This article provides a short and rather non-technical overview of XML. See also the XML category for all XML-related topics (many) or follow up links in this overview.
Learning goals
Prerequisites
Next steps
See also: Editing XML tutorial
Currently, there are hundreds of more or less popular XML languages. Within the more narrow area of web standards there are less and we shall shortly introduce the most important ones that non-programmers like content developers or web designers should know about
Initially, XML was thought to redefine the way contents are delivered. After that, it turned out that XHTML was (almost) never used as XML, e.g. in the form HTML combined with other XML contents. This "XML" vision of HTML still exists in the mind of some people, but the death of XHTML 2 put a provisional end to this. The current mainstream, represented by HTML5, is a computer application-centered model, i.e. HTML is seen as a delivery platform for interactive contents and not as a document format.
The picture below shows the idea that web documents could be composed of several components: In the case of HTML, there is HTML + CSS, in the case of HTML5 there is HTML + built-in SVG and MATHML + plus CSS. In the case of XHTML 1 or XHTML 5 a document can include any other XML language, provided that these are identified by so-called namespaces. Although it is not longer popular, we also included SGML in the picture, since it is the "mother" of all tag-based markup languages.
Just to make sure: The death of XHTML does by no means mean that XML is not being used on the Internet. It's just dead as web page format. Other formats like SVG (vector graphics), MathML (mathematical formula), RSS (content syndication) are very much in use today and will be so in the future.
The semantic web is essentially defined by the RDF framework. While RDF itself is used in some areas (e.g. Metadata formalisms), the global semantic web project seems to be somewhat stalled, except for occasional flares. Web 2.0 was supposed to be semantic but web 2.0 became all the opposite, i.e. it is based on simple micro-formats. Then it became web 3. Then the anti-semantic HTML 5 initiative became dominant and the "semantic web" remains a "smaller island" of interest and applications.
The exist several protocols for machine-to-machine interaction like SOAP and XML-RPC. See the web service article for more details.
In addition we can identify:
In a more general perspective, XML is currently one of the most popular standards to define various kinds of data structures. One could define three kinds:
An XML document can refer to a physical file, a database entry, a datastream. In other words, technically speaking an XML document is any sort of delimited "text" defined as a string and that has XML markup inside.
An XML document is well formed if and only if:
<?xml version="1.0"?>
<?xml version="1.0" encoding="UTF-8"?>
<i>...<b>...</i> .... </b>
<br />
<a href="http://tecfa.unige.ch:8080/xml.html">
& & & &aquot; '
An XML document is said valid if it conforms to some kind of grammar also called schema. An XML grammar formally describes an XML application (or vocabulary or language).
The most popular ones are in this order:
XML applications in addition to DTDs may include other constraints. Some XML applications may include languages that are not XML-based (e.g. CSS or XPath).
The most popular grammars are DTDs. Below we just include a picture of a little grammar (read the details in DTD tutorial
Data-centric XML as opposed to the text-centric XML refers to XML whose primary audience is not a human reader, but a computer program which will process the information, respond to it, store data items in a database, and so on.
See also Tour de XML, a selection of links demonstrating various uses of XML.
Extend the power of XML
Various style sheet and query languages
These document standards (as well others) can intervene all stages of the document production/delivery pipeline. XML in the documentation world appears as:
Any XML document can directly be put on the web together with a CSS stylesheet or an XSLT transformation. Specialized formats like SVG (vector graphics), X3D (3d vector graphics), MathML (formulas) can be added to XML-compatible browsers. Larger documents are often produces with specialized vocabularies such as DITA or DocBook. Contents can be written either with an XML editor or an XML-aware word processor. Such documents can then be either directly "saved as" or sent through various cusom output filters.
Today one can directly display information encoded in XML (of any grammar) in a browser, by using a style-sheet. The style-sheets allow to:
The utility of style-sheets is therefore
XSL refers to two languages recommended by W3C
XSL either refers to XSLT or XSL/FO and they provide two principal functions:
(1) XSLT is a transformation language for Xml elements.
For example: XSLT allow for the creation of table of contents or the translation XML to HTML
(2) XSL/FO is formatting language that allow to create high quality print documents
Formatting with XSL-FO
CSS also can be used for style XML contents. However since its transformation capabilities are rather poor, the XML already should include all the data to be published.
Xlink allows inserting a link in XML document, where a link expresses a relationship between two or more objects. XLink remains a proposal, there’s no complete implementation for the moment. However, subsets of Xlink are used in various other XML languages.
XLink is based on other standards (and that are also shared with XSLT)
Principal characteristics of XLink
Where does this standard come from?
On has to make a distinction between languages specifically developped for the education section (see below) and all the rest of XML technology, most of which can be useful to education
(longer entries have their own page)
Note: You may need to change DTD's or Schema's local system identifier. These programs must be able to get the DTD. I rather suggest installing a local program on your machine (like xmllint or xmlTester).
Some websites offer functionality to perform simple xml tasks like formatting, diffing, transforming, validating, querying XML.
Website | Features |
http://www.shancarter.com/data_converter/ | Conversion from Excel and csv to XML |
http://www.shell-tools.net/index.php?op=xml_format | Format and validation (dtd and xsd) |
http://tools.decisionsoft.com/xmldiff.html | Diff (compare XML files) |
http://tools.decisionsoft.com/schemaValidate/ | Validation (XSD) |
http://chris.photobooks.com/xml/ | Format, transformation (XSLT) and query (Xpath) |
http://www.xmltools.dk/ | Query (Xpath) |
http://xslt.online-toolz.com/tools/xslt-transformation.php | Format, transformation (XSLT) and Validation (XSD) |
http://www.w3schools.com/xsl/tryxslt.asp?xmlfile=cdcatalog&xsltfile=cdcatalog | Transformation (XSLT) |
http://www.qutoric.com/xslt/analyser/xpathtool.html | Query (Xpath) |
(this sections needs to be expanded some day)