From Edutechwiki - Reading time: 15 minThis is a beginners tutorial for XML editing made from slides
Learning goals
Let us recall some principles that you also may have read in the XML principles article. In particular:
Many XML languages are defined with so-called schemas, i.e. some sort of grammars that define elements (tags) and attributes and how they can be combined. There exist several schema formalism. Other languages are defined with a simple textual description, e.g. the well-known RSS 0.9 syndication language. Often a language is defined using both schemas and text, e.g. HTML and SVG define the main structure with a DTD but add extra constraints for certain elements and attributes through simple descriptions. A good example would be measures. A length can be expressed in m, cm, in, pt, px, %, etc. and that cannot be defined with the simple DTD language.
There are four more or less popular schema languages:
(1) Document Type Definitions (DTDs)
(2) XML Schema
(3) Relax NG
(4) Schematron
DTD grammars are just a set of rules that define:
The most important part in a formal XML specification making use of DTDs, is usually the DTD. In addition, other constraints can be added ! In particular:
<size length="10cm">
DTD file association with an XML file
XML grammars like DTDs and XML Schemas can be directly associated with an XML file. This way, the XML carries information about its content structure that allows any client to verify if it is valid.
A simple DTD example (file "page.dtd")
<!ELEMENT page (title, content, comment?)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT content (#PCDATA)>
<!ELEMENT comment (#PCDATA)>
The following XML document is a valid with respect to the grammar defined in "page.dtd" (just above)
<?xml version="1.0"?>
<!DOCTYPE page SYSTEM "page.dtd">
<page>
<title>Hello friend</title>
<content>Here is some content :)</content>
<comment>Written by Anonymous</comment>
</page>
A DTD document contains just definition of rules .... nothing else (see later for explanations). The "page" DTD defines the following:
Specification of a markup language. Is a DTD enough ?
DTDs can’t define what the character data (element contents) and most attribute values should look like. For example, if you require that the user enters a number between 10 and 15 or the name of 15 different capitals, then you would have to use another formalism than DTD.
We introduce some of the DTD "language" below, but details are explained in the DTD tutorial. But let us now first systematically describe how a DTD file can be associated with an XML document.
There are four ways of using a DTD with an XML file:
(1) No DTD
<?xml version="1.0" standalone="yes"?>
<hello> Hello XML et hello cher lecteur ! </hello>
(2) DTD rules are defined inside the XML document
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE hello [
<!ELEMENT hello (#PCDATA)>
]>
<hello> Hello XML et hello dear readers ! </hello>
(3) Private/System DTDs
<?xml version="1.0" ?>
<!DOCTYPE hello SYSTEM "hello.dtd">
<hello> This is a very simple XML document </hello>
(4) Public DTDs
<?xml version="1.0" "?>
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
"http://my.netscape.com/publish/formats/rss-0.91.dtd">
<rss version="0.91">
<channel> ...... </channel>
</rss>
The syntax rules are fairly simple and can be understood from looking at the example above, and you may skip this section ....
(1) Every DTD declaration must start with
<!DOCTYPE .... >
(2) Then, the root element must be specified next. Remember that DTDs don’t know their root element, root is defined in the XML document ! DTDs must define this root element just like any other element ! In some cases, DTDs are meant to be used in different ways, i.e. several elements could be used as root elements.
<!DOCTYPE hello .... >
(3) The next elements of the DTD declaration are different according to the DTD type (public or private)
(a) Syntax for internal DTDs (only !). DTD rules are inserted between brackets [ ... ]
<!DOCTYPE hello [
<!ELEMENT hello (#PCDATA)>
]>
(b) Syntax to define "private" external DTDs: The DTD is identified by the URL after the "SYSTEM" keyword
<!DOCTYPE hello SYSTEM "hello.dtd">
Example using an URL
<!DOCTYPE hello SYSTEM "http://tecfa.unige.ch/guides/xml/examples/simple/hello-page.dtd">
(c) Syntax for public DTDs: After the "PUBLIC" keyword you have to specify an official name and a backup URL that a validator could use. For example:
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
"http://my.netscape.com/publish/formats/rss-0.91.dtd">
Below we will present a few DTDs in increasing complexity.
Below is a simple XML document of type <page>:
<?xml version="1.0"?>
<page>
<title>Hello friend</title>
<content>
Here is some content :)
</content>
<comment>
Written by DKS/Tecfa, adapted from S.M./the Cocoon samples
</comment>
</page>
The following DTD could validate the document:
<!ELEMENT page (title, content, comment?)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT content (#PCDATA)>
<!ELEMENT comment (#PCDATA)>
Firstly it defines a page element that must include a title element, a content element, and optionally a comment element. Second, each of these sub-elements can only include text data, i.e. no other text.

Recipes are very popular in XML education.
Take one Let's first look at a quite simple example, originally published by Jay Greenspan (dead link)
<?xml version="1.0"?>
<!DOCTYPE list SYSTEM "simple_recipe.dtd">
<list>
<recipe>
<author>Carol Schmidt</author>
<recipe_name>Chocolate Chip Bars</recipe_name>
<meal>Dinner
<course>Dessert</course>
</meal>
<ingredients>
<item>2/3 C butter</item> <item>2 C brown sugar</item>
<item>1 tsp vanilla</item> <item>1 3/4 C unsifted all-purpose flour</item>
<item>1 1/2 tsp baking powder</item>
<item>1/2 tsp salt</item> <item>3 eggs</item>
<item>1/2 C chopped nuts</item>
<item>2 cups (12-oz pkg.) semi-sweet choc. chips</item>
</ingredients>
<directions>
Preheat oven to 350 degrees. Melt butter; combine with brown sugar and
vanilla in large mixing bowl. Set aside to cool. Combine flour, baking
powder, and salt; set aside. Add eggs to cooled sugar mixture; beat
well. Stir in reserved dry ingredients, nuts, and chips. Spread in
greased 13-by-9-inch pan. Bake for 25 to 30 minutes until golden
brown; cool. Cut into squares.
</directions>
</recipe>
</list>
The DTD would look like this

Take two
Below is half-filled in example of a slightly more complex recipe list in XML. As you can see, this example uses a more nested structure. For example, author, date, and version are children of a meta element. Directions includes a para element, i.e. a kind of formatting instruction which is meant to produce more legible text.
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE list SYSTEM "recipe-2.dtd">
<?xml-stylesheet href="recipe-2.css" type="text/css"?>
<list>
<recipe>
<meta>
<author>Joe</author>
<date></date>
<version></version>
</meta>
<recipe_name>Vegetable soup</recipe_name>
<meal>dinner</meal>
<ingredients>
<item>4 Carrots</item>
<item>2 Onions</item>
<item>Garlic</item>
<itme>1/2 Cabbage</item>
<item>Salt</item>
<item>Pepper</item>
</ingredients>
<directions>
<para>Cut the vegies into little pieces. Then boil with
water. Add some salt and pepper</para>
</directions>
</recipe>
</list>
Contents of the DTD (simple_recipe.dtd)
<!-- Simple recipe DTD -->
<!-- This DTD will allow to write simple recipees
list = a list of recipees
recipee = container for a recipee
meta = Metainformation: must include author of this file,
date, version in this order
recipee_author = optional name of recipee author
mail = title of meal
ingredients = list of items you need
directions = How to cook, may include either para's or bullet's.
-->
<!ELEMENT list (recipe+)>
<!ELEMENT recipe (meta, recipe_author?, recipe_name, meal,
ingredients, directions)>
<!ELEMENT meta (author, date, version)>
<!ELEMENT version (#PCDATA)>
<!ELEMENT date (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT recipe_author (#PCDATA)>
<!ELEMENT recipe_name (#PCDATA)>
<!ELEMENT meal (#PCDATA)>
<!ELEMENT ingredients (item+)>
<!ELEMENT item (#PCDATA)>
<!ELEMENT directions (para | bullet)* >
<!ELEMENT bullet (#PCDATA|strong)*>
<!ELEMENT para (#PCDATA|strong)*>
<!ELEMENT strong (#PCDATA)>
Let's present the grammar first
<?xml version="1.0"?>
<!-- DTD to write simple stories
Made by Daniel K. Schneider / TECFA / University of Geneva
VERSION 1.0
30/10/2003
-->
<!ELEMENT STORY (title, context, problem, goal, THREADS, moral, INFOS)>
<!ATTLIST STORY xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink">
<!ELEMENT THREADS (EPISODE+)>
<!ELEMENT EPISODE (subgoal, ATTEMPT+, result) >
<!ELEMENT ATTEMPT (action | EPISODE) >
<!ELEMENT INFOS ( ( date | author | a )* ) >
<!ELEMENT title (#PCDATA) >
<!ELEMENT context (#PCDATA) >
<!ELEMENT problem (#PCDATA) >
<!ELEMENT goal (#PCDATA) >
<!ELEMENT subgoal (#PCDATA) >
<!ELEMENT result (#PCDATA) >
<!ELEMENT moral (#PCDATA) >
<!ELEMENT action (#PCDATA) >
<!ELEMENT date (#PCDATA) >
<!ELEMENT author (#PCDATA) >
<!ELEMENT a (#PCDATA)>
<!ATTLIST a
xlink:href CDATA #REQUIRED
xlink:type CDATA #FIXED "simple"
>
Below is a short story
<?xml version="1.0"?>
<!DOCTYPE STORY SYSTEM "story-grammar.dtd">
<?xml-stylesheet href="story-grammar.css" type="text/css"?>
<STORY xmlns:xlink="http://www.w3.org/1999/xlink">
<title>The little Flexer</title>
<context>Once upon a time, in a dark small office.</context>
<problem>Kaspar was trying to learn Flex but didn't have a real
project. He then decided that it would be a good idea to look at
Data-Driven Controls. These are most useful in combination with an
external datasources in XML format.</problem>
<goal>So he decided how to write a mx:Tree application that imports
XML data.</goal>
<THREADS>
<EPISODE>
<subgoal>He decided to play with a little example.</subgoal>
<ATTEMPT>
<action>So he went to see the LiveDocs and copied an
example.</action>
</ATTEMPT>
<result>The example worked but he didn't understand why since he
didn't know about E4X.</result>
</EPISODE>
<EPISODE>
<subgoal>He then decided to learn e4X first.</subgoal>
<ATTEMPT>
<action>
Reading 2-3 tutorials and creating a simple example only took
2-3 hours.
</action>
</ATTEMPT>
<result>
He now understood how to write e4X code in Flex.
</result>
</EPISODE>
</THREADS>
<moral>Divide a problem into subproblems and you will get there ...</moral>
<INFOS>
<a xlink:href="http://edutechwiki.unige.ch/en/ECMAscript_for_XML"
xlink:type="simple">ECMAscript for XML</a>
</INFOS>
</STORY>
Story grammar is text centric DTD. There it can be easily styled with CSS. You can look at the file story-grammar.xml and also consult story-grammar.css.

A valid XML file
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE family SYSTEM "family.dtd">
<family>
<person name="Joe Miller" gender="male"
type="father" id="123.456.789"/>
<person name="Josette Miller" gender="female"
type="girl" id="123.456.987"/>
</family>
RSS is a news syndication format. There are several RSS variants. RSS 0.91 is Netscape’s original (still being used)
<!ELEMENT rss (channel)>
<!ATTLIST rss version CDATA #REQUIRED> <!-- must be "0.91"> -->
<!ELEMENT channel (title | description | link | language | item+ | rating? |
image? | textinput? | copyright? | pubDate? | lastBuildDate? |
docs? | managingEditor? | webMaster? | skipHours? | skipDays?)*>
<!ELEMENT title (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT link (#PCDATA)>
<!ELEMENT image (title | url | link | width? | height? | description?)*>
<!ELEMENT url (#PCDATA)>
<!ELEMENT item (title | link | description)*>
<!ELEMENT textinput (title | description | name | link)*>
<!ELEMENT name (#PCDATA)>
<!ELEMENT rating (#PCDATA)>
<!ELEMENT language (#PCDATA)>
<!ELEMENT width (#PCDATA)>
<!ELEMENT height (#PCDATA)>
<!ELEMENT copyright (#PCDATA)>
<!ELEMENT pubDate (#PCDATA)>
<!ELEMENT lastBuildDate (#PCDATA)>
<!ELEMENT docs (#PCDATA)>
<!ELEMENT managingEditor (#PCDATA)>
<!ELEMENT webMaster (#PCDATA)>
<!ELEMENT hour (#PCDATA)>
<!ELEMENT day (#PCDATA)>
<!ELEMENT skipHours (hour+)>
<!ELEMENT skipDays (day+)>
Possible XML document for RSS
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE rss SYSTEM "rss-0.91.dtd">
<rss version="0.91">
<channel>
<title>Webster University</title>
<description>Home Page of Webster University</description>
<link>http://www.webster.edu</link>
<item>
<title>Webster Univ. Geneva</title>
<description>Home page of Webster University Geneva</description>
<link>http://www.webster.ch</link>
</item>
<item>
<title>http://www.course.com/</title>
<description>You can find Thomson text-books materials
(exercise data) on this web site</description>
<link>http://www.course.com/</link>
</item>
</channel>
</rss>
We will come back to this when we will learn how to write our own DTDs in the DTD tutorial (don’t worry too much about unexplained details ....)
| order of elements | <!ELEMENT Name (First, Middle, Last)> | |
| optional element | MiddleName? | |
| at least one element | movie+ | |
| zero or more elements | item* | |
| pick one (or operator) | economics|law | |
| grouping construct | (A,B,C) |
Understanding DTD entities
Most professional DTDs use so-called entities. Entities are just symbols that contain some information which substitutes when the symbol is used.
DTD entities: Some more complex DTD use the same structures all over. Instead of typing these several times one can use a ENTITY construction like this:
<!ENTITY % Content "(Para | List | Listing)*">
Later in the DTD we then can have Element definitions like this:
<!ELEMENT Intro (Title, %Content; ) > <!ELEMENT Goal (Title, %Content; ) >
The computer will then simply translate these into:
<!ELEMENT Intro (Title, (Para | List | Listing)*) > <!ELEMENT Goal (Title, (Para | List | Listing)* ) >
... think of these entities as shortcuts.
Note: There also exist tow kinds of entities XML entities. XML entities allow to define an XML fragment of text and then to include it later.
There a lots of XML editors and there is no easy choice ! Depending on your needs you may choose a different editor:
Here is my own little advice with respect to XML editors (also read the XML editor article)
Minimal things your XML editor should be able to do
We then suggest some additional criteria depending on the kind of XML
For data-centric XML:
For text-centric XML:
Any XML editor is difficult to learn (because XML editing is not so easy). Please, make an effort to learn the interface, e.g. read the help !
(1) Exchanger XML Lite V3.3
If you are looking for a general purpose editor that is both DTD and Schema aware and that offers XSLT support, I suggest to try this editor first. Try others you are unhappy with it or if you plan to focus on a single kind of editing, e.g. just edit "data-centric" XML documents.
Hints for editing with Exchanger
To insert an element or attribute:
Read more in Exchanger XML Editor
(2) XMLmind Standard Edition is another free editor XMLmind may be better choice if you plan to edit data-centric XML and/or if you like to work with "tree views". The free edition doesn't include XSLT processing. But you can do this with another tool (e.g. Exchanger lite or just a command line call)
Hints for editing with XMLmind
Other Alternatives
About Java