Every year, we donate one Free website to a Charity Organization. LEARN MORE...

Understanding XML – Part III (Building Blocks)

In this Article

XML documents (and HTML documents) are made up by the following building blocks: Elements, Tags, Attributes, Entities, PCDATA, and CDATA


Elements are the main building blocks of both XML and HTML documents. Examples of HTML elements are “body” and “table”. Examples of XML elements could be “my-schedule” and “date”.

Elements can contain text, other elements, or be empty. Examples of empty HTML elements are “hr”, “br” and “img”.


Tags are used to markup elements. A starting tag like mark up the beginning of an element, and an ending tag like mark up the end of an element.

Examples: A body element – body text in between. A message element – some message in between


Attributes provide extra information about elements. Attributes are placed inside the start tag of an element. Attributes come in name/value pairs. For example, when inserting an image, “img” element has additional information about a source file

The name of the element is “img”. The name of the attribute is “src”. The value of the attribute is “computer.gif”. Since the element itself is empty it is closed by a ” /”.


PCDATA means parsed character data. Think of character data as the text found between the start tag and the end tag of an XML element. PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.


CDATA also means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.


Entities as variables used to define common text. Entity references are references to entities. Most of you already know the HTML entity reference: ” ” that is used to insert an extra space in an HTML document. Entities are expanded when a document is parsed by an XML parser.

The following entities are predefined in XML:

Entity References Character < means “less than – < ” > means “greater then – > ” & means “ampersand – & ” ” means “quotes – ” ” ‘ means “apostrophe – ‘ ”

Since, right now we do not plan to go very deep into XML coding, we’ll leave the data definition here, and move the future implication of XML.

Extensible Markup Language (XML), which complements HTML, promises to increase the benefits that can be derived from the wealth of information found today on IP networks around the world. This is because XML provides a uniform method for describing and exchanging structured data. The ability to describe structured data in an open text-based format and deliver this data using standard HTTP protocol is significant for two reasons.

XML will facilitate more precise declarations of content and more meaningful search results across multiple platforms. And once the data is located it will enable a new generation of viewing and manipulating the data.

Consider an industry where the interchange of data is vital, such as banking. Banks use proprietary systems to track transactions internally, but if they use a common XML format over the Web, then they’d be able to describe transaction information to another institution or an application (like Quicken or MS Money).

Of course, they’d also be able to present the data in a pretty Web page. FYI: This markup does exist. It’s called OFEX, the Open Financial Exchange format.

Under certain circumstances, if IE 4 on the PC comes across a tag with the proper contents, a function is started that gives a user the opportunity to update installed software.

Scroll to Top

Get in touch with us for a Free Quote

We typically respond within a few hours

Got Projects? Let's Work Together

We typically respond within a few hours