Introduction to XML

What is XML: XML (eXtensible Markup Language) is a text based markup language that enables you to store data in a structured format by using meaningful tags.

 

XML is a cross-platform, hardware and software independent markup language. XML allows computers to store data in format that can be interpreted by any other computer system.

 

Advantages of XML:

 

·         It provides a way of creating domain-specific vocabulary.

·         It allows data interchange between different computer systems.

·         It enables smart searches.

·         It provides user-selected view of data.

 

Sample XML file look like below.

 

<?xml version=”1.0” encoding=”UTF-8”?>

<Employees>

                <Employee>

                                <Id>1</Id>

                                <Name>ABC</Name>

                </Employee>

</Employees>

 

Any XML document begins with XML declaration statement also called Processing Instructions (PI).

 

<?xml version=”1.0” encoding=”UTF-8”?>

 

The PI provides information regarding the way in which the XML file should be processed. It contains version info and encoding schema, here the version is 1.0 and encoding schema is UTF-8. UTF is the abbreviation of UCS (Universal Character Set) Transformation format. UTF-8 supports characters that are compatible with ASCII-based computing system. If XML is in other languages, you need to set encoding schema as UTF-16.

 

In XML tags are used to specify a name for a given piece of information. A tag consists of an opening and closing angular brackets (<>).

 

  Elements are basic units that are used to identify and describe data in XCML. They are building blocks of an XML document. In above XML file <Employees></Employees>, <Employee></Employee>, <Id></Id>, <Name></Name> are Elements. <Employees> is a root element for XML document, <Employee> is a child element for <Employees> element and parent element for <Id><Name> elements.

 

The information that is represented by the elements of an XML document is referred to as the content of that particular element.

 

You can specify attributes for each element. Attributes provide additional information about the element for which they are declared. An attribute consists of a name-value pair.

 

<Employee Id=”1”>ABC</Employee>

 

In the above, Id is the attribute for <Employee> element and its value is “1”.

 

You can specify comments in a XML document like below.

 

<!--This is the comment format for an XML document -->

 

Some characters cannot be used in XML document content, because they have special meaning, you can represent them in a different way. For example, you cannot use “<” symbol in an XML document. To represent the “<” symbol in content, you have to use “&lt;”. Below table provides XML formats to use for different symbols.

 

                         Symbol

      Format to use in XML

                          <

               &lt;

                          >

               &gt;

                          &

               &amp; 

                         

               &quot;