Home > Articles > Microsoft > Other Microsoft

Accessing and Manipulating XML Data

  • Print
  • + Share This
Learn how to access and manipulate XML data in preparation for your MCAD Exam 70-310.
This chapter is from the book

Terms you'll need to understand:

  • DiffGram
  • Document Object Model (DOM)
  • Document Type Definition (DTD)
  • Valid XML
  • Well-formed XML
  • XPath

Techniques you'll need to master:

  • Retrieving information from XML files by using the Document Object Model, XmlReader class, XmlDocument class, and XmlNode class
  • Synchronizing DataSet data with XML via the XmlDataDocument class
  • Executing XML queries with XPath and the XPathNavigator class
  • Validating XML against XML Schema Design (XSD) and Document Type Definition (DTD) files
  • Generating XML from SQL Server databases
  • Updating SQL Server databases with DiffGrams

You can't use the .NET Framework effectively unless you're familiar with XML. XML is pervasive in .NET, and it's especially important for the distributed applications covered on the 70-310 exam. The System.Xml namespace contains classes to parse, validate, and manipulate XML. You can read and write XML, use XPath to navigate through an XML document, or check to see whether a particular document is valid XML by using the objects in this namespace.

NOTE

In this chapter, I've assumed that you're already familiar with the basics of XML, such as elements and attributes. If you need a refresher course on XML basics, refer to Appendix B, "XML Standards and Syntax."

Accessing an XML File

In this section, you'll learn how to extract information from an XML file. I'll start by showing you how you can use the XmlReader object to move through an XML file, extracting information as you go. Then you'll see how other objects, including the XmlNode and XmlDocument objects, provide a more structured view of an XML file.

I'll work with a very simple XML file named Books.xml that represents three books a computer bookstore might stock. Here's the raw XML file:

<?xml version="1.0" encoding="UTF-8"?>
<Books>
 <Book Pages="1109">
  <Author>Gunderloy, Mike</Author>
  <Title>Exam 70-306 Training Guide</Title>
  <Publisher>Que</Publisher>
 </Book>
 <Book Pages="357">
  <Author>Wildermuth, Shawn</Author>
  <Title>Pragmatic ADO.NET</Title>
  <Publisher>Addison-Wesley</Publisher>
 </Book>
 <Book Pages="484">
  <Author>Burton, Kevin</Author>
  <Title>.NET Common Language Runtime Unleashed</Title>
  <Publisher>Sams</Publisher>
 </Book>
</Books>

Understanding the DOM

The Document Object Model, or DOM, is an Internet standard for representing the information contained in an HTML or XML document as a tree of nodes. Like many other Internet standards, the DOM is an official standard of the World Wide Web Consortium, better known as the W3C. You can find it at http://www.w3.org/DOM.

In its simplest form, the DOM defines an XML document as consisting as a tree of nodes. The root element in the XML file becomes the root node of the tree, and other elements become child nodes. The DOM provides the standard for constructing this tree, including a classification for individual nodes and rules for which nodes can have children.

TIP

In the DOM, attributes are not represented as nodes within the tree. Rather, attributes are considered to be properties of their parent elements. You'll see later in the chapter that this is reflected in the classes provided by the .NET Framework for reading XML files.

Using an XmlReader Object

The XmlReader class is designed to provide forward-only, read-only access to an XML file. This class treats an XML file similar to the way a cursor treats a resultset from a database. At any given time, there is one current node within the XML file, represented by a pointer that you can move around within the file. The class implements a Read method that returns the next XML node to the calling application. The XmlReader class has many other members, as shown in Table 3.1.

Table 3.1 Important Members of the XmlReader Class

Member

Type

Description

Depth

Property

The depth of the current node in the XML document

EOF

Property

A Boolean property that is True when the current node pointer is at the end of the XML file

GetAttribute

Method

Gets the value of an attribute

HasAttributes

Property

True when the current node contains attributes

HasValue

Property

True when the current node is a type that has a Value property

IsEmptyElement

Property

True when the current node represents an empty XML element

IsStartElement

Method

Determines whether the current node is a start tag

Item

Property

An indexed collection of attributes for the current node (if any)

MoveToElement

Method

Moves to the element containing the current attribute

MoveToFirstAttribute

Method

Moves to the first attribute of the current element

MoveToNextAttribute

Method

Moves to the next attribute

Name

Property

The qualified name of the current node

NodeType

Property

The type of the current node

Read

Method

Reads the next node from the XML file

Skip

Method

Skips the children of the current element

Value

Property

The value of the current node


The XmlReader class is a purely abstract class. That is, this class is marked with the MustInherit modifier; you cannot create an instance of XmlReader in your own application. Generally, you'll use the XmlTextReader class instead. The XmlTextReader class implements XmlReader for use with text streams. Here's how you might use this class to dump the nodes of an XML file to a ListBox control:

Private Sub btnReadXml_Click( _
 ByVal sender As System.Object, _
 ByVal e As System.EventArgs) Handles btnReadXML.Click
 Dim intI As Integer
 Dim intJ As Integer
 Dim strNode As String
 ' Create a new XmlTextReader on the file
 Dim xtr As XmlTextReader = _
  New XmlTextReader("Books.xml")
 ' Walk through the entire XML file
 Do While xtr.Read
  If (xtr.NodeType = XmlNodeType.Element) Or _
   (xtr.NodeType = XmlNodeType.Text) Then
   strNode = ""
   For intI = 1 To xtr.Depth
    strNode &= " "
   Next
   strNode = strNode & xtr.Name & " "
   strNode &= xtr.NodeType.ToString
   If xtr.HasValue Then
    strNode = strNode & ": " & xtr.Value
   End If
   lbNodes.Items.Add(strNode)
   ' Now add the attributes, if any
   If xtr.HasAttributes Then
    While xtr.MoveToNextAttribute
     strNode = ""
     For intI = 1 To xtr.Depth
      strNode &= " "
     Next
     strNode = strNode & xtr.Name & " "
     strNode &= xtr.NodeType.ToString
     If xtr.HasValue Then
      strNode = strNode & ": " & _
       xtr.Value
     End If
     lbNodes.Items.Add(strNode)
    End While
   End If
  End If
 Loop
 ' Clean up
 xtr.Close()
End Sub

Figure 3.1 shows the view of the sample Books.xml file produced by this code.

Figure 3.1Figure 3.1 An XML file translated into schematic form by an XmlTextReader object.

NOTE

This and other examples in this chapter assume that the XML file is located in the bin folder of your Visual Basic .NET project.

The DOM includes nodes for everything in the XML file, including the XML declaration and any whitespace (such as the line feeds and carriage returns that separate lines of the files). On the other hand, the node tree doesn't include XML attributes, though you can retrieve them from the parent elements. However, the DOM and the XmlTextReader are flexible enough that you can customize their work as you like. Note the use of the NodeType property and the MoveToNextAttribute method in this example to display just the elements, text nodes, and attributes from the file.

CAUTION

Alternatively, you can retrieve attributes by using the Item property of the XmlTextReader. If the current node represents an element in the XML file, the following code will retrieve the value of the first attribute of the element:

xtr.Items(0)

This code will retrieve the value of an attribute named Page:

xtr.Item("Page")

The XMLNode Class

The code you saw in the previous example deals with nodes as part of a stream of information returned by the XmlTextReader object. But the .NET Framework also includes another class, XmlNode, that can be used to represent an individual node from the DOM representation of an XML document. If you instantiate an XmlNode object to represent a particular portion of an XML document, you can alter the properties of the object and then write the changes back to the original file. The DOM provides two-way access to the underlying XML in this case.

NOTE

In addition to XmlNode, the System.Xml namespace also contains a set of classes that represent particular types of nodes: XmlAttribute, XmlComment, XmlElement, and so on. These classes all inherit from the XmlNode class.

The XmlNode class has a rich interface of properties and methods. You can retrieve or set information about the entity represented by an XmlNode object, or you can use its methods to navigate the DOM. Table 3.2 shows the important members of the XmlNode class.

Table 3.2 Important Members of the XmlNode Class

Member

Type

Description

AppendChild

Method

Adds a new child node to the end of this node's list of children

Attributes

Property

Returns the attributes of the node as an XmlAttributeCollection

ChildNodes

Property

Returns all child nodes of this node

CloneNode

Method

Creates a duplicate of this node

FirstChild

Property

Returns the first child node of this node

HasChildNodes

Property

True if this node has any children

InnerText

Property

The value of the node and all its children

InnerXml

Property

The markup representing only the children of this node

InsertAfter

Method

Inserts a new node after this node

InsertBefore

Method

Inserts a new node before this node

LastChild

Property

Returns the last child node of this node

Name

Property

The name of the node

NextSibling

Property

Returns the next child of this node's parent node

NodeType

Property

The type of this node

OuterXml

Property

The markup representing this node and its children

OwnerDocument

Property

The XmlDocument object that contains this node

ParentNode

Property

Returns the parent of this node

PrependChild

Method

Adds a new child node to the beginning of this node's list of children

PreviousSibling

Property

Returns the previous child of this node's parent node

RemoveAll

Method

Removes all children of this node

RemoveChild

Method

Removes a specified child of this node

ReplaceChild

Method

Replaces a child of this node with a new node

SelectNodes

Method

Selects a group of nodes matching an XPath expression

SelectSingleNode

Method

Selects the first node matching an XPath expression

WriteContentTo

Method

Writes all children of this node to an XmlWriter object

WriteTo

Method

Writes this node to an XmlWriter


The XmlDocument Class

You can't directly create an XmlNode object that represents an entity from a particular XML document. Instead, you can retrieve XmlNode objects from an XmlDocument object. The XmlDocument object represents an entire XML document. By combining the XmlNode and XmlDocument objects, you can navigate through the DOM representation of an XML document. For example, you can recursively dump the contents of an XML file to a ListBox control with this code:

Private Sub btnReadXML_Click( _
 ByVal sender As System.Object, _
 ByVal e As System.EventArgs) Handles btnReadXML.Click
 Dim intI As Integer
 Dim intJ As Integer
 Dim strNode As String
 ' Create a new XmlTextReader on the file
 Dim xtr As XmlTextReader = _
  New XmlTextReader("Books.xml")
 ' Load the XML file to an XmlDocument
 xtr.WhitespaceHandling = WhitespaceHandling.None
 Dim xd As XmlDocument = New XmlDocument()
 xd.Load(xtr)
 ' Get the document root
 Dim xnodRoot As XmlNode = xd.DocumentElement
 ' Walk the tree and display it
 Dim xnodWorking As XmlNode
 If xnodRoot.HasChildNodes Then
  xnodWorking = xnodRoot.FirstChild
  While Not IsNothing(xnodWorking)
   AddChildren(xnodWorking, 0)
   xnodWorking = xnodWorking.NextSibling
  End While
 End If
 ' Clean up
 xtr.Close()
End Sub

Private Sub AddChildren(ByVal xnod As XmlNode, _
 ByVal Depth As Integer)
 ' Add this node to the listbox
 Dim strNode As String
 Dim intI As Integer
 Dim intJ As Integer
 Dim atts As XmlAttributeCollection
 ' Only process Text and Element nodes
 If (xnod.NodeType = XmlNodeType.Element) Or _
  (xnod.NodeType = XmlNodeType.Text) Then
  strNode = ""
  For intI = 1 To Depth
   strNode &= " "
  Next
  strNode = strNode & xnod.Name & " "
  strNode &= xnod.NodeType.ToString
  strNode = strNode & ": " & xnod.Value
  lbNodes.Items.Add(strNode)
  ' Now add the attributes, if any
  atts = xnod.Attributes
  If Not atts Is Nothing Then
   For intJ = 0 To atts.Count - 1
    strNode = ""
    For intI = 1 To Depth + 1
     strNode &= " "
    Next
    strNode = strNode & _
     atts(intJ).Name & " "
    strNode &= atts(intJ).NodeType.ToString
    strNode = strNode & ": " & _
     atts(intJ).Value
    lbNodes.Items.Add(strNode)
   Next
  End If
  ' And recursively walk
  ' the children of this node
  Dim xnodworking As XmlNode
  If xnod.HasChildNodes Then
   xnodworking = xnod.FirstChild
   While Not IsNothing(xnodworking)
    AddChildren(xnodworking, Depth + 1)
    xnodworking = xnodworking.NextSibling
   End While
  End If
 End If
End Sub

The XmlDocument class includes a number of other useful members. Table 3.3 lists the most important of these.

Table 3.3 Important Members of the XmlDocument Class

Member

Type

Description

CreateAttribute

Method

Creates an attribute node

CreateElement

Method

Creates an element node

CreateNode

Method

Creates an XmlNode object

DocumentElement

Property

Returns the root XmlNode for this document

DocumentType

Property

-Returns the node containing the DTD declaration for this document, if it has one

ImportNode

Method

Imports a node from another XML document

Load

Method

Loads an XML document into the XmlDocument

LoadXml

Method

Loads the XmlDocument from a string of XML data

NodeChanged

Event

Fires after the value of a node has been changed

NodeChanging

Event

Fires when the value of a node is about to be changed

NodeInserted

Event

Fires when a new node has been inserted

NodeInserting

Event

Fires when a new node is about to be inserted

NodeRemoved

Event

Fires when a node has been removed

NodeRemoving

Event

Fires when a node is about to be removed

PreserveWhitespace

Property

-True if whitespace in the document should be preserved when loading or saving the XML

Save

Method

Saves the XmlDocument as a file or stream

WriteTo

Method

Saves the XmlDocument to an XmlWriter


  • + Share This
  • 🔖 Save To Your Account