Skip to end of metadata
Go to start of metadata


What is the xml.etree Library?

The etree (ElementTree) library is a part of the python standard library, and contains many tools that make it simple to parse through and pull information out of an XML document. There are other libraries that can parse through XML documents, but etree is commonly used and very easy to get started with. The etree library will break up the XML into easily accessible elements, each representing a single node in the entire XML tree. For more information on using the etree library beyond the scope of this page, see the python documentation.

On this page ...

Using the xml.etree Library

There are a couple of different ways to import the XML data from etree, depending on how it is being stored. It can pull the data in from an XML file using the filepath, or it can read a string. Notice how regardless of how we import the XML, we end up with a root object.

Python - Reading a File
# The library must first be imported no matter how we pull in the data.
import xml.etree.ElementTree as ET

# Here we can grab the filepath using Ignition's built in openFile function, parse that into a tree, then grab the root element.
filepath = system.file.openFile()
tree = ET.parse(filepath)
root = tree.getroot()
Python - Reading from a String
# The library must first be imported no matter how we pull in the data.
import xml.etree.ElementTree as ET

# Alternately, we can start with a string of the xml data.
xmlString = """
<employee id="1234">
	<name>John Smith</name>
	<start_date>2010-11-26</start_date>
	<department>IT</department>
	<title>Tech Support</title>
</employee>
"""

# Then parse through the string using a different function that takes us straight to the root element.
root = ET.fromstring(xmlString)


Each Tag is considered an element object. In the example above, the root element would be the employee Tag. Elements can also have attributes, which are within the Tag itself. In the example above, the employee element has an attribute id with a value 1234. Finally, each element can also have additional data, typically in the form of a string. This additional data is usually placed in between the element's start and end Tags. In the example above, the employee element has no additional data, but its children do. The name element would have an additional data value of John Smith. All of this data can be accessed using the Element object's built-in functionality. The major functions are listed below, and each example uses the reading from a string root XML example from above.

FunctionDescriptionExampleOutput
Element.tagReturns the name of the Element's Tag.
print root.tag
employee
Element.attribReturns a dictionary of the Element's attributes.
print root.attrib
{'id':'1234'}
Element.textReturns the additional data of the Element. The example here will return nothing because the root does not have any text. The next example uses children which do have text.
print root.text

for child in ElementWill iterate through the Element's children. Each child is then its own element, complete with Tag, attrib, and text properties.
for child in root:
	print child.tag, child.text

name John Smith
start_date 2010-11-26
department IT
title Tech Support

Element[index]

Allows direct reference to an Element's children by index. Since Tags can be nested many times, further nested children can be accessed by adding an additional index in square brackets as many times as necessary: Element[1][4][0] From the original element, we would go to the child located in the first position, that child's fourth position child, and that child's zero position child.

When direct referencing child elements in this way, they still have access to the Tag, attrib, and text properties.

root[2].tag
root[3].text

department

Tech Support


A Simple Book Example

Using the functions above, we can now easily parse through an XML file and use the results for something. Lets keep it simple, and parse through a document and then place the values into a table. First we need to start with an XML document. We have one below for you to test with in a string form, which would need to be pasted at the top of the script.


 XML String
XML String
document = """
<catalog>
	<book id="bk101">
		<author>Gambardella, Matthew</author>
		<title>XML Developer's Guide</title>
		<genre>Computer</genre>
		<price>44.95</price>
		<publish_date>2000-10-01</publish_date>
		<description>An in-depth look at creating applications 
		with XML.</description>
	</book>
	<book id="bk102">
		<author>Ralls, Kim</author>
		<title>Midnight Rain</title>
		<genre>Fantasy</genre>
		<price>5.95</price>
		<publish_date>2000-12-16</publish_date>
		<description>A former architect battles corporate zombies, 
		an evil sorceress, and her own childhood to become queen 
		of the world.</description>
	</book>
	<book id="bk103">
		<author>Corets, Eva</author>
		<title>Maeve Ascendant</title>
		<genre>Fantasy</genre>
		<price>5.95</price>
		<publish_date>2000-11-17</publish_date>
		<description>After the collapse of a nanotechnology 
		society in England, the young survivors lay the 
		foundation for a new society.</description>
	</book>
	<book id="bk104">
		<author>Corets, Eva</author>
		<title>Oberon's Legacy</title>
		<genre>Fantasy</genre>
		<price>5.95</price>
		<publish_date>2001-03-10</publish_date>
		<description>In post-apocalypse England, the mysterious 
		agent known only as Oberon helps to create a new life 
		for the inhabitants of London. Sequel to Maeve 
		Ascendant.</description>
	</book>
	<book id="bk105">
		<author>Corets, Eva</author>
		<title>The Sundered Grail</title>
		<genre>Fantasy</genre>
		<price>5.95</price>
		<publish_date>2001-09-10</publish_date>
		<description>The two daughters of Maeve, half-sisters, 
		battle one another for control of England. Sequel to 
		Oberon's Legacy.</description>
	</book>
	<book id="bk106">
		<author>Randall, Cynthia</author>
		<title>Lover Birds</title>
		<genre>Romance</genre>
		<price>4.95</price>
		<publish_date>2000-09-02</publish_date>
		<description>When Carla meets Paul at an ornithology 
		conference, tempers fly as feathers get ruffled.</description>
	</book>
	<book id="bk107">
		<author>Thurman, Paula</author>
		<title>Splish Splash</title>
		<genre>Romance</genre>
		<price>4.95</price>
		<publish_date>2000-11-02</publish_date>
		<description>A deep sea diver finds true love twenty 
		thousand leagues beneath the sea.</description>
	</book>
	<book id="bk108">
		<author>Knorr, Stefan</author>
		<title>Creepy Crawlies</title>
		<genre>Horror</genre>
		<price>4.95</price>
		<publish_date>2000-12-06</publish_date>
		<description>An anthology of horror stories about roaches,
		centipedes, scorpions  and other insects.</description>
	</book>
	<book id="bk109">
		<author>Kress, Peter</author>
		<title>Paradox Lost</title>
		<genre>Science Fiction</genre>
		<price>6.95</price>
		<publish_date>2000-11-02</publish_date>
		<description>After an inadvertant trip through a Heisenberg
		Uncertainty Device, James Salway discovers the problems 
		of being quantum.</description>
	</book>
	<book id="bk110">
		<author>O'Brien, Tim</author>
		<title>Microsoft .NET: The Programming Bible</title>
		<genre>Computer</genre>
		<price>36.95</price>
		<publish_date>2000-12-09</publish_date>
		<description>Microsoft's .NET initiative is explored in 
		detail in this deep programmer's reference.</description>
	</book>
	<book id="bk111">
		<author>O'Brien, Tim</author>
		<title>MSXML3: A Comprehensive Guide</title>
		<genre>Computer</genre>
		<price>36.95</price>
		<publish_date>2000-12-01</publish_date>
		<description>The Microsoft MSXML3 parser is covered in 
		detail, with attention to XML DOM interfaces, XSLT processing, 
		SAX and more.</description>
	</book>
	<book id="bk112">
		<author>Galos, Mike</author>
		<title>Visual Studio 7: A Comprehensive Guide</title>
		<genre>Computer</genre>
		<price>49.95</price>
		<publish_date>2001-04-16</publish_date>
		<description>Microsoft Visual Studio 7 is explored in depth,
		looking at how Visual Basic, Visual C++, C#, and ASP+ are 
		integrated into a comprehensive development 
		environment.</description>
	</book>
</catalog>
"""

We can then place a Table component and a Button component on the window, and place this script on the Button's actionPerformed event.

Python - Complete XML Parsing
# Start by importing the library
import xml.etree.ElementTree as ET

######
# Here is where you would paste in the document string.
# Simply remove this comment, and paste in the document string.
######

# We can then parse the string into useable elements.
root = ET.fromstring(document)

# This creates empty header and row lists that we will add to later.
# These are used to create the dataset that will go into the Table.
# We could fill in the names of the headers beforehand, since we know what each will be.
# However, this allows us to add or remove children keys, and the script will automatically adjust.
headers = []
rows = []

# Now we can loop through each child of the root.
# Since the root is catalog, each child element is an individual book.
# We also create a single row empty list. We can add all of the data for a single book to this list.
for child in root:
	oneRow = []

	# Check if the book has any attributes.
	if child.attrib != {}:

		# If it does contain attributes, we want to loop through all of them.
		for key in child.attrib:

			# Since we only want to add the attributes to our header list once, first check if it is there.
			# If it isn't add it.
			if key not in headers:
				headers.append(key)

			# Add the attribute value to the oneRow list.
			oneRow.append(child.attrib[key])

	# Loop through the children of the book.
	for child2 in child:

		# Similar to above, we check if the tag is present in the header list before adding it.
		if child2.tag not in headers:
			headers.append(child2.tag)

		# We can then add the text of the Element to the oneRow list.
		oneRow.append(child2.text)

	# Finally, we want to add the oneRow list to our list of rows.
	rows.append(oneRow)

# Once the loop is complete, this will print out the headers and rows list so we can manually check them in the console.
print headers 
print rows

# Convert to a dataset, and insert into the Table.
data = system.dataset.toDataSet(headers, rows)
event.source.parent.getComponent('Table').data = data


Related Topics ...


  • No labels