Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
504f6810
Commit
504f6810
authored
Aug 18, 2012
by
Eli Bendersky
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Issue #15586: porting ET's new documentation bits to 2.7. Patch by Daniel Ellis
parent
e6723d63
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
308 additions
and
7 deletions
+308
-7
Doc/library/xml.etree.elementtree.rst
Doc/library/xml.etree.elementtree.rst
+308
-7
No files found.
Doc/library/xml.etree.elementtree.rst
View file @
504f6810
...
...
@@ -46,11 +46,313 @@ the xml.etree.ElementTree.
`Introducing ElementTree 1.3
<http:
//
effbot
.
org
/
zone
/
elementtree-13-intro
.
htm
>
`_.
Tutorial
--------
This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
short). The goal is to demonstrate some of the building blocks and basic
concepts of the module.
XML tree and elements
^^^^^^^^^^^^^^^^^^^^^
XML is an inherently hierarchical data format, and the most natural way to
represent it is with a tree. ``ET`` has two classes for this purpose -
:class:`ElementTree` represents the whole XML document as a tree, and
:class:`Element` represents a single node in this tree. Interactions with
the whole document (reading and writing to/from files) are usually done
on the :class:`ElementTree` level. Interactions with a single XML element
and its sub-elements are done on the :class:`Element` level.
.. _elementtree-parsing-xml:
Parsing XML
^^^^^^^^^^^
We'll be using the following XML document as the sample data for this section:
.. code-block:: xml
<?xml version="1.0"?>
<data>
<country
name=
"Liechtenstein"
>
<rank>
1
</rank>
<year>
2008
</year>
<gdppc>
141100
</gdppc>
<neighbor
name=
"Austria"
direction=
"E"
/>
<neighbor
name=
"Switzerland"
direction=
"W"
/>
</country>
<country
name=
"Singapore"
>
<rank>
4
</rank>
<year>
2011
</year>
<gdppc>
59900
</gdppc>
<neighbor
name=
"Malaysia"
direction=
"N"
/>
</country>
<country
name=
"Panama"
>
<rank>
68
</rank>
<year>
2011
</year>
<gdppc>
13600
</gdppc>
<neighbor
name=
"Costa Rica"
direction=
"W"
/>
<neighbor
name=
"Colombia"
direction=
"E"
/>
</country>
</data>
We have a number of ways to import the data. Reading the file from disk::
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
Reading the data from a string::
root = ET.fromstring(country_data_as_string)
:func:`fromstring` parses XML from a string directly into an :class:`Element`,
which is the root element of the parsed tree. Other parsing functions may
create an :class:`ElementTree`. Check the documentation to be sure.
As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
>>> root.tag
'data'
>>> root.attrib
{}
It also has children nodes over which we can iterate::
>>> for child in root:
... print child.tag, child.attrib
...
country {'name': 'Liechtenstein'}
country {'name': 'Singapore'}
country {'name': 'Panama'}
Children are nested, and we can access specific child nodes by index::
>>> root[0][1].text
'2008'
Finding interesting elements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:class:`Element` has some useful methods that help iterate recursively over all
the sub-tree below it (its children, their children, and so on). For example,
:meth:`Element.iter`::
>>> for neighbor in root.iter('neighbor'):
... print neighbor.attrib
...
{'name': 'Austria', 'direction': 'E'}
{'name': 'Switzerland', 'direction': 'W'}
{'name': 'Malaysia', 'direction': 'N'}
{'name': 'Costa Rica', 'direction': 'W'}
{'name': 'Colombia', 'direction': 'E'}
:meth:`Element.findall` finds only elements with a tag which are direct
children of the current element. :meth:`Element.find` finds the *first* child
with a particular tag, and :meth:`Element.text` accesses the element's text
content. :meth:`Element.get` accesses the element's attributes::
>>> for country in root.findall('country'):
... rank = country.find('rank').text
... name = country.get('name')
... print name, rank
...
Liechtenstein 1
Singapore 4
Panama 68
More sophisticated specification of which elements to look for is possible by
using :ref:`XPath
<elementtree-xpath>
`.
Modifying an XML File
^^^^^^^^^^^^^^^^^^^^^
:class:`ElementTree` provides a simple way to build XML documents and write them to files.
The :meth:`ElementTree.write` method serves this purpose.
Once created, an :class:`Element` object may be manipulated by directly changing
its fields (such as :attr:`Element.text`), adding and modifying attributes
(:meth:`Element.set` method), as well as adding new children (for example
with :meth:`Element.append`).
Let's say we want to add one to each country's rank, and add an ``updated``
attribute to the rank element::
>>> for rank in root.iter('rank'):
... new_rank = int(rank.text) + 1
... rank.text = str(new_rank)
... rank.set('updated', 'yes')
...
>>> tree.write('output.xml')
Our XML now looks like this:
.. code-block:: xml
<?xml version="1.0"?>
<data>
<country
name=
"Liechtenstein"
>
<rank
updated=
"yes"
>
2
</rank>
<year>
2008
</year>
<gdppc>
141100
</gdppc>
<neighbor
name=
"Austria"
direction=
"E"
/>
<neighbor
name=
"Switzerland"
direction=
"W"
/>
</country>
<country
name=
"Singapore"
>
<rank
updated=
"yes"
>
5
</rank>
<year>
2011
</year>
<gdppc>
59900
</gdppc>
<neighbor
name=
"Malaysia"
direction=
"N"
/>
</country>
<country
name=
"Panama"
>
<rank
updated=
"yes"
>
69
</rank>
<year>
2011
</year>
<gdppc>
13600
</gdppc>
<neighbor
name=
"Costa Rica"
direction=
"W"
/>
<neighbor
name=
"Colombia"
direction=
"E"
/>
</country>
</data>
We can remove elements using :meth:`Element.remove`. Let's say we want to
remove all countries with a rank higher than 50::
>>> for country in root.findall('country'):
... rank = int(country.find('rank').text)
... if rank > 50:
... root.remove(country)
...
>>> tree.write('output.xml')
Our XML now looks like this:
.. code-block:: xml
<?xml version="1.0"?>
<data>
<country
name=
"Liechtenstein"
>
<rank
updated=
"yes"
>
2
</rank>
<year>
2008
</year>
<gdppc>
141100
</gdppc>
<neighbor
name=
"Austria"
direction=
"E"
/>
<neighbor
name=
"Switzerland"
direction=
"W"
/>
</country>
<country
name=
"Singapore"
>
<rank
updated=
"yes"
>
5
</rank>
<year>
2011
</year>
<gdppc>
59900
</gdppc>
<neighbor
name=
"Malaysia"
direction=
"N"
/>
</country>
</data>
Building XML documents
^^^^^^^^^^^^^^^^^^^^^^
The :func:`SubElement` function also provides a convenient way to create new
sub-elements for a given element::
>>> a = ET.Element('a')
>>> b = ET.SubElement(a, 'b')
>>> c = ET.SubElement(a, 'c')
>>> d = ET.SubElement(c, 'd')
>>> ET.dump(a)
<a><b
/><c><d
/></c></a>
Additional resources
^^^^^^^^^^^^^^^^^^^^
See http://effbot.org/zone/element-index.htm for tutorials and links to other
docs.
.. _elementtree-xpath:
XPath support
-------------
This module provides limited support for
`XPath expressions
<http:
//
www
.
w3
.
org
/
TR
/
xpath
>
`_ for locating elements in a
tree. The goal is to support a small subset of the abbreviated syntax; a full
XPath engine is outside the scope of the module.
Example
^^^^^^^
Here's an example that demonstrates some of the XPath capabilities of the
module. We'll be using the ``countrydata`` XML document from the
:ref:`Parsing XML
<elementtree-parsing-xml>
` section::
import xml.etree.ElementTree as ET
root = ET.fromstring(countrydata)
# Top-level elements
root.findall(".")
# All 'neighbor' grand-children of 'country' children of the top-level
# elements
root.findall("./country/neighbor")
# Nodes with name='Singapore' that have a 'year' child
root.findall(".//year/..[@name='Singapore']")
# 'year' nodes that are children of nodes with name='Singapore'
root.findall(".//*[@name='Singapore']/year")
# All 'neighbor' nodes that are the second child of their parent
root.findall(".//neighbor[2]")
Supported XPath syntax
^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+------------------------------------------------------+
| Syntax | Meaning |
+=======================+======================================================+
| ``tag`` | Selects all child elements with the given tag. |
| | For example, ``spam`` selects all child elements |
| | named ``spam``, ``spam/egg`` selects all |
| | grandchildren named ``egg`` in all children named |
| | ``spam``. |
+-----------------------+------------------------------------------------------+
| ``*`` | Selects all child elements. For example, ``*/egg`` |
| | selects all grandchildren named ``egg``. |
+-----------------------+------------------------------------------------------+
| ``.`` | Selects the current node. This is mostly useful |
| | at the beginning of the path, to indicate that it's |
| | a relative path. |
+-----------------------+------------------------------------------------------+
| ``//`` | Selects all subelements, on all levels beneath the |
| | current element. For example, ``.//egg`` selects |
| | all ``egg`` elements in the entire tree. |
+-----------------------+------------------------------------------------------+
| ``..`` | Selects the parent element. |
+-----------------------+------------------------------------------------------+
| ``[@attrib]`` | Selects all elements that have the given attribute. |
+-----------------------+------------------------------------------------------+
| ``[@attrib='value']`` | Selects all elements for which the given attribute |
| | has the given value. The value cannot contain |
| | quotes. |
+-----------------------+------------------------------------------------------+
| ``[tag]`` | Selects all elements that have a child named |
| | ``tag``. Only immediate children are supported. |
+-----------------------+------------------------------------------------------+
| ``[position]`` | Selects all elements that are located at the given |
| | position. The position can be either an integer |
| | (1 is the first position), the expression ``last()`` |
| | (for the last position), or a position relative to |
| | the last position (e.g. ``last()-1``). |
+-----------------------+------------------------------------------------------+
Predicates (expressions within square brackets) must be preceded by a tag
name, an asterisk, or another predicate. ``position`` predicates must be
preceded by a tag name.
Reference
---------
.. _elementtree-functions:
Functions
---------
^^^^^^^^^
.. function:: Comment(text=None)
...
...
@@ -196,8 +498,7 @@ Functions
.. _elementtree-element-objects:
Element Objects
---------------
^^^^^^^^^^^^^^^
.. class:: Element(tag, attrib={}, **extra)
...
...
@@ -387,7 +688,7 @@ Element Objects
.. _elementtree-elementtree-objects:
ElementTree Objects
-------------------
^^^^^^^^^^^^^^^^^^^
.. class:: ElementTree(element=None, file=None)
...
...
@@ -507,7 +808,7 @@ Example of changing the attribute "target" of every link in first paragraph::
.. _elementtree-qname-objects:
QName Objects
-------------
^^^^^^^^^^^^^
.. class:: QName(text_or_uri, tag=None)
...
...
@@ -523,7 +824,7 @@ QName Objects
.. _elementtree-treebuilder-objects:
TreeBuilder Objects
-------------------
^^^^^^^^^^^^^^^^^^^
.. class:: TreeBuilder(element_factory=None)
...
...
@@ -574,7 +875,7 @@ TreeBuilder Objects
.. _elementtree-xmlparser-objects:
XMLParser Objects
-----------------
^^^^^^^^^^^^^^^^^
.. class:: XMLParser(html=0, target=None, encoding=None)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment