Commit 583302c5 authored by Florent Xicluna's avatar Florent Xicluna

Update some parts of the xml.etree documentation.

parent a0fd4ccd
......@@ -9,9 +9,9 @@
.. versionadded:: 2.5
The Element type is a flexible container object, designed to store hierarchical
data structures in memory. The type can be described as a cross between a list
and a dictionary.
The :class:`Element` type is a flexible container object, designed to store
hierarchical data structures in memory. The type can be described as a cross
between a list and a dictionary.
Each element has a number of properties associated with it:
......@@ -35,8 +35,14 @@ convert it from and to XML.
A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
See http://effbot.org/zone/element-index.htm for tutorials and links to other
docs. Fredrik Lundh's page is also the location of the development version of the
xml.etree.ElementTree.
docs. Fredrik Lundh's page is also the location of the development version of
the xml.etree.ElementTree.
.. versionchanged:: 2.7
The ElementTree API is updated to 1.3. For more information, see
`Introducing ElementTree 1.3
<http://effbot.org/zone/elementtree-13-intro.htm>`_.
.. _elementtree-functions:
......@@ -46,17 +52,17 @@ Functions
.. function:: Comment([text])
Comment element factory. This factory function creates a special element that
will be serialized as an XML comment by the standard serializer. The comment
string can be either an 8-bit ASCII string or a Unicode string. *text* is a
string containing the comment string. Returns an element instance representing
a comment.
Comment element factory. This factory function creates a special element
that will be serialized as an XML comment by the standard serializer. The
comment string can be either a bytestring or a Unicode string. *text* is a
string containing the comment string. Returns an element instance
representing a comment.
.. function:: dump(elem)
Writes an element tree or element structure to sys.stdout. This function should
be used for debugging only.
Writes an element tree or element structure to sys.stdout. This function
should be used for debugging only.
The exact output format is implementation dependent. In this version, it's
written as an ordinary XML file.
......@@ -66,34 +72,34 @@ Functions
.. function:: fromstring(text)
Parses an XML section from a string constant. Same as XML. *text* is a string
containing XML data. Returns an Element instance.
Parses an XML section from a string constant. Same as XML. *text* is a
string containing XML data. Returns an :class:`Element` instance.
.. function:: fromstringlist(sequence[, parser])
Parses an XML document from a sequence of string fragments. *sequence* is a list
or other sequence containing XML data fragments. *parser* is an optional parser
instance. If not given, the standard :class:`XMLParser` parser is used.
Returns an Element instance.
Parses an XML document from a sequence of string fragments. *sequence* is a
list or other sequence containing XML data fragments. *parser* is an
optional parser instance. If not given, the standard :class:`XMLParser`
parser is used. Returns an :class:`Element` instance.
.. versionadded:: 2.7
.. function:: iselement(element)
Checks if an object appears to be a valid element object. *element* is an
element instance. Returns a true value if this is an element object.
Checks if an object appears to be a valid element object. *element* is an
element instance. Returns a true value if this is an element object.
.. function:: iterparse(source[, events[, parser]])
Parses an XML section into an element tree incrementally, and reports what's
going on to the user. *source* is a filename or file object containing XML data.
*events* is a list of events to report back. If omitted, only "end" events are
reported. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns an :term:`iterator`
providing ``(event, elem)`` pairs.
going on to the user. *source* is a filename or file object containing XML
data. *events* is a list of events to report back. If omitted, only "end"
events are reported. *parser* is an optional parser instance. If not
given, the standard :class:`XMLParser` parser is used. Returns an
:term:`iterator` providing ``(event, elem)`` pairs.
.. note::
......@@ -108,54 +114,57 @@ Functions
.. function:: parse(source[, parser])
Parses an XML section into an element tree. *source* is a filename or file
object containing XML data. *parser* is an optional parser instance. If not
given, the standard :class:`XMLParser` parser is used. Returns an
Parses an XML section into an element tree. *source* is a filename or file
object containing XML data. *parser* is an optional parser instance. If
not given, the standard :class:`XMLParser` parser is used. Returns an
:class:`ElementTree` instance.
.. function:: ProcessingInstruction(target[, text])
PI element factory. This factory function creates a special element that will
be serialized as an XML processing instruction. *target* is a string containing
the PI target. *text* is a string containing the PI contents, if given. Returns
an element instance, representing a processing instruction.
PI element factory. This factory function creates a special element that
will be serialized as an XML processing instruction. *target* is a string
containing the PI target. *text* is a string containing the PI contents, if
given. Returns an element instance, representing a processing instruction.
.. function:: register_namespace(prefix, uri)
Registers a namespace prefix. The registry is global, and any existing mapping
for either the given prefix or the namespace URI will be removed. *prefix* is a
namespace prefix. *uri* is a namespace uri. Tags and attributes in this namespace
will be serialized with the given prefix, if at all possible.
Registers a namespace prefix. The registry is global, and any existing
mapping for either the given prefix or the namespace URI will be removed.
*prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
attributes in this namespace will be serialized with the given prefix, if at
all possible.
.. versionadded:: 2.7
.. function:: SubElement(parent, tag[, attrib[, **extra]])
Subelement factory. This function creates an element instance, and appends it
to an existing element.
Subelement factory. This function creates an element instance, and appends
it to an existing element.
The element name, attribute names, and attribute values can be either 8-bit
ASCII strings or Unicode strings. *parent* is the parent element. *tag* is the
subelement name. *attrib* is an optional dictionary, containing element
attributes. *extra* contains additional attributes, given as keyword arguments.
Returns an element instance.
The element name, attribute names, and attribute values can be either
bytestrings or Unicode strings. *parent* is the parent element. *tag* is
the subelement name. *attrib* is an optional dictionary, containing element
attributes. *extra* contains additional attributes, given as keyword
arguments. Returns an element instance.
.. function:: tostring(element[, encoding])
Generates a string representation of an XML element, including all subelements.
*element* is an Element instance. *encoding* is the output encoding (default is
US-ASCII). Returns an encoded string containing the XML data.
Generates a string representation of an XML element, including all
subelements. *element* is an :class:`Element` instance. *encoding* is the
output encoding (default is US-ASCII). Returns an encoded string containing
the XML data.
.. function:: tostringlist(element[, encoding])
Generates a string representation of an XML element, including all subelements.
*element* is an Element instance. *encoding* is the output encoding (default is
US-ASCII). Returns a sequence object containing the XML data.
Generates a string representation of an XML element, including all
subelements. *element* is an :class:`Element` instance. *encoding* is the
output encoding (default is US-ASCII). Returns a sequence object containing
the XML data.
.. versionadded:: 2.7
......@@ -163,18 +172,18 @@ Functions
.. function:: XML(text[, parser])
Parses an XML section from a string constant. This function can be used to
embed "XML literals" in Python code. *text* is a string containing XML data.
*parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns an Element instance.
embed "XML literals" in Python code. *text* is a string containing XML
data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns an :class:`Element` instance.
.. function:: XMLID(text[, parser])
Parses an XML section from a string constant, and also returns a dictionary
which maps from element id:s to elements. *text* is a string containing XML
data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns a tuple containing an Element
instance and a dictionary.
which maps from element id:s to elements. *text* is a string containing XML
data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns a tuple containing an
:class:`Element` instance and a dictionary.
.. _elementtree-element-objects:
......@@ -185,45 +194,46 @@ Element Objects
.. class:: Element(tag[, attrib[, **extra]])
Element class. This class defines the Element interface, and provides a
Element class. This class defines the Element interface, and provides a
reference implementation of this interface.
The element name, attribute names, and attribute values can be either 8-bit
ASCII strings or Unicode strings. *tag* is the element name. *attrib* is an
optional dictionary, containing element attributes. *extra* contains additional
attributes, given as keyword arguments.
The element name, attribute names, and attribute values can be either
bytestrings or Unicode strings. *tag* is the element name. *attrib* is
an optional dictionary, containing element attributes. *extra* contains
additional attributes, given as keyword arguments.
.. attribute:: tag
A string identifying what kind of data this element represents (the element
type, in other words).
A string identifying what kind of data this element represents (the
element type, in other words).
.. attribute:: text
The *text* attribute can be used to hold additional data associated with the
element. As the name implies this attribute is usually a string but may be
any application-specific object. If the element is created from an XML file
the attribute will contain any text found between the element tags.
The *text* attribute can be used to hold additional data associated with
the element. As the name implies this attribute is usually a string but
may be any application-specific object. If the element is created from
an XML file the attribute will contain any text found between the element
tags.
.. attribute:: tail
The *tail* attribute can be used to hold additional data associated with the
element. This attribute is usually a string but may be any
application-specific object. If the element is created from an XML file the
attribute will contain any text found after the element's end tag and before
the next tag.
The *tail* attribute can be used to hold additional data associated with
the element. This attribute is usually a string but may be any
application-specific object. If the element is created from an XML file
the attribute will contain any text found after the element's end tag and
before the next tag.
.. attribute:: attrib
A dictionary containing the element's attributes. Note that while the
A dictionary containing the element's attributes. Note that while the
*attrib* value is always a real mutable Python dictionary, an ElementTree
implementation may choose to use another internal representation, and create
the dictionary only if someone asks for it. To take advantage of such
implementations, use the dictionary methods below whenever possible.
implementation may choose to use another internal representation, and
create the dictionary only if someone asks for it. To take advantage of
such implementations, use the dictionary methods below whenever possible.
The following dictionary-like methods work on the element attributes.
......@@ -243,14 +253,14 @@ Element Objects
.. method:: items()
Returns the element attributes as a sequence of (name, value) pairs. The
Returns the element attributes as a sequence of (name, value) pairs. The
attributes are returned in an arbitrary order.
.. method:: keys()
Returns the elements attribute names as a list. The names are returned in an
arbitrary order.
Returns the elements attribute names as a list. The names are returned
in an arbitrary order.
.. method:: set(key, value)
......@@ -262,8 +272,8 @@ Element Objects
.. method:: append(subelement)
Adds the element *subelement* to the end of this elements internal list of
subelements.
Adds the element *subelement* to the end of this elements internal list
of subelements.
.. method:: extend(subelements)
......@@ -276,22 +286,22 @@ Element Objects
.. method:: find(match)
Finds the first subelement matching *match*. *match* may be a tag name or path.
Returns an element instance or ``None``.
Finds the first subelement matching *match*. *match* may be a tag name
or path. Returns an element instance or ``None``.
.. method:: findall(match)
Finds all subelements matching *match*. *match* may be a tag name or path.
Returns an iterable yielding all matching elements in document order.
Finds all matching subelements, by tag name or path. Returns a list
containing all matching elements in document order.
.. method:: findtext(condition[, default])
.. method:: findtext(match[, default])
Finds text for the first subelement matching *condition*. *condition* may be
a tag name or path. Returns the text content of the first matching element,
or *default* if no element was found. Note that if the matching element has
no text content an empty string is returned.
Finds text for the first subelement matching *match*. *match* may be
a tag name or path. Returns the text content of the first matching
element, or *default* if no element was found. Note that if the matching
element has no text content an empty string is returned.
.. method:: getchildren()
......@@ -313,31 +323,48 @@ Element Objects
.. method:: iter([tag])
Creates a tree iterator with the current element as the root. The iterator
iterates over this element and all elements below it, in document (depth
first) order. If *tag* is not ``None`` or ``'*'``, only elements whose tag
equals *tag* are returned from the iterator. If the tree structure is
modified during iteration, the result is undefined.
Creates a tree :term:`iterator` with the current element as the root.
The iterator iterates over this element and all elements below it, in
document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
elements whose tag equals *tag* are returned from the iterator. If the
tree structure is modified during iteration, the result is undefined.
.. method:: iterfind(match)
Finds all matching subelements, by tag name or path. Returns an iterable
yielding all matching elements in document order.
.. versionadded:: 2.7
.. method:: itertext()
Creates a text iterator. The iterator loops over this element and all
subelements, in document order, and returns all inner text.
.. versionadded:: 2.7
.. method:: makeelement(tag, attrib)
Creates a new element object of the same type as this element. Do not call
this method, use the SubElement factory function instead.
Creates a new element object of the same type as this element. Do not
call this method, use the :func:`SubElement` factory function instead.
.. method:: remove(subelement)
Removes *subelement* from the element. Unlike the findXYZ methods this
method compares elements based on the instance identity, not on tag value
Removes *subelement* from the element. Unlike the find\* methods this
method compares elements based on the instance identity, not on tag value
or contents.
Element objects also support the following sequence type methods for working
with subelements: :meth:`__delitem__`, :meth:`__getitem__`, :meth:`__setitem__`,
:meth:`__len__`.
:class:`Element` objects also support the following sequence type methods
for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
:meth:`__setitem__`, :meth:`__len__`.
Caution: Because Element objects do not define a :meth:`__nonzero__` method,
elements with no subelements will test as ``False``. ::
Caution: Elements with no subelements will test as ``False``. This behavior
will change in future versions. Use specific ``len(elem)`` or ``elem is
None`` test instead. ::
element = root.find('foo')
......@@ -356,43 +383,43 @@ ElementTree Objects
.. class:: ElementTree([element,] [file])
ElementTree wrapper class. This class represents an entire element hierarchy,
and adds some extra support for serialization to and from standard XML.
ElementTree wrapper class. This class represents an entire element
hierarchy, and adds some extra support for serialization to and from
standard XML.
*element* is the root element. The tree is initialized with the contents of the
XML *file* if given.
*element* is the root element. The tree is initialized with the contents
of the XML *file* if given.
.. method:: _setroot(element)
Replaces the root element for this tree. This discards the current
contents of the tree, and replaces it with the given element. Use with
care. *element* is an element instance.
care. *element* is an element instance.
.. method:: find(path)
.. method:: find(match)
Finds the first toplevel element with given tag. Same as
getroot().find(path). *path* is the element to look for. Returns the
first matching element, or ``None`` if no element was found.
Finds the first toplevel element matching *match*. *match* may be a tag
name or path. Same as getroot().find(match). Returns the first matching
element, or ``None`` if no element was found.
.. method:: findall(path)
.. method:: findall(match)
Finds all toplevel elements with the given tag. Same as
getroot().findall(path). *path* is the element to look for. Returns a
list or :term:`iterator` containing all matching elements, in document
order.
Finds all matching subelements, by tag name or path. Same as
getroot().findall(match). *match* may be a tag name or path. Returns a
list containing all matching elements, in document order.
.. method:: findtext(path[, default])
.. method:: findtext(match[, default])
Finds the element text for the first toplevel element with given tag.
Same as getroot().findtext(path). *path* is the toplevel element to look
for. *default* is the value to return if the element was not
found. Returns the text content of the first matching element, or the
default value no element was found. Note that if the element has is
found, but has no text content, this method returns an empty string.
Same as getroot().findtext(match). *match* may be a tag name or path.
*default* is the value to return if the element was not found. Returns
the text content of the first matching element, or the default value no
element was found. Note that if the element is found, but has no text
content, this method returns an empty string.
.. method:: getiterator([tag])
......@@ -409,23 +436,32 @@ ElementTree Objects
.. method:: iter([tag])
Creates and returns a tree iterator for the root element. The iterator
loops over all elements in this tree, in section order. *tag* is the tag
loops over all elements in this tree, in section order. *tag* is the tag
to look for (default is to return all elements)
.. method:: iterfind(match)
Finds all matching subelements, by tag name or path. Same as
getroot().iterfind(match). Returns an iterable yielding all matching
elements in document order.
.. versionadded:: 2.7
.. method:: parse(source[, parser])
Loads an external XML section into this element tree. *source* is a file
name or file object. *parser* is an optional parser instance. If not
given, the standard XMLParser parser is used. Returns the section
Loads an external XML section into this element tree. *source* is a file
name or file object. *parser* is an optional parser instance. If not
given, the standard XMLParser parser is used. Returns the section
root element.
.. method:: write(file[, encoding[, xml_declaration]])
Writes the element tree to a file, as XML. *file* is a file name, or a
file object opened for writing. *encoding* [1]_ is the output encoding
(default is US-ASCII). *xml_declaration* controls if an XML declaration
Writes the element tree to a file, as XML. *file* is a file name, or a
file object opened for writing. *encoding* [1]_ is the output encoding
(default is US-ASCII). *xml_declaration* controls if an XML declaration
should be added to the file. Use False for never, True for always, None
for only if not US-ASCII or UTF-8. None is default.
......@@ -446,13 +482,13 @@ Example of changing the attribute "target" of every link in first paragraph::
>>> from xml.etree.ElementTree import ElementTree
>>> tree = ElementTree()
>>> tree.parse("index.xhtml")
<Element 'html' at b7d3f1ec>
<Element 'html' at 0xb77e6fac>
>>> p = tree.find("body/p") # Finds first occurrence of tag p in body
>>> p
<Element 'p' at 8416e0c>
<Element 'p' at 0xb77ec26c>
>>> links = list(p.iter("a")) # Returns list of all links
>>> links
[<Element 'a' at b7d4f9ec>, <Element 'a' at b7d4fb0c>]
[<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
>>> for i in links: # Iterates through all found links
... i.attrib["target"] = "blank"
>>> tree.write("output.xhtml")
......@@ -465,12 +501,12 @@ QName Objects
.. class:: QName(text_or_uri[, tag])
QName wrapper. This can be used to wrap a QName attribute value, in order to
get proper namespace handling on output. *text_or_uri* is a string containing
the QName value, in the form {uri}local, or, if the tag argument is given, the
URI part of a QName. If *tag* is given, the first argument is interpreted as an
URI, and this argument is interpreted as a local name. :class:`QName` instances
are opaque.
QName wrapper. This can be used to wrap a QName attribute value, in order
to get proper namespace handling on output. *text_or_uri* is a string
containing the QName value, in the form {uri}local, or, if the tag argument
is given, the URI part of a QName. If *tag* is given, the first argument is
interpreted as an URI, and this argument is interpreted as a local name.
:class:`QName` instances are opaque.
.. _elementtree-treebuilder-objects:
......@@ -481,35 +517,35 @@ TreeBuilder Objects
.. class:: TreeBuilder([element_factory])
Generic element structure builder. This builder converts a sequence of start,
data, and end method calls to a well-formed element structure. You can use this
class to build an element structure using a custom XML parser, or a parser for
some other XML-like format. The *element_factory* is called to create new
Element instances when given.
Generic element structure builder. This builder converts a sequence of
start, data, and end method calls to a well-formed element structure. You
can use this class to build an element structure using a custom XML parser,
or a parser for some other XML-like format. The *element_factory* is called
to create new :class:`Element` instances when given.
.. method:: close()
Flushes the builder buffers, and returns the toplevel document
element. Returns an Element instance.
element. Returns an :class:`Element` instance.
.. method:: data(data)
Adds text to the current element. *data* is a string. This should be
either an 8-bit string containing ASCII text, or a Unicode string.
Adds text to the current element. *data* is a string. This should be
either a bytestring, or a Unicode string.
.. method:: end(tag)
Closes the current element. *tag* is the element name. Returns the closed
element.
Closes the current element. *tag* is the element name. Returns the
closed element.
.. method:: start(tag, attrs)
Opens a new element. *tag* is the element name. *attrs* is a dictionary
containing element attributes. Returns the opened element.
Opens a new element. *tag* is the element name. *attrs* is a dictionary
containing element attributes. Returns the opened element.
In addition, a custom :class:`TreeBuilder` object can provide the
......@@ -517,9 +553,9 @@ TreeBuilder Objects
.. method:: doctype(name, pubid, system)
Handles a doctype declaration. *name* is the doctype name. *pubid* is the
public identifier. *system* is the system identifier. This method does not
exist on the default :class:`TreeBuilder` class.
Handles a doctype declaration. *name* is the doctype name. *pubid* is
the public identifier. *system* is the system identifier. This method
does not exist on the default :class:`TreeBuilder` class.
.. versionadded:: 2.7
......@@ -532,16 +568,17 @@ XMLParser Objects
.. class:: XMLParser([html [, target[, encoding]]])
Element structure builder for XML source data, based on the expat parser. *html*
are predefined HTML entities. This flag is not supported by the current
implementation. *target* is the target object. If omitted, the builder uses an
instance of the standard TreeBuilder class. *encoding* [1]_ is optional.
If given, the value overrides the encoding specified in the XML file.
:class:`Element` structure builder for XML source data, based on the expat
parser. *html* are predefined HTML entities. This flag is not supported by
the current implementation. *target* is the target object. If omitted, the
builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
is optional. If given, the value overrides the encoding specified in the
XML file.
.. method:: close()
Finishes feeding data to the parser. Returns an element structure.
Finishes feeding data to the parser. Returns an element structure.
.. method:: doctype(name, pubid, system)
......@@ -553,11 +590,11 @@ XMLParser Objects
.. method:: feed(data)
Feeds data to the parser. *data* is encoded data.
Feeds data to the parser. *data* is encoded data.
:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
for each opening tag, its :meth:`end` method for each closing tag,
and data is processed by method :meth:`data`. :meth:`XMLParser.close`
and data is processed by method :meth:`data`. :meth:`XMLParser.close`
calls *target*\'s method :meth:`close`.
:class:`XMLParser` can be used not only for building a tree structure.
This is an example of counting the maximum depth of an XML file::
......@@ -598,6 +635,6 @@ This is an example of counting the maximum depth of an XML file::
.. rubric:: Footnotes
.. [#] The encoding string included in XML output should conform to the
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and http://www.iana.org/assignments/character-sets.
:mod:`xml.etree` --- The ElementTree API for XML
================================================
.. module:: xml.etree
:synopsis: Package containing common ElementTree modules.
.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
.. versionadded:: 2.5
The ElementTree package is a simple, efficient, and quite popular library for
XML manipulation in Python. The :mod:`xml.etree` package contains the most
common components from the ElementTree API library. In the current release,
this package contains the :mod:`ElementTree`, :mod:`ElementPath`, and
:mod:`ElementInclude` modules from the full ElementTree distribution.
.. XXX To be continued!
.. seealso::
`ElementTree Overview <http://effbot.org/tag/elementtree>`_
The home page for :mod:`ElementTree`. This includes links to additional
documentation, alternative implementations, and other add-ons.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment