There is one interface for the XPath system that is used to parse both Paths and Expressions. In both cases, the parser will return a Parsed Token that represents the path or expression in a tree format. This token can then be used to further process nodes of DOM tree.
A Path in XPath is used to select a set of nodes in a DOM tree. The top level EBNF production for a Path is the LocationPath production (see http://www.w3.org/TR/xpath). To create a parsed Path first create an instance of the XPathParser
from xml.xpath import XPathParser p = XPathParser.XPathParser()
Use the method parseExpression to parse a string into a parsed tree The path expression used in this exmaple can be broken into steps defined by the '/'. The first set of // will select all nodes that are a descendent of root
Then, child::ENTRY will select all elements with a name ENTRY that are children of the result of the above set Lastly the NAME[position()=1] Will select the element that is the child of any of the results of the above set, and that is a NAME element, and that is the first name element. In short, this will select the first NAME element that is a child of any ENTRY element in the DOM tree.
path = p.parseExpression('//child::ENTRY/NAME[position() = 1]')
The object returned from parseExpression has the following 2 methods: Dump will write the path expression to the given open file reference.
path.dump(sys.stdout)
Select will use a context to select a set of nodes. The context is made up of the context node and a context list. For top level selects, the context list usually contains one item, the context node.
from xml.xpath import Conext c = Context.Context(node,[node]) rt = path.select(c)
rt will containt a list of nodes that were selected from the expression
print rt
Expressions function much the same as Paths. In the EBNF production, and Expressions root is the Expr production. An Expression can contian Paths and a Path can contian expressions. It all depends on what you need.
from xml.xpath import XPathParser p = XPathParser.XPathParser()
Use the parseExpression method to parse an expression string into parsed tokens The expression used inb the example will return an child elements of the context that have the tag name NAME or #PHONENUM
exp = p.parseExpression('NAME | PHONENUM')
Expressions have a dump method
exp.dump(sys.stdout)
Expressions can be evaluated against a context. There are 4 return types for expressions
from xml.xpath import Conext c = Context.Context(node,[node]) rt = exp.evaluate(c)
A common use of expressions is for matching. Rules for matching a context node are described in the XPath #specification. These rules have been combined into a single function call
This function will only return a boolean result from the expression evaluated at the given context.
rt = Boolean.BooleanEvaluate(exp,c)
XPath location paths require node-lists to be sorted in document order. This can be an expensive operation so XPath allows users to pre-index documents for faster sorting. To do so, do the following:
from xml.xpath import Util ... Util.IndexDocument(document_node) ...XPath operations... Util.FreeDocumentIndex(document_node)
Do be sure to free the index to avoid memory leaks. Also note that it's a bad idea to mutate any node in the document while it is indexed.
XPath context
Module Summary
Class Summary | |
Context | Represents the context used for XPath processing at any given point |
Represents the context used for XPath processing at any given point
Attribute Summary | |
node | The context node, as used for computing XPath expressions |
position | The context node's position in the context node list, as returned by the XPath position() function |
size | The size of the context node list |
varBindings | Maps variable and parameters by expanded name to the value of the variable |
processorNss | provides expansion from namespace prefixes to uris for expanded names in name tests, variable names, etc. |
Method Summary | |
nss | Get a dictionary representing namespace nodes defined at the context node |
Method Details |
__init__(node, position, size, varBindings, processorNss)
ParametersReturn Value
node
of type Python DOM binding node object
The context node, as used for computing XPath expressions
position
of type positive integer
The context node's position in the context node list, as returned by the XPath position() function
size
of type positive integer
The size of the context node list
varBindings
of type dictionary with keys a tuple of two strings and value a string, integer, BooleanType or node set (list of nodes)
Maps variable and parameters by expanded name to the value of the variable. Defaults to an empty dictionary.
processorNss
of type dictionary with string key and value
provides expansion from namespace prefixes to uris for expanded names in name tests, variable names, etc. Defaults to an empty dictionary.
None
nss()
Get a dictionary representing namespace nodes defined at the context node
ParametersNoneReturn Value
- dictionary with string key and string value
Maps prefixes to namespace URIs