From: Adam Tauno Williams on 16 May 2010 00:11 On Sun, 2010-05-16 at 02:37 +0200, Martin v. Loewis wrote: > > ??? The namespaces are embedded in the document. Personally I find it > > odd I have to tell xpath about the namespace of the document it is a > > $*&@(*& method of. > How so? Why do you say it's a "method", and why do you say "of"? > Usually, xpath expressions are *not* part of the document they operate > on, but part of the code that performs the operation. from lxml import etree doc = etree.parse(data) doc.xpath(....) > Consequentially, > the namespace prefixes in the xpath expression do *not* occur in the > document (other than by chance), but are defined by whoever writes the > xpath expression. That is typically somebody different from the one > writing the document Maybe true technically, but false in practice. If I receive XML data from source XYZ or service XYZ the use of namespaces and their prefixes is extremely consistent [in practice] and very customary (for example: I've never seen the DSML namespace abbreviated as anything other than "dsml" and I rarely see WebDAV propfind XML use a namespace prefix other than "D"). The odds that a customer or vendors ERP will generate different namespaces and abbreviations between requests is ludicrously remote [I don't recall ever seeing it happen]. And if the xpath fails to produce normal [or any] output the workflow with either do nothing or abend which will draw the attention of an administrator. > - if you would always write them together, you > wouldn't need xpath in the first place, but could produce the selection > result right away.
From: Stefan Behnel on 16 May 2010 02:55 Martin v. Loewis, 15.05.2010 23:37: >> BTW, I'm still not sure I understand your problem. Could you provide >> some more details? > > Wouldn't it be easier if you told the OP how to access the prefix > mappings in lxml etree, or, if this was actually not possible, admitted > that it is actually not possible? Well, there's an "nsmap" property on each Element that provides the mapping of prefixes to namespace URIs that form the scope of the Element. However, while this is what the OP asked for, it is not what the OP wants, simply because it doesn't solve the problem. Prefixes can get defined and redefined arbitrarily often, so there is no such thing as a prefix-namespace mapping "of the document". Example: <x:tag xmlns:x="urn:uri1"> <x:tag xmlns:x="urn:uri2"> <x:tag xmlns:x="urn:uri3 /> </x:tag> </x:tag> Trying to infer a prefix-namespace mapping from that to push it into an XPath evaluation is futile. That's why I asked for more details in order to understand what the actual problem is that the OP is trying to solve, because the approach that the OP is apparently trying to follow is clearly misguided. Stefan
From: Stefan Behnel on 16 May 2010 02:59 Adam Tauno Williams, 16.05.2010 06:00: > Given that XML documents can be very large I'd rather avoid a parsing of > the document [beyond what lxml/etree] has already done] just to retrieve > the namespaces and their prefixes. In order to find out which prefixes are used in the document and which set of namespace URIs each of them is mapped to, you need to traverse the entire document and aggregate all namespace definitions on all Elements. However, the result will be mostly useless, as a prefix is only meaningful within the scope of its definition. It doesn't have any sensible meaning for the entire document. Stefan
From: Stefan Behnel on 16 May 2010 03:05 Adam Tauno Williams, 15.05.2010 23:04: > On Sat, 2010-05-15 at 22:58 +0200, Stefan Behnel wrote: >> Adam Tauno Williams, 15.05.2010 22:40: >>> On Sat, 2010-05-15 at 22:29 +0200, Stefan Behnel wrote: >>>> Adam Tauno Williams, 15.05.2010 20:37: >>>>> Say I have an XML document that begins with: >>>>> <?xml version="1.0" encoding="utf-8"?> >>>>> <dsml:dsml xmlns:dsml="http://www.dsml.org/DSML"> >>>>> How can one access the namespaces define in this node? I've done a fair >>>>> amount of XML in Python, but haven't been able to uncover the call to >>>>> enumerate the namespaces. >>>>> Primarily I am using etree from lxml. >>>> What do you need the namespaces for? >>> One needs to know the defined namespace in order to perform xpath >>> operations. >> Well, yes, but unless you already know the namespace (URI), you can't know >> what the tag you find signifies in the first place. >> Unless, obviously, you are confusing namespaces with namespace prefixes. >> But you don't need to know the prefixes for XPath. >> Does this help? >> http://codespeak.net/lxml/xpathxslt.html#namespaces-and-prefixes > > I know that. I just remembered that there's also this: http://codespeak.net/lxml/FAQ.html#how-can-i-find-out-which-namespace-prefixes-are-used-in-a-document Stefan
From: Martin v. Loewis on 16 May 2010 03:07 > Well, there's an "nsmap" property on each Element that provides the > mapping of prefixes to namespace URIs that form the scope of the > Element. However, while this is what the OP asked for, it is not what > the OP wants, simply because it doesn't solve the problem. Well, it solves the problem at hand: he gets some prefix mapping. He probably could have used a hard-coded prefix mapping for the 20 or so namespaces in his application instead (with a different set of flaws in that approach). > That's why I asked for more details in order to understand what the > actual problem is that the OP is trying to solve, because the approach > that the OP is apparently trying to follow is clearly misguided. I completely agree. However, I recommend that we let him find out on his own. I suspect he has some idiomatic usage of XML, perhaps with all namespace prefixes defined in the root element. He'll find out that his approach is flawed in the general case when he encounters such a case. It's probably pointless trying to convince him in the abstract. Regards, Martin
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Pyinstaller on WINE - cannot install pywin32 Next: Access to comp.lang.python |