I'm trying to use an xpath expression to select a node-set in an xml document with different namespaces defined.
The xml looks something like this:
<?POSTEN SND="SE00317644000" REC="5566420989" MSGTYPE="EPIX"?>
<ns:Msg xmlns:ns="http://www.noventus.se/epix1/genericheader.xsd">
<GenericHeader>
<SubsysId>1</SubsysId>
<SubsysType>30003</SubsysType>
<SendDateTime>2009-08-13T14:28:15</SendDateTime>
</GenericHeader>
<m:OrderStatus xmlns:m="http://www.noventus.se/epix1/orderstatus.xsd">
<Header>
<OrderSystemId>Soda SE</OrderSystemId>
<OrderNo>20090811</OrderNo>
<Status>0</Status>
</Header>
<Lines>...
I want to select only "Msg"-nodes that has the "OrderStatus" child and therefore I want to use the following xpath expression: /Msg[count('OrderStatus') > 0] but this won't work since I get an error message saying: "Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function".
So I think I want to use an expression that looks something like this: /*[local-name()='Msg'][count('OrderStatus') > 0] but that doesn't seem to work.. any ideas?
Br,
Andreas
I want to use the following xpath
expression:
/Msg[count('OrderStatus')[ 0]
but this won't work since I get an error message saying: "Namespace
Manager or XsltContext needed.
This is a FAQ.
In XPath a unprefixed name is always considered to belong in "no namespace".
However, the elements you want to select are in fact in the "http://www.noventus.se/epix1/genericheader.xsd"
namespace.
You have two possible ways to write your XPath expression:
Use the facilities of the hosting language to associate prefixes to all different namespaces to which names from the expression belong. You haven't indicated what is the hosting language in this concrete case, so I can't help you with this. A C# example can be found here.
If you have associated the prefix "xxx" to the namespace "http://www.noventus.se/epix1/genericheader.xsd" and the prefix "yyy" to the namespace "http://www.noventus.se/epix1/orderstatus.xsd", then your Expression can be written as:
/xxx:Msg[yyy:OrderStatus]
:2: If you don't want to use any prefixes at all, an XPath expression can still be constructed, however it will not be too readable:
/*[local-name() = 'Msg' and *[local-name() = 'OrderStatus']]
Finally, do note:
In order to test if an element x has a child y it isn't necessary to test for a positive count(y). Just use: x[y]
Xpath positions are 1-based. This means that NodeSetExpression[0] never selects a node. You want: NodeSetExpression[1]
Related
Have a many entries in an xml file and have xpath with condition:
/XMLReport/Report/PreflightResult/PreflightResultEntry[
#type = 'Check' and #level = 'warning']/PreflightResultEntryMessage/Message/text()
The output is:
onetwothreefour... and more
I need separation
'---' one---two---three---four
or
[enter]
one
two
three
four
Its possible ?
Why you bound XPath expression inside single quote ':
Use this:
string-join(/XMLReport/Report/PreflightResult/PreflightResultEntry[#type = 'Check' and #level = 'warning']/PreflightResultEntryMessage/Message/text(), '---')
Your XPath expression is actually returning a set of text nodes. The way these are displayed depends on the calling application (which you haven't told us anything about). I think your options are (a) change the way the calling application displays the result, or (b) if you're using XPath 2.0+, use the string-join() function to return the result as a string, formatted any way you like within the XPath expression itself.
Could you please help me on this xpath expression evaluation
I am working on fetching the proxy references. In the xml file the references will get stored as:
One way of XML file will have the reference as below:
con1:service ref="MyProject/ProxyServices/service1"
xsi:type="con2:PipelineRef" xmlns:ref="http://www.bea.com/wli/sb/reference"/
here in the xml file the name spaces are:
xmlns:con1="http://www.bea.com/wli/sb/stages/config"
xmlns:con2="http://www.bea.com/wli/sb/pipeline/config"
Another way of XML will have the reference as below.
con1:service ref="MyProject/ProxyServices/service2"
xsi:type="ref:ProxyRef" xmlns:ref="http://www.bea.com/wli/sb/reference"/
here in the xml file the name spaces are:
xmlns:con1="http://www.bea.com/wli/sb/stages/config"
xmlns:ref="http://www.bea.com/wli/sb/reference"
I have used this xpath expression, this is not fetching the reference service values, could you please help what is wrong in it.
"//service[#type= #*[local-name() ='ProxyRef' or #type=#*[local-name() ='PipelineRef']]/#ref"
when I used like this it is working but, name space prefix is keep on changes when there are multiple references in the xml file.
"//service[#type='ref:ProxyRef'or #type='con:PipelineRef' or #type='con1:PipelineRef' or #type='con2:PipelineRef' or #type='con3:PipelineRef' ...#type='con20:PipelineRef' ]/#ref";
Now here basically the type attribute PipelineRef is keep on changing the name space prefix from con to con(n). Now I am looking for something which supports some thing like #type='*:PipelineRef' or #type='con*:PipelineRef' or the best way to fetch the service element reference attribute value.
Thanks in advance.
Try using contains() like so :
//service[contains(#type,':ProxyRef') or contains(#type,':PipelineRef')]
Another alternative would be using ends-with() function which is more precise for this purpose compared to contains() function. However, ends-with() isn't available in xpath 1.0, so there is a chance that you need to implement it yourself (feasible, but the xpath result is less intuitive for me).
If I want to grab a currencies rate, say "USD", given a certain time, say "2015-02-09", how would I go about doing this?
I tried the following:
/gesmes:Envelope/def:Cube/def:Cube[#time="2014-11-19"]/def:Cube[#currency="USD"]/#rate
Though I suppose due a lack of understanding this is wrong, well at least, I know it is wrong because Nokogiri does not run it.
http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml
EDIT:
I'm going to go ahead and guess that I am not correctly using Nokogiri and XPath.
#doc = Nokogiri::XML(File.open("exchange_data.xml"))
#values = #doc.xpath('XPATH HERE')
#values.each {|i| puts i}
I have read the tutorial, and managed to get it working for other xml files, but this one seems harder to crack.
require 'nokogiri'
doc = Nokogiri::XML(File.open("xml4.xml"))
target_date = "2015-02-09"
target_currency = 'USD'
xpaths = [
"//gesmes:Envelope",
"/xmlns:Cube",
"/xmlns:Cube[#time='#{target_date}']",
"/xmlns:Cube[#currency='#{target_currency}']",
]
xpath = xpaths.join
target_cube = doc.at_xpath(xpath)
puts target_cube.attribute('rate')
--output:--
1.1297
Response to comment:
Your root tag:
<gesmes:Envelope xmlns:gesmes="http://www.gesmes.org/xml/2002-08-01"
xmlns="http://www.ecb.int/vocabulary/2002-08-01/eurofxref">
...declares two namespaces with xmlns, which stands for xml namespace. The namespace:
xmlns:gesmes="http://www.gesmes.org/xml/2002-08-01"
declares that any child tag whose name is prefixed by gesmes, e.g.:
<gesmes:subject>
...
</gesmes:subject>
will actually have a tag name that incorporates the specified url into the tag name, something like this:
<http://www.gesmes.org/xml/2002-08-01:subject>
...
</http://www.gesmes.org/xml/2002-08-01:subject>
The reason you would want to use a namespace is to create a unique name for the Cube tag, so that it doesn't clash with another xml document's Cube tag.
The second namespace declaration:
xmlns="http://www.ecb.int/vocabulary/2002-08-01/eurofxref"
is a default namespace declaration. It declares that any child tag that does not specify a prefix will have the specified url incorporated into its tag name. So a tag like this:
<Cube>
...
</Cube>
becomes something like this:
<http://www.ecb.int/vocabulary/2002-08-01/eurofxref:Cube>
...
</http://www.ecb.int/vocabulary/2002-08-01/eurofxref:Cube>
However, it would be unwieldy to have to write a tag name like that in your xpaths, so in place of the url you instead use the shortcut xmlns:
/xmlns:Cube
This might be due to the namespaces in this document:
<gesmes:Envelope xmlns:gesmes="http://www.gesmes.org/xml/2002-08-01" xmlns="http://www.ecb.int/vocabulary/2002-08-01/eurofxref">
To test this hypothesis, apply the following XPath expression:
/*[local-name() = 'Envelope']/*[local-name() = 'Cube']/*[local-name() = 'Cube'][#time="2014-11-19"]/*[local-name() = 'Cube'][#currency="USD"]/#rate
and let me know what you get. If you are otherwise correctly using XPath, you should end up with:
rate="1.2535"
If not, you are not using the XPath facilities of Nokogiri correctly, and then you'd really need to show all of your Ruby code to get help.
EDIT
Responding to a comment:
I look forward to seeing some examples added to your answer, so that I can learn something new about xml namespaces. – 7stud
7stud already gave the correct answer, I'll only add info I think is missing from this answer.
Explicit namespaces
First of all, if a namespace URI is explicitly present on an element, the correct syntax uses curly brackets, both for a prefixed and default namespace:
<{http://www.gesmes.org/xml/2002-08-01}subject>
Internally, this is how namespaces could be represented on elements (although some applications have other ways to associate elements with namespaces). Prefixes and default namespaces are there to simplify this process.
Namespaces in Nokogiri
Prefixes (gesmes:) do not have any inherent meaning. They can be associated with an arbitrary namespace URI and every document can use gesmes: to mean something different. Namespace declarations are not available to an XPath engine per se - usually, if you'd like to use a prefix in an XPath expression, you need to declare this namespace again for the XPath processor.
Yet, Nokogiri tries to simplify namespace handling for you by redeclaring namespace declarations found on the root element of the input document. This is important because it allows you to reuse the prefixes declared on the root element of the input without actually declaring the namespace. For default namespaces declared on the root element that do not have a prefix, Nokogiri has defined a special syntax:
xmlns:Cube
Namespaces that are present in the document, but declared on an element other than the root element:
<root>
<child xmlns:gesmes="http://other.com"/>
</root>
must be explicitly declared in Nokogiri:
#doc.xpath('//other:Cube', 'other' => 'http://other.com/')
What's wrong with your original code?
Your code:
/gesmes:Envelope/def:Cube/def:Cube[#time="2014-11-19"]/def:Cube[#currency="USD"]/#rate
does not work because you are using an unknown prefix def:. This prefix is not declared on the root element of the input, and neither did you declare it with Nokogiri. The Cube elements are in the default namespace, and, as we have seen, the correct way to address them is
/gesmes:Envelope/xmlns:Cube
and so on, 7stud gave you the correct answer.
a certain element of my xml data should match EXACT ONE of the following conditions:
1.) It has a #when attribute and nothing else.
2.) It has a #when-iso attribute and nothing else.
3.) It has both a #notBefore-iso and a #notAfter-iso attribute, but neither a #when nor a #when-iso attribute.
I try to test that using schematron, but I fail at creating a matching xpath expression.
I tried
<assert test="#when or #when-iso or (not(#when) and not(#when-iso) and #notBefore-iso and #notAfter-iso)">
but that doesn't work. Obviously, the content in brackets is simply ignored. So, how can I build complex/nested conditional expressions?
An example that should work for your case :
<assert test="(#when and count(#*)=1) or (#when-iso and count(#*)=1) or (#notBefore-iso and #notAfter-iso and count(#*)=2)"/>
I am trying to quickly find a specific node using XPath but it seems my multiple predicates are not working. The div I need has a specific class, but there are 3 others that have it. I want to select the fourth one so I did the following:
//div[#class='myCLass' and 4]
However the "4" is being ignored. Any help? I am new to XPath.
Thanks.
If a xpath query returns a node set you can always use the [OFFSET] operator to access a certain element of it.
Use the following query to access the fourth element that matches the #class='myClass' predicate:
//div[#class='myCLass'][4]
#WilliamNarmontas answer might be an alternative to the syntax showed above.
Alternatively,
//div[#class='myCLass' and position()=4]
The accepted answer works correctly only if all of the div elements have the same parent. Otherwise use:
(//div[#class='myCLass'])[4]