I'm not very familiar with xpath. But I was working with xpath expressions and setting them in a database. Actually it's just the BAM tool for biztalk.
Anyway, I have an xml which could look like:
<File>
<Element1>element1<Element1>
<Element2>element2<Element2>
<Element3>
<SubElement>sub1</SubElement>
<SubElement>sub2</SubElement>
<SubElement>sub3</SubElement>
<Element3>
</File>
I was wondering if there is a way to use an xpath expression of getting all the SubElements concatted? At the moment, I am using:
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement']
This works if it only has one index. But apparently my xml sometimes has more nodes, so it gives NULL. I could just use
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement'][0]
but I need all the nodes. Is there a way to do this?
Thanks a lot!
Edit: I changed the XML, I was wrong, it's different, it should look like this:
<item>
<element1>el1</element1>
<element2>el2</element2>
<element3>el3</element3>
<element4>
<subEl1>subel1a</subEl1>
<subEl2>subel2a</subEl2>
</element4>
<element4>
<subEl1>subel1b</subEl1>
<subEl2>subel2b</subEl2>
</element4>
</item>
And I need to have a one line code to get a result like: "subel2a subel2b";
I need the one line because I set this xpath expression as an xml attribute (not my choice, it's specified). I tried string-join but it's not really working.
string-join(/file/Element3/SubElement, ',')
/File/Element3/SubElement will match all of the SubElement elements in your sample XML. What are you using to evaluate it?
If your evaluation method is subject to the "first node rule", then it will only match the first one. If you are using a method that returns a nodeset, then it will return all of them.
You can get all SubElements by using:
//SubElement
But this won't keep them grouped together how you want. You will want to do a query for all elements that contain a SubElement (basically do a search for the parent of any SubElements).
//parent::SubElement
Once you have that, you could (depending on your programming language) loop through the parents and concatenate the SubElements.
Related
My XPath expression appears to be slightly wrong. Here is a snippet of my XML..
<wd:Repository_Document_Reference wd:Descriptor="EIB_Input.zip">
<wd:ID wd:type="WID">VALUE 1</wd:ID>
<wd:ID wd:type="Document_ID">VALUE 2</wd:ID>
</wd:Repository_Document_Reference>
I am looking to extract 'VALUE 2' as a single output.
The current XPath I am is not working:
/wd:Repository_Document_Reference/wd:ID[#wd:type='Document_ID']
Does my XPath need a slight tweak?
Thanks
Your XPath selects
<wd:ID wd:type="Document_ID">VALUE 2</wd:ID>
from the XML you've shown. In a context where this is evaluated as a string, it will indeed be
VALUE 2
If you wish to force it to be evaluated as a string, you can explicitly take the string value of your XPath:
string(/wd:Repository_Document_Reference/wd:ID[#wd:type='Document_ID'])
However, there's a chance that the rest of your document, which you've not shown, is causing other complications. Your XPath might be selecting multiple elements or no elements. You have to make sure that your XPath is specific enough to only be selecting the element you want. You also have to make sure that you've defined the namespaces prefix, wd properly. Without knowing more about your actual example, we can't say.
Try;
/wd:Repository_Document_Reference/wd:ID[contains(#wd:type,"Document_ID")]
I am new to Xpath and have been trying to get my head around some basic examples using xpath testing sites before I tackle a much more complex piece.
I am trying to understand exactly how to use the contains function in conjunction with other condition(s), but struggling a bit.
Here is my xml:
<root xmlns:foo="http://www.foo.org/" xmlns:bar="http://www.bar.org">
<actors>
<actor id="1">Christian Bale</actor>
<actor id="2">Liam Neeson</actor>
<actor id="3">Michael Caine</actor>
</actors>
<foo:singers>
<foo:singer id="4">Tom Waits</foo:singer>
<foo:singer id="4">B.B. King</foo:singer>
<foo:singer id="6">Ray Charles</foo:singer>
</foo:singers>
</root>
To replicate the type of xml (or html to be more precise) I am trying to parse, I have got one of the singer attributes repeated.
So I am trying to use contains to find the foo:singer id = 4 and contains "Tom Waits".
From what I have read and examples I have seen, you can use this type of expression:
.//*[#id =4 and //foo:singer[contains(text(),'Tom Waits')]]/text()
However, this returns both Tom Waits and BB King.
If I use these two expressions separately, they get the expected results, so not sure exactly if/how they can be combined.
Many thanks if you can assist me.
Andrew
Be sure to pay attention to context. There's no need for the nested predicate.
.//*[#id =4 and contains(.,'Tom Waits')]/text()
So I am trying to use contains to find the foo:singer id = 4 and contains "Tom Waits".
Since you're using //foo:singer for the contains test, the entire document is in context so it's always true.
Use
//foo:singer[contains(text(),'Tom Waits')]/text()
I am trying to quickly find a specific node using XPath but it seems my multiple predicates are not working. The div I need has a specific class, but there are 3 others that have it. I want to select the fourth one so I did the following:
//div[#class='myCLass' and 4]
However the "4" is being ignored. Any help? I am new to XPath.
Thanks.
If a xpath query returns a node set you can always use the [OFFSET] operator to access a certain element of it.
Use the following query to access the fourth element that matches the #class='myClass' predicate:
//div[#class='myCLass'][4]
#WilliamNarmontas answer might be an alternative to the syntax showed above.
Alternatively,
//div[#class='myCLass' and position()=4]
The accepted answer works correctly only if all of the div elements have the same parent. Otherwise use:
(//div[#class='myCLass'])[4]
I've been hacking away at this one for hours and I just can't figure it out. Using XPath to find text values is tricky and this problem has too many moving parts.
I have a webpage with a large table and a section in this table contains a list of users (assignees) that are assigned to a particular unit. There is nearly always multiple users assigned to a unit and I need to make sure a particular user is assigned to any of the units on the table. I've used XPath for nearly all of my selectors and I'm half way there on this one. I just can't seem to figure out how to use contains with text() in this context.
Here's what I have so far:
//td[#id='unit']/span [text()='asdfasdfasdfasdfasdf (Primary); asdfasdfasdfasdfasdf, asdfasdfasdfasdf; 456, 3456'; testuser]
The XPath Query above captures all text in the particular section I am looking at, which is great. However, I only need to know if testuser is in that section.
text() gets you a set of text nodes. I tend to use it more in a context of //span//text() or something.
If you are trying to check if the text inside an element contains something you should use contains on the element rather than the result of text() like this:
span[contains(., 'testuser')]
XPath is pretty good with context. If you know exactly what text a node should have you can do:
span[.='full text in this span']
But if you want to do something like regular expressions (using exslt for example) you'll need to use the string() function:
span[regexp:test(string(.), 'testuser')]
I'm using Nokogiri to parse a large XML file. Say I've got the following structure:
<menagerie>
<penguin>Pablo</penguin>
<penguin>Mortimer</penguin>
<bull>Ferdinand</bull>
<aardvark>James Cornelius Madison Humphrey Zophar Handlebrush III</aardvark>
</menagerie>
I can count the non-penguins like this:
xml.xpath('//menagerie//*[not(penguin)]').length // 2
But how do I get a list of the tags, like this? (The exact format isn't important; I just want to visually scan the non-penguins.)
bull
aardvark
Update
This gave me the list I wanted - thanks Oded and TMN and delnan!
xml.xpath('//menageries/*[not(penguin)]').each do |node|
puts node.name()
end
You can use the name() or local-name() XPath function.
See the examples on zvon.
I know it's a bit outdated but you should do: xml.xpath('//meagerie/*[not(penguin)]/name()') as the expression. Note the slash, not the dot. This is how you call methods on the current node in XPath.