Good night, friends!
Lets suppose we have a xml with 30 items like that and I want to get just the name and url from an item where name contains the word: Richard.
<channel>
<item>
<name>Brian</name>
<lastname>Connor</lastname>
<age>40</age>
<enclosure url="http://www.brian.com"/>
</item>
<item>
<name>Richard</name>
<lastname>Wendell</lastname>
<age>38</age>
<enclosure url="http://www.richard.com"/>
</item>
</channel>
How can I do that using XPath?
I tried:
"//channel/item[name[contains(text(),'Richard')]]" but it returns just the name and I don't know how to select the url information together.
Please excuse my english!
Your approach does not work because you are selecting a sub tree (in this case an item) of the XML tree which contains more information than you want. If you want just a subset of the attributes in ONE xpath expression you have to select them separately and then concatenate them adequately, e.g.
concat('name=', //channel/item[contains(name, 'Richard')]/name, ' url=', //channel/item[contains(name, 'Richard')]/enclosure/#url)
The example will allow you to alter the additional formatting easily.
By the way: your XML input was malformatted. I corrected this.
Related
I was selecting all attributes id and everything was going nicely then one day requirements changed and now I have to select all except one!
Given the following example:
<root>
<structs id="123">
<struct>
<comp>
<data id="asd"/>
</comp>
</struct>
</structs>
</root>
I want to select all attributes id except the one at /root/structs/struct/comp/data
Please note that the Xml could be different.
Meaning, what I really want is: given any Xml tree, I want to select all attributes id except the one on element /root/structs/struct/comp/data
I tried the following:
//#id[not(ancestor::struct)] It kinda worked but I want to provide a full xpath to the ancestor axis which I couldn't
//#id[not(contains(name(), 'data'))] It didn't work because name selector returns the name of the underlying node which is the attribute not its parent element
The following should achieve what you're describing:
//#id[not(parent::data/parent::comp/parent::struct/parent::structs/parent::root)]
As you can see, it simply checks from bottom to top whether the id attribute's parent matches the path root/structs/struct/comp/data.
I think this should be sufficient for your needs, but it does not 100% ensure that the parent is at the path /root/structs/struct/comp/data because it could be, for example, at the path /someOtherHigherRoot/root/structs/struct/comp/data. I'm guessing that's not a possible scenario in your XML structure, but if you had to check for that, you could do this:
//#id[not(parent::data/parent::comp/parent::struct/parent::structs/parent::root[not(parent::*)])]
I want to extract a value from xml via xpath and I'm struggling a bit. This is the example of xml I have to work with
<data>
<menu>
<date>2017-10-30</date>
<type>S</type>
<name>onion soup</name>
</menu>
<menu>
<date>2017-10-30</date>
<type>L</type>
<name>ham sandwich</name>
</menu>
<menu>
<date>2017-10-31</date>
<type>S</type>
<name>pumpkin soup</name>
</menu>
<menu>
<date>2017-10-31</date>
<type>L</type>
<name>cheese sandwich</name>
</menu>
<menu>
<date>2017-11-1</date>
<type>S</type>
<name>sweet potato soup</name>
</menu>
<menu>
<date>2017-11-1</date>
<type>L</type>
<name>chicken sandwich</name>
</menu>
</data>
The dates and meal names are dynamically changing.
Now I have 2 columns, for Today's soup and Tomorrow's. I know how to link to xml via xpath for today's soup:
/data/menu/name[../type/text() = "S"] or /data/menu[type[text()='S']]/name
But I struggle with tomorrow's as my xml feed doesn't have any attributes to differentiate, types are the same for both dates and date is constantly changing.
Thanks for any help.
Edit:
Thank you for anwering.
I think I described my problem wrong.
I should probably point out that I'm using Xpath build-in feature in one of the local software.
You're right, these lines
/data/menu[type='S' and date='2017-10-31']]/name
are for all the soups, I just wrongly described it by how it behaves on my end, where it gives me just the value of the first one.
/data/menu[type='S' and date='2017-11-01']]/name
will give me Tommorow's soup, but if I want to use output for this value in static column "Tommorow's soup" next to which I want my xpath output it will only be true for one day. What I need is for it to be true also for next days.
I need a line that will give me "tommorow's soup" which is suppose to be Pumpkin soup today, tommorow when the xml updates it would be Sweet Potato soup and day after that it will be some new soup which is going to be updated later with the whole xml.
If I use
/data/menu[type='S' and date='2017-10-30']]/name
it will not show anything tommorow since there won't be a 2017-10-30 because the xml will update and will start with 2017-10-31.
I hope it's clearer now what I'm asking. I know it's still confusing it's kinda hard for me to describe it in English especially since I'm beginner when it comes to Xpath.
How to differentiate elements without attributes? Use other elements...
But first to clear up a wrong assumption:
Now I have 2 columns, for Today's soup and Tomorrow's. I know how to
link to xml via xpath for today's soup:
/data/menu/name[../type/text() = "S"] or
/data/menu[type[text()='S']]/name
Actually, the XPaths that you say will give you today's soups will actually give you all soups regardless of date.
XPath 1.0
XPath 1.0 has no date functions1, so you'll have to pass the current date and tomorrow's date into your XPAth, and you're on your own to test the date element's value as a string:
If today is 2017-10-31, then this XPath will give you the names of today's soups,
/data/menu[type='S' and date='2017-10-31']]/name
and this XPath will give you the names of tomorrow's soups:
/data/menu[type='S' and date='2017-11-1']]/name
1
XPath 2.0 and 3.0's dynamic context includes a current-dateTime() function, but its format is implementation-dependent, which limits its usefulness. You might be able to use date calculations to determine tomorrow's date, but unless you want to be dependent upon an implementation-defined format for current-dateTime(), you'll have to pass today into your XPath at least.
I want to use XPath to select the sub tree containing the <name>-tag with "ABC" and not the other one from the following xml. Is this possible? And as a minor question, which keywords would I use to find something like that over Google (e.g. for selecting the sub tree by an attribute I would have the terminology for)?
<root>
<operation>
<name>ABC</name>
<description>Description 1</description>
</operation>
<operation>
<name>DEF</name>
<description>Description 2</description>
</operation>
</root>
Use:
/*/operation[name='ABC']
For your second question: I strongly recommend not to rely on online sources (there are some that aren't so good) but to read a good book on XPath.
See some resources listed here:
https://stackoverflow.com/questions/339930/any-good-xslt-tutorial-book-blog-site-online/341589#341589
For your first question, I think a more accurate way to do it would be://operation[./name[text()='ABC']].And according to this , we can also make it://operation[./name[text()[.='ABC']]]
What XPath query will select the <media:thumbnail /> node in the following XML?
<item>
<title>Sublime Federer crushes Wawrinka</title>
<description>Defending champion Roger Federer cruises past Stanislas Wawrinka 6-1 6-3 6-3 to take his place in the Australian Open semi-finals.</description>
<link>http://news.bbc.co.uk/go/rss/-/sport2/hi/tennis/9372592.stm</link>
<guid isPermaLink="false">http://news.bbc.co.uk/sport1/hi/tennis/9372592.stm</guid>
<pubDate>Tue, 25 Jan 2011 04:21:23 GMT</pubDate>
<category>Tennis</category>
<media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/50933000/jpg/_50933894_011104979-1.jpg"/>
</item>
The XML came from this RSS feed.
You need to learn about namespaces and how to define/register a namespace in your XPath engine so that you can then use the associated prefix for names in that registered namespace. There are plenty of questions in the xpath tag asking how to use names that are in a namespace -- with good answers. Search for them.
A very rough answer (ignoring namespaces at all) is:
//*[name()='media:thumbnail']
What worked for me is:
/item/*[local-name()='thumbnail']
If you're looping an XmlNodeList array
just use *[local-name()='thumbnail']
I'm not very familiar with xpath. But I was working with xpath expressions and setting them in a database. Actually it's just the BAM tool for biztalk.
Anyway, I have an xml which could look like:
<File>
<Element1>element1<Element1>
<Element2>element2<Element2>
<Element3>
<SubElement>sub1</SubElement>
<SubElement>sub2</SubElement>
<SubElement>sub3</SubElement>
<Element3>
</File>
I was wondering if there is a way to use an xpath expression of getting all the SubElements concatted? At the moment, I am using:
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement']
This works if it only has one index. But apparently my xml sometimes has more nodes, so it gives NULL. I could just use
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement'][0]
but I need all the nodes. Is there a way to do this?
Thanks a lot!
Edit: I changed the XML, I was wrong, it's different, it should look like this:
<item>
<element1>el1</element1>
<element2>el2</element2>
<element3>el3</element3>
<element4>
<subEl1>subel1a</subEl1>
<subEl2>subel2a</subEl2>
</element4>
<element4>
<subEl1>subel1b</subEl1>
<subEl2>subel2b</subEl2>
</element4>
</item>
And I need to have a one line code to get a result like: "subel2a subel2b";
I need the one line because I set this xpath expression as an xml attribute (not my choice, it's specified). I tried string-join but it's not really working.
string-join(/file/Element3/SubElement, ',')
/File/Element3/SubElement will match all of the SubElement elements in your sample XML. What are you using to evaluate it?
If your evaluation method is subject to the "first node rule", then it will only match the first one. If you are using a method that returns a nodeset, then it will return all of them.
You can get all SubElements by using:
//SubElement
But this won't keep them grouped together how you want. You will want to do a query for all elements that contain a SubElement (basically do a search for the parent of any SubElements).
//parent::SubElement
Once you have that, you could (depending on your programming language) loop through the parents and concatenate the SubElements.