XPath Expression referencing a node - xpath

I am trying to reference a node in an expression. Take this simple example:
<?xml version="1.0" encoding="UTF-8" ?>
<homelist>
<homes>
<home>
<hname>house</hname>
<location>hell</location>
<url>wee</url>
<cID>1234</cID>
</home>
</homes>
<contacts>
<contactdetails cID="1234">
<cname>John Smith</cname>
<phone>0123234</phone>
<email>test#gmail.com</email>
</contactdetails>
</contacts>
</homelist>
I basically want to select nodes if it's value is somewhere else in the tree.
For example, I want to display the url of homes that have cID of John Smith. I tried this but it doesn't work, what is wrong with it:
homelist/homes/home[ancestor::homelist/contacts/contactdetails[cname="John Smith"]/url

"/homelist/homes/home[cID = /homelist/contacts/contactdetails[cname='John Smith']/#cID]/url"
You want to find the <home> whose <cID> child's text content equals that of the cID= attribute of the <contactdetails> whose <cname> contains 'John Smith', then return its <url> child.
Note that I've written this as an absolute path, from the root, since you didn't tell us what the context node was going to be for this XPath.
There are certainly other ways of writing the same concept; this is just the first one that occurred to me offhand.
If you preferred to use ancestor or parent, you could say
"/homelist/homes/home[cID = ancestor::homelist/contacts/contactdetails[cname='John Smith']/#cID]/url"

Related

How to get parent element with attribute using xpath

I have posted sample XML and expected output kindly help to get the result.
Sample XML
<root>
<A id="1">
<B id="2"/>
<C id="2"/>
</A>
</root>
Expected output:
<A id="1"/>
You can formulate this query in several ways:
Find elements that have a matching attribute, only ascending all the time:
//*[#id=1]
Find the attribute, then ascend a step:
//#id[.=1]/..
Use the fn:id($id) function, given the document is validated and the ID-attribute is defined as such:
/id('1')
I think it's not possible what you're after. There's no way of selecting a node without its children using XPATH (meaning that it'd always return the nodes B and C in your case)
You could achieve this using XQuery, I'm not sure if this is what you want but here's an example where you create a new node based on an existing node that's stored in the $doc variable.
declare variable $doc := <root><A id="1"><B id="2"/><C id="2"/></A></root>;
element {fn:node-name($doc/*)} {$doc/*/#*}
The above returns <A id="1"></A>.
is that what you are looking for?
//*[#id='1']/parent::* , similar to //*[#id='1']/../
if you want to verify that parent is root :
//*[#id='1']/parent::root
https://en.wikipedia.org/wiki/XPath
if you need not just parent - but previous element with some attribute: Read about Axis specifiers and use Axis "ancestor::" =)

xpath without specificy the tag? [duplicate]

Given this XML, what XPath returns all elements whose prop attribute contains Foo (the first three nodes):
<bla>
<a prop="Foo1"/>
<a prop="Foo2"/>
<a prop="3Foo"/>
<a prop="Bar"/>
</bla>
//a[contains(#prop,'Foo')]
Works if I use this XML to get results back.
<bla>
<a prop="Foo1">a</a>
<a prop="Foo2">b</a>
<a prop="3Foo">c</a>
<a prop="Bar">a</a>
</bla>
Edit:
Another thing to note is that while the XPath above will return the correct answer for that particular xml, if you want to guarantee you only get the "a" elements in element "bla", you should as others have mentioned also use
/bla/a[contains(#prop,'Foo')]
This will search you all "a" elements in your entire xml document, regardless of being nested in a "blah" element
//a[contains(#prop,'Foo')]
I added this for the sake of thoroughness and in the spirit of stackoverflow. :)
This XPath will give you all nodes that have attributes containing 'Foo' regardless of node name or attribute name:
//attribute::*[contains(., 'Foo')]/..
Of course, if you're more interested in the contents of the attribute themselves, and not necessarily their parent node, just drop the /..
//attribute::*[contains(., 'Foo')]
descendant-or-self::*[contains(#prop,'Foo')]
Or:
/bla/a[contains(#prop,'Foo')]
Or:
/bla/a[position() <= 3]
Dissected:
descendant-or-self::
The Axis - search through every node underneath and the node itself. It is often better to say this than //. I have encountered some implementations where // means anywhere (decendant or self of the root node). The other use the default axis.
* or /bla/a
The Tag - a wildcard match, and /bla/a is an absolute path.
[contains(#prop,'Foo')] or [position() <= 3]
The condition within [ ]. #prop is shorthand for attribute::prop, as attribute is another search axis. Alternatively you can select the first 3 by using the position() function.
Have you tried something like:
//a[contains(#prop, "Foo")]
I've never used the contains function before but suspect that it should work as advertised...
John C is the closest, but XPath is case sensitive, so the correct XPath would be:
/bla/a[contains(#prop, 'Foo')]
If you also need to match the content of the link itself, use text():
//a[contains(#href,"/some_link")][text()="Click here"]
/bla/a[contains(#prop, "foo")]
try this:
//a[contains(#prop,'foo')]
that should work for any "a" tags in the document
For the code above...
//*[contains(#prop,'foo')]

How to find the parent node by matching text using XPath

I have some XML:
<sys>
<lang>
<employee>
<name>Employee 1</name>
<code>4fdaa994-7015-4ec1-b365-de4ee0279966</code>
</employee>
<employee>
<name>Employee 2</name>
<code>1d960bdc-0853-49af-bb83-18cf92493897</code>
</employee>
</lang>
</syz>
How can I search and get the employee node where name ="Employee 1"?
I tried this but it didn't work:
obj.xpath("//sys/lang[/employee/name = 'Employee 1']")
This XPath
/sys/lang/employee[name = 'Employee 1']
will select the employee element whose name is Employee 1.
Why might OP be getting an "Invalid expression" using the above XPath?
Transcription error.
Resolution: Use copy and paste.
Single quotes around single quotes.
Resolution: Use outer double quotes: "/sys/lang/employee[name = 'Employee 1']"
Smart quotes.
Resolution: Replace ‘ and ’ with single quote '.
Misinterpretation of error message.
Resolution: Carefully check any line number mentioned in error, or carve away surrounding code as much as possible, and see if error goes away.
If none of the above possibilities apply, post a MCVE (Minimal, Complete, and Verifiable Example, including the provided XPath and the calling code -- the complete in MCVE) that produces the invalid expression error, and someone will likely immediately spot the problem.
I'm a big fan of using CSS over XPath for readability reasons. Nokogiri implements a number of jQuery's extensions to make it easier to use CSS for things we'd usually use XPath for.
I'd do it this way:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<sys>
<lang>
<employee>
<name>Employee 1</name>
<code>4fdaa994-7015-4ec1-b365-de4ee0279966</code>
</employee>
<employee>
<name>Employee 2</name>
<code>1d960bdc-0853-49af-bb83-18cf92493897</code>
</employee>
</lang>
</syz>
EOT
emp1 = doc.at('employee name:contains("Employee 1")') # => #<Nokogiri::XML::Element:0x3ffed05285b4 name="name" children=[#<Nokogiri::XML::Text:0x3ffed05283d4 "Employee 1">]>
emp1.to_xml # => "<name>Employee 1</name>"
emp1.parent.to_xml # => "<employee>\n <name>Employee 1</name>\n <code>4fdaa994-7015-4ec1-b365-de4ee0279966</code>\n </employee>"
Also note, it's not good practice to define the full path in the selector for a node. If the HTML or XML changes the structure that selector will break. Instead, find useful landmarks and hop from one to the next. That way your selector is more likely to survive changes in the markup. I only care about finding the appropriate <employee>...<name> combination, not those two tags embedded under <sys> and <lang>.
Sometimes an alternate way of getting to the information you want is to use search and look at a particular index:
doc.search('employee').first.to_xml # => "<employee>\n <name>Employee 1</name>\n <code>4fdaa994-7015-4ec1-b365-de4ee0279966</code>\n </employee>"
Or:
doc.at('employee').to_xml # => "<employee>\n <name>Employee 1</name>\n <code>4fdaa994-7015-4ec1-b365-de4ee0279966</code>\n </employee>"
at('some selector') is equivalent to search('some selector').first.

Selecting a XML node with LINQ, and modifying

I've got the following XML:
<Config>
<Book>
<Name> Book Name #1 </Name>
<Available In>
<Country>US</Country>
<Country>Canada</Country>
</Available In>
</Book>
</Config>
I need to find all instances of Book which are available in a specific country, and then introduce a node underneath "Available In". My selection statement fails anytime I add the where statement:
XElement xmlFile = XElement.Load(xmlFileLocation);
var q = (from c in xmlFile.Elements(“Book”)
where c.Elements(Country).Value == "Canada"
select c;
.Value can't be resolved, and toString give me the entire subnode in stringform. I need to select all books in a particular country so that I can then update them all to include a new locale node, ex:
<Config>
<Book>
<Name> Book Name #1 </Name>
<Available In>
<Country>US</Country>
<Country>Canada</Country>
</Available In>
<LocaleIDs>
<LocalID> 3066 </LocaleID>
<LocaleIDs>
</Book>
</Config>
Thanks for your help!
You're trying to use Value on the result of calling Elements which returns a sequence of elements. That's not going to work - it doesn't make any sense. You want to call it on a single element at a time.
Additionally, you're trying to look for direct children of Book, which ignores the Available In element, which isn't even a valid element name...
I suspect you want something like:
var query = xmlFile.Elements("Book")
.Where(x => x.Descendants("Country")
.Any(x => (string) x == "Canada"));
In other words, find Book elements where any of the descendant Country elements has a text value of "Canada".
You'll still need to fix your XML to use valid element names though...

Using XQuery/XPath to get the attribute value of an element's parent node

Given this xml document:
<?xml version="1.0" encoding="UTF-8"?>
<mydoc>
<foo f="fooattr">
<bar r="barattr1">
<baz z="bazattr1">this is the first baz</baz>
</bar>
<bar r="barattr2">
<baz z="bazattr2">this is the second baz</baz>
</bar>
</foo>
</mydoc>
that is being processed by this xquery:
let $d := doc('file:///Users/mark/foo.xml')
let $barnode := $d/mydoc/foo/bar/baz[contains(#z, '2')]
let $foonode := $barnode/../../#f
return $foonode
I get the following error:
"Cannot create an attribute node (f) whose parent is a document node".
It seems that the ../ operation is sort of removing the matching nodes from the rest of the document such that it thinks it's the document node.
I'm open to other approaches but the selection of the parent depends on the child attribute containing a certain sub-string.
Cheers!
The query you have written is selecting the attribute f. However it is not legal to return an attribute node from an XQuery. The error is refering to the output document which here contains just an attribute (although this error message is misleading, as technically there is no output document here, there is just an attribute node that is returned).
You probably wanted to return the value of the attribute rather than the attribute itself
return data($foonode)

Resources