XPath Wildcard -- Any Node Name, Must have Specific Attribute Value - xpath

I am having difficulty figuring out an XPath query that would allow me to return nodes based on the value of the Program attribute in the example below. For example, I would like to be able to search all nodes for a value of the Program attribute = "011.pas". I tried /Items/*[Program="012.pas"] and also /Items/Item*[Program="01.pas"] but neither works. What is the correct expression?
<Items>
<Item0 Program="01.pas"></Item0>
<Item1 Program="011.pas"></Item1>
</Items>

The attribute is selected with #Program, the child elements of the Items element with /Items/*, so you want /Items/*[#Program = '011.pas'].

Try this :
/items/*[#Program='011.pas']

Related

Xpath expression (nokogiri) to get tag's child element?

From my xml, I can get this :
<home>
<creditors>
<count>2</count>
</creditors>
</home>
OR even this :
<home>
<creditors>
<moreThan>2</moreThan>
</creditors>
</home>
Which xpath expression can I use to get "<count>2</count>" instead of getting only "2" OR to get "<moreThan>2</moreThan>" instead of getting "2" ?
This XPath,
//creditors/count
will select all count child elements of all creditors elements in the XML document.
Update per OP's request in comments for a single XPath that selects both count and moreThan elements:
This XPath,
//creditors/*[self::count or self::moreThan]
will select all count or moreThan child elements of all creditors elements in the XML document.
Assuming that your xpath expression is OK, you just need to convert the element to string:
doc.xpath("home/creditors/*").to_s
=> "<count>2</count>"
Please check with queries returning more than one element, to make sure that it's desired behaviour.

How to get parent element with attribute using xpath

I have posted sample XML and expected output kindly help to get the result.
Sample XML
<root>
<A id="1">
<B id="2"/>
<C id="2"/>
</A>
</root>
Expected output:
<A id="1"/>
You can formulate this query in several ways:
Find elements that have a matching attribute, only ascending all the time:
//*[#id=1]
Find the attribute, then ascend a step:
//#id[.=1]/..
Use the fn:id($id) function, given the document is validated and the ID-attribute is defined as such:
/id('1')
I think it's not possible what you're after. There's no way of selecting a node without its children using XPATH (meaning that it'd always return the nodes B and C in your case)
You could achieve this using XQuery, I'm not sure if this is what you want but here's an example where you create a new node based on an existing node that's stored in the $doc variable.
declare variable $doc := <root><A id="1"><B id="2"/><C id="2"/></A></root>;
element {fn:node-name($doc/*)} {$doc/*/#*}
The above returns <A id="1"></A>.
is that what you are looking for?
//*[#id='1']/parent::* , similar to //*[#id='1']/../
if you want to verify that parent is root :
//*[#id='1']/parent::root
https://en.wikipedia.org/wiki/XPath
if you need not just parent - but previous element with some attribute: Read about Axis specifiers and use Axis "ancestor::" =)

XPath to only select the text contained within an element

I am new to xpath so I apologize in advance for how basic this question is.
How do I extract just the text from a specific element? For example, how would I extract just "text"
<h1>text</h1>
I tried the following but it seems to select everything including the tags instead of just the text.
//h1/text()
Thanks for your help
`
DocumentBuilderFactory docFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File("src/myFile.xml"));
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
String sessionId = (String) xpath
.evaluate(
"/Envelope/Body/LoginProcessResponse/loginResponse/sessionId",
doc, XPathConstants.STRING);
`
here Envelope is my parent element and i just traversed to the required path(in my case it is sessionid).
Hope it helps
This answer is rather an XSLT answer than an XPath answer, but many of the concepts are nevertheless applicable.
The XPath expression
//h1/text()
seems to be correct. It does select all text() nodes that are direct children of <h1> elements.
But one problem may be, that the XSL default template still copies all the othertext() nodes like described here in the W3C specification:
In the absence of a select attribute, the xsl:apply-templates instruction processes all of the children of the current node, including text nodes.
So to solve your problem, you have to define an explicit template that
ignores all other text() nodes like this:
<xsl:template match="text()" />
If you add this line to your XSL processing, the result will most likely be more pleasant to you.

Xpath expression returns null

I have the plenty of links like this:
<b>Edit issue >></b>
Trying to extract the href' content I use Xpath expression:
//a[contains(#href,'/edit_flat')]
but it returns me null. What am I doing wrong ?
//a[contains(#href,'/edit_flat')] selects a elements anywhere in the document tree that have an href attribute containing the '/edit_flat' string.
These matching elements do have this very "href" attribute, but the XPath expression you are using returns "only" the a elements, if there are any.
To actually return the matching elements' attribute's values, you need an extra step, with / and #href. So what you want is:
//a[contains(#href,'/edit_flat')]/#href
Suggestion:
What you really want is probably to select links which href begin with the substring "/edit_flat", so it's safer to use:
.//a[starts-with(#href,'/edit_flat')]/#href

Use Nokogiri to get all nodes in an element that contain a specific attribute name

I'd like to use Nokogiri to extract all nodes in an element that contain a specific attribute name.
e.g., I'd like to find the 2 nodes that contain the attribute "blah" in the document below.
#doc = Nokogiri::HTML::DocumentFragment.parse <<-EOHTML
<body>
<h1 blah="afadf">Three's Company</h1>
<div>A love triangle.</div>
<b blah="adfadf">test test test</b>
</body>
EOHTML
I found this suggestion (below) at this website: http://snippets.dzone.com/posts/show/7994, but it doesn't return the 2 nodes in the example above. It returns an empty array.
# get elements with attribute:
elements = #doc.xpath("//*[#*[blah]]")
Thoughts on how to do this?
Thanks!
I found this here
elements = #doc.xpath("//*[#*[blah]]")
This is not a useful XPath expression. It says to give you all elements that have attributes that have child elements named 'blah'. And since attributes can't have child elements, this XPath will never return anything.
The DZone snippet is confusing in that when they say
elements = #doc.xpath("//*[#*[attribute_name]]")
the inner square brackets are not literal... they're there to indicate that you put in the attribute name. Whereas the outer square brackets are literal. :-p
They also have an extra * in there, after the #.
What you want is
elements = #doc.xpath("//*[#blah]")
This will give you all the elements that have an attribute named 'blah'.
You can use CSS selectors:
elements = #doc.css "[blah]"

Resources