Find attribute names that start with a certain pattern - xpath

I am looking to find all attributes of an element that match a certain pattern.
So for an element
<element s2="1" name="aaaa" id="1" />
<element s3="1" name="aaaa" id="2" />
I would like to be able to find all attributes that start with 's' (returning the value of s1 for the first element and s3 for the value of the second element).
If this is outside of xpath's ability please let me know.

Use:
element/#*[starts-with(name(), 's')]
This XPath expression selects all atribute nodes whose name starts with the string 's' and that are attributes of elements named element that are children of the current node.
starts-with() is a standard function in XPath 1.0

element/#*[substring(name(), 1,1) = "s"]
will match any attribute that starts with 's'.
The function starts-with() might look better than using substring()

I've tested the given answers from both #Dimitre-Novatchev and #Ledhund, using lxml.html module in Python.
Both element/#*[starts-with(name(), 's')] and element/#*[substring(name(), 1,1) = "s"] return only the values of s2 and s3. You won't be able to know which value belong to which attribute.
I think in practice I would be more interested in finding the elements themselves that contain the attributes of names starting with specific characters rather than just their values.
To achieve that is very simple, just add /.. at the end,
element/#*[starts-with(name(), "s")]/..
or
element/#*[starts-with(name(), "s")]/parent::*
or
element/#*[starts-with(name(), "s")]/parent::node()

None from above worked for me.
So I did not some changes and it worked for me. :)
/*:UserCustomField[starts-with(#name, 'purchaseDate')]

Related

Xpath for an element , all ancestors of which have the same name up to a point

I have an XML that looks like the following:
xml tree
I need those tag elements that have only son elements as their ancestors.The only non-son ancestor allowed is the root element parent.After parent no ancestor of tag can be anything other than son . This xpath therefore would return <tag id="t1" /> and <tag id="t2" />
//son//tag would be one solution. Another would be //tag[ancestor::son] You could use /descendent:: in place of //; there are differences in the order in which results are reported. There are other variants; which one is best depends on the exact context in which you're doing this.
I should have posted this earlier or may be it does not matter.Here is the nasty looking xpath I wrote to solve this:
/parent/(descendant::tag except(descendant::element() except descendant::son)/descendant::tag)
Hope someone would suggest a better looking alternative.

Find xml node whose name is a concatenation of attribute of another node and string constant

I have a bit of a tough xpath query (which I'm not entirely sure can be done).
I have the below xml
<Root>
<PersonOne Name='jon'/>
<PersonTwo Name='bob'/>
<JonDetails>some text</JonDetails>
<BobDetails>some details about Bob</BobDetails>
</Root>
I know it is a bit of a contrived example but the xml structure I am dealing with is fixed and I cannot change it.
Basically I'm trying to figure out the xpath to select the *Detail node for the name attribute in the PersonOne node.
So to do this I need to concat the atribute value of 'Name' in the PersonOne node with the constant Details to get 'JonDetails' as a node name.
I have this so far but it doesn't work but I think it is along the right lines.
/Root/*[contains(name(), concat(/Root/PersonOne/#Name, 'Details'))]
However, just to add to the fun it has to be a case insensitive match on the node name. I know this can be done with a translate function.
Any pointers in the right direction?
Jon
will this expression be better?
/Root/*[translate(name(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') = translate(concat(/Root/PersonOne/#Name, 'details'), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')]
it looks for an exact match.
Just figured it out! It's not too pretty but it works.
/Root/*[contains(translate(name(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), translate(concat(/Root/PersonOne/#Name, 'details'), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))]
If anyone can improve on this it would be good to see how.
Thanks

XPath check for non-existing node

Im having a bit of trouble finding the right XPath syntax to check if a particular node in my XML exists. I'm only allowed to use XPath (so no XSL or something else like that, it has to be a pure XPath expression syntax).
I have an XML and it has a node Filename but it doesn't exist in every case. When the filename isn't specified, my LiveCycle proces will use a different route to fill in the filename. But how do I check if the Filename node exists?
Similar to count but maybe more direct depending of what you want is the function boolean
boolean(//Filename)
This returns true if "Filename" node exist and false if not.
You can use the count function - passing in the path of the nodes you are checking.
If they do not exist, then the value of count will be 0:
count(//Filename) = 0
Suppose you have the following XML document:
<top>
<function>
<filenamex>c:\a\y\z\myFile.xml</filenamex>
<default>Default.xml</default>
</function>
</top>
then this XPath expression selects either the filename element when it's present or the default element when no filename element is specified:
(/*/function/filename
|
/*/function/default
)
[1]
The shortest way to check if the filename element exists is:
/*/function/filename
So the first XPath expression could be re-written in the equivalent (but somewhat longer):
/*/function/filename
|
/*/function/default[not(/*/function/filename)]
Given the example Xml from another answer
<top>
<function>
<filenamex>c:\a\y\z\myFile.xml</filenamex>
<default>Default.xml</default>
</function>
</top>
To get nodes WITH node "filenamex" use /top/function[filenamex]
To get nodes WITHOUT node "filenamex" use /top/function[not(filenamex)]
I felt it necessary to answer here as the other answers did not work as advertised in XmlSpy

XPath concat multiple nodes

I'm not very familiar with xpath. But I was working with xpath expressions and setting them in a database. Actually it's just the BAM tool for biztalk.
Anyway, I have an xml which could look like:
<File>
<Element1>element1<Element1>
<Element2>element2<Element2>
<Element3>
<SubElement>sub1</SubElement>
<SubElement>sub2</SubElement>
<SubElement>sub3</SubElement>
<Element3>
</File>
I was wondering if there is a way to use an xpath expression of getting all the SubElements concatted? At the moment, I am using:
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement']
This works if it only has one index. But apparently my xml sometimes has more nodes, so it gives NULL. I could just use
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement'][0]
but I need all the nodes. Is there a way to do this?
Thanks a lot!
Edit: I changed the XML, I was wrong, it's different, it should look like this:
<item>
<element1>el1</element1>
<element2>el2</element2>
<element3>el3</element3>
<element4>
<subEl1>subel1a</subEl1>
<subEl2>subel2a</subEl2>
</element4>
<element4>
<subEl1>subel1b</subEl1>
<subEl2>subel2b</subEl2>
</element4>
</item>
And I need to have a one line code to get a result like: "subel2a subel2b";
I need the one line because I set this xpath expression as an xml attribute (not my choice, it's specified). I tried string-join but it's not really working.
string-join(/file/Element3/SubElement, ',')
/File/Element3/SubElement will match all of the SubElement elements in your sample XML. What are you using to evaluate it?
If your evaluation method is subject to the "first node rule", then it will only match the first one. If you are using a method that returns a nodeset, then it will return all of them.
You can get all SubElements by using:
//SubElement
But this won't keep them grouped together how you want. You will want to do a query for all elements that contain a SubElement (basically do a search for the parent of any SubElements).
//parent::SubElement
Once you have that, you could (depending on your programming language) loop through the parents and concatenate the SubElements.

XPath to return string concatenation of qualifying child node values

Can anyone please suggest an XPath expression format that returns a string value containing the concatenated values of certain qualifying child nodes of an element, but ignoring others:
<div>
This text node should be returned.
<em>And the value of this element.</em>
And this.
<p>But this paragraph element should be ignored.</p>
</div>
The returned value should be a single string:
This text node should be returned. And the value of this element. And this.
Is this possible in a single XPath expression?
Thanks.
In XPath 2.0 :
string-join(/*/node()[not(self::p)], '')
In XPath 1.0:
You can use
/div//text()[not(parent::p)]
to capture the wanted text nodes. The concatenation itself cannot be done in XPath 1.0, I recommend doing it in the host application.
/div//text()
double slash forces to extract text regardless of intermediate nodes
This look that works:
Using as context /div/:
text() | em/text()
Or without the use of context:
/div/text() | /div/em/text()
If you want to concat the first two strings, use this:
concat(/div/text(), /div/em/text())
If you want all children except p, you can try the following...
string-join(//*[name() != 'p']/text(), "")
which returns...
This text node should be returned.
And the value of this element.
And this.
I know this comes a bit late, but I figure my answer could still be relevant. I recently ran into a similar problem. And because I use scrapy in Python 3.6, which does not support xpath 2.0, I could not use the string-join function suggested in several online answers.
I ended up finding a simple workaround (as shown below) which I did not see in any of the stackoverflow answers, that's why I'm sharing it.
temp_selector_list = response.xpath('/div')
string_result = [''.join(x.xpath(".//text()").extract()) for x in temp_selector_list]
Hope this helps!
You could use a for-each loop as well and assemble the values in a variable like this
<xsl:variable name="newstring">
<xsl:for-each select="/div//text()">
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:variable>

Resources