Select attribute and text() in the same query - xpath

I would like to select a attribute and the text() value of a node in one query, e.g. I have
<Tag1 #myattr='test'>MyText</Tag1>
and I am interested in getting "test" and "MyText" with one query.
The obvious
//Tag1/#myattr | //Tag1/text()
fails due to the fact, that Unions are only allowed over node-sets.
Any ideas?

I think, given XPath 2.0, you want a sequence of string values which you get with //Tag1/(#myattr, .)/string(). If you want a single string then use //Tag1/string-join((#myattr, .), ' ').
BTW, your path //Tag1/#myattr | //Tag1/text() would select a sequence containing an attribute value and a text node. I don't see how that would fail.

Related

Is it possible in XPATH to find an element by attribute value, not by name?

For example I have an XML element:
<input id="optSmsCode" type="tel" name="otp" placeholder="SMS-code">
Suppose I know that somewhere there must be an attribute with otp value, but I don’t know in what attribute it can be, respectively, is it possible to have an XPath expression of type like this:
.//input[(contains(*, "otp")) or (contains(*, "ode"))]
Try it like this and see if it works:
one = '//input/#*[(contains(.,"otp") or contains(.,"ode"))]/..'
print(driver.find_elements_by_xpath(one))
Edit:
The contains() function has a required cardinality of first argument of either one or zero. In plain(ish) English, it means you can check only one element at a time to see if it contains the target string.
So, the expression above goes through each attribute of input separately (/#*), checks if the attribute value of that specific attribute contains within it the target string and - if target is found - goes up to the parent of that attribute (/..) which, in the case of an attribute, is the node itself (input).
This XPath expression selects all <input> elements that have some attribute, whose string value contains "otp" or "ode". Notice that there is no need to "go up to the parent ..."
//input[#*[contains(., 'otp') or contains(., 'ode')]]
If we know that "otp" or "ode" must be the whole value of the attribute (not just a substring of the value), then this expression is stricter and more efficient to evaluate:
//input[#*[. ='otp' or . = 'ode']]
In this latter case ("otp" or "ode" are the whole value of the attribute), if we have to compare against many values then an XPath expression of the above form will quickly become too long. There is a way to simplify such long expression and do just a single comparison:
//input[#*[contains('|s1|s2|s3|s4|s5|', concat('|', ., '|'))]]
The above expression selects all input elements in the document, that have at least one attribute whose value is one of the strings "s1", "s2", "s3", "s4" or "s5".

How to select a specific category value in this Xpath expression

I have a feed here. I'm trying to create an XPath expression that returns items that have a category equal to Bananas. Due to the limitations in my XML parser, I can't use namespaces directly to select items.
The expression /rss/channel/item//*[name()='itunes:category'] returns this:
Element='<itunes:category
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
text="Apples"/>'
Element='<itunes:category
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
text="Bananas"/>'
...
And /rss/channel/item//*[name()='itunes:category']/#text returns this:
Attribute='text=Apples'
Attribute='text=Bananas'
...
But I can't figure out how to limit the response to just a single category (e.g., Bananas)?
I want some kind of expression like this:
/rss/channel/item//*[name()='itunes:category' and contains(., 'Bananas')]
But this doesn't work. It's not syntactically valid. What would be the right XPath expression syntax to just return Bananas?
Do you just mean to filter by attributes of item child, but still return item node?
/rss/channel/item/*[name()='itunes:category' and contains(#text,'Apples')]/parent::item
or simplier
/rss/channel/item[*[name()='itunes:category' and #text='Apples']]
I used Apples in example because using your example xml file there is 0 results for Bananas.

Need XPath and XQuery query

I'm working on Xpath/Xquery to return values of multiple child nodes based on a sibling node value in a single query. My XML looks like this
<FilterResults>
<FilterResult>
<ID>535</ID>
<Analysis>
<Name>ZZZZ</Name>
<Identifier>asdfg</Identifier>
<Result>High</Result>
<Score>0</Score>
</Analysis>
<Analysis>
<Name>XXXX</Name>
<Identifier>qwerty</Identifier>
<Result>Medium</Result>
<Score>0</Score>
</Analysis>
</FilterResult>
<FilterResult>
<ID>745</ID>
<Analysis>
<Name>XXXX</Name>
<Identifier>xyz</Identifier>
<Result>Critical</Result>
<Score>0</Score>
</Analysis>
<Analysis>
<Name>YYYY</Name>
<Identifier>qwerty</Identifier>
<Result>Medium</Result>
<Score>0</Score>
</Analysis>
</FilterResult>
</FilterResults>
I need to get values of Score and Identifier based on Name value. I'm currently trying with below query but not working as desired
fn:string-join((
for $Identifier in fn:distinct-values(FilterResults/FilterResult/Analysis[Name="XXXX"])
return fn:string-join((//Identifier,//Score),'-')),',')
The output i'm looking for is this
qwerty-0,xyz-0
Your question suggests some fundamental misunderstandings about XQuery, generally. It's hard to explain everything in a single answer, but 1) that is not how distinct-values works (it returns string values, not nodes), and 2) the double slash selections in your return statement are returning everything because they are not constrained by anything. The XPath you use inside the distinct-values call is very close, however.
Instead of calling distinct-values, you can assign the Analysis results of that XPath to a variable, iterate over them, and generate concatenated strings. Then use string-join to comma separate the full sequence. Note that in the return statement, the variable $a is used to concat only one pair of values at a time.
string-join(
let $analyses := FilterResults/FilterResult/Analysis[Name="XXXX"]
for $a in $analyses
return $a/concat(Identifier, '-', Score),
',')
=> qwerty-0,xyz-0

XPath OR operator for different nodes

How can I do with XPath:
//bookstore/book/title or //bookstore/city/zipcode/title
Just //title won't work because I also have //bookstore/magazine/title
p.s. I saw a lot of or examples but mainly with attributes or single node structure.
All title nodes with zipcode or book node as parent:
Version 1:
//title[parent::zipcode|parent::book]
Version 2:
//bookstore/book/title|//bookstore/city/zipcode/title
Version 3: (results are sorted based on source data rather than the order of book then zipcode)
//title[../../../*[book] or ../../../../*[city/zipcode]]
or - used within true/false - a Boolean operator in xpath
| - a Union operator in xpath that appends the query to the right of the operator to the result set from the left query.
If you want to select only one of two nodes with union operator, you can use this solution:
(//bookstore/book/title | //bookstore/city/zipcode/title)[1]
It the element has two xpath. Then you can write two xpaths like below:
xpath1 | xpath2
Eg:
//input[#name="username"] | //input[#id="wm_login-username"]

Use Xpath to find the appropriate element based on the element value

I have the following xml snippet
<ZMARA01 SEGMENT="1">
<CHARACTERISTICS_01>X,001,COLOR_ATTRIBUTE_FR,BRUN ÉCORCE,TMBR,French C</CHARACTERISTICS_01>
<CHARACTERISTICS_02>X,001,COLOR_ATTRIBUTE,Timber Brown,TMBR,Color Attr</CHARACTERISTICS_02>
</ZMARA01>
I am looking for an xpath expression that will match based on COLOR_ATTRIBUTE. It will not always be in CHARACTERISTIC_02. It could be CHARACTERISTIC_XX. Also I don't want to match COLOR_ATTRIBUTE_FR. I have been using this:
Transaction.Input_XML{/ZMAT/IDOC/E1MARAM/ZMARA01/*[starts-with(local-name(.), 'CHARACTERISTIC_')][contains(.,'COLOR_ATTRIBUTE')]}
This gets me mostly there but it matches both COLOR_ATTRIBUTE and COLOR_ATTRIBUTE_FR
Use:
contains(concat(',', ., ','), ',COLOR_ATTRIBUTE,')
This first surrounds the string value of the context node with commas, then simply tests if the so cunstructed string contains ',COLOR_ATTRIBUTE,'.
Thus we treat all cases (pattern at the start of the string, pattern at the end of the string and pattern neither at the start or at the end) in the same single way.
If COLOR_ATTRIBUTE is guaranteed not to be in the first or last position, you could use [contains(.,',COLOR_ATTRIBUTE,')], otherwise you could use something like [contains(.,'COLOR_ATTRIBUTE') and not contains(.,'COLOR_ATTRIBUTE_FR')].

Resources