How does contain function work in xpath - xpath

<collection>
<movie title="Transformers" shelf="B">
<type>Science Fiction</type>
<format>DVD</format>
<year>1980</year>
<rating>R</rating>
<popularity>7</popularity>
<description>Science Fiction</description>
</movie>
<movie title="Trigun" shelf="B">
<type>Action</type>
<format>DVD</format>
<episodes>4</episodes>
<rating>PG</rating>
<popularity>10</popularity>
<description>Quite a bit of action!</description>
</movie>
<movie title="Ishtar" shelf="A">
<type>Comedy</type>
<format>VHS</format>
<rating>PG</rating>
<popularity>2</popularity>
<description>Boring</description>
</movie>
</collection>
In the above code segment,I want to get the title attribute for the segment where the description element contains a sub-string function.I tried to use contains functions which returns Boolean and expects a string for the argument. I tried something few Xpaths but dint work and got to know the answer is collection/movie[contains(description,'bit')]/#title
Since contains returns only Boolean value, how does it work in the above case.Please clarify

It's got nothing to do with contains() specifically, [...] simply defines a predicate, with which to filter things. So tag[boolean-expression()] will return to you exactly those tags for which boolean-expression() returns true() (or to be more precise: a value that can be converted to true()).

Related

Is it possible in XPATH to find an element by attribute value, not by name?

For example I have an XML element:
<input id="optSmsCode" type="tel" name="otp" placeholder="SMS-code">
Suppose I know that somewhere there must be an attribute with otp value, but I don’t know in what attribute it can be, respectively, is it possible to have an XPath expression of type like this:
.//input[(contains(*, "otp")) or (contains(*, "ode"))]
Try it like this and see if it works:
one = '//input/#*[(contains(.,"otp") or contains(.,"ode"))]/..'
print(driver.find_elements_by_xpath(one))
Edit:
The contains() function has a required cardinality of first argument of either one or zero. In plain(ish) English, it means you can check only one element at a time to see if it contains the target string.
So, the expression above goes through each attribute of input separately (/#*), checks if the attribute value of that specific attribute contains within it the target string and - if target is found - goes up to the parent of that attribute (/..) which, in the case of an attribute, is the node itself (input).
This XPath expression selects all <input> elements that have some attribute, whose string value contains "otp" or "ode". Notice that there is no need to "go up to the parent ..."
//input[#*[contains(., 'otp') or contains(., 'ode')]]
If we know that "otp" or "ode" must be the whole value of the attribute (not just a substring of the value), then this expression is stricter and more efficient to evaluate:
//input[#*[. ='otp' or . = 'ode']]
In this latter case ("otp" or "ode" are the whole value of the attribute), if we have to compare against many values then an XPath expression of the above form will quickly become too long. There is a way to simplify such long expression and do just a single comparison:
//input[#*[contains('|s1|s2|s3|s4|s5|', concat('|', ., '|'))]]
The above expression selects all input elements in the document, that have at least one attribute whose value is one of the strings "s1", "s2", "s3", "s4" or "s5".

xpath expression to read value based on value of sibling

I've below xml and would like to read the value of 'Value' tag whose Name matches 'test2'. I'm using the below xpath , but did not work. Can someone help.
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']/*[ local-name()='Name'][normalize-space(.) = 'test2']//*[local-name()='Value']/text()
<get:OutputData>
<get:OutputDataItem>
<get:Name>test1</get:Name>
<get:Value/>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>test2</get:Name>
<get:Value>B5B4</get:Value>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>test3</get:Name>
<get:Value/>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>OP_VCscEncrptCd_VAR</get:Name>
<get:Value/>
</get:OutputDataItem>
</get:OutputData>
Thanks
You were close, but because the get:name and get:value are siblings, you need to adjust your XPath a little.
Your XPath was attempting to address get:value elements that were descendants of get:name, rather than as siblings. Move the criteria that is filtering the get:name into a predicate, then step down into the get:value:
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']
[*[ local-name()='Name'][normalize-space(.) = 'test2']]/*[local-name()='Value']/text()
You could also combine the criteria of the predicate filter on the get:name and use an and:
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']
[*[ local-name()='Name' and normalize-space(.) = 'test2']]/*[local-name()='Value']/text()
This should work I think:
//*[local-name()="get:Name" and text()="test2"]/following-sibling::*[local-name()="get:Value"]/text()

Using xsl:key to store result of boolean expression

In my transformation there is an expression some elements are repeatedly tested against. To reduce redundancy I'd like to encapsulate this in an xsl:key like this (not working):
<xsl:key name="td-is-empty" match="td" use="not(./node()[normalize-space(.) or ./node()])" />
The expected behaviour is the key to yield a boolean value of true in case the expression is evaluated successfully and otherwise false. Then I'd like to use it as follows:
<xsl:template match="td[not(key('td-is-empty', .))]" />
Is this possible and in case yes, how?
I think with XSLT 1.0 a key value is always of type string so in your sample the key value with either be the string true or the string false. You could then call key('td-is-empty', 'true') to find all td element nodes for which the expression is true and key('td-is-empty', 'false') to find all td elements for which the expression is false.
You seem to want to do something differently with your key however, like storing the result of the use expression for each td node, based on node identity. I don't think that is how keys work in XSLT.
[edit]
You might however be able to express your requirement as
<xsl:template match="td[count(. | key('td-is-empty', 'false')) = count(key('td-is-empty', 'false'))]">...</xsl:template>
That matches those td elements which are a member of the set of elements found by key('td-is-empty', 'false').

How to get node value using variable node name?

I have an XML document like:
<data>
<item type="apple">
<misc>something</misc>
<appleValue>23</appleValue>
<misc2>something else</misc2>
</item>
<item type="banana">
<bananaValue>47</bananaValue>
<random>something</random>
</item>
</data>
I can get the items with doc("data.xml")/data/item but I need to get the text from the elements that end with Value. So I'd like to get "23" and "47", but I don't necessarily know the element names, meaning all I really know is there are elements that end in Value, I don't know if it's appleValue, bananaValue, etc. except that I could look at the type attribute and buildup a string.
let $type := (doc("data.xml")/data/item)[1]/#type
doc("data.xml")/data/item/$typeValue
...That last line is what I'm trying to get at, clearly that's not correct but I need to find elements whose name is known based on a variable (stored in a variable such as $type) and "Value".
Any ideas? I realize this variable element naming is strange/odd/bad...but that's the way it is and I have to deal with it.
I got it thanks to this post: Can XPath match on parts of an element's name?
doc("data.xml")/data/item/*[ends-with(name(), "Value")]
I would avoid using the name() function in favor of either node-name() or local-name(). The reason for this is that name() can give you different answers depending on what (and whether) namespace prefixes are used in the source. For example, the following three elements have the same exact name (QName):
<appleValue xmlns="http://example.com"/>
<x:appleValue xmlns:x="http://example.com"/>
<y:appleValue xmlns:y="http://example.com"/>
However, the name() function will give you a different answer for each one (appleValue, x:appleValue, and y:appleValue, respectively). So you're better off either ignoring the namespace by using local-name() (which returns the string appleValue for all three of the above cases) or explicitly specifying the namespace (even if it's empty, as Oliver showed), using node-name() (which returns a proper QName value, rather than a string). In this case, since you're not using namespaces (and since even if you added one later, the code will still work), I'd be slightly in favor of using local-name() as follows:
doc("data.xml")/data/item/*['Value' eq substring-after(local-name(),../#type)]
For elaboration on reasons to avoid the name() function (and exceptions), see "Perils of the name function".
You can access the name of the node using name(). XPath 1.0 does not have an "ends-with" function, but by using substring() and string-length() - 1 you can get there.
//item/*[ substring( name(), string-length(name() ) - 4 ) = 'Value']
A more precise way to implement this would be
for $item in doc("data.xml")/data/item
let $value-name := fn:QName('', concat($item/#type, 'Value'))
return $item/*[node-name() = $value-name]

Use Xpath to find the appropriate element based on the element value

I have the following xml snippet
<ZMARA01 SEGMENT="1">
<CHARACTERISTICS_01>X,001,COLOR_ATTRIBUTE_FR,BRUN ÉCORCE,TMBR,French C</CHARACTERISTICS_01>
<CHARACTERISTICS_02>X,001,COLOR_ATTRIBUTE,Timber Brown,TMBR,Color Attr</CHARACTERISTICS_02>
</ZMARA01>
I am looking for an xpath expression that will match based on COLOR_ATTRIBUTE. It will not always be in CHARACTERISTIC_02. It could be CHARACTERISTIC_XX. Also I don't want to match COLOR_ATTRIBUTE_FR. I have been using this:
Transaction.Input_XML{/ZMAT/IDOC/E1MARAM/ZMARA01/*[starts-with(local-name(.), 'CHARACTERISTIC_')][contains(.,'COLOR_ATTRIBUTE')]}
This gets me mostly there but it matches both COLOR_ATTRIBUTE and COLOR_ATTRIBUTE_FR
Use:
contains(concat(',', ., ','), ',COLOR_ATTRIBUTE,')
This first surrounds the string value of the context node with commas, then simply tests if the so cunstructed string contains ',COLOR_ATTRIBUTE,'.
Thus we treat all cases (pattern at the start of the string, pattern at the end of the string and pattern neither at the start or at the end) in the same single way.
If COLOR_ATTRIBUTE is guaranteed not to be in the first or last position, you could use [contains(.,',COLOR_ATTRIBUTE,')], otherwise you could use something like [contains(.,'COLOR_ATTRIBUTE') and not contains(.,'COLOR_ATTRIBUTE_FR')].

Resources