Xpath how to get element by index AND attribute - xpath

Given this xml:
<mets:techMD ID="techMD014">
<mets:mdWrap MDTYPE="PREMIS:OBJECT">
<mets:xmlData>
<premis:object
xsi:type="premis:file"
xsi:schemaLocation="info:lc/xmlns/premis-v2
http://www.loc.gov/standards/premis/v2/premis-v2-0.xsd">
<premis:objectIdentifier>
<premis:objectIdentifierType
>filepath</premis:objectIdentifierType>
<premis:objectIdentifierValue
>bib1234_yyyymmdd_99_x_performance.xml</premis:objectIdentifierValue>
</premis:objectIdentifier>
</premis:object>
</mets:xmlData>
</mets:mdWrap>
</mets:techMD>
<mets:techMD ID="techMD015">
<mets:mdWrap MDTYPE="PREMIS:OBJECT">
<mets:xmlData>
<premis:object
xsi:type="premis:representation"
xsi:schemaLocation="info:lc/xmlns/premis-v2
http://www.loc.gov/standards/premis/v2/premis-v2-0.xsd">
<premis:objectIdentifier>
<premis:objectIdentifierType
>local</premis:objectIdentifierType>
<premis:objectIdentifierValue
>bib1234_yyyymmdd_99_x</premis:objectIdentifierValue>
</premis:objectIdentifier>
</premis:object>
</mets:xmlData>
</mets:mdWrap>
</mets:techMD>
I would like to make a xpath query that takes both index and attribute into account.
I.e can I combine these two into ONE query? (Its the stuff around the "object" element Im interested in):
//techMD/mdWrap[
#MDTYPE=\'PREMIS:OBJECT\'
]/xmlData//object[1]/objectIdentifier/objectIdentifierValue
//techMD/mdWrap[
#MDTYPE=\'PREMIS:OBJECT\'
]/xmlData//object[
#xsi:type=\'premis:file\'
]/objectIdentifier/objectIdentifierValue
Thanks!

Just replace according part to:
object[#xsi:type='premis:file'][1]
if you want first object of those who have a given xsi:type value or
object[1][#xsi:type='premis:file']
if you want the first object, providing it has a given xsi:type value.

Related

Is it possible in XPATH to find an element by attribute value, not by name?

For example I have an XML element:
<input id="optSmsCode" type="tel" name="otp" placeholder="SMS-code">
Suppose I know that somewhere there must be an attribute with otp value, but I don’t know in what attribute it can be, respectively, is it possible to have an XPath expression of type like this:
.//input[(contains(*, "otp")) or (contains(*, "ode"))]
Try it like this and see if it works:
one = '//input/#*[(contains(.,"otp") or contains(.,"ode"))]/..'
print(driver.find_elements_by_xpath(one))
Edit:
The contains() function has a required cardinality of first argument of either one or zero. In plain(ish) English, it means you can check only one element at a time to see if it contains the target string.
So, the expression above goes through each attribute of input separately (/#*), checks if the attribute value of that specific attribute contains within it the target string and - if target is found - goes up to the parent of that attribute (/..) which, in the case of an attribute, is the node itself (input).
This XPath expression selects all <input> elements that have some attribute, whose string value contains "otp" or "ode". Notice that there is no need to "go up to the parent ..."
//input[#*[contains(., 'otp') or contains(., 'ode')]]
If we know that "otp" or "ode" must be the whole value of the attribute (not just a substring of the value), then this expression is stricter and more efficient to evaluate:
//input[#*[. ='otp' or . = 'ode']]
In this latter case ("otp" or "ode" are the whole value of the attribute), if we have to compare against many values then an XPath expression of the above form will quickly become too long. There is a way to simplify such long expression and do just a single comparison:
//input[#*[contains('|s1|s2|s3|s4|s5|', concat('|', ., '|'))]]
The above expression selects all input elements in the document, that have at least one attribute whose value is one of the strings "s1", "s2", "s3", "s4" or "s5".

How to select a specific category value in this Xpath expression

I have a feed here. I'm trying to create an XPath expression that returns items that have a category equal to Bananas. Due to the limitations in my XML parser, I can't use namespaces directly to select items.
The expression /rss/channel/item//*[name()='itunes:category'] returns this:
Element='<itunes:category
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
text="Apples"/>'
Element='<itunes:category
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
text="Bananas"/>'
...
And /rss/channel/item//*[name()='itunes:category']/#text returns this:
Attribute='text=Apples'
Attribute='text=Bananas'
...
But I can't figure out how to limit the response to just a single category (e.g., Bananas)?
I want some kind of expression like this:
/rss/channel/item//*[name()='itunes:category' and contains(., 'Bananas')]
But this doesn't work. It's not syntactically valid. What would be the right XPath expression syntax to just return Bananas?
Do you just mean to filter by attributes of item child, but still return item node?
/rss/channel/item/*[name()='itunes:category' and contains(#text,'Apples')]/parent::item
or simplier
/rss/channel/item[*[name()='itunes:category' and #text='Apples']]
I used Apples in example because using your example xml file there is 0 results for Bananas.

xpath expression to read value based on value of sibling

I've below xml and would like to read the value of 'Value' tag whose Name matches 'test2'. I'm using the below xpath , but did not work. Can someone help.
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']/*[ local-name()='Name'][normalize-space(.) = 'test2']//*[local-name()='Value']/text()
<get:OutputData>
<get:OutputDataItem>
<get:Name>test1</get:Name>
<get:Value/>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>test2</get:Name>
<get:Value>B5B4</get:Value>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>test3</get:Name>
<get:Value/>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>OP_VCscEncrptCd_VAR</get:Name>
<get:Value/>
</get:OutputDataItem>
</get:OutputData>
Thanks
You were close, but because the get:name and get:value are siblings, you need to adjust your XPath a little.
Your XPath was attempting to address get:value elements that were descendants of get:name, rather than as siblings. Move the criteria that is filtering the get:name into a predicate, then step down into the get:value:
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']
[*[ local-name()='Name'][normalize-space(.) = 'test2']]/*[local-name()='Value']/text()
You could also combine the criteria of the predicate filter on the get:name and use an and:
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']
[*[ local-name()='Name' and normalize-space(.) = 'test2']]/*[local-name()='Value']/text()
This should work I think:
//*[local-name()="get:Name" and text()="test2"]/following-sibling::*[local-name()="get:Value"]/text()

How to refer to position when using xf:setvalue function with iterate

Considering this code example and this post
...
<xf:action>
<xf:setvalue
iterate="instance('fr-send-submission-params')/#*"
ref="."
value="event(name(context()))"/>
</xf:action>
...
How can refer to current iterated position? Like value="position()"
Can i use this position as variable to xpath expressions? Like ref="/AnotherElement[position()]"
The following works:
<xf:action iterate="instance('fr-send-submission-params')/#*">
<xf:var name="p" value="position()"/>
<xf:setvalue ref="." value="$p"/>
</xf:action>
I don't think you can get away just with xf:setvalue, because ref changes the evaluation context of the expression to a single item which means that position() returns 1 within value.
A warning as I see that you iterate on attributes: I don't think that attribute position is guaranteed to be consistent.
Update:
The following works if you have elements, but then you need to have knowledge of the items iterated within the xf:setvalue:
<xf:setvalue
event="DOMActivate"
iterate="value"
ref="."
value="count(preceding-sibling::value) + 1"/>
So I think that the option with an enclosing action is much clearer.

Xpath: find an element value from a match of id attribute to id anchor

I would like to find the value of an element matched on id attribute for which I only have the ref - the bit with #, the anchor.
I am looking for the value of partyId:
< party id="partyA" >
< partyId >THEID< /partyId >
but to get there I only have the href from the following
< MyData >
< MyReference href="#partyA" />
Strip the # sign does not look good to me.
Any hints?
Because you haven't provided complete XML documents, I have to use // -- a practice I strongly recommend to avoid.
Suppose that
$vDataRef
is defined as
//MyData/MyReference/#href
and its string value is "#partyA", then one possible XPath expression that selects the wanted node is:
//party[#id=substring($vDataRef,2)]
In case the XML document has a DTD in which the id attribute of party is defined to be of type ID, then it is more convenient and efficient to use the standard XPath function id():
id(substring($vDataRef,2))
Assuming you have your ID as a variable already (lets say $myId), then try using:
//party[contains($myId, #id)]
The contains() function will look to see on each matching node whether or not the partyId attibute is in the value that you pass in.
Alternatively (as that could be considered 'ropey'), you can try:
//party[#id=substring($myId, 2, 1 div 0)]
the substring() function should be a little more precise.

Resources