Thought this was going to be an easy query... my limited XPath knowledge certainly does not help and I'm probably complicating things.
Trying to locate any EFFECT tags that reside between two SUBTASK tags.
<TOPIC>
<TITLE/>
<EFFECT/>
<SUBTASK>
</SUBTASK>
<EFFECT/>
<SUBTASK>
</SUBTASK>
</TOPIC>
Here's a false positive example:
<TC>
<TCBODY>
<TASK>
<EFFECT/>
<TITLE/>
<TOPIC>
<TITLE/>
<EFFECT/>
<SUBTASK>
<EFFECT/>
</SUBTASK>
</TOPIC>
</TASK>
</TCBODY>
</TC>
UPDATE: The above question has been answered and the following expression worked:
//EFFECT[preceding-sibling::SUBTASK and following-sibling::SUBTASK]
Now trying to locate any EFFECT tag where the preceding-sibling and following-sibling are the same - no matter what the element name is.
Tried this expression but its returning too many false positives:
//EFFECT[preceding-sibling::*[name()] = following-sibling::*[name()]
Any help would be greatly appreciated!! Thanks
Related
I have found many similar posts to this question, but nothing that answers this specific question. I must use XPath 1.0. I do not have XSLT (or XQuery or anything else) available to me, and I cannot use XPath 2.0. I am executing this XPath from inside a piece of software (Arbortext Styler), in which I can use XPath 1.0 to select content from other nodes, but XSLT is not available in this context. Also, I have no control over the structure of the source XML.
When I am in the context of <step>, I need to be able to match a previous procedure/task/step for which that step's parent procedure matches the current procedure's #ref and #seq and has the letter "A" as the value for #conf.
<document>
<topic>
<procedure ref="056" seq="01" conf="A">
<task>
<step>1. Blah Blah (056-01-A)</step>
</task>
</procedure>
<procedure ref="057" seq="02" conf="A">
<task>
<step>2. Blah blah (057-02-A)</step>
</task>
</procedure>
<procedure ref="057" seq="02" conf="B">
<task>
<step>2. Blah blah (057-02-B)</step>
</task>
</procedure>
<procedure ref="057" seq="03" conf="A">
<task>
<step>3. Blah blah (057-02-A)</step>
</task>
</procedure>
</topic>
</document>
What I need is something like this, but without the current() function, which is not supported by the software application:
//procedure[#ref=current()/ancestor::procedure/#ref and #seq=current()/ancestor::procedure/#seq and #conf='A']/task/step
Or something like this, but without the for in return statement:
for $ref in ancestor::procedure/#ref, $seq in ancestor::procedure/#seq return //topic/procedure[#ref=$ref and #seq=$seq and #conf='A']/task/step/text()
Does anyone have any suggestions for how this could be accomplished purely with XPath 1.0? Please note that the position of the procedure cannot be hardcoded. The duplicate refs can occur multiple times and in any position. Also, it is a requirement that this match be done with a starting context of <step>.
I suspect the answer to my question is that it can't be done, but I do know that if it can be done, this is the place to find the answer! Thanks, in advance, to all of you who consider this question.
This post was similar, but the search was looking for children of starting context: Xpath Getting All Nodes that Have an Attribute that Matches Another Node
This was also interesting, but my attribute value is not an ID: Xpath: find an element value from a match of id attribute to id anchor
Any suggestions?
As suggested by both Tomalak and Honza Hejzl, this cannot be done with XPath 1.0. Thanks for the feedback.
I have an XML that looks like the following:
xml tree
I need those tag elements that have only son elements as their ancestors.The only non-son ancestor allowed is the root element parent.After parent no ancestor of tag can be anything other than son . This xpath therefore would return <tag id="t1" /> and <tag id="t2" />
//son//tag would be one solution. Another would be //tag[ancestor::son] You could use /descendent:: in place of //; there are differences in the order in which results are reported. There are other variants; which one is best depends on the exact context in which you're doing this.
I should have posted this earlier or may be it does not matter.Here is the nasty looking xpath I wrote to solve this:
/parent/(descendant::tag except(descendant::element() except descendant::son)/descendant::tag)
Hope someone would suggest a better looking alternative.
I want to use XPath to select the sub tree containing the <name>-tag with "ABC" and not the other one from the following xml. Is this possible? And as a minor question, which keywords would I use to find something like that over Google (e.g. for selecting the sub tree by an attribute I would have the terminology for)?
<root>
<operation>
<name>ABC</name>
<description>Description 1</description>
</operation>
<operation>
<name>DEF</name>
<description>Description 2</description>
</operation>
</root>
Use:
/*/operation[name='ABC']
For your second question: I strongly recommend not to rely on online sources (there are some that aren't so good) but to read a good book on XPath.
See some resources listed here:
https://stackoverflow.com/questions/339930/any-good-xslt-tutorial-book-blog-site-online/341589#341589
For your first question, I think a more accurate way to do it would be://operation[./name[text()='ABC']].And according to this , we can also make it://operation[./name[text()[.='ABC']]]
This is my first post here. I have just started working with Ruby and am using REXML for some XML handling. I present a small sample of my xml file here:
<record>
<header>
<identifier>oai:lcoa1.loc.gov:loc.gmd/g3195.ct000379</identifier>
<datestamp>2004-08-13T15:32:50Z</datestamp>
<setSpec>gmd</setSpec>
</header>
<metadata>
<titleInfo>
<title>Meet-konstige vertoning van de grote en merk-waardige zons-verduistering</title>
</titleInfo>
</metadata>
</record>
My objective is to match the last numerical value in the tag with a list of values that I have from an array. I have achieved this with the following code snippet:
ids = XPath.match(xmldoc, "//identifier[text()='oai:lcoa1.loc.gov:loc.gmd/"+mapid+"']")
Having got a particular identifier that I wish to investigate, now I want to go back to and select and then select to get the value in the node for that particular identifier.
I have looked at the XPath tutorials and expressions and many of the related questions on this website as well and learnt about axes and the different concepts such as ancestor/following sibling etc. However, I am really confused and cannot figure this out easily.
I was wondering if I could get any help or if someone could point me towards an online resource "easy" to read.
Thank you.
UPDATE:
I have been trying various combinations of code such as:
idss = XPath.match(xmldoc, "//identifier[text()='oai:lcoa1.loc.gov:loc.gmd/"+mapid+"']/parent::header/following-sibling::metadata/child::mods/child::titleInfo/child::title")
The code compiles but does not output anything. I am wondering what I am doing so wrong.
Here's a way to accomplish it using XPath, then going up to the record, then XPath to get the title:
require 'rexml/document'
include REXML
xml=<<END
<record>
<header>
<identifier>oai:lcoa1.loc.gov:loc.gmd/g3195.ct000379</identifier>
<datestamp>2004-08-13T15:32:50Z</datestamp>
<setSpec>gmd</setSpec>
</header>
<metadata>
<titleInfo>
<title>Meet-konstige</title>
</titleInfo>
</metadata>
</record>
END
doc=Document.new(xml)
mapid = "ct000379"
text = "oai:lcoa1.loc.gov:loc.gmd/g3195.#{mapid}"
identifier_nodes = XPath.match(doc, "//identifier[text()='#{text}']")
record_node = identifier_nodes.first.parent.parent
record_node.elements['metadata/titleInfo/title'].text
=> "Meet-konstig"
I am looking to find all attributes of an element that match a certain pattern.
So for an element
<element s2="1" name="aaaa" id="1" />
<element s3="1" name="aaaa" id="2" />
I would like to be able to find all attributes that start with 's' (returning the value of s1 for the first element and s3 for the value of the second element).
If this is outside of xpath's ability please let me know.
Use:
element/#*[starts-with(name(), 's')]
This XPath expression selects all atribute nodes whose name starts with the string 's' and that are attributes of elements named element that are children of the current node.
starts-with() is a standard function in XPath 1.0
element/#*[substring(name(), 1,1) = "s"]
will match any attribute that starts with 's'.
The function starts-with() might look better than using substring()
I've tested the given answers from both #Dimitre-Novatchev and #Ledhund, using lxml.html module in Python.
Both element/#*[starts-with(name(), 's')] and element/#*[substring(name(), 1,1) = "s"] return only the values of s2 and s3. You won't be able to know which value belong to which attribute.
I think in practice I would be more interested in finding the elements themselves that contain the attributes of names starting with specific characters rather than just their values.
To achieve that is very simple, just add /.. at the end,
element/#*[starts-with(name(), "s")]/..
or
element/#*[starts-with(name(), "s")]/parent::*
or
element/#*[starts-with(name(), "s")]/parent::node()
None from above worked for me.
So I did not some changes and it worked for me. :)
/*:UserCustomField[starts-with(#name, 'purchaseDate')]