i want to get every <path> where <id_wert> = 1 from a xml file - xpath

i want to get every <path> where <id_wert> = 1 from a xml file
my query in sql would be "select path from xml where id = 1
<xml_export>
<id>1<\id>
<path>\DATEN\00000001.003</path>
</xml_export>
<xml_export>
<id>2</id>
<path>\DATEN\00000001.004</path>
</xml_export>
<xml_export>
<id>1</id>
<path>\DATEN\00000001.005</path>
</xml_export>

You can locate xml_export parent nodes by child id with desired value and then to get their child path nodes.
As following:
"//xml_export[.//id='1']//path"
In case you want to make it more strict, to search for only cases where id and path are direct children of xml_export the former expression can be changed to be
"//xml_export[./id='1']/path"

Related

Finding a parent element (not direct parent) based on partial match bot both parent id and child value

I have the following setup
<Ancestor_element_*****> Ancestor value
L
......
L
<Child_element> Child value *****
I have part of the child value and part of the ancestor node name. I need to get the Ancestor value (I do not know the exact level of nesting). Can this be done via an XPath query?
You are looking for a child element whose text contains "Child value", then you want its ancestor whose name contains "Ancestor_element", and you want its text value:
//Child_element[contains(text(),'Child value')]
/ancestor::*[contains(name(),'Ancestor_element')]/text()
Tested against
<Root>
<Ancestor_element_1>Ancestor value
<Something/>
<Something_in_between>
<Child_element> Child value 1</Child_element>
</Something_in_between>
</Ancestor_element_1>
</Root>
in xsh.

How to get node having a child node having a specified text?

There is a xml :
<mgns1:Champ_supplementaire>
<mgns1:CODE_CS>1</mgns1:CODE_CS>
<mgns1:VALEUR_CS>2</mgns1:VALEUR_CS>
</mgns1:Champ_supplementaire>
<mgns1:Champ_supplementaire>
<mgns1:CODE_CS>2</mgns1:CODE_CS>
<mgns1:VALEUR_CS>M</mgns1:VALEUR_CS>
</mgns1:Champ_supplementaire>
<mgns1:Champ_supplementaire>
<mgns1:CODE_CS>3</mgns1:CODE_CS>
<mgns1:VALEUR_CS>LOC</mgns1:VALEUR_CS>
</mgns1:Champ_supplementaire>
I want to get the node mgns1:Champ_supplementaire having a child mgns1:CODE_CS which text's is 2. How to do that ?
I tried
NodeList nodeliste_cs2 = (NodeList) xpath.evaluate( "//mgns1:Champ_supplementaire[//mgns1:CODE_CS=2]//mgns1:VALEUR_CS",doc, XPathConstants.NODESET);
//node_foo[//node_bar=2]
means select first found node_foo if there is a node_bar with value 2 anywhere in DOM
//node_foo[node_bar=2]
means select first found node_foo if it has its own child node_bar with value 2
So you need
"//mgns1:Champ_supplementaire[mgns1:CODE_CS=2]/mgns1:VALEUR_CS"

xerces-c 3.1 XPath evaluation

I could not find much examples of evaluate XPath using xerces-c 3.1.
Given the following sample XML input:
<abc>
<def>AAA BBB CCC</def>
</abc>
I need to retrieve the "AAA BBB CCC" string by the XPath "/abc/def/text()[0]".
The following code works:
XMLPlatformUtils::Initialize();
// create the DOM parser
XercesDOMParser *parser = new XercesDOMParser;
parser->setValidationScheme(XercesDOMParser::Val_Never);
parser->parse("test.xml");
// get the DOM representation
DOMDocument *doc = parser->getDocument();
// get the root element
DOMElement* root = doc->getDocumentElement();
// evaluate the xpath
DOMXPathResult* result=doc->evaluate(
XMLString::transcode("/abc/def"), // "/abc/def/text()[0]"
root,
NULL,
DOMXPathResult::ORDERED_NODE_SNAPSHOT_TYPE, //DOMXPathResult::ANY_UNORDERED_NODE_TYPE, //DOMXPathResult::STRING_TYPE,
NULL);
// look into the xpart evaluate result
result->snapshotItem(0);
std::cout<<StrX(result->getNodeValue()->getFirstChild()->getNodeValue())<<std::endl;;
XMLPlatformUtils::Terminate();
return 0;
But I really hate that:
result->getNodeValue()->getFirstChild()->getNodeValue()
Has it to be a node set instead of the exact node I want?
I tried other format of XPath such as "/abc/def/text()[0]", and "DOMXPathResult::STRING_TYPE". xerces always thrown exception.
What did I do wrong?
I don't code with Xerces C++ but it seems to implement the W3C DOM Level 3 so based on that I would suggest to select an element node with a path like /abc/def and then simply to access result->getNodeValue()->getTextContent() to get the contents of the element (e.g. AAA BBB CCC).
As far as I understand the DOM APIs, if you want a string value then you need to use a path like string(/abc/def) and then result->getStringValue() should do (if the evaluate method requests any type or STRING_TYPE as the result type).
Other approaches if you know you are only interested in the first node in document order you could evaluate /abc/def with FIRST_ORDERED_NODE_TYPE and then access result->getNodeValue()->getTextContent().

Does xpath query has Limit option like mysql

I want to limit number of result I receive from xpath query.
For example:-
$info = $xml->xpath("//*[firstname='Sheila'] **LIMIT 0,100**");
You see that LIMIT 0,100.
You should be able to use "//*[firstname='Sheila' and position() <= 100]"
Edit:
Given the following XML:
<root>
<country.php desc="country.php" language="fr|pt|en|in" editable="Yes">
<en/>
<in>
<cityList desc="cityList" language="in" editable="Yes" type="Array" index="No">
<element0>Abu</element0>
<element1>Agartala</element1>
<element2>Agra</element2>
<element3>Ahmedabad</element3>
<element4> Ahmednagar</element4>
<element5>Aizwal</element5>
<element150>abcd</element150>
</cityList>
</in>
</country.php>
</root>
You can use the following XPath to get the first three cities:
//cityList/*[position()<=3]
Results:
Node element0 Abu
Node element1 Agartala
Node element2 Agra
If you want to limit this to nodes that start with element:
//cityList/*[substring(name(), 1, 7) = 'element' and position()<=3]
Note that this latter example works because you're selecting all the child nodes of cityList, so in this case Position() works to limit the results as expected. If there was a mix of other node names under the cityList node, you'd get undesirable results.
For example, changing the XML as follows:
<root>
<country.php desc="country.php" language="fr|pt|en|in" editable="Yes">
<en/>
<in>
<cityList desc="cityList" language="in" editable="Yes" type="Array" index="No">
<element0>Abu</element0>
<dog>Agartala</dog>
<cat>Agra</cat>
<element3>Ahmedabad</element3>
<element4> Ahmednagar</element4>
<element5>Aizwal</element5>
<element150>abcd</element150>
</cityList>
</in>
</country.php>
</root>
and using the above XPath expression, we now get
Node element0 Abu
Note that we're losing the second and third results, because the position() function is evaluating at a higher order of precedence - the same as requesting "give me the first three nodes, now out of those give me all the nodes that start with 'element'".
Ran into the same issue myself and had some issue with Geoffs answer as it, as he clearly describes, limits the number of elements returned before it performs the other parts of the query due to precedence.
My solution is to add the position() < 10 as an additional conditional after my other conditions have been applied e.g.:
//ElementsIWant[./ChildElementToFilterOn='ValueToSearchFor'][position() <= 10]/.
Notice that I'm using two separate conditional blocks.
This will first filter out elements that live up to my condition and secondly only take 10 of those.

Parsing XML with REXML

I have this XML document and I want to find an specific GitHubCommiter using REXML. Hoy do I do that?
<users>
<GitHubCommiter id="Nerian">
<username>name</username>
<password>12345</password>
</GitHubCommiter>
<GitHubCommiter id="xmawet">
<username>name</username>
<password>12345</password>
</GitHubCommiter>
<GitHubCommiter id="JulienChristophe">
<username>name</username>
<password>12345</password>
</GitHubCommiter>
</users>
I have tried:
log = REXML::Document.new(file)
root = log.root username = root.elements["GitHubCommiter['#{github_user_name}']"].elements['username'].text
password = root.elements["GitHubCommiter['#{github_user_name}']"].elements['password'].text
root.elements["GitHubCommiter['id'=>'#{github_user_name}']"].text
But I don't find a way to do it. Any idea?
The docs say for elements (emphasis mine):
[]( index, name=nil)
Fetches a child element. Filters only Element children, regardless of the XPath match.
index: the search parameter. This is either an Integer, which will be used to find the index‘th child Element, or an XPath, which will be used to search for the Element.
So it needs to be XPath:
root.elements["./GitHubCommiter[#id = '{github_user_name}']"]
etc.

Resources