I'm new to xpath and I understand how to get a range of values in xpath:
/bookstore/book[position()>=2 and position()<=10]
but in my case, I need to get above 2 and one less then the total(so if there's 10 then I need 9, or if there's 5, I need up to the 4th spot). I'm applying my code to different pages and the number of entries is not always the same.
In python, I could do something like book[2:-2], but I'm unsure if I can do this within xpath.
You can use last() which represents the last item in the context:
/bookstore/book[position()>=2 and position() <= (last() - 1)]
In my case this was working for me to get last but one element
/bookstore/book[position() = (last() - 1)]
Related
Given the following xml:
<randomName>
<otherName>
<a>item1</a>
<a>item2</a>
<a>item3</a>
</otherName>
<lastName>
<a>item4</a>
<a>item5</a>
</lastName>
</randomName>
Running: '//a' Gives me an array of all 5 "a" elements, however '//a[1]' does not give me the first of those five elements (item1). It instead gives me an array containing (item1 and item 4).
I believe this is because they are both position 1 relatively. How can I grab any a element by its overall index?
I would like to be able to use a variable "x" to get itemX.
You can wrap it in parenthesis so it knows to apply the index to the entire result set
(//a)[1]
Is it possible to select nodes in a similar way?
'./tr[position() in (1, 3, 7)]'
I found only this solution:
'./tr[position() = 1 or position() = 3 or position() = 7]'
In XPath 2.0 you would simply do:
./tr[position = (1,3,7)]
In XPath 1.0 the usual way to do it is the solution you already found, an alternative that is a bit shorter would be something like:
./tr[contains('1 3 7', position())]
The spaces in the string are essential here, otherwise you'd also get nodes 13,37 and 137.
In this query, I select the 3rd
//tablecontainer/table/tbody/tr/td[3]
How do I select both the 3rd and 4th 's?
To get both the 3rd and 4th tds, you can use the expression:
//tablecontainer/table/tbody/tr/td[position() >= 3 and position() <= 4]
//tablecontainer/table/tbody/tr/td[position()=3 or position()=4]
If you can use XPath 2.0 you could use following trick
//tablecontainer/table/tbody/tr/td[position() = (1,2,4)]
Test position() = (1,2,4) means something similar as IN from SQL. Notice the brackets in (1,2,4) part.
Using the count(preceding-sibling::*) XPath expression one can obtaining incrementing counters. However, can the same also be accomplished in a two-levels deep sequence?
example XML instance
<grandfather>
<father>
<child>a</child>
</father>
<father>
<child>b</child>
<child>c</child>
</father>
</grandfather>
code (with Saxon HE 9.4 jar on the CLASSPATH for XPath 2.0 features)
Trying to get an counter sequence of 1,2 and 3 for the three child nodes with different kinds of XPath expressions:
XPathExpression expr = xpath.compile("/grandfather/father/child");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0 ; i < nodes.getLength() ; i++) {
Node node = nodes.item(i);
System.out.printf("child's index is: %s %s %s, name is: %s\n"
,xpath.compile("count(preceding-sibling::*)").evaluate(node)
,xpath.compile("count(preceding-sibling::child)").evaluate(node)
,xpath.compile("//child/position()").evaluate(doc)
,xpath.compile(".").evaluate(node));
}
The above code prints:
child's index is: 0 0 1, name is: a
child's index is: 0 0 1, name is: b
child's index is: 1 1 1, name is: c
None of the three XPaths I tried managed to produce the correct sequence: 1,2,3. Clearly it can trivially be done using the i loop variable but I want to accomplish it with XPath if possible. Also I need to keep the basic framework of evaluating an XPath expression to get all the nodes to visit and then iterating on that set since that's the way the real application I work on is structured. Basically I visit each node and then need to evaluate a number of XPath expressions on it (node) or on the document (doc); one of these XPAth expressions is supposed to produce this incrementing sequence.
Use the preceding axis with a name test instead.
count(preceding::child)
Using XPath 2.0, there is a much better way to do this. Fetch all <child/> nodes and use the position() function to get the index:
//child/concat("child's index is: ", position(), ", name is: ", text())
You don't say efficiency is important, but I really hate to see this done with O(n^2) code! Jens' solution shows how to do that if you can use the result in the form of a sequence of (position, name) pairs. You could also return an alternating sequence of strings and numbers using //child/(string(.), position()): though you would then want to use the s9api API rather than JAXP, because JAXP can only really handle the data types that arise in XPath 1.0.
If you need to compute the index of each node as part of other processing, it might still be worth computing the index for every node in a single initial pass, and then looking it up in a table. But if you're doing that, the simplest way is surely to iterate over the result of //child and build a map from nodes to the sequence number in the iteration.
In Ruby we can access an array with negative numbers like array[-1] to get the last object in the array. How do I do this using XPath?
I can't do this:
result = node.xpath('.//ROOT/TAG[-1]/KEY_NAME')
I found a solution here on Stack Overflow, but that is a query that just changes the upper limit to get elements. This could return one last item or last item and prevous.
What if I want to get only the prevous element like array[-2] in Ruby?
You can access the last element in XPath using last() in a predicate.
node.xpath('.//ROOT/TAG[last()]/KEY_NAME')
And use [last()-1] for the second-to-last position.