How to select a node referenced by an other node with XPath? - xpath

I would like to select the purchase-node with the attribute pgnr, which has the value from another pgnr attribute, startig with "KEY", concatinated with "c".
Example:
<root>
<purchact hhid="xxx">
<purchase pgnr="41">
<purchvalues field_name="Number" field_value="1"/>
</purchase>
<purchase pgnr="KEY9802">
<purchvalues field_name="Number" field_value="2"/>
</purchase>
<purchase pgnr="9802c">
<purchvalues field_name="Number" field_value="3"/>
</purchase>
</purchact>
</root>
In this case, I am looking for the purchase-node with the pgnr-attribute "9802c", because the purchase-node with the pgnr-attribute starting with "KEY" has as the following characters "9802".
I tried
root/purchact/purchase[#pgnr=concat(substring-after(#pgnr, "KEY"), "c")]
but it doesn't work.
Could anybody help? Thanks so much!

This XPath expression:
root/purchact/purchase[
#pgnr[substring(.,string-length()) = 'c']
][
concat(
'KEY',
substring-before(
#pgnr,
'c'
)
) = ../purchase/#pgnr
]
Meaning: a purchase element having an #pgnr attribute ending with 'c' and for wich there is at least one other #pgnr attribute belonging to a sibling purchase element and being equal to the concatenation of 'KEY' and the string of the given #pgnr before 'c'.

root/purchact/purchase[
#pgnr = concat(
substring-after(
../purchase[
contains(#pgnr, 'KEY')
]/#pgnr,
'KEY'
)
, 'c')
]

Related

How to get the first tag containing a word with xpath?

How can I get the first button which has type, class, id or ANYTHING containing text (have a substring equal to) close or Close or CLOSE? I tried this:
//button[contains(text(),'close')]
but it doesn't work.
Your predicate was testing whether any text() nodes contained "close". However, attributes are not text() nodes.
You can adjust your predicate to match on any attribute, then use a predicate on those attributes to test whether it's name() is "type", "class" or "id" and that it's lower-case() value contains "close":
With XPath 2.0 you could use this:
//button[#*[ name() = ('type','class','id') and contains(lower-case(.), 'close') ]]
With XPath 1.0, it takes a little more work. You can translate the upper-case letters into lower-case letters:
//button[
#*[name() = 'type' or name() = 'class' or name() = 'id']
[contains(translate(.,'CLOSE','close'), 'close')]]

Syntax error about XPath in Nokogiri, when combining namespace and node()

I'm learning XPath with Nokogiri. The XPath is like this:
xml_doc = Nokogiri::XML(open("test.xml"))
result = xml_doc.xpath("//x:foo", 'x' => 'www.example.com')
I could get the results. But when I perform this call:
result = xml_doc.xpath("//x:node()", 'x' => 'www.example.com')
I get an error:
Nokogiri::XML::XPath::SyntaxError: Invalid expression: //x:node()
Am I doing something wrong?
Different from elements, you don't need to use a namespace prefix to match by node(). The following will return all nodes in any namespace just fine:
result = xml_doc.xpath("//node()")
There are several types of nodes in XPath, namely text node, comment node, element node, so on. node() is a node tests which simply returns true for any node type whatsoever. Compare to text() which is another type of node tests that returns true only for text nodes. (See "w3.org > Xpath > Node Tests")
In my understanding, the notion of local name and namespace are only exists in the context of element nodes, so using a namespace prefix along with the node() test simply doesn't make sense.
If you meant to select all elements in a specific namespace use * instead of node():
result = xml_doc.xpath("//x:*", 'x' => 'www.example.com')

how to get attribute values using nokogiri

I have a webpage whose DOM structure I do not know...but i know the text which i need to find in that particular webpage..so in order to get its xpath what i do is :
doc = Nokogiri::HTML(webpage)
doc.traverse { |node|
if node.text?
if node.content == "my text"
path << node.path
end
end
}
puts path
now suppose i get an output like ::
html/body/div[4]/div[8]/div/div[38]/div/p/text()
so that later on when i access this webpage again i can do this ::
doc.xpath("#{path[0]}")
instead of traversing the whole DOM tree everytime i want the text
I want to do some further processing , for that i need to know which of the element nodes in the above xpath output have attributes associated with them and what are their attribute values. how would i achieve that? the output that i want is
#=> output desired
{ p => p_attr_value , div => div_attr_value , div[38] => div[38]_attr_value.....so on }
I am not facing the problem in searching the nodes where "my text" lies.. I wanted to have the full xpath of "my text" node..thts why i did the whole traversal...now after finding the full xpath i want the attributes associated with the each element node that I came across while getting to the "my text" node
constraints are ::I cant use any of the developer tools available in a web browser
PS :: I am newbie in ruby and nokogiri..
To select all attributes of an element that is selected using the XPath expression someExpr, you need to evaluate a new XPath expression:
someExpr/#*
where someExpr must be substituted with the real XPath expression used to select the particular element.
This selects all attributes of all (we assume that's just one) elements that are selected by the Xpath expression someExpr
For example, if the element we want is selected by:
/a/b/c
then all of its attributes are selected by:
/a/b/c/#*

Correlating a drop-down list

I have a drop-down list in which the values are:
store:[ ['1', 'Probation'], ['2', 'Confirmed'], ['3', 'Trainee'], ['4', 'Contract'] ],
I want to split the string using a split function to get:
[['1', 'Probation'],
['2', 'Confirmed'],
['3', 'Trainee'],
['4', 'Contract'] ],
Then, I can use regular expressions and pull the values 1, 2, 3, 4, or probation, confirmed, etc. and pass it to a request.
Can anybody help me with this? I want to know where exactly I can see the string after splitting, where I should call it, and how to use regular expressions for the split string.
You don't need to split the string to do a regular expression on it - you'd just add the Regex post processor as a child of your request.
If you do want to do the split, you'd want to add a post-processor BeanShell Assertion as a container for your code.

LINQ to XML question

My requirement here is to retrieve the node that matches the hostname (for eg. machine1) and I always get back no results. Please let me know what the problem is?
Thanks for the help in advance!!!
XDocument configXML = XDocument.Load("the below xml");
var q = from s in configXML.Descendants("lcsetting")
where ((string)s.Element("host") == hostName)
select s;
The actual xml:
<lcsettings>
<lcsetting env="prod">
<hosts usagelogpath="">
<host>machine1</host>
<host>machine2</host>
<host>machine3</host>
</hosts>
</lcsetting>
<lcsetting env="qa">
<hosts usagelogpath="">
<host>machine4</host>
<host>machine5</host>
<host>machine6</host>
</hosts>
</lcsetting>
<lcsetting env="test">
<hosts usagelogpath="">
<host>machine7</host>
<host>machine8</host>
<host>machine9</host>
</hosts>
</lcsetting>
</lcsettings>
You're looking for a host element directly under an lcsetting - that doesn't occur because there's always a hosts element between the two in the hierarchy. You're also using Element instead of Elements, which means only the first element with the right name will be returned.
You could use Descendants again instead of Element... but you'll need to change the condition. Something like:
var q = from s in configXML.Descendants("lcsetting")
where s.Descendants("host").Any(host => host.Value == hostName)
select s;
Alternatively, you could make your query find host elements and then take the grandparent element in each case:
var q = from host in configXML.Descendants("host")
where host.Value == hostName
select host.Parent.Parent;
(This assumes a host element will only occur once per lcsetting; if that's not the case, you can add a call to Distinct.)
"host" is not a child of "lcsetting".
You're selecting the descendants lcsetting but then attempting to check the element host which is two levels below it. The Element() function references only child elements 1 level deep. I'd recommend changing this to:
XDocument configXML = XDocument.Load("the below xml");
var q = from s in configXML.Descendants("lcsetting")
where s.Descendants("host").SingleOrDefault(e => e.Value == hostname) != null
select s;
That is because you have a <hosts> tag immedieately below your lcsetting, that contains your <host> tags. <host> is not an immedieate child of <lcsetting>.
This query will work:
var q = from s in configXML.Descendants("lcsetting").SelectMany(lcSetting => lcSetting.Descendants("host"))
where s.Name == "host" && s.Value == hostName
select s;

Resources