contains in xpath only checks first match - xpath

Taking this xml piece as example:
<list>
<element>
<title>Element 1</title>
<group>Group 1</group>
</element>
<element>
<title>Element 2</title>
<group>Group 1</group>
</element>
<element>
<title>Element 3</title>
<group>Group 1</group>
<group>Group 2</group>
</element>
<element>
<title>Element 4</title>
<group>Group 2</group>
<group>Group 3</group>
</element>
</list>
To get all groups I use the following xpath:
//group/text()
and it works fine -I remove duplicates later in python using a set as I don't know if I can do it with xpath-. But When I want to get the elements that contain "Group 3" I try the following xpath:
//element[contains(group/text(), "Group 3")]
and I get an empty result. While when I search elements that contain "Group 1" with:
//element[contains(group/text(), "Group 1")]
I get the correct result with 3 elements. And if I look for "Group 2" I get a wrong result with only one element.
What I'm not taking into account? How can I make those searches by group?

contains can only test, if a string occurs in another string. group/text() is not a string, it is a set/sequence of nodes.
You can use
//element[group/text() = "Group 1"]

Related

How can I compare map-like elements with XMLUnit

I have below xml data:
Control:
<Data>
<propertyValues>
<propertyName>Name1</propertyName>
<value>
<text>
<value>Value1</value>
</text>
</value>
</propertyValues>
<propertyValues>
<propertyName>Name2</propertyName>
<value>
<text>
<value>Value2</value>
</text>
</value>
</propertyValues>
</Data>
Test:
<Data>
<propertyValues>
<propertyName>Name2</propertyName>
<value>
<text>
<value>Value2</value>
</text>
</value>
</propertyValues>
<propertyValues>
<propertyName>Name1</propertyName>
<value>
<text>
<value>Value1</value>
</text>
</value>
</propertyValues>
</Data>
And I would expect these 2 documents are "same".
How can I config xmlUnit to make it work? (I'm using xmlunit 2.6.3)
Thanks
Leon
This is pretty similar to the running example of the "Selecting Nodes" part of XMLUnit's User Guide.
You need to use an ElementSelector that picks the correct propertyValues element when looking at the list of elements and then decides to compare the elements that contain the same nested text inside the only child element named propertyName. This directly translates into
ElementSelectors.conditionalBuilder()
.whenElementIsNamed("propertyValues")
.thenUse(ElementSelectors.byXPath("./propertyName", ElementSelectors.byNameAndText))
...
and then you need to add whatever other rules are required to make the rest work. Looking at the visible rest of your example there are no ambiguous children and a simple
...
.elseUse(ElementSelectors.byName)
.build();
will do.

Arithmetic operation in xpath on array of elements

I have a requirement to sum amount from every element i have in my response using xpath
,however condition is i am not sure about how many tags I am going to get in my response.
Sum= amount1*value1+ amount2*value2+amount3*value3+....
<root>
<element>
<amount>10</amount>
<value>2</value>
</element>
<element>
<amount>20</amount>
<value>2</value>
</element>
<element>
<amount>30</amount>
<value>2</value>
</element>
</root>
can some one please help?
You can try below XPath to get summ of all amount nodes:
sum(//element/amount)
Considering updated question:
sum(//element/sum(./amount * ./value))

xpath expression to select an item whose ENTIRE text contents contains a specific string

Can i have an xpath expression to select an item whose ENTIRE text contents (including its descendants') contains a specific string?
My xml is like this:
<item>
<sentence>Good morning. Today is fine.</sentence>
<entry>
<keywords>
<keyword>fooA</keyword>
<keyword>fooB</keyword>
</keywords>
</entry>
<entry>
<keywords>
<keyword>fooC</keyword>
<keyword>fooD</keyword>
</keywords>
</entry>
</item>
<item>
<sentence>It's raining, we'd better get a raincoat.</sentence>
<entry>
<keywords>
<keyword>barA</keyword>
<keyword>barB</keyword>
</keywords>
</entry>
<entry>
<keywords>
<keyword>barC</keyword>
<keyword>barD</keyword>
</keywords>
</entry>
</item>
What I want to get is. If I look up any word or words in:
"Good morning. Today is fine. fooA fooB fooC fooD"
like "morning", "fooC", or "fine fooB", I will get the first item.
and if I look up any word or words in:
"It's raining, we'd better get a raincoat.. barA barB barC barD"
like "raining", "better", "better barD", "get barB barA", i get the second item.
I'm trying to use contain() and concat(), but seemed I couldn't concatenate within the xpath expression all descendants in different levels.
contains(concat(sentence/text(),entry/keywords/keyword/text()),"barC")
not working. concat() only turns the first text node to string.
Thanks in advance!!
This seems to work just fine
//item[contains(string(), 'barC')]
Demo here - http://www.xpathtester.com/xpath/da560511673f56c89386aecfc15a4a63
If you only want to use specific child nodes, try this
//item[(sentence|entry)[contains(string(), 'barC')]]
Demo - http://www.xpathtester.com/xpath/8236042dbcceea163665b015aa4da180

Ruby + Nokogiri + Xpath navigate Node_Set

<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
I have created a Nokogiri-NodeSet with this structure, i.e. a list of items with links and data children.
How can I filter any items that don't match a certain value in the 'target'-attribute of <FirstLink>?
Actually, what I want in the end is to extract the <Data><String>-Content of every <Item> that matches a certain value in it's <FirstLink> "Target"-Attribute.
I've tried several approaches already but I'm at a loss as to how to identify an element by an attribute of it's grandchild, then extracting the content of this grandchild's parent's sibling, X(.
We can build up an XPath expression to do this. Assuming we are starting from the whole XML document, rather than the node-set you already have, something like
//Item
will select all <Item> elements (I’m guessing you already have something like that to get this node-set).
Next, to select only those <Item> elements which have <Links><FirstLink> where FirstLink has a target attribute value of one:
//Item[Links/FirstLink[#target='one']]
and finally to select the Data/String children of those nodes:
//Item[Links/FirstLink[#target='one']]/Data/String
So with Nokogiri you could use something like this (where doc is your parsed document):
doc.xpath("//Item[Links/FirstLink[#target='one']]/Data/String")
or if you want to use the node-set you already have you can use a relative expression:
nodeset.xpath("self::Item[Links/FirstLink[#target='one']]/Data/String")
I completely didn't understand what your goal is. But using a guess, I am trying to show you, how to proceed in this case :
require 'nokogiri'
doc = Nokogiri::XML <<-xml
<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content1</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content2</String>
</Data>
</Item>
xml
#xpath method with the expression "//Item", will select all the Item nodes. Then those Item nodes will be passed to the #reject method to select only those nodes, that has a node called Links having the target attribute value is "one". If any of the links, either FirstLink or SecondLink has the target attribute value "one", for that nodes grandparent node Item will be selected.
node.at("//Links/FirstLink")['target'] will give you the string say "one" which is a value of target attribute of the node, FirstLink of first Item nodes , then "two" from the second Item node. The part ['any vaue'] in node.at("//Links/FirstLink")['target']['any vaue'] is a call to the String#[] method.
Remember below approach will give you the flexibility of the use regular expression too.
nodeset = doc.xpath("//Item").reject do |node|
node.at("//Links/FirstLink")['target']['any vaue']
end
Now nodeset contains only the required Item nodes. Now I use #map, passing each item node inside it to collect the content of the String node. Then #at method with an expression //Data/String, will select the String node. Then #text, will give you the content of each String node.
nodeset.map { |n| n.at('//Data/String').text } # => ["content1"]

counting elements in xml with Nokogiri

I'd like to understand why count gives me 5?
If I'm at the root element and I want to know my children, it is supposed to give me 2.
doc = Nokogiri::XML(open('link..to....element.xml'))
root = doc.root.children.count
puts root
<element>
<name>Married with Children</name>
<name>Married with Children</name>
</element>
You get 5 as the result because there are five child nodes under the root <element> node. There are two <name> nodes and three text nodes that each consist of whitespace; one between the opening <element> and the first <name>, one between the two <names>, and one between the second <name> and the closing </element>:
doc.root.children.each do |c|
p c
end
output:
#<Nokogiri::XML::Text:0x80544a04 "\n ">
#<Nokogiri::XML::Element:0x80544900 name="name" children=[#<Nokogiri::XML::Text:0x8054470c "Married with Children">]>
#<Nokogiri::XML::Text:0x80544554 "\n ">
#<Nokogiri::XML::Element:0x80544478 name="name" children=[#<Nokogiri::XML::Text:0x80544284 "Married with Children">]>
#<Nokogiri::XML::Text:0x805440cc "\n">
If you use the noblanks option when parsing Nokogiri won’t include these whitespace nodes:
doc = Nokogiri::XML(open('link..to....element.xml')) { |c| c.noblanks }
Now doc.root.children.count will equal 2, only the two <name> element nodes will be included.

Resources