XPath: Get element with matching attribute value - xpath

I'm trying to get content from an element whose #id attribute matches the context node's #idref. For example, given the following xml (just a contrived sample)...
<doc>
<toc>
<entry idref="ch1"/>
<entry idref="ch2"/>
</toc>
<body>
<chapter id="ch1">
<title>Chapter 1</title>
<para/>
</chapter>
<chapter id="ch2">
<title>Chapter 2</title>
<para/>
</chapter>
<chapter id="ch3">
<title>Chapter 3</title>
<para/>
</chapter>
</body>
</doc>
From the [entry] element, how can I get the content of [title] within [chapter] whose #id matches the current #idref.
So, basically find chapter[where chapter #id = current entry #idref]/title
I've tried
string(//chapter[#id = #idref]/title)
string(//chapter[#id = ./#idref]/title)
string(//chapter[#id = current()/#idref]/title)
all with no luck.

Can you try this expression on your xml?
//chapter[#id=//toc/entry/#idref]/string-join((title,#id),' ')
Output:
Chapter 1 ch1
Chapter 2 ch2

Related

Ruby: Insert new XML element into existing XML file

How can I insert another XML element into an XML file I'm creating with Builder::XmlMarkup? e.g., something like
xml = Builder::XmlMarkup.new( :indent => 4 )
xml.content
xml.common do
xml.common_field1 do
// common_field1 content
end
xml.common_field2 do
// common_field 2 content
end
end
xml.custom do
xml.insert!(<XML element>)
end
end
Where <XML element> looks something like
<elements>
<element>
// element content
</element>
<element>
// element content
</element>
<elements>
and the final output looks like
<content>
<common>
<content1>
<!-- content1 -->
</content1>
<content2>
<!-- content2 -->
</content2>
</common>
<custom>
<elements>
<element>
<!-- element content -->
</element>
<element>
<!-- element content -->
</element>
</elements>
</custom>
</content>
I've tried using the << operator but that doesn't unfortunately doesn't maintain formatting.
<< is exactly what you need:
xml.custom do |custom|
custom << '<XML element>'
end
Rubydocs doesn't seem to work, so here's the link to the source code: https://github.com/jimweirich/builder/blob/master/lib/builder/xmlbase.rb#L104

How to Use multiple conditions in Xpath?

New to Xpath. Was trying in to use XML task in SSIS to load some values. Using Microsoft' XML inventory mentioned below.
How can I load first-name value in bookstore/books where style is novel and award = 'Pulitzer'?
//book[#style='novel' and ./author/award/text()='Pulitzer'] is what I am trying. It gives the whole element. Where should I modify to just get the first-name value?
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="myfile.xsl" ?>
<bookstore specialty="novel">
<book style="autobiography">
<author>
<first-name>Joe</first-name>
<last-name>Bob</last-name>
<award>Trenton Literary Review Honorable Mention</award>
</author>
<price>12</price>
</book>
<book style="textbook">
<author>
<first-name>Mary</first-name>
<last-name>Bob</last-name>
<publication>Selected Short Stories of
<first-name>Mary</first-name>
<last-name>Bob</last-name>
</publication>
</author>
<editor>
<first-name>Britney</first-name>
<last-name>Bob</last-name>
</editor>
<price>55</price>
</book>
<magazine style="glossy" frequency="monthly">
<price>2.50</price>
<subscription price="24" per="year"/>
</magazine>
<book style="novel" id="myfave">
<author>
<first-name>Toni</first-name>
<last-name>Bob</last-name>
<degree from="Trenton U">B.A.</degree>
<degree from="Harvard">Ph.D.</degree>
<award>P</award>
<publication>Still in Trenton</publication>
<publication>Trenton Forever</publication>
</author>
<price intl="Canada" exchange="0.7">6.50</price>
<excerpt>
<p>It was a dark and stormy night.</p>
<p>But then all nights in Trenton seem dark and
stormy to someone who has gone through what
<emph>I</emph> have.</p>
<definition-list>
<term>Trenton</term>
<definition>misery</definition>
</definition-list>
</excerpt>
</book>
<my:book xmlns:my="uri:mynamespace" style="leather" price="29.50">
<my:title>Who's Who in Trenton</my:title>
<my:author>Robert Bob</my:author>
</my:book>
</bookstore>
I got an answer.
//book[#style='novel' and ./author/award/text()='Pulitzer']//first-name
Use:
/*/book[#style='novel']/author[award = 'Pulitzer']/first-name
This selects any first-name element whose author parent has a award child with string value of 'Pulitzer' and whose (of the author) parent is a book whose style attribute has value "novel" and whose parent is the top element of the XML document.
A similar question in the same context. How can I do the vice-versa ? Let's suppose I want to find the id of all those books whose price is greater than 20 ? I know I am being a nudge, but really want to clear my understanding.
Here is the needed XPATH :
//book/price[text() > 20]/..

xpath selection of a title node within a resultset

I've an xml doc like below. I was trying to select a title node with a particular value in it say "![CDATA[ 1234 ]]". That Title node may be in any Type node. I was using this xpath query
/Results/ResultSet/Type[Title="![CDATA[ 1234 ]]"]
but didnt get anything selected. can someone pls help.
<Results>
<Info>...</Info>
<ResultSet num="4">
<Type type="A">
<Title>
<![CDATA[ 1234 ]]>
</Title>
<Description>
<![CDATA[ 1234 ]]>
</Description>
<Domain>
<![CDATA[1234 ]]>
</Domain>
<Target>
<![CDATA[]]>
</Target>
</Type>
<Type type="A">
<Title>
<![CDATA[ abcdef ]]>
</Title>
<Description>
<![CDATA[abcdef]]>
</Description>
<Domain>
<![CDATA[abcdef]]>
</Domain>
<Target>
<![CDATA[abcdef]]>
</Target>
</Type>
EDIT: included the ruby code that I am using
doc = Nokogiri::HTML(html)
Element = doc.xpath('/Results/ResultSet/Type/Title[text()=" 1234 "]')
if Element.empty?()
puts "not there "
else
Element.each do |node|
puts "Found Title: #{node.text}"
end
end
end
The XPath is wrong:
Use this:
/Results/ResultSet/Type/Title[text()=" 1234 "]
Based on the link OP posted for the XML, here is the working XPath:
/QuigoResults/ResultSet/Listing/Title[text()=" location in DYNAMICREGION "]

XPath in Nokogiri returning empty array [] whereas I am expecting to have results

I am trying to parse XML files using Nokogiri, Ruby and XPath. I usually don't encounter any problem but with the following I can't make any xpath request:
doc = Nokogiri::HTML(open("myfile.xml"))
doc.("//Meta").count
# result ==> 0
doc.xpath("//Meta")
# result ==> []
doc.xpath(.).count
# result => 1
Here is an simplified version of my XML File
<Answer xmlns="test:com.test.search" context="hf%3D10%26target%3Dst0" last="0" estimated="false" nmatches="1" nslices="0" nhits="1" start="0">
<time>
...
</time>
<promoted>
...
</promoted>
<hits>
<Hit url="http://www.test.com/" source="test" collapsed="false" preferred="false" score="1254772" sort="0" mask="272" contentFp="4294967295" did="1287" slice="1">
<groups>
...
</groups>
<metas>
<Meta name="enligne">
<MetaString name="value">
</MetaString>
</Meta>
<Meta name="language">
<MetaString name="value">
fr
</MetaString>
</Meta>
<Meta name="text">
<MetaText name="value">
<TextSeg highlighted="false" highlightClass="0">
La
</TextSeg>
</MetaText>
</Meta>
</metas>
</Hit>
</hits>
<keywords>
...
</keywords>
<groups>
...
</groups>
How can I get all children of <Hit> from this XML?
Include the namespace information when calling xpath:
doc.xpath("//x:Meta", "x" => "test:com.test.search")
You can use the remove_namespaces! method and save your day.
This is one of the most FAQ XPAth questions -- search for "XPath default namespace".
If there is no way to register a namespace for the default namespace and use the registered prefix (say "x" in //x:Meta) then use:
//*[name() = 'Meta` and namespace-uri()='test:com.test.search']
If it is known that Meta can only belong to the default namespace, then the above can be shortened to:
//*[name() = 'Meta`]

How to reference an XML attribute using XPath?

My XML:
<root>
<cars>
<makes>
<honda year="1995">
<model />
<!-- ... -->
</honda>
<honda year="2000">
<!-- ... -->
</honda>
</makes>
</cars>
</root>
I need a XPath that will get me all models for <honda> with year 1995.
so:
/root/cars/makes/honda
But how to reference an attribute?
"I need a XPath that will get me all models for <honda> with year 1995."
That would be:
/root/cars/makes/honda[#year = '1995']/model
Try /root/cars/makes/honda/#year
UPDATE: reading your question again:
/root/cars/makes/honda[#year = '1995']
Bottom line is: use # character to reference xml attributes.

Resources