Using LINQ to XML, how do I get a collection of all elements that have a named child element.
for example;
<root>
<Garage>
<Car id="001">
<Price PaymentType="Cash">$100</Price>
</Car>
<Car id="002">
<Price PaymentType="Cash">$200</Price>
</Car>
<Car id="003">
</Car>
</Garage>
</root>
this will return 2 Car elements (#1 and #2) as they have the Price element. It won't return Car #3, as it doesn't have a price element.
thanks as always
Assuming you have an XDocument object named doc with your example xml loaded into it. You could try something like this.
IEnumerable<XElement> elements = doc.Descendants("Garage").Elements().Where(e => e.Elements().Any());
Related
I am trying to capture a value using XPath based on value of a different field.
Example XML:
<?xml version="1.0" encoding="UTF-8" ?>
<employees>
<employee>
<id>1</id>
<firstName>Tom</firstName>
<lastName>Cruise</lastName>
<photo>https://jsonformatter.org/img/tom-cruise.jpg</photo>
</employee>
<employee>
<id>2</id>
<firstName>Maria</firstName>
<lastName>Sharapova</lastName>
<photo>https://jsonformatter.org/img/Maria-Sharapova.jpg</photo>
</employee>
<employee>
<id>3</id>
<firstName>Robert</firstName>
<lastName>Downey Jr.</lastName>
<photo>https://jsonformatter.org/img/Robert-Downey-Jr.jpg</photo>
</employee>
</employees>
I am trying to get Xpath expression for value in the firstName field, when id value is 3.
You can locate parent node based on the known child node and then find the desired child node of that parent, as following:
//employee[./id='3']/firstName
the expression above will give the desired firstName node itself.
To retrieve it's text value this can be used:
//employee[./id='3']/firstName/text()
I have a this xml:
<?xml version="1.0"?>
<catalog>
<car>
<id>0</id>
<color>green</color>
<color>red</color>
<color>yellow</color>
<vip>
<user>Trump</user>
<user>Obama</user>
<user>Merkel</user>
</vip>
</car>
<car>
<id>1</id>
<color>green</color>
<color>red</color>
<color>yellow</color>
<vip>
<user>Putinski</user>
<user>Orlovski</user>
<user>Idiotski</user>
</vip>
</car>
<car>
<id>2</id>
<color>green</color>
<color>red</color>
<color>yellow</color>
<vip>
<user>Clooney</user>
<user>Lopez</user>
<user>Ford</user>
</vip>
</car>
</catalog>
And I am fighting with some simple things:
a) count the "color" nodes from car id 0
b) retrieve Obama's car id
For a) I know how to identify car id 0
/catalog/car/id=0
gives me a TRUE - so this is the proof I am on the right track. But now how can I continue counting the "color" nodes based on car id 0? The solution postet here does not work, as well as the following-sibling results in an javax.xml.transformerException. Does anybody know how to solve this?
To count the color nodes in car with id = 0 you can use
count(/catalog/car[id="0"]/color)
Returns 3
To get Obama's car id:
/catalog/car[.//user="Obama"]/id/text()
Returns 0
I have a xml file like this:
<carSchema xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="carSchema.xsd">
<car>
<License>23</License>
<year>2010</year>
<model>a</model>
<manufacturer>hyundai</manufacturer>
</car>
<car>
<License>24</License>
<year>2002</year>
<model>b</model>
<manufacturer>hyundai</manufacturer>
</car>
<car>
<License>25</License>
<year>2005</year>
<model>c</model>
<manufacturer>hyundai</manufacturer>
</car>
<car>
<License>26</License>
<year>2004</year>
<model>d</model>
</car>
<car>
<License>27</License>
<year>2016</year>
<model>f</model>
<manufacturer>hyundai</manufacturer>
</car>
I want to find information about newest car in Xquery. I wrote this Query that return year of newest car.
xquery version "1.0";
max(
for $x in doc("car.xml")/carSchema/car
order by $x/year descending
return $x/year)
How I return all information about that car(License, model, manufacturer)?
You can use
(for $car in doc("car.xml")/carSchema/car
order by $car/year descending
return $car)[1]/*
to find all child elements of the element with the latest year or
(for $car in doc("car.xml")/carSchema/car
order by $car/year descending
return $car)[1]
to find the element itself with the latest year.
I'm trying to store in an array all the unique Xpaths of the low level elements in the XML below, but like I'm doing in array a is being stored all the XML, not only the Xpath themselves. The XML has different levels of Xpath. I mean, some child elements only have 2 ancestors and others more than one.
This is the code I have.
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<items>
<item>
<name>Cake</name>
<ppu>0.55</ppu>
<batters>
<batter>Regular</batter>
<batter>Chocolate</batter>
<batter>Blueberry</batter>
<batter>Devil's Food</batter>
</batters>
<topping>None</topping>
<topping>Glazed</topping>
<topping>Sugar</topping>
<topping>Powdered Sugar</topping>
<topping>Chocolate with Sprinkles</topping>
<topping>Chocolate</topping>
<topping>Maple</topping>
</item>
<item>
<name>Raised</name>
<ppu>0.55</ppu>
<batters>
<batter>Regular</batter>
</batters>
<topping>None</topping>
<topping>Glazed</topping>
<topping>Sugar</topping>
<topping>Chocolate</topping>
<topping>Maple</topping>
</item>
</items>
EOT
a = []
a = doc.xpath("//*")
puts a
I'd like to store in array "a" only the unique xpaths as below:
/items/item/name
/items/item/ppu
/items/item/batters/batter
/items/item/topping
Maybe somebody could help me in how to do this.
Thanks for the help.
What you want to select is the "leaf" nodes. You can do it like so:
doc.xpath("//*[not(*)]")
This means "select all elements that don't contain elements".
If you want the XPaths, you'll need to call .path on each node. But the paths provided by Nokogiri have explicit positions (e.g. /items/item[2]/topping[4]), so you'll have to apply a regex to remove them, then remove duplicates with uniq:
doc.xpath("//*[not(*)]").map {|leaf| leaf.path.gsub(/\[.*?\]/, '') }.uniq
Output:
/items/item/name
/items/item/ppu
/items/item/batters/batter
/items/item/topping
<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
I have created a Nokogiri-NodeSet with this structure, i.e. a list of items with links and data children.
How can I filter any items that don't match a certain value in the 'target'-attribute of <FirstLink>?
Actually, what I want in the end is to extract the <Data><String>-Content of every <Item> that matches a certain value in it's <FirstLink> "Target"-Attribute.
I've tried several approaches already but I'm at a loss as to how to identify an element by an attribute of it's grandchild, then extracting the content of this grandchild's parent's sibling, X(.
We can build up an XPath expression to do this. Assuming we are starting from the whole XML document, rather than the node-set you already have, something like
//Item
will select all <Item> elements (I’m guessing you already have something like that to get this node-set).
Next, to select only those <Item> elements which have <Links><FirstLink> where FirstLink has a target attribute value of one:
//Item[Links/FirstLink[#target='one']]
and finally to select the Data/String children of those nodes:
//Item[Links/FirstLink[#target='one']]/Data/String
So with Nokogiri you could use something like this (where doc is your parsed document):
doc.xpath("//Item[Links/FirstLink[#target='one']]/Data/String")
or if you want to use the node-set you already have you can use a relative expression:
nodeset.xpath("self::Item[Links/FirstLink[#target='one']]/Data/String")
I completely didn't understand what your goal is. But using a guess, I am trying to show you, how to proceed in this case :
require 'nokogiri'
doc = Nokogiri::XML <<-xml
<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content1</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content2</String>
</Data>
</Item>
xml
#xpath method with the expression "//Item", will select all the Item nodes. Then those Item nodes will be passed to the #reject method to select only those nodes, that has a node called Links having the target attribute value is "one". If any of the links, either FirstLink or SecondLink has the target attribute value "one", for that nodes grandparent node Item will be selected.
node.at("//Links/FirstLink")['target'] will give you the string say "one" which is a value of target attribute of the node, FirstLink of first Item nodes , then "two" from the second Item node. The part ['any vaue'] in node.at("//Links/FirstLink")['target']['any vaue'] is a call to the String#[] method.
Remember below approach will give you the flexibility of the use regular expression too.
nodeset = doc.xpath("//Item").reject do |node|
node.at("//Links/FirstLink")['target']['any vaue']
end
Now nodeset contains only the required Item nodes. Now I use #map, passing each item node inside it to collect the content of the String node. Then #at method with an expression //Data/String, will select the String node. Then #text, will give you the content of each String node.
nodeset.map { |n| n.at('//Data/String').text } # => ["content1"]