XmlDoucment how to get bottom xmlnodes? - xmldocument

In xml document, I want to get the Bottom xml node, how can I get the last xml nodes
<Books>
<book>
<author> sasi </author>
<pdate>2013-01-02</pdate>
</book>
<book>
<author> surya</author>
<pdate> 2013-02-02</pdate>
</book>
<book>
<author>dolly</author>
<pdate> 2013-04-01</pdate>
</book>
</Books>
from the above I want get the last <book> node in the xml document.

Try this:
var xml = #"<Books>
<book>
<author> sasi </author>
<pdate>2013-01-02</pdate>
</book>
<book>
<author> surya</author>
<pdate> 2013-02-02</pdate>
</book>
<book>
<author>dolly</author>
<pdate> 2013-04-01</pdate>
</book>
</Books>";
var doc = new XmlDocument();
doc.LoadXml(xml);
var node = doc.FirstChild.LastChild;
Console.WriteLine(node.OuterXml);
Outputs:
<book><author>dolly</author><pdate> 2013-04-01</pdate></book>
Alternatively, you may select the last book child under the Books element:
doc.SelectSingleNode("Books/book[last()]")
or the last book element no matter where they are in the document:
doc.SelectSingleNode("//book[last()]");

Related

Retrieving data from xml API with AJAX

I'm trying to retrieve data from an api, could someone help me on how to grab the node value of some child tags in an element in ajax? My XML api looks like this:
<distance ...>
<status>...</status>
<car>
<name>Golf</name>
<year>2016</year>
</car>
......
<car>
<name>BMW</name>
<year>2017</year>
</car>
</distance>
How can I retrieve all names and years tag values? Below is the script, I wrote comments in the area where I need help. Thanks
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<script>
function searchXML()
{
var xmlhttp = new XMLHttpRequest();
var url = "https://www.example.se/api/products/xml";
xmlhttp.onreadystatechange = function () {
if (xmlhttp.readyState === 4 && xmlhttp.status === 200) {
console.log(xmlhttp.responseXML);
//Here I need help on how to retrieve data.
//when I used document.get....., I was getting .getElementsByClassName instead of getElementsById
}
};
xmlhttp.open("GET", url, true);
xmlhttp.send();
}
</script>
<title></title>
</head>
<body>
<h2>The returned data under this text</h2>
<div id="mydata">
</div>
<button type="button" onclick="searchXML()">Get data</button>
</body>
</html>
You can use the XML DOm Manipulation API'S var xmlDoc = xml.responseXML;
the nodeValue returns the node value
here is an example XML..
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web" cover="paperback">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
that is being parsed for node value as
(Sorry for the CORS error...the code snippet will work on firefox)
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
myFunction(this);
}
};
xhttp.open("GET", "https://www.w3schools.com/xml/books.xml", true);
xhttp.send();
function myFunction(xml) {
var xmlDoc = xml.responseXML;
var x = xmlDoc.getElementsByTagName('title')[0];
var y = x.childNodes[0];
document.getElementById("demo").innerHTML =
y.nodeValue; // gets the node value
}
<p id="demo"></p>

How to search within a nodeset and delete a node from that same nodeset

I have the following xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document mc:Ignorable="w14 w15 wp14" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:mo="http://schemas.microsoft.com/office/mac/office/2008/main" xmlns:mv="urn:schemas-microsoft-com:mac:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape">
<w:body>
<w:p w14:paraId="56037BEC" w14:textId="1188FA30" w:rsidR="001665B3" w:rsidRDefault="008B4AC6">
<w:r>
<w:t xml:space="preserve">This is the story of a man who </w:t>
</w:r>
<w:ins w:author="Mitchell Gould" w:date="2016-09-28T09:15:00Z" w:id="0">
<w:r w:rsidR="003566BF">
<w:t>went</w:t>
</w:r>
</w:ins>
<w:del w:author="Mitchell Gould" w:date="2016-09-28T09:15:00Z" w:id="1">
<w:r w:rsidDel="003566BF">
<w:delText>goes</w:delText>
</w:r>
</w:del>
...
I use Nokogiri to parse the xml as follows:
zip = Zip::File.open("test.docx")
doc = zip.find_entry("word/document.xml")
file = Nokogiri::XML.parse(doc.get_input_stream)
I have a 'deletions' nodeset that contains all of the w:del elements:
#deletions = file.xpath("//w:del")
I search inside of this nodeset to see if an element exists as follows:
my_node_set = #deletions.search("//w:del[#w:id='1']" && "//w:del/w:r[#w:rsidDel='003566BF']")
If it exists I want to remove it from the deletions nodeset. I do this with the following:
deletions.delete(my_node_set.first)
Which seems to work as no errors are returned and it displays the deleted nodeset in the terminal.
However, when I check my #deletions nodeset it seems the item is still there:
#deletions.search("//w:del[#w:id='1']" && "//w:del/w:r[#w:rsidDel='003566BF']")
I'm just getting my head around Nokogiri so I'm obviously not searching for the element properly inside of my #deletions nodeset and am instead searching the entire document.
How can I search inside of the #deletions nodeset for the element and then delete it from the nodeset?
Consider this:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<html>
<body>
<div id="foo"><p>foo</p></div>
<div id="bar"><p>bar</p></div>
</body>
</html>
EOT
divs contains the div tags, which are a NodeSet:
divs = doc.css('div')
divs.class # => Nokogiri::XML::NodeSet
And contains:
divs.to_html # => "<div id=\"foo\"><p>foo</p></div><div id=\"bar\"><p>bar</p></div>"
You can search a NodeSet using at to find the first match:
divs.at('#foo').to_html # => "<div id=\"foo\"><p>foo</p></div>"
And you can easily remove it:
divs.at('#foo').remove
Which removes it from the document itself:
puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html>
# >> <body>
# >>
# >> <div id="bar"><p>bar</p></div>
# >> </body>
# >> </html>
It doesn't delete it from the NodeSet, but we don't care about that, the NodeSet is just a pointer to the nodes in the document itself used to give a list of what to delete.
If you then want an updated NodeSet after deleting certain nodes, rescan the document and rebuild the NodeSet:
divs = doc.css('div')
divs.to_html # => "<div id=\"bar\"><p>bar</p></div>"
If your goal is to remove all the nodes in the NodeSet, instead of searching through that list you can simply use:
divs.remove
puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html>
# >> <body>
# >>
# >>
# >> </body>
# >> </html>
When I'm deleting nodes I don't gather an intermediate NodeSet, instead I do it on the fly using something like:
doc = Nokogiri::HTML(<<EOT)
<html>
<body>
<div id="foo"><p>foo</p></div>
<div id="bar"><p>bar</p></div>
</body>
</html>
EOT
doc.at('div#bar p').remove
puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html>
# >> <body>
# >> <div id="foo"><p>foo</p></div>
# >> <div id="bar"></div>
# >> </body>
# >> </html>
which deletes the embedded <p> tag in #bar. By relaxing the selector and changing from at to search I can remove them en masse:
doc.search('div p').remove
puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html>
# >> <body>
# >> <div id="foo"></div>
# >> <div id="bar"></div>
# >> </body>
# >> </html>
If you insist on walking through the NodeSet, remember that they are like arrays, and you can treat them as such. Here's an example of using reject to skip a particular node:
doc = Nokogiri::HTML(<<EOT)
<html>
<body>
<div id="foo"><p>foo</p></div>
<div id="bar"><p>bar</p></div>
</body>
</html>
EOT
divs = doc.search('div').reject{ |d| d['id'] == 'foo' }
divs.map(&:to_html) # => ["<div id=\"bar\"><p>bar</p></div>"]
You won't receive a NodeSet though, you'll get an Array:
divs.class # => Array
While you can do that, you're better off using a specific selector to reduce the set rather than rely on Ruby to select or reject elements.

Xpath get nodes that do not have complete empty children

Scenario (minified):
<a>
<Sections>
<Section>
<Title></Title>
<Subject></Subject>
<Body></Body>
</Section>
<Section>
<Title/>
<Subject/>
<Body/>
</Section>
<Section>
<Title>Hello</Title>
<Subject></Subject>
<Body></Body>
</Section>
<Section>
<Title></Title>
<Subject>I have a problem</Subject>
<Body></Body>
</Section>
</Sections>
</a>
Question:
What XPath should I use to return a list of <Section/> nodes that have at least one child node not empty such that this is returned:
<Section>
<Title>Hello</Title>
<Subject></Subject>
<Body></Body>
</Section>
<Section>
<Title></Title>
<Subject>I have a problem</Subject>
<Body></Body>
</Section>
In other words, <Section> nodes with completely empty child nodes should be filtered out.
Try:
.//Section[./*/node()]
i.e. look for Section elements that have children that have children (text nodes or element nodes). This may or may not work depending on your requirement for empty child nodes, and may therefore need refinement.
If you are using XPath 2.0 you can use:
/a/Sections/Section[(true() = (for $i in * return has-children($i)))]
This checks for each child element if it has children and then checks if this is true for at least one children.
I am not sure if this can be achieved using XPath 1.0. The following works if there can be only text nodes as child elements:
/a/Sections/Section[not(. = "")]
However, this would not return the element if there is an empty element present, e.g. <Title><test/></Title>
try this xpath:
//a/Sections/Section[count(*[.!='']) > 0]

Getting certain attributes based on other attributes in Nokogiri?

Here's the XML I'm working with:
<order xmlns="http://example.com/schemas/1.0">
<link type="application/xml" rel="http://example.com/rel/self" href="https://example.com/orders/1631"/>
<link type="application/xml" rel="http://example.com/rel/order/history" href="http://example.com/orders/1631/history"/>
<link type="application/xml" rel="http://example.com/rel/order/transition/release" href="https://example.com/orders/1631/release"/>
<link type="application/xml" rel="http://example.com/rel/order/transition/cancel" href="https://example.com/orders/1631/cancel"/>
<state>hold</state>
<order-number>123-456-789</order-number>
<survey-title>Testing</survey-title>
<survey-url>http://example.com/s/123456</survey-url>
<number-of-questions>6</number-of-questions>
<number-of-completes>100</number-of-completes>
<target-group>
<country>
<id>US</id>
<name>United States</name>
</country>
<min-age>15</min-age>
</target-group>
<quote>319.00</quote>
<currency>USD</currency>
</order>
What I need to do is get the href attribute, from the link that has a rel of http://example.com/rel/order/transition/release
So, how can I do that using Nokogiri?
Easy-peasy:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<order xmlns="http://example.com/schemas/1.0">
<link type="application/xml" rel="http://example.com/rel/self" href="https://example.com/orders/1631"/>
<link type="application/xml" rel="http://example.com/rel/order/history" href="http://example.com/orders/1631/history"/>
<link type="application/xml" rel="http://example.com/rel/order/transition/release" href="https://example.com/orders/1631/release"/>
<link type="application/xml" rel="http://example.com/rel/order/transition/cancel" href="https://example.com/orders/1631/cancel"/>
<state>hold</state>
<order-number>123-456-789</order-number>
<survey-title>Testing</survey-title>
<survey-url>http://example.com/s/123456</survey-url>
<number-of-questions>6</number-of-questions>
<number-of-completes>100</number-of-completes>
<target-group>
<country>
<id>US</id>
<name>United States</name>
</country>
<min-age>15</min-age>
</target-group>
<quote>319.00</quote>
<currency>USD</currency>
</order>
EOT
href = doc.at('link[rel="http://example.com/rel/order/transition/release"]')['href']
=> "https://example.com/orders/1631/release"
This is using Nokogiri's ability to use CSS accessors. Sometimes it's easier (or the only way) to use XPath, but I prefer CSS because they tend to be more readable.
Nokogiri::Node.at can take a CSS accessor or XPath, and will return the first node matching that pattern. If you need to iterate over all the matches, use search instead, which returns a NodeSet, which you can treat as an array. Nokogiri also supports at_xpath and at_css along with css and xpath for at and search symmetry.
That's a one-liner:
#doc.xpath('//xmlns:link[#rel = "http://example.com/rel/order/transition/release"]').attr('href')

xml jquery ajax json polling

I am try get xml feed every 10 sec or add the new xml feed to the existing displayed results . Can someone show me how i can do that using ajax and json call backs.
Also, if I want to take a step further and want to paginate it by breaking down into 5 results per pages how would I do that?
If someone could show me some working examples would be really great
My code is below.. please feel free to expand
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Jquery Xml Ajax</title>
<script language="javascript" src="jquery.js"></script>
<script language="javascript">
$(document).ready(function() {
$.ajax({
type:"GET",
url:"sample-xml-feed.xml",
dataType:"xml",
success:parseXml
});
});
function parseXml (xml) {
$(xml).find("Tutorial").each(function() {
$("#output").append($(this).attr("author") +"<br/>")
});
}
</script>
<body>
<div id="output"></div>
</body>
</html>
XML BIT below
<?xml version="1.0" encoding="utf-8"?>
<RecentTutorials>
<Tutorial author="The Reddest">
<Title>Silverlight and the Netflix API</Title>
<Categories>
<Category>Tutorials</Category>
<Category>Silverlight 2.0</Category>
<Category>Silverlight</Category>
<Category>C#</Category>
<Category>XAML</Category>
</Categories>
<Date>1/13/2009</Date>
</Tutorial>
<Tutorial author="The Hairiest">
<Title>Cake PHP 4 - Saving and Validating Data</Title>
<Categories>
<Category>Tutorials</Category>
<Category>CakePHP</Category>
<Category>PHP</Category>
</Categories>
<Date>1/12/2009</Date>
</Tutorial>
<Tutorial author="The Tallest">
<Title>Silverlight 2 - Using initParams</Title>
<Categories>
<Category>Tutorials</Category>
<Category>Silverlight 2.0</Category>
<Category>Silverlight</Category>
<Category>C#</Category>
<Category>HTML</Category>
</Categories>
<Date>1/6/2009</Date>
</Tutorial>
<Tutorial author="The Fattest">
<Title>Controlling iTunes with AutoHotkey</Title>
<Categories>
<Category>Tutorials</Category>
<Category>AutoHotkey</Category>
</Categories>
<Date>12/12/2008</Date>
</Tutorial>
</RecentTutorials>
What about running your AJAX request inside a setInterval()?
setInterval(function() {
$.ajax({
type:"GET",
url:"sample-xml-feed.xml",
dataType:"xml",
success:parseXml
});
}, 10000); // 10 seconds
For the pagination your parseXml() function should take care of the logic. Keep a counter variable somewhere and as soon as it reaches a 5th result, append to a new container and reset the counter.

Resources