I am fetching data from database and send response in XML like below..
I want to fetch data into array or hash and then response to xml.... OR create direct xml..
Please refer below xml example...
<Response>
<Tolls>
<Toll>
<Id>123</Id>
<Name>Bradfield Highway</Name>
<Address>Bradfield Highway, New York</Address>
<Charge>5.95</Charge>
<Location lat="41.145556" lng="-73.995"/>
<EntryRects>
<EntryRect>
<Points>
<Point lat="41.145556" lng="-73.995"/>
<Point lat="41.145556" lng="-73.995"/>
<Point lat="41.145556" lng="-73.995"/>
<Point lat="41.145556" lng="-73.995"/>
</Points>
</EntryRects>
...
</EntryRects>
</Toll>
<Toll>
...
</Toll>
...
</Tolls>
</Response>
please send me response asap if any one know...
you should use the Builder::XmlMarkup, which provides a simple way to create XML markup and data structures
You don't say what database you are using, but many can generate the XML for you as the result of a query, instead of returning a normal "select" statement's output. That would be the fastest/easiest path because the data is going to have to be returned to your app anyway, so let the DBM do the conversion on the fly.
Second easiest is to use something like Nokogiri, Builder or one of several other gems. They can handle the encoding of non-ASCII characters, specifying the correct headers, and make sure the nesting and tag closure is correct. That's why people use those tools, because they save a huge amount of coding.
The last choice should be attempting to do it yourself. Simply because you asked the question, I suspect you don't really understand what goes into creating well-formed XML. It's possible to generate trivial XML output using something like ERB or maybe HAML to help with the nesting, but encoding will fall directly on you. If you insist on doing it, then start reading all the related links on the right side of the Stack Overflow page, plus any XML documentation you can find.
Related
I am new to ruby and XML. I have been given an XML file and asked to do some data manipulation in that.
For ex. consider the below XML file.
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to> Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
They are asking me to extract the the string which are inside the tags for ex "Tove", "Jani" and do some manipulation(for ex replacing "tove" with "john") on it and rewrite the data to same xml document.
I know ruby has a lot of gems and utilities and there must be a good utility to do it. If someone has any idea about any utility to do this work easily then just let me know.
And if there is no utility then if someone could give me some idea on how to proceed with it then it would be good.
One way is to use REXML that comes as part of the standard library.
Another way is to use Nokogiri (I would recommend using this).
Here are some good tutorials that will definitely help you:
http://ruby.bastardsbook.com/chapters/html-parsing/
https://blog.engineyard.com/2010/getting-started-with-nokogiri/
I need to parse a large (4gb) xml file in ruby, preferably with nokogiri. I've seen a lot of code exampled using
File.open(path)
but this takes too much time in my case. Is there an option to read the xml node by node in order to prevent loading the file at ones. Or what would be the fastest way to parse such a large file.
Best,
Phil
You can try using Nokogiri::XML::SAX
The basic way a SAX style parser works is by creating a parser,
telling the parser about the events we’re interested in, then giving
the parser some XML to process. The parser will notify you when it
encounters events your said you would like to know about.
I do this kind of work with LibXML http://xml4r.github.io/libxml-ruby/ (require 'xml') and its LibXML::XML::Reader API. It's simpler than SAX and allows you to make almost everything. REXML includes a similar API also, but it's quite buggy. Stream APIs like the one I mention or SAX shouldn't have any problem with huge files. I have not tested Nokogiri.
you may like to try this out - https://github.com/amolpujari/reading-huge-xml
HugeXML.read xml, elements_lookup do |element|
# => element{ :name, :value, :attributes}
end
I also tried using ox
I would want to get a structured version of a Wikiquote page via JSON (basically I need all phrases)
Example: http://en.wikiquote.org/wiki/Fight_Club_(film)
I tried with: http://en.wikiquote.org/w/api.php?format=xml&action=parse&page=Fight_Club_(film)&prop=text
but I get all HTML source code. I need each pharse as an element of an Array
How could I achieve that with DBPEDIA?
For one thing Iam not sure whether you can query wiki quotes using DBpedia and secondly, DBpedia gives you only info box data in a structured way, it does not in a any way the article content in a structured way. Instead with a little bit of trouble you can use the Media wiki api to get the data
EDIT
The URI you are trying gives you a text so this will make things easier, but not completely.
Try this piece of code in your console:
require 'Nokogiri'
content = JSON.parse(open("http://en.wikiquote.org/w/api.php?format=json&action=parse&page=Fight_Club_%28film%29&prop=text").read)
data = content['parse']['text']['*']
xpath_data = Nokogiri::HTML data
xpath_data.xpath("//ul/li").map{|data_node| data_node.text}
This is the closest I have come to an answer, of course this is not completely right because you will get a lot on unnecessary data. But if you dig into Nokogiri and xpath and find out how to pin point the nodes you need you can get a solution which will give you correct quotes at least 90% of the time.
Just change the format to JSON. Look up the Wikipedia API for more details.
http://en.wikiquote.org/w/api.php?format=json&action=parse&page=Fight_Club_(film)&prop=text
I am reading some data from an XML webservice with Ruby, something like this:
<phrases>
<phrase language="en_US">¡I'm highly annoyed with character references!</phrase>
</phrases>
I'm parsing the XML and grabbing an array of phrases. As you can see, the phrase text contains some XML character entity references. I'd like to replace them with the actual character being referenced. This is simple enough with the numeric references, but nasty with the XML and HTML ones. I'd like to avoid having a big hash in my code that holds the character for each XML or HTML character reference, i.e. http://www.java2s.com/Code/Java/XML/Resolvesanentityreferenceorcharacterreferencetoitsvalue.htm
Surely there's a library for this out there, right?
Update
Yes, there is a library out there, and it's called HTMLEntities:
: jmglov#laurana; sudo gem install htmlentities
Successfully installed htmlentities-4.2.4
: jmglov#laurana; irb
irb(main):001:0> require 'htmlentities'
=> []
irb(main):002:0> HTMLEntities.new.decode "¡I'm highly annoyed with character references!"
=> "¡I'm highly annoyed with character references!"
REXML can do it, though it won't handle "¡" or " ". The list of predefined XML entities (aside from Unicode numeric entities) is actually quite small. See http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
Given this input XML:
<phrases>
<phrase language="en_US">"I'm highly annoyed with character references!©</phrase>
</phrases>
you can parse the XML and the embedded entities like this (for example):
require 'rexml/document'
doc = REXML::Document.new(File.open('/tmp/foo.xml').readlines.join(''))
phrase = REXML::XPath.first(doc, '//phrases/phrase')
text = phrase.first # Type is REXML::Text
puts(text.value)
Obviously, that example assumes that the XML is in file /tmp/foo.xml. You can just as easily pass a string of XML. On my Mac and Ubuntu systems, running it produces:
$ ruby /tmp/foo.rb
"I'm highly annoyed with character references!©
This isn't an attempt to provide a solution, it's to relate some of my own experiences dealing with XML from the wild. I was using Perl at first, then later using Ruby, and the experiences are something you can encounter easily if you grab enough XML or RDF/RSS/Atom feeds.
I've often seen XML CDATA contain HTML, both encoded and unencoded. The encoded HTML was probably the result of someone doing things the right way, via some API or library to generate XML. The unencoded HTML was probably someone using a script to wrap the HTML with tags, resulting in invalid XML, but I had to deal with it anyway.
I've also seen XML CDATA containing HTML that had been encoded multiple times, requiring me to unencode everything, even after the XML engine had done its thing. Sometimes during an intermediate pass I'd suddenly have non-UTF8 characters in the string along with encoded ones, as a result of someone appending comments or joining multiple HTML streams together that were from different character-sets. For whatever the reason, it was really ugly and caused XML parsing to break or emit a lot of warnings. I'd have to loop over the content, decoding and checking to see if the previous pass was the same as the current decoding pass, and bailing if nothing had changed. There was no guarantee I'd have a string in a valid character-set at the time though, so I'd have to tell iconv to convert it to UTF8 and throw away characters that wouldn't convert cleanly.
Nokogiri can decode the content of a node various ways, by creative use of the to_xml and to_html methods. You can also look at the HTMLEntities gem, Loofah, and others to go after the CDATA contents. Loofah is nice because it's designed to whitelist/blacklist tags you might encounter.
The XML spec is supposed to protect us from such shenanigans, but, as one of my co-workers used to tell me, "We can make it fool-proof, but not damn-fool-proof". People are SO inventive and the specs mean nothing to someone who didn't bother to read them or doesn't care.
I,m using Builder::XmlMarkup to create xml. I want to create a tag without content because the api force me to create this.
If I use a blog
xml.tag do
end
I get what i need
<tag></tag>
but I want it shorter
xml.mytag
this gives me
<mytag/>
but i want
<mytag></mytag>
what do I have to pass as option.
regards Kai
Just pass empty string as a parameter. xml.mytag('')
Why do you want <mytag></mytag> instead of <mytag/>? Since the output is XML, downstream applications should not know or care about the difference.
According to the Infoset spec (Appendix D point 7), "The difference between the two forms of an empty element: <foo/> and <foo></foo>" is not represented in the XML Information Set.
This doesn't answer your "how" question, but if you discover that you actually don't need to do what you're trying to do, it may save you from a difficult and unnecessary wild goose chase.
ok empty string is nice, another one-line-way is empty block I found out.
xml.mytag{}