Deserialize JSON primitives with the built-in Ruby JSON library - ruby

Why can Ruby's built-in JSON not deserialize simple JSON primitives, and how do I work around it?
irb(main):001:0> require 'json'
#=> true
irb(main):002:0> objects = [ {}, [], 42, "", true, nil ]
#=> [{}, [], 42, "", true]
irb(main):012:0> objects.each do |o|
irb(main):013:1* json = o.to_json
irb(main):014:1> begin
irb(main):015:2* p JSON.parse(json)
irb(main):016:2> rescue Exception => e
irb(main):017:2> puts "Error parsing #{json.inspect}: #{e}"
irb(main):018:2> end
irb(main):019:1> end
{}
[]
Error parsing "42": 706: unexpected token at '42'
Error parsing "\"\"": 706: unexpected token at '""'
Error parsing "true": 706: unexpected token at 'true'
Error parsing "null": 706: unexpected token at 'null'
#=> [{}, [], 42, "", true, nil]
irb(main):020:0> RUBY_DESCRIPTION
#=> "ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-darwin10.7.0]"
irb(main):022:0> JSON::VERSION
#=> "1.4.2"

RFC 4627: The application/json Media Type for JavaScript Object Notation (JSON) has this to say:
2. JSON Grammar
A JSON text is a sequence of tokens. The set of tokens includes six
structural characters, strings, numbers, and three literal names.
A JSON text is a serialized object or array.
JSON-text = object / array
[...]
2.1. Values
A JSON value MUST be an object, array, number, or string, or one of
the following three literal names:
false null true
If you call to_json on your six sample objects, we get this:
>> objects = [ {}, [], 42, "", true, nil ]
>> objects.map { |o| puts o.to_json }
{}
[]
42
""
true
null
So the first and second are valid JSON texts whereas the last four are not valid JSON texts even though they are valid JSON values.
JSON.parse wants what it calls a JSON document:
Parse the JSON document source into a Ruby data structure and return it.
Perhaps JSON document is the library's term for what RFC 4627 calls a JSON text. If so, then raising an exception is a reasonable response to an invalid input.
If you forcibly wrap and unwrap everything:
objects.each do |o|
json = o.to_json
begin
json_text = '[' + json + ']'
p JSON.parse(json_text)[0]
rescue Exception => e
puts "Error parsing #{json.inspect}: #{e}"
end
end
And as you note in your comment, using an array as the wrapper is better than an object in case the caller wants to use the :symbolize_names option. Wrapping like this means that you'll always be feeding JSON.parse a JSON text and everything should be fine.

This is quite an old question but I think it worths to have a proper answer to prevent hair loss for the ones who just encountered with the problem and still searching for a solution :)
To be able to parse "JSON primitives" with JSON gem below version 2, you can pass quirks_mode: true option like so;
JSON::VERSION # => 1.8.6
json_text = "This is a json primitive".to_json
JSON.parse(json_text, quirks_mode: true)
With the JSON gem version greater or equals to 2, the quirks_mode is not necessary anymore.
JSON::VERSION # => 2.0.0
json_text = "This is a json primitive".to_json
JSON.parse(json_text)
Before parsing the JSON, you can check the version of the JSON gem that you are using in your project with bundle show json or gem list | grep json and then use the corresponding one.
Happy JSON parsing!

It appears that the built-in JSON parser intentionally fails on anything but objects and arrays. My current workaround is the following:
# Work around a flaw in Ruby's built-in JSON parser
# not accepting anything but an object or array at the root level.
module JSON
def self.parse_any(str,opts={})
parse("[#{str}]",opts).first
end
end

Use JSON.load instead of JSON.parse to handle primitives:
e.g.
JSON.load('true') # => true
JSON.load('false') # => false
JSON.load('5150') # => 5150
JSON.load('null') # => nil

I think you are right...whether it is a bug or not, there is some wonky logic going on with the implementation. If it can parse arrays, and hashes it should be able to parse everything else.
Because JSON.parse seems geared for objects and arrays, I would try to pass your data one of those ways if you can, and if you can't, stick with the workaround you have.

Related

What's an efficient way (without parsing and re-encoding) to put a string representing JSON into a Ruby hash?

I have a JSON string which has been generated by Jbuilder:
json = "{name: 'Peter', email: 'peter#stackoverflow.com'}"
This is currently a string. However I want to combine it into a new hash (ideally in Ruby) before finally outputting it as JSON.
i.e.
output = {result: :success, data: json}
However if I convert this to JSON the json value gets double-encoded such that it's sent as a string:
output.to_json
#=> "{\"result\":\"success\",\"data\":\"{name: 'Peter', email: 'peter#stackoverflow.com'}\"}"
Now I could parse the JSON into a Ruby hash and then re-output it but that seems like a big fat waste of parsing when what I'd really like to do is to say "hey, this node is already JSON, don't re-encode it already!".
Is there any equivalent to the raw() method Rails has in views? i.e.
output = {result: :success, data: raw(json)}
so that the json evaluation of this then becomes:
output.to_json
#=> "{\"result\":\"success\",\"data\": {\"name\":\"Peter\",\"email\":\"peter#stackoverflow.com\"}"
Here’s a way you can do this, it’s a bit of a hack but you might find it useful.
First restating the problem:
# Note the quotes, your example isn't actually valid
json = "{\"name\": \"Peter\", \"email\": \"peter#stackoverflow.com\"}"
output = {result: :success, data: json}
# Without changing anything
puts JSON.generate(output)
This results in the following, where the value of data is a single string:
{"result":"success","data":"{\"name\": \"Peter\", \"email\": \"peter#stackoverflow.com\"}"}
The json gem uses a to_json method that is added to all objects to convert them to json, so the simplest fix would be to replace that method on objects you want to behave differently:
# As before
json = "{\"name\": \"Peter\", \"email\": \"peter#stackoverflow.com\"}"
# Replace to_json on the singleton object
def json.to_json *args
self
end
output = {result: :success, data: json}
# Generate the output (output.to_json gives the same result)
puts JSON.generate(output)
This creates the following, where the data value is now itself a hash, as desired:
{"result":"success","data":{"name": "Peter", "email": "peter#stackoverflow.com"}}
A cleaner way to do this, to avoid manipulating singletons in your code could be to create a subclass of string that has this behaviour:
class JsonSafeString < String
def to_json *args
self
end
end
You can now create a JsonSafeString when you want the contents included directly in a JSON object:
json = "{\"name\": \"Peter\", \"email\": \"peter#stackoverflow.com\"}"
output = {result: :success, data: JsonSafeString.new(json)}
puts JSON.generate(output)
The result is the same as above:
{"result":"success","data":{"name": "Peter", "email": "peter#stackoverflow.com"}}
You could wrap the call to JsonSafeString.new in a method like raw_json if you wanted.
Obviously this leaves the task of ensuring your string is valid to you – the main point of using a library for this is the user doesn’t have to concern themselves with things like whether to use single or double quotes, so you could be vulnerable to generating invalid JSON if you’re not careful. Also this is just a quick hack, there are probably a load of things I haven’t considered. In particular I haven’t taken character encodings into account, so watch out.
This doesn't address your question, but may help you avoid it altogether...
Do you really need to generate your json variable into JSON before adding it to the hash? Jbuilder can generate a hash just as easily as a JSON string, e.g.:
hash = Jbuilder.new do |json|
json.name 'Peter'
json.email 'peter#stackoverflow.com'
end.attributes!
# => {"name"=>"Peter", "email"=>"peter#stackoverflow.com"}
output = {result: :success, data: hash}
eval will put it out as raw code.
eval "{name: 'Peter', email: 'peter#stackoverflow.com'}"
=> {:name=>"Peter", :email=>"peter#stackoverflow.com"}
And the results.
output = {result: :success, data: eval("{name: 'Peter', email: 'peter#stackoverflow.com'}") }
=> {:result=>:success, :data=>{:name=>"Peter", :email=>"peter#stackoverflow.com"}}
And to string
output.to_s
=> "{:result=>:success, :data=>{:name=>\"Peter\", :email=>\"peter#stackoverflow.com\"}}"
And JSON
require 'json'
=> true
output.to_json
=> "{\"result\":\"success\",\"data\":{\"name\":\"Peter\",\"email\":\"peter#stackoverflow.com\"}}"

How do I check if a string is valid YAML?

I'd like to check if a string is valid YAML. I'd like to do this from within my Ruby code with a gem or library. I only have this begin/rescue clause, but it doesn't get rescued properly:
def valid_yaml_string?(config_text)
require 'open-uri'
file = open("https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration")
hard_failing_bad_yaml = file.read
config_text = hard_failing_bad_yaml
begin
YAML.load config_text
return true
rescue
return false
end
end
I am unfortunately getting the terrible error of:
irb(main):089:0> valid_yaml_string?("b")
Psych::SyntaxError: (<unknown>): mapping values are not allowed in this context at line 6 column 19
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse_stream'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:151:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:127:in `load'
from (irb):83:in `valid_yaml_string?'
from (irb):89
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/bin/irb:12:in `<main>'
Using a cleaned-up version of your code:
require 'yaml'
require 'open-uri'
URL = "https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration"
def valid_yaml_string?(yaml)
!!YAML.load(yaml)
rescue Exception => e
STDERR.puts e.message
return false
end
puts valid_yaml_string?(open(URL).read)
I get:
(<unknown>): mapping values are not allowed in this context at line 6 column 19
false
when I run it.
The reason is, the data you are getting from that URL isn't YAML at all, it's HTML:
open('https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration').read[0, 100]
=> " \n\n\n<!DOCTYPE html>\n<html>\n <head prefix=\"og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# githubog:"
If you only want a true/false response whether it's parsable YAML, remove this line:
STDERR.puts e.message
Unfortunately, going beyond that and determining if the string is a YAML string gets harder. You can do some sniffing, looking for some hints:
yaml[/^---/m]
will search for the YAML "document" marker, but a YAML file doesn't have to use those, nor do they have to be at the start of the file. We can add that in to tighten up the test:
!!YAML.load(yaml) && !!yaml[/^---/m]
But, even that leaves some holes, so adding in a test to see what the parser returns can help even more. YAML could return an Fixnum, String, an Array or a Hash, but if you already know what to expect, you can check to see what YAML wants to return. For instance:
YAML.load(({}).to_yaml).class
=> Hash
YAML.load(({}).to_yaml).instance_of?(Hash)
=> true
So, you could look for a Hash:
parsed_yaml = YAML.load(yaml)
!!yaml[/^---/m] && parsed_yaml.instance_of(Hash)
Replace Hash with whatever type you think you should get.
There might be even better ways to sniff it out, but those are what I'd try first.

Check if a string is XML formatted [duplicate]

This question already has answers here:
Ruby Unit Test : Is this a Valid (well-formed) XML Doc?
(3 answers)
Closed 7 years ago.
I'm wondering if there's a function in Ruby like is_xml?(string) to identify if a given string is XML formatted.
Nokogiri's parse uses a simple regex test looking for <html> in an attempt to determine if the data to be parsed is HTML or XML:
string =~ /^s*<[^Hh>]*html/ # Probably html
Something similar, looking for the XML declaration would be a starting point:
string = '<?xml version="1.0"?><foo><bar></bar></foo>'
string.strip[/\A<\?xml/]
=> "<?xml"
If that returns anything other than nil the string contains the XML declaration. It's important to test for this because an empty string will fool the next steps.
Nokogiri::XML('').errors.empty?
=> true
Nokogiri also has the errors method, which will return an array of errors after attempting to parse a document that is malformed. Testing that for any size would help:
Nokogiri::XML('<foo>').errors
=> [#<Nokogiri::XML::SyntaxError: Premature end of data in tag foo line 1>]
Nokogiri::XML('<foo>').errors.empty?
=> false
Nokogiri::XML(string).errors.empty?
=> true
would be true if the document is syntactically valid.
I just tested Nokogiri to see if it could tell the difference between a regular string vs. true XML:
[2] (pry) main: 0> doc = Nokogiri::XML('foo').errors
[
[0] #<Nokogiri::XML::SyntaxError: Start tag expected, '<' not found>
]
So, you can loop through your files and sort them into XML and non-XML easily:
require 'nokogiri'
[
'',
'foo',
'<xml></xml>'
].group_by{ |s| (s.strip > '') && Nokogiri::XML(s).errors.empty? }
=> {false=>["", "foo"], true=>["<xml></xml>"]}
Assign the result of group_by to a variable, and you'll have a hash you can check for non-XML (false) or XML (true).
There is no such function in Ruby's String class or Active Support's String extensions, but you can use Nokogiri to detect errors in XML:
begin
bad_doc = Nokogiri::XML(badly_formed) { |config| config.strict }
rescue Nokogiri::XML::SyntaxError => e
puts "caught exception: #{e}"
end

How can I parse json and write that data to a database using Sinatra and DataMapper

I'm doing a proof of concept thing here and having a bit more trouble than I thought I was going to. Here is what I want to do and how I am currently doing it.
I am sending my Sinatra app a json file which contains the simple message below.
[
{
title: "A greeting!",
message: "Hello from the Chairman of the Board"
}
]
From there I have a post which I am using to take the params and write them to sqlite database
post '/note' do
data = JSON.parse(params) #<---EDIT - added, now gives error.
#note = Note.new :title => params[:title],
:message => params[:message],
:timestamp => (params[:timestamp] || Time.now)
#note.save
end
When I send the message the timestamp and the id are saved to the database however the title and message are nil.
What am I missing?
Thanks
Edit:
Now when I run my app and send it the json file I get this error:
C:/Users/Norm/ruby/Ruby192/lib/ruby/1.9.1/webrick/server.rb:183:in `block in start_thread'
TypeError: can't convert Hash into String
Edit 2: Some success.
I have the above json in a file call test.json which is the way the json will be posted. In order to post the file I used HTTPClient:
require 'httpclient'
HTTPClient.post 'http://localhost:4567/note', [ :file => File.new('.\test.json') ]
After thinking about it some more, I thought posting the file was the problem so I tried sending it a different way. The example below worked once I changed n my post /note handle to this:
data = JSON.parse(request.body.read)
My new send.rb
require 'net/http'
require 'rubygems'
require 'json'
#host = 'localhost'
#port = '4567'
#post_ws = "/note"
#payload ={
"title" => "A greeting from...",
"message" => "... Sinatra!"
}.to_json
def post
req = Net::HTTP::Post.new(#post_ws, initheader = {'Content-Type' =>'application/json'})
#req.basic_auth #user, #pass
req.body = #payload
response = Net::HTTP.new(#host, #port).start {|http| http.request(req) }
puts "Response #{response.code} #{response.message}:
#{response.body}"
end
thepost = post
puts thepost
So I am getting closer. Thanks for all the help so far.
Sinatra won't parse the JSON automatically for you, but luckily parsing JSON is pretty straightforward:
Start with requiring it as usual. require 'rubygems' if you're not on Ruby 1.9+:
>> require 'json' #=> true
>> a_hash = {'a' => 1, 'b' => [0, 1]} #=> {"a"=>1, "b"=>[0, 1]}
>> a_hash.to_json #=> "{"a":1,"b":[0,1]}"
>> JSON.parse(a_hash.to_json) #=> {"a"=>1, "b"=>[0, 1]}
That's a roundtrip use to create, then parse some JSON. The IRB output shows the hash and embedded array were converted to JSON, then parsed back into the hash. You should be able to break that down for your nefarious needs.
To get the fields we'll break down the example above a bit more and pretend that we've received JSON from the remote side of your connection. So, the received_json below is the incoming data stream. Pass it to the JSON parser and you'll get back a Ruby data hash. Access the hash as you would normally and you get the values:
>> received_json = a_hash.to_json #=> "{"a":1,"b":[0,1]}"
>> received_hash = JSON.parse(received_json) #=> {"a"=>1, "b"=>[0, 1]}
>> received_hash['a'] #=> 1
>> received_hash['b'] #=> [0, 1]
The incoming JSON is probably a parameter in your params[] hash but I am not sure what key it would be hiding under, so you'll need to figure that out. It might be called 'json' or 'data' but that's app specific.
Your database code looks ok, and must be working if you're seeing some of the data written to it. It looks like you just need to retrieve the fields from the JSON.

JSON object for just an integer

Silly question, but I'm unable to figure out..
I tried the following in Ruby:
irb(main):020:0> JSON.load('[1,2,3]').class
=> Array
This seems to work. While neither
JSON.load('1').class
nor this
JSON.load('{1}').class
works. Any ideas?
I'd ask the guys who programmed the library. AFAIK, 1 isn't a valid JSON object, and neither is {1} but 1 is what the library itself generates for the fixnum 1.
You'd need to do: {"number" : 1} to be valid json. The bug is that
a != JSON.parse(JSON.generate(a))
I'd say it's a bug:
>> JSON.parse(1.to_json)
JSON::ParserError: A JSON text must at least contain two octets!
from /opt/local/lib/ruby/gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in `initialize'
from /opt/local/lib/ruby/gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in `new'
from /opt/local/lib/ruby/gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in `parse'
from (irb):7
I assume you're using this: (http://json.rubyforge.org/)
JSON only supporting objects is simply not true -- json.org also does not suggest this imo. it was derived from javascript and thus especially strings and numbers are also valid JSON:
var json_string = "1";
var p = eval('(' + json_string + ')');
console.log(p);
// => 1
typeof p
// => "number"
ActiveSupport::JSON properly understands raw value JSON:
require 'active_support/json'
p = ActiveSupport::JSON.decode '1'
# => 1
p.class
# => Fixnum
and so does MultiJson:
require 'multi_json'
p = MultiJson.load '1'
# => 1
p.class
# => Fixnum
so, as a2800276 mentioned, this must be a bug.
but as of this writing, ruby 2's JSON has quirks_mode enabled by default when using the load method.
require 'json'
p = JSON.load '1'
# => 1
p.class
# => Fixnum
The first example is valid. The second two are not valid JSON data. go to json.org for details.
As said only arrays and objects are allowed at the top level of JSON.
Maybe wrapping your values in an array will solve your problem.
def set( value ); #data = [value].to_json; end
def get; JSON.parse( #data )[0]; end
From the very basics of what JSON is:
Data types in JSON can be:
Number
String
Json Object ... (and some more)
Reference to see complete list of Json data types
Now any Json data has to be encapsulated in 'Json Object' at the top level.
To understand why is this so, you can see that without a Json Object at the top level, everything would be loose and you could only have only one of the data type in the whole of Json. i.e. Either a number, a string, a array, a null value etc... but only one.
'Json Object' type has a fixed format of 'key' : 'value' pair.
You cannot store just the value. Thus you cannot have something like {1}.
You need to put in the correct format, i.e. 'key' : 'value' pair.

Resources