Parsing valid JSON throwing errors - ruby

I’m confused as to why this throws an error:
s = <<JSON
{"s": "This is \"valid\" JSON"}
JSON
JSON.parse(s) # => JSON::ParserError: 757: unexpected token at '{"s": "This is "valid" JSON"}'
Based on using http://jsonlint.com I can confirm that this is valid JSON, so what’s the deal? I get the feeling that I could be using %q{} here and things would be escaped properly, but I’d really rather use a heredoc here.

It turns out that Ruby supports disabling interpolation in heredocs by surrounding the opening identifier with single quotes, so in my example above, it would look like this:
s = <<'JSON'
{"s": "This is \"valid\" JSON"}
JSON
JSON.parse(s) # => {"s"=>"This is \"valid\" JSON"}

Related

backslash at end of string causes error when inserting into InfluxDB

I have a string:
string = "\\"
puts string
# => \
I am interpolating this into a new string and sending to a database. However the database (InfluxDB) uses backslashes as escape characters so pushing this string can cause an error.
For example, if I pass the following to Influx it will cause an "unterminated string" error:
insert_cmd = <<-TXT
INSERT INTO my_db.default.my_measurement,my_tag=1 my_val="#{string}"
TXT
My question is how can I replace \ in a string with \\ (two actual backslashes).
I have it working with gsub("\\", "\\\\\\") but I don't understand why this works and the following doesn't:
string.gsub("\\", "\\\\")
# SyntaxError: (irb):10: syntax error, unexpected $undefined, expecting end-of-input
Why doesn't this work? Why does gsub("\\", "\\\\\\") work? Is there a better way?
solved
As I mentioned in a comment, actually I am not manually interpolating into a INSERT INTO string. I am using influxdb-ruby:
INFLUXDB_CLIENT.write_point("things", time: Time.now.to_i, values: { foo: "\\" })
It turns out this is a bug with that gem: https://github.com/influxdata/influxdb-ruby/issues/200
It is fixed in v 0.4.2 and i was using 0.4.1
You just use parameterized query strings:
INSERT INTO my_db.default.my_measurement,my_tag=1 my_val=%{1}
Where when you call it you do this:
influxdb.query("...query...", params: [ string ])
What you did was create a classic injection bug by sending unescaped data into a query. The same principle applies in any database with a plain-text string representation, or even other data formats like HTML and JavaScript.

Ruby to_yaml stringifies my json

I am trying to convert a ruby hash to yaml. I'd like part of the hash be valid json; however, when I try to serialize the json string, it is converted to yaml in quotes.
For example, when I just have a simple string, the ouput is as follows (note foo is not in quotations):
request = {}
request['body'] = 'foo'
request.to_yaml # outputs: body: foo
However, when I add something to the beginning of the string, like { foo the output for body gets quoted:
request['body'] = '{ foo'
request.to_yaml # outputs: body: '{ foo'
How can I get around this? I've tried JSON.parse and, though that make work, I can't be guaranteed that this input will actually be json (could be xml, etc...) -- I just want to give back whatever was given to me but not "stringified".
Basically, I want to give an object that looks like:
{ 'request' => {
'url' => '/posts',
'method' => 'GET',
'headers' => [
'Content-Type' => 'application/json'
]
},
'response' => {
'code' => 200,
'body' => '[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]'
}
}
Which returns:
request:
url: /posts
method: GET
headers:
- Content-Type: application/json
response:
code: 200
body:
[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]
The reason being: right now, I can go from yaml to the correct ruby hash but I can't go the other way.
The method my_hash.to_yaml() just takes a hash and converts it to YAML without doing anything special to the values. The method does not care whether your string is JSON or XML, it just treats it as a string.
So why is my JSON being put into quotes when other strings aren't?
Good question! The reason is simple: curly braces are a valid part of YAML syntax.
This:
my_key: { sub: 1, keys: 2}
Is called flow mapping syntax in YAML, and it allows you make nested mappings in one line. To escape strings which have curly braces in them, YAML uses quotes:
my_key: "{ sub: 1, keys: 2}" # this is just a string
Of course, the quotes are optional for all strings:
my_key: "foo" #same as my_key: foo
Okay, but I want to_yaml() to find my JSON string and convert it to YAML mappings like the rest of the hash.
Well then, you need to convert your JSON string to a hash like the rest of your hash. to_yaml() converts a hash to YAML. It doesn't convert strings to YAML. The proper method for doing this is to use JSON.parse, as you mentioned:
request['body'] = JSON.parse( '{"id":"ef4b3a"}' )
But the string might not be JSON! It might be XML or some other smelly string.
This is exactly why to_yaml() doesn't convert strings. A wise programmer once told me: "Strings are strings. Strings are not data structures. Strings are strings."
If you want to convert a string into a data structure, you need to validate it and parse it. Because there's no guarantee that a string will be valid, it's your responsibility as a programmer to determine whether your data is JSON or XML or just bad, and to decide how you want to respond to each bit of data.
Since it looks like you're parsing web pages, you might want to consider using the same bit of data other web clients use to parse these things:
{ 'request' => {
'url' => '/posts',
'method' => 'GET',
'headers' => [
'Content-Type' => 'application/json' #<== this guy, right here!
]
},
'response' => {
'code' => 200,
'body' => '[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]'
}
}
If the content-type doesn't agree with the body then you should throw an error because your input data is bad.
The reason '{ foo' requires quote is because this is part of the YAML specification 7.3.3 Plain Style.
Excerpt
Plain scalars must never contain the “: ” and “#” character combinations. Such combinations would cause ambiguity with mapping key: value pairs and comments. In addition, inside flow collections, or when used as implicit keys, plain scalars must
not contain the “[”, “]”, “{”, “}” and “,” characters. These characters would cause ambiguity with flow collection structures.
Based on the above even your stated "return" value is incorrect and the body is probably enclosed in single quotes e.g.
response:
code: 200
body: '[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]'
Otherwise it would create ambiguity with "Flow Sequences" ([,]) and "Flow Mappings" ({,}).
If you would like result of the JSON, XML or other notation language to be represented appropriately (read objectively) then you will need to determine the correct parser (may be from the "Content-Type") and parse it before converting it YAML

Using a heredoc as a hash value

I have a method Embed.toggler that takes a hash argument. With the following code, I'm trying to use a heredoc in the hash.
Embed.toggler({
title: <<-RUBY
#{entry['time']}
#{entry['group']['who']
#{entry['name']}
RUBY
content: content
})
However, I'm getting the following error trace:
syntax error, unexpected ':', expecting tSTRING_DEND
content: content
^
can't find string "RUBY" anywhere before EOF
syntax error, unexpected end-of-input, expecting tSTRING_CONTENT or tSTRING_DBEG or tSTRING_DVAR or tSTRING_END
title: <<-RUBY
^
How I can avoid getting this error?
Add a comma after your <<-RUBY:
Embed.toggler({
title: <<-RUBY,
#{entry['time']}
#{entry['group']['who']
#{entry['name']}
RUBY
content: content
})
this does work in general. I am not sure why it wasn't working in my code though.
It didn't work because hashes require key/value pair to be separated by a comma, like {title: 'my title', content: 'my content' } and your code just didn't have the comma. It was hard to see that because of the cumbersome HEREDOC syntax.
Do you know if there is a way to perform operations on the string?
You're playing with fire. It's always safer (and usually cleaner) to extract a variable and do post-processing on a variable itself:
title = <<-RUBY
#{entry['time']}
#{entry['group']['who']
#{entry['name']}
RUBY
Embed.toggler(title: title.upcase, content: content)
However, if you feel dangerous today, you can just add operations after opening HEREDOC literal, just as you've added the comma:
Embed.toggler({
title: <<-RUBY.upcase,
#{entry['time']}
#{entry['group']['who']
#{entry['name']}
RUBY
content: content
})
But I discourage you from this because it destroys readability.

Is it safe to parse json with YAML.load?

I am using ruby 2.1.0
I have a json file.
For example: test.json
{
"item":[
{"apple": 1},
{"banana": 2}
]
}
Is it safe to load this file with YAML.load?
YAML.load(File.read('test.json'))
I am trying to load a file which is in either json or yaml format.
YAML can load JSON
YAML.load('{"something": "test", "other": 4 }')
=> {"something"=>"test", "other"=>4}
JSON will not be able to load YAML.
JSON.load("- something\n")
JSON::ParserError: 795: unexpected token at '- something'
There will be some obscure cases that work and produce different output.
YAML.load("")
=> false
JSON.load("")
=> nil
But generally the YAML construct is not JSON compliant.
So, try the JSON.load first because it's probably better at obscure JSON things.Catch the JSON::ParserError error and fall back to YAML.load.
In recent work I did I found a corner case of the sort alluded to by Matt. For example
puts JSON.load('{"x": "foo\/bar"}')['x']
succeeds in printing
foo/bar
despite the gratuitous escaping¹ whereas
puts YAML.load('{"x": "foo\/bar"}')['x']
fails:
Psych::SyntaxError ((<unknown>): found unknown escape character while parsing a quoted scalar at line 1 column 7)
¹In this case by Java as per net.sf.json.util.JSONUtils.quote. Note that they forgot to do the same quoting in their own Javadoc, ironically enough, so you have to browse source to understand!

Json Parse Error on parsing a hash

I am working on Ruby on Rails. I have a hash like below
{"attachment"=>"{:output_dir=>\"/home/mypath/\", :process_hash=>\"8b9d9c51\", :type=>\"pdf\", :processed_dir=>\"/513/9a1/88a\", :pdf=>\"/system/path/a3ae1194f76d737b6cfb141fa0fde17f78f2e94e.pdf\", :slides_count=>4, :meta=>{:swfs=>\"{/system/path/88a/8b9d9c51[*,0].swf,4}\", :pngs=>\"/system/path/8b9d9c51{page}.png\", :json=>\"/system/path/8b9d9c51.js\"}}"
In my code i have
JSON.parse(params[:attachment])
which throws me an error as
JSON::ParserError (757: unexpected token at '{:output_dir=>"/home/path", :process_hash=>"8b9d9c51", :type=>"pdf", :processed_dir=>"/513/9a1/88a", :pdf=>"/system/path/a3ae1194f76d737b6cfb141fa0fde17f78f2e94e.pdf", :slides_count=>4, :meta=>{:swfs=>"{/system/path/8b9d9c51[*,0].swf,4}", :pngs=>"/system/path/8b9d9c51{page}.png", :json=>"/system/path/8b9d9c51.js"}}'):
Suggest me how to resolve this.
JSON.parse parses an JSON formatted String into a Hash, not the other way around. I'm not sure what you'd like to accomplish?
If you're trying to convert a Hash into JSON (string) you could use
params[:attachment].to_json
If you're trying to convert a JSON (string) into Hash you could use
JSON.parse(params[:attachment])
However, your string doesn't look like JSON (it includes => where it should have :)
Valid JSON looks like:
{ "attachment": { "output_dir": "/home/mypath", "process_hash": "89r2432" } }

Resources