I'm trying to get jq to parse a JSON structure like:
{
"a" : 1,
"b" : 2,
"c" : "{\"id\":\"9ee ...\",\"parent\":\"abc...\"}\n"
}
That is, an element in the JSON is a string with escaped json.
So, I have something along the lines of
$ jq [.c] myFile.json | jq [.id]
But that crashes with jq: error: Cannot index string with string
This is because the output of .c is a string, not more JSON.
How do I get jq to parse this string?
My initial solution is to use sed to replace all the escape chars (\":\", \",\" and \") but that's messy, I assume there's a way built into jq to do this?
Thanks!
edit:
Also, the jq version available here is:
$ jq --version
jq version 1.3
I guess I could update it if required.
jq has the fromjson builtin for this:
jq '.c | fromjson | .id' myFile.json
fromjson was added in version 1.4.
You can use the raw output (-r) that will unescape characters:
jq -r .c myfile.json | jq .id
ADDENDUM: This has the advantage that it works in jq 1.3 and up; indeed, it should work in every version of jq that has the -r option.
Motivation: you want to parse JSON string - you want to escape a JSON object that's wrapped with quotes and represented as a String buffer, and convert it to a valid JSON object. For example:
some JSON unescaped string :
"{\"name\":\"John Doe\",\"position\":\"developer\"}"
the expected result ( a JSON object ):
{"name":"John Doe","position":"developer"}
Solution: In order to escape a JSON string and convert it into a valid JSON object use the sed tool in command line and use regex expressions to remove/replace specific characters:
cat current_json.txt | sed -e 's/\\\"/\"/g' -e 's/^.//g' -e 's/.$//g'
s/\\\"/\"/g replacing all backslashes and quotes ( \" ) into quotes only (")
s/^.//g replacing the first character in the stream to none character
s/.$//g replacing the last character in the stream to none character
Related
Hello I am using curl to get some info which I need to clean up.
This is from curl command:
{"ip":"000.000.000.000","country":"Italy","city":"Milan","longitude":9.1889,"latitude":45.4707, etc..
I would need to get "Ita" as output, that is the first three letter of the country.
After reading sed JSON regular expression i tried to adapt resulting in
sed -e 's/^.*"country":"[a-zA-Z]{3}".*$/\1/
but this won't work.
Can you please help?
Using jq, you can do:
curl .... | jq -r '.country[0:3]'
If you need to set the country to the first 3 chars,
jq '.country = .country[0:3]'
some fairly advanced bash:
{
read country
read city
} < <(
curl ... |
jq -r '.country[0:3], .city[0:3]'
)
Then:
$ echo "$country $city"
Ita Mil
In my shell, I have a JSON response like you can see below. When I am printing, it prints "" with JSON, but I want to remove them.
{
"Grade": "tenth"
}
I am using
curl -s "<<API>>"| awk '{print $2;}'
Use jq JSON parser instead of awk:
curl -s "<<API>>" | jq -r '.Grade'
-r is the raw mode. It outputs the string without quote.
How to make jq treat an input argument as numeric instead of string? In the following example, CURR_INDEX is a Bash variable which has array index value that I want to extract.
jq --arg ARG1 $CURR_INDEX '.[$ARG1].patchSets' inputfile.json
I get the following error:
jq: error: Cannot index array with string
I tried the workaround of using bash eval but some jq filters do not work properly in eval statements.
You can convert it to a number, like this:
jq --arg ARG1 1 '.[$ARG1|tonumber]' <<< '["foo". "bar"]'
"bar"
--arg always binds the value as a string. You can use --argjson (introduced in version 1.5) to treat the argument as a json-encoded value instead.
jq --argjson ARG1 $CURR_INDEX '.[$ARG1].patchSets' inputfile.json
To see it in action, you can reproduce your original error:
$ jq --argjson ARG1 '"1"' '.[$ARG1]' <<< '["foo", "bar"]'
jq: error (at <stdin>:1): Cannot index array with string "1"
then correct it:
$ jq --argjson ARG1 1 '.[$ARG1]' <<< '["foo", "bar"]'
"bar"
I would like to convert a list into JSON array. I'm looking at jq for this but the examples are mostly about parsing JSON (not creating it). It would be nice to know proper escaping will occur. My list is single line elements so the new line will probably be the best delimiter.
I was also trying to convert a bunch of lines into a JSON array, and was at a standstill until I realized that -s was the only way I could handle more than one line at a time in the jq expression, even if that meant I'd have to parse the newlines manually.
jq -R -s -c 'split("\n")' < just_lines.txt
-R to read raw input
-s to read all input as a single string
-c to not pretty print the output
Easy peasy.
Edit: I'm on jq ≥ 1.4, which is apparently when the split built-in was introduced.
--raw-input, then --slurp
Just summarizing what the others have said in a hopefully quicker to understand form:
cat /etc/hosts | jq --raw-input . | jq --slurp .
will return you:
[
"fe00::0 ip6-localnet",
"ff00::0 ip6-mcastprefix",
"ff02::1 ip6-allnodes",
"ff02::2 ip6-allrouters"
]
Explanation
--raw-input/-R:
Don´t parse the input as JSON. Instead, each line of text is passed
to the filter as a string. If combined with --slurp, then the
entire input is passed to the filter as a single long string.
--slurp/-s:
Instead of running the filter for each JSON object in the input,
read the entire input stream into a large array and run the filter
just once.
You can also use jq -R . to format each line as a JSON string and then jq -s (--slurp) to create an array for the input lines after parsing them as JSON:
$ printf %s\\n aa bb|jq -R .|jq -s .
[
"aa",
"bb"
]
The method in chbrown's answer adds an empty element to the end if the input ends with a linefeed, but you can use printf %s "$(cat)" to remove trailing linefeeds:
$ printf %s\\n aa bb|jq -R -s 'split("\n")'
[
"aa",
"bb",
""
]
$ printf %s\\n aa bb|printf %s "$(cat)"|jq -R -s 'split("\n")'
[
"aa",
"bb"
]
If the input lines don't contain ASCII control characters (which have to be escaped in strings in valid JSON), you can use sed:
$ printf %s\\n aa bb|sed 's/["\]/\\&/g;s/.*/"&"/;1s/^/[/;$s/$/]/;$!s/$/,/'
["aa",
"bb"]
Update: If your jq has inputs you can simply write:
jq -nR [inputs] /etc/hosts
to produce a JSON array of strings. This avoids having to read the text file as a whole.
I found in the man page for jq and through experimentation what seems to me to be a simpler answer.
$ cat test_file.txt | jq -Rsc '. / "\n" - [""]'
["aa","bb"]
The -R is to read without trying to parse json, the -s says to read all of the input as one string, and the -c is for one-line output - not necessary, but it's what I was looking for.
Then in the string I pass to jq, the '.' says take the input as it is. The '/ \n' says to divide the string (split it) on newlines. The '- [""]' says to remove from the resulting array any empty strings (resulting from an extra newline at the end).
It's one line and without any complicated constructs, using just simple built in jq features.
What I am trying to achieve is pass the Base64 encoded value captured in the sed regex to the base64 and have it decoded.
But the problem is, even though it seems like the correct value is being passed to the function using backreference, base64 complains that the input is invalid.
Following is my script -
#!/bin/bash
decodeBaseSixtyFour() {
echo "$1 is decoded to `echo $1 | base64 -d`"
}
echo Passing direct value ...
echo SGVsbG8gQmFzZTY0Cg== | sed -r "s/(.+)$/$(decodeBaseSixtyFour SGVsbG8gQmFzZTY0Cg==)/"
echo Passing captured value ...
echo SGVsbG8gQmFzZTY0Cg== | sed -r "s/(.+)$/$(decodeBaseSixtyFour \\1)/"
And when ran it produces the following output -
Passing direct value ...
SGVsbG8gQmFzZTY0Cg== is decoded to Hello Base64
Passing captured value ...
base64: invalid input
SGVsbG8gQmFzZTY0Cg== is decoded to
I think the output explains what I mean.
Is it possible to do what I am trying to do? If not, why?
Perl s/// can do what you want, but I don't think what you're asking for is what you need.
$ echo SGVsbG8gQmFzZTY0Cg== | perl -MMIME::Base64 -pe 's/(.+)/decode_base64($1)/e'
Hello Base64
What's actually happening:
echo SGVsbG8gQmFzZTY0Cg== | sed -r "s/(.+)$/$(decodeBaseSixtyFour \\1)/"
Before sed starts reading input, the shell notices the process substitution in the double quoted string
the decodeBaseSixtyFour function is called with the string "\\1"
base64 chokes on the input \1 and emits the error message
the function returns the string "\1 is decoded to "
now the sed script is 's/(.+)$/\1 is decoded to /' which is how you get the last line.
As I commented sed cannot do an equivalent of replace_callback which is esentially what you're trying to do.
Following awk does close to what you're trying to do:
s="My string is SGVsbG8gQmFzZTY0Cg== something"
awk '{for(i=1; i<=NF; i++) if ($i~/==$/) "base64 -D<<<"$i|getline $i}1'<<<"$s"
My string is Hello Base64 something