csvkit in2csv - how to convert a single json object to two-column csv - csvkit

Looking for a one liner with csvkit.
From a plain json object
{
"whatever": 2342,
"otherwise": 119,
"and": 1,
"so": 2,
"on": 3
}
Want this csv
whatever,2342
otherwise,119
and,1
so,2
on,3
I basically want this command to work, but it doesn't.
echo $the_json | in2csv -f json
> When converting a JSON document with a top-level dictionary element, a key must be specified.
Seems like something csvkit can do, and I just haven't found the right options.

short answer
variant A: in2csv (csvkit) + csvtool
wrap your json in brackets
use in2csv's -I option to avoid unexpected behavior
use a command to transpose the two-row CSV, e.g. csvtool
echo "[$the_json]" | in2csv -I -f json | csvtool transpose -
variant B: use jq instead
This is a solution using only jq: (https://stedolan.github.io/jq/)
echo "$the_json" | jq -r 'to_entries[] | [.key, .value] | #csv'
taken from How to map an object to arrays so it can be converted to csv?
long answer (csvkit + csvtool)
the input
in2csv -f json expects a list of JSON objects, so you need to wrap the single object ({...}) into square brackets ([{...}]).
On POSIX compatible shells, write
echo "[$the_json]"
which will print
[{
"whatever": 2342,
"otherwise": 119,
"and": 1,
"so": 2,
"on": 3
}]
the csvkit command
You may pipe the above data directly into in2csv. However, you might run into issues with the ”type inference“ (CSV data interpretation) feature of csvkit:
$ echo "[$the_json]" | in2csv -f json
whatever,otherwise,and,so,on
2342,119,True,2,3
1 has become True. For details, see the Tips and Troubleshooting part of the docs. It's suggested to turn off type inference using the -I option:
$ echo "[$the_json]" | in2csv -I -f json
whatever,otherwise,and,so,on
2342,119,1,2,3
Now the result is as expected
transpose the data
Still, you need to transpose the data. The csvkit docs say:
To transpose CSVs, consider csvtool.
(csvtool is available on github, opam, debian and probably other distribution channels.)
Using csvkit + csvtool, your final command looks like this:
echo "[$the_json]" | in2csv -I -f json | csvtool transpose -
with the hyphen (-) meaning to take the data from stdin. This is the result:
whatever,2342
otherwise,119
and,1
so,2
on,3
that's it.
I think there is no one-liner solution with csvtool only, you'll need in2csv. You may, however, use jq instead, see the short answer.
FTR, I'm using csvkit version 1.0.3.

Tested the first posted answer works! But it is a bit confusing because "[$the_json]" means the raw content of the json. So an example of command could be this:
echo '[{"a":"b","c":"d"}]' | in2csv -I -f json | csvtool transpose -
and if you want to do it with a file name instead, for instance myfile.json one can add the brackets with a sed command and pipe it to in2csv:
sed -e '1s/^/[/' -e 's/$/,/' -e '$s/,$/]/' myfile.json | in2csv -I -f json > myfile.csv
Example with the full transposition command:
sed -e '1s/^/[/' -e 's/$/,/' -e '$s/,$/]/' myfile.json | in2csv -I -f json | csvtool transpose - > myfile.csv
source: How to add bracket at beginning and ending in text on UNIX

Related

Unable to loop through the JSON internal Array having spaces in values using Bash script JQ [duplicate]

Background
I want to be able to pass a json file to WP CLI, to iteratively create posts.
So I thought I could create a JSON file:
[
{
"post_type": "post",
"post_title": "Test",
"post_content": "[leaflet-map][leaflet-marker]",
"post_status": "publish"
},
{
"post_type": "post",
"post_title": "Number 2",
"post_content": "[leaflet-map fitbounds][leaflet-circle]",
"post_status": "publish"
}
]
and iterate the array with jq:
cat posts.json | jq --raw-output .[]
I want to be able to iterate these to execute a similar function:
wp post create \
--post_type=post \
--post_title='Test Map' \
--post_content='[leaflet-map] [leaflet-marker]' \
--post_status='publish'
Is there a way I can do this with jq, or similar?
The closest I've gotten so far is this:
> for i in $(cat posts.json | jq -c .[]); do echo $i; done
But this seems to take issue with the (valid) spaces in the strings. Output:
{"post_type":"post","post_title":"Test","post_content":"[leaflet-map][leaflet-marker]","post_status":"publish"}
{"post_type":"post","post_title":"Number
2","post_content":"[leaflet-map
fitbounds][leaflet-circle]","post_status":"publish"}
Am I way off with this approach, or can it be done?
Use a while to read entire lines, rather than iterating over the words resulting from the command substitution.
while IFS= read -r obj; do
...
done < <(jq -c '.[]' posts.json)
Maybe this would work for you:
Make a bash executable, maybe call it wpfunction.sh
#!/bin/bash
wp post create \
--post_type="$1"\
--post_title="$2" \
--post_content="$3" \
--post_status="$4"
Then run jq on your posts.json and pipe it into xargs
jq -M -c '.[] | [.post_type, .post_title, .post_content, .post_status][]' \
posts.json | xargs -n4 ./wpfunction`
I am experimenting to see how this would handle post_content that contained quotes...
First generate an array of the arguments you wish to pass then convert to a shell compatible form using #sh. Then you could pass to xargs to invoke the command.
$ jq -r '.[] | ["post", "create", (to_entries[] | "--\(.key)=\(.value|tojson)")] | #sh' input.json | xargs wp

Need help consolidating three sed calls into one

I have a variable called TR_VERSION that is a JSON list of version numbers that looks something like this:
[
"1.0.1",
"1.0.2",
"1.0.3"
]
I would like to strip all of the JSON specific characters - [, ", , and ]. The following code works but it would be great to consolidate to one sed call instead of three.
TR_VERSION=$(echo $VERSION \
| sed 's|[",]||g' \
| sed 's/\[//' \
| sed 's/\]//')
Thanks for the answers!
Never ever use sed to parse json.
This is the way to go:
$ jq -r '.[]' < file.json
Output as expected
1.0.1
1.0.2
1.0.3
If you just want to remove all ", ,, [ and ] chars you may use
TR_VERSION=$(echo "$VERSION" | sed 's/[][",]//g')
Or,
TR_VERSION=$(sed 's/[][",]//g' <<< "$VERSION")
The [][",] pattern matches ], [, " or , chars.
If you really want to avoid a JSON parer, there is still no need to use sed. You could also do it by
TR_VERSION=$(tr -d '[]",' <<<$VERSION)
which, IMHO, is slightly better readable than the sed counterpart.

Create variables base on cURL response - Bash

I'm trying to create 2 variables via bash $lat, $long base on the result of my curl response.
curl ipinfo.io/33.62.137.111 | grep "loc" | awk '{print $2}'
I got.
"42.6334,-71.3162",
I'm trying to get
$lat=42.6334
$long=-71.3162
Can someone give me a little push ?
IFS=, read -r lat long < <(
curl -s ipinfo.io/33.62.137.111 |
jq -r '.loc'
)
printf 'Latitude is: %s\nLongitude is: %s\n' "$lat" "$long"
The ipinfo.io API is returning JSON data, so let parse it with jq:
Here is the JSON as returned by the query from your sample:
{
"ip": "33.62.137.111",
"city": "Columbus",
"region": "Ohio",
"country": "US",
"loc": "39.9690,-83.0114",
"postal": "43218",
"timezone": "America/New_York",
"readme": "https://ipinfo.io/missingauth"
}
We are going to JSON query the loc entry from the main root object ..
curl -s ipinfo.io/33.62.137.111: download the JSON data -s silently without progress.
jq -r '.loc': Process JSON data, query the loc entry of the main object and -r output raw string.
IFS=, read -r lat long < <(: Sets the Internal Field Separator to , and read both lat and long variables from the following command group output stream.
Although the answer from #LeaGris is quite interesting, if you don't want to use an external library or something, you can try this:
Playground: https://repl.it/repls/ThoughtfulImpressiveComputer
coordinates=($(curl ipinfo.io/33.62.137.111 | sed 's/ //g' | grep -P '(?<=\"loc\":").*?(?=\")' -o | tr ',' ' '))
echo "${coordinates[#]}"
echo ${coordinates[0]}
echo ${coordinates[1]}
Example output:
39.9690 -83.0114 # echo "${coordinates[#]}"
39.9690 # ${coordinates[0]}
-83.0114 # ${coordinates[1]}
Explanation:
curl ... get the JSON data
sed 's/ //g' remove all spaces
grep -P ... -o
-P interpret the given pattern as a perl regexp
(?<=\"loc\":").*?(?=\")
(?<=\"loc\":") regex lookbehind
.*? capture the longitude and latitude part with non-greedy search
(?=\") regex lookahead
-o get only the matching part which'ld be e.g. 39.9690,-83.0114
tr ',' ' ' replace , with space
Finally we got something like this: 39.9690 -83.0114
Putting it in parentheses lets us create an array with two values in it (cf. ${coordinates[...]}).

How to extract text with sed or grep and regular expression json

Hello I am using curl to get some info which I need to clean up.
This is from curl command:
{"ip":"000.000.000.000","country":"Italy","city":"Milan","longitude":9.1889,"latitude":45.4707, etc..
I would need to get "Ita" as output, that is the first three letter of the country.
After reading sed JSON regular expression i tried to adapt resulting in
sed -e 's/^.*"country":"[a-zA-Z]{3}".*$/\1/
but this won't work.
Can you please help?
Using jq, you can do:
curl .... | jq -r '.country[0:3]'
If you need to set the country to the first 3 chars,
jq '.country = .country[0:3]'
some fairly advanced bash:
{
read country
read city
} < <(
curl ... |
jq -r '.country[0:3], .city[0:3]'
)
Then:
$ echo "$country $city"
Ita Mil

converting lines to json in bash

I would like to convert a list into JSON array. I'm looking at jq for this but the examples are mostly about parsing JSON (not creating it). It would be nice to know proper escaping will occur. My list is single line elements so the new line will probably be the best delimiter.
I was also trying to convert a bunch of lines into a JSON array, and was at a standstill until I realized that -s was the only way I could handle more than one line at a time in the jq expression, even if that meant I'd have to parse the newlines manually.
jq -R -s -c 'split("\n")' < just_lines.txt
-R to read raw input
-s to read all input as a single string
-c to not pretty print the output
Easy peasy.
Edit: I'm on jq ≥ 1.4, which is apparently when the split built-in was introduced.
--raw-input, then --slurp
Just summarizing what the others have said in a hopefully quicker to understand form:
cat /etc/hosts | jq --raw-input . | jq --slurp .
will return you:
[
"fe00::0 ip6-localnet",
"ff00::0 ip6-mcastprefix",
"ff02::1 ip6-allnodes",
"ff02::2 ip6-allrouters"
]
Explanation
--raw-input/-R:
Don´t parse the input as JSON. Instead, each line of text is passed
to the filter as a string. If combined with --slurp, then the
entire input is passed to the filter as a single long string.
--slurp/-s:
Instead of running the filter for each JSON object in the input,
read the entire input stream into a large array and run the filter
just once.
You can also use jq -R . to format each line as a JSON string and then jq -s (--slurp) to create an array for the input lines after parsing them as JSON:
$ printf %s\\n aa bb|jq -R .|jq -s .
[
"aa",
"bb"
]
The method in chbrown's answer adds an empty element to the end if the input ends with a linefeed, but you can use printf %s "$(cat)" to remove trailing linefeeds:
$ printf %s\\n aa bb|jq -R -s 'split("\n")'
[
"aa",
"bb",
""
]
$ printf %s\\n aa bb|printf %s "$(cat)"|jq -R -s 'split("\n")'
[
"aa",
"bb"
]
If the input lines don't contain ASCII control characters (which have to be escaped in strings in valid JSON), you can use sed:
$ printf %s\\n aa bb|sed 's/["\]/\\&/g;s/.*/"&"/;1s/^/[/;$s/$/]/;$!s/$/,/'
["aa",
"bb"]
Update: If your jq has inputs you can simply write:
jq -nR [inputs] /etc/hosts
to produce a JSON array of strings. This avoids having to read the text file as a whole.
I found in the man page for jq and through experimentation what seems to me to be a simpler answer.
$ cat test_file.txt | jq -Rsc '. / "\n" - [""]'
["aa","bb"]
The -R is to read without trying to parse json, the -s says to read all of the input as one string, and the -c is for one-line output - not necessary, but it's what I was looking for.
Then in the string I pass to jq, the '.' says take the input as it is. The '/ \n' says to divide the string (split it) on newlines. The '- [""]' says to remove from the resulting array any empty strings (resulting from an extra newline at the end).
It's one line and without any complicated constructs, using just simple built in jq features.

Resources