Loop over JSON and create files with key as filename and value as file content - bash

I have a sample JSON like
"request_id": "016-ae81e5737add",
"renewable": false,
"data": {
"client_decrypted.key": "-----BEGIN KEY-----sdjijhgisdTRjkRD05\n-----END KEY-----",
"passphrase": "password"
}
In my bash script, I am reading the JSON specifically the contents of data key. I would like to create files with the filename as key (e.g: files like client_decrypted.key & passphrase) with the contents of the files as the value of the corresponding key.
I was able to use jq to read the key & value pair
for pair in $( jq -r ".data | to_entries|map(\"\(.key)=\(.value|tostring)\")|.[]"); do
echo $${pair} # OUTPUT=passphrase=password
done
I guess I can use IFS to split each pair on =. Is there a better way of doing this?

One safe way would be to emit the JSON records with a NULL byte padding and then later read it back in a shell loop, delimiting on that byte
while IFS= read -r -d '' key && IFS= read -r -d '' value; do
printf '%s\n' "$value" > "$key"
done < <(jq -j '.data | to_entries[] | (.key, "\u0000", .value, "\u0000")' json)
Since the NULL byte can't be part of a "valid" byte in any of the input records, it is used as the delimit character. The jq expression
(.key, "\u0000", .value, "\u0000")
appends the NULL byte between the key and the value field, which is read with a while loop , two reads at a time to store the key name and the value fields in the respective variables.
Note that <(..) is a bash/ksh93/zsh process substitution technique, which is not POSIX compliant and might not work in pure bourne shells.

Related

jq: insert new objects while reading inputs from json file and bash stdout

I want to insert new json objects in between json objects using bash generated uuid.
input json file test.json
{"name":"a","type":1}
{"name":"b","type":2}
{"name":"c","type":3}
input bash command uuidgen -r
target output json
{"id": "7e3ca7b0-48f1-41fe-9a19-092a62cba0dc"}
{"name":"a","type":1}
{"id": "3f793fdd-ec3b-4306-8153-12f3f9faf2c1"}
{"name":"b","type":2}
{"id": "cbcd759a-37e7-4da7-b7fe-7572f474ec31"}
{"name":"c","type":3}
basic jq program to insert new objects
jq -c '{"id"}, .' test.json
output json
{"id":null}
{"name":"a","type":1}
{"id":null}
{"name":"b","type":2}
{"id":null}
{"name":"c","type":3}
jq program to insert uuid generated from bash:
jq -c '{"id" | input}, .' test.json < <(uuidgen)
Unsure about how to handle two inputs, bash command used to create a value in the new object, and the input file to be transformed (new object inserted in between each object).
I want to process small and large json files up to a few gigabytes each.
Greatly appeaciate some help with a well designed solution(s) that would scale for large files and perform the operations quickly and efficiently.
Thanks in advance.
If the input file is already well-formed JSONL, then a simple bash solution would be:
while IFS= read -r line; do
printf "{\"id\": \"%s\"}\n" $(uuidgen)
printf '%s\n' "$line"
done < test.json
This might well be the best trivial solution if test.json is very large and known to be valid JSONL.
If the input file is not already JSONL, then you could still use the above approach by piping in jq -c . test.json. And if ‘read’ is too slow, you could still use the above text-processing approach with awk.
For the record, a single-call-to-jq solution along the lines you have in mind could be constructed as follows:
jq -n -c -R --slurpfile objects test.json '
$objects[] | {"id": input}, .' <(while true ; do uuidgen ; done)
Obviously you cannot "slurp" the unbounded stream of uuidgen values; less obviously perhaps, if you were simply to pipe in the stream, the process will hang.
Since #peak has already covered the jq side of the problem, I'm going to take a shot at doing this more efficiently using Python, still wrapped so it can be called in a shell script.
This assumes that your input is JSONL, with one document per line. If it isn't, consider piping through jq -c . before piping into the below.
#!/usr/bin/env bash
py_prog=$(cat <<'EOF'
import json, sys, uuid
for line in sys.stdin:
print(json.dumps({"id": str(uuid.uuid4())}))
sys.stdout.write(line)
EOF
)
python -c "$py_prog" <in.json >out.json
Here's another approach where jq is handling input as raw string, already muxed by a separate copy of bash.
while IFS= read -r line; do
uuidgen
printf '%s\n' "$line"
done | jq -Rrc '({ "id": . }, input)'
It still has all the performance overhead of calling uuidgen once per input line (plus some extra overhead because bash's read operates one byte at a time) -- but it operates in a fixed amount of memory without needing Python.
If the input was not known in advance to be valid JSONL,
one of the following bash+jq solutions might make sense
since the overhead of counting the number of objects would be relatively small.
If the input is small enough to fit in memory, you could go with a simple solution:
n=$(jq -n 'reduce inputs as $in (0; .+1)' test.json)
for ((i=0; i < $n; i++)); do uuidgen ; done |
jq -n -c -R --slurpfile objects test.json '
$objects[] | {"id": input}, .'
Otherwise, that is, if the input is very large, then one could avoid slurping it as follows:
n=$(jq -n 'reduce inputs as $in (0; .+1)' test.json)
jq -nc --rawfile ids <(for ((i=0; i < $n; i++)); do uuidgen ; done) '
$ids | split("\n") as $ids
| foreach inputs as $in (-1; .+1; {id: $ids[.]}, $in)
' test.json

Converting JSON response to key value pair using jq

So, I am getting a response from an API that I am calling in a shell script in the following form
[{"id":100000004,"name":"Customs Clearance Requested"},{"id":100000005,"name":"Customs Cleared"},{"id":100000006,"name":"Cargo Loaded to Vessel"}]
I want to create a map out of it that will help me lookup the id's from a name and use it in the shell script. So something like map["Customs Clearance Requested"] would give me 100000004 which I can use further. Can this be done using jq? I am pretty new to shell scripting and jq and got stuck with above thing
json='[{"id":100000004,"name":"Customs Clearance Requested"},{"id":100000005,"name":"Customs Cleared"},{"id":100000006,"name":"Cargo Loaded to Vessel"}]'
declare -A map
while IFS= read -r -d '' name && IFS= read -r -d '' value; do
map[$name]=$value
done < <(jq -j '.[] | "\(.name)\u0000\(.id)\u0000"' <<<"$json")
declare -p map # demo purposes: print the map we created as output
...emits as output:
declare -A map=(["Cargo Loaded to Vessel"]="100000006" ["Customs Clearance Requested"]="100000004" ["Customs Cleared"]="100000005" )
...which you can query exactly as requested:
$ echo "${map['Cargo Loaded to Vessel']}"
100000006
You could use the select function, e.g.:
data='[{"id":100000004,"name":"Customs Clearance Requested"},{"id":100000005,"name":"Customs Cleared"},{"id":100000006,"name":"Cargo Loaded to Vessel"}]'
jq 'map(select(.["name"] == "Customs Clearance Requested"))' <<< $data
It will get all elements which name equals "Customs Clearance Requested", e.g.:
[
{
"id": 100000004,
"name": "Customs Clearance Requested"
}
]
If you want to get the id field:
jq 'map(select(.["name"] == "Customs Clearance Requested")["id"])' <<< $data
This will output:
[
100000004
]
Please note that it will return an array and not a single element because the search does not know how many results will be found.
If you want to generalize this in a shell function, you could write:
function get_id_from_name
{
# $1=name to search for
local filter=$(printf 'map(select(.["name"] == "%s")["id"])' "$1")
jq "$filter"
}
Then call it like that:
get_id_from_name "Customs Clearance Requested" <<< $data
If your data is stored in a file, you could call it this way:
get_id_from_name "Customs Clearance Requested" < /path/to/file.json
The following is very similar to #CharlesDuffy's excellent answer but does not assume that the .name and .id values are NUL-free (i.e., do not have any "\u0000" characters):
declare -A map
while read -r name
do
name=$(sed -e 's/^"//' -e 's/"$//' <<< "$name")
read -r id
map[$name]="$id"
done < <(echo "$json" | jq -c '.[]|.name,.id')
The point is that the -j option is like -r (i.e., produces "raw output"), whereas the -c option produces JSON.
This means that if you don't want the .id values as JSON strings, then the above won't be a solution; also, if the .name values contain double-quotes, then you might want to deal with the occurrences of \".

How to avoid the read command cutting the user input which is a string by space

I wrote a bash script to read multiple inputs from the user
Here is the command:
read -a choice
In this way, I can put all the inputs in the choice variable as an array so that I can extract them using an index.
The problem is that when one of the inputs, which is a string has space in it, like
user1 google.com "login: myLogin\npassword: myPassword"
the read command will split the quoted string into 3 words. How can I stop this from happening?
bash doesn't process quotes in user input. The only thing I can think of is to use eval to execute an array assignment.
IFS= read -r input
eval "choice=($input)"
Unfortunately this is dangerous -- if the input contains executable code, it will be executed by eval.
You can use a tab instead of space as a field delimiter. For instance :
$ IFS=$'\t' read -a choice
value1 value2 a value with many words ## This is typed
$ echo ${choice[2]}
a value with many words
Regards!
Given risk of using eval, and the fact the input seems to have only two types of tokens: unquoted, and quoted, consider using scripting engine that will put all text into proper format that will be easy to read.
It's not clear from the example what other quoting rules are used. Example assume 'standard' escaped that can be processed with bash #E processor.
The following uses Perl one liner to generate TAB delimited tokens (hopefully, raw tabs can not be part of the input, but other character can be used instead).
input='user1 google.com "login: myLogin\npassword: myPassword"'
tsv_input=$(perl -e '$_ = " $ARGV[0]" ; print $2 // $3, "\t" while ( /\s+("([^"]*)"|(\S*))/g) ;' "$input")
IFS=$'\t' read -d '' id domain values <<< $(echo -e "${tsv_input#E}")
Or using a function to get more readable code
function data_to_tsv {
# Translate to TSV
local tsv_input=$(perl -e '$_ = " $ARGV[0]" ; print $2 // $3, "\t" while ( /\s+("([^"]*)"|(\S*))/g) ;' "$1")
# Process escapes
echo -n "${tsv_input#E}"
}
input='user1 google.com "login: myLogin\npassword: myPassword"'
IFS=$'\t' read -d '' id domain values <<< $(data_to_tsv "$input")

Convert json array of objects to bash associative array

I have an array of json objects that I'd like to convert to an associative array in bash with a slight alteration to the key
{
"Parameters": [
{
"Name": "/path/user_management/api_key",
"Type": "SecureString",
"Value": "1234",
"Version": 1
},
{
"Name": "/path/user_management/api_secret",
"Type": "SecureString",
"Value": "5678",
"Version": 1
}
]
}
I know I need to use jq and sed but I just can't quite find the proper combination of doing what I'm looking for. Need to strip out "/path/user_management/" and set the remaining as the key, and use Value for value.
Trying to find a fairly clean one liner piping commands together. What I'd like to end up with is a bash associative array of something like:
myArray[api_key]="1234"
myArray[api_secret]="5678"
Asking for a one-liner code is as good as asking for unreadable code. If you want do this in a proper way, read the output of jq command in a while loop and strip out unwanted characters as required.
#!/usr/bin/env bash
# declare an associative array, the -A defines the array of this type
declare -A _my_Array
# The output of jq is separated by '|' so that we have a valid delimiter
# to read our keys and values. The read command processes one line at a
# time and puts the values in the variables 'key' and 'value'
while IFS='|' read -r key value; do
# Strip out the text until the last occurrence of '/'
strippedKey="${key##*/}"
# Putting the key/value pair in the array
_my_Array["$strippedKey"]="$value"
done< <(jq -r '.Parameters[] | "\(.Name)|\(.Value)"' json)
# Print the array using the '-p' or do one by one
declare -p _my_Array
Or print the array, the traditional way
for key in "${!_my_Array[#]}"; do
printf '%s %s\n' "${key}" "${_my_Array[$key]}"
done

converting lines to json in bash

I would like to convert a list into JSON array. I'm looking at jq for this but the examples are mostly about parsing JSON (not creating it). It would be nice to know proper escaping will occur. My list is single line elements so the new line will probably be the best delimiter.
I was also trying to convert a bunch of lines into a JSON array, and was at a standstill until I realized that -s was the only way I could handle more than one line at a time in the jq expression, even if that meant I'd have to parse the newlines manually.
jq -R -s -c 'split("\n")' < just_lines.txt
-R to read raw input
-s to read all input as a single string
-c to not pretty print the output
Easy peasy.
Edit: I'm on jq ≥ 1.4, which is apparently when the split built-in was introduced.
--raw-input, then --slurp
Just summarizing what the others have said in a hopefully quicker to understand form:
cat /etc/hosts | jq --raw-input . | jq --slurp .
will return you:
[
"fe00::0 ip6-localnet",
"ff00::0 ip6-mcastprefix",
"ff02::1 ip6-allnodes",
"ff02::2 ip6-allrouters"
]
Explanation
--raw-input/-R:
Don´t parse the input as JSON. Instead, each line of text is passed
to the filter as a string. If combined with --slurp, then the
entire input is passed to the filter as a single long string.
--slurp/-s:
Instead of running the filter for each JSON object in the input,
read the entire input stream into a large array and run the filter
just once.
You can also use jq -R . to format each line as a JSON string and then jq -s (--slurp) to create an array for the input lines after parsing them as JSON:
$ printf %s\\n aa bb|jq -R .|jq -s .
[
"aa",
"bb"
]
The method in chbrown's answer adds an empty element to the end if the input ends with a linefeed, but you can use printf %s "$(cat)" to remove trailing linefeeds:
$ printf %s\\n aa bb|jq -R -s 'split("\n")'
[
"aa",
"bb",
""
]
$ printf %s\\n aa bb|printf %s "$(cat)"|jq -R -s 'split("\n")'
[
"aa",
"bb"
]
If the input lines don't contain ASCII control characters (which have to be escaped in strings in valid JSON), you can use sed:
$ printf %s\\n aa bb|sed 's/["\]/\\&/g;s/.*/"&"/;1s/^/[/;$s/$/]/;$!s/$/,/'
["aa",
"bb"]
Update: If your jq has inputs you can simply write:
jq -nR [inputs] /etc/hosts
to produce a JSON array of strings. This avoids having to read the text file as a whole.
I found in the man page for jq and through experimentation what seems to me to be a simpler answer.
$ cat test_file.txt | jq -Rsc '. / "\n" - [""]'
["aa","bb"]
The -R is to read without trying to parse json, the -s says to read all of the input as one string, and the -c is for one-line output - not necessary, but it's what I was looking for.
Then in the string I pass to jq, the '.' says take the input as it is. The '/ \n' says to divide the string (split it) on newlines. The '- [""]' says to remove from the resulting array any empty strings (resulting from an extra newline at the end).
It's one line and without any complicated constructs, using just simple built in jq features.

Resources