conditional grep from a Json file - shell

i have a query with this Json and grep :-
[
{
"name":"Jon",
"id":123
},
{
"name":"Ray",
"id":1234
},
{
"name":"Abraham",
"id":12345
}
]
How can one extract name from this json where id matches say 1234 , can be random , using grep or sed?

I would suggest to use jq but if you want to use grep try
grep -B1 'id.*1234' < input_file | grep name
from man page
-B num, --before-context=num
Print num lines of leading context before each match. See also the -A and -C options.

please suggest the jq command
I take the liberty to fulfill the request.
jq -r '.[]|select(.id==1234).name' file
.[] - iterates the array elements
select(.id==1234) - filters element with desired id
.name - extracts name
The option -r causes the name to be written unquoted.

Related

How to extract simple text and store into a file?

I'm writing a bash script for hetzner's cloud API, and I need to store the server's ID to a text file. After the command using it will output the below,
{
"server": {
"id": 12345678,
"name": "servertest-101",
"status": "initializing",
"created": "2020-09-18T09:22:21+00:00",
This is just a snippet, but that's from the first line of the response.
How can I extract and store that value?
The api returns in json format: You've not given much information but use jq to parse it:
$ cat myinput.json
{
"server": {
"id": 12345678,
"name": "servertest-101",
"status": "initializing",
"created": "2020-09-18T09:22:21+00:00"
}
}
$ jq -r .server.id myinput.json
12345678
redirect to a file:
$ jq -r .server.id myinput.json > myoutputfile
$ cat myoutputfile
12345678
You can pipe output of your command to process it further as this:
cat yourjson.json | grep -m 1 -E -o '\"id\": [0-9]+' | cut -d" " -f 2 > yourtextfile.txt
First, get your json content, then send it through the grep command that extracts only part "id": 1234567 using regular expression. Then pipe this result to cut command that splits it by a space and selects the second part, which is your value. Lastly, you redirect result of the job to the desired text file.
If you are sure that your value is going to always be the first number in the input, you can just simply select it by grep:
cat yourjson.json | grep -m 1 -E -o '[0-9]+' > output.txt

Get value from JSON with grep

I have this line to read a value from a JSON string
ID=$(echo $ACCOUNT_REQ | grep -Po '"id": *\K"[^"]*"')
TYPE=$(echo $ACCOUNT_REQ | grep -Po '"type": *\K"[^"]*"')
The problem is that from time to time I get too much information read out.
id: ".........." instead of .......... (I get the double quotes too)
type: "NATIVE",
"zoneTransferWhitelist" (I get the next part zoneTrans... too)
My shorted json
{
"response":{
"data":[
{
"id":"..........",
"type":"NATIVE",
"zoneTransferWhitelist":[
]
}
]
}
}
How do I make it so that I only get the value between the quotation marks?
Not very elegant but you this works with your example and will get any single-line value from the json:
get_value() {
echo "$1" | grep "\"$2\":" | sed "s/.*$2\":\"\(.*\)\",/\1/"
}
Then you can run:
get_value "$ACCOUNT_REQ" id
get_value "$ACCOUNT_REQ" type
The quotes around $ACCOUNT_REQ are important for keeping it split across multiple lines (assuming your initial json is multi-line like you've pasted above).

Create variables base on cURL response - Bash

I'm trying to create 2 variables via bash $lat, $long base on the result of my curl response.
curl ipinfo.io/33.62.137.111 | grep "loc" | awk '{print $2}'
I got.
"42.6334,-71.3162",
I'm trying to get
$lat=42.6334
$long=-71.3162
Can someone give me a little push ?
IFS=, read -r lat long < <(
curl -s ipinfo.io/33.62.137.111 |
jq -r '.loc'
)
printf 'Latitude is: %s\nLongitude is: %s\n' "$lat" "$long"
The ipinfo.io API is returning JSON data, so let parse it with jq:
Here is the JSON as returned by the query from your sample:
{
"ip": "33.62.137.111",
"city": "Columbus",
"region": "Ohio",
"country": "US",
"loc": "39.9690,-83.0114",
"postal": "43218",
"timezone": "America/New_York",
"readme": "https://ipinfo.io/missingauth"
}
We are going to JSON query the loc entry from the main root object ..
curl -s ipinfo.io/33.62.137.111: download the JSON data -s silently without progress.
jq -r '.loc': Process JSON data, query the loc entry of the main object and -r output raw string.
IFS=, read -r lat long < <(: Sets the Internal Field Separator to , and read both lat and long variables from the following command group output stream.
Although the answer from #LeaGris is quite interesting, if you don't want to use an external library or something, you can try this:
Playground: https://repl.it/repls/ThoughtfulImpressiveComputer
coordinates=($(curl ipinfo.io/33.62.137.111 | sed 's/ //g' | grep -P '(?<=\"loc\":").*?(?=\")' -o | tr ',' ' '))
echo "${coordinates[#]}"
echo ${coordinates[0]}
echo ${coordinates[1]}
Example output:
39.9690 -83.0114 # echo "${coordinates[#]}"
39.9690 # ${coordinates[0]}
-83.0114 # ${coordinates[1]}
Explanation:
curl ... get the JSON data
sed 's/ //g' remove all spaces
grep -P ... -o
-P interpret the given pattern as a perl regexp
(?<=\"loc\":").*?(?=\")
(?<=\"loc\":") regex lookbehind
.*? capture the longitude and latitude part with non-greedy search
(?=\") regex lookahead
-o get only the matching part which'ld be e.g. 39.9690,-83.0114
tr ',' ' ' replace , with space
Finally we got something like this: 39.9690 -83.0114
Putting it in parentheses lets us create an array with two values in it (cf. ${coordinates[...]}).

How to extract text with sed or grep and regular expression json

Hello I am using curl to get some info which I need to clean up.
This is from curl command:
{"ip":"000.000.000.000","country":"Italy","city":"Milan","longitude":9.1889,"latitude":45.4707, etc..
I would need to get "Ita" as output, that is the first three letter of the country.
After reading sed JSON regular expression i tried to adapt resulting in
sed -e 's/^.*"country":"[a-zA-Z]{3}".*$/\1/
but this won't work.
Can you please help?
Using jq, you can do:
curl .... | jq -r '.country[0:3]'
If you need to set the country to the first 3 chars,
jq '.country = .country[0:3]'
some fairly advanced bash:
{
read country
read city
} < <(
curl ... |
jq -r '.country[0:3], .city[0:3]'
)
Then:
$ echo "$country $city"
Ita Mil

Bash: Need to replace different email addresses within a file

I'm trying to mask PII in a file (.json).
The file contains different email addresses and I would like to change them with other different email addresses.
For example:
"results":
[{ "email1#domain1.com",
"email2#domain2.com",
"email3#domain3.com",
"email4#domain4.com",
"email5#domain5.com" }]
I need to change them to:
"results":
[{ "mockemail1#mockdomain1.com",
"mockemail2#mockdomain2.com",
"mockemail3#mockdomain3.com",
"mockemail4#mockdomain4.com",
"mockemail5#mockdomain5.com" }]
Using sed and regex I have been able to change the addresses to one of the mock email addresses, but I would like to change each email to a different mock email.
The mock email addresses are stored in a file. To get a random address I use:
RandomEmail=$(shuf -n 1 Mock_data.csv | cut -d "|" -f 3)
Any ideas? Thanks!
input.json
You've got your JSON file (add an extra breakline at the end that does not appear in this example or read function in bash won't work correctly)
"results":
[{ "email1#mockdomain1.com",
"email2#mockdomain2.com",
"email3#mockdomain3.com",
"email4#mockdomain4.com",
"email5#mockdomain5.com" }]
substitutions.txt
(add an extra breakline at the end that does not appear in this example or read function in bash won't work correctly)
domain1.com;mockdomain1.com
domain2.com;mockdomain2.com
domain3.com;mockdomain3.com
domain4.com;mockdomain4.com
domain5.com;mockdomain5.com
script.sh
#!/bin/bash
while read _line; do
unset _ResultLine
while read _subs; do
_strSearch=$(echo $_subs | cut -d";" -f1)
_strReplace=$(echo $_subs | cut -d";" -f2)
if [ "$(echo "$_line" | grep "#$_strSearch")" ]; then
echo "$_line" | awk -F"\t" -v strSearch=$_strSearch -v strReplace=$_strReplace \
'{sub(strSearch,strReplace); print $1}' >> output.json
_ResultLine="ok"
fi
done < substitutions.txt
[ "$_ResultLine" != "ok" ] && echo "$_line" >> output.json
done < input.json
ouput.json
"results":
[{ "email1#mockdomain1.com",
"email2#mockdomain2.com",
"email3#mockdomain3.com",
"email4#mockdomain4.com",
"email5#mockdomain5.com" }]
I saved the first file with emailX#domainX.com to /tmp/1. I created a file /tmp/2 with the content of mockemails:
mockemail1#mockdomain1.com
mockemail2#mockdomain2.com
mockemail3#mockdomain3.com
mockemail4#mockdomain4.com
mockemail5#mockdomain5.com
First I extract a list of email addresses from /tmp/1 and I shuffle mockemails. Then I join using paste emails with shuffled mockemails on columns. Then I convert the lines from format email mockemail into sed argument s/email/mockemail/; and pass it to sed. Then I call sed to suibstitute emails to random mockemail passing /tmp/1 file as stdin.
sed "$(paste <(cat /tmp/1 | sed -n '/#/{s/.*"\(.*#.*.com\)".*/\1/;/^$/d;p;}') <(shuf /tmp/2) | sed 's#\(.*\)\t\(.*\)#s/\1/\2/#' | tr '\n' ';')" </tmp/1
This produces:
"results":
[{ "mockemail1#mockdomain1.com",
"mockemail3#mockdomain3.com",
"mockemail5#mockdomain5.com",
"mockemail4#mockdomain4.com",
"mockemail2#mockdomain2.com" }]
Given these input files:
$ cat file1
"results":
[{ "email1#domain1.com",
"email2#domain2.com",
"email3#domain3.com",
"email4#domain4.com",
"email5#domain5.com" }]
$ cat file2
foo|bar|mockemail1#mockdomain1.com|etc
foo|bar|mockemail2#mockdomain2.com|etc
foo|bar|mockemail3#mockdomain3.com|etc
foo|bar|mockemail4#mockdomain4.com|etc
foo|bar|mockemail5#mockdomain5.com|etc
all you need is:
$ shuf file2 | awk 'NR==FNR{a[NR]=$3;next} /#/{$2=a[++c]} 1' FS='|' - FS='"' OFS='"' file1
"results":
[{ "mockemail2#mockdomain2.com",
"mockemail4#mockdomain4.com",
"mockemail5#mockdomain5.com",
"mockemail1#mockdomain1.com",
"mockemail3#mockdomain3.com" }]
Quick and dirty implementation with python:
hypothesis:
You have a wellformed JSON input:
{
"results":
[
"email1#domain1.com",
"email2#domain2.com",
"email3#domain3.com",
"email4#domain4.com",
"email5#domain5.com"
]
}
you can validate your JSON at this address https://jsonformatter.curiousconcept.com/
code:
import json
import sys
input_message = sys.stdin.read()
json_dict = json.loads(input_message)
results=[]
for elem in json_dict['results']:
results.append("mock"+elem)
results_dict = {}
results_dict['results']=results
print(json.dumps(results_dict))
command:
$ echo '{"results":["email1#domain1.com","email2#domain2.com","email3#domain3.com","email4#domain4.com","email5#domain5.com"]}' | python jsonConvertor.py
{"results": ["mockemail1#domain1.com", "mockemail2#domain2.com", "mockemail3#domain3.com", "mockemail4#domain4.com", "mockemail5#domain5.com"]}
A friend of mine suggested the following elegant solution that works in two parts:
Substitute email addresses with a string.
sed -E -i 's/\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b/EMAIL_TO_REPLACE/g' data.json
Iterate the file, and on each iteration substitute the 1st appearance of the string with a random email from the file:
for email in $(egrep -o EMAIL_TO_REPLACE data.json) ; do
sed -i '0,/EMAIL_TO_REPLACE/s//'"$(shuf -n 1 Mock_data.csv | cut -d "|" -f 3)"'/' data.json ;
done
And that's it.
Thanks Elina!

Resources