How to properly remove new lines from JSON parsed data in bash

How to properly remove new lines from JSON parsed data in bash - bash

I want to parsing JSON data using jq(as described here ) and delete any newlines character from the resulting string.
I've already tried to use tr but this approach remove also all the white spaces between parsed values.
My code:
IP=$(curl -s https://ipinfo.io/ip) # Get ip address
curl -s https://ipinfo.io/${IP}/geo | jq -r '.ip, .city, .country' | tr -d '\n' # parse only few values from the JSON data and remove new lines.
What i get with the code above is the following string:
XXX.XXX.XXX.XXXCity_NameCountry_Name but i want something like this:
XXX.XXX.XXX.XXX City_Name Country_Name

You could craft a single string from the three pieces of data so that it appears on a single line (single result by input) :
IP=$(curl -s https://ipinfo.io/ip)
curl -s https://ipinfo.io/${IP}/geo | jq -r '.ip + " " + .city + " " + .country'
> myIp myCity myCountryCode
Another option for a different but similar output format would be to use the #csv output format, where you would want to output an array of cells for each input :
IP=$(curl -s https://ipinfo.io/ip)
curl -s https://ipinfo.io/${IP}/geo | jq -r '[.ip, .city, .country] | #csv'
> "myIp","myCity","myCountryCode"
This result could be easily worked on from any spreadsheet software.

Related

Create variables base on cURL response - Bash

I'm trying to create 2 variables via bash $lat, $long base on the result of my curl response.
curl ipinfo.io/33.62.137.111 | grep "loc" | awk '{print $2}'
I got.
"42.6334,-71.3162",
I'm trying to get
$lat=42.6334
$long=-71.3162
Can someone give me a little push ?

IFS=, read -r lat long < <(
curl -s ipinfo.io/33.62.137.111 |
jq -r '.loc'
)
printf 'Latitude is: %s\nLongitude is: %s\n' "$lat" "$long"
The ipinfo.io API is returning JSON data, so let parse it with jq:
Here is the JSON as returned by the query from your sample:
{
"ip": "33.62.137.111",
"city": "Columbus",
"region": "Ohio",
"country": "US",
"loc": "39.9690,-83.0114",
"postal": "43218",
"timezone": "America/New_York",
"readme": "https://ipinfo.io/missingauth"
}
We are going to JSON query the loc entry from the main root object ..
curl -s ipinfo.io/33.62.137.111: download the JSON data -s silently without progress.
jq -r '.loc': Process JSON data, query the loc entry of the main object and -r output raw string.
IFS=, read -r lat long < <(: Sets the Internal Field Separator to , and read both lat and long variables from the following command group output stream.

Although the answer from #LeaGris is quite interesting, if you don't want to use an external library or something, you can try this:
Playground: https://repl.it/repls/ThoughtfulImpressiveComputer
coordinates=($(curl ipinfo.io/33.62.137.111 | sed 's/ //g' | grep -P '(?<=\"loc\":").*?(?=\")' -o | tr ',' ' '))
echo "${coordinates[#]}"
echo ${coordinates[0]}
echo ${coordinates[1]}
Example output:
39.9690 -83.0114 # echo "${coordinates[#]}"
39.9690 # ${coordinates[0]}
-83.0114 # ${coordinates[1]}
Explanation:
curl ... get the JSON data
sed 's/ //g' remove all spaces
grep -P ... -o
-P interpret the given pattern as a perl regexp
(?<=\"loc\":").*?(?=\")
(?<=\"loc\":") regex lookbehind
.*? capture the longitude and latitude part with non-greedy search
(?=\") regex lookahead
-o get only the matching part which'ld be e.g. 39.9690,-83.0114
tr ',' ' ' replace , with space
Finally we got something like this: 39.9690 -83.0114
Putting it in parentheses lets us create an array with two values in it (cf. ${coordinates[...]}).

How to extract text with sed or grep and regular expression json

Hello I am using curl to get some info which I need to clean up.
This is from curl command:
{"ip":"000.000.000.000","country":"Italy","city":"Milan","longitude":9.1889,"latitude":45.4707, etc..
I would need to get "Ita" as output, that is the first three letter of the country.
After reading sed JSON regular expression i tried to adapt resulting in
sed -e 's/^.*"country":"[a-zA-Z]{3}".*$/\1/
but this won't work.
Can you please help?

Using jq, you can do:
curl .... | jq -r '.country[0:3]'
If you need to set the country to the first 3 chars,
jq '.country = .country[0:3]'
some fairly advanced bash:
{
read country
read city
} < <(
curl ... |
jq -r '.country[0:3], .city[0:3]'
)
Then:
$ echo "$country $city"
Ita Mil

Optimising my script code for GNU parallels

I have a script which queries successfully an API, but is very slow. It will take around 16 hours to get all the resources. I looked at how I could optimise it, and I thought that using GNU parallels (installed on macos via Brew, version 20180522) would do the trick. But even with using 90 jobs (the API endpoints authorizes 100 connections max), my script is not faster. I'm not sure why.
I call my script like so:
bash script.sh | parallel -j90
The script is the following:
#!bin/bash
# This script downloads the list of French MPs who contributed to a specific amendment.
# The script is initialised with a file containing a list of API URLs, each pointing to a resource describing an amendment
# The main function loops over 3 actions:
# 1. assign to $sign the API url that points to the list of amendment authors
# 2. run the functions auteur and cosignataires and save them in their respective variables
# 3. merge the variable contents and append them as a new line into a csv file
main(){
local file="${1}"
local line
local sign
local auteur_clean
local cosign_clean
while read line
do
sign="${line}/signataires"
auteur_clean=$(auteur $sign)
cosign_clean=$(cosignataires $sign)
echo "${auteur_clean}","${cosign_clean}" >> signataires_15.csv
done < "${file}"
}
# The auteur function takes the $sign variable as an input and
# 1. filters the json returned by the API to get only the author's ID
# 2.use the ID stored in $auteur to query the full author resource and capture the key info, which is then assigned to $auteur_nom
# 3. echo a cleaned version of the info stored in $auteur_nom
auteur(){
local url="${1}"
local auteur
local auteur_nom
auteur=$(curl -s "${url}" | jq '.signataires[] | select(.relation=="auteur") | .id') \
&& auteur_nom=$(curl -s "https://www.parlapi.fr/rest/an/acteurs_amendements/${auteur}" \
| jq -r --arg url "https://www.parlapi.fr/rest/an/acteurs_amendements/${auteur}" '$url, .amendement.id, .acteur.id, (.acteur.prenom + " " + .acteur.nom)') \
&& echo "${auteur_nom}" | tr '\n' ',' | sed 's/,$//'
}
# The cosignataires function takes the $sign variable as an input and
# 1. filter the json returned by the API to produce a space separated list of co-authors
# 2. iterates over list of coauthors to get their name and surname, and assign the resulting list to $cosign_nom
# 3. echo a semi-colon separated list of the co-author names
cosignataires(){
local url="${1}"
local cosign
local cosign_nom
local i
cosign=$(curl -s "${url}" | jq '.signataires[] | select(.relation=="cosignataire") | .id' | tr '\n' ' ') \
&& cosign_nom=$(for i in ${cosign}; do curl -s "https://www.parlapi.fr/rest/an/acteurs_amendements/${i}" | jq -r '(.acteur.prenom + " " + .acteur.nom)'; done) \
&& echo "${cosign_nom}" | tr '\n' ';' | sed 's/,$//'
}
main "url_amendements_15.txt"
and the content of url_amendements_15.txt looks like so:
https://www.parlapi.fr/rest/an/amendements/AMANR5L15SEA717460BTC0174P0D1N7
https://www.parlapi.fr/rest/an/amendements/AMANR5L15PO59051B0490P0D1N90
https://www.parlapi.fr/rest/an/amendements/AMANR5L15PO59051B0490P0D1N134
https://www.parlapi.fr/rest/an/amendements/AMANR5L15PO59051B0490P0D1N187
https://www.parlapi.fr/rest/an/amendements/AMANR5L15PO59051B0490P0D1N161

Your script loops over a list of URLs and queries them sequentially. You need to break it up so each API query is done separately, that way parallel will have commands it can execute in parallel.
Change the script so it takes a single URL. Get rid of the main while loop.
main() {
local url=$1
local sign
local auteur_clean
local cosign_clean
sign=$url/signataires
auteur_clean=$(auteur "$sign")
cosign_clean=$(cosignataires "$sign")
echo "$auteur_clean,$cosign_clean" >> signataires_15.csv
}
Then pass url_amendements_15.txt to parallel. Give it the list of URLs that can be processed in parallel.
parallel -j90 script.sh < url_amendements_15.txt

How do i print multiple variables in separate lines into a file using shell scripting

i need to select string from one csv file to another properties file using shell
project.csv - this is the file which contains data like below & this may contain N number of lines/data
PN549,projects.pn549
SaturnTV_SW,projects.saturntv_sw
Need to collect each string "pn549" , "saturntv_sw" into a properties file
properties
[projects]
pn549_pt=pn549
saturntv_sw_pt=saturntv_sw
Below is the code i wrote to fetch the string and to print
cat "project.csv" | while IFS='' read -r line; do
Display_Name="$(echo "$line" | cut -d ',' -f 1 | tr -d '"')"
project_name="$(echo "$TEMP_Name" | cut -d '.' -f 2)"
echo "$project_name"
echo "$project_name"_pt="$project_name" > /opt/properties
How do i print multiple lines like i gave in example(properties)

i have got my answer, simply redirected the output

Remove Leading Spaces from a variable in Bash

I have a script that exports a XML file to my desktop and then extracts all the data in the "id" tags and exports that to a csv file.
xmlstarlet sel -t -m '//id[1]' -v . -n </users/$USER/Desktop/List.xml > /users/$USER/Desktop/List2.csv
I then use the following command to add commas after each number and store it as a variable.
devices=$(sed "s/$/,/g" /users/$USER/Desktop/List2.csv)
If I echo that variable I get an output that looks like this:
123,
124,
125,
etc.
What I need help with is removing those spaces so that output will look like 123,124,125 with no leading space. I've tried multiple solutions that I can't get to work. Any help would be amazing!

If you don't want newlines, don't tell xmlstarlet to put them there in the first place.
That is, change -n to -o , to put a comma after each value rather than a newline:
{ xmlstarlet sel -t -m '//id[1]' -v . -o ',' && printf '\n'; } \
<"/users/$USER/Desktop/List.xml" \
>"/users/$USER/Desktop/List2.csv"
The printf '\n' here puts a final newline at the end of your CSV file after xmlstarlet has finished writing its output.
If you don't want the trailing , this leaves on the output file, the easiest way to be rid of it is to read the result of xmlstarlet into a variable and manipulate it there:
content=$(xmlstarlet sel -t -m '//id[1]' -v . -o ',' <"/users/$USER/Desktop/List.xml")
printf '%s\n' "${content%,}" >"/users/$USER/Desktop/List2.csv"

For a sed solution, try
sed ':a;N;$!ba;y/\n/,/' /users/$USER/Desktop/List2.csv
or if you want a comma even after the last:
sed ':a;N;$!ba;y/\n/,/;s/$/,/' /users/$USER/Desktop/List2.csv
but then more easy would be
cat /users/$USER/Desktop/List2.csv | tr "\n" ","

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to properly remove new lines from JSON parsed data in bash - bash

Related

Create variables base on cURL response - Bash

How to extract text with sed or grep and regular expression json

Optimising my script code for GNU parallels

How do i print multiple variables in separate lines into a file using shell scripting

Remove Leading Spaces from a variable in Bash

Categories

Resources