How do I extract the content of quoted strings from the output of a shell command

How do I extract the content of quoted strings from the output of a shell command - shell

The following shell command returns an output with 3 items:
cred="$(aws sts assume-role --role-arn arn:aws:iam::01234567899:role/test --role-session-name s3-access-example --query '[Credentials.AccessKeyId, Credentials.SecretAccessKey, Credentials.SessionToken]')"
echo $cred returns the following output:
[ "ASRDTDRSIJGISGDT", "trttr435", "DF/////eraesr43" ]
How do I retrieve the value between double quotes? For example, trttr435
How to achieve this? Use regex? or other options?

IFS=', ' credArray=(`echo "$cred" | tr -d '"[]'`)
Simple as ... that
Testing
cred='[ "ASRDTDRSIJGISGDT", "trttr435", "DF/////eraesr43" ]'
IFS=', ' credArray=(`echo "$cred" | tr -d '"[]'`)
for i in "${credArray[#]}"; do echo "[$i]"; done
echo "2nd parameter is ${credArray[1]}"
Output
[ASRDTDRSIJGISGDT]
[trttr435]
[DF/////eraesr43]
2nd parameter is trttr435
Tested on Mac OS bash and CentOS bash

I didn't quite catch if the [ and ] are in the $cred or not, or what is your expected output but this will return everything between double quotes:
$ awk '{while(match($0,/"[^"]+"/)){print substr($0,RSTART+1,RLENGTH-2);$0=substr($0,RSTART+RLENGTH)}}' file
ASRDTDRSIJGISGDT
trttr435
DF/////eraesr43
You could and probably would like to:
$ echo "$cred" | awk ... # add above script here
Edit: If you just want to get the quoted string from second field ($2):
$ awk -F, '{match($2,/"[^"]+"/);print substr($2,RSTART+1,RLENGTH-2)}' file
trttr435
or even:
$ awk -F, '{gsub(/^[^"]+"|"[^"]*$/,"",$2);print $2}' file

Or use python, because the content of cred is already a valid python array:
#!/bin/bash
cred='[ "ASRDTDRSIJGISGDT", "trttr435", "DF/////eraesr43" ]'
python-script() {
local INDEX=$1
echo "arr=$cred"
echo "print(arr[$INDEX])"
}
item() {
local INDEX=$1
python-script "$INDEX" | python
}
echo "item1=$(item 1)"
echo "item2=$(item 2)"

Another crude but effective way of extracting the values you need would be to use awk with " as the split delimiter. The valid positions, in this case, would be $2, $4, $6
OUT="[ \"ASRDTDRSIJGISGDT\", \"trttr435\", \"DF/////eraesr43\" ]"
echo $OUT | awk -F '"' '{print $4}'
I would advise you to use python if you need to do a lot of string parsing.

Related

Using SED to substitute regex match with variable value

I have a file with following lines:
2022-Nov-23
2021-Jul-14
I want to replace the month with its number, my script should accept the date as an argument, and I added these variables to it:
Jan=01
Feb=02
Mar=03
Apr=04
May=05
Jun=06
Jul=07
Aug=08
Sep=09
Oct=10
Nov=11
Dec=12
How can I match the month name in the string with regex and substitute it based on the variables? here is what I have for now:
echo "$1" | sed 's/(\w{3})/${\1}/'
But it doesn't work.

With a file called months containing:
Jan=01
Feb=02
Mar=03
Apr=04
May=05
Jun=06
Jul=07
Aug=08
Sep=09
Oct=10
Nov=11
Dec=12
And a script:
#!/bin/sh
sub() (
set -a
. "${0%/*}/months"
awk -F- -vOFS=- '{ $2 = ENVIRON[$2]; print }'
)
printf 2022-Nov-23 | sub
printf 2021-Jul-14 | sub
The output is:
2022-11-23
2021-07-14

You might convert your data into sed script, that is create say file mon2num.sed with following content
s/Jan/01/
s/Feb/02/
s/Mar/03/
s/Apr/04/
s/May/05/
s/Jun/06/
s/Jul/07/
s/Aug/08/
s/Sep/09/
s/Oct/10/
s/Nov/11/
s/Dec/12/
and having file.txt with content as follows
2022-Nov-23
2021-Jul-14
you might do
sed -f mon2num.sed file.txt
which gives output
2022-11-23
2021-07-14

extract string from another using awk

I have this variable which contain a list of string separted by space
val=00:21:5D:16:F3 00:21:5D:16:F4 00:21:5D:16:F5
I want to extract each string separated bu space " " and then assign it to val
I use this shell code but it doesn't work
while [ "$((i++))" != "10" ]; do
val$i=`echo $val | awk '{print $i}'`
echo "val$i=$val$i"
done
the desired result is :
val1="00:21:5D:16:F3"
val2="00:21:5D:16:F4"
val3="00:21:5D:16:F5"
val4=""
val5=""
val6=""
val7=""
val8=""
val9=""
val10=""
any help is appreciated even if the treatment is done with another linux utility like cut , sed , grep.

this awk script should be what are you looking for
awk -F[' '=] 'BEGIN{t=1} { for (i=2;i<=11;i++) {print "val" t "=\"" $i "\""; t+=1}}' test
there is output
system1:/depot/scripts/sh # awk -F[' '=] 'BEGIN{t=1} { for (i=2;i<=11;i++) {print "val" t "=\"" $i "\""; t+=1}}' test
val1="00:21:5D:16:F3"
val2="00:21:5D:16:F4"
val3="00:21:5D:16:F5"
val4=""
val5=""
val6=""
val7=""
val8=""
val9=""
val10=""
system:/depot/scripts/sh #
test file contains:
system:/depot/scripts/sh # cat test
val=00:21:5D:16:F3 00:21:5D:16:F4 00:21:5D:16:F5
system:/depot/scripts/sh #

thank you for your help and I want to share with you the best solution that I found
while [ "$((i++))" != "10" ]; do
val$i=`echo $val | awk -F' ' '{print $'"$i"'}'`
echo "val$i=$val$i"
done

I Know is not what you really asked, but what about using array to solve this?
like:
val=(00:21:5D:16:F3 00:21:5D:16:F4 00:21:5D:16:F5)
$ echo ${val[0]}
00:21:5D:16:F3
$ echo ${val[1]}
00:21:5D:16:F4
$ echo ${val[2]}
00:21:5D:16:F5
$ echo ${val[3]}

Convert data from a simple JSON format to a DSV format

I have a file in Unix, with data sample like the following:
{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}
The desired output is
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexico
456|Americas|Canada
567|APAC|Japan
I tried with a few sed commands. I could remove the following: '{', '}', ' " ', ':'
There are 2 issues with the output file
All rows from input appear in single line in the output.
Adding the pipe ('|') as delimiter.
Any pointers are highly appreciated.

I recommend the tool jq (http://stedolan.github.io/jq/); jq is a lightweight and flexible command-line JSON processor.
jq -r '"\(.ID)|\(.Region)|\(.Location)"' < infile
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
Explanation
-r is --raw-output

Through awk,
awk -F'"' -v OFS="|" 'BEGIN{print "ID|Region|Location"}{print $4,$8,$12}' file
Example:
$ cat file
{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}
$ awk -F'"' -v OFS="|" 'BEGIN{print "ID|Region|Location"}{print $4,$8,$12}' file
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
EXplanation:
-F'"' Sets " as Field Separator value.
OFS="|" Sets | as Output Field Separator value.
Atfirst, awk would execute the function inside the BEGIN block. It helps to print the header section.

This sed one-liner does what you want. It's capturing the field values using parenthesized expressions, and then putting them into the output using \1, \2, and \3.
s/^{"ID":"\([^"]*\)", "Region":"\([^"]*\)", "Location":"\([^"]*\)"}$/\1|\2|\3/
Invoke it like:
$ sed -f one-liner.sed input.txt
Or you can invoke it within a Bash script, producing the header:
echo 'ID|Region|Location'
sed -e 's/^{"ID":"\([^"]*\)", "Region":"\([^"]*\)", "Location":"\([^"]*\)"}$/\1|\2|\3/' $input

It is a JSON file so it is best to use a JSON parser. Here is a perl implementation of it.
#!/usr/bin/perl
use strict;
use warnings;
use JSON;
open my $fh, '<', 'path/to/your/file';
#keys of your structure
my #key = qw(ID Region Location);
print join ("|", #key), "\n";
#iterate over your file, decode it and print in order of your key structure
while (my $json = <$fh>) {
my $text = decode_json($json);
print join ("|", map { $$text{$_} } #key ),"\n";
}
Output:
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan

Using sed as follows
Command line
echo "my_string" |
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g'
or
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g' my_file
I tried this in a terminal as follows:
echo '{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}' |
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g'
Output
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan

Many thanks for your response and the pointers/ solutions did help a lot.
For some mysterious reasons, I couldn't get any sed commands work. So, I devised my own solution. Although it's not elegant, it's still worked.
Here is the script I prepared which resolved the issue.
#!/bin/bash
# ource file path.
infile=/home/exfile.txt
# remove if these temp file exist already.
rm ./efile.txt ./xfile.txt ./yfile.txt ./zfile.txt
# removing the curly braces from input file.
cat exfile.txt | cut -d "{" -f2 | cut -d "}" -f1 >> ./efile.txt
# setting input file name to different value.
infile=./efile.txt
# remove double quotes from the file.
while IFS= read -r line
do
echo $line | sed 's/\"//g' >> ./xfile.txt
done < "$infile"
# creating another temp file.
infile2=./xfile.txt
# remove colon from file.
while IFS= read -r line
do
echo $line | sed 's/\:/,/g' >> ./yfile.txt
done < "$infile2"
# set input file path to new temp file.
infile3=yfile.txt
# initialize variables to hold header column values.
t1=0
t3=0
t5=0
# read each of the line to extract header row. Exit loop after reading 1st row.
once=1
while IFS=',' read -r f1 f2 f3 f4 f5 f6
do
"$f1 $f2 $f3 $f4 $f5 $f6"
t1=$f1
t3=$f3
t5=$f5
if [ "$once" -eq 1 ]; then
break
fi
done < "$infile3"
# Read each of the line from input file. Write only the value to another output file.
while IFS=',' read -r f1 f2 f3 f4 f5 f6
do
echo "$f2|$f4|$f6" >> ./zfile.txt
done < "$infile3"
# insert the header column row into the file generated in the step above.
frstline="$t1|$t3|$t5"
sed -i '1i ID|Region|Location' ./zfile.txt

How to use sed to extract a string [duplicate]

This question already has answers here:
BASH extract value after string in variable Not file [duplicate]
(2 answers)
Closed last year.
I need to extract a number from the output of a command: cmd. The output is type: 1000
So my question is how to execute the command, store its output in a variable and extract 1000 in a shell script. Also how do you store the extracted string in a variable?

This question has been answered in pieces here before, it would be something like this:
line=$(sed -n '2p' myfile)
echo "$line"
if [ `echo $line || grep 'type: 1000' ` ] then;
echo "It's there!";
fi;
Store output of sed into a variable
String contains in Bash
EDIT: sed is very limited, you would need to use bash, perl or awk for what you need.

This is a typical use case for grep:
output=$(cmd | grep -o '[0-9]\+')
You can write the output of a command or even a pipeline of commands into a shell variable using so called command substitution:
variable=$(cmd);
In comments it appeared that the output of cmd contains more lines than the type : 1000. In this case I would suggest sed:
output=$(cmd | sed -n 's/type : \([0-9]\+\)/\1/p;q')

You tagged your question as sed but your question description does not restrict other tools, so here's a solution using awk.
output = `cmd | awk -F':' '/type: [0-9]+/{print $2}'`
Alternatively, you can use the newer $( ) syntax. Some find the newer syntax preferable and it can be conveniently nested, without the need for escaping backtics.
output = $(cmd | awk -F':' '/type: [0-9]+/{print $2}')

If the output is rigidly restricted to "type: " followed by a number, you can just use cut.
var=$(echo 'type: 1000' | cut -f 2 -d ' ')
Obviously you'll have to pipe the output of your command to cut, I'm using echo as a demo.
In addition, I'd use grep and then cut if the string you are searching is more complex. If we assume there can be all kind of numbers in the text, but only one occurrence of "type: " followed by a number, you can use the command:
>> var=$(echo "hello 12 type: 1000 foo 1001" | grep -oE "type: [0-9]+" | cut -f 2 -d ' ')
>> echo $var
1000

You can use the | operator to send the output of one command to another, like so:
echo " 1\n 2\n 3\n" | grep "2"
This sends the string " 1\n 2\n 3\n" to the grep command, which will search for the line containing 2. It sound like you might want to do something like:
cmd | grep "type"

Here is a plain sed solution that uses a regualar expression to find the number in your string:
cmd | sed 's/^.*type: \([0-9]\+\)/\1/g'
^ means from the start
.* can be any character (also none)
\([0-9]\+\) are numbers (minimum one character)
\1 means it takes the first pattern it finds (and only in this case) and uses it as replacement for the whole string

currency parsing and conversion using shell commands

I'm looking for a shell one-liner that will parse the following example currency string PHP10000 into $245. I need to parse the number from the string, multiply it with a preset conversion factor then add a "$" prefix to the result.
So far, what I have is only this:
echo PHP10000 | sed -e 's/PHP//'
which gives 10000 as result.
Now, I'm stuck on how to do multiplication on that result.
I'm thinking awk could also give a solution to this but I'm a beginner at shell commands.
Update:
I tried:
echo PHP10000 | expr `sed -e 's/PHP//'` \* 2
and the multiplication works properly only on whole numbers. I can't use floating point numbers as it gives me this error: expr: not a decimal number: '2.1'.

value=PHP10000
factor=40.82
printf -v converted '$%.2f' "$(bc <<< "${value#PHP} / $factor")"
echo $converted # => $244.98
the ${value#PHP} part is parameter expansion that removes the PHP string from the front of the $value string
the <<< part is a bash here-string, so you're passing the formula to the bc program
bash does not do floating point arithmetic, so call bc to perform the calculation
printf -v varname is the equivalent of other languages varname = sprintf(...)

One way:
echo "PHP10000" | awk -F "PHP" '{ printf "$%d\n", $2 * .0245 }'
Results:
$245
Or to print to two decimal places:
echo "PHP10000" | awk -F "PHP" '{ printf "$%.2f\n", $2 * .0245 }'
Results:
$245.00
EDIT:
Bash doesn't support floating point operations. Use bc instead:
echo "PHP10000" | sed 's/PHP\([0-9]\+\)/echo "scale=2; \1*.0245\/1" | bc/e'
Results:
245.00

Something like:
echo PHP10000 | awk '/PHP/ { printf "$%.0f\n", .0245 * substr($1,4) }'
It can be easily extended to a multi-currency version that converts into one currency (known as quote currency), e.g.:
awk '
BEGIN {
rates["PHPUSD"]=.01
rates["GBPUSD"]=1.58
}
/[A-Z]{3}[0-9.]+/ {
pair=substr($1,1,3) "USD"
amount=substr($1,4)
print "USD" amount * rates[pair]
}
' <<EOF
PHP100
GBP100
EOF
Outputs:
USD1
USD158

Yet another alternative:
$ echo "PHP10000" | awk 'sub(/PHP/,""){ print "$" $0 * .0245 }'
$245

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How do I extract the content of quoted strings from the output of a shell command - shell

Related

Using SED to substitute regex match with variable value

extract string from another using awk

Convert data from a simple JSON format to a DSV format

How to use sed to extract a string [duplicate]

currency parsing and conversion using shell commands

Categories

Resources