How to extract CSV field and pass it to another command - bash

I have a bash script that makes an API request that returns the result in CSV format.
I want to extract only the "id" value from the first line(It will always be the same in the rest of the lines) and then pass it to wget command (for example wget http://test.com/$id)
Current bash script:
req = curl -k -d '{"returnFormat":"csv","eventid":"2"}' -H "Authorization: xxx" -H "Accept: application/json" -H "Content-type: application/json" -X POST https://test.api/restSearch
The outputs:
id,category,type,value,comment,date,subject_tag
1357,"activity","domain","dodskj.com","comment",1547034584,"kill-chain"
1357,"activity","ip-dst","8.8.6.6","comment example",1547034600,"maec-mal""
According to this example, I want to extract the value "1357" into a variable and send it to the wget command or any other command.

You can use the cut command ... in this case:
curl <params> | cut -d, -f1
or alternatively awk:
curl <params> | awk -F, '{print $1}'
For your specific example, if you only want the second line you can use:
curl <param> | awk -F, 'NR == 2 {print $1}'
If you want to select a line based on a particular field then:
curl <param> | awk -F, '$4 == "dodskj.com" {print $1}'
(you can match regular expressions using ~ operator in place of ==)
You can also break after the first match with exit:
curl <param> | awk -F, '$4 == "dodskj.com" {print $1; exit}'
Then you can encapsulate the whole lot in $() to assign to a variable ...
VAR=$(curl ... | awk ...)
Hope that helps!

You can pipe your csv into awk.
req=$(<prev_cmd> | awk -F, 'NR==2{print $1}')
-F, tells awk that fields are comma-delimited.
NR==2 will run the script in braces only for the second row.
print $1 will print the first field.
$(...) performs command substitution.
Note: remember no spaces between variable assignments.

Related

Bash : Curl grep result as string variable

I have a bash script as below:
curl -s "$url" | grep "https://cdn" | tail -n 1 | awk -F[\",] '{print $2}'
which is working fine, when i run run it, i able to get the cdn url as:
https://cdn.some-domain.com/some-result/
when i put it as variable :
myvariable=$(curl -s "$url" | grep "https://cdn" | tail -n 1 | awk -F[\",] '{print $2}')
and i echo it like this:
echo "CDN URL: '$myvariable'"
i get blank result. CDN URL:
any idea what could be wrong? thanks
If your curl command produces a trailing DOS carriage return, that will botch the output, though not exactly like you describe. Still, maybe try this.
myvariable=$(curl -s "$url" | awk -F[\",] '/https:\/\/cdn/{ sub(/\r/, ""); url=$2} END { print url }')
Notice also how I refactored the grep and the tail (and now also tr -d '\r') into the Awk command. Tangentially, see useless use of grep.
The result could be blank if there's only one item after awk's split.
You might try grep -o to only return the matched string:
myvariable=$(curl -s "$url" | grep -oP 'https://cdn.*?[",].*' | tail -n 1 | awk -F[\",] '{print $2}')
echo "$myvariable"

How to use awk {print} inside an inline ssh command

I am trying to run an inline ssh command which looks like this:
ssh user#127.0.0.1 "df / -h | awk 'FNR == 2 {print $3}'"
I would expect the output to be 3.8G (as this is the second line, third column) but instead, I am getting /dev/sda1 6.9G 3.8G 2.8G 58% /(the entire second line).
This means that the FNR == 2 is working but the {print $3} is not.
If I run that command directly on the machine that I am ssh'ing into then I get the expected result, just not when calling it through an inline ssh command as above.
This code will eventually ran within a bash script. Is it not possible to use print in this way? If not, is there another method that you can suggest? I am relatively new to terminal life so please bear with me.
The problem resides in the way you pass you ssh arguments.
By calling:
ssh user#127.0.0.1 "df / -h | awk 'FNR == 2 {print $3}'"
You are passing two arguments:
user#127.0.0.1
"df / -h | awk 'FNR == 2 {print $3}'"
Since your second argument is given inside double quotes, the $3 variable will be expanded. You can prevent this variable expansion by escaping the dollar sign:
ssh user#127.0.0.1 "df / -h | awk 'FNR == 2 {print \$3}'"
The joys of shell quoting. The line:
ssh user#127.0.0.1 "df / -h | awk 'FNR == 2 {print $3}'"
Is parsed by the shell, which invokes ssh with two arguments:
user#127.0.01 and df / -h | awk 'FNR == 2 {print }' The $3 was interpolated, and (I'm assuming) was empty. To prevent that, you have many options. One of which is:
ssh user#127.0.0.1 << \EOF
df / -h | awk 'FNR == 2 {print $3}'
EOF
another is:
ssh user#127.0.0.1 sh -c '"df / -h | awk '"'"'FNR == 2 {print $3}'"'"'"'

AWK -F with print all but last record

/Home/in/test_file.txt
echo /Home/in/test_file.txt | awk -F'/' '{ print $2,$3 }'
Gives the result as:
Home in
But I need /Home/in/ as the result .I have to get all except test_file.txt
How to achieve this?
$ echo '/Home/in/test_file.txt' | awk '{sub("/[^/]+$","")} 1'
/Home/in
$ echo '/Home/in/test_file.txt' | awk '{sub("[^/]+$","")} 1'
/Home/in/
$ echo '/Home/in/test_file.txt' | sed 's:/[^/]*$::'
/Home/in
$ echo '/Home/in/test_file.txt' | sed 's:[^/]*$::'
/Home/in/
$ dirname '/Home/in/test_file.txt'
/Home/in
Your attempt awk -F'/' '{ print $2,$3 }' didn't do what you wanted as -F'/' is telling awk to split the input into fields at every / and then print $2,$3 is telling awk to print the 2nd and 3rd fields separated by a blank char (the default value for OFS). You could do:
$ echo '/Home/in/test_file.txt' | awk 'BEGIN{FS=OFS="/"} { print "",$2,$3,"" }'
/Home/in/
to get the expected output but it'd be the wrong approach since it's removing the field you don't want AND removing the input separators AND then adding new output separators which happen to the have the same value as the input separators rather than simply removing the field you don't want like the other solutions above do.
echo /Home/in/test_file.txt | awk -F'/[^/]*$' '{ print $1 }'
..will print the everything but the trailing slash
There are several ways to achieve this:
Using dirname:
$ dirname /home/in/test_file.txt
/home/in
Using Shell substitution:
$ var="/home/in/test_file.txt"
$ echo "${var%/*}"
/home/in
Using sed: (See Ed Morton)
Using AWK:
$ echo "/home/in/test_file.txt" | awk -F'/' '{OFS=FS;$NF=""}1'
/home/in/
Remark: all these work since you can't have a filename with a forward slash (Is it possible to use "/" in a filename?)
Note: all but dirname will fail if you just have a single file_name without a path. While dirname foo will return ./ all others will return foo
awk behaves as it should.
When you define slash / as a separator, the fields in your expression become the content between the separators.
If you need the separator to be printed as well, you need to do it explicitly, like:
echo /Home/in/test_file.txt | awk -F'/' '{ printf "%s/%s/",$2,$3 }'
replace your last field with an empty string and
put the slash back in as the (builtin) Output Field Separator (OFS)
echo /Home/in/test_file.txt | awk -F'/' -vOFS='/' '{$NF="";print}

how to output awk result to varial

i need to run hadoop command to list all live nodes, then based on the output i reformat it using awk command, and eventually output the result to a variable, awk use different delimiter each time i call it:
hadoop job -list-active-trackers | sort | awk -F. '{print $1}' | awk -F_ '{print $2}'
it outputs result like this:
hadoop-dn-11
hadoop-dn-12
...
then i put the whole command in variable to print out the result line by line:
var=$(sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F "." '{print $1}' | awk -F "_" '{print $2}'")
printf %s "$var" | while IFS= read -r line
do
echo "$line"
done
the awk -F didnt' work, it output result as:
tracker_hadoop-dn-1.xx.xsy.interanl:localhost/127.0.0.1:9990
tracker_hadoop-dn-1.xx.xsy.interanl:localhost/127.0.0.1:9390
why the awk with -F won't work correctly? and how i can fix it?
var=$(sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F "." '{print $1}' | awk -F "_" '{print $2}'")
Because you're enclosing the whole command in double quotes, your shell is expanding the variables $1 and $2 before launching sudo. This is what the sudo command looks like (I'm assuming $1 and $2 are empty)
sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F . '{print }' | awk -F _ '{print }'"
So, you see your awk commands are printing the whole line instead of just the first and 2nd fields respectively.
This is merely a quoting challenge
var=$(sudo -H -u hadoop bash -c 'hadoop job -list-active-trackers | sort | awk -F "." '\''{print $1}'\'' | awk -F "_" '\''{print $2}'\')
A bash single quoted string cannot contain single quotes, so that's why you see ...'\''... -- to close the string, concatenate a literal single quote, then re-open the string.
Another way is to escape the vars and inner double quotes:
var=$(sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F \".\" '{print \$1}' | awk -F \"_\" '{print \$2}'")

Assigning deciles using bash

I'm learning bash, and here's a short script to assign deciles to the second column of file $1.
The complicating bit is the use of awk within the script, leading to ambiguous redirects when I run the script.
I would have gotten this done in SAS by now, but like the idea of two lines of code doing the job.
How can I communicate the total number of rows (${N}) to awk within the script? Thanks.
N=$(wc -l < $1)
cat $1 | sort -t' ' -k2gr,2 | awk '{$3=int((((NR-1)*10.0)/"${N}")+1);print $0}'
You can set an awk variable from the command line using -v.
N=$(wc -l < "$1" | tr -d ' ')
sort -t' ' -k2gr,2 "$1" | awk -v n=$N '{$3=int((((NR-1)*10.0)/n)+1);print $0}'
I added tr -d to get rid of the leading spaces that wc -l puts in its result.

Resources