How can I pass variables from awk to a shell command? - shell

I am trying to run a shell command from within awk for each line of a file, and the shell command needs one input argument. I tried to use system(), but it didn't recognize the input argument.
Each line of this file is an address of a file, and I want to run a command to process that file. So, for a simple example I want to use 'wc' command for each line and pass $1to wc.
awk '{system("wc $1")}' myfile

you are close. you have to concatenate the command line with awk variables:
awk '{system("wc "$1)}' myfile

You cannot grab the output of an awk system() call, you can only get the exit status. Use the getline/pipe or getline/variable/pipe constructs
awk '{
cmd = "your_command " $1
while (cmd | getline line) {
do_something_with(line)
}
close(cmd)
}' file

FYI here's how to use awk to process files whose names are stored in a file (providing wc-like functionality in this example):
gawk '
NR==FNR { ARGV[ARGC++]=$0; next }
{ nW+=NF; nC+=(length($0) + 1) }
ENDFILE { print FILENAME, FNR, nW, nC; nW=nC=0 }
' file
The above uses GNU awk for ENDFILE. With other awks just store the values in an array and print in a loop in the END section.

I would suggest another solution:
awk '{print $1}' myfile | xargs wc
the difference is that it executes wc once with multiple arguments. It often works (for example, with kill command)

Or use the pipe | as in bash then retrive the output in a variable with awk's getline, like this
zcat /var/log/fail2ban.log* | gawk '/.*Ban.*/ {print $7};' | sort | uniq -c | sort | gawk '{ "geoiplookup " $2 "| cut -f2 -d: " | getline geoip; print $2 "\t\t" $1 " " geoip}'
That line will print all the banned IPs from your server along with their origin (country) using the geoip-bin package.
The last part of that one-liner is the one that affects us :
gawk '{ "geoiplookup " $2 "| cut -f2 -d: " | getline geoip; print $2 "\t\t" $1 " " geoip}'
It simply says : run the command "geoiplookup 182.193.192.4 | -f2 -d:" ($2 gets substituted as you may guess) and put the result of that command in geoip (the | getline geoip bit). Next, print something something and anything inside the geoip variable.
The complete example and the results can be found here, an article I wrote.

Related

Extracting string from line, give as input to a command and then output the entire line with replacing the string

I have a file containing like below, multiple rows are there
test1| 1234 | test2 | test3
Extract second column 1234 and run a command feeding that as input
lets say we get X as output to the command
Print the output as below for each of the line
test1 | X | test2 | test3
Prefer if I could do it in one-liner, but open to ideas.
I am able to extract string using awk, but I am not sure how I can still preserve the initial output and replace it in the output. Below is what I tested
cat file.txt | awk -F '|' '{newVar=system("command "$2); print newVar $4}'
#
Sample command output, where we extract the "name"
openstack show 36a6c06e-5e97-4a53-bb42
+----------------------------+-----------------------------------+
| Property | Value |
+----------------------------+-----------------------------------+
| id | 36a6c06e-5e97-4a53-bb42 |
| name | testVM1 |
+----------------------------+-----------------------------------+
Perl to the rescue!
perl -lF'/\|/' -ne 'chomp( $F[1] = qx{ command $F[1] }); print join "|", #F' < file.txt
-n reads the input line by line
-l removes newlines from input and adds them to prints
F specifies how to split each input line into the #F array
$F[1] corresponds to the second column, we replace it with the output of the command
chomp removes the trailing newline from the command output
join glues the array back to one line
Using awk:
awk -F ' *| *' '{("command "$2) | getline $2}1' file.txt
e.g.
$ awk -F ' *| *' '{("date -d #"$2) | getline $2}1' file.txt
test1| Thu 01 Jan 1970 05:50:34 AM IST | test2 | test3
I changed the field separator from | to *| * to accommodate the spaces surrounding the fields. You can remove those based on your actual input.
This finally did the trick..
awk -F' *[|] *' -v OFS=' | ' '{
cmd = "openstack show \047" $2 "\047"
while ( (cmd | getline line) > 0 ) {
if ( line ~ /name/ ) {
split(line,flds,/ *[|] */)
$2 = flds[3]
break
}
}
close(cmd)
print
}' file
If command can take the whole list of values once and generate the converted list as output (e.g. tr 'a-z' 'A-Z') then you'd want to do something like this to avoid spawning a shell once per input line (which is extremely slow):
awk -F' *[|] *' '{print $2}' file |
command |
awk -F' *[|] *' -v OFS=' | ' 'NR==FNR{a[FNR]=$0; next} {$2=a[FNR]} 1' - file
otherwise if command needs to be called with one value at a time (e.g. echo) or you just don't care about execution speed then you'd do:
awk -F' *[|] *' -v OFS=' | ' '{
cmd = "command \047" $2 "\047"
if ( (cmd | getline line) > 0 ) {
$2 = line
}
close(cmd)
print
}' file
The \047s will produce single quotes around $2 when it's passed to command and so shield it from shell interpretation (see https://mywiki.wooledge.org/Quotes) and the test on the result of getline will protect you from silently overwriting the current $2 with the output of an earlier command execution in the event of a failure (see http://awk.freeshell.org/AllAboutGetline). The close() ensures that you don't end up with a "too many open files" error or other cryptic problem if the pipe isn't being closed properly, e.g. if command is generating multiple lines and you're just reading the first one.
Given your comment below, if you're going with the 2nd approach above then you'd write something like:
awk -F' *[|] *' -v OFS=' | ' '{
cmd = "openstack show \047" $2 "\047"
while ( (cmd | getline line) > 0 ) {
split(line,flds)
if ( flds[2] == "name" ) {
$2 = flds[3]
break
}
}
close(cmd)
print
}' file

how to select the last line of the shell output

Hi I have a shell command like this.
s3=$(awk 'BEGIN{ print "S3 bucket path" }
/Executing command\(queryId/{ sub(/.*queryId=[^[:space:]]+: /,""); q=$0 }
/s3:\/\//{ print "," $10 }' OFS=',' hive-server2.log)
The output of the above command like this.
echo $s3
2018-02-21T17:58:22,
2018-02-21T17:58:26,
2018-02-21T18:05:33,
2018-02-21T18:05:34
I want to select the last line only. I need the last output like this.
2018-02-21T18:05:34
I tried like this.
awk -v $s3 '{print $(NF)}'
Not working.Any help will be appreciated.
In general, command | tail -n 1 prints the last line of the output from command. However, where command is of the form awk '... { ... print something }' you can refactor to awk '... { ... result = something } END { print result }' to avoid spawning a separate process just to discard the other output. (Conversely, you can replace awk '/condition/ { print something }' | head -n 1 with awk '/condition/ { print something; exit }'.)
If you already have the result in a shell variable s3 and want to print just the last line, a parameter expansion echo "${s3##*$'\n'}" does that. The C-style string $'\n' to represent a newline is a Bash extension, and the parameter expansion operator ## to remove the longest matching prefix isn't entirely portable either, so you should make sure the shebang line says #!/bin/bash, not #!/bin/sh
Notice also that $s3 without quotes is an error unless you specifically require the shell to perform whitespace tokenization and wildcard expansion on the value. You should basically always use double quotes around variables except in a couple of very specific scenarios.
Your Awk command would not work for two reasons; firstly, as explained in the previous paragraph, you are setting s3 to the first token of the variable, and the second is your Awk script (probably a syntax error). In more detail, you are basically running
awk -v s3=firstvalue secondvalue thirdvalue '{ print $(NF) }'
^ value ^ script to run ^ names of files ...
where you probably wanted to say
awk -v s3=$'firstvalue\nsecondvalue\nthirdvalue' '{ print $(NF) }'
But even with quoting, your script would set v to something but then tell Awk to (ignore the variable and) process standard input, which on the command line leaves it reading from your terminal. A fixed script might look like
awk 'END { print }' <<<"$s3"
which passes the variable as standard input to Awk, which prints the last line. The <<<value "here string" syntax is also a Bash extension, and not portable to POSIX sh.
much simple way is
command | grep "your filter" | tail -n 1
or directly
command | tail -n 1
You could try this:
echo -e "This is the first line \nThis is the second line" | awk 'END{print}'
another approach can be, processing the file from the end and exiting after first match.
tac file | awk '/match/{print; exit}'
Hi you can do it just by adding echo $s3 | sed '$!d'
s3=$(awk 'BEGIN{ print "S3 bucket path" }/Executing command\(queryId/{ sub(/.*queryId=[^[:space:]]+: /,""); q=$0 } /s3:\/\//{ print "," $10 }' OFS=',' hive-server2.log)
echo $s3 | sed '$!d'
It will simply print:-
2018-02-21T18:05:34
Hope this will help you.

AWK alias not printing

The below awk command (copied and pasted from stackoverflow) works fine from the command line but doesnt print anything when aliased
awk '/WORD/ {print $3}' log.log | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'
alias getperc="awk '/WORD/ {print \$3}' log.log | awk 'BEGIN{c=0} length(\$0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'"
I am fairly new to using bash. What am I missing here?
Don't use aliases. They require an additional layer of quoting, which is troublesome (as here), and they prevent you from being able to usefully parameterize or add conditional logic to your code.
A simple transliteration to a function is:
getperc() { awk '/WORD/ {print $3}' log.log | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'; }
A slightly more capable one, which will still use log.log by default, but which will also let you provide an alternate input file name (as in getperc alternate.log) or pipe to your function (as in cat alternate.log | getperc):
getperc() {
[[ -t 0 || $1 ]] || set -- - # use "-" (stdin) as input file if not a TTY
# ...this will let you pipe to your function.
awk '/WORD/ {print $3}' "${1:-log.log}" | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'
}
I think there is confusion by bash regarding $3 and $0 it thinks they are argument of the alias. you can verify this by
try this in bash
alias ech="echo {print \$3}"
it will print just
{print }
but now try
alias ech="echo {print \$\3}"
it will print what you expected
{print $3}
Let me know if this solves your problem

SSH call inside ruby, using %x

I am trying to make a single line ssh call from a ruby script. My script takes a hostname, and then sets out to return the hostname's machine info.
return_value = %x{ ssh #{hostname} "#{number_of_users}; #{number_of_processes};
#{number_of_processes_running}; #{number_of_processes_sleeping}; "}
Where the variables are formatted like this.
number_of_users = %Q(users | wc -w | cat | awk '{print "Number of Users: "\$1}')
number_of_processes = %Q(ps -el | awk '{print $2}' | wc -l | awk '{print "Number of Processes: "$1}')
I have tried both %q, %Q, and just plain "" and I cannot get the awk to print anything before the output. I either get this error (if I include the colon)
awk: line 1: syntax error at or near :
or if I don't include the slash in front of $1 I just get empty output for that line. Is there any solution for this? I thought it might be because I was using %q, but it even happens with just double quotes.
Use backticks to capture the output of the command and return the output as a string:
number_of_users = `users | wc -w | cat | awk '{print "Number of Users:", $1}'`
puts number_of_users
Results on my system:
48
But you can improve your pipeline:
users | awk '{ print "Number of Users:", NF }'
ps -e | awk 'END { print "Number of Processes:", NR }'
So the solution to this problem is:
%q(users | wc -w | awk '{print \"Number of Users: \"\$1}')
Where you have to use %q, not %, not %Q, and not ""
You must backslash double quotes and the dollar sign in front of any awk variables
If somebody could improve upon this answer by explaining why, that would be most appreciated
Though as Steve pointed out I could have improved my code using users | awk '{ print \"Number of Users:\", NF }'
In which case there is no need to backslash the NF.

bash awk first 1st column and 3rd column with everything after

I am working on the following bash script:
# contents of dbfake file
1 100% file 1
2 99% file name 2
3 100% file name 3
#!/bin/bash
# cat out data
cat dbfake |
# select lines containing 100%
grep 100% |
# print the first and third columns
awk '{print $1, $3}' |
# echo out id and file name and log
xargs -rI % sh -c '{ echo %; echo "%" >> "fake.log"; }'
exit 0
This script works ok, but how do I print everything in column $3 and then all columns after?
You can use cut instead of awk in this case:
cut -f1,3- -d ' '
awk '{ $2 = ""; print }' # remove col 2
If you don't mind a little whitespace:
awk '{ $2="" }1'
But UUOC and grep:
< dbfake awk '/100%/ { $2="" }1' | ...
If you'd like to trim that whitespace:
< dbfake awk '/100%/ { $2=""; sub(FS "+", FS) }1' | ...
For fun, here's another way using GNU sed:
< dbfake sed -r '/100%/s/^(\S+)\s+\S+(.*)/\1\2/' | ...
All you need is:
awk 'sub(/.*100% /,"")' dbfake | tee "fake.log"
Others responded in various ways, but I want to point that using xargs to multiplex output is rather bad idea.
Instead, why don't you:
awk '$2=="100%" { sub("100%[[:space:]]*",""); print; print >>"fake.log"}' dbfake
That's all. You don't need grep, you don't need multiple pipes, and definitely you don't need to fork shell for every line you're outputting.
You could do awk ...; print}' | tee fake.log, but there is not much point in forking tee, if awk can handle it as well.

Resources