How to make a '$N' within a string variable in a bash script be used as a operator in awk command when variable is expanded? - bash

So due to the sugggestions Im gonna try be more conscise.
#!/bin/env bash
out1='$3 ~ "RB"'
out2='&& $3 ~ "F[0-9]"'
out3='&& $3 ~ /H*TF/'
printing=''"\t"'$3'
awk_output=$(awk -F, -v o1="$out1" -v o2="$out2" -v o3="$out3" -v p1="$printing" \
'o1 o2 o3 { print NR,p1}' test.csv)
dialog --title "title" \
--msgbox "$awk_output" 0 0
The above code is a example of a little program i am trying to write that takes a few user picked varibles, the outs and the printing variables at the top. And uses those to make a awk query of a database.
The problem is that the passed over variable strings contain a '$3' as an example. I want this to be read as a field operator in awk as it normally is but It is read as a literal string. Therefore when printing, awk prints the literal '$3' instead of the field it should represent in the csv file.
I have tried ENVIRON and SYMTAB with little success also. Has anyone come across this type of thing before as well?
Thank you, hope it was a bit more concise too.
jsut to clarify the output at the current moment is this:
1 $3
2 $3
3 $3
4 $3
5 $3
I would like the "$3" to actually represent a field of the csv file in the awk command. Something like this is what I am trying to get
1 "info from 3rd field in csv"
2 "info from 3rd field in csv"
3 "info from 3rd field in csv"
4 "info from 3rd field in csv"
5 "info from 3rd field in csv"

I want this to be read as a field operator in awk as it normally is but It is read as a literal string
So pass it as part of the command, not as a variable...
printing='$3'
awk -F, -v OFS='\t' "$out1 $out2 $out3 { print NR, $printing}" test.csv
Debug you shell scripts with set -x. Be sure to read about the difference between single and double quotes in shell and how the expansions happen.

A simpler approach is, rather than pass all the shell variables with special characters to awk, we can use printf to build the awk command and then use eval to execute and store the output.
Modified Script:
#!/bin/env bash
out1='$3 ~ "RB"'
out2='&& $3 ~ "F[0-9]"'
out3='&& $3 ~ /H*TF/'
printing='"\t"$3'
printf -v cmd $'awk -F, \'%s %s %s {print NR, %s}\'' "$out1" "$out2" "$out3" "$printing"
awk_output=$(eval "$cmd" test.csv)
dialog --title "title" \
--msgbox "$awk_output" 0 0
Explanation:
Changed printing='"\t"$3'. That is all is needed to print a tab in the output
Included the printf line. printf -v cmd $'awk -F, \'%s %s %s {print NR, %s}\'' "$out1" "$out2" "$out3" "$printing"
variables will be substituited in %s placeholders
Note $ in front of the awk -F, .... This is because the awk string contains '. To use literal ', we need to add $ before the string.
CSV used:
apple,orange,RBF1HTF
apple,orange,RBF2HTF
Output:

Related

how to select the last line of the shell output

Hi I have a shell command like this.
s3=$(awk 'BEGIN{ print "S3 bucket path" }
/Executing command\(queryId/{ sub(/.*queryId=[^[:space:]]+: /,""); q=$0 }
/s3:\/\//{ print "," $10 }' OFS=',' hive-server2.log)
The output of the above command like this.
echo $s3
2018-02-21T17:58:22,
2018-02-21T17:58:26,
2018-02-21T18:05:33,
2018-02-21T18:05:34
I want to select the last line only. I need the last output like this.
2018-02-21T18:05:34
I tried like this.
awk -v $s3 '{print $(NF)}'
Not working.Any help will be appreciated.
In general, command | tail -n 1 prints the last line of the output from command. However, where command is of the form awk '... { ... print something }' you can refactor to awk '... { ... result = something } END { print result }' to avoid spawning a separate process just to discard the other output. (Conversely, you can replace awk '/condition/ { print something }' | head -n 1 with awk '/condition/ { print something; exit }'.)
If you already have the result in a shell variable s3 and want to print just the last line, a parameter expansion echo "${s3##*$'\n'}" does that. The C-style string $'\n' to represent a newline is a Bash extension, and the parameter expansion operator ## to remove the longest matching prefix isn't entirely portable either, so you should make sure the shebang line says #!/bin/bash, not #!/bin/sh
Notice also that $s3 without quotes is an error unless you specifically require the shell to perform whitespace tokenization and wildcard expansion on the value. You should basically always use double quotes around variables except in a couple of very specific scenarios.
Your Awk command would not work for two reasons; firstly, as explained in the previous paragraph, you are setting s3 to the first token of the variable, and the second is your Awk script (probably a syntax error). In more detail, you are basically running
awk -v s3=firstvalue secondvalue thirdvalue '{ print $(NF) }'
^ value ^ script to run ^ names of files ...
where you probably wanted to say
awk -v s3=$'firstvalue\nsecondvalue\nthirdvalue' '{ print $(NF) }'
But even with quoting, your script would set v to something but then tell Awk to (ignore the variable and) process standard input, which on the command line leaves it reading from your terminal. A fixed script might look like
awk 'END { print }' <<<"$s3"
which passes the variable as standard input to Awk, which prints the last line. The <<<value "here string" syntax is also a Bash extension, and not portable to POSIX sh.
much simple way is
command | grep "your filter" | tail -n 1
or directly
command | tail -n 1
You could try this:
echo -e "This is the first line \nThis is the second line" | awk 'END{print}'
another approach can be, processing the file from the end and exiting after first match.
tac file | awk '/match/{print; exit}'
Hi you can do it just by adding echo $s3 | sed '$!d'
s3=$(awk 'BEGIN{ print "S3 bucket path" }/Executing command\(queryId/{ sub(/.*queryId=[^[:space:]]+: /,""); q=$0 } /s3:\/\//{ print "," $10 }' OFS=',' hive-server2.log)
echo $s3 | sed '$!d'
It will simply print:-
2018-02-21T18:05:34
Hope this will help you.

Can I have multiple awk actions without inserting newlines?

I'm a newbie with very small and specific needs. I'm using awk to parse something and I need to generate uninterrupted lines of text assembled from several pieces in the original text. But awk inserts a newline in the output whenever I use a semicolon.
Simplest example of what I mean:
Original text:
1 2
awk command:
{ print $1; print $2 }
The output will be:
1
2
The thing is that I need the output to be a single line, and I also need to use the semicolons, because I have to do multiple actions on the original text, not all of them print.
Also, using ORS=" " causes a whole lot of different problems, so it's not an option.
Is there any other way that I can have multiple actions in the same line without newline insertion?
Thanks!
The newlines in the output are nothing to do with you using semicolons to separate statements in your script, they are because print outputs the arguments you give it followed by the contents of ORS and the default value of ORS is newline.
You may want some version of either of these:
$ echo '1 2' | awk '{printf "%s ", $1; printf "%s ", $2; print ""}'
1 2
$
$ echo '1 2' | awk -v ORS=' ' '{print $1; print $2; print "\n"}'
1 2
$
$ echo '1 2' | awk -v ORS= '{print $1; print " "; print $2; print "\n"}'
1 2
$
but it's hard to say without knowing more about what you're trying to do.
At least scan through the book Effective Awk Programming, 4th Edition, by Arnold Robbins to get some understanding of the basics before trying to program in awk or you're going to waste a lot of your time and learn a lot of bad habits first.
You have better control of the output if you use printf, e.g.
awk '{ printf "%s %s\n",$1,$2 }'
awk '{print $1 $2}'
Is the solution in this case
TL;DR
You're getting newlines because print sends OFS to standard output after each print statement. You can format the output in a variety of other ways, but the key is generally to invoke only a single print or printf statement regardless of how many fields or values you want to print.
Use Commas
One way to do this is to use a single call to print using commas to separate arguments. This will insert OFS between the printed arguments. For example:
$ echo '1 2' | awk '{print $1, $2}'
1 2
Don't Separate Arguments
If you don't want any separation in your output, just pass all the arguments to a single print statement. For example:
$ echo '1 2' | awk '{print $1 $2}'
12
Formatted Strings
If you want more control than that, use formatted strings using printf. For example:
$ echo '1 2' | awk '{printf "%s...%s\n", $1, $2}'
1...2
$ echo "1 2" | awk '{print $1 " " $2}'
1 2

utilizing a variable in awk then enter output into another variable

#!/bin/bash
cat /home/user/list
read -p "enter a number: " LISTNUMBER
USERNAME=$(awk '$LISTNUMBER {
match($0, $LISTNUMBER); print substr($0, RLENGTH + 2); }' /home/user/list)
echo "you chose $USERNAME."
This script will use awk to search another file that has a list of numbers and usernames:
1 bob
2 fred
etc...
I only want the username not the corresponding number which is why I tried using: print substr($0, RLENGTH + 2)
Unfortunately, the output of the awk won't attach to $USERNAME.
I have attempted to grep for this but could not achieve the answer. I then read about awk and got here, but am stuck again. Utilizing grep or awk is all the same for me.
Single-quoted strings in POSIX-like shells are invariably literals - no interpolation of their contents is performed. Therefore, $LISTNUMBER in your code will not expand to its value.
To pass shell variables to Awk, use -v {awkVarName}="${shellVarName}".
Also, Awk performs automatic splitting of input lines into fields of by runs of whitespace (by default); $1 represents the 1st field, $2 the 2nd, ...
If we apply both insights to your scenario, we get:
#!/bin/bash
read -p "enter a number: " LISTNUMBER
USERNAME=$(awk -v LISTNUMBER="$LISTNUMBER" '
$1 == LISTNUMBER { print $2 }' /home/user/list)
echo "you chose $USERNAME."

AWK Multiple Field Separators and Variables

I am trying to perform calculations in awk using fields whose numbers are passed in from the shell, as well as the last four fields
e.g. I call my shell script like this
./myProgram myFile.txt 1 2 3 4
Then within my shell script I want to use awk to refer to fields in a text file like this, specifically the last four fields. $(NF-3) - $(NF)
0000000022:trevor:736:1,2:3,4
0000000223:john:73:5,6:7,8
0000002224:eliza:54:9,8:7,6
0000022225:paul:22:5,4:3,2
0000222226:chris:0:1,2:3,4
So I can go through the fields, however when I do because there are two types of field separators it doesn't seem to work.
My shell script so far:
#! /usr/bin/env bash
file="$1"
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "u1 =", $u1 }' $1
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "v1 =", $v1 }' $1
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "u2 =", $u2 }' $1
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "v2 =", $v2 }' $1
echo "Argument #1 =" $2
echo "Argument #2 =" $3
echo "Argument #3 =" $4
echo "Argument #4 =" $5
This is the output I get from terminal:
u1 = 1
u1 = 5
u1 = 9
u1 = 5
u1 = 1
v1 = awk: illegal field $(), name "v1"
input record number 1, file database.txt
source line number 1
u2 = awk: illegal field $(), name "u2"
input record number 1, file database.txt
source line number 1
v2 = awk: illegal field $(), name "v2"
input record number 1, file database.txt
source line number 1
Argument #1 = 1
Argument #2 = 2
Argument #3 = 3
Argument #4 = 4
When you use $N in awk, it will retrieve field N. You can use this in combination with passing arguments to awk as you have done to access a field number defined in a shell variable. The main issue would appear to be that you are passing variables that haven't been set in your script.
In your example invocation of the script, you're not passing enough arguments for positional parameters $6 and above to be defined. This is what is causing your error messages that look like illegal field $(), because v1 is an empty string, so you're attempting to get a field with no number.
NF is a special variable in awk that contains the number of fields, so to access the last four fields, you can use $(NF-3), $(NF-2), $(NF-1), and $NF.
There was a \ before the awk command which wasn't doing anything useful, so I removed that as well.
There are a couple of other issues with your code that are worth mentioning too. Quote your shell variables! This prevents issues with word splitting on more complex variables. If your arguments are numbers with no spaces, this won't make any difference but it does no harm either and is a good practice to get into. You've defined file, so I've used that instead of $1.
Combining those changes, we end up with something like this:
awk -F'[:,]' -v u1="$2" -v v1="$3" -v u2="$4" -v v2="$5" '{ print "u1 =", u1 }' "$file"
Just about one line:
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "u1 =", $u1 }' $1
Here $5,$6,$7 and $8 are the bash positional parameters not awk field position.
Here you have 5 parameters to your script according to your command line:
./myProgram myFile.txt 1 2 3 4
$1 = myFile.txt
$2 = 1
$3 = 2
$4 = 3
$5 = 4
$6 =
$7 =
$8 =
That's why awk alert your only on the call to $v1 as it's equivalent to $ and is not a field value.
If I understood properly your problem you wish to get the line where the 4 last paramters match thoose values:
awk -F'[:,]' '{ print "u1=",$(NF-3),"v1=",$(NF-2),"u2=",$(NF-1),"v2=",$NF }' "$1"
NF being the number of field, minus 3 give the 4 field before the end.

How to execute awk command in shell script

I have an awk command that extracts the 16th column from 3rd line in a csv file and prints the first 4 characters.
awk -F"," 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,0,4)}'
This works fine.
But when I execute it from a shell script, I get and error
#!/bin/ksh
YEAR=awk -F"," 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,0,4)}'
Error message:
-F,: not found
Use command substitution to assign the output of a command to a variable, as shown below:
YEAR=$(awk -F"," 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,0,4)}')
you are asking the shell to do :
VAR=value command [arguments...]
which means: launch command but pass it the VAR=value environment first
(ex: LC_ALL=C grep '[0-9]*' /some/file.txt : will grep a number in file.txt (and this with the LC_ALL variable set to C just for the duration of the call of grep)
So here : you ask the shell to launch the -F"," command (ie, -F, once the shell interpret the "," into , with arguments 'NR==3.......... and with the variable YEAR set to the value awk for the duration of the command invocation.
Just replace it with :
#!/bin/ksh
YEAR="$(awk -F',' 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,1,4)}')"
(I didn't try it, but I hope they work for you and your sample.csv file)
(Note that you use "0" to match character position 1, which works in many awk implementation but not all (ie most (but not all) assume 1 when you write 0))
From your description, it looks like you want to extract the year from the 16th field, which might contain leading spaces. You can accomplish it by calling AWK once:
YEAR=$(awk -F, 'NR==3{sub(/^[ \t]*/, "", $16); print ">" substr($16,1,4) "<" }')
Better yet, you don't even have to use awk. Since you are already writing shell script, let's do it all in shell script:
{ read line; read line; read line; } < sample.csv # Get the third line
IFS=, set $line # Breaks line into comma-separated fields
IFS=" " set ${16} # Trick to remove leading spaces, field 16 becomes field 1
YEAR=${1:0:4} # Extract the first 4 char from field 1
Do this:
year=$(awk -F, 'NR==3{sub(/^[ \t]+/,"",$16); print substr($16,1,4); exit }' sample.csv)

Resources