Passing a shell variable into an awk command - bash

I have a loop that processes a bunch of files within a dir. I would like to input the filename into the file it processes, but I'm getting an error. It works perfectly with the myvar syntax but I need that for obvious reasons.
Error
awk: cmd. line:1: RS=
awk: cmd. line:1: ^ unexpected newline or end of string
Command
for filename in $files
do
awk -v "myvar=${filename}"
RS= '/-- Ticket/{++i; print "PROMPT myvar Line ",
i ORS $0 ORS; i+=split($0, a, /\n/)+1}' ${filename}.txt
done

Couple of issues here, use the -v syntax for each of the variables that you are trying to pass to awk,
awk -v myvar="${filename}" -v RS= '/-- Ticket/{++i; print "PROMPT " myvar " Line ", i ORS $0 ORS; i+=split($0, a, /\n/)+1}' ${filename}.txt
# ^^^ variable1 ^^^^^ variable2 --> using separate -v for each
should be right approach.
For a shell variable import to awk do it as in my example above, not as "myvar=${filename}" but just myvar="${filename}"

If you could avoid a batch loop, it's better (performance mainly for subshell fork, ...)
# assume TXT_files is the list of $files with .txt extension (not purpose of this OP)
awk RS='' '
/-- Ticket/{
# get the file name without extension
myvar = FILENAME;sub( /\.txt$/,"",myvar)
print "PROMPT " myvar " Line " ++i ORS $0 ORS
i += split( $0, a, /\n/) + 1
}
' ${TXT_files}

Related

How to pad a CSV first column with zeroes in awk?

I have a CSV like this:
1,"Paris","3.57"
10,"Singapore","3.57"
211,"Sydney","3.28"
324,"Toronto Center","3.33"
I'd like to pad the first column with zeroes to get:
001,"Paris","3.57"
010,"Singapore","3.57"
211,"Sydney","3.28"
324,"Toronto Center","3.33"
I tried to assign the first column to the output of printf with awk:
awk '{ $1 = printf("%03d", $1); print }' my.csv
But it gives me a syntax error :
awk: cmd. line:1: { $1 = printf("%03d", $1); print }
awk: cmd. line:1: ^ syntax error
It doesn't work either if I quote the printf function.
How could I do that?
If you want just to format the text of one field then you can use sprintf of awk.
awk '{ $1=sprintf("%03d", $1)}1' csvfile
Or standard way:
awk '{printf "%03d %s\n", $1,$2}' csvfile
As per update by OP in question:
awk 'BEGIN{FS=OFS=","}{ $1=sprintf("%03d", $1)}1' csvfile
printf is not a function, it is a keyword, and its result cannot be assigned.
To return a formatted string, use sprintf (which is a function):
awk -F, -v OFS=, '{ $1 = sprintf("%03d", $1) } 1' file
It is necessary to set FS (via -F) and OFS so that when awk reformats the line, the field separators remain intact.
As pointed out in the comments, using %d can potentially lead to problems when the input starts with a 0, as numbers with a leading 0 are interpreted as octal. This can break on input like 08 because 8 is outside of the octal range (0-7).
One way to get around this is to use %03.0f, which interprets the input as a floating point value, with the output precision set to 0:
awk -F, -v OFS=, '{ $1 = sprintf("%03f.0", $1) } 1' file
(the second 0 in the format specifier can in fact be omitted)
awk '{printf("%03d", $1) ; print " "$2}' my.csv

Replacing a csv field value using awk

I have a csv file like the following example:
fieldname1: "This is field1", "id":55, fieldname2: "This is field2", "id":66
I would like to replace the fourth field from ""id":66" to ""id":72" using the awk command. I have tried it the following way but am getting a syntax error:
awk -F, '{${4}="\"id\":999";}1' OFS=, rule.txt
The error is:
awk: {${4}="\"id\":999";}1
awk: ^ syntax error
awk: {${4}="\"id\":999";}1
awk: ^ syntax error
awk: cmd. line:1: {${4}="\"id\":999";}1
awk: cmd. line:1: ^ unexpected newline or end of string
Any suggestions for correct way of doing this?
You just need to say $4 instead of ${4}:
$ awk -F, '{$4="\"id\":999";}1' OFS=, file
# ^^
fieldname1: "This is field1", "id":55, fieldname2: "This is field2","id":999
If you want to give the value via a variable, use -v value="$bash_var" as usual:
$ awk -F, -v val=999 '{$4="\"id\":" val;}1' OFS=, file
# ^^^^^^^^^^ ^^^^^
fieldname1: "This is field1", "id":55, fieldname2: "This is field2","id":999
Note that ${ } is used in Bash to avoid confusion when using a variable $hello being confused with $hello_you when saying eg echo "$hello_you" -> in that case, you would say echo "${hello}_you" to define the scope of the name of the variable.
But in awk such thing shouldn't be necessary because you enclose the string part in double quotes:
$ awk 'BEGIN {a=23; print a"_b"}'
23_b

AWK (unexpected newline or end of string)

I'm trying to execute bash script but I'm got stuck.
data
$ cat test.txt
cat,command,for
cp,command,for
ls,command,to
script
#!/bin/bash
CUT_FILE=test.txt
TRN_GUID="1bcd1adf-2016-443b-9f00-2e4ce20726d7"
LCTN_ID="8002"
LCTN_NAME="TEST FILE"
LCTN_ADDR1="This is test"
cat $CUT_FILE | awk -F ',' '{ print '$TRN_GUID','$LCTN_ID','$LCTN_NAME','$LCTN_ADDR1',$1,$2 }'
output
-bash-3.2# sh test4
awk: cmd. line:1: { print 1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST
awk: cmd. line:1: ^ unexpected newline or end of string
desired output
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,"TEST FILE","This is test",cat,command
Any ideas?
Syntax Error
You are closing your script using the single quotes.
awk -F ',' '{ print '$TRN_GUID','$LCTN_ID','$LCTN_NAME','$LCTN_ADDR1',$1,$2 }'
Can you see, using the syntax highlighting, that the script gets closed after print '? You should use double quotation marks inside your script to prevent this, like so:
awk -F ',' '{ print "$TRN_GUID", "$LCTN_ID", "$LCTN_NAME", "$LCTN_ADDR1", $1, $2 }'
Do you notice the difference in syntax highlighting?
Undesired output
You cannot access bash variables inside awk, to accomplish this you should pass them as variables. To do that, we simply use the -v option of awk. To illustrate this point:
awk -v trn_guid="$TRN_GUID" -v lctn_id="LCTN_ID" -v lctn_name="$LCTN_NAME" -v lctn_addr1="$LCTN_ADDR1" -v arg1="$1" -v arg2="$2" -F ',' '{ print trn_guid, lctn_id, lctn_name, lctn_addr1, arg1, arg2 }'
The syntax is simply -v variable="value". You can put anything in the value, including bash variables. The variable will be accessible in the awk script using the name you chose.
You don't need awk for this. Just use it like this in pure BASH:
CUT_FILE="test.txt"
TRN_GUID="1bcd1adf-2016-443b-9f00-2e4ce20726d7"
LCTN_ID="8002"
LCTN_NAME="TEST FILE"
LCTN_ADDR1="This is test"
while read -r line; do
echo "$TRN_GUID,$LCTN_ID,$LCTN_NAME,$LCTN_ADDR1,${line%,*}"
done < "$CUT_FILE"
Output:
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST FILE,This is test,cat,command
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST FILE,This is test,cp,command
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST FILE,This is test,ls,command

Passing variables into awk from bash

I am writing a shell script file in which I have to print certain columns of a file. So I try to use awk. The column numbers are calculated in the script. Nprop is a variable in a for loop, that changes from 1 to 8.
avg=1+3*$nprop
awk -v a=$avg '{print $a " " $a+1 " " $a+2}' $filename5 >> neig5.dat
I have tried the following also:
awk -v a=$avg '{print $a " " $(a+1) " " $(a+2) }' $filename5 >> neig5.dat
This results in printing the first three columns all the time.
avg=1+3*$nprop
This will set $avg to 1+3*4, literally, if $prop is 4 for instance. You should be evaluating that expression:
avg=$(( 1+3*$nprop ))
And use the version of the awk script with parenthesis.
This single awk script is a translation of what you want:
awk '{j=0;for(i=4;i<=25;i=3*++j+1)printf "%s %s %s ",$i,$(i+1),$(i+2);print ""}'
You don't need to parse your file 8 times in a shell loop just parse it once with awk.
Use a BEGIN{ } block to create a couple of awk variables:
avg=$((1+3*$nprop))
awk -v a=$avg 'BEGIN{ap1=a+1;ap2=a+2} {print $a " " $ap1 " " $ap2}' $filename5 >> neig5.dat
awk -v n="$nprop" 'BEGIN{x=3*n} {a=x; print $++a, $++a, $++a}' file
If you just want your seed value (nprop) to increment on every pass of the file and process the file 8 times, get rid of your external loop and just do this:
awk 'BEGIN{for (i=2;i<=8;i++) ARGV[++ARGC] = ARGV[1]} {a=3*NR/FNR; print $++a, $++a, $++a}' file
In GNU awk you can replace NR/FNR with ARGIND.

Awk - unterminated regex

I am writing a shell script which needs to pull values out of a text file which looks like this:
app.full.name /warfilelocation/ warfilename
My shell script will be iterating over a list of application names and pulling out either the location or name using AWK. I have tested doing this on the command line using the following:
awk "\$1 ~/app.full.name/ { print $2 }" applications.txt
which returns what I would expect however when i put this in a shell script I start having issues.
I have a function that looks like this:
function get_location() {
local application=$1
awk "\$1 ~/^$application/ { print \$2 }" applications.txt
}
But when i call this function i get the following error:
awk: $1 ~/^app.full.name
awk: ^ unterminated regexp
awk: cmd. line:1: app.full.name
awk: cmd. line:1: ^ syntax error
awk: cmd. line:2: app.full.name/ { print $2 }
awk: cmd. line:2: ^ syntax error
Does anyone have any ideas what I am doing wrong here. I presume I am not escaping the variable correct but no matter what i try it doesnt seem to work.
Thanks in advance
Use this approach to make awk recognize shell variables:
awk -v "v1=$VAR1" -v "v2=$VAR2" '{print v1, v2}' input_file
Update
$ cat input
tinky-winky
dipsy
laa-laa
noo-noo
po
$ teletubby='po'
$ awk -v "regexp=$teletubby" '$0 ~ regexp' input
po
Note that anything could go into the shell-variable,
even a full-blown regexp, e.g ^d.*y. Just make sure to use single-quotes
to prevent the shell from doing any expansion.
The error messages seem to indicate that there is a stray newline at the end of $application, which gives the "line 2" error messages.
see this: using awk match() function
kent$ app=app.ful
kent$ echo "app.full.name /warfilelocation/ warfilename"|awk -v a=$app '{if(match($1,a))print $2}'
/warfilelocation/
It's hard to tell without knowing exactly the value of $application, but it seems like you have a strange character in $application, such as a " or a / or something like that.
$ export application=foo/bar
$ awk "\$1 ~/^$application/ { print \$1 }"
gawk: cmd. line:1: $1 ~/^foo/bar/ { print $1 }
gawk: cmd. line:1: ^ parse error
I would look at the exact value that you have in $application, and if it contains a /, escape it.
One way to do this would be to use:
$ export application=`echo foo/bar | sed -e 's;/;\\\\/;g'`
$ awk "\$1 ~/^$application/ { print \$1 }"

Resources