Replacing a csv field value using awk - bash

I have a csv file like the following example:
fieldname1: "This is field1", "id":55, fieldname2: "This is field2", "id":66
I would like to replace the fourth field from ""id":66" to ""id":72" using the awk command. I have tried it the following way but am getting a syntax error:
awk -F, '{${4}="\"id\":999";}1' OFS=, rule.txt
The error is:
awk: {${4}="\"id\":999";}1
awk: ^ syntax error
awk: {${4}="\"id\":999";}1
awk: ^ syntax error
awk: cmd. line:1: {${4}="\"id\":999";}1
awk: cmd. line:1: ^ unexpected newline or end of string
Any suggestions for correct way of doing this?

You just need to say $4 instead of ${4}:
$ awk -F, '{$4="\"id\":999";}1' OFS=, file
# ^^
fieldname1: "This is field1", "id":55, fieldname2: "This is field2","id":999
If you want to give the value via a variable, use -v value="$bash_var" as usual:
$ awk -F, -v val=999 '{$4="\"id\":" val;}1' OFS=, file
# ^^^^^^^^^^ ^^^^^
fieldname1: "This is field1", "id":55, fieldname2: "This is field2","id":999
Note that ${ } is used in Bash to avoid confusion when using a variable $hello being confused with $hello_you when saying eg echo "$hello_you" -> in that case, you would say echo "${hello}_you" to define the scope of the name of the variable.
But in awk such thing shouldn't be necessary because you enclose the string part in double quotes:
$ awk 'BEGIN {a=23; print a"_b"}'
23_b

Related

Bash-shell grammar issue about Ternary operator and printf function

cat file
chenghuanghuijia jidianzhong 100 E20128
pannybudaqiu gujihuihenwan -1
shuijiao buxihuan 20 E20138
huijiakan babamama 10
I want get the result that when the line contains E2[0-9]*$ , the end of the line will be printed, if the line not contain E2[0-9]*$, the end of the line will be printed NULL or \n.
Here is my code:
awk '{printf("%s\n",($NF~/E2[0-9]*$/? "E2.*$" : NULL))}' file
The output as below:
E2.*$
E2.*$
but I want to print E20128 & E20138, So I verified my code with deleting " "
awk '{printf("%s\n",($NF~/E2[0-9]*$/? E2.*$ : NULL))}' file
Then it comes the error:
awk: cmd. line:1: {printf("%s\n",($NF~/E2[0-9]*$/? E2.*$ : NULL))}
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {printf("%s\n",($NF~/E2[0-9]*$/? E2.*$ : NULL))}
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {printf("%s\n",($NF~/E2[0-9]*$/? E2.*$ : NULL))}
awk: cmd. line:1: ^ syntax error
So I think is the grammar issue of Ternary operator( _?x:y) or printf function.
please support me .
You can try this awk
awk '{$0=$NF}!/E2[0-9]*$/{$0=""}1' infile
{$0=$NF} : for each line replace the complete line by the last field.
!/E2[0-9]*$/ if the line not match the regex
{$0=""} : substitute the complete line by nothing
1 : print each line
With GNU sed:
sed '/.*\(E2[0-9]*\)$/s//\1/;//!s/.*//' file
/.*\(E2[0-9]*\)$: capture strings matching E2[0-9]* and output it using backreference (s//\1/)
//!: if there is no matching string, clear the line(s/.*//)
After think for a while ,below works:
awk '{printf("%s\n",($NF~/E2[0-9]*$/? $NF : NULL))}' file
If you just want to extract the E2..., a simple grep should be enough:
grep -o "E2[0-9]*$" file
-o is the option for extracting the matched pattern

Passing a shell variable into an awk command

I have a loop that processes a bunch of files within a dir. I would like to input the filename into the file it processes, but I'm getting an error. It works perfectly with the myvar syntax but I need that for obvious reasons.
Error
awk: cmd. line:1: RS=
awk: cmd. line:1: ^ unexpected newline or end of string
Command
for filename in $files
do
awk -v "myvar=${filename}"
RS= '/-- Ticket/{++i; print "PROMPT myvar Line ",
i ORS $0 ORS; i+=split($0, a, /\n/)+1}' ${filename}.txt
done
Couple of issues here, use the -v syntax for each of the variables that you are trying to pass to awk,
awk -v myvar="${filename}" -v RS= '/-- Ticket/{++i; print "PROMPT " myvar " Line ", i ORS $0 ORS; i+=split($0, a, /\n/)+1}' ${filename}.txt
# ^^^ variable1 ^^^^^ variable2 --> using separate -v for each
should be right approach.
For a shell variable import to awk do it as in my example above, not as "myvar=${filename}" but just myvar="${filename}"
If you could avoid a batch loop, it's better (performance mainly for subshell fork, ...)
# assume TXT_files is the list of $files with .txt extension (not purpose of this OP)
awk RS='' '
/-- Ticket/{
# get the file name without extension
myvar = FILENAME;sub( /\.txt$/,"",myvar)
print "PROMPT " myvar " Line " ++i ORS $0 ORS
i += split( $0, a, /\n/) + 1
}
' ${TXT_files}

AWK (unexpected newline or end of string)

I'm trying to execute bash script but I'm got stuck.
data
$ cat test.txt
cat,command,for
cp,command,for
ls,command,to
script
#!/bin/bash
CUT_FILE=test.txt
TRN_GUID="1bcd1adf-2016-443b-9f00-2e4ce20726d7"
LCTN_ID="8002"
LCTN_NAME="TEST FILE"
LCTN_ADDR1="This is test"
cat $CUT_FILE | awk -F ',' '{ print '$TRN_GUID','$LCTN_ID','$LCTN_NAME','$LCTN_ADDR1',$1,$2 }'
output
-bash-3.2# sh test4
awk: cmd. line:1: { print 1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST
awk: cmd. line:1: ^ unexpected newline or end of string
desired output
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,"TEST FILE","This is test",cat,command
Any ideas?
Syntax Error
You are closing your script using the single quotes.
awk -F ',' '{ print '$TRN_GUID','$LCTN_ID','$LCTN_NAME','$LCTN_ADDR1',$1,$2 }'
Can you see, using the syntax highlighting, that the script gets closed after print '? You should use double quotation marks inside your script to prevent this, like so:
awk -F ',' '{ print "$TRN_GUID", "$LCTN_ID", "$LCTN_NAME", "$LCTN_ADDR1", $1, $2 }'
Do you notice the difference in syntax highlighting?
Undesired output
You cannot access bash variables inside awk, to accomplish this you should pass them as variables. To do that, we simply use the -v option of awk. To illustrate this point:
awk -v trn_guid="$TRN_GUID" -v lctn_id="LCTN_ID" -v lctn_name="$LCTN_NAME" -v lctn_addr1="$LCTN_ADDR1" -v arg1="$1" -v arg2="$2" -F ',' '{ print trn_guid, lctn_id, lctn_name, lctn_addr1, arg1, arg2 }'
The syntax is simply -v variable="value". You can put anything in the value, including bash variables. The variable will be accessible in the awk script using the name you chose.
You don't need awk for this. Just use it like this in pure BASH:
CUT_FILE="test.txt"
TRN_GUID="1bcd1adf-2016-443b-9f00-2e4ce20726d7"
LCTN_ID="8002"
LCTN_NAME="TEST FILE"
LCTN_ADDR1="This is test"
while read -r line; do
echo "$TRN_GUID,$LCTN_ID,$LCTN_NAME,$LCTN_ADDR1,${line%,*}"
done < "$CUT_FILE"
Output:
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST FILE,This is test,cat,command
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST FILE,This is test,cp,command
1bcd1adf-2016-443b-9f00-2e4ce20726d7,8002,TEST FILE,This is test,ls,command

AWK syntax error - what's causing it?

I have simple bash script:
#!/bin/sh
column=${1:-1}
awk ' {colawk='$column'+2; print $colawk}'
awk '(x=4; print $x)'
But I have received error:
awk: (x=4; print $x)
awk: ^ syntax error
awk: cmd. line:1: (x=4; print $x)
awk: cmd. line:1: ^ unexpected newline or end of string
Why? Code in the previous line works.
An AWK program is a series of pattern action pairs, written as:
condition { action }
where condition is typically an expression and action is a series of commands.
print is not expression but a statement, so it's a syntax error as expected.
Your problem is with using parentheses instead of braces. Try:
awk '{x=4; print $x}'
instead, as in the following transcript:
pax$ echo a b c d e | awk '(x=4; print $x)'
awk: cmd. line:1: (x=4; print $x)
awk: cmd. line:1: ^ syntax error
awk: cmd. line:2: (x=4; print $x)
awk: cmd. line:2: ^ unexpected newline or end of string
pax$ echo a b c d e | awk '{x=4; print $x}'
d

Awk - unterminated regex

I am writing a shell script which needs to pull values out of a text file which looks like this:
app.full.name /warfilelocation/ warfilename
My shell script will be iterating over a list of application names and pulling out either the location or name using AWK. I have tested doing this on the command line using the following:
awk "\$1 ~/app.full.name/ { print $2 }" applications.txt
which returns what I would expect however when i put this in a shell script I start having issues.
I have a function that looks like this:
function get_location() {
local application=$1
awk "\$1 ~/^$application/ { print \$2 }" applications.txt
}
But when i call this function i get the following error:
awk: $1 ~/^app.full.name
awk: ^ unterminated regexp
awk: cmd. line:1: app.full.name
awk: cmd. line:1: ^ syntax error
awk: cmd. line:2: app.full.name/ { print $2 }
awk: cmd. line:2: ^ syntax error
Does anyone have any ideas what I am doing wrong here. I presume I am not escaping the variable correct but no matter what i try it doesnt seem to work.
Thanks in advance
Use this approach to make awk recognize shell variables:
awk -v "v1=$VAR1" -v "v2=$VAR2" '{print v1, v2}' input_file
Update
$ cat input
tinky-winky
dipsy
laa-laa
noo-noo
po
$ teletubby='po'
$ awk -v "regexp=$teletubby" '$0 ~ regexp' input
po
Note that anything could go into the shell-variable,
even a full-blown regexp, e.g ^d.*y. Just make sure to use single-quotes
to prevent the shell from doing any expansion.
The error messages seem to indicate that there is a stray newline at the end of $application, which gives the "line 2" error messages.
see this: using awk match() function
kent$ app=app.ful
kent$ echo "app.full.name /warfilelocation/ warfilename"|awk -v a=$app '{if(match($1,a))print $2}'
/warfilelocation/
It's hard to tell without knowing exactly the value of $application, but it seems like you have a strange character in $application, such as a " or a / or something like that.
$ export application=foo/bar
$ awk "\$1 ~/^$application/ { print \$1 }"
gawk: cmd. line:1: $1 ~/^foo/bar/ { print $1 }
gawk: cmd. line:1: ^ parse error
I would look at the exact value that you have in $application, and if it contains a /, escape it.
One way to do this would be to use:
$ export application=`echo foo/bar | sed -e 's;/;\\\\/;g'`
$ awk "\$1 ~/^$application/ { print \$1 }"

Resources