awk print variable with spaces - bash

How do I pass a variable containing a space to awk as one variable?
code:
a="one"
b="two three"
c="four"
echo $a $b $c | awk '{ print "A="$1 "\nB="$2 "\nC="$3 }'
expected output:
A=one
B=two three
C=four
actual output:
A=one
B=two
C=three

You could use awk's -v option to pass shell parameters to awk :
a="one"
b="two three"
c="four"
echo | awk -va="$a" -vb="$b" -vc="$c" '{ print "A="a "\nB="b "\nC="c }'

There's a million possible answers and which one is right for you depends on what you are trying to do which you haven't told us. Here's another possibility using GNU awk for FPAT:
$ echo "\"$a\" \"$b\" \"$c\"" | awk -v FPAT='"[^"]+"' '{print "A="$1 "\nB="$2 "\nC="$3 }'
A="one"
B="two three"
C="four"

Since the separator is space, there cannot be field values containing space.
Protecting with quotes won't help either since awk doesn't consider them as protecting the columns like a csv parser could do (unless you use the FPAT trick as Ed suggested).
$ echo "$a" \"$b\" "$c" | awk '{ print "A="$1 "\nB="$2 "\nC="$3 }'
A=one
B="two
C=three"
workaround: change field separator:
$ echo "$a,$b,$c" | awk -F, '{ print "A="$1 "\nB="$2 "\nC="$3 }'
A=one
B=two three
C=four

Related

AWK -F with print all but last record

/Home/in/test_file.txt
echo /Home/in/test_file.txt | awk -F'/' '{ print $2,$3 }'
Gives the result as:
Home in
But I need /Home/in/ as the result .I have to get all except test_file.txt
How to achieve this?
$ echo '/Home/in/test_file.txt' | awk '{sub("/[^/]+$","")} 1'
/Home/in
$ echo '/Home/in/test_file.txt' | awk '{sub("[^/]+$","")} 1'
/Home/in/
$ echo '/Home/in/test_file.txt' | sed 's:/[^/]*$::'
/Home/in
$ echo '/Home/in/test_file.txt' | sed 's:[^/]*$::'
/Home/in/
$ dirname '/Home/in/test_file.txt'
/Home/in
Your attempt awk -F'/' '{ print $2,$3 }' didn't do what you wanted as -F'/' is telling awk to split the input into fields at every / and then print $2,$3 is telling awk to print the 2nd and 3rd fields separated by a blank char (the default value for OFS). You could do:
$ echo '/Home/in/test_file.txt' | awk 'BEGIN{FS=OFS="/"} { print "",$2,$3,"" }'
/Home/in/
to get the expected output but it'd be the wrong approach since it's removing the field you don't want AND removing the input separators AND then adding new output separators which happen to the have the same value as the input separators rather than simply removing the field you don't want like the other solutions above do.
echo /Home/in/test_file.txt | awk -F'/[^/]*$' '{ print $1 }'
..will print the everything but the trailing slash
There are several ways to achieve this:
Using dirname:
$ dirname /home/in/test_file.txt
/home/in
Using Shell substitution:
$ var="/home/in/test_file.txt"
$ echo "${var%/*}"
/home/in
Using sed: (See Ed Morton)
Using AWK:
$ echo "/home/in/test_file.txt" | awk -F'/' '{OFS=FS;$NF=""}1'
/home/in/
Remark: all these work since you can't have a filename with a forward slash (Is it possible to use "/" in a filename?)
Note: all but dirname will fail if you just have a single file_name without a path. While dirname foo will return ./ all others will return foo
awk behaves as it should.
When you define slash / as a separator, the fields in your expression become the content between the separators.
If you need the separator to be printed as well, you need to do it explicitly, like:
echo /Home/in/test_file.txt | awk -F'/' '{ printf "%s/%s/",$2,$3 }'
replace your last field with an empty string and
put the slash back in as the (builtin) Output Field Separator (OFS)
echo /Home/in/test_file.txt | awk -F'/' -vOFS='/' '{$NF="";print}

How to use awk for a variable in UNIX script

I would like to use awk for a variable
that has the form abc,def
I also don't know how to use awk for a variable instead of a file
I tried the following but it doesn't works
awk -F, '{$1" "$2}' $varand
awk -F, '{$1" "$2}' "$var"
Use a herestring
awk 'commands' <<< "$string"
Also if you want to print the first two fields of a comma separated string, change the command to
awk -F, '{print $1, $2}' <<< "$string"
You can do something like this:
echo "$variable" | awk -F, '{print $1 " " $2}'

Awk: Drop last record separator in one-liner

I have a simple command (part of a bash script) that I'm piping through awk but can't seem to suppress the final record separator without then piping to sed. (Yes, I have many choices and mine is sed.) Is there a simpler way without needing the last pipe?
dolls = $(egrep -o 'alpha|echo|november|sierra|victor|whiskey' /etc/passwd \
| uniq | awk '{IRS="\n"; ORS=","; print}'| sed s/,$//);
Without the sed, this produces output like echo,sierra,victor, and I'm just trying to drop the last comma.
You don't need awk, try:
egrep -o ....uniq|paste -d, -s
Here is another example:
kent$ echo "a
b
c"|paste -d, -s
a,b,c
Also I think your chained command could be simplified. awk could do all things in an one-liner.
Instead of egrep, uniq, awk, sed etc, all this can be done in one single awk command:
awk -F":" '!($1 in a){l=l $1 ","; a[$1]} END{sub(/,$/, "", l); print l}' /etc/password
Here is a small and quite straightforward one-liner in awk that suppresses the final record separator:
echo -e "alpha\necho\nnovember" | awk 'y {print s} {s=$0;y=1} END {ORS=""; print s}' ORS=","
Gives:
alpha,echo,november
So, your example becomes:
dolls = $(egrep -o 'alpha|echo|november|sierra|victor|whiskey' /etc/passwd | uniq | awk 'y {print s} {s=$0;y=1} END {ORS=""; print s}' ORS=",");
The benefit of using awk over paste or tr is that this also works with a multi-character ORS.
Since you tagged it bash here is one way of doing it:
#!/bin/bash
# Read the /etc/passwd file in to an array called names
while IFS=':' read -r name _; do
names+=("$name");
done < /etc/passwd
# Assign the content of the array to a variable
dolls=$( IFS=, ; echo "${names[*]}")
# Display the value of the variable
echo "$dolls"
echo "a
b
c" |
mawk 'NF-= _==$NF' FS='\n' OFS=, RS=
a,b,c

Use Awk to extract substring

Given a hostname in format of aaa0.bbb.ccc, I want to extract the first substring before ., that is, aaa0 in this case. I use following awk script to do so,
echo aaa0.bbb.ccc | awk '{if (match($0, /\./)) {print substr($0, 0, RSTART - 1)}}'
While the script running on one machine A produces aaa0, running on machine B produces only aaa, without 0 in the end. Both machine runs Ubuntu/Linaro, but A runs newer version of awk(gawk with version 3.1.8 while B with older awk (mawk with version 1.2)
I am asking in general, how to write a compatible awk script that performs the same functionality ...
You just want to set the field separator as . using the -F option and print the first field:
$ echo aaa0.bbb.ccc | awk -F'.' '{print $1}'
aaa0
Same thing but using cut:
$ echo aaa0.bbb.ccc | cut -d'.' -f1
aaa0
Or with sed:
$ echo aaa0.bbb.ccc | sed 's/[.].*//'
aaa0
Even grep:
$ echo aaa0.bbb.ccc | grep -o '^[^.]*'
aaa0
Or just use cut:
echo aaa0.bbb.ccc | cut -d'.' -f1
I am asking in general, how to write a compatible awk script that
performs the same functionality ...
To solve the problem in your quesiton is easy. (check others' answer).
If you want to write an awk script, which portable to any awk implementations and versions (gawk/nawk/mawk...) it is really hard, even if with --posix (gawk)
for example:
some awk works on string in terms of characters, some with bytes
some supports \x escape, some not
FS interpreter works differently
keywords/reserved words abbreviation restriction
some operator restriction e.g. **
even same awk impl. (gawk for example), the version 4.0 and 3.x have difference too.
the implementation of certain functions are also different. (your problem is one example, see below)
well all the points above are just spoken in general. Back to your problem, you problem is only related to fundamental feature of awk. awk '{print $x}' the line like that will work all awks.
There are two reasons why your awk line behaves differently on gawk and mawk:
your used substr() function wrongly. this is the main cause. you have substr($0, 0, RSTART - 1) the 0 should be 1, no matter which awk do you use. awk array, string idx etc are 1-based.
gawk and mawk implemented substr() differently.
You don't need awk for this...
echo aaa0.bbb.ccc | cut -d. -f1
cut -d. -f1 <<< aaa0.bbb.ccc
echo aaa0.bbb.ccc | { IFS=. read a _ ; echo $a ; }
{ IFS=. read a _ ; echo $a ; } <<< aaa0.bbb.ccc
x=aaa0.bbb.ccc; echo ${x/.*/}
Heavier options:
sed:
echo aaa0.bbb.ccc | sed 's/\..*//'
sed 's/\..*//' <<< aaa0.bbb.ccc
awk:
echo aaa0.bbb.ccc | awk -F. '{print $1}'
awk -F. '{print $1}' <<< aaa0.bbb.ccc
You do not need any external command at all, just use Parameter Expansion in bash:
hostname=aaa0.bbb.ccc
echo ${hostname%%.*}
if you don't want to change the input field separator, then it's possible to use split function:
echo "some aaa0.bbb.ccc text" | awk '{split($2, a, "."); print a[1]}'
documentation:
split(string, array [, fieldsep [, seps ] ])
Divide string into pieces separated by fieldsep
and store the pieces in array and the separator
strings in the seps array.
awk is still the cleanest approach :
mawk NF=1 FS='[.]' <<< aaa0.bbb.ccc
aaa0
If there's stuff before or after :
mawk ++NF FS='[.].+$|^[^ ]* ' OFS= <<< 'some aaa0.bbb.ccc text'
mawk '$!NF=$2' FS='[ .]' <<< 'some aaa0.bbb.ccc text'
aaa0

How to replace the nth column/field in a comma-separated string using sed/awk?

assume I have a string
"1,2,3,4"
Now I want to replace, e.g. the 3rd field of the string by some different value.
"1,2,NEW,4"
I managed to do this with the following command:
echo "1,2,3,4" | awk -F, -v OFS=, '{$3="NEW"; print }'
Now the index for the column to be replaced should be passed as a variable. So in this case
index=3
How can I pass this to awk? Because this won't work:
echo "1,2,3,4" | awk -F, -v OFS=, '{$index="NEW"; print }'
echo "1,2,3,4" | awk -F, -v OFS=, '{$($index)="NEW"; print }'
echo "1,2,3,4" | awk -F, -v OFS=, '{\$$index="NEW"; print }'
Thanks for your help!
This might work for you:
index=3
echo "1,2,3,4" | awk -F, -v OFS=, -v INDEX=$index '{$INDEX="NEW"; print }'
or:
index=3
echo "1,2,3,4" | sed 's/[^,]*/NEW/'$index
Have the shell interpolate the index in the awk program:
echo "1,2,3,4" | awk -F, -v OFS=, '{$'$index'="NEW"; print }'
Note how the originally single quoted awk program is split in three parts, a single quoted beginning '{$', the interpolated index value, followed by the single quoted remainder of the program.
Here's a seductive way to break the awkwardness:
$ echo "1,2,3,4" | sed 's/,/\n/g' | sed -e $index's/.*/NEW/'
This is easily extendable to multiple indexes just by adding another -e $newindex's/.*/NEWNEW/'
# This should be faster than awk or sed.
str="1,2,3,4"
IFS=','
read -a f <<< "$str"
f[2]='NEW'
printf "${f[*]}"
With plain awk (I.E. Not gawk etc) I believe you'll have to use split( string, array, [fieldsep] ); change the array entry of choice and then join them back together with sprintf or similar in a loop.
gawk allows you to have a variable as a field name, $index in your example. See here.
gawk is usually the default awk on Linux, so change your invocation to gawk "script" and see if it works.

Resources