Ignore empty fields - bash

Given this file
$ cat foo.txt
,,,,dog,,,,,111,,,,222,,,333,,,444,,,
,,,,cat,,,,,555,,,,666,,,777,,,888,,,
,,,,mouse,,,,,999,,,,122,,,133,,,144,,,
I can print the first field like so
$ awk -F, '{print $5}' foo.txt
dog
cat
mouse
However I would like to ignore those empty fields so that I can call like this
$ awk -F, '{print $1}' foo.txt

You can use like this:
$ awk -F',+' '{print $2}' file
dog
cat
mouse
Similarly, you can use $3, $4 and $5 and so on.. $1 cannot be used in this case because the records begins with delimiter.

$ awk '{print $1}' FPAT=[^,]+ foo.txt
dog
cat
mouse

You can delete multiple repetition of a field with tr -s 'field':
$ tr -s ',' < your_file
,dog,111,222,333,444,
,cat,555,666,777,888,
,mouse,999,122,133,144,
And then you can access to dog, etc with:
$ tr -s ',' < your_file | awk -F, '{print $2}'
dog
cat
mouse

perl -anF,+ -e 'print "$F[1]\n"' foo.txt
dog
cat
mouse
this is no awk but you will get to use 1 instead of 2.

awk -F, '{gsub(/^,*|,*$/,"");gsub(/,+/,",");print $1}' your_file
tested below:
> cat temp
,,,,dog,,,,,111,,,,222,,,333,,,444,,,
,,,,cat,,,,,555,,,,666,,,777,,,888,,,
,,,,mouse,,,,,999,,,,122,,,133,,,144,,,
execution:
> awk -F, '{gsub(/^,*|,*$/,"");gsub(/,+/,",");print $1}' temp
dog
cat
mouse

Related

Merging awk and cut into one command

My line is:
var1="_source.statistics.test1=AAAAA;;;_source.statistics.test2=BBBB;;;_source.statistics.test3=CCCCC"
awk -F ";;;" '{print $1}' <<<$var1 | cut -d= -f2
AAAAA
awk -F ";;;" '{print $2}' <<<$var1 | cut -d= -f2
BBBB
How can I get to the same result using only AWK?
Awk lets you split a field on another delimiter.
awk -F ";;;" '{split($1, a, /=/); print a[2] }'
However, perhaps a more fruitful approach would be to transform this horribly hostile input format to something a little bit more normal, and take it from there with standard tools.
sed 's/;;;/\
/g' inputfile | ...
Could you please try following, within single awk by making use of field separator -F setting it as either = or ; for each line passed to awk.
echo "$var1" | awk -F'=|;' '{print $2}'
AAAAA
echo "$var1" | awk -F'=|;' '{print $6}'
BBBB
OR
echo "$var1" | awk -F"=|;;;" '{print $2}'
AAAAA
echo "$var1" | awk -F"=|;;;" '{print $4}'
BBBB
Considering that you need these output for variables, if yes then you could use it by sed and placing its values in an array and later could make use of it. IMHO this is why arrays are built to save our time of creating N numbers of variables.
Creation of an array with sed:
array=( $(echo "$var1" | sed 's/\([^=]*\)=\([^;]*\)\([^=]*\)=\([^;]*\)\(.*\)/\2 \4/' ) )
Creating of an array with awk:
array=( $(echo "$var1" | awk -F"=|;;;" '{print $2,$4}') )
Above will create an array with values of AAAAA and BBBB now to fetch it you could use.
for i in {0..1}; do echo "$i : ${array[$i]}"; done
0 : AAAAA
1 : BBBB
I have used for loop for your understanding of it, one could use directly array[0] for AAAAA or array[1] for BBBB.
Whenever you have name/tag=val input data it's useful to create an array of tag-value pairs so you can just print or do whatever else you like with the data by it's tags, e.g.:
$ awk -F';;;|=' '{for (i=1; i<NF; i+=2) f[$i]=$(i+1); print f["_source.statistics.test1"]}' <<<"$var1"
AAAAA
$ awk -F';;;|=' '{for (i=1; i<NF; i+=2) f[$i]=$(i+1); print f["_source.statistics.test3"], f["_source.statistics.test2"]}' <<<"$var1"
CCCCC BBBB

handler: xyz.lambda_handler is a text and i want xyz.lambda_handler as output using sh script

i have "handler: xyz.lambda_handler" text in one file and i want "xyz.lambda_handler" i.e text present next to "handler:" as output using shell script, how can i do this.
I have tried
awk -F '${handler}' '{print $1}' filename | awk '{print $2}
grep handler filename
command but not getting correct output
as mentioned in qtn.
I combined two commands and i got my answer
grep Handler: filename | awk -F '${handler}' '{print $1}' | awk '{print $2}'
grep givepattern givefilename | awk -F '${givepattern}' '{print $1}' | awk '{print $2}'
It's grep, not greap. To print only the matched parts of a matching line, use option -o.
grep -o xyz.lambda_handler filename

Print only the contents after a certain pattern match

I have a string like this:
query:schema:query_result{cell=ab}: <timestamp>
I'd like to just print the ab and assign it to a variable. How can I do this with grep/sed?
You may try his,
$ var=$(grep -oP '=\K\w+' <<< "$str")
or
$ sed 's/.*=\(\w\+\).*/\1/' <<<"$var"
ab
You can also use awk:
s='query:schema:query_result{cell=ab}: <timestamp>'
awk -F '[=}]' '{print $2}' <<< "$s"
ab
To assign it to a variable:
var="$(awk -F '[=}]' '{print $2}' <<< "$s")"

AWK command to separate line using Field-sperator

Info 750: local macro 'ADD_GEN_METHOD' (line 149, file /home/vya3kor/vmshare/vya3kor_rbin_g3g_tas_lcmccatestadapter.vws/di_cfc/components/spm/LcmProject/framework/cca/server/generic/spm_CcaServiceHandlerFiParamConfig.cpp) not referenced
I want to separate above line using awk with field-separator (line
I used this command but it's not working
$ grep Info.* 1.txt |awk -F "(line" '{print $1}'
error : awk: fatal: Unmatched ( or \(: /(line/
output I want:
/di_cfc/components/spm/LcmProject/framework/cca/server/generic/spm_CcaServiceHandlerFiParamConfig.cpp%149%Info 750%local macro 'ADD_GEN_METHOD'%
So I used this command :
$ grep '^[Ii]nfo.*:'|
awk -F ":" '{print $1"%" $2}'|
awk -F ", file.*.vws" '{print $1"%" $2 }'|
awk -F ") not referenced" '{print $1"%" }'|
awk -F '(' '{print $1"%" $2"%" $3}'|
awk -F "line" '{print $1 $2 $3 }' |
awk -F "%" '{print $1$ "\n2" $3 $4 $4 $5}'
You can use this awk:
awk -F '\\(line' '{print $1}'
Info 750: local macro ADD_GEN_METHOD
( is special regex symbol that needs to be escaped.
instead of escaping the (, you can do in this way:
awk -F'[(]line' '... your codes'
personally I think it is easier to read.
You need to use three backslashes if the Field Seperator was set through -v,
$ echo 'Info 750: local macro 'ADD_GEN_METHOD' (line 149, file /home/vya3kor/vmshare/vya3kor_rbin_g3g_tas_lcmccatestadapter.vws/di_cfc/components/spm/LcmProject/framework/cca/server/generic/spm_CcaServiceHandlerFiParamConfig.cpp) not referenced' | awk -v FS="\\\(line" '{print $1}'
Info 750: local macro ADD_GEN_METHOD
With all that chopping, you may be better off with sed:
sed -n '/^[Ii]nfo/s/\(Info.*\): \([^(]*\).*file \([^)]*)\) .*/\3%\1%\2/gp' 1.txt
$ cat file
Info 750: local macro 'ADD_GEN_METHOD' (line 149, file /home/vya3kor/vmshare/vya3kor_rbin_g3g_tas_lcmccatestadapter.vws/di_cfc/components/spm/LcmProject/framework/cca/server/generic/spm_CcaServiceHandlerFiParamConfig.cpp) not referenced
$
$ cat tst.awk
BEGIN{ FS=" *[()] *"; OFS="%" }
/^[Ii]nfo.*:/ {
split($2,a,/[ ,]+/)
sub(/: /,OFS,$1)
sub(/.*\.vws/,"",$2)
print $2, a[2], $1 OFS
}
$
$ awk -f tst.awk file
/di_cfc/components/spm/LcmProject/framework/cca/server/generic/spm_CcaServiceHandlerFiParamConfig.cpp%149%Info 750%local macro 'ADD_GEN_METHOD'%
$
or you could do it all in GNU awk with one gensub() call or just use sed as #chthonicdaemon suggested since this is just a simple substitution on a single line.

Splitting CSVs into files named for one of the columns

I have CSVs like this:
apple,file1.txt
banana,file1.txt
carrot,file2.txt
How can I get it to place all of the items from the left column into files named with the items in the right column? E.g. file.txt would contain this list:
apple
banana
So far, I have this:
while read line
do
firstcolumn=$(echo $line | awk -F ",*" '{print $1}')
secondcolumn=$(echo $line | awk -F ",*" '{print $2}')
done < Text/selection.csv
One way using awk:
awk 'BEGIN { FS = "," } { print $1 >> $2 }' infile
This should work -
awk -F, '{a[$1]=$2} END{for (i in a) print i > a[i]}' file
Test:
[jaypal:~/Temp] cat file
apple,file1.txt
banana,file1.txt
carrot,file2.txt
[jaypal:~/Temp] awk -F, '{a[$1]=$2} END{for (i in a) print i > a[i]}' file
[jaypal:~/Temp] ls file*
file file1.txt file2.txt
[jaypal:~/Temp] cat file1.txt
apple
banana
[jaypal:~/Temp] cat file2.txt
carrot
Update:
You can also do something like this -
awk -F, '{print $1 > $2}' INPUT_FILE
Pure Bash and under the assumption that all target files are empty or non-existing:
while IFS=',' read item file ; do
echo "$item" >> "$file"
done < "$infile"
sed loves this stuff...
sed "s%\(.*\),\(.*\)%echo \1 >> \2 %" inputfile.txt | sh

Resources