Add blank column using awk or sed - shell

I have a file with the following structure (comma delimited)
116,1,89458180,17,FFFF,0403254F98
I want to add a blank column on the 4th field such that it becomes
116,1,89458180,,17,FFFF,0403254F98
Any inputs as to how to do this using awk or sed if possible ?
thank you

Assuming that none of the fields contain embedded commas, you can restate the task as replacing the third comma with two commas. This is just:
sed 's/,/,,/3'
With the example line from the file:
$ echo "116,1,89458180,17,FFFF,0403254F98" | sed 's/,/,,/3'
116,1,89458180,,17,FFFF,0403254F98

You can use this awk,
awk -F, '$4="," $4' OFS=, yourfile
(OR)
awk -F, '$4=FS$4' OFS=, yourfile
If you want to add 6th and 8th field,
awk -F, '{$4=FS$4; $1=FS$1; $6=FS$6}1' OFS=, yourfile

Through awk
$ echo '116,1,89458180,17,FFFF,0403254F98' | awk -F, -v OFS="," '{print $1,$2,$3,","$4,$5,$6}'
116,1,89458180,,17,FFFF,0403254F98
It prints a , after third field(delimited) by ,
Through GNU sed
$ echo 116,1,89458180,17,FFFF,0403254F98| sed -r 's/^([^,]*,[^,]*,[^,]*)(.*)$/\1,\2/'
116,1,89458180,,17,FFFF,0403254F98
It captures all the characters upto the third command and stored it into a group. Characters including the third , upto the last are stored into another group. In the replacement part, we just add an , between these two captured groups.
Through Basic sed,
Through Basic sed
$ echo 116,1,89458180,17,FFFF,0403254F98| sed 's/^\([^,]*,[^,]*,[^,]*\)\(.*\)$/\1,\2/'
116,1,89458180,,17,FFFF,0403254F98

echo 116,1,89458180,17,FFFF,0403254F98|awk -F',' '{print $1","$2","$3",,"$4","$5","$6}'

Non-awk
t="116,1,89458180,17,FFFF,0403254F98"
echo $(echo $t|cut -d, -f1-3),,$(echo $t|cut -d, -f4-)

You can use bellow awk command to achieve that.Replace the $3 with what ever the column that you want to make it blank.
awk -F, '{$3="" FS $3;}1' OFS=, filename

sed -e 's/\([^,]*,\)\{4\}/&,/' YourFile
replace the sequence of 4 [content (non comma) than comma ] by itself followed by a comma

Related

Replacing new line with comma seperator

I have a text file that the records in the following format. Please note that there are no empty files within the Name, ID and Rank section.
"NAME","STUDENT1"
"ID","123"
"RANK","10"
"NAME","STUDENT2"
"ID","124"
"RANK","11"
I have to convert the above file to the below format
"STUDENT1","123","10"
"STUDENT2","124","11"
I understand that this can be achieved using shell script by reading the records and writing it to another output file. But can this can done using awk or sed ?
$ awk -F, '{ORS=(NR%3?FS:RS); print $2}' file
"STUDENT1","123","10"
"STUDENT2","124","11"
With awk:
awk -F, '$1=="\"RANK\""{print $2;next}{printf "%s,",$2}' file
With awk, printing newline each 3 lines:
awk -F, '{printf "%s",$2;if (NR%3){printf ","}else{print""};}'
Following awk may also help you on same.
awk -F, '{ORS=$0~/^"RANK/?"\n":FS;print $NF}' Input_file
With sed
sed -E 'N;N;;y/\n/ /;s/([^,]*)(,[^ ]*)/\2/g;s/,//' infile

Remove blank line from awk output

I am trying to remove leading whitespace from awk output. When I use this command, a leading whitespace is displayed.
diff test1.txt test.txt | awk '{print $2}'
output:
asdfasdf.txt
test.txt
weqtwqe.txt
How can I remove the leading whitespace using awk?
Thanks in advance
if you want to print the lines where $2 exists you can do it conditionally on number of fields
awk 'NF>1{print $2}'
will do.

awk command to select exact word in any field

I have input file as
ab,1,3,qqq,bbc
b,445,jj,abc
abcqwe,234,23,123
abc,12,bb,88
uirabc,33,99,66
I have to select the rows which has only 'abc'. And note that abc string can appear in any of the column. Please help me how to achieve this using awk.
Output:
b,445,jj,abc
abc,12,bb,88
You could also use plain grep:
grep "(^|,)abc(,|$)" file
Or if you have to use awk
awk '/(^|,)abc(,|$)/' file
Using awk
awk 'gsub(/(^|,)abc(,|$)/,"&")' file
b,445,jj,abc
abc,12,bb,88
Based on Beny23s regex.
It does look for abc where its starting from ^ start or from a , and
ends with a , or end of line $
Another one using beny23 regex:
awk 'NF>1' FS="(^|,)abc(,|$)" infile
Not asked but if you feel the need to filter just the lines with one ocurrence:
$ cat infile
ab,1,3,qqq,bbc
b,445,jj,abc
abcqwe,234,23,123
abc,12,bb,88
abc,12,bb,abc
uirabc,33,99,66
This will be handy:
$ awk 'NF==2' FS="(^|,)abc(,|$)" infile
b,445,jj,abc
abc,12,bb,88
Also possible using Jotne solution:
$ awk 'gsub(/(^|,)abc(,|$)/,"&")==1' infile
Through awk,
$ awk -F, '{for(i=1;i<=NF;i++){if($i=="abc") print $0;}}' file | uniq
b,445,jj,abc
abc,12,bb,88
OR
$ awk -F, '{for(i=1;i<=NF;i++){if($i=="abc") {print; next}}}' file
b,445,jj,abc,abc
abc,12,bb,88
In the above awk command Field Separator variable is set to , . AWk parses the input file line by line. for function is used to traverse all the fields in a line. If a value of a particular field is abc, then it prints the whole line.

Awk adding constant values

I have data in the text file like val1,val2 with multiple lines
and I want to change it to 1,val1,val2,0,0,1
I tried with print statement in awk(solaris) to add constants by it didn't work.
What is the correct way to do it ?
(From the comments) This is what I tried
awk -F, '{print "%s","1,"$1","$2"0,0,1"}' test.txt
Based on the command you posted, a little change makes it:
$ awk -F, 'BEGIN{OFS=FS} {print 1,$1,$2,0,0,1}' file
1,val1,val2,0,0,1
OR using printf (I prefer print):
$ awk -F, '{printf "1,%s,%s,0,0,1", $1, $2}' file
1,val1,val2,0,0,1
To prepend every line with the constant 1 and append with 0,0,1 simply do:
$ awk '{print 1,$0,0,0,1}' OFS=, file
1,val1,val2,0,0,1
A idiomatic way would be:
$ awk '$0="1,"$0",0,0,1"' file
1,val1,val2,0,0,1
Using sed:
sed 's/.*/1,&,0,0,1/' inputfile
Example:
$ echo val1,val2 | sed 's/.*/1,&,0,0,1/'
1,val1,val2,0,0,1

awk sed filter values in all lines greater/smaller than

is there a way to construct a filter in awk (or something similar) that for a given file, say:
0.99,0.98,1.1,0.85,0.92
0.76,1.4,0.99,0.99,0.82
1.0,1.45,0.78,0.91,0.95
would replace any record in a line that is greater than 1.0 with 1.0?
Here is something you can do with awk
awk -F, '{for(i=1;i<=NF;i++) if($i>1) {$i="replacement"}}1' OFS=, file
Test:
$ cat file
0.99,0.98,1.1,0.85,0.92
0.76,1.4,0.99,0.99,0.82
1.0,1.45,0.78,0.91,0.95
$ awk -F, '{for(i=1;i<=NF;i++) if($i>1) {$i="replacement"}}1' OFS=, file
0.99,0.98,replacement,0.85,0.92
0.76,replacement,0.99,0.99,0.82
1.0,replacement,0.78,0.91,0.95
Here’s a sed solution:
sed -e 's/[1-9][0-9]*\.[0-9]*/1.0/g' in-file > out-file
The pattern [1-9][0-9]*\.[0-9]* simply matches any sequence that begins with a digit greater than 0, followed by zero or more digits, followed by the decimal point, followed by additional digits. If you want an in-place replacement, you can use the -i option:
sed -i -e 's/[1-9][0-9]*\.[0-9]*/1.0/g' in-file

Resources