How to print selected columns separated by tabs? - shell

I have a txt file with columns separated by tabs and based on that file, I want to create a new file that only contains information from some of the columns.
This is what I have now:
awk '{ print $1, $5 }' filename > newfilename
That works except that when column 5 contains spaces e.g 123 Street, only 123 shows up and the street is considered as another column.
How can I achieve what I'm trying to do?

You can specify the field separator as tab:
awk 'BEGIN { FS = "\t" } ; { print $1, $5 }' filename > newfilename
Or from the command line like this:
awk -F"\t" '{ print $1, $5 }' filename > newfilename

What about simple cut shell comand?
very simple yet does the job
cut -d "\t" -f 1,5 filename > newfilename

You can use Bash syntax in the following way:
while IFS=$'\t' read -a cols; do
printf "%s\t%s\n" "${cols[0]}" "${cols[4]}";
done < in.txt > newfile.txt
This will save 1st and 5th columns separated by tabs into the new file.

Related

Shell script to add values to a specific column

I have semicolon-separated columns, and I would like to add some characters to a specific column.
aaa;111;bbb
ccc;222;ddd
eee;333;fff
to the second column I want to add '#', so the output should be;
aaa;#111;bbb
ccc;#222;ddd
eee;#333;fff
I tried
awk -F';' -OFS=';' '{ $2 = "#" $2}1' file
It adds the character but removes all semicolons with space.
You could use sed to do your job:
# replaces just the first occurrence of ';', note the absence of `g` that
# would have made it a global replacement
sed 's/;/;#/' file > file.out
or, to do it in place:
sed -i 's/;/;#/' file
Or, use awk:
awk -F';' '{$2 = "#"$2}1' OFS=';' file
All the above commands result in the same output for your example file:
aaa;#111;bbb
ccc;#222;ddd
eee;#333;fff
#atb: Try:
1st:
awk -F";" '{print $1 FS "#" $2 FS $3}' Input_file
Above will work only when your Input_file has 3 fields only.
2nd:
awk -F";" -vfield=2 '{$field="#"$field} 1' OFS=";" Input_file
Above code you could put any field number and could make it as per your request.
Here I am making field separator as ";" and then taking a variable named field which will have the field number in it and then that concatenating "#" in it's value and 1 is for making condition TRUE and not making and action so by default print action will happen of current line.
You just misunderstood how to set variables. Change -OFS to -v OFS:
awk -F';' -v OFS=';' '{ $2 = "#" $2 }1' file
but in reality you should set them both to the same value at one time:
awk 'BEGIN{FS=OFS=";"} { $2 = "#" $2 }1' file

Gawk Line removal, Splitter is :

Is it possible to move certain columns from one .txt file into another .txt file?
I have a .txt that contains:
USERID:ORDER#:IP:PHONE:ADDRESS:POSTCODE
USERID:ORDER#:IP:PHONE:ADDRESS:POSTCODE
With gawk I want to extract ADDRESS & POSTCODE columns into another .txt, so for this given file the output should be:
ADDRESS1:POSTCODE1
ADDRESS2:POSTCODE2
etc.
This is a classic AWK transform. You want to use "-F :" to specify that the input is delimited by ":" and print a new ":" on output:
awk -F: '{ print $5 ":" $6 }' <input.txt >output.txt
Try that:
awk -F: '{printf "%s:%s ",$5,$6}' ex.txt
input is
USERID:ORDER#:IP:PHONE:ADDRESS1:POSTCODE1
USERID:ORDER#:IP:PHONE:ADDRESS2:POSTCODE2
output is (on one line if I understand correctly)
ADDRESS1:POSTCODE1 ADDRESS2:POSTCODE2
only default is that it ends with a trailing space and does not end with a newline.
Which can be fixed with the slightly more complex (but still readable):
awk -F: 'BEGIN {z=0;} {if (z==1) { printf " "; } ; z=1; printf "%s:%s",$5,$6} END{printf"\n"}' ex.txt
awk -F: 'NR==1 {print $5"1:"$6"1"};NR==2 {print $5"2:"$6"2"}' file
ADDRESS1:POSTCODE1
ADDRESS2:POSTCODE2

How to preserve new lines while printing to a text file in shell?

I have to print out some values in a txt file.
they are of the following format
input="Sno;Name;Field1;Field2"
However the output must be:
Sno-Name
FIELDS ALLOCATED:
Field1
Field2
I do it like so:
echo $input | $(awk -F';' '{print $1"-"$2}') >>$txtfile
echo "FIELDS ALLOCATED:">>$txtfile
echo "$input" | cut -d';' -f 3,4 >>$txtfile
This is easy. However, the problem is that Field1 or Field2 can contain new lines. Whenever this happens, the cut or awk doesn't read the field number 4 and treats it as a new line. Do help how can I print the two fields (with new lines preserved) from the given input format.
If the input is well-formed, you can collect input lines until you have four fields.
awk -F ';' 'r { $0 = r ORS $0 }
NR<4 { next }
{ print $1 "-" $2
print "FIELDS ALLOCATED:"
print $3; print $4
print ""; r="" }' file
Single gnu-awk can do the job with FPAT and empty RS:
input=$'Sno;Name;Field1\nFoo;Field2'
awk -v RS= -v FPAT='[^;]+' '{
printf "%s-%s\nFIELDS ALLOCATED:\n%s\n%s\n", $1, $2, $3, $4}' <<< "$input"
Sno-Name
FIELDS ALLOCATED:
Field1
Foo
Field2
Just change the input record separator in awk - RS. < and > added around each field for clarity.
EDIT: removed extra trailing newline by adding ';' at the end of the here-doc data, plus another condition.
input="Sno;Name;Fie
ld1;Fi
eld2"
awk 'BEGIN{RS=";"} NR==1{f1=$0};
NR==2{print f1 "-" $0; print "FIELDS ALLOCATED:"}
$0=="\n"{next}
NR>2{print "<" $0 ">"}' <<< "$input;"
Gives:
Sno-Name
FIELDS ALLOCATED:
<Fie
ld1>
<Fi
eld2>
input=$'Sno;Name;Field1\nFoo;Field2'
awk 'BEGIN{ RS = "\n\n+" ; FS = ";" } { print $1"-"$2; for(i=3;i<=NF;i++) {print $i}}' <<<"$input"
Since it does not know how many field I can give, i added a for loop until NF and changed the RS to a blank line instead of newline.

Redirect output of one command to different files in shell script

I have a tab seperated string.
I want to copy 1 column to one file and the remaining columns to other file in one go..as that string can modify in between if I use 2 different commands.
I tried:
tab_seperated_string | awk -F"\t" '{ print $2"\t"$3"\t"$4"\t"$5} {print $1}'
2,3,4,5 should go to one file and 1 should go to another file.
You can do like this:
tab_seperated_string | awk -F"\t" '{print $2,$3,$4,$5 > "file2"; print $1 > "file1"}' OFS="\t"
It will then save data to two different files.
By setting OFS to \t, you do not need all the \t in the print statement.
Here is another way if you have many fields that go to one file and first field to another:
awk -F"\t" '{print $1 > "file1"; sub(/[^\t]+\t/,""); print $0 > "file2"}' OFS="\t"
The sub(/[^\t]+\t/,"") removes first field and first tab.

creating a ":" delimited list in bash script using awk

I have following lines
380:<CHECKSUM_VALIDATION>
393:</CHECKSUM_VALIDATION>
437:<CHECKSUM_VALIDATION>
441:</CHECKSUM_VALIDATION>
I need to format it as below
CHECKSUM_VALIDATION:380:393
CHECKSUM_VALIDATION:437:441
Is it possible to achieve above output using "awk"? [I'm using bash]
Thanks you!
Here you go:
awk -F '[:<>/]+' '{ n = $1; getline; print $2 ":" n ":" $1 }'
Explanation:
Set the field separator with -F to be a sequence of a mix of :<>/ characters, this way the first field will be the number, and the second will be CHECKSUM_VALIDATION
Save the first field in variable n and read the next line (which would overwrite $1)
Print the line: a combination of the number from the previous line, and the fields on the current line
Another approach without using getline:
awk -F '[:<>/]+' 'NR % 2 { n = $1 } NR % 2 == 0 { print $2 ":" n ":" $1 }'
This one uses the record counter NR to determine whether it's time to print: if NR is odd, save the first field in n, if NR is even, then print.
You can try this sed,
sed 'N; s/\([0-9]\+\):<\(.*\)>\n\([0-9]\+\):<\(.*\)>/\2:\1:\3/' file.txt
Test:
sat:~$ sed 'N; s/\([0-9]\+\):<\(.*\)>\n\([0-9]\+\):<\(.*\)>/\2:\1:\3/' file.txt
CHECKSUM_VALIDATION:380:393
CHECKSUM_VALIDATION:437:441
Another way:
awk -F: '/<C/ {printf "CHECKSUM_VALIDATION:%d:",$1; next} {print $1}'
Here is one gnu awk
awk -F"[:\n<>]" 'NR==1{print $3,$1,$5;f=$3;next} $3{print f,$3,$7}' OFS=":" RS="</CH" file
CHECKSUM_VALIDATION:380:393
CHECKSUM_VALIDATION:437:441
Based on Jonas post and avoiding getline, this awk should do:
awk -F '[:<>/]+' '/<C/ {f=$1;next} { print $2,f,$1}' OFS=\: file
CHECKSUM_VALIDATION:380:393
CHECKSUM_VALIDATION:437:441

Resources