AWK not printing output file separator OFS - bash

Input
15.01.2018;Payment sent;;500.00;;
20.12.2017;Payment received;10.40;;;
Expected output
15.01.2018;Payment sent;-500.00
20.12.2017;Payment received;10.40
Current output
15.01.2018Payment sent-500.00
20.12.2017Payment received10.40
Does one see the problem in my command?
awk 'BEGIN{OFS=";";FS=";"} {print match($4, /[^ ]/) ? $1$2$3"-"$4 : $1$2$3}' < in.csv > out.csv
Thank you

I don't understand why you're surprised that when you print $1$2$3 there's no OFS between them but I also don't understand why you were trying to use the logic in your script at all instead of just:
$ awk 'BEGIN{FS=OFS=";"} {print $1, $2, ($3=="" ? "-"$4 : $3)}' file
15.01.2018;Payment sent;-500.00
20.12.2017;Payment received;10.40

Following awk may help you in same.
awk -F";" '$4~/[0-9]/{$4="-"$4}{gsub(/;+/,";");sub(/;$/,"")} 1' OFS=";" Input_file
Output will be as follows.
15.01.2018;Payment sent;-500.00
20.12.2017;Payment received;10.40
Explanation: Adding explanation for above code too now.
awk -F";" ' ##Setting field separator as semi colon here.
$4~/[0-9]/{ ##Checking condition here if 4th field is having digit in it, if yes then do following:
$4="-"$4 ##Adding a dash(-) before 4th field value here.
}
{
gsub(/;+/,";"); ##By Globally substitution method Removing multiple semi colons occurrences with single semi colon here, as per OP shown output.
sub(/;$/,"") ##By using normal substitution method replacing semi colon which comes at last of line with NULL here.
}
1 ##awk works on method of condition{action}, so here I am making condition TRUE and NOT mentioning any action so default print will happen.
' OFS=";" Input_file##Setting OFS(Output field separator) as semi colon here and mentioning Input_file name here too.

awk '{sub(/sent;;/,"sent;-")sub(/;;+/,"")}1' file
15.01.2018;Payment sent;-500.00
20.12.2017;Payment received;10.40
The first sub changes semicolon to dash and the second removes semicolons after last zero.

None of these answers is actually responsive to the OP's question. The question was "Why isn't the OFS appearing in the output?" The answer is quite simple, and one person made a snarky comment in the right direction but nobody actually answered.
Here's the answer: In the ... print $1$2$3... part, there are no spaces between $1, $2, and $3, so you've asked awk to just put those fields right next to each other with no space or field separator. If you had ... print $1,$2,$3.. then you'd have the result you are looking for.
And yes I know this is an old dead question

There's absolutely no point in using ";" as either FS or OFS
{m,g}awk 'sub(";*$",_,$!(NF=NF))' FS='sent;+' OFS='sent;-'
{m,g}awk NF=NF RS=';*\r?\n' FS='sent;+' OFS='sent;-'
15.01.2018;Payment sent;-500.00
20.12.2017;Payment received;10.40

Related

Reformatting text file using awk and cut as a one liner

Data:
CHR SNP BP A1 TEST NMISS BETA SE L95 U95 STAT P
1 chr1:1243:A:T 1243 T ADD 16283 -6.124 0.543 -1.431 0.3534 -1.123 0.14
Desired output:
MarkerName P-Value
chr1:1243 0.14
The actual file is 1.2G worth of lines like the above
I need to strip the 2nd column of the text past the 2nd colon and then paste this to the final 12th column and give it a new header.
I have tried:
awk '{print $2, $12}' | cut -d: -f1-2
but this removes the whole line after the colons and I want to keep the "p" column
I outputted this to a new file and then pasted it onto the P-value column using awk but was wondering if there was a one-liner method of doing this?
Many thanks
My comment in more understandable form:
$ awk '
BEGIN {
print "MarkerName P-Value" # output header
}
NR>1 { # skip the funky first record
split($2,a,/:/) # split by :
printf "%s:%s %s\n",a[1],a[2],$12 # printf allows easier output formating
}' file
Output:
MarkerName P-Value
chr1:1243 0.14
EDIT: Adding one more solution here, since OP mentioned my first solution somehow didn't work for OP but it worked fine for me, as an alternative adding this here.
awk '
BEGIN{
print "MarkerName P-Value"
}
FNR>1{
match($2,/([^:]*:){2}/)
print OFS substr($2,RSTART,RLENGTH-1),$NF
}
' Input_file
With shown samples, could you please try following. You need not to use cut with awk, awk could take care of everything within itself.
awk -F' +|:' '
BEGIN{
print "MarkerName P-Value"
}
FNR>1{
print OFS $2":"$3,$NF
}
' Input_file
Explanation: Adding detailed explanation for above.
awk -F' +|:' ' ##Starting awk program from here and setting field separator as spaces or colon for all lines.
BEGIN{ ##Starting BEGIN section of this program from here.
print "MarkerName P-Value" ##Printing headers here.
}
FNR>1{ ##Checking condition if line number is greater than 1 then do following.
print OFS $2":"$3,$NF ##Printing space(OFS) 2nd field colon 3rd field and last field as per OP request.
}
' Input_file ##Mentioning Input_file name here.
$ awk -F'[: ]+' '{print (NR==1 ? "MarkerName P-Value" : $2":"$3" "$NF)}' file
MarkerName P-Value
chr1:1243 0.14
Sed alternative:
sed -En '1{s/^.*$/MarkerName\tP-Value/p};s/([[:digit:]]+[[:space:]]+)([[:alnum:]]+:[[:digit:]]+)(.*)([[:digit:]]+\.[[:digit:]]+$)/\2\t\4/p'
For the first line, substitute the full line for the headers. Then, split the line into 4 sections based on regular expressions and then print the 2nd subsection followed by a tab and then the 4th subsection.

Transpose rows to column after nth column in bash

I have a file like below format:
$ cat file_in.csv
1308123;28/01/2019;28/01/2019;22/01/2019
1308456;20/11/2018;27/11/2018;09/11/2018;15/11/2018;10/11/2018;02/12/2018
1308789;06/12/2018;04/12/2018
1308012;unknown
How can i transpose as below, starting from second column:
1308123;28/01/2019
1308123;28/01/2019
1308123;22/01/2019
1308456;20/11/2018
1308456;27/11/2018
1308456;09/11/2018
1308456;15/11/2018
1308456;10/11/2018
1308456;02/12/2018
1308789;06/12/2018
1308789;04/12/2018
1308012;unknown
I'm testing my script, but obtain a wrong result
echo "123;23/05/2018;24/05/2018" | awk -F";" 'NR==3{a=$1";";next}{a=a$1";"}END{print a}'
Thanks in advance
1st Solution: Eaisest solution will be, loop through all fields(off course have set field separator as ;) and then print $1 along with all fields in new line. Also note that loop is running from i=2 to till value of NF leaving first field since we need to print in new line from column 2nd onwards.
awk 'BEGIN{FS=OFS=";"} {for(i=2;i<=NF;i++){print $1,$i}}' Input_file
2nd Solution: Using 1 time substitution(sub) and global substitutions(gsub) functionality of awk. Here I am changing very first occurence of ; with ###(assumed that your Input_file will NOT have this characters together, in case it is there choose any unique character(s) which are NOT in one's Input_file on place of ###), then globally subsituting ;(all occurences) with ORS val(a variable which has value of $1) and ; so make values in new column. Now finally remove ### from first field. Why we have done this approch if we DO NOT substitute very first occurence of ; with any other character then it will place a NEW LINE before substituion which we DO NOT want to have. (Also as per Ed sir's comment this solution was tested in 1 Input_file and may have issues while reading multiple Input_files)
awk 'BEGIN{FS=OFS=";"} {val=$1;sub(";","###");gsub(";",ORS val ";");sub("###",";",$1)} 1' Input_file
Another awk
awk -F";" '{ OFS="\n" $1 ";"; $1=$1;$1=""; printf("%s",$0) } ' file

Filter records based on Text in Unix

I'm trying to extract all the records that matches the text "IN" in the 10th field from this file.
i tried but it's not giving me the accurate results. Any help provided here would be highly appreciated.
awk '$10 == "IN" {print $0}'
input_file: my input file
A1|A2|A3|A4|A5|A6|A7|A8|A9|PK|A11|A13|A14|A15|A16|A17|A18
1|2|3|4|5|6|7|8|9|IN|11|12|13|14|15|16|17|18
AW|BW|CQ|AA|AR|AF|RR|AKL|ASD|US|PP|BN|TY|OL|Q3|M8|I7|V6
AR|BR|CR|A8|AN|AQ|RU|A11|A13|IN|P9P|B0N|T2Y|O4L|Q43|M88|I71|V16
output_file: my output should be
1|2|3|4|5|6|7|8|9|IN|11|12|13|14|15|16|17|18
AR|BR|CR|A8|AN|AQ|RU|A11|A13|IN|P9P|B0N|T2Y|O4L|Q43|M88|I71|V16
all the records that matched "IN" in the 10th field should be filtered.
Since you haven't mentioned the field separator in awk code so by default it makes space as field separator and your Input_file is | pipe delimited so let awk know you should set it up in code.
Could you please try following.
awk -F'|' '$10=="IN"' Input_file
Explanation: Adding explanation for above code too.
awk -F'|' ' ##Setting field separator as |(pipe) for all lines of Input_file.
$10=="IN" ##Checking condition if 10th field is equal to IN here if yes then print the current line.
' Input_file ##Mentioning Input_file name here.

Awk, Shell Scripting

I have a file which has the following form:
#id|firstName|lastName|gender|birthday|creationDate|locationIP|browserUsed
111|Arkas|Sarkas|male|1995-09-11|2010-03-17T13:32:10.447+0000|192.248.2.123|Midori
Every field is separated with "|". I am writing a shell script and my goal is to remove the "-" from the fifth field (birthday), in order to make comparisons as if they were numbers.
For example i want the fifth field to be like |19950911|
The only solution I have reached so far, deletes all the "-" from each line which is not what I want using sed.
i would be extremely grateful if you show me a solution to my problem using awk.
If this is a homework writing the complete script will be a disservice. Some hints: the function you should be using is gsub in awk. The fifth field is $5 and you can set the field separator by -F'|' or in BEGIN block as FS="|"
Also, line numbers are in NR variable, to skip first line for example, you can add a condition NR>1
An awk one liner:
awk 'BEGIN { FS="|" } { gsub("-","",$5); print }' infile.txt
To keep "|" as output separator, it is better to define OFS value as "|" :
... | awk 'BEGIN { FS="|"; OFS="|"} {gsub("-","",$5); print $0 }'

join all lines that have the same first column to the same line

IE:
File:
1234:abcd
1234:930
1234:999999
194:keee
194:284
194:222222
Result:
1234:abcd:930:999999
194:kee:284:222222
I have exhausted my brain to the best of my knowledge and can't come up with a way. Sorry to bother you guys!
$ awk -F: '$1==last {printf ":%s",$2; next} NR>1 {print "";} {last=$1; printf "%s",$0;} END{print "";}' file
1234:abcd:930:999999
194:keee:284:222222
How it works
-F:
This tells awk to use a : as the field separator.
$1==last {printf ":%s",$2; next}
If the first field of this line is the same as the first field of the last line, print a colon followed by field 2. Then, skip the rest of the commands and start over with the next line.
NR>1 {print "";}
If we get here, that means that this line has a new not-seen-before value of the first field. If this not the first line, we finish the last line by printing a newline character.
{last=$1; printf "%s",$0;}
Update the variable last with the new value of field 1. Then, print this line.
END{print "";}
After we reach the end of the file, print one last newline character.
Combining non-consecutive lines
Consider this test file:
$ cat testfile2
3:abcd
4:abcd
10:123
3:999
4:999
10:123
Apply this awk script:
$ awk -F: '{a[$1]=a[$1]":"$2;} END{for (x in a) print x ":" substr(a[x],2);}' testfile2
3:abcd:999
4:abcd:999
10:123:123
In this approach, the lines will not necessarily come out in any particular order. If order is important, you may want to pipe this output to sort.

Resources