Printing date within awk - bash

I'm trying to print the date inside of an awk command. I cannot find a way around the fact that the arguments for gawk are put inside single quotes, which negate the execution that I need for date:
gawk '/.*(ge|ga).*/ { print $1 "," $2 "," date } ' >> file.csv
gawk '/.*(ge|ga).*/ { print $1 "," $2 "," echo date } ' >> file.csv
gawk '/.*(ge|ga).*/ { print $1 "," $2 "," `date` } ' >> file.csv
What is a way around this inside the gawk command ? Thanks.

It's not 100% clear what you're trying to do here (some input and desired output would be useful) but I think this is what you want:
gawk -v date="$(date)" -v OFS=, '/g[ea]/ { print $1, $2, date }'
This sets an awk variable date based on the output of the date command and prints it after the first and second field. I've set the output field separator OFS to make your print command neater.
Alternatively (and probably preferred) is to use the strftime function available in GNU awk:
gawk -v OFS=, '/g[ea]/ { print $1, $2, strftime() }'
The format of the output is slightly different but can be adjusted by passing a format string to the function. See the GNU awk documentation for more details on that.
I have also simplified your regular expression, based on the suggestions made in the comments (thanks).

Related

How to pad a CSV first column with zeroes in awk?

I have a CSV like this:
1,"Paris","3.57"
10,"Singapore","3.57"
211,"Sydney","3.28"
324,"Toronto Center","3.33"
I'd like to pad the first column with zeroes to get:
001,"Paris","3.57"
010,"Singapore","3.57"
211,"Sydney","3.28"
324,"Toronto Center","3.33"
I tried to assign the first column to the output of printf with awk:
awk '{ $1 = printf("%03d", $1); print }' my.csv
But it gives me a syntax error :
awk: cmd. line:1: { $1 = printf("%03d", $1); print }
awk: cmd. line:1: ^ syntax error
It doesn't work either if I quote the printf function.
How could I do that?
If you want just to format the text of one field then you can use sprintf of awk.
awk '{ $1=sprintf("%03d", $1)}1' csvfile
Or standard way:
awk '{printf "%03d %s\n", $1,$2}' csvfile
As per update by OP in question:
awk 'BEGIN{FS=OFS=","}{ $1=sprintf("%03d", $1)}1' csvfile
printf is not a function, it is a keyword, and its result cannot be assigned.
To return a formatted string, use sprintf (which is a function):
awk -F, -v OFS=, '{ $1 = sprintf("%03d", $1) } 1' file
It is necessary to set FS (via -F) and OFS so that when awk reformats the line, the field separators remain intact.
As pointed out in the comments, using %d can potentially lead to problems when the input starts with a 0, as numbers with a leading 0 are interpreted as octal. This can break on input like 08 because 8 is outside of the octal range (0-7).
One way to get around this is to use %03.0f, which interprets the input as a floating point value, with the output precision set to 0:
awk -F, -v OFS=, '{ $1 = sprintf("%03f.0", $1) } 1' file
(the second 0 in the format specifier can in fact be omitted)
awk '{printf("%03d", $1) ; print " "$2}' my.csv

Shell script to add values to a specific column

I have semicolon-separated columns, and I would like to add some characters to a specific column.
aaa;111;bbb
ccc;222;ddd
eee;333;fff
to the second column I want to add '#', so the output should be;
aaa;#111;bbb
ccc;#222;ddd
eee;#333;fff
I tried
awk -F';' -OFS=';' '{ $2 = "#" $2}1' file
It adds the character but removes all semicolons with space.
You could use sed to do your job:
# replaces just the first occurrence of ';', note the absence of `g` that
# would have made it a global replacement
sed 's/;/;#/' file > file.out
or, to do it in place:
sed -i 's/;/;#/' file
Or, use awk:
awk -F';' '{$2 = "#"$2}1' OFS=';' file
All the above commands result in the same output for your example file:
aaa;#111;bbb
ccc;#222;ddd
eee;#333;fff
#atb: Try:
1st:
awk -F";" '{print $1 FS "#" $2 FS $3}' Input_file
Above will work only when your Input_file has 3 fields only.
2nd:
awk -F";" -vfield=2 '{$field="#"$field} 1' OFS=";" Input_file
Above code you could put any field number and could make it as per your request.
Here I am making field separator as ";" and then taking a variable named field which will have the field number in it and then that concatenating "#" in it's value and 1 is for making condition TRUE and not making and action so by default print action will happen of current line.
You just misunderstood how to set variables. Change -OFS to -v OFS:
awk -F';' -v OFS=';' '{ $2 = "#" $2 }1' file
but in reality you should set them both to the same value at one time:
awk 'BEGIN{FS=OFS=";"} { $2 = "#" $2 }1' file

Gawk Line removal, Splitter is :

Is it possible to move certain columns from one .txt file into another .txt file?
I have a .txt that contains:
USERID:ORDER#:IP:PHONE:ADDRESS:POSTCODE
USERID:ORDER#:IP:PHONE:ADDRESS:POSTCODE
With gawk I want to extract ADDRESS & POSTCODE columns into another .txt, so for this given file the output should be:
ADDRESS1:POSTCODE1
ADDRESS2:POSTCODE2
etc.
This is a classic AWK transform. You want to use "-F :" to specify that the input is delimited by ":" and print a new ":" on output:
awk -F: '{ print $5 ":" $6 }' <input.txt >output.txt
Try that:
awk -F: '{printf "%s:%s ",$5,$6}' ex.txt
input is
USERID:ORDER#:IP:PHONE:ADDRESS1:POSTCODE1
USERID:ORDER#:IP:PHONE:ADDRESS2:POSTCODE2
output is (on one line if I understand correctly)
ADDRESS1:POSTCODE1 ADDRESS2:POSTCODE2
only default is that it ends with a trailing space and does not end with a newline.
Which can be fixed with the slightly more complex (but still readable):
awk -F: 'BEGIN {z=0;} {if (z==1) { printf " "; } ; z=1; printf "%s:%s",$5,$6} END{printf"\n"}' ex.txt
awk -F: 'NR==1 {print $5"1:"$6"1"};NR==2 {print $5"2:"$6"2"}' file
ADDRESS1:POSTCODE1
ADDRESS2:POSTCODE2

Format date in a column using awk

I just want to fix this problem. I am running the code below
awk -F, 'NR>1{gsub(/\:/,"",$4);gsub(/\-/,"",$4);gsub(/\.0/,"",$4);gsub(/\ /,",",$4);NF--}{$1=$1}1' OFS=, sample
$cat sample
1,0,null,2014-11-24 08:15:18.0,1
1,0,null,2014-11-24 08:15:16.0,1
The output is
1,0,null,2014-11-24 08:15:18.0,1
1,0,null,20141124,081516
My expected output:
1,0,null,20141124,081518,1
1,0,null,20141124,081516,1
Anyone who could help me with my code above?
You probably just need
awk -F, '{gsub(/[-:]/,"",$4);sub(/ /,OFS,$4);sub(/\.0$/,"",$4)}1' OFS=, sample
Instead of using gsub, you are better off using split.
awk '
BEGIN { FS = OFS = "," }
{
split ($4, flds, /[- :.]/);
$4 = flds[1] flds[2] flds[3] FS flds[4] flds[5] flds[6]
}1' sample
1,0,null,20141124,081518,1
1,0,null,20141124,081516,1
We set the input and output field separator in the BEGIN block to ,.
Using split, we break the forth field on -, :, . and space in to an array.
We then re-construct the forth field by concatenating the array elements.
1 at the end will do default awk action, that is print.
#!/usr/bin/awk -f
$1 {
gsub(/(\.0|[-:])/, "")
gsub(/ /, ",")
print
}
$ awk 'BEGIN{FS=OFS=","} {gsub(/[-:]|\.0/,"",$4); sub(/ /,OFS,$4)} 1' file
1,0,null,20141124,081518,1
1,0,null,20141124,081516,1
or:
$ awk 'BEGIN{FS="[ ,]";OFS=","} {gsub(/-/,"",$4); gsub(/:|\.0/,"",$5)} 1' file
1,0,null,20141124,081518,1
1,0,null,20141124,081516,1

How to include variable to output filename using awk

There is a command which prints out to file range of values from CSV file:
date1var=mm/dd/yyyy hh:mm:ss
date2var=mm/dd/yyyy hh:mm:ss
awk -F, -v d1var="$date1var" -v d2var="$date2var" '$1 > d1var && $1 <= d2var {print $0 }' OFS=, plot_data.csv > graph1.csv
I'm just guessing if it's possible to include my variables to the output filename?
Final name of the file should be similar to:
graph_d1var-d2var.csv
Any ideas?
You can redirect the output of print command to a file name, like:
awk -F, -v d1var="$date1_var" -v d2var="$date2var" '
$1 > d1var && $1 <= d2var {
print > ("graph_" d1var "-" d2var ".csv")
}'
OFS=, plot_data.csv
This uses the values of d1var and d2var to create the name of the output file. If you want the name of the variables, surround the whole name in double quotes.
Let the shell handle it: you're starting with shell variables after all
date1var=mm/dd/yyyy hh:mm:ss
date2var=mm/dd/yyyy hh:mm:ss
awk -F, -v OFS=, -v d1var="$date1var" \
-v d2var="$date2var" \
'
# awk script is unchanged
' plot_data.csv > "graph1_${date1var}-${date2var}.csv"
#!/bin/bash
date1var="1234"
date2var="5678"
awk -F, -v d1="$date1var" -v d2="$date2var" '{print > ("graph" d1 "-" d2 ".txt")}' OFS=, plot_data.csv
Note that you can't compare date strings in awk like you are trying to do. You also have a typo, in that you have written date1_var with an underscore whereas you have used date1var without an underscore further on.
I guess the short answer is that you can print to a named file with print > "filename" and that you can concatenate (join) strings by placing them beside each other like this string2 = string1 "and" string3;

Resources