Objective
add "67" to column 1 of the output file with 67 being the variable ($iv) classified on the difference between 2 dates.
File1.csv
display,dc,client,20572431,5383594
display,dc,client,20589101,4932821
display,dc,client,23030494,4795549
display,dc,client,22973424,5844194
display,dc,client,21489000,4251031
display,dc,client,23150347,3123945
display,dc,client,23194965,2503875
display,dc,client,20578983,1522448
display,dc,client,22243554,920166
display,dc,client,20572149,118865
display,dc,client,23077785,28077
display,dc,client,21811100,5439
Current Output 3_file1.csv
BOB-UK-,display,dc,client,20572431,5383594,0.05,269.18
BOB-UK-,display,dc,client,20589101,4932821,0.05,246.641
BOB-UK-,display,dc,client,23030494,4795549,0.05,239.777
BOB-UK-,display,dc,client,22973424,5844194,0.05,292.21
BOB-UK-,display,dc,client,21489000,4251031,0.05,212.552
BOB-UK-,display,dc,client,23150347,3123945,0.05,156.197
BOB-UK-,display,dc,client,23194965,2503875,0.05,125.194
BOB-UK-,display,dc,client,20578983,1522448,0.05,76.1224
BOB-UK-,display,dc,client,22243554,920166,0.05,46.0083
BOB-UK-,display,dc,client,20572149,118865,0.05,5.94325
BOB-UK-,display,dc,client,23077785,28077,0.05,1.40385
BOB-UK-,display,dc,client,21811100,5439,0.05,0.27195
TOTAL,,,,,33430004,,1671.5
Desired Output 3_file1.csv
BOB-UK-67,display,dc,client,20572431,5383594,0.05,269.18
BOB-UK-67,display,dc,client,20589101,4932821,0.05,246.641
BOB-UK-67,display,dc,client,23030494,4795549,0.05,239.777
BOB-UK-67,display,dc,client,22973424,5844194,0.05,292.21
BOB-UK-67,display,dc,client,21489000,4251031,0.05,212.552
BOB-UK-67,display,dc,client,23150347,3123945,0.05,156.197
BOB-UK-67,display,dc,client,23194965,2503875,0.05,125.194
BOB-UK-67,display,dc,client,20578983,1522448,0.05,76.1224
BOB-UK-67,display,dc,client,22243554,920166,0.05,46.0083
BOB-UK-67,display,dc,client,20572149,118865,0.05,5.94325
BOB-UK-67,display,dc,client,23077785,28077,0.05,1.40385
BOB-UK-67,display,dc,client,21811100,5439,0.05,0.27195
TOTAL,,,,,33430004,,1671.5
Current Code
#! bin/sh
set -eu
de=$(date +"%d-%m-%Y" -d "1 month ago")
ds="15-04-2014"
iv=$(awk -vdate1=$de -vdate2=$ds 'BEGIN{split(date1, A,"-");split(date2, B,"-");year_diff=A[3]-B[3];if(year_diff){months_diff=A[2] + 12 * year_diff - B[2] + 1;} else {months_diff=A[2]>B[2]?A[2]-B[2]+1:B[2]-A[2]+1};print months_diff}')
for f in $(find *.csv); do
awk -F"," -v OFS=',' '{print "BOB-UK-"$iv,$0,0.05}' $f > "1_$f.csv" ##PROBLEM LINE##
awk -F"," -v OFS=',' '{print $0,$6*$7/1000}' "1_$f.csv" > "2_$f.csv" ##calculate price
awk -F"," -v OFS=',' '{print $0}; {sum+=$6}{sum2+=$8} END {print "TOTAL,,,,," (sum)",,"(sum2)}' "2_$f.csv" > "3_$f.csv" ##calculate total
done
Issue
When I run the first awk line (Marked as "## PROBLEM LINE##") the loop doesn't change column $1 to include the "67" after "BOB-UK-". This should be done with the print "BOB-UK-"$iv but instead it doesn't do anything. I suspect this is due to the way print works in awk but I haven't been able to work out a way to treat it within this row. Does anyone know if this is possible or do I need to create a new row to achieve this?
You have to pass the variable value to awk. awk does not inherit variables from the shell and does not expand $variable variables like shell. It is another tool with it's internal language.
awk -v iv="$iv" -F"," -v OFS=',' '{print "BOB-UK-"iv,$0,0.05}' "$f"
Tested in repl with the input provided.
for f in $(find *.csv)
Is useless use of find, makes no sense, just
for f in *.csv
Also note that you are creating 1_$f.csv, 2_$f.csv and 3_$f.csv files in the current directory in your loop, so the next time you run your script there will be 4 times more .csv files to iterate through. Dunno if that's relevant.
How $iv works in awk?
The $<number> is the field number <number> from the line in awk. So for example the $1 is the first field of the line in awk. The $2 is the second field. The $0 is special and it is the whole line.
The $iv expands to $ + the value of iv. So for example:
echo a b c | awk '{iv=2; print $iv}'
will output b, as the $iv expands to $2 then $2 expands to the second field from the input - ie. b.
Uninitialized variables in awk are initialized with 0. So $iv is substituted for $0 in your awk line, so it expands for the whole line.
I have txt file like this:
Dupont Charles
Martin Paul
Dupuis Jean
I want, for each line, to make a login corresponding to first 2 caracters of each names.
For instance : Dupont Charles ==> duch
awk '{print tolower(substr($1,1,1)substr($2,1,1))}' liste.txt
works perfectly.
but i want to store each login in var and call a bash script (uscript) with that var...
awk '{login=tolower(substr($1,1,1)substr($2,1,1));print $login; script1 $login; }' liste.txt
But it does not work and the content of login is not what I want.
The way to use variables in awk is different from Bash. In awk, a variable does not have a leading $. So what you want to say is:
awk '{login=tolower(substr($1,1,1)substr($2,1,1));print login; }' liste.txt
# ^
# intead of $login
However, you are willing to use a script script1 with this value. For this, you may want to use system() to call an external command............... or use a while block to handle all in one.
while IFS= read -r name surname
do
login=${name:0:2}${surname:0:2}
script1 "$login"
done < file
This uses the same logic to get the 3 first characters of a variable:
$ v="123456789"
$ echo ${v:0:3}
123
awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs script1
e.g. using echo instead of script1 which of course I don't have:
$ awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs echo
duch mapa duje
or if script1 requires 1 arg at a time:
awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs -n1 script1
e.g.
$ awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs -n1 echo
duch
mapa
duje
I have an awk command that extracts the 16th column from 3rd line in a csv file and prints the first 4 characters.
awk -F"," 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,0,4)}'
This works fine.
But when I execute it from a shell script, I get and error
#!/bin/ksh
YEAR=awk -F"," 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,0,4)}'
Error message:
-F,: not found
Use command substitution to assign the output of a command to a variable, as shown below:
YEAR=$(awk -F"," 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,0,4)}')
you are asking the shell to do :
VAR=value command [arguments...]
which means: launch command but pass it the VAR=value environment first
(ex: LC_ALL=C grep '[0-9]*' /some/file.txt : will grep a number in file.txt (and this with the LC_ALL variable set to C just for the duration of the call of grep)
So here : you ask the shell to launch the -F"," command (ie, -F, once the shell interpret the "," into , with arguments 'NR==3.......... and with the variable YEAR set to the value awk for the duration of the command invocation.
Just replace it with :
#!/bin/ksh
YEAR="$(awk -F',' 'NR==3{print $16}' sample.csv|sed -e 's/^[ \t]*//'|awk '{print substr($0,1,4)}')"
(I didn't try it, but I hope they work for you and your sample.csv file)
(Note that you use "0" to match character position 1, which works in many awk implementation but not all (ie most (but not all) assume 1 when you write 0))
From your description, it looks like you want to extract the year from the 16th field, which might contain leading spaces. You can accomplish it by calling AWK once:
YEAR=$(awk -F, 'NR==3{sub(/^[ \t]*/, "", $16); print ">" substr($16,1,4) "<" }')
Better yet, you don't even have to use awk. Since you are already writing shell script, let's do it all in shell script:
{ read line; read line; read line; } < sample.csv # Get the third line
IFS=, set $line # Breaks line into comma-separated fields
IFS=" " set ${16} # Trick to remove leading spaces, field 16 becomes field 1
YEAR=${1:0:4} # Extract the first 4 char from field 1
Do this:
year=$(awk -F, 'NR==3{sub(/^[ \t]+/,"",$16); print substr($16,1,4); exit }' sample.csv)
I put together this shell script to do two things:
Change the delimiters in a data file ('::' to ',' in this case)
Select the columns and I want and append them to a new file
It works but I want a better way to do this. I specifically want to find an alternative method for exploding each line into an array. Using command line arguments doesn't seem like the way to go. ANY COMMENTS ARE WELCOME.
# Takes :: separated file as 1st parameters
SOURCE=$1
# create csv target file
TARGET=${SOURCE/dat/csv}
touch $TARGET
echo #userId,itemId > $TARGET
IFS=","
while read LINE
do
# Replaces all matches of :: with a ,
CSV_LINE=${LINE//::/,}
set -- $CSV_LINE
echo "$1,$2" >> $TARGET
done < $SOURCE
Instead of set, you can use an array:
arr=($CSV_LINE)
echo "${arr[0]},${arr[1]}"
The following would print columns 1 and 2 from infile.dat. Replace with
a comma-separated list of the numbered columns you do want.
awk 'BEGIN { IFS='::'; OFS=","; } { print $1, $2 }' infile.dat > infile.csv
Perl probably has a 1 liner to do it.
Awk can probably do it easily too.
My first reaction is a combination of awk and sed:
Sed to convert the delimiters
Awk to process specific columns
cat inputfile | sed -e 's/::/,/g' | awk -F, '{print $1, $2}'
# Or to avoid a UUOC award (and prolong the life of your keyboard by 3 characters
sed -e 's/::/,/g' inputfile | awk -F, '{print $1, $2}'
awk is indeed the right tool for the job here, it's a simple one-liner.
$ cat test.in
a::b::c
d::e::f
g::h::i
$ awk -F:: -v OFS=, '{$1=$1;print;print $2,$3 >> "altfile"}' test.in
a,b,c
d,e,f
g,h,i
$ cat altfile
b,c
e,f
h,i
$