unix time to date and replace in bash using awk - bash

I am trying to convert unix time to date and time;
1436876820 blah1 stop none john
1436876820 blah0 continu none john
1436876821 blah2 stop good bob
I would like to convert the first column to have two more column date and time as below
14-07-15 13:27:00 blah1 stop none john
14-07-15 13:27:00 blah0 continu none john
14-07-15 13:27:01 blah2 stop good bob
etc..
So I have started to do the following.
IN="${1}"
for i in $(awk '{print $1}' ${IN});
do
DD=$(date -d #${i} +'%d-%m-%Y %H:%M:%S')
awk '{ ${1}="'"${DD}"'" }' < ${IN}
done
This does not work due to the syntax and give such of error:
awk: { ${1}="14-07-2015 13:27" }
awk: ^ syntax error
I could use sed instead of awk:
sed "s/^1........./${DD}/" ${IN}
Any help with awk is really welcome.
Al.

Get rid of the shell loop and just do it one awk invocation:
awk '{
cmd = "date -d #" $1 " +\"%d-%m-%Y %H:%M:%S\""
if ( (cmd | getline dd) > 0 ) {
$1 = dd
}
close(cmd)
print
}' "$1"
If you have GNU awk you can just use it's internal strftime() instead of the date+getline:
awk '{
$1 = strftime("%d-%m-%Y %H:%M:%S",$1)
print
}' "$1"

Related

awk how to find first available date field?

Fields 1,2,3,4 are date fields yyyy-mm-dd.
Delimited by ";"
"-" if no date.
Field 4 will always have a date
Examples;
-; 2016-08-19; 2016-08-19; 2018-07-17; Beach-Rangiroa.jpg
-; -; -; 2018-09-12; MV3_0034-copy.webp
2016-12-10; 2016-12-10; 2016-12-20; 2018-07-18; Sukhothai-61.jpg
-; -; -; 2018-07-19; Gdu9Rwhu6W3Q5W6q_1Qag.jpg
Objective: Use awk to print the 1st available date in order fields 1,2,3,4
I've tried this;
awk -F";" '{if ($1!="-") print $1; else if ($2!="-") print $2; else if ($3!="-") prin$3; else if ($4!="-") print $4}'
Results...
2016-08-19
-
-
bash version 4.3.48
I am trying to achieve this: e.g. line 1 in example...
2016-08-19; Beach-Rangiroa.jpg
echo '-; -; -; 2018-07-15; Stock-Photo-114398301.webp; WEBP; image/webp; 2000; 1333' | \
awk -F';' 'OFS=";" {for(i=1; i<5; ++i) { if ($i ~ /[0-9]{4}-[0-9]{,2}-[0-9]{,2}/) { print $i,$5,$6,$7,$8,$9; next; }}}'
Result;
2018-07-15; Stock-Photo-114398301.webp; WEBP; image/webp; 2000; 1333
This works nicely, except the 1st space on the date, also is there a method available to verify the date, e.g. date -d "%Y-%m-%d" ?
Thank you.
This is a gnu only gawk solution using FPAT:
awk 'BEGIN{FPAT="[0-9]{4}-[0-9]{,2}-[0-9]{,2}"}{print $1}' file1
2016-08-19
2018-09-12
2018-07-19
With FPAT you actually instruct gawk what to consider as a field, a whole regex here. If the input line has also a second date this will appear as $2, $NF will return the last date field of each line,NF will return the total date fields,and so on.
You can use a variable for field numbers:
awk -F\; '{for(i=1; i<5; ++i) { if ($i ~ /[0-9]/) { print $i; next; }}}' in
Solution without awk:
You said you wanted the 1st available date. When you only want 1 line output, you can use
grep -Eo "[0-9]{4}-[0-9]{2}-[0-9]{2}" inputfile| head -1
When you want to have the first date for each line, change the grep or use sed:
grep -Eo "[0-9]{4}-[0-9]{2}-[0-9]{2}.*" inputfile| cut -d';' -f1
# or
sed -r 's/([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/; s/.*([0-9]{4}-[0-9]{2}-[0-9]{2})/\1/' inputfile
Thank you for all for your help.
I think this acomplishes the objective;
echo '-; -; -; 2018-07-25; Redwood-Forest-Sequoia-4.jpg; JPEG; image/jpeg; 1280; 720' | \
awk -F'; ' 'OFS="; " {for(i=1; i<5; ++i) { if ($i ~ /[0-9]{4}-[0-9]{,2}-[0-9]{,2}/) { print $i,$5,$6,$7,$8,$9; next; }}}'
Result;
2018-07-25; Redwood-Forest-Sequoia-4.jpg; JPEG; image/jpeg; 1280; 720
Best regards.

Add to CSV a timestamp column based on other columns (using bash)

I need to read a CSV file (list.csv) like this:
0;John Doe;2001;03;24
1;Jane Doe;1985;12;05
2;Mr. White;2018;06;01
3;Jake White;2017;11;20
...
and add a column (doesn't matter where I put it) with a Unix timestamp based on the year/month/day being in column 3, 4 and 5, to get this:
0;John Doe;2001;03;24;985392000
1;Jane Doe;1985;12;05;502588800
2;Mr. White;2018;06;01;1527811200
3;Jake White;2017;11;20;1511136000
...
So I wrote this script.sh:
#!/bin/sh
while read line
do
printf "$line;"
date -d $(awk -F\; '{print $3$4$5}' <<<$line) +%s
done
and I ran:
<list.csv ./script.sh
and it works, but it's very slow when it comes to having very large CSVs.
Is there a way to do it faster in a sed/awk command line?
I mean, can I (for instance) inject a bash command into a sed/awk line?
For example (I know this won't work, it's just an example):
awk -F\; '{print $1 ";" $2 ";" $3 ";" $4 ";" $5 ";" $(date -d $3$4$5 +%s)}'
GNU awk to the rescue!
$ gawk -F';' '{$0=$0 FS mktime($3" "$4" "$5" 00 00 00")}1' file
0;John Doe;2001;03;24;985410000
1;Jane Doe;1985;12;05;502606800
2;Mr. White;2018;06;01;1527825600
3;Jake White;2017;11;20;1511154000
not sure what hour/min/sec you use as default.
For other awks without builtin time functions:
awk -F';' '{
cmd = "date -d "$3 $4 $5" +%s"
cmd | getline time
close(cmd)
$0 = $0 FS time
print
}' file
or perl
perl -MTime::Piece -F';' -lane '
print join ";", #F, Time::Piece->strptime("#F[2..4]", "%Y %m %d")->epoch
' file
# or
perl -MTime::Local -F';' -lane '
print join ";", #F, timelocal(0, 0, 0, $F[4], $F[3]-1, $F[2]-1900)
' file

Using and manipulating date variable inside AWK

I assume that there may be a better way to do it but the only one I came up with was using AWK.
I have a file with name convention like following:
testfile_2016_03_01.txt
Using one command I am trying to shift it by one day testfile_20160229.txt
I started from using:
file=testfile_2016_03_01.txt
IFS="_"
arr=($file)
datepart=$(echo ${arr[1]}-${arr[2]}-${arr[3]} | sed 's/.txt//')
date -d "$datepart - 1 days" +%Y%m%d
the above works fine, but I really wanted to do it in AWK. The only thing I found was how to use "date" inside AWK
new_name=$(echo ${file##.*} | awk -F'_' ' {
"date '+%Y%m%d'" | getline date;
print date
}')
echo $new_name
okay so two things happen here. For some reason $4 also contains .txt even though I removed it(?) ##.*
And the main problem is I don't know how to pass the variables to that date, the below doesn't work
`awk -F'_' '{"date '-d "2016-01-01"' '+%Y%m%d'" | getline date; print date}')
ideally I want 2016-01-01 to be variables coming from the file name $2-$3-$4 and substract 1 day but I think I'm getting way too many single and double quotes here and my brain is losing..
Equivalent awk command:
file='testfile_2016_03_01.txt'
echo "${file%.*}" |
awk -F_ '{cmd="date -d \"" $2"-"$3"-"$4 " -1 days\"" " +%Y%m%d";
cmd | getline date; close(cmd); print date}'
20160229
WIth GNU awk for time functions:
$ file=testfile_2016_03_01.txt
$ awk -v file="$file" 'BEGIN{ split(file,d,/[_.]/); print strftime(d[1]"_%Y%m%d."d[5],mktime(d[2]" "d[3]" "d[4]" 12 00 00")-(24*60*60)) }'
testfile_20160229.txt
This might work for you:
file='testfile_2016_03_01.txt'
IFS='_.' read -ra a <<< "$file"
date -d "${a[1]}${a[2]}${a[3]} -1 day" "+${a[0]}_%Y%m%d.${a[4]}"

awk print something if column is empty

I am trying out one script in which a file [ file.txt ] has so many columns like
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha| |325
xyz| |abc|123
I would like to get the column list in bash script using awk command if column is empty it should print blank else print the column value
I have tried the below possibilities but it is not working
cat file.txt | awk -F "|" {'print $2'} | sed -e 's/^$/blank/' // Using awk and sed
cat file.txt | awk -F "|" '!$2 {print "blank"} '
cat file.txt | awk -F "|" '{if ($2 =="" ) print "blank" } '
please let me know how can we do that using awk or any other bash tools.
Thanks
I think what you're looking for is
awk -F '|' '{print match($2, /[^ ]/) ? $2 : "blank"}' file.txt
match(str, regex) returns the position in str of the first match of regex, or 0 if there is no match. So in this case, it will return a non-zero value if there is some non-blank character in field 2. Note that in awk, the index of the first character in a string is 1, not 0.
Here, I'm assuming that you're interested only in a single column.
If you wanted to be able to specify the replacement string from a bash variable, the best solution would be to pass the bash variable into the awk program using the -v switch:
awk -F '|' -v blank="$replacement" \
'{print match($2, /[^ ]/) ? $2 : blank}' file.txt
This mechanism avoids problems with escaping metacharacters.
You can do it using this sed script:
sed -r 's/\| +\|/\|blank\|/g' File
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha|blank|325
xyz|blank|abc|123
If you don't want the |:
sed -r 's/\| +\|/\|blank\|/g; s/\|/ /g' File
abc pqr lmn 123
pqr xzy 321 azy
lee cha blank 325
xyz blank abc 123
Else with awk:
awk '{gsub(/\| +\|/,"|blank|")}1' File
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha|blank|325
xyz|blank|abc|123
You can use awk like this:
awk 'BEGIN{FS=OFS="|"} {for (i=1; i<=NF; i++) if ($i ~ /^ *$/) $i="blank"} 1' file
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha|blank|325
xyz|blank|abc|123

bash awk first 1st column and 3rd column with everything after

I am working on the following bash script:
# contents of dbfake file
1 100% file 1
2 99% file name 2
3 100% file name 3
#!/bin/bash
# cat out data
cat dbfake |
# select lines containing 100%
grep 100% |
# print the first and third columns
awk '{print $1, $3}' |
# echo out id and file name and log
xargs -rI % sh -c '{ echo %; echo "%" >> "fake.log"; }'
exit 0
This script works ok, but how do I print everything in column $3 and then all columns after?
You can use cut instead of awk in this case:
cut -f1,3- -d ' '
awk '{ $2 = ""; print }' # remove col 2
If you don't mind a little whitespace:
awk '{ $2="" }1'
But UUOC and grep:
< dbfake awk '/100%/ { $2="" }1' | ...
If you'd like to trim that whitespace:
< dbfake awk '/100%/ { $2=""; sub(FS "+", FS) }1' | ...
For fun, here's another way using GNU sed:
< dbfake sed -r '/100%/s/^(\S+)\s+\S+(.*)/\1\2/' | ...
All you need is:
awk 'sub(/.*100% /,"")' dbfake | tee "fake.log"
Others responded in various ways, but I want to point that using xargs to multiplex output is rather bad idea.
Instead, why don't you:
awk '$2=="100%" { sub("100%[[:space:]]*",""); print; print >>"fake.log"}' dbfake
That's all. You don't need grep, you don't need multiple pipes, and definitely you don't need to fork shell for every line you're outputting.
You could do awk ...; print}' | tee fake.log, but there is not much point in forking tee, if awk can handle it as well.

Resources