Comparing and transforming a date that is piped to (probably) awk? - shell

I've got a reasonably complicated string of piped shell commands (let's assume it's bunch | of | commands), which together produces several rows of output, in this format:
some_path/some_file.csv 1439934121
...where 1439934121 is the file's last-modified timestamp.
What I need to do is see if it's a timestamp on the current day, i.e. on or after last midnight, and then include just the lines where that is true.
I assume this means that some string (e.g. the word true) should either replace or be appended to the timestamps of those lines for grep to distinguish them from ones where the timestamps are those of an earlier date.
To put it in shell command terms:
bunch | of | commands | ????
...should produce:
some_path/some_file.csv true or some_path/some_file.csv 1439934121 true
...for which I could easily grep (obviously assuming that last midnight <= 1439934121 <= current time).
What kind of ???? would do this? I'm almost certain that awk can do what I need it to, so I've looked at it and date, but I'm basically doing awk-by-google with no skills and getting nowhere.
Don't feel constrained by my tool assumptions; if you can achieve this with alternate means, given the output of bunch | of | commands but still using shell tools and piping, I'm all ears. I'd like to avoid temp files or Perl, if possible :-)
I'm using gawk + bash 4.3 on Ubuntu Linux, specifically, and have no portability concerns.

Since today 00:00:00 with the %s format returns the unix timestamp of that moment:
$ date -d'today 00:00:00'
Thu Sep 3 00:00:00 CEST 2015
$ date -d 'today 00:00:00' "+%s"
1441231200
You can probably pipe to an awk doing something like:
... | awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1'
That is, use the ternary operator to check the value of $2 and replace with either of the values true/false depending on the result:
awk -v midnight="$(date ...)" '{$2= ($2>midnight) ? "true" : "false"}1'
Test
$ cat a
hello 1441231201
bye 23
$ awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1' a
hello true
bye false

Related

Adding months using shell script

Currently I have a below record in a file.
ABC,XYZ,123,Sep-2018
Looking for a command in linux which will add months and give the output. For example If I want to add 3 months. Expected output is:
ABC,XYZ,123,Dec-2018
Well,
date -d "1-$(echo "ABC,XYZ,123,Sep-2018" | awk -F "," '{ print $4 }')+3 months" "+%b-%Y"
(Careful, that code continues past the edge of the box.)
Shows you how to get it working. Just replace the echo with a shell variable as you loop through the dates.
Basically, you use awk to grab just the date portion, add a 1- to the front to turn it into a real date then use the date command to do the math and then tell it to give you just the month abbreviation and year.
The line above gives just the date portion. The first part can be found using:
stub=`echo "ABC,XYZ,123,Dec-2018" | awk -F "," '{ printf("%s,%s,%s,",$1,$2,$3) }'`
You can use external date or (g)awk's datetime related function to do it. However you have to prepare the string to parse. Here is another way to do the job:
First prepare an index file, we name it month.txt:
Jan
Feb
......
...
Nov
Dec
Then run this:
awk -F'-|,' -v OFS="," 'NR==FNR{m[NR]=$1;a[$1]=NR;next}
{i=a[$4]; if(i==12){i=1;++$5}else i++
$4=m[i]"-"$5;NF--}7' month.txt file
With this example file:
ABC,XYZ,123,Jan-2018
ABC,XYZ,123,Nov-2018
ABC,XYZ,123,Dec-2018
You will get:
ABC,XYZ,123,Feb-2018
ABC,XYZ,123,Dec-2018
ABC,XYZ,123,Jan-2019
update
Oh, I didn't notice that you want to add 3 months. Here is the updated codes for it:
awk -F'-|,' -v OFS="," 'NR==FNR{m[NR]=$1;a[$1]=NR;next}
{i=a[$4]+3; if(i>12){i=i-12;++$5}
$4=m[i]"-"$5;NF--}7' month.txt file
Now with the same input, you get:
ABC,XYZ,123,Apr-2018
ABC,XYZ,123,Feb-2019
ABC,XYZ,123,Mar-2019

bash unit tests with dynamic content

I had to implement some new features on an very old awk script and now want to implement some unit tests to check if my script breaks things. I used diff to check if the script output is different from the whished output:
awk -f mygenerator.awk test.1.gen | diff - test.1.out -q
if [ $? -ne 0 ]; then
echo "test failed"
fi
But now i have some files that generate a dynamic content like a timestamp of the generation date, which causes diff to fail because obviously the timestamp will be different.
My first though was to remove the corresponding lines with grep and test the two "clean" files. then check by egrep if the line is a timestamp.
is there any better way to do this? It should all be done by common unix tools in a bash script due to compatibility reasons.
You could use sed with regular expressions.
If your output is like Fri Feb 21 22:53:54 UTC 2014 from the date command, use:
regex_timestamp="s/([A-Z]{1}[a-z]{2} [A-Z]{1}[a-z]{2} [0-9]{2} [0-9]{2}\:[0-9]{2}\:[0-9]{2} [A-Z]{3} [0-9]{4})//g";
awk -f mygenerator.awk test.1.gen | diff <(sed -r "$regex_timestamp" -) <(sed -r "$regex_timestamp" test.1.out) -q
If you're trying to filter a unix timestamp, simply use this as regex:
s/([0-9]{10})//g
Please note that the latter replaces any group of numbers the same size as a unix timestamp. What format is your timestamp?
I usually use sed to replace the timestamp with XXXXXX, so I can still compare the other information on the same line.
date | \
sed 's/\(Sun\|Mon\|Tue\|Wed\|Thu\|Fri\|Sat\) \(Jan\|Feb\|Mar\|Apr\|May\|Jun\|Jul\|Aug\|Sep\|Oct\|Nov\|Dec\) \?[0-9]\+ [0-9][0-9]:[0-9][0-9]:[0-9][0-9] [A-Z]\+ [0-9]\{4\}/XXXXXX/'

sed: mass converting epochs amongst random other text

Centos / Linux
Bash
I have a log file, which has lots of text in and epoch numbers all over the place. I want to replace all epochs whereever they are into readable date/time.
I've been wanting to this via sed, as that seems the tool for the job. I can't seem to get the replacement part of sed to actually parse the variable(epoch) to it for conversion.
Sample of what I'm working with...
echo "Some stuff 1346474454 And not working" \
| sed 's/1[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/'"`bpdbm -ctime \&`"'/g'
Some stuff 0 = Thu Jan 1 01:00:00 1970 And not working
The bpdbm part will convert a supplied epoch variable into useful date. Like this..
bpdbm -ctime 1346474454
1346474454 = Sat Sep 1 05:40:54 2012
So how do i get the "found" item to be parsed into a command. As i don't seem to be able to get it to work.
Any help would be lovely. If there is another way, that would be cool...but i suspect sed will be quickest.
Thanks for your time!
that seems the tool for the job
No, it is not. sed can use & only itself, there is no way how to make it an argument to a command. You need something more powerful, e.g. Perl:
perl -pe 'if ( ($t) = /(1[0-9]+)/ ) { s/$t/localtime($t)/e }'
You can do it with GNU sed, the input:
infile
Some stuff 1346474454 And not working
GNU sed supports /e parameter which allows for piping command output into pattern space, one way to take advantage of this with bpdbm:
sed 's/(.*)(1[0-9]{9})(.*)/echo \1 $(bpdbm -ctime \2) \3/e' infile
Or with coreutils date:
sed 's/(.*)(1[0-9]{9})(.*)/echo \1 $(date -d #\2) \3/e' infile
output with date
Some stuff Sat Sep 1 06:40:54 CEST 2012 And not working
To get the same output as with bpdbm:
sed 's/(.*)(1[0-9]{9})(.*)/echo "\1$(date -d #\2 +\"%a %b %_d %T %Y\")\3"/e' infile
output
Some stuff Sat Sep 1 06:40:54 2012 And not working
Note, this only replaces the last epoch found on a line. Re-run if there are more.

grep all lines for a specific date in log-files

I need to grep all lines for "yesterday" in /var/log/messages.
When I use following snippet, I get zero results due to the fact that the dates are in the format "Jun 9". (It doesn't show here, but in the log file, the days of the month are padded with an extra space when smaller than 10).
cat /var/log/messages | grep `date --date="yesterday" +%b\ %e`
When I enter
$ date --date="yesterday" +%b\ %e
on the commandline, it returns yesterday's date, with padding.
But when I combine it with grep and the backticks, the extra padding gets suppressed, which gives me zero results.
What do I need to change so that the "date" gets evaluated with extra padding?
You should be able to fix this by putting quotes around the backticks:
cat /var/log/messages | grep "`date --date="yesterday" +%b\ %e`"

How to convert time format from "2010-10-08 00:00:01" to "1286467201" in awk

In awk, Is there a way to convert the time format from "2010-10-08 00:00:01" to 1286467201
as using the command "date"
$ date +%s -d '2010-10-08 00:00:01'
1286467201
GNU awk has a mktime function that can do the job. However, it's crucial to be aware of timezones. The string "2010-10-08 00:00:01" does not contain enough information to define a specific time. If you assume it is in UTC you can do:
$ echo 2010-10-08 00:00:01 | \
TZ=UTC gawk '{ tstr=$1" "$2; gsub(/[\-:]/, " ", tstr); print mktime(tstr); }'
1286496001
If you don't specify the TZ variable you end up with the server's time zone (which should be UTC anyway, but a lot of folks use local time on servers, so it's not a safe assumption).
You can get UTC output from your date command by altering it slightly:
$ date +%s -u -d '2010-10-08 00:00:01'

Resources