This question already has an answer here:
How to compare two DateTime strings and return difference in hours? (bash shell)
(1 answer)
Closed 8 years ago.
I am looking to write a shell script that will compare the time between two date-time stamps in the format:
2013-12-10 13:25:30.123
2013-12-10 13:25:31.123
I can split the date and time if required (as the comparison should never be more than one second - I am looking at a reporting rate), so I can format the time as 13:25:30.123 / 13:25:31.123.
To just find the newer (or older) of the two timestamps, you could just use a string comparison operator:
time1="2013-12-10 13:25:30.123"
time2="2013-12-10 13:25:31.123"
if [ "$time1" > "$time2" ]; then
echo "the 2nd timestamp is newer"
else
echo "the 1st timestamp is newer"
fi
And, to find the time difference (tested):
ns1=$(date --date "$time1" +%s%N)
ns2=$(date --date "$time2" +%s%N)
echo "the difference in seconds is:" `bc <<< "scale=3; ($ns2 - $ns1) / 1000000000"` "seconds"
Which, in your case prints
the difference in seconds is: 1.000 seconds
Convert them into timestamps before comparing:
if [ $(date -d "2013-12-10 13:25:31.123" +%s) -gt $(date -d "2013-12-10 13:25:30.123" +%s) ]; then
echo "blub";
fi
With Perl using the included Time::Piece library:
perl -MTime::Piece -nE '
BEGIN {
$, = "\t";
sub to_seconds {
my ($dt, $frac) = (shift =~ /(.*)(\.\d*)$/);
return(Time::Piece->strptime($dt, "%Y-%m-%d %T")->epoch + $frac);
}
}
if ($. > 1) {
$a = to_seconds($_);
$b = to_seconds($prev);
say $a, $b, $a-$b
}
$prev = $_
'<<END
2013-12-10 13:25:30.123
2013-12-10 13:25:31.123
2013-12-10 13:25:42.042
END
1386681931.123 1386681930.123 1
1386681942.042 1386681931.123 10.9190001487732
Related
I'm creating a condition that checks the actual date in epoch format and compares another string in epoch format; if the strings are more than 10 days old, do something...
i tried something like this:
#!/bin/bash
timeago='10 days ago'
actual_date=$(date --date "now" +'%s')
last_seen_filter=$(date --date "$timeago" +'%s')
echo "INFO: actual_date=$actual_date, last_seen_filter=$last_seen_filter" >&2
if [ "$actual_date" -lt "$last_seen_filter" ]; then
echo "something"
else
echo "do something"
fi
or
#!/bin/bash
cutoff=$(date -d '10 days ago' +%s)
key="1624684050 1624688000"
while read -r "$key"
do
age=$(date -d "now" +%s)
if (($age < $cutoff))
then
printf "Warning! key %s is older than 10 days\n" "$key" >&2
fi
done < input
That's not enough for what I need, I have epoch dates in a file called converted_data, i need to include this strings on if comparision.
1624684050
1634015250
1614661650
1622005650
It's not clear what you're trying to do but maybe this with GNU awk will get you started:
$ awk '
BEGIN { today=strftime("%F") }
{
secs = $0
days = 0
date = strftime("%F",secs)
while ( strftime("%F",secs+=(24*60*60)) < today ) {
++days
}
print $0":", date, "->", today, "=", days
}
' file
1624684050: 2021-06-26 -> 2021-06-13 = 0
1634015250: 2021-10-12 -> 2021-06-13 = 0
1614661650: 2021-03-01 -> 2021-06-13 = 102
1622005650: 2021-05-26 -> 2021-06-13 = 17
The 0s for future dates are because you only asked about 10-days past dates so I don't care to adapt for both past and future deltas. I also didn't put much thought into it so check the logic and the math!
I have a file with more than 10K lines of record.
Within each line, there are two date+time info. Below is an example:
"aaa bbb ccc 170915 200801 12;ddd e f; g; hh; 171020 122030 10; ii jj kk;"
I want to filter out the lines the days between these two dates is less than 30 days.
Below is my source code:
#!/bin/bash
filename="$1"
echo $filename
touch filterfile
totalline=`wc -l $filename | awk '{print $1}'`
i=0
j=0
echo $totalline lines
while read -r line
do
i=$[i+1]
if [ $i -gt $[j+9] ]; then
j=$i
echo $i
fi
shortline=`echo $line | sed 's/.*\([0-9]\{6\}\)[ ][0-9]\{6\}.*\([0-9]\{6\}\)[ ][0-9]\{6\}.*/\1 \2/'`
date1=`echo $shortline | awk '{print $1}'`
date2=`echo $shortline | awk '{print $2}'`
if [ $date1 -gt 700000 ]
then
continue
fi
d1=`date -d $date1 +%s`
d2=`date -d $date2 +%s`
diffday=$[(d2-d1)/(24*3600)]
#diffdays=`date -d $date2 +%s` - `date -d $date1 +%s`)/(24*3600)
if [ $diffday -lt 30 ]
then
echo $line >> filterfile
fi
done < "$filename"
I am running it in cywin. It took about 10 second to handle 10 lines. I use echo $i to show the progress.
Is it because i am using some wrong way in my script?
This answer does not answer your question but gives an alternative method to your shell script. The answer to your question is given by Sundeep's comment :
Why is using a shell loop to process text considered bad practice?
Furthermore, you should be aware that everytime you call sed, awk, echo, date, ... you are requesting the system to execute a binary which needs to be loaded into memory etc etc. So if you do this in a loop, it is very inefficient.
alternative solution
awk programs are commonly used to process log files containing timestamp information, indicating when a particular log record was written. gawk extended the awk standard with time-handling functions. The one you are interested in is :
mktime(datespec [, utc-flag ]) Turn datespec into a timestamp in the
same form as is returned by systime(). It is similar to the function
of the same name in ISO C. The argument, datespec, is a string of the
form "YYYY MM DD HH MM SS [DST]". The string consists of six or seven
numbers representing, respectively, the full year including century,
the month from 1 to 12, the day of the month from 1 to 31, the hour of
the day from 0 to 23, the minute from 0 to 59, the second from 0 to
60, and an optional daylight-savings flag.
The values of these numbers need not be within the ranges specified;
for example, an hour of -1 means 1 hour before midnight. The
origin-zero Gregorian calendar is assumed, with year 0 preceding year
1 and year -1 preceding year 0. If utc-flag is present and is either
nonzero or non-null, the time is assumed to be in the UTC time zone;
otherwise, the time is assumed to be in the local time zone. If the
DST daylight-savings flag is positive, the time is assumed to be
daylight savings time; if zero, the time is assumed to be standard
time; and if negative (the default), mktime() attempts to determine
whether daylight savings time is in effect for the specified time.
If datespec does not contain enough elements or if the resulting time
is out of range, mktime() returns -1.
As your date format is of the form yymmdd HHMMSS we need to write a parser function convertTime for this. Be aware in this function we will pass times of the form yymmddHHMMSS. Furthermore, using a space delimited fields, your times are located in field $4$5 and $11$12. As mktime converts the time to seconds since 1970-01-01 onwards, all we need to do is to check if the delta time is smaller than 30*24*3600 seconds.
awk 'function convertTime(t) {
s="20"substr(t,1,2)" "substr(t,3,2)" "substr(t,5,2)" "
s= s substr(t,7,2)" "substr(t,9,2)" "substr(t,11,2)"
return mktime(s)
}
{ t1=convertTime($4$5); t2=convertTime($11$12)}
(t2-t1 < 30*3600*24) { print }' <file>
If you are not interested in the real delta time (your sed line removes the actual time of the day), than you can adopt it to :
awk 'function convertTime(t) {
s="20"substr(t,1,2)" "substr(t,3,2)" "substr(t,5,2)" "
s= s "00 00 00"
return mktime(s)
}
{ t1=convertTime($4); t2=convertTime($11)}
(t2-t1 < 30*3600*24) { print }' <file>
If the dates are not in the fields, you can use match to find them :
awk 'function convertTime(t) {
s="20"substr(t,1,2)" "substr(t,3,2)" "substr(t,5,2)" "
s= s substr(t,7,2)" "substr(t,9,2)" "substr(t,11,2)"
return mktime(s)
}
{ match($0,/[0-9]{6} [0-9]{6}/);
t1=convertTime(substr($0,RSTART,RLENGTH));
a=substr($0,RSTART+RLENGTH)
match(a,/[0-9]{6} [0-9]{6}/)
t2=convertTime(substr(a,RSTART,RLENGTH))}
(t2-t1 < 30*3600*24) { print }' <file>
With some modifications, often without speed in mind, I can reduce the processing time by 50% - which is a lot:
#!/bin/bash
filename="$1"
echo "$filename"
# touch filterfile
totalline=$(wc -l < "$filename")
i=0
j=0
echo "$totalline" lines
while read -r line
do
i=$((i+1))
if (( i > ((j+9)) )); then
j=$i
echo $i
fi
shortline=($(echo "$line" | sed 's/.*\([0-9]\{6\}\)[ ][0-9]\{6\}.*\([0-9]\{6\}\)[ ][0-9]\{6\}.*/\1 \2/'))
date1=${shortline[0]}
date2=${shortline[1]}
if (( date1 > 700000 ))
then
continue
fi
d1=$(date -d "$date1" +%s)
d2=$(date -d "$date2" +%s)
diffday=$(((d2-d1)/(24*3600)))
# diffdays=$(date -d $date2 +%s) - $(date -d $date1 +%s))/(24*3600)
if (( diffday < 30 ))
then
echo "$line" >> filterfile
fi
done < "$filename"
Some remarks:
# touch filterfile
Well - the later CMD >> filterfile overwrites this file and creates one, if it doesn't exist.
totalline=$(wc -l < "$filename")
You don't need awk, here. The filename output is surpressed if wc doesn't see the filename.
Capturing the output in an array:
shortline=($(echo "$line" | sed 's/.*\([0-9]\{6\}\)[ ][0-9]\{6\}.*\([0-9]\{6\}\)[ ][0-9]\{6\}.*/\1 \2/'))
date1=${shortline[0]}
date2=${shortline[1]}
allows us array access and saves another call to awk.
On my machine, your code took about 42s for 2880 lines (on your machine 2880 s?) and about 19s for the same file with my code.
So I suspect, if you aren't running it on an i486-machine, that cygwin might be a slowdown. It's a linux environment for windows, isn't it? Well, I'm on a core Linux system. Maybe you try the gnu-utils for Windows - the last time I looked for them, they were advertised as gnu-utils x32 or something, maybe there is an a64-version available by now.
And the next thing I would have a look at, is the date calculation - that might be a slowdown too.
2880 lines isn't that much, so I don't suspect that my SDD drive plays a huge role in the game.
#####DATE1=201609
#### DATE2=201508
How to calculate the difference between these two date and get output as count of no of month
ie
201609-201508=13month
The calculation of the time difference is generally a complicated task, even for a single calendar type (and there are many). Many programming languages have a built-in support for date and time manipulation operations, including calculation of the time difference. But the most useful feature available in the popular shells is the date command which lacks this feature, unfortunately.
Therefore, we should whether write a script in another language, or make some assumptions such as the number of days in the year.
For example, in Perl the task is done with just four lines of code:
perl -e $(cat <<'PerlScript'
use Time::Piece;
my $t1 = Time::Piece->strptime($ARGV[0], '%Y%m');
my $t2 = Time::Piece->strptime($ARGV[1], '%Y%m');
printf "%d months\n", ($t1 - $t2)->months;
PerlScript
) 201609 201508
However, the difference of Time::Piece objects is an instance of Time::Seconds which actually assumes that
there are 24 hours in a day, 7 days in a week, 365.24225 days in a
year and 12 months in a year.
which indirectly confirms my words regarding the complexity of the task.
Then let's make the same assumption, and write a simple shell script:
DATE1=201609
DATE2=201508
printf '(%d - %d) / 2629744.2\n' \
$(date -d ${DATE1}01 +%s) \
$(date -d ${DATE2}01 +%s) | bc
where 2629744.2 is the number of seconds in month, i.e. 3600 * 24 * (365.24225 / 12).
Note, most of the shells do not support floating point arithmetic. That's why we need to invoke external tools such as bc.
The script outputs 13. This is a portable version. You may run it in the standard shell, Bash, Korn shell, or Zsh, for instance. If you want to put the result into a variable, just wrap the printf command in $( ... ):
months=$(printf '(%d - %d) / 2629744.2\n' \
$(date -d ${DATE1}01 +%s) \
$(date -d ${DATE2}01 +%s) | bc)
printf '%d - %d = %d months\n' $DATE1 $DATE2 $months
You can try below solution -
[vipin#hadoop ~]$ cat time.awk
{
diff1=((substr($1,1,4)) - (substr($2,1,4)))
diff2=((substr($1,5,2)) - (substr($2,5,2)))
if(diff1 > 0)
{
if(diff2 > 0)
{
print (diff1+diff2)
}
else if (diff2 = 0)
{
print (diff1+diff2)
}
else
{
diff2=(12+((substr($1,5,2)-(substr($2,5,2)))))
diff1=(diff1-1)
print (diff1+diff2)
}
}
else if(diff1 == 0)
{
if(diff2 > 0)
{
print (diff1+diff2)
}
else if (diff2 == 0)
{
print (diff1+diff2)
}
else
{
print "Argument 2 is greater than 1"
}
}
else
{
print "Argument 2 is greater than 1"
}
}
Test Cases -
[vipin#hadoop ~]$ cat time1.txt && awk -f time.awk time1.txt
201611 201601
10
[vipin#hadoop ~]$ cat time2.txt && awk -f time.awk time2.txt
201601 201611
Argument 2 is greater than 1
[vipin#hadoop ~]$ cat time3.txt && awk -f time.awk time3.txt
201511 201601
Argument 2 is greater than 1
[vipin#hadoop ~]$ cat time4.txt && awk -f time.awk time4.txt
201611 201611
0
Let's try something quick and dirty that won't work for all months in history, but it's probably good enough: convert YYYYMM to the number of months since year 0:
$ ym2m() {
if [[ $1 =~ ^([0-9]{4})([0-9]{2})$ ]]; then
echo $(( 10#${BASH_REMATCH[1]} * 12 + 10#${BASH_REMATCH[2]} ))
else
return 1
fi
}
$ ym2m 201609
24201
$ ym2m 201508
24188
$ echo $(( $(ym2m 201609) - $(ym2m 201508) ))
13
Notes:
requires bash version 4.3 (I think)
ym2m => "year-month to months"
uses 10#number in the arithmetic expression to ensure "08" and "09" are not treated as invalid octal numbers.
I need to know the first monday of the current month using Cygwin bash.
One Liner:
d=$(date -d "today 1300" '+%Y%m01'); w=$(date -d $d '+%w'); i=$(( (8 - $w) % 7)); answer=$(( $d + $i ));
The result is stored in $answer. It uses working variables $d, $w, and $i.
Proof (assuming you just ran the one liner above):
echo $answer; echo $(date -d $answer '+%w')
Expected Result: Monday of current month in YYYYMMDD. On the next line, a 1 for the day of the week.
Expanded Proof (checks the next 100 month's Mondays):
for x in {1..100}; do d=$(date -d "+$x months 1300" '+%Y%m01'); w=$(date -d $d '+%w'); i=$(( (8 - $w) % 7)); answer=$(( $d + $i )); echo $answer; echo $(date -d $answer '+%w'); done | egrep -B1 '^[^1]$'
Expected Result: NOTHING
(If there are results, something is broken)
Breaking it down
The first statement gets the first day of the current month, and stores that in $d, formatted as YYYYMMDD.
The second statement gets the day of the week number of the date $d, and stores that in $w.
The third statement computes the increment of days to add and stores it in $i. Zero is perfectly valid here, because...
The last statement computes the sum of the date $d (as an integer) and the increment $i (as an integer). This works because the domain of the $i is 0 to 6, and we will always start at the first day of the month. This can quickly be converted back to a date variable (see Proof for example of this).
This has been tested on BASH v4.1 (CentOS 6), v4.4 (Ubuntu), and v5 (Archlinux)
A one-liner--I hope it's correct
d=$(date -d date +%Y%m"01" +%u);date -d date +%Y%m"0"$(((9-$d)%7))
the variable d contains the day of week (1..7) where 1 is Monday
then I print the current year and month changing the day with $((9-$d))
This should do it, but I have no Cygwin here to test:
#!/bin/bash
# get current year and month:
year=$( date +"%Y" )
month=$( date +"%m" )
# for the first 7 days in current month :
for i in {1..7}
do
# get day of week (dow) for that date:
dow=$( date -d "${year}-${month}-${i}" +"%u" )" "
# if dow is 1 (Monday):
if [ "$dow" -eq 1 ]
then
# print date of that Monday in default formatting:
date -d "${year}-${month}-${i}"
break
fi
done
See manpage date(1) for more information.
What is the most elegant way to calculate the previous business day in shell ksh script ?
What I got until now is :
#!/bin/ksh
set -x
DAY_DIFF=1
case `date '+%a'` in
"Sun")
DAY_DIFF=2
;;
"Mon")
DAY_DIFF=3
;;
esac
PREV_DT=`perl -e '($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime(time()-${DAY_DIFF}*24*60*60);printf "%4d%02d%02d",$year+1900,$mon+1,$mday;'`
echo $PREV_DT
How do I make the ${DAY_DIFF} variable to be transmitted as value and not as string ?
#!/bin/ksh
# GNU date is a veritable Swiss Army Knife...
((D=$(date +%w)+2))
if [ $D -gt 3 ]; then D=1; fi
PREV_DT=$(date -d "-$D days" +%F)
Here is a solution that doesn't use Perl. It works both with ksh and sh.
#!/bin/ksh
diff=-1
[ `date +%u` == 1 ] && diff=-3
seconds=$((`date +%s` + $diff * 24 * 3600))
format=+%Y-%m-%d
if date --help 2>/dev/null | grep -q -- -d ; then
# GNU date (e.g., Linux)
date -d "1970-01-01 00:00 UTC + $seconds seconds" $format
else
# For BSD date (e.g., Mac OS X)
date -r $seconds $format
fi
Well, if running Perl counts as part of the script, then develop the answer in Perl. The next question is - what defines a business day? Are you a shop/store that is open on Sunday? Saturday? Or a 9-5 Monday to Friday business? What about holidays?
Assuming you're thinking Monday to Friday and holidays are temporarily immaterial, then you can use an algorithm in Perl that notes that wday will be 0 on Sunday through 6 on Saturday, and therefore if wday is 1, you need to subtract 3 * 86400 from time(); if wday is 0, you need to subtract 2 * 86400; and if wday is 6, you need to subtract 1 * 86400. That's what you've got in the Korn shell stuff - just do it in the Perl instead:
#!/bin/perl -w
use strict;
use POSIX;
use constant SECS_PER_DAY => 24 * 60 * 60;
my(#days) = (2, 3, 1, 1, 1, 1, 1);
my($now) = time;
my($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime($now);
print strftime("%Y-%m-%d\n", localtime($now - $days[$wday] * SECS_PER_DAY));
This does assume you have the POSIX module; if not, then you'll need to do roughly the same printf() as you used. I also use ISO 8601 format for dates by preference (also used by XSD and SQL) - hence the illustrated format.
This should work for Solaris and Linux. It's realy complicating on Unix that you can not use the same commandline arguments on all Unix derivates.
On Linux you can use date -d '-d24 hour ago' to get the last day
on Solaris its TZ=CET+24 date. I guess other UNIX'es works the same way as Solaris does.
#!/usr/bin/ksh
lbd=5 # last business day (1=Mon, 2=Thu ... 6=Sat, 7=Sun)
lbd_date="" # last business day date
function lbdSunOS
{
typeset back=$1
typeset tz=`date '+%Z'` # timezone
lbd_date=`TZ=${tz}+$back date '+%Y%m%d'`
}
function lbdLinux
{
typeset back=$1
lbd_date=`date -d "-d$back hour ago"`
}
function calcHoursBack
{
typeset lbd=$1
typeset dow=`date '+%u'` # day of the week
if [ $dow -ge $lbd ]
then
return $(((dow-lbd)*24))
else
return $(((dow-lbd+7)*24))
fi
}
# Main
calcHoursBack $lbd
lbd`uname -s` $?
echo $lbd_date