I want to get successive differences of dates shown sample below. How do I convert each of these dates into epoch seconds?
7/21/17 6:39:12:167 GMT
7/21/17 6:39:12:168 GMT
7/21/17 6:39:12:168 GMT
7/21/17 6:39:12:205 GMT
7/21/17 6:39:12:206 GMT
7/21/17 6:39:12:206 GMT
Once each line gets converted into epoch seconds, I can simply run another script to get successive differentials of each. Thanks.
You can convert times using the date command. Given a line like this:
7/21/17 6:39:12:167 GMT
You first need to strip everything at and after the seconds part, to get this:
7/21/17 6:39:12
You can use cut -d: -f1-3 for that. Then, convert to epoch seconds, if you're using FreeBSD or Mac OS:
date -ujf "%m/%d/%y %H:%M:%S" "7/21/17 6:39:12" +%s
Which gives:
1500619152
If you are using GNU date (e.g. on Linux), you can feed an entire file of dates to it. Since the input file isn't in the right format, we can do this:
date --file <(cut -d: -f1-3 infile) +%s
That will read the entire file with only a single invocation of date, which is much more efficient, but only works with GNU date.
Here is one in GNU awk. It converts the timestamps to epoch time in seconds and subtracts the former from the latter. mktime function used for converting doesn't take fraction of a seconds but the fractions are stored to hash a[7] and nothing stops you from adding it to t var before subtracting:
$ awk '
function zeropad(s) { # zeropad function
return sprintf("%02d", s)
}
{
split($0,a,"[/ :]") # split timestamp to a
for(i in a)
a[i]=zeropad(a[i]) # zeropad all components
t=mktime(20 a[3] " " a[1] " " a[3] " " a[4] " " a[5] " " a[6])
# add the fractions in a[7] here
if(NR>1) # might be unnecessary
print t-p # output subtracted seconds
p=t # set current time to previous
}' file
0
0
0
0
0
Since you didn't include the expected output or proper data sample, that's the best I can do for now.
EDIT:
Since your data does not fully reflect if the fraction of a second are presented like 0:0:0:100 or 0:0:0:1 I modified the zeropad function to left and right pad given values. Now you call it like zeropad(value, count, left/right) or zeropad(a[7],3,"r"):
function zeropad(s,c,d) {
return sprintf("%" (d=="l"? "0" c:"") "d" (d=="r"?"%0" c-length(s) "d":""), s,"")
}
{
split($0,a,"[/ :]") # split timestamp to a
for(i in a)
a[i]=zeropad(a[i],2,"l") # left-pad all components with 0s
t=mktime(20 a[3] " " a[1] " " a[3] " " a[4] " " a[5] " " a[6])
t=t+zeropad(a[7],3,"r")/1000 # right-pad fractions with 0s
if(NR>1) # might be unnecessary
print t-p # output subtracted seconds
p=t # set current time to previous
}
0.00999999
0
0.37
0.00999999
0
printf with proper modifiers should probably be used for outputing to get sane values.
Let's suppose that you have one of those dates in a variable:
$ d='7/21/17 6:39:12:167 GMT'
With GNU date, we need to remove the milliseconds part. That can be done with bash's pattern substitution:
$ echo "${d/:??? / }"
7/21/17 6:39:12 GMT
To convert that to seconds-since-epoch, we use the -d option to set the date and the %s format to request seconds-since-epoch:
$ date -d "${d/:??? / }" +%s
1500619152
Compatibility: Mac OSX does not support GNU's -d option but I gather it has similar option. On many operating systems, including OSX, there is the option to install GNU utilities.
Related
I'm trying to get the last x minutes of logs from /var/log/maillog from a remote host (I'm using this script within icinga2) but having no luck.
I have tried a few combinations of awk, sed, and grep but none have seemed to work. I thought it was an issue with double quotes vs single quotes but I played around with them and nothing helped.
host=$1
LOG_FILE=/var/log/maillog
hour_segment=$(ssh -o 'StrictHostKeyChecking=no' myUser#${host} 2>/dev/null "sed -n "/^$(date --date='10 minutes ago' '+%b %_d %H:%M')/,\$p" ${LOG_FILE}")
echo "${hour_segment}"
When running the script with bash -x, I get the following output:
bash -x ./myScript.sh host.domain
+ host=host.domain
+ readonly STATE_OK=0
+ STATE_OK=0
+ readonly STATE_WARN=1
+ STATE_WARN=1
+ LOG_FILE=/var/log/maillog
+++ date '--date=10 minutes ago' '+%b %_d %H:%M'
++ ssh -o StrictHostKeyChecking=no myUser#host.domain 'sed -n /^Jan' 8 '12:56/,$p /var/log/maillog'
+ hour_segment=
+ echo ''
Maillog log file output. I'd like $hour_segment to look like the below output also so I can apply filters to it:
head -n 5 /var/log/maillog
Jan 6 04:03:36 hostname imapd: Disconnected, ip=[ip_address], time=5
Jan 6 04:03:36 hostname postfix/smtpd[9501]: warning: unknown[ip_address]: SASL LOGIN authentication failed: authentication failure
Jan 6 04:03:37 hostname imapd: Disconnected, ip=[ip_address], time=5
Jan 6 04:03:37 hostname postfix/smtpd[7812]: warning: unknown[ip_address]: SASL LOGIN authentication failed: authentication failure
Jan 6 04:03:37 hostname postfix/smtpd[7812]: disconnect from unknown[ip_address]
Using GNU awk's time functions:
$ awk '
BEGIN {
m["Jan"]=1 # convert month abbreviations to numbers
# fill in the rest # fill in the rest of the months
m["Dec"]=12
nowy=strftime("%Y") # assume current year, deal with Dec/Jan below
nowm=strftime("%b") # get the month, see above comment
nows=strftime("%s") # current epoch time
}
{ # below we for datespec for mktime
dt=(nowm=="Jan" && $1=="Dec"?nowy-1:nowy) " " m[$1] " " $2 " " gensub(/:/," ","g",$3)
if(mktime(dt)>=nows-600) # if timestamp is less than 600 secs away
print # print it
}' file
Current year is assumed. If it's January and log has Dec we subtract one year from mktime's datespec: (nowm=="Jan" && $1=="Dec"?nowy-1:nowy). Datespec: Jan 6 04:03:37 -> 2019 1 6 04 03 37 and for comparison in epoch form: 1546740217.
Edit: As no one implemeted my specs in the comments I'll do it myself. tac outputs file in reverse and the awk prints records while they are in given time frame (t-now or future) and exits once it meets a date outside of the time frame:
$ tac file | awk -v t=600 ' # time in seconds go here
BEGIN {
m["Jan"]=1
# add more months
m["Dec"]=12
nowy=strftime("%Y")
nowm=strftime("%b")
nows=strftime("%s")
} {
dt=(nowm=="Jan" && $1=="Dec"?nowy-1:nowy) " " m[$1] " " $2 " " gensub(/:/," ","g",$3)
if(mktime(dt)<nows-t) # this changed some
exit
else
print
}'
Coming up with a robust solution that will work 100% bulletproof is very hard since we are missing the most crucial information, the year.
Imagine you want the last 10 minutes of available data on March 01 2020 at 00:05:00. This is a bit annoying since February 29 2020 exists. But in 2019, it does not.
I present here an ugly solution that only looks at the third field (the time) and I will make the following assumptions:
The log-file is sorted by time
There is at least one log every single day!
Under these conditions we can keep track of a sliding window starting from the first available time.
If you safe the following in an file extractLastLog.awk
{ t=substr($3,1,2)*3600 + substr($3,4,2)*60 + substr($3,7,2) + offset}
(t < to) { t+=86400; offset+=86400 }
{ to = t }
(NR==1) { startTime = t; startIndex = NR }
{ a[NR]=$0; b[NR]=t }
{ while ( startTime+timeSpan*60 <= t ) {
delete a[startIndex]
delete b[startIndex]
startIndex++; startTime=b[startIndex]
}
}
END { for(i=startIndex; i<=NR; ++i) print a[i] }
then you can extract the last 23 minutes in the following way:
awk -f extractLastLog.awk -v timeSpan=23 logfile.log
The second condition I gave (There is at least one log every single day!) is needed not to have messed up results. In the above code, I compute the time fairly simple, HH*3600 + MM*60 + SS + offset. But I make the statement that if the current time is smaller than the previous time, it implies we are on a different day hence we update the offset with 86400 seconds. So if you have two entries like:
Jan 09 12:01:02 xxx
Jan 10 12:01:01 xxx
it will work, but this
Jan 09 12:01:00 xxx
Jan 10 12:01:01 xxx
will not work. It will not realize the day changed. Other cases that will fail are:
Jan 08 12:01:02 xxx
Jan 10 12:01:01 xxx
as it does not know that it jumped two days. Corrections for this are not easy due to the months (all thanks to leap years).
As I said, it's ugly, but might work.
I'm running AIX with coreutils 5.0. I need to advance an arbitrary date (or time) as given conformative to ISO-8601 format YYYY-MM-DD hh:mm:ss.
For example:
Value of D1 is: 2017-07-08 19:20:01, and I need to add 30 minutes.
In a modern UNIX-system I could probably write something like
date -d "$D1 + 30 minutes" +'%H:%M'
but, alas, I need it to work on an old AIX.
Try
$ date -d "$(date -d "$D1") + 30 minutes" +'%H:%M'
This works in bash, but not in ksh.
The inner call to date will parse D1 to a date, and present it in date's "native" format.
$ date -d "$D1"
Sat Jul 8 19:20:01 CEST 2017
This output will be used with + 30 minutes to create the date that you want, with the outer call to date.
The inner call to date will be expanded so that
$ date -d "$(date -d "$D1") + 30 minutes" +'%H:%M'
will be equivalent to
$ date -d "Sat Jul 8 19:20:01 CEST 2017 + 30 minutes" +'%H:%M'
which will be
19:50
date -d #$(($(date -d "$D1" +%s) + 30 * 60)) +%H:%M
$(date -d "$D1" +%s) echoes the epoch
$((epoch + value)) calculates the wanted time
date -d#epoch +fmt formats it
If you are running AIX from 2003 you are in dire straits, my friend, but if you only need the time, not the full date, as your question implies, I think #RamanSailopal got us half way there.
echo $D1 | awk -F "[: ]" '{
m = $3+30;
h = ($2+int(m/60)) % 24;
printf("%02i:%02i\n", h, m%60)
}'
awk splits the input in different fields, with the splitter pattern given in the -F argument. The pattern denotes : or space .
The input will be split in
$1 = 2017-07-08
$2 = 19
$3 = 20
$4 = 01
Then the script calculates a fake minute value (that can be more than or equal to 60) and stores it in m. From that value it calculates the hour, modulo 24, and the actual minutes, m modulo 60.
This could fail if you hit a leap second, so if you need second precision at all times, you should use some other method.
Awk solution:
awk -F '[-: ]' '{
ram=(mktime($1" "$2" "$3" "$4" "$5" "$6)+(30*60));
print strftime("%Y-%m-%d %T",ram)
}' <<< "$D1"
Convert the date to a date string using awk's mktime function. Add 30 minutes (30*60) and then convert back to a date string with the required format using strftime.
How do I compare current timestamp and a field of a file and print the matched and unmatched data. I have 2 columns in a file (see below)
oac.bat 09:09
klm.txt 9:00
I want to compare the timestamp(2nd column) with current time say suppose(10:00) and print the output as follows.
At 10:00
greater.txt
xyz.txt 10:32
mnp.csv 23:54
Lesser.txt
oac.bat 09:09
klm.txt 9:00
Could anyone help me on this please ?
I used awk $0 > "10:00", which gives me only 2nd column details but I want both the column details and I am taking timestamp from system directly from system with a variable like
d=`date +%H:%M`
With GNU awk you can just use it's builtin time functions:
awk 'BEGIN{now = strftime("%H:%M")} {
split($NF,t,/:/)
cur=sprintf("%02d:%02d",t[1],t[2])
print > ((cur > now ? "greater" : "lesser") ".txt")
}' file
With other awks just set now using -v and date up front, e.g.:
awk -v now="$(date +"%H:%M")" '{
split($NF,t,/:/)
cur = sprintf("%02d:%02d",t[1],t[2])
print > ((cur > now ? "greater" : "lesser") ".txt")
}' file
The above is untested since you didn't provide input/output we could test against.
Pure Bash
The script can be implemented in pure Bash with the help of date command:
# Current Unix timestamp
let cmp_seconds=$(date +%s)
# Read file line by line
while IFS= read -r line; do
let line_seconds=$(date -d "${line##* }" +%s) || continue
(( line_seconds <= cmp_seconds )) && \
outfile=lesser || outfile=greater
# Append the line to the file chosen above
printf "%s\n" "$line" >> "${outfile}.txt"
done < file
In this script, ${line##* } removes the longest match of '* ' (any character followed by a space) pattern from the front of $line thus fetching the last column (the time). The time column is supposed to be in one of the following formats: HH:MM, or H:MM. Actually, date's -d option argument
can be in almost any common format. It can contain month names, time zones, ‘am’ and ‘pm’, ‘yesterday’, etc.
We use the flexibility of this option to convert the time (HH:MM, or H:MM) to Unix timestamp.
The let builtin allows arithmetic to be performed on shell variables. If the last let expression fails, or evaluates to zero, let returns 1 (error code), otherwise 0 (success). Thus, if for some reason the time column is in invalid format, the iteration for such line will be skipped with the help of continue.
Perl
Here is a Perl version I have written just for fun. You may use it instead of the Bash version, if you like.
# For current date
#cmp_seconds=$(date +%s)
# For specific hours and minutes
cmp_seconds=$(date -d '10:05' +%s)
perl -e '
my #t = localtime('$cmp_seconds');
my $minutes = $t[2] * 60 + $t[1];
while (<>) {
/ (\d?\d):(\d\d)$/ or next;
my $fh = ($1 * 60 + $2) > $minutes ? STDOUT : STDERR;
printf $fh "%s", $_;
}' < file >greater.txt 2>lesser.txt
The script computes the number of minutes in the following way:
HH:MM = HH * 60 + MM minutes
If the number of minutes from the file are greater then the number of minutes for the current time, it prints the next line to the standard output, otherwise to standard error. Finally, the standard output is redirected to greater.txt, and the standard error is redirected to lesser.txt.
I have written this script for demonstration of another approach (algorithm), which can be implemented in different languages, including Bash.
I'm trying to work on a logfile, and I need to be able to specify the range of dates. So far (before any processing), I'm converting a date/time string to timestamp using date --date "monday" +%s.
Now, I want to be able to iterate over each line in a file, but check if the date (in a human readable format) is within the allowed range. To do this, I'd like to do something like the following:
echo `awk '{if(`date --date "$3 $4 $5 $6 $7" +%s` > $START && `date --date "" +%s` <= $END){/*processing code here*/}}' myfile`
I don't even know if thats possible... I've tried a lot of variations, plus I couldn't find anything understandable/usable online.
Thanks
Update:
Example of myfile is as follows. Its logging IPs and access times:
123.80.114.20 Sun May 01 11:52:28 GMT 2011
144.124.67.139 Sun May 01 16:11:31 GMT 2011
178.221.138.12 Mon May 02 08:59:23 GMT 2011
Given what you have to do, its really not that hard AND it is much more efficient to do your date processing by converting to strings and comparing.
Here's a partial solution that uses associative arrays to convert the month value to a number. Then you rely on the %02d format specifier to ensure 2 digits. You can reformat the dateTime value with '.', etc or leave the colons in the hr:min:sec if you really need the human readability.
The YYYYMMDD format is a big help in these sort of problems, as LT, GT, EQ all work without any further formatting.
echo "178.221.138.12 Mon May 02 08:59:23 GMT 2011" \
| awk 'BEGIN {
mons["Jan"]=1 ; mons["Feb"]=2; mons["Mar"]=3
mons["Apr"]=4 ; mons["May"]=5; mons["Jun"]=6
mons["Jul"]=7 ; mons["Aug"]=8; mons["Sep"]=9
mons["Oct"]=10 ; mons["Nov"]=11; mons["Dec"]=12
}
{
# 178.221.138.12 Mon May 02 08:59:23 GMT 2011
printf("dateTime=%04d%02d%02d%02d%02d%02d\n",
$NF, mons[$3], $4, substr($5,1,2), substr($5,4,2), substr($5,7,2) )
} ' -v StartTime=20110105235959
The -v StartTime is ilustrative of how to pass in (and the matching format) your starTime value.
I hope this helps.
Here's an alternative approach using awk's built-in mktime() function. I've never bothered with the month parsing until now - thanks to shelter for that part (see accepted answer). It always feels time to switch language around that point.
#!/bin/bash
# input format:
#(1 2 3 4 5 6 7)
#123.80.114.20 Sun May 01 11:52:28 GMT 2011
awk -v startTime=1304252691 -v endTime=1306000000 '
BEGIN {
mons["Jan"]=1 ; mons["Feb"]=2; mons["Mar"]=3
mons["Apr"]=4 ; mons["May"]=5; mons["Jun"]=6
mons["Jul"]=7 ; mons["Aug"]=8; mons["Sep"]=9
mons["Oct"]=10 ; mons["Nov"]=11; mons["Dec"]=12;
}
{
hmsSpaced=$5; gsub(":"," ",hmsSpaced);
timeInSec=mktime($7" "mons[$3]" "$4" "hmsSpaced);
if (timeInSec > startTime && timeInSec <= endTime) print $0
}' myfile
(I've chosen example time thresholds to select only the last two log lines.)
Note that if the mktime() function were a bit smarter this whole thing would reduce to:
awk -v startTime=1304252691 -v endTime=1306000000 't=mktime($7" "$3" "$4" "$5); if (t > startTime && t <= endTime) print $0}' myfile
I'm not sure of the format of the data you're parsing, but I do know that you can't use the backticks within single quotes. You'll have to use double quotes. If there are too many quotes being nested, and it's confusing you, you can also just save the output of your date command to a variable beforehand.
I have this awk script that runs through a file and counts every occurrence of a given date. The date format in the original file is the standard date format, like this: Thu Mar 5 16:46:15 EST 2009 I use awk to throw away the weekday, time, and timezone, and then do my counting by pumping the dates into an associative array with the dates as indices.
In order to get the output to be sorted by date, I converted the dates to a different format that I could sort with bash sort.
Now, my output looks like this:
Date Count
03/05/2009 2
03/06/2009 1
05/13/2009 7
05/22/2009 14
05/23/2009 7
05/25/2009 7
05/29/2009 11
06/02/2009 12
06/03/2009 16
I'd really like the output to have more human readable dates, like this:
Mar 5, 2009
Mar 6, 2009
May 13, 2009
May 22, 2009
May 23, 2009
May 25, 2009
May 29, 2009
Jun 2, 2009
Jun 3, 2009
Any suggestions for a way I could do this? If I could do this on the fly when I output the count values that would be best.
UPDATE:
Here's my solution incorporating ghostdog74's example code:
grep -i "E[DS]T 2009" original.txt | awk '{printf "%s %2.d, %s\r\n",$2,$3,$6}' >dates.txt #outputs dates for counting
date -f dates.txt +'%Y %m %d' | awk ' #reformat dates as YYYYMMDD for future sort
{++total[$0]} #pump dates into associative array
END {
for (item in total) printf "%s\t%s\r\n", item, total[item] #output dates as yyyy mm dd with counts
}' | sort -t \t | awk ' #send to sort, then to cleanup
BEGIN {printf "%s\t%s\r\n","Date","Count"}
{t=$1" "$2" "$3" 0 0 0" #cleanup using example by ghostdog74
printf "%s\t%2.d\r\n",strftime("%b %d, %Y",mktime(t)),$4
}'
rm dates.txt
Sorry this looks so messy. I've tried to put clarifying comments in.
Use awk's sort and date's stdin to greatly simplify the script
Date will accept input from stdin so you can eliminate one pipe to awk and the temporary file. You can also eliminate a pipe to sort by using awk's array sort and as a result, eliminate another pipe to awk. Also, there's no need for a coprocess.
This script uses date for the monthname conversion which would presumably continue to work in other languages (ignoring the timezone and month/day order issues, though).
The end result looks like "grep|date|awk". I have broken it into separate lines for readability (it would be about half as big if the comments were eliminated):
grep -i "E[DS]T 2009" original.txt |
date -f - +'%Y %m %d' | #reformat dates as YYYYMMDD for future sort
awk '
BEGIN { printf "%s\t%s\r\n","Date","Count" }
{ ++total[$0] #pump dates into associative array }
END {
idx=1
for (item in total) {
d[idx]=item;idx++ # copy the array indices into the contents of a new array
}
c=asort(d) # sort the contents of the copy
for (i=1;i<=c;i++) { # use the contents of the copy to index into the original
printf "%s\t%2.d\r\n",strftime("%b %e, %Y",mktime(d[i]" 0 0 0")),total[d[i]]
}
}'
I get testy when I see someone using grep and awk (and sed, cut, ...) in a pipeline. Awk can fully handle the work of many utilities.
Here's a way to clean up your updated code to run in a single instance of awk (well, gawk), and using sort as a co-process:
gawk '
BEGIN {
IGNORECASE = 1
}
function mon2num(mon) {
return(((index("JanFebMarAprMayJunJulAugSepOctNovDec", mon)-1)/3)+1)
}
/ E[DS]T [[:digit:]][[:digit:]][[:digit:]][[:digit:]]/ {
month=$2
day=$3
year=$6
date=sprintf("%4d%02d%02d", year, mon2num(month), day)
total[date]++
human[date] = sprintf("%3s %2d, %4d", month, day, year)
}
END {
sort_coprocess = "sort"
for (date in total) {
print date |& sort_coprocess
}
close(sort_coprocess, "to")
print "Date\tCount"
while ((sort_coprocess |& getline date) > 0) {
print human[date] "\t" total[date]
}
close(sort_coprocess)
}
' original.txt
if you are using gawk
awk 'BEGIN{
s="03/05/2009"
m=split(s,date,"/")
t=date[3]" "date[2]" "date[1]" 0 0 0"
print strftime("%b %d",mktime(t))
}'
the above is just an example, as you did not show your actual code and so cannot incorporate it into your code.
Why don't you prepend your awk-date to the original date? This yields a sortable key, but is human readable.
(Note: to sort right, you should make it yyyymmdd)
If needed, cut can remove the prepended column.
Gawk has strftime(). You can also call the date command to format them (man). Linux Forums gives some examples.