How to get last 10 minutes of logs from remote host - bash

I'm trying to get the last x minutes of logs from /var/log/maillog from a remote host (I'm using this script within icinga2) but having no luck.
I have tried a few combinations of awk, sed, and grep but none have seemed to work. I thought it was an issue with double quotes vs single quotes but I played around with them and nothing helped.
host=$1
LOG_FILE=/var/log/maillog
hour_segment=$(ssh -o 'StrictHostKeyChecking=no' myUser#${host} 2>/dev/null "sed -n "/^$(date --date='10 minutes ago' '+%b %_d %H:%M')/,\$p" ${LOG_FILE}")
echo "${hour_segment}"
When running the script with bash -x, I get the following output:
bash -x ./myScript.sh host.domain
+ host=host.domain
+ readonly STATE_OK=0
+ STATE_OK=0
+ readonly STATE_WARN=1
+ STATE_WARN=1
+ LOG_FILE=/var/log/maillog
+++ date '--date=10 minutes ago' '+%b %_d %H:%M'
++ ssh -o StrictHostKeyChecking=no myUser#host.domain 'sed -n /^Jan' 8 '12:56/,$p /var/log/maillog'
+ hour_segment=
+ echo ''
Maillog log file output. I'd like $hour_segment to look like the below output also so I can apply filters to it:
head -n 5 /var/log/maillog
Jan 6 04:03:36 hostname imapd: Disconnected, ip=[ip_address], time=5
Jan 6 04:03:36 hostname postfix/smtpd[9501]: warning: unknown[ip_address]: SASL LOGIN authentication failed: authentication failure
Jan 6 04:03:37 hostname imapd: Disconnected, ip=[ip_address], time=5
Jan 6 04:03:37 hostname postfix/smtpd[7812]: warning: unknown[ip_address]: SASL LOGIN authentication failed: authentication failure
Jan 6 04:03:37 hostname postfix/smtpd[7812]: disconnect from unknown[ip_address]

Using GNU awk's time functions:
$ awk '
BEGIN {
m["Jan"]=1 # convert month abbreviations to numbers
# fill in the rest # fill in the rest of the months
m["Dec"]=12
nowy=strftime("%Y") # assume current year, deal with Dec/Jan below
nowm=strftime("%b") # get the month, see above comment
nows=strftime("%s") # current epoch time
}
{ # below we for datespec for mktime
dt=(nowm=="Jan" && $1=="Dec"?nowy-1:nowy) " " m[$1] " " $2 " " gensub(/:/," ","g",$3)
if(mktime(dt)>=nows-600) # if timestamp is less than 600 secs away
print # print it
}' file
Current year is assumed. If it's January and log has Dec we subtract one year from mktime's datespec: (nowm=="Jan" && $1=="Dec"?nowy-1:nowy). Datespec: Jan 6 04:03:37 -> 2019 1 6 04 03 37 and for comparison in epoch form: 1546740217.
Edit: As no one implemeted my specs in the comments I'll do it myself. tac outputs file in reverse and the awk prints records while they are in given time frame (t-now or future) and exits once it meets a date outside of the time frame:
$ tac file | awk -v t=600 ' # time in seconds go here
BEGIN {
m["Jan"]=1
# add more months
m["Dec"]=12
nowy=strftime("%Y")
nowm=strftime("%b")
nows=strftime("%s")
} {
dt=(nowm=="Jan" && $1=="Dec"?nowy-1:nowy) " " m[$1] " " $2 " " gensub(/:/," ","g",$3)
if(mktime(dt)<nows-t) # this changed some
exit
else
print
}'

Coming up with a robust solution that will work 100% bulletproof is very hard since we are missing the most crucial information, the year.
Imagine you want the last 10 minutes of available data on March 01 2020 at 00:05:00. This is a bit annoying since February 29 2020 exists. But in 2019, it does not.
I present here an ugly solution that only looks at the third field (the time) and I will make the following assumptions:
The log-file is sorted by time
There is at least one log every single day!
Under these conditions we can keep track of a sliding window starting from the first available time.
If you safe the following in an file extractLastLog.awk
{ t=substr($3,1,2)*3600 + substr($3,4,2)*60 + substr($3,7,2) + offset}
(t < to) { t+=86400; offset+=86400 }
{ to = t }
(NR==1) { startTime = t; startIndex = NR }
{ a[NR]=$0; b[NR]=t }
{ while ( startTime+timeSpan*60 <= t ) {
delete a[startIndex]
delete b[startIndex]
startIndex++; startTime=b[startIndex]
}
}
END { for(i=startIndex; i<=NR; ++i) print a[i] }
then you can extract the last 23 minutes in the following way:
awk -f extractLastLog.awk -v timeSpan=23 logfile.log
The second condition I gave (There is at least one log every single day!) is needed not to have messed up results. In the above code, I compute the time fairly simple, HH*3600 + MM*60 + SS + offset. But I make the statement that if the current time is smaller than the previous time, it implies we are on a different day hence we update the offset with 86400 seconds. So if you have two entries like:
Jan 09 12:01:02 xxx
Jan 10 12:01:01 xxx
it will work, but this
Jan 09 12:01:00 xxx
Jan 10 12:01:01 xxx
will not work. It will not realize the day changed. Other cases that will fail are:
Jan 08 12:01:02 xxx
Jan 10 12:01:01 xxx
as it does not know that it jumped two days. Corrections for this are not easy due to the months (all thanks to leap years).
As I said, it's ugly, but might work.

Related

Shows only last 10 min logs from log file

Log File :-
LOCATION CPIC (TCP/IP) on local host with Unicode
ERROR Error Message
TIME Mon May 4 11:37:17 2020
RELEASE 721
COMPONENT CPIC (TCP/IP) with Unicode
VERSION 3
RC 473
LINE 9261
COUNTER 15
Changing trace level: handle 38331088 / destination (null) / level 0
Tried with below command to automatically get the last 10 minutes log.
sed -n "/ $(date +\%R -d "-10 min")/,$"p logfile.log | grep "ERROR"
no output displayed
Expected Output : Error message in Last 10 mins.
Any Solution ?
The challenge in this case is that the 10 minute test requires comparison of timestamp. Using pattern matching on the exact time 10 minutes ago might fail, if there was no error in the exact minutes.
As an alternative, consider the following 'AWK' based solution. It will skip lines until it see a line matching the following:
* First token is TIME
* The day of the month matching today
* Time in last 10 minutes
awk -v DD=$(date +%-d -d '-10 min') -v WHEN="$(date +\%R -d '-10 min')" '
$1 == "TIME" && $4 == +DD && $5 >= WHEN { p=1 }
p { print }
' logfile
The solution can be improved to be more generic. Current implementation will not work well during the first 10 minutes after midnight. This can be addressed (if needed) by OP.

Date subtraction in bash

I want to get successive differences of dates shown sample below. How do I convert each of these dates into epoch seconds?
7/21/17 6:39:12:167 GMT
7/21/17 6:39:12:168 GMT
7/21/17 6:39:12:168 GMT
7/21/17 6:39:12:205 GMT
7/21/17 6:39:12:206 GMT
7/21/17 6:39:12:206 GMT
Once each line gets converted into epoch seconds, I can simply run another script to get successive differentials of each. Thanks.
You can convert times using the date command. Given a line like this:
7/21/17 6:39:12:167 GMT
You first need to strip everything at and after the seconds part, to get this:
7/21/17 6:39:12
You can use cut -d: -f1-3 for that. Then, convert to epoch seconds, if you're using FreeBSD or Mac OS:
date -ujf "%m/%d/%y %H:%M:%S" "7/21/17 6:39:12" +%s
Which gives:
1500619152
If you are using GNU date (e.g. on Linux), you can feed an entire file of dates to it. Since the input file isn't in the right format, we can do this:
date --file <(cut -d: -f1-3 infile) +%s
That will read the entire file with only a single invocation of date, which is much more efficient, but only works with GNU date.
Here is one in GNU awk. It converts the timestamps to epoch time in seconds and subtracts the former from the latter. mktime function used for converting doesn't take fraction of a seconds but the fractions are stored to hash a[7] and nothing stops you from adding it to t var before subtracting:
$ awk '
function zeropad(s) { # zeropad function
return sprintf("%02d", s)
}
{
split($0,a,"[/ :]") # split timestamp to a
for(i in a)
a[i]=zeropad(a[i]) # zeropad all components
t=mktime(20 a[3] " " a[1] " " a[3] " " a[4] " " a[5] " " a[6])
# add the fractions in a[7] here
if(NR>1) # might be unnecessary
print t-p # output subtracted seconds
p=t # set current time to previous
}' file
0
0
0
0
0
Since you didn't include the expected output or proper data sample, that's the best I can do for now.
EDIT:
Since your data does not fully reflect if the fraction of a second are presented like 0:0:0:100 or 0:0:0:1 I modified the zeropad function to left and right pad given values. Now you call it like zeropad(value, count, left/right) or zeropad(a[7],3,"r"):
function zeropad(s,c,d) {
return sprintf("%" (d=="l"? "0" c:"") "d" (d=="r"?"%0" c-length(s) "d":""), s,"")
}
{
split($0,a,"[/ :]") # split timestamp to a
for(i in a)
a[i]=zeropad(a[i],2,"l") # left-pad all components with 0s
t=mktime(20 a[3] " " a[1] " " a[3] " " a[4] " " a[5] " " a[6])
t=t+zeropad(a[7],3,"r")/1000 # right-pad fractions with 0s
if(NR>1) # might be unnecessary
print t-p # output subtracted seconds
p=t # set current time to previous
}
0.00999999
0
0.37
0.00999999
0
printf with proper modifiers should probably be used for outputing to get sane values.
Let's suppose that you have one of those dates in a variable:
$ d='7/21/17 6:39:12:167 GMT'
With GNU date, we need to remove the milliseconds part. That can be done with bash's pattern substitution:
$ echo "${d/:??? / }"
7/21/17 6:39:12 GMT
To convert that to seconds-since-epoch, we use the -d option to set the date and the %s format to request seconds-since-epoch:
$ date -d "${d/:??? / }" +%s
1500619152
Compatibility: Mac OSX does not support GNU's -d option but I gather it has similar option. On many operating systems, including OSX, there is the option to install GNU utilities.

Need a script to split a large file by month that can determine year based off order of the logs

I need to split a large syslog file that goes from October 2015 to February 2016 and be separated by month. Due to background log retention, the format of these logs are similar to:
Oct 21 08:00:00 - Log info
Nov 16 08:00:00 - Log Info
Dec 25 08:00:00 - Log Info
Jan 11 08:00:00 - Log Info
Feb 16 08:00:00 - Log Info
This large file is the result of an initial zgrep search across a large amount of log files split by day. Example being, user activity on a network across multiple services such as Windows/Firewall/Physical access logs.
For a previous request, I used the following:
gawk 'BEGIN{
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",mth,"|")
}
{
for(i=1;i<=m;i++){ if ( mth[i]==$1){ month = i } }
tt="2015 "month" "$2" 00 00 00"
date= strftime("%Y%m",mktime(tt))
print $0 > FILENAME"."date".txt"
}
' logfile
output file examples (note sometimes I add "%d" to get the day but not this time:
Test.201503.txt
Test.201504.txt
Test.201505.txt
Test.201506.txt
This script however adds 2015 manually to the output log file name. What I attempted, and failed to do, was a script that creates variables out of each month at 1-12 and then sets 2015 as a variable (a) and 2016 as variable (b). Then the script would be able to compare when going in the order of 10, 11, 12, 1, 2 which would go in order and once it gets to 1 < 12 (the previous month) it would know to use 2016 instead of 2015. Odd request I know, but any ideas would at least help me get in the right mindset.
You could use date to parse the date and time. E.g.
#!/bin/bash
while IFS=- read -r time info; do
mon=$(date --date "$time" +%m | sed 's/^0//')
if (( mon < 10 )); then
year=2016
else
year=2015
fi
echo $time - $info > Test.$year$(printf "02d%" $mon).txt
done
Here is a gawk solution based on your script and your observation in the question. The idea is to detect a new year when the number of the month suddenly gets smaller, eg from 12 to 1. (Of course that will not work if the log has Jan 2015 directly followed by Jan 2016.)
script.awk
BEGIN { START_YEAR= 2015
# configure months and a mapping month -> nr, e.g. "Feb" |-> "02"
split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",monthNames,"|")
for( nr in monthNames) { month2Nr[ monthNames[ nr ] ] = sprintf("%02d", nr ) }
yearCounter=0
}
{
currMonth = month2Nr[ $1 ]
# detect a jump to the next year by a reset in the month number
if( prevMonth > currMonth) { yearCounter++ }
newFilename = sprintf("%s.%d%s.txt", FILENAME, (START_YEAR + yearCounter), currMonth)
prevMonth = currMonth
print $0 > newFilename
}
Use it like this: awk -f script.awk logfile

gawk - suppress output of matched lines

I'm running into an issue where gawk prints unwanted output. I want to find lines in a file that match an expression, test to see if the information in the line matches a certain condition, and then print the line if it does. I'm getting the output that I want, but gawk is also printing every line that matches the expression rather than just the lines that meet the condition.
I'm trying to search through files containing dates and times for certain actions to be executed. I want to show only lines that contain times in the future. The dates are formatted like so:
text... 2016-01-22 10:03:41 more text...
I tried using sed to just print all lines starting with ones that had the current hour, but there is no guarantee that the file contains a line with that hour, (plus there is no guarantee that the lines all have any particular year, month, day etc.) so I needed something more robust. I decided trying to convert the times into seconds since epoch, and comparing that to the current systime. If the conversion produces a number greater than systime, I want to print that line.
Right now it seems like gawk's mktime() function is the key to this. Unfortunately, it requires input in the following format:
yyyy mm dd hh mm ss
I'm currently searching a test file (called timecomp) for a regular expression matching the date format.
Edit: the test file only contains a date and time on each line, no other text.
I used sed to replace the date separators (i.e. /, -, and :) with a space, and then piped the output to a gawk script called stime using the following statement:
sed -e 's/[-://_]/ /g' timecomp | gawk -f stime
Here is the script
# stime
BEGIN { tsec=systime(); } /.*20[1-9][0-9] [0-1][1-9] [0-3][0-9] [0-2][0-9][0-6][0-9] [0-6][0-9]/ {
if (tsec < mktime($0))
print "\t" $0 # the tab is just to differentiate the desired output from the other lines that are being printed.
} $1
Right now this is getting the basic information that I want, but it is also printing every like that matches the original expression, rather than just the lines containing a time in the future. Sample output:
2016 01 22 13 23 20
2016 01 22 14 56 57
2016 01 22 15 46 46
2016 01 22 16 32 30
2016 01 22 18 56 23
2016 01 22 18 56 23
2016 01 22 22 22 28
2016 01 22 22 22 28
2016 01 22 23 41 06
2016 01 22 23 41 06
2016 01 22 20 32 33
How can I print only the lines in the future?
Note: I'm doing this on a Mac, but I want it to be portable to Linux because I'm ultimately making this for some tasks I have to do at work.
I'd like trying to accomplish this in one script rather than requiring the sed statement to reformat the dates, but I'm running into other issues that probably require a different question, so I'm sticking to this for now.
Any help would be greatly appreciated! Thanks!
Answered: I had a $1 at the last line of my script, and that was the cause of the additional output.
Instead of awk, this is an (almost) pure Bash solution:
#!/bin/bash
# Regex for time string
re='[0-9]{4}-[0-9]{2}-[0-9]{2} ([0-9]{2}:){2}[0-9]{2}'
# Current time, in seconds since epoch
now=$(date +%s)
while IFS= read -r line; do
# Match time string
[[ $line =~ $re ]]
time_string="${BASH_REMATCH[0]}"
# Convert time string to seconds since epoch
time_secs=$(date -d "$time_string" +%s)
# If time is in the future, print line
if (( time_secs > now )); then
echo "$line"
fi
done < <(grep 'pattern' "$1")
This takes advantage of the Coreutils date formatting to convert a date to seconds since epoch for easy comparison of two dates:
$ date
Fri, Jan 22, 2016 11:23:59 PM
$ date +%s
1453523046
And the -d argument to take a string as input:
$ date -d '2016-01-22 10:03:41' +%s
1453475021
The script does the following:
Filter the input file with grep (for lines containing a generic pattern, but could be anything)
Loop over lines containing pattern
Match the line with a regex that matches the date/time string yyyy-mm-dd hh:mm:ss and extract the match
Convert the time string to seconds since epoch
Compare that value to the time in $now, which is the current date/time in seconds since epoch
If the time from the logfile is in the future, print the line
For an example input file like this one
text 2016-01-22 10:03:41 with time in the past
more text 2016-01-22 10:03:41 matching pattern but in the past
other text 2017-01-22 10:03:41 in the future matching pattern
some text 2017-01-23 10:03:41 in the future but not matching
blahblah 2022-02-22 22:22:22 pattern and also in the future
the result is
$ date
Fri, Jan 22, 2016 11:36:54 PM
$ ./future_time logfile
other text 2017-01-22 10:03:41 in the future matching pattern
blahblah 2022-02-22 22:22:22 pattern and also in the future
This is what I have working now. It works for a few different date formats and on the actual files that have more than just the date and time. The default format that it works for is yyyy/mm/dd, but it takes an argument to specify a mm/dd/yyyy format if needed.
BEGIN { tsec=systime(); dtstr=""; dt[1]="" } /.*[0-9][0-9]:[0-9][0-9]:[0-9][0-9]/ {
cur=$0
if ( fm=="mdy" ) {
match($0,/[0-1][1-9][-_\/][0-3][0-9][-_\/]20[1-9][0-9]/) # mm dd yyyy
section=substr($0,RSTART,RLENGTH)
split(section, dt, "[-_//]")
dtstr=dt[3] " " dt[1] " " dt[2]
gsub(/[0-1][1-9][-\/][0-3][0-9][-\/]20[1-9][0-9]/, dtstr, cur)
}
gsub(/[-_:/,]/, " ", cur)
match(cur,/20[1-9][0-9] [0-1][1-9] [0-3][0-9][[:space:] ]*[0-2][0-9] [0-6][0-9] [0-6][0-9]/)
arr=mktime(substr(cur,RSTART,RLENGTH))
if ( tsec < arr)
print $0
}
I'll be adding more format options as I find more formats, but this works for all the different files I've tested so far. If they have a mm/dd/yyyy format, you call it with:
gawk -f stime fm=mdy filename
I plan on adding an option to specify the time window that you want to see, but this is an excellent start. Thank you guys again, this is going to drastically simplify a few tasks at work ( I basically have to retrieve a great deal of data, often under time pressure depending on the situation ).

How can I convert a logfile entry date into a date in the future in bash

I'm sure this answer is obvious but I'm banging my head on it and getting a headache and my Search Foo is failing me…
I have a log file with this date format:
Sep 1 16:55:00 stuff happening
Sep 1 16:55:01 THIS IS THE LINE YOU WANT at this time stamp
Sep 1 16:55:02 more stuff
Sep 1 16:55:02 THIS IS THE LINE YOU WANT at this time stamp
Sep 1 16:55:03 blah
Sep 1 16:55:04 blah and so on…..
My ultimate goal is to:
Find the last line in the log file with a given string eg: "THIS IS THE LINE…" this is my "magic time" that I will do calculations on later.
Take the date of that line and set a variable that is the date +NN seconds. The time in the future will usually just short of 24hrs in the future from the time in step 1 so crossing into the next day may happen if that is important.
At some point in the script, advance the system clock to the new date/time after which I will be checking for certain events to fire.
I know this is way wrong but so far I have figured out how to:
Grab the last date stamp for my event.
logDate=cat /logdir/my.log | grep "THIS IS THE LINE" | tail -1 | cut -f1,2,3 -d" "
Returns: Sept 1 16:55:02
Convert the date into a more usable format
logDate2="$(date -d "$logDate" +"%m-%d %H:%M:%S")"; echo $logDate2
Returns: 09-17 16:55:02
I'm stuck here - what I want is:
futuredate=$logdate2 + XXXSeconds
Could someone help me with the time calculation or perhaps point out a better way to do all of this?
Thanks.
I'm stuck here - what I want is:
futuredate=$logdate2 + XXXSeconds
You can do it by converting through timestamps:
# convert log date to timestamp
logts="$(date -d "$logDate" '+%s')"
# add timestamp with seconds
futurets=$(( logts + XXXSeconds ))
# get date based from timestamp, optionally you can add a format.
futuredate=$(date -d "#${futurets}")
# Get time in seconds from the epoc (1970-01-01 00:00:00 UTC)
dateinseconds=$(date +"%s" -d "$(tail -1 logfile | grep "THIS IS THE LINE" | awk '{print $1, $2, $3}')")
# You can also use just awk without grep and tail to match and print the last line
dateinseconds=$(date +"%s" -d "$(awk '{/THIS IS THE LINE/}END{print $1, $2, $3}' logfile)")
gotofuture=$(( $dateinseconds + 2345 )) # Add 2345 seconds
newdate=$(date -d "#${gotofuture}")
echo "$newdate"

Resources