Grep messages for the last hour [duplicate] - bash

This question already has answers here:
Filter log file entries based on date range
(5 answers)
Closed 6 years ago.
I have an application which emits logs in this format:
00:00:10,799 ERROR [stderr] (http-prfilrjb08/10.1.29.34:8180-9) {}:return code: 500
I would need to monitor for new ERRORs in the log file, happened in the last hour. Looking at some tutorials I've come up with the following grep:
grep "^$(date -d -1 hour +'%H:%M:%S')" /space/log/server.log | grep 'ERROR'
However nothing is grepped! Can you help me to fix it ?
Thanks!

You need quotes around the -1 hour and also you want to remove the seconds and minutes from the output (your current solution finds data only for the first second 1 hour ago):
grep "^$(date -d '-1 hour' +'%H')" /space/log/server.log | grep 'ERROR'

grep -E "^($(date -d '-1 hour' '+%H')|$(date '+%H')):[0-9]{2}:[0-9]{2}" /space/log/server.log | grep 'ERROR'
Let's take a look at the parts
grep -E tells grep to use extended regular expressions (so we don't need to escape all those brackets)
date -d '-1 hour' '+%H' prints the previous hour. Similarly date '+%H' prints the current hour. These need to be evaluated at runtime and captured in a capture group, that's why we have the (date|date) structure (you'll probably want some data not only from the previous hour, but the current running hour).
Next you need to specify that you are indeed looking at timestamps. We use : to delimit hours, minutes and seconds. A two-digit number group can be matched with the [0-9]{2} regexp (this is basically identical to [0-9][0-9] but shorter)
There you go.
Ps. I'd recommend sed.

Related

tail a log file from a specific line number

I know how to tail a text file with a specific number of lines,
tail -n50 /this/is/my.log
However, how do I make that line count a variable?
Let's say I have a large log file which is appended to daily by some program, all lines in the log file start with a datetime in this format:
Day Mon YY HH:MM:SS
Every day I want to output the tail of the log file but only for the previous days records. Let's say this output runs just after midnight, I'm not worried about the tail spilling over into the next day.
I just want to be able to work out how many rows to tail, based on the first occurrence of yesterdays date...
Is that possible?
Answering the question of the title, for anyone who comes here that way, head and tail can both accept a code for how much of the file to exclude.
For tail, use -n +num for the line number num to start at
For head, use -n -num for the number of lines not to print
This is relevant to the actual question if you have remembered the number of lines from the previous time you did the command, and then used that number for tail -n +$prevlines to get the next portion of the partial log, regardless of how often the log is checked.
Answering the actual question, one way to print everything after a certain line that you can grep is to use the -A option with a ridiculous count. This may be more useful than the other answers here as you can get a number of days of results. So to get everything from yesterday and so-far today:
grep "^`date -d yesterday '+%d %b %y'`" -A1000000 log_file.txt
You can combine 2 greps to print between 2 date ranges.
Note that this relies on the date actually occurring in the log file. It has the weakness that if no events were logged on a particular day used as the range marker, then it will fail to find anything.
To resolve that you could inject dummy records for the start and end dates and sort the file before grepping. This is probably overkill, though, and the sort may be expensive, so I won't example it.
I don't think tail has any functionality like this.
You could work out the beginning and ending line numbers using awk, but if you just want to exact those lines from the log file, the simplest way is probably to use grep combined with date to do it. Matching yesterday's date at beginning of line should work:
grep "^`date -d yesterday '+%d %b %y'`" < log_file.txt
You may need to adjust the date format to match exactly what you've got in the log file.
You can do it without tail, just grep rows with previous date:
cat my.log | grep "$( date -d "yesterday 13:00" '+%d %m %Y')"
And if you need line count you can add
| wc -l
I worked this out through trial and error by getting the line numbers for the first line containing the date and the total lines, as follows:
lines=$(wc -l < myfile.log)
start=$(cat myfile.log | grep -no $datestring | head -n1 | cut -f1 -d:)
n=$((lines-start))
and then a tail, based on that:
tail -n$n myfile.log

Last Day of Month in csvfile

i try to delete all days of a csv file which not matched last days. But I find not the right solution.
date,price
2018-07-02,162.17
2018-06-29,161.94
2018-06-28,162.22
2018-06-27,162.32
2018-06-12,163.01
2018-06-11,163.53
2018-05-31,164.87
2018-05-30,165.59
2018-05-29,165.42
2018-05-25,165.96
2018-05-02,164.94
2018-04-30,166.16
2018-04-27,166.69
The output I want become
date,price
2018-06-29,161.94
2018-05-31,164.87
2018-04-30,166.16
I try it with cut + grep
cut -d, -f1 file.csv | grep -E "28|29|30"
Work but bring nothing when combine -f1,2.
I find csvkit which seem to me the right tool, but I find not the solution for multiple grep.
csvgrep -c 1 -m 30 file.csv
Bring me the right result but how can combine multiple search option? I try -m 28,29,30 and -m 28 -m 29 -m 30 all work not. Best it work with last day of every month.
Maybe one have here a idea.
Thank you and nice Sunday
Silvio
You want to get all records of the LAST day of the month. But months vary in length (28-29-30-31).
I don't see why you used cut to extract the first field (the date part), because the data in the second field does not look like dates at all (xx-xx).
I suggest to use grep directly to display the lines that matches the following pattern mm-dd; where mm is the month number, and dd is the last day of the month.
This command should do the trick:
grep -E "01-31|02-(28|29)|03-31|04-30|05-31|06-30|07-31|08-30|09-31|10-30|11-31|12-30" file.csv
This command will give the following output:
2018-05-31,164.87
2018-04-30,166.16

Can I replace hours in sed command with variable? [Case Closed] [duplicate]

This question already has answers here:
Replace a string in shell script using a variable
(12 answers)
Closed 6 years ago.
I'm trying to get a certain timestamps (in my case every 15 minute) in logfile using sed command in bash scripting.
My question is, can I replace the hours in the command with a variable?
This is script that I want:
#!/bin/bash
hour=`date +%H` #Current Hours e.g 14:00
sed -n '/$hour:15:00/,/$hour:30:00/p' file.log | nice grep Event
The result will print the logfile from 14:15:00 until 14:30:00. But there's a problem when the range is from 45 minute to 60 minute which is 14:45 - 15:00. Is there any solution for this?
UPDATE
This issue is already solved, the command below work for me.
sed -n "/${hour}:15:00/,/${hour}:30:00/p" file.log | nice grep Event
Other reference: Replace a string in shell script using a variable
Thank you.
== Case closed ==
Well, to answer the question - yes, you would take the variable out of the quotes and then it should use the value:
sed -n '/'$hour:15:00/,/'$hour':30:00/p' file.log | nice grep Event
You could also just use double quotes around the expression
sed -n "/${hour}:15:00/,/${hour}:30:00/p" file.log | nice grep Event

Shell Script to get exception from logs for last one hour

I am developing script which will grep logs of last one hour and check any exception and send email for solaris platform.
I did following steps
grep -n -h date +'%Y-%m-%d %H:%M' test.logs
above command gives me line number and then i do following
tail +6183313 test.log | grep 'exception'
sample logs
2014-02-17 10:15:02,625 | WARN | m://mEndpoint | oSccMod | 262 - com.sm.sp-client - 0.0.0.R2D03-SNAPSHOT | 1201 or 101 is returned as exception code from SP, but it is ignored
2014-02-17 10:15:02,625 | WARN | m://mEndpoint | oSccMod | 262 - com.sm.sp-client - 0.0.0.R2D03-SNAPSHOT | SP error ignored and mock success returned
2014-02-17 10:15:02,626 | INFO | 354466740-102951 | ServiceFulfill | 183 - org.apache.cxf | Outbound Message
Please suggest any better alternative to perform above task.
With GNU date, one can use:
grep "^$(date -d -1hour +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
The first step above is to select all log entries from the last hour. This is done with grep by looking for all lines beginning with the year-month-day and hour that matches one hour ago:
grep "^$(date -d -1hour +'%Y-%m-%d %H')" test.logs
The next step in the pipeline is to select from those lines the ones that have exceptions:
grep 'exception'
The last step in the pipeline is to send out the mail:
mail -s "exceptions in last hour of test.logs" ImranRazaKhan
The above sends mail to ImranRazaKhan (or whatever email address you chose) with the subject line of "exceptions in last hour of test.logs".
The convenience of having the -d option to date should not be underestimated. It might seem simple to subtract 1 from the current hour but, if the current hour is 12am, then we need to adjust both the day and the hour. If the hour was 12am on the first of the month, we would also have to change the month. And likewise for year. And, of course, February requires special consideration during leap years.
Adapting the above to Solaris:
Consider three cases:
Under Solaris 11 or better, the GNU date utility is available at /usr/gnu/bin/date. Thus, we need simply to specify a path for date:
grep "^$(/usr/gnu/bin/date -d -1hour +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
Under Solaris 10 or earlier, one can download & install GNU date
If GNU date is still not available, we need to find another way to find the date and time for one hour ago. The simplest workaround is likely to select a timezone that is one hour behind your timezone. If that timezone was, say, Hong Kong, then use:
grep "^$(TZ=HongKong date +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
You can do like this:
dt="$(date -d '1 hour ago' "+%m/%d/%Y %H:%M:%S")"
awk -v dt="$dt" '$0 ~ dt && /exceltion/' test.logs
Scanning through millions lines of log sounds terribly inefficient. I would suggest changing log4j (what it looks like) configuration of your application to cut a new log file every hour. This way, tailing the most recent file becomes a breeze.

count the occurrences of a string in a log file per hour (with a shell script)

I want to make a script to count the occurrences of a specific string (domain name)
from a log file (mail log) per hour, in order to check how many emails they sent per hour.
I know there are many easy and different ways to find a script into a file (like grep etc)
and count the lines (like wc -l)
but I don't know how to do it per hour.
Yes I can call the script every 60 minutes via a cron job but this would read the log file from the beginning till the moment the script was executed..and not the lines made in the last 60 minutes, and I don't know how to overcome this.
Note:
the command that I'm using the show all the sent emails per domain is :
\# cat /usr/local/psa/var/log/maillog | grep -i qmail-remote-handlers \
| grep from | awk {' print $6 '} | gawk -F# '{ print $2 }' \
| sort | uniq -c | sort -n | tail
the result is like this:
8 domain1.tld
45 domain34.tld
366 domain80948.tld
etc etc
The main point of the question is this one:
Yes I can call the script every 60 minutes via a cron job but this would read the log file
from the beginning till the moment the script was executed..and not the lines made in the
last 60 minutes, and I don't know how to overcome this.
How could you solve the problem?
You could save the number of lines in the log file lat time you processed it. After that skip these lines using sed.
The same as in 1 but save the number of bytes in processed file; then skip it using dd.
You could rotate (rename) the file after processing it (this method has disadvantage that you need to reconfigure your system to make log processing)
I personally would chose method 2. It is very efficient and very simple to implement.

Resources