Get List of All Lines within the date range - shell

I have a log file with entries like this:
24/09/20 | 11:22:56am | Server1 | Backup Done
28/09/20 | 10:44:05am | Server1 | Error in Config File
10/10/20 | 04:22:10am | Server1 | Error in Config File
How can I extract only the lines which are between
[Today's Date] <-> [Today's Date - 30Days]
After searching on web I found this command to work but it gives a error.
sed -n "/$(date --date='-30 day' '+%d/%m/%y')/,/$(date +'%d/%m/%y')/p"
Error
cat logs.txt | sed -n "/$(date --date='-10 day' '+%d/%m/%y')/,/$(date '+%d/%m/%y')/p"
sed: -e expression #1, char 5: unknown command: `1'
Can anyone please help me to extract only lines which are 30 days old from current date.

Use GNU date to get the date for 30 days ago.
Then reverse the format, to %y%m%d, in order to apply alphabetical comparison with awk.
start=$(date --date='30 days ago' +'%y%m%d')
awk -v start="$start" '{split($1, a, "/")} a[3]a[2]a[1] >= start' file
Note: In general, date formats that can be directly sorted or compared, as strings, like the ISO 8601 date, or as numbers, like the Unix timestamp, should be preferred for timestamped files.

Related

Solaris/Unix: How to display only new lines from log file since last check

The machine is SunOS 5.10 so it behaves a bit differently than Unix.
I have an error log file that is filled by multiple users.
I get my entries by:
tail error.log | grep USER1
I get results in the format:
Jan 16 09:06:18 XYZ USER1
Jan 16 09:22:12 XYZ USER1
Jan 16 11:22:30 XYZ USER1
What I need is a command that would only print my errors that were logged after the last time I checked and print nothing otherwise.
I decided to get my last entry into another file by:
grep USER1 error.log | tail -1 > temp_error.log
From there I'd cut only the first chars containing the date by:
less temp_error.log | cut -c 1-15
This would give me the result:
Jan 16 11:22:30
I then use a 'sed' command to get all lines after this one in the file:
sed -e '1,/Jan 16 11:22:30/d' error.log | grep USER1
This would work but of course I have to paste the date every time. I decided to automate this with a variable by executing the below commands:
LAST_ENTRY=`less temp_error.log | cut -c 1-15`
sed -e '1,/$LAST_ENTRY/d' error.log | grep USER1
This doesn't work because the variable is not passed correctly to the 2nd command and I don't understand why. Please, help me to understand what is wrong with this.

Comparing and transforming a date that is piped to (probably) awk?

I've got a reasonably complicated string of piped shell commands (let's assume it's bunch | of | commands), which together produces several rows of output, in this format:
some_path/some_file.csv 1439934121
...where 1439934121 is the file's last-modified timestamp.
What I need to do is see if it's a timestamp on the current day, i.e. on or after last midnight, and then include just the lines where that is true.
I assume this means that some string (e.g. the word true) should either replace or be appended to the timestamps of those lines for grep to distinguish them from ones where the timestamps are those of an earlier date.
To put it in shell command terms:
bunch | of | commands | ????
...should produce:
some_path/some_file.csv true or some_path/some_file.csv 1439934121 true
...for which I could easily grep (obviously assuming that last midnight <= 1439934121 <= current time).
What kind of ???? would do this? I'm almost certain that awk can do what I need it to, so I've looked at it and date, but I'm basically doing awk-by-google with no skills and getting nowhere.
Don't feel constrained by my tool assumptions; if you can achieve this with alternate means, given the output of bunch | of | commands but still using shell tools and piping, I'm all ears. I'd like to avoid temp files or Perl, if possible :-)
I'm using gawk + bash 4.3 on Ubuntu Linux, specifically, and have no portability concerns.
Since today 00:00:00 with the %s format returns the unix timestamp of that moment:
$ date -d'today 00:00:00'
Thu Sep 3 00:00:00 CEST 2015
$ date -d 'today 00:00:00' "+%s"
1441231200
You can probably pipe to an awk doing something like:
... | awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1'
That is, use the ternary operator to check the value of $2 and replace with either of the values true/false depending on the result:
awk -v midnight="$(date ...)" '{$2= ($2>midnight) ? "true" : "false"}1'
Test
$ cat a
hello 1441231201
bye 23
$ awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1' a
hello true
bye false

Grep a time stamp in the H:MM:SS format

Working on the file an need to grep the line with a time stamp in the H:MM:SS format. I tried the following egrep '[0-9]\:[0-9]\:[0-9]'. Didn't work for me. What am i doing wrong in regex?
$ date -u | egrep '\d{1,2}:\d{1,2}:\d{1,2}'
Fri May 2 00:59:47 UTC 2014
Try a site like http://regexpal.com/
Here is the fix:
grep '[0-9]:[0-9][0-9]:[0-9][0-9]'
If you need get timestamp only, and your grep is gnu grep.
grep -o '[0-9]:[0-9][0-9]:[0-9][0-9]'
and if you work more harder, limit on time format only:
grep '[0-2][0-9]:[0-5][0-9]:[0-5][0-9]'
Simplest way that I know of:
grep -E '([0-9]{2}:){2}[0-9]{2}' file
If you need month and day also:
grep -E '.{3,4} .{,2} ([0-9]{2}:){2}[0-9]{2}' file

Shell Script to get exception from logs for last one hour

I am developing script which will grep logs of last one hour and check any exception and send email for solaris platform.
I did following steps
grep -n -h date +'%Y-%m-%d %H:%M' test.logs
above command gives me line number and then i do following
tail +6183313 test.log | grep 'exception'
sample logs
2014-02-17 10:15:02,625 | WARN | m://mEndpoint | oSccMod | 262 - com.sm.sp-client - 0.0.0.R2D03-SNAPSHOT | 1201 or 101 is returned as exception code from SP, but it is ignored
2014-02-17 10:15:02,625 | WARN | m://mEndpoint | oSccMod | 262 - com.sm.sp-client - 0.0.0.R2D03-SNAPSHOT | SP error ignored and mock success returned
2014-02-17 10:15:02,626 | INFO | 354466740-102951 | ServiceFulfill | 183 - org.apache.cxf | Outbound Message
Please suggest any better alternative to perform above task.
With GNU date, one can use:
grep "^$(date -d -1hour +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
The first step above is to select all log entries from the last hour. This is done with grep by looking for all lines beginning with the year-month-day and hour that matches one hour ago:
grep "^$(date -d -1hour +'%Y-%m-%d %H')" test.logs
The next step in the pipeline is to select from those lines the ones that have exceptions:
grep 'exception'
The last step in the pipeline is to send out the mail:
mail -s "exceptions in last hour of test.logs" ImranRazaKhan
The above sends mail to ImranRazaKhan (or whatever email address you chose) with the subject line of "exceptions in last hour of test.logs".
The convenience of having the -d option to date should not be underestimated. It might seem simple to subtract 1 from the current hour but, if the current hour is 12am, then we need to adjust both the day and the hour. If the hour was 12am on the first of the month, we would also have to change the month. And likewise for year. And, of course, February requires special consideration during leap years.
Adapting the above to Solaris:
Consider three cases:
Under Solaris 11 or better, the GNU date utility is available at /usr/gnu/bin/date. Thus, we need simply to specify a path for date:
grep "^$(/usr/gnu/bin/date -d -1hour +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
Under Solaris 10 or earlier, one can download & install GNU date
If GNU date is still not available, we need to find another way to find the date and time for one hour ago. The simplest workaround is likely to select a timezone that is one hour behind your timezone. If that timezone was, say, Hong Kong, then use:
grep "^$(TZ=HongKong date +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
You can do like this:
dt="$(date -d '1 hour ago' "+%m/%d/%Y %H:%M:%S")"
awk -v dt="$dt" '$0 ~ dt && /exceltion/' test.logs
Scanning through millions lines of log sounds terribly inefficient. I would suggest changing log4j (what it looks like) configuration of your application to cut a new log file every hour. This way, tailing the most recent file becomes a breeze.

Trim text and add timestamp?

So basically I have my output as the following:
<span id="PlayerCount">134,015 people currently online</span>
What I want is a way to trim it to show:
134,015 - 3:24:20AM - Oct 24
Can anyone help? Also note the number may change so is it possible output everything between ">" and the "c" in currently? And add a timestamp somehow?
Using commands from terminal in Linux, so that's called bash right?
Do you perhaps mean something like:
$ echo '<span id="PlayerCount">134,015 people currently online</span>' | sed
-e 's/^[^>]*>//'
-e "s/currently.*$/$(date '+%r %b %d %Y')/"
which generates:
134,015 people 03:36:30 PM Oct 24 2011
The echo is just for the test data. The first sed command will change everything up to the first > character into nothing (ie, delete it).
The second one will change everything from the currently to the end of the line with the current date in your desired format (although I have added the year since I'm a bit of a stickler for detail).
The relevant arguments for date here are:
%r locale's 12-hour clock time (e.g., 11:11:04 PM)
%b locale's abbreviated month name (e.g., Jan)
%d day of month (e.g., 01)
%Y year
A full list of format specifiers can be obtained from the date man page (execute man date from a shell).
A small script which will give you the desired information from the page you mentioned in the comments is:
#!/usr/bin/bash
wget --output-document=- http://runescape.com/title.ws 2>/dev/null \
| grep PlayerCount \
| head -1l \
| sed 's/^[^>]*>//' \
| sed "s/currently.*$/$(date '+%r %b %d %Y')/"
Running this gives me:
pax$ ./online.sh
132,682 people 04:09:17 PM Oct 24 2011
In detail:
The wget bit pulls down the web page and writes it on standard output. The standard error (progress bar) is thrown away.
The grep extracts only lines with the word PlayerCount in them.
The head throws away all but the first of those.
The first sed strips up to the first > character.
The second sed changes the trailing text to the durrent date and time.
Quickhack(tm):
$ people=$(echo '<span id="PlayerCount">134,015 people currently online</span>' | \
sed -e 's/^.*>\(.*\) people.*$/\1/')
$ echo $people - $(date)
134,015 - Mon Oct 24 09:36:23 CEST 2011
produce_OUTPUT | grep -o '[0-9,]\+' | while read count; do
printf "%s - %s\n" $count "$(date +'%l:%M:%S %p - %b %e')"
done

Resources