grep all lines for a specific date in log-files - bash

I need to grep all lines for "yesterday" in /var/log/messages.
When I use following snippet, I get zero results due to the fact that the dates are in the format "Jun 9". (It doesn't show here, but in the log file, the days of the month are padded with an extra space when smaller than 10).
cat /var/log/messages | grep `date --date="yesterday" +%b\ %e`
When I enter
$ date --date="yesterday" +%b\ %e
on the commandline, it returns yesterday's date, with padding.
But when I combine it with grep and the backticks, the extra padding gets suppressed, which gives me zero results.
What do I need to change so that the "date" gets evaluated with extra padding?

You should be able to fix this by putting quotes around the backticks:
cat /var/log/messages | grep "`date --date="yesterday" +%b\ %e`"

Related

tail a log file from a specific line number

I know how to tail a text file with a specific number of lines,
tail -n50 /this/is/my.log
However, how do I make that line count a variable?
Let's say I have a large log file which is appended to daily by some program, all lines in the log file start with a datetime in this format:
Day Mon YY HH:MM:SS
Every day I want to output the tail of the log file but only for the previous days records. Let's say this output runs just after midnight, I'm not worried about the tail spilling over into the next day.
I just want to be able to work out how many rows to tail, based on the first occurrence of yesterdays date...
Is that possible?
Answering the question of the title, for anyone who comes here that way, head and tail can both accept a code for how much of the file to exclude.
For tail, use -n +num for the line number num to start at
For head, use -n -num for the number of lines not to print
This is relevant to the actual question if you have remembered the number of lines from the previous time you did the command, and then used that number for tail -n +$prevlines to get the next portion of the partial log, regardless of how often the log is checked.
Answering the actual question, one way to print everything after a certain line that you can grep is to use the -A option with a ridiculous count. This may be more useful than the other answers here as you can get a number of days of results. So to get everything from yesterday and so-far today:
grep "^`date -d yesterday '+%d %b %y'`" -A1000000 log_file.txt
You can combine 2 greps to print between 2 date ranges.
Note that this relies on the date actually occurring in the log file. It has the weakness that if no events were logged on a particular day used as the range marker, then it will fail to find anything.
To resolve that you could inject dummy records for the start and end dates and sort the file before grepping. This is probably overkill, though, and the sort may be expensive, so I won't example it.
I don't think tail has any functionality like this.
You could work out the beginning and ending line numbers using awk, but if you just want to exact those lines from the log file, the simplest way is probably to use grep combined with date to do it. Matching yesterday's date at beginning of line should work:
grep "^`date -d yesterday '+%d %b %y'`" < log_file.txt
You may need to adjust the date format to match exactly what you've got in the log file.
You can do it without tail, just grep rows with previous date:
cat my.log | grep "$( date -d "yesterday 13:00" '+%d %m %Y')"
And if you need line count you can add
| wc -l
I worked this out through trial and error by getting the line numbers for the first line containing the date and the total lines, as follows:
lines=$(wc -l < myfile.log)
start=$(cat myfile.log | grep -no $datestring | head -n1 | cut -f1 -d:)
n=$((lines-start))
and then a tail, based on that:
tail -n$n myfile.log

Unix: Removing date from a string in single command

For satisfying a legacy code i had to add date to a filename like shown below(its definitely needed and cannot modify legacy code :( ). But i need to remove the date within the same command without going to a new line. this command is read from a text file so i should do this within the single command.
$((echo "$file_name".`date +%Y%m%d`| sed 's/^prefix_//')
so here i am removing the prefix from filename and adding a date appended to filename. i also do want to remove the date which i added. for ex: prefix_filename.txt or prefix_filename.zip should give me as below.
Expected output:
filename.txt
filename.zip
Current output:
filename.txt.20161002
filename.zip.20161002
Assumming all the files are formatted as filename.ext.date, You can pipe the output to 'cut' command and get only the 1st and 2nd fields :
~> X=filename.txt.20161002
~> echo $X | cut -d"." -f1,2
filename.txt
I am not sure that I understand your question correctly, but perhaps this does what you want:
$((echo "$file_name".`date +%Y%m%d`| sed -e 's/^prefix_//' -e 's/\.[^.]*$//')
Sample input:
cat sample
prefix_original.txt.log.tgz.10032016
prefix_original.txt.log.10032016
prefix_original.txt.10032016
prefix_one.txt.10032016
prefix.txt.10032016
prefix.10032016
grep from start of the string till a literal dot "." followed by digit.
grep -oP '^.*(?=\.\d)' sample
prefix_original.txt.log.tgz
prefix_original.txt.log
prefix_original.txt
prefix_one.txt
prefix.txt
prefix
perhaps, following should be used:
grep -oP '^.*(?=\.\d)|^.*$' sample
If I understand your question correctly, you want to remove the date part from a variable, AND you already know from the context that the variable DOES contain a date part and that this part comes after the last period in the name.
In this case, the question boils down to removing the last period and what comes after.
This can be done (Posix shell, bash, zsh, ksh) by
filename_without=${filename_with%.*}
assuming that filename_with contains the filename which has the date part in the end.
% cat example
filename.txt.20161002
filename.zip.20161002
% cat example | sed "s/.[0-9]*$//g"
filename.txt
filename.zip
%

Bash grep keyword plus trailing numbers upto first whitespace

I'm looking to filter tcpdump output and extracting only two constant element names and their string of changing numbers which is followed by a white space and more unwanted data. Is there a way to only extract up to the first white space using GREP of SED? I've been using bash for about a month and this is the first time my googlefoo has failed me.
Example output: red23:34:23 black23:43 purple00:55:22 yellow32:43 green10:10 (color names are constant)
Looking to extract: black23:43 yellow32:43
The -o option in grep prints only the matching part, so to get just black and the numbers you might do this:
output='red23:34:23 black23:43 purple00:55:22 yellow32:43 green10:10'
echo "$output" | grep -Eo 'black[0-9]+:[0-9]+'
and you could parameterize it like so:
color='green'
echo "$output" | grep -Eo "${color}[0-9]+:[0-9]+"

Comparing and transforming a date that is piped to (probably) awk?

I've got a reasonably complicated string of piped shell commands (let's assume it's bunch | of | commands), which together produces several rows of output, in this format:
some_path/some_file.csv 1439934121
...where 1439934121 is the file's last-modified timestamp.
What I need to do is see if it's a timestamp on the current day, i.e. on or after last midnight, and then include just the lines where that is true.
I assume this means that some string (e.g. the word true) should either replace or be appended to the timestamps of those lines for grep to distinguish them from ones where the timestamps are those of an earlier date.
To put it in shell command terms:
bunch | of | commands | ????
...should produce:
some_path/some_file.csv true or some_path/some_file.csv 1439934121 true
...for which I could easily grep (obviously assuming that last midnight <= 1439934121 <= current time).
What kind of ???? would do this? I'm almost certain that awk can do what I need it to, so I've looked at it and date, but I'm basically doing awk-by-google with no skills and getting nowhere.
Don't feel constrained by my tool assumptions; if you can achieve this with alternate means, given the output of bunch | of | commands but still using shell tools and piping, I'm all ears. I'd like to avoid temp files or Perl, if possible :-)
I'm using gawk + bash 4.3 on Ubuntu Linux, specifically, and have no portability concerns.
Since today 00:00:00 with the %s format returns the unix timestamp of that moment:
$ date -d'today 00:00:00'
Thu Sep 3 00:00:00 CEST 2015
$ date -d 'today 00:00:00' "+%s"
1441231200
You can probably pipe to an awk doing something like:
... | awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1'
That is, use the ternary operator to check the value of $2 and replace with either of the values true/false depending on the result:
awk -v midnight="$(date ...)" '{$2= ($2>midnight) ? "true" : "false"}1'
Test
$ cat a
hello 1441231201
bye 23
$ awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1' a
hello true
bye false

Trim text and add timestamp?

So basically I have my output as the following:
<span id="PlayerCount">134,015 people currently online</span>
What I want is a way to trim it to show:
134,015 - 3:24:20AM - Oct 24
Can anyone help? Also note the number may change so is it possible output everything between ">" and the "c" in currently? And add a timestamp somehow?
Using commands from terminal in Linux, so that's called bash right?
Do you perhaps mean something like:
$ echo '<span id="PlayerCount">134,015 people currently online</span>' | sed
-e 's/^[^>]*>//'
-e "s/currently.*$/$(date '+%r %b %d %Y')/"
which generates:
134,015 people 03:36:30 PM Oct 24 2011
The echo is just for the test data. The first sed command will change everything up to the first > character into nothing (ie, delete it).
The second one will change everything from the currently to the end of the line with the current date in your desired format (although I have added the year since I'm a bit of a stickler for detail).
The relevant arguments for date here are:
%r locale's 12-hour clock time (e.g., 11:11:04 PM)
%b locale's abbreviated month name (e.g., Jan)
%d day of month (e.g., 01)
%Y year
A full list of format specifiers can be obtained from the date man page (execute man date from a shell).
A small script which will give you the desired information from the page you mentioned in the comments is:
#!/usr/bin/bash
wget --output-document=- http://runescape.com/title.ws 2>/dev/null \
| grep PlayerCount \
| head -1l \
| sed 's/^[^>]*>//' \
| sed "s/currently.*$/$(date '+%r %b %d %Y')/"
Running this gives me:
pax$ ./online.sh
132,682 people 04:09:17 PM Oct 24 2011
In detail:
The wget bit pulls down the web page and writes it on standard output. The standard error (progress bar) is thrown away.
The grep extracts only lines with the word PlayerCount in them.
The head throws away all but the first of those.
The first sed strips up to the first > character.
The second sed changes the trailing text to the durrent date and time.
Quickhack(tm):
$ people=$(echo '<span id="PlayerCount">134,015 people currently online</span>' | \
sed -e 's/^.*>\(.*\) people.*$/\1/')
$ echo $people - $(date)
134,015 - Mon Oct 24 09:36:23 CEST 2011
produce_OUTPUT | grep -o '[0-9,]\+' | while read count; do
printf "%s - %s\n" $count "$(date +'%l:%M:%S %p - %b %e')"
done

Resources