How do I search a log file based on timestamp - shell

I have written a simple code which will send out an email when a service is down, once i restart the service,script will check the file for the same keyword. problem is it may find the earlier error in the log and give a false alarm that the service is still down.
so I decided to search based on the time stamp.
dt=$(date +"%D %T")
awk '$0 ~ "Connection refused" && $0 >= $dt' /***.log
this is still returning all the old results as well
This is how the contents of the log look like.
[08/06/20 11:36:54.577]:Work...
Please let me know what I'm missing here and if this is the best way to go about with this.
Edit: This is going to be an automated script that will be run every hour.
Thank you!

The reason you get the old results as well is that you don't really compare with that date, but with some undefined $dt inside the awk condition. The awk body is not a place where you use a bash variable as is. See how you do this: https://www.gnu.org/software/gawk/manual/html_node/Using-Shell-Variables.html
dt=$(date +"%D %T")
awk -v dt="$dt" '$0 >= dt && $0 ~ /Connection refused/' file
The alphabetical comparison seems enough for your case, I assume you look into logs of a few hours or days (I think that it could fail only around New Years Day, or not, depending maybe on the log file rotation and your environment).
To make it faster, as your log lines are still sorted by date, you want to search from the restart timestamp to the end of file, so you could set a flag when you find that timestamp and check for the pattern only after that:
awk -v dt="$dt" 'f && $0 ~ /Connection refused/{print; next} $0 >= dt {f=1}' file
You see that you don't check again any timestamps after the critical point. And in any case, it is better to match exactly the last service restart (how to do this depends on the details and you have not provided any) rather than comparing.
Edit: In the sample line of the question we have the timestamp inside brackets
[08/06/20 11:36:54.577]:Work...
and this can be passed e.g. with this modification
awk -v dt="$dt" 'f && $0 ~ /Connection refused/{print; next} substr($0,2) >= dt {f=1}' file
where substr($0,2) returns $0 without the first character.

Related

Why there is an epoch time from empty value?

I'm trying to improve a monitor that reads log file and sends a notification if an application has stopped running and logging. To avoid getting notifications from moments when log file rotates and app hasn't logged anything I added a check to see if the oldest log line is older than two minutes.
Here's a snippet of the important part.
<bash lines and variable setting>
LATEST=$(gawk 'match($0, /\[#\|(.*[0-9]*?)T(.*[0-9]*?)\+.*<AppToMonitor>/, m) {print m[1], m[2];} ' $LOG_TO_MONITOR | tail -1 )
OLDEST=$(gawk 'match($0, /\[#\|(.*[0-9]*?)T(.*[0-9]*?)\+.*INFO/, m) {print m[1], m[2];} ' $LOG_TO_MONITOR | head -1)
if [ -z "$LATEST" ]
then
# no line in log
OLDEST_EPOCH=`(date --date="$OLDEST" +%s)`
CURR_MINUS_TWO=$(date +"%Y-%m-%d %T" -d "2 mins ago")
CURR_MINUS_TWO_EPOCH=`(date --date="$CURR_MINUS_TWO" +%s)`
# If oldest log line is over two minutes old
if [[ "$OLDEST_EPOCH" -lt "$CURR_MINUS_TWO_EPOCH" ]]
then
log "No lines found."
<send notification>
else
log "Server.log rotated."
fi
<else and stuff>
I still got some notifications when log rotated and the culprit reason was that the epoch time was taken from totally empty log file. I tested this by creating empty .log-file with touch test.log, then setting EMPTY=$(gawk 'match($0, /\[#\|(.*[0-9]*?)T(.*[0-9]*?)\+.*INFO/, m) {print m[1], m[2];} ' /home/USER/test.log | head -1)
Now, if I echo $EMPTY, I get a blank line. But, if I convert this empty line to epoch time EPOCHEMPTY=`(date --date="$EMPTY" +%s)` I get an epoch time 1584914400 from echo. This refers to yesterday evening. Apparently, this same epoch comes every time an empty date is converted to epoch time, like replacing "$EMPTY" with "", at least while writing this.
So the question is, what is this epoch time from empty line? When the if-statement makes the comparison with this value it triggers the notification even though it should not. Is there a way to avoid taking empty string to comparison but some else time value from the log file?
date's manual defines that an empty string passed to -d will be considered as the start of the day.
You could however rely on the -f/--file option and process substitution :
date -f <(echo -n "$your_date")
The -f option lets you pass a file as parameters, each line of which will be treated as an input for -d. An empty file will just return an empty output.
The process substitution is used to create on the fly a ephemeral file (an anonymous pipe to be precise, but that's still a file) that will only contain the content of your variable, that is either an empty file if the variable is undefined or the empty string, and a single line otherwise.

Use Awk and Grep to find lines between two time in a log file

I'm searching a log file to see if it contains a certain string between two different time. i.e. if foo exists between the lines that starts with the timestamp of 2016-11-10 06:45:00 and 2016-11-10 10:45:00 The threshold variable sets the time between, for example 240 would be 4 hours.
current="$(date "+%Y-%m-%d %H:%M:%S")"
threshold=240
dt_format="+%Y-%m-%d %H:%M:%S"
from="$(date -d "$threshold minutes ago" "$dt_format")"
if awk '$0 >= "$from" && $0 <= "$current"' /path/file.log | grep "foo"
then
exit 0
else
exit 1
fi
However I'm not sure why but when I pass $from and $current in the command line in the if statement, it's not actually reading it. It's as if I'm passing in garbage so it's not comparing the dates right and will return all the lines and exit 0.
But if I manually put in the dates in the if statement, i.e. 2016-11-10 06:45:00 as from and 2016-11-10 10:45:00 as current then it returns the correct lines that are in between those two dates and then I can use grep to check whether those lines contain foo.
I really don't understand why my code isn't working, and I can't manually put in the dates as I need to be able to check between two different time based on my needs by changing the threshold variable.
2016-11-10 06:45:00 is how the timestamp is formatted in my log, starting in the beginning of each lines.
Thanks.
You are attempting to have bash expand variables single quotes... run s="string"; echo '$s' and you'll see what I mean.
So this '$0 >= "$from" && $0 <= "$current"' Literally means those exact characters. Probably not what you wanted.
"But that's the argument to awk"... Right so awk knows how to handle $0 and $1, so awk is properly expanding those. But you were expecting awk to get '$0 >= "some_time" && $0 <= "Some_other_time"' But it didn't!
So, the way you pass variables to awk is doing some_variable="world"; awk -v my_variable=$some_variable 'BEGIN{print "hello", my_variable}'
So you should have if awk -v f="$from" -v c="$current" '$0 >= f && $0 <= c' /path/file.log | grep "foo"
Check out http://www.catonmat.net/blog/ten-awk-tips-tricks-and-pitfalls/ This article actually has some good insight into neat things you can do with awk. You might be able to use the "split file on patterns" here to reduce the amount of commands you use but either way you'll learn something about awk.

How to grep files from specific date to EOF awk

I have a little problem with printing data from file from date to end of file, namely, I have file:
2016/08/10-12:45:14.970000 <some_data> <some_data> ...
2016/08/10-12:45:15.970000 <some_data> <some_data> ...
2016/08/10-12:45:18.970000 <some_data> <some_data> ...
2016/08/10-12:45:19.970000 <some_data> <some_data> ...
And this file has hundreds lines.
And I have to print file from one point in the time to end of file but I don't know precise time when row in logfile appeared.
And I need to print data from date 2016/08/10-12:45:16to end of file, I want to receive file looks like that:
2016/08/10-12:45:18.970000
2016/08/10-12:45:19.970000
OK if I know specific date from which I want to print data everything is easy
awk '/<start_time>/,/<end/'
awk '/2016\/08\/10-12:45:18/,/<end/'
But if I don't know specific date, I know only approximate date 2016/08/10-12:45:16 it's harder.
Can any one please help me?
You can benefit from the fact that the time format you are using supports alphanumerical comparison. With awk the command can look like this:
awk -v start='2016/08/10-12:45:16' '$1>=start' file
You can use mktime function of awk to check for time:
awk -v TIME="2016/08/10-12:45:16" '
BEGIN{
gsub("[/:-]"," ",TIME)
reftime=mktime(TIME)
}
{
t=$1
sub("[0-9]*$","",t)
gsub("[/:-]"," ",t)
if(mktime(t)>reftime)
print
}' file
This script take your reference time and convert it into number and then compare it to time found in the file.
Note the sub and gsub are only to convert your specific time format to the time format understood by awk.
You should be able to do this simply with awk:
awk '{m = "2016/08/10-12:45:18"} $0 ~ m,0 {print}' file
If you weren't sure exactly the time or date you could do:
awk '{m = "2016/08/10-12:45:1[6-8]"} $0 ~ m,0 {print}' file
This should print from your specified date and time around 12:45 +16-18 seconds to the files end. The character class [6-8] treats the seconds as a range from the original time 12:45:1...
Output:
2016/08/10-12:45:18.970000 somedata 3
2016/08/10-12:45:19.970000 somedata 4

Save changes to a file AWK/SED

I have a huge text file delimited with comma.
19429,(Starbucks),390 Provan Walk,Glasgow,G34 9DL,-4.136909,55.872982
The first one is a unique id. I want the user to enter the id and enter a value for one of the following 6 fields in order to be replaced. Also, i'm asking him to enter a 2-7 value in order to identify which field should be replaced.
Now i've done something like this. I am checking every line to find the id user entered and then i'm replacing the value.
awk -F ',' -v elem=$element -v id=$code -v value=$value '{if($1==id) {if(elem==2) { $2=value } etc }}' $path
Where $path = /root/clients.txt
Let's say user enters "2" in order to replace the second field, and also enters "Whatever". Now i want "(Starbucks)" to be replaced with "Whatever" What i've done work fine but does not save the change into the file. I know that awk is not supposed to do so, but i don't know how to do it. I've searched a lot in google but still no luck.
Can you tell me how i'm supposed to do this? I know that i can do it with sed but i don't know how.
Newer versions of GNU awk support inplace editing:
awk -i inplace -v elem="$element" -v id="$code" -v value="$value" '
BEGIN{ FS=OFS="," } $1==id{ $elem=value } 1
' "$path"
With other awks:
awk -v elem="$element" -v id="$code" -v value="$value" '
BEGIN{ FS=OFS="," } $1==id{ $elem=value } 1
' "$path" > /usr/tmp/tmp$$ &&
mv /usr/tmp/tmp$$ "$path"
NOTES:
Always quote your shell variables unless you have an explicit reason not to and fully understand all of the implications and caveats.
If you're creating a tmp file, use "&&" before replacing your original with it so you don't zap your original file if the tmp file creation fails for any reason.
I fully support replacing Starbucks with Whatever in Glasgow - I'd like to think they wouldn't have let it open in the first place back in my day (1986 Glasgow Uni Comp Sci alum) :-).
awk is much easier than sed for processing specific variable fields, but it does not have in-place processing. Thus you might do the following:
#!/bin/bash
code=$1
element=$2
value=$3
echo "code is $code"
awk -F ',' -v elem=$element -v id=$code -v value=$value 'BEGIN{OFS=",";} /^'$code',/{$elem=value}1' mydb > /tmp/mydb.txt
mv /tmp/mydb.txt ./mydb
This finds a match for a line starting with code followed by a comma (you could also use ($1==code)), then sets the elemth field to value; finally it prints the output, using the comma as output field separator. If nothing matches, it just echoes the input line.
Everything is written to a temporary file, then overwrites the original.
Not very nice but it gets the job done.

how can I supply bash variables as fields for print in awk

I currently am trying to use awk to rearrange a .csv file that is similar to the following:
stack,over,flow,dot,com
and the output would be:
over,com,stack,flow,dot
(or any other order, just using this as an example)
and when it comes time to rearrange the csv file, I have been trying to use the following:
first='$2'
second='$5'
third='$1'
fourth='$3'
fifth='$4'
awk -v a=$first -v b=$second -v c=$third -v d=$fourth -v e=$fifth -F '^|,|$' '{print $a,$b,$c,$d,$e}' somefile.csv
with the intent of awk/print interpreting the $a,$b,$c,etc as field numbers, so it would come out to the following:
{print $2,$5,$1,$3,$4}
and print out the fields of the csv file in that order, but unfortunately I have not been able to get this to work correctly yet. I've tried several different methods, this seeming like the most promising, but unfortunately have not been able to get any solution to work correctly yet. Having said that, I was wondering if anyone could possibly give any suggestions or point out my flaw as I am stumped at this point in time, any help would be much appreciated, thanks!
Use simple numbers:
first='2'
second='5'
third='1'
fourth='3'
fifth='4'
awk -v a=$first -v b=$second -v c=$third -v d=$fourth -v e=$fifth -F '^|,|$' \
'{print $a, $b, $c, $d, $e}' somefile.csv
Another way with a shorter example:
aa='$2'
bb='$1'
cc='$3'
awk -F '^|,|$' "{print $aa,$bb,$cc}" somefile.csv
You already got the answer to your specific question but have you considered just specifying the order as a string instead of each individual field? For example:
order="2 5 1 3 4"
awk -v order="$order" '
BEGIN{ FS=OFS=","; n=split(order,a," ") }
{ for (i=1;i<n;i++) printf "%s%s",$(a[i]),OFS; print $(a[i]) }
' somefile.csv
That way if you want to add/delete fields or change the order you just trivially rearrange the numbers in the first line instead of having to mess with a bunch of hard-coded variables, etc.
Note that I changed your FS as there was no need for it to be that complicated. Also, you don't need the shell variable, "order",you could just populate the awk variable of the same name explicitly, I just started with the shell variable since you had started with shell variables so maybe you have a reason.

Resources