Extract date from log file - bash

I have a log line like this:
Tue Dec 2 10:03:46 2014 1 10.0.0.1 0 /home/test4/TEST_LOGIN_201312021003.201412021003.23872.sqlLdr b _ i r test4 ftp 0 * c
And I can print date value of this line like this.
echo $log | awk '{print $9}' | grep -oP '(?<!\d)201\d{9}' | head -n 1
I have another log line like this, how can I print date value?
Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data-2014_12_11_16_20.txt b _ i r spy ftp 0 * c
I tried my awk/grep solution, but it prints 201 and 9 number after 201 when see 201.
Sub folders and data name is the same:
2014/12/11/16/20 --> 11 Dec 2014 16:20 <-- blablabla_data-2014_12_11_16_20.txt
note: /home/DATA1 is not static. year/month/day/hour/minute is static.

As the format in the path is /.../YYYY/MM/DD/HH/MM/filename, you can use 201D/DD/DD/DD/DD in the grep expression to match the date block:
$ log="Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data2_11_16_20.txt b _ i r spy ftp 0 * c"
$ echo "$log" | grep -oP '(?<!\d)201\d/\d{2}/\d{2}/\d{2}/\d{2}'
2014/12/11/16/20
And eventually remove the slashes with tr:
$ echo "$log" | grep -oP '(?<!\d)201\d/\d{2}/\d{2}/\d{2}/\d{2}' | tr -d '/'
201412111620

sed can also work too, if you are acquainted with it
echo "Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data-2014_12_11_16_20.txt b _ i r spy ftp 0 * c"|sed 's#.*[[:alnum:]]*/\([[:digit:]]\{4\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}\).*#\1#'
output
2014/12/11/16/20
To remove "/", the same above command piped to tr -d '/'
Full command line
echo "Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data-2014_12_11_16_20.txt b _ i r spy ftp 0 * c"|sed 's#.*[[:alnum:]]*/\([[:digit:]]\{4\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}\).*#\1#'|tr -d '/'
Output
201412111620

Related

Get the longest logon time of a given user using awk

My task is to write a bash script, using awk, to find the longest logon of a given user ("still logged in" does not count), and print the month day IP logon time in minutes.
Sample input: ./scriptname.sh username1
Content of last username1:
username1 pts/ IP Apr 2 .. .. .. .. (00.03)
username1 pts/ IP Apr 3 .. .. .. .. (00.13)
username1 pts/ IP Apr 5 .. .. .. .. (12.00)
username1 pts/ IP Apr 9 .. .. .. .. (12.11)
Sample output:
Apr 9 IP 731
(note: 12 hours and 11 minutes is in total 731 minutes)
I have written this script, but a bunch of errors pop up, and I am really confused:
#!/bin/bash
usr=$1
last $usr | grep -v "still logged in" | awk 'BEGIN {max=-1;}
{
h=substr($10,2,2);
min=substr($10,5,2) + h/60;
}
(max < min){
max = min;
}
END{
maxh=max/60;
maxmin=max-maxh;
($maxh == 0 && $maxmin >=10){
last $usr | grep "00:$maxmin" | awk '{print $5," ",$6," ", $3," ",$maxmin}'
exit 1
}
($maxh == 0 $$ $maxmin < 10){
last $usr | grep "00:0$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh < 10 && $maxmin == 0){
last $usr | grep "0$maxh:00" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh < 10 && $maxmin < 10){
last $usr | grep "0$maxh:0$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh >= 10 && $maxmin < 10){
last $usr | grep "$maxh:0$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh >=10 && $maxmin >= 10){
last $usr | grep "$maxh:$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
}'
So a bit of explaining of how I imagined this would work:
After the initialization, I want to find the (hh:mm) column of the last $usr command, save the h and min of every line, find the biggest number (in minutes, meaning it is the longest logon time).
After I found the longest logon time (in minutes, stored in the variable max), I then have to reformat the only minutes format to hh:mm to be able to use a grep, use the last command again, but now only searching for the line(s) that contain the max logon time, and print all of the needed information in the month day IP logon time in minutes format, using another awk.
Errors I get when running this code: A bunch of syntax errors when I try using grep and awk inside the original awk.
awk is not shell. You can't directly call tools like last, grep and awk from awk any more than you could call them directly from a C program.
Using any awk in any shell on every Unix box and assuming if multiple rows have the max time you'd want all of them printed and that if no timestamped rows are found you want something like No matching records printed (easy tweak if not, just tell us your requirements for those cases and include them in the example in your question):
last username1 |
awk '
/still logged in/ {
next
}
{
split($NF,t,/[().]/)
cur = (t[2] * 60) + t[3]
}
cur >= max {
out = ( cur > max ? "" : out ORS ) $4 OFS $5 OFS $3 OFS cur
max = cur
}
END {
print (out ? out : "No matching records")
}
'
Apr 9 IP 731
If gnu-awk is available, you might use a pattern with 2 capture groups for the numbers in the last field. In the END block print the format that you want.
If in this example, file contains the example content, and the last column contains the logon:
awk '
match ($(NF), /\(([0-9]+)\.([0-9]+)\)/, a) {
hm = (a[1] * 60) + a[2]
if(hm > max) {max = hm; line = $0;}
}
END {
n = split(line,a,/[[:space:]]+/)
print a[3], a[4], a[5], max
}
' file
Output
IP Apr 9 731
Testing last command in my machine:
Using Red Hat Linux 7.8
Got the following output:
user0022 pts/1 10.164.240.158 Sat Apr 25 19:32 - 19:47 (00:14)
user0022 pts/1 10.164.243.80 Sat Apr 18 22:31 - 23:31 (1+01:00)
user0022 pts/1 10.164.243.164 Sat Apr 18 19:21 - 22:05 (02:43)
user0011 pts/0 10.70.187.1 Thu Nov 21 15:26 - 18:37 (03:10)
user0011 pts/0 10.70.187.1 Thu Nov 7 16:21 - 16:59 (00:38)
astukals pts/0 10.70.187.1 Mon Oct 7 19:10 - 19:13 (00:03)
reboot system boot 3.10.0-957.10.1. Mon Oct 7 22:09 - 14:30 (156+17:21)
astukals pts/0 10.70.187.1 Mon Oct 7 18:56 - 19:08 (00:12)
reboot system boot 3.10.0-957.10.1. Mon Oct 7 21:53 - 19:08 (-2:-44)
IT pts/0 10.70.187.1 Mon Oct 7 18:50 - 18:53 (00:03)
IT tty1 Mon Oct 7 18:48 - 18:49 (00:00)
user0022 pts/1 30.30.30.168 Thu Apr 16 09:43 - 14:54 (05:11)
user0022 pts/1 30.30.30.59 Wed Apr 15 11:48 - 04:59 (17:11)
user0022 pts/1 30.30.30.44 Tue Apr 14 19:03 - 04:14 (09:11)
Found time format is DD+HH:MM appears only when DD is not zero.
Found there are additional technical users: IT, system, reboot need to filtered.
Suggesting solution:
last | awk 'BEGIN {FS="[ ()+:]*"}
/reboot|system|still/{next}
{ print $5 OFS $6 OFS $3 OFS $(NF-1) + ($(NF-2) * 60) + ($(NF-3) * 60 * 24)}
' |sort -nk 4| head -1
Result:
Apr 15 30.30.30.59 85991

How can I extract the data between two time in two or more log files

I have two log files namely, Log1.log and Log2.log each containing following data.
Log1.log:
Apr 10 02:07:20 Data 1
May 10 04:11:09 Data 2
June 11 06:22:35 Data 3
Aug 12 09:08:07 Data 4
Log2.log
Apr 10 09:07:20 Data 1
Apr 10 10:07:10 Data 2
Jul 11 11:07:30 Data 3
Aug 18 12:50:40 Data 4
What command I can use to get the data between Apr 10 02:07:20 to Aug 18 12:50:40.
I have used
$ awk -v start=01:06:04 -v stop=01:07:16 'start <= $3 && $3 <= stop' Log1.log Log2.log
I have also used
awk -v StartTime="$StartTime" -v EndTime="$EndTime" -f script.sh Log1.log Log2.log
where script.sh contains,
BEGIN { Keep = 0;}
{
if($3 >= StartTime)
{
keep = 1;
}
if ($3 > EndTime)
{
exit;
}
if(keep)
{
print;
}
}
I am not getting the desired result. Can someone help me in improving me answer?Thanks in advance
I would first use sort to sort the input. Then I would use sed to extract that range:
LC_TIME=C sort -t' ' -k1,1M -k2,3n 1.log 2.log \
| sed -n '/Apr 10 02:07:20/,/Aug 18 12:50:40/p'
Btw, it is not fully clear to me if you want to exclude or include the range borders. The above example includes them, the below example excludes them:
LC_TIME=C sort -t' ' -k1,1M -k2,3n 1.log 2.log \
| sed -n '/Apr 10 02:07:20/,/Aug 12 09:08:07/{/Apr 10 02:07:20/!{/Aug 12 09:08:07/!p}}
At least GNU sed allows to simplify the latter command to:
LC_TIME=C sort -t' ' -k1,1M -k2,3n 1.log 2.log \
| sed -n '/Apr 10 02:07:20/,/Aug 12 09:08:07/{//!p}'

How can I use sort by custom date in file?

I have log file like this:
Fri Jan 30 13:52:57 2015 1 10.1.1.1 0 /home/test1/MAIL_201401301353.201501301352.19721.sqlLdr b _ i r test1 ftp 0 * c
Fri Jan 30 13:52:58 2015 1 10.1.1.1 0 /home/test2/MAIL_201401301354.201501301352.12848.sqlLdr b _ i r test2 ftp 0 * c
Fri Jan 30 13:53:26 2015 1 10.1.1.1 0 /home/test3/MAIL_201401301352.201501301353.17772.sqlLdr b _ i r test3 ftp 0 * c
I need to sort by date value. Date value is first 2014....
I can find date value like this:
echo $log | awk '{print $9}' | grep -oP '(?<!\d)201\d{9}' | head -n 1
How can I sort by this date value(new to old)?
To sort this file you can use:
sort -t_ -nk2,2 file
Fri Jan 30 13:53:26 2015 1 10.1.1.1 0 /home/test3/MAIL_201401301352.201501301353.17772.sqlLdr b _ i r test3 ftp 0 * c
Fri Jan 30 13:52:57 2015 1 10.1.1.1 0 /home/test1/MAIL_201401301353.201501301352.19721.sqlLdr b _ i r test1 ftp 0 * c
Fri Jan 30 13:52:58 2015 1 10.1.1.1 0 /home/test2/MAIL_201401301354.201501301352.12848.sqlLdr b _ i r test2 ftp 0 * c
Details:
-n # numerical sort
-t # set field separator as _
-k2,2 # sort on 2nd field

Script to generate a list to run a command

Sorry for the semi-vague title, I wasn't exactly sure how to word it. I'm looking to generate a list, excluding devices without a matching major/minor number, and run
lkdev -l hdiskn -a -c DATAn
where the hdisk and the DATA device having corresponding major/minor numbers.
In /dev, I have -
root# testbox /dev
#ls -l | grep -E "DATA|hdisk" | grep -v rhd
crw-r--r-- 1 root system 18, 3 Oct 03 10:50 DATA01
crw-r--r-- 1 root system 18, 2 Oct 03 10:50 DATA02
brw------- 1 root system 18, 1 Apr 12 2013 hdisk0
brw------- 1 root system 18, 0 Apr 12 2013 hdisk1
brw------- 1 root system 18, 3 Jan 14 2014 hdisk2
brw------- 1 root system 18, 2 Jan 14 2014 hdisk3
brw------- 1 root system 18, 4 Jan 14 2014 hdisk4
So essentially, I'm trying to create something where hdisk0,1,4 are all excluded, and hdisk2-3 are locked with DATA01 and DATA02, respectively.
I originally was trying to use sort and/or uniq to isolate/remove fields, but haven't been able to generate the desired list to even begin looking at running the command on each.
(As a note, I have several servers with hundreds of these. If it were just these few, I'd find a "simpler" way.)
(I can't test it right now, so please correct syntax errors if any)
You could play with sort en uniq like beneath
ls -l | grep -E "DATA|hdisk" | sed -e 's/.* \([0-9]*, *[0-9]*\).*/\1/' | sort |
uniq -c | grep -v " 1" | cut -c8- | while read majorminor; do
ls -l | grep " ${majorminor}" | sed 's/.* //'
done
However, you should start with selecting the right lines without counting:
for data in $(find /dev -type c -name "DATA*" | cut -d/ -f3); do
majorminor="$(ls -l $data | sed -e 's/.* \([0-9]*, *[0-9]*\).*/\1/')"
echo "$data <==> $(ls -l hdisk* | grep " ${majorminor}" | sed 's/.* //')"
done

pick up files based on dates in ksh script

I have this list of files . Now I will have to pick the latest file based on some condition
3679 Jul 21 23:59 belk_rpo_error_**po9324892**_07212014.log
0 Jul 22 23:59 belk_rpo_error_**po9324892**_07222014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
0 Jul 20 05:50 belk_rpo_error_**po9999992**_07202014.log
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
742 Jul 21 07:30 belk_rpo_error_**po9999991**_07212014.log
0 Jul 23 2014 belk_rpo_error_**po9999991**_07232014.log
For a PATRICULAR Order_No(Marked with ** **)
If the latest file is 0 kB then we will discard it (rest of the files with same Order_no as well)
if the latest file is non Zero then I will take it.(Only the latest one)
Then append the contents in a txt file .
My expected output would be ::
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
I am at my wits end here. I cant seem to figure out how to compare dates in Unix. Any help is very appreciated.
You can try something like:
touch test.txt
for var in ` find . ! -empty -exec ls -r {} \;`
do
cat $var>>test.txt
done
untested
use stat to emit date (epoch time), size and filename.
use awk to filter out zero-length files and extract order number.
sort by order number and date
awk to pick up the last filename for each order number
stat -c $'%Y\t%s\t%n' *.log |
awk -F'\t' -v OFS='\t' '
$2 > 0 {
split($3, a, /_/)
print a[4], $1, $3
}' |
sort -t $'\t' -k1,1 -k2,2n |
awk -F'\t' '
NR > 1 && $1 != prev_order {print filename}
{filename = $3; prev_order = $1}
END {print filename}
'
The sort command might be wrong: In order to group by order number, you might need to sort first by file time then by order number.
If I understand your question, the resulting files need to be concatenated and appended to a file. If the above pipeline is working OK, then pipe into | xargs cat >> something.log

Resources