How can I use sort by custom date in file? - bash

I have log file like this:
Fri Jan 30 13:52:57 2015 1 10.1.1.1 0 /home/test1/MAIL_201401301353.201501301352.19721.sqlLdr b _ i r test1 ftp 0 * c
Fri Jan 30 13:52:58 2015 1 10.1.1.1 0 /home/test2/MAIL_201401301354.201501301352.12848.sqlLdr b _ i r test2 ftp 0 * c
Fri Jan 30 13:53:26 2015 1 10.1.1.1 0 /home/test3/MAIL_201401301352.201501301353.17772.sqlLdr b _ i r test3 ftp 0 * c
I need to sort by date value. Date value is first 2014....
I can find date value like this:
echo $log | awk '{print $9}' | grep -oP '(?<!\d)201\d{9}' | head -n 1
How can I sort by this date value(new to old)?

To sort this file you can use:
sort -t_ -nk2,2 file
Fri Jan 30 13:53:26 2015 1 10.1.1.1 0 /home/test3/MAIL_201401301352.201501301353.17772.sqlLdr b _ i r test3 ftp 0 * c
Fri Jan 30 13:52:57 2015 1 10.1.1.1 0 /home/test1/MAIL_201401301353.201501301352.19721.sqlLdr b _ i r test1 ftp 0 * c
Fri Jan 30 13:52:58 2015 1 10.1.1.1 0 /home/test2/MAIL_201401301354.201501301352.12848.sqlLdr b _ i r test2 ftp 0 * c
Details:
-n # numerical sort
-t # set field separator as _
-k2,2 # sort on 2nd field

Related

Get the longest logon time of a given user using awk

My task is to write a bash script, using awk, to find the longest logon of a given user ("still logged in" does not count), and print the month day IP logon time in minutes.
Sample input: ./scriptname.sh username1
Content of last username1:
username1 pts/ IP Apr 2 .. .. .. .. (00.03)
username1 pts/ IP Apr 3 .. .. .. .. (00.13)
username1 pts/ IP Apr 5 .. .. .. .. (12.00)
username1 pts/ IP Apr 9 .. .. .. .. (12.11)
Sample output:
Apr 9 IP 731
(note: 12 hours and 11 minutes is in total 731 minutes)
I have written this script, but a bunch of errors pop up, and I am really confused:
#!/bin/bash
usr=$1
last $usr | grep -v "still logged in" | awk 'BEGIN {max=-1;}
{
h=substr($10,2,2);
min=substr($10,5,2) + h/60;
}
(max < min){
max = min;
}
END{
maxh=max/60;
maxmin=max-maxh;
($maxh == 0 && $maxmin >=10){
last $usr | grep "00:$maxmin" | awk '{print $5," ",$6," ", $3," ",$maxmin}'
exit 1
}
($maxh == 0 $$ $maxmin < 10){
last $usr | grep "00:0$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh < 10 && $maxmin == 0){
last $usr | grep "0$maxh:00" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh < 10 && $maxmin < 10){
last $usr | grep "0$maxh:0$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh >= 10 && $maxmin < 10){
last $usr | grep "$maxh:0$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
($maxh >=10 && $maxmin >= 10){
last $usr | grep "$maxh:$maxmin" | awk '{print $5," ",$6," ",$3," ",$maxmin}'
exit 1
}
}'
So a bit of explaining of how I imagined this would work:
After the initialization, I want to find the (hh:mm) column of the last $usr command, save the h and min of every line, find the biggest number (in minutes, meaning it is the longest logon time).
After I found the longest logon time (in minutes, stored in the variable max), I then have to reformat the only minutes format to hh:mm to be able to use a grep, use the last command again, but now only searching for the line(s) that contain the max logon time, and print all of the needed information in the month day IP logon time in minutes format, using another awk.
Errors I get when running this code: A bunch of syntax errors when I try using grep and awk inside the original awk.
awk is not shell. You can't directly call tools like last, grep and awk from awk any more than you could call them directly from a C program.
Using any awk in any shell on every Unix box and assuming if multiple rows have the max time you'd want all of them printed and that if no timestamped rows are found you want something like No matching records printed (easy tweak if not, just tell us your requirements for those cases and include them in the example in your question):
last username1 |
awk '
/still logged in/ {
next
}
{
split($NF,t,/[().]/)
cur = (t[2] * 60) + t[3]
}
cur >= max {
out = ( cur > max ? "" : out ORS ) $4 OFS $5 OFS $3 OFS cur
max = cur
}
END {
print (out ? out : "No matching records")
}
'
Apr 9 IP 731
If gnu-awk is available, you might use a pattern with 2 capture groups for the numbers in the last field. In the END block print the format that you want.
If in this example, file contains the example content, and the last column contains the logon:
awk '
match ($(NF), /\(([0-9]+)\.([0-9]+)\)/, a) {
hm = (a[1] * 60) + a[2]
if(hm > max) {max = hm; line = $0;}
}
END {
n = split(line,a,/[[:space:]]+/)
print a[3], a[4], a[5], max
}
' file
Output
IP Apr 9 731
Testing last command in my machine:
Using Red Hat Linux 7.8
Got the following output:
user0022 pts/1 10.164.240.158 Sat Apr 25 19:32 - 19:47 (00:14)
user0022 pts/1 10.164.243.80 Sat Apr 18 22:31 - 23:31 (1+01:00)
user0022 pts/1 10.164.243.164 Sat Apr 18 19:21 - 22:05 (02:43)
user0011 pts/0 10.70.187.1 Thu Nov 21 15:26 - 18:37 (03:10)
user0011 pts/0 10.70.187.1 Thu Nov 7 16:21 - 16:59 (00:38)
astukals pts/0 10.70.187.1 Mon Oct 7 19:10 - 19:13 (00:03)
reboot system boot 3.10.0-957.10.1. Mon Oct 7 22:09 - 14:30 (156+17:21)
astukals pts/0 10.70.187.1 Mon Oct 7 18:56 - 19:08 (00:12)
reboot system boot 3.10.0-957.10.1. Mon Oct 7 21:53 - 19:08 (-2:-44)
IT pts/0 10.70.187.1 Mon Oct 7 18:50 - 18:53 (00:03)
IT tty1 Mon Oct 7 18:48 - 18:49 (00:00)
user0022 pts/1 30.30.30.168 Thu Apr 16 09:43 - 14:54 (05:11)
user0022 pts/1 30.30.30.59 Wed Apr 15 11:48 - 04:59 (17:11)
user0022 pts/1 30.30.30.44 Tue Apr 14 19:03 - 04:14 (09:11)
Found time format is DD+HH:MM appears only when DD is not zero.
Found there are additional technical users: IT, system, reboot need to filtered.
Suggesting solution:
last | awk 'BEGIN {FS="[ ()+:]*"}
/reboot|system|still/{next}
{ print $5 OFS $6 OFS $3 OFS $(NF-1) + ($(NF-2) * 60) + ($(NF-3) * 60 * 24)}
' |sort -nk 4| head -1
Result:
Apr 15 30.30.30.59 85991

awk print date formats for all letters - lower and upper cases

I'm working on a awk one-liner to get the date command output for all possible characters ( upper and lower case) like below
a Tue | A Tuesday
b Apr | B April
c Tue Apr 14 17:33:37 2020 | C 20
d 14 | D 04/14/20
. . . .
. . . .
z +0530 | Z IST
The below command seems to be syntactically correct, but awk is throwing error.
seq 0 25 | awk ' { d="date \"+" printf("%c",$0+97) " %" printf("%c",$0+97) "\""; d | getline ; print } '
-bash: syntax error near unexpected token `)'
what is wrong with my attempt. Any other awk solution is also welcome.
bash can do this:
for c in {a..z}; do date "+$c %$c | ${c^} %${c^}"; done
Could you please try following(without ASCII numbers using trick).
awk -v s1="\"" '
BEGIN{
num=split("a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z",alphabets,",")
for(i=1;i<=num;i++){
print "date " s1 "+"alphabets[i] " %"alphabets[i] " | " toupper(alphabets[i]) " %"toupper(alphabets[i]) s1
}
}
' | bash
Logical explanation:
Starting awk program and mentioning variable s1 with value ".
Everything we re doing is in BEGIN section of code only.
Using split to create an array named alphabets where all small letters alphabets are stored in it with index of 1,2,3.. and so on.
Now running for loop from 1 to till length of array alphabets.
Now here print command will actually print all the commands(how they should actually run), but this will do only printing of them.
Closing awk command and passing its output to bash will execute the commands and show the output on terminal.
Any time you find youself considering using awk like a shell (i.e. as a tool to call other tools from) you really need to think hard about whether or not it's the right approach.
Using any awk in any shell without the complications of having shell call awk to spawn a subshell to call date and then have getline try to read it and close the pipe, etc. as happens if you try to call date from awk:
$ awk 'BEGIN{for (i=0; i<=25; i++) print c=sprintf("%c",i+97), toupper(c)}' |
while read c C; do date "+$c %$c | $C %$C"; done
a Tue | A Tuesday
b Apr | B April
c Tue Apr 14 09:03:28 2020 | C 20
d 14 | D 04/14/20
e 14 | E E
f f | F 2020-04-14
g 20 | G 2020
h Apr | H 09
i i | I 09
j 105 | J J
k 9 | K K
l 9 | L L
m 04 | M 03
n
| N N
o o | O O
p AM | P P
q q | Q Q
r 09:03:28 AM | R 09:03
s 1586873008 | S 28
t | T 09:03:28
u 2 | U 15
v 14-Apr-2020 | V 16
w 2 | W 15
x 04/14/2020 | X 09:03:28
y 20 | Y 2020
z -0500 | Z CDT
You may want to have this:
awk -v q='"' 'BEGIN{for(i=0;i<=25;i++){
ch=sprintf("%c",i+97)
d="date +%s%s %%%s%s "
sprintf(d, q,ch,ch,q)|getline v;
sprintf(d,q,toupper(ch),toupper(ch),q)|getline v2;
print v "|" v2
close(d)
}}'
Note
you don't need to feed awk by seq 0 25, you can use the BEGIN block
printf does output, if you want the result, use sprintf()
you should close the command after execution
you didn't implement the "uppercase" part
Output:
a Tue|A Tuesday
b Apr|B April
c Tue 14 Apr 2020 03:02:33 PM CEST|C 20
d 14|D 04/14/20
e 14|E %E
f %f|F 2020-04-14
g 20|G 2020
h Apr|H 15
i %i|I 03
j 105|J %J
k 15|K %K
l 3|L %L
m 04|M 02
n |N 396667929
o %o|O %O
p PM|P pm
q 2|Q %Q
r 03:02:33 PM|R 15:02
s 1586869353|S 33
t |T 15:02:33
u 2|U 15
v %v|V 16
w 2|W 15
x 04/14/2020|X 03:02:33 PM
y 20|Y 2020
z +0200|Z CEST

How can I skip line with awk

I have a command like this:
tac $log | awk -v pattern="test" '$9 ~ pattern {print; exit}'
It shows me the last line in which $9 contains test text.
Like this:
Thu Mar 26 20:21:38 2015 1 10.8.0.22 94 /home/xxxyyy/zzz_test_223123.txt b _ o r spy ftp 0 * c
Thu Mar 26 20:21:39 2015 1 10.8.0.22 94 /home/SAVED/zzz_test_123123.txt b _ o r spy ftp 0 * c
Thu Mar 26 20:21:40 2015 1 10.8.0.22 94 /home/xxxyyy/zzz_test_123123.txt b _ o r spy ftp 0 * c
Thu Mar 26 20:21:41 2015 1 10.8.0.22 94 /home/SAVED/zzz_test_123124.txt b _ o r spy ftp 0 * c
-- >
Thu Mar 26 20:21:41 2015 1 10.8.0.22 94 /home/SAVED/zzz_test_123124.txt b _ o r spy ftp 0 * c
This command shows me the last line. But I need to pass if line have SAVED. So I need to show like this:
Thu Mar 26 20:21:40 2015 1 10.8.0.22 94 /home/xxxyyy/zzz_test_123123.txt b _ o r spy ftp 0 * c
How can I do this?
To skip a line, you can match it, and use the next command.
$9 ~ /SAVED/ { next }
$9 ~ /\.txt$/ { print; exit }
You can add another condition !~ to prevent this second pattern to be matched (I use pattern2 to make it more generic, of course you can hardcode SAVED there):
$9 ~ pattern && $9 !~ pattern2
All together:
$ awk -v pattern="test" -v pattern2="SAVED" '$9 ~ pattern && $9 !~ pattern2 {print; exit}'
Thu Mar 26 20:21:40 2015 1 10.8.0.22 94 /home/xxxyyy/zzz_test_123123.txt b _ o r spy ftp 0 * c
Use !~ to test if a line doesn't match a pattern.
awk -v pattern="test" $9 ~ pattern && $9 !~ /SAVED/ { print; exit; }

Extract date from log file

I have a log line like this:
Tue Dec 2 10:03:46 2014 1 10.0.0.1 0 /home/test4/TEST_LOGIN_201312021003.201412021003.23872.sqlLdr b _ i r test4 ftp 0 * c
And I can print date value of this line like this.
echo $log | awk '{print $9}' | grep -oP '(?<!\d)201\d{9}' | head -n 1
I have another log line like this, how can I print date value?
Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data-2014_12_11_16_20.txt b _ i r spy ftp 0 * c
I tried my awk/grep solution, but it prints 201 and 9 number after 201 when see 201.
Sub folders and data name is the same:
2014/12/11/16/20 --> 11 Dec 2014 16:20 <-- blablabla_data-2014_12_11_16_20.txt
note: /home/DATA1 is not static. year/month/day/hour/minute is static.
As the format in the path is /.../YYYY/MM/DD/HH/MM/filename, you can use 201D/DD/DD/DD/DD in the grep expression to match the date block:
$ log="Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data2_11_16_20.txt b _ i r spy ftp 0 * c"
$ echo "$log" | grep -oP '(?<!\d)201\d/\d{2}/\d{2}/\d{2}/\d{2}'
2014/12/11/16/20
And eventually remove the slashes with tr:
$ echo "$log" | grep -oP '(?<!\d)201\d/\d{2}/\d{2}/\d{2}/\d{2}' | tr -d '/'
201412111620
sed can also work too, if you are acquainted with it
echo "Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data-2014_12_11_16_20.txt b _ i r spy ftp 0 * c"|sed 's#.*[[:alnum:]]*/\([[:digit:]]\{4\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}\).*#\1#'
output
2014/12/11/16/20
To remove "/", the same above command piped to tr -d '/'
Full command line
echo "Tue Dec 9 10:48:13 2014 1 10.0.0.1 80 /home/DATA1/2014/12/11/16/20/blablabla_data-2014_12_11_16_20.txt b _ i r spy ftp 0 * c"|sed 's#.*[[:alnum:]]*/\([[:digit:]]\{4\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}/[[:digit:]]\{2\}\).*#\1#'|tr -d '/'
Output
201412111620

How to grep two column from a single file

cat Error00
4 0 375
4 2001 21
4 2002 20
cat Error01
4 0 465
4 2001 12
4 2002 40
4 2016 1
I want output as below
4 0 375 465
4 2001 21 12
4 2002 20 20
4 2016 - 1
i am using the below query. here problem is i m not able to handle grep for two field because space is coming.
please suggest how can to get rid of this.
keylist=$(awk '{print $1,$2'} Error0[0-1] | sort | uniq)
for key in ${keylist} ; do
echo ${key}
val_a=$(grep "^${key}" Error00 | awk '{print $3}') ;val_a=${val_a:---}
val_b=$(grep "^${key}" Error01 | awk '{print $1,$2}') ; val_b=${val_b:--- --}
echo $key ${val_a} >>testreport
done
i m geting the oputput as below
4 375 465
0
4 21 12
2001
4 20 20
2002
4 - 1
2016
A single awk one liner can handle this easily:
awk 'FNR==NR{a[$1,$2]=$3;next}{print $1,$2,(a[$1,$2]?a[$1,$2]:"-"),$3}' err0 err1
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
For formatted output you can use printf instead of print. Like Jonathan Leffler suggest:
printf "%s %-6s %-6s %s\n",$1,$2,(a[$1,$2]?a[$1,$2]:"-"),$3
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
However a general solution is to use column -t for a nice table output:
awk '{....}' err0 err1 | column -t
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
grep is not really the right tool for this job. You can either play with awk or Perl (or Python, or …), or you can use join. However, join only joins on a single column at a time, and you appear to need to join on two columns. So, we're going to have to massage the data so that it will work with join. I'm about to assume you're using bash and so have process substitution available. You can do the job without, but it is fiddlier and involves temporary files (and traps to clean them up, etc).
The key to the join will be to replace the blank between the first two columns with a colon (or any other convenient character — control-A would work fine too), then join the files on column 1 with a replacement character. The inputs must be sorted; the output must have the colon replaced with a blank.
$ join -o 0,1.2,2.2 -a 1 -a 2 -e '-' \
> <(sed 's/ */:/' Error00 | sort) \
> <(sed 's/ */:/' Error01 | sort) |
> sed 's/:/ /'
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
$
The 's/ */:/' operation replaces the first sequence of one or more blanks with a colon; the input data has two blanks between the 4 and the 0 in the first line of Error00. The input to join must be in sorted order of the joining field, here the first field. The output is the join field, the second column of Error00 and the second column of Error01 (remembering that means the second column after the first two have been fused by the colon). If there's an unmatched line in the first file, generate an output line (-a 1); ditto for the second file; and for the missing fields, insert a dash (-e '-'). The final sed removes the colon that was added.
If you want the data formatted, pipe it through awk.
$ join -o 0,1.2,2.2 -a 1 -a 2 -e '-' \
> <(sed 's/ */:/' Error00 | sort) \
> <(sed 's/ */:/' Error01 | sort) |
> sed 's/:/ /' |
> awk '{printf("%s %-6s %-6s %s\n", $1, $2, $3, $4)}'
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
$

Resources