Displaying formatted records using AWK - sorting

I have been giving an input file where it looks as this:
ID PID PPID C STIME TTY TIME CMD
adz110 5344 5334 0 08:47 pts/2 00:00:00 bash
dmq292 6908 6854 0 08:53 pts/1 00:00:00 bash
adz110 7227 7150 0 08:54 pts/9 00:00:00 who
erg474 7466 7461 0 08:54 pts/10 00:00:00 ls
dmq292 7966 7960 0 08:55 pts/13 00:00:00 assign1.sh if of
xle135 8636 8628 0 08:58 pts/15 00:00:00 bash
xle135 8983 8636 0 08:59 pts/15 00:00:00 ssh ctf.cs.utsarr.net
zeh458 9057 1980 0 08:59 pts/7 00:00:00 vim prog.c
HanSolo 9150 9139 0 08:59 pts/16 00:00:00 ps -af
and it needs to end like this:
User: adz110
bash
who
User: dmq292
bash
assign1.sh if of
User: erg474
ls
User: xle135
bash
ssh ctf.cs.utsarr.net
User: zeh458
vim prog.c
Earliest Start Time:
adz110 5344 5334 0 08:47 pts/2 00:00:00 bash
Latest Start Time
xle135 8983 8636 0 08:59 pts /15 00:00:00 ssh ctf.cs.utsarr.net
What I have come up until now is within a file.awk that I have written down :
/[a-z]{3}[0-9]{3}/
{
print $1
if( match($1 , arg1) )
{
for(i=8; i <= NF ; i ++)
printf("%s", $i);
printf("\n");
}
}
END {
print " Earliest Start Time:"
print "Latest Start Time: "
}
So instead of printing for each user having the username matching the [a-z][0-9]{3}, it prints the cmd along with the username without any spaces. I am having a lot of trouble understanding associative arrays and the use of specials variables like NR, RS, etc.

Try this, file.awk:
$1 !~ /[a-z]{3}[0-9]{3}/ {next;}
!fstTime {fstTime=$5; lstTime=$5; first=$0; last = $0;}
!($1 in a) {a[$1];users[++ind]=$1;}
{ cmd=$8; for(i=9;i<=NF;i++) cmd=cmd OFS $i;
cmds[$1] = cmds[$1] ? cmds[$1] "\n " cmd : " " cmd;
if ($5 < fstTime) { fstTime=$5; first=$0; }
if ($5 > lstTime) { lstTime=$5; last = $0; }
}
END {
for(i=1;i<=length(a);i++) {
print "User: " users[i];
print cmds[users[i]];
}
print "Earliest Start Time:\n" first "\n\nLatest Start Time:\n" last;
}
awk -f file.awk fileToTest.txt yields:
User: adz110
bash
who
User: dmq292
bash
assign1.sh if of
User: erg474
ls
User: xle135
bash
ssh ctf.cs.utsarr.net
User: zeh458
vim prog.c
Earliest Start Time:
adz110 5344 5334 0 08:47 pts/2 00:00:00 bash
Latest Start Time:
xle135 8983 8636 0 08:59 pts/15 00:00:00 ssh ctf.cs.utsarr.net
If you want the zeh458 9057 1980 0 08:59 pts/7 00:00:00 vim prog.c be the Latest Start Time: line, just change ($5 > lstTime) to ($5 >= lstTime) will do.

You want to apply the regex only against the first field, and collect the values for each user in memory in a format suitable for printing out.
Observe that the listing is fixed-width; so the program name and arguments are in a field which starts in column 55. Similarly, the time stamp is in columns 28-32.
awk 'NR > 1 && $1 ~ /^[a-z]{3}[0-9]{3}$/ {
when = substr($0, 28, 5)
command = substr($0, 55)
if ($1 in user) {
# Append this command to previous value
user[$1] = user[$1] ORS " " command
} else {
# Create a new item in this associative array
user[$1] = " " command
}
if (NR==2 || when > max) { max=when; maxcmd = $0 }
if (NR==2 || when < min) { min=when; mincmd = $0 }
}
END {
# XXX TODO: sort array?
for (u in user) printf "User %s:\n%s\n", u, user[u]
print "Earliest start time"; print mincmd
print "Lastest start time"; print maxcmd
}' filename
So the first time we see a particular user, they will not be in users, and so we just put their first command indented by four spaces as the value for that key in the associative array. If we see them again, we append another line to the value, with a newline (ORS) and four spaces in front.
NR is the current line number -- we skip NR==1 to avoid capturing the header line, and with NR==2 we know that this is the first line we are processing, so we set max and min to their baseline values.
Ideally, you should also normalize the timestamps into a canonical form so that you can sort "yesterday 21:24" before "21:23", but I'm not going there with this simple answer. You probably want to add embellishments in various other places, too. (Or maybe you could rely on the fact that the input seems to be sorted by process start time?)
If you want the user names to be sorted, GNU Awk has array sorting built-in; for other Awks, you will need to write your own simple sort function, or use an external pipeline.

There is one confusion in your input file (your command's output displays a few processes with the same time, e.g. xle135 and zeh458). So the solution I have come up with will sort your input file (you could put your command if you need in place of it too), then it will figure out maximum and minimum time period out of all entries. If some things have the same time stamp, it will not concatenate their values (though we could do that, too); instead, it will print the latest sorted one (with the latest timestamp only).
awk '
/^ID/{
next
}
{
split($5,array,":")
seconds=array[1]*60+array[2]
}
FNR==NR{
a[$1]++
tim[seconds]=$0
next
}
a[$1]==2{
print "User: " $1 ORS " " $NF
getline
sub(/.*:00/,"")
print" " $0
}
a[$1]==1{
val=$0
sub(/.*:00/,"",val)
print "User: " $1 ORS " " val
}
{
min=min<seconds?(min?min:seconds):seconds
max=max>seconds?max:seconds
}
END{
print "Earliest Start Time:" ORS tim[min] ORS "Latest Start Time" ORS tim[max]
}
' <(sort -k1,5 Input_file) <(sort -k1,5 Input_file)
Output will be as follows.
User: adz110
bash
who
User: dmq292
bash
assign1.sh if of
User: erg474
ls
User: HanSolo
ps -af
User: xle135
bash
ssh ctf.cs.utsarr.net
User: zeh458
vim prog.c
Earliest Start Time:
adz110 5344 5334 0 08:47 pts/2 00:00:00 bash
Latest Start Time
zeh458 9057 1980 0 08:59 pts/7 00:00:00 vim prog.c

Assuming that if a given user is running the same command multiple times you want that command to appear multiple times in the output for that user:
$ cat tst.awk
NR == 1 { next }
/^[a-z]{3}[0-9]{3}/ {
user = $1
users[user]
cmd = $0
sub(/([^[:space:]]+[[:space:]]+){7}/,"",cmd)
cmds[user,++numCmds[user]] = cmd
stime = $5
if ( (earliestStime == "") || (stime < earliestStime) ) {
earliestStime = stime
earliestData = $0
}
if ( (latestStime == "") || (stime > latestStime) ) {
latestStime = stime
latestData = $0
}
}
END {
for (user in users) {
printf "User: %s\n", user
for (cmdNr=1; cmdNr<=numCmds[user]; cmdNr++) {
printf " %s\n", cmds[user,cmdNr]
}
}
print "\nEarliest Start Time:"
print earliestData
print "\nLatest Start Time:"
print latestData
}
.
$ awk -f tst.awk file
User: xle135
bash
ssh ctf.cs.utsarr.net
User: zeh458
vim prog.c
User: dmq292
bash
assign1.sh if of
User: erg474
ls
User: adz110
bash
who
Earliest Start Time:
adz110 5344 5334 0 08:47 pts/2 00:00:00 bash
Latest Start Time:
xle135 8983 8636 0 08:59 pts/15 00:00:00 ssh ctf.cs.utsarr.net
If the order of users or commands in the output matters then tell us what the order should be and it'll be easy enough to adapt, e.g. by retaining the input order or setting PROCINFO["sorted_in"] with GNU awk before each loop in the END section.

Related

awk modify printf output

I need to parse the output of ldapsearch and only keep the attributes with numeric values.
Also I need to transform the output to make it usable in prometheus monitoring.
this is the output of a raw ldapsearch:
# 389, snmp, monitor
dn: cn=389,cn=snmp,cn=monitor
cn: 389
objectClass: top
objectClass: extensibleObject
anonymousbinds: 9
unauthbinds: 9
simpleauthbinds: 122256
strongauthbinds: 0
bindsecurityerrors: 27869
inops: 24501385
readops: 17933653
compareops: 24852
addentryops: 14205
removeentryops: 0
modifyentryops: 378287
modifyrdnops: 0
listops: 0
searchops: 19194674
onelevelsearchops: 117
wholesubtreesearchops: 1260904
referrals: 0
chainings: 0
securityerrors: 2343
errors: 4694375
connections: 1075
connectionseq: 4720927
bytesrecv: 1608469180
bytessent: -424079608
entriesreturned: 19299393
referralsreturned: 0
I execute this query in order to remove the fields that are not numerical and also the dn/cn fields if they have numbers eg cn=389.
${LDAPSEARCH} -LLL -H ${LDAP_URI} -x -D "${BINDDN}" -w ${LDAP_PASSWD} -b "${cn}" -s base | sed '/^cn\|^dn/d' | awk -F: '{ if ( $1 != "connection" && $2 ~ /[[:digit:]$]/) printf "dsee_%s\n", $1 $2}'
But i need to modify the print f so that it prints me the field like this:
dsee_modifyrdnops{node="vm1",cn="389"} 0
dsee_listops{node="vm1",cn="1389"} 0
dsee_strongauthbinds{node="vm1",cn="389"} 0
dsee_readops{"node="vm1",cn="389"} 37194588
I have difficulties adding the curly brackets and quotes to the printf command.
what would be the best way to improve the awk/sed command and modify the printf output?
In plain bash:
#!/bin/bash
node=vm1
while IFS=: read -r key val; do
[[ $key = cn ]] && { cn=${val# }; continue; }
if [[ $val =~ ^\ -?[0-9]+(\.[0-9]*)?$ ]]; then
printf 'dsee_%s{node="%s",cn="%s"}%s\n' "$key" "$node" "$cn" "$val"
fi
done < <( your_raw_ldapsearch_command )
something along these lines:
$ cat tst.awk
BEGIN {
FS=":[[:blank:]]*"
qq="\""
node="vm1"
}
$1=="cn" {cn=$2}
$1!~/^((cn|dn)$|connection)/ && $2~/^[[:digit:]]+$/ {
printf("dsee_%s{node=%s%s%s,cn=%s%s%s} %d\n", $1, qq, node, qq, qq, cn, qq, $2)
}
$ awk -f tst.awk myFile
dsee_anonymousbinds{node="vm1",cn="389"} 9
dsee_unauthbinds{node="vm1",cn="389"} 9
dsee_simpleauthbinds{node="vm1",cn="389"} 122256
dsee_strongauthbinds{node="vm1",cn="389"} 0
dsee_bindsecurityerrors{node="vm1",cn="389"} 27869
dsee_inops{node="vm1",cn="389"} 24501385
dsee_readops{node="vm1",cn="389"} 17933653
dsee_compareops{node="vm1",cn="389"} 24852
dsee_addentryops{node="vm1",cn="389"} 14205
dsee_removeentryops{node="vm1",cn="389"} 0
dsee_modifyentryops{node="vm1",cn="389"} 378287
dsee_modifyrdnops{node="vm1",cn="389"} 0
dsee_listops{node="vm1",cn="389"} 0
dsee_searchops{node="vm1",cn="389"} 19194674
dsee_onelevelsearchops{node="vm1",cn="389"} 117
dsee_wholesubtreesearchops{node="vm1",cn="389"} 1260904
dsee_referrals{node="vm1",cn="389"} 0
dsee_chainings{node="vm1",cn="389"} 0
dsee_securityerrors{node="vm1",cn="389"} 2343
dsee_errors{node="vm1",cn="389"} 4694375
dsee_bytesrecv{node="vm1",cn="389"} 1608469180
dsee_entriesreturned{node="vm1",cn="389"} 19299393
dsee_referralsreturned{node="vm1",cn="389"} 0

how to print a result data in same the line in bash command?

I have my command below and I want to have the result in the same line with delimeters. My command:
Array=("GET" "POST" "OPTIONS" "HEAD")
echo $(date "+%Y-%m-%d %H:%M")
for i in "${Array[#]}"
do
cat /home/log/myfile_log | grep "$(date "+%d/%b/%Y:%H")"| awk -v last5=$(date --date="-5 min" "+%M") -F':' '$3>=last5 && $3<last5+5{print}' | egrep -a "$i" | wc -l
done
Results is:
2019-01-01 13:27
1651
5760
0
0
I want to have the result below:
2019-01-01 13:27,1651,5760,0,0
It looks (to me) like the overall objective is to scan /home/log/myfile.log for entries that have occurred within the last 5 minutes and which match one of the 4 entries in ${Array[#]}, keeping count of the matches along the way and finally printing the current date and the counts to a single line of output.
I've opted for a complete rewrite that uses awk's abilities of pattern matching, keeping counts and generating a single line of output:
date1=$(date "+%Y-%m-%d %H:%M") # current date
date5=$(date --date="-5 min" "+%M") # date from 5 minutes ago
awk -v d1="${date1}" -v d5="${date5}" -F":" '
BEGIN { keep=0 # init some variables
g=0
p=0
o=0
h=0
}
$3>=d5 && $3<d5+5 { keep=1 } # do we keep processing this line?
!keep { next } # if not then skip to next line
/GET/ { g++ } # increment our counters
/POST/ { p++ }
/OPTIONS/ { o++ }
/HEAD/ { h++ }
{ keep=0 } # reset keep flag for next line
# print results to single line of output
END { printf "%s,%s,%s,%s,%s\n", d1, g, p, o, h }
' <(grep "$(date '+%d/%b/%Y:%H')" /home/log/myfile_log)
NOTE: The OP may need to revisit the <(grep "$(date ...)" /home/log/myfile.log) to handle timestamp periods that span hours, days, months and years, eg, 14:59 - 16:04, 12/31/2019 23:59 - 01/01/2020 00:04, etc.
Yeah, it's a bit verbose but a bit easier to understand; OP can rewrite/reduce as sees fit.

Attempt to print out the row containing the earliest/latest date

I'm stuck on what to do. My script will output the date/time itself, but not the whole row containing that value.
And the input looks like the following:
1 UID PID PPID C STIME TTY TIME CMD
2 adz110 5344 5334 0 08:47 pts /2 00:00:00 bash
3 dmq292 6908 6854 0 Jun04 pts /1 00:00:00 bash
4 adz110 7227 7150 0 Jul11 pts /9 00:00:00 who
5 erg474 7466 7461 0 08:54 pts /10 00:00:00 ls
6 dmq292 7966 7960 0 Jun04 pts /13 00:00:00 assign1 . sh if of
7 xle135 8983 8636 0 08:59 pts /15 00:00:00 ssh ctf . cs . utsarr . net
8 zeh458 9057 1980 0 08:59 pts /7 00:00:00 vim prog . c
9 rslavin 9150 9139 0 08:59 pts /16 00:00:00 ps - af
10 xle135 8636 8628 0 08:58 pts /15 00:00:00 bash
This seems to work for printing out "Earliest Start Time: Jun04" and "Latest Start Time: 08:59".
min = $5; next} {if ($5 > min) min = $5;}
{max = 0}{if ($5>max) max=$5}
I'm finishing up the entire script, which looks like this:
sort |
gawk '
$1 !~ /[a-z]{3}[0-9]{3}/ {
next
}
NR==0{min = $5; next} {if ($5 > min) min = $5;}
{max = 0}{if ($5>max) max=$5}
prev!=$1 && prev{
print "User: " prev ORS "\t " val
prev=val=""
}
{
prev=$1
val=(val?val ORS "\t ":"")substr($0, index($0,$8))
} ;
END {
if(prev){
print "User: " prev ORS "\t " val
}; { print "\nEarliest Start Time :\n" } { print min };
{ print "\nLatest Start Time :\n" } { print max };
}'
I thought I would want to change it to "min = $0" and "max = $0" to get the whole row containing the min and max values. But that prints out different rows instead.
The sample output should be:
1 User : adz110
2 bash
3 who
4 User : dmq292
5 bash
6 assign1 . sh if of
7 User : erg474
8 ls
9 User : xle135
10 bash
11 ssh ctf . cs . utsarr . net
12 User : zeh458
13 vim prog . c
14
15 Earliest Start Time :
16 dmq292 6908 6854 0 Jun04 pts /1 00:00:00 bash
17
18 Latest Start Time :
19 xle135 8983 8636 0 08:59 pts /15 00:00:00 ssh ctf . cs . utsarr . net
For me, earliest Start time just comes out as "Jun04" and latest start time comes out as "08:59"
#! /bin/bash
file=$1
awk ' BEGIN {
mon["Jan"]=1
mon["Feb"]=2
mon["Mar"]=3
mon["Apr"]=4
mon["May"]=5
mon["Jun"]=6
mon["Jul"]=7
mon["Aug"]=8
mon["Sep"]=9
mon["Oct"]=10
mon["Nov"]=11
mon["Dec"]=12
}
# main logic here
NR != 1{
# storing user info
users[$1]=users[$1]"\n\t"$9
for (i = 10 ; i <= NF ; ++i)
users[$1]=users[$1]" "$i
# min
if (min != "") {
if (compare(min, $5) == 1) {
min=$5
min_line=$0
}
} else {
min=$5
min_line=$0
}
# max
if (max != "") {
if (compare(max, $5) == -1) {
max=$5
max_line=$0
}
} else {
max=$5
max_line=$0
}
}
# printing here
END {
for (user in users) {
print "User:", user, users[user]
}
print "Earliest Start Time :\n\t"min_line
print "Latest Start Time :\n\t"max_line
}
function compare(arg1, arg2) {
# returns
# -1 if arg1 < arg2
# 0 if arg1 == arg2
# 1 if arg1 > arg2
last_two_arg1=substr(arg1, 3)
last_two_arg2=substr(arg2, 3)
first_two_arg1=substr(arg1, 0, 2)
first_two_arg2=substr(arg2, 0, 2)
if (int(first_two_arg1) == first_two_arg1) {
# arg1 is a time
if (int(first_two_arg2) == first_two_arg2) {
# arg2 is a time aswell
if (first_two_arg1 > first_two_arg2) {
return 1
} else if (first_two_arg1 == first_two_arg2) {
if (last_two_arg1 > last_two_arg2)
return 1
else if (last_two_arg1 == last_two_arg2)
return 0
else
return -1
} else {
return -1
}
} else {
# arg2 is a date means is smaller
return 1
}
} else {
# arg1 is date
if (int(first_two_arg2) == first_two_arg2) {
# arg2 is time, means arg1 is smaller
return -1
} else {
# both arg1 and arg2 are date
first_three_arg1=substr(arg1, 0, 3)
first_three_arg2=substr(arg2, 0, 3)
if (mon[first_three_arg1] > mon[first_three_arg2])
return 1
else if (mon[first_three_arg1] == mon[first_three_arg2]) {
if (last_two_arg1 > last_two_arg2)
return 1
else if (last_two_arg1 == last_two_arg2)
return 0
else
return -1
} else {
return -1
}
}
}
}' "${file}"
The main barrier in writing this script is a function to compare two different formats of time.
The above script should produce the desired result, although I haven't taken care of sorting, if that's needed, you can ask.

Combine awk and another command to send report to user

I need small help related to Unix shell script using awk.
I have a file like below:
139341 8.61248 python_dev ntoma2 r 07/17/2017 07:27:43 gpuml#acepd1641.udp.finco.com 1
139342 8.61248 python_val ntoma2 r 07/17/2017 07:27:48 gpuml#acepd1611.udp.finco.com 1
139652 8.61248 python_dev ntoma2 r 07/17/2017 10:55:57 gpuml#acepd1671.udp.finco.com 1
Which is space separated. I need to get 1st col and 4th col which are job-id and user-name(ntoma2 in this case) based on 6th col (which is date in date formate - mm/dd/yyyy), older than 7days. Compare 6th column with current date and I need to get cols which are older than 7days.
I have below one to get Job id and user name of older than 7 days:
cat filename.txt | awk -v dt="$(date "--date=$(date) -7 day" +%m/%d/%Y)" -F" " '/qw/{ if($6<dt) print $4,":",$1 }' >> ./longRunningJob.$$
Also i have another command to get email ids like below using user-name (from the above 4th col):
/ccore/pbis/bin/enum-members "adsusers" | grep ^UNIX -B3 | grep <User-Name> -B2 | grep UPN | awk '{print $2}'
I need to combined above 2 commands and need to send a report to every user as like below:
echo "Hello <User Name>, There is a long running job which is of job-id: <job-id> more than 7days, so please kill the job or let us know if we can help. Thank you!" | mailx -s "Long Running Job"
NOTE: if user name repeated, all the list should go in one email.
I am not sure how can i combine these 2 and send email to user, can some one please help me?
Thank you in advance!!
Vasu
You can certainly do this in awk -- easier in gawk because of date support.
Just to give you an outline of how to do this, I wrote this in Ruby:
$ cat file
139341 8.61248 python_dev ntoma2 r 07/10/2017 07:27:43 gpuml#acepd1641.udp.finco.com 1
139342 8.61248 python_val ntoma2 r 07/09/2017 07:27:48 gpuml#acepd1611.udp.finco.com 1
139652 8.61248 python_dev ntoma2 r 07/17/2017 10:55:57 gpuml#acepd1671.udp.finco.com 1
$ ruby -lane 'BEGIN{ require "date"
jobs=Hash.new { |h,k| h[k]=[] }
users=Hash.new()
pn=7.0
}
t=DateTime.parse("%s %s" % [$F[5].split("/").rotate(-1).join("-"), $F[6]])
ti_days=(DateTime.now-t).to_f
ts="%d days, %d hours, %d minutes and %d seconds" % [60,60,24]
.reduce([ti_days*86400]) { |m,o| m.unshift(m.shift.divmod(o)).flatten }
users[$F[3]]=$F[7]
jobs[$F[3]] << "Job: %s has been running %s" % [$F[0], ts] if (DateTime.now-t).to_f > pn
END{
jobs.map { |id, v|
w1,w2=["is a","job"]
w1,w2=["are","jobs"] if v.length>1
s="Hello #{id}, There #{w1} long running #{w2} running more than the policy of #{pn.to_i} days. Please kill the #{w2} or let us know if we can help. Thank you!\n\t" << v.join("\n\t")
puts "#{users[id]} \n#{s}"
# s is the formated email address and body. You take it from here...
}
}
' /tmp/file
gpuml#acepd1671.udp.finco.com
Hello ntoma2, There are long running jobs running more than the policy of 7 days. Please kill the jobs or let us know if we can help. Thank you!
Job: 139341 has been running 11 days, 9 hours, 28 minutes and 44 seconds
Job: 139342 has been running 12 days, 9 hours, 28 minutes and 39 seconds
I got the Solution, but there is a bug in it, here is the solution:
!#/bin/bash
{ qstat -u \*; /ccore/pbis/bin/enum-members "adsusers"; } | awk -v dt=$(date "--date=$(date) -7 day" +%m/%d/%Y) '
/^User obj/ {
F2 = 1
FS = ":"
T1 = T2 = ""
next
}
!F2 {
if (NR < 3) next
if ($5 ~ "qw" && $6 < dt) JID[$4] = $1 "," JID[$4]
next
}
/^UPN/ {T1 = $2
}
/^Display/ {T2 = $2
}
/^Alias/ {gsub (/ /, _, $2)
EM[$2] = T1
DN[$2] = T2
}
END {for (j in JID) {print "echo -e \"Hello " DN[j] " \\n \\nJob(s) with job id(s): " JID[j] " executing more than last 7 days, hence request you to take action, else job(s) will be killed in another 1 day \\n \\n Thank you.\" | mailx -s \"Long running job for user: " DN[j] " (" j ") and Job ID(s): " JID[j] "\" " EM[j]
}
}
' | sh
The bug in the above code is -- the if condition of date compare (as shown below) is is not working as expected, i am really not sure how to compare the $6 and the variable dt (both of format mm/dd/yyyy). I think i should use either mkdate() or something else. can some one please help?
if ($5 ~ "qw" && $6 < dt)
Thank you!!
Vasu

Shell ps command under Ubuntu

I have a question regarding shell scripts. I am trying to be as specific as possible. So, I have to write a monitoring shell script in which I have to write in a file all the users that are running a vi command more, than one minute. I don't really have any idea about the approach, except that I should use the ps command. I have something like this:
ps -ewo "%t %u %c %g" | grep '\< vi >'
with this I get the times and the users that run a vi command. The problem is that I don't really know how to parse the result of this command. Can anyone help, please? All answers are appreciated. Thanks
I will use awk:
ps eo user,etime,pid,args --no-heading -C vi | awk '{MIN=int(substr($2,0,2)); printf "minutes=%s pid=%d\n", MIN, $3; }'
Note, that you dont have to grep for "vi", you can use "ps -C procname".
This is what i'd do:
ps fo "etime,user" --no-heading --sort 'uid,-etime' $(pgrep '\<vi\>') |
perl -ne '($min,$sec,$user) = (m/^\s+(\d\d):(\d\d)\s+(\w+)$/mo);
print "$user\t$min:$sec\n" unless ((0+$min)*60+$sec)<60'
Tack on | cut -f1 | uniq or | cut -f1 | uniq -c to get some nicer stats
Note that the way this is formulated it is easy to switch the test to 59 seconds or 3min11s if you so wish by changing <60 to e.g. <191 (for 3m11s)
If you have Ruby(1.9+)
#!/usr/bin/env ruby
while true
process="ps eo user,etime,args"
f = IO.popen(process) #call the ps command
f.readlines.each do|ps|
user, elapsed, command = ps.split
if command["vi"] && elapsed > "01:00"
puts "User #{user} running vi for more than 1 minute: #{elapsed}"
end
end
f.close
sleep 10 # sleep 10 seconds before monitoring again
end
#!/bin/sh
# -e :: all processes (inluding other users')
# -o :: define output format
# user :: user name
# etimes :: time in seconds after the process was started
# pid :: process id
# comm :: name of the executable
# --no-headers :: do not print column names
ps -eo user,etimes,pid,comm --no-headers |
awk '
# (...) :: select only rows that meet the condition in ()
# $4 ~ // :: 4th field (comm) should match the pattern in //
# (^|\/)vim?$ :: beginning of the line or "/", then "vi",
# nothing or "m" (to capture vim), end of the line
# $2 > 60 :: 2nd field (etimes) >= 60 seconds
($4 ~ /(^|\/)vim?$/ && $2 >= 60){
# convert 2nd field (etimes) into minutes
t = int($2 / 60);
# check if the time is more than 1 minute
s = (t > 1) ? "s" : "";
# output
printf "user %s : [%s] (pid=%d) started %d minute%s ago\n", $1, $4, $3, t, s;
}'

Resources