Extract text and evaluate in bash

Extract text and evaluate in bash - bash

I need some help getting a script up and running. Basically I have some data that comes from a command output and want to select some of it and evaluate
Example data is
JSnow <jsnow#email.com> John Snow spotted 30/1/2015
BBaggins <bbaggins#email.com> Bilbo Baggins spotted 20/03/2015
Batman <batman#email.com> Batman spotted 09/09/2015
So far I have something along the lines of
# Define date to check
check=$(date -d "-90 days" "+%Y/%m/%d")
# Return user name
for user in $(command | awk '{print $1}')
do
# Return last logon date
$lastdate=(command | awk '{for(i=1;i<=NF;i++) if ($i==spotted) $(i+1)}')
# Evaluation date again current -90days
if $lastdate < $check; then
printf "$user not logged on for ages"
fi
done
I have a couple of problems, not least the fact that whilst I can get information from places I don't know how to go about getting it all together!! I'm also guessing my date evaluation will be more complicated but at this point that's another problem and just there to give a better idea of my intentions. If anyone can explain the logical steps needed to achieve my goal as well as propose a solution that would be great. Thanks

Every time you write a loop in shell just to manipulate text you have the wrong approach (see, for example, https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice). The general purpose text manipulation tool that comes on every UNIX installation is awk. This uses GNU awk for time functions:
$ cat tst.awk
BEGIN { check = systime() - (90 * 24 * 60 * 60) }
{
user = $1
date = gensub(/([0-9]+)\/([0-9]+)\/([0-9]+)/,"\\3 \\2 \\1 0 0 0",1,$NF)
secs = mktime(date)
if (secs < check) {
printf "%s not logged in for ages\n", user
}
}
$ cat file
JSnow <jsnow#email.com> John Snow spotted 30/1/2015
BBaggins <bbaggins#email.com> Bilbo Baggins spotted 20/03/2015
Batman <batman#email.com> Batman spotted 09/09/2015
$ cat file | awk -f tst.awk
JSnow not logged in for ages
BBaggins not logged in for ages
Batman not logged in for ages
Replace cat file with command.

Related

Compare two timestamp columns and if difference is greater than 1 hour, trigger email alert(bash)

I have a file that looks like this:
user1,135.4,MATLAB,server1,14:53:59,15:54:28
user2,3432,Solver_HF+,server1,14:52:01,14:54:28
user3,3432,Solver_HF+,server1,14:52:01,15:54:14
user4,3432,Solver_HF+,server1,14:52:01,14:54:36
I want to run a comparison between the last two columns and if the difference is greater than an hour(such as lines 1 and 3) it will trigger something like this:
echo "individual line from file" | mail -s "subject" email#site.com
I was trying to come up with a possible solution using awk, but I'm still fairly new to linux and couldn't quite figure out something that worked.

the following awk scripts maybe is your want
awk 'BEGIN{FS=","}
{a="2019 01 01 " gensub(":"," ","g",$5);
b="2019 01 01 " gensub(":"," ","g",$6);
c = int((mktime(b)-mktime(a))/60)}
{if (c >= 60){system("echo "individual line from file" | mail -s "subject" email#site.com")}}' your_filename
then put the scritps into crontab or other trigger
for example
*/5 * * * * awk_scripts.sh
if you just want check new line . use tail -n filename may be more useful than cat

Here you go: (using gnu awk due to mktime)
awk -F, '{
split($(NF-1),t1,":");
split($NF,t2,":");
d1=mktime("0 0 0 "t1[1]" "t1[2]" "t1[3]" 0");
d2=mktime("0 0 0 "t2[1]" "t2[2]" "t2[3]" 0");
if (d2-d1>3600) print $0}' file
user1,135.4,MATLAB,server1,14:53:59,15:54:28
user3,3432,Solver_HF+,server1,14:52:01,15:54:14
Using field separator as comma to get the second last and last field.
The split the two field inn to array t1 and t2 to get hour min sec
mktime converts this to seconds.
do the math and print only lines with more than 3600 seconds
This can then be piped to other commands.
See how time function are used int gnu awk: https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html

Having SUM issues with a bash script

I'm trying to write a script to pull the integers out of 4 files that store temperature readings from 4 industrial freezers, this is a hobby script it generates the general readouts I wanted, however when I try to generate a SUM of the temperature readings I get the following printout into the file and my goal is to print the end SUM only not the individual numbers printed out in a vertical format
Any help would be greatly appreciated;here's my code
grep -o "[0.00-9.99]" "/location/$value-1.txt" | awk '{ SUM += $1; print $1} END { print SUM }' >> "/location/$value-1.txt"
here is what I am getting in return
Morningtemp:17.28
Noontemp:17.01
Lowtemp:17.00 Hightemp:18.72
1
7
.
2
8
1
7
.
0
1
1
7
.
0
0
1
8
.
7
2
53
It does generate the SUM I don't need the already listed numbers, just the SUM total

Why not stick with AWK completely? Code:
$ cat > summer.awk
{
while(match($0,/[0-9]+\.[0-9]+/)) # while matches on record
{
sum+=substr($0, RSTART, RLENGTH) # extract matches and sum them
$0=substr($0, RSTART + RLENGTH) # reset to start after previous match
count++ # count matches
}
}
END {
print sum"/"count"="sum/count # print stuff
Data:
$ cat > data.txt
Morningtemp:17.28
Noontemp:17.01
Lowtemp:17.00 Hightemp:18.72
Run:
$ awk -f summer.awk file
70.01/4=17.5025
It might work in the winter too.

The regex in grep -o "[0.00-9.99]" "/location/$value-1.txt" is equivalent to [0-9.], but you're probably looking for numbers in the range 0.00 to 9.99. For that, you need a different regex:
grep -o "[0-9]\.[0-9][0-9]" "/location/$value-1.txt"
That looks for a digit, a dot, and two more digits. It was almost tempting to use [.] in place of \.; it would also work. A plain . would not; that would select entries such as 0X87.
Note that the pattern shown ([0-9]\.[0-9][0-9]) will match 192.16.24.231 twice (2.16 and 4.23). If that's not what you want, you have to be a lot more precise. OTOH, it may not matter in the slightest for the actual data you have. If you'd want it to match 192.16 and 24.231 (or .24 and .231), you have to refine your regex.
Your command structure:
grep … filename | awk '…' >> filename
is living dangerously. In the example, it is 'OK' (but there's a huge grimace on my face as I type 'OK') because the awk script doesn't write anything to the file until grep has read it all. But change the >> to > and you have an empty input, or have awk write material before the grep is complete and suddenly it gets very tricky to determine what happens (it depends, in part, on what awk writes to the end of the file).

How to use time command but only to time part of the script

just for the record i am running on ksh in a Unix Solaris system.
I have a script that could look like this:
awk '{ do stuff here }'
for n in x {
echo $something | awk '{ do more stuff here }'
}
I know that i can use time in the console when i am running the script, for example, if the name of the file would be file_name, i would run it like this: time ksh file_name, but like this i get returned the run time of the whole script, I need only the run time of the for in order to do some calculations in comparison with another script. Is it even possible that i can invoke the time inside the script, so lets say i would add it before for so that it would calculate only that part? Also i would like to know if it is possible to have two time for one script, so you would get the run time of two different parts of the script.

Since all variables in a Bash script are global, you can store that for in a function and call it prepending time:
my_for_function () {
for n in x {
echo $something | awk '{ do more stuff here }'
}
}
awk '{ do stuff here }'
time my_for_function # here you call `time`
Then you run the script normally.
Little example
$ cat silly_script.sh
myfunc() {
for x in "${a[#]}"
do
echo "$x --"
sleep 1
done
}
awk 'BEGIN {print "i am here"}'
a=(1 2 3 4)
time myfunc
echo "i am done"
Let's run it:
$ bash silly_script.sh
i am here
1 --
2 --
3 --
4 --
real 0m4.008s
user 0m0.004s
sys 0m0.000s
i am done

Wondering how to merge these two bash commands in a single efficient one

I've a log file that contains some lines I need to grab:
Jul 2 06:42:00 myhostname error proc[12345]: 01310001:3: event code xxxx Slow transactions attack detected - account id: (20), number of dropped slow transactions: (3)
Jul 2 06:51:00 myhostname error proc[12345]: 01310001:3: event code xxxx Slow transactions attack detected - account id: (20), number of dropped slow transactions: (2)
Account id(xx) gives me the name of an object that I am able to gather through mysql query.
Following command (which is for sure not optimized at all, but working) gives me the number of matching lines per account id:
grep "Slow transactions" logfile| awk '{print $18}' | awk -F '[^0-9]+' '{OFS=" ";for(i=1; i<=NF; i++) if ($i != "") print($i)}' | sort | uniq -c
14 20
The output (14 20) means the account id 20 was observed 14 times (14 lines in the logfile).
Then I also have number of dropped slow transactions: (2) part.
This gives the real number of dropped transactions that was logged. In other word, a log entry could mean 1 or more dropped transaction.
I do have a small command to count the number of dropped transactions:
grep "Slow transactions" logfile | awk '{print $24}' | sed 's/(//g' | sed 's/)//g' | awk '{s+=$1} END {print s}'
73
That means 73 transactions were dropped.
These two works but when coming to the point of merging the two I am stuck. I really don't see how to combine them; I am pretty sure awk can do it (and probably a better way that I did) but I would appreciate if any expert from the community could give me some guidance.
update
Since above one was too easy for some of our awk experts in SO I introduce an optional feature :)
As previously mentioned I can convert account ID into a name issuing a mysql query. So, the idea is now to include the ID => name conversion into the awk command.
The mySQL query looks like this (XX being the account ID):
mysql -Bs -u root -p$(perl -MF5::GenUtils -e "print get_mysql_password.qq{\n}") -e "SELECT name FROM myTABLE where account_id= 'XX'"
I founded the post below which deals with commands outputs into awk but facing syntax errors...
How can I pass variables from awk to a shell command?

This uses parentheses as your field separator, so it's easier to grab the account number and the number of slow connections.
awk -F '[()]' '
/Slow transactions/ {
acct[$2]++
dropped[$2] += $4
}
END {
PROCINFO["sorted_in"] = "#ind_num_asc" # https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html
for (acctnum in acct)
print acctnum, acct[acctnum], dropped[acctnum]
}
' logfile
Given your sample input, this outputs
20 2 5
Required GNU awk for the "sorted_in" method of sorting array traversal by index.

Using awk with Operations on Variables

I'm trying to write a Bash script that reads files with several columns of data and multiplies each value in the second column by each value in the third column, adding the results of all those multiplications together.
For example if the file looked like this:
Column 1 Column 2 Column 3 Column 4
genome 1 30 500
genome 2 27 500
genome 3 83 500
...
The script should multiply 1*30 to give 30, then 2*27 to give 54 (and add that to 30), then 3*83 to give 249 (and add that to 84) etc..
I've been trying to use awk to parse the input file but am unsure of how to get the operation to proceed line by line. Right now it stops after the first line is read and the operations on the variables are performed.
Here's what I've written so far:
for file in fileone filetwo
do
set -- $(awk '/genome/ {print $2,$3}' $file.hist)
var1=$1
var2=$2
var3=$((var1*var2))
total=$((total+var3))
echo var1 \= $var1
echo var2 \= $var2
echo var3 \= $var3
echo total \= $total
done
I tried placing a "while read" loop around everything but could not get the variables to update with each line. I think I'm going about this the wrong way!
I'm very new to Linux and Bash scripting so any help would be greatly appreciated!

That's because awk reads the entire file and runs its program on each line. So the output you get from awk '/genome/ {print $2,$3}' $file.hist will look like
1 30
2 27
3 83
and so on, which means in the bash script, the set command makes the following variable assignments:
$1 = 1
$2 = 30
$3 = 2
$4 = 27
$5 = 3
$6 = 83
etc. But you only use $1 and $2 in your script, meaning that the rest of the file's contents - everything after the first line - is discarded.
Honestly, unless you're doing this just to learn how to use bash, I'd say just do it in awk. Since awk automatically runs over every line in the file, it'll be easy to multiply columns 2 and 3 and keep a running total.
awk '{ total += $2 * $3 } ENDFILE { print total; total = 0 }' fileone filetwo
Here ENDFILE is a special address that means "run this next block at the end of each file, not at each line."
If you are doing this for educational purposes, let me say this: the only thing you need to know about doing arithmetic in bash is that you should never do arithmetic in bash :-P Seriously though, when you want to manipulate numbers, bash is one of the least well-adapted tools for that job. But if you really want to know, I can edit this to include some information on how you could do this task primarily in bash.

I agree that awk is in general better suited for this kind of work, but if you are curious what a pure bash implementation would look like:
for f in file1 file2; do
total=0
while read -r _ x y _; do
((total += x * y))
done < "$f"
echo "$total"
done

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Extract text and evaluate in bash - bash

Related

Compare two timestamp columns and if difference is greater than 1 hour, trigger email alert(bash)

Having SUM issues with a bash script

How to use time command but only to time part of the script

Wondering how to merge these two bash commands in a single efficient one

Using awk with Operations on Variables

Categories

Resources