bash log file count words and replace them by number - bash

I need to keep warnings from my script log and add a "LAST" to every line after each start so I know when the alert occurs at a glance so I add this to my script :
This is the fist line of my script :
echo "$( cat $ALERT_LOG_FILE | grep WARNING | tail -n 2k | ts "LAST ")" > $ALERT_LOG_FILE
Script log looks like this at first run :
WARNING : ...
WARNING : ...
WARNING : ...
WARNING : ...
When script start/restart the echo line adds "LAST" to each line and make it like this :
LAST WARNING : ...
LAST WARNING : ...
LAST WARNING : ...
LAST WARNING : ...
Problem is the log file becomes like this after some restarts:
LAST LAST LAST LAST WARNING : ....
LAST LAST LAST WARNING : ....
LAST LAST WARNING : ....
LAST LAST WARNING : ....
LAST WARNING : ....
WARNING:
Any way to make it like this:
LAST 4 WARNING : ....
LAST 3 WARNING : ....
LAST 2 WARNING : ....
LAST 2 WARNING : ....
LAST 2 WARNING : ....
LAST 1 WARNING : ....
WARNING:
EDIT:
code with #Yoda suggestion:
cat $LOG_FILE | grep WARNING | tail -n 2k | ts "LAST " | awk '{n=gsub("LAST ",X);if(n) print "LAST",n,$0;else print}')" > $LOG_FILE
out put log after some restarts with #Yoda suggestion:
LAST 2 2 1 WARNING : ...
LAST 2 1 WARNING : ...
LAST 1 WARNING : ...
WARNING : ...

Based on some assumptions:-
$ awk '{n=gsub("LAST ",X);if(n) print "LAST",n,$0;else print}' file
LAST 4 WARNING : ....
LAST 3 WARNING : ....
LAST 2 WARNING : ....
LAST 2 WARNING : ....
LAST 1 WARNING : ....
WARNING:
If this is not what your are looking for, then I would suggest posting a representative sample of your log file and expected output.

Here is something that might help:-
awk '
{
n = gsub("LAST ",X)
if( n )
{
for ( i = 1; i <= NF; i++ )
{
if ( $i ~ /WARNING/ )
{
sub(/^ */,X)
print "LAST",n,$0;
next
}
if ( $i ~ /^[0-9]$/ )
{
n += $i
$i = ""
}
}
}
else
print $0
}
'

Related

I am very new to BASH Scripting, How do I keep a running total (count) of the keyword hits (grep) in a file and then display totals in the end?

count= `grep success <fileName.txt>
The above will only give me a total count of the word "success" but I want to keep a running total. For example, if there is a total of 'expected' 25 hits of which only 20 were found. This would mean that there were 5 failures. So I think I need to keep a running total so in the end I can report (echo) as follows:
20 out of 25 expected success found; 5 failures.
You could use awk which will print out the custom output also.
awk '/success/{ success++ ; expected=25 } END { if ( success < expected ) ; print success " out of " expected " success found; " fail " failures"};{fail=expected-success}' input_file
Example Output
$ for i in {1..25}; do echo "success";done |\
> awk '/success/{ success++ ; expected=25 } END { if ( success < expected ) ; print success " out of " expected " success found; " fail " failures"};{fail=expected-success}'
25 out of 25 success found; 0 failures
This should suffice:
~$ count=$(grep -o -i <search_term> <data_source> | wc -l)
e.g.
~$ count=$(grep -o -i computer myfile.txt | wc -l)
~$ echo $count
--- flag explanations ---
-o means only print only the matching part of the line. Can also be written as --only-matching.
-i makes the search case-insensitive. Also written as --ignore-case.
-l means output the number of lines in each datasource/input file. In the above command, that coordinates with grep with the -o flag which counts each match as a unique line found.

Count checker from log file with bash script

i have the script that has output logfile.txt :
File_name1
Replay requests : 5
Replay responsee : 5
Replay completed : 5
--------------------
File_name2
Replay requests : 3
Replay responsee : 3
Replay completed : 3
--------------------
I need to check that counts at all 3 line were the same, and if one of the line mismatched move File_name to "echo".
I tried to grep with pattern file like cat logfile.txt | grep -f patternfile.ptrn with for loop, but there is no result, can`t find how to put first count in parameter that allow me to check it with next line, and how to check when there are many files_names at the logfile.
Pattern was :
Replay requests :
Replay responsee :
Replay completed :
--------------------
Its a correct idea or mb i`m moving in wrong way?
I need to check that counts at all 3 line were the same, and if one of the line mismatched move File_name to "echo".
Here is one approach/solution.
Given your input example.
File_name1
Replay requests : 5
Replay responsee : 5
Replay completed : 5
--------------------
File_name2
Replay requests : 3
Replay responsee : 3
Replay completed : 3
--------------------
The script.
#!/usr/bin/env bash
while mapfile -tn4 array && ((${#array[*]})); do
name="${array[0]}"
contents=("${array[#]:1}")
contents=("${contents[#]##* }")
for n in "${contents[#]:1}"; do
(( contents[0] != n )) &&
printf '%s\n' "$name" &&
break
done
done < <(grep -Ev '^-+$' file.txt)
It will not print anything (filename) but change just one value of count (assuming count is the last string per line which is a number) then it should print the filename.
Note that mapfile aka readarray is a bash4+ feature.
The script above assumes that there are 4 lines in between the dashes that separates the Filenames.
and how to check when there are many files_names at the logfile.
Not sure what that means. Clarify the question.
Here is a stating point for a script; I have not well understood the whole question and don't know what exact output is expected.
#! /bin/bash
declare -A dict
while read -a line ; do
test "${line[0]}" == "Replay" || continue
rep="${line[1]}"
num="${line[3]}"
if test "${dict[$rep]}" == "" ; then
dict[$rep]=$num
elif test "${dict[$rep]}" != "$num" ; then
echo "Value changed for $rep : ${dict[$rep]} -> $num"
fi
done < "logfile.txt"
If for instance the input is
File_name1
Replay requests : 5
Replay responsee : 3
Replay completed : 7
--------------------
File_name2
Replay requests : 2
Replay responsee : 3
Replay completed : 6
--------------------
the output will be :
Value changed for requests : 5 -> 2
Value changed for completed : 7 -> 6
Is it helpful?

Bash one-liner code to output unique values

I have this command which will output 0, 1 or 2.
This line of code is part of a config file (zabbix), only reason for one-liner code.
mysql -u root -e "show slave status\G" | \
grep -E 'Slave_IO_Running:|Slave_SQL_Running:' | cut -f2 -d':' | \
sed "s/No/0/;s/Yes/1/" | awk '{ SUM += $1} END { print SUM }'
But I want it to output values to be like this so I can setup alert with correct status:
If only Slave_IO_Running is No then output 1.
If only Slave_SQL_Running is No then output 2.
If both are Yes then output 3.
If both are No then output 0.
If no lines/output from show slave status command then output 4.
So something like modify first entry of No with a unique value using sed or awk. And second entry with unique value and so on.
Output of show slave status\G
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.10.10.10
Master_User: replicationslave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.009081
Read_Master_Log_Pos: 856648307
Relay_Log_File: mysqld-relay-bin.002513
Relay_Log_Pos: 1431694
Relay_Master_Log_File: mysql-bin.009081
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
You can do all the string processing here in awk:
mysql -u root -e "show slave status\G" | awk 'BEGIN {output=0} /Slave_IO_Running.*No/ {output+=1} /Slave_SQL_Running.*No/ {output +=2} END {if(output==3}{print 0} else {if(output==0} {print 3} else {print output}}'
This will start the output counter at 0, if we match Slave_IO_Running with No we'll add 1. If we match Slave_SQL_Running with No we'll add 2, then at the end we'll print the total, which will be 0 if neither are matched, 1 if only IO is No, 2 if only SQL is No and 3 if both are No. Since you want to print 0 if both are Yes we reverse our count at the end, if we got a 3 then both were "No" so print 0, otherwise if it was 0 print 3, else print its own value.
The following awk code could be compacted into a single line if you feel the urge to do that:
awk -F: -v ret=0 '
/Slave_IO_Running:.*No/ { ret=1 }
/Slave_IO_Running:.*Yes/ { yes++ }
/Slave_SQL_Running:.*No/ { ret=(ret==1) ? 0 : 2 }
/Slave_SQL_Running:.*Yes/ { yes++ }
END { print (yes==2) ? 3 : ret }
'
No grep or cut or sed is required, this takes the output of your mysql command directly. It also assumes that Slave_IO_Running will always appear before Slave_SQL_Running in the output of your command.
The notation in the third line and last line functions as an in-line "if" statement -- if the value of ret equals 1, set ret to 0; otherwise set ret to 2.
Whenever you have name to value pairs in your data it's usually clearest, simplest and easiest to enhance later to first create an array mapping the names to the values and then access the values by their names, e.g.:
awk '
{ f[$1]=$2 }
END {
if (f["Slave_10_Running:"] == "Yes")
rslt = (f["Slave_SQL_Running:"] == "Yes" ? 3 : 2)
else
rslt = (f["Slave_SQL_Running:"] == "Yes" ? 1 : 0)
print rslt
}
' file
1

Output of command to array not working

I'm attempting to store the output of a series of beeline HQL queries into an array, so that I can parse it to pull out the interesting bits. Here's the relevant code:
#!/usr/bin/env ksh
ext_output=()
while IFS= read -r line; do
ext_output+=( "$line" )
done < <( bee --hiveconf hive.auto.convert.join=false -f temp.hql)
bee is just an alias to the full beeline command with the JDBC url, etc. Temp.hql is multiple hql queries.
And here's a snippet of what the output of each query looks like:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| tableName:myTable |
| owner:foo |
| location:hdfs://<server>/<path>...
<big snip>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
15 rows selected (0.187 seconds)
The problem is, my array is only getting the last line from each result (15 rows selected (0.187 seconds).
Am I doing something wrong here? The exact same approach is working in other instances, so I really don't understand.
Hmmmm, I'm not having any problems with the code you've posted.
I can reproduce what I think you may be seeing (ie, array contains a single value consisting of the last line of output) if I make the following change in your code:
# current/correct code - from your post
ext_output+=( "$line" )
# modified/wrong code
ext_output=+( "$line" )
Notice the placement of the plus sign (+):
when on the left side of the equal sign (+=) each $line is appended to the end of the array (see sample run - below)
when on the right side of the equal sign (=+) each $line is assigned to the first slot in the array (index=0); the plus sign (+) and parens (()) are treated as part of the data to be stored in the array (see sample run - at bottom of this post)
Could there be a typo between what you're running (with 'wrong' results) vs what you've posted here in this thread (and what you've mentioned generates the correct results in other instances)?
Here's what I get when I run your posted code (plus sign on the left of the equal sign : +=) ...
NOTE: I've replaced the bee/HCL call with an output file containing your sample lines plus a couple (bogus) data lines; also cut down the longer lines for readability:
$ cat temp.out
-----------------------------------------+--+
| tableName:myTable
| owner:foo
| location:hdfs://<server>/<path>...
abc def ghi
123 456 789
-----------------------------------------+--+
15 rows selected (0.187 seconds)
Then I ran your code against temp.out:
ext_output=()
while IFS= read -r line
do
ext_output+=( "$line" )
done < temp.out
Some stats on the array:
$ echo "array size : ${#ext_output[*]}"
array size : 10
$ echo "array indx : ${!ext_output[*]}"
array indx : 0 1 2 3 4 5 6 7 8 9
$ echo "array vals : ${ext_output[*]}"
array vals : -----------------------------------------+--+ | tableName:myTable | owner:foo | location:hdfs://<server>/<path>... abc def ghi 123 456 789 -----------------------------------------+--+ 15 rows selected (0.187 seconds)
And a dump of the array's contents:
$ for i in ${!ext_output[*]}
> do
> echo "${i} : ${ext_output[$i]}"
> done
0 : -----------------------------------------+--+
1 : | tableName:myTable
2 : | owner:foo
3 : | location:hdfs://<server>/<path>...
4 :
5 : abc def ghi
6 : 123 456 789
7 :
8 : -----------------------------------------+--+
9 : 15 rows selected (0.187 seconds)
If I modify your code to place the plus sign on the right side of the equal sign (=+) ...
ext_output=()
while IFS= read -r line
do
ext_output=+( "$line" )
done < temp.out
... the array stats:
$ echo "array size : ${#ext_output[*]}"
array size : 1
$ echo "array indx : ${!ext_output[*]}"
array indx : 0
$ echo "array vals : ${ext_output[*]}"
array vals : +( 15 rows selected (0.187 seconds) )
... and the contents of the array:
$ for i in ${!ext_output[*]}
> do
> echo "${i} : ${ext_output[$i]}"
> done
0 : +( 15 rows selected (0.187 seconds) )
!! Notice that the plus sign and parens are part of the string stored in ext_output[0]

{awk} How to read a line and compare a $ with its next/previous line?

The command below is used to read an input file containing 7682 lines:
I use the --field-separator then converted some fields into what I need, and the grep got rid of the 2 first lines I do not need.
awk --field-separator=";" '($1<15) {print int(a=(($1-1)/480)+1) " " ($1-((int(a)-1)*480)) " " (20*log($6)/log(10))}' 218_DW.txt | grep -v "0 480 -inf"
I used ($1<15) so that I only print 14 lines, better for testing. The output I get is exactly what I want, but, there is more I need to do on that:
1 1 48.2872
1 2 48.3021
1 3 48.1691
1 4 48.1502
1 5 48.1564
1 6 48.1237
1 7 48.1048
1 8 48.015
1 9 48.0646
1 10 47.9472
1 11 47.8469
1 12 47.8212
1 13 47.8616
1 14 47.8047
From above, $1 will increment from 1-16, $2 from 1-480, it's always continuous,
so when it gets to 16 480 47.8616 it restarts from 2 1 47.8616 until last line is 16 480 10.2156
So I get 16*480=7680 lines
What I want to do is simple, but, I don't get it :)
I want to compare the current line with the next one. But not all fields, only $3, it's a value in dB that decreases when $2 increases.
In example:
The current line is 1 1 48.2872=a
Next line is 1 2 48.3021=b
If [ (a - b) > 6 ] then print $1 $2 $3
Of course (a - b) has got to be an absolute value, always > 0.
The beast will be to be able to compare the current line (the $3 only) with it's next and previous line ($3).
Something like this:
1 3 48.1691=a
1 4 48.1502=b
1 5 48.1564=c
If [ ABS(b - a) > 6 ] OR If [ ABS(b - c) > 6 ] then print $1 $2 $3
But of course first line can only be compared with its next one and the last one with its previous one. Is it possible?
Try this:
#!/usr/bin/awk -f
function abs(x) {
if (x >= 0)
return x;
else
return -1 * x;
}
function compare(a,b) {
return abs(a - b) > 6;
}
function update() {
before_value = current_value;
current_line = $0;
current_value = $3;
}
BEGIN {
line_n = 1;
}
#Edit: added to skip blank lines and differently formatted lines in
# general. You could add some error message and/or exit function
# here to detect badly formatted data.
NF != 3 {
next;
}
line_n == 1 {
update();
line_n += 1;
next;
}
line_n == 2 {
if (compare(current_value, $3))
print current_line;
update();
line_n += 1;
next;
}
{
if (compare(current_value, before_value) && compare(current_value, $3))
print current_line;
update();
}
END {
if (compare(current_value, before_value)) {
print current_line;
}
}
The funny thing is that I had this code lying around from a old project where I had to do basically the same thing. Adapted it a little for you. I think it solves your problem (how I understood it, at least). If it doesn't, it should point you in the right direction.
Instructions to run the awk script:
Supposing you saved the code with the name "awkscript", the data file is named "datafile" and they are both in the current folder, you should first mark the script as executable with chmod +x awkscript and then execute it passing the data file as parameter with ./awkscript datafile or use it as part of a sequence of pipes as in cat datafile | ./awkscript.
Comparing the current line to the previous one is trivial, so I think the problem you're having is that you can't figure out how to compare the current line to the next one. Just keep 2 previous lines instead of 1 and always operate on the line before the one that's actually being read as $0, i.e. the line stored in the array p1 in this example (p2 is the line before it and $0 is the line after it):
function abs(val) { return (val > 0 ? val : -val) }
NR==2 {
if ( abs(p1[3] - $3) > 6 ) {
print p1[1], p1[2], p1[3]
}
}
NR>2 {
if ( ( abs(p1[3] - p2[3]) > 6 ) || ( abs(p1[3] - $3) > 6 ) ) {
print p1[1], p1[2], p1[3]
}
}
{ prev2=prev1; prev1=$0; split(prev2,p2); split(prev1,p1) }
END {
if ( ( abs(p1[3] - p2[3]) > 6 ) ) {
print p1[1], p1[2], p1[3]
}
}

Resources