Bash: Verifying numbers inside line closer to "big" log file

Bash: Verifying numbers inside line closer to "big" log file - bash

What I'm trying to do: Make a bash script that performs some tests on my system, wich than reads some log files, and in the end points me some analyzes.
I have, for example, a log file (and although it's not so big sometimes, I want to save proccess when possible) that has in the end something like this:
Ran 6 tests with 1 failures and 0 errors in 1.042 seconds.
Tearing down left over layers:
Tear down Products.PloneTestCase.layer.PloneSite in 0.463 seconds.
Tear down Products.PloneTestCase.layer.ZCML in 0.008 seconds.
And I already have this bash line, that takes the line I want (the one with failures and errors):
error_line=$(tac $p.log | grep -m 1 '[1-9] .* \(failures\|errors\)')
Obs: Could anyone answer me if 'tac' passes the proccess to grep for each line of the file, or first it load the hole file on memory and than grep runs on the hole "memory variable"? Cause I was thinking in run the grep to each line, and when the line I want comes, I'd stop the "cating" proccess.
If it does work this way ("grepping" each line), when grep finds what it want (with the -m 1 option), it stops the tac proccess? How I'd do that?
Also, do you know a better way?
Continuing...
So, the result of the command is:
Ran 6 tests with 1 failures and 0 errors in 1.042 seconds.
Now, I want to check that both the '1' AND the '0' values on the $error_line variable are equal to 0 (like they're not on this case), so that if any of them is different I can perform some other proccess to sinalyze that some error or failure was found.
Answers?

When grep exits because the pattern is found, a SIGPIPE is sent to tac causing it to exit so it won't continue to run needlessly.
read failures errors <<< $(tac $p.log | grep -Pom 1 '(?<= )[0-9]*(?= *(failures|errors))')
if [[ $failures == 0 && $errors == 0 ]]
then
echo "success"
else
echo "failure"
fi
The grep command will output only the numbers found preceding the words "failures" and "errors" without outputting any of the other text on the line.
Edit:
If your grep doesn't have -P, change the first line above to:
read failures errors <<< $(tac $p.log | grep -om 1 ' [0-9]* *\(failures\|errors\)' | cut -d ' ' -f 2
or, to use a variation of Ignacio's answer, to:
read failures errors <<< $(tac $p.log | awk '/^Ran/ {printf "%s\n%s\n", $5, $8}'

awk:
/^Ran / {
print "No failures: " ($5 == 0)
print "No errors: " ($8 == 0)
}

for very big log files, its better to use grep first to find the patterns you need, then awk to do other processing.
grep "^Ran" very_big_log_file | awk '{print $5$8=="00"?"no failure":"failure"}'
Otherwise, just awk will do.
awk '{print $5$8=="00"?"no failure":"failure"}'

Related

Determine script length at beginning of Zip file

I got a spring boot app, with the bash loader in the beginning. When I unzip it, the script in the beginning gets lost. I need it though for re-assembly. So the idea was to split it off with head -c. But I have no idea how to determine the byte location efficiently. less tells me the amount of bytes of the script when I open the zip with it, but I'd like to automate it. Is there a possibility to determine it with (un)zip? Or is there another easy way?
I thought of determining the end location of exit 0. In my current app, this is at 8720. With
echo 'ibase=16;obase=A;'$(xxd nevisadmin-app.jar | grep -m 1 "exit 0" | awk -F: '{print $1}') | bc
I get 8704 (because it's at the end of the line), but this is super fragile, because it'll fail, if the xxd output is not in the same line e.g.
000021f0 ... bla bla ex
00002200 ... it 0 binarystartshere
Thanks

It seems, that search for exit 0 is a reliable way to determine the end of the script of spring boot.
Extracting the script would be done like this":
head -n $(grep -a -n "exit 0" springboot-app.jar | awk -F: '{print $1}') springboot-app.jar > startscript.sh
So it doesn't determine the length, but that becomes irrelevant to the original question.
If someone looks up this question, the answer would be to instead of redirecting the output to a file, one could pie it to wc -c.

How to process each grep (with -A option) result separately? Not by line, but by item

I have a long log to parse, the log message for each event takes more than one line usually. If I use hardcoded line numbers of each event, then I can use grep -A $EventLineNumbers to grab the whole message of each event.
Now, for example, I want to grep for two data fields in one event with 20 lines of messages. That event might be logged 100 times in the log. I want to avoid process grep results line by line because if one data field doesn't exist in one of the events, I will end up taking data from a different event and "bundle" it with previous event.
In short, how do I process each grep individually given that I have set a -A option in the command?

For this kind of parsing, I would use awk(1). Compared to grep(1), it allows you to easily use variables and conditions. So you can:
match on the line(s) indicating the start of a bunch of lines you are interested in
set a variable to remember this state
process lines following, remember relevant fields
when you reach the end of this bunch of lines, print the formatted information
If in fact, those lines are not interesting, you can reset the state and skip to the next interesting message.
The command will be longer than a simple grep, but it remains concise and you don't need to use a full blown programming language.

This should work:
Example Input:
1
2
3
1
4
5
6
2
4
Assuming that all blocks are 3 lines and that you want to match the block with a 1 in it. The following code works:
echo -e '1\n2\n3\n1\n4\n5\n6\n2\n4' | grep 1 -A2 | awk -v n=3 '{print} NR % 3 == 0 { printf "\0" }' | xargs -0 -n 1 echo -n "WOOT: "
Explanation:
The grep 1 -A2 is you expression to find the relevant block. I assume that the matching expression occurs only once per block. The -A2 is for the two lines of context after the match.
The awk -v '{print} NR % 3 == 0 { printf "\0" }' part instructs awk to print a \0 character every 3 lines. Again, adjust this for the relevant blocksize. We need this for the next command.
The xargs -0 -n1 part executes the next command (which would be the extra filters for you), giving the whole block to it. -0 for the null terminated items (a block) and -n1 for only passing a single item to the next command.
In my example this will give the following output:
WOOT: 1
2
3
WOOT: 1
4
5
Meaning that echo was executed for each block exactly once.

Different output for pipe in script vs. command line

I have a directory with files that I want to process one by one and for which each output looks like this:
==== S=721 I=47 D=654 N=2964 WER=47.976% (1422)
Then I want to calculate the average percentage (column 6) by piping the output to AWK. I would prefer to do this all in one script and wrote the following code:
for f in $dir; do
echo -ne "$f "
process $f
done | awk '{print $7}' | awk -F "=" '{sum+=$2}END{print sum/NR}'
When I run this several times, I often get different results although in my view nothing really changes. The result is almost always incorrect though.
However, if I only put the for loop in the script and pipe to AWK on the command line, the result is always the same and correct.
What is the difference and how can I change my script to achieve the correct result?

Guessing a little about what you're trying to do, and without more details it's hard to say what exactly is going wrong.
for f in $dir; do
unset TEMPVAR
echo -ne "$f "
TEMPVAR=$(process $f | awk '{print $7}')
ARRAY+=($TEMPVAR)
done
I would append all your values to an array inside your for loop. Now all your percentages are in $ARRAY. It should be easy to calculate the average value, using whatever tool you like.
This will also help you troubleshoot. If you get too few elements in the array ${#ARRAY[#]} then you will know where your loop is terminating early.

# To get the percentage of all files
Percs=$(sed -r 's/.*WER=([[:digit:].]*).*/\1/' *)
# The divisor
Lines=$(wc -l <<< "$Percs")
# To change new lines into spaces
P=$(echo $Percs)
# Execute one time without the bc. It's easier to understand
echo "scale=3; (${P// /+})/$Lines" | bc

Grep outputs multiple lines, need while loop

I have a script which uses grep to find lines in a text file (ics calendar to be specific)
My script finds a date match, then goes up and down a few lines to copy the summary and start time of the appointment into a separate variable. The problem I have is that I'm going to have multiple appointments at the same time, and I need to run through the whole process for each result in grep.
Example:
LINE=`grep -F -n 20130304T232200 /path/to/calendar.ics | cut -f1 d:`
And it outputs only the lines, such as
86 89
Then it goes on to capture my other variables, as such:
SUMMARYLINE=$(( $LINE + 5 ))
SUMMARY:`sed -n "$SUMMARYLINE"p /path/to/calendar.ics
my script runs fine with one output, but it obviously won't work with more than 1 and I need for it to. should I send the grep results into an array? a separate text file to read from? I'm sure I'll need a while loop in here somehow. Need some help please.

You can call grep from a loop quite easily:
while IFS=':' read -r LINE notused # avoids the use of cut
do
# First field is now in $LINE
# Further processing
done < <(grep -F -n 20130304T232200 /path/to/calendar.ics)
However, if the file is not too large then it might be easier to read the whole file into an array and more around that.

With your proposed solution, you are reading through the file several times. Using awk, you can do it in one pass:
awk -F: -v time=20130304T232200 '
$1 == "SUMMARY" {summary = substr($0,9)}
/^DTSTART/ {start = $2}
/^END:VEVENT/ && start == time {print summary}
' calendar.ics

Check occurrance of word after a particular number of lines

How do I check if a line starts with "WARNING: ", ignoring the first 15 lines?
To provide context, I want to use it in a bash if condition to do error processing based on a log. The way the system is setup, it gives a known warning that comes within the first 15 lines, which we need to ignore.

tail -n +16 /var/log/syslog | grep '^WARNING'

awk 'NR>15 && /^WARNING/' file
ruby -ne 'print if $.>15 && /^WARNING/' file

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Bash: Verifying numbers inside line closer to "big" log file - bash

awk: /^Ran / { print "No failures: " ($5 == 0) print "No errors: " ($8 == 0) }

for very big log files, its better to use grep first to find the patterns you need, then awk to do other processing. grep "^Ran" very_big_log_file | awk '{print $5$8=="00"?"no failure":"failure"}' Otherwise, just awk will do. awk '{print $5$8=="00"?"no failure":"failure"}'

Related

Determine script length at beginning of Zip file

How to process each grep (with -A option) result separately? Not by line, but by item

Different output for pipe in script vs. command line

Grep outputs multiple lines, need while loop

Check occurrance of word after a particular number of lines

Categories

Resources