Count checker from log file with bash script - bash

i have the script that has output logfile.txt :
File_name1
Replay requests : 5
Replay responsee : 5
Replay completed : 5
--------------------
File_name2
Replay requests : 3
Replay responsee : 3
Replay completed : 3
--------------------
I need to check that counts at all 3 line were the same, and if one of the line mismatched move File_name to "echo".
I tried to grep with pattern file like cat logfile.txt | grep -f patternfile.ptrn with for loop, but there is no result, can`t find how to put first count in parameter that allow me to check it with next line, and how to check when there are many files_names at the logfile.
Pattern was :
Replay requests :
Replay responsee :
Replay completed :
--------------------
Its a correct idea or mb i`m moving in wrong way?

I need to check that counts at all 3 line were the same, and if one of the line mismatched move File_name to "echo".
Here is one approach/solution.
Given your input example.
File_name1
Replay requests : 5
Replay responsee : 5
Replay completed : 5
--------------------
File_name2
Replay requests : 3
Replay responsee : 3
Replay completed : 3
--------------------
The script.
#!/usr/bin/env bash
while mapfile -tn4 array && ((${#array[*]})); do
name="${array[0]}"
contents=("${array[#]:1}")
contents=("${contents[#]##* }")
for n in "${contents[#]:1}"; do
(( contents[0] != n )) &&
printf '%s\n' "$name" &&
break
done
done < <(grep -Ev '^-+$' file.txt)
It will not print anything (filename) but change just one value of count (assuming count is the last string per line which is a number) then it should print the filename.
Note that mapfile aka readarray is a bash4+ feature.
The script above assumes that there are 4 lines in between the dashes that separates the Filenames.
and how to check when there are many files_names at the logfile.
Not sure what that means. Clarify the question.

Here is a stating point for a script; I have not well understood the whole question and don't know what exact output is expected.
#! /bin/bash
declare -A dict
while read -a line ; do
test "${line[0]}" == "Replay" || continue
rep="${line[1]}"
num="${line[3]}"
if test "${dict[$rep]}" == "" ; then
dict[$rep]=$num
elif test "${dict[$rep]}" != "$num" ; then
echo "Value changed for $rep : ${dict[$rep]} -> $num"
fi
done < "logfile.txt"
If for instance the input is
File_name1
Replay requests : 5
Replay responsee : 3
Replay completed : 7
--------------------
File_name2
Replay requests : 2
Replay responsee : 3
Replay completed : 6
--------------------
the output will be :
Value changed for requests : 5 -> 2
Value changed for completed : 7 -> 6
Is it helpful?

Related

continuing echo of two different statements back to same line

I know using echo -n option , we can update the output in same line .
But i have different scenario :
echo "Currently reading file :"
if [some_condition];then
echo -n $file_read
else
echo -n "Skipping file read:" $file_skip
fi
echo "successfully completed"
Current output :
Currently reading file :
0 1 2 Skipping file read : 3 Skipping file read 4 5 6 7 8
Completed reading files successfully
Expecting output 1 :
Currently reading file :
0 1 2 .....9 10 11 ....
Skipping file read : 4 5 6 ....12 13
OR
Expecting Output 2:
for this i can use -en option for this
currently Reading file : only latest file no
Skipped these files : 1 2 9 15 21 ....
OR
Any other best way possible
Is there any better way to display this output scenario . Files range may be between 1 to 2000 .
I even tried displaying only the latest output using
echo -en "$file_read \r"
But this also wont look good when a file is skipped reading .
You can declare firstly the initial values of FIRST_ROW or SECOND_ROW as you with
For example:
SECOND_ROW="Skipping file read : "
Then you assign the outputs you need in 2 different variables for 'Expecting output 1'
if [some_condition];then
FIRST_ROW= "$FIRST_ROW $file_read"
else
SECOND_ROW= "$SECOND_ROW $file_skip"
fi
And finally you can print the desired output using
echo
and custom strings as you wish

Shell: why does while loop processing additional char

This is my simple shell script
root#Ubuntu:/tmp# cat -n script.sh
1 echo
2 while x= read -n 1 char
3 do
4 echo -e "Original value = $char"
5 echo -e "Plus one = `expr $char + 1`\n"
6 done < number.txt
7 echo
root#Ubuntu:/tmp#
And this is the content of number.txt
root#Ubuntu:/tmp# cat number.txt
12345
root#Ubuntu:/tmp#
As you can see on the code, I'm trying to read each number and process it separately. In this case, I would like to add one to each of them and print it on a new line.
root#Ubuntu:/tmp# ./script.sh
Original value = 1
Plus one = 2
Original value = 2
Plus one = 3
Original value = 3
Plus one = 4
Original value = 4
Plus one = 5
Original value = 5
Plus one = 6
Original value =
Plus one = 1
root#Ubuntu:/tmp#
Everything looks fine except for the last line. I've only have 5 numbers, however it seems like the code is processing additional one.
Original value =
Plus one = 1
Question is how does this happen and how to fix it?
It seems the input file number.txt contains a complete line, which is terminated by a line feed character (LF). (You can verify the input file is longer than 5 using ls -l.) read eventually encounters the LF and gives you an empty char (stripping the terminating LF from the input as it would without the -n option). This will give you expr + 1 resulting in 1. You can explicitely test for the empty char and terminate the while loop using the test -n for non-zero length strings:
echo "12345" | while read -n 1 char && [ -n "$char" ]; do echo "$char" ; done

Continuously-updated (running-count) output from a program reading from a pipeline

How can I get continuously-updated output from a program that's reading from a pipeline? For example, let's say that this program were a version of wc:
$ ls | running_wc
So I'd like this to output instantly, e.g.
0 0 0
and then every time a new output line is received, it'd update again, e.g.
1 2 12
2 4 24
etc.
Of course my command isn't really ls, it's a process that slowly outputs data... I'd actually love to dynamically have it count matches and non matches, and sum this info up on a single line, e.g,
$ my_process | count_matches error
This would constantly update a single line of output with the matching and non matching counts, e.g.
$ my_process | count_matches error
0 5
then later on it might look like so, since it's found 2 matches and 10 non matching lines.
$ my_process | count_matches error
2 10
dd will print out statistics if it receives a SIGUSR1 signal, but neither wc nor grep does that. You'll need to re-implement them, more or less.
count_matches() {
local pattern=$1
local matches=0 nonmatches=0
local line
while IFS= read -r line; do
if [[ $line == *$pattern* ]]; then ((++matches)); else ((++nonmatches)); fi
printf '\r%s %s' "$matches" "$nonmatches"
done
printf '\n'
}
Printing a carriage return \r each time causes the printouts to overwrite each other.
Most programs will switch from line buffering to full buffering when used in a pipeline. Your slow-running program should flush its output after each line to ensure the results are available immediately. Or if you can't modify it, you can often use stdbuf -oL to force programs that use C stdio to line buffer stdout.
stdbuf -oL my_process | count_matches error
Using awk. First we create the "my_process":
$ for i in {1..10} ; do echo $i ; sleep 1 ; done # slowly prints lines
The match counter:
$ awk 'BEGIN {
print "match","miss" # print header
m=0 # reset match count
}
{
if($1~/(3|6)/) # match is a 3 or 6 (for this output)
m++ # increment match count
print m,NR-m # for each record output match / miss counts
}'
Running it:
$ for i in {1..10} ; do echo $i ; sleep 1 ; done | awk 'BEGIN{print "match","miss";m=0}{if($1~/(3|6)/)m++;print m,NR-m}'
match miss
0 1
0 2
1 2
1 3
1 4
2 4
2 5
2 6
2 7
2 8

Output of command to array not working

I'm attempting to store the output of a series of beeline HQL queries into an array, so that I can parse it to pull out the interesting bits. Here's the relevant code:
#!/usr/bin/env ksh
ext_output=()
while IFS= read -r line; do
ext_output+=( "$line" )
done < <( bee --hiveconf hive.auto.convert.join=false -f temp.hql)
bee is just an alias to the full beeline command with the JDBC url, etc. Temp.hql is multiple hql queries.
And here's a snippet of what the output of each query looks like:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| tableName:myTable |
| owner:foo |
| location:hdfs://<server>/<path>...
<big snip>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
15 rows selected (0.187 seconds)
The problem is, my array is only getting the last line from each result (15 rows selected (0.187 seconds).
Am I doing something wrong here? The exact same approach is working in other instances, so I really don't understand.
Hmmmm, I'm not having any problems with the code you've posted.
I can reproduce what I think you may be seeing (ie, array contains a single value consisting of the last line of output) if I make the following change in your code:
# current/correct code - from your post
ext_output+=( "$line" )
# modified/wrong code
ext_output=+( "$line" )
Notice the placement of the plus sign (+):
when on the left side of the equal sign (+=) each $line is appended to the end of the array (see sample run - below)
when on the right side of the equal sign (=+) each $line is assigned to the first slot in the array (index=0); the plus sign (+) and parens (()) are treated as part of the data to be stored in the array (see sample run - at bottom of this post)
Could there be a typo between what you're running (with 'wrong' results) vs what you've posted here in this thread (and what you've mentioned generates the correct results in other instances)?
Here's what I get when I run your posted code (plus sign on the left of the equal sign : +=) ...
NOTE: I've replaced the bee/HCL call with an output file containing your sample lines plus a couple (bogus) data lines; also cut down the longer lines for readability:
$ cat temp.out
-----------------------------------------+--+
| tableName:myTable
| owner:foo
| location:hdfs://<server>/<path>...
abc def ghi
123 456 789
-----------------------------------------+--+
15 rows selected (0.187 seconds)
Then I ran your code against temp.out:
ext_output=()
while IFS= read -r line
do
ext_output+=( "$line" )
done < temp.out
Some stats on the array:
$ echo "array size : ${#ext_output[*]}"
array size : 10
$ echo "array indx : ${!ext_output[*]}"
array indx : 0 1 2 3 4 5 6 7 8 9
$ echo "array vals : ${ext_output[*]}"
array vals : -----------------------------------------+--+ | tableName:myTable | owner:foo | location:hdfs://<server>/<path>... abc def ghi 123 456 789 -----------------------------------------+--+ 15 rows selected (0.187 seconds)
And a dump of the array's contents:
$ for i in ${!ext_output[*]}
> do
> echo "${i} : ${ext_output[$i]}"
> done
0 : -----------------------------------------+--+
1 : | tableName:myTable
2 : | owner:foo
3 : | location:hdfs://<server>/<path>...
4 :
5 : abc def ghi
6 : 123 456 789
7 :
8 : -----------------------------------------+--+
9 : 15 rows selected (0.187 seconds)
If I modify your code to place the plus sign on the right side of the equal sign (=+) ...
ext_output=()
while IFS= read -r line
do
ext_output=+( "$line" )
done < temp.out
... the array stats:
$ echo "array size : ${#ext_output[*]}"
array size : 1
$ echo "array indx : ${!ext_output[*]}"
array indx : 0
$ echo "array vals : ${ext_output[*]}"
array vals : +( 15 rows selected (0.187 seconds) )
... and the contents of the array:
$ for i in ${!ext_output[*]}
> do
> echo "${i} : ${ext_output[$i]}"
> done
0 : +( 15 rows selected (0.187 seconds) )
!! Notice that the plus sign and parens are part of the string stored in ext_output[0]

I want to delete a batch from file

I have a file and contents are like :
|T1234
010000000000
02123456878
05122345600000000000000
07445678920000000000000
09000000000123000000000
10000000000000000000000
.T1234
|T798
013457829
0298365799
05600002222222222222222
09348977722220000000000
10000057000004578933333
.T798
Here one complete batch means it will start from |T and end with .T.
In the file i have 2 batches.
I want to edit this file to delete a batch for record 10(position1-2),if from position 3 till position 20 is 0 then delete the batch.
Please let me know how i can achieve this by writing a shell script or syncsort or sed or awk .
I am still a little unclear about exactly what you want, but I think I have it enough to give you an outline on a bash solution. The part I was unclear on is exactly which line contained the first two characters of 10 and remaining 0's, but it looks like that is the last line in each batch. Not knowing exactly how you wanted the batch (with the matching 10) handled, I have simply written the remaining wanted batch(es) out to a file called newbatch.txt in the current working directory.
The basic outline of the script is to read each batch into a temporary array. If during the read, the 10 and 0's match is found, it sets a flag to delete the batch. After the last line is read, it checks the flag, if set simply outputs the batch number to delete. If the flag is not set, then it writes the batch to ./newbatch.txt.
Let me know if your requirements are different, but this should be fairly close to a solution. The code is fairly well commented. If you have questions, just drop a comment.
#!/bin/bash
ifn=${1:-dat/batch.txt} # input filename
ofn=./newbatch.txt # output filename
:>"$ofn" # truncate output filename
declare -i bln=0 # batch line number
declare -i delb=0 # delete batch flag
declare -a ba # temporary batch array
[ -r "$ifn" ] || { # test input file readable
printf "error: file not readable. usage: %s filename\n" "${0//*\//}"
exit 1
}
## read each line in input file
while read -r line || test -n "$line"; do
printf " %d %s\n" $bln "$line"
ba+=( "$line" ) # add line to array
## if chars 1-2 == 10 and chars 3 on == 00...
if [ ${line:0:2} == 10 -a ${line:3} == 00000000000000000000 ]; then
delb=1 # set delete flag
fi
((bln++)) # increment line number
## if the line starts with '.'
if [ ${line:0:1} == '.' ]; then
## if the delete batch flag is set
if [ $delb -eq 1 ]; then
## do nothing (but show batch no. to delete)
printf " => deleting batch : %s\n" "${ba[0]}"
## if delb not set, then write the batch to output file
else
printf "%s\n" ${ba[#]} >> "$ofn"
fi
## reset line no., flags, and uset array.
bln=0
delb=0
unset ba
fi
done <"$ifn"
exit 0
Output (to stdout)
$ bash batchdel.sh
0 |T1234
1 010000000000
2 02123456878
3 05122345600000000000000
4 07445678920000000000000
5 09000000000123000000000
6 10000000000000000000000
7 .T1234
=> deleting batch : |T1234
0 |T798
1 013457829
2 0298365799
3 05600002222222222222222
4 09348977722220000000000
5 10000057000004578933333
6 .T798
Output (to newbatch.txt)
$ cat newbatch.txt
|T798
013457829
0298365799
05600002222222222222222
09348977722220000000000
10000057000004578933333
.T798

Resources