Check occurrance of word after a particular number of lines - bash

How do I check if a line starts with "WARNING: ", ignoring the first 15 lines?
To provide context, I want to use it in a bash if condition to do error processing based on a log. The way the system is setup, it gives a known warning that comes within the first 15 lines, which we need to ignore.

tail -n +16 /var/log/syslog | grep '^WARNING'

awk 'NR>15 && /^WARNING/' file
ruby -ne 'print if $.>15 && /^WARNING/' file

Related

Can I do a Bash wildcard expansion (*) on an entire pipeline of commands?

I am using Linux. I have a directory of many files, I want to use grep, tail and wildcard expansion * in tandem to print the last occurrence of <pattern> in each file:
Input: <some command>
Expected Output:
<last occurrence of pattern in file 1>
<last occurrence of pattern in file 2>
...
<last occurrence of pattern in file N>
What I am trying now is grep "pattern" * | tail -n 1 but the output contains only one line, which is the last occurrence of pattern in the last file. I assume the reason is because the * wildcard expansion happens before pipelining of commands, so the tail runs only once.
Does there exist some Bash syntax so that I can achieve the expected outcome, i.e. let tail run for each file?
I know I can always use a for-loop to solve the problem. I'm just curious if the problem can be solved with a more condensed command.
I've also tried grep -m1 "pattern" <(tac *), and it seems like the aforementioned reasoning still applies: wildcard expansion applies to only to the immediate command it is associated with, and the "outer" command runs only once.
Wildcards are expanded on the command line before any command runs. For example if you have files foo and bar in your directory and run grep pattern * | tail -n1 then bash transforms this into grep pattern foo bar | tail -n1 and runs that. Since there's only one stream of output from grep, there's only one stream of input to tail and it prints the last line of that stream.
If you want to search each file and print the last line of grep's output separately you can use a loop:
for file in * ; do
grep pattern "${file}" | tail -n1
done
The problem with non-loop solutions is that tail doesn't inherently know where the output of one file ends and the output of another file begins, or indeed that there are even files involved on the other end of the pipe. It just knows input is coming in from somewhere and it has to print the last line of that input. If you didn't want a loop, you'd have to use a more powerful tool like awk and perhaps use the fact that grep prepends the names of matched files (if multiple files are matched, or with -H) to delimit the start and end of outputs from each file. But, the work to write an awk program that keeps track of the current file to know when its output ends and print its last line is probably more effort than is worth when the loop solution is so simple.
You can achieve what you want using xargs. For your example it would be:
ls * | xargs -n 1 sh -c 'grep "pattern" $0 | tail -n 1'
Can save you from having to write a loop.
You can do this with awk, although (as tjm3772 pointed out in their answer) it's actually more complicated than the shell for loop. For the record, here's what I came up with:
awk -v pattern="YourPatternHere" '(FNR==1 && line!="") {print line; line=""}; $0~pattern {line=$0}; END {if (line!="") print line}'
Explanation: when it finds a matching line ($0~pattern), it stores that line in the line variable ({line=$0}) (this means that at the end of the file, line will hold the last matching line.
(Note: if you want to just include a literal pattern in the program, remove the -v pattern="YourPatternHere" part and replace $0~pattern with just /YourPatternHere/)
There's no simple trigger to print a match at the end of each file, so that part's split into two pieces: if it's the first line of a file AND line is set because of a match in the previous file ((FNR==1 && line!="")), print line and then clear it so it's not mistaken for a match in the current file ({print line; line=""}). Finally, at the end of the final file (END), print a match found in that last file if there was one ({if (line!="") print line}).
Also, note that the print-at-beginning-of-new-file test must be before the check for a matching line, or else it'll get very confused if the first line of the new file matches.
So... yeah, a shell for loop is simpler (and much easier to get right).

Show only newly added lines of logfile in terminal

I use tail -f to show the contents of a logfile.
What I want is when the logfile content changes, instead of appending the new lines to my screen, only the newly added lines should be shown on my screen.
So as if a clearscreen was made every time before printing the new lines.
I tried to find a solution by web search but couldn't find anything useful.
edit:
In my case it happens that several lines will be added at once (it is a php error logfile). So I am looking for a solution where more than the single last line can be shown on screen.
The watch command in combination with the tail command shows the last line of a log file with the intervall of every 2 seconds. Basically it doesn't refresh whenever a new line is appended to the log file but since you could specifiy an intervall it might help you for your use case.
watch -t tail -1 <path_to_logfile>
If you need a faster intervall like every 0.5 seconds, then you could specify it with the 'n' option i.e.:
watch -t -n 0.5 tail -1 <path_to_logfile>
Try
$ watch 'tac FILE | grep -m1 -C2 PATTERN | tac'
where
PATTERN is any keyword (or regexp) to identify errors you seek in the log,
tac prints the lines in reverse,
-m is a max count of matching lines to grep,
-C is any number of lines of context (before and after the match) to show (optional).
That would be similar to
$ tail -f FILE | grep -C2 PATTERN
if you didn't mind just appending occurrences to the output in real-time.
But if you don't know any generic PATTERN to look for at all,
you'd have to just follow all the updates as the logfile grows:
$ tail -n0 -f FILE
Or even, create a copy of the logfile and then do a diff:
Copy: cp file.log{,.old}
Refresh the webpage with your .php code (or whatever, to trigger the error)
Run: diff file.log{,.old}
(or, if you prefer sort to diff: $ sort file.log{,.old} | uniq -u)
The curly braces is shorthand for both filenames (see Brace Expansion in $ man bash)
If you must avoid any temp copies, store the line count in memory:
z=$(grep -c ^ file.log)
Refresh the webpage to trigger an error
tail -n +$z file.log
The latter approach can be built upon, to create a custom scripting solution more suitable for your needs (check timestamps, clear screen, filter specific errors, etc). For example, to only show the lines that belong to the last error message in the log file updated in real-time:
$ clear; z=$(grep -c ^ FILE); while true; do d=$(date -r FILE); sleep 1; b=$(date -r FILE); if [ "$d" != "$b" ]; then clear; tail -n +$z FILE; z=$(grep -c ^ FILE); fi; done
where
FILE is, obviously, your log file name;
grep -c ^ FILE counts all lines in a file (that is almost, but not entirely unlike cat FILE|wc -l that would only count newlines);
sleep 1 sets the pause/delay between checking the file timestamps to 1 second, but you could change it to even a floating point number (the less the interval, the higher the CPU usage).
To simplify any repetitive invocations in future, you could save this compound command in a Bash script that could take a target logfile name as an argument, or define a shell function, or create an alias in your shell, or just reverse-search your bash history with CTRL+R. Hope it helps!

echo last character of text file in Unix/Bash

I need to see the last characters of bunch of text files (or alternatively test whether they are "}" and give a list of files that test negative ). Is there an easy way to do this from the command line.
(Ideally the solution works without reading the whole file from the start because in addition to there being many they can also be quite large.
P.S.: Any answer would be great but I would really appreciate if the function and syntax of everything in the answer can be fully explained.
It can be done fairly easily with tail and then string indexing in bash. For example, you obtain the last line in a file with, tail -n1 file. You will need to store the line in a variable using command-substitution, e.g.
lastln=$(tail -n1 file)
Then it is simply a matter of indexing the last characters, e.g.
echo ${lastln:(-1)}
(note: when indexing from the end of the string, you must put the offset (e.g. -1 in parenthesis (-1) -- or -- you must leave a space before the -1, e.g. echo ${lastln: -1} is also valid.)
You can try this:
for file in file1 file2; do tail -n 1 "$file" | grep -q '}$' || echo "$file"; done
where you should replace file1 file2 with the list of files you want to analyze, e.g. * or the like. Now what happens here? The outer part
for file in file1 file2; do ...; done
is a simple loop over the files, where inside the loop, you can refer to the current file as $file. Then,
tail -n 1 "$file"
prints the last line of the given file and
| grep -q '}$'
redirects the output to grep (turned into silent mode with -q), which looks for '}' immediatly followed by the end of the line ($). The return value of this command can be used to chain another action: when grep returns non-zero (indicating failure, i.e., the pattern is not matched), the last part
|| echo "$file"
is executed, resulting in the list of files you need.

Find string then from there pull numbers

Im starting to code bash and not the best but i have a situation. I have an output like:
Configuration file 'hello2.conf' is in use by process 735.
Ending
I want to extract the process ID 735.
I seen answers were to extract ONLY numbers from outputs but then i am left with 2735?
How can i go about extracting 735 from the output? I was thinking search for process then grab number after perhaps?
Thanks!
Use GNU grep with its Perl Compatible Regular Expression capabilities enabled with the -P flag and print only the matching entry using -o flag.
grep -Po 'process \K[0-9]+' <<<"Configuration file 'hello2.conf' is in use by process 735."
735
Use it in a command line as
.. | grep -Po 'process \K[0-9]+'
where the \K escape sequence stands for
\K: This sequence resets the starting point of the reported match. Any previously matched characters are not included in the final matched sequence.
RegEx Demo
You might want to use a regular expressions:
[[ "$line" =~ ([0-9]+)\.$ ]] && echo "${BASH_REMATCH[1]}"
This should match any number at the end of the line, select the number part, and print it!
Good Luck!
If you line remains the same, use cut -d" " -f 9
sed can extract only the numbers at the specific location of the message (using \(...\) match grouping and \1 replacement).
... | sed "s#^Configuration file '.*' is in use by process \([0-9]*\)\.#\1#"

How do I delete all rows with a blank space in the third column within a file?

So, I have a file which contains the results of some calculations I've run in the past weeks. I've collected the results in a file which I intend to plot. It is basically a bunch of rows with the format "x" "y" "f(x,y)", like this:
1.7 4.7 -460.5338556921
1.7 4.9 -460.5368762353
1.7 5.5
However, some lines, exemplified by the last one, contain a blank space in the 3rd column, resulting from failed calculations. I'd still like to plot the viable points, but, as there are thousands of points (and therefore rows) that task just be accomplished easily by hand. I'd like to know how to make a script or program (I'd prefer a shell script, but I'll gladly go along with whatever works), which identifies those lines and deletes them. Does anyone know a way to do it?
awk '$3' <filename>
or better
awk 'NF > 2' <filename> # if in any entry in the column-3 happens to be zero
This will do the purpose!
The simplest form of grep command that should probably be understood by any shell these days:
grep -v '^[^[:space:]]*[[:space:]]*[^[:space:]]*[[:space:]]*$' <filename>
With grep:
grep ' .* [^ ]' file
or using ERE:
grep -E '\s\S+\s\S' file
I would to use:
perl -lanE 'print if #F==3 && /^[\d\s\.+-]+$/' file
will print only lines:
which contains 3 fields
and contains only numbers, spaces, and .+-
I do not know how you are going to plot. You would like a grep or awk solution and pipe all valid lines into your plotting application.
When you need to call a program for each set of values, you can skip the invalid lines when you are reading the values:
while read -r x y fxy; do
if [ -n "${fxy}" ]; then
myplotter "$x" "$y" "${fxy}"
fi
done < file

Resources