Echo something while piping stdout - bash

I know how to pipe stdout:
./myScript | grep 'important'
Example output of the above command:
Very important output.
Only important stuff here.
But while greping I would also like to echo something each line so it looks like this:
1) Very important output.
2) Only important stuff here.
How can I do that?
Edit: Apparently, I haven't specified well enough what I want to do. Numbering of lines is just an example, I want to know in general how to add text (any text, including variables and whatnot) to pipe output. I see one can achieve that using awk '{print $0}' where $0 is the solution I'm looking for.
Are there any other ways to achieve this?

This will number the hits from 0
./myScript | grep 'important' | awk '{printf("%d) %s\n", NR, $0)}'
1) Very important output.
2) Only important stuff here.
This will give you the line number of the hit
./myScript | grep -n 'important'
3:Very important output.
47:Only important stuff here.

If you want line numbers on the new output running from 1..n where n is number of lines in the new output:
./myScript | awk '/important/{printf("%d) %s\n", ++i, $0)}'
# ^ Grep part ^ Number starting at 1

A solution with a while loop is not suited for large files, so you should only use this solution when you do not have a lot important stuff:
i=0
while read -r line; do
((i++))
printf "(%s) Look out: %s" $i "${line}"
done < <(./myScript | grep 'important')

Related

Bash Iterative approach in place of process substitution not working as expected

complete bash noob here. Had the following command (1.) and it worked as expected but it seemed a bit naive for what I needed:
Essentially generating a wordlist from a messy input file with tab delimiters
cat users.txt | tee >(cut -f 1 >> cut_out.txt) >(cut -f 2 >> cut_out.txt) >(cut -f 3 >> cut_out.txt) >(cut -f 4 >> cut_out.txt)
Output:
W Humphrey
SummersW
FoxxR
noreply
DaibaN
PeanutbutterM
PetersJ
DaviesJ
BlaireJ
GongoH
MurphyF
JeffersD
HorsemanB
...
Thought I could cut down on the ridiculous command above with the following
cat users.txt | for i in {1..4}; do cut -f $i >> cut_out.txt; done
Output:
HumphreyW
The command above only returned a single word from the list and some white-space.
The solution. I knew that I could get it working logically by simply looping the entire command instead, this did exactly what I wanted but just wanted to know why the command above (2.) returned an almost empty file?
for i in {1..4}; do cat users.txt | cut -f $i >> cut_out.txt; done
Have a solution, more-so wanted an explanation because I am dumb and still learning about I/O in bash. Cheers.
Just a remark
awk -F '[\t]' '{for(i = 1; i <= 4; i++) print $i}' users.txt > cut_out.txt
Is basically what your cat ... | tee >(cut ...) ... does.
If the order of the output is unimportant, and there are only four coumns in the file, simply
tr '\t' '\n' <users.txt >cut_out.txt
If you only want the first four columns in any order,
cut -f1-4 users.txt |
rt '\t' '\n' >cut_out.txt
(Thanks to #KamilCuk for raising this in a comment.)
Otherwise your third attempt is basically fine, though you want to avoid the useless cat and redirect only once;
for i in {1..4}; do
cut -f "$i" users.txt
done > cut_out.txt
This is obviously less efficient than only reading the file once. If the file is small enough to fit into memory, you could write a simple Awk script to read it once and split it up into variables, and then write out these variables in the order you want.
The second attempt is wrong because cat only supplies a single instance of the data to the pipe, and the first iteration of the loop consumes it all.

Select line below search expression in log file

I am trying to search logs for an expression, then select the line below each match.
Example
I know I want the lines below CommonText, for example given the log data:
CommonTerm: something
This Should
random stuff
CommonTerm: something else
Be The
random stuff
more random stuff
CommonTerm: something else
Output Text
random fluff
Desired Output
This Should
Be The
Output Text
Current Attempt
Currently I can use grep log_file CommonTerm -B 0 -A 1 to get:
CommonTerm: something
This Should
--
CommonTerm: something else
Be The
--
CommonTerm: something else
Output Text
I can then pipe this through | grep "\-\-" -B 0 -A 1 to get
This Should
--
--
Be The
--
--
Output Text
--
And then through awk '{if (count++%3==0) print $0;}', giving:
This Should
Be The
Output Text
My question is: surely there's a good 'unix-y' way to do this? Multi greps and a hacky awk feels pretty silly... Is there?
Edit: I also tried:
(grep 'CommonTerm:' log_file -B 0 -A 2) | grep "\-\-" -B 1 -A 0 | grep -v "^--$"
but it seems much more clunky than the answers below which was expected ;)
Edit:
There are some great answers coming in, are there any which would let me easily select the nth line after the search term? I see a few might be more easy than others...
awk 'p { print; p=0 }
/CommonTerm/ { p=1 }' file
You can use sed:
sed -n "/^CommonTerm: /{n;p}" log_file
This searches for "CommonTerm: " at the start of the line (^), then skips to the next line (n) and prints it (p).
EDIT: As per the comment thread, if you're using BSD sed rather than GNU sed (likely to be the case on OS X), you need a couple of extra semicolons to get round a bug:
sed -n "/^CommonTerm: /{;n;p;}" log_file
How about:
grep -B 0 -A 1 "CommonTerm" log_file | grep -v "^CommonTerm:" | grep -v "^--$"
I'd do this with awk:
awk 'found{found=0;print;next}/CommonTerm/{found=1}'
For those that have pcregrep installed, this can be done at one shot. Notice the use of \K to reset the starting point of the match
pcregrep -Mo 'CommonTerm.*?\n\K.*?(?=\n)' file

How can I split a string in shell?

I have two strings and I want to split with space and use them two by two:
namespaces="Calc Fs"
files="calc.hpp fs.hpp"
for example, I want to use like this: command -q namespace[i] -l files[j]
I'm a noob in Bourne Shell.
Put them into an array like so:
#!/bin/bash
namespaces="Calc Fs"
files="calc.hpp fs.hpp"
i=1
j=0
name_arr=( $namespaces )
file_arr=( $files )
command -q "${name_arr[i]}" -l "${file_arr[j]}"
echo "hello world" | awk '{split($0, array, " ")} END{print array[2]}'
is how you would split a simple string.
if what you want to do is loop through combinations of the two split strings, then you want something like this:
for namespace in $namespaces
do
for file in $files
do
command -q $namespace -l $file
done
done
EDIT:
or to expand on the awk solution that was posted, you could also just do:
echo $foo | awk '{print $'$i'}'
EDIT 2:
Disclaimer: I don not profess to be any kind of expert in awk at all, so there may be small errors in this explanation.
Basically what the snippet above does is pipe the contents of $foo into the standard input of awk. Awk reads from it's standard in line by line, separating each line into fields based on a field separator, which is any number of spaces by default. Awk executes the program that it is given as an argument. In this case, the shell expands '{ print $'$1' }' into { print $1 } which simply tells awk to print field number 1 of each line of its input.
If you want to learn more I think that this blog post does a pretty good job of describing the basics (as well as the basics of sed and grep) if you skip past the more theoretical stuff at the start (unless you're into that kind of thing).
I wanted to find a way to do it without arrays, here it is:
paste -d " " <(tr " " "\n" <<< $namespaces) <(tr " " "\n" <<< $files) |
while read namespace file; do
command -q $namespace -l $file
done
Two special usage here: process substitution (<(...)) and here strings (<<<). Here strings are a shortcut for echo $namespaces | tr " " "\n". Process substitution is a shortcut for fifo creation, it allows paste to be run using the output of commands instead of files.
If you are using zsh this could be very easy:
files="calc.hpp fs.hpp"
# all elements
print -l ${(s/ /)files}
# just the first one
echo ${${(s/ /)files}[1]} # just the first one

Script to grab output lines after specific pattern

I have a program, "wimaxcu scan" to be precise, that outputs data in a format like the following:
network A
frequency
signal strength
noise ratio
network B
frequency
signal strength
noise ratio
etc....
There are a huge number of elements that get output by the program. I am only interested in a few of the properties of one particular network, say for example network J. I would like to write a bash script that will place the signal strength and noise ratio of J on a new line in a specified text file every time that I run the script. So after running the script many times I would have a file that looks like:
Point 1 signal_strength noise_ratio
Point 2 signal_strength noise_ratio
Point 3 signal_strength noise_ratio
etc...
I was advised to pipe the output into grep to accomplish this. I'm fairly certain that grep is not the best method to accomplish this because the lines I want to grab are indistinguishable from other noise and signal strengths lines. I'm thinking that the "network J" pattern would have to be recognized (it is unique), and then the lines that come 2nd and 3rd after the found pattern would be grabbed.
My question is how others would recommend that I implement such a script. I'm not very experienced with bash, so the simplest method would be appreciated, rather than a complex but efficient method.
With awk!
If your data is in a file called "data," you can do this on the command line:
$ awk -v RS='\n\n' -v FS='\n' '{ print $1,$3,$4 }' data
What that will do is set your "record separator" to two newlines, the "field separator" to a single newline, and then print fields one, three, and four from each data set.
Awk, if you're not familiar, operates on records, and can do various things with them. So this simply says "a record looks like this, and print it this way." Specifically, "A record has fields that are separated by newlines, and each record is separated by two consecutive newlines. Print the first, third, and fourth fields of these records out for each record."
Edit: As Jo So (who fully read and comprehended what you were asking for) points out, you can add an if statement to the inside of the curly braces to specify a specific network. Or, if it were unique, you could just throw in a pipe to grep at the end. But his solution is more correct, since it will only match against that first field!
$ awk -v RS='\n\n' -v FS='\n' '{ if ($1 == "Network J") print $1,$3,$4 }' data
To complete Dan Fego's very good answer (sorry, it seems I'm not yet allowed to place comments), consider this:
awk -v RS='\n\n' -v FS='\n' '{if ($1 == "network J") print $3}' data
This is actually a very robust piece of code.
Actually Grep is the right option.
What you have to do is use the -A (after) and -B (before) options of grep. You can use something like:
grep "network J" -A 3 original_output
this will output the 3 lines after network J including the line network J. But you don't want the words "network J" so
grep "network J" -A 3 original_output | grep -v "network J"
you then have to put them in one line which is easily done by echoing the output as in:.
echo $(grep "network J" -A original_output | grep -v "network J")
Now you will end up with all instances of Network J in the file. you can append them to an output
Part A
echo $(grep "network J" -A original_output | grep -v "network J") >> net_j_report.txt
adding Point 1 ... etc to the beginning can be done later by:
Part B
grep -v '^[[:space:]]*$' net_j_report.txt | cat -n | sed -e 's/^/Point /'
here grep -v removes any accidental empty lines, cat -n adds line numbers and last sed statement puts the word Point in the beginning.
so combine part A and B and voila.
This might work for you:
# cat file1 # same format for file2, file3, ...
network A
frequency
signalA strength
noise1 ratio
network B
frequency
signalB strength
noise1 ratio
# sed -n '/network/{s/network \(.\)/cat <<\\EOF >>\1/p;n;n;N;y/ /_/;s/\n/ /;s/$/\nEOF/p}' file1 | sh
# sed -n '/network/{s/network \(.\)/cat <<\\EOF >>\1/p;n;n;N;y/ /_/;s/\n/ /;s/$/\nEOF/p}' file2 | sh
# sed -n '/network/{s/network \(.\)/cat <<\\EOF >>\1/p;n;n;N;y/ /_/;s/\n/ /;s/$/\nEOF/p}' file3 | sh
# sed -i = A
# sed -i 'N;s/^/Point /;s/\n/ /' A
# sed -i = B
# sed -i 'N;s/^/Point /;s/\n/ /' B
# cat A
Point 1 signalA_strength noise1_ratio
Point 2 signalA_strength noise2_ratio
Point 3 signalA_strength noise3_ratio
# cat B
Point 1 signalB_strength noise1_ratio
Point 2 signalB_strength noise2_ratio
Point 3 signalB_strength noise3_ratio

"while read LINE do" and grep problems

I have two files.
file1.txt:
Afghans
Africans
Alaskans
...
where file2.txt contains the output from a wget on a webpage, so it's a big sloppy mess, but does contain many of the words from the first list.
Bashscript:
cat file1.txt | while read LINE; do grep $LINE file2.txt; done
This did not work as expected. I wondered why, so I echoed out the $LINE variable inside the loop and added a sleep 1, so i could see what was happening:
cat file1.txt | while read LINE; do echo $LINE; sleep 1; grep $LINE file2.txt; done
The output looks in terminal looks something like this:
Afghans
Africans
Alaskans
Albanians
Americans
grep: Chinese: No such file or directory
: No such file or directory
Arabians
Arabs
Arabs/East Indians
: No such file or directory
Argentinans
Armenians
Asian
Asian Indians
: No such file or directory
file2.txt: Asian Naruto
...
So you can see it did finally find the word "Asian". But why does it say:
No such file or directory
?
Is there something weird going on or am I missing something here?
What about
grep -f file1.txt file2.txt
#OP, First, use dos2unix as advised. Then use awk
awk 'FNR==NR{a[$1];next}{ for(i=1;i<=NF;i++){ if($i in a) {print $i} } } ' file1 file2_wget
Note: using while loop and grep inside the loop is not efficient, since for every iteration, you need to invoke grep on the file2.
#OP, crude explanation:
For meaning of FNR and NR, please refer to gawk manual. FNR==NR{a[1];next} means getting the contents of file1 into array a. when FNR is not equal to NR (which means reading the 2nd file now), it will check if each word in the file is in array a. If it is, print out. (the for loop is used to iterate each word)
Use more quotes and use less cat
while IFS= read -r LINE; do
grep "$LINE" file2.txt
done < file1.txt
As well as the quoting issue, the file you've downloaded contains CRLF line endings which are throwing read off. Use dos2unix to convert file1.txt before iterating over it.
Although usng awk is faster, grep produces a lot more details with less effort. So, after issuing dos2unix use:
grep -F -i -n -f <file_containing_pattern> <file_containing_data_blob>
You will have all the matches + line numbers (case insensitive)
At minimum this will suffice to find all the words from file_containing_pattern:
grep -F -f <file_containing_pattern> <file_containing_data_blob>

Resources