print some text every Nth line in bash - bash

How to throttle the output of some command to print three lines, wait and print something and print next three lines and keep doing it. Note that sometime this tool prints lot of data together or sometimes it prints very slowly(its kind of tail command). I need to do this using pure bash, cannot use awk.
./sometool
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
expected result:
./sometool| <some pure bash loginc/no awk>
line1
line2
line3
#some randon text
line4
line5
line6
#some randon text
line7
line8
line9
#some randon text
line10
...
did it with awk but needed a bash approch:
./sometool |awk 'NR%3==0{$0= $0 RS"#some randon text"}1'
line1
line2
line3
#some randon text
line4
line5
line6
#some randon text
line7
line8
line9
#some randon text
line10

$ seq 10 | while IFS= read -r line; do echo "$line"; if [[ $((++c%3)) -eq 0 ]]; then echo "some random text"; fi; done
1
2
3
some random text
4
5
6
some random text
7
8
9
some random text
10
create a line counter modulo 3 and do the same logic as in awk. I think this can be simplified a bit.

How about this:
some process that generates output | while
IFS= read -r first
IFS= read -r second
IFS= read -r third
do
echo "$first"
echo "$second"
echo "$third"
echo "some random text"
done
Obviously inflexible for values of N != 3

Related

Bash Process Substitution usage with tee and while loop

I want to use nested process subtitution with tee in a while loop.
while read line; do
#process line
echo "--$line"
done < <(cat sample.file | tee >(grep "SPECLINE") | grep "LINESTOPROCESS")
Therefore, I need:
all lines in sample.file that contain "LINETOPROCESS" expression should be passed into the loop, and they will be printed with "--" prefix.
all lines contain "SPECLINE" needs to be printed in tee's first process substitution (in the grep).
I want to avoid cat-ting the sample.file more than once as it is too large and heavy.
With a simple sample.test file:
line1 SPECLINE
line2 LINETOPROCESS
line3 LINETOPROCESS
line4 SPECLINE
line5 I don't need it
line6 also not
line7 also not
line8 SPECLINE
line9 LINETOPROCESS
My result:
# ./test.sh
#
My desired result:
# ./test.sh
line1 SPECLINE
--line2 LINETOPROCESS
--line3 LINETOPROCESS
line4 SPECLINE
line8 SPECLINE
--line9 LINETOPROCESS
Or I can also accept this as output:
# ./test.sh
--line2 LINETOPROCESS
--line3 LINETOPROCESS
--line9 LINETOPROCESS
line1 SPECLINE
line4 SPECLINE
line8 SPECLINE
UPDATE1
greps are for demo only.
I really need those 2 substitutions.
sample.file is a http file.
grep "SPECLINE" would be "hxselect -i -s ';' -c 'div.hour'
grep "LINESTOPROCESS" would be "hxselect -i -s ';' -c 'div.otherclass' | hxpipe
hx programs are not line-oriented. They are reading from stdin and outputting to stdout.
Therefore the tee's first command will select divs with 'hour' class and separate them with ';'. Afterwards, the pipe after tee will select all divs with class 'otherclass' and hxpipe will flatten it for the loop for further processing.
I would use no process substitution at all.
while IFS= read -r line; do
if [[ $line = *SPECLINE* ]]; then
printf '%s\n' "$line"
elif [[ $line = *LINETOPROCESS* ]]; then
printf '--%s\n' "$line"
fi
done < sample.txt
You are already paying the cost of reading an input stream line-by-line in bash; no reason to add the overhead of two separate grep processes to it.
A single awk process would be even better, as it is more efficient than bash's read-one-character-at-a-time approach to reading lines of text.
awk '/SPECLINE/ {print} /LINETOPROCESS/ {print "--"$0}' sample.txt
(which is too simple if a single line could match both SPECLINE and LINETOPROCESS, but I leave that as an exercise to the reader to fix.)
The following just loops through the entire file and just prints the matching lines. All other lines are ignored.
while read line; do
case "$line" in
*SPECLINE*) echo "$line" ;;
*LINETOPROCESS*) echo "--$line" ;;
esac
done < sample.file
When you want the tee, you can make 2 changes.
Your testcode greps LINESTOPROCESS, the input is LINETO..
The output process substition gives problems like https://stackoverflow.com/a/42766913/3220113 explained. You can do this differently.
while IFS= read -r line; do
#process line
echo "--$line"
done < x2 |
tee >(grep "SPECLINE") >(grep "LINETOPROCESS") >/dev/null
I don't know hxselect, but it seems to operate on a complete well-formed XML document, so avoid the grep.

How to grep and remove from a file all lines between a separator

I have a file that looks like this:
===SEPARATOR===
line2
line3
===SEPARATOR===
line5
line6
===SEPARATOR===
line8
...
lineX
===SEPARATOR===
How can I do a while loop and go through the file, dump anything between two ===SEPARATOR=== occurrences into another file for further processing?
I want to add only line2, line3 to the second file on the first iteration. I will parse the file; and on the next iteration I want line5 line6 in second file to do the same parsing again but on different data.
It sounds like you want to save each block of lines to a separate file.
The following solutions create output files f1, f2, containing the (non-empty) blocks of lines betwen the ===SEPARATOR=== lines.
With GNU Awk or Mawk:
awk -v fnamePrefix='f' -v RS='(^|\n)===SEPARATOR===(\n|$)' \
'NF { fname = fnamePrefix (++n); print > fname; close(fname) }' file
Pure bash - which will be slow:
#!/usr/bin/env bash
fnamePrefix='f'; i=0
while IFS= read -r line; do
[[ $line == '===SEPARATOR===' ]] && { (( ++i )); > "${fnamePrefix}${i}"; continue; }
printf '%s\n' "$line" >> "${fnamePrefix}${i}"
done < file
You can exclude all lines matching ===SEPARATOR=== with grep -v and redirect the rest to a file:
grep -vx '===SEPARATOR===' file > file_processed
-x makes sure that only lines completely matching ===SEPARATOR=== are excluded.
This uses sed to find lines between separators, and then grep -v to delete the separators.
$ sed -n '/===SEPARATOR===/,/===SEPARATOR===/ p' file | grep -v '===SEPARATOR==='
line2
line3
line8
...
lineX
There's got to be a more elegant answer that doesn't repeat the separator three times, but I'm drawing a blank.
I am assuming that you do not need the line5 and line6 . You can do it with awk like this:.
awk '$0 == "===SEPARATOR===" {interested = ! interested; next} interested {print}'
Credit goes to https://www.gnu.org/software/gawk/manual/html_node/Boolean-Ops.html#Boolean-Ops
Output:
[root#hostname ~]# cat /tmp/1 | awk '$0 == "===SEPARATOR===" {interested = ! interested; next} interested {print}' /tmp/1
line2
line3
line8
...
lineX
awk to the rescue!
with multi-char support (e.g. gawk)
$ awk -v RS='\n?===SEPARATOR===\n' '!(NR%2)' file
line2
line3
line8
...
lineX
or without that
$ awk '/===SEPARATOR===/{p=!p;next} p' file
line2
line3
line8
...
lineX
which is practically the same with #Jay Rajput's answer.

Bash script file reading

I'm trying to select certain text from an output file.
I'm reading my file as follows:
while read line
do
if [ "$line" == "SUMMARY OF POLARIZATION CALCULATION" ]; then
break
fi
done < tutorial1/Tutorial1_1.out
When the loop reaches that "Summary" line, i need to read only the next 9 lines. I'm trying to use a for loop but i'm not sure how to use it:
for i in {1..9}
do
read line < tutorial1/Tutorial1_1.out
echo $line >> Summary.out
done
My output is as follows:
next is setrmt
next is setrmt
next is setrmt
next is setrmt
next is setrmt
next is setrmt
next is setrmt
next is setrmt
next is setrmt
But i need it to be the next 9 lines after the "SUMMARY" statement. Please help.
You can use the -A parameter for the grep command like:
grep -A9 "SUMMARY OF POLARIZATION CALCULATION"
from the man:
-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines.
demo:
while read -r line
do
echo "$line"
done < <(grep -A9 "SUMMARY OF POLARIZATION CALCULATION" filename | tail -9)
for the next input file
before1
before2
before3
before4
before5
before6
SUMMARY OF POLARIZATION CALCULATION
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
line11
line12
prints:
line1
line2
line3
line4
line5
line6
line7
line8
line9
or simply:
grep -A9 "SUMMARY OF POLARIZATION CALCULATION" ../Tutorial1_1.out | tail -9 >> Summary.out
You can't redirect again. That reopens the file at the beginning. Do the second block inside the first and use the same file descriptor:
while read line
do
if [ "$line" == "SUMMARY OF POLARIZATION CALCULATION" ]; then
for i in {1..9}
do
read line
echo $line >> Summary.out
done
break
fi
done < tutorial1/Tutorial1_1.out
Let's try a clean solution:
1) find the line number of the line that contains the string you want. You could do this by implementing a counter, but this is better:
linenr=$(grep -n -m 1 "SUMMARY OF POLARIZATION CALCULATION" <your file name here> | cut -d':' -f1)
2) calculate the number of the last line you want:
let "lastlinenr = $linenr + 9"
3) fetch the lines and write them to a file:
cat <your file name here> | sed -n "$linenr","$lastlinenr"p > <your destination file here>

Bash while sed is not null

I need to do while loop when sed is not null. Example:
File 1
line1
line2
line3
File 2
i=1
while sed "${int}p" # here I need expression which checks if line is not null
# here echo this line and i++
I tried to write just while sed -n "${int}p" but it does not work as I expected.
You can use the = command in sed to get the number of lines:
sed -n '/./!q;=;p' input | sed 'N;s/\n/ /'
For an input:
a
b
c
d
This gives:
1 a
2 b
3 c
If you only want to get line number of the line before the first non-empty line:
sed -n '/./!q;=' input | tail -1
A while loop that prints all lines:
while read line; do
echo "$line"
done < input
If you want to count the lines until the first empty line, you could do this.
$ cat in.txt
line1
line2
line3
line4
line5
$ echo $(($(sed '/^\s*$/q' < in.txt | wc -l) - 1))
3

How do i list the file content's selected lines?

I want to list the content of the file from fourth to the last line and then display the content form first to third line and append both o/p in a new file.
This will print from 4th to last line.
awk 'NR>=4' file
This will print from first to 3rd line.
awk 'NR<4' file
To have all the output in this order:
awk 'NR>=4' file > new_file
awk 'NR<4' file >> new_file
Test
$ cat a
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
$ awk 'NR>=4' a
line4
line5
line6
line7
line8
line9
line10
$ awk 'NR<4' a
line1
line2
line3
$ awk 'NR>=4' a > new_file
$ awk 'NR<4' a >> new_file
$ cat new_file
line4
line5
line6
line7
line8
line9
line10
line1
line2
line3
Update
You can also do it using head and tail:
$ tail -n +4 a
line4
line5
line6
line7
line8
line9
line10
$ head -n 3 a
line1
line2
line3
We use the -n option of tail with +:
-n, --lines=K
output the last K lines, instead of the last 10; or use -n +K to
output lines starting with the Kth
This can also be done using sed:
$ sed -ne '4,$p' inp.txt > out.txt
$ sed -ne '4q;p' inp.txt >> out.txt
The last line could be shortened to sed 3q inp.txt >> out.txt if you like. I did it this way so that if you wanted to auto-generate your sed script using a single breakpoint, you wouldn't need to do any extra math.
Of course, you could also make this a more complex one-liner:
$ sed -ne '1x;2,3H;4,$p;${x;p;}' inp.txt > out.txt
The breakdown is this:
1x - Store the first line in sed's "hold" buffer.
2,3H - Append subsequent lines to the "hold" buffer, up to our breakpoint.
4,$p - If the current line number is "4 to the end", just print it.
${x;p;} - On the last line, swap the hold buffer and pattern space, then print.
This may look a bit arcane. And it is. And requires a bit more math if your breakpoint ("4") is a number that will change. But sed is tiny, and this runs as a single process, so it will likely be quite a bit faster if you're processing many files this way.
One more ways you can do try this:-
For Single line
This will print only 5 number of selected line in file
sed -n '5p' yourFilesPath
For Multiple lines
sed -n '1,5p' yourFilesPath
The problem with using 'head' and 'tail' command is that you need to know the total number of lines in the given file. You can use 'wc' command to know the number of lines in the file.
wc -l < file will give you the number of lines.
echo `expr \`wc -l < file\` - 3` will give you the number of lines > 3. Feed it to tail command. That's it.
-bash-2.05b$ tail -`expr \`wc -l < file\` - 3` file
line4
line5
line6
line7
line8
line9
line10
-bash-2.05b$
-bash-2.05b$ head -3 file
line1
line2
line3
-bash-2.05b$
-bash-2.05b$ tail -`expr \`wc -l < file\` - 3` file > output_file
-bash-2.05b$ head -3 file >> output_file
-bash-2.05b$
-bash-2.05b$ cat output_file
line4
line5
line6
line7
line8
line9
line10
line1
line2
line3
Not as elegant as the answers posted earlier but just another way of doing it. Please note that this solution will work till the number of lines are >= 3 :)

Resources