Omit the last line with sed - bash

I'm having the following file content.
2013-07-30 debug
line1
2013-07-30 info
line2
line3
2013-07-30 debug
line4
line5
I want to get the following output with sed.
2013-07-30 info
line2
line3
This command gives me nearly the output I want
sed -n '/info/I,/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/{p}' myfile.txt
2013-07-30 info
line2
line3
2013-07-30 debug
How do I omit the last line here?

IMO, sed starts to become unwieldy as soon as you have to add conditions into it. I realize you did not tag the question with awk, but here is an awk program to print only "info" sections.
awk -v type="info" '
$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}$/ {p = ($2 == type)}
p
' myfile.txt
2013-07-30 info
line2
line3

Try:
sed -n '/info/I p; //,/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/{ //! p}' myfile.txt
It prints first match, and in range omits both edges but the first one is already printed, so only skips the second one. It yields:
2013-07-30 info
line2
line3

This might work for you (GNU sed):
sed -r '/info/I{:a;n;/^[0-9]{4}(-[0-9]{2}){2}/!ba;s/^/\n/;D};d' file
or if you prefer:
sed '/info/I{:a;n;/^....-..-.. /!ba;s/^/\n/;D};d' file
N.B. This caters for consecutive patterns

Related

Remove blank lines from the ends of a bunch of files

I have a bunch of files with many lines in them, and usually one or two blank lines at the end.
I want to remove the blank lines at the end, while keeping all of the blank lines that may exist within the file.
I want to restrict the operation to the use of GNU utilities or similar, i.e. bash, sed, awk, cut, grep etc.
I know that I can easily remove all blank lines, with something like:
sed '/^$/d'
But I want to keep blank lines which exist prior to further content in the file.
File input might be as follows:
line1
line2
line4
line5
I'd want the output to look like:
line1
line2
line4
line5
All files are <100K, and we can make temporary copies.
With Perl:
perl -0777 -pe 's/\n*$//; s/$/\n/' file
Second S command (s/$/\n/) appends again a newline to end of your file to be POSIX compilant.
Or shorter:
perl -0777 -pe 's/\n*$/\n/' file
With Fela Maslen's comment to edit files in place (-i) and glob all elements in current directory (*):
perl -0777 -pe 's/\n*$/\n/' -i *
If lines containing just space chars are to be considered empty:
$ tac file | awk 'NF{f=1}f' | tac
line1
line2
line4
line5
otherwise:
$ tac file | awk '/./{f=1}f' | tac
line1
line2
line4
line5
Here is an awk solution (Standard linux gawk). I enjoyed writing.
single line:
awk '/^\s*$/{s=s $0 ORS; next}{print s $0; s=""}' input.txt
using a readable script script.awk
/^\s*$/{skippedLines = skippedLines $0 ORS; next}
{print skippedLines $0; skippedLines= ""}
explanation:
/^\s*$/ { # for each empty line
skippedLines = skippedLines $0 ORS; # pad string of newlines
next; # skip to next input line
}
{ # for each non empty line
print skippedLines $0; # print any skippedLines and current input line
skippedLines= ""; # reset skippedLines
}
This might work for you (GNU sed):
sed ':a;/\S/{n;ba};$d;N;ba' file
If the current line contains a non-space character, print the current pattern space, fetch the next line and repeat. If the current line(s) is/are empty and it is the last line in the file, delete the pattern space, otherwise append the next line and repeat.

Merging Two, Nearly Similar Text Files

Suppose we have ~/file1:
line1
line2
line3
...and ~/file2:
line1
lineNEW
line3
Notice that thes two files are nearly identical, except line2 differs from lineNEW.
Question: How can I merge these two files to produce one that reads as follows:
line1
line2
lineNEW
line3
That is, how can I merge the two files so that all unique lines are captured (without overlap) into a third file? Note that the order of the lines doesn't matter (as long as all unique lines are being captured).
awk '{
print
getline line < second
if ($0 != line) print line
}' second=file2 file1
will do it
Considered the command below. It is more robust since it also works for files where a new line has been added instead of replaced (see f1 and f2 below).
First, I executed it using your files. I divided the command(s) into two lines so that it fits nicely in the "code block":
$ (awk '{ print NR, $0 }' file1; awk '{ print NR, $0 }' file2) |\
sort -k 2 | uniq -f 1 | sort | cut -d " " -f 2-
It produces your expected output:
line1
line2
lineNEW
line3
I also used these two extra files to test it:
f1:
line1 stuff after a tab
line2 line2
line3
line4
line5
line6
f2:
line1 stuff after a tab
lineNEW
line2 line2
line3
line4
line5
line6
Here is the command:
$ (awk '{ print NR, $0 }' f1; awk '{ print NR, $0 }' f2) |\
sort -k 2 | uniq -f 1 | sort | cut -d " " -f 2-
It produces this output:
line1 stuff after a tab
line2 line2
lineNEW
line3
line4
line5
line6
When you do not care about the order, just sort them:
cat ~/file1 ~/file2 | sort -u > ~/file3

replacing word in shell script or sed

I am a newbie, but would like to create a script which does the following.
Suppose I have a file of the form
This is line1
This is line2
This is line3
This is line4
This is line5
This is line6
I would like to replace it in the form
\textbf{This is line1}
This is line2
This is line3
\textbf{This is line4}
This is line5
This is line6
That is, at the start of the paragraph I would like to add a text \textbf{ and end the line with }. Is there a way to search for double end of lines? I am having trouble creating such a script with sed. Thank you !
Using awk you can write something like
$ awk '!f{ $0 = "\\textbf{"$0"}"; f++} 1; /^$/{f=0}' input
\textbf{This is line1}
This is line2
This is line3
\textbf{This is line4}
This is line5
This is line6
What it does?
!f{ $0 = "\\textbf{"$0"}"; f++}
!f True if value of f is 0. For the first line, since the value of f is not set, will evaluates true. If its true, awk performs tha action part {}
$0 = "\\textbf{"$0"}" adds \textbf{ and } to the line
f++ increments the value of f so that it may not enter into this action part, unless f is set to zero
1 always True. Since action part is missing, awk performs the default action to print the entire line
/^$/ Pattern matches an empty line
{f=0} If the line is empty, then set f=0 so that the next line is modfied by the first action part to include the changes
An approach using sed
sed '/^$/{N;s/^\(\n\)\(.*\)/\1\\textbf{\2}/};1{s/\(.*\)/\\textbf{\1}/}' my_file
find all lines that only have a newline character and then add the next line to it. ==
^$/{N;s/^\(\n\)\(.*\)/\1\\textbf{\2}/}
mark the line below the blank line and modify it
find the first line in the file and do the same == 1{s/\(.*\)/\\textbf{\1}/}
Just use awk's paragraph mode:
$ awk 'BEGIN{RS="";ORS="\n\n";FS=OFS="\n"} {$1="\\textbf{"$1"}"} 1' file
\textbf{This is line1}
This is line2
This is line3
\textbf{This is line4}
This is line5
This is line6

Separate by blank lines in bash

I have an input like this:
Block 1:
line1
line2
line3
line4
Block 2:
line1
line2
Block 3:
line1
line2
line3
This is an example, is there an elegant way to print Block 2 and its lines only without rely on their names? It would be like "separate the blocks by the blank line and print the second block".
try this:
awk '!$0{i++;next;}i==1' yourFile
considering performance, also can add exit after 2nd block was processed:
awk '!$0{i++;next;}i==1;i>1{exit;}' yourFile
test:
kent$ cat t
Block 1:
line1
line2
line3
line4
Block 2:
line1
line2
Block 3:
line1
line2
line3
kent$ awk '!$0{i++;next;}i==1' t
Block 2:
line1
line2
kent$ awk '!$0{i++;next;}i==1;i>1{exit;}' t
Block 2:
line1
line2
Set the record separater to the empty string to separate on blank lines. To
print the second block:
$ awk -v RS= 'NR==2{ print }'
(Note that this only separates on lines that do not contain any whitespace.
A line containing only white space is not considered a blank line.)

bash, sed, awk: extracting lines within a range

How can I get sed to extract the lines between two patterns, write that data to a file, and then extract the lines between the next range and write that text to another file? For example given the following input:
pattern_a
line1
line2
line3
pattern_b
pattern_a
line4
line5
line6
pattern_b
I want line1 line2 and line3 to appear in one file and line4 line5 and line6 to appear in another file. I can't see a way of doing this without using a loop and maintaining some state between iterations of the loop where the state tells you where sed must start start search to looking for the start pattern (pattern_a) again.
For example, in bash-like psuedocode:
while not done
if [[ first ]]; then
sed -n -e '/pattern_a/,/pattern_b/p' > $filename
else
sed -n -e '$linenumber,/pattern_b/p' > $filename
fi
linenumber = last_matched_line
filename = new_filename
Is there a nifty way of doing this using sed? Or would awk be better?
How about this:
awk '/pattern_a/{f=1;c+=1;next}/pattern_b/{f=0;next}f{print > "outfile_"c}' input_file
This will create a outfile_x for every range.

Resources