I'm having the following file content.
2013-07-30 debug
line1
2013-07-30 info
line2
line3
2013-07-30 debug
line4
line5
I want to get the following output with sed.
2013-07-30 info
line2
line3
This command gives me nearly the output I want
sed -n '/info/I,/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/{p}' myfile.txt
2013-07-30 info
line2
line3
2013-07-30 debug
How do I omit the last line here?
IMO, sed starts to become unwieldy as soon as you have to add conditions into it. I realize you did not tag the question with awk, but here is an awk program to print only "info" sections.
awk -v type="info" '
$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}$/ {p = ($2 == type)}
p
' myfile.txt
2013-07-30 info
line2
line3
Try:
sed -n '/info/I p; //,/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/{ //! p}' myfile.txt
It prints first match, and in range omits both edges but the first one is already printed, so only skips the second one. It yields:
2013-07-30 info
line2
line3
This might work for you (GNU sed):
sed -r '/info/I{:a;n;/^[0-9]{4}(-[0-9]{2}){2}/!ba;s/^/\n/;D};d' file
or if you prefer:
sed '/info/I{:a;n;/^....-..-.. /!ba;s/^/\n/;D};d' file
N.B. This caters for consecutive patterns
Related
I have a bunch of files with many lines in them, and usually one or two blank lines at the end.
I want to remove the blank lines at the end, while keeping all of the blank lines that may exist within the file.
I want to restrict the operation to the use of GNU utilities or similar, i.e. bash, sed, awk, cut, grep etc.
I know that I can easily remove all blank lines, with something like:
sed '/^$/d'
But I want to keep blank lines which exist prior to further content in the file.
File input might be as follows:
line1
line2
line4
line5
I'd want the output to look like:
line1
line2
line4
line5
All files are <100K, and we can make temporary copies.
With Perl:
perl -0777 -pe 's/\n*$//; s/$/\n/' file
Second S command (s/$/\n/) appends again a newline to end of your file to be POSIX compilant.
Or shorter:
perl -0777 -pe 's/\n*$/\n/' file
With Fela Maslen's comment to edit files in place (-i) and glob all elements in current directory (*):
perl -0777 -pe 's/\n*$/\n/' -i *
If lines containing just space chars are to be considered empty:
$ tac file | awk 'NF{f=1}f' | tac
line1
line2
line4
line5
otherwise:
$ tac file | awk '/./{f=1}f' | tac
line1
line2
line4
line5
Here is an awk solution (Standard linux gawk). I enjoyed writing.
single line:
awk '/^\s*$/{s=s $0 ORS; next}{print s $0; s=""}' input.txt
using a readable script script.awk
/^\s*$/{skippedLines = skippedLines $0 ORS; next}
{print skippedLines $0; skippedLines= ""}
explanation:
/^\s*$/ { # for each empty line
skippedLines = skippedLines $0 ORS; # pad string of newlines
next; # skip to next input line
}
{ # for each non empty line
print skippedLines $0; # print any skippedLines and current input line
skippedLines= ""; # reset skippedLines
}
This might work for you (GNU sed):
sed ':a;/\S/{n;ba};$d;N;ba' file
If the current line contains a non-space character, print the current pattern space, fetch the next line and repeat. If the current line(s) is/are empty and it is the last line in the file, delete the pattern space, otherwise append the next line and repeat.
Suppose we have ~/file1:
line1
line2
line3
...and ~/file2:
line1
lineNEW
line3
Notice that thes two files are nearly identical, except line2 differs from lineNEW.
Question: How can I merge these two files to produce one that reads as follows:
line1
line2
lineNEW
line3
That is, how can I merge the two files so that all unique lines are captured (without overlap) into a third file? Note that the order of the lines doesn't matter (as long as all unique lines are being captured).
awk '{
print
getline line < second
if ($0 != line) print line
}' second=file2 file1
will do it
Considered the command below. It is more robust since it also works for files where a new line has been added instead of replaced (see f1 and f2 below).
First, I executed it using your files. I divided the command(s) into two lines so that it fits nicely in the "code block":
$ (awk '{ print NR, $0 }' file1; awk '{ print NR, $0 }' file2) |\
sort -k 2 | uniq -f 1 | sort | cut -d " " -f 2-
It produces your expected output:
line1
line2
lineNEW
line3
I also used these two extra files to test it:
f1:
line1 stuff after a tab
line2 line2
line3
line4
line5
line6
f2:
line1 stuff after a tab
lineNEW
line2 line2
line3
line4
line5
line6
Here is the command:
$ (awk '{ print NR, $0 }' f1; awk '{ print NR, $0 }' f2) |\
sort -k 2 | uniq -f 1 | sort | cut -d " " -f 2-
It produces this output:
line1 stuff after a tab
line2 line2
lineNEW
line3
line4
line5
line6
When you do not care about the order, just sort them:
cat ~/file1 ~/file2 | sort -u > ~/file3
I am a newbie, but would like to create a script which does the following.
Suppose I have a file of the form
This is line1
This is line2
This is line3
This is line4
This is line5
This is line6
I would like to replace it in the form
\textbf{This is line1}
This is line2
This is line3
\textbf{This is line4}
This is line5
This is line6
That is, at the start of the paragraph I would like to add a text \textbf{ and end the line with }. Is there a way to search for double end of lines? I am having trouble creating such a script with sed. Thank you !
Using awk you can write something like
$ awk '!f{ $0 = "\\textbf{"$0"}"; f++} 1; /^$/{f=0}' input
\textbf{This is line1}
This is line2
This is line3
\textbf{This is line4}
This is line5
This is line6
What it does?
!f{ $0 = "\\textbf{"$0"}"; f++}
!f True if value of f is 0. For the first line, since the value of f is not set, will evaluates true. If its true, awk performs tha action part {}
$0 = "\\textbf{"$0"}" adds \textbf{ and } to the line
f++ increments the value of f so that it may not enter into this action part, unless f is set to zero
1 always True. Since action part is missing, awk performs the default action to print the entire line
/^$/ Pattern matches an empty line
{f=0} If the line is empty, then set f=0 so that the next line is modfied by the first action part to include the changes
An approach using sed
sed '/^$/{N;s/^\(\n\)\(.*\)/\1\\textbf{\2}/};1{s/\(.*\)/\\textbf{\1}/}' my_file
find all lines that only have a newline character and then add the next line to it. ==
^$/{N;s/^\(\n\)\(.*\)/\1\\textbf{\2}/}
mark the line below the blank line and modify it
find the first line in the file and do the same == 1{s/\(.*\)/\\textbf{\1}/}
Just use awk's paragraph mode:
$ awk 'BEGIN{RS="";ORS="\n\n";FS=OFS="\n"} {$1="\\textbf{"$1"}"} 1' file
\textbf{This is line1}
This is line2
This is line3
\textbf{This is line4}
This is line5
This is line6
I have an input like this:
Block 1:
line1
line2
line3
line4
Block 2:
line1
line2
Block 3:
line1
line2
line3
This is an example, is there an elegant way to print Block 2 and its lines only without rely on their names? It would be like "separate the blocks by the blank line and print the second block".
try this:
awk '!$0{i++;next;}i==1' yourFile
considering performance, also can add exit after 2nd block was processed:
awk '!$0{i++;next;}i==1;i>1{exit;}' yourFile
test:
kent$ cat t
Block 1:
line1
line2
line3
line4
Block 2:
line1
line2
Block 3:
line1
line2
line3
kent$ awk '!$0{i++;next;}i==1' t
Block 2:
line1
line2
kent$ awk '!$0{i++;next;}i==1;i>1{exit;}' t
Block 2:
line1
line2
Set the record separater to the empty string to separate on blank lines. To
print the second block:
$ awk -v RS= 'NR==2{ print }'
(Note that this only separates on lines that do not contain any whitespace.
A line containing only white space is not considered a blank line.)
How can I get sed to extract the lines between two patterns, write that data to a file, and then extract the lines between the next range and write that text to another file? For example given the following input:
pattern_a
line1
line2
line3
pattern_b
pattern_a
line4
line5
line6
pattern_b
I want line1 line2 and line3 to appear in one file and line4 line5 and line6 to appear in another file. I can't see a way of doing this without using a loop and maintaining some state between iterations of the loop where the state tells you where sed must start start search to looking for the start pattern (pattern_a) again.
For example, in bash-like psuedocode:
while not done
if [[ first ]]; then
sed -n -e '/pattern_a/,/pattern_b/p' > $filename
else
sed -n -e '$linenumber,/pattern_b/p' > $filename
fi
linenumber = last_matched_line
filename = new_filename
Is there a nifty way of doing this using sed? Or would awk be better?
How about this:
awk '/pattern_a/{f=1;c+=1;next}/pattern_b/{f=0;next}f{print > "outfile_"c}' input_file
This will create a outfile_x for every range.