How I can remove the previous line of a match pattern?
Or
the opposite of:
sed -n '/pattern/{g;1!p;};h'
Use tac | sed | tac (Linux/Solaris) then it's next line after a match pattern :)
sed is an excellent tool for simple substitutions on a single line, for anything else just use awk:
$ cat file
here is a
a bad line before
a good line
in a file
$ awk 'NR==FNR{if (/good/) del[NR-1]; next} !(FNR in del)' file file
here is a
a good line
in a file
You can use the above idiom to delete any number of lines before and/or after a given pattern, e.g. to delete the 3 lines before and 2 lines after a given target:
$ cat file
-5
-4
-3
-2
-1
target
+1
+2
+3
+4
+5
$
$ awk 'NR==FNR{if (/target/) for (i=-3;i<=2;i++) del[NR+i]; next} !(FNR in del)' file file
-5
-4
+3
+4
+5
or to leave the target in place and just delete the lines around it:
$ awk 'NR==FNR{if (/target/) for (i=-3;i<=2;i++) if (i!=0) del[NR+i]; next} !(FNR in del)' file file
-5
-4
target
+3
+4
+5
All very clear, trivial, and scalable...
For "relatively complex" navigation around a search expression, ed might be a good solution (comments are not part of the command):
ed testfile << EOF
/r.*o/ # Search the pattern
-1d # delete one line above
w # write
EOF
Here is an example (using <<< and \n to write as a single line):
sh$ cat testfile
john
paul
george
ringo
sh$ ed testfile <<< $'/r.*o/\n-1d\nw'
23
ringo
16
sh$ cat testfile
john
paul
ringo
You can revert the file and then delete the line after the matche pattern(which is simple), and then revert the result, here is the code:
tail -r|sed '/pattern/{n;d;}'|tail -r
Here is another awk:
awk '/pattern/ {f=1} !f&&NR>1 {print p} {p=$0;f=0} END {print p}' file
A tac awk version:
tac file | awk '1; /pattern/ {getline}' | tac
PS getline should normally be avoided since it has many pitfalls, so then this:
tac file | awk '!p||NR!=p+1; /pattern/ {p=NR}' | tac
This might work for you (GNU sed):
sed '$!N;/\n.*pattern/!P;D' file
Keep a window of 2 lines and test the second of them for the pattern. If the pattern is present do not print the first line.
This would work but only if line count is more than 1 and pattern is not at the last line.
sed -n '/pattern/ { h; b }; 1 { h; b }; ${ H }; x; p' file
Better use awk instead:
awk '!/pattern/ && NR > 1 { print p } { p = $0 } END { if (NR) print p }' file
Related
What's the easiest/quickest way to interleave the lines of two (or more) text files? Example:
File 1:
line1.1
line1.2
line1.3
File 2:
line2.1
line2.2
line2.3
Interleaved:
line1.1
line2.1
line1.2
line2.2
line1.3
line2.3
Sure it's easy to write a little Perl script that opens them both and does the task. But I was wondering if it's possible to get away with fewer code, maybe a one-liner using Unix tools?
paste -d '\n' file1 file2
Here's a solution using awk:
awk '{print; if(getline < "file2") print}' file1
produces this output:
line 1 from file1
line 1 from file2
line 2 from file1
line 2 from file2
...etc
Using awk can be useful if you want to add some extra formatting to the output, for example if you want to label each line based on which file it comes from:
awk '{print "1: "$0; if(getline < "file2") print "2: "$0}' file1
produces this output:
1: line 1 from file1
2: line 1 from file2
1: line 2 from file1
2: line 2 from file2
...etc
Note: this code assumes that file1 is of greater than or equal length to file2.
If file1 contains more lines than file2 and you want to output blank lines for file2 after it finishes, add an else clause to the getline test:
awk '{print; if(getline < "file2") print; else print ""}' file1
or
awk '{print "1: "$0; if(getline < "file2") print "2: "$0; else print"2: "}' file1
#Sujoy's answer points in a useful direction. You can add line numbers, sort, and strip the line numbers:
(cat -n file1 ; cat -n file2 ) | sort -n | cut -f2-
Note (of interest to me) this needs a little more work to get the ordering right if instead of static files you use the output of commands that may run slower or faster than one another. In that case you need to add/sort/remove another tag in addition to the line numbers:
(cat -n <(command1...) | sed 's/^/1\t/' ; cat -n <(command2...) | sed 's/^/2\t/' ; cat -n <(command3) | sed 's/^/3\t/' ) \
| sort -n | cut -f2- | sort -n | cut -f2-
With GNU sed:
sed 'R file2' file1
Output:
line1.1
line2.1
line1.2
line2.2
line1.3
line2.3
Here's a GUI way to do it: Paste them into two columns in a spreadsheet, copy all cells out, then use regular expressions to replace tabs with newlines.
cat file1 file2 |sort -t. -k 2.1
Here its specified that the separater is "." and that we are sorting on the first character of the second field.
I have a text file like this:
1.2.3.t
1.2.4.t
complete
I need to print the last non blank line and two line to last as two variable. the output should be:
a=1.2.4.t
b=complete
I tried this for last line:
b=awk '/./{line=$0} END{print line}' myfile
but I have no idea for a.
grep . file | tail -n 2 | sed 's/^ *//;1s/^/a=/;2s/^/b=/'
Output:
a=1.2.4.t
b=complete
awk to the rescue!
$ awk 'NF{a=b;b=$0} END{print "a="a;print "b="b}' file
a=1.2.4.t
b=complete
Or, if you want to the real variable assignment
$ awk 'NF{a=b;b=$0} END{print a, b}' file
| read a b; echo "a="$a; echo "b="$b
a=1.2.4.t
b=complete
you may need -r option for read if you have backslashes in the values.
I have a file with ~700,000 lines and I would like to remove a bunch of specific lines (~30,000) using bash scripting or another method.
I know I can remove lines using sed:
sed -i.bak -e '1d;34d;45d;678d' myfile.txt # an example
I have the lines in a text file but I don't know if I can use it as input to sed, maybe perl??
Thanks
A few options:
sed <(sed 's/$/d/' lines_file) data_file
awk 'NR==FNR {del[$1]; next} !(FNR in del)' lines_file data_file
perl -MPath::Class -e '
%del = map {$_ => 1} file("lines_file")->slurp(chomp => 1);
$f = file("data_file")->openr();
while (<$f>) {
print unless $del{$.};
}
'
perl -ne'
BEGIN{ local #ARGV =pop; #h{<>} =() }
exists $h{"$.\n"} or print;
' myfile.txt lines
You can make the remove the lines using sed file.
First make a list of lines to remove. (One line number for one line)
$ cat lines
1
34
45
678
Make this file to sed format.
$ sed -e 's|$| d|' lines >lines.sed
$ cat lines.sed
1 d
34 d
45 d
678 d
Now use this sed file and give it as input to sed command.
$ sed -i.bak -f lines.sed file_with_70k_lines
This will remove the lines.
If you can create a text file of the format
1d
34d
45d
678d
then you can run something like
sed -i.bak -f scriptfile datafile
You can use a genuine editor for that, and ed is the standard editor.
I'm assuming your lines are in a file lines.txt, one number per line, e.g.,
1
34
45
678
Then (with a blatant bashism):
ed -s file.txt < <(sed -n '/^[[:digit:]]\+$/p' lines.txt | sort -nr | sed 's/$/d/'; printf '%s\n' w q)
A first sed selects only the numbers from file lines.txt (just in case).
There's something quite special to take into account here: that when you delete line 1, then line 34 in the original file becomes line 33. So it's better to remove the lines from the end: start with 678, then 45, etc. that's why we're using sort -nr (to sort the numbers in reverse order). A final sed appends d (ed's delete command) to the numbers.
Then we issue the w (write) and q (quit) commands.
Note that this overwrites the original file!
Is there a shell command to pick the n-th line of a string ?
Example:
line1
line2
line3
pick line 2.
UPDATE: Thank you so far. With your help, I came up with this solution for a string:
Pick the 2nd line:
echo -e "1\n2\n3" | head -2 | tail -1
$ head -n filename | tail -1
where 'n' is your line number. But it's a little inefficient, launching 2 processes.
Alternatively sed can do this. To print the 4th line:
$ sed -n 4p filename
This forum answer details 3 different methods for sed
# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
Using gawk:
gawk -v n=3 'n==NR { print; exit }' a.txt
head -4 a.txt | tail -1
To print the 4:th line in a. txt.
I have one of my large file as
foo:43:sdfasd:daasf
bar:51:werrwr:asdfa
qux:34:werdfs:asdfa
foo:234:dfasdf:dasf
qux:345:dsfasd:erwe
...............
here 1st column foo, bar and qux etc. are file names. and 2nd column 43,51, 34 etc. are line numbers. I want to print Nth line(specified by 2nd column) for each file(specified in 1st column).
How can I automate above in unix shell.
Actually above file is generated while compiling and I want to print warning line in code.
-Thanks,
while IFS=: read name line rest
do
head -n $line $name | tail -1
done < input.txt
while IFS=: read file line message; do
echo "$file:$line - $message:"
sed -n "${line}p" "$file"
done <yourfilehere
awk 'NR==4 {print}' yourfilename
or
cat yourfilename | awk 'NR==4 {print}'
The above one will work for 4th line in your file.You can change the number as per your requirement.
Just in awk, but probably worse performance than answers by #kev or #MarkReed.
However it does process each file just once. Requires GNU awk
gawk -F: '
BEGIN {OFS=FS}
{
files[$1] = 1
lines[$1] = lines[$1] " " $2
msgs[$1, $2] = $3
}
END {
for (file in files) {
split(lines[file], l, " ")
n = asort(l)
count = 0
for (i=1; i<=n; i++) {
while (++count <= l[i])
getline line < file
print file, l[i], msgs[file, l[i]]
print line
}
close(file)
}
}
'
This might work for you:
sed 's/^\([^,]*\),\([^,]*\).*/sed -n "\2p" \1/' file |
sort -k4,4 |
sed ':a;$!N;s/^\(.*\)\(".*\)\n.*"\(.*\)\2/\1;\3\2/;ta;P;D' |
sh
sed -nr '3{s/^([^:]*):([^:]*):.*$/\1 \2/;p}' namesNnumbers.txt
qux 34
-n no output by default,
-r regular expressions (simplifies using the parens)
in line 3 do {...;p} (print in the end)
s ubstitute foobarbaz with foo bar
So to work with the values:
fnUln=$(sed -nr '3{s/^([^:]*):([^:]*):.*$/\1 \2/;p}' namesNnumbers.txt)
fn=$(echo ${fnUln/ */})
ln=$(echo ${fnUln/* /})
sed -n "${ln}p" "$fn"