I have a file that contains lines starting with a number, for example
1 This is the first line
2 this is the second line
3 this is the third line
4 This is the fourth line
What I want to do is delete a line for example line 2 and update the numbering so the file would look like the following, I want to do this in a bash script.
1 This is the first line
2 this is the third line
3 This is the fourth line
Thanks
IMO it might be a little easier with awk:
awk '!/regex/ {$1=++x; print}' inputFile
In the /.../ you can put the regex that occurs on the line that needs to be deleted.
Test:
$ cat inputFile
1 This is the first line
2 this is the second line
3 this is the third line
4 This is the fourth line
$ awk '!/second/ {$1=++x; print}' inputFile
1 This is the first line
2 this is the third line
3 This is the fourth line
$ awk '!/third/ {$1=++x; print}' inputFile
1 This is the first line
2 this is the second line
3 This is the fourth line
$ awk '!/first/ {$1=++x; print}' inputFile
1 this is the second line
2 this is the third line
3 This is the fourth line
Note: Since we are re-constructing the $1 field, any white space sequences will get removed.
You can use this set of commands:
grep -v '^2 ' file | cut -d' ' -f2- | nl -w1 -s' '
Using grep with -v option allows to remove line #2.
cut program cuts the first column which is line number.
Finally, we just need to renumber the lines so we use nl.
Related
I need to delete certain numbers of line before a desired text but only if a line before and after searched string is empty.
E.g (line number, content)
1
2
3 Hello
4
5 yellow
in this case, if lines before and after line containing Hello are empty (line 2 and 4), i have to delete lines from 3 to 1.
I can delete lines from 3 to 1 using tac and sed command but m having difficulty in putting tht if condition.
tac file1|sed -e '/Hello/,+3d'|tac
This might work for you (GNU sed):
sed ':a;N;s/\n/&/3;Ta;/\n\n.*Hello.*\n$/s/.*\n//;ta;P;D' file
Gather up 4 lines in the pattern space and if the 2nd and the 4th are empty and the 3rd contains Hello, delete the first three lines and repeat. Otherwise print the first line and repeat.
Could you please try following if you are ok with awk.
awk -v string="Hello" '
FNR==NR{
a[FNR]=$0
next
}
($0==string) && a[FNR-1]=="" && a[FNR+1]==""{
a[FNR-1]=a[FNR]=a[FNR-2]="del_flag"
}
END{
for(i=1;i<=length(a);i++){
if(a[i]!="del_flag"){
print a[i]
}
}
}
' Input_file Input_file
With GNU sed option -z you can match
some_line
empty line
line With Hello
empty line
and replace this with an empty line.
sed -rz 's/(^|\n)[^\n]*\n\nHello\n\n/\1\n/g' file1
EDIT: added g for multiple segments.
Suppose I have an input file with lines of text:
line 1
line 2
line 3
line 4
line 2
now suppose I would like to check if my inputfile contains
line 2
line 3
and remove that block of text if it is found. This would give:
line 1
line 4
line 2
Note that I don't want to remove just every occurrence of line 2 or line 3; but only if they are found one after another. (In reality I want to check for a block of 5 lines, and not just any block of code between two placeholders, but let's keep the example simple).
I looked into awk but that is getting complicated very quick (I'm not yet ready with this; since I feel this is not the right approach and will explode with 5 lines...)
awk '/line 2/ {if (line0) {print line0; line0=""}; line0=$0}' input.txt
One way with GNU awk for multi-char RS and RT:
$ awk -v RS='(^|\n)line 2\nline 3\n' '{ORS=(RT ~ /^\n/ ? "\n" : "")} 1' file
line 1
line 4
line 2
With any awk:
$ cat file
line 2
line 3
line 1
line 2
line 3
line 4
line 2
line 3
$ awk '
{ rec = rec $0 RS }
END {
rec = RS rec
gsub(/\nline 2\nline 3\n/,RS,rec)
gsub(/^\n|\n$/,"",rec)
print rec
}
' file
line 1
line 4
The above assumes you want to match using regexps since that's what your posted code does. If you want to do literal string matches instead that's do-able too with some massaging:
$ cat tst.awk
{ rec = rec $0 RS }
END {
while ( beg = index(RS rec,RS block RS) ) {
out = out substr(RS rec,1,beg-1)
rec = substr(RS rec,beg+length(block)+2)
}
print substr(out rec,2)
}
$ awk -v block='line 2\nline 3' -f tst.awk file
line 1
line 4
Not awk, but this is straightforward with Perl 5, as #triplee pointed out. With the five-line input file you showed above as foo.txt:
perl -0777 -pe 's{^line 2\nline 3\n}{}gm' foo.txt
produces the desired three-line output.
Explanation:
-0777 causes perl to read the entire input as one string (see perlrun).
The /m modifier on the regex causes ^ to match at the beginning of a line (see perlre).
Edit ^ will also match at the beginning of the file, so you can detect blocks of lines even if there is not a newline before them.
The separators between the lines are literal \ns because $ matches before the \n with the /m modifier. Therefore, it's easier just to match the \n.
Thanks to this U&L SE answer by Stéphane Chazelas for the basics.
With gnu sed
sed -z 's/line 2\nline 3\n//g;s/line 2\nline 3\n$//' infile
This might work for you (GNU sed):
sed '/^line 2$/!b;N;/^line 3$/Md;P;D' file
If a line does not match the string line 2, print it and begin the next cycle. Otherwise, append the following line and if that does match the string line 3, delete both lines. Otherwise, print then delete the first line and repeat.
I have a long text file comprised of numbers, such as:
1
2
9.252
9.252
9.272
1
1
6.11
6.11
6.129
I would like to keep the first line, delete the subsequent three and then keep the next one. I would like to do this process for the whole file. Following that logic, considered the input above, I would like to have the following output:
1
9.272
1
6.129
Using GNU sed (needed for the ~ extension):
sed -n '1~5p;5~5p' file
Saving your numbers in a "textfile.txt" I can use the following with sed:
sed -n 'p;n;n;n;n;p;' textfile.txt
Sed prints the first line, reads the next 4 and prints the last line.
Or the following using while read in bash:
while read -r firstline && read -r nextone1 && read -r nextone2 && read -r nextone3 && read -r lastone; do
printf "%s\n" "$firstline" "$lastone";
done < textfile.txt
This just reads 5 lines at a time and prints only the first and 5th lines.
You can simply say:
awk 'NR%5<2' input.txt
Explanation: Considering the entire pattern repeats every five lines, let's start with applying modulo operation to the line number NR by five. Then we'll see the 1st line of the five-line block yields "1" and the 5th line of the block yields "0". Now they can be separated from other lines by comparing it to two.
To print the 1st and 5th line of every block of 5 lines (remember that 5%5 = 0):
$ awk '(NR%5) ~ /[10]/' file
1
9.272
1
6.129
If you want to print the 2nd, 3rd, and 4th line of every block of 5 lines instead of the 1st and 5th:
$ awk '(NR%5) ~ /[234]/' file
2
9.252
9.252
1
6.11
6.11
If you wanted to print the 27th and 53rd line of every block of 100:
awk '(NR%100) ~ /^(27|53)$/' file
We couldn't use a bracket expression there as we're now beyond single char numbers.
This might work for you (GNU sed):
sed '2~5,+2d' file
Starting from line 2, delete the next three lines using modulo 5.
An alternative:
sed -n '1p;5~5,+1p' file
Considering your groups are packed as 5 lines, you could use awk with a mod 5 operation.
awk '{i=(NR-1)%5;if(i==0||i==4)print $0}' input.txt
With indentation it looks like this:
{
i=(NR-1)%5;
if (i==0||i==4)
print $0;
}
i=(NR-1)%5 gets the line number and computes the modulo with 5, but since the line numbers start at 1 (instead of 0), you need to subtract 1 to it before computing the modulo.
This leaves you with an integer i that ranges from 0 to 4. You want to print the first line (index 0), skip the next three lines (indexes 1-3) and print the last line (index 4), which is exactly what does if (i==0||i==4) print $0
Alternately you can do the same thing with a shorter (and probably slightly more optimized version):
awk '((NR-1)%5==0||(NR-1)%5==4)' input.txt
This tells awk to do something for every 1st out of 5 lines and every 5th out of 5 lines. Since the "something" is not defined, by default it outputs the current line. If it helps, this is strictly equivalent to:
awk '((NR-1)%5==0||(NR-1)%5==4){print $0}' input.txt
I have a text file like this:
line 1
line 2
*
line 3
*
line 4
line 5
line 6
*
line 7
line 8
I would like to write out parts which are between the two patterns (* in this case). So if I want the first section, I want to get
line 1
line 2
If I want to get the third one it should be
line 4
line 5
line 6
The returned lines should be without the asterisk, and it is important that there is no asterisk at the beginning or at the end.
I was thinking about "splitting" the whole text into columns using '*' as delimiter with sed or awk, but I did not succeed. Anyone could help? Thanks a lot.
This was what I tried:
sed "/^%/{x;s/^/X/;/^X\\{$choice\\}$/ba;x};d;:a;x;:b;$!{n;/^%/!bb}" "$file"
but this needs to have an * at the beginning and it also prints the asterisks before and after.
$ awk -v num=3 '$0=="*"{ if (++count >= num) exit; next } num-1==count' data
line 4
line 5
line 6
$ awk -v num=1 '$0=="*"{ if (++count >= num) exit; next } num-1==count' data
line 1
line 2
awk to the rescue
$ awk -v RS='\n\\*\n' -v n=3 'NR==n' file
line 4
line 5
line 6
this requires multi-char record separator support (gawk).
Another alternative, with counting stars
$ awk -v n=3 '/^*/{c++;next} c==n-1' file
line 4
line 5
line 6
I want to insert a text line, lets say "hello" to the 3rd line of the file. And there should be a new line appended:
1st
2nd
Hello
3rd
How can I do that?
Very straightforward with awk:
$ cat file
1
2
3
4
5
6
$ awk 'NR==3{print "hello\n"}1' file
1
2
hello
3
4
5
6
Where NR is the line number. You can set it to any number you wish to insert text to.
Does it have to be sed?
head -2 infile ; echo Hello ; echo ; tail +3 infile
$ sed '3s/^/Hello\n\n/' file.txt
1st
2nd
Hello
3rd
The 3 at the beginning of the sed command specifies that the command should be applied to line 3 only. Thus, the command, 3s/^/Hello\n\n/, substitutes in "Hello" and two new lines to the beginning (^ matches the beginning of a line) of line 3. Otherwise, the file is left unchanged.
sed '3 i\
Hello\
' YopurFile
Insert following line (preceded by \) at line 3