sed substitute and show line number - bash

I'm working in bash trying to use sed substitution on a file and show both the line number where the substitution occurred and the final version of the line. For a file with lines that contain foo, trying with
sed -n 's/foo/bar/gp' filename
will show me the lines where substitution occurred, but I can't figure out how to include the line number. If I try to use = as a flag to print the current line number like
sed -n 's/foo/bar/gp=' filename
I get
sed: -e expression #1, char 14: unknown option to `s'
I can accomplish the goal with awk like
awk '{if (sub("foo","bar",$0)){print NR $0}}' filename
but I'm curious if there's a way to do this with one line of sed. If possible I'd love to use a single sed statement without a pipe.

I can't think of a way to do it without listing the search pattern twice and using command grouping.
sed -n "/foo/{s/foo/bar/g;=;p;}" filename
EDIT: mklement0 helped me out there by mentioning that if the pattern space is empty, the default pattern space is the last one used, as mentioned in the manual. So you could get away with it like this:
sed -n "/foo/{s//bar/g;=;p;}" filename
Before that, I figured out a way not to repeat the pattern space, but it uses branches and labels. "In most cases," the docs specify, "use of these commands indicates that you are probably better off programming in something like awk or Perl. But occasionally one is committed to sticking with sed, and these commands can enable one to write quite convoluted scripts." [source]
sed -n "s/foo/bar/g;tp;b;:p;=;p" filename
This does the following:
s/foo/bar/g does your substitution.
tp will jump to :p iff a substitution happened.
b (branch with no label) will process the next line.
:p defines label p, which is the target for the tp command above.
= and p will print the line number and then the line.
End of script, so go back and process the next line.
See? Much less readable...and maybe a distant cousin of :(){ :|:& };:. :)

It cannot be done in any reasonable way with sed, here's how to really do it clearly and simply in awk:
awk 'sub(/foo/,"bar"){print NR, $0}' filename
sed is an excellent tool for simple substitutions on a single line, for anything else use awk.

Related

How to insert a specific character at a specific line of a file using sed or awk?

I want to use command to edit the specific line of a file instead of using vi. This is the thing. If there is a # starting with the line, then replace the # to make it uncomment. Otherwise, add the # to make it comment. I'd like to use sed or awk. But it won't work as expected.
This is the file.
what are you doing now?
what are you gonna do? stab me?
this is interesting.
This is a test.
go big
don't be rude.
For example, I just want to add the # at the beginning of the the line 4 This is a test if it doesn't start with #. And if it starts with #, then remove the #.
I've already tried via sed & gawk (awk)
gawk -i inplace '$1!="#" {print "#",$0;next};{print substr($0,3,length-1)}' file
sed -i /test/s/^#// file # make it uncomment
sed -i /test/s/^/#/ file # make it comment
I don't know how to use if else to make sed work. I could only make it with a single command, then use another regex to make the opposite.
Using gawk, it works as the main line. But it will mess the rest of the code up.
This might work for you (GNU sed):
sed '4{s/^/#/;s/^##//}' file
On line 4 prepend a # to the line and if there 2 #'s remove them.
Could also be written:
sed '4s/^/#/;4s/^##//' file
This will remove # from the start of line 4 or add it if it wasn't already there:
sed -i '4s/^#/\n/; 4s/^[^\n]/#&/; 4s/^\n//' File
The above assume GNU sed. If you have BSD/MacOS sed, some minor changes will be required.
When sed reads a new line, the one thing that we know for sure about the new line is that it does not contain \n. (If it did, it would be two lines, not one.) Using this knowledge, the script works by:
s/^#/\n/
If the fourth line starts with #, replace # with \n. (The \n serves as a notice that the line had originally been commented out.)
4s/^[^\n]/#&/
If the fourth line now starts with anything other than \n (meaning that it was not originally commented), put a # in front.
4s/^\n//
If the fourth line now starts with \n, remove it.
Alternative: Modifying lines that contain test
To comment/uncomment lines that contain test:
sed '/test/{s/^#/\n/; s/^[^\n]/#&/; s/^\n//}' File
Alternative: using awk
The exact same logic can be applied using awk. If we want to comment/uncomment line 4:
awk 'NR==4 {sub(/^#/, "\n"); sub(/^[^\n]/, "#&"); sub(/^\n/, "")} 1' File
If we want to comment/uncomment any line containing test:
awk '/test/ {sub(/^#/, "\n"); sub(/^[^\n]/, "#&"); sub(/^\n/, "")} 1' File
Alternative: using sed but without newlines
To comment/uncomment any line containing test:
sed '/test/{s/^#//; t; s/^/#/; }' File
How it works:
s/^#//; t
If the line begins with #, then remove it.
t tells sed that, if the substitution succeeded, then it should skip the rest of the commands.
s/^/#/
If we get to this command, that means that the substitution did not succeed (meaning the line was not originally commented out), so we insert #.
If you end up on a system with a sed that doesn't support in-place editing, you can fall back to its uncle ed:
ed -s file 2>/dev/null <<EOF
4 s/^/#/
s/^##//
w
q
EOF
(Standard error is redirected to /dev/null because in ed, unlike sed, it's an error if s doesn't replace anything and a question mark is thus printed to standard error.)
$ awk 'NR==4{$0=(sub(/^#/,"") ? "" : "#") $0} 1' file
what are you doing now?
what are you gonna do? stab me?
this is interesting.
#This is a test.
go big
don't be rude.
$ awk 'NR==4{$0=(sub(/^#/,"") ? "" : "#") $0} 1' file |
awk 'NR==4{$0=(sub(/^#/,"") ? "" : "#") $0} 1'
what are you doing now?
what are you gonna do? stab me?
this is interesting.
This is a test.
go big
don't be rude.

Delete specific lines in range with sed

I am aware of several other questions related to this one such as: Sed Unknown Option to s; however I am not having that problem. I am trying to run:
sed -n '/Ce./,/EOF/ {s!^#!! d} p' more_tests_high.job
but I keep getting:
sed: -e expression #1, char 21: unknown option to `s'
I am trying to search more_test_high.job for text between Ce and EOF, but remove any comment lines which start with #. Yes, EOF is a literal text in the file that I want to search for. I have tried using / , !, and _ as delimiters. I can run:
sed -n '/Ce./,/EOF/ p' more_tests_high.job
and see all the text that is between Ce and EOF, but how do I remove the commented lines that start with #?
Your command should look like this:
sed -n '/CE./,/EOF/{/^#/d;p}' more_tests_high.job
For all the lines between the CE and the EOF line, you check if they are a comment line, and if yes, you delete it, which restarts the cycle and ignores the p.
If it's not a comment line, it will be printed.
BSD sed (also found on Mac OS X) requires an extra semicolon between the p and the closing brace.
awk to the rescue!
awk '/Ce./,/EOF/{if($0!~/^#/) print}' file
almost direct translation of the requirements without cryptic syntax.
This might work for you (GNU sed):
sed '/CE./,/EOF/!d;/^#/d' file
You are only interested in the range of lines between CE. and EOF therefore anything else delete. Once in the required range, delete any lines that begin #. Print all remaining lines.

Using both GNU Utils with Mac Utils in bash

I am working with plotting extremely large files with N number of relevant data entries. (N varies between files).
In each of these files, comments are automatically generated at the start and end of the file and would like to filter these out before recombining them into one grand data set.
Unfortunately, I am using MacOSx, where I encounter some issues when trying to remove the last line of the file. I have read that the most efficient way was to use head/tail bash commands to cut off sections of data. Since head -n -1 does not work for MacOSx I had to install coreutils through homebrew where the ghead command works wonderfully. However the command,
tail -n+9 $COUNTER/test.csv | ghead -n -1 $COUNTER/test.csv >> gfinal.csv
does not work. A less than pleasing workaround was I had to separate the commands, use ghead > newfile, then use tail on newfile > gfinal. Unfortunately, this will take while as I have to write a new file with the first ghead.
Is there a workaround to incorporating both GNU Utils with the standard Mac Utils?
Thanks,
Keven
The problem with your command is that you specify the file operand again for the ghead command, instead of letting it take its input from stdin, via the pipe; this causes ghead to ignore stdin input, so the first pipe segment is effectively ignored; simply omit the file operand for the ghead command:
tail -n+9 "$COUNTER/test.csv" | ghead -n -1 >> gfinal.csv
That said, if you only want to drop the last line, there's no need for GNU head - OS X's own BSD sed will do:
tail -n +9 "$COUNTER/test.csv" | sed '$d' >> gfinal.csv
$ matches the last line, and d deletes it (meaning it won't be output).
Finally, as #ghoti points out in a comment, you could do it all using sed:
sed -n '9,$ {$!p;}' file
Option -n tells sed to only produce output when explicitly requested; 9,$ matches everything from line 9 through (,) the end of the file (the last line, $), and {$!p;} prints (p) every line in that range, except (!) the last ($).
I realize that your question is about using head and tail, but I'll answer as if you're interested in solving the original problem rather than figuring out how to use those particular tools to solve the problem. :)
One method using sed:
sed -e '1,8d;$d' inputfile
At this level of simplicity, GNU sed and BSD sed both work the same way. Our sed script says:
1,8d - delete lines 1 through 8,
$d - delete the last line.
If you decide to generate a sed script like this on-the-fly, beware of your quoting; you will have to escape the dollar sign if you put it in double quotes.
Another method using awk:
awk 'NR>9{print last} NR>1{last=$0}' inputfile
This works a bit differently in order to "recognize" the last line, capturing the previous line and printing after line 8, and then NOT printing the final line.
This awk solution is a bit of a hack, and like the sed solution, relies on the fact that you only want to strip ONE final line of the file.
If you want to strip more lines than one off the bottom of the file, you'd probably want to maintain an array that would function sort of as a buffered FIFO or sliding window.
awk -v striptop=8 -v stripbottom=3 '
{ last[NR]=$0; }
NR > striptop*2 { print last[NR-striptop]; }
{ delete last[NR-striptop]; }
END { for(r in last){if(r<NR-stripbottom+1) print last[r];} }
' inputfile
You specify how much to strip in variables. The last array keeps a number of lines in memory, prints from the far end of the stack, and deletes them as they are printed. The END section steps through whatever remains in the array, and prints everything not prohibited by stripbottom.

Delete all lines before last case of a string

How would I go about deleting all the lines before the last occurrence of a string. Like if I had a file that looked like
Icecream is good
And
Chocolate is good
And
They have lots of sugar
If I want all lines after and including the last occurrence of "And" what's the cleanest way to do this? Specifically, I want
And
They have lots of sugar
I was doing sed -n -E -e '/And/,$p' file but I see this gives me the first occurrence.
This might work for you (GNU sed):
sed -n '/And/h;//!H;$!d;x;//p' file
Replace anything in the hold space by the line containing And. Append all other lines to the hold space. At the end of the file, swap the pattern space for the hold space and print out the result as long it matches the required string And.
I know that you asked for sed and that Potong provided a good sed solution. But, for comparison, here is an awk solution:
$ awk 's{s=s"\n"$0;} /And/{s=$0;} END{print s;}' file
And
They have lots of sugar
How it works:
s{s=s"\n"$0;}
If the variable s is not empty, then add to it the current line, $0.
/And/{s=$0;}
If the current line contains And, then set s to the current line, $0.
END{print s;}
After we have reached the end of the file, print s.
$ tac file | awk '!f; /And/{f=1}' | tac
And
They have lots of sugar
$ awk 'NR==FNR{if(/And/)nr=NR;next} FNR>=nr' file file
And
They have lots of sugar

Insert line after match using sed

For some reason I can't seem to find a straightforward answer to this and I'm on a bit of a time crunch at the moment. How would I go about inserting a choice line of text after the first line matching a specific string using the sed command. I have ...
CLIENTSCRIPT="foo"
CLIENTFILE="bar"
And I want insert a line after the CLIENTSCRIPT= line resulting in ...
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Try doing this using GNU sed:
sed '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
if you want to substitute in-place, use
sed -i '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
Output
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Doc
see sed doc and search \a (append)
Note the standard sed syntax (as in POSIX, so supported by all conforming sed implementations around (GNU, OS/X, BSD, Solaris...)):
sed '/CLIENTSCRIPT=/a\
CLIENTSCRIPT2="hello"' file
Or on one line:
sed -e '/CLIENTSCRIPT=/a\' -e 'CLIENTSCRIPT2="hello"' file
(-expressions (and the contents of -files) are joined with newlines to make up the sed script sed interprets).
The -i option for in-place editing is also a GNU extension, some other implementations (like FreeBSD's) support -i '' for that.
Alternatively, for portability, you can use perl instead:
perl -pi -e '$_ .= qq(CLIENTSCRIPT2="hello"\n) if /CLIENTSCRIPT=/' file
Or you could use ed or ex:
printf '%s\n' /CLIENTSCRIPT=/a 'CLIENTSCRIPT2="hello"' . w q | ex -s file
Sed command that works on MacOS (at least, OS 10) and Unix alike (ie. doesn't require gnu sed like Gilles' (currently accepted) one does):
sed -e '/CLIENTSCRIPT="foo"/a\'$'\n''CLIENTSCRIPT2="hello"' file
This works in bash and maybe other shells too that know the $'\n' evaluation quote style. Everything can be on one line and work in
older/POSIX sed commands. If there might be multiple lines matching the CLIENTSCRIPT="foo" (or your equivalent) and you wish to only add the extra line the first time, you can rework it as follows:
sed -e '/^ *CLIENTSCRIPT="foo"/b ins' -e b -e ':ins' -e 'a\'$'\n''CLIENTSCRIPT2="hello"' -e ': done' -e 'n;b done' file
(this creates a loop after the line insertion code that just cycles through the rest of the file, never getting back to the first sed command again).
You might notice I added a '^ *' to the matching pattern in case that line shows up in a comment, say, or is indented. Its not 100% perfect but covers some other situations likely to be common. Adjust as required...
These two solutions also get round the problem (for the generic solution to adding a line) that if your new inserted line contains unescaped backslashes or ampersands they will be interpreted by sed and likely not come out the same, just like the \n is - eg. \0 would be the first line matched. Especially handy if you're adding a line that comes from a variable where you'd otherwise have to escape everything first using ${var//} before, or another sed statement etc.
This solution is a little less messy in scripts (that quoting and \n is not easy to read though), when you don't want to put the replacement text for the a command at the start of a line if say, in a function with indented lines. I've taken advantage that $'\n' is evaluated to a newline by the shell, its not in regular '\n' single-quoted values.
Its getting long enough though that I think perl/even awk might win due to being more readable.
A POSIX compliant one using the s command:
sed '/CLIENTSCRIPT="foo"/s/.*/&\
CLIENTSCRIPT2="hello"/' file
Maybe a bit late to post an answer for this, but I found some of the above solutions a bit cumbersome.
I tried simple string replacement in sed and it worked:
sed 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
& sign reflects the matched string, and then you add \n and the new line.
As mentioned, if you want to do it in-place:
sed -i 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
Another thing. You can match using an expression:
sed -i 's/CLIENTSCRIPT=.*/&\nCLIENTSCRIPT2="hello"/' file
Hope this helps someone
The awk variant :
awk '1;/CLIENTSCRIPT=/{print "CLIENTSCRIPT2=\"hello\""}' file
I had a similar task, and was not able to get the above perl solution to work.
Here is my solution:
perl -i -pe "BEGIN{undef $/;} s/^\[mysqld\]$/[mysqld]\n\ncollation-server = utf8_unicode_ci\n/sgm" /etc/mysql/my.cnf
Explanation:
Uses a regular expression to search for a line in my /etc/mysql/my.cnf file that contained only [mysqld] and replaced it with
[mysqld]
collation-server = utf8_unicode_ci
effectively adding the collation-server = utf8_unicode_ci line after the line containing [mysqld].
I had to do this recently as well for both Mac and Linux OS's and after browsing through many posts and trying many things out, in my particular opinion I never got to where I wanted to which is: a simple enough to understand solution using well known and standard commands with simple patterns, one liner, portable, expandable to add in more constraints. Then I tried to looked at it with a different perspective, that's when I realized i could do without the "one liner" option if a "2-liner" met the rest of my criteria. At the end I came up with this solution I like that works in both Ubuntu and Mac which i wanted to share with everyone:
insertLine=$(( $(grep -n "foo" sample.txt | cut -f1 -d: | head -1) + 1 ))
sed -i -e "$insertLine"' i\'$'\n''bar'$'\n' sample.txt
In first command, grep looks for line numbers containing "foo", cut/head selects 1st occurrence, and the arithmetic op increments that first occurrence line number by 1 since I want to insert after the occurrence.
In second command, it's an in-place file edit, "i" for inserting: an ansi-c quoting new line, "bar", then another new line. The result is adding a new line containing "bar" after the "foo" line. Each of these 2 commands can be expanded to more complex operations and matching.

Resources