I've got a file with the following contents, and want to remove the last comma (in this case, the comma after the 'c' and 'f').
some more text
This has to be used using bash and not Perl or Python etc as these are not installed on my target system. I can use sed, awk etc, but I cannot use sed with the -z argument as I'm using an old version of the utility.
So sed -zi 's/,\n);/\n);/g' $file is off the table.
This might work in your version of sed. Then again it might not.
sed 'x;1d;G;/;$/s/,//;$!s/\n.*//' $file
Rough translation: "Swap this line with the hold space. If this is the first line, do no more with it. Append the hold space to the line in the buffer (so that you're looking at the last line and the current one). If what you have ends with a semicolon, delete the comma. If you're not on the last line of the file, delete the second of the two lines you have (i.e. the current line, which we'll deal with after we see the next one)."

Using awk, RS="^$" to read in the whole file and regex to replace parts of the text:
$ awk -v RS=^$ '{gsub(/,\n\);/,"\n);")}1' file
Some output:

This should work with GNU sed and BSD sed on the shown input:
sed -e ':a' -e '/,\n);$/!{N' -e 'ba' -e '}' -e 's/,\n);$/\n);/' file.txt
We concatenate lines in the pattern space until it ends with ,\n);. Then we delete the comma, print (the default) and restart the cycle with a new line.
Simpler and more readable version with GNU sed (that you do not have):
sed ':a;/,\n);$/!{N;ba};s/,\n);$/\n);/' file.txt

Using awk:
awk '
$0==");" {sub(/,$/, "", l)}
FNR!=1 {print l}
END {print l}'

This might work for you (GNU sed):
sed '/,$/{N;/);$/Ms/,$//M;P;D}' file
If a line ends with a comma, fetch the next line and if this ends in );, remove the comma.
Otherwise, if the following line does not match as above, print/delete the first of the lines and repeat.

Using sed there are broadly two approaches:
Keep multiple lines in the pattern space; or
Keep the previous line in the hold space.
Using just the pattern space means a very concise version:
sed 'N; s/,[[:space:]]*\n*[[:space:]]*)/)/; P; D'
This relies on the pattern space being able to hold multiple lines, and being able to match the newline with \n. Not all versions of sed can do this, but GNU sed can.
This also relies on the implicit behaviours of N, P, and D, which change depending on when end-of-input is reached. Read man sed for the gory details.
Unrolling this to one command per line gets:
sed '
If you have only a POSIX version of sed available, you'll need to use the hold space as well. In this case the idea is that when you see the ) in the pattern space, you edit the line that's in the hold space to remove the comma:
sed '1 { h; d; }; /^)/ { x; s/,[[:space:]]*$//; x; }; x; $ { p; x; s/,$//; }'
Unrolling that we get:
sed '
1 {
/^)/ {
$ {
Breaking that apart: what follows is a "sed script"; so just put '' around it and "sed" in front of it:
sed '
Start by unconditionally copying the first line from the pattern space to the hold space, and then deleting the pattern space (which forces a skip to the next line)
1 {
For each line that starts with ')', swap the pattern space and hold space (so you now have the previous line in the pattern space), remove the trailing comma (if any), and then swap back again:
/^)/ {
Now swap the pattern space with the hold space, so that the hold space now hold the current line and pattern space holds the previous line.
Normally contents of the pattern space will be sent to output when the end of the script is reached, but we have one more case to take care of first.
On the last line, print the previous line, then swap to retrieve the last line and then (because we reach the end of the script) print it too. This code will also remove a trailing comma from the last line, but that's optional; you can remove the s command in the following if you don't want that.
$ {
Upon reaching the end of the sed script, the pattern space will be printed; so there's no "p" at the end.
As mentioned before, close the quote from the beginning.
If you need to scan ahead more than one line, instead of "x" to swap one line, use "H;g" to append to the hold space and then copy the hold space to the pattern space, then "P;D" to print and remove up to the first newline. (H, P & D are GNU extensions.)


I have been tinkering with this for a while but can't quite figure it out. A sample line within the file looks like this:
"...~236 characters of data...Y YYY. Y...many more characters of data"
How would I use sed or awk to replace spaces with a B character only between positions 236 and 246? In that example string it starts at character 29 and ends at character 39 within the string. I would want to preserve all the text preceding and following the target chunk of data within the line.
For clarification based on the comments, it should be applied to all lines in the file and expected output would be:
"...~236 characters of data...YBBYYY.BBY...many more characters of data"
With GNU awk:
$ awk -v FIELDWIDTHS='29 10 *' -v OFS= '{gsub(/ /, "B", $2)} 1' ip.txt
...~236 characters of data...YBBYYY.BBY...many more characters of data
FIELDWIDTHS='29 10 *' means 29 characters for first field, next 10 characters for second field and the rest for third field. OFS is set to empty, otherwise you'll get space added between the fields.
With perl:
$ perl -pe 's/^.{29}\K.{10}/$&=~tr| |B|r/e' ip.txt
...~236 characters of data...YBBYYY.BBY...many more characters of data
^.{29}\K match and ignore first 29 characters
.{10} match 10 characters
e flag to allow Perl code instead of string in replacement section
$&=~tr| |B|r convert space to B for the matched portion
Use this Perl one-liner with substr and tr. Note that this uses the fact that you can assign to substr, which changes the original string:
perl -lpe 'BEGIN { $from = 29; $to = 39; } (substr $_, ( $from - 1 ), ( $to - $from + 1 ) ) =~ tr/ /B/;' in_file > out_file
To change the file in-place, use:
perl -i.bak -lpe 'BEGIN { $from = 29; $to = 39; } (substr $_, ( $from - 1 ), ( $to - $from + 1 ) ) =~ tr/ /B/;' in_file
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-i.bak : Edit input files in-place (overwrite the input file). Before overwriting, save a backup copy of the original file by appending to its name the extension .bak.
I would use GNU AWK following way, for simplicity sake say we have file.txt content
S o m e s t r i n g
and want to change spaces from 5 (inclusive) to 10 (inclusive) position then
awk 'BEGIN{FPAT=".";OFS=""}{for(i=5;i<=10;i+=1)$i=($i==" "?"B":$i);print}' file.txt
output is
S o mBeBsBt r i n g
Explanation: I set field pattern (FPAT) to any single character and output field seperator (OFS) to empty string, thus every field is populated by single characters and I do not get superfluous space when print-ing. I use for loop to access desired fields and for every one I check if it is space, if it is I assign B here otherwise I assign original value, finally I print whole changed line.
Using GNU awk:
awk -v strt=29 -v end=39 '{ ram=substr($0,strt,(end-strt));gsub(" ","B",ram);print substr($0,1,(strt-1)) ram substr($0,(end)) }' file
awk -v strt=29 -v end=39 '{ # Pass the start and end character positions as strt and end respectively
ram=substr($0,strt,(end-strt)); # Extract the 29th to the 39th characters of the line and read into variable ram
gsub(" ","B",ram); # Replace spaces with B in ram
print substr($0,1,(strt-1)) ram substr($0,(end)) # Rebuild the line incorporating raw and printing the result
This is certainly a suitable task for perl, and saddens me that my perl has become so rusty that this is the best I can come up with at the moment:
perl -e 'local $/=\1;while(<>) { s/ /B/ if $. >= 236 && $. <= 246; print }' input;
Another awk but using FS="":
$ awk 'BEGIN{FS=OFS=""}{for(i=29;i<=39;i++)sub(/ /,"B",$i)}1' file
"...~236 characters of data...YBBYYY.BBY...many more characters of data"
$ awk ' # yes awk yes
FS=OFS="" # set empty field delimiters
for(i=29;i<=39;i++) # between desired indexes
sub(/ /,"B",$i) # replace space with B
# if($i==" ") # couldve taken this route, too
# $i="B"
}1' file # implicit output
With sed :
sed '
s/ /B/g
x' infile
When you have an input string without \r, you can use:
sed -r 's/(.{236})(.{10})(.*)/\1\r\2\r\3/;:a;s/(\r.*) (.*\r)/\1B\2/;ta;s/\r//g' input
First put \r around the area that you want to change.
Next introduce a label to jump back to.
Next replace a space between 2 markers.
Repeat until all spaces are replaced.
Remove the markers.
In your case, where the length doesn't change, you can do without the markers.
Replace a space after 236..245 characters and try again when it succeeds.
sed -r ':a; s/^(.{236})([^ ]{0,9}) /\1\2B/;ta' input
This might work for you (GNU sed):
sed -E 's/./&\n/245;s//\n&/236/;h;y/ /B/;H;g;s/\n.*\n(.*)\n.*\n(.*)\n.*/\2\1/' file
Divide the problem into 2 lines, one with spaces and one with B's where there were spaces.
Then using pattern matching make a composite line from the two lines.
N.B. The newline can be used as a delimiter as it is guaranteed not to be in seds pattern space.

I want to replace only the last string "delay" by "ens_delay" in my file and delete the others one before the last one:
Input file:
Output file: (expected value)
Here my first command but it doesn't work because it will work only if I have delay as last line.
sed -e '$,/delay/ s/delay/ens_delay/'
My second command will delete all lines contain "delay", even "ens_delay" will be deleted.
sed -i '/delay/d'
Thank you
This might work for you (GNU sed):
sed '/^delay=/,$!b;/^delay=/!H;//{x;s/^[^\n]*\n\?//;/./p;x;h};$!d;x;s/^/ens_/' file
Lines before the first line beginning delay= should be printed as normal. Otherwise, a line beginning delay= is stored in the hold space and subsequent lines that do not begin delay= are appended to it. Should the hold space already contain such lines, the first line is deleted and the remaining lines printed before the hold space is replaced by the current line. At the end of the file, the first line of the hold space is amended to prepend the string ens_ and then the whole of the hold space is printed.
You cannot do this kind of thing with sed. There is no way in sed to "look forward" and tell if there are more matches to the pattern. You can kind of look back, but that won't be sufficient to solve this problem.
This perl script will solve it:
use strict;
use warnings;
my ($seek, $replacement, $last, #new) = (shift, shift, 0);
open(my $fh, shift) or die $!;
my #l = <$fh>;
close($fh) or die $!;
foreach (reverse #l){
if ($last++ == 0){
} else {
unshift(#new, $_);
print join "", #new;
Call like:
./script delay= ens_delay= inputfile
I chose to entirely eliminate lines which you intended to delete rather than collapse them in to a single blank line. If that is really required then it's a bit more complicated: the first such line in any consecutive set (or rather the last such) must be pushed on to the output list and you have to track whether this has just been done so you know whether to push the next time, too.
You could also solve this problem with awk, python, or any number of other languages. Just not sed.
Have this monster:
sed -e "1,$(expr $(sed -n '/^delay=/=' your_file.txt | tail -1) - 1)"'s/^delay=.*$//' \
-e 's/^delay=/ens_delay=/' your_file.txt
sed -n '/^delay=/=' your_file.txt | tail -1 return the last line number of the encountered pattern (let's name it X)
expr is used to get the X-1 line
"1,X-1"'[command]' means "perform this command betwen the first and the X-1 line included (I used double quotes to let the expansion getting done)
's/^delay=.*$//' the said [command]
-e 's/^delay=/ens_delay=/' the next expression to perform (will occur only on the last line)
If you want to delete the lines instead of leaving them blank:
sed -e "1,$(expr $(sed -n '/^delay=/=' your_file.txt | tail -1) - 1)"'{/^delay=.*$/d}' \
-e 's/^delay=/ens_delay=/' your_file.txt
As was mentioned elsewhere, sed can't know which occurrence of a substring is the last one. But awk can keep track of things in arrays. For example, the following will delete all duplicate assignments, as well ask making your substitution:
awk 'BEGIN{FS=OFS="="} $1=="delay"{$1="ens_delay"} !($1 in a){o[++i]=$1} {a[$1]=$0} END{for(x=0;x<i;x++) printf "%s\n",a[o[x]]}' inputfile
Or, broken out for easier reading/comments:
FS=OFS="=" # set the field separator, to help isolate the left hand side
$1=="delay" {
$1="ens_delay" # your field substitution
!($1 in a) {
o[++i]=$1 # if we haven't seen this variable, record its position
a[$1]=$0 # record the value of the last-seen occurrence of this variable
for (x=0;x<i;x++) # step through the array,
printf "%s\n",a[o[x]] # printing the last-seen values, in the order
} # their variable was first seen in the input file.
You might not care about the order of the variables. If so, the following might be simpler:
awk 'BEGIN{FS=OFS="="} $1=="delay"{$1="ens_delay"} {o[$1]=$0} END{for(i in o) printf "%s\n", o[i]}' inputfile
This simply stores the last-seen line in an array whose key is the variable name, then prints out the content of the array in an unknown order.
Assuming I understand your specifications properly, this should do what you need. Given infile x,
$: last=$( grep -n delay x|tail -1|sed 's/:.*//' )
This grep's the file for all lines with delay and returns them with the line number prepended with a colon. The tail -1 grabs the last of those lines, ignoring all the others. sed 's/:.*//' strips the colon and the actual line content, leaving only the number (here it was 14.)
That all evaluates out to assign 14 as $last.
$: sed '/delay/ { '$last'!d; '$last' s/delay/ens_delay/; }' x
Apologies for the ugly catenation. What this does is writes the script using the value of $last so that the result looks like this to sed:
$: sed '/delay/ { 14!d; 14 s/delay/ens_delay/; }' x
sed reads leading numbers as line selectors, so what this script of commands do -
First, sed automatically prints lines unless told not to, so by default it would just print every line. The script modifies that.
/delay/ {...} is a pattern-based record selector. It will apply the commands between the {} to all lines that match /delay/, which is why it doesn't need another grep - it handles that itself. Inside the curlies, the script does two things.
First, 14!d says (only if this line has delay, which it will) that if the line number is 14, do not (the !) delete the record. Since all the other lines with delay won't be line 14 (or whatever value of the last one the earlier command created), those will get deleted, which automatically restarts the cycle and reads the next record.
Second, if the line number is 14, then it won't delete, and so will progress to the s/delay/ens_delay/ which updates your value.
For all lines that don't match /delay/, sed just prints them as-is.

SET_VALUE(, dsad_sd );
How can I use sed to replace only from the SET_VALUE until the , with each letter after _ to be upper case?
SET_VALUE(, dsad_sd );
For your input string you may apply the following sed expression + bash variable substitution:
s="SET_VALUE(, dsad sd )"
res=$(sed '1s/_\([a-z]\)/\U\1/g;' <<< "${s%,*}"),${s#*,}
echo "$res"
The output:
SET_VALUE(, dsad_sd );
Got distracted while writing this one up so Roman beat me to the punch, but this has a slight variation so figured I'd post it as another option ...
$ s="SET_VALUE(, dsad_sd );"
$ sed 's/,/,\n/g' <<< "$s" | sed -n '1{s/_\([a-z]\)/\U\1/g;N;s/\n//;p}'
SET_VALUE(, dsad_sd );
s/,/,\n/g : break input into separate lines at the comma (leave comma on first line, push rest of input to a second line)
at this point we've broken our input into 2 lines; the second sed invocation will now be working with a 2-line input
sed -n : refrain from printing input lines as they're processed; we'll explicitly print lines when required
1{...} : for the first line, apply the commands inside the braces ...
s/_\([a-z]\)/\U\1/g : for each pattern we find like '_[a-z]', save the [a-z] in buffer #1, and replace the pattern with the upper case of the contents of buffer #1
at this point we've made the desired edits to line #1 (ie, everything before the comma in the original input), now ...
N : read and append the next line into the pattern space
s/\n// : replace the carriage return with a null character
at this point we've pasted lines #1 and #2 together into a single line
p : print the pattern space

Hello: I have tab separated data of the form
customer-item description-purchase price-category
e.g. a.out contains:
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
I'm attempting to get rid of all the NULL fields. I can't rely on the simple replacement of the string "NULL" as it may be a substring; so I am attempting
sed -i 's:\tNULL\t:\t\t:g' a.out
when I do this, I end up with
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
what's wrong here is that #5 has only suffered a replacement of the first instance of the search string on each line.
If I run my sed command twice, I end up with the result I want:
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
where you can see that line 5 has both of the NULLs removed
But I don't understand why I'm suffering this?
awk -F'\t' -v OFS='\t' '{
for (i = 1; i <= NF; ++i) {
if ($i == "NULL") {
$i = "";
}' test.txt
The straightforward solution is to use \t as a field separator and then loop over all of the fields looking for an exact match of "NULL". No substringing.
Here's the same thing as a one liner:
awk -F'\t' -v OFS='\t' '{for(i=1;i<=NF;++i) if($i=="NULL") $i=""} 1' test.txt
Since tabs can't be inside strings in your case since that would imply a new field you might be able to do what you want simply by doing this;
sed ':start ; s/\tNULL\(\t\|$\)/\t\1/ ; t start' a.out
First the inner part s/\tNULL\(\t\|$\)/\t\1/ searches for tab NULL followed by a tab or end of line $ and replace with a tab followed by the character that did appear after NULL (this last part is done using \1). We'll call that expression
We now have:
sed ':start ; expression ; t start' a.out
This is effectively a loop (like goto). :start is a label. ; acts as a statement delimiter. I have described what expression does above. t start says that IF the expression did any substitution that a jump will be made to label start. The buffer will contain the substituted text. This loop occurs until no substitution can be done on the line and then processing continues.
Information on sed flow control and other useful tidbits can be found here
awk makes it simpler:
awk -F '\tNULL\\>' -v OFS='\t' '{$1=$1}1' file
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
From grep(1) on a recent Linux:
The Backslash Character and Special Expressions
The symbols \< and > respectively match the empty string at the
beginning and end of a word. The symbol \b matches the empty string at
the edge of a word [...]
So, how about:
sed -i 's:\<NULL\>::g' a.out

I'm trying to delete two lines either side of a pattern match from a file full of transactions. Ie. find the match then delete two lines before it, then delete two lines after it and then delete the match. The write this back to the original file.
So the input data is
and my pattern is
sed -i '/PINITIAL BALANCE/,+2d' test.txt
However this is only deleting two lines after the pattern match and then deleting the pattern match. I can't work out any logical way to delete all 5 lines of data from the original file using sed.
an awk one-liner may do the job:
awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file
kent$ cat file
this line will be kept
this line will be kept too
kent$ awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file
this line will be kept
this line will be kept too
add some explanation
awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];} #if match found, add the line and +- 2 lines' line number in an array "d"
{a[NR]=$0} # save all lines in an array with line number as index
END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' #finally print only those index not in array "d"
file # your input file
sed will do it:
sed '/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'
It works this way:
if sed has only one string in pattern space it joins another one
if there are only two it joins the third one
if it does natch to pattern LINE + LINE + LINE with BALANCE it joins two following strings, deletes them and goes at the beginning
if not, it prints the first string from pattern and deletes it and goes at the beginning without swiping the pattern space
To prevent the appearance of pattern on the first string you should modify the script:
sed '1{/PINITIAL BALANCE/{N;N;d}};/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'
However, it fails in case you have another PINITIAL BALANCE in string which are going to be deleted. However, other solutions fails too =)
For such a task, I would probably reach for a more advanced tool like Perl:
perl -ne 'push #x, $_;
if (#x > 4) {
if ($x[2] =~ /PINITIAL BALANCE/) { undef #x }
else { print shift #x }
END { print #x }' input-file > output-file
This will remove 5 lines from the input file. These lines will be the 2 lines before the match, the matched line, and the two lines afterwards. You can change the total number of lines being removed modifying #x > 4 (this removes 5 lines) and the line being matched modifying $x[2] (this makes the match on the third line to be removed and so removes the two lines before the match).
A more simple and easy to understand solution might be:
awk '/PINITIAL BALANCE/ {print NR-2 "," NR+2 "d"}' input_filename \
| sed -f - input_filename > output_filename
awk is used to make a sed-script that deletes the lines in question and the result is written on the output_filename.
This uses two processes which might be less efficient than the other answers.
This might work for you (GNU sed):
sed ':a;$q;N;s/\n/&/2;Ta;/\nPINITIAL BALANCE$/!{P;D};$q;N;$q;N;d' file
save this code into a file grep.sed
/.*\n.*\n/ {
and run a command like this:
`sed -i -f grep.sed FILE`
You can use it so either:
sed -i 'H;s:.*::;x;s:^\n::;:r;/PINITIAL BALANCE/{N;N;d;};/.*\n.*\n/{P;D;};x;d' FILE
