sed swallows whitespaces while inserting content - bash

I am trying to insert a new line before the first match with sed. The content that I wanna insert starts with spaces but sed or bash swallows the whitespaces.
hello.txt
line 1
line 2
line 3
The content to insert in my case coming from a variable this way:
content=" hello"
The command that I have created:
sed -i "/line 2/i $content" hello.txt
Result:
line 1
hello
line 2
line 3
Expected result:
line 1
hello
line 2
line 3
It seems that bash swallows the whitespaces. I have tried to use quotes around the variable this way: sed -i "/line 2/i \"$content\"" but unfortunately it does not work. After playing with this for 2 hours I just decided that I ask help.

That's how GNU sed works, yes - any whitespace between the command i and the start of text is eaten. You can add a backslash (At least, in GNU sed; haven't tested others) to keep the spaces:
$ sed "/line 2/i \\$content" hello.txt
line 1
hello
line 2
line 3
POSIX sed i always requires a backslash after the i and then a newline with the text to insert on the next line (And to insert multiple lines, a backslash at the end of all but the last), instead of the one-line version that's a GNU extension.
sed "/line 2/i\\
$content
" hello.txt
I don't have a non-GNU version handy to test, so I don't know if the leading whitespace in the variable will be eaten, but I suspect not.

You need to use a backslash instead:
content="\ hello"
Otherwise, it's viewed as part of the separated between the /i and the hello.
Note also that the $content variable could end up including all sorts of characters that sed would interpret differently... so be careful.

ed would work well here:
ed hello.txt <<END_ED
/line 2/i
$content
.
wq
END_ED

Related

Using sed to append a line with alt characters

I have this file I want to add a line with and I seem to have a problem when using alt characters.
Here's the script piece:
#testfile.txt
Main
├────►projectA
└────►projectC
sed "/├────►projectA/a ├────►projectB/" testfile.txt
sed doesn't seem to find the "├────►projectA" portion. Grep won't even find it.
grep "├────►projectA" testfile.txt
grep returns nothing.
So how can you make it found so I can add my line below it?
Edit: I found my problem. I was using the wrong character in my sed command. This script is on a different system so I had to make an example off the top of my head.
I'm trying to add spaces as well after the /a but it trims it. Is there a way to preserve the spaces?
(e.g. "[5 spaces] ├────►projectB")
I can't add spaces to the above line because stakoverflow formatting trims it as well. So I say [5 spaces] to represent the amount of whitespace.
It seems to me like you're trying to use sed to append a literal string (i,.e. a string containing any characters) but sed doesn't understand literal strings, only regular expressions and backreference-enabled replacements, see Is it possible to escape regex metacharacters reliably with sed. You should instead be using a tool like awk that understands literal strings.
Is this all you want to do?
$ awk '{print} index($0,"├────►projectA"){print "├────►projectB"}' file
#testfile.txt
Main
├────►projectA
├────►projectB
└────►projectC
If you want 5 leading blanks, just add 5 leading blanks to the string in the print statement:
$ awk '{print} index($0,"├────►projectA"){print " ├────►projectB"}' file
#testfile.txt
Main
├────►projectA
├────►projectB
└────►projectC
If you just want to duplicate whatever indent the preceding line has, here's a modified version of your input file:
$ cat file
#testfile.txt
Main
├────►projectA
└────►projectC
and here's how to print the new line with whatever indent (blanks and/or tabs) the preceding line uses:
$ awk '{print} s=index($0,"├────►projectA"){print substr($0,1,s-1) "├────►projectB"}' file
#testfile.txt
Main
├────►projectA
├────►projectB
└────►projectC
This might work for you (GNU sed):
sed '/^├────►projectA$/{p;s/A/B/}' file
Match on ────►projectA, print the line then substitute B for A.
Use the sed command l0 to see each lines representation in octal and ascii.
Thus this is the same as above:
sed '/^\o342\o224\o234\o342\o224\o200\o342\o224\o200\o342\o224\o200\o342\o224\o200\o342\o226\o272projectA$/{p;s/A/B/}' file
To add 5 spaces to front of such a line, use:
sed '/^├────►projectA$/{p;s/^/ /;s/A/B/}' file
Or you may prefer:
sed '/^\(├────►project\)A$/{p;s/ \1B/}' file

match repeated character in sed on mac

I am trying to find all instances of 3 or more new lines and replace them with only 2 new lines (imagine a file with wayyy too much white space). I am using sed, but OK with an answer using awk or the like if that's easier.
note: I'm on a mac, so sed is slightly different than on linux (BSD vs GNU)
My actual goal is new lines, but I can't get it to work at all so for simplicity I'm trying to match 3 or more repetitions of bla and replace that with BLA.
Make an example file called stupid.txt:
$ cat stupid.txt
blablabla
$
My understanding is that you match i or more things using regex syntax thing{i,}.
I have tried variations of this to match the 3 blas with no luck:
cat stupid.txt | sed 's/bla{3,}/BLA/g' # simplest way
cat stupid.txt | sed 's/bla\{3,\}/BLA/g' # escape curly brackets
cat stupid.txt | sed -E 's/bla{3,}/BLA/g' # use extended regular expressions
cat stupid.txt | sed -E 's/bla\{3,\}/BLA/g' # use -E and escape brackets
Now I am out of ideas for what else to try!
thing{3,} matches thinggg. Use (..) to group things to make the quantifier apply to what you want:
$ echo blablabla | sed -E 's/(bla){3}/BLA/g'
BLA
If slurping the whole file is acceptable:
perl -0777pe 's/(\n){3,}/\n\n/g' newlines.txt
Where you should replace \n with whatever newline sequence is appropriate.
-0777 tells perl to not break each line into its own record, which allows a regex that works across lines to function.
If you are satisfied with the result, -i causes perl to replace the file in-place rather than output to stdout:
perl -i -0777pe 's/(\n){3,}/\n\n/g' newlines.txt
You can also do as so: -i~ to create a backup file with the given suffix (~ in this case).
If slurping the whole file is not acceptable:
perl -ne 'if (/^$/) {$i++}else{$i=0}print if $i<3' newlines.txt
This prints any line that is not the third (or higher) consecutive empty line. -i works with this the same.
ps--MacOS comes with perl installed.
sed -E 's/bla{3,}/BLA/g'
The above matches bl followed by three or more repetitions of a. This is not what you want. It appears that you actually want three or more repetitions of bla. If that is the case, then replace:
$ sed -E 's/bla{3,}/BLA/g' stupid.txt
blablabla
With:
$ sed -E 's/(bla){3,}/BLA/g' stupid.txt
BLA
The above, though, doesn't directly help with your task of replacing newlines because, by default, sed reads in only one line at a time.
Replacing newlines
Let's consider this file which has 3 newlines between the 1 and 2:
$ cat file.txt
1
3
To replace any occurrence of three or more newlines with a single newline:
$ sed -E 'H;1h;$!d;x; s/\n{3,}/\n/g' file.txt
1
3
How it works:
H;1h;$!d;x
This complex series of commands reads in the whole file. It is probably
simplest to think of this as an idiom. If you really want to know
the gory details:
H - Append current line to hold space
1h - If this is the first line, overwrite the hold space
with it
$!d - If this is not the last line, delete pattern space
and jump to the next line.
x - Exchange hold and pattern space to put whole file in
pattern space
s/\n{3,}/\n/g
This replaces all sequences of three or more newlines with a single newline.
Alternate
The above solution reads in the whole file at once. For large (gigabyte) files that could be a disadvantage. This alternate approach avoids that:
$ sed -E '/^$/{:a; N; /\n$/ba; s/\n{3,}([^\n]*)/\1/}' file.txt # GNU only
1
3
How it works:
/^$/{...}
This selects blank lines. For blank lines and only blank lines, the commands in braces are executed and they are:
:a
This defines a label a.
N
This reads in the next line from the file into the pattern space, separated from the previous by a newline.
/\n$/ba
If the last line read in is empty, branch (jump) to label a.
s/\n{3,}([^\n]*)/\1/
If we didn't branch, then this substitution is performed which removes the excess newlines.
BSD Version: I don't have a BSD system to test this on but I am guessing:
sed -E -e '/^$/{:a' -e N -e '/\n$/ba' -e 's/\n{3,}([^\n]*)/\1/}' file.txt
To keep only 2 newlines, you can try this sed
sed '
/^$/!b
N
/../b
h
:A
y/\n/#/
/^#$/!bB
s/#//
$bB
N
bA
:B
s/^#//
/./ {
x
G
b
}
g
' infile
/^$/!b If it's a empty line don't print it
N get a new line
/../b if this new line is not empty print the 2 lines
h keep the 2 empty lines in the hold buffer
:A label A
At this point there is always 2 lines in the pattern buffer and the first is empty
y/\n/#/ substitute \n by # (you can choose another char not present in your file)
/^#$/!bB If the second line is not empty jump to B
s/#// remove the #
$bB If it's the last line jump to B
At this point there is 1 empty line in the pattern space
N get the last line
bA jump to A
:B label B
s/^#// remove the # at the start of the line
/./ { If the last line is not empty
x exchange pattern and hold buffer
G add the hold buffer to the pattern space
b jump to end
}
g replace the pattern space (empty) by the hold space
print the pattern space

Sed substitution places characters after back reference at beginning of line

I have a text file that I am trying to convert to a Latex file for printing. One of the first steps is to go through and change lines that look like:
Book 01 Introduction
To look like:
\chapter{Introduction}
To this end, I have devised a very simple sed script:
sed -n -e 's/Book [[:digit:]]\{2\}\s*(.*)/\\chapter{\1}/p'
This does the job, except, the closing curly bracket is placed where the initial backslash should be in the substituted output. Like so:
}chapter{Introduction
Any ideas as to why this is the case?
Your call to sed is fine; the problem is that your file uses DOS line endings (CRLF), but sed does not recognize the CR as part of the line ending, but as just another character on the line. The string Introduction\r is captured, and the result \chapter{Introduction\r} is printed by printing everything up to the carriage return (the ^ represents the cursor position)
\chapter{Introduction
^
then moving the cursor to the beginning of the line
\chapter{Introduction
^
then printing the rest of the result (}) over what has already been printed
}chapter{Introduction
^
The solution is to either fix the file to use standard POSIX line endings (linefeed only), or to modify your regular expression to not capture the carriage return at the end of the line.
sed -n -e 's/Book [[:digit:]]\{2\}\s*(.*)\r?$/\\chapter{\1}/p'
As an alternative to sed, awk using gsub might work well in this situation:
awk '{gsub(/Book [0-9]+/,"\\chapter"); print $1"{"$2"}"}'
Result:
\chapter{Introduction}
A solution is to modify the capture group. In this case, since all book chapter names consist only of alphabetic characters I was able to use [[:alpha:]]*. This gave a revised sed script of:
sed -n -e 's/Book [[:digit:]]\{2\}\s*\([[:alpha:]]*\)/\\chapter{\1}/p'.

Add multiple lines in file using bash script

Using a bash script, I am trying to insert a line in a file (eventually there will be 4 extra lines, one after the other).
I am trying to implement the answer by iiSeymour to the thread:
Insert lines in a file starting from a specific line
which I think is the same comment that dgibbs made in his own thread:
Bash: Inserting a line in a file at a specific location
The line after which I want to insert the new text is very long, so I save it in a variable first:
field1=$(head -2 file847script0.xml | tail -1)
The text I want to insert is:
insert='newtext123'
When running:
sed -i".bak" "s/$field1/$field1\n$insert/" file847script0.xml
I get the error:
sed: 1: "s/<ImageAnnotation xmln ...": bad flag in substitute command: 'c'
I also tried following the thread
sed throws 'bad flag in substitute command'
but the command
sed -i".bak" "s/\/$field1/$field1\n$insert/" file847script0.xml
still gives me the same error:
sed: 1: "s/\/<ImageAnnotation xm ...": bad flag in substitute command: 'c'
I am using a Mac OS X 10.5.
Any idea of what am I doing wrong? Thank you!
Good grief, just use awk. No need to worry about special characters in your replacement text or random single-character commands and punctuation.
In this case it looks like all you need is to print some new text after the 2nd line so that's just:
$ cat file
a
b
c
$ insert='absolutely any text you want, including newlines
slashes (/), backslashes (\\), whatever...'
$ awk -v insert="$insert" '{print} NR==2{print insert}' file
a
b
absolutely any text you want, including newlines
slashes (/), backslashes (\), whatever...
c
Isn't it easier to do it by line number? If you know it's the second line or the nth line (and grep will tell you line numbers if you are pattern matching) then you can simply use sed to find the correct line and then append a new line (or 4 new lines).
cat <<EOF > testfile
one two three
four five six
seven eight nine
EOF
sed -re '2a\hello there' testfile
will output
one two three
four five six
hello there
seven eight nine

Why does sed add a new line in OSX?

echo -n 'I hate cats' > cats.txt
sed -i '' 's/hate/love/' cats.txt
This changes the word in the file correctly, but also adds a newline to the end of the file. Why? This only happens in OSX, not Ubuntu etc. How can I stop it?
echo -n 'I hate cats' > cats.txt
This command will populate the contents of 'cats.txt' with the 11 characters between the single quotes. If you check the size of cats.txt at this stage it should be 11 bytes.
sed -i '' 's/hate/love/' cats.txt
This command will read the cats.txt file line by line, and replace it with a file where each line has had the first instance of 'hate' replaced by 'love' (if such an instance exists). The important part is understanding what a line is. From the sed man page:
Normally, sed cyclically copies a line of input, not including its
terminating newline character, into a pattern space, (unless there is
something left after a ``D'' function), applies all of the commands
with addresses that select that pattern space, copies the pattern
space to the standard output, appending a newline, and deletes the
pattern space.
Note the appending a newline part. On your system, sed is still interpreting your file as containing a single line, even though there is no terminating newline. So the output will be the 11 characters, plus the appended newline. On other platforms this would not necessarily be the case. Sometimes sed will completely skip the last line in a file (effectively deleting it) because it is not really a line! But in your case, sed is basically fixing the file for you (as a file with no lines in it, it is broken input to sed).
See more details here: Why should text files end with a newline?
See this question for an alternate approach to your problem: SED adds new line at the end
If you need a solution which will not add the newline, you can use gsed (brew install gnu-sed)
A good way to avoid this problem is to use perl instead of sed. Perl will respect the EOF newline, or lack thereof, that is in the original file.
echo -n 'I hate cats' > cats.txt
perl -pi -e 's/hate/love/' cats.txt
Note that GNU sed does not add the newline on Mac OS.
Another thing you can do is this:
echo -n 'I hate cats' > cats.txt
SAFE=$(cat cats.txt; echo x)
SAFE=$(printf "$SAFE" | sed -e 's/hate/love/')
SAFE=${SAFE%x}
That way if cats.txt ends in a newline it gets preserved. If it doesn't, it doesn't get one added on.
This worked for me. I didn't have to use an intermediate file.
OUTPUT=$( echo 'I hate cats' | sed 's/hate/love/' )
echo -n "$OUTPUT"

Resources