Bash Concatenate and Replace carriage returns with newline - bash

I need to convert a series of text files that are formatted with line breaks to single lines separated by newlines (\n). For example:
This is an example text file
where the contents are separated
by line breaks
What I want this to look like is:
This is an example text file\nwhere the contents are separated\nby line breaks\n
I'm open to using awk, sed, or any builtin POSIX commands.

Please try this solution:
awk 'BEGIN{RS="\n";ORS="\\n"}1' file.txt
What we are doing is detect the Record Separator like '\n', and when we print we use '\n', the double slash implies it must print '\n', to force the printing we use the pattern 1 with the default action (print the whole record).
If you have any problem let me know, I don't have an awk available to try it.

It's not clear when you say "line break" if you you mean Carriage Return, Line Feed, or Newline or something else, nor is it clear if you want to replace newlines with the string \n or if you just want to strip Carriage Returns from newlines or something else, but if its the latter then all you need is:
dos2unix file
If you don't have dos2unix you can do it with any awk:
$ printf 'foo\r\nbar\r\n' | cat -v
foo^M
bar^M
$ printf 'foo\r\nbar\r\n' | awk '{sub(/\r$/,"")}1' | cat -v
foo
bar
You can't do it robustly with tr since it can't tell when a \r is at the end of a line or not, and you can't do it portably with sed.

This might work for you (GNU sed):
sed '1h;1!H;$!d;x;s/\n/\\n/g' file
Slurp the file into memory and quote newlines.

Related

Read in a file AS a single line [duplicate]

How can I replace a newline ("\n") with a space ("") using the sed command?
I unsuccessfully tried:
sed 's#\n# #g' file
sed 's#^$# #g' file
How do I fix it?
sed is intended to be used on line-based input. Although it can do what you need.
A better option here is to use the tr command as follows:
tr '\n' ' ' < input_filename
or remove the newline characters entirely:
tr -d '\n' < input.txt > output.txt
or if you have the GNU version (with its long options)
tr --delete '\n' < input.txt > output.txt
Use this solution with GNU sed:
sed ':a;N;$!ba;s/\n/ /g' file
This will read the whole file in a loop (':a;N;$!ba), then replaces the newline(s) with a space (s/\n/ /g). Additional substitutions can be simply appended if needed.
Explanation:
sed starts by reading the first line excluding the newline into the pattern space.
Create a label via :a.
Append a newline and next line to the pattern space via N.
If we are before the last line, branch to the created label $!ba ($! means not to do it on the last line. This is necessary to avoid executing N again, which would terminate the script if there is no more input!).
Finally the substitution replaces every newline with a space on the pattern space (which is the whole file).
Here is cross-platform compatible syntax which works with BSD and OS X's sed (as per #Benjie comment):
sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /g' file
As you can see, using sed for this otherwise simple problem is problematic. For a simpler and adequate solution see this answer.
Fast answer
sed ':a;N;$!ba;s/\n/ /g' file
:a create a label 'a'
N append the next line to the pattern space
$! if not the last line, ba branch (go to) label 'a'
s substitute, /\n/ regex for new line, / / by a space, /g global match (as many times as it can)
sed will loop through step 1 to 3 until it reach the last line, getting all lines fit in the pattern space where sed will substitute all \n characters
Alternatives
All alternatives, unlike sed will not need to reach the last line to begin the process
with bash, slow
while read line; do printf "%s" "$line "; done < file
with perl, sed-like speed
perl -p -e 's/\n/ /' file
with tr, faster than sed, can replace by one character only
tr '\n' ' ' < file
with paste, tr-like speed, can replace by one character only
paste -s -d ' ' file
with awk, tr-like speed
awk 1 ORS=' ' file
Other alternative like "echo $(< file)" is slow, works only on small files and needs to process the whole file to begin the process.
Long answer from the sed FAQ 5.10
5.10. Why can't I match or delete a newline using the \n escape
sequence? Why can't I match 2 or more lines using \n?
The \n will never match the newline at the end-of-line because the
newline is always stripped off before the line is placed into the
pattern space. To get 2 or more lines into the pattern space, use
the 'N' command or something similar (such as 'H;...;g;').
Sed works like this: sed reads one line at a time, chops off the
terminating newline, puts what is left into the pattern space where
the sed script can address or change it, and when the pattern space
is printed, appends a newline to stdout (or to a file). If the
pattern space is entirely or partially deleted with 'd' or 'D', the
newline is not added in such cases. Thus, scripts like
sed 's/\n//' file # to delete newlines from each line
sed 's/\n/foo\n/' file # to add a word to the end of each line
will NEVER work, because the trailing newline is removed before
the line is put into the pattern space. To perform the above tasks,
use one of these scripts instead:
tr -d '\n' < file # use tr to delete newlines
sed ':a;N;$!ba;s/\n//g' file # GNU sed to delete newlines
sed 's/$/ foo/' file # add "foo" to end of each line
Since versions of sed other than GNU sed have limits to the size of
the pattern buffer, the Unix 'tr' utility is to be preferred here.
If the last line of the file contains a newline, GNU sed will add
that newline to the output but delete all others, whereas tr will
delete all newlines.
To match a block of two or more lines, there are 3 basic choices:
(1) use the 'N' command to add the Next line to the pattern space;
(2) use the 'H' command at least twice to append the current line
to the Hold space, and then retrieve the lines from the hold space
with x, g, or G; or (3) use address ranges (see section 3.3, above)
to match lines between two specified addresses.
Choices (1) and (2) will put an \n into the pattern space, where it
can be addressed as desired ('s/ABC\nXYZ/alphabet/g'). One example
of using 'N' to delete a block of lines appears in section 4.13
("How do I delete a block of specific consecutive lines?"). This
example can be modified by changing the delete command to something
else, like 'p' (print), 'i' (insert), 'c' (change), 'a' (append),
or 's' (substitute).
Choice (3) will not put an \n into the pattern space, but it does
match a block of consecutive lines, so it may be that you don't
even need the \n to find what you're looking for. Since GNU sed
version 3.02.80 now supports this syntax:
sed '/start/,+4d' # to delete "start" plus the next 4 lines,
in addition to the traditional '/from here/,/to there/{...}' range
addresses, it may be possible to avoid the use of \n entirely.
A shorter awk alternative:
awk 1 ORS=' '
Explanation
An awk program is built up of rules which consist of conditional code-blocks, i.e.:
condition { code-block }
If the code-block is omitted, the default is used: { print $0 }. Thus, the 1 is interpreted as a true condition and print $0 is executed for each line.
When awk reads the input it splits it into records based on the value of RS (Record Separator), which by default is a newline, thus awk will by default parse the input line-wise. The splitting also involves stripping off RS from the input record.
Now, when printing a record, ORS (Output Record Separator) is appended to it, default is again a newline. So by changing ORS to a space all newlines are changed to spaces.
GNU sed has an option, -z, for null-separated records (lines). You can just call:
sed -z 's/\n/ /g'
The Perl version works the way you expected.
perl -i -p -e 's/\n//' file
As pointed out in the comments, it's worth noting that this edits in place. -i.bak will give you a backup of the original file before the replacement in case your regular expression isn't as smart as you thought.
Who needs sed? Here is the bash way:
cat test.txt | while read line; do echo -n "$line "; done
In order to replace all newlines with spaces using awk, without reading the whole file into memory:
awk '{printf "%s ", $0}' inputfile
If you want a final newline:
awk '{printf "%s ", $0} END {printf "\n"}' inputfile
You can use a character other than space:
awk '{printf "%s|", $0} END {printf "\n"}' inputfile
tr '\n' ' '
is the command.
Simple and easy to use.
Three things.
tr (or cat, etc.) is absolutely not needed. (GNU) sed and (GNU) awk, when combined, can do 99.9% of any text processing you need.
stream != line based. ed is a line-based editor. sed is not. See sed lecture for more information on the difference. Most people confuse sed to be line-based because it is, by default, not very greedy in its pattern matching for SIMPLE matches - for instance, when doing pattern searching and replacing by one or two characters, it by default only replaces on the first match it finds (unless specified otherwise by the global command). There would not even be a global command if it were line-based rather than STREAM-based, because it would evaluate only lines at a time. Try running ed; you'll notice the difference. ed is pretty useful if you want to iterate over specific lines (such as in a for-loop), but most of the times you'll just want sed.
That being said,
sed -e '{:q;N;s/\n/ /g;t q}' file
works just fine in GNU sed version 4.2.1. The above command will replace all newlines with spaces. It's ugly and a bit cumbersome to type in, but it works just fine. The {}'s can be left out, as they're only included for sanity reasons.
Why didn't I find a simple solution with awk?
awk '{printf $0}' file
printf will print the every line without newlines, if you want to separate the original lines with a space or other:
awk '{printf $0 " "}' file
The answer with the :a label ...
How can I replace a newline (\n) using sed?
... does not work in freebsd 7.2 on the command line:
( echo foo ; echo bar ) | sed ':a;N;$!ba;s/\n/ /g'
sed: 1: ":a;N;$!ba;s/\n/ /g": unused label 'a;N;$!ba;s/\n/ /g'
foo
bar
But does if you put the sed script in a file or use -e to "build" the sed script...
> (echo foo; echo bar) | sed -e :a -e N -e '$!ba' -e 's/\n/ /g'
foo bar
or ...
> cat > x.sed << eof
:a
N
$!ba
s/\n/ /g
eof
> (echo foo; echo bar) | sed -f x.sed
foo bar
Maybe the sed in OS X is similar.
Easy-to-understand Solution
I had this problem. The kicker was that I needed the solution to work on BSD's (Mac OS X) and GNU's (Linux and Cygwin) sed and tr:
$ echo 'foo
bar
baz
foo2
bar2
baz2' \
| tr '\n' '\000' \
| sed 's:\x00\x00.*:\n:g' \
| tr '\000' '\n'
Output:
foo
bar
baz
(has trailing newline)
It works on Linux, OS X, and BSD - even without UTF-8 support or with a crappy terminal.
Use tr to swap the newline with another character.
NULL (\000 or \x00) is nice because it doesn't need UTF-8 support and it's not likely to be used.
Use sed to match the NULL
Use tr to swap back extra newlines if you need them
You can use xargs:
seq 10 | xargs
or
seq 10 | xargs echo -n
cat file | xargs
for the sake of completeness
If you are unfortunate enough to have to deal with Windows line endings, you need to remove the \r and the \n:
tr '\r\n' ' ' < $input > $output
I'm not an expert, but I guess in sed you'd first need to append the next line into the pattern space, bij using "N". From the section "Multiline Pattern Space" in "Advanced sed Commands" of the book sed & awk (Dale Dougherty and Arnold Robbins; O'Reilly 1997; page 107 in the preview):
The multiline Next (N) command creates a multiline pattern space by reading a new line of input and appending it to the contents of the pattern space. The original contents of pattern space and the new input line are separated by a newline. The embedded newline character can be matched in patterns by the escape sequence "\n". In a multiline pattern space, the metacharacter "^" matches the very first character of the pattern space, and not the character(s) following any embedded newline(s). Similarly, "$" matches only the final newline in the pattern space, and not any embedded newline(s). After the Next command is executed, control is then passed to subsequent commands in the script.
From man sed:
[2addr]N
Append the next line of input to the pattern space, using an embedded newline character to separate the appended material from the original contents. Note that the current line number changes.
I've used this to search (multiple) badly formatted log files, in which the search string may be found on an "orphaned" next line.
In response to the "tr" solution above, on Windows (probably using the Gnuwin32 version of tr), the proposed solution:
tr '\n' ' ' < input
was not working for me, it'd either error or actually replace the \n w/ '' for some reason.
Using another feature of tr, the "delete" option -d did work though:
tr -d '\n' < input
or '\r\n' instead of '\n'
I used a hybrid approach to get around the newline thing by using tr to replace newlines with tabs, then replacing tabs with whatever I want. In this case, " " since I'm trying to generate HTML breaks.
echo -e "a\nb\nc\n" |tr '\n' '\t' | sed 's/\t/ <br> /g'`
You can also use this method:
sed 'x;G;1!h;s/\n/ /g;$!d'
Explanation
x - which is used to exchange the data from both space (pattern and hold).
G - which is used to append the data from hold space to pattern space.
h - which is used to copy the pattern space to hold space.
1!h - During first line won't copy pattern space to hold space due to \n is
available in pattern space.
$!d - Clear the pattern space every time before getting the next line until the
the last line.
Flow
When the first line get from the input, an exchange is made, so 1 goes to hold space and \n comes to pattern space, appending the hold space to pattern space, and a substitution is performed and deletes the pattern space.
During the second line, an exchange is made, 2 goes to hold space and 1 comes to the pattern space, G append the hold space into the pattern space, h copy the pattern to it, the substitution is made and deleted. This operation is continued until EOF is reached and prints the exact result.
Bullet-proof solution. Binary-data-safe and POSIX-compliant, but slow.
POSIX sed
requires input according to the
POSIX text file
and
POSIX line
definitions, so NULL-bytes and too long lines are not allowed and each line must end with a newline (including the last line). This makes it hard to use sed for processing arbitrary input data.
The following solution avoids sed and instead converts the input bytes to octal codes and then to bytes again, but intercepts octal code 012 (newline) and outputs the replacement string in place of it. As far as I can tell the solution is POSIX-compliant, so it should work on a wide variety of platforms.
od -A n -t o1 -v | tr ' \t' '\n\n' | grep . |
while read x; do [ "0$x" -eq 012 ] && printf '<br>\n' || printf "\\$x"; done
POSIX reference documentation:
sh,
shell command language,
od,
tr,
grep,
read,
[,
printf.
Both read, [, and printf are built-ins in at least bash, but that is probably not guaranteed by POSIX, so on some platforms it could be that each input byte will start one or more new processes, which will slow things down. Even in bash this solution only reaches about 50 kB/s, so it's not suited for large files.
Tested on Ubuntu (bash, dash, and busybox), FreeBSD, and OpenBSD.
In some situations maybe you can change RS to some other string or character. This way, \n is available for sub/gsub:
$ gawk 'BEGIN {RS="dn" } {gsub("\n"," ") ;print $0 }' file
The power of shell scripting is that if you do not know how to do it in one way you can do it in another way. And many times you have more things to take into account than make a complex solution on a simple problem.
Regarding the thing that gawk is slow... and reads the file into memory, I do not know this, but to me gawk seems to work with one line at the time and is very very fast (not that fast as some of the others, but the time to write and test also counts).
I process MB and even GB of data, and the only limit I found is line size.
Finds and replaces using allowing \n
sed -ie -z 's/Marker\n/# Marker Comment\nMarker\n/g' myfile.txt
Marker
Becomes
# Marker Comment
Marker
You could use xargs — it will replace \n with a space by default.
However, it would have problems if your input has any case of an unterminated quote, e.g. if the quote signs on a given line don't match.
On Mac OS X (using FreeBSD sed):
# replace each newline with a space
printf "a\nb\nc\nd\ne\nf" | sed -E -e :a -e '$!N; s/\n/ /g; ta'
printf "a\nb\nc\nd\ne\nf" | sed -E -e :a -e '$!N; s/\n/ /g' -e ta
To remove empty lines:
sed -n "s/^$//;t;p;"
Using Awk:
awk "BEGIN { o=\"\" } { o=o \" \" \$0 } END { print o; }"
A solution I particularly like is to append all the file in the hold space and replace all newlines at the end of file:
$ (echo foo; echo bar) | sed -n 'H;${x;s/\n//g;p;}'
foobar
However, someone said me the hold space can be finite in some sed implementations.
Replace newlines with any string, and replace the last newline too
The pure tr solutions can only replace with a single character, and the pure sed solutions don't replace the last newline of the input. The following solution fixes these problems, and seems to be safe for binary data (even with a UTF-8 locale):
printf '1\n2\n3\n' |
sed 's/%/%p/g;s/#/%a/g' | tr '\n' # | sed 's/#/<br>/g;s/%a/#/g;s/%p/%/g'
Result:
1<br>2<br>3<br>
It is sed that introduces the new-lines after "normal" substitution. First, it trims the new-line char, then it processes according to your instructions, then it introduces a new-line.
Using sed you can replace "the end" of a line (not the new-line char) after being trimmed, with a string of your choice, for each input line; but, sed will output different lines. For example, suppose you wanted to replace the "end of line" with "===" (more general than a replacing with a single space):
PROMPT~$ cat <<EOF |sed 's/$/===/g'
first line
second line
3rd line
EOF
first line===
second line===
3rd line===
PROMPT~$
To replace the new-line char with the string, you can, inefficiently though, use tr , as pointed before, to replace the newline-chars with a "special char" and then use sed to replace that special char with the string you want.
For example:
PROMPT~$ cat <<EOF | tr '\n' $'\x01'|sed -e 's/\x01/===/g'
first line
second line
3rd line
EOF
first line===second line===3rd line===PROMPT~$

Put the first letter of each column in eol

I have a file like this:
A_City,QQQQ
B_State,QQQQ
C_Country,QQQQ
A_Cityt,YYYY
B_State,YYYY
C_Country,YYYY
I want to add one more column at end of the line on the same file with the first letter of each column.
A_City,QQQQ,AQ
B_State,QQQQ,BQ
C_Country,QQQQ,CQ
A_Cityt,YYYY,AY
B_State,YYYY,BY
C_Country,YYYY,CY
I would like to get this using sed but if there is an awk code would help.
awk to the rescue!
$ awk '{print $0 "," substr($0,1,1) substr($0,length($0))}' file
A_City,QQQQ,AQ
B_State,QQQQ,BQ
C_Country,QQQQ,CQ
A_Cityt,YYYY,AY
B_State,YYYY,BY
C_Country,YYYY,CY
or, perhaps
$ awk -F, '{print $0 FS substr($1,1,1) substr($2,1,1)}' file
When you have only one , you can use
sed -r 's/^(.).*,(.).*/&,\1\2/' file
This might work for you (GNU sed):
sed -r 's/^|,+/&\n/g;s/$/,\n/;:a;s/\n(.).*,\n.*/&\1/;s/\n//;/\n.*,\n/ba;s/\n//g' file
Insert a newline at the start of a line or following one or more ,'s. Append an additional , and a newline to the end of the line. Append a character following a newline followed by zero or more characters followed by a , and a final newline and any following characters to its match. Remove the first newline. If there are two or more newlines repeat. Finally remove all newlines.
N.B. If the line is initially empty, this will add a , to such lines. Empty fields are catered for and will be represented by no first character.

sed - remove line break if line does not end on \"

I have a tsv.-file and there are some lines which do not end with an '"'. So now I would like to remove every line break which is not directly after an '"'.
How could I accomplish that with sed? Or any other bash shell program...
Kind regards,
Snafu
This sed command should do it:
sed '/"$/!{N;s/\n//}' file
It says: on every line not matching "$ do:
read next line, append it to pattern space;
remove linebreak between the two lines.
Example:
$ cat file.txt
"test"
"qwe
rty"
foo
$ sed '/"$/!{N;s/\n//}' file.txt
"test"
"qwerty"
foo
To elaborate on #Lev's answer, the BSD (OSX) version of sed is less forgiving about the command syntax within the curly braces -- the semicolon command separator is required for both commands:
sed '/"$/!{N;s/\n//;}' file.txt
per the documentation here -- an excerpt:
Following an address or address range, sed accepts curly braces '{...}' so several commands may be applied to that line or to the lines matched by the address range. On the command line, semicolons ';' separate each instruction and must precede the closing brace.
give this awk one-liner a try:
awk '{printf "%s%s",$0,(/"$/?"\n":"")}' file
test
kent$ cat f
"foo"
"bar"
"a long
text with
many many
lines"
"lalala"
kent$ awk '{printf "%s%s",$0,(/"$/?"\n":"")}' f
"foo"
"bar"
"a longtext withmany manylines"
"lalala"
This might work for you (GNU sed):
sed ':a;/"$/!{N;s/\n//;ta}' file
This checks if the last character of the pattern space is a " and if not appends another line, removes a newline and repeats until the condition is met or the end-of-file is encountered.
An alternative is:
sed -r ':a;N;s/([^"])\n/\1/;ta;P;D' file
The mechanism is left for the reader to ponder.

How to replace newlines with tab characters?

I have pattern like below
hi
hello
hallo
greetings
salutations
no more hello for you
I am trying to replace all newlines with tab using the following command
sed -e "s_/\n_/\t_g"
but it's not working.
Could anybody please help? I'm looking for a solution in sed/awk.
tr is better here, I think:
tr "\n" "\t" < newlines
As Nifle suggested in a comment, newlines here is the name of the file holding the original text.
Because sed is so line-oriented, it's more complicated to use in a case like this.
not sure about output you want
# awk -vRS="\n" -vORS="\t" '1' file
hi hello hallo greetings salutations no more hello for you
sed '$!{:a;N;s/\n/\t/;ta}' file
You can't replace newlines on a line-by-line basis with sed. You have to accumulate lines and replace the newlines between them.
text abc\n <- can't replace this one
text abc\ntext def\n <- you can replace the one after "abc" but not the one at the end
This sed script accumulates all the lines and eliminates all the newlines but the last:
sed -n '1{x;d};${H;x;s/\n/\t/g;p};{H}'
By the way, your sed script sed -e "s_/\n_/\t_g" is trying to say "replace all slashes followed by newlines with slashes followed by tabs". The underscores are taking on the role of delimiters for the s command so that slashes can be more easily used as characters for searching and replacing.
paste -s
-s Concatenate all of the lines of each separate input file in
command line order. The newline character of every line
except the last line in each input file is replaced with the
tab character, unless otherwise specified by the -d option.
You are almost there with your sed script, you'd just need to change it to:
sed -e "s/\n/\t/g"
The \ is enough for escape, you don't need to add _
And you need to add the / before g at the end to let sed know that this is the last part of the script.

How to insert a newline in front of a pattern?

How to insert a newline before a pattern within a line?
For example, this will insert a newline behind the regex pattern.
sed 's/regex/&\n/g'
How can I do the same but in front of the pattern?
Given this sample input file, the pattern to match on is the phone number.
some text (012)345-6789
Should become
some text
(012)345-6789
This works in bash and zsh, tested on Linux and OS X:
sed 's/regexp/\'$'\n/g'
In general, for $ followed by a string literal in single quotes bash performs C-style backslash substitution, e.g. $'\t' is translated to a literal tab. Plus, sed wants your newline literal to be escaped with a backslash, hence the \ before $. And finally, the dollar sign itself shouldn't be quoted so that it's interpreted by the shell, therefore we close the quote before the $ and then open it again.
Edit: As suggested in the comments by #mklement0, this works as well:
sed $'s/regexp/\\\n/g'
What happens here is: the entire sed command is now a C-style string, which means the backslash that sed requires to be placed before the new line literal should now be escaped with another backslash. Though more readable, in this case you won't be able to do shell string substitutions (without making it ugly again.)
Some of the other answers didn't work for my version of sed.
Switching the position of & and \n did work.
sed 's/regexp/\n&/g'
Edit: This doesn't seem to work on OS X, unless you install gnu-sed.
In sed, you can't add newlines in the output stream easily. You need to use a continuation line, which is awkward, but it works:
$ sed 's/regexp/\
&/'
Example:
$ echo foo | sed 's/.*/\
&/'
foo
See here for details. If you want something slightly less awkward you could try using perl -pe with match groups instead of sed:
$ echo foo | perl -pe 's/(.*)/\n$1/'
foo
$1 refers to the first matched group in the regular expression, where groups are in parentheses.
On my mac, the following inserts a single 'n' instead of newline:
sed 's/regexp/\n&/g'
This replaces with newline:
sed "s/regexp/\\`echo -e '\n\r'`/g"
echo one,two,three | sed 's/,/\
/g'
You can use perl one-liners much like you do with sed, with the advantage of full perl regular expression support (which is much more powerful than what you get with sed). There is also very little variation across *nix platforms - perl is generally perl. So you can stop worrying about how to make your particular system's version of sed do what you want.
In this case, you can do
perl -pe 's/(regex)/\n$1/'
-pe puts perl into a "execute and print" loop, much like sed's normal mode of operation.
' quotes everything else so the shell won't interfere
() surrounding the regex is a grouping operator. $1 on the right side of the substitution prints out whatever was matched inside these parens.
Finally, \n is a newline.
Regardless of whether you are using parentheses as a grouping operator, you have to escape any parentheses you are trying to match. So a regex to match the pattern you list above would be something like
\(\d\d\d\)\d\d\d-\d\d\d\d
\( or \) matches a literal paren, and \d matches a digit.
Better:
\(\d{3}\)\d{3}-\d{4}
I imagine you can figure out what the numbers in braces are doing.
Additionally, you can use delimiters other than / for your regex. So if you need to match / you won't need to escape it. Either of the below is equivalent to the regex at the beginning of my answer. In theory you can substitute any character for the standard /'s.
perl -pe 's#(regex)#\n$1#'
perl -pe 's{(regex)}{\n$1}'
A couple final thoughts.
using -ne instead of -pe acts similarly, but doesn't automatically print at the end. It can be handy if you want to print on your own. E.g., here's a grep-alike (m/foobar/ is a regex match):
perl -ne 'if (m/foobar/) {print}'
If you are finding dealing with newlines troublesome, and you want it to be magically handled for you, add -l. Not useful for the OP, who was working with newlines, though.
Bonus tip - if you have the pcre package installed, it comes with pcregrep, which uses full perl-compatible regexes.
In this case, I do not use sed. I use tr.
cat Somefile |tr ',' '\012'
This takes the comma and replaces it with the carriage return.
To insert a newline to output stream on Linux, I used:
sed -i "s/def/abc\\\ndef/" file1
Where file1 was:
def
Before the sed in-place replacement, and:
abc
def
After the sed in-place replacement. Please note the use of \\\n. If the patterns have a " inside it, escape using \".
Hmm, just escaped newlines seem to work in more recent versions of sed (I have GNU sed 4.2.1),
dev:~/pg/services/places> echo 'foobar' | sed -r 's/(bar)/\n\1/;'
foo
bar
echo pattern | sed -E -e $'s/^(pattern)/\\\n\\1/'
worked fine on El Captitan with () support
In my case the below method works.
sed -i 's/playstation/PS4/' input.txt
Can be written as:
sed -i 's/playstation/PS4\nplaystation/' input.txt
PS4
playstation
Consider using \\n while using it in a string literal.
sed : is stream editor
-i : Allows to edit the source file
+: Is delimiter.
I hope the above information works for you 😃.
in sed you can reference groups in your pattern with "\1", "\2", ....
so if the pattern you're looking for is "PATTERN", and you want to insert "BEFORE" in front of it, you can use, sans escaping
sed 's/(PATTERN)/BEFORE\1/g'
i.e.
sed 's/\(PATTERN\)/BEFORE\1/g'
You can also do this with awk, using -v to provide the pattern:
awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' file
This checks if a line contains a given pattern. If so, it appends a new line to the beginning of it.
See a basic example:
$ cat file
hello
this is some pattern and we are going ahead
bye!
$ awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' file
hello
this is some
pattern and we are going ahead
bye!
Note it will affect to all patterns in a line:
$ cat file
this pattern is some pattern and we are going ahead
$ awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' d
this
pattern is some
pattern and we are going ahead
sed -e 's/regexp/\0\n/g'
\0 is the null, so your expression is replaced with null (nothing) and then...
\n is the new line
On some flavors of Unix doesn't work, but I think it's the solution to your problem.
echo "Hello" | sed -e 's/Hello/\0\ntmow/g'
Hello
tmow
This works in MAC for me
sed -i.bak -e 's/regex/xregex/g' input.txt sed -i.bak -e 's/qregex/\'$'\nregex/g' input.txt
Dono whether its perfect one...
After reading all the answers to this question, it still took me many attempts to get the correct syntax to the following example script:
#!/bin/bash
# script: add_domain
# using fixed values instead of command line parameters $1, $2
# to show typical variable values in this example
ipaddr="127.0.0.1"
domain="example.com"
# no need to escape $ipaddr and $domain values if we use separate quotes.
sudo sed -i '$a \\n'"$ipaddr www.$domain $domain" /etc/hosts
The script appends a newline \n followed by another line of text to the end of a file using a single sed command.
In vi on Red Hat, I was able to insert carriage returns using just the \r character. I believe this internally executes 'ex' instead of 'sed', but it's similar, and vi can be another way to do bulk edits such as code patches. For example. I am surrounding a search term with an if statement that insists on carriage returns after the braces:
:.,$s/\(my_function(.*)\)/if(!skip_option){\r\t\1\r\t}/
Note that I also had it insert some tabs to make things align better.
Just to add to the list of many ways to do this, here is a simple python alternative. You could of course use re.sub() if a regex were needed.
python -c 'print(open("./myfile.txt", "r").read().replace("String to match", "String to match\n"))' > myfile_lines.txt
sed 's/regexp/\'$'\n/g'
works as justified and detailed by mojuba in his answer .
However, this did not work:
sed 's/regexp/\\\n/g'
It added a new line, but at the end of the original line, a \n was added.

Resources