escaping newlines in sed replacement string

escaping newlines in sed replacement string - bash

Here are my attempts to replace a b character with a newline using sed while running bash
$> echo 'abc' | sed 's/b/\n/'
anc
no, that's not it
$> echo 'abc' | sed 's/b/\\n/'
a\nc
no, that's not it either. The output I want is
a
c
HELP!

Looks like you are on BSD or Solaris. Try this:
[jaypal:~/Temp] echo 'abc' | sed 's/b/\
> /'
a
c
Add a black slash and hit enter and complete your sed statement.

$ echo 'abc' | sed 's/b/\'$'\n''/'
a
c
In Bash, $'\n' expands to a single quoted newline character (see "QUOTING" section of man bash). The three strings are concatenated before being passed into sed as an argument. Sed requires that the newline character be escaped, hence the first backslash in the code I pasted.

You didn't say you want to globally replace all b. If yes, you want tr instead:
$ echo abcbd | tr b $'\n'
a
c
d
Works for me on Solaris 5.8 and bash 2.03

In a multiline file I had to pipe through tr on both sides of sed, like so:
echo "$FILE_CONTENTS" | \
tr '\n' ¥ | tr ' ' ∑ | mySedFunction $1 | tr ¥ '\n' | tr ∑ ' '
See unix likes to strip out newlines and extra leading spaces and all sorts of things, because I guess that seemed like the thing to do at the time when it was made back in the 1900s. Anyway, this method I show above solves the problem 100%. Wish I would have seen someone post this somewhere because it would have saved me about three hours of my life.

echo 'abc' | sed 's/b/\'\n'/'
you are missing '' around \n

Related

Delete words in a line using grep or sed

I want to delete three words with a special character on a line such as
Input:
\cf4 \cb6 1749,1789 \cb3 \
Output:
1749,1789
I have tried a couple sed and grep statements but so far none have worked, mainly due to the character \.
My unsuccessful attempt:
sed -i 's/ [.\c ] //g' inputfile.ext >output file.ext

Awk accepts a regex Field Separator (in this case, comma or space):
$ awk -F'[ ,]' '$0 = $3 "." $4' <<< '\cf4 \cb6 1749,1789 \cb3 \'
1749.1789
-F'[ ,]' - Use a single character from the set space/comma as Field Separator
$0 = $3 "." $4 - If we can set the entire line $0 to Field 3 $4 followed by a literal period "." followed by Field 4 $4, do the default behavior (print entire line)
Replace <<< 'input' with file if every line of that file has the same delimeters (spaces/comma) and number of fields. If your input file is more complex than the sample you shared, please edit your question to show actual input.

The backslash is a special meta-character that confuses bash.
We treat it like any other meta-character, by escaping it, with--you guessed it--a backslash!
But first, we need to grep this pattern out of our file
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file # Close enough!
Now, just sed out those pesky backslashes
| sed -e 's/\\//g' # Don't forget the g, otherwise it'll only strip out 1 backlash
Now, finally, sed out the clusters of 2 alpha followed by a number and a space!
| sed -e 's/[a-z][a-z][0-9] //g'
And, finally....
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file | sed -e 's/\\//g' | sed -e 's/[a-z][a-z][0-9] //g'
Output:
1749,1789

My guess is you are having trouble because you have backslashes in input and can't figure out how to get backslashes into your regex. Since backslashes are escape characters to shell and regex you end up having to type four backslashes to get one into your regex.
Ben Van Camp already posted an answer that uses single quotes to make the escaping a little easier; however I shall now post an answer that simply avoids the problem altogether.
grep -o '[0-9]*,[0-9]*' | tr , .
Locks on to the comma and selects the digits on either side and outputs the number. Alternately if comma is not guaranteed we can do it this way:
egrep -o ' [0-9,]*|^[0-9,]*' | tr , . | tr -d ' '
Both of these assume there's only one usable number per line.

$ awk '{sub(/,/,".",$3); print $3}' file
1749.1789
$ sed 's/\([^ ]* \)\{2\}\([^ ]*\).*/\2/; s/,/./' file
1749.1789

Bash - Read in a file and replace multiple spaces with just one comma

I'm trying to write a bash script that will take in a file with spaces and output the same file, but comma delimited. I figured out how to replaces spaces with commas, but I've run into a problem: there are some rows that have a variable number of spaces. Some rows contain 2 or 3 spaces and some contain as many as 7 or 13. Here's what I have so far:
sed 's/ /,/g' $varfile > testdone.txt
$varfile is the file name that the user gives.
But I'm not sure how to fix the variable space problem. Any suggestions are welcome. Thank you.

This is not a job for sed. tr is more appropriate:
$ printf 'foo bar\n' | tr -s ' ' ,
foo,bar
The -s tells tr to squash multiple occurrences. Also, you can generalize with tr -s '[:space:]' , (which will replace newlines, perhaps undesirable) or tr -s ' \t' , to handle spaces or tabs.

You just need to use the + quantifier to match one or more
Assuming GNU sed
sed 's/ \+/,/g' file
# or
sed -E 's/ +/,/g' file
With GNU basic regular expressions, the "one or more" quantifier is \+
With GNU extended regular expressions, the "one or more" quantifier is +

Unix Shell - Removing special newline characters

We are receiving a file that is delimited into rows with the \(newline) and columns with the \(tab) character.
When there is a manual newline present in one of the "fields" of the file, it comes in as a special newline with two backslashes (\\newline).
To remove the special tabs \(tab), we are using this sed command, which works correctly:
sed "s/$(printf '\\\\\t')/ /g"
The corresponding command for newlines, however does not:
sed "s/$(printf '\\\\\n')/ /g"
It does not remove the \n, only the backslash before it. Is there special handling that needs to be done to remove \(newline)?
Clarification: normal newlines are formatted like this:
\(newline)
Wheras the special characters that need removal are
\\(newline)

Here you go:
echo -e 'hello\\\nthere' | perl -ne 's/\\\n/ /; print'
It would be difficult (but probably possible) to do this in sed, because sed processes input line by line, and your data is broken into multiple lines. This perl one-liner processes the input line by line, and since it treats the newline character as part of the line, it can perform a substitution with space, which I think has the effect that you want.
Or if you prefer awk:
echo -e 'hello\\\nthere' | awk '{ if (gsub(/\\$/, " ")) printf; else print }'
At first I suspected your "special newline" character is just the string \\n like in the output of this command:
echo 'hello\\nthere'
You can replace the string \\n with a space like this:
echo 'hello\\nthere' | sed -e 's/\\\\n/ /g'

You can use tr (translate) command as well to do this, like
tr '\n' ' ' < inputfile.txt
EdIT: In that case use it like
tr '\\\n' ' ' < inputfile.txt

Replace 5 dots with a single space

I have a title that has 5 consecutive dots which I'd like replaced with just one space using bash script. Doing this is not helping:
tr '.....' ' '
Obviously because it's replacing the five dots with 5 spaces.
Basically, I have a title that I want changed to a slug. So I'm using:
tr A-Z a-z | tr '[:punct:] [:blank:]' '-'
to change everything to lowercase and change any punctuation mark and spaces to a hyphen, but I'm stuck with the dots.
The title I'm using is something like: Believe.....Right Now
So I want that turned into believe-right-now
How do I change the 5 dots to a single space?

You don't need sed or awk. Your original tr command should do the trick, you just need to add the -s flag. After tr translates the desired characters into hyphens, -s will squeeze all repeated hyphens into one:
tr A-Z a-z | tr -s '[:punct:] [:blank:]' '-'
I'm not sure what the input/output context is for you, but I tested the above as follows, and it worked for me:
tr A-Z a-z <<< "Believe.....Right Now" | tr -s '[:punct:] [:blank:]' '-'
output:
believe-right-now
See http://www.ss64.com/bash/tr.html for reference.

Transliterations are performed with mappings, which means each character is mapped into something else -- or delete, with tr -d. This is the reason why tr '.....' ' ' does not work.
Replacing five dots with space:
using sed with extended regular expressions:
$ sed -r 's/\.{5}/ /g' <<< "Believe.....Right Now"
Believe Right Now
using sed without -r:
$ sed 's/\.\{5\}/ /g' <<< "Believe.....Right Now"
Believe Right Now
using parameter expansion:
$ text="foo.....bar.....zzz" && echo "${text//...../ }"
foo bar zzz
Replacing five dots and spaces with -:
$ sed -r 's/\.{5}| /-/g' <<< "Believe.....Right Now"
Believe-Right-Now
Full replacement -- ditching tr usage:
$ sed -re 's/\.{5}| /-/g' -e 's/([A-Z])/\l&/g' <<< "Believe.....Right Now"
believe-right-now
or, in case your sed version does not support the flag -r, you may use:
$ sed -e 's/\.\{5\}\| /-/g' -e 's/\([A-Z]\)/\l&/g' <<< "Believe.....Right Now"
believe-right-now

$ cat file
Believe.....Right Now
$ awk '{gsub(/[[:punct:][:space:].]+/,"-"); print tolower($0)}' file
believe-right-now

shell replace cr\lf by comma

I have input.txt
1
2
3
4
5
I need to get such output.txt
1,2,3,4,5
How to do it?

Try this:
tr '\n' ',' < input.txt > output.txt

With sed, you could use:
sed -e 'H;${x;s/\n/,/g;s/^,//;p;};d'
The H appends the pattern space to the hold space (saving the current line in the hold space). The ${...} surrounds actions that apply to the last line only. Those actions are: x swap hold and pattern space; s/\n/,/g substitute embedded newlines with commas; s/^,// delete the leading comma (there's a newline at the start of the hold space); and p print. The d deletes the pattern space - no printing.
You could also use, therefore:
sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}'
The -n suppresses default printing so the final d is no longer needed.
This solution assumes that the CRLF line endings are the local native line ending (so you are working on DOS) and that sed will therefore generate the local native line ending in the print operation. If you have DOS-format input but want Unix-format (LF only) output, then you have to work a bit harder - but you also need to stipulate this explicitly in the question.
It worked OK for me on MacOS X 10.6.5 with the numbers 1..5, and 1..50, and 1..5000 (23,893 characters in the single line of output); I'm not sure that I'd want to push it any harder than that.

In response to #Jonathan's comment to #eumiro's answer:
tr -s '\r\n' ',' < input.txt | sed -e 's/,$/\n/' > output.txt

tr and sed used be very good but when it comes to file parsing and regex you can't beat perl
(Not sure why people think that sed and tr are closer to shell than perl... )
perl -pe 's/\n/$1,/' your_file
if you want pure shell to do it then look at string matching
${string/#substring/replacement}

Use paste command. Here is using pipes:
echo "1\n2\n3\n4\n5" | paste -s -d, /dev/stdin
Here is using a file:
echo "1\n2\n3\n4\n5" > /tmp/input.txt
paste -s -d, /tmp/input.txt
Per man pages the s concatenates all lines and d allows to define the delimiter character.

Awk versions:
awk '{printf("%s,",$0)}' input.txt
awk 'BEGIN{ORS=","} {print $0}' input.txt
Output - 1,2,3,4,5,
Since you asked for 1,2,3,4,5, as compared to 1,2,3,4,5, (note the comma after 5, most of the solutions above also include the trailing comma), here are two more versions with Awk (with wc and sed) to get rid of the last comma:
i='input.txt'; awk -v c=$(wc -l $i | cut -d' ' -f1) '{printf("%s",$0);if(NR<c){printf(",")}}' $i
awk '{printf("%s,",$0)}' input.txt | sed 's/,\s*$//'

printf "1\n2\n3" | tr '\n' ','
if you want to output that to a file just do
printf "1\n2\n3" | tr '\n' ',' > myFile
if you have the content in a file do
cat myInput.txt | tr '\n' ',' > myOutput.txt

python version:
python -c 'import sys; print(",".join(sys.stdin.read().splitlines()))'
Doesn't have the trailing comma problem (because join works that way), and splitlines splits data on native line endings (and removes them).

cat input.txt | sed -e 's|$|,|' | xargs -i echo "{}"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

escaping newlines in sed replacement string - bash

Here are my attempts to replace a b character with a newline using sed while running bash $> echo 'abc' | sed 's/b/\n/' anc no, that's not it $> echo 'abc' | sed 's/b/\\n/' a\nc no, that's not it either. The output I want is a c HELP!

Looks like you are on BSD or Solaris. Try this: [jaypal:~/Temp] echo 'abc' | sed 's/b/\ > /' a c Add a black slash and hit enter and complete your sed statement.

You didn't say you want to globally replace all b. If yes, you want tr instead: $ echo abcbd | tr b $'\n' a c d Works for me on Solaris 5.8 and bash 2.03

echo 'abc' | sed 's/b/\'\n'/' you are missing '' around \n

Related

Delete words in a line using grep or sed

Bash - Read in a file and replace multiple spaces with just one comma

Unix Shell - Removing special newline characters

Replace 5 dots with a single space

shell replace cr\lf by comma

Categories

Resources