Bash: Replacing "" with newline character, using sed or tr - bash

I'm trying to format output in a way that inserts newline characters after each 'line', with lines denoted by double quotes (""). The quotes themselves are temporary and to be stripped in a later step.
Input:
"a",1,"aa""b",2,"bb"
Output:
a,1,aa
b,2,bb
I've tried:
sed 's/""/\n/'
sed 's/""/\/g'
tr '""' '\n'
But tr seems to replace every quote character and sed seems to insert \n as text instead of a newline. What can I do to make this work?

echo '"a",1,"aa""b",2,"bb"' |awk -v RS='""' '{$1=$1} {gsub(/"/,"")}1'
a,1,aa
b,2,bb
or using sed:
echo '"a",1,"aa""b",2,"bb"' |sed -e 's/""/\n/' -e 's/"//g' # OR sed -e 's/""/\n/;s/"//g'
a,1,aa
b,2,bb
awk solution: Here the default record separator is changed from new line to "". So awk will consider the EOL when it hits "".
sed solution: Here first "" are converted into new line and second replacement is to remove " from each line.

neech#nicolaw.uk:~ $ cat file.txt
"a",1,"aa""b",2,"bb"
neech#nicolaw.uk:~ $ sed 's/""/\n/' file.txt | tr -d '"'
a,1,aa
b,2,bb

You seem to be dealing with POSIX sed, which does not have support for the \n notation. Insert an actual new-line into the pattern, either:
sed 's/""/\
/'
Or:
sed 's/""/\'$'\n''/'
E.g.:
sed 's/""/\
/' | tr -d \"
Output:
a,1,aa
b,2,bb

As suggested by George Vasiliou if you have perl you could use:
> echo '"a",1,"aa""b",2,"bb"' | perl -pe 's/""/"\n"/g;s/"//g'
This avoids the non portable sed problem.
Or for a crappy hack version.
Replace the "" with another character and then use tr (since tr should work with \n) to replace it with \n instead then remove the single " after.
So you can get the "" replaced with newline like this:
sed 's/""/#/g' | tr '#' '\n'
Then the rest follows:
> echo '"a",1,"aa""b",2,"bb"'| sed 's/""/#/g' | tr '#' '\n' | sed 's/\"//g'

Related

Replace '>' with '>\n' in several files in shell/bash

I have several files in a single folder and I want to replace the character > with >\n everywhere in all of those files.
But whatever I do, the \n character does not get added after the > character.
I have tried the following:
echo '>ABCCHACAC' | tr '\>' '>\\n'
echo '>ABCCHACAC' | tr '>' '>\\n'
echo '>ABCCHACAC' | tr '>' '>\n'
echo '>ABCCHACAC' | tr '>' '\>\n'
echo '>ABCCHACAC' | tr '>' '\>\\n'
echo '>ABCCHACAC' | tr '>' '\>\\n'
But I get the same input string as output, whereas the correct output I want is:
>
ABCCHACAC
And I am using this script to do the same thing on many files:
for f in *.txt
do
tr ">" ">\n" < "$f" > $(basename "$f" .txt)_newline_added.txt
done
tr is for one-for-one character replacements, not replacing strings. E.g. if you translate abc with def, it replaces all a with d, all b with e, and all c with f. When the second string is longer than the first, the extra characters are ignored. So tr '>' '>\n' means to replace > with > and ignores \n.
Use sed to perform string replacements.
sed 's/>/>\n/g' "$f" > "$(basename "$f" .txt)_newline_added.txt"
In addition to Barmar's answer, if you're using a BSD based *nix (eg. OS X) you'll either need to include an escaped literal newline, or possibly use tr in addition to sed.
Escaped literal newline:
$ sed 's/^>/>\
/' "$f"
sed with tr:
$ sed 's/^>/>▾/' "$f" | tr '▾' '\n'
↳ Insert newline (\n) using sed

replacing spaces and brackets in a string + sed + is there a better way than this?

trying to replace the sapces and underscores in this is just a (test)
I do the following:
echo "this is just a (test)" | sed -e 's/ /_/g' | sed -e 's/(//g' | sed -e 's/)//g'
And this gives me:
this_is_just_a_test
Is there a better way? shorter way of writing it in sed?
You can achieve the same thing using tr:
echo "this is just a (test)" | tr \ _ | tr -d \(\)
The first tr replaces spaces with underscores and the second one deletes all parenthesis.

removing new line character from incoming stream using sed

I am new to shell scripting and i am trying to remove new line character from each line using SED. this is what i have done so far :
printf "{new\nto\nlinux}" | sed ':a;N;s/\n/ /g'
removes only Ist new line character.
I somewhere found this command :
printf "{new\nto\nlinux}" | sed ':a;N;$!ba;s/\n/ /g'
but it gives :"ba: Event not found."
if i do:
printf "{new\nto\nlinux}" | sed ':a;N;s/\n/ /g' | sed ':a;N;s/\n/ /g'
then it gives correct output but i am looking for something better as i am not sure how many new character i will get when i run the script.
incoming stream is from echo or printf or some variable in script.
To remove newlines, use tr:
tr -d '\n'
If you want to replace each newline with a single space:
tr '\n' ' '
The error ba: Event not found is coming from csh, and is due to csh trying to match !ba in your history list. You can escape the ! and write the command:
sed ':a;N;$\!ba;s/\n/ /g' # Suitable for csh only!!
but sed is the wrong tool for this, and you would be better off using a shell that handles quoted strings more reasonably. That is, stop using csh and start using bash.
This might work for you:
printf "{new\nto\nlinux}" | paste -sd' '
{new to linux}
or:
printf "{new\nto\nlinux}" | tr '\n' ' '
{new to linux}
or:
printf "{new\nto\nlinux}" |sed -e ':a' -e '$!{' -e 'N' -e 'ba' -e '}' -e 's/\n/ /g'
{new to linux}
Use perl instead of sed. perl is similar to sed:
ubuntu#ubuntu:/$ printf "{new\nto\nlinux}" | sed 's/\n/ /g'; echo ''
{new
to
linux}
ubuntu#ubuntu:/$ printf "{new\nto\nlinux}" | perl -pe 's/\n/ /g'; echo ''
{new to linux}
ubuntu#ubuntu:/$ echo -e "new\nto\nlinux\ntest\n1\n2 3" | perl -pe 's/\n/_ _/g'; echo ''
new_ _to_ _linux_ _test_ _1_ _2 3_ _
ubuntu#ubuntu:/$

Remove blank spaces with comma in a string in bash shell

I would like to replace blank spaces/white spaces in a string with commas.
STR1=This is a string
to
STR1=This,is,a,string
Without using external tools:
echo ${STR1// /,}
Demo:
$ STR1="This is a string"
$ echo ${STR1// /,}
This,is,a,string
See bash: Manipulating strings.
Just use sed:
echo $STR1 | sed 's/ /,/g'
or pure BASH way::
echo ${STR1// /,}
kent$ echo "STR1=This is a string"|awk -v OFS="," '$1=$1'
STR1=This,is,a,string
Note:
if there are continued blanks, they would be replaced with a single comma. as example above shows.
This might work for you:
echo 'STR1=This is a string' | sed 'y/ /,/'
STR1=This,is,a,string
or:
echo 'STR1=This is a string' | tr ' ' ','
STR1=This,is,a,string
How about
STR1="This is a string"
StrFix="$( echo "$STR1" | sed 's/[[:space:]]/,/g')"
echo "$StrFix"
**output**
This,is,a,string
If you have multiple adjacent spaces in your string and what to reduce them to just 1 comma, then change the sed to
STR1="This is a string"
StrFix="$( echo "$STR1" | sed 's/[[:space:]][[:space:]]*/,/g')"
echo "$StrFix"
**output**
This,is,a,string
I'm using a non-standard sed, and so have used ``[[:space:]][[:space:]]*to indicate one or more "white-space" characters (including tabs, VT, maybe a few others). In a modern sed, I would expect[[:space:]]+` to work as well.
STR1=`echo $STR1 | sed 's/ /,/g'`

Using sed to replace a string with the contents of a variable, even if it's an escape character

I'm using
sed -e "s/\*DIVIDER\*/$DIVIDER/g" to replace *DIVIDER* with a user-specified string, which is stored in $DIVIDER. The problem is that I want them to be able to specify escape characters as their divider, like \n or \t. When I try this, I just end up with the letter n or t, or so on.
Does anyone have any ideas on how to do this? It will be greatly appreciated!
EDIT: Here's the meat of the script, I must be missing something.
curl --silent "$URL" > tweets.txt
if [[ `cat tweets.txt` == *\<error\>* ]]; then
grep -E '(error>)' tweets.txt | \
sed -e 's/<error>//' -e 's/<\/error>//' |
sed -e 's/<[^>]*>//g' |
head $headarg | sed G | fmt
else
echo $REPLACE | awk '{gsub(".", "\\\\&");print}'
grep -E '(description>)' tweets.txt | \
sed -n '2,$p' | \
sed -e 's/<description>//' -e 's/<\/description>//' |
sed -e 's/<[^>]*>//g' |
sed -e 's/\&amp\;/\&/g' |
sed -e 's/\&lt\;/\</g' |
sed -e 's/\&gt\;/\>/g' |
sed -e 's/\&quot\;/\"/g' |
sed -e 's/\&....\;/\?/g' |
sed -e 's/\&.....\;/\?/g' |
sed -e 's/^ *//g' |
sed -e :a -e '$!N;s/\n/\*DIVIDER\*/;ta' | # Replace newlines with *divider*.
sed -e "s/\*DIVIDER\*/${DIVIDER//\\/\\\\}/g" | # Replace *DIVIDER* with the actual divider.
head $headarg | sed G
fi
The long list of sed lines are replacing characters from an XML source, and the last two are the ones that are supposed to replace the newlines with the specified character. I know it seems redundant to replace a newline with another newline, but it was the easiest way I could come up with to let them pick their own divider. The divider replacement works great with normal characters.
You can use bash to escape the backslash like this:
sed -e "s/\*DIVIDER\*/${DIVIDER//\\/\\\\}/g"
The syntax is ${name/pattern/string}. If pattern begins with /, every occurence of pattern in name is replaced by string. Otherwise only the first occurence is replaced.
Maybe:
case "$DIVIDER" in
(*\\*) DIVIDER=$(echo "$DIVIDER" | sed 's/\\/\\\\/g');;
esac
I played with this script:
for DIVIDER in 'xx\n' 'xxx\\ddd' "xxx"
do
echo "In: <<$DIVIDER>>"
case "$DIVIDER" in (*\\*) DIVIDER=$(echo "$DIVIDER" | sed 's/\\/\\\\/g');;
esac
echo "Out: <<$DIVIDER>>"
done
Run with 'ksh' or 'bash' (but not 'sh') on MacOS X:
In: <<xx\n>>
Out: <<xx\\n>>
In: <<xxx\\ddd>>
Out: <<xxx\\\\ddd>>
In: <<xxx>>
Out: <<xxx>>
It seems to be a simple substitution:
$ d='\n'
$ echo "a*DIVIDER*b" | sed "s/\*DIVIDER\*/$d/"
a
b
Maybe I don't understand what you're trying to accomplish.
Then maybe this step could take the place of the last two of yours:
sed -n ":a;$ {s/\n/$DIVIDER/g;p;b};N;ba"
Note the space after the dollar sign. It prevents the shell from interpreting "${s..." as a variable name.
And as ghostdog74 suggested, you have way too many calls to sed. You may be able to change a lot of the pipe characters to backslashes (line continuation) and delete "sed" from all but the first one (leave the "-e" everywhere). (untested)
You just need to escape the escape char.
\n will match \n
\ will match \
\\ will match \
Using FreeBSD sed (e.g. on Mac OS X) you have to preprocess the $DIVIDER user input:
d='\n'
d='\t'
NL=$'\\\n'
TAB=$'\\\t'
d="${d/\\n/${NL}}"
d="${d/\\t/${TAB}}"
echo "a*DIVIDER*b" | sed -E -e "s/\*DIVIDER\*/${d}/"

Resources