This is how my input looks like:
string="a 1,a 2,a 3"
This is how I generate list out of the input:
sed -e 's/[^,]*/"&"/g' <<< ${string}
Above command gives me the desired output as:
"a 1","a 2","a 3"
How do I trim each element so that if the input is " a 1, a 2, a 3", my output still comes back as "a 1","a 2","a 3"?
I think it is important to understand that in bash, the double quotes have a special meaning.
string="a 1,a 2,a 3" represents the string a 1,a 2,a 3 (no quotes)
sed -e 's/[^,]*/"&"/g' <<< ${string} is equivalent to the variable out='"a 1","a 2","a 3"'
To accomplish what you want, you can do:
$ string=" a 1, a 2, a 3 "
$ echo "\"$(echo ${string//*( ),*( )/\",\"})\""
"a 1","a 2","a 3"
This is only using bash builtin operations.
replace all combinations of multiple spaces and commas by the quoted comma ${string//*( ),*( )/\",\"}
use word splitting to remove all leading and trailing blanks $(echo ...) (note: this is a bit ugly and will fail on cases like a 1 , a 2 as it will remove the double space between a and 1)
print two extra double-quotes at the beginning and end of the string.
A better way is to use a double substitution:
$ string=" a 1, a 2, a 3 "
$ foobar="\"${string//,/\",\"}\""
$ echo "${foobar//*( )\"*( )/\"}"
"a 1","a 2","a 3"
note: here we make use of KSH-globs which can be enabled with the extglob setting (shopt -s extglob)
Here an answer which extends your sed command with some basic preprocessing that removes unwanted spaces:
sed -E -e 's/ *(^|,) */\1/g;s/[^,]*/"&"/g' <<< ${string}
The -E option enables extended regular expression which saves some \.
EDIT: Since OP told to wrap output in " so adding it now.
echo "$string" | sed -E 's/^ +/"/;s/$/"/;s/, +/"\,"/g'
Output will be as follows.
echo "$string" | sed -E 's/^ +/"/;s/$/"/;s/, +/"\,"/g'
"a 1","a 2","a 3"
Could you please try following and let me know if this helps you.
awk '{sub(/^\" +/,"\"");gsub(/, +/,"\",\"")} 1' Input_file
In case you want to save output into same Input_file itself append > temp_file && mv temp_file Input_file.
Solution 2nd: Using sed.
sed -E 's/^" +/"/;s/, +/"\,"/g' Input_file
Instead of complex sed pattern, you can use grep -ow option.
[nooka#lori ~]$ string1="a 1,a 2,a 3"
[nooka#lori ~]$ string2=" a 1, a 2, a 3, a 4"
[nooka#lori ~]$
nooka#lori ~]$ echo $(echo $string1 | grep -ow "[a-zA-Z] [0-9]"|sed "s/^/\"/;s/$/\"/")|sed "s/\" /\",/g"
"a 1","a 2","a 3"
[nooka#lori ~]$ echo $(echo $string2 | grep -ow "[a-zA-Z] [0-9]"|sed "s/^/\"/;s/$/\"/")|sed "s/\" /\",/g"
"a 1","a 2","a 3","a 4"
1) use grep -ow to get only those words as per the pattern defined above. You can tweak the pattern per your needs (for ex: [a-zA-Z] [0-9][0-9]* etc) for more patterns cases.
2) Then you wrap the output (a 1 or a 2 etc) with a " using the first sed cmd.
3) Then you just put , between 2 " and you get what you wanted. This assumes you pattern always follows a single space between string and number value.
Ok. Finally got it working.
string=" a 1 b 2 c 3 , something else , yet another one with spaces , totally working "
trimmed_string=$(awk '{gsub(/[[:space:]]*,[[:space:]]*/,",")}1' <<< $string)
echo ${trimmed_string}
a 1 b 2 c 3,something else,yet another one with spaces,totally working
string_as_list=$(sed -e 's/[^,]*/"&"/g' <<< ${trimmed_string})
echo ${string_as_list}
"a 1 b 2 c 3","something else","yet another one with spaces","totally working"
This whole thing had to be done because terraform expects list
variables to be passed like that. They must be surrounded by double quotes (" "),
delimited by comma( , ) inside square brackets([ ]).
Related
I have a string containing duplicate words, for example:
abc, def, abc, def
How can I remove the duplicates? The string that I need is:
abc, def
We have this test file:
$ cat file
abc, def, abc, def
To remove duplicate words:
$ sed -r ':a; s/\b([[:alnum:]]+)\b(.*)\b\1\b/\1\2/g; ta; s/(, )+/, /g; s/, *$//' file
abc, def
How it works
:a
This defines a label a.
s/\b([[:alnum:]]+)\b(.*)\b\1\b/\1\2/g
This looks for a duplicated word consisting of alphanumeric characters and removes the second occurrence.
ta
If the last substitution command resulted in a change, this jumps back to label a to try again.
In this way, the code keeps looking for duplicates until none remain.
s/(, )+/, /g; s/, *$//
These two substitution commands clean up any left over comma-space combinations.
Mac OSX or other BSD System
For Mac OSX or other BSD system, try:
sed -E -e ':a' -e 's/\b([[:alnum:]]+)\b(.*)\b\1\b/\1\2/g' -e 'ta' -e 's/(, )+/, /g' -e 's/, *$//' file
Using a string instead of a file
sed easily handles input either from a file, as shown above, or from a shell string as shown below:
$ echo 'ab, cd, cd, ab, ef' | sed -r ':a; s/\b([[:alnum:]]+)\b(.*)\b\1\b/\1\2/g; ta; s/(, )+/, /g; s/, *$//'
ab, cd, ef
You can use awk to do this.
Example:
#!/bin/bash
string="abc, def, abc, def"
string=$(printf '%s\n' "$string" | awk -v RS='[,[:space:]]+' '!a[$0]++{printf "%s%s", $0, RT}')
string="${string%,*}"
echo "$string"
Output:
abc, def
This can also be done in pure Bash:
#!/bin/bash
string="abc, def, abc, def"
declare -A words
IFS=", "
for w in $string; do
words+=( [$w]="" )
done
echo ${!words[#]}
Output
def abc
Explanation
words is an associative array (declare -A words) and every word is added as
a key to it:
words+=( [${w}]="" )
(We do not need its value therefore I have taken "" as value).
The list of unique words is the list of keys (${!words[#]}).
There is one caveat thought, the output is not separated by ", ". (You will
have to iterate again. IFS is only used with ${words[*]} and even than only
the first character of IFS is used.)
I have another way for this case. I changed my input string such as below and run command to editing it:
#string="abc def abc def"
$ echo "abc def abc def" | xargs -n1 | sort -u | xargs | sed "s# #, #g"
abc, def
Thanks for all support!
The problem with an associative array or xargs and sort in the other examples is, that the words become sorted. My solution only skips words that already have been processed. The associative array map keeps this information.
Bash function
function uniq_words() {
local string="$1"
local delimiter=", "
local words=""
declare -A map
while read -r word; do
# skip already processed words
if [ ! -z "${map[$word]}" ]; then
continue
fi
# mark the found word
map[$word]=1
# don't add a delimiter, if it is the first word
if [ -z "$words" ]; then
words=$word
continue
fi
# add a delimiter and the word
words="$words$delimiter$word"
# split the string into lines so that we don't have
# to overwrite the $IFS system field separator
done <<< $(sed -e "s/$delimiter/\n/g" <<< "$string")
echo ${words}
}
Example 1
uniq_words "abc, def, abc, def"
Output:
abc, def
Example 2
uniq_words "1, 2, 3, 2, 1, 0"
Output:
1, 2, 3, 0
Example with xargs and sort
In this example, the output is sorted.
echo "1 2 3 2 1 0" | xargs -n1 | sort -u | xargs | sed "s# #, #g"
Output:
0, 1, 2, 3
Suppose I have the following string:
some letters foo/substring/goo/some additional letters
I need to extract this substring supposing that foo/ and /goo are constant strings that are known in advance. How can I do that?
This sed one-liner does it.
sed 's#.*foo/##;s#/goo/.*##' file
Except for sed, awk, grep can do the job too. Or with zsh:
kent$ v="some letters foo/substring/goo/some additional letters"
kent$ echo ${${v##*foo/}%%/goo/*}
substring
Note that:
comment by #Nahuel Fouilleul
in ${var%%/goo/*} var must be a variable name, and can't be the result of expansion
The line should be divided into two statements, if work with bash.
$ echo $0
bash
$ v="some letters foo/substring/goo/some additional letters"
$ v=${v##*foo/}
$ v=${v%%/goo/*}
$ echo $v
substring
The line I executed in zsh, worked, but just I tested in bash, it didn't work.
$ echo $0
-zsh
$ v="some letters foo/substring/goo/some additional letters"
$ echo ${${v##*foo/}%%/goo/*}
substring
With variable expansion
line='some letters foo/substring/goo/some additional letters'
line=${line%%/goo*} # remove suffix /goo*
line=${line##*foo/} # remove prefix *ffo/
echo "$line"
or bash regular expression
line='some letters foo/substring/goo/some additional letters'
if [[ $line =~ foo/([^/]*)/goo ]]; then
echo "${BASH_REMATCH[1]}"
fi
If you know there are no other / in your "other letters", you can use cut :
> echo "some letters foo/substring/goo/some additional letters" | cut -d'/' -f2
In terms of readability I think awk is a good solution
echo "some letters foo/substring/goo/some additional letters" | awk -v FS="(foo/|/goo)" '{print $2}'
I want to print a text example
'(]\{&}$"\n
the escaped version I have is this:
"'(]\\{&}$\"\\n"
I tried the following:
cat $CLIPBOARD_HISTORY_FILE | sed "$2!d" | sed 's/^.\(.*\).$/\1/'
cat $CLIPBOARD_HISTORY_FILE | sed "$2!d" | sed 's/^.\(.*\).$/\1/' | eval 'stdin=$(cat); echo "$stdin"'
VAR1=$(cat $CLIPBOARD_HISTORY_FILE | sed "$2!d" | sed 's/^.\(.*\).$/\1/')
VAR2="'(]\\{&}\$\"\\n"
VAR3=$VAR1
echo "1 '(]\\{&}\$\"\\n"
echo "2 $VAR1"
echo "3 $VAR2"
echo "4 $VAR3"
echo -e "5 $VAR1"
echo -e "6 $VAR2"
echo -e "7 $VAR3"
$
'(]\\{&}$\"\\n
'(]\\{&}$\"\\n
1 '(]\{&}$"\n
2 '(]\\{&}$\"\\n
3 '(]\{&}$"\n
4 '(]\\{&}$\"\\n
5 '(]\{&}$\"\n
6 '(]\{&}$"
7 '(]\{&}$\"\n
echoing the text directly works, but not if it comes from a command.... what am I not seeing or understanding?
Thanks for the help!
In general, it's best to enclose material in single quotes rather than double quotes; then you only have to worry about single quotes. Thus:
$ x="'"'(]\{&}$"\n'
$ printf "%s\n" "$x"
'(]\{&}$"\n
$ printf "%s\n" "$x" | sed -e "s/'/'\\\\''/g" -e "s/^/'/" -e "s/$/'/"
''\''(]\{&}$"\n'
$
The use of printf is important; it doesn't futz with its data, unlike echo.
The '\'' sequence is crucial; it stops the current single quoted string, outputs a single quote and then restarts the single quoted string. That output is 'sub-optimal'; the initial '' could be left out (and similarly the final '' could be left out if the data ends with a single quote):
$ printf "%s\n" "$x" | sed -e "s/'/'\\\\''/g" -e "s/^/'/" -e "s/$/'/" -e "s/^''//" -e "s/''$//"
\''(]\{&}$"\n'
$
If you really must have double quotes around the data, rather than single quotes, then you have to escape more ($`\" need protection), but the concept is similar.
I am new to shell scripts. I want to read a file line by line, which contains arguments and if the arguments contains any spaces in it, I want to replace it by enclosing with quotes.
For example if the file (test.dat) contains:
-DtestArgument1=/path/to a/text file
-DtestArgument2=/path/to a/text file
After parsing the above file, shell script should prepare the string with following:
-DtestArgument1="/path/to a/text file" -DtestArgument2="/path/to a/text file"
Here is my shell script:
while read ARGUMENT; do
ARGUMENT=`echo ${ARGUMENT} | tr "\n" " "`
if [[ "${ARGUMENT}" =~ " " ]]; then
ARGUMENT=`echo $ARGUMENT | sed 's/\^(-D.*\)=(.*)/\1=\"\2\"/g'`
NEW_ARGUMENT="${NEW_ARGUMENT} ${ARGUMENT}"
else
echo "doesn't contains spaces"
NEW_ARGUMENT="${NEW_ARGUMENT} ${ARGUMENT}"
fi
done < test.dat
But it's throwing the following error:
sed: -e expression #1, char 28: Unmatched ) or \)
The code should be compatible with all shells.
I think you should simplify the problem. Rather than worrying about spaces, just quote the argument after the =. Something like:
sed -e 's/=/="/' -e 's/$/"/' test.dat | paste -s -d\ -
Should be sufficient. If you really care about spaces, you could try something like:
sed -e '/=.* /{ s/=/="/; s/$/"/; }' test.dat | paste -s -d\ -
That will only notice spaces after the =. Just use / / if you really want to change any line that has a space anywhere.
There's no need to use a while/read loop: just let sed read the file directly.
The sed parentheses should be escaped:
ARGUMENT=`echo $ARGUMENT | sed "s/\^\(-D.*\)=\(.*\)/\1=\"\2\"/g"`
One place you did, in 3 places you forgot... BTW, I generally use " quotation.
If you prefer '-style, do like this:
ARGUMENT=`echo $ARGUMENT | sed 's/\^(-D.*)=(.*)/\1="\2"/g'`
How to remove extra spaces in variable HEAD?
HEAD=" how to remove extra spaces "
Result:
how to remove extra spaces
Try this:
echo "$HEAD" | tr -s " "
or maybe you want to save it in a variable:
NEWHEAD=$(echo "$HEAD" | tr -s " ")
Update
To remove leading and trailing whitespaces, do this:
NEWHEAD=$(echo "$HEAD" | tr -s " ")
NEWHEAD=${NEWHEAD%% }
NEWHEAD=${NEWHEAD## }
Using awk:
$ echo "$HEAD" | awk '$1=$1'
how to remove extra spaces
Take advantage of the word-splitting effects of not quoting your variable
$ HEAD=" how to remove extra spaces "
$ set -- $HEAD
$ HEAD=$*
$ echo ">>>$HEAD<<<"
>>>how to remove extra spaces<<<
If you don't want to use the positional paramaters, use an array
ary=($HEAD)
HEAD=${ary[#]}
echo "$HEAD"
One dangerous side-effect of not quoting is that filename expansion will be in play. So turn it off first, and re-enable it after:
$ set -f
$ set -- $HEAD
$ set +f
This horse isn't quite dead yet: Let's keep beating it!*
Read into array
Other people have mentioned read, but since using unquoted expansion may cause undesirable expansions all answers using it can be regarded as more or less the same. You could do
set -f
read HEAD <<< $HEAD
set +f
or you could do
read -rd '' -a HEAD <<< "$HEAD" # Assuming the default IFS
HEAD="${HEAD[*]}"
Extended Globbing with Parameter Expansion
$ shopt -s extglob
$ HEAD="${HEAD//+( )/ }" HEAD="${HEAD# }" HEAD="${HEAD% }"
$ printf '"%s"\n' "$HEAD"
"how to remove extra spaces"
*No horses were actually harmed – this was merely a metaphor for getting six+ diverse answers to a simple question.
Here's how I would do it with sed:
string=' how to remove extra spaces '
echo "$string" | sed -e 's/ */ /g' -e 's/^ *\(.*\) *$/\1/'
=> how to remove extra spaces # (no spaces at beginning or end)
The first sed expression replaces any groups of more than 1 space with a single space, and the second expression removes any trailing or leading spaces.
echo -e " abc \t def "|column -t|tr -s " "
column -t will:
remove the spaces at the beginning and at the end of the line
convert tabs to spaces
tr -s " " will squeeze multiple spaces to single space
BTW, to see the whole output you can use cat - -A: shows you all spacial characters including tabs and EOL:
echo -e " abc \t def "|cat - -A
output: abc ^I def $
echo -e " abc \t def "|column -t|tr -s " "|cat - -A
output:
abc def$
Whitespace can take the form of both spaces and tabs. Although they are non-printing characters and unseen to us, sed and other tools see them as different forms of whitespace and only operate on what you ask for. ie, if you tell sed to delete x number of spaces, it will do this, but the expression will not match tabs. The inverse is true- supply a tab to sed and it will not match spaces, even if the number of them is equal to those in a tab.
A more extensible solution that will work for removing either/both additional space in the form of spaces and tabs (I've tested mixing both in your specimen variable) is:
echo $HEAD | sed 's/^[[:blank:]]*//g'
or we can tighten-up #Frontear 's excellent suggestion of using xargs without the tr:
echo $HEAD | xargs
However, note that xargs would also remove newlines. So if you were to cat a file and pipe it to xargs, all the extra space- including newlines- are removed and everything put on the same line ;-).
Both of the foregoing achieved your desired result in my testing.
Try this one:
echo ' how to remove extra spaces ' | sed 's/^ *//g' | sed 's/$ *//g' | sed 's/ */ /g'
or
HEAD=" how to remove extra spaces "
HEAD=$(echo "$HEAD" | sed 's/^ *//g' | sed 's/$ *//g' | sed 's/ */ /g')
I would make use of tr to remove the extra spaces, and xargs to trim the back and front.
TEXT=" This is some text "
echo $(echo $TEXT | tr -s " " | xargs)
# [...]$ This is some text
echo variable without quotes does what you want:
HEAD=" how to remove extra spaces "
echo $HEAD
# or assign to new variable
NEW_HEAD=$(echo $HEAD)
echo $NEW_HEAD
output: how to remove extra spaces