Remove a fixed prefix/suffix from a string in Bash - bash

I want to remove the prefix/suffix from a string. For example, given:
string="hello-world"
prefix="hell"
suffix="ld"
How do I get the following result?
"o-wor"

$ prefix="hell"
$ suffix="ld"
$ string="hello-world"
$ foo=${string#"$prefix"}
$ foo=${foo%"$suffix"}
$ echo "${foo}"
o-wor
This is documented in the Shell Parameter Expansion section of the manual:
${parameter#word}
${parameter##word}
The word is expanded to produce a pattern and matched according to the rules described below (see Pattern Matching). If the pattern matches the beginning of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the shortest matching pattern (the # case) or the longest matching pattern (the ## case) deleted. […]
${parameter%word}
${parameter%%word}
The word is expanded to produce a pattern and matched according to the rules described below (see Pattern Matching). If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the value of parameter with the shortest matching pattern (the % case) or the longest matching pattern (the %% case) deleted. […]

Using sed:
$ echo "$string" | sed -e "s/^$prefix//" -e "s/$suffix$//"
o-wor
Within the sed command, the ^ character matches text beginning with $prefix, and the trailing $ matches text ending with $suffix.
Adrian Frühwirth makes some good points in the comments below, but sed for this purpose can be very useful. The fact that the contents of $prefix and $suffix are interpreted by sed can be either good OR bad- as long as you pay attention, you should be fine. The beauty is, you can do something like this:
$ prefix='^.*ll'
$ suffix='ld$'
$ echo "$string" | sed -e "s/^$prefix//" -e "s/$suffix$//"
o-wor
which may be what you want, and is both fancier and more powerful than bash variable substitution. If you remember that with great power comes great responsibility (as Spiderman says), you should be fine.
A quick introduction to sed can be found at http://evc-cit.info/cit052/sed_tutorial.html
A note regarding the shell and its use of strings:
For the particular example given, the following would work as well:
$ echo $string | sed -e s/^$prefix// -e s/$suffix$//
...but only because:
echo doesn't care how many strings are in its argument list, and
There are no spaces in $prefix and $suffix
It's generally good practice to quote a string on the command line because even if it contains spaces it will be presented to the command as a single argument. We quote $prefix and $suffix for the same reason: each edit command to sed will be passed as one string. We use double quotes because they allow for variable interpolation; had we used single quotes the sed command would have gotten a literal $prefix and $suffix which is certainly not what we wanted.
Notice, too, my use of single quotes when setting the variables prefix and suffix. We certainly don't want anything in the strings to be interpreted, so we single quote them so no interpolation takes place. Again, it may not be necessary in this example but it's a very good habit to get into.

$ string="hello-world"
$ prefix="hell"
$ suffix="ld"
$ #remove "hell" from "hello-world" if "hell" is found at the beginning.
$ prefix_removed_string=${string/#$prefix}
$ #remove "ld" from "o-world" if "ld" is found at the end.
$ suffix_removed_String=${prefix_removed_string/%$suffix}
$ echo $suffix_removed_String
o-wor
Notes:
#$prefix : adding # makes sure that substring "hell" is removed only if it is found in beginning.
%$suffix : adding % makes sure that substring "ld" is removed only if it is found in end.
Without these, the substrings "hell" and "ld" will get removed everywhere, even it is found in the middle.

I use grep for removing prefixes from paths (which aren't handled well by sed):
echo "$input" | grep -oP "^$prefix\K.*"
\K removes from the match all the characters before it.

Do you know the length of your prefix and suffix? In your case:
result=$(echo $string | cut -c5- | rev | cut -c3- | rev)
Or more general:
result=$(echo $string | cut -c$((${#prefix}+1))- | rev | cut -c$((${#suffix}+1))- | rev)
But the solution from Adrian Frühwirth is way cool! I didn't know about that!

Small and universal solution:
expr "$string" : "$prefix\(.*\)$suffix"

Using the =~ operator:
$ string="hello-world"
$ prefix="hell"
$ suffix="ld"
$ [[ "$string" =~ ^$prefix(.*)$suffix$ ]] && echo "${BASH_REMATCH[1]}"
o-wor

NOTE: Not sure if this was possible back in 2013 but it's certainly possible today (10 Oct 2021) so adding another option ...
Since we're dealing with known fixed length strings (prefix and suffix) we can use a bash substring to obtain the desired result with a single operation.
Inputs:
string="hello-world"
prefix="hell"
suffix="ld"
Plan:
bash substring syntax: ${string:<start>:<length>}
skipping over prefix="hell" means our <start> will be 4
<length> will be total length of string (${#string}) minus the lengths of our fixed length strings (4 for hell / 2 for ld)
This gives us:
$ echo "${string:4:(${#string}-4-2)}"
o-wor
NOTE: the parens can be removed and still obtain the same result
If the values of prefix and suffix are unknown, or could vary, we can still use this same operation but replace 4 and 2 with ${#prefix} and ${#suffix}, respectively:
$ echo "${string:${#prefix}:${#string}-${#prefix}-${#suffix}}"
o-wor

Using #Adrian Frühwirth answer:
function strip {
local STRING=${1#$"$2"}
echo ${STRING%$"$2"}
}
use it like this
HELLO=":hello:"
HELLO=$(strip "$HELLO" ":")
echo $HELLO # hello

Related

Insert the contents of the variable in SED command [duplicate]

If I run these commands from a script:
#my.sh
PWD=bla
sed 's/xxx/'$PWD'/'
...
$ ./my.sh
xxx
bla
it is fine.
But, if I run:
#my.sh
sed 's/xxx/'$PWD'/'
...
$ ./my.sh
$ sed: -e expression #1, char 8: Unknown option to `s'
I read in tutorials that to substitute environment variables from shell you need to stop, and 'out quote' the $varname part so that it is not substituted directly, which is what I did, and which works only if the variable is defined immediately before.
How can I get sed to recognize a $var as an environment variable as it is defined in the shell?
Your two examples look identical, which makes problems hard to diagnose. Potential problems:
You may need double quotes, as in sed 's/xxx/'"$PWD"'/'
$PWD may contain a slash, in which case you need to find a character not contained in $PWD to use as a delimiter.
To nail both issues at once, perhaps
sed 's#xxx#'"$PWD"'#'
In addition to Norman Ramsey's answer, I'd like to add that you can double-quote the entire string (which may make the statement more readable and less error prone).
So if you want to search for 'foo' and replace it with the content of $BAR, you can enclose the sed command in double-quotes.
sed 's/foo/$BAR/g'
sed "s/foo/$BAR/g"
In the first, $BAR will not expand correctly while in the second $BAR will expand correctly.
Another easy alternative:
Since $PWD will usually contain a slash /, use | instead of / for the sed statement:
sed -e "s|xxx|$PWD|"
You can use other characters besides "/" in substitution:
sed "s#$1#$2#g" -i FILE
一. bad way: change delimiter
sed 's/xxx/'"$PWD"'/'
sed 's:xxx:'"$PWD"':'
sed 's#xxx#'"$PWD"'#'
maybe those not the final answer,
you can not known what character will occur in $PWD, / : OR #.
if delimiter char in $PWD, they will break the expression
the good way is replace(escape) the special character in $PWD.
二. good way: escape delimiter
for example:
try to replace URL as $url (has : / in content)
x.com:80/aa/bb/aa.js
in string $tmp
URL
A. use / as delimiter
escape / as \/ in var (before use in sed expression)
## step 1: try escape
echo ${url//\//\\/}
x.com:80\/aa\/bb\/aa.js #escape fine
echo ${url//\//\/}
x.com:80/aa/bb/aa.js #escape not success
echo "${url//\//\/}"
x.com:80\/aa\/bb\/aa.js #escape fine, notice `"`
## step 2: do sed
echo $tmp | sed "s/URL/${url//\//\\/}/"
URL
echo $tmp | sed "s/URL/${url//\//\/}/"
URL
OR
B. use : as delimiter (more readable than /)
escape : as \: in var (before use in sed expression)
## step 1: try escape
echo ${url//:/\:}
x.com:80/aa/bb/aa.js #escape not success
echo "${url//:/\:}"
x.com\:80/aa/bb/aa.js #escape fine, notice `"`
## step 2: do sed
echo $tmp | sed "s:URL:${url//:/\:}:g"
x.com:80/aa/bb/aa.js
With your question edit, I see your problem. Let's say the current directory is /home/yourname ... in this case, your command below:
sed 's/xxx/'$PWD'/'
will be expanded to
sed `s/xxx//home/yourname//
which is not valid. You need to put a \ character in front of each / in your $PWD if you want to do this.
Actually, the simplest thing (in GNU sed, at least) is to use a different separator for the sed substitution (s) command. So, instead of s/pattern/'$mypath'/ being expanded to s/pattern//my/path/, which will of course confuse the s command, use s!pattern!'$mypath'!, which will be expanded to s!pattern!/my/path!. I’ve used the bang (!) character (or use anything you like) which avoids the usual, but-by-no-means-your-only-choice forward slash as the separator.
Dealing with VARIABLES within sed
[root#gislab00207 ldom]# echo domainname: None > /tmp/1.txt
[root#gislab00207 ldom]# cat /tmp/1.txt
domainname: None
[root#gislab00207 ldom]# echo ${DOMAIN_NAME}
dcsw-79-98vm.us.oracle.com
[root#gislab00207 ldom]# cat /tmp/1.txt | sed -e 's/domainname: None/domainname: ${DOMAIN_NAME}/g'
--- Below is the result -- very funny.
domainname: ${DOMAIN_NAME}
--- You need to single quote your variable like this ...
[root#gislab00207 ldom]# cat /tmp/1.txt | sed -e 's/domainname: None/domainname: '${DOMAIN_NAME}'/g'
--- The right result is below
domainname: dcsw-79-98vm.us.oracle.com
VAR=8675309
echo "abcde:jhdfj$jhbsfiy/.hghi$jh:12345:dgve::" |\
sed 's/:[0-9]*:/:'$VAR':/1'
where VAR contains what you want to replace the field with
I had similar problem, I had a list and I have to build a SQL script based on template (that contained #INPUT# as element to replace):
for i in LIST
do
awk "sub(/\#INPUT\#/,\"${i}\");" template.sql >> output
done
If your replacement string may contain other sed control characters, then a two-step substitution (first escaping the replacement string) may be what you want:
PWD='/a\1&b$_' # these are problematic for sed
PWD_ESC=$(printf '%s\n' "$PWD" | sed -e 's/[\/&]/\\&/g')
echo 'xxx' | sed "s/xxx/$PWD_ESC/" # now this works as expected
for me to replace some text against the value of an environment variable in a file with sed works only with quota as the following:
sed -i 's/original_value/'"$MY_ENVIRNONMENT_VARIABLE"'/g' myfile.txt
BUT when the value of MY_ENVIRONMENT_VARIABLE contains a URL (ie https://andreas.gr) then the above was not working.
THEN use different delimiter:
sed -i "s|original_value|$MY_ENVIRNONMENT_VARIABLE|g" myfile.txt

Shell scripting selecting a part of a word

Shell scripting - I need to get only "v1.0.42" from below. There is no space between any words here
"ansible-project-development-environment-TEMPLATE-v1.0.42-role_test_example_run_environment"
Could you please try following.
awk 'match($0,/v[0-9]+\.[0-9]+\.[0-9]+/){print substr($0,RSTART,RLENGTH)}' Input_file
2nd solution: Using GNU sed:
sed -E 's/.*(v[0-9]+\.[0-9]+\.[0-9]+).*/\1/ Input_file
OR with BRE sed as per David sir's comments:
sed 's/^.*-\(v[^-][^-]*\).*$/\1/' Input_file
3rd solution: With perl one liner.
perl -ne 'print "$&\n" if /v[0-9]+\.[0-9]+\.[0-9]+/' Input_file
If you're using Bash and the above string in var $var:
$ [[ $var =~ v([0-9]+\.?)+ ]] && echo ${BASH_REMATCH[0]}
v1.0.42
Assuming you have the string in a shell variable, you could use parameter expansion to remove the parts you don't want:
v="ansible-project-development-environment-TEMPLATE-v1.0.42-role_test_example_run_environment"
v=${v#*-TEMPLATE-} # v1.0.42-role_test_example_run_environment
v=${v%%-*} # v1.0.42
This is standard POSIX shell, not requiring any non-standard extensions.
Relevant quote:
${parameter#[word]}
Remove Smallest Prefix Pattern. The word shall be expanded to produce a pattern. The parameter expansion shall then result in
parameter, with the smallest portion of the prefix matched by the
pattern deleted. If present, word shall not begin with an unquoted
'#'.
${parameter%%[word]}
Remove Largest Suffix Pattern. The word shall be expanded to produce a pattern. The parameter expansion shall then result in
parameter, with the largest portion of the suffix matched by the
pattern deleted.
try this:
echo "ansible-project-development-environment-TEMPLATE-v1.0.42-role_test_example_run_environment" | cut -d '-' -f6

Add prefix to each word of each line in bash

I have a variable called deps:
deps='word1 word2'
I want to add a prefix to each word of the variable.
I tried with:
echo $deps | while read word do \ echo "prefix-$word" \ done
but i get:
bash: syntax error near unexpected token `done'
any help? thanks
With sed :
$ deps='word1 word2'
$ echo "$deps" | sed 's/[^ ]* */prefix-&/g'
prefix-word1 prefix-word2
For well behaved strings, the best answer is:
printf "prefix-%s\n" $deps
as suggested by 123 in the comments to fedorqui's answer.
Explanation:
Without quoting, bash will split the contents of $deps according to $IFS (which defaults to " \n\t") before calling printf
printf evaluates the pattern for each of the provided arguments and writes the output to stdout.
printf is a shell built-in (at least for bash) and does not fork another process, so this is faster than sed-based solutions.
In another question I just came across the markers for beginning (\<) and end (\>) of words. With those you can shorten the solution of SLePort above somewhat. The solution also nicely extends to appending a suffix, which I needed in addition to the prefix, but couldn't figure out how to use above solution for it, as the & also includes the possible trailing whitespace after the word.
So my solution is this:
$ deps='word1 word2'
# add prefix:
$ echo "$deps" | sed 's/\</prefix-/g'
prefix-word1 prefix-word2
# add suffix:
$ echo "$deps" | sed 's/\>/-suffix/g'
word1-suffix word2-suffix
Explanation: \< matches the beginning of every word, and \> matches the end of each word. You can simply "replace" these by the prefix/suffix, resulting in them being prepended/appended. There is no need to reference them anymore in the replacement, as these are not "real" characters anyway!
You can read the string into an array and then prepend the string to every item:
$ IFS=' ' read -r -a myarray <<< "word1 word2"
$ printf "%s\n" "${myarray[#]}"
word1
word2
$ printf "prefix-%s\n" "${myarray[#]}"
prefix-word1
prefix-word2

How to ensure I have exactly 2 spaces before string and zero spaces after

I get a string that can have from zero to multiple leading and trailing spaces.
I'm trying to get rid of them without lot of hackery but my code looks huge.
How to do this in a clean way?
as easy as:
$ src=" some text "
$ dst=" $(echo $src)"
$ echo ":$dst:"
: some text:
$(echo $src) will get rid of all around spaces.
than you simply add how much spaces you need before it.
How are you calling out the string? If it's an echo you can just put
Echo "<2 spaces>". "string";
if it's a normal string you just put 2 spaces between the first qoute and the string.
"<2spaces> string here"
One way using GNU sed:
sed 's/^[ \t]*/ /; s/[ \t]*$//' file.txt
You can apply this to a bash variable like this:
echo "$string" | sed 's/^[ \t]*/ /; s/[ \t]*$//'
And save it like this:
variable=$(echo "$string" | sed 's/^[ \t]*/ /; s/[ \t]*$//')
Explanation:
The first substitution will remove all leading whitespace and replace it with two spaces.
The second substitution will simply remove all lagging whitespace from a line.
The simplest is probably to use an external process.
value=$(echo "$value" | sed 's/^ *\(.*[^ ]\) *$/ \1/')
If you need to transform an empty string into two spaces, you'll need to modify the regex; and if you're not on Linux, your sed dialect may differ slightly. For maximum portability, switch to awk or Perl, or do it all in Bash. That gets a bit more complex, but for a start, trailing=${value##*[! ]} contains any trailing spaces, and you can trim them off with ${value%$trailing}, and similarly for leading spaces. See the section on variable substitution in the Bash manual for details.
You can use a regular expression to match everything between the leading and trailing spaces. The matched text is found in the BASH_REMATCH array (the text matching the first parentheses group is in element 1).
spcs='\ *'
text='.*[^ ]'
[[ $src =~ ^$spcs($text)$spcs$ ]]
dst=" ${BASH_REMATCH[1]}"

Bash - Extract numbers from String

I got a string which looks like this:
"abcderwer 123123 10,200 asdfasdf iopjjop"
Now I want to extract numbers, following the scheme xx,xxx where x is a number between 0-9. E.g. 10,200. Has to be five digit, and has to contain ",".
How can I do that?
Thank you
You can use grep:
$ echo "abcderwer 123123 10,200 asdfasdf iopjjop" | egrep -o '[0-9]{2},[0-9]{3}'
10,200
In pure Bash:
pattern='([[:digit:]]{2},[[:digit:]]{3})'
[[ $string =~ $pattern ]]
echo "${BASH_REMATCH[1]}"
Simple pattern matching (glob patterns) is built into the shell. Assuming you have the strings in $* (that is, they are command-line arguments to your script, or you have used set on a string you have obtained otherwise), try this:
for token; do
case $token in
[0-9][0-9],[0-9][0-9][0-9] ) echo "$token" ;;
esac
done
Check out pattern matching and regular expressions.
Links:
Bash regular expressions
Patterns and pattern matching
SO question
and as mentioned above, one way to utilize pattern matching is with grep.
Other uses: echo supports patterns (globbing) and find supports regular expressions.
A slightly non-typical solution:
< input tr -cd [0-9,\ ] | tr \ '\012' | grep '^..,...$'
(The first tr removes everything except commas, spaces, and digits. The
second tr replaces spaces with newlines, putting each "number" on a separate
line, and the grep discards everything except those that match your criterion.)
The following example using your input data string should solve the problem using sed.
$ echo abcderwer 123123 10,200 asdfasdf iopjjop | sed -ne 's/^.*\([0-9,]\{6\}\).*$/\1/p'
10,200

Resources