sed Capital_Case not working - shell

I'm trying to convert a string that has either - (hyphen) or _ (underscore) to Capital_Case string.
#!/usr/bin/env sh
function cap_case() {
[ $# -eq 1 ] || return 1;
_str=$1;
_capitalize=${_str//[-_]/_} | sed -E 's/(^|_)([a-zA-Z])/\u\2/g'
echo "Capitalize:"
echo $_capitalize
return 0
}
read string
echo $(cap_case $string)
But I don't get anything out.
First I am replacing any occurrence of - and _ with _ ${_str//[-_]/_}, and then I pipe that string to sed which finds the first letter, or _ as the first group, and then the letter after the first group in the second group, and I want to uppercase the found letter with \u\2. I tried with \U\2 but that didn't work as well.
I want the string some_string to become
Some_String
And string some-string to become
Some_String
I'm on a mac, using zsh if that is helpful.

EDIT: More generic solution here to make each field's first letter Capital.
echo "some_string_other" | awk -F"_" '{for(i=1;i<=NF;i++){$i=toupper(substr($i,1,1)) substr($i,2)}} 1' OFS="_"
Following awk may help you.
echo "some_string" | awk -F"_" '{$1=toupper(substr($1,1,1)) substr($1,2);$2=toupper(substr($2,1,1)) substr($2,2)} 1' OFS="_"
Output will be as follows.
echo "some_string" | awk -F"_" '{$1=toupper(substr($1,1,1)) substr($1,2);$2=toupper(substr($2,1,1)) substr($2,2)} 1' OFS="_"
Some_String

This being zsh, you don't need sed (or even a function, really):
$ s=some-string-bar
$ print ${(C)s:gs/-/_}
Some_String_Bar
The (C) flag capitalizes words (where "words" are defined as sequences of alphanumeric characters separated by other characters); :gs/-/_ replaces hyphens with underscores.
If you really want a function, it's cap_case () { print ${(C)1:gs/-/_} }.

pure bash:
#!/bin/bash
camel_case(){
local d display string
declare -a strings # = scope local
[ "$2" ] && d="$2" || d=" " # optional output delimiter
ifs_ini="$IFS"
IFS+='_-' # we keep initial IFS
strings=( "$1" ) # array
for string in ${strings[#]} ; do
display+="${string^}$d"
done
echo "${display%$d}"
IFS="$ifs_ini"
}
camel_case "some-string_here" "_"
camel_case "some-string_here some strings here" "+"
camel_case "some-string_here some strings here"
echo "$BASH_VERSION"
exit
output:
Some_String_Here
Some+String+Here+Some+Strings+Here
Some String Here Some Strings Here
4.4.18(1) release

You can try this gnu sed
echo 'some_other-string' | sed -E 's/(^.)/\u&/;s/[_-](.)/_\u\1/g'
Explains :
s/(^.)/\u&/
(^.) match the first char and \u& put the match in capital letter.
s/[_-](.)/_\u\1/g
[_-](.) capture a char preceded by _ or - and replace it by _ and the matched char in capital letter.
The g at the end tell sed to make the replacement for each char which meet the criteria

You didn't assign to _capitalize - you set a _capitalize environment variable for the empty command that you piped into sed.
You probably meant
_capitalize=$(<<<"${_str//[-_]/_}" sed -E 's/(^|_)([a-zA-Z])/\1\u\2/g')
Note also that ${//} isn't standard shell, so you really ought to specify an interpreter other than sh.
A simpler approach would be simply:
#!/bin/sh
cap_case() {
printf "Capitalize: "
echo "$*" | sed -e 'y/-/_/' -e 's/\(^\|_\)[[:alpha:]]/\U&/g'
}
echo $(cap_case "snake_case")
Note that the \u / \U replacement is a GNU extension to sed - if you're using a non-GNU implementation, check whether it supports this feature.

Related

find & replace only exact match between delimiters in string values

I have a string value stored in a variable:
PTYPE="Other Farm|Raised Ranch|Farm house|Other|A-Frame|Log Home"
I want to find & replace Other with some value like NOTHING. All values are stored in variables.
WhatToChange=Other
NewValue=NOTHING
echo $PTYPE|sed -e "s#${WhatToChange}#${NewValue}#g"
This is replacing all the occurances of Other and getting output like:
NOTHING Farm|Raised Ranch|Farm house|NOTHING|A-Frame|Log Home
Is there any way I can exactly change only the exact one? The place for ${WhatToChange} is variable.
As you have well defined fields and want an exact match, awk could be easier to use than sed; at the very least, you won't have to worry about escaping the strings for using it in the sed expression:
echo "Other Farm|Raised Ranch|Farm house|Other|A-Frame|Log Home" |
awk -v old="Other" -v new="NOTHING" \
'BEGIN {FS = OFS = "|"} {for(i=1;i<=NF;i++) if($i == old) $i = new} 1'
output:
Other Farm|Raised Ranch|Farm house|NOTHING|A-Frame|Log Home
To match either the exact character | or the beginning of the line, use ([|]|^).
To match either the exact character | or the end of the line, use ([|]|$).
To put a | back in place only when appropriate, store these in match groups, and refer to those groups with \1 or \2:
PTYPE="Other Farm|Raised Ranch|Farm house|Other|A-Frame|Log Home"
WhatToChange=Other
NewValue=NOTHING
sed -re "s#(^|[|])${WhatToChange}($|[|])#\1${NewValue}\2#g" <<<"$PTYPE"
...emits as output:
Other Farm|Raised Ranch|Farm house|NOTHING|A-Frame|Log Home
...and still works even if WhatToChange is matched at the beginning or end of the list.
For fun, some perl:
This is like #Charles's sed solution: Note the \Q...\E so that the "to change" value is treated as literal text.
echo "$PTYPE" | perl -spe '
s{ (?:^|\|)\K \Q$WhatToChange\E (?=\||$) }{$NewValue}gx
' -- -WhatToChange=Other -NewValue=NOTHING
This is like #Fravadona's awk solution:
echo "$PTYPE" | perl -F'[|]' -sane '
print join "|", map {$_ eq $WhatToChange ? $NewValue : $_} #F
' -- -WhatToChange=Other -NewValue=NOTHING
How about
echo ${PTYPE//$WhatToChange/$NewValue}
UPDATE:
I just realized that the replacement should happen only if WhatToChange is the whole content between two separators (|). In this case, we can do it in bash as well (without the need to revert to a child process):
if [[ $PTYPE =~ (.*[|]|^)$WhatToChange([|].*|$) ]]
echo "${BASH_REMATCH[1]}${NewValue}${BASH_REMATCH[2]}"
fi
UPDATE (based on the comment by Fravadona):
Used in this way, WhatToChange is interpreted as a regular expression. This can be useful, if you want to catch for instance variations of the string, for instance
WhatToChange='[Oo]ther' # to catch Other and other
If you always want to have a literal match, you have to quote the variable:
[[ $PTYPE =~ (.*[|]|^)"$WhatToChange"([|].*|$) ]]
This might work for you (GNU sed & bash):
<<<"$PTYPE" sed 'y/|/\n/;s/^'"$WhatToChange"'$/'"$NewValue"'/mg;y/\n/|/'
Input $PTYPE as a here-string into sed.
Translate | separators to newlines.
Replace $WhatToChange to $NewValue for each matched line.
Translate newlines back to |'s.
N.B. The use of the m flag in the substitution command allows sed to work in multiline mode and this presents each value between separators on its own line.
An alternative:
sed -z 'y/|/\x00/;s/^'"$WhatToChange"'$/'"$NewValue"'/mg;y/\x00/|/;' file

Double quotes containing variable not working in sed [duplicate]

In my bash script I have an external (received from user) string, which I should use in sed pattern.
REPLACE="<funny characters here>"
sed "s/KEYWORD/$REPLACE/g"
How can I escape the $REPLACE string so it would be safely accepted by sed as a literal replacement?
NOTE: The KEYWORD is a dumb substring with no matches etc. It is not supplied by user.
Warning: This does not consider newlines. For a more in-depth answer, see this SO-question instead. (Thanks, Ed Morton & Niklas Peter)
Note that escaping everything is a bad idea. Sed needs many characters to be escaped to get their special meaning. For example, if you escape a digit in the replacement string, it will turn in to a backreference.
As Ben Blank said, there are only three characters that need to be escaped in the replacement string (escapes themselves, forward slash for end of statement and & for replace all):
ESCAPED_REPLACE=$(printf '%s\n' "$REPLACE" | sed -e 's/[\/&]/\\&/g')
# Now you can use ESCAPED_REPLACE in the original sed statement
sed "s/KEYWORD/$ESCAPED_REPLACE/g"
If you ever need to escape the KEYWORD string, the following is the one you need:
sed -e 's/[]\/$*.^[]/\\&/g'
And can be used by:
KEYWORD="The Keyword You Need";
ESCAPED_KEYWORD=$(printf '%s\n' "$KEYWORD" | sed -e 's/[]\/$*.^[]/\\&/g');
# Now you can use it inside the original sed statement to replace text
sed "s/$ESCAPED_KEYWORD/$ESCAPED_REPLACE/g"
Remember, if you use a character other than / as delimiter, you need replace the slash in the expressions above wih the character you are using. See PeterJCLaw's comment for explanation.
Edited: Due to some corner cases previously not accounted for, the commands above have changed several times. Check the edit history for details.
The sed command allows you to use other characters instead of / as separator:
sed 's#"http://www\.fubar\.com"#URL_FUBAR#g'
The double quotes are not a problem.
The only three literal characters which are treated specially in the replace clause are / (to close the clause), \ (to escape characters, backreference, &c.), and & (to include the match in the replacement). Therefore, all you need to do is escape those three characters:
sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
Example:
$ export REPLACE="'\"|\\/><&!"
$ echo fooKEYWORDbar | sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
foo'"|\/><&!bar
Based on Pianosaurus's regular expressions, I made a bash function that escapes both keyword and replacement.
function sedeasy {
sed -i "s/$(echo $1 | sed -e 's/\([[\/.*]\|\]\)/\\&/g')/$(echo $2 | sed -e 's/[\/&]/\\&/g')/g" $3
}
Here's how you use it:
sedeasy "include /etc/nginx/conf.d/*" "include /apps/*/conf/nginx.conf" /etc/nginx/nginx.conf
It's a bit late to respond... but there IS a much simpler way to do this. Just change the delimiter (i.e., the character that separates fields). So, instead of s/foo/bar/ you write s|bar|foo.
And, here's the easy way to do this:
sed 's|/\*!50017 DEFINER=`snafu`#`localhost`\*/||g'
The resulting output is devoid of that nasty DEFINER clause.
It turns out you're asking the wrong question. I also asked the wrong question. The reason it's wrong is the beginning of the first sentence: "In my bash script...".
I had the same question & made the same mistake. If you're using bash, you don't need to use sed to do string replacements (and it's much cleaner to use the replace feature built into bash).
Instead of something like, for example:
function escape-all-funny-characters() { UNKNOWN_CODE_THAT_ANSWERS_THE_QUESTION_YOU_ASKED; }
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A="$(escape-all-funny-characters 'KEYWORD')"
B="$(escape-all-funny-characters '<funny characters here>')"
OUTPUT="$(sed "s/$A/$B/g" <<<"$INPUT")"
you can use bash features exclusively:
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A='KEYWORD'
B='<funny characters here>'
OUTPUT="${INPUT//"$A"/"$B"}"
Use awk - it is cleaner:
$ awk -v R='//addr:\\file' '{ sub("THIS", R, $0); print $0 }' <<< "http://file:\_THIS_/path/to/a/file\\is\\\a\\ nightmare"
http://file:\_//addr:\file_/path/to/a/file\\is\\\a\\ nightmare
Here is an example of an AWK I used a while ago. It is an AWK that prints new AWKS. AWK and SED being similar it may be a good template.
ls | awk '{ print "awk " "'"'"'" " {print $1,$2,$3} " "'"'"'" " " $1 ".old_ext > " $1 ".new_ext" }' > for_the_birds
It looks excessive, but somehow that combination of quotes works to keep the ' printed as literals. Then if I remember correctly the vaiables are just surrounded with quotes like this: "$1". Try it, let me know how it works with SED.
These are the escape codes that I've found:
* = \x2a
( = \x28
) = \x29
" = \x22
/ = \x2f
\ = \x5c
' = \x27
? = \x3f
% = \x25
^ = \x5e
sed is typically a mess, especially the difference between gnu-sed and bsd-sed
might just be easier to place some sort of sentinel at the sed side, then a quick pipe over to awk, which is far more flexible in accepting any ERE regex, escaped hex, or escaped octals.
e.g. OFS in awk is the true replacement ::
date | sed -E 's/[0-9]+/\xC1\xC0/g' |
mawk NF=NF FS='\xC1\xC0' OFS='\360\237\244\241'
1 Tue Aug 🤡 🤡:🤡:🤡 EDT 🤡
(tested and confirmed working on both BSD-sed and GNU-sed - the emoji isn't a typo that's what those 4 bytes map to in UTF-8 )
There are dozens of answers out there... If you don't mind using a bash function schema, below is a good answer. The objective below was to allow using sed with practically any parameter as a KEYWORD (F_PS_TARGET) or as a REPLACE (F_PS_REPLACE). We tested it in many scenarios and it seems to be pretty safe. The implementation below supports tabs, line breaks and sigle quotes for both KEYWORD and replace REPLACE.
NOTES: The idea here is to use sed to escape entries for another sed command.
CODE
F_REVERSE_STRING_R=""
f_reverse_string() {
: 'Do a string reverse.
To undo just use a reversed string as STRING_INPUT.
Args:
STRING_INPUT (str): String input.
Returns:
F_REVERSE_STRING_R (str): The modified string.
'
local STRING_INPUT=$1
F_REVERSE_STRING_R=$(echo "x${STRING_INPUT}x" | tac | rev)
F_REVERSE_STRING_R=${F_REVERSE_STRING_R%?}
F_REVERSE_STRING_R=${F_REVERSE_STRING_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/2705678/3223785 ]
F_POWER_SED_ECP_R=""
f_power_sed_ecp() {
: 'Escape strings for the "sed" command.
Escaped characters will be processed as is (e.g. /n, /t ...).
Args:
F_PSE_VAL_TO_ECP (str): Value to be escaped.
F_PSE_ECP_TYPE (int): 0 - For the TARGET value; 1 - For the REPLACE value.
Returns:
F_POWER_SED_ECP_R (str): Escaped value.
'
local F_PSE_VAL_TO_ECP=$1
local F_PSE_ECP_TYPE=$2
# NOTE: Operational characters of "sed" will be escaped, as well as single quotes.
# By Questor
if [ ${F_PSE_ECP_TYPE} -eq 0 ] ; then
# NOTE: For the TARGET value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[]\/$*.^[]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
else
# NOTE: For the REPLACE value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[\/&]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
fi
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R%?}
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/24134488/3223785 ,
# https://stackoverflow.com/a/21740695/3223785 ,
# https://unix.stackexchange.com/a/655558/61742 ,
# https://stackoverflow.com/a/11461628/3223785 ,
# https://stackoverflow.com/a/45151986/3223785 ,
# https://linuxaria.com/pills/tac-and-rev-to-see-files-in-reverse-order ,
# https://unix.stackexchange.com/a/631355/61742 ]
F_POWER_SED_R=""
f_power_sed() {
: 'Facilitate the use of the "sed" command. Replaces in files and strings.
Args:
F_PS_TARGET (str): Value to be replaced by the value of F_PS_REPLACE.
F_PS_REPLACE (str): Value that will replace F_PS_TARGET.
F_PS_FILE (Optional[str]): File in which the replacement will be made.
F_PS_SOURCE (Optional[str]): String to be manipulated in case "F_PS_FILE" was
not informed.
F_PS_NTH_OCCUR (Optional[int]): [1~n] - Replace the nth match; [n~-1] - Replace
the last nth match; 0 - Replace every match; Default 1.
Returns:
F_POWER_SED_R (str): Return the result if "F_PS_FILE" is not informed.
'
local F_PS_TARGET=$1
local F_PS_REPLACE=$2
local F_PS_FILE=$3
local F_PS_SOURCE=$4
local F_PS_NTH_OCCUR=$5
if [ -z "$F_PS_NTH_OCCUR" ] ; then
F_PS_NTH_OCCUR=1
fi
local F_PS_REVERSE_MODE=0
if [ ${F_PS_NTH_OCCUR} -lt -1 ] ; then
F_PS_REVERSE_MODE=1
f_reverse_string "$F_PS_TARGET"
F_PS_TARGET="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_REPLACE"
F_PS_REPLACE="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_SOURCE"
F_PS_SOURCE="$F_REVERSE_STRING_R"
F_PS_NTH_OCCUR=$((-F_PS_NTH_OCCUR))
fi
f_power_sed_ecp "$F_PS_TARGET" 0
F_PS_TARGET=$F_POWER_SED_ECP_R
f_power_sed_ecp "$F_PS_REPLACE" 1
F_PS_REPLACE=$F_POWER_SED_ECP_R
local F_PS_SED_RPL=""
if [ ${F_PS_NTH_OCCUR} -eq -1 ] ; then
# NOTE: We kept this option because it performs better when we only need to replace
# the last occurrence. By Questor
# [Ref(s).: https://linuxhint.com/use-sed-replace-last-occurrence/ ,
# https://unix.stackexchange.com/a/713866/61742 ]
F_PS_SED_RPL="'s/\(.*\)$F_PS_TARGET/\1$F_PS_REPLACE/'"
elif [ ${F_PS_NTH_OCCUR} -gt 0 ] ; then
# [Ref(s).: https://unix.stackexchange.com/a/587924/61742 ]
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/$F_PS_NTH_OCCUR'"
elif [ ${F_PS_NTH_OCCUR} -eq 0 ] ; then
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/g'"
fi
# NOTE: As the "sed" commands below always process literal values for the "F_PS_TARGET"
# so we use the "-z" flag in case it has multiple lines. By Quaestor
# [Ref(s).: https://unix.stackexchange.com/a/525524/61742 ]
if [ -z "$F_PS_FILE" ] ; then
F_POWER_SED_R=$(echo "x${F_PS_SOURCE}x" | eval "sed -z $F_PS_SED_RPL")
F_POWER_SED_R=${F_POWER_SED_R%?}
F_POWER_SED_R=${F_POWER_SED_R#?}
if [ ${F_PS_REVERSE_MODE} -eq 1 ] ; then
f_reverse_string "$F_POWER_SED_R"
F_POWER_SED_R="$F_REVERSE_STRING_R"
fi
else
if [ ${F_PS_REVERSE_MODE} -eq 0 ] ; then
eval "sed -i -z $F_PS_SED_RPL \"$F_PS_FILE\""
else
tac "$F_PS_FILE" | rev | eval "sed -z $F_PS_SED_RPL" | tac | rev > "$F_PS_FILE"
fi
fi
}
MODEL
f_power_sed "F_PS_TARGET" "F_PS_REPLACE" "" "F_PS_SOURCE"
echo "$F_POWER_SED_R"
EXAMPLE
f_power_sed "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" "[ ]+|$/,\"\0\"" "" "Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate that concatenation of the final \", \" then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,\"\0\"); print; }' <<<\"$string\") on Bash that supports readarray. Note your method is Bash 4.4+ I think because of the -d in readar"
echo "$F_POWER_SED_R"
IF YOU JUST WANT TO ESCAPE THE PARAMETERS TO THE SED COMMAND
MODEL
# "TARGET" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 0
echo "$F_POWER_SED_ECP_R"
# "REPLACE" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 1
echo "$F_POWER_SED_ECP_R"
IMPORTANT: If the strings for KEYWORD and/or replace REPLACE contain tabs or line breaks you will need to use the "-z" flag in your "sed" command. More details here.
EXAMPLE
f_power_sed_ecp "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" 0
echo "$F_POWER_SED_ECP_R"
f_power_sed_ecp "[ ]+|$/,\"\0\"" 1
echo "$F_POWER_SED_ECP_R"
NOTE: The f_power_sed_ecp and f_power_sed functions above was made available completely free as part of this project ez_i - Create shell script installers easily!.
Standard recommendation here: use perl :)
echo KEYWORD > /tmp/test
REPLACE="<funny characters here>"
perl -pi.bck -e "s/KEYWORD/${REPLACE}/g" /tmp/test
cat /tmp/test
don't forget all the pleasure that occur with the shell limitation around " and '
so (in ksh)
Var=">New version of \"content' here <"
printf "%s" "${Var}" | sed "s/[&\/\\\\*\\"']/\\&/g' | read -r EscVar
echo "Here is your \"text\" to change" | sed "s/text/${EscVar}/g"
If the case happens to be that you are generating a random password to pass to sed replace pattern, then you choose to be careful about which set of characters in the random string. If you choose a password made by encoding a value as base64, then there is is only character that is both possible in base64 and is also a special character in sed replace pattern. That character is "/", and is easily removed from the password you are generating:
# password 32 characters log, minus any copies of the "/" character.
pass=`openssl rand -base64 32 | sed -e 's/\///g'`;
If you are just looking to replace Variable value in sed command then just remove
Example:
sed -i 's/dev-/dev-$ENV/g' test to sed -i s/dev-/dev-$ENV/g test
I have an improvement over the sedeasy function, which WILL break with special characters like tab.
function sedeasy_improved {
sed -i "s/$(
echo "$1" | sed -e 's/\([[\/.*]\|\]\)/\\&/g'
| sed -e 's:\t:\\t:g'
)/$(
echo "$2" | sed -e 's/[\/&]/\\&/g'
| sed -e 's:\t:\\t:g'
)/g" "$3"
}
So, whats different? $1 and $2 wrapped in quotes to avoid shell expansions and preserve tabs or double spaces.
Additional piping | sed -e 's:\t:\\t:g' (I like : as token) which transforms a tab in \t.
An easier way to do this is simply building the string before hand and using it as a parameter for sed
rpstring="s/KEYWORD/$REPLACE/g"
sed -i $rpstring test.txt

In bash how can I get the last part of a string after the last hyphen [duplicate]

I have this variable:
A="Some variable has value abc.123"
I need to extract this value i.e abc.123. Is this possible in bash?
Simplest is
echo "$A" | awk '{print $NF}'
Edit: explanation of how this works...
awk breaks the input into different fields, using whitespace as the separator by default. Hardcoding 5 in place of NF prints out the 5th field in the input:
echo "$A" | awk '{print $5}'
NF is a built-in awk variable that gives the total number of fields in the current record. The following returns the number 5 because there are 5 fields in the string "Some variable has value abc.123":
echo "$A" | awk '{print NF}'
Combining $ with NF outputs the last field in the string, no matter how many fields your string contains.
Yes; this:
A="Some variable has value abc.123"
echo "${A##* }"
will print this:
abc.123
(The ${parameter##word} notation is explained in §3.5.3 "Shell Parameter Expansion" of the Bash Reference Manual.)
Some examples using parameter expansion
A="Some variable has value abc.123"
echo "${A##* }"
abc.123
Longest match on " " space
echo "${A% *}"
Some variable has value
Longest match on . dot
echo "${A%.*}"
Some variable has value abc
Shortest match on " " space
echo "${A%% *}"
some
Read more Shell-Parameter-Expansion
The documentation is a bit painful to read, so I've summarised it in a simpler way.
Note that the '*' needs to swap places with the ' ' depending on whether you use # or %. (The * is just a wildcard, so you may need to take off your "regex hat" while reading.)
${A% *} - remove shortest trailing * (strip the last word)
${A%% *} - remove longest trailing * (strip the last words)
${A#* } - remove shortest leading * (strip the first word)
${A##* } - remove longest leading * (strip the first words)
Of course a "word" here may contain any character that isn't a literal space.
You might commonly use this syntax to trim filenames:
${A##*/} removes all containing folders, if any, from the start of the path, e.g.
/usr/bin/git -> git
/usr/bin/ -> (empty string)
${A%/*} removes the last file/folder/trailing slash, if any, from the end:
/usr/bin/git -> /usr/bin
/usr/bin/ -> /usr/bin
${A%.*} removes the last extension, if any (just be wary of things like my.path/noext):
archive.tar.gz -> archive.tar
How do you know where the value begins? If it's always the 5th and 6th words, you could use e.g.:
B=$(echo "$A" | cut -d ' ' -f 5-)
This uses the cut command to slice out part of the line, using a simple space as the word delimiter.
As pointed out by Zedfoxus here. A very clean method that works on all Unix-based systems. Besides, you don't need to know the exact position of the substring.
A="Some variable has value abc.123"
echo "$A" | rev | cut -d ' ' -f 1 | rev
# abc.123
More ways to do this:
(Run each of these commands in your terminal to test this live.)
For all answers below, start by typing this in your terminal:
A="Some variable has value abc.123"
The array example (#3 below) is a really useful pattern, and depending on what you are trying to do, sometimes the best.
1. with awk, as the main answer shows
echo "$A" | awk '{print $NF}'
2. with grep:
echo "$A" | grep -o '[^ ]*$'
the -o says to only retain the matching portion of the string
the [^ ] part says "don't match spaces"; ie: "not the space char"
the * means: "match 0 or more instances of the preceding match pattern (which is [^ ]), and the $ means "match the end of the line." So, this matches the last word after the last space through to the end of the line; ie: abc.123 in this case.
3. via regular bash "indexed" arrays and array indexing
Convert A to an array, with elements being separated by the default IFS (Internal Field Separator) char, which is space:
Option 1 (will "break in mysterious ways", as #tripleee put it in a comment here, if the string stored in the A variable contains certain special shell characters, so Option 2 below is recommended instead!):
# Capture space-separated words as separate elements in array A_array
A_array=($A)
Option 2 [RECOMMENDED!]. Use the read command, as I explain in my answer here, and as is recommended by the bash shellcheck static code analyzer tool for shell scripts, in ShellCheck rule SC2206, here.
# Capture space-separated words as separate elements in array A_array, using
# a "herestring".
# See my answer here: https://stackoverflow.com/a/71575442/4561887
IFS=" " read -r -d '' -a A_array <<< "$A"
Then, print only the last elment in the array:
# Print only the last element via bash array right-hand-side indexing syntax
echo "${A_array[-1]}" # last element only
Output:
abc.123
Going further:
What makes this pattern so useful too is that it allows you to easily do the opposite too!: obtain all words except the last one, like this:
array_len="${#A_array[#]}"
array_len_minus_one=$((array_len - 1))
echo "${A_array[#]:0:$array_len_minus_one}"
Output:
Some variable has value
For more on the ${array[#]:start:length} array slicing syntax above, see my answer here: Unix & Linux: Bash: slice of positional parameters, and for more info. on the bash "Arithmetic Expansion" syntax, see here:
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Arithmetic-Expansion
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Shell-Arithmetic
You can use a Bash regex:
A="Some variable has value abc.123"
[[ $A =~ [[:blank:]]([^[:blank:]]+)$ ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
Prints:
abc.123
That works with any [:blank:] delimiter in the current local (Usually [ \t]). If you want to be more specific:
A="Some variable has value abc.123"
pat='[ ]([^ ]+)$'
[[ $A =~ $pat ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
echo "Some variable has value abc.123"| perl -nE'say $1 if /(\S+)$/'

Is there a way to format the width of a substring within a string in a bash/sh script?

I have to format the width of a substring within a string using a bash script, but without using tokens or loops. A single character between two colons should be prepended by a 0 in order to match the standard width of 2 for each field.
For e.g
from:
6:0:36:35:30:30:72:6c:73:0:c:52:4c:30:31:30:31:30:30:30:31:36:39:0:1:3
to
06:00:36:35:30:30:72:6c:73:00:0c:52:4c:30:31:30:31:30:30:30:31:36:39:00:01:03
How can I do this?
sed -r 's/\<([0-9a-f])\>/0\1/g'
Search and replace with a regex. Use \< and \> to match word boundaries so [0-9a-f] only matches single digits.
$ sed -r 's/\<([0-9a-f])\>/0\1/g' <<< "6:0:36:35:30:30:72:6c:73:0:c:52:4c:30:31:30:31:30:30:30:31:36:39:0:1:3"
06:00:36:35:30:30:72:6c:73:00:0c:52:4c:30:31:30:31:30:30:30:31:36:39:00:01:03
awk -F: -v OFS=: '{for(i=1;i<=NF;i++) if(length($i)==1)gsub($i,"0&",$i)}1' file
Output:
06:00:36:35:30:30:72:6c:73:00:0c:52:4c:30:31:30:31:30:30:30:31:36:39:00:01:03
This will divide the whole line into fields separated by : , if the length of any of the field is == 1. then it will replace that field with 0field.
Bash solution:
IFS=:; for i in $string; do echo -n 0$i: | tail -c 3; done
With
str="06:00:36:35:30:30:72:6c:73:00:0c:52:4c:30:31:30:31:30:30:30:31:36:39:00:01:03"
you can add a '0' to all tokens and remove those that are unwanted:
sed -r 's/0([0-9a-f]{2})/\1/g' <<< "0${str//:/:0}"
That doesn't feel right, making errors and repairing them.
A better alternative is
echo $(IFS=:; printf "%2s:" ${str} | tr " " "0")

Replace strings in multiple files with corresponding caps using bash on MacOSX

I have multiple .txt files, in which I want to replace the strings
old -> new
Old -> New
OLD -> NEW
The first step is to only replace one string Old->New. Here is my current code, but it does not do the job (the files remain unchanged). The sed line works only if I replace the variables with the actual strings.
#!/bin/bash
old_string="Old"
new_string="New"
sed -i '.bak' 's/$old_string/$new_string/g' *.txt
Also, how do I convert a string to all upper-caps and all lower-caps?
Thank you very much for your advice!
To complement #merlin2011's helpful answer:
If you wanted to create the case variants dynamically, try this:
# Define search and replacement strings
# as all-lowercase.
old_string='old'
new_string='new'
# Loop 3 times and create the case variants dynamically.
# Build up a _single_ sed command that performs all 3
# replacements.
sedCmd=
for (( i = 1; i <= 3; i++ )); do
case $i in
1) # as defined (all-lowercase)
old_string_variant=$old_string
new_string_variant=$new_string
;;
2) # initial capital
old_string_variant="$(tr '[:lower:]' '[:upper:]' <<<"${old_string:0:1}")${old_string:1}"
new_string_variant="$(tr '[:lower:]' '[:upper:]' <<<"${new_string:0:1}")${new_string:1}"
;;
3) # all-uppercase
old_string_variant=$(tr '[:lower:]' '[:upper:]' <<<"$old_string")
new_string_variant=$(tr '[:lower:]' '[:upper:]' <<<"$new_string")
;;
esac
# Append to the sed command string. Note the use of _double_ quotes
# to ensure that variable references are expanded.
sedCmd+="s/$old_string_variant/$new_string_variant/g; "
done
# Finally, invoke sed.
sed -i '.bak' "$sedCmd" *.txt
Note that bash 4 supports case conversions directly (as part of parameter expansion), but OS X, as of 10.9.3, is still on bash 3.2.51.
Alternative solution, using awk to create the case variants and synthesize the sed command:
Aside from being shorter, it is also more robust, because it also handles strings correctly that happen to contain characters that are regex metacharacters (characters with special meaning in an regular expression, e.g., *) or have special meaning in sed's s function's replacement-string parameter (e.g., \), through appropriate escaping; without escaping, the sed command would not work as expected.
Caveat: Doesn't support strings with embedded \n chars. (though that could be fixed, too).
# Define search and replacement strings as all-lowercase literals.
old_string='old'
new_string='new'
# Synthesize the sed command string, utilizing awk and its tolower() and toupper()
# functions to create the case variants.
# Note the need to escape \ chars to prevent awk from interpreting them.
sedCmd=$(awk \
-v old_string="${old_string//\\/\\\\}" \
-v new_string="${new_string//\\/\\\\}" \
'BEGIN {
printf "s/%s/%s/g; s/%s/%s/g; s/%s/%s/g",
old_string, new_string,
toupper(substr(old_string,1,1)) substr(old_string,2), toupper(substr(new_string,1,1)) substr(new_string,2),
toupper(old_string), toupper(new_string)
}')
# Invoke sed with the synthesized command.
# The inner sed command ensures that all regex metacharacters in the strings
# are escaped so that sed treats them as literals.
sed -i '.bak' "$(sed 's#[][(){}^$.*?+\]#\\&#g' <<<"$sedCmd")" *.txt
If you want to do bash variable expansion inside the argument to sed, you need to use double quotes " instead of single quotes '.
sed -i '.bak' "s/$old_string/$new_string/g" *.txt
In terms of getting matches on all three of the literal substitutions, the cleanest solution may be just to run sed three times in a loop like this.
declare -a olds=(old Old OLD)
declare -a news=(new New NEW)
for i in `seq 0 2`; do
sed -i "s/${olds[$i]}/${news[$i]}/g" *.txt
done;
Update: The solution above works on Linux, but apparently OS X has different requirements. Additionally, as #mklement0 mentioned, my for loop is silly. Here is an improved version for OS X.
declare -a olds=(old Old OLD)
declare -a news=(new New NEW)
for (( i = 0; i < ${#olds[#]}; i++ )); do
sed -i '.bak' "s/${olds[$i]}/${news[$i]}/g" *.txt
done;
Assuming each string is separated by spaces from your other strings and that you don't want partial matches within longer strings and that you don't care about preserving white space on output and assuming that if an "old" string matches on a "new" string after a previous conversion operation, then the string should be changed again:
$ cat tst.awk
BEGIN {
split(tolower(old),oldStrs)
split(tolower(new),newStrs)
}
{
for (fldNr=1; fldNr<=NF; fldNr++) {
for (stringNr=1; stringNr in oldStrs; stringNr++) {
oldStr = oldStrs[stringNr]
if (tolower($fldNr) == oldStr) {
newStr = newStrs[stringNr]
split(newStr,newChars,"")
split($fldNr,fldChars,"")
$fldNr = ""
for (charNr=1; charNr in fldChars; charNr++) {
fldChar = fldChars[charNr]
newChar = newChars[charNr]
$fldNr = $fldNr ( fldChar ~ /[[:lower:]]/ ?
newChar : toupper(newChar) )
}
}
}
}
print
}
.
$ cat file
The old Old OLD smOLDering QuICk brown FoX jumped
$ awk -v old="old" -v new="new" -f tst.awk file
The new New NEW smOLDering QuICk brown FoX jumped
Note that the "old" in "smOLDering" did not get changed. Is that desirable?
$ awk -v old="QUIck Fox" -v new="raBid DOG" -f tst.awk file
The old Old OLD smOLDering RaBId brown DoG jumped
$ awk -v old="THE brown Jumped" -v new="FEW dingy TuRnEd" -f tst.awk file
Few old Old OLD smOLDering QuICk dingy FoX turned
Think about whether or not this is your expected output:
$ awk -v old="old new" -v new="new yes" -f tst.awk file
The yes Yes YES smOLDering QuICk brown FoX jumped
A few lines of sample input and expected output in the question would be useful to avoid all the guessing and assumptions.

Resources