Replace strings in multiple files with corresponding caps using bash on MacOSX - macos

I have multiple .txt files, in which I want to replace the strings
old -> new
Old -> New
OLD -> NEW
The first step is to only replace one string Old->New. Here is my current code, but it does not do the job (the files remain unchanged). The sed line works only if I replace the variables with the actual strings.
#!/bin/bash
old_string="Old"
new_string="New"
sed -i '.bak' 's/$old_string/$new_string/g' *.txt
Also, how do I convert a string to all upper-caps and all lower-caps?
Thank you very much for your advice!

To complement #merlin2011's helpful answer:
If you wanted to create the case variants dynamically, try this:
# Define search and replacement strings
# as all-lowercase.
old_string='old'
new_string='new'
# Loop 3 times and create the case variants dynamically.
# Build up a _single_ sed command that performs all 3
# replacements.
sedCmd=
for (( i = 1; i <= 3; i++ )); do
case $i in
1) # as defined (all-lowercase)
old_string_variant=$old_string
new_string_variant=$new_string
;;
2) # initial capital
old_string_variant="$(tr '[:lower:]' '[:upper:]' <<<"${old_string:0:1}")${old_string:1}"
new_string_variant="$(tr '[:lower:]' '[:upper:]' <<<"${new_string:0:1}")${new_string:1}"
;;
3) # all-uppercase
old_string_variant=$(tr '[:lower:]' '[:upper:]' <<<"$old_string")
new_string_variant=$(tr '[:lower:]' '[:upper:]' <<<"$new_string")
;;
esac
# Append to the sed command string. Note the use of _double_ quotes
# to ensure that variable references are expanded.
sedCmd+="s/$old_string_variant/$new_string_variant/g; "
done
# Finally, invoke sed.
sed -i '.bak' "$sedCmd" *.txt
Note that bash 4 supports case conversions directly (as part of parameter expansion), but OS X, as of 10.9.3, is still on bash 3.2.51.
Alternative solution, using awk to create the case variants and synthesize the sed command:
Aside from being shorter, it is also more robust, because it also handles strings correctly that happen to contain characters that are regex metacharacters (characters with special meaning in an regular expression, e.g., *) or have special meaning in sed's s function's replacement-string parameter (e.g., \), through appropriate escaping; without escaping, the sed command would not work as expected.
Caveat: Doesn't support strings with embedded \n chars. (though that could be fixed, too).
# Define search and replacement strings as all-lowercase literals.
old_string='old'
new_string='new'
# Synthesize the sed command string, utilizing awk and its tolower() and toupper()
# functions to create the case variants.
# Note the need to escape \ chars to prevent awk from interpreting them.
sedCmd=$(awk \
-v old_string="${old_string//\\/\\\\}" \
-v new_string="${new_string//\\/\\\\}" \
'BEGIN {
printf "s/%s/%s/g; s/%s/%s/g; s/%s/%s/g",
old_string, new_string,
toupper(substr(old_string,1,1)) substr(old_string,2), toupper(substr(new_string,1,1)) substr(new_string,2),
toupper(old_string), toupper(new_string)
}')
# Invoke sed with the synthesized command.
# The inner sed command ensures that all regex metacharacters in the strings
# are escaped so that sed treats them as literals.
sed -i '.bak' "$(sed 's#[][(){}^$.*?+\]#\\&#g' <<<"$sedCmd")" *.txt

If you want to do bash variable expansion inside the argument to sed, you need to use double quotes " instead of single quotes '.
sed -i '.bak' "s/$old_string/$new_string/g" *.txt
In terms of getting matches on all three of the literal substitutions, the cleanest solution may be just to run sed three times in a loop like this.
declare -a olds=(old Old OLD)
declare -a news=(new New NEW)
for i in `seq 0 2`; do
sed -i "s/${olds[$i]}/${news[$i]}/g" *.txt
done;
Update: The solution above works on Linux, but apparently OS X has different requirements. Additionally, as #mklement0 mentioned, my for loop is silly. Here is an improved version for OS X.
declare -a olds=(old Old OLD)
declare -a news=(new New NEW)
for (( i = 0; i < ${#olds[#]}; i++ )); do
sed -i '.bak' "s/${olds[$i]}/${news[$i]}/g" *.txt
done;

Assuming each string is separated by spaces from your other strings and that you don't want partial matches within longer strings and that you don't care about preserving white space on output and assuming that if an "old" string matches on a "new" string after a previous conversion operation, then the string should be changed again:
$ cat tst.awk
BEGIN {
split(tolower(old),oldStrs)
split(tolower(new),newStrs)
}
{
for (fldNr=1; fldNr<=NF; fldNr++) {
for (stringNr=1; stringNr in oldStrs; stringNr++) {
oldStr = oldStrs[stringNr]
if (tolower($fldNr) == oldStr) {
newStr = newStrs[stringNr]
split(newStr,newChars,"")
split($fldNr,fldChars,"")
$fldNr = ""
for (charNr=1; charNr in fldChars; charNr++) {
fldChar = fldChars[charNr]
newChar = newChars[charNr]
$fldNr = $fldNr ( fldChar ~ /[[:lower:]]/ ?
newChar : toupper(newChar) )
}
}
}
}
print
}
.
$ cat file
The old Old OLD smOLDering QuICk brown FoX jumped
$ awk -v old="old" -v new="new" -f tst.awk file
The new New NEW smOLDering QuICk brown FoX jumped
Note that the "old" in "smOLDering" did not get changed. Is that desirable?
$ awk -v old="QUIck Fox" -v new="raBid DOG" -f tst.awk file
The old Old OLD smOLDering RaBId brown DoG jumped
$ awk -v old="THE brown Jumped" -v new="FEW dingy TuRnEd" -f tst.awk file
Few old Old OLD smOLDering QuICk dingy FoX turned
Think about whether or not this is your expected output:
$ awk -v old="old new" -v new="new yes" -f tst.awk file
The yes Yes YES smOLDering QuICk brown FoX jumped
A few lines of sample input and expected output in the question would be useful to avoid all the guessing and assumptions.

Related

shell command that eats its input if it matches a string

I’m looking for a shell command (a one-liner) that eats all of its standard input if that input matches a string. Otherwise, it just outputs its input.
Likewise, how about a command that eats all of its input if it doesn’t match the string.
Presumably sed or awk can do it, but it’s beyond my fu. I can easily do it with a shell script but I’d prefer a single command.
This is useful in crontab so that you only receive an email if the output of a command fails (i.e. matches a string).
So, below, what are “eatIfMatch” and “eatIfNoMatch”?
Thanks for any ideas.
$ cat matchFile
This file
will match the
pattern
$ cat noMatchFile
This file
doesn’t quite
Match
though
$ eatIfMatch match < matchFile
$ eatIfMatch match < noMatchFile
This file
doesn’t quite
Match
though
$ eatIfNoMatch match < matchFile
This file
will match the
pattern
$ eatIfNoMatch match < noMatchFile
$
This might work for you (GNU sed and Bash):
eatIfMatch(){ sed -zn '/'"$1"'/!p' "$2"; }
eatIfMatch match fileMatch
eatIfMatch match fileNoMatch
This file
doesn’t quite
Match
though
eatIfNoMatch(){ sed -zn '/'"$1"'/p' "$2"; }
eatIfNoMatch match fileMatch
This file
will match the
pattern
eatIfNoMatch match fileNoMatch
Function eatIfMatch accepts regexp to match and if it does eats the file.
Function eatIfNoMatch accepts regexp to mat and if it does not eats file.
Assign the input to a variable. Check if the variable contains the string. If not, it prints the string.
eatIfMatch() {
local var=$(cat)
if [[ $var != *"$1"* ]]
then printf "%s\n" "$var"
fi
}
If you want awk, it's essentially the same logic
awk '/match/ {found=1} {all = all "\n" $0} END {if(found) print substr(all, 1)}'

Double quotes containing variable not working in sed [duplicate]

In my bash script I have an external (received from user) string, which I should use in sed pattern.
REPLACE="<funny characters here>"
sed "s/KEYWORD/$REPLACE/g"
How can I escape the $REPLACE string so it would be safely accepted by sed as a literal replacement?
NOTE: The KEYWORD is a dumb substring with no matches etc. It is not supplied by user.
Warning: This does not consider newlines. For a more in-depth answer, see this SO-question instead. (Thanks, Ed Morton & Niklas Peter)
Note that escaping everything is a bad idea. Sed needs many characters to be escaped to get their special meaning. For example, if you escape a digit in the replacement string, it will turn in to a backreference.
As Ben Blank said, there are only three characters that need to be escaped in the replacement string (escapes themselves, forward slash for end of statement and & for replace all):
ESCAPED_REPLACE=$(printf '%s\n' "$REPLACE" | sed -e 's/[\/&]/\\&/g')
# Now you can use ESCAPED_REPLACE in the original sed statement
sed "s/KEYWORD/$ESCAPED_REPLACE/g"
If you ever need to escape the KEYWORD string, the following is the one you need:
sed -e 's/[]\/$*.^[]/\\&/g'
And can be used by:
KEYWORD="The Keyword You Need";
ESCAPED_KEYWORD=$(printf '%s\n' "$KEYWORD" | sed -e 's/[]\/$*.^[]/\\&/g');
# Now you can use it inside the original sed statement to replace text
sed "s/$ESCAPED_KEYWORD/$ESCAPED_REPLACE/g"
Remember, if you use a character other than / as delimiter, you need replace the slash in the expressions above wih the character you are using. See PeterJCLaw's comment for explanation.
Edited: Due to some corner cases previously not accounted for, the commands above have changed several times. Check the edit history for details.
The sed command allows you to use other characters instead of / as separator:
sed 's#"http://www\.fubar\.com"#URL_FUBAR#g'
The double quotes are not a problem.
The only three literal characters which are treated specially in the replace clause are / (to close the clause), \ (to escape characters, backreference, &c.), and & (to include the match in the replacement). Therefore, all you need to do is escape those three characters:
sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
Example:
$ export REPLACE="'\"|\\/><&!"
$ echo fooKEYWORDbar | sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
foo'"|\/><&!bar
Based on Pianosaurus's regular expressions, I made a bash function that escapes both keyword and replacement.
function sedeasy {
sed -i "s/$(echo $1 | sed -e 's/\([[\/.*]\|\]\)/\\&/g')/$(echo $2 | sed -e 's/[\/&]/\\&/g')/g" $3
}
Here's how you use it:
sedeasy "include /etc/nginx/conf.d/*" "include /apps/*/conf/nginx.conf" /etc/nginx/nginx.conf
It's a bit late to respond... but there IS a much simpler way to do this. Just change the delimiter (i.e., the character that separates fields). So, instead of s/foo/bar/ you write s|bar|foo.
And, here's the easy way to do this:
sed 's|/\*!50017 DEFINER=`snafu`#`localhost`\*/||g'
The resulting output is devoid of that nasty DEFINER clause.
It turns out you're asking the wrong question. I also asked the wrong question. The reason it's wrong is the beginning of the first sentence: "In my bash script...".
I had the same question & made the same mistake. If you're using bash, you don't need to use sed to do string replacements (and it's much cleaner to use the replace feature built into bash).
Instead of something like, for example:
function escape-all-funny-characters() { UNKNOWN_CODE_THAT_ANSWERS_THE_QUESTION_YOU_ASKED; }
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A="$(escape-all-funny-characters 'KEYWORD')"
B="$(escape-all-funny-characters '<funny characters here>')"
OUTPUT="$(sed "s/$A/$B/g" <<<"$INPUT")"
you can use bash features exclusively:
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A='KEYWORD'
B='<funny characters here>'
OUTPUT="${INPUT//"$A"/"$B"}"
Use awk - it is cleaner:
$ awk -v R='//addr:\\file' '{ sub("THIS", R, $0); print $0 }' <<< "http://file:\_THIS_/path/to/a/file\\is\\\a\\ nightmare"
http://file:\_//addr:\file_/path/to/a/file\\is\\\a\\ nightmare
Here is an example of an AWK I used a while ago. It is an AWK that prints new AWKS. AWK and SED being similar it may be a good template.
ls | awk '{ print "awk " "'"'"'" " {print $1,$2,$3} " "'"'"'" " " $1 ".old_ext > " $1 ".new_ext" }' > for_the_birds
It looks excessive, but somehow that combination of quotes works to keep the ' printed as literals. Then if I remember correctly the vaiables are just surrounded with quotes like this: "$1". Try it, let me know how it works with SED.
These are the escape codes that I've found:
* = \x2a
( = \x28
) = \x29
" = \x22
/ = \x2f
\ = \x5c
' = \x27
? = \x3f
% = \x25
^ = \x5e
sed is typically a mess, especially the difference between gnu-sed and bsd-sed
might just be easier to place some sort of sentinel at the sed side, then a quick pipe over to awk, which is far more flexible in accepting any ERE regex, escaped hex, or escaped octals.
e.g. OFS in awk is the true replacement ::
date | sed -E 's/[0-9]+/\xC1\xC0/g' |
mawk NF=NF FS='\xC1\xC0' OFS='\360\237\244\241'
1 Tue Aug 🤡 🤡:🤡:🤡 EDT 🤡
(tested and confirmed working on both BSD-sed and GNU-sed - the emoji isn't a typo that's what those 4 bytes map to in UTF-8 )
There are dozens of answers out there... If you don't mind using a bash function schema, below is a good answer. The objective below was to allow using sed with practically any parameter as a KEYWORD (F_PS_TARGET) or as a REPLACE (F_PS_REPLACE). We tested it in many scenarios and it seems to be pretty safe. The implementation below supports tabs, line breaks and sigle quotes for both KEYWORD and replace REPLACE.
NOTES: The idea here is to use sed to escape entries for another sed command.
CODE
F_REVERSE_STRING_R=""
f_reverse_string() {
: 'Do a string reverse.
To undo just use a reversed string as STRING_INPUT.
Args:
STRING_INPUT (str): String input.
Returns:
F_REVERSE_STRING_R (str): The modified string.
'
local STRING_INPUT=$1
F_REVERSE_STRING_R=$(echo "x${STRING_INPUT}x" | tac | rev)
F_REVERSE_STRING_R=${F_REVERSE_STRING_R%?}
F_REVERSE_STRING_R=${F_REVERSE_STRING_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/2705678/3223785 ]
F_POWER_SED_ECP_R=""
f_power_sed_ecp() {
: 'Escape strings for the "sed" command.
Escaped characters will be processed as is (e.g. /n, /t ...).
Args:
F_PSE_VAL_TO_ECP (str): Value to be escaped.
F_PSE_ECP_TYPE (int): 0 - For the TARGET value; 1 - For the REPLACE value.
Returns:
F_POWER_SED_ECP_R (str): Escaped value.
'
local F_PSE_VAL_TO_ECP=$1
local F_PSE_ECP_TYPE=$2
# NOTE: Operational characters of "sed" will be escaped, as well as single quotes.
# By Questor
if [ ${F_PSE_ECP_TYPE} -eq 0 ] ; then
# NOTE: For the TARGET value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[]\/$*.^[]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
else
# NOTE: For the REPLACE value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[\/&]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
fi
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R%?}
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/24134488/3223785 ,
# https://stackoverflow.com/a/21740695/3223785 ,
# https://unix.stackexchange.com/a/655558/61742 ,
# https://stackoverflow.com/a/11461628/3223785 ,
# https://stackoverflow.com/a/45151986/3223785 ,
# https://linuxaria.com/pills/tac-and-rev-to-see-files-in-reverse-order ,
# https://unix.stackexchange.com/a/631355/61742 ]
F_POWER_SED_R=""
f_power_sed() {
: 'Facilitate the use of the "sed" command. Replaces in files and strings.
Args:
F_PS_TARGET (str): Value to be replaced by the value of F_PS_REPLACE.
F_PS_REPLACE (str): Value that will replace F_PS_TARGET.
F_PS_FILE (Optional[str]): File in which the replacement will be made.
F_PS_SOURCE (Optional[str]): String to be manipulated in case "F_PS_FILE" was
not informed.
F_PS_NTH_OCCUR (Optional[int]): [1~n] - Replace the nth match; [n~-1] - Replace
the last nth match; 0 - Replace every match; Default 1.
Returns:
F_POWER_SED_R (str): Return the result if "F_PS_FILE" is not informed.
'
local F_PS_TARGET=$1
local F_PS_REPLACE=$2
local F_PS_FILE=$3
local F_PS_SOURCE=$4
local F_PS_NTH_OCCUR=$5
if [ -z "$F_PS_NTH_OCCUR" ] ; then
F_PS_NTH_OCCUR=1
fi
local F_PS_REVERSE_MODE=0
if [ ${F_PS_NTH_OCCUR} -lt -1 ] ; then
F_PS_REVERSE_MODE=1
f_reverse_string "$F_PS_TARGET"
F_PS_TARGET="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_REPLACE"
F_PS_REPLACE="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_SOURCE"
F_PS_SOURCE="$F_REVERSE_STRING_R"
F_PS_NTH_OCCUR=$((-F_PS_NTH_OCCUR))
fi
f_power_sed_ecp "$F_PS_TARGET" 0
F_PS_TARGET=$F_POWER_SED_ECP_R
f_power_sed_ecp "$F_PS_REPLACE" 1
F_PS_REPLACE=$F_POWER_SED_ECP_R
local F_PS_SED_RPL=""
if [ ${F_PS_NTH_OCCUR} -eq -1 ] ; then
# NOTE: We kept this option because it performs better when we only need to replace
# the last occurrence. By Questor
# [Ref(s).: https://linuxhint.com/use-sed-replace-last-occurrence/ ,
# https://unix.stackexchange.com/a/713866/61742 ]
F_PS_SED_RPL="'s/\(.*\)$F_PS_TARGET/\1$F_PS_REPLACE/'"
elif [ ${F_PS_NTH_OCCUR} -gt 0 ] ; then
# [Ref(s).: https://unix.stackexchange.com/a/587924/61742 ]
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/$F_PS_NTH_OCCUR'"
elif [ ${F_PS_NTH_OCCUR} -eq 0 ] ; then
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/g'"
fi
# NOTE: As the "sed" commands below always process literal values for the "F_PS_TARGET"
# so we use the "-z" flag in case it has multiple lines. By Quaestor
# [Ref(s).: https://unix.stackexchange.com/a/525524/61742 ]
if [ -z "$F_PS_FILE" ] ; then
F_POWER_SED_R=$(echo "x${F_PS_SOURCE}x" | eval "sed -z $F_PS_SED_RPL")
F_POWER_SED_R=${F_POWER_SED_R%?}
F_POWER_SED_R=${F_POWER_SED_R#?}
if [ ${F_PS_REVERSE_MODE} -eq 1 ] ; then
f_reverse_string "$F_POWER_SED_R"
F_POWER_SED_R="$F_REVERSE_STRING_R"
fi
else
if [ ${F_PS_REVERSE_MODE} -eq 0 ] ; then
eval "sed -i -z $F_PS_SED_RPL \"$F_PS_FILE\""
else
tac "$F_PS_FILE" | rev | eval "sed -z $F_PS_SED_RPL" | tac | rev > "$F_PS_FILE"
fi
fi
}
MODEL
f_power_sed "F_PS_TARGET" "F_PS_REPLACE" "" "F_PS_SOURCE"
echo "$F_POWER_SED_R"
EXAMPLE
f_power_sed "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" "[ ]+|$/,\"\0\"" "" "Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate that concatenation of the final \", \" then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,\"\0\"); print; }' <<<\"$string\") on Bash that supports readarray. Note your method is Bash 4.4+ I think because of the -d in readar"
echo "$F_POWER_SED_R"
IF YOU JUST WANT TO ESCAPE THE PARAMETERS TO THE SED COMMAND
MODEL
# "TARGET" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 0
echo "$F_POWER_SED_ECP_R"
# "REPLACE" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 1
echo "$F_POWER_SED_ECP_R"
IMPORTANT: If the strings for KEYWORD and/or replace REPLACE contain tabs or line breaks you will need to use the "-z" flag in your "sed" command. More details here.
EXAMPLE
f_power_sed_ecp "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" 0
echo "$F_POWER_SED_ECP_R"
f_power_sed_ecp "[ ]+|$/,\"\0\"" 1
echo "$F_POWER_SED_ECP_R"
NOTE: The f_power_sed_ecp and f_power_sed functions above was made available completely free as part of this project ez_i - Create shell script installers easily!.
Standard recommendation here: use perl :)
echo KEYWORD > /tmp/test
REPLACE="<funny characters here>"
perl -pi.bck -e "s/KEYWORD/${REPLACE}/g" /tmp/test
cat /tmp/test
don't forget all the pleasure that occur with the shell limitation around " and '
so (in ksh)
Var=">New version of \"content' here <"
printf "%s" "${Var}" | sed "s/[&\/\\\\*\\"']/\\&/g' | read -r EscVar
echo "Here is your \"text\" to change" | sed "s/text/${EscVar}/g"
If the case happens to be that you are generating a random password to pass to sed replace pattern, then you choose to be careful about which set of characters in the random string. If you choose a password made by encoding a value as base64, then there is is only character that is both possible in base64 and is also a special character in sed replace pattern. That character is "/", and is easily removed from the password you are generating:
# password 32 characters log, minus any copies of the "/" character.
pass=`openssl rand -base64 32 | sed -e 's/\///g'`;
If you are just looking to replace Variable value in sed command then just remove
Example:
sed -i 's/dev-/dev-$ENV/g' test to sed -i s/dev-/dev-$ENV/g test
I have an improvement over the sedeasy function, which WILL break with special characters like tab.
function sedeasy_improved {
sed -i "s/$(
echo "$1" | sed -e 's/\([[\/.*]\|\]\)/\\&/g'
| sed -e 's:\t:\\t:g'
)/$(
echo "$2" | sed -e 's/[\/&]/\\&/g'
| sed -e 's:\t:\\t:g'
)/g" "$3"
}
So, whats different? $1 and $2 wrapped in quotes to avoid shell expansions and preserve tabs or double spaces.
Additional piping | sed -e 's:\t:\\t:g' (I like : as token) which transforms a tab in \t.
An easier way to do this is simply building the string before hand and using it as a parameter for sed
rpstring="s/KEYWORD/$REPLACE/g"
sed -i $rpstring test.txt

sed Capital_Case not working

I'm trying to convert a string that has either - (hyphen) or _ (underscore) to Capital_Case string.
#!/usr/bin/env sh
function cap_case() {
[ $# -eq 1 ] || return 1;
_str=$1;
_capitalize=${_str//[-_]/_} | sed -E 's/(^|_)([a-zA-Z])/\u\2/g'
echo "Capitalize:"
echo $_capitalize
return 0
}
read string
echo $(cap_case $string)
But I don't get anything out.
First I am replacing any occurrence of - and _ with _ ${_str//[-_]/_}, and then I pipe that string to sed which finds the first letter, or _ as the first group, and then the letter after the first group in the second group, and I want to uppercase the found letter with \u\2. I tried with \U\2 but that didn't work as well.
I want the string some_string to become
Some_String
And string some-string to become
Some_String
I'm on a mac, using zsh if that is helpful.
EDIT: More generic solution here to make each field's first letter Capital.
echo "some_string_other" | awk -F"_" '{for(i=1;i<=NF;i++){$i=toupper(substr($i,1,1)) substr($i,2)}} 1' OFS="_"
Following awk may help you.
echo "some_string" | awk -F"_" '{$1=toupper(substr($1,1,1)) substr($1,2);$2=toupper(substr($2,1,1)) substr($2,2)} 1' OFS="_"
Output will be as follows.
echo "some_string" | awk -F"_" '{$1=toupper(substr($1,1,1)) substr($1,2);$2=toupper(substr($2,1,1)) substr($2,2)} 1' OFS="_"
Some_String
This being zsh, you don't need sed (or even a function, really):
$ s=some-string-bar
$ print ${(C)s:gs/-/_}
Some_String_Bar
The (C) flag capitalizes words (where "words" are defined as sequences of alphanumeric characters separated by other characters); :gs/-/_ replaces hyphens with underscores.
If you really want a function, it's cap_case () { print ${(C)1:gs/-/_} }.
pure bash:
#!/bin/bash
camel_case(){
local d display string
declare -a strings # = scope local
[ "$2" ] && d="$2" || d=" " # optional output delimiter
ifs_ini="$IFS"
IFS+='_-' # we keep initial IFS
strings=( "$1" ) # array
for string in ${strings[#]} ; do
display+="${string^}$d"
done
echo "${display%$d}"
IFS="$ifs_ini"
}
camel_case "some-string_here" "_"
camel_case "some-string_here some strings here" "+"
camel_case "some-string_here some strings here"
echo "$BASH_VERSION"
exit
output:
Some_String_Here
Some+String+Here+Some+Strings+Here
Some String Here Some Strings Here
4.4.18(1) release
You can try this gnu sed
echo 'some_other-string' | sed -E 's/(^.)/\u&/;s/[_-](.)/_\u\1/g'
Explains :
s/(^.)/\u&/
(^.) match the first char and \u& put the match in capital letter.
s/[_-](.)/_\u\1/g
[_-](.) capture a char preceded by _ or - and replace it by _ and the matched char in capital letter.
The g at the end tell sed to make the replacement for each char which meet the criteria
You didn't assign to _capitalize - you set a _capitalize environment variable for the empty command that you piped into sed.
You probably meant
_capitalize=$(<<<"${_str//[-_]/_}" sed -E 's/(^|_)([a-zA-Z])/\1\u\2/g')
Note also that ${//} isn't standard shell, so you really ought to specify an interpreter other than sh.
A simpler approach would be simply:
#!/bin/sh
cap_case() {
printf "Capitalize: "
echo "$*" | sed -e 'y/-/_/' -e 's/\(^\|_\)[[:alpha:]]/\U&/g'
}
echo $(cap_case "snake_case")
Note that the \u / \U replacement is a GNU extension to sed - if you're using a non-GNU implementation, check whether it supports this feature.

Bash - Search and Replace operation with reporting the files and lines that got changed

I have a input file "test.txt" as below -
hostname=abc.com hostname=xyz.com
db-host=abc.com db-host=xyz.com
In each line, the value before space is the old value which needs to be replaced by the new value after the space recursively in a folder named "test". I am able to do this using below shell script.
#!/bin/bash
IFS=$'\n'
for f in `cat test.txt`
do
OLD=$(echo $f| cut -d ' ' -f 1)
echo "Old = $OLD"
NEW=$(echo $f| cut -d ' ' -f 2)
echo "New = $NEW"
find test -type f | xargs sed -i.bak "s/$OLD/$NEW/g"
done
"sed" replaces the strings on the fly in 100s of files.
Is there a trick or an alternative way by which i can get a report of the files changed like absolute path of the file & the exact lines that got changed ?
PS - I understand that sed or stream editors doesn't support this functionality out of the box. I don't want to use versioning as it will be an overkill for this task.
Let's start with a simple rewrite of your script, to make it a little bit more robust at handling a wider range of replacement values, but also faster:
#!/bin/bash
# escape regexp and replacement strings for sed
escapeRegex() { sed 's/[^^]/[&]/g; s/\^/\\^/g' <<<"$1"; }
escapeSubst() { sed 's/[&/\]/\\&/g' <<<"$1"; }
while read -r old new; do
find test -type f -exec sed "/$(escapeRegex "$old")/$(escapeSubst "$new")/g" -i '{}' \;
done <test.txt
So, we loop over pairs of whitespace-separated fields (old, new) in lines from test.txt and run a standard sed in-place replace on all files found with find.
Pretty similar to your script, but we properly read lines from test.txt (no word splitting, pathname/variable expansion, etc.), we use Bash builtins whenever possible (no need to call external tools like cat, cut, xargs); and we escape sed metacharacters in old/new values for proper use as sed's regexp and replacement expressions.
Now let's add logging from sed:
#!/bin/bash
# escape regexp and replacement strings for sed
escapeRegex() { sed 's/[^^]/[&]/g; s/\^/\\^/g' <<<"$1"; }
escapeSubst() { sed 's/[&/\]/\\&/g' <<<"$1"; }
while read -r old new; do
find test -type f -printf '\n[%p]\n' -exec sed "/$(escapeRegex "$old")/{
h
s//$(escapeSubst "$new")/g
H
x
s/\n/ --> /
w /dev/stdout
x
}" -i '{}' > >(tee -a change.log) \;
done <test.txt
The sed script above changes each old to new, but it also writes old --> new line to /dev/stdout (Bash-specific), which we in turn append to change.log file. The -printf action in find outputs a "header" line with file name, for each file processed.
With this, your "change log" will look something like:
[file1]
hostname=abc.com --> hostname=xyz.com
[file2]
[file1]
db-host=abc.com --> db-host=xyz.com
[file2]
db-host=abc.com --> db-host=xyz.com
Just for completeness, a quick walk-through the sed script. We act only on lines containing the old value. For each such line, we store it to hold space (h), change it to new, append that new value to the hold space (joined with newline, H) which now holds old\nnew. We swap hold with pattern space (x), so we can run s command that converts it to old --> new. After writing that to the stdout with w, we move the new back from hold to pattern space, so it gets written (in-place) to the file processed.
From man sed:
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)
This can be used to create a backup file when replacing. You can then look for any backup files, which indicate which files were changed, and diff those with the originals. Once you're done inspecting the diff, simply remove the backup files.
If you formulate your replacements as sed statements rather than a custom format you can go one further, and use either a sed shebang line or pass the file to -f/--file to do all the replacements in one operation.
There's several problems with your script, just replace it all with (using GNU awk instead of GNU sed for inplace editing):
mapfile -t files < <(find test -type f)
awk -i inplace '
NR==FNR { map[$1] = $2; next }
{ for (old in map) gsub(old,map[old]) }
' test.txt "${files[#]}"
You'll find that is orders of magnitude faster than what you were doing.
That still has the issue your existing script does of failing when the "test.txt" strings contain regexp or backreference metacharacters and modifying previously-modified strings and handling partial matches - if that's an issue let us know as it's easy to work around with awk (and extremely difficult with sed!).
To get whatever kind of report you want you just tweak the { for ... } line to print them, e.g. to print a record of the changes to stderr:
mapfile -t files < <(find test -type f)
awk -i inplace '
NR==FNR { map[$1] = $2; next }
{
orig = $0
for (old in map) {
gsub(old,map[old])
}
if ($0 != orig) {
printf "File %s, line %d: \"%s\" became \"%s\"\n", FILENAME, FNR, orig, $0 | "cat>&2"
}
}
' test.txt "${files[#]}"

String manipulation via script

I am trying to get a substring between &DEST= and the next & or a line break.
For example :
MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546
In this I need to extract "SFO"
MYREQUESTISTO8764GETTHIS&DEST=SANFRANSISCO&ORIG=6546
In this I need to extract "SANFRANSISCO"
MYREQUESTISTO8764GETTHISWITH&DEST=SANJOSE
In this I need to extract "SANJOSE"
I am reading a file line by line, and I need to update the text after &DEST= and put it back in the file. The modification of the text is to mask the dest value with X character.
So, SFO should be replaced with XXX.
SANJOSE should be replaced with XXXXXXX.
Output :
MYREQUESTISTO8764GETTHIS&DEST=XXX&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=XXXXXXXXXXXX&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=XXXXXXX
Please let me know how to achieve this in script (Preferably shell or bash script).
Thanks.
$ cat file
MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=PORTORICA
MYREQUESTISTO8764GETTHIS&DEST=SANFRANSISCO&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=SANJOSE
$ sed -E 's/^.*&DEST=([^&]*)[&]*.*$/\1/' file
SFO
PORTORICA
SANFRANSISCO
SANJOSE
should do it
Replacing airports with an equal number of Xs
Let's consider this test file:
$ cat file
MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=SANFRANSISCO&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=SANJOSE
To replace the strings after &DEST= with an equal length of X and using GNU sed:
$ sed -E ':a; s/(&DEST=X*)[^X&]/\1X/; ta' file
MYREQUESTISTO8764GETTHIS&DEST=XXX&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=XXXXXXXXXXXX&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=XXXXXXX
To replace the file in-place:
sed -i -E ':a; s/(&DEST=X*)[^X&]/\1X/; ta' file
The above was tested with GNU sed. For BSD (OSX) sed, try:
sed -Ee :a -e 's/(&DEST=X*)[^X&]/\1X/' -e ta file
Or, to change in-place with BSD(OSX) sed, try:
sed -i '' -Ee :a -e 's/(&DEST=X*)[^X&]/\1X/' -e ta file
If there is some reason why it is important to use the shell to read the file line-by-line:
while IFS= read -r line
do
echo "$line" | sed -Ee :a -e 's/(&DEST=X*)[^X&]/\1X/' -e ta
done <file
How it works
Let's consider this code:
search_str="&DEST="
newfile=chart.txt
sed -E ':a; s/('"$search_str"'X*)[^X&]/\1X/; ta' "$newfile"
-E
This tells sed to use Extended Regular Expressions (ERE). This has the advantage of requiring fewer backslashes to escape things.
:a
This creates a label a.
s/('"$search_str"'X*)[^X&]/\1X/
This looks for $search_str followed by any number of X followed by any character that is not X or &. Because of the parens, everything except that last character is saved into group 1. This string is replaced by group 1, denoted \1 and an X.
ta
In sed, t is a test command. If the substitution was made (meaning that some character needed to be replaced by X), then the test evaluates to true and, in that case, ta tells sed to jump to label a.
This test-and-jump causes the substitution to be repeated as many times as necessary.
Replacing multiple tags with one sed command
$ name='DEST|ORIG'; sed -E ':a; s/(&('"$name"')=X*)[^X&]/\1X/; ta' file
MYREQUESTISTO8764GETTHIS&DEST=XXX&ORIG=XXXX
MYREQUESTISTO8764GETTHIS&DEST=XXXXXXXXXXXX&ORIG=XXXX
MYREQUESTISTO8764GETTHISWITH&DEST=XXXXXXX
Answer for original question
Using shell
$ s='MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546'
$ s=${s#*&DEST=}
$ echo ${s%%&*}
SFO
How it works:
${s#*&DEST=} is prefix removal. This removes all text up to and including the first occurrence of &DEST=.
${s%%&*} is suffix removal_. It removes all text from the first & to the end of the string.
Using awk
$ echo 'MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546' | awk -F'[=\n]' '$1=="DEST"{print $2}' RS='&'
SFO
How it works:
-F'[=\n]'
This tells awk to treat either an equal sign or a newline as the field separator
$1=="DEST"{print $2}
If the first field is DEST, then print the second field.
RS='&'
This sets the record separator to &.
With GNU bash:
while IFS= read -r line; do
[[ $line =~ (.*&DEST=)(.*)((&.*|$)) ]] && echo "${BASH_REMATCH[1]}fooooo${BASH_REMATCH[3]}"
done < file
Output:
MYREQUESTISTO8764GETTHIS&DEST=fooooo&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=fooooo&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=fooooo
Replace the characters between &DEST and & (or EOL) with x's:
awk -F'&DEST=' '{
printf("%s&DEST=", $1);
xlen=index($2,"&");
if ( xlen == 0) xlen=length($2)+1;
for (i=0;i<xlen;i++) printf("%s", "X");
endstr=substr($2,xlen);
printf("%s\n", endstr);
}' file

Resources