Convert text from HttpStatus.NOT_FOUND into status().isNotFound() in bash - shell

I want to convert the text in a bash variable i.e. HttpStatus.NOT_FOUND into status().isNotFound() and I had accomplished this by using sed:
result=HttpStatus.NOT_FOUND
result=$(echo $result | cut -d'.' -f2- | sed -r 's/(^|_)([A-Z])/\L\2/g' | sed -E 's/([[:lower:]])|([[:upper:]])/\U\1\L\2/g')
echo "status().is$result()"
Output:
status().isNotFound()
As you can see here I'm using 2 sed commands.
Is there a way to achieve the same result using 1 sed or any other simpler way?

Since it involves a lot of new text insertion in the replacement part, the sed command can be written in detail as below. Just pass the variable content over a pipe without using cut
result=HttpStatus.NOT_FOUND
echo "$result" |
sed -E 's/^.*(Status)\.([[:upper:]])([[:upper:]]+)_([[:upper:]])([[:upper:]]+)$/\L\1().is\u\2\L\3\u\4\L\5()/g'
The idea is add the case conversion functions of GNU sed on the captured groups. So we capture
(Status) in \1 in which we just lowercase the entire string and then append a ().is to the result
The next captured group, \2 would be first uppercase character following the . which would be N and the rest of the string OT in \3. We retain the second as such and do lower case of the third group.
The same sequence as above is repeated for the next word FOUND in \4 and \5.
The \L, \u are case conversion operators available in GNU sed.
If you are looking to modify only the part beyond the . to CamelCase, then you can use sed as
result=HttpStatus.NOT_FOUND
result=$(echo "$result" |
sed -E 's/^.*\.([[:upper:]])([[:upper:]]+)_([[:upper:]])([[:upper:]]+)/\u\1\L\2\u\3\L\4/g')
echo "status().is$result()"

This might work for you (GNU sed):
<<<"$result" sed -r 's/.*(Status)\.(.*)_(.*)/\L\1().is\u\2\u\3()/'
Use pattern matching/grouping/back references. The majority of the RHS is lowercase, so use the \L metacharacter to convert from Status... to lowercase and uppercase just the start of words using \u which converts only the next character to uppercase.
N.B. \L and likewise \U converts all following characters to lowercase/uppercase until \E or \U/\L, \l and \u only interrupt this for the next character.

Since you are using GNU sed (-r switch), here's another sed solution,
just a little bit more concise, and locale safe:
$ result=HttpStatus.NOT_FOUND
$ echo "$result" | sed -r 's/^.*([A-Z][a-z]*)\.([a-zA-Z])([a-zA-Z]*)_([a-zA-Z])([a-zA-Z]*)/\L\1().is\u\2\L\3\U\4\L\5()/'
status().isNotFound()
An even more concise way of sed is:
echo "$result" | sed -r 's/^.*([A-Z][a-z]*)\.([a-zA-Z]*)_([a-zA-Z]*)/\L\1().is\u\2\u\3()/'
They both are case insensitive for the second part, for example .nOt_fOuNd also works here.
And an GNU awk solution:
echo "$result" | awk 'function cap(str){return (toupper(substr(str,1,1)) tolower(substr(str,2)))}match($0, /([A-Z][a-z]*)\.([a-zA-Z]*)_([a-zA-Z]*)/, m){print tolower(m[1]) ".is" cap(m[2]) cap(m[3]) "()"}'

You can use the sed option "-e" to concatenate multible expressions.

Related

How to replace "\n" string with a new line in Unix Bash script

Cannot seem to find an answer to this one online...
I have a string variable (externally sourced) with new lines "\n" encoded as strings.
I want to replace those strings with actual new line carriage returns. The code below can achieve this...
echo $EXT_DESCR | sed 's/\\n/\n/g'
But when I try to store the result of this into it's own variable, it converts them back to strings
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\n/\n/g'`
How can this be achieved, or what I'm I doing wrong?
Here's my code I've been testing to try get the right results
EXT_DESCR="This is a text\nWith a new line"
echo $EXT_DESCR | sed 's/\\n/\n/g'
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\n/\n/g'`
echo ""
echo "$NEW_DESCR"
No need for sed, using parameter expansion:
$ foo='1\n2\n3'; echo "${foo//'\n'/$'\n'}"
1
2
3
With bash 4.4 or newer, you can use the E operator in ${parameter#operator}:
$ foo='1\n2\n3'; echo "${foo#E}"
1
2
3
Other answers contain alternative solutions. (I especially like the parameter expansion one.)
Here's what's wrong with your attempt:
In
echo $EXT_DESCR | sed 's/\\n/\n/g'
the sed command is in single quotes, so sed gets s/\\n/\n/g as is.
In
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\n/\n/g'`
the whole command is in backticks, so a round of backslash processing is applied. That leads to sed getting the code s/\n/\n/g, which does nothing.
A possible fix for this code:
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\\\n/\\n/g'`
By doubling up the backslashes, we end up with the right command in sed.
Or (easier):
NEW_DESCR=$(echo $EXT_DESCR | sed 's/\\n/\n/g')
Instead of backticks use $( ), which has less esoteric escaping rules.
Note: Don't use ALL_UPPERCASE for your shell variables. UPPERCASE is (informally) reserved for system variables such as HOME and special built-in variables such as IFS or RANDOM.
Depending on what exactly you need it for:
echo -e $EXT_DESCR
might be all you need.
From echo man page:
-e
enable interpretation of backslash escapes
This printf would do the job by interpreting all escaped constructs:
printf -v NEW_DESCR "%b" "$EXT_DESCR"
-v option will store output in a variable so no need to use command substitution here.
Problem with your approach is use of old back-ticks. You could do:
NEW_DESCR=$(echo "$EXT_DESCR" | sed 's/\\n/\n/g')
Assuming you're using gnu sed as BSD sed won't work with this approach.

Proper use of capture groups in SED command

I need to convert a string "1,234" =to=> 1234.
this string is just a part of a bigger line. There are thousands of such lines in the file.
I have written a sed command which is not working as I expect it to.
echo \"1,234\" | sed 's/\("\)\([0-9]+\)\(,\)\([0-9]+\)\("\)/\2\4/g'
As far as I understand, in this code,
\1 is "
\2 is the digits before comma
\3 is ,
\4 is the digits after comma
I expect this command to output 1234 which should be \2\4. But it just yields back "1,234". So I think it is not being parsed properly. Some help would be appreciated.
I would suggest you use POSIX Extended Regular Expressions (ERE), where you don't have to escape parentheses and the repetition operator. To enable ERE in sed, you can use the -E switch (or -r in GNU sed). Your expression will then look like this:
$ echo '"1,234"' | sed -E 's/"([0-9]+),([0-9]+)"/\1\2/g'
1234
For completeness, your original BRE expression will function properly if you escape the +:
echo \"1,234\" | sed 's/\("\)\([0-9]\+\)\(,\)\([0-9]\+\)\("\)/\2\4/g'
1234
Your second and fourth groups contain [0-9]+, which matches any digit followed by a plus sign.
It looks like you meant [0-9]\+, to match one or more digits.
In passing: there's no need to group the parts you'll not be using (\1, \3 and \5). You can simplify to:
echo \"1,234\" | sed 's/"\([0-9]\+\),\([0-9]\+\)"/\1\2/g'
If you're finding all those \ hard to handled, you could use Extendend Regular Expression syntax, with the -E flag:
echo \"1,234\" | sed -E 's/"([0-9]+),([0-9]+)"/\1\2/g'

Adding double quotes to beginning, end and around comma's in bash variable

I have a shell script that accepts a parameter that is comma delimited,
-s 1234,1244,1567
That is passed to a curl PUT json field. Json needs the values in a "1234","1244","1567" format.
Currently, I am passing the parameter with the quotes already in it:
-s "\"1234\",\"1244\",\"1567\"", which works, but the users are complaining that its too much typing and hard to do. So I'd like to just take a comma delimited list like I had at the top and programmatically stick the quotes in.
Basically, I want a parameter to be passed in as 1234,2345 and end up as a variable that is "1234","2345"
I've come to read that easiest approach here is to use sed, but I'm really not familiar with it and all of my efforts are failing.
You can do this in BASH:
$> arg='1234,1244,1567'
$> echo "\"${arg//,/\",\"}\""
"1234","1244","1567"
awk to the rescue!
$ awk -F, -v OFS='","' -v q='"' '{$1=$1; print q $0 q}' <<< "1234,1244,1567"
"1234","1244","1567"
or shorter with sed
$ sed -r 's/[^,]+/"&"/g' <<< "1234,1244,1567"
"1234","1244","1567"
translating this back to awk
$ awk '{print gensub(/([^,]+)/,"\"\\1\"","g")}' <<< "1234,1244,1567"
"1234","1244","1567"
you can use this:
echo QV=$(echo 1234,2345,56788 | sed -e 's/^/"/' -e 's/$/"/' -e 's/,/","/g')
result:
echo $QV
"1234","2345","56788"
just add double quotes at start, end, and replace commas with quote/comma/quote globally.
easy to do with sed
$ echo '1234,1244,1567' | sed 's/[0-9]*/"\0"/g'
"1234","1244","1567"
[0-9]* zero more consecutive digits, since * is greedy it will try to match as many as possible
"\0" double quote the matched pattern, entire match is by default saved in \0
g global flag, to replace all such patterns
In case, \0 isn't recognized in some sed versions, use & instead:
$ echo '1234,1244,1567' | sed 's/[0-9]*/"&"/g'
"1234","1244","1567"
Similar solution with perl
$ echo '1234,1244,1567' | perl -pe 's/\d+/"$&"/g'
"1234","1244","1567"
Note: Using * instead of + with perl will give
$ echo '1234,1244,1567' | perl -pe 's/\d*/"$&"/g'
"1234""","1244""","1567"""
""$
I think this difference between sed and perl is similar to this question: GNU sed, ^ and $ with | when first/last character matches
Using sed:
$ echo 1234,1244,1567 | sed 's/\([0-9]\+\)/\"\1\"/g'
"1234","1244","1567"
ie. replace all strings of numbers with the same strings of numbers quoted using backreferencing (\1).

How to toggle cases of characters in a string with a one-liner?

"I am Groot" should be changed to "i AM gROOT" using sed one-liner.
I've tried...
sed -e 's/(.*)/\L\1/' -e 's/(.*)/\U\1/'
..., but both expressions don't seem to run in parallel.
Any suggestions?
Using sed:
$ echo "I am Groot" | sed 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz/abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/'
i AM gROOT
tr is a little more compact (but not unicode-safe):
$ echo "I am Groot" | tr '[:upper:][:lower:]' '[:lower:][:upper:]'
i AM gROOT
Another sed solution (requires GNU sed)
The following toggles the case with help from a character that we think will never be in a sed input line. One possiblity would be to chose \x00 for that character because it can never be part of a bash variable. Another is to chose \n because it is never part of a sed input line. For the following, \n was chosen.
All lower case characters in the input are tagged by putting a \n in front of them. Then, any upper-case character is converted to lower case. Finally, any character with a \n in front of it is converted to upper case:
$ echo "I am Groot" | sed -r 's/[[:lower:]]/\n&/g; s/[[:upper:]]/\L&/g; s/\n(.)/\U\1/g'
i AM gROOT
This might work for you (GNU sed):
sesed "s/.*/echo '&'|tr '[:upper:][:lower:]' '[:lower:][:upper:]'/e" file
Used GNU sed's evaluation command but really why not just use tr?
N.B. Can be used in conjunction with the -i option to reverse case a file in place.

How to ensure I have exactly 2 spaces before string and zero spaces after

I get a string that can have from zero to multiple leading and trailing spaces.
I'm trying to get rid of them without lot of hackery but my code looks huge.
How to do this in a clean way?
as easy as:
$ src=" some text "
$ dst=" $(echo $src)"
$ echo ":$dst:"
: some text:
$(echo $src) will get rid of all around spaces.
than you simply add how much spaces you need before it.
How are you calling out the string? If it's an echo you can just put
Echo "<2 spaces>". "string";
if it's a normal string you just put 2 spaces between the first qoute and the string.
"<2spaces> string here"
One way using GNU sed:
sed 's/^[ \t]*/ /; s/[ \t]*$//' file.txt
You can apply this to a bash variable like this:
echo "$string" | sed 's/^[ \t]*/ /; s/[ \t]*$//'
And save it like this:
variable=$(echo "$string" | sed 's/^[ \t]*/ /; s/[ \t]*$//')
Explanation:
The first substitution will remove all leading whitespace and replace it with two spaces.
The second substitution will simply remove all lagging whitespace from a line.
The simplest is probably to use an external process.
value=$(echo "$value" | sed 's/^ *\(.*[^ ]\) *$/ \1/')
If you need to transform an empty string into two spaces, you'll need to modify the regex; and if you're not on Linux, your sed dialect may differ slightly. For maximum portability, switch to awk or Perl, or do it all in Bash. That gets a bit more complex, but for a start, trailing=${value##*[! ]} contains any trailing spaces, and you can trim them off with ${value%$trailing}, and similarly for leading spaces. See the section on variable substitution in the Bash manual for details.
You can use a regular expression to match everything between the leading and trailing spaces. The matched text is found in the BASH_REMATCH array (the text matching the first parentheses group is in element 1).
spcs='\ *'
text='.*[^ ]'
[[ $src =~ ^$spcs($text)$spcs$ ]]
dst=" ${BASH_REMATCH[1]}"

Resources