How do I cut everything on each line starting with a multi-character string in bash? - bash

How do I cut everything starting with a given multi-character string using a common shell command?
e.g., given:
foo+=bar
I want:
foo
i.e., cut everything starting with +=
cut doesn't work because it only takes a single-character delimiter, not a multi-character string:
$ echo 'foo+=bar' | cut -d '+=' -f 1
cut: bad delimiter
If I can't use cut, I would consider using perl instead, or if there's another shell command that is more commonly installed.

cut only allows single character delimiter.
You may use bash string manipulation:
s='foo+=bar'
echo "${s%%+=*}"
foo
or use more powerful awk:
awk -F '\\+=' '{print $1}' <<< "$s"
foo
'\\+=' is a regex that matches + followed by = character.

You can use 'sed' command to do this:
string='foo+=bar'
echo ${string} | sed 's/+=.*//g'
foo
or if you're using Bash shell, then use the below parameter expansion (recommended) since it doesn't create unnecessary pipeline and another sed process and so is efficient:
echo ${string%%\+\=*}
or
echo ${string%%[+][=]*}

Related

Why are results different when passing an argument to a function from piping to it as a process?

I found this thread with two solutions for trimming whitespace: piping to xargs and defining a trim() function:
trim() {
local var="$*"
# remove leading whitespace characters
var="${var#"${var%%[![:space:]]*}"}"
# remove trailing whitespace characters
var="${var%"${var##*[![:space:]]}"}"
echo -n "$var"
}
I prefer the second because of one comment:
This is overwhelmingly the ideal solution. Forking one or more external processes merely to trim whitespace from a single string is fundamentally insane – particularly when most shells (including bash) already provide native string munging facilities out-of-the-box.
I am getting, for example, the wifi SSID on macOS by piping to awk (when I get comfortable with regular expressions in bash, I won't fork an awk process), which includes a leading space:
$ /System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}'
<some-ssid>
$ /System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}' | xargs
<some-ssid>
$ /System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}' | trim
$ wifi=$(/System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}')
$ trim "$wifi"
<some-ssid>
Why does piping to the trim function fail and giving it an argument work?
It is because your trim() function is expecting a positional argument list to process. The $* is the argument list passed to your function. For the case that you report as not working, you are connecting the read end of a pipe to the function inside which you need to fetch from the standard input file descriptor.
In such a case you need to read from standard input using read command and process the argument list, i.e. as
trim() {
# convert the input received over pipe to a a single string
IFS= read -r var
# remove leading whitespace characters
var="${var#"${var%%[![:space:]]*}"}"
# remove trailing whitespace characters
var="${var%"${var##*[![:space:]]}"}"
echo -n "$var"
}
for which you can now do
$ echo " abc " | trim
abc
or using a command substitution syntax to run the command that fetches the string, that you want to pass to trim() with your older definition.
trim "$(/System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}')"
In this case, the shell expands the $(..) by running the command inside and replaces it with output of the commands run. So now the function sees trim <args> which it interprets as a positional argument and runs the string replacement functions directly on it.

Convert text from HttpStatus.NOT_FOUND into status().isNotFound() in bash

I want to convert the text in a bash variable i.e. HttpStatus.NOT_FOUND into status().isNotFound() and I had accomplished this by using sed:
result=HttpStatus.NOT_FOUND
result=$(echo $result | cut -d'.' -f2- | sed -r 's/(^|_)([A-Z])/\L\2/g' | sed -E 's/([[:lower:]])|([[:upper:]])/\U\1\L\2/g')
echo "status().is$result()"
Output:
status().isNotFound()
As you can see here I'm using 2 sed commands.
Is there a way to achieve the same result using 1 sed or any other simpler way?
Since it involves a lot of new text insertion in the replacement part, the sed command can be written in detail as below. Just pass the variable content over a pipe without using cut
result=HttpStatus.NOT_FOUND
echo "$result" |
sed -E 's/^.*(Status)\.([[:upper:]])([[:upper:]]+)_([[:upper:]])([[:upper:]]+)$/\L\1().is\u\2\L\3\u\4\L\5()/g'
The idea is add the case conversion functions of GNU sed on the captured groups. So we capture
(Status) in \1 in which we just lowercase the entire string and then append a ().is to the result
The next captured group, \2 would be first uppercase character following the . which would be N and the rest of the string OT in \3. We retain the second as such and do lower case of the third group.
The same sequence as above is repeated for the next word FOUND in \4 and \5.
The \L, \u are case conversion operators available in GNU sed.
If you are looking to modify only the part beyond the . to CamelCase, then you can use sed as
result=HttpStatus.NOT_FOUND
result=$(echo "$result" |
sed -E 's/^.*\.([[:upper:]])([[:upper:]]+)_([[:upper:]])([[:upper:]]+)/\u\1\L\2\u\3\L\4/g')
echo "status().is$result()"
This might work for you (GNU sed):
<<<"$result" sed -r 's/.*(Status)\.(.*)_(.*)/\L\1().is\u\2\u\3()/'
Use pattern matching/grouping/back references. The majority of the RHS is lowercase, so use the \L metacharacter to convert from Status... to lowercase and uppercase just the start of words using \u which converts only the next character to uppercase.
N.B. \L and likewise \U converts all following characters to lowercase/uppercase until \E or \U/\L, \l and \u only interrupt this for the next character.
Since you are using GNU sed (-r switch), here's another sed solution,
just a little bit more concise, and locale safe:
$ result=HttpStatus.NOT_FOUND
$ echo "$result" | sed -r 's/^.*([A-Z][a-z]*)\.([a-zA-Z])([a-zA-Z]*)_([a-zA-Z])([a-zA-Z]*)/\L\1().is\u\2\L\3\U\4\L\5()/'
status().isNotFound()
An even more concise way of sed is:
echo "$result" | sed -r 's/^.*([A-Z][a-z]*)\.([a-zA-Z]*)_([a-zA-Z]*)/\L\1().is\u\2\u\3()/'
They both are case insensitive for the second part, for example .nOt_fOuNd also works here.
And an GNU awk solution:
echo "$result" | awk 'function cap(str){return (toupper(substr(str,1,1)) tolower(substr(str,2)))}match($0, /([A-Z][a-z]*)\.([a-zA-Z]*)_([a-zA-Z]*)/, m){print tolower(m[1]) ".is" cap(m[2]) cap(m[3]) "()"}'
You can use the sed option "-e" to concatenate multible expressions.

How to replace "\n" string with a new line in Unix Bash script

Cannot seem to find an answer to this one online...
I have a string variable (externally sourced) with new lines "\n" encoded as strings.
I want to replace those strings with actual new line carriage returns. The code below can achieve this...
echo $EXT_DESCR | sed 's/\\n/\n/g'
But when I try to store the result of this into it's own variable, it converts them back to strings
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\n/\n/g'`
How can this be achieved, or what I'm I doing wrong?
Here's my code I've been testing to try get the right results
EXT_DESCR="This is a text\nWith a new line"
echo $EXT_DESCR | sed 's/\\n/\n/g'
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\n/\n/g'`
echo ""
echo "$NEW_DESCR"
No need for sed, using parameter expansion:
$ foo='1\n2\n3'; echo "${foo//'\n'/$'\n'}"
1
2
3
With bash 4.4 or newer, you can use the E operator in ${parameter#operator}:
$ foo='1\n2\n3'; echo "${foo#E}"
1
2
3
Other answers contain alternative solutions. (I especially like the parameter expansion one.)
Here's what's wrong with your attempt:
In
echo $EXT_DESCR | sed 's/\\n/\n/g'
the sed command is in single quotes, so sed gets s/\\n/\n/g as is.
In
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\n/\n/g'`
the whole command is in backticks, so a round of backslash processing is applied. That leads to sed getting the code s/\n/\n/g, which does nothing.
A possible fix for this code:
NEW_DESCR=`echo $EXT_DESCR | sed 's/\\\\n/\\n/g'`
By doubling up the backslashes, we end up with the right command in sed.
Or (easier):
NEW_DESCR=$(echo $EXT_DESCR | sed 's/\\n/\n/g')
Instead of backticks use $( ), which has less esoteric escaping rules.
Note: Don't use ALL_UPPERCASE for your shell variables. UPPERCASE is (informally) reserved for system variables such as HOME and special built-in variables such as IFS or RANDOM.
Depending on what exactly you need it for:
echo -e $EXT_DESCR
might be all you need.
From echo man page:
-e
enable interpretation of backslash escapes
This printf would do the job by interpreting all escaped constructs:
printf -v NEW_DESCR "%b" "$EXT_DESCR"
-v option will store output in a variable so no need to use command substitution here.
Problem with your approach is use of old back-ticks. You could do:
NEW_DESCR=$(echo "$EXT_DESCR" | sed 's/\\n/\n/g')
Assuming you're using gnu sed as BSD sed won't work with this approach.

Adding double quotes to beginning, end and around comma's in bash variable

I have a shell script that accepts a parameter that is comma delimited,
-s 1234,1244,1567
That is passed to a curl PUT json field. Json needs the values in a "1234","1244","1567" format.
Currently, I am passing the parameter with the quotes already in it:
-s "\"1234\",\"1244\",\"1567\"", which works, but the users are complaining that its too much typing and hard to do. So I'd like to just take a comma delimited list like I had at the top and programmatically stick the quotes in.
Basically, I want a parameter to be passed in as 1234,2345 and end up as a variable that is "1234","2345"
I've come to read that easiest approach here is to use sed, but I'm really not familiar with it and all of my efforts are failing.
You can do this in BASH:
$> arg='1234,1244,1567'
$> echo "\"${arg//,/\",\"}\""
"1234","1244","1567"
awk to the rescue!
$ awk -F, -v OFS='","' -v q='"' '{$1=$1; print q $0 q}' <<< "1234,1244,1567"
"1234","1244","1567"
or shorter with sed
$ sed -r 's/[^,]+/"&"/g' <<< "1234,1244,1567"
"1234","1244","1567"
translating this back to awk
$ awk '{print gensub(/([^,]+)/,"\"\\1\"","g")}' <<< "1234,1244,1567"
"1234","1244","1567"
you can use this:
echo QV=$(echo 1234,2345,56788 | sed -e 's/^/"/' -e 's/$/"/' -e 's/,/","/g')
result:
echo $QV
"1234","2345","56788"
just add double quotes at start, end, and replace commas with quote/comma/quote globally.
easy to do with sed
$ echo '1234,1244,1567' | sed 's/[0-9]*/"\0"/g'
"1234","1244","1567"
[0-9]* zero more consecutive digits, since * is greedy it will try to match as many as possible
"\0" double quote the matched pattern, entire match is by default saved in \0
g global flag, to replace all such patterns
In case, \0 isn't recognized in some sed versions, use & instead:
$ echo '1234,1244,1567' | sed 's/[0-9]*/"&"/g'
"1234","1244","1567"
Similar solution with perl
$ echo '1234,1244,1567' | perl -pe 's/\d+/"$&"/g'
"1234","1244","1567"
Note: Using * instead of + with perl will give
$ echo '1234,1244,1567' | perl -pe 's/\d*/"$&"/g'
"1234""","1244""","1567"""
""$
I think this difference between sed and perl is similar to this question: GNU sed, ^ and $ with | when first/last character matches
Using sed:
$ echo 1234,1244,1567 | sed 's/\([0-9]\+\)/\"\1\"/g'
"1234","1244","1567"
ie. replace all strings of numbers with the same strings of numbers quoted using backreferencing (\1).

Grep last match of returned multi line result and assign to variable

Lets say that I have a command list kittens that returns something in this multi line format in my terminal (in this exact layout):
[ 'fluffy'
'buster'
'bob1' ]
How can I fetch bob1 and assign to a variable for scripting use? Here's my non working try so far.
list kittens | grep "'([^']+)' \]"
I am not overly familiar with grepping on the cli and am running into issues of syntax with quotes and such.
If you know that bob1 will be in the last line, you can capture it like that:
myvar="$(list kittens | tail -n1 | grep -oP "'\K[^']+(?=')")"
This uses tail to find the last line and then grep with a lookahead and a lookbehind in the regular expression to extract the part inside the quotes.
Edit: The above assume that you are using GNU grep (for the -P mode). Here's an alternative with sed:
myvar="$(list kittens | tail -n1 | sed -e "s/^[^']*'//; s/'[^']*$//")"
Could be done by awk alone:
list kittens |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
bob1
Example:
echo "$kit"
[ 'fluffy'
'buster'
'bob1' ]
echo "$kit" |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
bob1
To Assign it to any variable:
var=$(list kittens |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
Explanation:
END{}: End block is used to take data from last line as we are interested only for last line.
gsub: This is awk's inbuilt function for search and replacement tasks. Here white space and double quoted and single quotes are removed. Not that \047 is used for single quote replacement.

Resources