extract hostname with parameter expansion in a single assignment - shell

I am trying to get the hostname for a url. I am able to do it with 2 assignments but I want to do it in a single step.
#!/usr/bin/env sh
HELM_CHART=oci://foo.bar.io/ns/chart-name
host=${HELM_CHART#*//}
host=${host%%/*}
echo "$host"
# foo.bar.io
I am not able to figure out a pattern or combination of pattern to achieve this. I read that you can combine a pattern like shown here Remove a fixed prefix/suffix from a string in Bash. I try all sorts of combinations, but I can't get it working.
Is it possible to do in a single assignment?

You can't do this with parameter expansion alone. You can with a regular expression match, though. In bash,
[[ $HELM_CHART =~ ://([^/]*)/ ]] && host=${BASH_REMATCH[1]}
In POSIX shell,
host=$(expr "$HELM_CHART" : '.*://\([^/]*\)/')

Related

Using "expanding characters" in a variable in a bash script

I apologize beforehand for this question, which is probably both ill formulated and answered a thousand times over. I get the feeling that my inability to find an answer is that I don't quite know how to ask the question.
I'm writing a script that traverses folders in a bunch of mounted external hard drives, like so:
for g in /Volumes/compartment-?/{Private/Daniel,Daniel}/Projects/*/*
It then proceeds to perform long-running tasks on each of the directories found there. Because these operations are io-intensive rather than cpu-intensive, I thought I'd add the option to provide which "compartment" I want to work in, so that I can parallelize the workloads.
But, doing
cmp="?"
[[ ! "$1" = "" ]] && cmp="$1"
And then,
for g in /Volumes/compartment-$cmp/{Private/Daniel,Daniel}/Projects/*/*
Doesn't work - the question mark that should expand to all compartments instead becomes literal, so I get an error that "compartment-?" doesn't exist, which is of course true.
How do I create a variable with a value that "expands," like dir="./*" working with ls $dir?
EDIT: Thanks to #dan for the answer. I was brought up to be courteous and thank people, so I did thank him for it in a comment on his question, but that comment has been removed, and I'm anxious that repeating it might be some kind of infraction here. I ended up simply escaping my question mark glob character, i.e. \?, since for this script I only need to either search all drives or one particular drive. But I'll keep the answer handy for the next time I write a script where I'd like to support more advanced arguments.
Brace expansion occurs before variable expansion. Pathname/glob expansion (eg ?, *) occurs last. Therefore you can't use the glob character ? in a variable, and in a brace expansion.
You can use a glob expression in an unquoted variable, without brace expansion. Eg. q=\?; echo compartment-$q is equivalent to echo compartment-?.
To solve your problem, you could define an array based on the input argument:
if [[ $1 ]]; then
[[ -d /Volumes/compartment-$1 ]] || exit 1
files=("/Volumes/compartment-$1"/{Private/Daniel,Daniel}/Projects/*/*)
else
files=(/Volumes/compartment-?/{Private/Daniel,Daniel}/Projects/*/*)
fi
# then iterate the list:
for i in "${files[#]}"; do
...
Another option is a nested loop. The path expression in the outer loop doesn't use brace expansion, so (unlike the first example) it can expand a glob in $1 (or default to ? if $1 is empty):
for i in /Volumes/compartments-${1:-?}; do
[[ -d $i ]] &&
for j in {Private/Daniel,Daniel}/Projects/*/*; do
[[ -e $j ]] || continue
...
Note that the second example expands a glob expression passed in $1 (eg. ./script '[1-9]'). The first example does not.
Remember that pathname expansion has the property of expanding only to existing files, or literally. shopt -s nullglob guarantees expansion only to existing files (or nothing).
You should either use nullglob, or check that each file or directory exists, like in the examples above.
Using $1 unquoted also subjects it to word splitting on whitespace. You can set IFS= (empty) to avoid this.

what happens in bash 'echo ${full_path##/*}'

I found this easy filename printing on the internet. But I cant find explanation what does these ##*/ mean? It doesnt look like regex. More over, could it be used with result of readlink in one line?
From Manipulating String, Advanced Bash-Scripting Guide
${string##substring}
Deletes longest match of substring from front of $string.
So in your case, the * in the substring indicates: match everything.
The command echo ${full_path##/*} will:
Print $full_path unless it starts with a forward slash (/), in that case an empty string will be shown
Example cases;
$ test_1='/foo/bar'
$ test_2='foo/bar'
$
$ echo "${test_1##/*}"
$ echo "${test_2##/*}"
foo/bar
$
Regarding your second question:
More over, could it be used with result of readlink in one line?
Please take a look at Can command substitution be nested in variable substitution?.
If you're using bash I'd recommend keeping it simple, by assigning the result of readlink to a variable, then using the regular variable substitution to get the desired output. Linking both actions could be done using the && syntax.
An one-liner could look something like:
tmp="$(readlink -f file_a)" && echo "${tmp##/*}"

Trim a string (tailing end) based on a specific character in Bash

I was looking to try and figure out how trim a string in Bash, from the trailing end, once I hit a certain character.
Example: if my string is this (or any link): https://www.cnpp.usda.gov/Innovations/DataSource/MyFoodapediaData.zip
(I'll set that as my variable).
(I.e. if I echo $var it will return that link:)
I'm looking to use Bash, I'm guessing I will need to utilize sed or awk, but I want to trim, starting from the end until I see the first / (since the will be the file name) and strip that out.
So using that link, I'm trying to just get after the / so jus "MyFoodapediaData.zip" and set that to a different variable.
So in the end, if I echo $var2 (if I call it that) it will just return: MyFoodapediaData.zip"
I tried working with sed 's.*/" and that will start from the beginning until it finds the first slash. I was looking for the reverse order if possible.
You can use bash builtin parameter substitution for this:
$ var='https://www.cnpp.usda.gov/Innovations/DataSource/MyFoodapediaData.zip'
$ echo "$var"
https://www.cnpp.usda.gov/Innovations/DataSource/MyFoodapediaData.zip
$ var2=${var##*/}
$ echo "$var2"
MyFoodapediaData.zip
${var##*/} means "from the beginning of the value of the var variable, remove everything up to the last slash."
See parameter substitution in the manual

How to grep from a single line

I'm using a weather API that outputs all data in a single line. How do I use grep to get the values for "summary" and "apparentTemperature"? My command of regular expressions is basically nonexistent, but I'm ready to learn.
{"latitude":59.433335,"longitude":24.750486,"timezone":"Europe/Tallinn","offset":2,"currently":{"time":1485880052,"summary":"Clear","icon":"clear-night","precipIntensity":0,"precipProbability":0,"temperature":0.76,"apparentTemperature":-3.34,"dewPoint":-0.13,"humidity":0.94,"windSpeed":3.99,"windBearing":262,"visibility":9.99,"cloudCover":0.11,"pressure":1017.72,"ozone":282.98}}
Thank you!
How do I use grep to get the values for "summary" and "apparentTemperature"?
You use grep's -o flag, which makes it output only the matched part.
Since you don't know much about regex, I suggest you instead learn to use a JSON parser, which would be more appropriate for this task.
For example with jq, the following command would extract the current summary :
<whatever is your JSON source> | jq '.currently.summary'
Assume your single-line data is contained in a variable called DATA_LINE.
If you are certain the field is only present once in the whole line, you could do something like this in Bash:
if
[[ "$DATA_LINE" =~ \"summary\":\"([^\"]*)\" ]]
then
summary="${BASH_REMATCH[1]}"
echo "Summary field is : $summary"
else
echo "Summary field not found"
fi
You would have to do that once for each field, unless you build a more complex matching expression that assumes fields are in a specific order.
As a note, the matching expression \"summary\":\"([^\"]*)\" finds the first occurrence in the data of a substring consisting of :
"summary":" (double quotes included), followed by
([^\"]*) a sub-expression formed of a sequence of zero or more characters other than a double quote : this is in parentheses to make it available later as an element in the BASH_REMATCH array, because this is the value you want to extract
and finally a final quote ; this is not absolutely necessary, but protects from reading from a truncated data line.
For apparentTemperature the code will be a bit different because the field does not have the same format.
if
[[ "$DATA_LINE" =~ \"apparentTemperature\":([^,]*), ]]
then
apparentTemperature="${BASH_REMATCH[1]}"
echo "Apparent temperature field is : $apparentTemperature"
else
echo "Apparent temperature field not found"
fi
This is fairly easily understood if your skills are limited - like mine! Assuming your string is in a variable called $LINE:
summary=$(sed -e 's/.*summary":"//' -e 's/".*//' <<< $LINE)
Then check:
echo $summary
Clear
That executes (-e) 2 sed commands. The first one substitutes everything up to summary":" with nothing and the second substitutes the first remaining double quote and everything that follows with nothing.
Extract apparent temperature:
appTemp=$(sed -e 's/.*apparentTemperature"://' -e 's/,.*//' <<< $LINE)
Then check:
echo $appTemp
-3.34
As Aaron mentioned a json parser like jq is the right tool for this, but since the question was about grep, let's see one way to do it.
Assuming your API return value is in $json:
json='{"latitude":59.433335,"longitude":24.750486,"timezone":"Europe/Tallinn","offset":2,"currently":{"time":1485880052,"summary":"Clear","icon":"clear-night","precipIntensity":0,"precipProbability":0,"temperature":0.76,"apparentTemperature":-3.34,"dewPoint":-0.13,"humidity":0.94,"windSpeed":3.99,"windBearing":262,"visibility":9.99,"cloudCover":0.11,"pressure":1017.72,"ozone":282.98}}'
The patterns you see in the parenthesis are lookbehind and lookahead assertions for context matching. They can be used with the -P Perl regex option and will not be captured in the output.
summary=$(<<< "$json" grep -oP '(?<="summary":").*?(?=",)')
apparentTemperature=$(<<< "$json" grep -oP '(?<="apparentTemperature":).*?(?=,)')

What is the meaning of #(${VAR})?

Whilst looking at some shell scripts I encountered several instances of if statements comparing some normal variable against another variable which is enclosed in #( ) brackets.
Does #(....) have some special meaning or am I missing something obvious here? Example of if test:
if [[ ${VAR} != #(${VAR2}) ]]
Thanks
It's an extended pattern, borrowed from ksh. Originally you would need to enable support for it with shopt -s extglob, but it became the default behavior inside [[ ... ]] in bash 4.1. #(...) matches one of the enclosed patterns. By itself, #(pattern) and pattern would be equivalent, so I would assume that the contents of $VAR2 contains at least one pipe, so that the expansion is something like #(foo|bar). In that case, the test would succeed if $VAR1 does not match foo or bar.
From the bash man page:
#(pattern-list)
Matches one of the given patterns
So ${VAR2} is expected to be a list of patterns separated by |, and your code tests whether ${VAR} matches any of them.

Resources