How to use parameter expansion correctly in bash [duplicate] - bash

This question already has answers here:
Stripping prefixes and suffixes from shell words matching a pattern
(2 answers)
Difference between ${} and $() in Bash [duplicate]
(3 answers)
Closed 1 year ago.
I have a string with the structure task_name-student_name and I want to split it into two variables:
task: containing the chunk before the -
student: containing the chunk after the -
I can get this to work with sed:
string="task_name-student_name"
student=$(echo "$string" | sed "s/.*-//")
task=$(echo "$string" | sed "s/-[^-]*$//")
However, VS Code suggests "See if you can use $(variable//search/replace) instead".
So I have two questions:
Why would $(variable//search/replace) be better
How do I get the parameter expansion to work without it being interpreted as a command?
When I try
echo $("$string"//-[^-]*$//)
or
echo $(echo $("$string"//-[^-]*$//))
I get this output:
bash: task_name-student_name//-[^-]*$//: No such file or directory
Thanks in advance!

First: for variable expansion, you want curly braces instead of parentheses. $(something) will execute something as a command; ${something} will expand something as a variable. And just for completeness, $((something)) will evaluate something as an arithmetic expression (integers only, no floating point).
As for replacing the sed with a variable expansion: I wouldn't use $(variable//search/replace} for this; there are more appropriate modifications. ${variable#pattern} will remove the shortest possible match of pattern from the beginning of the variable's value, so use that with the pattern *- to remove through the first "-":
student=${string#*-}
Similarly, ${variable%pattern} will remove from the end of the variable's value, so you can use this with the pattern -* to remove from the dash to the end:
task=${string%-*}
Note that the patterns used here are "glob" expressions (like filename wildcards), not regular expressions like sed uses; they're just different enough to be confusing. Also, the way I've written these assumes there's exactly one "-" in the string; if there's a possibility some student will have a hyphenated name or something like that, you may need to modify them.
There are lots more modifications you can do in a parameter expansion; see the bash hacker's wiki on the subject. Some of these modifications will work in other shells besides bash; the # and % modifiers (and the "greedy" versions, ## and %%) will work in any shell that conforms to the POSIX standard.

Related

Bash use variable in 'for' keyword [duplicate]

This question already has answers here:
When to wrap quotes around a shell variable?
(5 answers)
In bash, how do I expand a wildcard while it's inside double quotes?
(1 answer)
Closed 5 months ago.
How to get correctly work?
Below line is working but need use argument in 'for'.
for f in ~/files/*/*.txt do
Code:
list ()
{
for f in $1; do
echo $f
done
}
list "~/files/*/*.txt"
list "~/files/*.txt"
Output:
~/files/*/*.txt
~/files/*.txt
The problem lies in the expansion order in bash.
There is a certain logic in what you did.
It would work with list "./dir/*/*.txt" for example.
You enclosed list argument with double quotes, so that pattern expansion is not done when calling list.
So, $1 in "list" function is literally ./dir/*/*.txt. As you wanted it to be.
Then you don't enclose $1 in the for instruction, so that it is expanded into a list of file names.
So it does what you want.
Or almost so.
Problem is that the order of expansion is :
~ then parameters then patterns (plus other irrelevant in between).
So, when $1 is substituted by its value, it is too late to substitute ~.
If you had done that (not at all a suggestion. Just an explanation) :
list ()
{
for f in ~/$1; do
echo $f
done
}
list "files/*/*.txt"
list "files/*.txt"
It would have worked. Not a solution obviously. But it helps understand what happens: "files/*.txt" is literally passed to "list".
Then
for f in ~/$1
is transformed into
for f in /home/you/$1 [~ substitution]
then transformed into
for f in /home/you/files/*.txt [parameter substitution]
then transformed into
for f in /home/you/files/a.txt /home/you/files/b.txt [pattern expansion]
Now for the solution, quoting the arguments and then using $#, as suggested in comments, would do the trick indeed.
If you don't quote the argument and call
list ~/files/*.txt
Then expansion will occur before the call.
list ~/files/*.txt
is transformed into
list /home/you/files/*.txt
then info
list /home/you/files/a.txt /home/you/files/b.txt.
Then passed to list.
But then, inside list, what you have are 2 arguments.
So indeed, for the for the for, then, you need to use "$#"
list ()
{
for f in "$#"; do
echo $f
done
}
list ~/files/list.txt
Another way, if you have a reason to want the expansion to occurs inside "list" (for example, if you may want to pass two of those patterns), would be to force the expansion after the parameter substitution.
There is no ideal way to do that.
Either you need to recode (or use external commands, such as realpath) the path expansion.
Or you use "eval" to force double evaluation of your "$1". But that is a huge security breach if the arguments come from the user (one could use $(rm -fr /) as an argument, and eval would execute it), plus it can also be tricky, if you have, for example, filenames containing "$".
If you know that the patterns will always look like your examples (maybe a tilde and some * and likes) then you could just do the tilde substitution yourself and keep the rest of the code as is
list ()
{
param=${1/#\~/$HOME}
for f in $param; do
echo $f
done
}
list "~/files/list.txt"
Not the best solution. But the one closest to yours.
tl;dr:
The problem is the order of substitution is bash. You need to understand how bash works by rewriting commands in several stages before execution.
More specifically, because ~ is expanded before parameters and variables. So if x="~/*", then echo $x means echo $x after ~ expansion (no ~ in echo $x), then echo ~/* after variable expansion ($x is replaced by its value), and then echo ~/* after * expansion (since you have to directory literally named ~, * matches nothing).
The easiest solution is to have list take many arguments, not just one, let the expansion occurs before the call to list (so not enclosing argument to list in "), and then, rewrite list by taking into account that $1 is just the first of many arguments.
If you insist on having a single argument to list, you have to deal with potential ~ yourself. Like with ${1/#\~/$HOME} if ~ are always single ~ (not ~user) at the beginning of the pattern.

Bash bad substitution with glob expansion for environment variables

How can I match environment variables which include the case-insensitive segment "proxy" that is not a prefix? I'm on bash:
root#PDPINTDEV9:~# echo ${SHELL}
/bin/bash
I want to unset a bunch of proxy variables simultaneously. They all have "proxy" or "PROXY" in the name, such as http_proxy or NO_PROXY. I would like to use glob expansion, which this answer & comment says is what bash uses.
Also based on that answer, I see that I can find environment vars which start with "PROXY":
root#PDPINTDEV9:~# echo "${!PROXY*}"
PROXY_IP PROXY_PORT
But that doesn't make sense with what I've read about glob expansion. Based on those, "${!PROXY*}" should match anything that doesn't start with proxy... I think.
Furthermore, I can't get anything that does make sense with glob syntax to actually work:
root#PDPINTDEV9:~# echo ${*proxy}
-bash: ${*proxy}: bad substitution
root#PDPINTDEV9:~# echo "${!*[pP][rR][oO][xX][yY]}"
-bash: ${!*[pP][rR][oO][xX][yY]}: bad substitution
SOLVED below: Turns out you can't. Crazy, but thanks everyone.
Variable name expansion, as a special case of shell parameter expansion, does not support globbing. But it has two flavors:
${!PREFIX*}
${!PREFIX#}
In both, the * and # characters are hard-coded.
The first form will expand to variable names prefixed with PREFIX and joined by the first character of the IFS (which is a space, by default):
$ printf "%s\n" "${!BASH*}"
BASH BASHOPTS BASHPID BASH_ALIASES BASH_ARGC BASH_ARGV BASH_CMDS BASH_COMMAND ...
The second form will expand to variable names (prefixed with PREFIX), but as separate words:
$ printf "%s\n" "${!BASH#}"
BASH
BASHOPTS
BASHPID
BASH_ALIASES
BASH_ARGC
...
Both of these forms are case-sensitive, so to get the variable names in a case-insensitive manner, you can use set, in combination with some cut and grep:
$ (set -o posix; set) | cut -d= -f1 | grep -i ^proxy
PROXY_IP
proxy_port
But that doesn't make sense with what I've read about glob expansion.
Based on those, "${!PROXY*}" should match anything that doesn't start
with proxy... I think.
No and no.
In the first place, the ! character is not significant to pathname expansion, except when it appears at the beginning of a character class in a pattern, in which case the sense of the class is inverted. For example, fo[!o] is a pattern that matches any three-character string whose first two characters are "fo" and whose third is not another 'o'. But there is no character class in your expression.
But more importantly, pathname expansion isn't relevant to your expression ${!PROXY*} at all. There is no globbing there. The '!' and '*' are fixed parts of the syntax for one of the forms of parameter expansion. That particular expansion produces, by definition, the names of all shell variables whose names start with "PROXY", separated by the first character of the value of the IFS variable. Where it appears outside of double quotes, it is equivalent to ${!PROXY#}, which is less susceptible to globbing-related confusion.
Furthermore, I can't get anything that does make sense with glob syntax to actually work: [...]
No, because, again, there is no globbing going on. You need exactly ${! followed by the name prefix of interest, followed by *} or #} to form the particular kind of parameter expansion you're asking about.
How can I match environment variables which include the case-insensitive segment "proxy"?
You need to explicitly express the case variations of interest to you. For example:
${!PROXY*} ${!proxy*} ${!Proxy*}

Why this expansion inside expansion pattern doesn't work?

Suppose you have something like:
$ a=(fooa foob foox)
Then you can do:
$ b=(${(M)a:#*(a|b)})
To select a's elements matching the pattern.
So you have:
$ print ${(qq)b}
'fooa' 'foob'
Then you expect to build the pattern in some dynamic way, so you have it in another variable, say:
$ p="*(a|b)"
And you expect this:
$ b=(${(M)a:#$p})
Would work the same as before, as the documentation says, but it doesn't:
$ print ${(qq)b}
''
Why is that?
Because zsh tries to select $p's value literally (a plain string text) in this case:
a=('*(a|b)' fooa foob)
p="*(a|b)"
b=(${(M)a:#$p})
print ${(qq)b}
;#⇒'*(a|b)'
We could tell zsh to treat $p's expansion as patterns rather than literal values explicitly by a ${~spec} form.
${~spec}
Turn on the GLOB_SUBST option for the evaluation of spec; if the ‘~’ is doubled, turn it off. When this option is set, the string resulting from the expansion will be interpreted as a pattern anywhere that is possible, such as in filename expansion and filename generation and pattern-matching contexts like the right hand side of the ‘=’ and ‘!=’ operators in conditions.
-- zshexpn(1): Expansion, Parameter Expansion
In this case, we could use it like this:
a=(fooa foob foox)
p="*(a|b)"
b=(${(M)a:#${~p}}) ;# tell zsh treat as a pattern for `$p`
print ${(qq)b}
;#⇒'fooa' 'foob'
Note: It gives some hints in the parameter expansion flag b for storing patterns in variable values:
b
Quote with backslashes only characters that are special to pattern matching. This is useful when the contents of the variable are to be tested using GLOB_SUBST, including the ${~...} switch.
Quoting using one of the q family of flags does not work for this purpose since quotes are not stripped from non-pattern characters by GLOB_SUBST. In other words,
pattern=${(q)str}
[[ $str = ${~pattern} ]]
works if $str is ‘a*b’ but not if it is ‘a b’, whereas
pattern=${(b)str}
[[ $str = ${~pattern} ]]
is always true for any possible value of $str.
-- zshexpn(1): Expansion, Parameter Expansion, Parameter Expansion Flags

How do I store a command in a variable and use it in a pipeline? [duplicate]

This question already has answers here:
Why does shell ignore quoting characters in arguments passed to it through variables? [duplicate]
(3 answers)
Closed 6 years ago.
If i use this command in pipeline, it's working very well;
pipeline ... | grep -P '^[^\s]*\s3\s'
But if I want to set grep into variable like:
var="grep -P '^[^\s]*\s3\s'"
And if I put variable in pipeline;
pipeline ... | $var
nothing happens, like there isn't any matches.
Any help what am I doing wrong?
The robust way to store a simple command in a variable in Bash is to use an array:
# Store the command names and arguments individually
# in the elements of an *array*.
cmd=( grep -P '^[^\s]*\s3\s' )
# Use the entire array as the command to execute - be sure to
# double-quote ${cmd[#]}.
echo 'before 3 after' | "${cmd[#]}"
If, by contrast, your command is more than a simple command and, for instance, involves pipes, multiple commands, loops, ..., defining a function is the right approach:
# Define a function encapsulating the command...
myGrep() { grep -P '^[^\s]*\s3\s'; }
# ... and use it:
echo 'before 3 after' | myGrep
Why what you tried didn't work:
var="grep -P '^[^\s]*\s3\s'"
causes the single quotes around the regex to become a literal, embedded part of $var's value.
When you then use $var - unquoted - as a command, the following happens:
Bash performs word-splitting, which means that it breaks the value of $var into words (separate tokens) by whitespace (the chars. defined in special variable $IFS, which contains a space, a tab, and a newline character by default).
Bash also performs globbing (pathname expansion) on the resulting works, which is not a problem here, but can have unintended consequences in general.
Also, if any of your original arguments had embedded whitespace, word splitting would split them into multiple words, and your original argument partitioning is lost.
(As an aside: "$var" - i.e., double-quoting the variable reference - is not a solution, because then the entire string is treated as the command name.)
Specifically, the resulting words are:
grep
-P
'^[^\s]*\s3\s' - including the surrounding single quotes
The words are then interpreted as the name of the command and its arguments, and invoked as such.
Given that the pattern argument passed to grep starts with a literal single quote, matching won't work as intended.
Short of using eval "$var" - which is NOT recommended for security reasons - you cannot persuade Bash to see the embedded single quotes as syntactical elements that should be removed (a process appropriate called quote removal).
Using an array bypasses all these problems by storing arguments in individual elements and letting Bash robustly assemble them into a command with "${cmd[#]}".
What you are doing wrong is trying to store a command in a variable. For simplicity, robustness, etc. commands are stored in aliases (if no arguments) or functions (if arguments), not variables. In this case:
$ alias foo='grep X'
$ echo "aXb" | foo
aXb
I recommend you read the book Shell Scripting Recipes by Chris Johnson ASAP to get the basics of shell programming and then Effective Awk Programming, 4th Edition, by Arnold Robbins when you're ready to start writing scripts to manipulate text.

Bash Parameter Expansion - get immediate parent directory of file [duplicate]

This question already has answers here:
Extracting a string between last two slashes in Bash
(6 answers)
Closed 7 years ago.
I'd like to get the name of the immediate parent directory of a given file, e.g. foo given /home/blah/foo/bar.txt, using a parameter expansion. Right now I can do it in two lines:
f="/home/blah/foo/bar.txt"
dir_name="${f%/*}"
immediate_parent="${dir_name##*/}"
But I'm very new to parameter expansions, so I assume this could be optimized. Is there a way to do it in only one line?
You can't do it with a single parameter expansion, but you can use =~, Bash's regex-matching operator:
[[ $f =~ ^.*/(.+)/.+$ ]] && immediate_parent=${BASH_REMATCH[1]}
Note: Assumes an absolute path with at least 2 components.
If calling an external utility is acceptable, awk offers a potentially simpler alternative:
immediate_parent=$(awk -F/ '{ print $(NF-1) }' <<<"$f")
As for why it can't be done with a single parameter expansion:
Parameter expansion allows for stripping either a prefix (# / ##) or a suffix (% / %%) from a variable value, but not both.
Nesting prefix- and suffix-trimming expansions, while supported in principle, does not help, because you'd need iterative modification of values, whereas an expansion only ever returns the modified string; in other words: the effective overall expansion still only performs a single expansion operation, and you're again stuck with either a prefix or a suffix operation.
Using a single parameter expansion, extracting an inner substring can only be done by character position and length; e.g., a hard-coded solution based on the sample input would be: immediate_parent=${f:11:3}
You can use arithmetic expressions and even command substitutions as parameter expansion arguments, but the pointlessness of this approach - at least in this scenario - becomes obvious if we try it; note the embedded command substitutions - the first to calculate the character position, the second to calculate the length:
immediate_parent=${f:$(d=${f%/*/*}; printf %s ${#d})+1:$(awk -F/ '{ print length($(NF-1)) }' <<<"$f")}

Resources