Pattern matching in bash with tuple-like arguments - bash

I want to pass a variable number of 'tuples' as arguments into a bash script and go through them in a loop using pattern matching, something like this:
for *,* in "$#"; do
#do something with first part of tuple
#do something with second part of tuple
done
is this possible? If so, how do I access each part of the tuple?
For example I would like to call my script like:
bash bashscript.sh first_file.xls,1 second_file,2 third_file,2 ... nth_file,1

Since bash doesn't have a tuple datatype (it just has strings), you need would need to encode and decode them yourself. For example:
$ bash bashscript.sh first_file.xls,1 second_file,2 third_file,2 ... nth_file,1
In bashscript.sh:
for tuple in "$#"; do
IFS=, read first second <<< "$tuple"
...
done

Yes, it's possible, and there is more than one way to do it. You can use the prefix/suffix expansion syntax on variables (e.g. ${var#prefix}, ${var##prefix}, ${var%suffix}, ${var%%suffix} - these remove either the shortest or longest prefix/suffix matching the specified pattern). Or you can replace the positional parameters with e.g. IFS=, set -- ${var} (although you'd have to make sure to save the rest of the original parameters in some way first so you can continue your loop). You can use arrays, if your version of bash is new enough (and if it isn't it's pretty old...). Those are probably three of the better methods, but there are others...
Edit: some examples using the suffix/prefix expansions:
for tuple in first_file.xls,1
do
echo ${tuple%,*} # "first_file.xls"
echo ${tuple#*,} # "1"
done
If your tuples are more than 2-ary, that method's a little more complex; for example:
for tuple in x,y,z
do
first=${tuple%%,*}
rest=${tuple#${first}}
second=${rest%%,*}
last=${rest#*,}
done
In that case you might prefer #chepner's answer of IFS=, read first second third <<< "${tuple}"... Otherwise, the bookkeeping can get hairy for large tuples. Setting an array from the tuple would be an acceptable alternative as well.
For simple pairs, though, I tend to prefer just stripping off a prefix/suffix as appropriate...

Related

(bash) What is the least redundant way to systematically apply changes to an array of variables?

My goal is to check a list of file paths if they end in "/" and remove it if that is the case.
Ideally I would like to change the original FILEPATH variables to reflect this change, and I'd like this to work for a long list without unnecessary redundancy. I tried doing it as a loop, but the changes didn't alter the original variables, it just changed the iterating "EACH_PATH" variable. Can anyone think of a better way to do this?
Here is my code:
FILEPATH1="filepath1/file1"
FILEPATH2="filepath2/file2/"
PATH_ARRAY=(${FILEPATH1} ${FILEPATH2})
echo ${PATH_ARRAY[#]}
for EACH_PATH in ${PATH_ARRAY[#]}
do
if [ "${EACH_PATH:$((${#EACH_PATH}-1)):${#EACH_PATH}}"=="/" ]
then EACH_PATH=${EACH_PATH:0:$((${#EACH_PATH}-1))}
fi
done
edit: I'm happy to do this in a totally different way and scrap the code above, I just want to know the most elegant way to do this.
I'm not entirely clear on the actual goal here, but depending on the situation I can see several possible solutions. The best (if it'll work in the situation) is to dispense with the individual variables, and just use array entries. For example, you could use:
declare -a filepath
filepath[1]="filepath1/file1"
filepath[2]="filepath2/file2/"
for index in "${!filepath[#]}"; do
if [[ "${filepath[index]}" = *?/ ]]; then
filepath[index]="${filepath[index]%/}"
fi
done
...and then use "${filepath[x]}" instead of "$FILEPATHx" throughout. Some notes:
I've used lowercase names. It's generally best to avoid all-caps names, since there are a lot of them with special functions, and accidentally using one of those names can cause trouble.
"${!filepath[#]}" gets a list of the indexes of the array (in this case, "1" "2") rather than their values; this is necessary so we can set the entries rather than just look at them.
I changed the logic of the slash-trimming test -- it uses [[ = ]] to do pattern matching, to see if the entry ends with "/" and has at least one character before that (i.e. it isn't just "/", 'cause you don't want to trim that). Then it uses in the expansion %/ to just trim "/" from the end of the value.
If a numerically-indexed array won't work (and you have at least bash version 4), how about a string-indexed ("associative") array? It's very similar, but use declare -A and use $ on variables in the index (and generally quote them). Something like this:
declare -A filepath
filepath["foo"]="filepath1/file1"
filepath["bar"]="filepath2/file2/"
for index in "${!filepath[#]}"; do
if [[ "${filepath["$index"]}" = *?/ ]]; then
filepath["$index"]="${filepath["$index"]%/}"
fi
done
If you really need separate variables instead of array entries, you might be able to use an array of variable names, and indirect variable references. how this works varies quite a bit between different shells, and can easily be unsafe depending on what's in your data (in this case, specifically what's in path_array). Here's a way to do it in bash:
filepath1="filepath1/file1"
filepath2="filepath2/file2/"
path_array=(filepath1 filepath2)
for varname in "${path_array[#]}"; do
if [[ "${!varname}" = *?/ ]]; then
declare "$varname=${!varname%/}"
fi
done
Using sed
PATH_ARRAY=($(echo ${PATH_ARRAY[#]} | sed 's#\/ ##g;s#/$##g'))
Demo:
$FILEPATH1="filepath1/file1"
$FILEPATH2="filepath2/file2/"
$PATH_ARRAY=(${FILEPATH1} ${FILEPATH2})
$echo ${PATH_ARRAY[#]}
filepath1/file1 filepath2/file2/
$PATH_ARRAY=($(echo ${PATH_ARRAY[#]} | sed 's#\/ ##g;s#/$##g'))
$echo ${PATH_ARRAY[#]}
filepath1/file1 filepath2/file2
$

Using a variable for associative array key in Bash

I'm trying to create associative arrays based on variables. So below is a super simplified version of what I'm trying to do (the ls command is not really what I want, just used here for illustrative purposes)...
I have a statically defined array (text-a,text-b). I then want to iterate through that array, and create associative arrays with those names and _AA appended to them (so associative arrays called text-a_AA and text-b_AA).
I don't really need the _AA appended, but was thinking it might be
necessary to avoid duplicate names since $NAME is already being used
in the loop.
I will need those defined and will be referencing them in later parts of the script, and not just within the for loop seen below where I'm trying to define them... I want to later, for example, be able to reference text-a_AA[NUM] (again, using variables for the text-a_AA part). Clearly what I have below doesn't work... and from what I can tell, I need to be using namerefs? I've tried to get the syntax right, and just can't seem to figure it out... any help would be greatly appreciated!
#!/usr/bin/env bash
NAMES=('text-a' 'text-b')
for NAME in "${NAMES[#]}"
do
NAME_AA="${NAME}_AA"
$NAME_AA[NUM]=$(cat $NAME | wc -l)
done
for NAME in "${NAMES[#]}"
do
echo "max: ${$NAME_AA[NUM]}"
done
You may want to use "NUM" as the name of the associative array and file name as the key. Then you can rewrite your code as:
NUM[${NAME}_AA]=$(wc -l < "$NAME")
Then rephrase your loop as:
for NAME in "${NAMES[#]}"
do
echo "max: ${NUM[${NAME}_AA]}"
done
Check your script at shellcheck.net
As an aside: all uppercase is not a good practice for naming normal shell variables. You may want to take a look at:
Correct Bash and shell script variable capitalization

What is the best way to store sliced arguments in Bash?

Example
>>./my_script.sh a b c
If I try to echo argument 2 - ..., I may do
>>echo "${#:2}"
a b c
And if i want to store ${#:2} in variable, these methods will not work
my_params=${#:2}
or
my_params="${#:2}"
But this way is work
my_params="$(echo ${#:2})"
I can feel an ugliness of this way. So, my questions are
What is the proper way to store a sliced arguments?
How to assign those sliced arguments to a variable?
How to reuse it as parameters of another function again?
In the original Bourne shell, only the positional argument list was available for this. Fortunately, modern derivatives have an array variable type specifically for this kind of situation.
array=("${#[2:]}") # note parentheses for array
echo "${array[0]}" # first arg of array
command "${array[#]}" # pass array as quoted arguments

bash command expansion

The following bash command substitution does not work as I thought.
echo $TMUX_$(echo 1)
only prints 1 and I am expecting the value of the variable $TMUX_1.I also tried:
echo ${TMUX_$(echo 1)}
-bash: ${TMUXPWD_$(echo 1)}: bad substitution
Any suggestions ?
If I understand correctly what you're looking for, you're trying to programatically construct a variable name and then access the value of that variable. Doing this sort of thing normally requires an eval statement:
eval "echo \$TMUX_$(echo 1)"
Important features of this statement include the use of double-quotes, so that the $( ) gets properly interpreted as a command substitution, and the escaping of the first $ so that it doesn't get evaluated the first time through. Another way to achieve the same thing is
eval 'echo $TMUX_'"$(echo 1)"
where in this case I used two strings which automatically get concatenated. The first is single-quoted so that it's not evaluated at first.
There is one exception to the eval requirement: Bash has a method of indirect referencing, ${!name}, for when you want to use the contents of a variable as a variable name. You could use this as follows:
tmux_var = "TMUX_$(echo 1)"
echo ${!tmux_var}
I'm not sure if there's a way to do it in one statement, though, since you have to have a named variable for this to work.
P.S. I'm assuming that echo 1 is just a stand-in for some more complicated command ;-)
Are you looking for arrays? Bash has them. There are a number of ways to create and use arrays in bash, the section of the bash manpage on arrays is highly recommended. Here is a sample of code:
TMUX=( "zero", "one", "two" )
echo ${TMUX[2]}
The result in this case is, of course, two.
Here are a few short lines from the bash manpage:
Bash provides one-dimensional indexed and associative array variables. Any variable may be
used as an indexed array; the declare builtin will explicitly declare an array. There is
no maximum limit on the size of an array, nor any requirement that members be indexed or
assigned contiguously. Indexed arrays are referenced using integers (including arithmetic
expressions) and are zero-based; associative arrays are referenced using arbitrary
strings.
An indexed array is created automatically if any variable is assigned to using the syntax
name[subscript]=value. The subscript is treated as an arithmetic expression that must
evaluate to a number greater than or equal to zero. To explicitly declare an indexed
array, use declare -a name (see SHELL BUILTIN COMMANDS below). declare -a name[subscript]
is also accepted; the subscript is ignored.
This works (tested):
eval echo \$TMUX_`echo 1`
Probably not very clear though. Pretty sure any solutions will require backticks around the echo to get that to work.

Tricky brace expansion in shell

When using a POSIX shell, the following
touch {quick,man,strong}ly
expands to
touch quickly manly strongly
Which will touch the files quickly, manly, and strongly, but is it possible to dynamically create the expansion? For example, the following illustrates what I want to do, but does not work because of the order of expansion:
TEST=quick,man,strong #possibly output from a program
echo {$TEST}ly
Is there any way to achieve this? I do not mind constricting myself to Bash if need be. I would also like to avoid loops. The expansion should be given as complete arguments to any arbitrary program (i.e. the program cannot be called once for each file, it can only be called once for all files). I know about xargs but I'm hoping it can all be done from the shell somehow.
... There is so much wrong with using eval. What you're asking is only possible with eval, BUT what you might want is easily possible without having to resort to bash bug-central.
Use arrays! Whenever you need to keep multiple items in one datatype, you need (or, should use) an array.
TEST=(quick man strong)
touch "${TEST[#]/%/ly}"
That does exactly what you want without the thousand bugs and security issues introduced and concealed in the other suggestions here.
The way it works is:
"${foo[#]}": Expands the array named foo by expanding each of its elements, properly quoted. Don't forget the quotes!
${foo/a/b}: This is a type of parameter expansion that replaces the first a in foo's expansion by a b. In this type of expansion you can use % to signify the end of the expanded value, sort of like $ in regular expressions.
Put all that together and "${foo[#]/%/ly}" will expand each element of foo, properly quote it as a separate argument, and replace each element's end by ly.
In bash, you can do this:
#!/bin/bash
TEST=quick,man,strong
eval echo $(echo {$TEST}ly)
#eval touch $(echo {$TEST}ly)
That last line is commented out but will touch the specified files.
Zsh can easily do that:
TEST=quick,man,strong
print ${(s:,:)^TEST}ly
Variable content is splitted at commas, then each element is distributed to the string around the braces:
quickly manly strongly
Taking inspiration from the answers above:
$ TEST=quick,man,strong
$ touch $(eval echo {$TEST}ly)

Resources