Surpressing shell expansion when truncating variable obtained by asterisk - bash

In my folder I have following files:
roi_1_Precentral_L/
roi_1_Precentral_L_both.fig
roi_1_Precentral_L_left.fig
roi_1_Precentral_L_right.fig
roi_1_Precentral_L_slice.fig
roi_2_Precentral_R/
roi_2_Precentral_R_both.fig
...
roi_116_Vermis_10/
roi_116_Vermis_10_both.fig
roi_116_Vermis_10_left.fig
roi_116_Vermis_10_right.fig
roi_116_Vermis_10_slice.fig
I use following script to obtain a desired prefix of the filename for each of all 116 types:
for iroi in `seq 1 116`;
do
d=roi_${iroi}_*/
d2=${d:0:-1} # <-- THIS LINE IS IMPORTANT
echo $d2
done;
Desired output for iroi=1:
$ roi_1_Precentral_L
Actual output:
$ roi_1_Precentral_L roi_1_Precentral_L_both.fig roi_1_Precentral_L_left.fig roi_1_Precentral_L_right.fig roi_1_Precentral_L_slice.fig
How can I avoid shell expansion in the emphasized line of code to make desired output?

If you assign to an array, the glob will be expanded on the first line, not later as of the echo as was the case with your original code.
d=( "roi_${iroi}_"*/ )
d2=${d:0:-1} # Note that this only works with new bash. ${d%/} would be better.
echo "$d2"
If you expect multiple directories, "${d[#]%/}" will expand to the full list, with the trailing / removed from each:
d=( "roi_${iroi}_"*/ )
printf '%s\n' "${d[#]%/}"
With respect to avoiding unwanted expansions -- note that in the above, every expansion except for those on the right-hand side of a simple (string, not array) assignment is in double quotes. (Regular assignments implicitly inhibit string-splitting and glob expansion -- though it doesn't hurt to have quotes even then! This inhibition is why ${d:0:-1} was removing the / from the glob expression itself, not from its results).

Answer to Question
If you wanted, you could quote to avoid expansion of * in $d ...
d=roi_${iroi}_*/
d2="${d:0:-1}"
echo $d2
... but then you could directly write ...
d2="roi_${iroi}_*"
echo $d2
... and the output would still be the same as in your question.
Answer to Expected Output
You could do the expansion in an array and select the first array entry, then remove the / from that entry.
for iroi in {1..116}; do
d=(roi_"$iroi"_*/)
d2="${d[0]:0:-1}"
echo "$d2"
done
This matches only directories and prints the first directory without the trailing /.

Related

Multiple elements instead of one in bash script for loop

I have been following the answers given in these questions
Shellscript Looping Through All Files in a Folder
How to iterate over files in a directory with Bash?
to write a bash script which goes over files inside a folder and processes them. So, here is the code I have:
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in "$INFOLDER$YEAR*.mdb";
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done
I am receiving this error: basename: extra operand.
I added echo $f and I realized that f contains all the filenames separated by space. But I expected to get one at a time. What could be the problem here?
You're running into problems with quoting. In the shell, double-quotes prevent word splitting and wildcard expansion; generally, you don't want these things to happen to variable's values, so you should double-quote variable references. But when you have something that should be word-split or wildcard-expanded, it cannot be double-quoted. In your for statement, you have the entire file pattern in double-quotes:
for f in "$INFOLDER$YEAR*.mdb";
...which prevents word-splitting and wildcard expansion on the variables' values (good) but also prevents it on the * which you need expanded (that's the point of the loop). So you need to quote selectively, with the variables inside quotes and the wildcard outside them:
for f in "$INFOLDER$YEAR"*.mdb;
And then inside the loop, you should double-quote the references to $f in case any filenames contain whitespace or wildcards (which are completely legal in filenames):
echo "$f"
absname="$INFOLDER$YEAR$(basename "$f")"
(Note: the double-quotes around the assignment to absname aren't actually needed -- the right side of an assignment is one of the few places in the shell where it's safe to skip them -- but IMO it's easier and safer to just double-quote all variable references and $( ) expressions than to try to keep track of where it's safe and where it's not.)
Just quote your shell variables if they are supposed to contain strings with spaces in between.
basename "$f"
Not doing so will lead to splitting of the string into separate characters (see WordSplitting in bash), thereby messing up the basename command which expects one string argument rather than multiple.
Also it would be a wise to include the * outside the double-quotes as shell globbing wouldn't work inside them (single or double-quote).
#!/bin/bash
# good practice to lower-case variable names to distinguish them from
# shell environment variables
year="2002/"
in_folder="/local/data/datasets/Convergence/"
for file in "${in_folder}${year}"*.mdb; do
# break the loop gracefully if no files are found
[ -e "$file" ] || continue
echo "$file"
# Worth noting here, the $file returns the name of the file
# with absolute path just as below. You don't need to
# construct in manually
absname=${in_folder}${year}$(basename "$file")
done
just remove "" from this line
for f in "$INFOLDER$YEAR*.mdb";
so it looks like this
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in $INFOLDER$YEAR*.mdb;
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done

Why does this Bash pathname expansion not take place?

I'm struggling with Bash variable expansion. Please see the following code:
~/tmp 689$ a=~/Library/Application\ *; echo $a
/Users/foo/Library/Application *
~/tmp 690$ echo ~/Library/Application\ *
/Users/foo/Library/Application Scripts /Users/foo/Library/Application Support
As the order of expansion is brace->tilde->parameter->....->pathname,
why is pathname expansion not applied to $a in the same way that it is in the 2nd command?
[added]
Does whitespace escaping have hidden behaviour regarding the following output?
~/tmp 705$ a=~/Library/Application*; echo $a
/Users/foo/Library/Application Scripts /Users/foo/Library/Application Support
To do what you meant to do, you'd have to use the following:
a=(~/Library/Application\ *) # use an *array* to capture the pathname-expanded results
echo "${a[#]}" # output all array elements (without further expansion)
As for why your code didn't work:
In the context of variable assignment involving only literals or string interpolation (references to other variables), NO pathname expansion takes place, even with unquoted strings (e.g., a=*, a="*", and a='*' all assign literal *)[1].
(By contrast, pathname expansion is applied to unquoted strings inside an array definition (e.g., a=(*), or inside a command substitution (e..g, a=$(echo *)).)
Thus, the literal content of $a is /Users/foo/Library/Application *
Executing echo $a - i.e., NOT double-quoting the variable reference $a - then applies word splitting and does the following:
it prints literal '/Users/foo/Library/Application' (the 1st word - no expansion applied, due to its contents)
it prints the pathname expansion applied to * (the 2nd word - i.e., it expands to matching filenames in the current dir.)
The fact that the latter results in * in your case implies that you happen to be running the echo command from an empty directory (save for hidden files, assuming the default configuration).
[1]Whether the string is unquoted or not does, however, matter with respect to tilde expansion; e.g., a=~ expands ~ to the user's home directory, whereas a='~' or a="~" assign literal ~.

Bash replace ls with for loop

I was given a tip to use file globbing in stead of ls in Bash scripts, in my code I followed the instructions and replaced array=($(ls)) to:
function list_files() { for f in *; do [[ -e $f ]] || continue done }
array=($(list_files))
However the new function doen't return anything, am I doing something wrong here?
Simply write this:
array=(*)
Leaving aside that your "list_files" doesn't output anything, there are still other problems with your approach.
Unquoted command substitution (in your case "$(list_files)") will still be subject to "word splitting" and "pathname expansion" (see bash(1) "EXPANSION"), which means that if there are spaces in "list_files" output, they will be used to split it into array elements, and if there are pattern characters, they will be used to attempt to match and substitute the current directory file names as separate array elements.
OTOH, if you quote the command substitution with double quotes, then the whole output will be considered a single array element.

place a multi-line output inside a variable

I'm writing a script in bash and I want it to execute a command and to handle each line separately. for example:
LINES=$(df)
echo $LINES
it will return all the output converting new lines with spaces.
example:
if the output was supposed to be:
1
2
3
then I would get
1 2 3
how can I place the output of a command into a variable allowing new lines to still be new lines so when I print the variable i will get proper output?
Generally in bash $v is asking for trouble in most cases. Almost always what you really mean is "$v" in double quotes:
LINES="$(df)"
echo "$LINES"
No, it will not. The $(something) only strips trailing newlines.
The expansion in argument to echo splits on whitespace and than echo concatenates separate arguments with space. To preserve the whitespace, you need to quote again:
echo "$LINES"
Note, that the assignment does not need to be quoted; result of expansion is not word-split in assignment to variable and in argument to case. But it can be quoted and it's easier to just learn to just always put the quotes in.

Quoting vs not quoting the variable on the RHS of a variable assignment

In shell scripting, what is the difference between these two when assigning one variable to another:
a=$b
and
a="$b"
and when should I use one over the other?
I think there is no big difference here. Yes, it is advisable to enclose a variable in double quotes when that variable is being referenced. However, $x does not seem to be referenced here in your question.
y=$x does not by itself affect how whitespaces will be handled. It is only when $y is actually used that quoting matters. For example:
$ x=" a b "
$ y=$x
$ echo $y
a b
$ echo "$y"
a b
From section 2.9.1 of the POSIX shell syntax specification:
Each variable assignment shall be expanded for tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal prior to assigning the value.
String-splitting and globbing (the steps which double quotes suppress) are not in this list.
Thus, the quotes are superfluous in all simple assignments (not speaking here to those implemented with arguments to declare, export or similar commands) except those where (1) the behavior of single-quoted, not double-quoted, strings are desired; or (2) whitespace or other content in the value would be otherwise parsed as syntactic rather than literal.
(Note that the decision on how to parse a command -- thus, whether it is an assignment, a simple command, a compound command, or something else -- takes place before parameter expansions; thus, var=$1 is determined to be an assignment before the value of $1 is ever considered! Were this untrue, such that data could silently become syntax, it would be far more difficult -- if not impossible -- to write secure code handling untrusted data in bash).
There are no (good) reasons to double quote the RHS of a variable assignment when used as a statement on its own.
The RHS of an assignment statement is not subject to word splitting (or brace expansion), etc. so cannot need quotes to assign correctly. All other expansions (as far as I'm aware) do occur on the RHS but also occur in double quotes so the quoting serves no purpose.
That being said there are reasons not to quote the RHS. Namely how to address error "bash: !d': event not found" in Bash command substitution (specifically see my answer and rici's answer).
Here are some other examples: ( having two files in the current directory t.sh and file)
a='$(ls)' # no command substitution
b="$(ls)" # command substitution, no word splitting
c='*' # no filename expansion
d="*" # no filename expansion
e=* # no filename expansion
f=$a # no expansions or splittings
g="$a" # no expansions or splittings
h=$d # no expansions or splittings
echo ---'$a'---
echo $a # no command substitution
echo ---'$b'---
echo $b # word splitting
echo ---'"$b"'---
echo "$b" # no word splitting
echo ---'$c'---
echo $c # filename expansion, word splitting
echo ---'"$c"'---
echo "$c" # no filename expansion, no word splitting
echo ---'$d'---
echo $d # filename expansion, word splitting
echo ---'"$d"'---
echo "$d" # no filename expansion, no word splitting
echo ---'"$e"'---
echo "$e" # no filename expansion, no word splitting
echo ---'$e'---
echo $e # filename expansion, word splitting
echo ---'"$f"'---
echo "$f" # no filename expansion, no word splitting
echo ---'"$g"'---
echo "$g" # no filename expansion, no word splitting
echo ---'$h'---
echo $h # filename expansion, word splitting
echo ---'"$h"'---
echo "$h" # no filename expansion, no word splitting
Output:
---$a---
$(ls)
---$b---
file t.sh
---"$b"---
file
t.sh
---$c---
file t.sh
---"$c"---
*
---$d---
file t.sh
---"$d"---
*
---"$e"---
*
---$e---
file t.sh
---"$f"---
$(ls)
---"$g"---
$(ls)
---$h---
file t.sh
---"$h"---
*
One interesting thing to notice is that command substitution occurs in variable assignments if they are in double quotes, and if the RHS is given explicitly as "$(ls)" and not implicitly as "$a"..
Advanced Bash-Scripting Guide: Chapter 5: Quoting
When referencing a variable, it is
generally advisable to enclose its
name in double quotes. This prevents
reinterpretation of all special
characters within the quoted string.
Use double quotes to prevent word
splitting. An argument enclosed in
double quotes presents itself as a
single word, even if it contains
whitespace separators.

Resources