how to address files by their suffix - bash

I am trying to copy a .nii file (Gabor3.nii) path to a variable but even though the file is found by the find command, I can't copy the path to the variable.
find . -type f -name "*.nii"
Data= '/$PWD/"*.nii"'
output:
./Gabor3.nii
./hello.sh: line 21: /$PWD/"*.nii": No such file or directory

What went wrong
You show that you're using:
Data= '/$PWD/"*.nii"'
The space means that the Data= parts sets an environment variable $Data to an empty string, and then attempts to run '/$PWD/"*.nii"'. The single quotes mean that what is between them is not expanded, and you don't have a directory /$PWD (that's a directory name of $, P, W, D in the root directory), so the script "*.nii" isn't found in it, hence the error message.
Using arrays
OK; that's what's wrong. What's right?
You have a couple of options. The most reliable is to use an array assignment and shell expansion:
Data=( "$PWD"/*.nii )
The parentheses (note the absence of spaces before the ( — that's crucial) makes it an array assignment. Using shell globbing gives a list of names, preserving spaces etc in the names correctly. Using double quotes around "$PWD" ensures that the expansion is correct even if there are spaces in the current directory name.
You can find out how many files there are in the list with:
echo "${#Data[#]}"
You can iterate over the list of file names with:
for file in "${Data[#]}"
do
echo "File is [$file]"
ls -l "$file"
done
Note that variable references must be in double quotes for names with spaces to work correctly. The "${Data[#]}" notation has parallels with "$#", which also preserves spaces in the arguments to the command. There is a "${Data[*]}" variant which behaves analogously to "$*", and is of similarly limited value.
If you're worried that there might not be any files with the extension, then use shopt -s nullglob to expand the globbing expression into an empty list rather than the unexpanded expression which is the historical default. You can unset the option with shopt -u nullglob if necessary.
Alternatives
Alternatives involve things like using command substitution Data=$(ls "$PWD"/*.nii), but this is vastly inferior to using an array unless neither the path in $PWD nor the file names contain any spaces, tabs, newlines. If there is no white space in the names, it works OK; you can iterate over:
for file in $Data
do
echo "No white space [$file]"
ls -l "$file"
done
but this is altogether less satisfactory if there are (or might be) any white space characters around.

You can use command substitution:
Data=$(find . -type f -name "*.nii" -print -quit)
To prevent multiline output, the -quit option stop searching after the first file was found(unless you're sure only one file will be found or you want to process multiple files).

The syntax to do what you seem to be trying to do with:
Data= '/$PWD/"*.nii"'
would be:
Data="$(ls "$PWD"/*.nii)"
Not saying it's the best approach for whatever you want to do next of course, it's probably not...

Related

Can't run for loops inside script command over ssh conection

I'm trying to run a for loop after using the command
script
to output the command-terminal copy to a txt file (for further checking).
This is all being done over an SSH connection on solar-putty.
This is my code:
filename=$(ls /home/*.txt | xargs -n1 -I{} basename "{}" | head -3)
echo "$filename"
script /home/test.txt
for f in $filename; do
echo $f; done
exit
Which does not initiate the for loop. It simply logs in the command above and I can't execute it.
When I run:
for f in $filename; do
echo $f; done
Everything works fine...
I'm using all of this inside a TMUX terminal as sudo su (because I'm afraid of loosing my terminal over SSH and I need sudo su)
If I understand what you're doing, the problem is that script is starting a new shell (as a subprocess), and it doesn't have the old (parent process) shell's variables. Can you define the variable after starting script, so it's defined in the right shell?
Another possible solution is to export the variable, which converts it from a shell variable to an environment variable, and subprocesses will inherit a copy of it. Note that, depending on which shell you're using, you may need to double-quote the value being assigned to avoid problems with word-splitting:
export filename="$(ls /home/*.txt | xargs -n1 -I{} basename "{}" | head -3)"
BTW, this way of handling lists of filenames will run into trouble with names that have spaces or some other shell metacharacters. The right way to handle lists of filenames is to store them as arrays, but unfortunately it's not possible to export arrays.
[EDIT:] The problem with filenames with spaces and/or other weird characters is that 1) the way ls outputs filenames is ambiguous and inconsistent, and 2) shell "word splitting" on unquoted variables can parse lists of filenames in ... unfortunate ... ways. For an extreme example, suppose you had a file named /home/this * that.txt -- if that's in a variable, and you use the variable without double-quotes around it, it'll treat /home/this and that.txt as totally separate things, and it'll also expand the * into a list of filenames in the current directory. See this question from yesterday for just one of many examples of this sort of thing happening for real.
To safely handle filenames with weird characters, the basic rules are that to get lists of files you use raw shell wildcards (not ls!) or find with -exec or -print0, always store lists of filenames in arrays (not plain variables), and double-quote all variable (/array) references. See BashFAQ #20: "How can I find and safely handle file names containing newlines, spaces or both?"
In this case, you just need to use a wildcard expression to make an array of paths, then the shell's builtin string manipulation to remove the path prefix:
filepaths=( /home/*.txt ) # Create array of matching files
filenames=( "${filepaths[#]##*/}" ) # Remove path prefixes
You can then use "${filenames[#]:0:3}" to get the first three names from the array. You can either create a new array with just the first three files, or use that directly in the loop:
first3files=( "${filenames[#]:0:3}" ) # ...or...
for f in "${filenames[#]:0:3}"; do
echo "$f" # Always double-quote variable references!
done
Note that bash doesn't allow stacking most array/variable modifiers, so getting the array of paths, stripping the prefixes, and selecting just the first few, must be done as three separate steps.

Multiple elements instead of one in bash script for loop

I have been following the answers given in these questions
Shellscript Looping Through All Files in a Folder
How to iterate over files in a directory with Bash?
to write a bash script which goes over files inside a folder and processes them. So, here is the code I have:
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in "$INFOLDER$YEAR*.mdb";
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done
I am receiving this error: basename: extra operand.
I added echo $f and I realized that f contains all the filenames separated by space. But I expected to get one at a time. What could be the problem here?
You're running into problems with quoting. In the shell, double-quotes prevent word splitting and wildcard expansion; generally, you don't want these things to happen to variable's values, so you should double-quote variable references. But when you have something that should be word-split or wildcard-expanded, it cannot be double-quoted. In your for statement, you have the entire file pattern in double-quotes:
for f in "$INFOLDER$YEAR*.mdb";
...which prevents word-splitting and wildcard expansion on the variables' values (good) but also prevents it on the * which you need expanded (that's the point of the loop). So you need to quote selectively, with the variables inside quotes and the wildcard outside them:
for f in "$INFOLDER$YEAR"*.mdb;
And then inside the loop, you should double-quote the references to $f in case any filenames contain whitespace or wildcards (which are completely legal in filenames):
echo "$f"
absname="$INFOLDER$YEAR$(basename "$f")"
(Note: the double-quotes around the assignment to absname aren't actually needed -- the right side of an assignment is one of the few places in the shell where it's safe to skip them -- but IMO it's easier and safer to just double-quote all variable references and $( ) expressions than to try to keep track of where it's safe and where it's not.)
Just quote your shell variables if they are supposed to contain strings with spaces in between.
basename "$f"
Not doing so will lead to splitting of the string into separate characters (see WordSplitting in bash), thereby messing up the basename command which expects one string argument rather than multiple.
Also it would be a wise to include the * outside the double-quotes as shell globbing wouldn't work inside them (single or double-quote).
#!/bin/bash
# good practice to lower-case variable names to distinguish them from
# shell environment variables
year="2002/"
in_folder="/local/data/datasets/Convergence/"
for file in "${in_folder}${year}"*.mdb; do
# break the loop gracefully if no files are found
[ -e "$file" ] || continue
echo "$file"
# Worth noting here, the $file returns the name of the file
# with absolute path just as below. You don't need to
# construct in manually
absname=${in_folder}${year}$(basename "$file")
done
just remove "" from this line
for f in "$INFOLDER$YEAR*.mdb";
so it looks like this
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in $INFOLDER$YEAR*.mdb;
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done

Produce a file that contains names of all empty subfolders

I want to write a script that takes a name of a folder as a command line argument and produces a file that contains the names of all subfolders with size 0 (empty subfolder). This is what I got:
#!/bin/bash
echo "Name of a folder'
read FOLDER
for entry in "$search_dir"/*
do
echo "$entry"
done
your script doesn't have the logic you intended. find command has a feature for this
$ find path/to/dir -type d -empty
will print empty directories starting from the given path/to/dir
I would suggest you accept the answer which suggests to use find instead. But just to be complete, here is some feedback on your code.
You read the input directory into FOLDER but then never use this variable.
As an aside, don't use uppercase for your private variables; this is reserved for system variables.
You have unpaired quotes in the prompt string. If the opening quote is double, you need to close with a double quote, or vice versa for single quotes.
You loop over directory entries, but do nothing to isolate just the ones which are directories, let alone empty directories.
Finally, nothing in your script uses Bash-only facilities, so it would be safe and somewhat more portable to use #!/bin/sh
Now, looping over directories can be done by using search_dir/*/ instead of just search_dir/*; and finding out which ones are empty can be done by checking whether a wildcard within the directory returns just the directory itself. (This assumes default globbing behavior -- with nullglob you would make a wildcard with no matches expand to an empty list, but this is problematic in some scenarios so it's not the default.)
#!/bin/bash
# read -p is not POSIX
read -p "Name of a folder" search_dir
for dir in "$search_dir"/*/
do
# [[ is Bash only
if [[ "$dir"/* = "$dir/*" ]]; then # Notice tricky quoting
echo "$dir"
fi
done
Using the wildcard expansion with [ is problematic because it is not prepared to deal with a wildcard expansion -- you get "too many arguments" if the wildcard expands into more than one filename -- so I'm using the somewhat more mild-tempered Bash replacement [[ which copes just fine with this. Alternatively, you could use case, which I would actually prefer here; but I've stuck to if in order to make only minimal changes to your script.

mac OS – Creating folders based on part of a filename

I'm running macOS and looking for a way to quickly sort thousands of jpg files. I need to create folders based on part of filenames and then move those files into it.
Simply, I want to put these files:
x_not_relevant_part_of_name.jpg
x_not_relevant_part_of_name.jpg
y_not_relevant_part_of_name.jpg
y_not_relevant_part_of_name.jpg
Into these folders:
x
y
Keep in mind that length of "x" and "y" part of name may be different.
Is there an automatic solution for that in maxOS?
I've tried using Automator and Terminal but i'm not a programmer so I haven't done well.
I would back up the files first to somewhere safe in case it all goes wrong. Then I would install homebrew and then install rename with:
brew install rename
Then you can do what you want with this:
rename --dry-run -p 's|(^[^_]*)|$1/$1|' *.jpg
If that looks correct, remove the --dry-run and run it again.
Let's look at that command.
--dry-run means just say what the command would do without actually doing anything
-p means create any intermediate paths (i.e. directories) as necessary
's|...|' I will explain in a moment
*.jpg means to run the command on all JPG files.
The funny bit in single quotes is actually a substitution, in its simplest form it is s|a|b| which means substitute thing a with b. In this particular case, the a is caret (^) which means start of filename and then [^_]* means any number of things that are not underscores. As I have surrounded that with parentheses, I can refer back to it in the b part as $1 since it is the first thing in parentheses in a. The b part means "whatever was before the underscore" followed by a slash and "whatever was before the underscore again".
Using find with bash Parameter Substitution in Terminal would likely work:
find . -type f -name "*jpg" -maxdepth 1 -exec bash -c 'mkdir -p "${0%%_*}"' {} \; \
-exec bash -c 'mv "$0" "${0%%_*}"' {} \;
This uses bash Parameter Substitution with find to recursively create directories (if they don't already exist) using the prefix of any filenames matching jpg. It takes the characters before the first underscore (_), then moves the matching files into the appropriate directory. To use the command simply cd into the directory you would like to organize. Keep in mind that without using the maxdepth option running the command multiple times can produce more folders; limit the "depth" at which the command can operate using the maxdepth option.
${parameter%word}
${parameter%%word}
The word is expanded to produce a pattern just as in filename expansion. If the pattern matches a trailing portion of the expanded
value of parameter, then the result of the expansion is the value of
parameter with the shortest matching pattern (the ‘%’ case) or the
longest matching pattern (the ‘%%’ case) deleted.
↳ GNU Bash : Shell Parameter Expansion

Expand part of the path in bash script

I am trying to list all files located in specific sub-directories of a directory in my bash script. Here is what I tried.
root_dir="/home/shaf/data/"
sub_dirs_prefixes=('ab' 'bp' 'cd' 'cn' 'll' 'mr' 'mb' 'nb' 'nh' 'nw' 'oh' 'st' 'wh')
ls "$root_dir"{$(IFS=,; echo "${sub_dirs_prefixes[*]}")}"rrc/"
Please note that I do not want to expand value stored in $root_dir as it may contain spaces but I do want to expand sub-path contained in {} which is a comma delimited string of contents of $sub_dirs_prefixes. I stored sub-directories prefixes in an array variable, $sub_dirs_prefixes , because I have to use them later on for something else.
I am getting following error:
ls: cannot access /home/shaf/data/{ab,bp,cd,cn,ll,mr,mb,nb,nh,nw,oh,st,wh}rrc/
If I copy the path in error message and run ls from command line, it does display contents of listed sub-directories.
You can command substitution to generate an extended pattern.
shopt -s extglob
ls "$root_dir"/$(IFS="|"; echo "#(${sub_dirs_prefixes[*]})rrc")
By the time parameter can command substitutions have completed, the shell sees this just before performing pathname expansion:
ls "/home/shaf/data/"/#(ab|bp|cd|cn|ll|mr|mb|nb|nh|nw|oh|st|wh)rrc
The #(...) pattern matches one of the enclosed prefixes.
It gets a little trickier if the components of the directory names contain characters that need to be quoted, since we aren't quoting the command substitution.

Resources