script-file vs command-line: rsync and --exclude - bash

I have a simple test bash script which looks like that:
#!/bin/bash
cmd="rsync -rv --exclude '*~' ./dir ./new"
$cmd # execute command
When I run the script it will copy also the files ending with a ~ even though I meant to exclude them. When I run the very same rsync command directly from the command line, it works! Does someone know why and how to make bash script work?
Btw, I know that I can also work with --exclude-from but I want to know how this works anyway.

Try eval:
#!/bin/bash
cmd="rsync -rv --exclude '*~' ./dir ./new"
eval $cmd # execute command

The problem isn't that you're running it in a script, it's that you put the command in a variable and then run the expanded variable. And since variable expansion happens after quote removal has already been done, the single quotes around your exclude pattern never get removed... and so rsync winds up excluding files with names starting with ' and ending with ~'. To fix this, just remove the quotes around the pattern (the whole thing is already in double-quotes, so they aren't needed):
#!/bin/bash
cmd="rsync -rv --exclude *~ ./dir ./new"
$cmd # execute command
...speaking of which, why are you putting the command in a variable before running it? In general, this is a good way make code more confusing than it needs to be, and trigger parsing oddities (some even weirder than this). So how about:
#!/bin/bash
rsync -rv --exclude '*~' ./dir ./new

You can use a simple --eclude '~' as (accoding to the man page):
if the pattern starts with a / then it is anchored to a particular spot in
the hierarchy of files, otherwise it
is matched against the end of the
pathname. This is similar to a leading
^ in regular expressions. Thus "/foo"
would match a name of "foo" at either
the "root of the transfer" (for a
global rule) or in the merge-file's
directory (for a per-directory rule).
An unqualified "foo" would match a
name of "foo" anywhere in the tree
because the algorithm is applied
recursively from the top down; it
behaves as if each path component gets
a turn at being the end of the
filename. Even the unanchored
"sub/foo" would match at any point in
the hierarchy where a "foo" was found
within a directory named "sub". See
the section on ANCHORING
INCLUDE/EXCLUDE PATTERNS for a full
discussion of how to specify a pattern
that matches at the root of the
transfer.
if the pattern ends with a / then it will only match a directory, not a
regular file, symlink, or device.
rsync chooses between doing a simple string match and wildcard
matching by checking if the pattern
contains one of these three wildcard
characters: '*', '?', and '[' .
a '*' matches any path component, but it stops at slashes.
use '**' to match anything, including slashes.

Related

move command with regular expressions

Bash is not recognizing the regular expression in this mv command:
mv ../downloads'^[exam].*$[.pdf] ../physics2400/exams
I'm trying to move files from a download directory to what ever directory I have made for them to go into.
An example of such a file is 'Exam 2 Practice Homework (Solutions).pdf'
(the single quotes are part of the file in Bash apparently.
There are many other files in the download folder hence the regex or the attempt anyway.
When performing filename expansion, Bash does not use regular expressions. Instead, a type of pattern matching referred to as globbing is used. This is discussed in the Filename Expansion section of the Bash manual.
In regards to your example file name (Exam 2 Practice Homework (Solutions).pdf), here are a couple things to note:
the single quotes are not part of the file name, but are a convenience to avoid having to escape special characters in the filename (i.e. the spaces and the parentheses). Without the quotes, the filename would be specified Exam\ 2\ Practice\ Homework\ \(Solutions\).pdf. See the Quoting section of the Bash manual for further details.
filesystems in Unix-like operating systems are case sensitive, so you need to account for the upper case E the filename starts with
Here's a pattern matching expression that would match your example filename as well as other files that start with Exam and end with .pdf.
mv ../downloads/Exam*.pdf ../phyiscs2400/exams
If you have files that start with both Exam and exam, you could account for both with the following:
mv ../downloads/[Ee]xam*.pdf ../phyiscs2400/exams
The bracketed expression is interpreted as "matches any one of the enclosed characters". This allows you to account for both upper and lower case.
Before executing such mv commands, I would test the filename expansion by running ls to verify that the intended files are matched:
ls ../downloads/[Ee]xam*.pdf
If you want to use the regular expression, how about this?
find ./downloads -regex '.*\.pdf' -exec mv '{}' exams/ \;

how to address files by their suffix

I am trying to copy a .nii file (Gabor3.nii) path to a variable but even though the file is found by the find command, I can't copy the path to the variable.
find . -type f -name "*.nii"
Data= '/$PWD/"*.nii"'
output:
./Gabor3.nii
./hello.sh: line 21: /$PWD/"*.nii": No such file or directory
What went wrong
You show that you're using:
Data= '/$PWD/"*.nii"'
The space means that the Data= parts sets an environment variable $Data to an empty string, and then attempts to run '/$PWD/"*.nii"'. The single quotes mean that what is between them is not expanded, and you don't have a directory /$PWD (that's a directory name of $, P, W, D in the root directory), so the script "*.nii" isn't found in it, hence the error message.
Using arrays
OK; that's what's wrong. What's right?
You have a couple of options. The most reliable is to use an array assignment and shell expansion:
Data=( "$PWD"/*.nii )
The parentheses (note the absence of spaces before the ( — that's crucial) makes it an array assignment. Using shell globbing gives a list of names, preserving spaces etc in the names correctly. Using double quotes around "$PWD" ensures that the expansion is correct even if there are spaces in the current directory name.
You can find out how many files there are in the list with:
echo "${#Data[#]}"
You can iterate over the list of file names with:
for file in "${Data[#]}"
do
echo "File is [$file]"
ls -l "$file"
done
Note that variable references must be in double quotes for names with spaces to work correctly. The "${Data[#]}" notation has parallels with "$#", which also preserves spaces in the arguments to the command. There is a "${Data[*]}" variant which behaves analogously to "$*", and is of similarly limited value.
If you're worried that there might not be any files with the extension, then use shopt -s nullglob to expand the globbing expression into an empty list rather than the unexpanded expression which is the historical default. You can unset the option with shopt -u nullglob if necessary.
Alternatives
Alternatives involve things like using command substitution Data=$(ls "$PWD"/*.nii), but this is vastly inferior to using an array unless neither the path in $PWD nor the file names contain any spaces, tabs, newlines. If there is no white space in the names, it works OK; you can iterate over:
for file in $Data
do
echo "No white space [$file]"
ls -l "$file"
done
but this is altogether less satisfactory if there are (or might be) any white space characters around.
You can use command substitution:
Data=$(find . -type f -name "*.nii" -print -quit)
To prevent multiline output, the -quit option stop searching after the first file was found(unless you're sure only one file will be found or you want to process multiple files).
The syntax to do what you seem to be trying to do with:
Data= '/$PWD/"*.nii"'
would be:
Data="$(ls "$PWD"/*.nii)"
Not saying it's the best approach for whatever you want to do next of course, it's probably not...

Produce a file that contains names of all empty subfolders

I want to write a script that takes a name of a folder as a command line argument and produces a file that contains the names of all subfolders with size 0 (empty subfolder). This is what I got:
#!/bin/bash
echo "Name of a folder'
read FOLDER
for entry in "$search_dir"/*
do
echo "$entry"
done
your script doesn't have the logic you intended. find command has a feature for this
$ find path/to/dir -type d -empty
will print empty directories starting from the given path/to/dir
I would suggest you accept the answer which suggests to use find instead. But just to be complete, here is some feedback on your code.
You read the input directory into FOLDER but then never use this variable.
As an aside, don't use uppercase for your private variables; this is reserved for system variables.
You have unpaired quotes in the prompt string. If the opening quote is double, you need to close with a double quote, or vice versa for single quotes.
You loop over directory entries, but do nothing to isolate just the ones which are directories, let alone empty directories.
Finally, nothing in your script uses Bash-only facilities, so it would be safe and somewhat more portable to use #!/bin/sh
Now, looping over directories can be done by using search_dir/*/ instead of just search_dir/*; and finding out which ones are empty can be done by checking whether a wildcard within the directory returns just the directory itself. (This assumes default globbing behavior -- with nullglob you would make a wildcard with no matches expand to an empty list, but this is problematic in some scenarios so it's not the default.)
#!/bin/bash
# read -p is not POSIX
read -p "Name of a folder" search_dir
for dir in "$search_dir"/*/
do
# [[ is Bash only
if [[ "$dir"/* = "$dir/*" ]]; then # Notice tricky quoting
echo "$dir"
fi
done
Using the wildcard expansion with [ is problematic because it is not prepared to deal with a wildcard expansion -- you get "too many arguments" if the wildcard expands into more than one filename -- so I'm using the somewhat more mild-tempered Bash replacement [[ which copes just fine with this. Alternatively, you could use case, which I would actually prefer here; but I've stuck to if in order to make only minimal changes to your script.

mac OS – Creating folders based on part of a filename

I'm running macOS and looking for a way to quickly sort thousands of jpg files. I need to create folders based on part of filenames and then move those files into it.
Simply, I want to put these files:
x_not_relevant_part_of_name.jpg
x_not_relevant_part_of_name.jpg
y_not_relevant_part_of_name.jpg
y_not_relevant_part_of_name.jpg
Into these folders:
x
y
Keep in mind that length of "x" and "y" part of name may be different.
Is there an automatic solution for that in maxOS?
I've tried using Automator and Terminal but i'm not a programmer so I haven't done well.
I would back up the files first to somewhere safe in case it all goes wrong. Then I would install homebrew and then install rename with:
brew install rename
Then you can do what you want with this:
rename --dry-run -p 's|(^[^_]*)|$1/$1|' *.jpg
If that looks correct, remove the --dry-run and run it again.
Let's look at that command.
--dry-run means just say what the command would do without actually doing anything
-p means create any intermediate paths (i.e. directories) as necessary
's|...|' I will explain in a moment
*.jpg means to run the command on all JPG files.
The funny bit in single quotes is actually a substitution, in its simplest form it is s|a|b| which means substitute thing a with b. In this particular case, the a is caret (^) which means start of filename and then [^_]* means any number of things that are not underscores. As I have surrounded that with parentheses, I can refer back to it in the b part as $1 since it is the first thing in parentheses in a. The b part means "whatever was before the underscore" followed by a slash and "whatever was before the underscore again".
Using find with bash Parameter Substitution in Terminal would likely work:
find . -type f -name "*jpg" -maxdepth 1 -exec bash -c 'mkdir -p "${0%%_*}"' {} \; \
-exec bash -c 'mv "$0" "${0%%_*}"' {} \;
This uses bash Parameter Substitution with find to recursively create directories (if they don't already exist) using the prefix of any filenames matching jpg. It takes the characters before the first underscore (_), then moves the matching files into the appropriate directory. To use the command simply cd into the directory you would like to organize. Keep in mind that without using the maxdepth option running the command multiple times can produce more folders; limit the "depth" at which the command can operate using the maxdepth option.
${parameter%word}
${parameter%%word}
The word is expanded to produce a pattern just as in filename expansion. If the pattern matches a trailing portion of the expanded
value of parameter, then the result of the expansion is the value of
parameter with the shortest matching pattern (the ‘%’ case) or the
longest matching pattern (the ‘%%’ case) deleted.
↳ GNU Bash : Shell Parameter Expansion

Why does grep ignore the shell variable containing directories to be ignored?

On Mac OS X, I have a bash script like this:
# Directories excluded from grep go here.
EXCLUDEDIR="--exclude-dir={node_modules,.git,tmp,angular*,icons,server,coffee}"
# This grep needs to include one line below the hit.
grep -iIrn -A1 $EXCLUDEDIR -e "class=[\"\']title[\"\']>$" -e "<div class=\"content" . > microcopy.txt
but it seems to be ignoring $EXCLUDEDIR. If I simply use the --exclude-dir directly, it works. Why won't it expand the variable and work right?
The braces are technically an error. When they are in a variable, they are included verbatim, while when you type them directly as part of the command, Bash performs brace expansion, and effectively removes the braces from your expression.
bash$ echo --exclude-dir=moo{bar,baz}
--exclude-dir=moobar --exclude-dir=moobaz
bash$ x='moo{bar,baz}'
bash$ echo --exclude-dir=$x
--exclude-dir=moo{bar,baz}
The (not so simple) workaround is to list your parameters explicitly instead. This can be somewhat simplified by using an array to list the directory names you want to exclude (but this is not portable to legacy /bin/sh).
x=(node_modules .git tmp angular\* icons server coffee)
EXCLUDEDIR="${x[#]/#/--exclude-dir=}"
The backslash in angular\* is to pass this wildcard expression through to grep unexpanded -- if the shell would expand the variable, grep would not exclude directories matching the wildcard expression in subdirectories (unless they conveniently happened to match one of the expanded values in the current directory). If you have nullglob in effect, an unescaped wildcard would simply disappear from the lists.
#tripleee correctly describes the problem, but there are two workarounds that I think are simpler (and, I think, more portable) than using an array: use eval in the git command, or use echo in the variable assignment itself. The echo method is preferable.
Using eval
# Directories excluded from grep go here.
EXCLUDEDIR="--exclude-dir={node_modules,.git,tmp,angular*,icons,server,coffee}"
# This grep needs to include one line below the hit.
eval grep -iIrn -A1 $EXCLUDEDIR # .... etc
This causes the braces to be expanded as if they had been typed literally. Note, however, that it may have some unintended side-effects if you're not careful; for instance, you may need to add some extra \'s to escape quotes and $-signs.
Using echo
This is potentially safer than eval, since you won't accidentally execute code hidden in the EXCLUDEDIR variable.
# Directories excluded from grep go here.
EXCLUDEDIR="$(echo --exclude-dir={node_modules,.git,tmp,angular*,icons,server,coffee})"
# This grep needs to include one line below the hit.
grep -iIrn -A1 $EXCLUDEDIR # .... etc

Resources