move command with regular expressions - bash

Bash is not recognizing the regular expression in this mv command:
mv ../downloads'^[exam].*$[.pdf] ../physics2400/exams
I'm trying to move files from a download directory to what ever directory I have made for them to go into.
An example of such a file is 'Exam 2 Practice Homework (Solutions).pdf'
(the single quotes are part of the file in Bash apparently.
There are many other files in the download folder hence the regex or the attempt anyway.

When performing filename expansion, Bash does not use regular expressions. Instead, a type of pattern matching referred to as globbing is used. This is discussed in the Filename Expansion section of the Bash manual.
In regards to your example file name (Exam 2 Practice Homework (Solutions).pdf), here are a couple things to note:
the single quotes are not part of the file name, but are a convenience to avoid having to escape special characters in the filename (i.e. the spaces and the parentheses). Without the quotes, the filename would be specified Exam\ 2\ Practice\ Homework\ \(Solutions\).pdf. See the Quoting section of the Bash manual for further details.
filesystems in Unix-like operating systems are case sensitive, so you need to account for the upper case E the filename starts with
Here's a pattern matching expression that would match your example filename as well as other files that start with Exam and end with .pdf.
mv ../downloads/Exam*.pdf ../phyiscs2400/exams
If you have files that start with both Exam and exam, you could account for both with the following:
mv ../downloads/[Ee]xam*.pdf ../phyiscs2400/exams
The bracketed expression is interpreted as "matches any one of the enclosed characters". This allows you to account for both upper and lower case.
Before executing such mv commands, I would test the filename expansion by running ls to verify that the intended files are matched:
ls ../downloads/[Ee]xam*.pdf

If you want to use the regular expression, how about this?
find ./downloads -regex '.*\.pdf' -exec mv '{}' exams/ \;

Related

how to address files by their suffix

I am trying to copy a .nii file (Gabor3.nii) path to a variable but even though the file is found by the find command, I can't copy the path to the variable.
find . -type f -name "*.nii"
Data= '/$PWD/"*.nii"'
output:
./Gabor3.nii
./hello.sh: line 21: /$PWD/"*.nii": No such file or directory
What went wrong
You show that you're using:
Data= '/$PWD/"*.nii"'
The space means that the Data= parts sets an environment variable $Data to an empty string, and then attempts to run '/$PWD/"*.nii"'. The single quotes mean that what is between them is not expanded, and you don't have a directory /$PWD (that's a directory name of $, P, W, D in the root directory), so the script "*.nii" isn't found in it, hence the error message.
Using arrays
OK; that's what's wrong. What's right?
You have a couple of options. The most reliable is to use an array assignment and shell expansion:
Data=( "$PWD"/*.nii )
The parentheses (note the absence of spaces before the ( — that's crucial) makes it an array assignment. Using shell globbing gives a list of names, preserving spaces etc in the names correctly. Using double quotes around "$PWD" ensures that the expansion is correct even if there are spaces in the current directory name.
You can find out how many files there are in the list with:
echo "${#Data[#]}"
You can iterate over the list of file names with:
for file in "${Data[#]}"
do
echo "File is [$file]"
ls -l "$file"
done
Note that variable references must be in double quotes for names with spaces to work correctly. The "${Data[#]}" notation has parallels with "$#", which also preserves spaces in the arguments to the command. There is a "${Data[*]}" variant which behaves analogously to "$*", and is of similarly limited value.
If you're worried that there might not be any files with the extension, then use shopt -s nullglob to expand the globbing expression into an empty list rather than the unexpanded expression which is the historical default. You can unset the option with shopt -u nullglob if necessary.
Alternatives
Alternatives involve things like using command substitution Data=$(ls "$PWD"/*.nii), but this is vastly inferior to using an array unless neither the path in $PWD nor the file names contain any spaces, tabs, newlines. If there is no white space in the names, it works OK; you can iterate over:
for file in $Data
do
echo "No white space [$file]"
ls -l "$file"
done
but this is altogether less satisfactory if there are (or might be) any white space characters around.
You can use command substitution:
Data=$(find . -type f -name "*.nii" -print -quit)
To prevent multiline output, the -quit option stop searching after the first file was found(unless you're sure only one file will be found or you want to process multiple files).
The syntax to do what you seem to be trying to do with:
Data= '/$PWD/"*.nii"'
would be:
Data="$(ls "$PWD"/*.nii)"
Not saying it's the best approach for whatever you want to do next of course, it's probably not...

mac OS – Creating folders based on part of a filename

I'm running macOS and looking for a way to quickly sort thousands of jpg files. I need to create folders based on part of filenames and then move those files into it.
Simply, I want to put these files:
x_not_relevant_part_of_name.jpg
x_not_relevant_part_of_name.jpg
y_not_relevant_part_of_name.jpg
y_not_relevant_part_of_name.jpg
Into these folders:
x
y
Keep in mind that length of "x" and "y" part of name may be different.
Is there an automatic solution for that in maxOS?
I've tried using Automator and Terminal but i'm not a programmer so I haven't done well.
I would back up the files first to somewhere safe in case it all goes wrong. Then I would install homebrew and then install rename with:
brew install rename
Then you can do what you want with this:
rename --dry-run -p 's|(^[^_]*)|$1/$1|' *.jpg
If that looks correct, remove the --dry-run and run it again.
Let's look at that command.
--dry-run means just say what the command would do without actually doing anything
-p means create any intermediate paths (i.e. directories) as necessary
's|...|' I will explain in a moment
*.jpg means to run the command on all JPG files.
The funny bit in single quotes is actually a substitution, in its simplest form it is s|a|b| which means substitute thing a with b. In this particular case, the a is caret (^) which means start of filename and then [^_]* means any number of things that are not underscores. As I have surrounded that with parentheses, I can refer back to it in the b part as $1 since it is the first thing in parentheses in a. The b part means "whatever was before the underscore" followed by a slash and "whatever was before the underscore again".
Using find with bash Parameter Substitution in Terminal would likely work:
find . -type f -name "*jpg" -maxdepth 1 -exec bash -c 'mkdir -p "${0%%_*}"' {} \; \
-exec bash -c 'mv "$0" "${0%%_*}"' {} \;
This uses bash Parameter Substitution with find to recursively create directories (if they don't already exist) using the prefix of any filenames matching jpg. It takes the characters before the first underscore (_), then moves the matching files into the appropriate directory. To use the command simply cd into the directory you would like to organize. Keep in mind that without using the maxdepth option running the command multiple times can produce more folders; limit the "depth" at which the command can operate using the maxdepth option.
${parameter%word}
${parameter%%word}
The word is expanded to produce a pattern just as in filename expansion. If the pattern matches a trailing portion of the expanded
value of parameter, then the result of the expansion is the value of
parameter with the shortest matching pattern (the ‘%’ case) or the
longest matching pattern (the ‘%%’ case) deleted.
↳ GNU Bash : Shell Parameter Expansion

Glob matching (wildcards) in fish shell not matching bash behavior

When I the following command in bash, I get a list of files that match the regular expression I want:
$> ls *-[0-9].jtl
benchmark-1422478133-1.jtl benchmark-1422502883-4.jtl benchmark-1422915207-2.jtl
However, when I run the same command in the fish shell, I get different result:
$> ls *-[0-9].jtl
fish: No matches for wildcard '*-[0-9].jtl'.
ls *-[0-9].jtl
^
How come?
Fish's documentation does not claim to support the full power of POSIX glob patterns.
Quoting the docs:
Wildcards
If a star (*) or a question mark (?) is present in the parameter, fish attempts to match the given parameter to any files in such a way that:
? can match any single character except /.
* can match any string of characters not containing /. This includes matching an empty string.
** matches any string of characters. This includes matching an empty string. The string may include the / character but does not need to.
Notably, there's no mention of character classes, as fish doesn't support them.
If you want globs guaranteed to support all POSIX (fnmatch) features, use a POSIX-compliant or POSIX-superset shell.
You can also use more extended tool unix find. It is very powerful.
https://kb.iu.edu/d/admm
https://duckduckgo.com/?q=unix+find
example: use regular expressions
find . -path '.*-[0-9].jtl' -not -path '.*-32.jtl'
Fish just needs quotes "*.conf" to do the same as bash *.conf.
This is an older post, but I think it's worth revisiting this. At time of writing (Mar 2021), the documentation does explicitly state supporting wildcards.
Fish supports the familiar wildcard *. To list all JPEG files:
> ls *.jpg
lena.jpg
meena.jpg
santa maria.jpg
You can include multiple wildcards:
> ls l*.p*
lena.png
lesson.pdf
Especially powerful is the recursive wildcard ** which searches directories recursively:
> ls /var/**.log
/var/log/system.log
/var/run/sntp.log
However, I still all too frequently run into this same issue
[/home/glass ]
><glass#rockpiX-Ubuntu> rm *.log.old
fish: No matches for wildcard “*.log.old”. See `help expand`.
rm *.log.old
^
In fish 3+ you could string match:
ls | string match -r --entire '-[0-9].jtl'
options:
-r: regular expression
--entire: returns the entire matching string

bash - mass renaming files with many special characters

I have a lot of files (in single directory) like:
[a]File-. abc'.d -001[xxx].txt
so there are many spaces, apostrophes, brackets, and full stops. The only differences between them are numbers in place of 001, and letters in place of xxx.
How to remove the middle part, so all that remains would be
[a]File-001[xxx].txt
I'd like an explanation how such code would work, so I could adapt it for other uses, and hopefully help answer others similar questions.
Here is a simple script in pure bash:
for f in *; do # for all entries in the current directory
if [ -f "$f" ]; then # if the entry is a regular file (i.e. not a directory)
mv "$f" "${f/-*-/-}" # rename it by removing everything between two dashes
# and the dashes, and replace the removed part
# with a single dash
fi
done
The magic done in the "${f/-*-/-}" expression is described in the bash manual (the command is info bash) in the chapter 3.5.3 Shell Parameter Expansion
The * pattern in the first line of the script can be replaced with anything than can help to narrow the list of the filles you want to rename, e.g. *.txt, *File*.txt, etc.
If you have the rename (aka prename) utility that's a part of Perl distribution, you could say:
rename -n 's/([^-]*-).*-(.*)/$1$2/' *.txt
to rename all txt files in your desired format. The -n above would not perform the actual rename, it'd only tell you what it would do had you not specified it. (In order to perform the actual rename, remove -n from the above command.)
For example, this would rename the file
[a]File-. abc'.d -001[xxx].txt
as
[a]File-001[xxx].txt
Regarding the explanation, this captures the part upto the first - into a group, and the part after the second (or last) one into another and combines those.
Read about Regular Expressions. If you have perl docs available on your system, saying perldoc perlre should help.

script-file vs command-line: rsync and --exclude

I have a simple test bash script which looks like that:
#!/bin/bash
cmd="rsync -rv --exclude '*~' ./dir ./new"
$cmd # execute command
When I run the script it will copy also the files ending with a ~ even though I meant to exclude them. When I run the very same rsync command directly from the command line, it works! Does someone know why and how to make bash script work?
Btw, I know that I can also work with --exclude-from but I want to know how this works anyway.
Try eval:
#!/bin/bash
cmd="rsync -rv --exclude '*~' ./dir ./new"
eval $cmd # execute command
The problem isn't that you're running it in a script, it's that you put the command in a variable and then run the expanded variable. And since variable expansion happens after quote removal has already been done, the single quotes around your exclude pattern never get removed... and so rsync winds up excluding files with names starting with ' and ending with ~'. To fix this, just remove the quotes around the pattern (the whole thing is already in double-quotes, so they aren't needed):
#!/bin/bash
cmd="rsync -rv --exclude *~ ./dir ./new"
$cmd # execute command
...speaking of which, why are you putting the command in a variable before running it? In general, this is a good way make code more confusing than it needs to be, and trigger parsing oddities (some even weirder than this). So how about:
#!/bin/bash
rsync -rv --exclude '*~' ./dir ./new
You can use a simple --eclude '~' as (accoding to the man page):
if the pattern starts with a / then it is anchored to a particular spot in
the hierarchy of files, otherwise it
is matched against the end of the
pathname. This is similar to a leading
^ in regular expressions. Thus "/foo"
would match a name of "foo" at either
the "root of the transfer" (for a
global rule) or in the merge-file's
directory (for a per-directory rule).
An unqualified "foo" would match a
name of "foo" anywhere in the tree
because the algorithm is applied
recursively from the top down; it
behaves as if each path component gets
a turn at being the end of the
filename. Even the unanchored
"sub/foo" would match at any point in
the hierarchy where a "foo" was found
within a directory named "sub". See
the section on ANCHORING
INCLUDE/EXCLUDE PATTERNS for a full
discussion of how to specify a pattern
that matches at the root of the
transfer.
if the pattern ends with a / then it will only match a directory, not a
regular file, symlink, or device.
rsync chooses between doing a simple string match and wildcard
matching by checking if the pattern
contains one of these three wildcard
characters: '*', '?', and '[' .
a '*' matches any path component, but it stops at slashes.
use '**' to match anything, including slashes.

Resources