Batch copy files from subdirectories to a new folder? - bash

I would like to batch copy specific files that ends with fastq.gz from each folder (with unique names) to a new directory, but it keeps giving me an error saying that the files cannot be found. Is it because I am using a wildcard wrong?
for f in ./*/split-adapter-quality-trimmed/*.fastq.gz; do
cp *fastq.gz ../../new;
done

Executing for f in ./*/split-adapter-quality-trimmed/*.fastq.gz will already contain the filenames ending with *.fastq.gz in variable f. So use it directly in cp (cp $f destination) inside the loop. If you put an echo $f inside the loop, you can see all the files and verify it before cp.
for f in ./*/split-adapter-quality-trimmed/*.fastq.gz; do
cp $f ../../new;
done

Except if you absolutely want to use a for-loop, you could perform that with one find command:
find ./*/split-adapter-quality-trimmed -name "*fastq.gz" -exec cp {} ../../new \;
It will browse the directories matching ./*/split-adapter-quality-trimmed, looking for each file terminating with fastq.gz, and then execute the needed cp command (in the current directory of the shell, the command line ends with a semi-colon):
cp <found-path> ../../new
(The wildcarded term *fastq.gz is surrounded by quotes to prevent Bash to interpret it, just in case. So is it with the semi-colon.)

Related

Copy multiple files with wildcard in bash

Using Ubuntu 18.04. Say we have a file called debug.log. You can create a copy called debug_BACKUP.log with either of these commands:
cp debug.log debug_BACKUP.log
cp debug{,_BACKUP}.log
Alternatively, substitute cp with mv to rename the file.
Now suppose we have debug1.log and debug2.log. We would like to create copies called debug1_BACKUP.log and debug2_BACKUP.log. Is there a single command to achieve this?
When I tried either of the following:
cp debug*.log debug*_BACKUP.log
cp debug*{,_BACKUP}.log
the error is cp: target 'debug*_BACKUP.log' is not a directory.
Brace expansions are an instruction for the shell about how to rewrite your command before glob expansion takes place. They aren't passed to the command itself -- cp has no idea if a brace expansion was used. For that matter, cp doesn't even have any idea if a wildcard is used; when you run cp *.txt dir/, the shell generates an array of C strings corresponding to something like cp foo.txt bar.txt baz.txt dir/ before running it.
This means that if you want to rewrite content after wildcard expansion takes place, you need to do it by hand.
for f in debug*.log; do
[[ $f = *_BACKUP.log ]] && continue # skip things that are already backup files
cp "$f" "${f%.log}_BACKUP.log"
done
There are few excellent bulk rename programs, including Perl based file-rename. You can achieve your bulk copy in 3 steps:
Copy the files to tmp sub folder
Perform bulk rename, moving the files back into the current folder
Remove the tmp folder

How to copy recursively files with multiple specific extensions in bash

I want to copy all files with specific extensions recursively in bash.
****editing****
I've written the full script. I have list of names in a csv file, I'm iterating through each name in that list, then creating a directory with that same name somewhere else, then I'm searching in my source directory for the directory with that name, inside it there are few files with endings of xlsx,tsv,html,gz and I'm trying to copy all of them into the newly created directory.
sample_list_filepath=/home/lists/papers
destination_path=/home/ds/samples
source_directories_path=/home/papers_final/new
cat $sample_list_filepath/sample_list.csv | while read line
do
echo $line
cd $source_directories_path/$line
cp -r *.{tsv,xlsx,html,gz} $source_directories_path/$line $destination_path
done
This works, but it copies all the files there, with no discrimination for specific extension.
What is the problem?
An easy way to solve your problem is to use find and regex :
find src/ -regex '.*\.\(tsv\|xlsx\|gz\|html\)$' -exec cp {} dest/ \;
find look recursively in the directory you specify (in my example it's src/), allows you to filter with -regex and to apply a command for matching results with -exec
For the regex part :
.*\.
will take the name of the file and the dot before extension,
\(tsv\|xlsx\|gz\|html\)$
verify the extension with those you want.
The exec block is what you do with files you got from regex
-exec cp {} dest/ \;
In this case, you copy what you got ({} meaning) to the destination directory.

Bash: Loop through each file in each subfolder and rename

I'm in a directory with 3 subdirectories: sub1, sub2, and sub3. Each subdirectory has files in it. I would like to rename each file by prepending sample_ to it.
Here's what I have:
for d in */; do
for f in "$d"; do
mv "$f" "sample_$f"
done
done
This prepends to the folder name, which isn't what I want. What am I doing incorrectly?
Thanks!
You can easily accomplish this with find and brace expansion (part of shell expansion):
find . -type f -execdir mv {,sample_}{} \;
This should recursively find only files (-type f) within each subdirectory then move them (renaming them) using the -execdir option (see below), prepending sample_ to each filename. The remaining mv {,_sample}{} is the Cartesian product way of doing mv {} sample_{}.
-execdir command {} + Like -exec, but the specified command is run from the subdirectory
containing the matched file, which is not normally the directory in
which you started find. This a much more secure method for invoking
commands, as it avoids race conditions during resolution of the paths
to the matched files. As with the -exec option, the '+' form of
-execdir will build a command line to process more than one matched file, but any given invocation of command will only list files that
exist in the same subdirectory. If you use this option, you must
ensure that your $PATH environment variable does not reference the
current directory; otherwise, an attacker can run any commands they
like by leaving an appropriately-named file in a directory in which
you will run -execdir.
↳ GNU : Brace / Shell Expansions
you need to use dirname and basename to split your file name.
for d in */; do
for f in $d/*; do
mv "$f" "$d/sample_$(basename $f)"
done
done

How to rename files using wildcard in bash?

I was trying to rename some files to another extension:
# mv *.sqlite3_done *.sqlite3
but got an error:
mv: target '*.sqlite3' is not a directory
Why?
the easy way is use find
find . -type f -name '*.sqlite3_done' -exec sh -c 'x="{}"; mv "$x" "${x%_done}"' \;
mv can only move multiple files into a single directory; it can’t move each one to a different name. You can loop in bash instead:
for x in *.sqlite3_done; do
mv -- "$x" "${x%_done}"
done
${x%_done} removes _done from the end of $x.
The wildcard expansion results in multiple names being passed to the command. The shell thinks you are trying to move multiple files to the *.sqlite3 directory.
You need to use a loop:
for nam in *sqlite3_done
do
newname=${nam%_done}
mv $nam $newname
done
The %_done says to remove the last occurrence of _done from the string.
If you may have spaces in your filenames you will want to quote the filenames.

bash script for copying files between directories

I am writing the following script to copy *.nzb files to a folder to queue them for Download.
I wrote the following script
#!/bin/bash
#This script copies NZB files from Downloads folder to HellaNZB queue folder.
${DOWN}="/home/user/Downloads/"
${QUEUE}="/home/user/.hellanzb/nzb/daemon.queue/"
for a in $(find ${DOWN} -name *.nzb)
do
cp ${a} ${QUEUE}
rm *.nzb
done
it gives me the following error saying:
HellaNZB.sh: line 5: =/home/user/Downloads/: No such file or directory
HellaNZB.sh: line 6: =/home/user/.hellanzb/nzb/daemon.queue/: No such file or directory
Thing is that those directories exsist, I do have right to access them.
Any help would be nice.
Please and thank you.
Variable names on the left side of an assignment should be bare.
foo="something"
echo "$foo"
Here are some more improvements to your script:
#!/bin/bash
#This script copies NZB files from Downloads folder to HellaNZB queue folder.
down="/home/myusuf3/Downloads/"
queue="/home/myusuf3/.hellanzb/nzb/daemon.queue/"
find "${down}" -name "*.nzb" | while read -r file
do
mv "${file}" "${queue}"
done
Using while instead of for and quoting variables that contain filenames protects against filenames that contain spaces from being interpreted as more than one filename. Removing the rm keeps it from repeatedly producing errors and failing to copy any but the first file. The file glob for -name needs to be quoted. Habitually using lowercase variable names reduces the chances of name collisions with shell variables.
If all your files are in one directory (and not in multiple subdirectories) your whole script could be reduced to the following, by the way:
mv /home/myusuf3/Downloads/*.nzb /home/myusuf3/.hellanzb/nzb/daemon.queue/
If you do have files in multiple subdirectories:
find /home/myusuf3/Downloads/ -name "*.nzb" -exec mv {} /home/myusuf3/.hellanzb/nzb/daemon.queue/ +
As you can see, there's no need for a loop.
The correct syntax is:
DOWN="/home/myusuf3/Downloads/"
QUEUE="/home/myusuf3/.hellanzb/nzb/daemon.queue/"
for a in $(find ${DOWN} -name *.nzb)
# escape the * or it will be expanded in the current directory
# let's just hope no file has blanks in its name
do
cp ${a} ${QUEUE} # ok, although I'd normally add a -p
rm *.nzb # again, this is expanded in the current directory
# when you fix that, it will remove ${a}s before they are copied
done
Why don't you just use rm $(a}?
Why use a combination of cp and rm anyway, instead of mv?
Do you realize all files will end up in the same directory, and files with the same name from different directories will overwrite each other?
What if the cp fails? You'll lose your file.

Resources