Recursively go through directory and copy files in unix - bash

I need to recursively go through a directory and all subdirectories and copy all files in them to an othet empty directory. "From" is the directory i want to go through and copy all files to "to". Currently it just copies all files from "from" to "to" and doesn't go in the subdirectories. Is there any way this will work or maybe some other way of doing this?
#/bin/bash
from=$1
to=$2
list=`ls -lR $from | grep '^-' | awk '{ print $10; }'`
for i in $list
do
cp $from/$i $to/$i
done

You can use https://www.shellcheck.net/ to check your syntax, and learn about indentation. It will suggest some modifications to your code. Excellent suggestions also by #KamilCuk in the comments.
When you need to do something recursively in sub-directories, you can always use find with option -exec. You use {} to specify that the command should be applied to each file (or directory) that matches your find options.
This would do the trick: find $from -type f -name "*.txt" -exec cp {} $to \; -print
-type f: find all files
-exec cp {} $to \;: for each file that is found ({}), copy it to the $to directory. \; delimits the end of the command for option -exec.
-name "*.txt": if you want to filter by file name (here the extension .txt must be present to match).
-print: if you want to see which files are copied, as it goes through your directories.
Note also that find will work even if you have a huge number of sub-directories, or a very large number of files in them. ls sometimes fails, or takes an enormously long time to output large number of files.

Related

How to copy recursively files with multiple specific extensions in bash

I want to copy all files with specific extensions recursively in bash.
****editing****
I've written the full script. I have list of names in a csv file, I'm iterating through each name in that list, then creating a directory with that same name somewhere else, then I'm searching in my source directory for the directory with that name, inside it there are few files with endings of xlsx,tsv,html,gz and I'm trying to copy all of them into the newly created directory.
sample_list_filepath=/home/lists/papers
destination_path=/home/ds/samples
source_directories_path=/home/papers_final/new
cat $sample_list_filepath/sample_list.csv | while read line
do
echo $line
cd $source_directories_path/$line
cp -r *.{tsv,xlsx,html,gz} $source_directories_path/$line $destination_path
done
This works, but it copies all the files there, with no discrimination for specific extension.
What is the problem?
An easy way to solve your problem is to use find and regex :
find src/ -regex '.*\.\(tsv\|xlsx\|gz\|html\)$' -exec cp {} dest/ \;
find look recursively in the directory you specify (in my example it's src/), allows you to filter with -regex and to apply a command for matching results with -exec
For the regex part :
.*\.
will take the name of the file and the dot before extension,
\(tsv\|xlsx\|gz\|html\)$
verify the extension with those you want.
The exec block is what you do with files you got from regex
-exec cp {} dest/ \;
In this case, you copy what you got ({} meaning) to the destination directory.

Renaming multiple files in a nested structure

I have a directory with this structure:
root
|-dir1
| |-pred_20181231.csv
|
|-dir2
| |-pred_20181234.csv
...
|-dir84
|-pred_2018123256.csv
I want to run a command that will rename all the pred_XXX.csv files to pred.csv.
How can I easily achieve that?
I have looked into the rename facility but I do not understand the perl expression syntax.
EDIT: I tried with this code: rename -n 's/\training_*.csv$/\training_history.csv/' *.csv but it did not work
Try with this command:
find root -type f -name "*.csv" -exec perl-rename 's/_\d+(\.csv)/$1/g' '{}' \;
Options used:
-type f to specify file or directory.
-name "*.csv" to only match files with extension csv
-exec\-execdir to execute a command, in this case, perl-rename
's/_\d+(\.csv)/$1/g' search a string like _20181234.csv and replace it with .csv, $1 means first group found.
NOTE
Depending in your S.O. you could use just rename instead of perl-rename.
Use some shell looping:
for file in **/*.csv
do
echo mv "$(dirname "$file")/$(basename "$file")" "$(dirname "$file")/pred.csv"
done
On modern shells ** is a wildcard that matches multiple directories in a hierarchy, an alternative to find, which is a fine solution too. I'm not sure if this should instead be /**/*.csv or /root/**/*.csv based on tree you provided, so I've put echo before the 'mv' to see what it's about to do. After making sure this is going to do what you expect it to do, remove the echo.

How to delete a file in any of the directories or subdirectories except one subdirectory

I want to delete a file from a directory which contains many subdirectories but the deletion should not happen in one subdiretory(searc) whose name is already predefined but path varies as shown below.So now how to delete a file i am using the below command
find . -type f -name "*.txt" -exec rm -f {} \;
this command deletes all the files in the directory.So How can we delete the file without serching that subdirectory.
The subdirectory file name will be same but the path will different
for eg
Main
|
a--> searc
|
b-->x--->searc
|
c-->y-->x-->searc
now the
the subdirectory not to be searched can be present any where as shown above
I think you want the -prune option. In combination with a successful name match, this prevents descent into the named directories. Example:
% mkdir -p test/{a,b,c}
% touch test/{a,b,c}/foo.txt
% find test -name b -prune -o -name '*.txt' -print
test/a/foo.txt
test/c/foo.txt
I am not completely sure what you're asking, so I can give only somewhat generic advice.
You already know the -name option. This refers to the filename only. You can, however, also use -wholename (a.k.a. -path), which refers to the full path (beginning with the one given as first option to find).
So if you want to delete all *.txt files except in the foo/bar subdirectory, you can do this:
find . -type f -name "*.txt" ! -wholename "./foo/bar/*" -delete
Note the -delete option; it doesn't require a subshell, and is easier to type.
If you would like to exclude a certain directory name regardless of where in the tree it might be, just don't "root" it. In the above example, foo/bar was "rooted" to ./, so only a top-level foo/bar would match. If you write ! -wholename "*/foo/bar/*" instead (allowing anything before or after via the *), you would exclude any files below any directory foo/bar from the operation.
You can use xargs instead of the exec
find .... <without the --exec stuff> | grep -v 'your search' | xargs echo rm -f
Try this first. If it is satisfactory, you can remove the echo.

Bash Script for listing subdirectories and files in textfile

I need a Script that writes the directory and subdirectory in a text-file.
For example the script lies in /Mainfolder and in this folder are four other folders. Each contains several files.
Now I would like the script to write the path of each file in the textfile.
Subfolder1/File1.dat
Subfolder1/File2.dat
Subfolder2/File1.dat
Subfolder3/File1.dat
Subfolder4/File1.dat
Subfolder4/File2.dat
Important is that there is no slash in front of the listing.
Use the find command:
find Mainfolder > outputfile
and if you only want the files listed, do
find Mainfolder -type f > outputfile
You can also strip the leading ./ if you search the current directory, with the %P format option:
find . -type f -printf '%P\n' > outputfile
If your bash version is high enough, you can do it like that:
#!/bin/bash
shopt -s globstar
echo ** > yourtextfile
This solution assumes that the subdirectories contain only files -- they do not contain any directory in turn.
find . -type f -print | sed 's|^.*/S|S|'
I have created a single file in each of the four subdirectories. The original output is:
./Subfolder1/File1.dat
./Subfolder4/File4.dat
./Subfolder2/File2.dat
./Subfolder3/File3.dat
The filtered output is:
Subfolder1/File1.dat
Subfolder4/File4.dat
Subfolder2/File2.dat
Subfolder3/File3.dat
You can use this find with -exec:
find . -type f -exec bash -c 'f="{}"; echo "${f:2}"' \;
This will print all files starting from current paths by removing ./ from front.

How do I grab the filename of the file containing a certain string when there are hundreds of files?

I have a folder with 200 files in it. We can say that the files are named "abc0" to "abc199". Five of these files contain the string "ez123" but I don't know which ones. My current attempt to find the file names of the files that contain the string is:
#!/bin/sh
while read FILES
do
cat $FILES | egrep "ez123"
done
I have a file that contains the filenames of all files in the directory. So I then execute:
./script < filenames
This is verifies for me that the files containing the string exist but I still don't have the name of the files. Are there any ideas concerning the best way to accomplish this?
Thanks
you can try
grep -l "ez123" abc*
find /directory -maxdepth 1 -type f -exec fgrep -l 'ez123' \{\} \;
(-maxdepth 1 is only necessary if you only want to search the directory and not the tree recursively (if there's any)).
fgrep is a bit faster than grep. -l lists the matched filenames only.
Try
find -type f -exec grep -qs "ez123" {} \; -print
This will use find to find all real files in current directory (and subdirectories), execute grep on them ({} will be replaced by file name, -qs tells it to be silent and just set an exit code), -print will print out the names of the files that grep found a matching line in.
What about:
xargs egrep -l ez123
That reads filenames from stdin and prints out the filenames with matches.

Resources