Moving a directory tree with only some of the files - bash

I have files structured as follows:
folder1
-file1.a
-file2.b
-file3.c
folder2
-file1.a
-file2.b
folder3
-file1.a
-file2.b
-file3.c
I want to copy just the files with .a and preserving the tree structure:
folder1
-file1.a
folder2
-file2.a
folder3
-file3.a
How can I accomplish this using bash?

If you have GNU cp (linux) you can:
cp --parent folder*/**/*.a /path/to/destination
or with two piped tar
tar czf - folder*/**/*.a | tar -C /path/to/dest -xvf -
or
find folder* -name \*.a -print | cpio -o | (cd /path/to/dest ; cpio -idv)
or better, from the #JonathanLeffler's comment
find folder* -name '*.a' -print | cpio -pvd /path/to/dest
#and with null terminated
find ... -print0 | cpio -p0dv /path/to/dest
The ** mean (from man bash)
Matches any string, including the null string. When the globstar shell option is enabled, and *
is used in a pathname expansion context, two adjacent *s used as a single pattern will match all
files and zero or more directories and subdirectories. If followed by a /, two adjacent *s will
match only directories and subdirectories.
the globstar (default on)
globstar
If set, the pattern ** used in a pathname expansion context will match all files and zero or
more directories and subdirectories. If the pattern is followed by a /, only directories and
subdirectories match.

One way is to create a tar file with files you need...and then extract that tar where ever you want it...eg. below
$ tar -cvf allfiles.tar `find . -name "*.a" -print`
$ cp allfiles.tar /newdirectory
$ cd /newdirectory
$ tar -xvf allfiles.tar
Another Way would be ...
# below command will find all files with .a extension and then
# copy them to the newdir directory with folder structure.
$ cp --parents `find . -name "*.a" -print` newdir

You could use a for loop:
for f in folder{1..3}; do mkdir -p "dest/$f" && cp "src/$f/file1.a" "dest/$f"; done
For example:
$ ls src/*
src/folder1:
file1.a file2.b file3.c
src/folder2:
file1.a file2.b file3.c
src/folder3:
file1.a file2.b file3.c
$ for f in folder{1..3}; do mkdir -p "dest/$f" && cp "src/$f/file1.a" "dest/$f"; done
$ ls dest/*
dest/folder1:
file1.a
dest/folder2:
file1.a
dest/folder3:
file1.a

Related

Using 'diff' with mismatched directories and filenames

I have two separate folder directories, which mostly contain the same files, but the directory structure is completely different between the two folders. The filenames do not correspond either
So, for example:
FOLDER 1
--- Subfolder A
-file1
-file2
--- Subfolder B
-file3
-file4
FOLDER 2
--- Subfolder C
-Subfolder C1
-file5
-file6
-file7
-Subfolder C2
-file8
-file9
Let's suppose that file1=file5, file2=file6, file3=file7, file4=file8
And file9 is unmatched.
Is there some combination of options to the diff command that will identify the matches? Doing a recursive diff with -r doesn't seem to do the job.
This is a way to get the different and/or identical files with find and xargs:
find FOLDER1 -type f -print0 |
xargs -0 -I % find FOLDER2 -type f -exec diff -qs --from-file="%" '{}' \+
Sample output:
Files FOLDER1/SubfolderB/file3 and FOLDER2/SubfolderC/SubfolderC1/file5 differ
Files FOLDER1/SubfolderB/file3 and FOLDER2/SubfolderC/SubfolderC1/file7 are identical
So, you can filter the ones you want with grep (see example).
Notice this solution supports filenames with spaces and special characters (e.g.: newlines) embedded, so you don't have to worry about it
Explanation
For every file in FOLDER1 (find FOLDER1 -type f -print0), executes:
find FOLDER2 -type f -exec diff -qs --from-file="%" '{}' \+
That line calls find again to get all the files in FOLDER2 and executes the following (processed):
diff -qs --from-file="<a file from FOLDER1>" <all the files from FOLDER2>
From man diff:
--from-file=FILE1
Compare FILE1 to all operands. FILE1 can be a directory.
Example
This is the directory tree and the file content:
$ find FOLDER1 FOLDER2 -type f -exec sh -c 'echo "$0": && cat "$0"' '{}' \;
FOLDER1/SubfolderA/file1:
1=5
FOLDER1/SubfolderA/file2:
2=6
FOLDER1/SubfolderB/file3:
3=7
FOLDER1/SubfolderB/file4:
4=8
FOLDER2/SubfolderC/SubfolderC1/file5:
1=5
FOLDER2/SubfolderC/SubfolderC1/file6:
2=6
FOLDER2/SubfolderC/SubfolderC1/file7:
3=7
FOLDER2/SubfolderC/SubfolderC2/file8:
4=8
FOLDER2/SubfolderC/SubfolderC2/file9:
anything
And this is the command (pipeline) getting just the identical ones:
$ find FOLDER1 -type f -print0 |
> xargs -0 -I % find FOLDER2 -type f -exec diff -qs --from-file="%" '{}' \+ |
> grep "identical$"
Files FOLDER1/SubfolderA/file1 and FOLDER2/SubfolderC/SubfolderC1/file5 are identical
Files FOLDER1/SubfolderA/file2 and FOLDER2/SubfolderC/SubfolderC1/file6 are identical
Files FOLDER1/SubfolderB/file3 and FOLDER2/SubfolderC/SubfolderC1/file7 are identical
Files FOLDER1/SubfolderB/file4 and FOLDER2/SubfolderC/SubfolderC2/file8 are identical
Enhanced solution with bash's Process Substitution and Arrays
If you're using bash, you can first save all the FOLDER2 filenames in an array to avoid calling find for each file in FOLDER1:
# first of all, we save all the FOLDER2 filenames (recursively) in an array
while read -d $'\0' file; do
folder2_files=("${folder2_files[#]}" "$file")
done < <(find FOLDER2 -type f -print0)
# now we compare each file in FOLDER1 with the files in the array
find FOLDER1 -type f -exec diff -qs --from-file='{}' "${folder2_files[#]}" \; |
grep "identical$"
Create a temporary Git repository. Add the first directory tree to it, and commit.
Remove all the files and add the second directory tree to it. Do the second commit.
The git diff between those two commits will turn on rename detection and you will probably see something more englightening.

How can I recursively copy same-named files from one directory structure to another in bash?

I have two directories, say dir1 and dir2, that have exactly the same directory structure. How do I recursively copy all the *.txt files from dir1 to dir2?
Example:
I want to copy from
dir1/subdir1/file.txt
dir1/subdir2/someFile.txt
dir1/.../..../anotherFile.txt
to
dir2/subdir1/file.txt
dir2/subdir2/someFile.txt
dir2/.../..../anotherFile.txt
The .../... in the last file example means this could be any sub-directory, which can have sub-directories itself.
Again I want to do this programmatically. Here's the pseudo-code
SRC=dir1
DST=dir2
for f in `find ./$SRC "*.txt"`; do
# $f should now be dir1/subdir1/file.txt
# I want to copy it to dir2/subdir1/file.txt
# the next line coveys the idea, but does not work
# I'm attempting to substitute "dir1" with "dir2" in $f,
# and store the new path in tmp.txt
echo `sed -i "s/$SRC/$DST/" $f` > tmp.txt
# Do the copy
cp -f $f `cat tmp.txt`
done
You can simply use rsync. This answer is based from this thread.
rsync -av --include='*.txt' --include='*/' --exclude='*' dir1/ dir2/
If you only have .txt files in dir1, this would work:
cp -R dir1/* dir2/
But if you have other file extensions, it will copy them too. In this case, this will work:
cd /path/to/dir1
cp --parents `find . -name '*.txt'` path/to/dir2/

How to restrict pattern matching to the first matching file

I have a very delicate bash problem:
given a directory foo
$ ls
foo.tgz bar.tgz baz.tgz
I want to generate a bash single liner, that extracts the first tarball of a pattern like:
bash -c "tar -zxvf foo.tgz file1" # fine
bash -c 'tar -zxvf *.tgz file1" # oups trying to extract bar.tgz from foo.tgz!
Is there a possibility to restrict pattern matching to the first expanded parameter?
Refinement:
find -iname '*.tgz' | xargs tar -zxvf # oups! cannot add restriction to only extract file1
Use the -quit primary with find:
find -iname '*.tgz' -exec tar -zxvf '{}' \; -quit
As the actions are processed from left-to-right, this will run tar on the first match, then end the find command.
Sure, try this:
tar -zxvf $(ls *.tgz | head -1) file1
You should consider what happens if nothing matches the pattern....

Recursively move files of certain type and keep their directory structure

I have a directory which contains multiple sub-directories with mov and jpg files.
/dir/
/subdir-a/ # contains a-1.jpg, a-2.jpg, a-1.mov
/subdir-b/ # contains b-1.mov
/subdir-c/ # contains c-1.jpg
/subdir-d/ # contains d-1.mov
... # more directories with the same pattern
I need to find a way using command-line tools (on Mac OSX, ideally) to move all the mov files to a new location. However, one requirement is to keep directory structure i.e.:
/dir/
/subdir-a/ # contains a-1.mov
/subdir-b/ # contains b-1.mov
# NOTE: subdir-c isn't copied because it doesn't have mov files
/subdir-d/ # contains d-1.mov
...
I am familiar with find, grep, and xargs but wasn't sure how to solve this issue. Thank you very much beforehand!
It depends slightly on your O/S and, more particularly, on the facilities in your version of tar and whether you have the command cpio. It also depends a bit on whether you have newlines (in particular) in your file names; most people don't.
Option #1
cd /old-dir
find . -name '*.mov' -print | cpio -pvdumB /new-dir
Option #2
find . -name '*.mov' -print | tar -c -f - -T - |
(cd /new-dir; tar -xf -)
The cpio command has a pass-through (copy) mode which does exactly what you want given a list of file names, one per line, on its standard input.
Some versions of the tar command have an option to read the list of file names, one per line, from standard input; on MacOS X, that option is -T - (where the lone - means 'standard input'). For the first tar command, the option -f - means (in the context of writing an archive with -c, write to standard output); in the second tar command, the -x option means that the -f - means 'read from standard input'.
There may be other options; look at the manual page or help output of tar rather carefully.
This process copies the files rather than moving them. The second half of the operation would be:
find . -name '*.mov' -exec rm -f {} +
ASSERT: No files have newline characters in them. Spaces, however, are AOK.
# TEST FIRST: CREATION OF FOLDERS
find . -type f -iname \*.mov -printf '%h\n' | sort | uniq | xargs -n 1 -d '\n' -I '{}' echo mkdir -vp "/TARGET_FOLDER_ROOT/{}"
# EXECUTE CREATION OF EMPTY TARGET FOLDERS
find . -type f -iname \*.mov -printf '%h\n' | sort | uniq | xargs -n 1 -d '\n' -I '{}' mkdir -vp "/TARGET_FOLDER_ROOT/{}"
# TEST FIRST: REVIEW FILES TO BE MOVED
find . -type f -iname \*.mov -exec echo mv {} /TARGET_FOLDER_ROOT/{} \;
# EXECUTE MOVE FILES
find . -type f -iname \*.mov -exec mv {} /TARGET_FOLDER_ROOT/{} \;
Being large files, if they are on the same file system you don't want to copy them, but just to replicate their directory structure while moving.
You can use this function:
# moves a file (or folder) preserving its folder structure (relative to source path)
# usage: move_keep_path source destination
move_keep_path () {
# create directories up to one level up
mkdir -p "`dirname "$2"`"
mv "$1" "$2"
}
Or, adding support to merging existing directories:
# moves a file (or folder) preserving its folder structure (relative to source path)
# usage: move_keep_path source destination
move_keep_path () {
# create directories up to one level up
mkdir -p "`dirname "$2"`"
if [[ -d "$1" && -d "$2" ]]; then
# merge existing folder
find "$1" -depth 1 | while read file; do
# call recursively for all files inside
mv_merge "$file" "$2/`basename "$file"`"
done
# remove after merge
rmdir "$1"
else
# either file or non-existing folder
mv "$1" "$2"
fi
}
It is easier to just copy the files like:
cp --parents some/folder/*/*.mov new_folder/
from the parent directory of "dir execute this:
find ./dir -name "*.mov" | xargs tar cif mov.tar
Then cd to the directory you want to move the files to and execute this:
tar xvf /path/to/parent/directory/of"dir"/mov.tar
This should work if you want to move all mov files to a directory called new location -
find ./dir -iname '*.mov' -exec mv '{}' ./newlocation \;
However, if you wish to move the mov files along with their sub-dirs then you can do something like this -
Step 1: Copy entire structure of /dir to a new location using cp
cp -iprv dir/ newdir
Step 2: Find jpg files from newdir and delete them.
find ./newdir -iname "*.jpg" -delete
Test:
[jaypal:~/Temp] ls -R a
a.mov aa b.mov
a/aa:
aaa c.mov d.mov
a/aa/aaa:
e.mov f.mov
[jaypal:~/Temp] mkdir d
[jaypal:~/Temp] find ./a -iname '*.mov' -exec mv '{}' ./d \;
[jaypal:~/Temp] ls -R d
a.mov b.mov c.mov d.mov e.mov f.mov
I amended the function of #djjeck, because it didn't work as I needed. The function below moves a source file to a destination directory also creating the needed levels of hierarchy in the source file path (see the example below):
# moves a file, creates needed levels of hierarchy in destination
# usage: move_with_hierarchy source_file destination top_level_directory
move_with_hierarchy () {
path_tail=$(dirname $(realpath --relative-to="$3" "$1"))
cd "$2"
mkdir -p $path_tail
cd - > /dev/null
mv "$1" "${2}/${path_tail}"
}
example:
$ ls /home/sergei/tmp/dir1/dir2/bla.txt
/home/sergei/tmp/dir1/dir2/bla.txt
$ rm -rf tmp2
$ mkdir tmp2
$ move_with_hierarchy /home/sergei/tmp/dir1/dir2/bla.txt /home/sergei/tmp2 /home/sergei/tmp
$ tree ~/tmp2
/home/sergei/tmp2
└── dir1
└── dir2
└── bla.txt
2 directories, 1 file

Unix script to find all folders in the directory, then tar and move them

Basically I need to run a Unix script to find all folders in the directory /fss/fin, if it exists; then I have tar it and move to another directory /fs/fi.
This is my command so far:
find /fss/fin -type d -name "essbase" -print
Here I have directly mentioned the folder name essbase. But instead, I would like to find all the folders in the /fss/fin and use them all.
How do I find all folders in the /fss/fin directory & tar them to move them to /fs/fi?
Clarification 1:
Yes I need to find only all folders in the directory /fss/fin directory using a Unix shell script and tar them to another directory /fs/fi.
Clarification 2:
I want to make it clear with the requirement. The Shell Script should contain:
Find all the folders in the directory /fss/fin
Tar the folders
Move the folders in another directory /fs/fi which is located on the server s11003232sz.net
On user requests it should untar the Folders and move them back to the orignal directory /fss/fin
here is an example I am working with that may lead you in the correct direction
BackUpDIR="/srv/backup/"
SrvDir="/srv/www/"
DateStamp=$(date +"%Y%m%d");
for Dir in $(find $SrvDir* -maxdepth 0 -type d );
do
FolderName=$(basename $Dir);
tar zcf "$BackUpDIR$DateStamp.$FolderName.tar.gz" -P $Dir
done
Since tar does directories automatically, you really don't need to do very much. Assuming GNU tar:
tar -C /fss/fin -cf - essbase |
tar -C /fs/fi -xf -
The '-C' option changes directory before operating. The first tar writes to standard output (the lone '-') everything found in the essbase directory. The output of that tar is piped to the second tar, which reads its standard input (the lone '-'; fun isn't it!).
Assuming GNU find, you can also do:
(cd /fss/fin; tar -cf - $(find . -maxdepth 1 -type d | sed '/^\.$/d')) |
tar -xf - -C /fs/fi
This changes directory to the source directory; it runs 'find' with a maximum depth of 1 to find the directories and removes the current directory from the list with 'sed'; the first 'tar' then writes the output to the second one, which is the same as before (except I switched the order of the arguments to emphasize the parallelism between the two invocations).
If your top-level directories (those actually in /fss/fin) have spaces in the names, then there is more work to do again - I'm assuming none of the directories to be backed up start with a '.':
(cd /fss/fin; find * -maxdepth 0 -type d -print 0 | xargs -0 tar -cf -) |
tar -xf - -C /fs/fi
This weeds out the non-directories from the list generated by '*', and writes them with NUL '\0' (zero bytes) marking the end of each name (instead of a newline). The output is written to 'xargs', which is configured to expect the NUL-terminated names, and it runs 'tar' with the correct directory names. The output of this ensemble is sent to the second tar, as before.
If you have directory names starting with a '.' to collect, then add '.[a-z]*' or another suitable pattern after the '*'; it is crucial that what you use does not list '.' or '..'. If you have names starting with dashes in the directory, then you need to use './*' and './.[a-z]*'.
If you've got still more perverse requirements, enunciate them clearly in an amendment to the question.
find /fss/fin -d 1 -type d -name "*" -print
The above command gives you the list of 1st level subdirectories of the /fss/fin.
Then you can do anything with this. E.g. tar them to your output directory as in the command below
tar -czf /fss/fi/outfile.tar.gz `find /fss/fin -d 1 -type d -name "*" -print`
Original directory structure will be recreated after untar-ing.
Here is a bash example (change /fss/fin, /fs/fi with your paths):
dirs=($(find /fss/fin -type d))
for dir in "${dirs[#]}"; do
tar zcf "$dir.tgz" "$dir" -P -C /fs/fi && mv -v "$dir" /fs/fi/
done
which finds all the folders, tar them separately, and if successful - move them into different folder.
This should do it:
#!/bin/sh
list=`find . -type d`
for i in $list
do
if [ ! "$i" == "." ]; then
tar -czf ${i}.tar.gz ${i}
fi
done
mv *.tar.gz ~/tardir

Resources