Copy all files with a certain extension from all subdirectories and preserving structure of subdirectories - bash

How can I copy specific files from all directories and subdirectories to a new directory while preserving the original subdirectorie structure?
This answer:
find . -name \*.xls -exec cp {} newDir \;
solves to copy all xls files from all subdirectories in the same directory newDir. That is not what I want.
If an xls file is in: /s1/s2/ then it sould be copied to newDir/s1/s2.
copies all files from all folders and subfolders to a new folder, but the original file structure is lost. Everything is copied to a same new folder on top of each other.

You can try:
find . -type f -name '*.xls' -exec sh -c \
'd="newDir/${1%/*}"; mkdir -p "$d" && cp "$1" "$d"' sh {} \;
This applies the d="newDir/${1%/*}"; mkdir -p "$d" && cp "$1" "$d" shell script to all xls files, that is, first create the target directory and copy the file at destination.
If you have a lot of files and performance issues you can try to optimize a bit with:
find . -type f -name '*.xls' -exec sh -c \
'for f in "$#"; do d="newDir/${f%/*}"; mkdir -p "$d" && cp "$f" "$d"; done' sh {} +
This second version processes the files by batches and thus spawns less shells.

This should do:
# Ensure that newDir exists and is empty. Omit this step if you
# don't want it.
[[ -d newDir ]] && rm -r newDir && mkdir newDir
# Copy the xls files.
rsync -a --include='**/*.xls' --include='*/' --exclude='*' . newDir
The trick here is the combination of include and exclude. By default, rsync copies everything below its source directory (. in your case). We change this by excluding everything, but also including the xls files.
In your example, newDir is itself a subdirectory of your working directory and hence part of the directory tree searched for copying. I would rethink this decision.
NOTE: This would not only also copy directories whrere the name ends in .xls, bur also recreated the whole directory structure of your source tree (even if there are no xls files in it), and populate it only with xls files.

Thanks for the solutions.
Meanwhile I found also:
find . -name '*.xls' | cpio -pdm newDir

Related

bash script to Remove *.pom file from multiple directories which contains *.jar and *.pom aswell

Multiple directories in the Linux system contain *.jar and *.pom files. However, some directories only contain *.pom files. That I am attempting to delete but am unable to do so
Using below script, it removes all *.pom files from all the directories which contains *.jar as well.
#!/bin/sh
sudo find /var/opt/jfrog/artifactory/2021*/repositories/ -type f \( -name "*.pom" \) -exec rm {} \;
I'm attempting to delete only the *.pom file from the pic directory structure shown below.
directory structure
Instead of running rm directly, run a script that checks for jar files and deletes the pom only if there is no jar. Here we use -execdir to make this a bit easier and more efficient:
find ... -type f -name '*.pom' -execdir bash -c \
'compgen -G \*.jar > /dev/null || rm "$#"' . {} +
I would code:
#!/bin/bash
find /var/opt/jfrog/artifactory/2021*/repositories/ -name '*.pom' |
while IFS= read -r file; do
if [ ! -f ${file%.pom}.jar ] ; then rm $file ; fi
done
Explanation: find sends all the .pom files to a while loop. For each .pom file, the loop checks that the corresponding .jar file does not exist and in that case, it deletes the .pom file.

Script for moving all *.512.png to a new folder

Can you make a script(bash) for moving all the files with the ending *.512.png to a new folder like res512(will be new branch) (keeping all the subfolders)
for this repo I tried really long but I can't figure it out.
You're not very specific with what you're asking.
If you want to move all files that have the suffix .512.png from within your current directory to a new directory, you can use the following
mkdir res512
cp -r *.512.png res512/
If you want to move all files that have the suffix .512.png from within your directory and all child directories into a new directory, you can use
mkdir res512
for f in $(find -type f -name "*.512.png")
do
cp $f res512/
done
If you want to move all files that have the suffix .512.png including their directory structure into a new directory, you can use
find . -name '*.512.png' -exec cp --parents \{\} res512/ \;
Replace cp with mv if you want to move the files instead of copy them.

How to copy every file with extension X while keeping the original folder structure? (Unix-like systems)

I am trying to copy every HTML file from an src folder to a dist folder. However, I should like to preserve the original folder structure and if the dist folder does not exist I should like to create a new one.
Create the folder if it does not exist:
[ -d _dist/ ] || mkdir _dist/
Copy every file:
cp -R _src/**/*.html _dist/
Together:
[ -d _dist/ ] || mkdir _dist/ && cp -R _src/**/*.html _dist/
However, if I use ** only the files inside a folder will get copied and if I remove the ** only the root files will get copied. Is this even accomplishable?
find _src -type f -name "*.html" -exec cp -t _dist --parents {} ";"
cp -t : target directory
--parents : append parents dir to target
This will omit empty (no html-files) dirs. But if you need them, repeat without -name "*.html" but with -type d.
In the case you don't have a version of bash with the --parents option, cpio is awesome.
[ -d _dist/ ] || cd _src && find . -name '*.html' | cpio -pdm _dist && mv _dist ..
This would recursively copy all html files into _dist while maintaining the directory structures.
↳ GNU cpio manual - GNU Project

Traverse directory and zip certain subdirectories in place

How can I bulk-zip folders in subdirectories without including the parent folder in the zip archives? I have a folder structure like this:
folder01
folder02
file01
file02
When I run:
find . -type d -name "folder02" -exec zip -r '{}'.zip '{}' \;
I get "folder02.zip" which always extracts its contents into a parent folder "folder01". How can I prevent this? For me it creates useless parent folder structures when extracting these archives anywhere else.
Using some simple bash:
find . -type d -name "folder02" -exec bash -c 'cd "$(dirname "{}")"; zip -r "$(basename "{}")".zip "$(basename "{}")"' \;

Find and gzip files in subdirectories

I have *.xls files in the location /home/Docs/Calc. There are multiple subdirectories inside that folder. Eg
/home/Docs/Calc/2011
/home/Docs/Calc/2012
/home/Docs/Calc/2013
I can gzip each file under the subdirectories using the find command,
find /home/Docs/Calc -iname "*.xls" -exec gzip {} \;
but how can I gzip all the files in each subdirectory ? eg.
/home/Docs/Calc/2011/2011.tar.gz
/home/Docs/Calc/2012/2012.tar.gz
/home/Docs/Calc/2013/2013.tar.gz
I must add that /home/Docs/Calc is one of the many folders eg Calc-work, calc-tax, calc-bills. All of these have the year subfolders in them
Since we don't have to recurse, I'd approach it not with find but with globbing and a for loop. If we're in the Calc directory, echo * will give us all the directory names:
~/Docs/Calc$ echo *
2011 2012 2013
It just so happens that we can use a for loop to iterate over these and tar them up in the usual way:
for year in *; do
tar czf $year.tar.gz $year
done
If you want the resulting tarballs in the year directories, you could add an mv after the tar command. I'd be hesitant to put the tarball in the directory from outset or tar might start trying to tar its own output into itself.
I set up a simple function in my .bashrc:
function gzdp () {
find . -type f -name "$#" -exec gzip {} \;
}
The $# automatically gets replaced with whatever comes after gzdp when you call the function. Now, in your command window you can navigate to the /home/Docs/Calc/ folder and just call:
gzdp *.txt
and it should zip all .txt files in all lower subdirectories.
Not sure if this helps or not, my first post on this website. Careful that you don't accidentally gzip unwanted .txt files.
Try this:
find /home/Docs/Calc -type d -exec tar cvzf {}.tar.gz {} \;
Try this script as well:
#!/bin/bash
find /home/Docs/Calc/ -mindepth 1 -type d | while read -r DIR; do
NAME=${DIR##*/}
pushd "$DIR" >/dev/null && {
tar -cvpzf "${NAME}.tar.gz" *
popd >/dev/null
}
done
You can use the following shell script:
#!/bin/bash
cd /home/Docs/Calc/
find=`find . -type d`
for f in $find; do
cd $f
tar -cvz *.xls >> ${f##*/}.tar.gz
cd -
done

Resources