Find and gzip files in subdirectories - bash

I have *.xls files in the location /home/Docs/Calc. There are multiple subdirectories inside that folder. Eg
/home/Docs/Calc/2011
/home/Docs/Calc/2012
/home/Docs/Calc/2013
I can gzip each file under the subdirectories using the find command,
find /home/Docs/Calc -iname "*.xls" -exec gzip {} \;
but how can I gzip all the files in each subdirectory ? eg.
/home/Docs/Calc/2011/2011.tar.gz
/home/Docs/Calc/2012/2012.tar.gz
/home/Docs/Calc/2013/2013.tar.gz
I must add that /home/Docs/Calc is one of the many folders eg Calc-work, calc-tax, calc-bills. All of these have the year subfolders in them

Since we don't have to recurse, I'd approach it not with find but with globbing and a for loop. If we're in the Calc directory, echo * will give us all the directory names:
~/Docs/Calc$ echo *
2011 2012 2013
It just so happens that we can use a for loop to iterate over these and tar them up in the usual way:
for year in *; do
tar czf $year.tar.gz $year
done
If you want the resulting tarballs in the year directories, you could add an mv after the tar command. I'd be hesitant to put the tarball in the directory from outset or tar might start trying to tar its own output into itself.

I set up a simple function in my .bashrc:
function gzdp () {
find . -type f -name "$#" -exec gzip {} \;
}
The $# automatically gets replaced with whatever comes after gzdp when you call the function. Now, in your command window you can navigate to the /home/Docs/Calc/ folder and just call:
gzdp *.txt
and it should zip all .txt files in all lower subdirectories.
Not sure if this helps or not, my first post on this website. Careful that you don't accidentally gzip unwanted .txt files.

Try this:
find /home/Docs/Calc -type d -exec tar cvzf {}.tar.gz {} \;

Try this script as well:
#!/bin/bash
find /home/Docs/Calc/ -mindepth 1 -type d | while read -r DIR; do
NAME=${DIR##*/}
pushd "$DIR" >/dev/null && {
tar -cvpzf "${NAME}.tar.gz" *
popd >/dev/null
}
done

You can use the following shell script:
#!/bin/bash
cd /home/Docs/Calc/
find=`find . -type d`
for f in $find; do
cd $f
tar -cvz *.xls >> ${f##*/}.tar.gz
cd -
done

Related

Copy all files with a certain extension from all subdirectories and preserving structure of subdirectories

How can I copy specific files from all directories and subdirectories to a new directory while preserving the original subdirectorie structure?
This answer:
find . -name \*.xls -exec cp {} newDir \;
solves to copy all xls files from all subdirectories in the same directory newDir. That is not what I want.
If an xls file is in: /s1/s2/ then it sould be copied to newDir/s1/s2.
copies all files from all folders and subfolders to a new folder, but the original file structure is lost. Everything is copied to a same new folder on top of each other.
You can try:
find . -type f -name '*.xls' -exec sh -c \
'd="newDir/${1%/*}"; mkdir -p "$d" && cp "$1" "$d"' sh {} \;
This applies the d="newDir/${1%/*}"; mkdir -p "$d" && cp "$1" "$d" shell script to all xls files, that is, first create the target directory and copy the file at destination.
If you have a lot of files and performance issues you can try to optimize a bit with:
find . -type f -name '*.xls' -exec sh -c \
'for f in "$#"; do d="newDir/${f%/*}"; mkdir -p "$d" && cp "$f" "$d"; done' sh {} +
This second version processes the files by batches and thus spawns less shells.
This should do:
# Ensure that newDir exists and is empty. Omit this step if you
# don't want it.
[[ -d newDir ]] && rm -r newDir && mkdir newDir
# Copy the xls files.
rsync -a --include='**/*.xls' --include='*/' --exclude='*' . newDir
The trick here is the combination of include and exclude. By default, rsync copies everything below its source directory (. in your case). We change this by excluding everything, but also including the xls files.
In your example, newDir is itself a subdirectory of your working directory and hence part of the directory tree searched for copying. I would rethink this decision.
NOTE: This would not only also copy directories whrere the name ends in .xls, bur also recreated the whole directory structure of your source tree (even if there are no xls files in it), and populate it only with xls files.
Thanks for the solutions.
Meanwhile I found also:
find . -name '*.xls' | cpio -pdm newDir

bash script optimization file rename

i am a total noob, but i figured out this script for doing the following:
I have a folder called "unrar" in there are subfolders with unknown foldername with rar file inside.
Now i enter unknownsubfolder, find rar file and unrar it in unknownsubfolder.
After that i find the new file and rename it with the unknownsubfoldername. Now i grab the file and move it to ./unrar.
#!/bin/bash
cd /home/user/unrar/
for dir in /home/user/unrar/*;
do (cd "$dir" && find -name "*.rar" -execdir unrar e -r '{}' \;); done
echo "$(tput setaf 2)-> unrar done!$(tput sgr0)"
for dir in /home/user/unrar/*;
do (cd "$dir" && find -name "*.mkv" -exec mv '{}' "${PWD##*\/}.mkv" \;); done
for dir in /home/user/unrar/*;
do (cd "$dir" && find -name "*.mp4" -exec mv '{}' "${PWD##*\/}.mp4" \;); done
for dir in /home/user/unrar/*;
do (cd "$dir" && find -name "*.avi" -exec mv '{}' "${PWD##*\/}.avi" \;); done
cd /home/user/unrar
find -name "*.mkv" -exec mv '{}' /home/user/unrar \;
find -name "*.mp4" -exec mv '{}' /home/user/unrar \;
find -name "*.avi" -exec mv '{}' /home/user/unrar \;
This works fine with most files, but in some cases it doesn't
I want to find *.rar in DIR and unrar it. the newfile.(.mkv|.avi|.mp4) should be renamed to DIR(.mkv|.avi|.mp4) and moved to ./unrar
This is my filestructure.
./unrar/
- unknownsubfolder/
-file.rar
-file.r00
-....
- unknownsubfolder1/
- s01/
- file.rar
- file.r00
- ....
- s02/
- file.rar
- file.r00
- ....
- ....
If case1, unrar "/unknownsubfolder/file.rar" and get "x.mkv". the file is renamed from "x.mkv" to "unknwonsubfolder.mkv" and moved to "./unrar/unknownsubfolder.mkv"
(same with *.avi + *.mp4) ==perfekt
if case2, in my script unknownsubfolder/s01/file.rar will be unrard, but not renamed to s01.mkv insted to unknwonsubfolder1.mkv.
(if there are more like s02, s03, s04 ...) i always end up with one unknownsubfolder.mkv file in ./unrar) ==wrong output
So i guess i have 3 questions
How do i get the right DIRname for renaming the file? Or how do i enter unknownsubfolder/s01 ....?
Is there a way to exclude a word from the find? sometimes "unknownsubfolder" contains another folder+file called "sample(.mkv|.avi|.mp4)". I would like to exclude that, to prevent the original file to be overwritten with the sample file. happens sometimes.
I am sure i can combine some of the code,to make it even shorter. Could someone explain how? So how i combine the mkv,avi and mp4 in one line.
regards, wombat
(EDIT: for better understanding)
UPDATE:
I adjusted the solution to work with unrar. Since I did not had unrar installed previously, I used gunzip to construct the solution and then simply replaced it with unrar. The problem with this approach was that, by default, unrar extracts to the current working directory. Another difference is that the name of the extracted file can be completely different from the archive's name - it is not just a matter of different extensions. The original archive is also not deleted after extraction.
Here is the solution specifically tailored to work with unrar with respect to aforementioned behavior:
#!/bin/bash
path="$1"
omit="$2"
while read f;do
unrar e -r "${f}" "${f%/*}" > /dev/null
done < <(find "${path}" -type d -name "${omit}" -prune -o -type f -print)
while read f;do
new="${f%/*}"
new="${new##*/}"
mv "${f}" "${path}/${new}"
done < <(find "${path}" -type d -name "${omit}" -prune -o -type f -a \! -name '*.rar' -print )
You can save the script, e.g., as rename-script (do not forget to make it executable), and then call it like
./rename-script /path/to/unrar omitfolder
Notice, that inside the script there is no cd. You will have to at least provide the location of the unrar folder as first parameter, otherwise you will get an error. In case of OP this would be /home/user/unrar. The omitfolder is not a path, it is just the name of the folder that you want to omit. So in OP's case this would be sample.
./rename-script /home/user/unrar sample
As requested by OP in the comments, you can read about the bash read-builtin and process substitution in order to understand how the while-loop works and how it assigns the filenames returned by find to the variable f.

Script to backup folders

I have a folder in /opt/backup in which folders are created every day. In order to save space I would like to gunzip all folders that are older than 2 days.
I don't want to create one single zip file but rather zip each folder on its own, with the name preserved. I have tried:
#!/bin/bash
# Backup files
files=($(find /opt/backup/ -mtime +"2"))
for files in ${files[*]}
do
echo $files
tar cvfz backup.tar.gz $files
done
But all this does is creating a single zip file, I would like each folder separately.
The script will run every 2 days at 02:00 in the morning. How do I write this script, please?
You are making it too complicated. You should find directories that are old enough and simply tar zip those.
find /opt/backup/ -mtime +"2" -type d -exec tar cvfz backup.tar.gz {} \;
This will look for all directories (-type d) and execute a certain command on them (tar cvfz backup.tar.gz {}). In which {} is a placeholder for the directory found.
If you want to preserve the name of the dir, simply use {} a second time:
find /opt/backup/ -mtime +"2" -type d -exec tar cvfz {}.tar.gz {} \;
Note that no quotes are required around {} as special chars will be handled well inside find's exec.

Bash: Moving multiple files into subfolders

I have a folder with a couple thousand files and I want to move them into subfolders according to a string in the filename. The files all have a structure like
something-run1_001.txt
something-run22_1243.txt
So I tried the following script in order to move all files with "run1" in it into a subfolder r1 and all "run22" files in a subfolder r22 (and so on) but it does no work that way and I get a message "File X is the same as file X".
#!bin/bash
for i in {1..39}
do
foldername=r$i
#echo "$foldername"
mkdir $foldername
find . -type f -name "*run$i_*" | xargs -i mv {} $foldername/
done
How to solve this?
for i in {1..39}
do
mkdir -p r${i}/
mv *run${i}_* r${i}/
done
is this work as your requirement?
mv *run*.html dir1
If you still run into the "too many arguments" trap you can pipe find into a while loop
#!/bin/bash -u
find . -maxdepth 1 -name '*-run*_*.txt' |
{
while read FNAME
do
N=${FNAME##*-run}
N=${N%_*}
DIR=r$N
test -d $DIR || mkdir $DIR
mv $FNAME $DIR/.
done
}

Unix script to find all folders in the directory, then tar and move them

Basically I need to run a Unix script to find all folders in the directory /fss/fin, if it exists; then I have tar it and move to another directory /fs/fi.
This is my command so far:
find /fss/fin -type d -name "essbase" -print
Here I have directly mentioned the folder name essbase. But instead, I would like to find all the folders in the /fss/fin and use them all.
How do I find all folders in the /fss/fin directory & tar them to move them to /fs/fi?
Clarification 1:
Yes I need to find only all folders in the directory /fss/fin directory using a Unix shell script and tar them to another directory /fs/fi.
Clarification 2:
I want to make it clear with the requirement. The Shell Script should contain:
Find all the folders in the directory /fss/fin
Tar the folders
Move the folders in another directory /fs/fi which is located on the server s11003232sz.net
On user requests it should untar the Folders and move them back to the orignal directory /fss/fin
here is an example I am working with that may lead you in the correct direction
BackUpDIR="/srv/backup/"
SrvDir="/srv/www/"
DateStamp=$(date +"%Y%m%d");
for Dir in $(find $SrvDir* -maxdepth 0 -type d );
do
FolderName=$(basename $Dir);
tar zcf "$BackUpDIR$DateStamp.$FolderName.tar.gz" -P $Dir
done
Since tar does directories automatically, you really don't need to do very much. Assuming GNU tar:
tar -C /fss/fin -cf - essbase |
tar -C /fs/fi -xf -
The '-C' option changes directory before operating. The first tar writes to standard output (the lone '-') everything found in the essbase directory. The output of that tar is piped to the second tar, which reads its standard input (the lone '-'; fun isn't it!).
Assuming GNU find, you can also do:
(cd /fss/fin; tar -cf - $(find . -maxdepth 1 -type d | sed '/^\.$/d')) |
tar -xf - -C /fs/fi
This changes directory to the source directory; it runs 'find' with a maximum depth of 1 to find the directories and removes the current directory from the list with 'sed'; the first 'tar' then writes the output to the second one, which is the same as before (except I switched the order of the arguments to emphasize the parallelism between the two invocations).
If your top-level directories (those actually in /fss/fin) have spaces in the names, then there is more work to do again - I'm assuming none of the directories to be backed up start with a '.':
(cd /fss/fin; find * -maxdepth 0 -type d -print 0 | xargs -0 tar -cf -) |
tar -xf - -C /fs/fi
This weeds out the non-directories from the list generated by '*', and writes them with NUL '\0' (zero bytes) marking the end of each name (instead of a newline). The output is written to 'xargs', which is configured to expect the NUL-terminated names, and it runs 'tar' with the correct directory names. The output of this ensemble is sent to the second tar, as before.
If you have directory names starting with a '.' to collect, then add '.[a-z]*' or another suitable pattern after the '*'; it is crucial that what you use does not list '.' or '..'. If you have names starting with dashes in the directory, then you need to use './*' and './.[a-z]*'.
If you've got still more perverse requirements, enunciate them clearly in an amendment to the question.
find /fss/fin -d 1 -type d -name "*" -print
The above command gives you the list of 1st level subdirectories of the /fss/fin.
Then you can do anything with this. E.g. tar them to your output directory as in the command below
tar -czf /fss/fi/outfile.tar.gz `find /fss/fin -d 1 -type d -name "*" -print`
Original directory structure will be recreated after untar-ing.
Here is a bash example (change /fss/fin, /fs/fi with your paths):
dirs=($(find /fss/fin -type d))
for dir in "${dirs[#]}"; do
tar zcf "$dir.tgz" "$dir" -P -C /fs/fi && mv -v "$dir" /fs/fi/
done
which finds all the folders, tar them separately, and if successful - move them into different folder.
This should do it:
#!/bin/sh
list=`find . -type d`
for i in $list
do
if [ ! "$i" == "." ]; then
tar -czf ${i}.tar.gz ${i}
fi
done
mv *.tar.gz ~/tardir

Resources