Extract recursively and append extension? - bash

I want to make a script that can extract rar files recursively and append an extension to the extracting files.
The extension should be added during the process (so that other software doesn't see a recognised file extension and start its process until all files are extracted). Once all files are completed the extension should be removed.
Here is an example file structure...
/some/path/
folder1/
folder2/
file1.rar
folder3/
file2.rar
file3.rar
folder4/
file4.rar
I want this to turn to this...
/some/path/
folder1/
folder2/
file1.rar
file1.txt.extracting
folder3/
file2.rar
file2.txt.extracting
file3.rar
file3.txt.extracting
folder4/
file4.rar
file4.txt.extracting
Then once all are complete, to this...
/some/path/
folder1/
folder2/
file1.rar
file1.txt
folder3/
file2.rar
file2.txt
file3.rar
file3.txt
folder4/
file4.rar
file4.txt
I hope that makes sense. Is this possible?

The following should work :
cd $(mktemp -d)
find /some/path/ -name '*.rar' \
-exec unrar e {} \; \
-exec bash -c 'mv * $(dirname {})' \;
# if relevant, cd - to get back to the previous directory
This will iterate over all the .rar files in /some/path and its children directories, extracting them in a temporary directory before copying the extracted content to the .rar file's directory.
Executing bash -c instead of mv directly is required for the subshell to be interpreted after the {} gets replaced by find.

Related

How to write a bash script to copy files from one base to another base location

I have a bash script I'm trying to write
I have 2 base directories:
./tmp/serve/
./src/
I want to go through all the directories in ./tmp and copy the *.html files into the same folder path in ./src
i.e
if I have a html file in ./tmp/serve/app/components/help/ help.html -->
copy to ./src/app/components/help/ And recursively do this for all subdirectories in ./tmp/
NOTE: the folder structures should exist so just need to copy them only. If it doesn't then hopefully it could create the folder for me (not what I want) but with GIT I can track these folders to manually handle those loose html files.
I got as far as
echo $(find . -name "*.html")\n
But not sure how to actually extract the file path with pwd and do what I need to, maybe it's not a one liner and needs to be done with some vars.
something like
for i in `echo $(find /tmp/ -name "*.html")\n
do
cp -r $i /src/app/components/help/
done
going so far to create the directories would take some more time for me.
I'll try to do it on my own and see if I come up with something
but for argument sake if you do run pwd and get a response the pseudo code for that:
pwd
get response
if that directory does not exist in src create that directory
copy all the original directories contents into the new folder at /src/$newfolder
(possibly running two for loops, one to check the directory tree, and then one to go through each original directory, copying all the html files)
You process substitution to loop the output from your find command and create the destination directory(ies) and then copy the file(s):
#!/bin/bash
# accept first parameters to script as src_dir and dest values or
# simply use default values if no parameter(s) passed
src_dir=${1:-/tmp/serve}
dest=${2-src}
while read -r orig_path ; do
# To replace the first occurrence of a pattern with a given string,
# use ${parameter/pattern/string}
dest_path="${orig_path/tmp\/serve/${dest}}"
# Use dirname to remove the filename from the destination path
# and create the destination directory.
dest_dir=$(dirname "${dest_path}")
mkdir -p "${dest_dir}"
cp "${orig_path}" "${dest_path}"
done < <(find "${src_dir}" -name '*.html')
This script copy .html files from src directory to des directory (create the subdirectory if they do not exist)
Find the files, then remove the src directory name and copy them into the destination directory.
#!/bin/bash
for i in `echo $(find src/ -name "*.html")`
do
file=$(echo $i | sed 's/src\///g')
cp -r --parents $i des
done
Not sure if you must use bash constructs or not, but here is a GNU tar solution (if you use GNU tar), which IMHO is the best way to handle this situation because all the metadata for the files (permissions, etc.) are preserved:
find ./tmp/serve -name '*.html' -type f -print0 | tar --null -T - -c | tar -x -v -C ./src --strip-components=3
This finds all the .html files (-type f) in the ./tmp/serve directory and prints them nul-terminated (-print0), then sends these filenames via stdin to tar as nul-terminated literals (--null) for inclusion (-T -), creating (-c) an archive which is then sent to another tar instance which extracts (-x) the archive printing its contents along the way (optional: -v), changing directory to the destination (-C ./src) before commencing and stripping (--strip-components=3) the ./tmp/serve/ prefix from the files. (You could also cd ./tmp/serve beforehand, using find . instead, and change -C to ../../src.)

Copy all files in directory except ".txt" and not to replace existing files

i have to copy all the file from source directory to destination directory , but skip all file with extension ".txt" and not to the replace the file if its already present in destination directory
example
source directory
/a/aone.js
/a/atwo.js
/b/bone.txt
/b/btwo.js
destination directory
/a/atwo.js
then it should only copy
/a/aone.js
/b/btwo.js
and skip "/a/atwo.js" because its already present in destination folder
and skip "/b/bone.txt" because its extension is ".txt"
i tried this command but this does not work
find /path/to/source/ \( ! -name "*.txt" \) -type f | cp -n /path/to/destination/ -R
cp -n /path/to/source/*(!*.txt) /path/to/destination/ -R
Assuming you can use rsync, (vaz is verbose, archive and compress - I believe the other options are self explanatory)
rsync -vaz --exclude "*.txt" /path/to/source/ /path/to/destination/
Why make it difficult. You were on the right track. A simple:
cp -an /path/to/source/*.[^t*] /path/to/destination
will copy all files from source, except those whose extension begins with a t to destination. It will do so without overwriting existing files in destination. This presumes that files do not have more than one dot. If so, then a few more lines of code will be needed.
The following will illustrate use of the above:
$ md tmp
$ md a
$ md b
$ touch a/a.{j,k,l,txt}
$ ls -1 a
a.j
a.k
a.l
a.txt
$ cp -an a/a*.[^t*] b
$ ls -1 b
a.j
a.k
a.l
using cp, you must match the proper directory depth. If you have another intervening directory, then simply add an additional wildcard. For example:
$ ls -1 dat/*/*.[^t*]
dat/a/a.j
dat/a/a.k
dat/a/a.l
dat/b/a.j
dat/b/a.k
dat/b/a.l
If your directory structure gets more complex, then go with find or rsync. Both are excellent tools and rsync can handle both local and network transfers. cp is the right tool for small jobs, but when more flexibility is needed, then grab a bigger hammer.

How can I recursively copy same-named files from one directory structure to another in bash?

I have two directories, say dir1 and dir2, that have exactly the same directory structure. How do I recursively copy all the *.txt files from dir1 to dir2?
Example:
I want to copy from
dir1/subdir1/file.txt
dir1/subdir2/someFile.txt
dir1/.../..../anotherFile.txt
to
dir2/subdir1/file.txt
dir2/subdir2/someFile.txt
dir2/.../..../anotherFile.txt
The .../... in the last file example means this could be any sub-directory, which can have sub-directories itself.
Again I want to do this programmatically. Here's the pseudo-code
SRC=dir1
DST=dir2
for f in `find ./$SRC "*.txt"`; do
# $f should now be dir1/subdir1/file.txt
# I want to copy it to dir2/subdir1/file.txt
# the next line coveys the idea, but does not work
# I'm attempting to substitute "dir1" with "dir2" in $f,
# and store the new path in tmp.txt
echo `sed -i "s/$SRC/$DST/" $f` > tmp.txt
# Do the copy
cp -f $f `cat tmp.txt`
done
You can simply use rsync. This answer is based from this thread.
rsync -av --include='*.txt' --include='*/' --exclude='*' dir1/ dir2/
If you only have .txt files in dir1, this would work:
cp -R dir1/* dir2/
But if you have other file extensions, it will copy them too. In this case, this will work:
cd /path/to/dir1
cp --parents `find . -name '*.txt'` path/to/dir2/

Unzip Folders to Parent Directory Keeping Zipped Folder Name

I have a file structure as follows:
archives/
zips/
zipfolder1.zip
zipfolder2.zip
zipfolder3.zip
...
zipfolderN.zip
I have a script that unzips the folders to the parent directory "archives", but it is unzipping the contents of the folders to the "archives" directory. I need the zipped folders to remain as folders under the "archives" directory. The resultant file structure should look like this:
archives/
zips/
zipfolder1.zip
zipfolder2.zip
...
zipfolder1/
contents...
zipfolder2/
contents...
...
I am currently using the following:
find /home/username/archives/zips/*.zip -type f | xargs -i unzip -d ../ -q '{}'
How can I modify this line to keep the original folder names? Is it as simple as using ../*?
You could use basename to extract the zip into the desired directory:
find /home/username/archives/zips/*.zip -type f -exec sh -c 'unzip -q -d ../"$(basename "{}" .zip)" "{}"' \;

Unix script to find all folders in the directory, then tar and move them

Basically I need to run a Unix script to find all folders in the directory /fss/fin, if it exists; then I have tar it and move to another directory /fs/fi.
This is my command so far:
find /fss/fin -type d -name "essbase" -print
Here I have directly mentioned the folder name essbase. But instead, I would like to find all the folders in the /fss/fin and use them all.
How do I find all folders in the /fss/fin directory & tar them to move them to /fs/fi?
Clarification 1:
Yes I need to find only all folders in the directory /fss/fin directory using a Unix shell script and tar them to another directory /fs/fi.
Clarification 2:
I want to make it clear with the requirement. The Shell Script should contain:
Find all the folders in the directory /fss/fin
Tar the folders
Move the folders in another directory /fs/fi which is located on the server s11003232sz.net
On user requests it should untar the Folders and move them back to the orignal directory /fss/fin
here is an example I am working with that may lead you in the correct direction
BackUpDIR="/srv/backup/"
SrvDir="/srv/www/"
DateStamp=$(date +"%Y%m%d");
for Dir in $(find $SrvDir* -maxdepth 0 -type d );
do
FolderName=$(basename $Dir);
tar zcf "$BackUpDIR$DateStamp.$FolderName.tar.gz" -P $Dir
done
Since tar does directories automatically, you really don't need to do very much. Assuming GNU tar:
tar -C /fss/fin -cf - essbase |
tar -C /fs/fi -xf -
The '-C' option changes directory before operating. The first tar writes to standard output (the lone '-') everything found in the essbase directory. The output of that tar is piped to the second tar, which reads its standard input (the lone '-'; fun isn't it!).
Assuming GNU find, you can also do:
(cd /fss/fin; tar -cf - $(find . -maxdepth 1 -type d | sed '/^\.$/d')) |
tar -xf - -C /fs/fi
This changes directory to the source directory; it runs 'find' with a maximum depth of 1 to find the directories and removes the current directory from the list with 'sed'; the first 'tar' then writes the output to the second one, which is the same as before (except I switched the order of the arguments to emphasize the parallelism between the two invocations).
If your top-level directories (those actually in /fss/fin) have spaces in the names, then there is more work to do again - I'm assuming none of the directories to be backed up start with a '.':
(cd /fss/fin; find * -maxdepth 0 -type d -print 0 | xargs -0 tar -cf -) |
tar -xf - -C /fs/fi
This weeds out the non-directories from the list generated by '*', and writes them with NUL '\0' (zero bytes) marking the end of each name (instead of a newline). The output is written to 'xargs', which is configured to expect the NUL-terminated names, and it runs 'tar' with the correct directory names. The output of this ensemble is sent to the second tar, as before.
If you have directory names starting with a '.' to collect, then add '.[a-z]*' or another suitable pattern after the '*'; it is crucial that what you use does not list '.' or '..'. If you have names starting with dashes in the directory, then you need to use './*' and './.[a-z]*'.
If you've got still more perverse requirements, enunciate them clearly in an amendment to the question.
find /fss/fin -d 1 -type d -name "*" -print
The above command gives you the list of 1st level subdirectories of the /fss/fin.
Then you can do anything with this. E.g. tar them to your output directory as in the command below
tar -czf /fss/fi/outfile.tar.gz `find /fss/fin -d 1 -type d -name "*" -print`
Original directory structure will be recreated after untar-ing.
Here is a bash example (change /fss/fin, /fs/fi with your paths):
dirs=($(find /fss/fin -type d))
for dir in "${dirs[#]}"; do
tar zcf "$dir.tgz" "$dir" -P -C /fs/fi && mv -v "$dir" /fs/fi/
done
which finds all the folders, tar them separately, and if successful - move them into different folder.
This should do it:
#!/bin/sh
list=`find . -type d`
for i in $list
do
if [ ! "$i" == "." ]; then
tar -czf ${i}.tar.gz ${i}
fi
done
mv *.tar.gz ~/tardir

Resources