How to decompress multiple nested archives of different formats? - bash

I have hundreds of .zip and .tar archives nested in each other with the unknown depth and I need to decompress all of them to get to the last one, how can I achieve that?
I have the part for the zip files:
while 'true'
do
find . '(' -iname '*.zip' ')' -exec sh -c 'unzip -o -d "${0%.*}" "$0"' '{}' ';'
done
but once it stumbles upon the .tar file it expectedly does nothing. I'm running the script on mac.
The structure is just an archive in an archive, the extensions are not in any particular order, like:
a.zip/b.zip/c.tar/d.tar/e.zip/f.tar...
and so on

You can use an existing command like 7z x to extract either archive type or build your own using case "$file" in; *.zip) unzip ...;; *.tar) ... and so on.
The following script unpacks nested archives as long as the unpacked content is exactly one .tar or .zip archive. It stops when multiple archives, multiple files, or even directory containing just one .zip, were unpacked at once.
#! /usr/bin/env bash
# this function can be replaced by `7z x "$1"`
# if 7zip is installed (package managers often call it p7zip)
extract() {
case "$1" in
*.zip) unzip "$1" ;;
*.tar) tar -xf "$1" ;;
*) echo "Unknown archive type: $1"; exit 1 ;;
esac
}
isOne() {
[ $# = 1 ]
}
mkdir out tmp
ln {,out/}yourOutermostArchive.zip # <-- Adapt this line
cd out
shopt -s nullglob
while isOne * && isOne *.{zip,tar}
do
a=(*)
mv "$a" ../tmp/
extract "../tmp/$a"
rm "../tmp/$a"
done
rm -r ../tmp
cd ..

Related

Unpack .tar.gz and modify result files

I wanted to write a bash script that will unpack .tar.gz archives and for each result file it will set an additional attribute with the name of the original archive. Just to know what the origin is of the unpacked file.
I tried to store the inside files in an array and then for-loop them.
for archive in "$1"*.tar.gz; do
if [ -f "${archive}" ]
then
readarray -t fileNames < <(tar tzf "$archive")
for file in "${fileNames}"; do
echo "${file}"
tar xvzf "${archive}" -C "$1" --no-wildcards "${file}" &&
attr -s package -V "${archive}" "${file}"
done
fi
done
The result is that only one file is extracted and no extra attribute is set.
#! /bin/bash
for archive in "$1"*.tar.gz; do
if [ -f "${archive}" ] ; then
# Unpack the archive into subfolder $1
tar xvf "$archive" -C "$1"
# Assign attributes
tar tf "$archive" | (cd "$1" && xargs -t -L1 attr -s package -V "$archive" )
fi
done
Notes:
Script is unpacking each archive with a single 'tar'. This is more efficient than unpacing one file at a time. It also avoid issues with unpacking folders, which will lead to unnecessary repeated work.
Script is using 'attr'. Will be better to use 'setfattr', if supported on target file system to set attributes on multiple files with a few calls (using xargs, with multiple files per command)
It is not clear what is the structure of the output folder. From the question, it looks as if all archives will be placed into the same folder "$1". The following solution assume that this is the intended behavior, and that each archive will have distinct file names. If each archive is to be placed into different sub folder, it will be easier/more efficient to implement.

Bash script of unzipping unknown name files

I have a folder that after an rsync will have a zip in it. I want to unzip it to its own folder(if the zip is L155.zip, to unzip its content to L155 folder). The problem is that I dont know it's name beforehand(although i know it will be "letter-number-number-number"), so I have to unzip an uknown file to its unknown folder and this to be done automatically.
The command “unzip *”(or unzip *.zip) works in terminal, but not in a script.
These are the commands that have worked through terminal one by one, but dont work in a script.
#!/bin/bash
unzip * #also tried .zip and /path/to/file/* when script is on different folder
i=$(ls | head -1)
y=${i:0:4}
mkdir $y
unzip * -d $y
First I unzip the file, then I read the name of the first extracted file through ls and save it in a variable.I take the first 4 chars and make a directory with it and then again unzip the files to that specific folder.
The whole procedure after first unzip is done, is because the files inside .zip, all start with a name that the zip already has, so if L155.ZIP is the zip, the files inside with be L155***.txt.
The zip file is at /path/to/file/NAME.zip.
When I run the script I get errors like the following:
unzip: cannot find or open /path/to/file/*.ZIP
unzip: cannot find or open /path/to/file//*.ZIP.zip
unzip: cannot find or open /path/to/file//*.ZIP.ZIP. No zipfiles found.
mkdir: cannot create directory 'data': File exists data
unzip: cannot find or open data, data.zip or data.ZIP.
Original answer
Supposing that foo.zip contains a folder foo, you could simply run
#!/bin/bash
unzip \*.zip \*
And then run it as bash auto-unzip.sh.
If you want to have these files extracted into a different folder, then I would modify the above as
#!/bin/bash
cp *.zip /home/user
cd /home/user
unzip \*.zip \*
rm *.zip
This, of course, you would run from the folder where all the zip files are stored.
Another answer
Another "simple" fix is to get dtrx (also available in the Ubuntu repos, possibly for other distros). This will extract each of your *.zip files into its own folder. So if you want the data in a different folder, I'd follow the second example and change it thusly:
#!/bin/bash
cp *.zip /home/user
cd /home/user
dtrx *.zip
rm *.zip
I would try the following.
for i in *.[Zz][Ii][Pp]; do
DIRECTORY=$(basename "$i" .zip)
DIRECTORY=$(basename "$DIRECTORY" .ZIP)
unzip "$i" -d "$DIRECTORY"
done
As noted, the basename program removes the indicated suffix .zip from the filename provided.
I have edited it to be case-insensitive. Both .zip and .ZIP will be recognized.
for zfile in $(find . -maxdepth 1 -type f -name "*.zip")
do
fn=$(echo ${zfile:2:4}) # this will give you the filename without .zip extension
mkdir -p "$fn"
unzip "$zfile" -d "$fn"
done
If the folder has only file file with the extension .zip, you can extract the name without an extension with the basename tool:
BASE=$(basename *.zip .zip)
This will produce an error message if there is more than one file matching *.zip.
Just to be clear about the issue here, the assumption is that the zip file does not contain a folder structure. If it did, there would be no problem; you could simply extract it into the subfolders with unzip. The following is only needed if your zipfile contains loose files, and you want to extract them into a subfolder.
With that caveat, the following should work:
#!/bin/bash
DIR=${1:-.}
BASE=$(basename "$DIR/"*.zip .zip 2>/dev/null) ||
{ echo More than one zipfile >> /dev/stderr; exit 1; }
if [[ $BASE = "*" ]]; then
echo No zipfile found >> /dev/stderr
exit 1
fi
mkdir -p "$DIR/$BASE" ||
{ echo Could not create $DIR/$BASE >> /dev/stderr; exit 1; }
unzip "$DIR/$BASE.zip" -d "$DIR/$BASE"
Put it in a file (anywhere), call it something like unzipper.sh, and chmod a+x it. Then you can call it like this:
/path/to/unzipper.sh /path/to/data_directory
simple one liner I use all the time
$ for file in `ls *.zip`; do unzip $file -d `echo $file | cut -d . -f 1`; done

How to untar specific files from a number of tar files and zip them?

The requirement is to extract all the *.properties files from multiple tars and put them into a zip.
I tried this:
find . -iwholename "*/ext*/*.tar.gz"|xargs -n 1 tar --wildcards '*.properties' -xvzf | zip -# tar-properties.zip
This is creating a zip with the .properties files in all the tars.
But the issue is the tars are structured as in each tar contains a properties folder which contains the files. The above command is creating a zip with a single properties folder which contains all the files .
Is there a way to put these in the zip with a folder structure like {name of the tar}/properties/*.properties ?
You could use this script. My solution uses --transform as well. Please check first if your tar command supports it with tar --help 2>&1 | grep -Fe --transform.
#!/bin/bash
[ -n "$BASH_VERSION" ] || {
echo "You need bash to run this script." >&2
exit 1
}
TEMPDIR=/tmp/properties-files
OUTPUTFILE=$PWD/tar-properties.zip ## Must be an absolute path.
IFS=
if [[ ! -d $TEMPDIR ]]; then
mkdir -p "$TEMPDIR" || {
echo "Unable to create temporary directory $TEMPDIR." >&2
exit 1
}
fi
NAMES=()
while read -r FILE; do
NAMEOFTAR=${FILE##*/} ## Remove dir part.
NAMEOFTAR=${NAMEOFTAR%.tar.gz} to remove extension ## Remove .tar.gz.
echo "Extracting $FILE."
tar --wildcards '*.properties' -xvzf "$FILE" -C "$TEMPDIR" --transform "s#.*/#${NAMEOFTAR//#/\\#}/properties/#" || {
echo "An error occurred extracting to $TEMPDIR." >&2
exit 1
}
NAMES+=("$NAMEOFTAR")
done < <(exec find . -type f -iwholename '*/ext*/*.tar.gz')
(
cd "$TEMPDIR" >/dev/null || {
echo "Unable to change directory to $TEMPDIR."
exit 1
}
zip -a "$OUTPUTFILE" "${NAMES[#]}"
)
Save it to a script then run it on the directory where those files are to be searched with
bash /path/to/script.sh`
You can probably do the trick with tar option --transform, --xform. This option permits to manipulate path thanks to a sed expression.
find . -iwholename "*/ext*/*.tar.gz"|xargs -n 1 tar --wildcards '*.properties' -xvzf --xform 's#.*/#name_of_the_tar/properties/#' | zip -# tar-properties.zip

bash scripting copying all files in folder

I'm writing a shell script as follows:
for file in `ls`
do
mkdir "$file"_folder
cp $file "$file"_folder
done
What I want to do is to make a folder for each file in the current directory with its name and then underscore folder as the name and then copy that file into it. My problem is that the file names contain spaces in them. How do I escape them?
There are many resources explaining how to do this for variables but none of them can be applied to this situation where I use a for loop to get the names.
Don't use ls there, use shell globbing. (In general, do not parse the output of ls.)
for file in *
do
# only consider files, not directories
if [ -f "$file" ] ; then
new_dir="$file"_folder
# create the directory
if [ ! -d "$new_dir" ] ; then
mkdir "$new_dir"
if [ $? -ne 0 ] ; then
# handle directory creation eror
fi
fi
# possibly check for the copied file existence here
# and deal with that appropriately (i.e. skip/error/copy anyway)
cp "$file" "$new_dir"
fi
done
How about
find . -type f -exec mkdir {}_folder \; -exec cp {} {}_folder \;
It finds all regular files in the current directory, creates the folder (first -exec), and copies the file into the new folder (second -exec).
You do not parse
ls for exactly this reason
for file in *
do
mkdir "${file}_folder"
cp "$file" "${file}_folder"
done

BASH :: find file in archive from command line

i know how to find file with find
# find /root/directory/to/search -name 'filename.*'
but, how to look also into archives, as file can be ziped inside...
thanx
I defined a function (zsh, minor changes -> BaSh)
## preview archives before extraction
# Usage: show-archive <archive>
# Description: view archive without unpack
show-archive() {
if [[ -f $1 ]]
then
case $1 in
*.tar.gz) gunzip -c $1 | tar -tf - -- ;;
*.tar) tar -tf $1 ;;
*.tgz) tar -ztf $1 ;;
*.zip) unzip -l $1 ;;
*.bz2) bzless $1 ;;
*) echo "'$1' Error. Please go away" ;;
esac
else
echo "'$1' is not a valid archive"
fi
}
You can
find /directory -name '*.tgz' -exec show-archive {} \| grep filename \;
find /directory -name '*.tgz' -exec tar ztf {} \| grep filename \;
or something like that... But I don't think there's an 'easy' solution.
If your archive is some sort of zipped tarball, you can use the feature of tar that searches for a particular file and prints only that filename. If your tar supports wildcards, you can use those too. For example, on my system:
tar tf sprt12823.logs.tar --wildcards *tomcat*
prints:
tomcat.log.20090105
although there are many more files in the tarball, but only one matching the pattern "*tomcat*". This way you don't have to use grep.
You can combine this with find and gunzip or whatever other zipping utility you've used.

Resources