Unzip ZIP file and extract unknown folder name's content - bash

My users will be zipping up files which will look like this:
TEMPLATE1.ZIP
|--------- UnknownName
|------- index.html
|------- images
|------- image1.jpg
I want to extract this zip file as follows:
/mysite/user_uploaded_templates/myrandomname/index.html
/mysite/user_uploaded_templates/myrandomname/images/image1.jpg
My trouble is with UnknownName - I do not know what it is beforehand and extracting everything to the "base" level breaks all the relative paths in index.html
How can I extract from this ZIP file the contents of UnknownName?
Is there anything better than:
1. Extract everything
2. Detect which "new subdidrectory" got created
3. mv newsubdir/* .
4. rmdir newsubdir/
If there is more than one subdirectory at UnknownName level, I can reject that user's zip file.

I think your approach is a good one. Step 2 could be improved my extracting to a newly created directory (later deleted) so that "detection" is trivial.
# Bash (minimally tested)
tempdest=$(mktemp -d)
unzip -d "$tempdest" TEMPLATE1.ZIP
dir=("$tempdest"/*)
if (( ${#dir[#]} == 1 )) && [[ -d $dir ]]
# in Bash, etc., scalar $var is the same as ${var[0]}
mv "$dir"/* /mysite/user_uploaded_templates/myrandomname
else
echo "rejected"
fi
rm -rf "$tempdest"

The other option I can see other than the one you suggested is to use the unzip -j flag which will dump all paths and put all files into the current directory. If you know for certain that each of your TEMPLATE1.ZIP files includes an index.html and *.jpg files then you can just do something like:
destdir=/mysite/user_uploaded_templates/myrandomname
unzip -j -d "$destdir"
mkdir "${destdir}/images"
mv "${destdir}/*.jpg" "${destdir}/images"
It's not exactly the cleanest solution but at least you don't have to do any parsing like you do in your example. I can't seem to find any option similar to patch -p# that lets you specify the path level.

Each zip and unzip command differs, but there's usually a way to list the file contents. From there, you can parse the output to determine the unknown directory name.
On Windows, the 1996 Wales/Gaily/van der Linden/Rommel version it is unzip -l.
Of course, you could just simply allow the unzip to unzip the files to whatever directory it wants, then use mv to rename the directory to what you want it as.
$tempDir = temp.$$
mv $zipFile temp.$$
cd $tempDir
unzip $zipFile
$unknownDir = * #Should be the only directory here
mv $unknownDir $whereItShouldBe
cd ..
rm -rf $tempDir
It's always a good idea to create a temporary directory for these types of operations in case you end up running two instances of this command.

Related

How to write a bash script to copy files from one base to another base location

I have a bash script I'm trying to write
I have 2 base directories:
./tmp/serve/
./src/
I want to go through all the directories in ./tmp and copy the *.html files into the same folder path in ./src
i.e
if I have a html file in ./tmp/serve/app/components/help/ help.html -->
copy to ./src/app/components/help/ And recursively do this for all subdirectories in ./tmp/
NOTE: the folder structures should exist so just need to copy them only. If it doesn't then hopefully it could create the folder for me (not what I want) but with GIT I can track these folders to manually handle those loose html files.
I got as far as
echo $(find . -name "*.html")\n
But not sure how to actually extract the file path with pwd and do what I need to, maybe it's not a one liner and needs to be done with some vars.
something like
for i in `echo $(find /tmp/ -name "*.html")\n
do
cp -r $i /src/app/components/help/
done
going so far to create the directories would take some more time for me.
I'll try to do it on my own and see if I come up with something
but for argument sake if you do run pwd and get a response the pseudo code for that:
pwd
get response
if that directory does not exist in src create that directory
copy all the original directories contents into the new folder at /src/$newfolder
(possibly running two for loops, one to check the directory tree, and then one to go through each original directory, copying all the html files)
You process substitution to loop the output from your find command and create the destination directory(ies) and then copy the file(s):
#!/bin/bash
# accept first parameters to script as src_dir and dest values or
# simply use default values if no parameter(s) passed
src_dir=${1:-/tmp/serve}
dest=${2-src}
while read -r orig_path ; do
# To replace the first occurrence of a pattern with a given string,
# use ${parameter/pattern/string}
dest_path="${orig_path/tmp\/serve/${dest}}"
# Use dirname to remove the filename from the destination path
# and create the destination directory.
dest_dir=$(dirname "${dest_path}")
mkdir -p "${dest_dir}"
cp "${orig_path}" "${dest_path}"
done < <(find "${src_dir}" -name '*.html')
This script copy .html files from src directory to des directory (create the subdirectory if they do not exist)
Find the files, then remove the src directory name and copy them into the destination directory.
#!/bin/bash
for i in `echo $(find src/ -name "*.html")`
do
file=$(echo $i | sed 's/src\///g')
cp -r --parents $i des
done
Not sure if you must use bash constructs or not, but here is a GNU tar solution (if you use GNU tar), which IMHO is the best way to handle this situation because all the metadata for the files (permissions, etc.) are preserved:
find ./tmp/serve -name '*.html' -type f -print0 | tar --null -T - -c | tar -x -v -C ./src --strip-components=3
This finds all the .html files (-type f) in the ./tmp/serve directory and prints them nul-terminated (-print0), then sends these filenames via stdin to tar as nul-terminated literals (--null) for inclusion (-T -), creating (-c) an archive which is then sent to another tar instance which extracts (-x) the archive printing its contents along the way (optional: -v), changing directory to the destination (-C ./src) before commencing and stripping (--strip-components=3) the ./tmp/serve/ prefix from the files. (You could also cd ./tmp/serve beforehand, using find . instead, and change -C to ../../src.)

Bash Extract tar.gz file

I have my tar file under:
/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz
With the tar command:
tar -tf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz"
The command show me:
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/exit_codes/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/exit_codes/code_FUNC
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/exit_codes/code_SCRI
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/check_appprivilege.php
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/check_login.php
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/privilege.php
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/Lysto BackUp/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/Lysto BackUp/sys
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/Lysto BackUp/sys_func
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/SSH_ERROR
My Plan or better my wish is to handle it like this:
IFS=$'\n'
for PATHS in $(tar -tPf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz")
do
SED=$(echo "$PATHS" | sed 's/.*\///')
if [[ -n "$SED" ]]
then
tar -C "${target_archiv}" -xvf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz" "$PATHS"
#echo JA
echo "$PATHS"
fi
done
unset IFS
i only want one file of the tar and Store this to a different Directory....
but this command with the -C don´t work... it Extract all the files of the tar....
My Question is, is it possible to extract only one file of the Tar without cd to the Directory ??
Another Question: is it possible to Extract only the files of the tar without the Folders this is maybe the better way but I don´t know how...?
and no I can not tar the files without the paths of it I need them...
so this is no way for me...
I hope for help here :)
If your ultimate goal is to extract files without the full path, you can use a SED-like expression to rename the files while they are extracted, using the --xform option:
tar -C "${target_archiv}" -xvf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz" --xform='s,^.*/,,'
The 's,^.*/,,' expression asks to substitute (s) from the beginning of the filename (^), capture everything (.*) and stop at the last slash (/) then replace it with nothing. In other words, it removes the directory structure from the filenames.
If you want to get rid of the empty folders that have been extracted, you may call this command after extracting:
find "${target_archiv}" -mindepth 1 -maxdepth 1 -type d -exec rmdir {} \;
Keep in mind it will remove all the (empty) subfolders of "${target_archiv}", even the ones that were already here before extracting the tarball. However, because rmdir will not remove directories that contain files, it will be mostly harmless to the subdirectories you had.

Shell: Copy list of files with full folder structure stripping N leading components from file names

Consider a list of files (e.g. files.txt) similar (but not limited) to
/root/
/root/lib/
/root/lib/dir1/
/root/lib/dir1/file1
/root/lib/dir1/file2
/root/lib/dir2/
...
How can I copy the specified files (not any other content from the folders which are also specified) to a location of my choice (e.g. ~/destination) with a) intact folder structure but b) N folder components (in the example just /root/) stripped from the path?
I already managed to use
cp --parents `cat files.txt` ~/destination
to copy the files with an intact folder structure, however this results in all files ending up in ~/destination/root/... when I'd like to have them in ~/destination/...
I think I found a really nice an concise solution by using GNU tar:
tar cf - -T files.txt | tar xf - -C ~/destination --strip-components=1
Note the --strip-components option that allows to remove an arbitrary number of path components from the beginning of the file name.
One minor problem though: It seems tar always "compresses" the whole content of folders mentioned in files.txt (at least I couldn't find an option to ignore folders), but that is most easily solved using grep:
cat files.txt | grep -v '/$' > files2.txt
This might not be the most graceful solution - but it works:
for file in $(cat files.txt); do
echo "checking for $file"
if [[ -f "$file" ]]; then
file_folder=$(dirname "$file")
destination_folder=/destination/${file_folder#/root/}
echo "copying file $file to $destination_folder"
mkdir -p "$destination_folder"
cp "$file" "$destination_folder"
fi
done
I had a look at cp and rsync, but it looks like they would benefit more if you to cd into /root first.
However, if you did cd to the correct directory before hand, you could always run it as a subshell so that you would be returned to your original location once the subshell has finished.

Move files to the correct folder in Bash

I have a few files with the format ReportsBackup-20140309-04-00 and I would like to send the files with same pattern to the files as the example to the 201403 file.
I can already create the files based on the filename; I would just like to move the files based on the name to their correct folder.
I use this to create the directories
old="directory where are the files" &&
year_month=`ls ${old} | cut -c 15-20`&&
for i in ${year_month}; do
if [ ! -d ${old}/$i ]
then
mkdir ${old}/$i
fi
done
you can use find
find /path/to/files -name "*201403*" -exec mv {} /path/to/destination/ \;
Here’s how I’d do it. It’s a little verbose, but hopefully it’s clear what the program is doing:
#!/bin/bash
SRCDIR=~/tmp
DSTDIR=~/backups
for bkfile in $SRCDIR/ReportsBackup*; do
# Get just the filename, and read the year/month variable
filename=$(basename $bkfile)
yearmonth=${filename:14:6}
# Create the folder for storing this year/month combination. The '-p' flag
# means that:
# 1) We create $DSTDIR if it doesn't already exist (this flag actually
# creates all intermediate directories).
# 2) If the folder already exists, continue silently.
mkdir -p $DSTDIR/$yearmonth
# Then we move the report backup to the directory. The '.' at the end of the
# mv command means that we keep the original filename
mv $bkfile $DSTDIR/$yearmonth/.
done
A few changes I’ve made to your original script:
I’m not trying to parse the output of ls. This is generally not a good idea. Parsing ls will make it difficult to get the individual files, which you need for copying them to their new directory.
I’ve simplified your if ... mkdir line: the -p flag is useful for “create this folder if it doesn’t exist, or carry on”.
I’ve slightly changed the slicing command which gets the year/month string from the filename.

Bash script of unzipping unknown name files

I have a folder that after an rsync will have a zip in it. I want to unzip it to its own folder(if the zip is L155.zip, to unzip its content to L155 folder). The problem is that I dont know it's name beforehand(although i know it will be "letter-number-number-number"), so I have to unzip an uknown file to its unknown folder and this to be done automatically.
The command “unzip *”(or unzip *.zip) works in terminal, but not in a script.
These are the commands that have worked through terminal one by one, but dont work in a script.
#!/bin/bash
unzip * #also tried .zip and /path/to/file/* when script is on different folder
i=$(ls | head -1)
y=${i:0:4}
mkdir $y
unzip * -d $y
First I unzip the file, then I read the name of the first extracted file through ls and save it in a variable.I take the first 4 chars and make a directory with it and then again unzip the files to that specific folder.
The whole procedure after first unzip is done, is because the files inside .zip, all start with a name that the zip already has, so if L155.ZIP is the zip, the files inside with be L155***.txt.
The zip file is at /path/to/file/NAME.zip.
When I run the script I get errors like the following:
unzip: cannot find or open /path/to/file/*.ZIP
unzip: cannot find or open /path/to/file//*.ZIP.zip
unzip: cannot find or open /path/to/file//*.ZIP.ZIP. No zipfiles found.
mkdir: cannot create directory 'data': File exists data
unzip: cannot find or open data, data.zip or data.ZIP.
Original answer
Supposing that foo.zip contains a folder foo, you could simply run
#!/bin/bash
unzip \*.zip \*
And then run it as bash auto-unzip.sh.
If you want to have these files extracted into a different folder, then I would modify the above as
#!/bin/bash
cp *.zip /home/user
cd /home/user
unzip \*.zip \*
rm *.zip
This, of course, you would run from the folder where all the zip files are stored.
Another answer
Another "simple" fix is to get dtrx (also available in the Ubuntu repos, possibly for other distros). This will extract each of your *.zip files into its own folder. So if you want the data in a different folder, I'd follow the second example and change it thusly:
#!/bin/bash
cp *.zip /home/user
cd /home/user
dtrx *.zip
rm *.zip
I would try the following.
for i in *.[Zz][Ii][Pp]; do
DIRECTORY=$(basename "$i" .zip)
DIRECTORY=$(basename "$DIRECTORY" .ZIP)
unzip "$i" -d "$DIRECTORY"
done
As noted, the basename program removes the indicated suffix .zip from the filename provided.
I have edited it to be case-insensitive. Both .zip and .ZIP will be recognized.
for zfile in $(find . -maxdepth 1 -type f -name "*.zip")
do
fn=$(echo ${zfile:2:4}) # this will give you the filename without .zip extension
mkdir -p "$fn"
unzip "$zfile" -d "$fn"
done
If the folder has only file file with the extension .zip, you can extract the name without an extension with the basename tool:
BASE=$(basename *.zip .zip)
This will produce an error message if there is more than one file matching *.zip.
Just to be clear about the issue here, the assumption is that the zip file does not contain a folder structure. If it did, there would be no problem; you could simply extract it into the subfolders with unzip. The following is only needed if your zipfile contains loose files, and you want to extract them into a subfolder.
With that caveat, the following should work:
#!/bin/bash
DIR=${1:-.}
BASE=$(basename "$DIR/"*.zip .zip 2>/dev/null) ||
{ echo More than one zipfile >> /dev/stderr; exit 1; }
if [[ $BASE = "*" ]]; then
echo No zipfile found >> /dev/stderr
exit 1
fi
mkdir -p "$DIR/$BASE" ||
{ echo Could not create $DIR/$BASE >> /dev/stderr; exit 1; }
unzip "$DIR/$BASE.zip" -d "$DIR/$BASE"
Put it in a file (anywhere), call it something like unzipper.sh, and chmod a+x it. Then you can call it like this:
/path/to/unzipper.sh /path/to/data_directory
simple one liner I use all the time
$ for file in `ls *.zip`; do unzip $file -d `echo $file | cut -d . -f 1`; done

Resources