Loop through and unzip directories and then unzip items in subdirectories - bash

I have a folder designed in the following way:
-parentDirectory
---folder1.zip
----item1
-----item1.zip
-----item2.zip
-----item3.zip
---folder2.zip
----item1
-----item1.zip
-----item2.zip
-----item3.zip
---folder3.zip
----item1
-----item1.zip
-----item2.zip
-----item3.zip
I would like to write a bash script that will loop through and unzip the folders and then go into each subdirectory of those folders and unzip the files and name those files a certain way.
I have tried the following
cd parentDirectory
find ./ -name \*.zip -exec unzip {} \;
count=1
for fname in *
do
unzip
mv $fname $attempt{count}.cpp
count=$(($count + 1))
done
I thought the first two lines would go into the parentDirectory folder and unzip all zips in that folder and then the for loop would handle the unzipping and renaming. But instead, it unzipped everything it could and placed it in the parentDirectory. I would like to maintain the same directory structure I have.
Any help would be appreciated

excerpt from man unzip
[-d exdir]
An optional directory to which to extract files. By default, all files and subdirectories are recreated in the current directory; the -d option allows extraction in an arbitrary directory (always assuming one has permission to write to the directory).
It's doing exactly what you told it, and what would happen if you had done the same on the command line. Just tell it where to extract, since you want it to extract there.
see Ubuntu bash script: how to split path by last slash? for an example of splitting the path out of fname.
putting it all together, your command executed in the parentDirectory is
find ./ -name \*.zip -exec unzip {} \;
But you want unzip to extract to the directory where it found the file. I was going to just use backticks on dirname {} but I can't get it to work right, as it either executes on the "{}" literal before find, or never executes.
The easiest workaround was to write my own script for unzip which does it in place.
> cat unzip_in_place
unzip $1 -d `dirname $1`
> find . -name "*.zip" -exec ./unzip_in_place {} \;
You could probably alias unzip to do that automatically, but that is unwise in case you ever use other tools that expect unzip to work as documented.

Related

How to copy recursively files with multiple specific extensions in bash

I want to copy all files with specific extensions recursively in bash.
****editing****
I've written the full script. I have list of names in a csv file, I'm iterating through each name in that list, then creating a directory with that same name somewhere else, then I'm searching in my source directory for the directory with that name, inside it there are few files with endings of xlsx,tsv,html,gz and I'm trying to copy all of them into the newly created directory.
sample_list_filepath=/home/lists/papers
destination_path=/home/ds/samples
source_directories_path=/home/papers_final/new
cat $sample_list_filepath/sample_list.csv | while read line
do
echo $line
cd $source_directories_path/$line
cp -r *.{tsv,xlsx,html,gz} $source_directories_path/$line $destination_path
done
This works, but it copies all the files there, with no discrimination for specific extension.
What is the problem?
An easy way to solve your problem is to use find and regex :
find src/ -regex '.*\.\(tsv\|xlsx\|gz\|html\)$' -exec cp {} dest/ \;
find look recursively in the directory you specify (in my example it's src/), allows you to filter with -regex and to apply a command for matching results with -exec
For the regex part :
.*\.
will take the name of the file and the dot before extension,
\(tsv\|xlsx\|gz\|html\)$
verify the extension with those you want.
The exec block is what you do with files you got from regex
-exec cp {} dest/ \;
In this case, you copy what you got ({} meaning) to the destination directory.

How can i copy the contents of a directory located in multiple locations using find command and preserving directory structure?

I have a folder named accdb under multiple directories all under one parent directory dist. I want to copy the contents of accdb for all directories while preserving the code structure
I succeeded in making the recursive folder structure with:
cd ~/dist; find . -name "accdb" -type d -exec mkdir -p -- ~/acc_trial/{} \;
But i am failing to copy the contents of accdb. This command just makes the structure until directory accdb.
I tried
find . -name "accdb" -type d -exec mkdir -p -- ~/acc_trial/{} \ && cp -r {} ~/acc_trial/{} \;
I get an error:
find: missing argument to `-exec'
I don't know if this is possible using only a find expression, I'm pretty sure it is not. Besides you must consider that if you have one subfolder named accdb inside one accdb folder you'll probably get an error, that's why in the script that I've made I decided to use rsync:
#!/bin/bash
DEST='/home/corronx/provisional/destination_dir'
#Clean destination directory, PLEASE BE CAREFUL IT MUST BE A REMOVABLE DIRECTORY
rm -rf $DEST/*
FIND='test'
LOOK_PATH='/home/corronx/provisional'
FILES=($(find . -type d -name $FIND))
for ((i=0; i<${#FILES[#]};i++))
do
#Remove first character .
FILES[$i]=${FILES[$i]:1:${#FILES[$i]}}
#Create directories in destination path
mkdir -p $DEST${FILES[$i]}
rsync -aHz --delete ${FILES[$i]:1:${#FILES[$i]}}/ $DEST${FILES[$i]}
echo $i
done
Explanation
First of all I'd recommend using full paths in your script because an rm -rf expression inside an script is pretty dangerous (If you want comment that line and delete destination folder before running script).
DEST= Destination path.
FIND= Subfolder name that your are looking for.
LOOK_PATH= Path where you want to execute find
I create an array called FILES that contain all folders that returns find expression, after that I just create destination directories and run rsync to copy files, I've used rsync because I think it is better in case there is any subdirectory with the same name.
PLEASE BE CAREFUL WITH rm -rf expression, if DEST is not set you'll delete everything in your machine

Bash Extract tar.gz file

I have my tar file under:
/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz
With the tar command:
tar -tf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz"
The command show me:
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/exit_codes/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/exit_codes/code_FUNC
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/exit_codes/code_SCRI
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/check_appprivilege.php
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/check_login.php
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/login/privilege.php
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/Lysto BackUp/
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/Lysto BackUp/sys
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/webapp/scripte/Lysto BackUp/sys_func
/volume1/02_public/3rd_Party_Apps/SPK_SCRIPTS/SynoDSApps/SSH_ERROR
My Plan or better my wish is to handle it like this:
IFS=$'\n'
for PATHS in $(tar -tPf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz")
do
SED=$(echo "$PATHS" | sed 's/.*\///')
if [[ -n "$SED" ]]
then
tar -C "${target_archiv}" -xvf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz" "$PATHS"
#echo JA
echo "$PATHS"
fi
done
unset IFS
i only want one file of the tar and Store this to a different Directory....
but this command with the -C don´t work... it Extract all the files of the tar....
My Question is, is it possible to extract only one file of the Tar without cd to the Directory ??
Another Question: is it possible to Extract only the files of the tar without the Folders this is maybe the better way but I don´t know how...?
and no I can not tar the files without the paths of it I need them...
so this is no way for me...
I hope for help here :)
If your ultimate goal is to extract files without the full path, you can use a SED-like expression to rename the files while they are extracted, using the --xform option:
tar -C "${target_archiv}" -xvf "/volume1/#appstore/SynoDSApps/archiv/DE/2018_08_18__Lysto BackUp.tar.gz" --xform='s,^.*/,,'
The 's,^.*/,,' expression asks to substitute (s) from the beginning of the filename (^), capture everything (.*) and stop at the last slash (/) then replace it with nothing. In other words, it removes the directory structure from the filenames.
If you want to get rid of the empty folders that have been extracted, you may call this command after extracting:
find "${target_archiv}" -mindepth 1 -maxdepth 1 -type d -exec rmdir {} \;
Keep in mind it will remove all the (empty) subfolders of "${target_archiv}", even the ones that were already here before extracting the tarball. However, because rmdir will not remove directories that contain files, it will be mostly harmless to the subdirectories you had.

Bash script to recursively copy files and folders when subdir is not present

I have lots of projects archived under a directory tree, some of which have a .git folder in them.
What I'd like to do is recursively copy those files and directories to a new destination, keeping the current structure - EXCEPT for those directories containing a .git folder, in which case the script should run a command (let's say "echo", I'll change it later) followed by the folder name, without creating or copying it.
Any help would be much appreciated.
Edit: I'll try to explain myself better: I need to copy every single file and directory, except for those containing .git, which should be skipped and their path should be passed to another command. In this example, path a/b/c/d and its subfolders should be skipped entirely and a/b/c/d should be displayed using echo (just for brevity, I'll replace it with a different command later):
a
a/b
a/b/c
a/b/c/d/.git
a/b/c/d/e
a/b/c/d/f/g
a/b/c/e
a/b/d
a/c
b
b/c
...
IIUC, the following find one-liner will do the job:
find . -type d -mindepth 1 -maxdepth 1 -exec sh -c "test -e '{}/.git' && echo not copy '{}' || cp -r -- '{}' /tmp/copy-here " \;

running 2 unix commands in the same line in a batch file

Apreciate any help and excuse me if my terminology is incorrect.
What I am trying to do is write a scrpit/.bat file that will do the following:
copy 1 directory(and subdirectories) from pointA, to point B.
Then in pointB(and subdirectories) unzip the files which will give *.csv files
Then in pointB(and subdirectories) I want to delete some rows from all these csv files
This unix command, run on cygwin, will copy all the files from /cygdrive/v/pointA/* to the current directory . (i.e. the dot is the current working directory)
cp /cygdrive/v/pointA/* .
This unix command, run on cygwin, will go through all the files in the directory and subdirectories that end with .zip
and unzip them
find -iname *.zip -execdir unzip {} \;
This unix command, run on cygwin, will go through all the files in the directory and subdirectories that end with .csv
For each file it deletes the 1st 6 rows and the last row and that's the returned file.
find ./ -iname '*.csv' -exec sed -i '1,6d;$ d' '{}' ';'
I was looking to do this in one script/bat file but I am having trouble with the first find command
I am having trouble with the find and unzip commands on the one line and am wondering how and if this can be done
chdir C:\pointA
C:\cygwin\bin\cp.exe /cygdrive/v/pointB/* .
::find -iname *.zip -execdir unzip {} \;
::find ./ -iname '*.csv' -exec sed -i '1,6d;$ d' '{}' ';'
I did try something like this:
C:\cygwin\bin\find.exe -iname *.zip -execdir C:\cygwin\bin\unzip.exe {} \;
but I get the following:
/usr/bin/find: missing argument to `-execdir'
Can anyone advise if/how this can be done?
The Cygwin tools use their own kind of paths, e.g. /cygdrive/c/cygwin/bin/unzip.exe though sometimes the Windows paths with backslashes work, the backslashes do tend to confuse the Cygwin tools.
I highly recommend you write your tool in Bash shell script instead of a cmd.exe Windows batch file. In my experience (1) it's much easier to do flow control in bash scripts than in batch files, and (2) the Cygwin environment works better from Bash. You can open a bash shell and run bash yourscript.sh.
Your Bash script might look something like this: (untested)
#!/bin/bash
# This script would be run from a Cygwin Bash shell.
# You can use the Mintty program or run C:\cygwin\bin\bash --login
# to start a bash shell from Windows Command Prompt.
# Configure bash so the script will exit if a command fails.
set -e
cd /cygdrive/c/pointA
cp /cygdrive/v/pointB/* .
# I did try something like this:
# 1. Make sure you quote wildcards so the shell doesn't expand them
# before passing them to the 'find' program.
#
# 2. If you start bash with the --login option, the PATH will be
# configured so that C:\cygwin\bin is in your PATH, and you can
# just call 'find', 'cp' etc. without specifying full path to it.
# This will unzip all .zip files in all subdirectories under this one.
find -iname '*.zip' -execdir unzip {} \;

Resources