Bash: Recreate recursively sub-directories - bash

I've got a lot of files in a lot of sub-directories.
I would like to perform some task on them and return the result in a new file, but in an output directory which has the exact same sub-directories as the input.
I try already this:
#!/bin/bash
########################################################
# $1 = "../benchmarks/k"
# $2 = Output Folder;
# $3 = Path to access the solver
InputFolder=$1;
OutputFolder=$2;
Solver=$3
mkdir -p $2;
########################################
#
# Send the command on the cluster
# to run the solver on the instancee.
#
########################################
solveInstance() {
instance=$1;
# $3 $instance > $2/$i.out
}
########################################
#
# Loop on benchmarks folders recursively
#
########################################
loop_folder_recurse() {
for i in "$1"/*;
do
if [ -d "$i" ]; then
echo "dir: $i"
mkdir -p "$2/$i";
loop_folder_recurse "$i"
elif [ -f "$i" ]; then
solveInstance "$i"
fi
done
}
########################################
#
# Main of the Bash script.
#
########################################
echo "Dir: $1";
loop_folder_recurse $1
########################################################
The problem is my line mkdir -p "$2/$i";. $2 is the name of a directory that we create at the beginning, so there is no problem. But in $i, it can be an absolute path and in that case it wants to create all the sub-directories to arrive to the file : Not possible. Or it can contain .. and same kind of problem appear...
I don't know exactly how to fix this bug :/ I try some things with sed but I did not succeed :/

The easiest way is to use find:
for i in `find $1 -type d` # Finds all the subfolders and loop.
do
mkdir ${i/$1/$2} # Replaces the root with the new root and creates the dir.
done
In such a way you recreate the folder structure of $1 in $2. You can even avoid the loop if you use sed to replace the old folder path with the new.

Related

Bash simple script copying files to specific folder + renaming to todays effective date

Good day,
I need your help in creating next script
Every day teacher uploading files in next format:
STUDENT_ACCOUNTS_20200217074343-20200217.xlsx
STUDENT_MARKS_20200217074343-20200217.xlsx
STUNDENT_HOMEWORKS_20200217074343-20200217.xlsx
STUDENT_PHYSICAL_20200217074343-20200217.xlsx
SUBSCRIBED_STUDENTS_20200217074343-20200217.xlsx
[file_name+todaydatetime-todaydate.xlsx]
But sometimes a teacher is not uploading these files and we need to do manual renaming the files received for the previous date and then copying every separate file to separate folder like:
cp STUDENT_ACCOUNTS_20200217074343-20200217.xlsx /incoming/A1/STUDENT_ACCOUNTS_20200318074343-20200318.xlsx
cp STUDENT_MARKS_20200217074343-20200217.xlsx /incoming/B1/STUDENT_ACCOUNTS_20200318074343-20200318.xlsx
.............
cp SUBSCRIBED_STUDENTS_20200217074343-20200217.xlsx /incoming/F1/SUBSCRIBED_STUDENTS_20200318074343-20200318.xlsx.
In two words - taking the files from previous date copying them to specific folder with a new timestamp.
#!/bin/bash
cd /home/incoming/
date=$(date '+%Y%m%d')
previousdate="$( date --date=yesterday '+%Y%m%d' )"
cp /home/incoming/SUBSCRIBED_STUDENTS_'$previousdate'.xlsx /incoming/F1/SUBSCRIBED_STUDENTS_'$date'.xlsx
and there could be case when teacher can upload one file and others not, how to do check for existing files?
Thanks for reading that, if you can help me i will ne really thankful - you will save plenty of manual work for me.
The process can be automated completely if your directory structure is known. If it follows some kind of pattern, do mention it here.
For the timing, this maybe helpful:
Filename "tscp"
#
# Stands for timestamped cp
#
tscp() {
local file1=$1 ; shift
local to_dir=$1 ; shift
local force_copy=$1 ; shift
local current_date="$(date '+%Y%m%d')"
if [ "${force_copy}" == "--force" ] ; then
cp "${file1}" "${to_dir}/$(basename ${file1%-*})-${current_date}.xlsx"
else
cp -n "${file1}" "${to_dir}/$( basename ${file1%-*})-${current_date}.xlsx"
fi
}
tscp "$#"
It's usage is as follows:
tscp source to_directory [-—force]
Basically the script takes 2 arguments and the 3rd one is optional.
First arg is source file path and second are is the directory path to where you want to copy (. if same directory).
By default this copy would be made if and only if destination file doesn't exist.
If you want to overwrite the destination file then pass a third arg —force.
Again, this can be refined much much more based on details provided.
Sample usage for now:
bash tscp SUBSCRIBED_STUDENTS_20200217074343-20200217.xlsx /incoming/F1/
will copy SUBSCRIBED_STUDENTS_20200217074343-20200217.xlsx to directory /incoming/F1/ with updated date if it doesn't exist yet.
UPDATE:
Give this a go:
#! /usr/bin/env bash
printf_err() {
ERR_COLOR='\033[0;31m'
NORMAL_COLOR='\033[0m'
printf "${ERR_COLOR}$1${NORMAL_COLOR}" ; shift
printf "${ERR_COLOR}%s${NORMAL_COLOR}\n" "$#" >&2
}
alias printf_err='printf_err "Line ${LINENO}: " '
shopt -s expand_aliases
usage() {
printf_err \
"" \
"usage: ${BASH_SOURCE##*/} " \
" -f copy_data_file" \
" -d days_before" \
" -m months_before" \
" -o" \
" -y years_before" \
" -r " \
" -t to_dir" \
>&2
exit 1
}
fullpath() {
local path="$1" ; shift
local abs_path
if [ -z "${path}" ] ; then
printf_err "${BASH_SOURCE}: Line ${LINENO}: param1(path) is empty"
return 1
fi
abs_path="$( cd "$( dirname "${path}" )" ; pwd )/$( basename ${path} )"
printf "${abs_path}"
}
OVERWRITE=0
REVIEW=0
COPYSCRIPT="$( mktemp "/tmp/copyscriptXXXXX" )"
while getopts 'f:d:m:y:t:or' option
do
case "${option}" in
d)
DAYS="${OPTARG}"
;;
f)
INPUT_FILE="${OPTARG}"
;;
m)
MONTHS="${OPTARG}"
;;
t)
TO_DIR="${OPTARG}"
;;
y)
YEARS="${OPTARG}"
;;
o)
OVERWRITE=1
;;
r)
REVIEW=1
COPYSCRIPT="copyscript"
;;
*)
usage
;;
esac
done
INPUT_FILE=${INPUT_FILE:-$1}
TO_DIR=${TO_DIR:-$2}
if [ ! -f "${INPUT_FILE}" ] ; then
printf_err "No such file ${INPUT_FILE}"
usage
fi
DAYS="${DAYS:-1}"
MONTHS="${MONTHS:-0}"
YEARS="${YEARS:-0}"
if date -v -1d > /dev/null 2>&1; then
# BSD date
previous_date="$( date -v -${DAYS}d -v -${MONTHS}m -v -${YEARS}y '+%Y%m%d' )"
else
# GNU date
previous_date="$( date --date="-${DAYS} days -${MONTHS} months -${YEARS} years" '+%Y%m%d' )"
fi
current_date="$( date '+%Y%m%d' )"
tmpfile="$( mktemp "/tmp/dstnamesXXXXX" )"
awk -v to_replace="${previous_date}" -v replaced="${current_date}" '{
gsub(to_replace, replaced, $0)
print
}' ${INPUT_FILE} > "${tmpfile}"
paste ${INPUT_FILE} "${tmpfile}" |
while IFS=$'\t' read -r -a arr
do
src=${arr[0]}
dst=${arr[1]}
opt=${arr[2]}
if [ -n "${opt}" ] ; then
if [ ! -d "${dst}" ] ;
then
printf_err "No such directory ${dst}"
usage
fi
dst="${dst}/$( basename "${opt}" )"
else
if [ ! -d "${TO_DIR}" ] ;
then
printf_err "No such directory ${TO_DIR}"
usage
fi
dst="${TO_DIR}/$( basename "${dst}" )"
fi
src=$( fullpath "${src}" )
dst=$( fullpath "${dst}" )
if [ -n "${OVERWRITE}" ] ; then
echo "cp ${src} ${dst}"
else
echo "cp -n ${src} ${dst}"
fi
done > "${COPYSCRIPT}"
if [ "${REVIEW}" -eq 0 ] ; then
${BASH} "${COPYSCRIPT}"
rm "${COPYSCRIPT}"
fi
rm "${tmpfile}"
Steps:
Store the above script in a file, say `tscp`.
Now you need to create the input file for it.
From you example, a sample input file can be like:
STUDENT_ACCOUNTS_20200217074343-20200217.xlsx /incoming/A1/
STUDENT_MARKS_20200217074343-20200217.xlsx /incoming/B1/
STUNDENT_HOMEWORKS_20200217074343-20200217.xlsx
STUDENT_PHYSICAL_20200217074343-20200217.xlsx
SUBSCRIBED_STUDENTS_20200217074343-20200217.xlsx /incoming/FI/
Where first part is the source file name and after a "tab" (it should be a tab for sure), you mention the destination directory. These paths should be either absolute or relative the the directory where you are executing the script. You may not mention destination directory if all are to be sent to same directory (discussed later).
Let's say you named this file `file`.
Also, you don't really have to type all that. If you have these files in the current directory, just do this:
ls -1 > file
(the above is ls "one", not "l".)
Now we have the `file` from above in which we didn't mention destination directory for all but only for some.
Let's say we want to move all other directories to `/incoming/x` and it exists.
Now script is to be executed like:
bash tscp -f file -t /incoming/x -r
Where `/incoming/x` is the default directory i.e. when none other directory is mentioned in `file`, your files are moved to this directory.
Now in the current directory a script named `copyscript` will be generated which will contain `cp` commands to copy all files. You can open a review `copyscript` and if the copying seems right, go ahead and:
bash copyscript
which will copy all the files and then you can:
rm copyscript
You need not generate to `copyscript` and can straight away go for a copy like:
bash tscp -f file -t /incoming/x
which won't generate any copyscript and copy straight away.
Previously `-r` caused the generation of `copyscript`.
I would recomment to use version with `-r` because that is a little safer and you will be sure that right copies are being made.
By default it would check for the previous day and rename to current date, but you can override that behaviour as:
bash tscp -f file -t /incoming/x -d 3
`-d 3` would look for 3 days back files in `file`.
By default copies won't overwrite i.e. if file at the destination already exists, copies won't be made.
If you want to overwrite, add flag `-o`.
As a conclusion I would advice to use:
bash tscp -f file -r
where file contains tab separated values like above for all.
Also, adding tscp to path would be a good idea after you are sure it works ok.
Also the scipt is made on mac and there is always a change of version clash of tools used. I would suggest to try the script on some sample data first to make sure script works right on your machine.

move command performs move in source directory too

I have written a shell script to move files from source directory to destination directory.
/home/tmp/ to /home/from/
The move happens correctly but it displays message
mv: /home/tmp/testfile_retry_17072017.TIF
/home/tmp/testfile_retry_17072017.TIF are identical.
and if source directory is empty it displays
mv: cannot rename /home/tmp/* to /home/from/*
for file in /home/tmp/*
if [ -f "$file" ]
then
do
DIRPATH=$(dirname "${file}")
FILENAME=$(basename "${file}")
# echo "Dirpath = ${DIRPATH} Filename = ${FILENAME}"
mv "${DIRPATH}/"${FILENAME} /home/from
echo ${FILENAME} " moved to from directory"
done
else
echo "Directory is empty"
fi
You should use find instead of /home/tmp/* as shown.
for file in $(find /home/tmp/ -type f)
do
if [ -f "$file" ]
then
DIRPATH=$(dirname "${file}")
FILENAME=$(basename "${file}")
# echo "Dirpath = ${DIRPATH} Filename = ${FILENAME}"
mv "${DIRPATH}/"${FILENAME} /home/from
echo ${FILENAME} " moved to from directory"
else
echo "Directory is empty"
fi
done
You have things a bit out of order with:
for file in /home/tmp/*
if [ -f "$file" ]
then
do
Of course "$file" will exist -- you are looping for file in /home/tmp/*. It looks like you intended
for file in /home/tmp/*
do
FILENAME=$(basename "${file}")
if [ ! -f "/home/from/$FILENAME" ] ## if it doesn't already exist in dest
then
Note: POSIX shell include parameter expansions that allow you to avoid calling dirname and basename. Instead you can simply use "${file##*/}" for the filename (which just says remove everything from the left up to (and including) the last /). That is the only expansion you need (as you already know the destination directory name). This allows you to check [ -f "$dest/${f##*/}" ] to determine if a file with the same name you are moving already exists in /home/from
You could use that to your advantage with:
src=/home/tmp ## source dir
dst=/home/from ## destination dir
for f in "$src"/* ## for each file in src
do
[ "$f" = "$src/*" ] && break ## src is empty
if [ -f "$dst/${f##*/}" ] ## test if it already exists in dst
then
printf "file '%s' exists in '%s' - forcing mv.\n" "${f##*/}" "$dst"
mv -f "$f" "$dst" ## use -f to overwrite existing
else
mv "$f" "$dst" ## regular move otherwise
fi
done
There is a great resource for checking your shell code called ShellCheck.net. Just type your code into the webpage (or paste it) and it will analyze your logic and variable use and let you know where problem are identified.
Look things over and let me know if you have further questions.

Sort files into sub folders by date - bash

Basically my HDD crashed, I was able to recover all the files, but, all the files have retained their meta & some have retained their names, I have 274000 images, which I need to more or less, sort into folders by date.
So let's say it starts with the first files, it would get the date from the file, create a sub folder, and until the date changes, keep moving that file into the created folder, once the date changes, it would create a new folder and keep doing the same thing.
I'm sure this is possible, I really didn't want to have to do this manually as it would take weeks...
Lets say I have a target folder /target/
Target contains, 274000 files, in no sub folders at all.
The folders structure should be /target/YY/DD_MM/filenames
I would like to create a bash script for this, but I'm not really sure where to proceed from here.
I've found this:
#!/bin/bash
DIR=/home/data
target=$DIR
cd "$DIR"
for file in *; do
dname="$( date -d "${file%-*}" "+$target/%Y/%b_%m" )"
mkdir -vp "${dname%/*}"
mv -vt "$dname" "$file"
done
Would creating a folder without checking if it exists delete files inside that folder?
I'm also not quite sure what adding an asterix to the dir pathname would do?
I'm not quite familiar with bash, but I'd love to get this working if someone could please explain to me a little more what's going on?
Thankyou!
I seemed to have found an answer that suited me, this worked on OSX just fine on three files, before I run it on the massive folder, can you guys just check that this isn't going to fail somewhere?
#!/bin/bash
DIR=/Users/limeworks/Downloads/target
target=$DIR
cd "$DIR"
for file in *; do
# Top tear folder name
year=$(stat -f "%Sm" -t "%Y" $file)
# Secondary folder name
subfolderName=$(stat -f "%Sm" -t "%d-%m-%Y" $file)
if [ ! -d "$target/$year" ]; then
mkdir "$target/$year"
echo "starting new year: $year"
fi
if [ ! -d "$target/$year/$subfolderName" ]; then
mkdir "$target/$year/$subfolderName"
echo "starting new day & month folder: $subfolderName"
fi
echo "moving file $file"
mv "$file" "$target/$year/$subfolderName"
done
I've had issues with the performance of the other solutions since my filesystem is remotely mounted and access times are big.
I've worked on some improved solutions in bash and python:
Bash version:
record # cat test.sh
for each in *.mkv
do
date=$(date +%Y-%d-%m -r "$each");
_DATES+=($date);
FILES+=($each);
done
DATES=$(printf "%s\n" "${_DATES[#]}" | sort -u);
for date in ${DATES[#]}; do
if [ ! -d "$date" ]; then
mkdir "$date"
fi
done
for i in ${FILES[#]}; do
dest=$(date +%Y-%d-%m -r "$i")
mv $i $dest/$i
done
record # time bash test.sh
real 0m3.785s
record #
Python version:
import os, datetime, errno, argparse, sys
def create_file_list(CWD):
""" takes string as path, returns tuple(files,date) """
files_with_mtime = []
for filename in [f for f in os.listdir(CWD) if os.path.splitext(f)[1] in ext]:
files_with_mtime.append((filename,datetime.datetime.fromtimestamp(os.stat(filename).st_mtime).strftime('%Y-%m-%d')))
return files_with_mtime
def create_directories(files):
""" takes tuple(file,date) from create_file_list() """
m = []
for i in files:
m.append(i[1])
for i in set(m):
try:
os.makedirs(os.path.join(CWD,i))
except OSError as exception:
if exception.errno != errno.EEXIST:
raise
def move_files_to_folders(files):
""" gets tuple(file,date) from create_file_list() """
for i in files:
try:
os.rename(os.path.join(CWD,i[0]), os.path.join(CWD,(i[1] + '/' + i[0])))
except Exception as e:
raise
return len(files)
if __name__ == '__main__':
parser = argparse.ArgumentParser(prog=sys.argv[0], usage='%(prog)s [options]')
parser.add_argument("-e","--extension",action='append',help="File extensions to match",required=True)
args = parser.parse_args()
ext = ['.' + e for e in args.extension]
print "Moving files with extensions:", ext
CWD = os.getcwd()
files = create_file_list(CWD)
create_directories(files)
print "Moved %i files" % move_files_to_folders(files)
record # time python sort.py -e mkv
Moving files with extensions: ['.mkv']
Moved 319 files
real 0m1.543s
record #
Both scripts are tested upon 319 mkv files modified in the last 3 days.
I worked on a little script and tested it.Hope this helps.
#!/bin/bash
pwd=`pwd`
#list all files,cut date, remove duplicate, already sorted by ls.
dates=`ls -l --time-style=long-iso|grep -e '^-.*'|awk '{print $6}'|uniq`
#for loop to find all files modified on each unique date and copy them to your pwd
for date in $dates; do
if [ ! -d "$date" ]; then
mkdir "$date"
fi
#find command will find all files modified at particular dates and ignore hidden files.
forward_date=`date -d "$date + 1 day" +%F`
find "$pwd" -maxdepth 1 -not -path '*/\.*' -type f -newermt "$date" ! -newermt "$forward_date" -exec cp -f {} "$pwd/$date" \;
done
You must be in your working directory where your files to be copied according to date are present.

Shell script to poll a directory and stop upon an event

Need shell script to:
1/keep polling a directory "receive_dir" irrespective of having files or no files in it.
2/move the files over to another directory "send_dir".
3/the script should only stop polling upon a file "stopfile" get moved to "receive_dir". Thanks !!
My script:
until [ $i = stopfile ]
do
for i in `ls receive_dir`; do
time=$(date +%m-%d-%Y-%H:%M:%S)
echo $time
mv receive_dir/$i send_dir/;
done
done
This fails on empty directories and also is there any better way ?
If you are running on Linux, you might wish to consider inotifywait
$ declare -f tillStopfile
tillStopfile ()
{
cd receive_dir
[[ -d ../send_dir ]] || mkdir ../send_dir
while true; do
date +%m-%d-%Y-%H:%M:%S
for f in *
do
mv "$f" ../send_dir
[[ $f == "stopfile" ]] && break 2
done
sleep 3
done
}
$
Improvements
while true ... break #
easier to control this loop
cd receive_dir #
why not run in the "receive_dir"
factor date out of inner loop #
unless you need to see each time-stamp?
added suggested "sleep"
# pick a suitable inteval
Run:
$ tillStopfile 2>/dev/null # suppresses ls error messages

bash shell script to run imgcmp on two JPGs and store 'different' ones

I've got an IP camera that ftps files to a directory on my SuSE server.
I'm trying to write a shell script to do the following:
for every file in a directory;
use image compare to check this file against the next one
store the output in a file or variable.
if the next file is different then
copy the original to another folder
else
delete the original
end for
Running the following at the prompt generates this:
myserver:/uploads # imgcmp -f img_01.jpg -F img_02.jpg -m rmse > value.txt
myserver:/uploads # cat value.txt
5.559730
5.276747
6.256132
myserver:/uploads #
I know there's loads wrong with the code, the main issue I've got is with executing imgcmp from the script and extracting a value from it, so please point out the obvious as it may not be to me.
FILES=/uploads/img*
declare -i value
declare -i result
value = 10
shopt -s nullglob
# no idea what the above even does #
# IFS=.
# attempt to read the floating point number from imgcmp & make it an integer
for f in $FILES
do
echo "doing stuff w/ $f"
imgcmp -f 4f -F 4f+1 -m rmse > value.txt
# doesn't seem to find the files from the variables #
result= ( $(<value.txt) )
if [ $result > $value ] ; then
echo 'different';
# and copy it off to another directory #
else
echo 'same'
# and delete it #
fi
if $f+1 = null; then
break;
fi
done
when running the above, I get an error cannot open /uploads/img_023.jpg+1
and doing a cat of value.txt shows nothing, so all the files show as being the same.
I know where the issues are, but I've got no idea what I should actually be doing to extract the output of imgcmp (run from within a script) and then get it into a variable that I can compare it with.
FILES=/uploads/*
current=
for f in $FILES; do
if [ -z "$current" ]; then
current="$f"
continue
fi
next="$f"
echo "<> Comparing $current against $next"
## imgcmp will return non-0 if images cannot be compared
## and print an explanation message to stderr;
if result=$(imgcmp -f $current -F $next -m rmse); then
echo "comparison result: " $result
## Checking whether the first value returned
## is greater than 10
if [ "$(echo "$result" | awk '$1 > 10 {print "different"}')" = "different" ]; then
echo 'different';
# cp -v $current /some/other/folder/
else
echo 'same'
# rm -v $current
fi
else
## images cannot be compared... different dimensions / components / ...
echo 'wholly different'
# cp -v $current /some/other/folder/
fi
current="$next"
done

Resources