Move all files in a folder to a new location if the same existing Folder name exists at remote location - bash

Looking for a bash script:
Here's the situation:
I have 1000's folders and subfolders on my Backup Directory Drive
lets say.....
/backup
/backup/folderA
/backup/folderA/FolderAA
/backup/folderB
/backup/folderB/FolderBB
I have Dozens of similar folders in a secondary location (with files in them) and the Folder names will match one of the folders or subfolders in the main backup drive.
I would like to move all contents of specific extension types from my secondary location $FolderName to the Backup location + matching subfolder ONLY if the $FolderName matches exactly and remove the folders from my secondary location!
If there is no corrosponding folder or subfolder in the backup location then leave the source folders & files alone.
looking forward to getting some help/guidance.
Mike
Additional info requested.Expected input and ouput
Lets say i have the following:
Backup Folder
/backup/test/file.bak
And for my secondary folder location:
/secondarylocation/mike/test/hello/john.bak
/secondarylocation/mike/test/hello/backup.zip
i would like this as the end result
/backup/test/file.bak
/backup/test/john.bak
/backup/test/backup.zip
and /secondarylocation/mike/test *and sub folders and files removed

run this script with quoted folders and file types:
./merge.sh "backup" "secondarylocation/mike" "*.zip" "*.bak"
replace -iname with -name if you want to search for suffix case sensitive
replace mv -fv with mv -nv when you don't want to overwrite duplicate file names
add -mindepth 1 to last find if you want to keep empty folder test
merge.sh
#!/bin/bash
# read folders from positional parameters
[ -d "$1" ] && targetf="$1" && shift
[ -d "$1" ] && sourcef="$1" && shift
if [ -z "$targetf" ] || [ -z "$sourcef" ]
then
echo -e "usage: ./merge.sh <targetfolder> <sourcefolder> [PATTERN]..."
exit 1
fi
# add prefix -iname for each pattern
while [ ${pattern:-1} -le $# ]
do
set -- "$#" "-iname \"$1\""
shift
pattern=$((${pattern:-1}+1))
done
# concatenate all prefix+patterns with -o and wrap in parentheses ()
if (( $# > 1 ))
then
pattern="\( $1"
while (( $# > 1 ))
do
pattern="$pattern -o $2"
shift
done
pattern="$pattern \)"
else
pattern="$1"
fi
# move files from searchf to destf
find "$targetf" -mindepth 1 -type d -print0 | sort -z | while IFS=$'\0' read -r -d $'\0' destf
do
find "$sourcef" -mindepth 1 -type d -name "${destf##*/}" -print0 | sort -z | while IFS=$'\0' read -r -d $'\0' searchf
do
if (( $# ))
then
# search with pattern
eval find "\"$searchf\"" -depth -type f "$pattern" -exec mv -fv {} "\"$destf\"" \\\;
else
# all files
find "$searchf" -depth -type f -exec mv -fv {} "$destf" \;
fi
# delete empty folders
find "$searchf" -depth -type d -exec rmdir --ignore-fail-on-non-empty {} +
done
done
exit 0
this will merge hello into test (earn the fruits and cut the tree)

Related

Count no.of .xlsx across all the directories and subdirectories where .tab is also exist

I have a directroy which consists of 'n' number of sub directories. Following is the structure:
RootDir ->
SubDir1 ->
test.xlsx
test.tab
SubDir2 ->
test.xlsx
As shown above SubDir1 has both files .xlsx and .tab and SubDir2 have only .xlsx. Like this I have 'n' number of subdirectories, and willing to count only .xlsx from the folders where .tab file is also present.
I wanted to do it using shell scripting.
The present code returning me count of .xlsx files. But, it also includes the .xlsx files where .tab fiiles are not present.
find . -name '*.xlsx' -type f
The following code should work:
count=0
for file in `find . -name '*.xlsx' -type f`; do
if [ -f "${file%.xlsx}"".tab" ]; then
count=`expr $count + 1`
fi
done
echo $count
A slightly refined version of Ankush Pandit's answer:
#!/bin/bash
count=0
while IFS= read -r -d "" f; do
[[ -f ${f%.xlsx*}.tab ]] && (( count++ ))
done < <(find RootDir -type f -name "*.xlsx" -print0)
echo "$count"
The combination of -print0 and read -d "" options protects filenames
which contain special characters such as a space character.
The syntax <(find ..) is a process substitution and the output of
find is fed to the read command in the while loop via the redirect.

find emitting unexpected ".", making wc -l list more contents than expected

I'm trying to use the newer command as follows:
touch $HOME/mark.start -d "$d1"
touch $HOME/mark.end -d "$d2"
SF=$HOME/mark.start
EF=$HOME/mark.end
find . -newer $SF ! -newer $EF
But this gives me an output like this:
.
./File5
and counts it as 2 files, however that directory only has 1 file i.e., File5. Why is this happening and how to solve it?
UPDATE:
I'm actually trying to run the following script:
#!/bin/bash
check_dir () {
d1=$2
d2=$((d1+1))
f1=`mktemp`
f2=`mktemp`
touch -d $d1 $f1
touch -d $d2 $f2
n=$(find $1 \( -name "*$d1*" \) -o \( -newer $f1 ! -newer $f2 \) | wc -l)
if [ $n != $3 ]; then echo $1 "=" $n ; fi
rm -f $f1 $f2
}
That checks if the directory has file that either has a particular date in the format YYYMMDD or if its last modification time was last 1 day.
check_dir ./dir1 20151215 4
check_dir ./dir2 20151215 3
where in dir1 there should be 4 such files and if it is not true then it will print the actual number of files that is there.
So, when the directory only has file with dates in their name, then it checks them fine, but when it checks with newer, it always gives 1 file extra (which is not even there in the directory). Why is this happening???
The question asks why there's an extra . in the results from find, even when no file or directory by that name exists. The answer is simple: . always exists, even when it's hidden. Use ls -a to show hidden contents, and you'll see that it's present.
Your existing find command doesn't exempt the target directory itself -- . -- from being a legitimate result, which is why you're getting more results than you expect.
Add the following filter:
-mindepth 1 # only include content **under** the file or directory specified
...or, if you only want to count files, use...
-type f # only include regular files
Assuming GNU find, by the way, this all can be made far more efficient:
check_dir() {
local d1 d2 # otherwise these variables leak into global scope
d1=$2
d2=$(gdate -d "+ 1 day $d1" '+%Y%m%d') # assuming GNU date is installed as gdate
n=$(find "$1" -mindepth 1 \
-name "*${d1}*" -o \
'(' -newermt "$d1" '!' -newermt "$d2" ')' \
-printf '\n' | wc -l)
if (( n != $3 )); then
echo "$1 = $n"
fi
}

counting the total numbers of files and directories in a provided folder including subdirectories and their files

I want to count all the files and directories from a provided folder including files and directories in a subdirectory. I have written a script which will count accurately the number of files and directory but it does not handle the subdirectories any ideas ???
I want to do it without using FIND command
#!/bin/bash
givendir=$1
cd "$givendir" || exit
file=0
directories=0
for d in *;
do
if [ -d "$d" ]; then
directories=$((directories+1))
else
file=$((file+1))
fi
done
echo "Number of directories :" $directories
echo "Number of file Files :" $file
Use find:
echo "Number of directories: $(find "$1" -type d | wc -l)"
echo "Number of files/symlinks/sockets: $(find "$1" ! -type d | wc -l)"
Using plain shell and recursion:
#!/bin/bash
countdir() {
cd "$1"
dirs=1
files=0
for f in *
do
if [[ -d $f ]]
then
read subdirs subfiles <<< "$(countdir "$f")"
(( dirs += subdirs, files += subfiles ))
else
(( files++ ))
fi
done
echo "$dirs $files"
}
shopt -s dotglob nullglob
read dirs files <<< "$(countdir "$1")"
echo "There are $dirs dirs and $files files"
find "$1" -type f | wc -l will give you the files, find "$1" -type d | wc -l the directories
My quick-and-dirty shellscript would read
#!/bin/bash
test -d "$1" || exit
files=0
# Start with 1 to count the starting dir (as find does), else with 0
directories=1
function docount () {
for d in $1/*; do
if [ -d "$d" ]; then
directories=$((directories+1))
docount "$d";
else
files=$((files+1))
fi
done
}
docount "$1"
echo "Number of directories :" $directories
echo "Number of file Files :" $files
but mind it: On my build folder for a project, there were quite some differences:
find: 6430 dirs, 74377 non-dirs
my script: 6032 dirs, 71564 non-dirs
#thatotherguy's script: 6794 dirs, 76862 non-dirs
I assume that has to do with the legions of links, hidden files etc., but I am too lazy to investigate: find is the tool of choice.
Here are some one-line commands that work without find:
Number of directories: ls -Rl ./ | grep ":$" | wc -l
Number of files: ls -Rl ./ | grep "[0-9]:[0-9]" | wc -l
Explanation:
ls -Rl lists all files and directories recursively, one line each.
grep ":$" finds just the results whose last character is ':'. These are all of the directory names.
grep "[0-9]:[0-9]" matches on the HH:MM part of the timestamp. The timestamp only shows up on file, not directories. If your timestamp format is different then you will need to pick a different grep.
wc -l counts the number of lines that matched from the grep.

bash delete directories based on contents

Currently I have multiple directories
Directory1 Directory2 Directory3 Directory4
each of these directories contain files (the files are somewhat cryptic)
what i wish to do is scan files within the folders to see if certain files are present, if they are then leave that folder alone, if the certain files are not present then just delete the entire directory. here is what i mean:
im searching for the files that have the word .pass. in the filename.
Say Directory 4 has that file that im looking for
Direcotry4:
file1.temp.pass.exmpl
file1.temp.exmpl
file1.tmp
and the rest of the Directories do not have that specific file:
file.temp
file.exmp
file.tmp.other
so i would like to delete Directory1,2 and3 But only keep Directory 4...
So far i have come up with this code
(arr is a array of all the directory names)
for x in ${arr[#]}
do
find $x -type f ! -name "*pass*" -exec rd {} $x\;
done
another way i have thought of doing this is like this:
for x in ${arr[#]}
do
cd $x find . -type f ! -name "*Pass*" | xargs -i rd {} $x/
done
SO far these don't seem to work, and im scared that i might do something wrong and have all my files deleted.....(i have backed up)
is there any way that i can do this? remember i want Directory 4 to be unchanged, everything in it i want to keep
To see if your directory contains a pass file:
if [ "" = "$(find directory -iname '*pass*' -type f | head -n 1)" ]
then
echo notfound
else
echo found
fi
To do that in a loop:
for x in "${arr[#]}"
do
if [ "" = "$(find "$x" -iname '*pass*' -type f | head -n 1)" ]
then
rm -rf "$x"
fi
done
Try this:
# arr is a array of all the directory names
for x in ${arr[#]}
do
ret=$(find "$x" -type f -name "*pass*" -exec echo "0" \;)
# expect zero length $ret value to remove directory
if [ -z "$ret" ]; then
# remove dir
rm -rf "$x"
fi
done

How to loop through a directory recursively to delete files with certain extensions

I need to loop through a directory recursively and remove all files with extension .pdf and .doc. I'm managing to loop through a directory recursively but not managing to filter the files with the above mentioned file extensions.
My code so far
#/bin/sh
SEARCH_FOLDER="/tmp/*"
for f in $SEARCH_FOLDER
do
if [ -d "$f" ]
then
for ff in $f/*
do
echo "Processing $ff"
done
else
echo "Processing file $f"
fi
done
I need help to complete the code, since I'm not getting anywhere.
As a followup to mouviciel's answer, you could also do this as a for loop, instead of using xargs. I often find xargs cumbersome, especially if I need to do something more complicated in each iteration.
for f in $(find /tmp -name '*.pdf' -or -name '*.doc'); do rm $f; done
As a number of people have commented, this will fail if there are spaces in filenames. You can work around this by temporarily setting the IFS (internal field seperator) to the newline character. This also fails if there are wildcard characters \[?* in the file names. You can work around that by temporarily disabling wildcard expansion (globbing).
IFS=$'\n'; set -f
for f in $(find /tmp -name '*.pdf' -or -name '*.doc'); do rm "$f"; done
unset IFS; set +f
If you have newlines in your filenames, then that won't work either. You're better off with an xargs based solution:
find /tmp \( -name '*.pdf' -or -name '*.doc' \) -print0 | xargs -0 rm
(The escaped brackets are required here to have the -print0 apply to both or clauses.)
GNU and *BSD find also has a -delete action, which would look like this:
find /tmp \( -name '*.pdf' -or -name '*.doc' \) -delete
find is just made for that.
find /tmp -name '*.pdf' -or -name '*.doc' | xargs rm
Without find:
for f in /tmp/* tmp/**/* ; do
...
done;
/tmp/* are files in dir and /tmp/**/* are files in subfolders. It is possible that you have to enable globstar option (shopt -s globstar).
So for the question the code should look like this:
shopt -s globstar
for f in /tmp/*.pdf /tmp/*.doc tmp/**/*.pdf tmp/**/*.doc ; do
rm "$f"
done
Note that this requires bash ≥4.0 (or zsh without shopt -s globstar, or ksh with set -o globstar instead of shopt -s globstar). Furthermore, in bash <4.3, this traverses symbolic links to directories as well as directories, which is usually not desirable.
If you want to do something recursively, I suggest you use recursion (yes, you can do it using stacks and so on, but hey).
recursiverm() {
for d in *; do
if [ -d "$d" ]; then
(cd -- "$d" && recursiverm)
fi
rm -f *.pdf
rm -f *.doc
done
}
(cd /tmp; recursiverm)
That said, find is probably a better choice as has already been suggested.
Here is an example using shell (bash):
#!/bin/bash
# loop & print a folder recusively,
print_folder_recurse() {
for i in "$1"/*;do
if [ -d "$i" ];then
echo "dir: $i"
print_folder_recurse "$i"
elif [ -f "$i" ]; then
echo "file: $i"
fi
done
}
# try get path from param
path=""
if [ -d "$1" ]; then
path=$1;
else
path="/tmp"
fi
echo "base path: $path"
print_folder_recurse $path
This doesn't answer your question directly, but you can solve your problem with a one-liner:
find /tmp \( -name "*.pdf" -o -name "*.doc" \) -type f -exec rm {} +
Some versions of find (GNU, BSD) have a -delete action which you can use instead of calling rm:
find /tmp \( -name "*.pdf" -o -name "*.doc" \) -type f -delete
For bash (since version 4.0):
shopt -s globstar nullglob dotglob
echo **/*".ext"
That's all.
The trailing extension ".ext" there to select files (or dirs) with that extension.
Option globstar activates the ** (search recursivelly).
Option nullglob removes an * when it matches no file/dir.
Option dotglob includes files that start wit a dot (hidden files).
Beware that before bash 4.3, **/ also traverses symbolic links to directories which is not desirable.
This method handles spaces well.
files="$(find -L "$dir" -type f)"
echo "Count: $(echo -n "$files" | wc -l)"
echo "$files" | while read file; do
echo "$file"
done
Edit, fixes off-by-one
function count() {
files="$(find -L "$1" -type f)";
if [[ "$files" == "" ]]; then
echo "No files";
return 0;
fi
file_count=$(echo "$files" | wc -l)
echo "Count: $file_count"
echo "$files" | while read file; do
echo "$file"
done
}
This is the simplest way I know to do this:
rm **/#(*.doc|*.pdf)
** makes this work recursively
#(*.doc|*.pdf) looks for a file ending in pdf OR doc
Easy to safely test by replacing rm with ls
The following function would recursively iterate through all the directories in the \home\ubuntu directory( whole directory structure under ubuntu ) and apply the necessary checks in else block.
function check {
for file in $1/*
do
if [ -d "$file" ]
then
check $file
else
##check for the file
if [ $(head -c 4 "$file") = "%PDF" ]; then
rm -r $file
fi
fi
done
}
domain=/home/ubuntu
check $domain
There is no reason to pipe the output of find into another utility. find has a -delete flag built into it.
find /tmp -name '*.pdf' -or -name '*.doc' -delete
The other answers provided will not include files or directories that start with a . the following worked for me:
#/bin/sh
getAll()
{
local fl1="$1"/*;
local fl2="$1"/.[!.]*;
local fl3="$1"/..?*;
for inpath in "$1"/* "$1"/.[!.]* "$1"/..?*; do
if [ "$inpath" != "$fl1" -a "$inpath" != "$fl2" -a "$inpath" != "$fl3" ]; then
stat --printf="%F\0%n\0\n" -- "$inpath";
if [ -d "$inpath" ]; then
getAll "$inpath"
#elif [ -f $inpath ]; then
fi;
fi;
done;
}
I think the most straightforward solution is to use recursion, in the following example, I have printed all the file names in the directory and its subdirectories.
You can modify it according to your needs.
#!/bin/bash
printAll() {
for i in "$1"/*;do # for all in the root
if [ -f "$i" ]; then # if a file exists
echo "$i" # print the file name
elif [ -d "$i" ];then # if a directroy exists
printAll "$i" # call printAll inside it (recursion)
fi
done
}
printAll $1 # e.g.: ./printAll.sh .
OUTPUT:
> ./printAll.sh .
./demoDir/4
./demoDir/mo st/1
./demoDir/m2/1557/5
./demoDir/Me/nna/7
./TEST
It works fine with spaces as well!
Note:
You can use echo $(basename "$i") # print the file name to print the file name without its path.
OR: Use echo ${i%/##*/}; # print the file name which runs extremely faster, without having to call the external basename.
Just do
find . -name '*.pdf'|xargs rm
If you can change the shell used to run the command, you can use ZSH to do the job.
#!/usr/bin/zsh
for file in /tmp/**/*
do
echo $file
done
This will recursively loop through all files/folders.
The following will loop through the given directory recursively and list all the contents :
for d in /home/ubuntu/*;
do
echo "listing contents of dir: $d";
ls -l $d/;
done

Resources