Bash script backup, check if directory contains the files from another directory - bash

I am making a bash backup script and I want to implement a functionality that checks if the files from a directory are already contained in another directory, if they are not I want to output the name of these files
#!/bin/bash
TARGET_DIR=$1
INITIAL_DIR=$2
TARG_ls=$(ls -A $1)
INIT_ls=$(ls -A $2)
if [[ "$(ls -A $2)" ]]; then
if [[ ! -n "$(${TARG_ls} | grep ${INIT_ls})" ]]; then
echo All files in ${INITIAL_DIR} have backups for today in ${TARGET_DIR}
exit 0
else
#code for listing the missing files
fi
else
echo Error!! ${INITIAL_DIR} has no files
exit 1
fi
I have thought about storing the ls output of both directories inside strings and comparing them, as it is shown in the code, but in the event where I have to list the files from INITIAL_DIR that are missing in TARGET_DIR, I just don't know how to proceed.
I tried using the diff command comparing the two directories but that takes into account the preexisting files of TARGET_DIR.
In the above code if [[ "$(ls -A $2)" ]]; checks if the CURRENT_DIR contains any files and if [[ ! -n "$(${TARG_ls} | grep ${INIT_ls})" ]]; checks if the target directory contains all the initial directory files.
Anyone have a suggestion, hint?

you can use comm command
$ comm <(ls -A a) <(ls -A b)
will give you files in a only, both in a and b, and in only b in three columns. To get the list of files in a only for example
$ comm -23 <(ls -A a) <(ls -A b)

rsync has a --dry-run switch that will show you what files have changed between 2 directories. Before doing rsync copies of my home directory I preview the changes this way to see if there could be evidence of mass mal encryption or corruption before proceeding.

Related

compare multiple directories on remote hosts

Is there a way to diff on contents of multiple directories instead of two directories? Or diff a single directory on multiple hosts.
I wrote the following bash script to diff a directory on three hosts
#!/bin/bash
if [[ -n "$(diff <(ssh user#host1 ls -r /user/test1) <(ssh user#host2 ls -r /user/test1))" || -n "$(diff <(ssh user#host2 ls -r /user/test1) <(ssh user#host3 ls -r /user/test1))" ]]; then
echo "There are differences"
fi
Is there a better way to do this?
Yes, GNU diff has an option --from-file that allows the comparison of one reference file or directory to many others.
diff -r --from-file=ref-dir dir1 dir2 ... dirN
Note that it will only compare ref-dir to dir1, ..., dirN; it won't compare dir1 to dir2, ..., dirN.
As for your remote directories, since you have ssh access to the machines, you can mount them locally with sshfs in order to execute diff over them.
You could use MD5 checksums for files lists for each host. It will allow you to use the same script for different count of servers. If lists are the same, you should receive the same values for checksums. And then you just compare all the sums with the previous one. If it differs from any other checksum, then you have differences.
#!/bin/bash
MD5SUMS=$(
for hostname in host{1,2,3}
do
result=$(ssh user#${hostname} ls -r /user/test1 | md5sum)
result=${result%% *}
done
)
PREVSUM=""
for SUM in ${MD5SUMS}
do
if [ -z "$PREVSUM" ]
then
PREVSUM=$SUM
continue
else
if [ "$PREVSUM" != "$SUM" ]
then
echo "There are differences"
fi
PREVSUM=$SUM
fi
done

bash check for subdirectories under directory

This is my first day scripting, I use linux but needed a script that I have been racking my brain until i finally ask for help. I need to check a directory that has directories already present to see if any new directories are added that are not expected.
Ok I think i have got this as simple as possible. The below works but displays all files in the directory as well. I will keep working at it unless someone can tell me how not to list the files too | I tried ls -d but it is doing the echo "nothing new". I feel like an idiot and should have got this sooner.
#!/bin/bash
workingdirs=`ls ~/ | grep -viE "temp1|temp2|temp3"`
if [ -d "$workingdirs" ]
then
echo "nothing new"
else
echo "The following Direcetories are now present"
echo ""
echo "$workingdirs"
fi
If you want to take some action when a new directory is created, used inotifywait. If you just want to check to see that the directories that exist are the ones you expect, you could do something like:
trap 'rm -f $TMPDIR/manifest' 0
# Create the expected values. Really, you should hand edit
# the manifest, but this is just for demonstration.
find "$Workingdir" -maxdepth 1 -type d > $TMPDIR/manifest
while true; do
sleep 60 # Check every 60 seconds. Modify period as needed, or
# (recommended) use inotifywait
if ! find "$Workingdir" -maxdepth 1 -type d | cmp - $TMPDIR/manifest; then
: Unexpected directories exist or have been removed
fi
done
Below shell script will show directory present or not.
#!/bin/bash
Workingdir=/root/working/
knowndir1=/root/working/temp1
knowndir2=/root/working/temp2
knowndir3=/root/working/temp3
my=/home/learning/perl
arr=($Workingdir $knowndir1 $knowndir2 $knowndir3 $my) #creating an array
for i in ${arr[#]} #checking for each element in array
do
if [ -d $i ]
then
echo "directory $i present"
else
echo "directory $i not present"
fi
done
output:
directory /root/working/ not present
directory /root/working/temp1 not present
directory /root/working/temp2 not present
directory /root/working/temp3 not present
**directory /home/learning/perl present**
This will save the available directories in a list to a file. When you run the script a second time, it will report directories that have been deleted or added.
#!/bin/sh
dirlist="$HOME/dirlist" # dir list file for saving state between runs
topdir='/some/path' # the directory you want to keep track of
tmpfile=$(mktemp)
find "$topdir" -type d -print | sort -o "$tmpfile"
if [ -f "$dirlist" ] && ! cmp -s "$dirlist" "$tmpfile"; then
echo 'Directories added:'
comm -1 -3 "$dirlist" "$tmpfile"
echo 'Directories removed:'
comm -2 -3 "$dirlist" "$tmpfile"
else
echo 'No changes'
fi
mv "$tmpfile" "$dirlist"
The script will have problems with directories that have very exotic names (containing newlines).

bash - find duplicate file in directory and rename

I have a directory that has thousands of files in it with various extensions. I also have a drop location where users drop files to be migrated to this directory. I'm looking for a script that will scan the target directory for a duplicate file name, if found, rename the file in the drop folder, then move it to the target directory.
Example:
/target/file.doc
/drop/file.doc
Script will rename file.doc to file1.doc then move it to /target/.
It needs to maintain the file extension too.
for fil in /drop/*
do
test -f "/target/$fil"
if [ "$?" = 0 ]
then
suff=$(awk -F\. '{ print "."$NF }' <<<$fil)
bdot=$(basename -s $suff $fil)
mv "/drop/$fil" "/drop/${bdot}1$suff"
cp "/drop/${bdot}1.$suff" "/target/${bdot}1$suff"
fi
done
Take each file in the drop directory and check it is existing the /target using test -e. If it does then move (rename) and then copy.
You have to take a bit more care than simply checking if a file exists before moving in order to provide a flexible solution that can handle files with or without extensions. You also may want to provide a way of forming duplicate filenames that preserves sort order. e.g. if file.txt already exists, you may want to use file_001.txt as the duplicate in target rather than file1.txt as when you reach 10 you will no longer have a canonical sort by filename.
Also, you never want to iterate with for i in $(ls dir) that is wrought with pitfalls. See Bash Pitfalls No. 1
Putting those pieces together, and including detail in the comments below, you could do something similar to the following and have a reasonable flexible solution allowing you to specify only the filename.ext to move or /path/to/drop/filename.ext. You must specify the drop and target directories in the script to meet your circumstance., e.g.
#!/bin/bash
tgt=target ## set target and drop directories as required
drp=drop
declare -i cnt=1 ## counter for filename_$cnt
test -z "$1" && { ## validate one argument given
printf "error: insufficient input\nusage: %s filename\n" "${0##*/}"
exit 1
}
test -w "$1" || test -w "$drp/$1" || { ## validate valid filename is writeable
printf "error: file not found or lack permission to move '%s'.\n" "$1"
exit 1
}
fn="${1##*/}" ## strip any path info from filename
if test "$1" != "${1%.*}" ; then
ext="${fn##*.}" ## get file extension
fnwoe="${fn%."$ext"}" ## get filename without extension
test "$fnwoe" = '' && ext= ## was a dotfile, reset ext
fi
vfn="$fn" ## set valid filename = filename
## form valid filename e.g. "$fn_001.$ext" if duplicate found
while test -e "$tgt/$vfn"; do
if test -n "$ext" ## did we have have an extension?
then
printf -v vfn "%s_%03d.%s" "$fnwoe" "$((cnt++))" "$ext"
else
printf -v vfn "%s_%03d" "$fn" "$((cnt++))"
fi
done
mv "$drp/$fn" "$tgt/$vfn" ## move file under non-conflicting name
Example drop and target
$ ls -1 drop
file
file.txt
$ ls -1 target
file.txt
file_001.txt
file_002.txt
Example Use
$ bash mvdrop.sh file
$ bash mvdrop.sh drop/file.txt
Resulting drop and target
$ ls -1 drop
$ ls -1 target
file
file.txt
file_001.txt
file_002.txt
file_003.txt
This will test to see if it exists, preserve the extension (along with any structure before the extension such as in the case of FILE.tar.gz), and move it to the target directory.
#!/bin/bash
TARGET="\target\"
DROP="\drop\"
for F in `ls $DROP`; do
if [[ -f $TARGET$F ]]; then
EXT=`echo $F | awk -F "." '{print $NF}'`
PRE=`echo $F | awk -F "." '{$NF="";print $0}' | sed -e 's/ $//g;s/ /./g'`
mv $DROP$F $DROP$PRE"1".$EXT
F=$PRE"1".$EXT
fi
mv $DROP$F $TARGET
done
Additionally you may want to do come restricting in the ls command, so that you aren't copying entire directories.
Display only regular files (no directories or symbolic links)
ls -p $DROP | grep -v /

How to identify files which are not in list using bash?

Unfortunately my knowledge in bash not so well and I have very non-standard task.
I have a file with the files list.
Example: /tmp/my/file1.txt /tmp/my/file2.txt
How can I write a script which can check that files from folder /tmp/my exist and to have two types messages after script is done.
1 - Files exist and show files:
/tmp/my/file1.txt
/tmp/my/file2.txt
2 - The folder /tmp/my including files and folders which are not in your list. The files and folders:
/tmp/my/test
/tmp/my/1.txt
You speak of files and folders, which seems unclear.
Anyways, I wanted to try it with arrays, so here we go :
unset valid_paths; declare -a valid_paths
unset invalid_paths; declare -a invalid_paths
while read -r line
do
if [ -e "$line" ]
then
valid_paths=("${valid_paths[#]}" "$line")
else
invalid_paths=("${invalid_paths[#]}" "$line")
fi
done < files.txt
echo "VALID PATHS:"; echo "${valid_paths[#]}"
echo "INVALID PATHS:"; echo "${invalid_paths[#]}"
You can check for the files' existence (assuming a list of files, one filename per line) and print the existing ones with a prefix using this
# Part 1 - check list contents for files
while read thefile; do
if [[ -n "$thefile" ]] && [[ -f "/tmp/my/$thefile" ]]; then
echo "Y: $thefile"
else
echo "N: $thefile"
fi
done < filelist.txt | sort
# Part 2 - check existing files against list
for filepath in /tmp/my/* ; do
filename="$(basename "$filepath")"
grep "$filename" filelist.txt -q || echo "U: $filename"
done
The files that exist are prefixed here with Y:, all others are prefixed with N:
In the second section, files in the tmp directory that are not in the file list are labelled with U: (unaccounted for/unexpected)
You can swap the -f test which checks that a path exists and is a regular file for -d (exists and is a directory) or -e (exists)
See
man test
for more options.

Removing old directories with logs

My IM stores the logs according to the contact name. I have created a file with the list of active contacts. My problem is following:
I would like to create a bash script with read the active contacts names from the file and compare it with the directories. If the directory name wouldn't be found on the list, it would be moved to another directory (let's call it "archive"). I try to visualise it for you.
content of the list:
contact1
contact2
content of the dir
contact1
contact2
contact3
contact4
after running of the script, the content fo the dir:
contact1
contact2
contact3 ==> ../archive
contact4 ==> ../archive
You could use something like this:
mv $(ls | grep -v -x -F -f ../file.txt) ../archive
Where ../file.txt contains the names of the directories that should not be moved. It is assumed here that the current directory only contains directories, if that is not the case, ls should be replaced with something else. Note that the command fails if there are no directories that should be moved.
Since in the comments to the other answer you state that directories with whitespace in the name can occur, you could replace this by:
for i in *
do
echo $i | grep -v -x -q -F -f ../file.txt && mv "$i" ../archive
done
This is an improved version of marcog's answer. Note that the associative array requires Bash 4.
#!/bin/bash
sourcedir=/path/to/foo
destdir=/path/to/archive
contactfile=/path/to/list
declare -A contacts
while read -r contact
do
contacts[$contact]=1
done < "$contactfile"
for contact in "$sourcedir"/*
do
if [[ -f $contact ]]
then
index=${contact##*/}
if [[ ! ${contacts[$index]} ]]
then
mv "$contact" "$destdir"
fi
fi
done
Edit:
If you're moving directories instead of files, then change the for loop above to look like this:
for contact in "$sourcedir"/*/
do
index=${contact/%\/}
index=${index##*/}
if [[ ! ${contacts[$index]} ]]
then
mv "$contact" "$destdir"
fi
done
There might be a more concise solution, but this works. I'd strongly recommend prefixing the mv with echo to test it out first, otherwise you could end up with a serious mess if it doesn't do what you want.
declare -A contacts
for contact in "$#"
do
contacts[$contact]=1
done
ls a | while read contact
do
if [[ ! ${contacts[$contact]} ]]
then
mv "a/$contact" ../archive
fi
done

Resources