How to find the filenames changed between to linux directories as per contents of the file using md5sum followed by diff command - bash

I have two linux directories dir1 and dir2 with some files in both. Now i want list of filenames with files added and files deleted in dir2 as compared to dir1. The files should be compared as per the data or contents in the file. I am new to the linux bash scripting. Please help me.
Currently i am doing this like below :
find dir1 -iname *.c -o -iname *.h -o -iname *.prm | xargs -n1 md5sum > dir1.fingerprint.md5sum
find dir2 -iname *.c -o -iname *.h -o -iname *.prm | xargs -n1 md5sum > dir2.fingerprint.md5sum
cat dir1.fingerprint.md5sum | cut -d" " -f1 | sort -u > dir1.fingerprint
cat dir2.fingerprint.md5sum | cut -d" " -f1 | sort -u > dir2.fingerprint
diff -NrU 2 dir1.fingerprint dir2.fingerprint
I am getting the result as some change id's as shown below :
--- dir1.fingerprint 2013-03-08 11:57:24.421311354 +0530
+++ dir2.fingerprint 2013-03-08 11:57:34.901311856 +0530
## -1,3 +1,3 ##
-43551a78e0f5b0be4aec23fdab881e65
-4639647e4f86eb84987cd01df8245d14
4c9cc7c6332b4105197576f66d1efee7
+9f944e70cb20b275b2e9b4f0ee26141a
+d41d8cd98f00b204e9800998ecf8427e
I want the result as the filenames for files modified or added newly to dir2. How to get this. Please help me.

Try this script with the arguments dir2 and dir1
#!/bin/sh
if [ "x$1" == "x" ]
then
exit 0
fi
if [ "x$2" == "x" ]
then
exit 0
fi
#echo "DIFF $1 $2"
if [ -f $1 ]
then
if [ -e $2 ]
then
diff $1 $2 >/dev/null
if [ "$?" != "0" ]
then
echo "DIFFERENT $1"
fi
fi
exit 0
fi
if [ "x`ls $1`" != "x" ]
then
for f in `ls $1`
do
$0 $1/$f $2/$f
done
fi
exit 0
EDIT:
if [ "x`ls $1`" != "x" ]
then
for f in `ls $1`
do
if [ -f $1/$f ]
then
for g in `ls $2`
do
if [ -f $2/$g ]
then
diff $1/$f $2/$g >/dev/null
if [ "$?" == "0" ]
then
echo "SAME CONTENT $1/$f $2/$g"
fi
fi
done
fi
done
fi

Related

searching for file unix script

My script is as shown:
it searches for directories and provides info on that directory, however I am having trouble setting exceptions.
if [ -d "$1" ];
then
directories=$(find "$1" -type d | wc -l)
files=$(find "$1" -type f | wc -l)
sym=$(find "$1" -type l | wc -l)
printf "%s %'d\n" "Directories" $directories
printf "%s %'d\n" "Files" $files
printf "%s %'d\n" "Sym links" $sym
exit 0
else
echo "Must provide one argument"
exit 1
fi
How do I make it so that if I search for a file it tells me that a directory needs to be inputted? I'm stuck on it, I've tried test commands but I don't know what to do.
You're missing your shebang in the first line of your script:
#!/bin/bash
I get correct results from your script if I add it:
Directories 1,991
Files 13,363
Sym links 0
You may have to set the correct execution permissions also chmod +x scriptname.sh?
Entire script looks like this:
#!/bin/bash
if [ -z "$1" ];
then
echo "Please provide at least one argument!"
exit 1
elif [ -d "$1" ];
then
directories=$(find "$1" -type d | wc -l)
files=$(find "$1" -type f | wc -l)
sym=$(find "$1" -type l | wc -l)
printf "%s %'d\n" "Directories" $directories
printf "%s %'d\n" "Files" $files
printf "%s %'d\n" "Sym links" $sym
exit 0
else
echo "This is a file, not a directory"
exit 1
fi

Looping through each file in directory - bash

I'm trying to perform certain operation on each file in a directory but there is a problem with order it's going through. It should do one file at the time. The long line (unzipping, grepping, zipping) works fine on a single file without a script, so there is a problem with a loop. Any ideas?
Script should grep through through each zipped file and look for word1 or word2. If at least one of them exist then:
unzip file
grep word1 and word2 and save it to file_done
remove unzipped file
zip file_done to /donefiles/ with original name
remove file_done from original directory
#!/bin/bash
for file in *.gz; do
counter=$(zgrep -c 'word1\|word2' $file)
if [[ $counter -gt 0 ]]; then
echo $counter
for file in *.gz; do
filenoext=${file::-3}
filedone=${filenoext}_done
echo $file
echo $filenoext
echo $filedone
gunzip $file | grep 'word1\|word2' $filenoext > $filedone | rm -f $filenoext | gzip -f -c $filedone > /donefiles/$file | rm -f $filedone
done
else
echo "nothing to do here"
fi
done
The code snipped you've provided has a few problems, e.g. unneeded nested for cycle and erroneous pipeline
(the whole line gunzip $file | grep 'word1\|word2' $filenoext > $filedone | rm -f $filenoext | gzip...).
Note also your code will work correctly only if *.gz files don't have spaces (or special characters) in names.
Also zgrep -c 'word1\|word2' will also match strings like line_starts_withword1_orword2_.
Here is the working version of the script:
#!/bin/bash
for file in *.gz; do
counter=$(zgrep -c -E 'word1|word2' $file) # now counter is the number of word1/word2 occurences in $file
if [[ $counter -gt 0 ]]; then
name=$(basename $file .gz)
zcat $file | grep -E 'word1|word2' > ${name}_done
gzip -f -c ${name}_done > /donefiles/$file
rm -f ${name}_done
else
echo 'nothing to do here'
fi
done
What we can improve here is:
since we unzipping the file anyway to check for word1|word2 presence, we may do this to temp file and avoid double-unzipping
we don't need to count how many word1 or word2 is inside the file, we may just check for their presence
${name}_done can be a temp file cleaned up automatically
we can use while cycle to handle file names with spaces
#!/bin/bash
tmp=`mktemp /tmp/gzip_demo.XXXXXX` # create temp file for us
trap "rm -f \"$tmp\"" EXIT INT TERM QUIT HUP # clean $tmp upon exit or termination
find . -maxdepth 1 -mindepth 1 -type f -name '*.gz' | while read f; do
# quotes around $f are now required in case of spaces in it
s=$(basename "$f") # short name w/o dir
gunzip -f -c "$f" | grep -P '\b(word1|word2)\b' > "$tmp"
[ -s "$tmp" ] && gzip -f -c "$tmp" > "/donefiles/$s" # create archive if anything is found
done
It looks like you have an inner loop inside the outer one :
#!/bin/bash
for file in *.gz; do
counter=$(zgrep -c 'word1\|word2' $file)
if [[ $counter -gt 0 ]]; then
echo $counter
for file in *.gz; do #<<< HERE
filenoext=${file::-3}
filedone=${filenoext}_done
echo $file
echo $filenoext
echo $filedone
gunzip $file | grep 'word1\|word2' $filenoext > $filedone | rm -f $filenoext | gzip -f -c $filedone > /donefiles/$file | rm -f $filedone
done
else
echo "nothing to do here"
fi
done
The inner loop goes through all the files in the directory if one of them contains file1 or file2. You probably want this :
#!/bin/bash
for file in *.gz; do
counter=$(zgrep -c 'word1\|word2' $file)
if [[ $counter -gt 0 ]]; then
echo $counter
filenoext=${file::-3}
filedone=${filenoext}_done
echo $file
echo $filenoext
echo $filedone
gunzip $file | grep 'word1\|word2' $filenoext > $filedone | rm -f $filenoext | gzip -f -c $filedone > /donefiles/$file | rm -f $filedone
else
echo "nothing to do here"
fi
done

Unix Bash - Copy files from a source folder recursively to destination/*file_extension*(ex. “txt”) folder

This is my code, something in the rec_copy() function isn't working properly, probably this line:
cp $1/$f $HOME/$2/$dest
The extension named folders are created in the destination folder but the files are not copied there. Can you help me?
#!/bin/bash
if [ $# -ne 2 ]
then
echo "Usage: $0 <source> <destination>"
exit
fi
if [ ! -d $1 ]
then
echo "Source folder does not exist"
exit
fi
if [ -d $2 ]
then
rm -r $2
mkdir $2
else
mkdir $2
fi
extension=`ls -l $1 | grep -v "^d" | awk '{ print $10; }' | sed 's/^.*\.//g'`
for f in $extension
do
if [ ! -d $1/$f ]
then
mkdir $2/$f
fi
done
rec_copy(){
folder=`ls $1`
for f in $folder
do
dest=`echo "$f" | sed 's/.*\.//g'`
if [ -f $1/$f ]
then
cp $1/$f $HOME/$2/$dest
elif [ -d $1/$f ]
then
rec_copy $1/$f
fi
done
}
rec_copy $1
Here is the answer in case someone ever needs it:
#!/bin/bash
if [ $# -ne 2 ]
then
echo "Usage: $0 <izvor> <destinacija>"
exit
fi
if [ ! -d "$1" ]
then
echo "Izvorniot folder ne postoi"
exit
fi
if [ -d "$2" ]
then
rm -r "$2"
mkdir "$2"
else
mkdir "$2"
fi
extension=`ls -l "$1" | grep -v "^d" | awk '{ print $10; }' | sed 's/^.*\.//g' | sort -u`
for f in $extension
do
if [ ! -d "$1/$f" ]
then
mkdir "$2/$f"
fi
done
rec_copy(){
folder=`ls "$1"`
for f in $folder
do
dest=`echo "$f" | sed 's/.*\.//g'`
to=`cp "$1/$f" "$2/$dest"`
if [ -f "$1/$f" ]
then
echo "$to"
elif [ -d "$1/$f" ]
then
rec_copy "$1/$f" "$2"
fi
done
}
rec_copy "./$1" "./$2"

Bash script loop through subdirectories and write to file

I have no idea I have spent a lot of hours dealing with this problem. I need to write script. Script should loop recursively through subdirectories in current directory. It should check files count in each directory. If file count is greater than 10 it should write all names of these file in file named "BigList" otherwise it should write in file "ShortList". This should look like
---<directory name>
<filename>
<filename>
<filename>
<filename>
....
---<directory name>
<filename>
<filename>
<filename>
<filename>
....
My script only works if subdirecotries don't include subdirectories in turn.
I am confused about this. Because it doesn't work as I expect. It will take less than 5 minutes to write this on any programming language for my.
Please help to solve this problem , because I have no idea how to do this.
Here is my script
#!/bin/bash
parent_dir=""
if [ -d "$1" ]; then
path=$1;
else
path=$(pwd)
fi
parent_dir=$path
loop_folder_recurse() {
local files_list=""
local cnt=0
for i in "$1"/*;do
if [ -d "$i" ];then
echo "dir: $i"
parent_dir=$i
echo before recursion
loop_folder_recurse "$i"
echo after recursion
if [ $cnt -ge 10 ]; then
echo -e "---"$parent_dir >> BigList
echo -e $file_list >> BigList
else
echo -e "---"$parent_dir >> ShortList
echo -e $file_list >> ShortList
fi
elif [ -f "$i" ]; then
echo file $i
if [ $cur_fol != $main_pwd ]; then
file_list+=$i'\n'
cnt=$((cnt + 1))
fi
fi
done
}
echo "Base path: $path"
loop_folder_recurse $path
I believe that this does what you want:
find . -type d -exec env d={} bash -c 'out=Shortlist; [ $(ls "$d" | wc -l) -ge 10 ] && out=Biglist; { echo "--$d"; ls "$d"; echo; } >>"$out"' ';'
If we don't want either to count subdirectories to the cut-off or to list them in the output, then use this version:
find . -type d -exec env d={} bash -c 'out=Shortlist; [ $(ls -p "$d" | grep -v "/$" | wc -l) -ge 10 ] && out=Biglist; { echo "--$d"; ls -p "$d"; echo; } | grep -v "/$" >>"$out"' ';'

Bash capturing output of find with exclusion into the While loop

I made a script that looks for the content of a folder recursively, excluding some paths, then asks for an action to take on each line of the results.
The find command on its own is working fine and exclude paths as expected. It looks something like that:
$SOURCE="FOLDER/"
$EXCLUDESTRING="! -path \"FOLDER/*/.svn/*\" ! -path \"FOLDER/uploads/*\" ! -path \"FOLDER/ai-cache/*\""
find "$SOURCE"* $EXCLUDESTRING # uploads and ai-cache folders are not included in the results
But when I pipe the result to the While loop it does not consider the exclusions.
find "$SOURCE"* $EXCLUDESTRING -print0 | while read -d $'\0' file_1
do
echo $file_1 # uploads and ai-cache folders are included in the results
if statement ...
more commands ...
done
I want to mention that the goal is to find the desired files and folders and process them instantaneously without using an array.
UPDATE
For those who are interested in my script (Step by step unidirectional sync) or could test (it will be very appreciated) Here is a more detailed copy:
#!/bin/bash
excludepath=( "*/.svn/*" "uploads/*" "design/*" "ai-cache/*" )
bold=`tput bold`
normal=`tput sgr0`
validsource="false"
while [ "$validsource" == "false" ]
do
echo ""
echo "Specify project to compare :"
echo -n "/home/myaccount/public_html/projects/"
read -e project
project=`echo "$project" | sed -e "s/\/*$//" `
projectpath="/home/myaccount/public_html/projects/$project"
source="$(readlink -f $projectpath)/"
if [ -d "$source" ];then
validsource="true"
else
echo "The working copy cannot be found ($projectpath)."
fi
done
echo "Compare project with folder :"
read -e target
excludestring=""
for i in "${excludepath[#]}"
do
excludestring="$excludestring ! -path \"$source$i\""
done
echo ""
echo "______________________________________________"
echo ""
echo "COMPARISON IN PROGRESS ..."
echo "______________________________________________"
echo ""
echo "List of paths excluded from the comparison: ${excludepath[#]}"
echo "Executed command : find \"$source\"* $excludestring"
echo ""
liveexclude=()
find "$source"* $excludestring -print0 | while read -d $'\0' file_1
do
file=$( echo "$file_1" | sed "s,$source,,g" ) # Strip base path
file_2=$( echo "$file_1" | sed "s,$source,$target,g" ) # Getting file path in $target
dir=$( dirname "$file_2" | sed "s,$target,,g" )
dir_1=$( dirname "$file_1" )
dir_2=$( dirname "$file_2" )
#Check for live excluded folders
process="true"
for i in "${liveexclude[#]}"
do
if [[ $file_1 == "$i"* ]]
then
process="false"
break
fi
done
if [ "$process" == "true" ];then
if [ -d "$file_1" ];then
if [ ! -d "$file_2" ] # Checking if sub-dir exists in $target
then
while [ "$confirm" != "y" ] && [ "$confirm" != "n" ]
do
echo ""
echo "${bold}Folder${normal} \"$file\" doesn't exist."
echo -n "Would you like to ${bold}create it and its entire contents${normal} ? (y/n) "
read -e confirm </dev/tty
done
if [ "$confirm" == "y" ];then
mkdir -p $file_2 # Creating if sub-dir missing
cp -r "$file_1/"* "$file_2"
fi
confirm=""
liveexclude+=("$file_2")
fi
else
if [ -f "$file_1" ];then
if [ -f "$file_2" ] # Checking if file exists in $target
then
cksum_file_1=$( cksum "$file_1" | cut -f 1 -d " " ) # Get cksum of file in $source
cksum_file_2=$( cksum "$file_2" | cut -f 1 -d " " ) # Get cksum of file in $target
if [ $cksum_file_1 -ne $cksum_file_2 ] # Check if cksum matches
then
while [ "$confirm" != "y" ] && [ "$confirm" != "n" ]
do
if [ "$file_1" -nt "$file_2" ]
then
echo ""
echo "${bold}File${normal} \"$file\" is not updated."
echo -n "Would you like to ${bold}replace${normal} it ? (y/n) "
else
echo ""
echo "${bold}File${normal} \"$file\" was modified."
echo "${bold}CAUTION${normal}: The file \"$file_2\" is newer than the file \"$file_1\""
echo -n "Would you still ${bold}overwrite${normal} it ? (y/n) "
fi
read -e confirm </dev/tty
done
if [ "$confirm" == "y" ];then
cp "$file_1" "$file_2" # Copy if cksum mismatch
fi
confirm=""
fi
else
while [ "$confirm" != "y" ] && [ "$confirm" != "n" ]
do
echo ""
echo "${bold}File${normal} \"$file\" doesn't exist."
echo -n "Would you like to ${bold}copy${normal} it ? (y/n) "
read -e confirm </dev/tty
done
if [ "$confirm" == "y" ];then
cp "$file_1" "$file_2" # Copy if file does not exist.
fi
confirm=""
fi
fi
fi
fi
done
PS. We use this script for applying new changes on an existing project if a detailed check is required.
Don't put your commands in a string, but in an array. And don't use a dollar in the left-hand side of an assignment (we're not in Perl/PHP). Oh, and avoid using upper case variable names. It looks ugly; it seems you're shouting the variable's name; but more seriously it might clash with reserved names (like PATH, LINES, GROUPS, USERS, etc.); if you stick to lower case variable names, you're on the safe side (and it's prettier!).
source=FOLDER/
excludeary=( \! -path "FOLDER/*/.svn/*" \! -path "FOLDER/uploads/*" \! -path "FOLDER/ai-cache/*" )
find "$source" "${excludeary[#]}" -print0 | while IFS= read -r -d '' file_1
do
echo "$file_1" # uploads and ai-cache folders are included in the results
if statement ...
more commands ...
done
Edit. Here's a small example:
$ mkdir Test
$ cd Test
$ mkdir -p {excl,incl}/{1,2}
$ touch {excl,incl}/{1,2}/{a,b}
$ tree
.
|-- excl
| |-- 1
| | |-- a
| | `-- b
| `-- 2
| |-- a
| `-- b
`-- incl
|-- 1
| |-- a
| `-- b
`-- 2
|-- a
`-- b
6 directories, 8 files
$ source=~/Test
$ excludeary=( \! -path "$source/excl/*" )
$ find "$source" "${excludeary[#]}"
/home/gniourf/Test
/home/gniourf/Test/excl
/home/gniourf/Test/incl
/home/gniourf/Test/incl/1
/home/gniourf/Test/incl/1/a
/home/gniourf/Test/incl/1/b
/home/gniourf/Test/incl/2
/home/gniourf/Test/incl/2/a
/home/gniourf/Test/incl/2/b
That's how ! -path works. See, you still have the /home/gniourf/Test/excl folder (but not its children). Maybe you want -prune instead:
$ pruneary=( \! \( -type d -name excl -prune \) )
$ find "$source" "${pruneary[#]}"
/home/gniourf/Test
/home/gniourf/Test/incl
/home/gniourf/Test/incl/1
/home/gniourf/Test/incl/1/a
/home/gniourf/Test/incl/1/b
/home/gniourf/Test/incl/2
/home/gniourf/Test/incl/2/a
/home/gniourf/Test/incl/2/b
Or to exclude all the 1 directories together with the excl directory:
$ excludeary=( \! \( -type d \( -name excl -o -path '*/1' \) -prune \) )
$ find "$source" "${excludeary[#]}"
/home/gniourf/Test
/home/gniourf/Test/incl
/home/gniourf/Test/incl/2
/home/gniourf/Test/incl/2/a
/home/gniourf/Test/incl/2/b
All of the necessary exclusions worked for me when I removed all of the double quotes and put everything in single quotes:
EXCLUDESTRING='! -path FOLDER/*/.svn/* ! -path \"FOLDER/uploads/* ! -path FOLDER/ai-cache/*'

Resources