find and gzip a directory recursively without a directory/file test - bash

I'm working on improving our bash backup script, and would like to move away from rsync and towards using gzip and a "find since last run timestamp" system. I would like to have a mirror of the original tree, except have each destination file gzipped. However, if I pass a destination path to gzip that does not exist, it complains. I created the test below, but I can't believe that this is the most efficient solution. Am I going about this wrong?
Also, I'm not crazy about using while read either, but I can't get the right variable expansion with the alternatives I've tried, such as a for file in 'find' do.
Centos 6.x. Relevant snip below, simplified for focus:
cd /mnt/${sourceboxname}/${drive}/ && eval find . -newer timestamp | while read objresults;
do
if [[ -d "${objresults}" ]]
then
mkdir -p /backup/${sourceboxname}/${drive}${objresults}
else
cat /mnt/${sourceboxname}/${drive}/"${objresults}" | gzip -fc > /backup/${sourceboxname}/${drive}"${objresults}".gz
fi
done
touch timestamp #if no stderr

With proposed changes from my comments incorporated, I suggest this code:
#!/bin/bash
src="/mnt/$sourceboxname/$drive"
dst="/backup/$sourceboxname/$drive"
timestamp="$src/timestamp"
errors=$({ cd "$src" && find -newer "$timestamp" | while read objresults;
do
mkdir -p $(basename "$dst/$objresults")
[[ -d "$objresults" ]] || gzip -fc < "$objresults" > "$dst/$objresults.gz"
done; } 2>&1)
if [[ -z "$errors" ]]
then
touch "$timestamp"
else
echo "$errors" >&2
exit 1
fi

Related

Send files to folders using bash script

I want to copy the functionality of a windows program called files2folder, which basically lets you right-click a bunch of files and send them to their own individual folders.
So
1.mkv 2.png 3.doc
gets put into directories called
1 2 3
I have got it to work using this script but it throws out errors sometimes while still accomplishing what I want
#!/bin/bash
ls > list.txt
sed -i '/list.txt/d' ./list.txt
sed 's/.$//;s/.$//;s/.$//;s/.$//' ./list.txt > list2.txt
for i in $(cat list2.txt); do
mkdir $i
mv $i.* ./$i
done
rm *.txt
is there a better way of doing this? Thanks
EDIT: My script failed with real world filenames as they contained more than one . so I had to use a different sed command which makes it work. this is an example filename I'm working with
Captain.America.The.First.Avenger.2011.INTERNAL.2160p.UHD.BluRay.X265-IAMABLE
I guess you are getting errors on . and .. so change your call to ls to:
ls -A > list.txt
-A List all entries except for . and ... Always set for the super-user.
You don't have to create a file to achieve the same result, just assign the output of your ls command to a variable. Doing something like this:
files=`ls -A`
for file in $files; do
echo $file
done
You can also check if the resource is a file or directory like this:
files=`ls -A`
for res in $files; do
if [[ -d $res ]];
then
echo "$res is a folder"
fi
done
This script will do what you ask for:
files2folder:
#!/usr/bin/env sh
for file; do
dir="${file%.*}"
{ ! [ -f "$file" ] || [ "$file" = "$dir" ]; } && continue
echo mkdir -p -- "$dir"
echo mv -n -- "$file" "$dir/"
done
Example directory/files structure:
ls -1 dir/*.jar
dir/paper-279.jar
dir/paper.jar
Running the script above:
chmod +x ./files2folder
./files2folder dir/*.jar
Output:
mkdir -p -- dir/paper-279
mv -n -- dir/paper-279.jar dir/paper-279/
mkdir -p -- dir/paper
mv -n -- dir/paper.jar dir/paper/
To make it actually create the directories and move the files, remove all echo

bash move is failing

I am running below commands in a script
move_jobs() {
cd $JOB_DIR
for i in `cat $JOBS_FILE`
do
if [ `ls | grep -i ^${i}- | wc -l` -gt 0 ]; then
cd $i
if [ ! -d jobs ]; then
mkdir jobs && cd .. && mv "${i}"-* "${i}"/jobs/
else
cd .. && mv "${i}"-* "${i}"/jobs/
fi
error_handler $?
fi
done
}
but it failing as
mv: cannot stat `folder-*': No such file or directory
Not sure why move command is failing with regular expression
Your script is overly complicated and has several issues, one of which will be the problem, I guess it's the ls | grep ... part, but to find that out, you should include some debug logging.
for i in $(cat ...) loops through words, not lines.
Do not parse ls
And if you still do, do not ever grep for filenames but include it in your ls call: ls "${i}"-* | wc -l.
You do not need to check if a folder exists when the only thing that is different then is that you create it. You can use mkdir -p instead.
Jumping around folders in your script makes it almost unreadable, as you need to keep track of all cd commands when reading your script.
You could simply write the following, which I think will do what you want:
xargs -a "$JOBS_FILE" -I{} \
sh -c "
mkdir -p '$JOB_DIR/{}/jobs';
mv '$JOB_DIR/{}-'* '$JOB_DIR/{}/jobs';
"
or if you need more control:
while IFS= read -r jid; do
if ls "$JOB_DIR/$jid-"* &>/dev/null; then
TARGET_DIR="$JOB_DIR/$jid/jobs"
mkdir -p "$TARGET_DIR"
mv "$JOB_DIR/$jid-"* "$TARGET_DIR"
echo "OK"
else
echo "No files to move."
fi
done < "$JOBS_FILE"

Perform an undo action using a script

I have an excellent interactive script which sorts and processes a variety of filetypes from an unsorted folder into newly created directories.
I was wondering how I could write a small script or modify the existing script that so that I could unwind / undo the executed script and its sorting process back to its (pre sort) state if need be.
#!/bin/bash
read -p "Good Morning, Please enter your file type name for sorting [ENTER]:" all_extensions
if cd /Users/christopherdorman/desktop
then while read extension
do destination="folder$extension"
mkdir -p "$destination"
mv -v unsorted/*."$extension" "$destination"
done <<< "${all_extensions// /$'\n'}"
mkdir -p foldermisc
if mv -v unsorted/* "foldermisc"
then echo "Good News, the rest of Your files have been successfully processed"
fi
for i in folder*/; do
ls -S "$i" > "${i}filelist"
cat "${i}filelist" >> ~/desktop/summary.txt
done
fi
If you want to generate a script with an inverse action for each action you're performing, use printf %q to quote names in an eval-safe manner. For instance:
if [[ $undo_log ]]; then
# at the top of your script: open FD 3 as undo log
exec 3>"$undo_log"
fi
# later:
mv -v unsorted/*."$extension" "$destination"
# ...and, if we're generating an undo log, generate a sequence of appropriate commands
if [[ $undo_log ]]; then
for f in unsorted/*."$extension"
printf 'mv %q/%q %q\n' "$destination" "${f##*/}" "$f" >&3
done
fi

How to find latest modified files and delete them with SHELL code

I need some help with a shell code. Now I have this code:
find $dirname -type f -exec md5sum '{}' ';' | sort | uniq --all-repeated=separate -w 33 | cut -c 35-
This code finds duplicated files (with same content) in a given directory. What I need to do is to update it - find out latest (by date) modified file (from duplicated files list), print that file name and also give opportunity to delete that file in terminal.
Doing this in pure bash is a tad awkward, it would be a lot easier to write
this in perl or python.
Also, if you were looking to do this with a bash one-liner, it might be feasible,
but I really don't know how.
Anyhoo, if you really want a pure bash solution below is an attempt at doing
what you describe.
Please note that:
I am not actually calling rm, just echoing it - don't want to destroy your files
There's a "read -u 1" in there that I'm not entirely happy with.
Here's the code:
#!/bin/bash
buffer=''
function process {
if test -n "$buffer"
then
nbFiles=$(printf "%s" "$buffer" | wc -l)
echo "================================================================================="
echo "The following $nbFiles files are byte identical and sorted from oldest to newest:"
ls -lt -c -r $buffer
lastFile=$(ls -lt -c -r $buffer | tail -1)
echo
while true
do
read -u 1 -p "Do you wish to delete the last file $lastFile (y/n/q)? " answer
case $answer in
[Yy]* ) echo rm $lastFile; break;;
[Nn]* ) echo skipping; break;;
[Qq]* ) exit;;
* ) echo "please answer yes, no or quit";;
esac
done
echo
fi
}
find . -type f -exec md5sum '{}' ';' |
sort |
uniq --all-repeated=separate -w 33 |
cut -c 35- |
while read -r line
do
if test -z "$line"
then
process
buffer=''
else
buffer=$(printf "%s\n%s" "$buffer" "$line")
fi
done
process
echo "done"
Here's a "naive" solution implemented in bash (except for two external commands: md5sum, of course, and stat used only for user's comfort, it's not part of the algorithm). The thing implements a 100% Bash quicksort (that I'm kind of proud of):
#!/bin/bash
# Finds similar (based on md5sum) files (recursively) in given
# directory. If several files with same md5sum are found, sort
# them by modified (most recent first) and prompt user for deletion
# of the oldest
die() {
printf >&2 '%s\n' "$#"
exit 1
}
quicksort_files_by_mod_date() {
if ((!$#)); then
qs_ret=()
return
fi
# the return array is qs_ret
local first=$1
shift
local newers=()
local olders=()
qs_ret=()
for i in "$#"; do
if [[ $i -nt $first ]]; then
newers+=( "$i" )
else
olders+=( "$i" )
fi
done
quicksort_files_by_mod_date "${newers[#]}"
newers=( "${qs_ret[#]}" )
quicksort_files_by_mod_date "${olders[#]}"
olders=( "${qs_ret[#]}" )
qs_ret=( "${newers[#]}" "$first" "${olders[#]}" )
}
[[ -n $1 ]] || die "Must give an argument"
[[ -d $1 ]] || die "Argument must be a directory"
dirname=$1
shopt -s nullglob
shopt -s globstar
declare -A files
declare -A hashes
for file in "$dirname"/**; do
[[ -f $file ]] || continue
read md5sum _ < <(md5sum -- "$file")
files[$file]=$md5sum
((hashes[$md5sum]+=1))
done
has_found=0
for hash in "${!hashes[#]}"; do
((hashes[$hash]>1)) || continue
files_with_same_md5sum=()
for file in "${!files[#]}"; do
[[ ${files[$file]} = $hash ]] || continue
files_with_same_md5sum+=( "$file" )
done
has_found=1
echo "Found ${hashes[$hash]} files with md5sum=$hash, sorted by modified (most recent first):"
# sort them by modified date (using quicksort :p)
quicksort_files_by_mod_date "${files_with_same_md5sum[#]}"
for file in "${qs_ret[#]}"; do
printf " %s %s\n" "$(stat --printf '%y' -- "$file")" "$file"
done
read -p "Do you want to remove the oldest? [yn] " answer
if [[ ${answer,,} = y ]]; then
echo rm -fv -- "${qs_ret[#]:1}"
fi
done
if((!has_found)); then
echo "Didn't find any similar files in directory \`$dirname'. Yay."
fi
I guess the script is self-explanatory (you can read it like a story). It uses the best practices I know of, and is 100% safe regarding any silly characters in file names (e.g., spaces, newlines, file names starting with hyphens, file names ending with a newline, etc.).
It uses bash's globs, so it might be a bit slow if you have a bloated directory tree.
There are a few error checkings, but many are missing, so don't use as-is in production! (it's a trivial but rather tedious taks to add these).
The algorithm is as follows: scan each file in the given directory tree; for each file, will compute its md5sum and store in associative arrays:
files with keys the file names and values the md5sums.
hashes with keys the hashes and values the number of files the md5sum of which is the key.
After this is done, we'll scan through all the found md5sum, select only the ones that correspond to more than one file, then select all files with this md5sum, then quicksort them by modified date, and prompt the user.
A sweet effect when no dups are found: the script nicely informs the user about it.
I would not say it's the most efficient way of doing things (might be better in, e.g., Perl), but it's really a lot of fun, surprisingly easy to read and follow, and you can potentially learn a lot by studying it!
It uses a few bashisms and features that only are in bash version ≥ 4
Hope this helps!
Remark. If on your system date has the -r switch, you can replace the stat command by:
date -r "$file"
Remark. I left the echo in front of rm. Remove it if you're happy with how the script behaves. Then you'll have a script that uses 3 external commands :).

Removing old directories with logs

My IM stores the logs according to the contact name. I have created a file with the list of active contacts. My problem is following:
I would like to create a bash script with read the active contacts names from the file and compare it with the directories. If the directory name wouldn't be found on the list, it would be moved to another directory (let's call it "archive"). I try to visualise it for you.
content of the list:
contact1
contact2
content of the dir
contact1
contact2
contact3
contact4
after running of the script, the content fo the dir:
contact1
contact2
contact3 ==> ../archive
contact4 ==> ../archive
You could use something like this:
mv $(ls | grep -v -x -F -f ../file.txt) ../archive
Where ../file.txt contains the names of the directories that should not be moved. It is assumed here that the current directory only contains directories, if that is not the case, ls should be replaced with something else. Note that the command fails if there are no directories that should be moved.
Since in the comments to the other answer you state that directories with whitespace in the name can occur, you could replace this by:
for i in *
do
echo $i | grep -v -x -q -F -f ../file.txt && mv "$i" ../archive
done
This is an improved version of marcog's answer. Note that the associative array requires Bash 4.
#!/bin/bash
sourcedir=/path/to/foo
destdir=/path/to/archive
contactfile=/path/to/list
declare -A contacts
while read -r contact
do
contacts[$contact]=1
done < "$contactfile"
for contact in "$sourcedir"/*
do
if [[ -f $contact ]]
then
index=${contact##*/}
if [[ ! ${contacts[$index]} ]]
then
mv "$contact" "$destdir"
fi
fi
done
Edit:
If you're moving directories instead of files, then change the for loop above to look like this:
for contact in "$sourcedir"/*/
do
index=${contact/%\/}
index=${index##*/}
if [[ ! ${contacts[$index]} ]]
then
mv "$contact" "$destdir"
fi
done
There might be a more concise solution, but this works. I'd strongly recommend prefixing the mv with echo to test it out first, otherwise you could end up with a serious mess if it doesn't do what you want.
declare -A contacts
for contact in "$#"
do
contacts[$contact]=1
done
ls a | while read contact
do
if [[ ! ${contacts[$contact]} ]]
then
mv "a/$contact" ../archive
fi
done

Resources