I have this script for backing up postgresql databases
# Location to place backups.
#String to append to the name of the backup files
backup_date=`date +%Y-%m-%d`
#Numbers of days you want to keep copie of your databases
databases=`psql -l -t | cut -d'|' -f1 | sed -e 's/ //g' -e '/^$/d'`
for i in $databases; do
if [ "$i" != "template0" ] && [ "$i" != "template1" ]; then
echo Dumping $i to $backup_dir$i\_$backup_date
pg_dump -Fc $i > $backup_dir$i\_$backup_date
find $backup_dir -type f -prune -mtime +$number_of_days -exec rm -f {} \;
Only one cluster was used on server, so everything was fine. But now one new cluster was created. So I got thinking if backups will be done properly and if not how to make sure it will do backups properly for every cluster?
Will this script now goes over every cluster and do backups for all databases in all clusters? If so, there might be name clashes.
How could I make sure it would do backups in different directories for different clusters?
I solved this by creating second script to backup another cluster databases. Well it is not very elegant, but it works. If anyone could write more universal script (and that one script could be used for all DB backups) so it would take in consideration different backup directories and clusters, please post it as answer (as it would be better solution)
Second script looks like this (ofcourse it would be best to merge both scripts into one):
# Location to place backups.
#String to append to the name of the backup files
backup_date=`date +%Y-%m-%d`
#Numbers of days you want to keep copie of your databases
databases=` psql -p 5433 -l -t | cut -d'|' -f1 | sed -e 's/ //g' -e '/^$/d' `
for i in $databases; do
if [ "$i" != "template0" ] && [ "$i" != "template1" ]; then
echo Dumping $i to $backup_dir$i\_$backup_date\.gz
pg_dump -p 5433 -Fc $i |gzip -f > $backup_dir$i\_$backup_date\.gz
find $backup_dir -type f -prune -mtime +$number_of_days -exec rm -f {} \;
I needed to move a large s3 bucket to a local file store for a variety of reasons, and the files were stored as 160,000 directories with subdirectories.
As this is just far too many folders to look at with something like a gui FTP interface, I'd like to move the 160,000 root directories into, say, 320 directories - 500 directories in each.
I'm a newbie at bash scripting, and I just wrote this up, but I'm scared I'm going to mangle the whole thing and have to redo the transfer. I tested with [[ "$i" -ge 3 ]]; and some directories with subdirectories and it looked like it worked okay, but I'm quite nervous. Do not want to retransfer all this data.
for file in *; do
if [[ -d "$file" && ! -L "$file" ]];
echo "directory $file is being written to assets_$j";
mv $file ./assets_$j/;
if [[ "$i" -ge 499 ]];
Thanks for the help!
find all the directories in the current folder.
Read a count of the folders.
Exec mv for each chunk
find . -mindepth 1 -maxdepth 1 -type d |
while IFS= readarray -n10 -t files && ((${#files[#]})); do
echo mkdir -v -p "$dest"
echo mv -v "${files[#]}" "$dest";
On the condition that assets_1, assets_2, etc. do not exist in the working directory yet:
for (( i=0,j=1; i<${#dirs[#]}; i+=500,j++ )); do
echo mkdir ./assets_$j/
echo mv "${dirs[#]:i:500}" ./assets_$j/
If you're happy with the output, remove echos.
A possible way, but you have no control on the counter, is:
find . -type d -mindepth 1 -maxdepth 1 -print0 \
| xargs -0 -n 500 sh -c 'echo mkdir -v ./assets_$$ && echo mv -v "$#" ./assets_$$' _
This gets the counter of assets from the PID which only recycles when the wrap-around is reached (Linux PID recycling)
The order which findreturns is slight different then the glob * (Find command default sorting order)
If you want to have the sort order alphabetically, you can add a simple sort:
find . -type d -mindepth 1 -maxdepth 1 -print0 | sort -z \
| xargs -0 -n 500 sh -c 'echo mkdir -v ./assets_$$ && echo mv -v "$#" ./assets_$$' _
note: remove the echo if you are pleased with the output
Define source, target, maxdepth and cd to source
cd "${source}"
Set the maximum number of concurrent rsync threads
How long to wait before checking the number of rsync threads again
Find all folders in the source directory within the maxdepth level
find . -maxdepth ${depth} -type d | while read dir
Make sure to ignore the parent folder
if [ `echo "${dir}" | awk -F'/' '{print NF}'` -gt ${depth} ]
Strip leading dot slash
subfolder=$(echo "${dir}" | sed 's#^\./##g')
if [ ! -d "${target}/${subfolder}" ]
Create destination folder and set ownership and permissions to match source
mkdir -p "${target}/${subfolder}"
chown --reference="${source}/${subfolder}" "${target}/${subfolder}"
chmod --reference="${source}/${subfolder}" "${target}/${subfolder}"
Make sure the number of rsync threads running is below the threshold
while [ `ps -ef | grep -c [r]sync` -gt ${maxthreads} ]
echo "Sleeping ${sleeptime} seconds"
sleep ${sleeptime}
Run rsync in background for the current subfolder and move one to the next one
nohup rsync -au "${source}/${subfolder}/" "${target}/${subfolder}/"
</dev/null >/dev/null 2>&1 &
Find all files above the maxdepth level and rsync them as well
find . -maxdepth ${depth} -type f -print0 | rsync -au --files-from=- --from0 ./ "${target}/"
Thank you for all your help. By adding the -v switch to rsync, I solved the problem.
Not sure if this is what you are after (I don't know what rsync is), but can you not just run the script as,
./myscript > logfile.log
./myscript | tee logfile.log
(ie: pipe to tee if you want to see the output as it goes along)?
Alternatively... not sure this is what real coders do, but you could append the output of each command in the script to a logfile, eg:
#at the beginning define the logfile name:
#remove the file if it exists
if [ -a ${logfile}.log ]; then rm -i ${logfile}.log; fi
#for each command that you want to capture the output of, use >> $logfile
mkdir -p "${target}/${subfolder}" >> ${logfile}.log
If rsync has several threads with names, I imagine you could store to separate logfiles as >> ${logfile}${thread}.log and concatenate the files at the end into 1 logfile.
Hope that is helpful? (am new to answering things - so I apologise if what I post is basic/bad, or if you already considered these ideas!)
In my external HDD I have two partitions, one is for Mac and the other for Windows (FAT32). Since my Mac partition is almost full due to Time Machine backup, I want to move some of my old folders (in which are movies) from the Mac partition to the Windows partition. However, the FAT32 file system only allows each file less than 4GB. But my some of the folders contain files larger than 4G. I don't want to manually go through each folder , check the size and then copy & paste the folders of small size.
So my question is:
What is the command for moving all the folders (including the sub-directories) less than 4GB to the new partition? Does it have anything to do with the options of mv command?
--- Update 12/7/2014---
I ran find . -mindepth 1 -type d -exec bash -c 'f="$1";set $(du -bs "$f"); \ [[ $1 -lt 4294967296 ]] && echo mv "$f" /dest-dir' - '{}' \; >> output.txt.
The following was the first a few lines of my output:
BASH_EXECUTION_STRING=$'f="$1";set $(du -bs "$f"); \\\n [[ $1 -lt 4294967296 ]] && echo mv "$f" /Volumes/WIN_PANC/movies/'
BASH_VERSINFO=([0]="3" [1]="2" [2]="53" [3]="1" [4]="release" [5]="x86_64-apple-darwin14")
They are not the folders I want to move. Am I doing right?
You can use this find command to list directories that have files greater than 4GB:
find . -mindepth 1 -type d -exec bash -c 'f="$1"; read s _ < <(du -s "$f"); \
[[ $s -lt 4194304 ]] && echo mv "$f" /dest-dir' - '{}' \;
Remove echo before mv command once you're satisfied with the listing.
Using the following codes can do this for you (for files >4G
#! /bin/bash
my_files=`ls --almost-all -1v -s -A --block-size=G|sort|sed -e 's#^[0-4]*G##g' -e '$ s#.*##g'`
echo "$my_files" >> my_files.txt
while read -r file; do
echo "MOVING FILE : $file"
mv "$file" "destination_location"
sleep 0.5
done < my_files.txt
rm -rf my_files.txt
Note: change your directory to where all your files to be copied are present in a terminal, then you can run script from the same terminal. Ensure you replace "destination_location" with the directory you want to move the file to inside the codes. Afterwards execute script.
Note: You will have to change your directory and run the codes in each directory.
I have a script that'll backup my svn repo to another server (setup as a cronjob to run daily)
svnadmin dump /path/to/repo | gzip > /backups/`date +%F`_repo.svn.gz
scp /backups/`date +%F`_repo.svn.gz user#ip:/backups/svn/
So example filenames:
2014-04-30_repo.svn.gz, 2014-04-29_repo.svn.gz, 2014-04-28_repo.svn.gz
Using bash, How would I go about removing backups older than 7 days old?
This should work:
find /path/to/files -name '*_repo.svn.gz' -mtime +7 | xargs rm
If you're trying to rely totally on the file name for the date, then something like this:
TODAY=$(date '+%s')
for f in /backup/*_repo.svn.gz ; do
DATESTR=$(echo $f | sed "s/^\/backup\/\(.*\)_repo\.svn\.gz/\1/")
FILEDATE=$(date -d "$DATESTR" '+%s')
if ((FILEDATE + 7*24*60*60 < TODAY)) ; then
rm $f
This scripts will sort the files by date then move the first 2500 files to another directory.
When I run below scripts, system prompt out Argument list too long msg. Anyone can help me enhance the scripts ? Thanks
echo "unused_file directory does not exist!"
echo "$DESTINATION_DIRECTORY directory created!"
echo "Moving $NUM_OF_FILES oldest files to $DESTINATION_DIRECTORY directory"
xargs -i sh -c "mv {} $DESTINATION_DIRECTORY"
You didn't say, but I assume this is where the problem occurs:
ls -tr $FROM_DIRECTORY/MSCERC*.Z|head -2500 | \
xargs -i sh -c "mv {} $DESTINATION_DIRECTORY"
(You can verify it by adding "set -x" to the top of your script.)
The problem is that the kernel has a fixed maximum size of the total length of the command line given to a new process, and your exceeding that in the ls command. You can work around it by not using globbing and instead using grep:
ls -tr $FROM_DIRECTORY/ | grep '/MSCERC\*\.Z$' |head -2500 | \
xargs -i sh -c "mv {} $DESTINATION_DIRECTORY"
(grep uses regular expressions instead of globs, so the pattern looks a little bit different.)
ls -tr $FROM_DIRECTORY/MSCERC*.Z|head -2500 | \
xargs -i sh -c "mv {} $DESTINATION_DIRECTORY"
do something like the following:
find "$FROM_DIRECTORY" -maxdepth 1 -type f -name 'MSCERC*.Z' -printf '%p\t%T#\n' | sort -k2,2 -r | cut -f1 | head -$NUM_OF_FILES | xargs mv -t "$DESTINATION_DIRECTORY"
This uses find to create a list of files with modification timestamps, sorts by the timestamp, then removes the unneeded field before passing the output to head and xargs
Another variant, should work with non GNU utils
find "$FROM_DIRECTORY" -type f -name 'MSCERC*.Z' -printf '%p\t%T#' |sort -k 2,2 -r | cut -f1 | head -$NUM_OF_FILES | xargs -i mv \{\} "$DESTINATION_DIRECTORY"
First of create a backup list of the files to be treated. Then read the backup file line-by-line and heal it. For example
echo "unused_file directory does not exist!"
echo "$DESTINATION_DIRECTORY directory created!"
echo "Moving $NUM_OF_FILES oldest files to $DESTINATION_DIRECTORY directory"
ls -tr $FROM_DIRECTORY/MSCERC*.Z|head -2500 > list
exec 3<list
while read file <&3
A quick way to fix this would be to change to $FROM_DIRECTORY, so that you can refer the files using (shorter) relative paths.
ls -tr MSCERC*.Z|head -2500 |xargs -i sh -c "mv {} $DESTINATION_DIRECTORY"
This is also not entirely fool-proof, if you have too many files that match.