rsync -aviuP misses unfinished or corrupt files - shell

When I run the command:
rsync -aviuP /src /trgt
The command seems to be missing some files that are not correct at the target destination. Like a 25gb file on src, being 4gb at the target and corrupt. I run the command 3 times to be safe.
When I start the sync, I have to stop the sync with ctrl+C every now and then as I need the drives to work faster for some other task but I thought the P flag was meant to make that cosher.
Am I missing something here? The problem is happening relatively frequently and I can't seem to find any answers on the web.
Thanks in advance.

Avoid interrupting the sync process. Instead limit the maximum bandwidth used by rsync using the bwlimit option
rsync --bwlimit=1000 -aviuP /src /trgt
In this example the maximum bandwidth used would be limited to roughly 1MB/s (1000KB/s).

Related

Can "rsync --append" replace files that are larger at destination?

I have an rsync job that moves log files from a web server to an archive. The server rotates its own logs, so I might see a structure like this:
/logs
error.log
error.log.20200420
error.log.20200419
error.log.20200418
I use rsync to sync these log files every few minutes:
rsync --append --size-only /foo/logs/* /mnt/logs/
This command syncs everything with the least amount of processing. And it's important - calculating checksums or writing an entire file every time a few lines are added is a no-go. But it ignores files if there is a larger version on the server instead of replacing them:
man rsync:
--append [...] If a file needs to be transferred and its size on the receiver is the
same or longer than the size on the sender, the file is skipped.
Is there a way to tell rsync to replace files instead in this case? Using --append is important for me and works well for other log files that use unique filenames. Maybe there's a better tool for this?
The service is a packaged application that I can't really edit or configure unfortunately, so changing the file structure or paths isn't an option for me.

How to compare two directories, and if they're the EXACT SAME, delete the second

I'm trying to setup an automatic backup on a raspberry pi system connected to an external hard drive.
Basically, I have shared folders and they're mounted via samba on the rPI under
/mnt/Comp1
/mnt/Comp2
I will then have the external hard drive plugged in and mounted with two folders under
/media/external/Comp1
/media/external/Comp2
I will then run a recursive copy from /mnt/Comp1* to /media/external/Comp1/* and the same with Comp2.
What I need help with is at the end of the copies (because it will be a total of 5 computers), I would like to verify that all the files transferred, and if they did and everything is on the external, then I can delete from the local machine automatically. I understand this is risky, because almost inevitably it will delete things that may not be backed up, but I need help knowing where to start.
I've found a lot of information on checking contents of a folder, and I know I can use the diff command, but I don't know how to use it in this pseudocode
use diff on directories /mnt/Comp1/ and /media/external/Comp1
if no differences, proceed to delete /mnt/Comp1/* recursively
if differences, preferably move the files not saved to /media/external/Comp1
repeat checking for differences, and deleting if necessary
Try something like:
diff -r -q d1/ d2/ >/dev/null 2>&1
check return value with $?
remove the d2, if return value is 1.

Directory monitoring using fswatch

I am using fswatch to monitor a directory and run a script when video files are copied into that directory:
fswatch -o /Path/To/Directory/Directory | xargs -n 1 sh /Path/To/Script/Script.sh
The problem is that the file is often not completed its copy before the script is actioned. The files are video files of varying size. Small files are OK, larger files are not.
How can I delay the fswatch notification until the file has completed its copy?
First of all, the behaviour of the fswatch "monitors" is OS-specific: when asking question about fswatch you'd better specify the OS you use.
However, there's no way to do that using fswatch alone. A process may open a file for writing and keep it open for an amount of time sufficiently long for the OS to send multiple events. I'm afraid there is nothing fswatch can do about it.
An alternate approach may be using another tool to check whether the modified file is currently open: if it is not, then run your script, otherwise skip it and wait for its next event. Such tools are OS-specific: in OS X and Linux you may use lsof. Beware this approach does not protect you from another process opening that file while your script is running.

How to keep two folders automatically synchronized?

I would like to have a synchronized copy of one folder with all its subtree.
It should work automatically in this way: whenever I create, modify, or delete stuff from the original folder those changes should be automatically applied to the sync-folder.
Which is the best approach to this task?
BTW: I'm on Ubuntu 12.04
Final goal is to have a separated real-time backup copy, without the use of symlinks or mount.
I used Ubuntu One to synchronize data between my computers, and after a while something went wrong and all my data was lost during a synchronization.
So I thought to add a step further to keep a backup copy of my data:
I keep my data stored on a "folder A"
I need the answer of my current question to create a one-way sync of "folder A" to "folder B" (cron a script with rsync? could be?). I need it to be one-way only from A to B any changes to B must not be applied to A.
The I simply keep synchronized "folder B" with Ubuntu One
In this manner any change in A will be appled to B, which will be detected from U1 and synchronized to the cloud. If anything goes wrong and U1 delete my data on B, I always have them on A.
Inspired by lanzz's comments, another idea could be to run rsync at startup to backup the content of a folder under Ubuntu One, and start Ubuntu One only after rsync is completed.
What do you think about that?
How to know when rsync ends?
You can use inotifywait (with the modify,create,delete,move flags enabled) and rsync.
while inotifywait -r -e modify,create,delete,move /directory; do
rsync -avz /directory /target
done
If you don't have inotifywait on your system, run sudo apt-get install inotify-tools
You need something like this:
https://github.com/axkibe/lsyncd
It is a tool which combines rsync and inotify - the former is a tool that mirrors, with the correct options set, a directory to the last bit. The latter tells the kernel to notify a program of changes to a directory ot file.
It says:
It aggregates and combines events for a few seconds and then spawns one (or more) process(es) to synchronize the changes.
But - according to Digital Ocean at https://www.digitalocean.com/community/tutorials/how-to-mirror-local-and-remote-directories-on-a-vps-with-lsyncd - it ought to be in the Ubuntu repository!
I have similar requirements, and this tool, which I have yet to try, seems suitable for the task.
Just simple modification of #silgon answer:
while true; do
inotifywait -r -e modify,create,delete /directory
rsync -avz /directory /target
done
(#silgon version sometimes crashes on Ubuntu 16 if you run it in cron)
Using the cross-platform fswatch and rsync:
fswatch -o /src | xargs -n1 -I{} rsync -a /src /dest
You can take advantage of fschange. It’s a Linux filesystem change notification. The source code is downloadable from the above link, you can compile it yourself. fschange can be used to keep track of file changes by reading data from a proc file (/proc/fschange). When data is written to a file, fschange reports the exact interval that has been modified instead of just saying that the file has been changed.
If you are looking for the more advanced solution, I would suggest checking Resilio Connect.
It is cross-platform, provides extended options for use and monitoring. Since it’s BitTorrent-based, it is faster than any other existing sync tool. It was written on their behalf.
I use this free program to synchronize local files and directories: https://github.com/Fitus/Zaloha.sh. The repository contains a simple demo as well.
The good point: It is a bash shell script (one file only). Not a black box like other programs. Documentation is there as well. Also, with some technical talents, you can "bend" and "integrate" it to create the final solution you like.

Rsync bash script and hard linking files

I am creating a bash script to backup my files with rsync.
Backups all come from a single directory.
I only want new or modified files to be backed up.
Currently, I am telling rsync to backup the dir, and to check the files compared to the last backup.
The way I am doing this is
THE_TIME=`date "+%Y-%m-%dT%H:%M:%S"`
rsync -aP --link-dest=/Backup/Current /usr/home/user/backup /Backup/Backup-$THE_TIME
rm -f /Backup/Current
ln -s /Backup/Backup-$THE_TIME /Backup/Current
I am pretty sure I have the syntax correct for this. Each backup will check against the "Current" folder, and upload only as necesary. It will then delete the Current folder, and re-create the symlink to the newest backup it just did.
I am getting an error when I run the script:
rsync: link "/Backup/Backup-2010-08-04-12:21:15/dgs1200series_manual_310.pdf"
=> /Backup/Current/dgs1200series_manual_310.pdf
failed: Operation not supported (45)
The host OS is running HFS filesystem, which supports hard linking. I am trying to figure out if something else is not supporting this, or if I have a problem in my code.
Thanks for any help
Edit:
I am able to create a hard link on my local machine.
I am also able to create a hard link on the remote server (when logged in locally)
I am NOT able to create a hard link on the remote server when mounted via afp. Even if both files exist on the server.
I am guessing this is a limitation of afp.
Just in case your command line is only an example: Be sure to always specify the link-dest directory with an absolute pathname! That’s something which took me quite some time to figure out …
Two things from the man page stand out that are worth checking:
If file's aren't linking, double-check their attributes. Also
check if some attributes are getting forced outside of rsync's
control, such a mount option that squishes root to a single
user, or mounts a removable drive with generic ownership (such
as OS X's “Ignore ownership on this volume” option).
and
Note that rsync versions prior to 2.6.1 had a bug that could
prevent --link-dest from working properly for a non-super-user
when -o was specified (or implied by -a). You can work-around
this bug by avoiding the -o option when sending to an old rsync.
Do you have the "ignore ownership" option turned on? What version of rsync do you have?
Also, have you tried manually creating a similar hardlink using ln at the command line?
I don't know if this is the same issue, but I know that rsync can't sync a file when the destination is a FAT32 partition and the filename has a ":" (colon) in it. [The source filesystem is ext3, and the destination is FAT32]
Try reconfiguring the date command so that it doesn't use a colon and see if that makes a difference.
e.g.
THE_TIME=`date "+%Y-%m-%dT%H_%_%S"`

Resources