I am trying to use grsync (A GUI for rsync) for Windows to run backups. In the directory that I am backing up there are many larger files that are updated periodically. I would like to be able to sync just the changes to those files and not the entire file each backup. I was under the impression that rsync is a block-level file copier and would only copy the bytes that had changed between each sync. Perhaps this is not the case, or I have misunderstood what block-level file coping is!
To test this I used grsync to synchronize a 5GB zip file between two directories. Then I added a very small text file to the zip file and ran grsync again. However it proceeded to copy over the entire zip file again. Is there a utility that would only copy over the changes to this zip file and not the entire file again? Or is there a command within grsync that could be used to this effect?
The reason the entire file was copied is simply that the algorithm that handles block-level changes is disabled when copying between two directories on a local filesystem.
This would have worked, because the file is being copied (or updated) to a remote system:
rsync -av big_file.zip remote_host:
This will not use the "delta" algorithm and the entire file will be copied:
rsync -av big_file.zip D:\target\folder\
Some notes
Even if the target is a network share, rsync will treat it as path of your local filesystem and will disable the "delta" (block changes) algorithm.
Adding data to the beginning or middle of a data file will not upset the algorithm that handles the block-level changes.
Rationale
The delta algorithm is disabled when copying between two local targets because it needs to read both the source and the destination file completely in order to determine which blocks need changing. The rationale is that the time taken to read the target file is much the same as just writing to it, and so there's no point reading it first.
Workaround
If you know for definite that reading from your target filesystem is significantly faster than writing to it you can force the block-level algorithm to run by including the --no-whole-file flag.
If you add a file to a zip the entire zip file can change if the file was added as the first file in the archive. The entire archive will shift. so yours is not a valid test.
I was just looking for this myself, I think you have to use
rsync -av --inplace
for this to work.
Related
I have an rsync job that moves log files from a web server to an archive. The server rotates its own logs, so I might see a structure like this:
/logs
error.log
error.log.20200420
error.log.20200419
error.log.20200418
I use rsync to sync these log files every few minutes:
rsync --append --size-only /foo/logs/* /mnt/logs/
This command syncs everything with the least amount of processing. And it's important - calculating checksums or writing an entire file every time a few lines are added is a no-go. But it ignores files if there is a larger version on the server instead of replacing them:
man rsync:
--append [...] If a file needs to be transferred and its size on the receiver is the
same or longer than the size on the sender, the file is skipped.
Is there a way to tell rsync to replace files instead in this case? Using --append is important for me and works well for other log files that use unique filenames. Maybe there's a better tool for this?
The service is a packaged application that I can't really edit or configure unfortunately, so changing the file structure or paths isn't an option for me.
I wanted to know how to move files to a .zip archive. I'm using this code: xcopy C:\Folder C:\AnotherFolder\zippedFolder.zip. This copies the files from C:\Folder DIRECTLY into the archive, but I want to have that file in the archive (so i can doubleclick the archive and see the file unopened).
Want to do this to create an excel file with a .cmd
Use -m to import a file to a ZIP archive.
I found this on StackOverflow maybe it helps you.
How to move a file in to zip uncompressed, with zip cmd tool
But be careful it deletes the source file after it adds it to the archive. See the link for more details.
UPDATE
Instructions from this site. http://linux.about.com/od/commands/l/blcmdl1_zip.htm.
-m moves the specified files into the ZIP archive; actually, this deletes the target directories/files after making the specified ZIP archive.
If a directory becomes empty after removal of the files, the directory is also removed. No deletions are done until zip has created the archive without errors. This is useful for conserving disk space, but is potentially dangerous so it is recommended to use it in combination with -T to test the archive before removing all input files.
zip -m yourfile zip.file
I am trying to create a script for cron job. I have around 8 GB folder containing thousands of files. I am trying to create a bash script which first tar the folder and then transfer the tarred file to ftp server.
But I am not sure while tar is tarring the folder and some other process is accessing files inside it or writing to the files inside it.
Although its is fine for me if the tarred file does not contains that recent changes while the tar was tarring the folder.
suggest me the proper way. Thanks.
tar will hapilly tar "whatever it can". But you will probably have some surprises when untarring, as tar also stored the size of the file it tars, before taring it. So expect some surprises.
A very unpleasant surprise would be : if the size is truncated, then tar will "fill" it with "NUL" characters to match it's recorded size... This can give very unpleasant side effects. In some cases, tar, when untarring, will say nothing, and silently add as many NUL characters it needs to match the size (in fact, in unix, it doesn't even need to do that : the OS does it, see "sparse files"). In some cases, if truncating occured during the taring of the file, tar will complain it encounters an Unexpected End of File when untarring (as it expected XXX bytes but only reads fewer than this), but will still say that the file should be XXX bytes (and the unix OSes will then create it as a sparse file, with "NUL" chars magically appended at the end to match the expected size when you read it).
(to see the NUL chars : an easy way is to less thefile (or cat -v thefile | more on a very old unix. Look for any ^#)
But on the contrary, if files are only appended to (logs, etc), then the side effect is less problematic : you will only miss some bits of them (which you say you're ok about), and not have that unpleasant "fill with NUL characters" side effects. tar may complain when untarring the file, but it will untar it.
I think tar failed (so do not create archive) when an archived file is modified during archiving. As Etan said, the solution depends on what you want finally in the tarball.
To avoid a tar failure, you can simply COPY the folder elsewhere before to call tar. But in this case, you cannot be confident in the consistency of the backuped directory. It's NOT an atomic operation, so some files will be todate while other files will be outdated. It can be a severe issue or not follow your situation.
If you can, I suggest you configure how these files are created. For example: "only recent files are appended, files older than 1 day are never changed", in this case you can easily backup only old files and the backup will be consistent.
More generally, you have to accept to loose last data AND be not consistant (each files is backup at a different date), or you have to act at a different level. I suggest :
Configure the software that produces the data to choose a consistency
Or use OS/Virtualization features. For example it's possible to do consistent snapshot of a storage on some virtual storage...
I am using RSync to copy tar balls to an external hard drive on a Windows XP machine.
My files are tar.gz files (perms 600) in a directory (perms 711).
However, when I do a dry-run, only the folders are returned, the files are ignored.
I use RSync a lot, so I presume there is no issue with my installation.
I have tried changing permissions of the files but this makes no difference
The owner of the files is root, which is also the user which the script logs in as
I am not using Rsync's CVS option
The command I am using is:
rsync^
-azvr^
--stats^
--progress^
-e 'ssh -p 222' root#servername:/home/directory/ ./
Is there something I am missing to get my files copied over?
I can think of only a single possibility: My experience with rsync is that it creates the directory structure before copying files in. Rsync may be terminating prematurely, but after this directory step has been completed.
Update0
You mentioned that you were running dry run. Rsync by default only shows the directory names when the directory and all its contents are not present on the receiver.
After a lot of experimentation, I'm only able to reproduce the behaviour you describe if the directories on the source have later modification dates than on the receiver. In this instance, the modification times are adjusted on the receiver.
I had this problem too, and it turns out that backing up to a windows drive from linux doesn't seem to copy the temp files in place, after they are transferred over.
Try adding the --inplace flag, when rsyncing to windows drives.
I'm extracting a folder from a tarball, and I see these zero-byte files showing up in the result (where they are not in the source.) Setup (all on OS X):
On machine one, I have a directory /My/Stuff/Goes/Here/ containing several hundred files.
I build it like this
tar -cZf mystuff.tgz /My/Stuff/Goes/Here/
On machine two, I scp the tgz file to my local directory, then unpack it.
tar -xZf mystuff.tgz
It creates ~scott/My/Stuff/Goes/, but then under Goes, I see two files:
Here/ - a directory,
Here.bGd - a zero byte file.
The "Here.bGd" zero-byte file has a random 3-character suffix, mixed upper and lower-case characters. It has the same name as the lowest-level directory mentioned in the tar-creation command. It only appears at the lowest level directory named. Anybody know where these come from, and how I can adjust my tar creation to get rid of them?
Update: I checked the table of contents on the files using tar tZvf: toc does not list the zero-byte files, so I'm leaning toward the suggestion that the uncompress machine is at fault. OS X is version 10.5.5 on the unzip machine (not sure how to check the filesystem type). Tar is GNU tar 1.15.1, and it came with the machine.
You can get a table of contents from the tarball by doing
tar tZvf mystuff.tgz
If those zero-byte files are listed in the table of contents, then the problem is on the computer making the tarball. If they aren't listed, then the problem is on the computer decompressing the tarball.
I can't replicate this on my 10.5.5 system.
So, for each system:
what version of OSX are you using?
what filesystem is in use?
I have not seen this particular problem before with tar. However, there is another problem where tar bundles metadata files with regular files (they have the same name but are prefixed with "._"). The solution to this was to set the environment variable COPYFILE_DISABLE=y. If those weird files you have are more metadata files, maybe this would solve your problem as well?
Failing that, you could try installing a different version of tar.
On my MacOS X (10.4.11) machine, I sometimes acquire files .DS_Store in a directory (but these are not empty files), and I've seen other hidden file names on memory sticks that have been used on the Mac. These are somehow related to the Mac file system. I'd guess that what you are seeing are related to one or the other of these sets of files. Original Macs (MacOS 9 and earlier) had data forks and resource forks for files.
A little bit of play shows that a directory in which I've never used Finder has no .DS_Store file; if I use the Finder to navigate to that directory, the .DS_Store file appears. It presumably contains information about how the files should appear in the Finder display (so if you move files around, they stay where you put them).
This doesn't directly answer your question; I hope it gives some pointers.
I don't know (and boy is this a hard problem to Google for!), but here's a troubleshooting step: try tar without Z. That will determine whether compress or tar is causing the issue.