I want to automate to upload files of my websites. But, remote server does not support ssh, so I try lftp command below instead of rsync.
lftp -c "set ftp:use-mdtm no && set ftp:timezone -9 && open -u user,password ftp.example.com && mirror -Ren local_directory remote_directory"
If local files are not changed, no files are uploded by this command. But, I change a file and run the command, all files are uploaded.
I know lftp/ftp's MDTM problem. So, I tried "set ftp:use-mdtm no && set ftp:timezone -9", but all files are uploaded though I changed only one file.
Is anyone know why lftp mirror --only-newer does not transfer "only newer" file?
On the following page
http://www.bouthors.fr/wiki/doku.php?id=en:linux:synchro_lftp
the authors state:
When uploading, it is not possible to set the date/time on the files uploaded, that's why --ignore-time is needed.
so if you use the flag combination --only-newer and --ignore-time you can achieve decent backup properties, in such a way that all files that differ in size are replaced. Of course it doesn't help if you really need to rely on time-synchronization but if it is just to perform a regular backup of data, it'll do the job.
mirror -R -n works for me as a very simple backup of new files
Related
I am at an impass with my knowledge about bash scripting and rsync (over SSH).
In my use case there is a local folder with log files in it. Those logfiles are rotated every 24 hours and receive a date-stamp in their filename (eg. logfile.DATE) while the current one is called logfile only.
I'd like to copy those files to another (remote) server and then compress those copied log files on this remote server.
I'd like to use rsync to ensure if the script does not work once or twice that there are no files skipped (so I would like not to mess with dates and date abbriviations if not nessecary).
However, if I understand correctly, all files would be rsynced, because the already rsynced files do not "match" the rsync algorithm because they are compressed....
How can I avoid that the same file is being copied again, when this very file is on the remote location (only alraedy compressed).
Does someone have an idea or a direction I should focus my research on this?
Thank you very much
best regards
When you do the rotation, you rename logfile to logfile.DATE. As part of that operation, use ssh mv to do the same on the archive server at the same time (you can even tell the server to compress it then).
Then you only ever need to rsync the current logfile.
For example, your rotate operation goes from this:
mv logfile logfile.$(date +%F)
To this:
mv logfile logfile.$(date +%F)
ssh archiver mv logfile logfile.$(date +%F) && gzip logfile.$(date +%F)
And your rsync job goes from this:
rsync logdir/ archiver:
To this:
rsync logdir/logfile archiver:
I'm using lftp to run a backup job, from one location to another. And it's the only possible solution. But it works really great. I'm using this command:
/usr/bin/lftp -c "open -p 9922 -u jdoe,passw0rd sftp://sftpsiteurl.com; mirror -c -e -R -L /source-folder /destination-folder/"
But I need to change the greated date or modified date on the files comming down. Right now the date on the files is from the remote location. But I'm not sure on how to do this.
I can see that you can run some kind off script for validating the files coming down. But I'm unsure off the command, and I can't seem to find any examples.
Do anybody know if this is possible, and how to do it.
I am trying to mirror a public FTP to a local directory. When I use wget -m {url} then wget quite quickly skips lots of files that have been already downloaded (and no newer version exists), when I use lftp open -u user,pass {url}; mirror then lftp sends MDTM for every file before deciding whether to download the file or not. With 2 million+ files in 50 thousand+ directories this is very slow, besides I get error messages that MDTM of directories could not be obtained.
In the manual it says that using set sync-mode off will result in sending all requests at once, so that lftp doesn't wait for each response. When I do that, I get error messages from the server saying there are too many connections from my IP address.
I tried running wget first to download only the newer files, but this does not delete the files which were removed from the FTP server, so I follow up with lftp to remove the old files, however lftp still sends MDTM on each file, which means that there is no advantage to this approach.
If I use set ftp:use-mdtm off, then it seems that lftp just downloads all files again.
Could someone suggest the correct setting for lftp with large number of directories/files (specifically, so that it skips directories which were not updated, like wget seems to do)?
Use set ftp:use-mdtm off and mirror --ignore-time for the first invocation to avoid re-downloading all the files.
You can also try to upgrade lftp and/or use set ftp:use-mlsd on, in this case lftp will get precise file modification time from the MLSD command output (provided that the server supports the command).
I am using codeship.io to upload files in a code repository to a shared hosting without SSH.
This is the original command, it tooks two hours to complete:
lftp -c "open -u $FTP_USER,$FTP_PASSWORD ftp.mydomain.com; set ssl:verify-certificate no; mirror -R ${HOME}/clone/ /public_html/targetfolder"
I tried to add -n, which is supposed to upload only newer files. But I can still see from the streaming logs that some unchanged files are being uploaded:
lftp -c "open -u $FTP_USER,$FTP_PASSWORD ftp.mydomain.com; set ssl:verify-certificate no; mirror -R -n ${HOME}/clone/ /public_html/targetfolder"
What is the correct command to correctly upload only updated files?
The command is correct.
The question is why lftp considers the files "changed". It uploads a file if it is missing, has different size of has different modification time.
You can try to do "ls" on the directory where lftp uploads the files to and see if the files are really present, have the same size and the same or newer modification time.
If for some reason the modification time is older, add --ignore-time to the mirror command.
Codeship builds the code first before deployment.
This means that the code in Codeship's temporary server is newer than anything else in your pipeline, even though the code itself may not have changed.
This is why when you use lftp's option of "only newer files", it simply means everything.
As far as I know, you can't upload only the actual newer files.
I need to copy all the files of an FTP folder to my local Windows folder, but without replacing the files that already exist. This would need to be a job/task that runs unattended every hour.
This is what the job would need to do:
1. Connect to FTP server.
2. In ftp, move to folder /var/MyFolder.
3. In local PC, move to c:\MyDestination.
4. Copy all files in /var/MyFolder that do not exist in c:\MyDestination.
5. Disconnect.
I had previously tried the following script using MGET * (that runs from a .bat), but it copies and overwrites everything. Which means that even if 1000 files were previously copied, it will copy them again.
open MyFtpServer.com
UserName
Password
lcd c:\MyDestination
cd /var/MyFolder
binary
mget *
Any help is appreciated.
Thanks.
Use wget for Windows.
If you want to include subdirectories (adjust the cut-dirs number according to the depth of your actual remote path):
cd /d C:\MyDestination
wget.exe --mirror -np -nH --cut-dirs=2 ftp://UserName:Password#MyFtpServer.com/var/MyFolder
If you don't want subdirectories:
cd /d C:\MyDestination
wget.exe -nc ftp://UserName:Password#MyFtpServer.com/var/MyFolder/*
The "magic" bit (for this second form) is the -nc option, which tells wget not to overwrite files that are already there locally. Do keep in mind that old files are also left alone, so if a file on your FTP server gets edited or updated, it won't get re-downloaded. If you want to also update files, use -N instead of -nc.
(Note that you can also type wget instead of wget.exe, I just included the extension to point out that these are Windows batch file commands)