Correct LFTP command to upload only updated files - lftp

I am using codeship.io to upload files in a code repository to a shared hosting without SSH.
This is the original command, it tooks two hours to complete:
lftp -c "open -u $FTP_USER,$FTP_PASSWORD ftp.mydomain.com; set ssl:verify-certificate no; mirror -R ${HOME}/clone/ /public_html/targetfolder"
I tried to add -n, which is supposed to upload only newer files. But I can still see from the streaming logs that some unchanged files are being uploaded:
lftp -c "open -u $FTP_USER,$FTP_PASSWORD ftp.mydomain.com; set ssl:verify-certificate no; mirror -R -n ${HOME}/clone/ /public_html/targetfolder"
What is the correct command to correctly upload only updated files?

The command is correct.
The question is why lftp considers the files "changed". It uploads a file if it is missing, has different size of has different modification time.
You can try to do "ls" on the directory where lftp uploads the files to and see if the files are really present, have the same size and the same or newer modification time.
If for some reason the modification time is older, add --ignore-time to the mirror command.

Codeship builds the code first before deployment.
This means that the code in Codeship's temporary server is newer than anything else in your pipeline, even though the code itself may not have changed.
This is why when you use lftp's option of "only newer files", it simply means everything.
As far as I know, you can't upload only the actual newer files.

Related

wget hangs after large file download

I'm trying to download a large file over a ftp. (5GB file). Here is my script.
read ZipName
wget -c -N -q --show-progress "ftp://Password#ftp.server.com/$ZipName"
unzip $ZipName
The files downloads at 100% but never goes to the unzip command. No special error message, no outputs in the terminal. Just blank new line. I have to send CTRL + c and run back to script to unzip since wget detects that the file is fully downloaded.
Why does is hangs out like this? Is it because of the large file, or passing an argument in command?
By the way I can't use ftp because it's not on the VM i'm working on, and it's a temporary VM so no root privilege to install anything.
I've made some tests, and I think that size of the disk was the reason.
I've tried with curl -O and it worked for the same disk space.

LFTP to touch every file it's downloading

I'm using lftp to run a backup job, from one location to another. And it's the only possible solution. But it works really great. I'm using this command:
/usr/bin/lftp -c "open -p 9922 -u jdoe,passw0rd sftp://sftpsiteurl.com; mirror -c -e -R -L /source-folder /destination-folder/"
But I need to change the greated date or modified date on the files comming down. Right now the date on the files is from the remote location. But I'm not sure on how to do this.
I can see that you can run some kind off script for validating the files coming down. But I'm unsure off the command, and I can't seem to find any examples.
Do anybody know if this is possible, and how to do it.

Speeding up lftp mirroring with many directories

I am trying to mirror a public FTP to a local directory. When I use wget -m {url} then wget quite quickly skips lots of files that have been already downloaded (and no newer version exists), when I use lftp open -u user,pass {url}; mirror then lftp sends MDTM for every file before deciding whether to download the file or not. With 2 million+ files in 50 thousand+ directories this is very slow, besides I get error messages that MDTM of directories could not be obtained.
In the manual it says that using set sync-mode off will result in sending all requests at once, so that lftp doesn't wait for each response. When I do that, I get error messages from the server saying there are too many connections from my IP address.
I tried running wget first to download only the newer files, but this does not delete the files which were removed from the FTP server, so I follow up with lftp to remove the old files, however lftp still sends MDTM on each file, which means that there is no advantage to this approach.
If I use set ftp:use-mdtm off, then it seems that lftp just downloads all files again.
Could someone suggest the correct setting for lftp with large number of directories/files (specifically, so that it skips directories which were not updated, like wget seems to do)?
Use set ftp:use-mdtm off and mirror --ignore-time for the first invocation to avoid re-downloading all the files.
You can also try to upgrade lftp and/or use set ftp:use-mlsd on, in this case lftp will get precise file modification time from the MLSD command output (provided that the server supports the command).

Automatically copy contents of FTP folder to local Windows folder without overwrite?

I need to copy all the files of an FTP folder to my local Windows folder, but without replacing the files that already exist. This would need to be a job/task that runs unattended every hour.
This is what the job would need to do:
1. Connect to FTP server.
2. In ftp, move to folder /var/MyFolder.
3. In local PC, move to c:\MyDestination.
4. Copy all files in /var/MyFolder that do not exist in c:\MyDestination.
5. Disconnect.
I had previously tried the following script using MGET * (that runs from a .bat), but it copies and overwrites everything. Which means that even if 1000 files were previously copied, it will copy them again.
open MyFtpServer.com
UserName
Password
lcd c:\MyDestination
cd /var/MyFolder
binary
mget *
Any help is appreciated.
Thanks.
Use wget for Windows.
If you want to include subdirectories (adjust the cut-dirs number according to the depth of your actual remote path):
cd /d C:\MyDestination
wget.exe --mirror -np -nH --cut-dirs=2 ftp://UserName:Password#MyFtpServer.com/var/MyFolder
If you don't want subdirectories:
cd /d C:\MyDestination
wget.exe -nc ftp://UserName:Password#MyFtpServer.com/var/MyFolder/*
The "magic" bit (for this second form) is the -nc option, which tells wget not to overwrite files that are already there locally. Do keep in mind that old files are also left alone, so if a file on your FTP server gets edited or updated, it won't get re-downloaded. If you want to also update files, use -N instead of -nc.
(Note that you can also type wget instead of wget.exe, I just included the extension to point out that these are Windows batch file commands)

Why lftp mirror --only-newer does not transfer "only newer" file?

I want to automate to upload files of my websites. But, remote server does not support ssh, so I try lftp command below instead of rsync.
lftp -c "set ftp:use-mdtm no && set ftp:timezone -9 && open -u user,password ftp.example.com && mirror -Ren local_directory remote_directory"
If local files are not changed, no files are uploded by this command. But, I change a file and run the command, all files are uploaded.
I know lftp/ftp's MDTM problem. So, I tried "set ftp:use-mdtm no && set ftp:timezone -9", but all files are uploaded though I changed only one file.
Is anyone know why lftp mirror --only-newer does not transfer "only newer" file?
On the following page
http://www.bouthors.fr/wiki/doku.php?id=en:linux:synchro_lftp
the authors state:
When uploading, it is not possible to set the date/time on the files uploaded, that's why --ignore-time is needed.
so if you use the flag combination --only-newer and --ignore-time you can achieve decent backup properties, in such a way that all files that differ in size are replaced. Of course it doesn't help if you really need to rely on time-synchronization but if it is just to perform a regular backup of data, it'll do the job.
mirror -R -n works for me as a very simple backup of new files

Resources