FTP backup script with hard links using - bash

Usually I use rsync based backup.
But now I have to make backup script from Windows server to linux.
So, there is no rsync - only FTP.
I like ideas of hard links using to save disk space and incremental backup to minimize traffic.
Is there any similar backup script for ftp instead of rsync?
UPDATE:
I need to backup Windows server through FTP. Backup script executes at Linux backup server.
SOLUTION:
I found this useful script to backup through FTP with hard links and incremental feature.
Note for Ubuntu users: there is no md5 command in Ubuntu. Use md5sum instead.
# filehash1="$(md5 -q "$curfile"".gz")"
# filehash2="$(md5 -q "$mysqltmpfile")"
filehash1="$(md5sum "$curfile"".gz" | awk '{ print $1 }')"
filehash2="$(md5sum "$mysqltmpfile" | awk '{ print $1 }')"

Edit, since the setup was not clear enough for me from the original question.
Based on the update of the question the situation is, that you need to pull the data on the backup server from the windows system via ftp. In this case you could adapt the script you find yourself (see comment) or use a similar idea like:
Use cp -lr to clone the previous backup with hard links.
Use lftp --mirror to overwrite this copy with anything which got updated on the remote system.
But I assumed initially that you need to push the data from the windows system to the backup server, that is the FTP server is on the backup system. This case can not handled this way (original answer follows):
Since FTP has no idea of links at all any transfers will only result in new or overwritten files. The only way would be to using the SITE command to issue site specific commands and deal this way with hard links. But site specific commands are usually restricted heavily so that you can do something like change permissions but not do anything with hard links.
And even if you could support hard links with SITE you have to implement the logic which decides when to use such links. With rsync this logic is built into the rsync server and executed on the server site. With FTP you have to built all the logic at the client site, which means that you would have to download a file to compare it with a local file and then decide if you would need to upload the new file or if a hard link to an existing file could be used.

Related

Retrieving latest file in a directory from a remote server

I was hoping to crack this myself, but it seems I have fallen at the first hurdle because I can't make head nor tale of other options I've read about.
I wish to access a database file hosted as follows (i.e. the hhsuite_dbs is a folder containing several databases)
http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/pdb70_08Oct15.tgz
Periodically, they update these databases, and so I want to download the lastest version. My plan is to run a bash script via cron, most likely monthly (though I've yet to even tackle the scheduling aspect of the task).
I believe the database is refreshed fortnightly, so if my script runs monthly I can expect there to be a new version. I'll then be running downstream programs that require the database.
My question is then, how do I go about retrieving this (and for a little more finesse I'd perhaps like to be able to check whether the remote file has changed in name or content to avoid a large download if unnecessary)? Is the best approach to query the name of the file, or the file property of date last modified (given that they may change the naming syntax of the file too?). To my naive brain, some kind of globbing of the pdb70 (something I think I can rely on to be in the filename) then pulled down with wget was all I had come up with so far.
EDIT Another confounding issue that has just occurred to me is that the file I want wont necessarily be the newest in the folder (as there are other types of databases there too), but rather, I need the newest version of, in this case, the pdb70 database.
Solutions I've looked at so far have mentioned weex, lftp, curlftpls but all of these seem to suggest logins/passwords for the server which I don't have/need if I was to just download it via the web. I've also seen mention of rsync, but of a cursory read it seems like people are steering clear of it for FTP uses.
Quite a few barriers in your way for this.
My first suggestion is that rather than getting the filename itself, you simply mirror the directory using wget, which should already be installed on your Ubuntu system, and let wget figure out what to download.
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
And new files will be created in the "safe" directory.
But that just gets you your mirror. You're still after is the "newest" file.
Luckily, wget sets the datestamp of files it downloads, if it can. So after mirroring, you might be able to do something like:
newestfile=$(ls -t /some/place/safe/pdb70*gz | head -1)
Note that this fails if ever there are newlines in the filename.
Another possibility might be to check the difference between the current file list and the last one. Something like this:
#!/bin/bash
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
rm index.html* *.gif # remove debris from mirroring an index
ls > /tmp/filelist.txt.$$
if [ -f /tmp/filelist.txt ]; then
echo "Difference since last check:"
diff /tmp/filelist.txt /tmp/filelist.txt.$$
fi
mv /tmp/filelist.txt.$$ /tmp/filelist.txt
You can parse the diff output (man diff for more options) to determine what file has been added.
Of course, with a solution like this, you could run your script every day and hopefully download a new update within a day of it being ready, rather than a fortnight later. Nice thing about --mirror is that it won't download files that are already on-hand.
Oh, and I haven't tested what I've written here. That's one monstrously large file.

Bash script for recursive directory listing on FTP server without -R

There are multiple folders with subfolders and image files on the FTP server. The -R is disabled. I need to dump the recursive directory listing with the path name in a text file. The logic I have till now is that, traverse in each folder, check the folder name if it consists of '.' to verify it as a file or a folder, if its a folder, go in and check for subfolders or files and list them. Since I cannot go with the -R, I have to go with a function to perform traverse each folder.
#!/bin/sh
ftp_host='1.1.1.1'
userName='uName'
ftp -in <<EOF
open $ftp_host
user $userName
recurList() {
path=`pwd`
level=()
for entry in `ls`
do
`cwd`
close
bye
EOF
I am stuck with the argument for the for loop!
Sorry to see you didn't get any replies yet. I think the reason may be that Bash isn't a good way to solve this problem, since it requires interacting with the FTP client, i.e. sending commands and reading responses. Bash is no good at that sort of thing. So there is no easy answer other than "don't use Bash".
I suggest you look at two other tools.
Firstly, you may be able to get the information you want using http://curlftpfs.sourceforge.net/. If you mount the FTP server using curlftpfs, then you can use the find command to dump the directory structure. This is the easiest option... if it works!
Alternatively, you could write a program using Python with the ftplib module: https://docs.python.org/2/library/ftplib.html. The module allows you to interact with the FTP server through API calls.

Windows Command Line FTP to deploy website

Trying to set up a post build script on my CI server to push changes to our web server by FTP. In as few lines as possible how can i push a folder of files to my webserver using windows FTP? For example deployment folder is:
c:\deployment\*.*
How can i recursively push all files to replace on the web server?
I'm open to using cmd or powershell - MS Windows only
Thanks
Windows' built-in command-line FTP client doesn't have recursion built-in. The easiest way would be to use a different FTP client. NcFTP will do what you're looking for. See the manual page for ncftpput. The syntax is basically as follows:
cd c:\deployment
ncftpput -u user -p pass -R ftp.ftpserver.com /path/on/ftp/server .\*
Or if your web server also runs an ssh service, then rsync would be even better.
Fsync is good, I am using it for long. It allows to push only what has changed. Recursion of course. Exclude files, too. Track client-side (much faster) what has changed... Biggest only drawback: No SFTP./ProductList/Fsync.html

Find oldest item in a folder SVN

I want to write a bash script that will store 10 back ups of a website in SVN, with it being back up nightly and then have the oldest back up deleted.
Is there an SVN command where I can get the age of these files in svn so then I can grammatically call "svn delete" on that file?
Subversion is definitely not the tool for this job. Once you commit something to subversion, there is no practical way to delete it.
There are a lot of ways to achieve your goal using standard commands in bash. You can use tools like ftp, wget, curl, scp, ssh, or whatever to download your site files, then tar and zip them up with different file names based on the date.
#!/bin/bash
DELETEME='htdocs_'`date '+%Y%m%d' -d '-10 days'`'.tar.gz'
NEW='htdocs_'`date '+%Y%m%d'`'.tar.gz'
SOURCE='/path/on/server/to/backup'
HOST='IP_or_hostname'
USER='user_on_HOST'
ssh $USER#$HOST tar czvf - $SOURCE > $NEW
rm -v $DELETEME
Then just schedule this as a daily cron job.
It doesn't sound like you understand how Subversion works.
Subversion is a version control system. You really use it the other way around, you write your webpages and JavaScripts in Subversion and then deploy your webpage from Subversion to your website. You have a complete history of all of your files in Subversion, and use its features like creating a tag to mark specific revisions of your website. This way, you can find out who made changes and why they were made.
It sounds like you simply want to make a backup of your website, and then delete the oldest backup to save room.
You should look into rsync which is really great for backups. Rsync is fast and is pretty simple to use.
You can look at the Subversion online manual and read the first two or three chapters. It'll explain how Subversion is used and it's one of the best manuals for open source software out there. After you read it, you might decide to use Subversion after all, but not for backups, but for development.

How can I ftp multiple files?

I have two unix servers in which I need to ftp some files.
The directory structure is almost same except a slight difference, like:
server a server b
miabc/v11_0/a/b/c/*.c miabc/v75_0/a/b/c/
miabc/v11_0/xy/*.h miabc/v11_0/xy/
There are many modules:
miabc
mfabc
The directory structure inside them is same in both the servers except the 11_0 and 75_0. And directory structure in side different modules is different
How can I FTP all the files in all modules into the corresponding module in second server b by any of scripting languages like awk, Perl, shell, ksh using FTP?
I'd say if you want to go with Perl, you have to use Net::FTP.
Once, I needed a script that diffs a directory/file structure on an FTP
server with a corresponding directory/file structure on a local harddisk,
which lead me to write this script. I don't know if it is efficient or elegant, but you might find one or another
idea in it.
hth / Rene
See you need to use correct path of directory where you want to send files.
You can create small script with php .
php provide good ftp functions.using php you can easily ftp your file. but before that, once check your ftp settings of IIS server or file zilla
I have used following code for sending files on ftp this is in php :-
$conn_id = ftp_connect($FTP_HOST) or die("Couldn't connect to ".$FTP_HOST);
$login_result =ftp_login($conn_id, $FTP_USER, $FTP_PW);
ftp_fput($conn_id, $from, $files, $mode) // ths is the function to put files on ftp
This code is just for reference , go through php manual before using it.
I'd use a combination of Expect, lftp and a recursive function to walk the directory structure.
If the file system supports symlinking or hardlinking, I would use a simple wget to mirror the ftp server. in one of them when you're wgetting just hack the directory v11_0 to point to 75_0, wget won't know the difference.
server a:
go to /project/servera
wget the whole thing. (this should place them all in /project/servera/miabc/v11_0)
server b:
go to /project/serverb
create a directory /project/serverb/miabc/75_0, link it to /project/servera/v11_0:
ln -s /project/serverb/miabc/75_0 /project/servera/v11_0
wget serverb, this will be followed when wget tries to cwd into in 75_0 it will find itself in /project/servera/v11_0
Don't make the project harder than it needs to be: read the docs on wget, and ln. If wget doesn't follow symbolic links, file a bug report, and use a hard link if your FS supports it.
It sounds like you really want rsync instead. I'd try to avoid any programming in solving this problem.
I suggest you could login on any of the server first and go to the appropraite path miabc/v75_0/a/b/c/ . From here you need to do a sftp to the other server.
sftp user#servername
Go to the appropraiate path which files needs to be transferred.
write the command mget *

Resources