Good workflow for uploading thousands of files - shell

Every day ~30000 files are created on one server and then they should all be moved to a web server. Until now I've been using ncftpput for transferring all the files but it's very slow and stops now and then.
My plan now is to gzip all the files and then use scp to transfer the archive to my web server and then unpack it there.
Is this a good solution or is there a better way?

I would use "rsync" if available on your platform. It is restartable, only does the minimum necessary and very fast and easy to use.

I think it is better than upload raw files.
But I found this: Slow Upload Speeds via SCP - https://discussion.dreamhost.com/thread-134902.html

Related

Backup strategy ubuntu laravel

I am searching for a backup strategy for my web application files.
I am hosting my (laravel) application at an ubuntu (18.04) server in the cloud and currently have around 80GB of storage that needs to be backed up (this grows fast). The biggest files are around ~30mb, the rest of it are small jpg/txt/pdf files.
I want to make at least 2 times a day a full backup of the storage directory and store it as a zip file on a local server. I have 2 reasons for this: independence from cloud providers, and for archiving.
My first backup strategy was to zip all the contents of the storage folder en rsync the zip, this goes well until a couple of gigabytes then the server is completely stuck on cpu usage.
My second approach is with rsync, but this i can't track when a file is deleted / added.
I am looking for a good backup strategy that preferable generate zips before or after backup and stores them so we can browse and examine back in time.
Strange enough i could not find anything that suits me, i hope anyone can help me out.
I agree with #RobertFridzema that the whole server becomes unresponsive when using ZIP functionality from spatie package.
Had the same situation with a customer project. My suggestion is to keep the source code files within version control. Just backup the dynamic/changing files with rsync (incremental works best and fast) and create a separate database backup strategy. For example with MySQL/Mariadb: mysqldump, encrypt the resulting file and move it to an external storage as well.
If ZIP creation still is a problem, I would maybe use a storage which is already set up with raid functionality or if that is not possible, I would definitly not use the ZIP functionality on the live server. rsync incremental to another server and do the backup strategy there.
Spatie has a package for Laravel backups that can be scheduled in the laravel job scheduler. It will create zips with the entire project including storage dirs
https://github.com/spatie/laravel-backup

advice : manage file transfer between multiple remote servers

Hoping for a bit of expert advise in best practise.
At the moment I have about 5 servers (some dedicated, some EC2's). I need to transfer files between them. I have one server running an SFTP server which I use as a 'middle-man' but this means I'm double-handling data (upload to SFTP, then download from other server).
How do others manage this sort of thing? Should I run an SFTP server on each server? Or is there a simpler SSH-based method?
Instead of an SFTP server as your 'middle-man', use Amazon S3. Create pick up/drop locations on Amazon S3 that your servers can push and pull from. It's cheap, extremely reliable and scalable. File transfers can be scripted via command line or done via GUI (just like FTP).

Download large files using SFTPor HTTPS

We have an application that generate many files with size of (2GB-10GB) we want to save these files in a server and allow specific customers to download them. The system will delete these files in 30 days (we have around 30 customers) .
From your experience which download method should we use SFTP or HTTPS and why?
And do you have any suggestion how to grantee download Security?
Depends on who downloads what.
If it is customers downloading the files, then make things as easy as possible: offer https and advise one or two download managers you have tested that allow to pickup a broken download again.
For internal use (backup and the like) I strongly suggest to use rsync via ssh. Much easier to use, since you can do incremental downloads, so only those files are downloaded that do not exist locally or have changed remotely. That means you can simply trigger synchronization on a daily bases and the files will comulate locally over time just as they are created remotely.
When using sftp or rsync via ssh: the ssh server should be configured not to accept passwords but only keys for authentication as this is more secure.

best practices for uploading many files to live server while updating database

I have roughly 200 files that I need to push to our live server after business hours. In addition to this push I have a few database updates that I need to run in conjunction with this roll out.
What has been done in the past on this system is to create a directory on the server of the updated files and create a cron script to copy those files to overwrite their previous versions on the server. And then executing the calls to the database.
Here are the problems I am trying to work around:
1) There is no staging server.
2) There is no easy way to push from our version control (svn) to our live server
3) There are a lot of files and the directory structure is deep so setting up a copy of the directories to be copied over on the server seems precarious and time consuming.
What's the best way to do this?
The way I've done similar things in the past is to have a cron job run a script an administrative machine that:
1) checks out the files I need on my production server on some sort of staging machine
2) rsync's the files onto the server
3) runs a post-rsync script on the server (say via ssh'ing to the server)
However, you specify that you have no ability to use a staging machine, by which I assume you mean that you have no administrative machine at all, and that you cannot check out your repository on the server either. That makes doing this cleanly far harder. Are you sure you can't at least use your workstation or some similar box as an administrative or staging machine here?

How do I precache files on a perfoce proxy server in order to obtain some decent speed?

Perforce proxy does cache files only when a user does sync them so if you are the single user of a perforce proxy that is syncing some locations you will gain almost nothing from it.
The question is how to make perforce cache the files for you?
The solution for this is to create a scheduled task/crontab that will simulate a sync for you.
You will have to create a branchspec for this client,
export P4CLIENT=prefetch
p4 -Zproxyload sync //depot/main/...
This command will not copy the files to your client, it will only tell the proxy to cache them.

Resources