AWS cli skip files that are in use? - bash

How does the AWS cli tool handle files that are in use?
My log files are created hourly (log-2018.01.26.13, 14, 15 etc). I was thinking of making a very simple aws cli mv script to move files to a S3 bucket and having that run every 10 minutes through cron to make sure I get logs as soon as possible.
However there will be files that haven't finished writing yet. Is the AWS cli smart enough to leave those files alone or do I need extra logic that first check whether files are in use?

Related

Need to write a shell script for transferring files to GCS bucket using gsutil. but don't know completely how to write it. Please tell from start

#!/bin/bash
gsutil -m cp -r dir gs://my-bucket
I write this code for transferring files from machine to GCS bucket..but I think I need to configure project and service account...please tell me the code for this.
The steps are basically:
Create project
Attach billing to the project
Create the bucket
Assign privilege to the bucket (if you open it up to all users for testing make sure you get rid of it or lock it down later)
Upload/download an object via the gui
Upload objects via gsutil
These two quickstarts should be followed in order and should cover everything you need to achieve what you're talking about.
Quickstart Storage
Quickstart Gsutil

How to utilize shell script and AWS CLI to automatically copy a file daily from one S3 bucket to another?

I'd like to create a way (using shell scripts and AWS's CLI) so that the following can be automated:
Copy specific files from an s3 bucket
Paste them into a different bucket in S3.
Would the below 'sync' command work?
aws s3 sync s3://directory1/bucket1 s3://directory2/bucket2 --exclude "US*.gz" --exclude "CA*.gz" --include "AU*.gz"
The goal here is to ONLY transfer files whose filenames begin with "AU" and exclude everything else, all in automated fashion as much as possible. Also, is it possible to exclude very old files?
Second part of the question is what do I need to add to my shell script in order to automate this process as much as possible, as "AU" files gets dropped in this folder everyday?
Copy objects
The AWS CLI can certainly copy objects between buckets. In fact, it does not even require files to be downloaded — S3 will copy directly between buckets, even if they are in different regions.
The aws s3 sync command is certainly an easy way to do it, since it will replicate any files from the source to the destination without having to specifically state which files to copy.
To only copy AU* files, use: --exclude "*" --include "AU*"
See: Use of Exclude and Include Filters
You asked about excluding old files — the sync command will sync all files, so any files that were previously copied will not be copied again. By default, any files deleted from the source will not be deleted in the destination until specifically requested.
Automate
How to automate this? The most cloud-worthy way to do this would be to create an AWS Lambda function. The Lambda function can be automatically triggered by an Amazon CloudWatch Events rule on a regular schedule.
However, the AWS CLI is not installed by default in Lambda, so it might be a little more challenging. See: Running aws-cli Commands Inside An AWS Lambda Function - Alestic.com
It would be better to have the Lambda function do the copy itself, rather than calling the AWS CLI.
Alternative idea
Amazon S3 can be configured to trigger an AWS Lambda function whenever a new object is added to an S3 bucket. This way, as soon as the object is added in S3, it will be copied to the other Amazon S3 bucket. Logic in the Lambda function can determine whether or not to copy the file, such as checking that is starts with AU.

Backup strategy ubuntu laravel

I am searching for a backup strategy for my web application files.
I am hosting my (laravel) application at an ubuntu (18.04) server in the cloud and currently have around 80GB of storage that needs to be backed up (this grows fast). The biggest files are around ~30mb, the rest of it are small jpg/txt/pdf files.
I want to make at least 2 times a day a full backup of the storage directory and store it as a zip file on a local server. I have 2 reasons for this: independence from cloud providers, and for archiving.
My first backup strategy was to zip all the contents of the storage folder en rsync the zip, this goes well until a couple of gigabytes then the server is completely stuck on cpu usage.
My second approach is with rsync, but this i can't track when a file is deleted / added.
I am looking for a good backup strategy that preferable generate zips before or after backup and stores them so we can browse and examine back in time.
Strange enough i could not find anything that suits me, i hope anyone can help me out.
I agree with #RobertFridzema that the whole server becomes unresponsive when using ZIP functionality from spatie package.
Had the same situation with a customer project. My suggestion is to keep the source code files within version control. Just backup the dynamic/changing files with rsync (incremental works best and fast) and create a separate database backup strategy. For example with MySQL/Mariadb: mysqldump, encrypt the resulting file and move it to an external storage as well.
If ZIP creation still is a problem, I would maybe use a storage which is already set up with raid functionality or if that is not possible, I would definitly not use the ZIP functionality on the live server. rsync incremental to another server and do the backup strategy there.
Spatie has a package for Laravel backups that can be scheduled in the laravel job scheduler. It will create zips with the entire project including storage dirs
https://github.com/spatie/laravel-backup

How to delete x old files in amazon s3 using amazon CLI

I have found this regarding deletion of old files in bash:
Delete all but the most recent X files in bash
I want the same functionality, however I cannot apply the same principles in my script as it is interacting with an amazon s3 directory.
Does anyone know how to use amazon CLI to achieve this?
Well you can just create a lifecycle rule on S3 to delete older files. Then this process is done automatically for you.
Otherwise I guess you need to LIST all objects' metadata and write a script that checks if the script is old enough. But if you have a lot of objects this can be quite costly, while the lifecycle rule is free.

Sync a bucket with AWS ruby tools

I am using Amazon's official aws-sdk gem, but I can't seem to find any funcionality that works like the command line tool aws s3 sync <path> <bucket>. Does it exist or am I forced to upload each file separately (slow)?
you don't have an api call that achieves that.
the sync is basically a call to get the objects, a call to inspect your local path and after that uploads/downloads to bring the 2 locations in sync. that's what the was cli tool does under the hood.

Resources