Scheduled process to copy files out of S3 into a temp-folder in Ubuntu 18.04 - bash

Looking for recommendations for the following scenario:
In an ubuntu 18.04 server, every 1 minute check for new files in an AWS S3 bucket, fetch only the newest file to a temp folder at the end of the day remove them.
It should be automated in bash.
I proposed using aws s3 events notification, queues, lambda but it was defined that is best to keep it simple.
i am looking for recommendations for the steps described below:
For step 1 i was doing aws s3 ls | awk (FUNCTION to filter files updated within the last minute)
then i realized that it was best to do it with grep
0-Cron job should run from 7:00 to 23:00 every minute
1-List the files updated to S3 bucket during the past 1 minute
2-List the files in a temp-encrypted folder in ubuntu 18.03
3-Are the files listed in step 1 already downloaded in folder temp-encrypted from step 2
4-If the files are not already donloaded > download newest files from S3 bucket into temp-encrypted
5-At end of the day 23:00 take a record of the last files fetched from s3
6-run cleanup script at end of the day to remove everything in temp-encrypted
I attach a diagram with the intended process and infrastructure design.

The solution was like this:
Change FTPS to SFTP running in Ubuntu 18.04
change main ports: randomport1 for SSH and randomport2 for SFTP
configure SFTP in sshd_config file
once everything is working create local directory structure
by using a bash script
5.1 List what is in S3 and save in a var
5.2 for each of the files listed in s3 check if there is a new file not present in the mirrored file in the local directory s3-mirror
5.3 if there is new file fetch, touch a file with empy contents in s3-mirror directory just same name, move encrypted file to SFTP and remove fetched S3 file from mirrored local directory
5.4 record successful actions in a log.
So far it works good.

Related

How can I confirm if azcopy synced successfully a dir?

I have a group of files and directories stored in a container in a storage account in Azure. I am using the sync operation of azcopy to bring only the files and directories that I am missing into a local directory. When executing the command I use the --delete-destination true flag and the --recursive flag. The command is executed as part of a bash script.
How can I check if the sync process was successful? I've noticed that azcopy doesn't necessarily return a zero exit code even though the sync was successful. Other users have mentioned that checking the exit code with the copy operation of azcopy has worked for them. However, it seems that the story is different with the sync operation.
Currently what I do is delete in my local directory a file that I know will always exist in all Azure containers that I have to sync with azcopy sync. After deleting the file, I run azcopy sync and when finished I check if the file I had deleted was restored... This is clearly not the ideal solution.
I am considering checking the logs from each one of the jobs that azcopy creates, or exploring the --mirror-mode flag or even figuring out if the details provided by --dry-run can help me review if everything went according to plan.
However, all these options seem to be too much for something that should be way much simpler. So, most likely, there is something here that I am missing...
I tried in my environment and got below results:
I tried the same steps in my environment first I copied from storage account to local environment using Azcopy command.
Command:
azcopy copy `"https://venkat123.blob.core.windows.net/container1/<SAS-token>" "C:\Users\xxxx" --recursive`
The above command has copied two files to local environment.
delete in my local directory a file that I know will always exist in all Azure containers that I have to sync with azcopy sync. After deleting the file, I run azcopy sync and when finished I check if the file I had deleted was restored.
I tried the same condition in my environment. I deleted the file is local environment and now I need to sync the file to storage account I used the below commands from local to azure portal.
azcopy sync "C:\Users\xxxxx" "https://venkat123.blob.core.windows.net/container1<SAS-Token>" --recursive --delete-destination=true --mirror-mode
Console:
The above command synced with azure blob container from local environment and it get deleted in azure portal and logs can be seen with above command as below.
azcopy sync "C:\Users\v-vsettu\xxxx" "https://venkat123.blob.core.windows.net/container1<SAS Token>" --recursive --delete-destination=true --mirror-mode
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
Job d8d2e3c3-d583-0a4c-6841-da4c919004d0 has started
Log file is located at: C:\Users\v-vsettu\.azcopy\d8d2e3c3-d583-0a4c-6841-da4c919004d0.log
INFO: azcopy.exe: A newer version 10.17.0 is available to download
100.0 %, 1 Done, 0 Failed, 0 Pending, 1 Total, 2-sec Throughput (Mb/s): 0.0103
Job d8d2e3c3-d583-0a4c-6841-da4c919004d0 Summary
Files Scanned at Source: 1
Files Scanned at Destination: 1
Elapsed Time (Minutes): 0.067
Number of Copy Transfers for Files: 1
Number of Copy Transfers for Folder Properties: 0
Total Number Of Copy Transfers: 1
Number of Copy Transfers Completed: 1
Number of Copy Transfers Failed: 0
Number of Deletions at Destination: 0
Total Number of Bytes Transferred: 2575
Total Number of Bytes Enumerated: 2575
Final Job Status: Completed
Portal:
Reference:
azcopy sync | Microsoft Learn

aws s3 glacier restore from vault

I have vault and need to restore one of the folder from the vault I have initiated the job using AWS CLI and got the inventory using JSON file but unable to get the complete folder from the inventory. Any one can help me restoring the folder?
I am able to get CSV file formate to see the archive ID of the files but is it possible to take the complete folder as it is showing separate archive ID for all files in folder?

permission problem when mirror a bucket, I'm trying to migrate a bucket of one minio server to another

I'm trying to migrate a bucket of one minio server to another using mc client, the command that I'm using is mc mirror (mc mirror --remove --overwrite --preserve minioproducao/compartilhado minioteste/compartilhado) the command works fine, but I was cheeking some
permissions between both server inside the bucket and I realized that the permission are different, for example:
I've connected inside of the the both kubernetes container and run the ls -l inside the bucket directory
on the origin it was showed like this: drwxr-xr-x arquivo.JPG #ps: it's ins't a file its a directoryenter image description here and inside of it there's two file like this: part.1 xl.meta
on the destine it was showed like this: -rw-r--r-- arquivo.JPG #ps: it's was copied as as file not a directory like it's on the origin minio server bucket
I'm wondering if there is a way to do a exactly copy of what's on my origin minio server, to another?
Thank you in advance!
Here's a picture that will help illustrate What I'm saying:

Local DynamoDB Tables Backup

I want to back my DynamoDB local server. I have install DynamoDB server in Linux machine. Some sites are refer to create a BASH file in Linux os and connect to S3 bucket, but in local machine we don't have S3 bucket.
So i am stuck with my work, Please help me Thanks
You need to find the database file created by DynamoDb local. From the docs:
-dbPath value — The directory where DynamoDB will write its database file. If you do not specify this option, the file will be written to
the current directory. Note that you cannot specify both -dbPath and
-inMemory at once.
The file name would be of the form youraccesskeyid_region.db. If you used the -sharedDb option, the file name would be shared-local-instance.db
By default, the file is created in the directory from which you ran dynamodb local. To restore you'll have to the copy the same file and while running dynamodb, specify the same dbPath.

Remote duplicate on FTP server

I have a server 1 (running Ubuntu), on this server, a website.
I have a server 2 (running Win Server 2012), on that server some application are running and I have space for my backups.
Server 1 has limited space, so I keep backups of both my MySQL database and Webserver file for 1 week only (daily backups).
When doing my daily backup, the script does the following :
- backup MySQL to a file (Mysqldump)
- Compress the Webserver root folder to a tar.gz
- push both generated file to a FTP server (total is 6GB)
- clean for files older than retention period
Now I want to add a step to have a stronger backup policy on server2 (keep daily for 10 days, have a weekly for 5 weeks, a monthly for a year and keep the yearly forever). Each backup interval is in a folder (i.e. a Daily folder, a weekly folder, a monthly folder and a Yearly folder)
I want that every sunday my backup file is copied both in Daily and Weekly folder (each of them being cleaned per policy explained previously and with another schedule task), I do not want to FTP it twice. I want basically from server1 to copy the file from \Server2\Daily to \Server2\Weekly.
Is RCP the right thing to use? I could not find how to use it with password.
well, some more research suggested me to go to a web service, so I ended up with the following setup.
in my cron job on Server1, after pushing the backed up files to the FTP server, I call (using curl) a php script on Server2, this PHP script will then call a batch file to do the copy/duplication job all on Server2.

Resources