Delete files after awk command - shell

I'm setting to do an ls in a bucket.
Make a print in folder name
remove the /, do sorting and remove the last 3.
which will be the most recent, then I'm setting remove the folds except for those 3 recent ones.
for i in $(aws s3 ls s3://portal-storage-site | awk -F '-' '{print $2}' | sed 's/\///g'| sort -n| tail -3| xargs| sed 's/ /|/g');
do aws s3 ls s3://portal-storage-site| grep -Ev "PRE\s.*\-($i)\/" | awk '{print $2}'|xargs echo "aws s3 ls s3://portal-storage-site/"; done
I expect the output is exec
aws s3 ls s3://portal-storage-site/2e5d0599-120/
aws s3 ls s3://portal-storage-site/6f08a223-118/
aws s3 ls s3://portal-storage-site/ba67667e-121/
aws s3 ls s3://portal-storage-site/ba67667e-122/
but the actual is
aws s3 ls s3://portal-storage-site/2e5d0599-119/ 2e5d0599-120/ 6f08a223-118/ ba67667e-121/ ba67667e-122/

Instead of using xargs you can try to compose your second aws ls command in awk and send it to bash:
aws s3 ls s3://portal-storage-site| grep -Ev "PRE\s.*\-($i)\/" | awk '{print "aws s3 ls s3://portal-storage-site/" $2}'| bash

Related

Shell script to fetch S3 bucket size with AWS CLI

I have this script that fetches all the buckets in AWS along with the size. But when I am running the script its fetching the bucket, but when running the loop for fetching the size, its throwing error. can someone point me where I am going wrong here. bcos when I am running the awscli commands for individual bucket, its fetching the size without any issues.
The desired output wille be as below, but for all the buckets, I have fetched for one bucket.
Desired ouptut:
aws --profile aws-stage s3 ls s3://<bucket> --recursive --human-readable --summarize | awk END'{print}'
Total Size: 75.1 KiB
Error:
Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
Script:
#!/bin/bash
aws_profile=('aws-stage' 'aws-prod');
#loop AWS profiles
for i in "${aws_profile[#]}"; do
echo "${i}"
buckets=$(aws --profile "${i}" s3 ls s3:// --recursive | awk '{print $3}')
#loop S3 buckets
for j in "${buckets[#]}"; do
echo "${j}"
aws --profile "${i}" s3 ls s3://"${j}" --recursive --human-readable --summarize | awk END'{print}'
done
done
Try this:
#!/bin/bash
aws_profiles=('aws-stage' 'aws-prod');
for profile in "${aws_profiles[#]}"; do
echo "$profile"
read -rd "\n" -a buckets <<< "$(aws --profile "$profile" s3 ls | cut -d " " -f3)"
for bucket in "${buckets[#]}"; do
echo "$bucket"
aws --profile "$profile" s3 ls s3://"$bucket" --human-readable --summarize | awk END'{print}'
done
done
The problem was that your buckets was a single string, rather than an array.

Using makefile to download a file from AWS to local

I want to set up a target which downloads the latest s3 file containing _id_config within a path. So I know I can get the name of file I am interested in by
FILE=$(shell aws s3 ls s3:blah//xyz/mno/here --recursive | sort | tail -n 2 | awk '{print $4}' | grep id_config)
Now, I want to download the file to local with something like
download_stuff:
aws s3 cp s3://prod_an.live.data/$FILE .
But when I run this, my $FILE has some extra stuff like
aws s3 cp s3://blah/2022-02-17 16:02:21 2098880 blah//xyz/mno/here54fa8c68e41_id_config.json .
Unknown options: 2098880,blah/xyz/mno/here54fa8c68e41_id_config.json,.
Please can someone help me understand why 2098880 and the spaces are there in the output and how to resolve this. Thank you in advance.
Suggesting a trick with ls options -1 and -t to get the latest files in a folder:
FILE=$(shell aws s3 ls -1t s3:blah//xyz/mno/here |head -n 2 | grep id_config)

bash script (or something else) to automate docker tag docker push

Looking for help to write a bash script to automate my docker workflow, or open to suggestions what to do instead
Current workflow is:
1.
me$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
abc.amazonaws.com/XYZ/XYZ-server 0.0.3-7-g45b4b4232e cf324458299c 8 minutes ago 936MB
2.
me$ docker tag cf324458299c abc.amazonaws.com/XYZ/XYZ-server:0.0.3-7-g45b4b4232e
me$ docker tag <last_image_id> <last_repo_id>:<last_tag>
3.
me$ docker push abc.amazonaws.com/XYZ/XYZ-server:0.0.3-7-g45b4b4232e
how could I automate this in a bash script so I can put it as an alias?
thank you very much
You would just extract the needed data and use them.
# 1.
if ! tmp=$(docker images --format '{{.Repository}}\t{{.Tag}}\t{{.ID}}' | grep 'abc.amazonaws.com/XYZ/XYZ-server'); then
: #hadnle error
fi
IFS=$'\t' read -r last_image_id last_repo_id last_tag <<<"$tmp"
# 2.
docker tag "$last_image_id" "$last_repo_id:$last_tag"
# 3.
docker push "$last_repo_id:$last_tag"
Steps:
create a bash script sample.sh and add these lines in it
var="amazonaws"
echo docker tag $(docker images | grep $var | awk '{print $3}') $(docker images | grep $var | awk '{print $1}'):$(docker images | grep $var | awk '{print $2}')
docker tag $(docker images | grep $var | awk '{print $3}') $(docker images | grep $var | awk '{print $1}'):$(docker images | grep $var | awk '{print $2}')
echo docker push $(docker images | grep $var | awk '{print $1}'):$(docker images | grep $var | awk '{print $2}')
docker push $(docker images | grep $var | awk '{print $1}'):$(docker images | grep $var | awk '{print $2}')
chmod +x sample.sh
execute as: sample.sh

aws s3 ls - how to recursively list objects with bash script avoid pagination error

I have on premise AWS S3 like storage. I need to list all files on specific bucket. When I am doing it at the top of the bucket I am getting error:
Error during pagination: The same next token was received twice:{'ContinuationToken':"file path"}
It happens when two many objects needs to be listed I think. This is something wrong at storage side but there is no cure for that right now.
I did a workaround for that and run S3 ls in the bash loop while. I manage to prepare a simple loop for different bucket where I have a much fewer number of objects. That loop were operating deep inside where I knew how many dirs I have.
./aws --profile us-bucket --endpoint-url https://endpoint:18082 --no-verify-ssl s3 ls us-bucket/dir1/dir2/dir3/dir4/dir5/dir6/ | tr -s ' '| tr '/' ' ' | awk '{print $2}' | while read line0; do ./aws --profile us-bucket --endpoint-url https://endpoint:18082 --no-verify-ssl s3 ls us-bucket/dir1/dir2/dir3/dir4/dir5/dir6/${line0}/| tr -s ' '| tr '/' ' ' | awk '{print $2}' | while read line1; do ./aws --profile us-bucket --endpoint-url https://endpoint:18082 --no-verify-ssl s3 ls us-bucket/dir1/dir2/dir3/dir4/dir5/dir6/${line0}/${line1}/| tr -s ' '| tr '/' ' ' | awk '{print $2}' |while read line2; do ./aws --profile us-bucket --endpoint-url https://endpoint:18082 --no-verify-ssl s3 ls --recursive us-bucket/dir1/dir2/dir3/dir4/dir5/dir6/${line0}/${line1}/${line2}/;done;done;done > /tmp/us-bucket/us-bucket_dir2_dir3_dir4_dir5_dir6.txt
I would like to write loop which go from the top or root (how you prefer) and list all files (no matter how many dir we have on the path) from the last dir in the path going up to avoid appearing:
Error during pagination: The same next token was received twice:{'ContinuationToken':"file path"}
Any help/clues appreciated. Thanks.
Br,
Jay

Command run via xargs fails but runs manually

I'm trying to create a command that will automatically attach to my existing python docker container, and trying to chain a bunch of commands together.
docker ps | grep "mypythoncontainer" | awk '{print $1}' | xargs docker attach
If I run
docker ps | grep "mypythoncontainer" | awk '{print $1}' | xargs echo
I get back a docker id string, as expected. And if I do docker attach {id string} (copied from the return of the statement right above this), it works. But when I run the full command at top, I get an error (the input device is not a TTY).
So docker ps | grep "mypythoncontainer" | awk '{print $1}' | xargs echo would echo out abc, but docker ps | grep "mypythoncontainer" | awk '{print $1}' | xargs docker attach would fail, while docker attach abc works. Not sure what about xargs I don't understand.
Try:
docker attach $(docker ps | grep "mypythoncontainer" | awk '{print $1}')
or simplier:
docker attach $(docker ps | awk '/mypythoncontainer/{print $1}')
Not sure what about xargs I don't understand.
Running: ...| ... docker ... will redirect docker's standard input to ... the ouput of awk, wich was already read by xargs. So docker abc will r
un with a broken (already closed) STDIN, then fail.

Resources