Removing instances from HAProxy during AWS CodeDeploy - bash

Our application requires the use of HAProxy to load balance and route traffic (one per AZ), ALBs and ELBs are not configurable enough for our purposes. When deploying new code via AWS CodeDeploy, we would like the instances being patched to be placed into Maintenance Mode (removed from load balancing, connections drained). We have modified the default CodeDeploy lifecycle bash scripts to remove the instances from their respective HAProxy instances by sending an SSM Run Command to HAProxy from the instances in question. Currently this modification doesn't work, and the reason for failure is unknown. The script works when executed manually step by step (at least to the current point of failure). The part that fails is either the test that returns "$INSTANCE_ID doesn't seem to be in an AZ with a HAProxy instance, skipping deregistration.", or the setting of $HAPROXY_ID which aforementioned test depends on. The script runs just fine up until that point, but at that point, exits because it can't find the HAProxy instance ID.
I have checked IAM role permissions/credentials, environment variables, and file permissions which all appear to be correct. Normally I would place more logging into the script to debug, but deployments are too few and far between for us to make that practical.
My question: Is there a better way to do this? I can only guess we're not the only ones to use HAProxy with CodeDeploy, and there has to be a reliable method of doing this. Below is the current code being used that is not working.
#!/bin/bash
#
# Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
# http://aws.amazon.com/apache2.0
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.
. $(dirname $0)/common_functions.sh
if [[ "$DEPLOYMENT_GROUP_NAME" != "redacted" ]]; then
msg "ELB Deregistration doesn't need to happen when not on redacted."
exit
fi
msg "Running AWS CLI with region: $(get_instance_region)"
# get this instance's ID
INSTANCE_ID=$(get_instance_id)
if [ $? != 0 -o -z "$INSTANCE_ID" ]; then
error_exit "Unable to get this instance's ID; cannot continue."
fi
# Get current time
msg "Started $(basename $0) at $(/bin/date "+%F %T")"
start_sec=$(/bin/date +%s.%N)
msg "Checking if instance $INSTANCE_ID is part of an AutoScaling group"
asg=$(autoscaling_group_name $INSTANCE_ID)
if [ $? == 0 -a -n "${asg}" ]; then
msg "Found AutoScaling group for instance $INSTANCE_ID: ${asg}"
msg "Checking that installed CLI version is at least at version required for AutoScaling Standby"
check_cli_version
if [ $? != 0 ]; then
error_exit "CLI must be at least version ${MIN_CLI_X}.${MIN_CLI_Y}.${MIN_CLI_Z} to work with AutoScaling Standby"
fi
msg "Attempting to put instance into Standby"
autoscaling_enter_standby $INSTANCE_ID "${asg}"
if [ $? != 0 ]; then
error_exit "Failed to move instance into standby"
else
msg "Instance is in standby"
fi
fi
msg "Instance is not part of an ASG, continuing..."
## Get the instanceID of the HAProxy instance in this AZ and ENVIRONMENT - Will there ever be more than one???
HAPROXY_ID=$(/usr/local/bin/aws ec2 describe-instances --region us-east-1 --filters "Name=availability-zone,Values=$(/usr/bin/curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)" "Name=tag:deployment_group,Values=haproxy.$ENVIRONMENT" --output text | \
grep INSTANCES | \
awk '{print $8}' )
HAPROXY_IP=$(/usr/local/bin/aws ec2 describe-instances --region us-east-1 --filters "Name=availability-zone,Values=$(/usr/bin/curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)" "Name=tag:deployment_group,Values=haproxy.$ENVIRONMENT" --output text | \
grep INSTANCES | \
awk '{print $13}' )
if test -z "$HAPROXY_ID"; then
msg "$INSTANCE_ID doesn't seem to be in an AZ with a HAProxy instance, skipping deregistration."
exit
fi
## Put the current instance into MAINT mode with the HAProxy instance via SSM
msg "Deregistering $INSTANCE_ID from HAProxy $HAPROXY_ID"
DEREGCMD="{\"commands\":[\"haproxyctl disable server bk_app_servers/$INSTANCEID\"],\"executionTimeout\":[\"3600\"]}"
/usr/local/bin/aws ssm send-command \
--document-name "AWS-RunShellScript" \
--instance-ids "$HAPROXY_ID" \
--parameters "$DEREGCMD" \
--timeout-seconds 600 \
--output-s3-bucket-name "redacted" \
--output-s3-key-prefix "haproxy-codedeploy/deregister" \
--region us-east-1
if [ $? != 0 ]; then
error_exit "Failed to send SSM command to deregister instance $INSTANCE_ID from HAProxy $HAPROXY_ID"
fi
## Wait for all connections to drain from instance
SESS_COUNT=$(/usr/bin/curl -s "http://$HAPROXY_IP:<portredacted>/<urlredacted>" | grep $INSTANCEID | awk -F "," '{print $5}')
DRAIN_TIME=60
msg "Initial session count: $SESS_COUNT"
while [[ "$SESS_COUNT" -gt 0 ]]; do
if [[ "$COUNTER" -gt "$DRAIN_TIME" ]]; then
msg "Instance failed to drain all connections within $DRAIN_TIME seconds. Continuing to deploy anyway."
break
fi
msg $SESS_COUNT
sleep 1
COUNTER=$(($COUNTER + 1))
SESS_COUNT=$(/usr/bin/curl -s "http://$HAPROXY_IP:<portredacted>/<urlredacted>" | grep $INSTANCEID | awk -F "," '{print $5}')
done
msg "Finished $(basename $0) at $(/bin/date "+%F %T")"
end_sec=$(/bin/date +%s.%N)
elapsed_seconds=$(echo "$end_sec - $start_sec" | /usr/bin/bc)
msg "Elapsed time: $elapsed_seconds"

At the moment the only option for you is to add more logging and issue a deployment to test out this script and then look at your deployment logs. It sounds like you don't know why it's failing and only the logs can tell you that.
Try adding logging and seeing what happens. We should be just executing your script as is so it shouldn't act any differently but it's hard to tell without seeing the logs.
Good luck,
-Asaf

Related

Getting line 19: syntax error: unexpected end of file

I'm trying to associate the Elastic IP address with Auto scaling group, so whenever the autoscaling triggers it will automatically associate with the EIP.
For this I'm trying to add the script in user data.
My intention is to we have 2 servers so its associated with 2 EIP's, whenever the autoscaling triggers it has to check whether the EIP is free or not if its free it has to associate with that instance using the instance id.
So when i run the script im getting line 19: syntax error: unexpected end of file.
I have checked the indentations but i think its correct.
#!/bin/bash
INSTANCE_ID=$(ec2-metadata --instance-id | cut -d " " -f 2);
EIP_LIST=(eipalloc-07da69856432f7cef eipalloc-0355263fcb50412ed)
for EIP in $${EIP_LIST}; do
echo"Checkin if EIP is free"
ISFREE=$(aws ec2 describe-addresses --allocation-ids $EIP --query Addresses[].InstanceID --output text --region ap-south-1)
STARTWAIT=$(date +%s)
while [ ! -z "$ISFREE" ]; do
if [ "$(($(date +%s) - $STARTWAIT))" -gt $MAXWAIT ]; then
echo "WARNING: We waited for 30 seconds, we are forcing it now."
ISFREE=""
else
echo "checking the other EIP [$EIP]"
ISFREE=$(aws ec2 describe-adresses --allocation-ids $EIP --query Addresses[].InstanceID --output text --region ap-south-1)
fi
done
echo "Running: aws ec2 associate-address --instance-id $INSTANCE_ID --allocation-id $EIP --allow-reassociation --region ap-south-1"
aws ec2 association-address --instance-id $INSTANCE_ID --allocation-id $EIP --allow-reassociation --region ap-south-1
You are missing the final done for the for-loop.

Bash Script to run AWS Cli command in parallel to reduce time

sorry i am still new to bash scripting. I have around 10000 EC2 instance, i have created this bash script to change my EC2 instance type, all instance name and type are stored in a file. the code is working but it is taking so long to run through instance by instance.
does any have knows if i can run AWS Cli command on all EC2 instance in one go ? Thanks :)
#!/bin/bash
my_file='test.txt'
declare -a instanceID
declare -a fmo #Future Instance Size
while IFS=, read -r COL1 COL2; do
instanceID+=("$COL1")
fmo+=("$COL2")
done <"$my_file"
len=${#instanceID[#]}
for (( i=0; i < $len; i++)); do
vm_instance_id="${instanceID[$i]}"
vm_type="${fmo[$i]}"
echo Stoping $vm_instance_id
aws ec2 stop-instances --instance-ids $vm_instance_id
echo " Waiting for $vm_instance_id state to be STOP "
aws ec2 wait instance-stopped --instance-ids $vm_instance_id
echo Resizing $vm_instance_id to $vm_type
aws ec2 modify-instance-attribute --instance-id $vm_instance_id --instance-type $vm_type
echo Starting $vm_instance_id
aws ec2 start-instances --instance-ids $vm_instance_id
done
Refactor your code to a function that is passed a line from the file.
work() {
IFS=, read -r instanceID fmo <<<"$1"
stuff "$instanceID" "$fmo"
}
Run GNU xargs or GNU parallel for each line of file that calls the exported function. Use -P option run the function in paralell, see documentation.
export -f work
xargs -P0 -t bash -c 'work "$#"' -- <"$my_file"
As #KamilCuk pointed here, you can easily make this run in parallel. However, If you run this script in parallel, you might end up getting throttled by EC2, so make sure you include some backoff + retry logic / respect the limits specified here https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html

error in awscli call doesnt send to logfile

we have to check for the status of instance and iam trying to capture if any error to logfile. logfile has instance inforamtion but the error is not being writen to logfile below is code let me know what needs to be corrected
function wait-for-status {
instance=$1
target_status=$2
status=unknown
while [[ "$status" != "$target_status" ]]; do
status=`aws rds describe-db-instances \
--db-instance-identifier $instance | head -n 1 \
| awk -F \ '{print $10}'` >> ${log_file} 2>&1
sleep 30
echo $status >> ${log_file}
done
Rather than using all that head/awk stuff, if you want a value out of the CLI, you should use the --query parameter. For example:
aws rds describe-db-instances --db-instance-identifier xxx --query 'DBInstances[*].DBInstanceStatus'
See: Controlling Command Output from the AWS Command Line Interface - AWS Command Line Interface
Also, if your goal is to wait until an Amazon RDS instance is available, then you should use db-instance-available — AWS CLI Command Reference:
aws rds wait db-instance-available --db-instance-identifier xxx

Command line tool to access Amazon Athena

I am looking for a command line tool to make queries to Amazon Athena.
It works with JDBC, using the driver com.amazonaws.athena.jdbc.AthenaDriver, but I haven't found any command line tool that works with it.
Expanding on previous answer from #MasonWinsauer. Requires bash and jq.
#!/bin/bash
# Athena queries are fundamentally Asynchronous. So we have to :
# 1) Make the query, and tell Athena where in s3 to put the results (tell it the same place as the UI uses).
# 2) Wait for the query to finish
# 3) Pull down the results and un-wacky-Jsonify them.
# run the query, use jq to capture the QueryExecutionId, and then capture that into bash variable
queryExecutionId=$(
aws athena start-query-execution \
--query-string "SELECT Count(*) AS numBooks FROM books" \
--query-execution-context "Database=demo_books" \
--result-configuration "OutputLocation"="s3://whatever_is_in_the_athena_UI_settings" \
--region us-east-1 | jq -r ".QueryExecutionId"
)
echo "queryExecutionId was ${queryExecutionId}"
# Wait for the query to finish running.
# This will wait for up to 60 seconds (30 * 2)
for i in $(seq 1 30); do
queryState=$(
aws athena get-query-execution --query-execution-id "${queryExecutionId}" --region us-east-1 | jq -r ".QueryExecution.Status.State"
);
if [[ "${queryState}" == "SUCCEEDED" ]]; then
break;
fi;
echo " Awaiting queryExecutionId ${queryExecutionId} - state was ${queryState}"
if [[ "${queryState}" == "FAILED" ]]; then
# exit with "bad" error code
exit 1;
fi;
sleep 2
done
# Get the results.
aws athena get-query-results \
--query-execution-id "${queryExecutionId}" \
--region us-east-1 > numberOfBooks_wacky.json
# Todo un-wacky the json with jq or something
# cat numberOfBooks_wacky.json | jq -r ".ResultSet.Rows[] | .Data[0].VarCharValue"
As of version 1.11.89, the AWS command line tool supports Amazon Athena operations.
First, you will need to attach the AmazonAthenaFullAccess policy to the IAM role of the calling user.
Then, to get started querying, you will use the start-query-execution command as follows:
aws athena start-query-execution
--query-string "SELECT * FROM MyDb.MyTable"
--result-configuration "OutputLocation"="s3://MyBucket/logs" [Optional: EncryptionConfiguration]
--region <region>
This will return a JSON object of the QueryExecutionId, which can be used to retrieve the query results using the following command:
aws athena get-query-results
--query-execution-id <id>
--region <region>
Which also returns a JSON object of the results and metadata.
More information can be found in the official AWS Documentation.
Hope this helps!
You can try AthenaCLI, which is a command line client for Athena service that can do auto-completion and syntax highlighting, and is a proud member of the dbcli community
https://github.com/dbcli/athenacli
athena-cli should be a good start.

Mount a EBS volume (not snapshot) to Elastic Beanstalk EC2

I'm migrating a legacy app to Elastic Beanstalk. It needs persistent storage (for the time being). I want to mount a EBS volume.
I was hoping the following would work in .ebextensions/ebs.config:
commands:
01mkdir:
command: "mkdir /data"
02mount:
command: "mount /dev/sdh /data"
option_settings:
- namespace: aws:autoscaling:launchconfiguration
option_name: BlockDeviceMappings
value: /dev/sdh=vol-XXXXX
https://blogs.aws.amazon.com/application-management/post/Tx224DU59IG3OR9/Customize-Ephemeral-and-EBS-Volumes-in-Elastic-Beanstalk-Environments
But unfortunately I get the following error "(vol-XXXX) for parameter snapshotId is invalid. Expected: 'snap-...'."
Clearly this method only allows snapshots. Can anyone suggest a fix or an alternative method.
I have found a solution. It could be improved by removing the "sleep 10" but unfortunately that required because aws ec2 attach-volume is async and returns straight away before the attachment takes place.
container_commands:
01mount:
command: "aws ec2 attach-volume --volume-id vol-XXXXXX --instance-id $(curl -s http://169.254.169.254/latest/meta-data/instance-id) --device /dev/sdh"
ignoreErrors: true
02wait:
command: "sleep 10"
03mkdir:
command: "mkdir /data"
test: "[ ! -d /data ]"
04mount:
command: "mount /dev/sdh /data"
test: "! mountpoint -q /dev/sdh"
Note. Ideally it would be run in commands section not container_commands but the environment variables are not set in time.
To add to #Simon's answer (to avoid traps for the unwary):
If the persistent storage being mounted will ultimately be used inside a Docker container (e.g. if you're running Jenkins and want to persist jenkins_home), you need to restart the docker container after running the mount.
You need to have the 'ec2:AttachVolumes' action permitted against both the EC2 instance (or the instance/* ARN) and the volume(s) you want to attach (or the volume/* ARN) in the EB assumed role policy. Without this, the aws ec2 attach-volume command fails.
You need to pass in the --region to the aws ec2 ... command as well (at least, as of this writing)
Alternatively, instead of using an EBS volume, you could consider using an Elastic File System (EFS) Storage. AWS has published a script on how to mount an EFS volume to Elastic Beanstalk EC2 instances, and it can also be attached to multiple EC2 instances simultaneously (which is not possible for EBS).
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/services-efs.html
Here's a config file that you can drop in .ebextensions. You will need to provide the VOLUME_ID that you want to attach. The test commands make it so that attaching and mounting only happens as needed, so that you can eb deploy repeatedly without errors.
container_commands:
00attach:
command: |
export REGION=$(/opt/aws/bin/ec2-metadata -z | awk '{print substr($2, 0, length($2)-1)}')
export INSTANCE_ID=$(/opt/aws/bin/ec2-metadata -i | awk '{print $2}')
export VOLUME_ID=$(aws ec2 describe-volumes --region ${REGION} --output text --filters Name=tag:Name,Values=tf-trading-prod --query 'Volumes[*].VolumeId')
aws ec2 attach-volume --region ${REGION} --device /dev/sdh --instance-id ${INSTANCE_ID} --volume-id ${VOLUME_ID}
aws ec2 wait volume-in-use --region ${REGION} --volume-ids ${VOLUME_ID}
sleep 1
test: "! file -E /dev/xvdh"
01mkfs:
command: "mkfs -t ext3 /dev/xvdh"
test: "file -s /dev/xvdh | awk '{print $2}' | grep -q data"
02mkdir:
command: "mkdir -p /data"
03mount:
command: "mount /dev/xvdh /data"
test: "! mountpoint /data"
Have to use container_commands because when commands are run the source bundle is not fully unpacked yet.
.ebextensions/whatever.config
container_commands:
chmod:
command: chmod +x .platform/hooks/predeploy/mount-volume.sh
Predeploy hooks run after container commands but before the deployment. No need to restart your docker container even if it mounts a directory on the attached ebs volume, because beanstalk spins it up after predeploy hooks complete. You can see it in the logs.
.platform/hooks/predeploy/mount-volume.sh
#!/bin/sh
# Make sure LF line endings are used in the file, otherwise there would be an error saying "file not found".
# All platform hooks run as root user, no need for sudo.
# Before attaching the volume find out the root volume's name, so that we can later use it for filtering purposes.
# -d – to filter out partitions.
# -P – to display the result as key-value pairs.
# -o – to output only the matching part.
# lsblk strips the "/dev/" part
ROOT_VOLUME_NAME=$(lsblk -d -P | grep -o 'NAME="[a-z0-9]*"' | grep -o '[a-z0-9]*')
aws ec2 attach-volume --volume-id vol-xxx --instance-id $(curl -s http://169.254.169.254/latest/meta-data/instance-id) --device /dev/sdf --region us-east-1
# The above command is async, so we need to wait.
aws ec2 wait volume-in-use --volume-ids vol-xxx --region us-east-1
# Now lsblk should show two devices. We figure out which one is non-root by filtering out the stored root volume name.
NON_ROOT_VOLUME_NAME=$(lsblk -d -P | grep -o 'NAME="[a-z0-9]*"' | grep -o '[a-z0-9]*' | awk -v name="$ROOT_VOLUME_NAME" '$0 !~ name')
FILE_COMMAND_OUTPUT=$(file -s /dev/$NON_ROOT_VOLUME_NAME)
# Create a file system on the non-root device only if there isn't one already, so that we don't accidentally override it.
if test "$FILE_COMMAND_OUTPUT" = "/dev/$NON_ROOT_VOLUME_NAME: data"; then
mkfs -t xfs /dev/$NON_ROOT_VOLUME_NAME
fi
mkdir /data
mount /dev/$NON_ROOT_VOLUME_NAME /data
# Need to make sure that the volume gets mounted after every reboot, because by default only root volume is automatically mounted.
cp /etc/fstab /etc/fstab.orig
NON_ROOT_VOLUME_UUID=$(lsblk -d -P -o +UUID | awk -v name="$NON_ROOT_VOLUME_NAME" '$0 ~ name' | grep -o 'UUID="[-0-9a-z]*"' | grep -o '[-0-9a-z]*')
# We specify 0 to prevent the file system from being dumped, and 2 to indicate that it is a non-root device.
# If you ever boot your instance without this volume attached, the nofail mount option enables the instance to boot
# even if there are errors mounting the volume.
# Debian derivatives, including Ubuntu versions earlier than 16.04, must also add the nobootwait mount option.
echo "UUID=$NON_ROOT_VOLUME_UUID /data xfs defaults,nofail 0 2" | tee -a /etc/fstab
Pretty sure that things that I do with grep and awk could be done in a more concise manner. I'm not great at Linux.
Instance profile should include these permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:AttachVolume",
"ec2:DetachVolume",
"ec2:DescribeVolumes"
],
"Resource": [
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:instance/*"
]
}
]
}
You have to ensure that you deploy ebs volume in the same AZ as beanstalk and that you use SingleInstance deployment. Then if your instance crashes, ASG will terminate it, create another one, and attach the volume to the new instance keeping all the data.
Here it is with missing config:
commands:
01mount:
command: "export AWS_ACCESS_KEY_ID=<replace by your AWS key> && export AWS_SECRET_ACCESS_KEY=<replace by your AWS secret> && aws ec2 attach-volume --volume-id <replace by you volume id> --instance-id $(curl -s http://169.254.169.254/latest/meta-data/instance-id) --device /dev/xvdf --region <replace with your region>"
ignoreErrors: true
02wait:
command: "sleep 10"
03mkdir:
command: "mkdir /home/lucene"
test: "[ ! -d /home/lucene ]"
04mount:
command: "mount /dev/xvdf /home/lucene"
test: "! mountpoint -q /dev/xvdf"

Resources