How to delete cloudwatch alarms that are attached to non-existent instances? - amazon-ec2

When I create new EC2 instances I use an ansible dynamic inventory to create new cloudwatch metrics alarms. So far so good:
- name: set AWS CloudWatch alarms
hosts: tag_env_production
vars:
alarm_slack: 'arn:aws:sns:123:metrics-alarms-slack'
tasks:
- name: "CPU > 70%"
ec2_metric_alarm:
state: present
name: "{{ ec2_tag_Name }}-CPU"
region: "{{ ec2_region }}"
dimensions:
InstanceId: '{{ ec2_id }}'
namespace: "AWS/EC2"
metric: CPUUtilization
statistic: Average
comparison: ">="
threshold: 70.0
unit: Percent
period: 300
evaluation_periods: 1
description: Triggered when CPU utilization is more than 70% for 5 minutes
alarm_actions: ['{{ alarm_slack }}']
when: ec2_tag_group == 'lazyservers'
Executing as follows:
ansible-playbook -v ec2_alarms.yml -i inventories/ec2/ec2.py
After creating the new instances I drop the old ones (manually). The problem is that I'd need to delete the alarms for existing metrics attached to the old instances.
Am I missing something or there is no way how to do this via the dynamic inventory?
My current idea is to delete the metrics for instances that are in the "Terminating" state, but the downside is that if I run the playbook after those instances are terminated, they simply won't be visible.

Before delete the instance, delete the alarm try with something like this:
- name: delete alarm
ec2_metric_alarm:
state: absent
region: ap-southeast-2
name: "cpu-low"
metric: "CPUUtilization"
namespace: "AWS/EC2"
statistic: Average
comparison: "<="
threshold: 5.0
period: 300
evaluation_periods: 3
unit: "Percent"
description: "This will alarm when a bamboo slave's cpu usage average is lower than 5% for 15 minutes "
dimensions: {'InstanceId':'{{ instance_id }}'}
alarm_actions: ["action1","action2"]

Related

Launch instance in AWS and add in host file in ansible (not dynaminc inventory)

I have a lot of inventories inside ansible. I not looking for a dynamic inventory solution.
When I create an instance from ansible in AWS, I need to run some tasks inside that new server but I don’t know what is the best way to do it.
tasks:
- ec2:
aws_secret_key: "{{ ec2_secret_key }}"
aws_access_key: "{{ ec2_access_key }}"
region: us-west-2
key_name: xxxxxxxxxx
instance_type: t2.medium
image: ami-xxxxxxxxxxxxxxxxxx
wait: yes
wait_timeout: 500
volumes:
- device_name: /dev/xvda
volume_type: gp3
volume_size: 20
delete_on_termination: yes
vpc_subnet_id: subnet-xxxxxxxxxxxx
assign_public_ip: no
instance_tags:
Name: new-instances
count: 1
- name: Buscamos la IP de la instancia creada
ec2_instance_facts:
filters:
"tag:Name": new-instances
register: ec2_instance_info
- set_fact:
msg: "{{ ec2_instance_info | json_query('instances[*].private_ip_address') }} "
- debug: var=msg
I currently show the IP of the new instance but I need to create a new host file and insert the IP of the new instance alias as I need to run several tasks after creating it.
Any help doing this?

Attach boot disk if exist to Gcloud instance with Ansible

I'm creating instance in Google Cloud with Ansible, but when I want to attach existing disk to new compute engine, I can't attach it or add it to instance.
- name: Launch instances
gce:
instance_names: mongo
machine_type: "n1-standard-1"
image: "debian-9"
service_account_email: "xxxx#xxxx.iam.gserviceaccount.com"
credentials_file: "gcp-credentials.json"
project_id: "learning"
disk_size: 10
disk_auto_delete: false
preemptible: true
tags: "mongo-server"
register: gce
- name: Wait for SSH for instances
wait_for:
delay: 1
host: "{{ item.public_ip }}"
port: 22
state: started
timeout: 30
with_items: "{{ gce.instance_data }}"
The error I have is:
The error was: libcloud.common.google.ResourceExistsError: {'domain': 'global', 'message': "The resource 'projects/xxx-xxx/zones/us-central1-a/disks/mongo' already exists", 'reason': 'alreadyExists'}
There are any form to configure this option with Ansible? To do that now I'm using external scripts.
Existing disks can be provided as a list under 'disks' attribute, first entry needs to be Boot dik
https://docs.ansible.com/ansible/2.6/modules/gce_module.html
- gce:
instance_names: my-test-instance
zone: us-central1-a
machine_type: n1-standard-1
state: present
metadata: '{"db":"postgres", "group":"qa", "id":500}'
tags:
- http-server
- my-other-tag
disks:
- name: disk-2
mode: READ_WRITE
- name: disk-3
mode: READ_ONLY

async_status loses results of the task it's cheking

I have these tasks to build aws ec2 instances in parallel:
- name: Set up testing EC2 instances
ec2_instance:
image_id: "{{ ami }}"
name: "testing {{ item }}"
tags:
Resposible Party: neil.watson#genesys.com
Purpose: Temporary shared VPC CICD testing
vpc_subnet_id: "{{ item }}"
wait: yes
register: ec2_instances
async: 7200
poll: 0
loop:
- "{{ PrivateSubnet01.value }}"
- name: Wait for instance creation to complete
async_status: jid={{ item.ansible_job_id }}
register: ec2_jobs
until: ec2_jobs.finished
retries: 300
loop: "{{ ec2_instances.results }}"
- debug:
msg: "{{ ec2_instances }}"
The trouble is that the end debug task doesn't show what I expect. I expect to see all the return values of the ec2_instance module, but instead I only see:
ok: [localhost] =>
msg:
changed: true
msg: All items completed
results:
- _ansible_ignore_errors: null
_ansible_item_label: subnet-0f69db3460b3391d1
_ansible_item_result: true
_ansible_no_log: false
_ansible_parsed: true
ansible_job_id: '814747228663.130'
changed: true
failed: false
finished: 0
item: subnet-0f69db3460b3391d1
results_file: /root/.ansible_async/814747228663.130
started: 1
Why?
"Set up testing EC2 instances" task was run asynchronously (poll: 0) and registered ec2_instances before they finish booting (finished: 0). Variable ec2_instances has not been changed afterwards. Would probably ec2_jobs ,registered after the task "Wait for instance creation to complete" had completed, keep the info you expect?
So I was looking for a solution to this exact scenario as well. Couldn't find one then figured it out on my own.
Apparently ec2_instance will return the instance of an already spun up ec2 if you provide the same name, region, and image_id.
So in my case, I had to provision 3 new instances,
I ran 3 tasks each spinning up an instance, async.
Then ran 3 async_status task to make sure all 3 instances were up. Then ran
community.aws.ec2_instance:
name: "machine_1"
region: "us-east-1"
image_id: ami-042e8287309f5df03
register: machine_1
for each of the three machines, and then stored them into my variables.

Connect EC2 instance to Target Group using Ansible

I have been working for a while registering EC2 instances to ELB's using Ansible. But now I'm starting to use ALB's and I need to connect my instances to Target Groups which in turn are connected to the ALB. Is there an Ansible plugin that allows me to register an instance to a AWS Target Group?
Since Ansible does not support registration of instances to target groups I had to use the AWS-CLI tool. With the following command you can register an instance to a target group:
aws elbv2 register-targets --target-group-arn arn:aws:elasticloadbalancing:us-east-1:your-target-group --targets Id=i-your-instance
So I just call this command from Ansible and it's done.
Use elb_target:
name: Gather facts for all new proxy instances
ec2_instance_facts:
filters:
"tag:Name": "{{ ec2_tag_proxy }}"
register: ec2_proxy
elb_target_group:
name: uat-target-proxy
protocol: http
port: 80
vpc_id: vpc-4e6e8112
deregistration_delay_timeout: 60
stickiness_enabled: True
stickiness_lb_cookie_duration: 86400
health_check_path: /
successful_response_codes: "200"
health_check_interval: "20"
state: present
elb_target:
target_group_name: uat-target-proxy
target_id: "{{ item.instance_id }}"
target_port: 80
state: present
with_items: "{{ ec2_proxy.instances }}"
when: ec2_proxy.instances|length > 0
Try the following configurations:
- name: creating target group
local_action:
module: elb_target_group
region: us-east-2
vpc_id: yourvpcid
name: create-targetgrp-delete
protocol: https
port: 443
health_check_path: /
successful_response_codes: "200,250-260"
state: present
targets:
- Id: ec2isntanceid
Port: 443
wait_timeout: 200
register: tgp

How to get filesystem via AWS EC2 dynamic inventory

I found quite a specific issue when setting AWS CloudWatch alarms via ansible using the ec2 dynamic inventory.
I've successfully set up the aws-script-mon to monitor the usage of Disk and RAM usage on my machines.
Also I've managed to set RAM usage alarms with the ansible ec2_metric_alarm module.
The problem I'm facing at the moment is when setting the alarms for disk usage the Filesystem dimension parameter is required, but not returned in the ec2 dynamic inventory variables.
Some of my machines have filesystem set to /dev/xvda1 and others have something like: /dev/disk/by-uuid/abcd123-def4-....
My current "solution" is as follows:
- name: "Disk > 60% (filesystem by fixed uuid)"
ec2_metric_alarm:
state: present
name: "{{ ec2_tag_Name }}-Disk"
region: "{{ ec2_region }}"
dimensions:
InstanceId: '{{ ec2_id }}'
MountPath: "/"
Filesystem: '/dev/disk/by-uuid/abcd123-def4-...'
namespace: "System/Linux"
metric: DiskSpaceUtilization
statistic: Average
comparison: ">="
threshold: 60.0
unit: Percent
period: 300
evaluation_periods: 1
description: Triggered when Disk utilization is more than 60% for 5 minutes
alarm_actions: ['arn:aws:sns:us-west-2:1234567890:slack']
when: ec2_tag_Name in ['srv1', 'srv2']
- name: "Disk > 60% (filesystem /dev/xvda1)"
ec2_metric_alarm:
state: present
name: "{{ ec2_tag_Name }}-Disk"
region: "{{ ec2_region }}"
dimensions:
InstanceId: '{{ ec2_id }}'
MountPath: "/"
Filesystem: '/dev/xvda1'
namespace: "System/Linux"
metric: DiskSpaceUtilization
statistic: Average
comparison: ">="
threshold: 60.0
unit: Percent
period: 300
evaluation_periods: 1
description: Triggered when Disk utilization is more than 60% for 5 minutes
alarm_actions: ['arn:aws:sns:us-west-2:1234567890:slack']
when: ec2_tag_Name not in ['srv1', 'srv2']
The only difference between those two tasks is the Filesystem dimension and the when condition (in or not in).
Is there any way how to obtain the Filesystem value so I can use them as comfortably as let's say ec2_id? My biggest concern is that I have to watch filesystem values when creating new machines and handle lists of machines according to that values.
I couldn't found a nice solution to this probem, and ended up writing a bash script to generate a YAML file containing UUID variables.
Run the script on the remote machine using the script module:
- script: ../files/get_disk_uuid.sh > /disk_uuid.yml
Fetch the created file from the remote
Use the include_vars_module to import the variables from the file. YAML syntax requires hyphens be replaced with underscores in variable names. The disk label 'cloudimg-rootfs' becomes 'cloudimg_rootfs'. Unlabeled disks use the variable names: 'disk0, disk1, disk2...'
Scripts aren't ideal. One day I'd like to write a module that accomplishes this same task.
#! /bin/bash
# get_disk_uuid.sh
echo '---'
disk_num=0
for entry in `blkid -o export -s LABEL -s UUID`; do
if [[ $entry == LABEL* ]];
then
label=${entry: 6}
elif [[ $entry == UUID* ]];
then
uuid=${entry: 5}
fi
if [ $label ] && [ $uuid ]; then
printf $label: | tr - _
printf ' '$uuid'\n'
label=''
uuid=''
elif [ $uuid ]; then
printf disk$disk_num:
printf ' '$uuid'\n'
label=''
uuid=''
fi
done

Resources