getting throttle exception while using aws describe_log_streams - aws-lambda

Below is my boto3 code snippet for lambda. My requirement is to read the entire cloudwatch logs and based on certain criteria should push it to S3.
I have used the below snippet to read the cloudwatch logs from each stream. This is working absolutely fine, for lesser data. However for massive logs inside each LogSteam this will throw
Throttle exception - (reached max retries: 4)
Default/Max value is 50.
I tried given certain other values but of no use. Please check and let me know if there is any other alternative for this?
while v_nextToken is not None:
cnt+=1
loglist += '\n' + "No of iterations inside describe_log_streams 2nd stage - Iteration Cnt" + str(cnt)
#Note : Max value of limit=50 and by default value will be 50
#desc_response = client.describe_log_streams(logGroupName=vlog_groups,orderBy='LastEventTime',nextToken=v_nextToken,descending=True, limit=50)
try:
desc_response = client.describe_log_streams(logGroupName=vlog_groups,orderBy='LastEventTime',nextToken=v_nextToken,descending=True, limit=50)
except Exception as e:
print ( "Throttling error" + str(e) )

You can use CW logs subscription filter for Lambda, so the lambda will be triggered directly from the log stream. You can also consider subscribing a Kinesis stream which has some advantages.

Related

Does event source mapping get deleted along with the lambda?

I have an AWS lambda and have created an event source mapping for the same. When I delete the lambda using Python boto3, does the event source mapping also get deleted along with that?
No. A Lambda Event Source Mapping is a separate, customer-managed resource. It has its own CRUD API and CloudFormation AWS::Lambda::EventSourceMapping resource type. You must delete it yourself with delete_event_source_mapping.
res = client.list_event_source_mappings(EventSourceArn=queue_arn, FunctionName=function_name)
assert len(res["EventSourceMappings"]) == 1
client.delete_function(FunctionName=function_name)
res = client.list_event_source_mappings(EventSourceArn=queue_arn, FunctionName=function_name)
assert len(res["EventSourceMappings"]) == 1
client.delete_event_source_mapping(UUID=mapping_uuid)
res = client.list_event_source_mappings(EventSourceArn=queue_arn, FunctionName=function_name)
assert len(res["EventSourceMappings"]) == 0 # wait a few seconds for deletion to finish

ClowdWatch doesn't show any AWS lambda failure details

I'm trying to debug my lambda_function.py in AWS.
It writes the logs to CloudWatch always but..
In some case (cannot understand which) of 'Internal Server Error' it doesnt write anything but only START and END records to CloudWatch, which makes impossible to understand the root cause of the failure.
Here is my code:
import json
import psycopg2
def lambda_handler(event, context):
try:
print('started')
s = psycopg2.__version__
print(s)
conn = psycopg2.connect(
user='pg_user',
password='*********',
host='pg_host',
port='5432',
database='dev_db'
)
cur = conn.cursor()
cur.execute("select count(1) q from keywords_to_scrape")
for q in cur:
print(f'q = {q}')
except Exception as e:
print(f'exception: {e} ')
finally:
print('returning result')
return {
'statusCode' : 200,
'body' : json.dumps(f'{s}')
}
and if to comment this part
.............
#conn = psycopg2.connect(
# user='pg_user',
# password='*********',
# host='pg_host',
# port='5432',
# database='dev_db'
#)
.............
then it perfectly writes to CloudWatch the lines "started", "exception" with clear exception message and finally returns 200 OK
But with the lines of connection to DB it just dies with 'Internal server error' and with no messages in CloudWatch.
Could you please advice how to track such failures?
You are hitting timeout error as according to your comment.
Task timed out after 3.01 seconds
A few things for you to try and check:
Make your Lambda Timeout longer. E.g. 10 seconds.
If your Lambda is still hitting timeout error after you longer your Lambda Timeout, then you might want to check your database connections to the database. E.g. Make sure your Lambda is placed in the same VPC as your database and your database security group enables traffic from your Lambda.

Create CloudWatch alarm that sets an instance to standby via SNS/Lambda

What I am looking to do is set an instance to standby mode when it hits an alarm state. I already have an alarm set up to detect when my instance hits 90% CPU for a while. The alarm currently sends a Slack and text message via SNS calling a Lambda function. I would like to add is to have the instance go into standby mode. The instances are in an autoscaling group.
I found that you can perform this through the CLI using the command :
aws autoscaling enter-standby --instance-ids i-66b4f7d5be234234234 --auto-scaling-group-name my-asg --should-decrement-desired-capacity
You can also do this with boto3 :
response = client.enter_standby(
InstanceIds=[
'string',
],
AutoScalingGroupName='string',
ShouldDecrementDesiredCapacity=True|False
)
I assume I need to write another Lambda function that will be triggered by SNS that will use the boto3 code to do this?
Is there a better/easier way before I start?
I already have the InstanceId passed into the event to the Lambda so I will have to add the ASG name in the event.
Is there a way to get the ASG name in the Lambda function when I already have the Instance ID? Then I do not have to pass it in with the event.
Thanks!
Your question has a couple sub-parts, so I'll try to answer them in order:
I assume I need to write another Lambda function that will be triggered by SNS that will use the boto3 code to do this?
You don't need to, you could overload your existing function. I could see a valid argument for either separate functions (separation of concerns) or one function (since "reacting to CPU hitting 90%" is basically "one thing").
Is there a better/easier way before I start?
I don't know of any other way you could do it, other than Cloudwatch -> SNS -> Lambda.
Is there a way to get the ASG name in the Lambda function when I already have the Instance ID?
Yes, see this question for an example. It's up to you whether it looks like doing it in the Lambda or passing an additional parameter is the cleaner option.
For anyone interested, here is what I came up with for the Lambda function (in Python) :
# Puts the instance in the standby mode which takes it off the load balancer
# and a replacement unit is spun up to take its place
#
import json
import boto3
ec2_client = boto3.client('ec2')
asg_client = boto3.client('autoscaling')
def lambda_handler(event, context):
# Get the id from the event JSON
msg = event['Records'][0]['Sns']['Message']
msg_json = json.loads(msg)
id = msg_json['Trigger']['Dimensions'][0]['value']
print("Instance id is " + str(id))
# Capture all the info about the instance so we can extract the ASG name later
response = ec2_client.describe_instances(
Filters=[
{
'Name': 'instance-id',
'Values': [str(id)]
},
],
)
# Get the ASG name from the response JSON
#autoscaling_name = response['Reservations'][0]['Instances'][0]['Tags'][1]['Value']
tags = response['Reservations'][0]['Instances'][0]['Tags']
autoscaling_name = next(t["Value"] for t in tags if t["Key"] == "aws:autoscaling:groupName")
print("Autoscaling name is - " + str(autoscaling_name))
# Put the instance in standby
response = asg_client.enter_standby(
InstanceIds=[
str(id),
],
AutoScalingGroupName=str(autoscaling_name),
ShouldDecrementDesiredCapacity=False
)

How to synchronize data between multiple workers

I've the following problem that is begging a zmq solution. I have a time-series data:
A,B,C,D,E,...
I need to perform an operation, Func, on each point.
It makes good sense to parallelize the task using multiple workers via zmq. However, what is tripping me up is how do I synchronize the result, i.e., the results should be time-ordered exactly the way the input data came in. So the end result should look like:
Func(A), Func(B), Func(C), Func(D),...
I should also point out that time to complete,say, Func(A) will be slightly different than Func(B). This may require me to block for a while.
Any suggestions would be greatly appreciated.
You will always need to block for a while in order to synchronize things. You can actually send requests to a pool of workers, and when a response is received - to buffer it if it is not a subsequent one. One simple workflow could be described in a pseudo-language as follows:
socket receiver; # zmq.PULL
socket workers; # zmq.DEALER, the worker thread socket is started as zmq.DEALER too.
poller = poller(receiver, workers);
next_id_req = incr()
out_queue = queue;
out_queue.last_id = next_id_req
buffer = sorted_queue;
sock = poller.poll()
if sock is receiver:
packet_N = receiver.recv()
# send N for processing
worker.send(packet_N, ++next_id_req)
else if sock is workers:
# get a processed response Func(N)
func_N_response, id = workers.recv()
if out_queue.last_id != id-1:
# not subsequent id, buffer it
buffer.push(id, func_N_rseponse)
else:
# in order, push to out queue
out_queue.push(id, func_N_response)
# also consume all buffered subsequent items
while (out_queue.last_id == buffer.min_id() - 1):
id, buffered_N_resp = buffer.pop()
out_queue.push(id, buffered_N_resp)
But here comes the problem what happens if a packet is lost in the processing thread(the workers pool).. You can either skip it after a certain timeout(flush the buffer into the out queue), amd continue filling the out queue, and reorder when the packet comes later, if ever comes.

sqs message between a client and a server

I need to setup a client which will send sqs to a server:
client side:
...
sqs = AWS::SQS.new
q = sqs.queues.create("q_name")
m = q.send_message("meta")
...
but how the server could read the message of the client?
Thank you in advance.
First you need to have your server connect to SQS then you can get your queue.
Do a get_messages on your queue. Go to boto docs to get more information on the attributes. This will give you 1 to 10 message objects based on your parameters. Then on each of those objects do a get_body() then you'll have the string of the message.
Here's a simple example in python. Sorry don't know ruby.
sqsConn = connect_to_region("us-west-1", # this is the region you created the queue in
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
QUEUE = sqsConn.get_queue("my-queue") # the name of your queue
msgs = QUEUE.get_messages(num_messages=10, # try and get 10 messages
wait_time_seconds=1, # wait 1 second for these messages
visibility_timeout=10) # keep them visible for 10 seconds
body = msgs[0].get_body() # get the string from the first object
Hope this helps.

Resources