ClowdWatch doesn't show any AWS lambda failure details - aws-lambda

I'm trying to debug my lambda_function.py in AWS.
It writes the logs to CloudWatch always but..
In some case (cannot understand which) of 'Internal Server Error' it doesnt write anything but only START and END records to CloudWatch, which makes impossible to understand the root cause of the failure.
Here is my code:
import json
import psycopg2
def lambda_handler(event, context):
try:
print('started')
s = psycopg2.__version__
print(s)
conn = psycopg2.connect(
user='pg_user',
password='*********',
host='pg_host',
port='5432',
database='dev_db'
)
cur = conn.cursor()
cur.execute("select count(1) q from keywords_to_scrape")
for q in cur:
print(f'q = {q}')
except Exception as e:
print(f'exception: {e} ')
finally:
print('returning result')
return {
'statusCode' : 200,
'body' : json.dumps(f'{s}')
}
and if to comment this part
.............
#conn = psycopg2.connect(
# user='pg_user',
# password='*********',
# host='pg_host',
# port='5432',
# database='dev_db'
#)
.............
then it perfectly writes to CloudWatch the lines "started", "exception" with clear exception message and finally returns 200 OK
But with the lines of connection to DB it just dies with 'Internal server error' and with no messages in CloudWatch.
Could you please advice how to track such failures?

You are hitting timeout error as according to your comment.
Task timed out after 3.01 seconds
A few things for you to try and check:
Make your Lambda Timeout longer. E.g. 10 seconds.
If your Lambda is still hitting timeout error after you longer your Lambda Timeout, then you might want to check your database connections to the database. E.g. Make sure your Lambda is placed in the same VPC as your database and your database security group enables traffic from your Lambda.

Related

Is it alright to inlcude connect() inside the lambda_handler in order to close the connection after use?

I wrote one lambda function to access the MySQL database and fetch the data i.e to fetch the number of users, but any real-time update is not fetched, unless the connection is re-established.
And closing the connection inside the lambda_handler before returning, results in connection error upon its next call.
The query which I am using is -> select count(*) from users
import os
import pymysql
import json
import logging
endpoint = os.environ.get('DBMS_endpoint')
username = os.environ.get('DBMS_username')
password = os.environ.get('DBMS_password')
database_name = os.environ.get('DBMS_name')
DBport = int(os.environ.get('DBMS_port'))
logger = logging.getLogger()
logger.setLevel(logging.INFO)
try:
connection = pymysql.connect(endpoint, user=username, passwd=password, db=database_name, port=DBport)
logger.info("SUCCESS: Connection to RDS mysql instance succeeded")
except:
logger.error("ERROR: Unexpected error: Could not connect to MySql instance.")
def lambda_handler(event, context):
try:
cursor = connection.cursor()
............some.work..........
............work.saved..........
cursor.close()
connection.close()
return .....
except:
print("ERROR")
The above code results in connection error after its second time usage,
First time it works fine and gives the output but the second time when I run the lambda function it results in connection error.
Upon removal of this line ->
connection.close()
The code works fine but the real-time data which was inserted into the DB is not fetched by the lambda,
but when I don't use the lambda function for 2 minutes, then after using it again, the new value is fetched by it.
So,
In order to rectify this problem,
I placed the connect() inside the lambda_handler and the problem is solved and it also fetches the real-time data upon insertion.
import os
import pymysql
import json
import logging
endpoint = os.environ.get('DBMS_endpoint')
username = os.environ.get('DBMS_username')
password = os.environ.get('DBMS_password')
database_name = os.environ.get('DBMS_name')
DBport = int(os.environ.get('DBMS_port'))
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
try:
try:
connection = pymysql.connect(endpoint, user=username, passwd=password, db=database_name, port=DBport)
except:
logger.error("ERROR: Unexpected error: Could not connect to MySql instance.")
cursor = connection.cursor()
............some.work..........
............work.saved..........
cursor.close()
connection.close()
return .....
except:
print("ERROR")
So, I want to know, whether is it right to do this, or there is some other way to solve this problem, I trying to solve this for few-days and finally this solution is working, but not sure whether will it be a good practice to do this or not.
Any problems will occur if the number of connections to database increases?
Or any kind of resource problem?

Error: Runtime exited without providing a reason in python lambda

This code is used to do following
This lambda gets triggered via an event rule
Event is sent when and instance state goes to running/terminated
When instance is running the instance attributes are saved to dynamodb table
Route53 record sets are created / deleted when instance is in running/ terminated state.
Now the lambda created records when instance is launched and throws below error
RequestId: 9f4fb9ed-88db-442a-bc4f-079744f5bbcf Error: Runtime exited without providing a reason
Runtime.ExitError
import ipaddress
import os
import time
from datetime import datetime
from typing import Dict, List, Optional
import boto3
from botocore.exceptions import ClientError, ParamValidationError
from pynamodb.attributes import UnicodeAttribute, UTCDateTimeAttribute
from pynamodb.exceptions import DoesNotExist
from pynamodb.models import Model
def lambda_handler(event, context):
"""Registers or de-registers private DNS resource records for a given EC2 instance."""
# Retrieve details from invocation event object.
try:
account_id = event["account"]
instance_id = event["detail"]["instance-id"]
instance_region = event["region"]
instance_state = event["detail"]["state"]
except KeyError as err:
raise RuntimeError(
f"One or more required fields missing from event object {err}"
)
print(
f"EC2 instance {instance_id} changed to state `{instance_state}` in account "
f"{account_id} and region {instance_region}."
)
print(f"Creating a new aws session in {instance_region} for account {account_id}.")
target_session = aws_session(
region=instance_region, account_id=account_id, assume_role=ASSUME_ROLE_NAME,
)
print(f"Retrieving instance and VPC attributes for instance {instance_id}.")
instance_resource = get_instance_resource(instance_id, target_session)
vpc_resource = get_vpc_resource(instance_resource.vpc_id, target_session)
route53_client = target_session.client("route53")
print(f"Retrieving DNS configuration from VPC {instance_resource.vpc_id}.")
forward_zone = get_vpc_domain(vpc_resource)
print(f"Calculating reverse DNS configuration for instance {instance_id}.")
reverse_lookup = determine_reverse_lookup(instance_resource, vpc_resource)
if instance_state == "running":
print(f"Building DNS registration record for instance {instance_id}.")
#vpc_resource = get_vpc_resource(instance_resource.vpc_id, target_session)
#print(f"Retrieving DNS configuration from VPC {instance_resource.vpc_id}.")
#forward_zone = get_vpc_domain(vpc_resource)
#print(f"Calculating reverse DNS configuration for instance {instance_id}.")
#reverse_lookup = determine_reverse_lookup(instance_resource, vpc_resource)
record = Registration(
account_id=account_id,
hostname=generate_hostname(instance_resource),
instance_id=instance_resource.id,
forward_zone=forward_zone,
forward_zone_id=get_zone_id(forward_zone, route53_client),
private_address=instance_resource.private_ip_address,
region=instance_region,
reverse_hostname=reverse_lookup["Host"],
reverse_zone=reverse_lookup["Zone"],
reverse_zone_id=get_zone_id(reverse_lookup["Zone"], route53_client),
vpc_id=instance_resource.vpc_id,
)
print(record)
try:
if record.forward_zone_id is not None:
manage_resource_record(record, route53_client)
if record.forward_zone_id and record.reverse_zone_id is not None:
manage_resource_record(record, route53_client, record_type="PTR")
except RuntimeError as err:
print(f"An error occurred while creating records: {err}")
exit(os.EX_IOERR)
if record.forward_zone_id:
print(
f"Saving DNS registration record to database for instance {instance_id}."
)
record.save()
else:
print(
f"No matching hosted zone for {record.forward_zone} associated "
f"with {record.vpc_id}."
)
else:
try:
print(
f"Getting DNS registration record from database for instance {instance_id}."
)
record = Registration.get(instance_id)
if record.forward_zone_id is not None:
manage_resource_record(record, route53_client, action="DELETE")
if record.reverse_zone_id is not None:
manage_resource_record(record, route53_client, record_type="PTR", action="DELETE")
print(
"Deleting DNS registration record from database for "
f"instance {instance_id}."
)
record.delete()
except DoesNotExist:
print(f"A registration record for instance {instance_id} does not exist.")
exit(os.EX_DATAERR)
except RuntimeError as err:
print(f"An error occurred while removing resource records: {err}")
exit(os.EX_IOERR)
exit(os.EX_OK)
I had this problem with a dotnet lambda. Turns out it'd run out of memory. Raising the memory ceiling allowed it to pass.
exit(os.EX_OK) statement in last line was causing this. Removing this line resolved my issue.
What I did to get around this, but still have an exit code if there was errors detected was the following:
def main(event=None, context=None):
logger.info('Starting Lambda')
error_count = 0
s3_objects = parse_event(event)
for s3_object in s3_objects:
logger.info('Parsing s3://{}/{}'.format(s3_object['bucket'], s3_object['key']))
error_count += parse_object(object_json)
logger.info('Total Errors: {}'.format(error_count))
if error_count > 255:
error_count = 255
logger.info('Exiting lambda')
# exit if error_count (lambda doesn't like sys.exit(0))
if error_count > 0:
sys.exit(error_count)
TL;DR - Exit only if an error is detected.

getting throttle exception while using aws describe_log_streams

Below is my boto3 code snippet for lambda. My requirement is to read the entire cloudwatch logs and based on certain criteria should push it to S3.
I have used the below snippet to read the cloudwatch logs from each stream. This is working absolutely fine, for lesser data. However for massive logs inside each LogSteam this will throw
Throttle exception - (reached max retries: 4)
Default/Max value is 50.
I tried given certain other values but of no use. Please check and let me know if there is any other alternative for this?
while v_nextToken is not None:
cnt+=1
loglist += '\n' + "No of iterations inside describe_log_streams 2nd stage - Iteration Cnt" + str(cnt)
#Note : Max value of limit=50 and by default value will be 50
#desc_response = client.describe_log_streams(logGroupName=vlog_groups,orderBy='LastEventTime',nextToken=v_nextToken,descending=True, limit=50)
try:
desc_response = client.describe_log_streams(logGroupName=vlog_groups,orderBy='LastEventTime',nextToken=v_nextToken,descending=True, limit=50)
except Exception as e:
print ( "Throttling error" + str(e) )
You can use CW logs subscription filter for Lambda, so the lambda will be triggered directly from the log stream. You can also consider subscribing a Kinesis stream which has some advantages.

Tornado cancel httpclient.AsyncHTTPClient fetch() from on_chunk()

Inside one of the handlers I am doing the following:
async def get(self):
client = httpclient.AsyncHTTPClient()
url = 'some url here'
request = httpclient.HTTPRequest(url=url, streaming_callback=self.on_chunk, request_timeout=120)
result = await client.fetch(request)
self.write("done")
#gen.coroutine
def on_chunk(self, chunk):
self.write(chunk)
yield self.flush()
The requests can sometimes be quite large and the client may leave while the request is still in progress of being fetched and pumped to the client. If this happens an exception will appear in the on_chunk function when self.write() is attempted. My question is how do I abort the remaining download if my client went away ?
If your streaming_callback raises an exception, the client request should be aborted. This will spam the logs with stack traces, but there's not currently a cleaner way to do it. You can override on_connection_close to detect when the client has disconnected and set an attribute on self that you can check in on_chunk.

sqs message between a client and a server

I need to setup a client which will send sqs to a server:
client side:
...
sqs = AWS::SQS.new
q = sqs.queues.create("q_name")
m = q.send_message("meta")
...
but how the server could read the message of the client?
Thank you in advance.
First you need to have your server connect to SQS then you can get your queue.
Do a get_messages on your queue. Go to boto docs to get more information on the attributes. This will give you 1 to 10 message objects based on your parameters. Then on each of those objects do a get_body() then you'll have the string of the message.
Here's a simple example in python. Sorry don't know ruby.
sqsConn = connect_to_region("us-west-1", # this is the region you created the queue in
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
QUEUE = sqsConn.get_queue("my-queue") # the name of your queue
msgs = QUEUE.get_messages(num_messages=10, # try and get 10 messages
wait_time_seconds=1, # wait 1 second for these messages
visibility_timeout=10) # keep them visible for 10 seconds
body = msgs[0].get_body() # get the string from the first object
Hope this helps.

Resources