AWS Lambda: How To Upload & Test Code Using Python And Command Line - aws-lambda

I am no longer able to edit my AWS lambda function using the inline editor because of the error, "Your inline editor code size is too large. Maximum size is 51200." Yet, I can't find a walk-through that explains how to do these things from localhost:
Upload a python script to Lambda
Supply "event" data to the script
View Lambda output

You'll need to create a deployment package for your code, which is just a zip archive but with a particular format. Instructions are in the AWS Python deployment documentation.
Then you can use the context object to supply event data to your script, starter information in the AWS Python programming model documentation.
Side note: once your Lambda code starts to get larger, it's often handy to move to some sort of management framework. Several have been written for Lambda, I use Apex which is written in Go but works for any Lambda code, but you might be more comfortable with Gordon (has a great list of examples and is more active) or Kappa, which are both written in Python.

Related

How to run DBT in AWS Lambda?

I have currently dockerized my DBT solution and I launch it in AWS Fargate (triggered from Airflow). However, Fargate requires about 1 minute to start running (image pull + resource provisioning + etc.), which is great for long running executions (hours), but not for short ones (1-5 minutes).
I'm trying to run my docker container in AWS Lambda instead of in AWS Fargate for short executions, but I encountered several problems during this migration.
The one I cannot fix is related to the bellow message, at the time of running the dbt deps --profiles-dir . && dbt run -t my_target --profiles-dir . --select my_model
Running with dbt=0.21.0
Encountered an error:
[Errno 38] Function not implemented
It says there is no function implemented but I cannot see anywhere which is that function. As it appears at the time of installing dbt packages (redshift and dbt_utils), I tried to download them and include them in the docker image (set local paths in packages.yml), but nothing changed. Moreover, DBT writes no logs at this phase (I set the log-path to /tmp in the dbt_project.yml so that it can have write permissions within the Lambda), so I'm blind.
Digging into this problem, I've found that this can be related to multiprocessing issues within AWS Lamba (my docker image contains python scripts), as stated in https://github.com/dbt-labs/dbt-core/issues/2992. I run DBT from python using the subprocess library.
Since it may be a multiprocessing issue, I have also tried to set "threads": 1 in profiles.yml but it did not solve the problem.
Does anyone succeeded in deploying DBT in AWS Lambda?
I've recently been trying to do this, and the summary of what I've found is that it seems to be possible, but isn't worth it.
You can pretty easily build a Lambda Layer that includes dbt & the provider you want to use, but you'll also need to patch the multiprocessing behavior and invoke dbt.main from within the Lambda code. Once you've jumped through all those hops, you're left with a dbt instance that is limited to a relatively small upper bound on memory, a 15 minute maximum runtime, and is throttled to a single thread.
This discussion gives an rough example of what's needed to get it running in Lambda: https://github.com/dbt-labs/dbt-core/issues/2992#issuecomment-919288906
All that said, I'd love to put dbt on a Lambda and I hope dbt's multiprocessing will one day support it.

AWS: Call jupyter notebook from step function or lambda

I created a python notebook called 'test.ipynb' in SageMaker to retrieve a csv file from my S3 bucket, manipulate the data, calculate values, create a new csv file and save it back into the S3 bucket. This part works.
I want to test triggering it from a step function or from a lambda function. For step functions, I added a SageMaker event / item / step called StartNotebookInstance that successfully starts the notebook instance but the next step is to start the notebook 'test.ipynb' but I do not see anything in that step that allows me to specify the notebook name. I also do not see the equivalent to 'RunNotebook'. Did anyone successfully call a notebook from a step function? If so, how did you do it?
If it is not possible, perhaps I can create a lambda function to call 'test.ipynb'. Is anyone familiar with the code to do so or can someone point me in the right direction? I found this video but it uses api gateway which I am unsure I need. I also checked the aws lambda and step documents but did not find any solution to this behavior. I also tried to use aws data pipeline but that api is blocked due to security reasons.
I am also wondering if there is a more practical / efficient way to call a python notebook because I did not find any solutions and maybe it is because it is not a recommended way.
Thank you in advance for your assistance

AWS Lambda "Unable to import moudle"

I have been trying to create my first Lambda function but every time I run it I get "Unable to import module 'app'".
I have been following the tutorial written by Amazon to create a function that connects to a Amazon RDS db so my Lambda function must include the PyMySQL library.
https://docs.aws.amazon.com/lambda/latest/dg/vpc-rds-create-rds-mysql.html
I am pretty sure my issue is I am not zipping the contents of the directories correctly therefore the Lambda cannot find my app.py file in order to find the handler and execute.
I have followed the steps to create a deployment package, I have looked at online, create new linux machines, zip in virtual env etc but nothing is working and I get the same error back each time.
Can someone please write down step by step with full commands as what I need to do in order to create the deployment package correctly, or what they do in order to do get their python functions to run in Lambda
Thanks in advance!!

What functions does the CLI perform before executing the command

So, this topic came up during a discussion about CLI and GUI and interfaces and all. And I gathered some information about it but could not find a definite complete answer.
So, think that you are developing a CLI for a set of API. So what functions can the CLI perform before it calls the particular method for the command that was entered.
There are few points that I was able to think of and also got from the internet.
Validation of the command
Security check
What else does a CLI do before executing the command?

EC2 init.d script - what's the best practice

I'm creating an init.d script that will run a couple of tasks when the instance starts up.
it will create a new volume with our code repository and mount it if it doesn't exist already.
it will tag the instance
The tasks above being complete will be crucial for our site (i.e. without the code repository mounted the site won't work). How can I make sure that the server doesn't end up being publicly visible? Should I start my init.d script with de-registering the instance from the ELB (I'm not even sure if it will be registered at that point), and then register it again when all the tasks finished successfully?
What is the best practice?
Thanks!
You should have a health check on your ELB. So your server shouldn't get in unless it reports as happy. And it shouldn't report happy if the boot script errors out.
(Also, you should look into using cloud-init. That way you can change the boot script without making a new AMI.)
I suggest you use CloudFormation instead. You can bring up a full stack of your system by representing it in a JSON format template.
For example, you can create an autoscale group that has an instances with unique tags and the instances have another volume attached (which presumably has your code)
Here's a sample JSON template attaching an EBS volume to an instance:
https://s3.amazonaws.com/cloudformation-templates-us-east-1/EC2WithEBSSample.template
And here many other JSON templates that you can use for your guidance and deploy your specific Stack and Application.
http://aws.amazon.com/cloudformation/aws-cloudformation-templates/
Of course you can accomplish the same using init.d script or using the rc.local file in your instance but I believe CloudFormation is a cleaner solution from the outside (not inside your instance)
You can also write your own script that brings up your stack from the outside by why reinvent the wheel.
Hope this helps.

Resources