AWS Lambda "Unable to import moudle" - aws-lambda

I have been trying to create my first Lambda function but every time I run it I get "Unable to import module 'app'".
I have been following the tutorial written by Amazon to create a function that connects to a Amazon RDS db so my Lambda function must include the PyMySQL library.
https://docs.aws.amazon.com/lambda/latest/dg/vpc-rds-create-rds-mysql.html
I am pretty sure my issue is I am not zipping the contents of the directories correctly therefore the Lambda cannot find my app.py file in order to find the handler and execute.
I have followed the steps to create a deployment package, I have looked at online, create new linux machines, zip in virtual env etc but nothing is working and I get the same error back each time.
Can someone please write down step by step with full commands as what I need to do in order to create the deployment package correctly, or what they do in order to do get their python functions to run in Lambda
Thanks in advance!!

Related

Provision existing AWS lambda using Terraform

I was learning terraform and was asked to provision it for CI/CD pipeline at gitlab.
My doubt is that ,
Let's say a lambda function is already running/live.
How can I provision it using terraform ?
Should I use data block to consume the running aws lambda?
Or this isn't how it works ! I am not sure how can we do this.
I searched the docs which isn't supporting this use case.
So with the Lambda function that is already running, basically here you have two use cases:
whether you want to add further changes/updates to that Lambda later on using Terraform. In this case, you need to import it to your terraform code, and all the changes you add to that Lambda can be deployed via your CI/CD pipeline, e.g.:
terraform import aws_lambda_function.my_lambda existing_lambda_function_name
Note: Please note that the my_lambda function is your terraform block of code that is defining the exact Lambda that is already running, this is to match the existing resource with your code Terraform, to then be added to the state. I hope that it is clear
or you simply just need some outputs of that Lambda to be used as inputs to other services, here you can simply just keep the Lambda up and running and use Terraform data source, e.g.:
data "aws_lambda_function" "existing_lambda" {
function_name = var.function_name
}
And somewhere else in your code you can use it as follows:
function_name = data.aws_lambda_function.existing_lambda
I hope this was helpful

How to run DBT in AWS Lambda?

I have currently dockerized my DBT solution and I launch it in AWS Fargate (triggered from Airflow). However, Fargate requires about 1 minute to start running (image pull + resource provisioning + etc.), which is great for long running executions (hours), but not for short ones (1-5 minutes).
I'm trying to run my docker container in AWS Lambda instead of in AWS Fargate for short executions, but I encountered several problems during this migration.
The one I cannot fix is related to the bellow message, at the time of running the dbt deps --profiles-dir . && dbt run -t my_target --profiles-dir . --select my_model
Running with dbt=0.21.0
Encountered an error:
[Errno 38] Function not implemented
It says there is no function implemented but I cannot see anywhere which is that function. As it appears at the time of installing dbt packages (redshift and dbt_utils), I tried to download them and include them in the docker image (set local paths in packages.yml), but nothing changed. Moreover, DBT writes no logs at this phase (I set the log-path to /tmp in the dbt_project.yml so that it can have write permissions within the Lambda), so I'm blind.
Digging into this problem, I've found that this can be related to multiprocessing issues within AWS Lamba (my docker image contains python scripts), as stated in https://github.com/dbt-labs/dbt-core/issues/2992. I run DBT from python using the subprocess library.
Since it may be a multiprocessing issue, I have also tried to set "threads": 1 in profiles.yml but it did not solve the problem.
Does anyone succeeded in deploying DBT in AWS Lambda?
I've recently been trying to do this, and the summary of what I've found is that it seems to be possible, but isn't worth it.
You can pretty easily build a Lambda Layer that includes dbt & the provider you want to use, but you'll also need to patch the multiprocessing behavior and invoke dbt.main from within the Lambda code. Once you've jumped through all those hops, you're left with a dbt instance that is limited to a relatively small upper bound on memory, a 15 minute maximum runtime, and is throttled to a single thread.
This discussion gives an rough example of what's needed to get it running in Lambda: https://github.com/dbt-labs/dbt-core/issues/2992#issuecomment-919288906
All that said, I'd love to put dbt on a Lambda and I hope dbt's multiprocessing will one day support it.

AWS: Call jupyter notebook from step function or lambda

I created a python notebook called 'test.ipynb' in SageMaker to retrieve a csv file from my S3 bucket, manipulate the data, calculate values, create a new csv file and save it back into the S3 bucket. This part works.
I want to test triggering it from a step function or from a lambda function. For step functions, I added a SageMaker event / item / step called StartNotebookInstance that successfully starts the notebook instance but the next step is to start the notebook 'test.ipynb' but I do not see anything in that step that allows me to specify the notebook name. I also do not see the equivalent to 'RunNotebook'. Did anyone successfully call a notebook from a step function? If so, how did you do it?
If it is not possible, perhaps I can create a lambda function to call 'test.ipynb'. Is anyone familiar with the code to do so or can someone point me in the right direction? I found this video but it uses api gateway which I am unsure I need. I also checked the aws lambda and step documents but did not find any solution to this behavior. I also tried to use aws data pipeline but that api is blocked due to security reasons.
I am also wondering if there is a more practical / efficient way to call a python notebook because I did not find any solutions and maybe it is because it is not a recommended way.
Thank you in advance for your assistance

AWS Lambda: How To Upload & Test Code Using Python And Command Line

I am no longer able to edit my AWS lambda function using the inline editor because of the error, "Your inline editor code size is too large. Maximum size is 51200." Yet, I can't find a walk-through that explains how to do these things from localhost:
Upload a python script to Lambda
Supply "event" data to the script
View Lambda output
You'll need to create a deployment package for your code, which is just a zip archive but with a particular format. Instructions are in the AWS Python deployment documentation.
Then you can use the context object to supply event data to your script, starter information in the AWS Python programming model documentation.
Side note: once your Lambda code starts to get larger, it's often handy to move to some sort of management framework. Several have been written for Lambda, I use Apex which is written in Go but works for any Lambda code, but you might be more comfortable with Gordon (has a great list of examples and is more active) or Kappa, which are both written in Python.

Working with Flask-Script and cron jobs

So I've been meaning to create a cron job on my prototype Flask app running on Heroku. Searching the web I found that the best way is by using Flask-Script but I fail to see the point of using it. Do I get easier access to my app logic and storage info? And if I do use Flask-Script, how do I organize it around my app? I'm using it right now to start my server without really knowing the benefits. My folder structure is like this:
/app
/manage.py
/flask_prototype
all my Flask code
Should I put the 'script.py' to be run by the Heroku Scheduler on app folder, the same level as manage.py? If so, do I get access to the models defined within flask_prototype?
Thank you for any info
Flask-Script just provides a framework under which you can create your script(s). It does not give you any better access to the application than what you can obtain when you write a standalone script. But it handles a few mundane tasks for you, like command line arguments and help output. It also folds all of your scripts into a single, consistent command line master script (this is manage.py, in case it isn't clear).
As far as where to put the script, it does not really matter. As long as manage.py can import it and register it with Flask-Script, and that your script can import what it needs from the application you should be fine.

Resources