I have created a deployment package for AWS Lambda with my python file and the dependencies including sqlalchemy and psycopg2. The code works perfectly in accessing the DB locally. But when I imported this zip file, I am getting the following error.
No module named 'psycopg2._psycopg': ModuleNotFoundError
The stack trace of the error is,
{
"errorMessage": "No module named 'psycopg2._psycopg'",
"errorType": "ModuleNotFoundError",
"stackTrace": [
[
"/var/task/DBAccessLamdaHandler.py",
50,
"lambda_handler",
"engine = create_engine(rds_host)"
],
[
"/var/task/sqlalchemy/engine/__init__.py",
387,
"create_engine",
"return strategy.create(*args, **kwargs)"
],
[
"/var/task/sqlalchemy/engine/strategies.py",
80,
"create",
"dbapi = dialect_cls.dbapi(**dbapi_args)"
],
[
"/var/task/sqlalchemy/dialects/postgresql/psycopg2.py",
554,
"dbapi",
"import psycopg2"
],
[
"/var/task/psycopg2/__init__.py",
50,
"<module>",
"from psycopg2._psycopg import ( # noqa"
]
]
}
Any help is appreciable
The AWS Lambda runtime environment doesn't include the PostgreSQL libraries so you need to include them within your AWS Lambda upload.
One way to do this is to get them from the jkehler/awslambda-psycopg2 repo at GitHub. Note that you don't need to build this project from scratch as the repo includes a pre-built package in the psycopg2 folder that you can simply include in your Lambda upload.
The psycopg2 build library from jkehler/awslambda-psycopg2 was built for python 3.6 and make sure that while uploading your code to AWS lambda, select Python Runtime environment as 3.6, and it should work. I banged my head on this for a full day and then when i changed to 3.6, the import error just vanished.
If you are going to attempt to build it yourself, remember that you must build on a machine or VM with the same architecture as your target at AWS.
Like other answers, psycopg2-binary worked fine for python3.9 (it looks like other package awslambda-psycopg2 was only available for python3.6).
But, if you run on MacOs before sending to aws lambda, you must specify the platform to your pip install like this:
pip3.9 install --platform=manylinux1_x86_64 --only-binary=:all: psycopg2-binary
Latest as of 26/MAR/2020
I was skeptical about depending on a third-party library for my production code. On research the following works,
The issue happens only when the packages are built from MAC OS.
I can confirm today that the issue is fixed when I build the package from Centos 7 ( AWS AMI )
The following is my approach
requirement.txt
psycopg2-binary==2.8.4
Build process
pip install -r requirements.txt --target .
Lambda code is in the root directory
+-- lambda_function.py
+-- psycopg2
+-- psycopg2 files
Zip the directory and test the code in lambda works.
The only additional step is to build the package in Linux env instead of macOS using the Docker container. An example can be found here: Deploy AWS Amplify Python Lambda from macOS with Docker
Thanks to AWS Lambda Layers we can include a ready compiled psycopg2 layer for our selected Python versions directly to our Lambda function. Please use the version you need from this Github repo. Be careful to use the correct Python version and Region in Lambda and layer creation.
For Windows (Looks like in any OS which is not build on Amazon AMI or Centos), the easiest fix is to use
psycopg2 Python Library for AWS Lambda
which you can find https://github.com/jkehler/awslambda-psycopg2
I've had issues getting SQLAlchemy to run on AWS Lambda despite the fact that I have tried multiple psycopg2 versions, but what eventually solved it for me was to use an older Python version. I went from Python 3.9 to 3.7 and it was finally able to run (Using psycopg2-binary 2.8.4, but I didn't try other versions or the none binary version with 3.7)
I have faced a similar issue. I tried to create the layers using AWS Cloud shell it worked with Python3.8. But failed in Python3.9 though.
Please change the region name as per your AWS Region
sudo amazon-linux-extras install python3.8
curl -O https://bootstrap.pypa.io/get-pip.py
python3.8 get-pip.py --user
mkdir python
python3.8 -m pip install --platform=manylinux1_x86_64 --only-binary=:all: pandas numpy pymysql psycopg2-binary SQLAlchemy -t python/
zip -r layer.zip python
aws lambda publish-layer-version \
--layer-name mics-layer \
--description "Pandas Numpy psycopg2-binary SQLAlchemy pymysql" \
--zip-file fileb://layer.zip \
--compatible-runtimes python3.8 python3.9 \
--region eu-west-1
instead of using psycopg2, try using pg8000
in command prompt go the directory that you are currently working(example F:\example)
pip install pg8000 -t. (-t. is for installing pg8000 in the directory that you are working)
// handler.py
import pg8000
database=''
host=''
port=''
user=''
password=''
conn = pg8000.connect(database=database, host=host, port=port, user=user,
password=password)
def lambda_function(event, context):
.
.
.
after writing your code zip the code and upload in your aws lambda.
It worked for me!!!
If you are using CDK you can use SQLAlchemy on AWS Lambda by bundling with Docker and providing the right pip install command:
Python 3.8.
sqlachemy_layer = lambda_.LayerVersion(
scope=self,
id=LAYERS_SQLALCHEMY_LAYER,
layer_version_name=LAYERS_SQLALCHEMY_LAYER,
code=lambda_.Code.from_asset(
str(pathlib.Path(__file__).parent.joinpath("dependencies").resolve()),
bundling=BundlingOptions(
image=lambda_.Runtime.PYTHON_3_8.bundling_image,
command=[
"bash", "-c",
"pip install -r requirements-sqlalchemy.txt -t /asset-output/python && cp -au . /asset-output/python",
]
)
),
compatible_runtimes=[
lambda_.Runtime.PYTHON_3_8,
],
)
Python 3.9
sqlachemy_layer = lambda_.LayerVersion(
scope=self,
id=LAYERS_SQLALCHEMY_LAYER,
layer_version_name=LAYERS_SQLALCHEMY_LAYER,
code=lambda_.Code.from_asset(
str(pathlib.Path(__file__).parent.joinpath("dependencies").resolve()),
bundling=BundlingOptions(
image=lambda_.Runtime.PYTHON_3_9.bundling_image,
command=[
"bash", "-c",
"pip3.9 install --platform=manylinux1_x86_64 --only-binary=:all: -r requirements-sqlalchemy.txt -t /asset-output/python && cp -au . /asset-output/python",
]
)
),
compatible_runtimes=[
lambda_.Runtime.PYTHON_3_9,
],
)
The requirements-sqlalchemy.txt:
psycopg2-binary
sqlalchemy
And the file structure:
dependencies/
requirements-sqlalchemy.txt
my_stack.py
Related
When I try to run the experiment defined in this notebook in notebook, I encountered an error when it is creating the conda env. The error occurs when the below cell is executed:
from azureml.core import Experiment, ScriptRunConfig, Environment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.widgets import RunDetails
# Create a Python environment for the experiment
sklearn_env = Environment("sklearn-env")
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip'],
pip_packages=['azureml-defaults','azureml-dataprep[pandas]'])
sklearn_env.python.conda_dependencies = packages
# Get the training dataset
diabetes_ds = ws.datasets.get("diabetes dataset")
# Create a script config
script_config = ScriptRunConfig(source_directory=experiment_folder,
script='diabetes_training.py',
arguments = ['--regularization', 0.1, # Regularizaton rate parameter
'--input-data', diabetes_ds.as_named_input('training_data')], # Reference to dataset
environment=sklearn_env)
# submit the experiment
experiment_name = 'mslearn-train-diabetes'
experiment = Experiment(workspace=ws, name=experiment_name)
run = experiment.submit(config=script_config)
RunDetails(run).show()
run.wait_for_completion()
Everytime I run this, I always faced the issue of creating the conda env as below:
Creating conda environment...
Running: ['conda', 'env', 'create', '-p', '/home/azureuser/.azureml/envs/azureml_000000000000', '-f', 'azureml-environment-setup/mutated_conda_dependencies.yml']
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Installing pip dependencies: ...working...
Attempting to clean up partially built conda environment: /home/azureuser/.azureml/envs/azureml_000000000000
Remove all packages in environment /home/azureuser/.azureml/envs/azureml_000000000000:
Creating conda environment failed with exit code: -15
I could not find anything useful on the internet and this is not the only script where it fail. When I am try to run other experiments I have sometimes faced this issue. One solution which worked in the above case is I moved the pandas from pip to conda and it was able to create the coonda env. Example below:
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip'],
pip_packages=['azureml-defaults','azureml-dataprep[pandas]'])
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip','pandas'],
pip_packages=['azureml-defaults','azureml-dataprep'])
The error message (or the logs from Azure) is also not much help. Would apprecite if a proper solution is available.
Edit: I have recently started learning to use Azure for Machine learning and so if I am not sure if I am missing something? I assume the example notebooks should work as is hence raised this question.
short answer
Totally been in your shoes before. This code sample seems a smidge out of date. Using this notebook as a reference, can you try the following?
packages = CondaDependencies.create(
pip_packages=['azureml-defaults','scikit-learn']
)
longer answer
Using pip with Conda is not always smooth sailing. In this instance, conda isn't reporting up the issue that pip is having. The solution is to create and test this environment locally where we can get more information, which will at least will give you a more informative error message.
Install anaconda or miniconda (or use an Azure ML Compute Instance which has conda pre-installed)
Make a file called environment.yml that looks like this
name: aml_env
dependencies:
- python=3.8
- pip=21.0.1
- pip:
- azureml-defaults
- azureml-dataprep[pandas]
- scikit-learn==0.24.1
Create this environment with the command conda env create -f environment.yml.
respond to any discovered error message
If there' no error, use this new environment.yml with Azure ML like so
sklearn_env = Environment.from_conda_specification(name = 'sklearn-env', file_path = './environment.yml')
more context
the error I'm guessing that's happening is when you reference a pip requirements file from a conda environment file. In this scenario, conda calls pip install -r requirements.txt and if that command errors out, conda can't report the error.
requirements.txt
scikit-learn==0.24.1
azureml-dataprep[pandas]
environment.yml
name: aml_env
dependencies:
- python=3.8
- pip=21.0.1
- pip:
- -rrequirements.txt
What worked for me looking at the previous notebook 05 - Train Models.ipynb:
packages = CondaDependencies.create(conda_packages=['pip', 'scikit-learn'],
pip_packages=['azureml-defaults'])
You have to:
Remove 'azureml-dataprep[pandas]' from pip_packages
Change the order of conda_packages - pip should go first
I'm trying to spin up a super simple package for proof of concept and I can't see what i'm missing.
My aim is to be able to do the following:
python3 import mypackage
mypackage.add2(2)
>> 4
Github link
I created a public repo to reproduce the issue here
git clone https://github.com/OliverFarren/testPackage
Problem
I have a basic file structure as follows:
src/
mypackage/
__init__.py
mymodule.py
setup.cfg
setup.py
pyproject.toml
setup.cfg is pretty boiler plate from here
setup.py is just to allow pip install in editable mode:
import setuptools
setuptools.setup()
I ran the following commands at the top level directory in my Pycharm virtual env:
python3 -m pip install --upgrade build
python3 -m build
That created my dist and build directories and mypackage.egg-info file so now the directory looks like this:
testpackage
build/
bdist.linux-x86_64/
dist/
mypackage-0.1.0.tar.gz
mypackage-0.1.0-py3-none-any.whl
src/
mypackage/
mypackage.egg-info
__init__.py
mymodule.py
setup.cfg
setup.py
pyproject.toml
I've then tried install the package as follows:
sudo pip3 install -e .
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Installing collected packages: mypackage
Running setup.py develop for mypackage
Successfully installed mypackage
Which I think should have installed it. Except when I try and import the package I get a ModuleNotFoundError
I'm wondering whether this is a permissions issue of some sort. When I try:
sudo pip3 list
pip3 list
I notice i'm getting different outputs, I can see my package present in the list and in my sys.path:
~/testpackage/src/mypackage'
I just don't understand what i'm missing here. Any advice would be appreciated!
Ok so I found the issue. Posting solution and leaving the github repo live - with fix, incase anyone else has this issue.
It turns out my setup.cfg wasn't boiler plate.
Here was my incorrect code:
[metadata]
# replace with your username:
name = mypackage
author = Oliver Farren
version = 0.1.0
description = Test Package
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
[options]
package_dir =
= src/mypackage
packages = find:
python_requires = >=3.6
[options.packages.find]
where = src/mypackage
src/mypackage should be src, it was looking inside the package for packages.
A key step in debugging this issue was checking the mypackage.egg.info files. The SOURCES.txt contained a list of all the files in the build package and I could clearly see that in the incorrect build, that src/mypackage/mymodules.py and src/mypackage/__init__.py were missing. So the package was correctly installed by pip, but being empty was making for a very confusing error message.
I have a CodePipline that grabs code out of CodeCommit bundles it up in CodeBuild and then publishes it via CloudFormation.
I want to use the Python package gspread and because it's not part of the standard AWS Linux image I need to install it.
Currently when the code is run I get the error:
[ERROR] Runtime.ImportModuleError: Unable to import module 'index': No module named 'gspread'
Code structure
- buildspec.yml
- template.yml
package/
- gspread/
- gspread-3.6.0.dist-info/
- (37 other python packages)
source/
- index.py
buildspec.yml -- EDITED
version: 0.2
phases:
install:
commands:
# Use Install phase to install packages or any pre-reqs you may need throughout the build (e.g. dev deps, security checks, etc.)
- echo "[Install phase]"
- pip install --upgrade pip
- pip install --upgrade aws-sam-cli
- sam --version
- cd source
- ls
- pip install --target . gspread oauth2client
# consider using pipenv to install everything in the environement and then copy the files installed into the /source folder
- ls
runtime-versions:
python: 3.8
pre_build:
commands:
# Use Pre-Build phase to run tests, install any code deps or any other customization before build
# - echo "[Pre-Build phase]"
build:
commands:
- cd ..
- sam build
post_build:
commands:
# Use Post Build for notifications, git tags and any further customization after build
- echo "[Post-Build phase]"
- export BUCKET=property-tax-invoice-publisher-deployment
- sam package --template-file template.yml --s3-bucket $BUCKET --output-template-file outputtemplate.yml
- echo "SAM packaging completed on `date`"
##################################
# Build Artifacts to be uploaded #
##################################
artifacts:
files:
- outputtemplate.yml
discard-paths: yes
cache:
paths:
# List of path that CodeBuild will upload to S3 Bucket and use in subsequent runs to speed up Builds
- '/root/.cache/pip'
The index.py file has more in it than this. But to show the offending line.
-- index.py --
import os
import boto3
import io
import sys
import csv
import json
import smtplib
import gspread #**<--- Right here**
def lambda_handler(event, context):
print("In lambda_handler")
What I've tried
Creating the /package folder and committing the gspread and other packages
Running "pip install gspread" in the CodeBuild builds: commands:
At the moment, I'm installing it everywhere and seeing what sticks. (nothing is currently sticking)
Version: Python 3.8
I think you may need to do the following steps :
Use virtual env to install the packages locally.
Create requirements.txt to let code build know of the package requirement.
In CodeBuild buildspec.xml , include commands to install virutal env and then supply requirements.txt.
pre_build:
commands:
pip install virtualenv
virtualenv env
. env/bin/activate
pip install -r requirements.txt
Detailed steps here for reference :
https://adrian.tengamnuay.me/programming/2018/07/01/continuous-deployment-with-aws-lambda/
My OS is win 10 Home.
I have a Python 3.6 script that updates an MSSql DB, using pymssql. The script works fine locally.
Now I need to upload it as aws lambda, so I followed this using the cmd:
python -m venv .
Scripts\activate
pip install pymssql
Then I copied my py function to the Lib\site-packages dir, ziped all the dir content and uploaded it to the Lambda service.
The result was this error:
Unable to import module 'validationLambda': No module named 'pymssql'
How to fix this?
I am not strong with Windows environments but you should probably try to do the opposite.
Copy the Lib\site-packages\pymssql directory to the root of your package (same level with your_function.py
Try to import cython along with pymssql.
I am trying to use python-ldap with AWS Lambda. I downloaded the tarball from : https://pypi.python.org/pypi/python-ldap
and code to use lambda (lambda_function.py)
from ldap_dir.ldap_query.Lib import ldap
and uploaded the zip to Lambda.
where my directory structure is
ldap_dir -> ldap_query -> Lib -> ldap folder
ldap_dir -> lambda_function.py
Am I missing out something?
python-ldap is built on top of native OpenLDAP libraries. This article - even though unrelated to the python ldap module - describes how to bundle Python packages that have native dependencies.
The outline of this is the following:
Create an Amazon EC2 instance with Amazon Linux
Install compiler packages as well as the OpenLDAP developer package. yum install -y gcc openldap-devel
Create a virtual environment: virtualenv env
Activate the virtual environment: env/bin/activate
Upgrade pip (I am not sure this is necessary, but I got a warning without this): pip install --upgrade pip
Install python-ldap: pip install python-ldap
Create a handler Python script, for example, lambda.py with the following code:
import os
import subprocess
libdir = os.path.join(os.getcwd(), 'local', 'lib')
def handler(event, context):
command = 'LD_LIBRARY_PATH={} python ldap.py'.format(libdir)
subprocess.call(command, shell=True)
Implement your LDAP function, in this example ldap.py:
import ldap
print ldap.PORT
Create a zip package, let's say ldap.zip:
zip -9 ~/ldap.zip ldap.py
zip -9 ~/ldap.zip lambda.py
cd env/lib/python2.7/site-packages
zip -r9 ~/ldap.zip *
cd ../../../lib64/python2.7/site-packages
zip -r9 ~/ldap.zip *
Download the zip to your system (or put it into an S3 bucket). Now you can create your Lambda function using lambda.handler as the function name and use the zip file as the code.
I hope this helps.
one more step/check to the solution above:
still you might get No module named '_ldap', then check if the python version that you install on local/EC2 are the same as the Runtime on lambda