How do I add a Python library dependency for a Lambda function in CodeBuild - aws-lambda

I have a CodePipline that grabs code out of CodeCommit bundles it up in CodeBuild and then publishes it via CloudFormation.
I want to use the Python package gspread and because it's not part of the standard AWS Linux image I need to install it.
Currently when the code is run I get the error:
[ERROR] Runtime.ImportModuleError: Unable to import module 'index': No module named 'gspread'
Code structure
- buildspec.yml
- template.yml
package/
- gspread/
- gspread-3.6.0.dist-info/
- (37 other python packages)
source/
- index.py
buildspec.yml -- EDITED
version: 0.2
phases:
install:
commands:
# Use Install phase to install packages or any pre-reqs you may need throughout the build (e.g. dev deps, security checks, etc.)
- echo "[Install phase]"
- pip install --upgrade pip
- pip install --upgrade aws-sam-cli
- sam --version
- cd source
- ls
- pip install --target . gspread oauth2client
# consider using pipenv to install everything in the environement and then copy the files installed into the /source folder
- ls
runtime-versions:
python: 3.8
pre_build:
commands:
# Use Pre-Build phase to run tests, install any code deps or any other customization before build
# - echo "[Pre-Build phase]"
build:
commands:
- cd ..
- sam build
post_build:
commands:
# Use Post Build for notifications, git tags and any further customization after build
- echo "[Post-Build phase]"
- export BUCKET=property-tax-invoice-publisher-deployment
- sam package --template-file template.yml --s3-bucket $BUCKET --output-template-file outputtemplate.yml
- echo "SAM packaging completed on `date`"
##################################
# Build Artifacts to be uploaded #
##################################
artifacts:
files:
- outputtemplate.yml
discard-paths: yes
cache:
paths:
# List of path that CodeBuild will upload to S3 Bucket and use in subsequent runs to speed up Builds
- '/root/.cache/pip'
The index.py file has more in it than this. But to show the offending line.
-- index.py --
import os
import boto3
import io
import sys
import csv
import json
import smtplib
import gspread #**<--- Right here**
def lambda_handler(event, context):
print("In lambda_handler")
What I've tried
Creating the /package folder and committing the gspread and other packages
Running "pip install gspread" in the CodeBuild builds: commands:
At the moment, I'm installing it everywhere and seeing what sticks. (nothing is currently sticking)
Version: Python 3.8

I think you may need to do the following steps :
Use virtual env to install the packages locally.
Create requirements.txt to let code build know of the package requirement.
In CodeBuild buildspec.xml , include commands to install virutal env and then supply requirements.txt.
pre_build:
commands:
pip install virtualenv
virtualenv env
. env/bin/activate
pip install -r requirements.txt
Detailed steps here for reference :
https://adrian.tengamnuay.me/programming/2018/07/01/continuous-deployment-with-aws-lambda/

Related

Unable to use locally built python package in development mode

I'm trying to spin up a super simple package for proof of concept and I can't see what i'm missing.
My aim is to be able to do the following:
python3 import mypackage
mypackage.add2(2)
>> 4
Github link
I created a public repo to reproduce the issue here
git clone https://github.com/OliverFarren/testPackage
Problem
I have a basic file structure as follows:
src/
mypackage/
__init__.py
mymodule.py
setup.cfg
setup.py
pyproject.toml
setup.cfg is pretty boiler plate from here
setup.py is just to allow pip install in editable mode:
import setuptools
setuptools.setup()
I ran the following commands at the top level directory in my Pycharm virtual env:
python3 -m pip install --upgrade build
python3 -m build
That created my dist and build directories and mypackage.egg-info file so now the directory looks like this:
testpackage
build/
bdist.linux-x86_64/
dist/
mypackage-0.1.0.tar.gz
mypackage-0.1.0-py3-none-any.whl
src/
mypackage/
mypackage.egg-info
__init__.py
mymodule.py
setup.cfg
setup.py
pyproject.toml
I've then tried install the package as follows:
sudo pip3 install -e .
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Installing collected packages: mypackage
Running setup.py develop for mypackage
Successfully installed mypackage
Which I think should have installed it. Except when I try and import the package I get a ModuleNotFoundError
I'm wondering whether this is a permissions issue of some sort. When I try:
sudo pip3 list
pip3 list
I notice i'm getting different outputs, I can see my package present in the list and in my sys.path:
~/testpackage/src/mypackage'
I just don't understand what i'm missing here. Any advice would be appreciated!
Ok so I found the issue. Posting solution and leaving the github repo live - with fix, incase anyone else has this issue.
It turns out my setup.cfg wasn't boiler plate.
Here was my incorrect code:
[metadata]
# replace with your username:
name = mypackage
author = Oliver Farren
version = 0.1.0
description = Test Package
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
[options]
package_dir =
= src/mypackage
packages = find:
python_requires = >=3.6
[options.packages.find]
where = src/mypackage
src/mypackage should be src, it was looking inside the package for packages.
A key step in debugging this issue was checking the mypackage.egg.info files. The SOURCES.txt contained a list of all the files in the build package and I could clearly see that in the incorrect build, that src/mypackage/mymodules.py and src/mypackage/__init__.py were missing. So the package was correctly installed by pip, but being empty was making for a very confusing error message.

Anaconda: Installing package raises UnsatisfiableError on itself

I'm trying to build a new conda package. I used conda skeleton to create a meta.yaml file which I edited to my needs:
package:
name: ofl
version: "master"
source:
git_url: git#github.com:fergalm/Fireworks.git
git_tag: master
build:
# noarch: python
# preserve_egg_dir: True
entry_points:
# Put any entry points (scripts to be generated automatically) here. The
# syntax is module:function. For example
#
# - pyinstrument = pyinstrument:main
#
# Would create an entry point called pyinstrument that calls pyinstrument.main()
# If this is a new build for the same version, increment the build
# number. If you do not include this key, it defaults to 0.
# number: 1
requirements:
build:
- python
- setuptools
run:
- python
- boto3
- yaml
- dateutil
I can build this package with conda-build ofl, but when I try to install it with conda install ofl --use-local I get the following error
Fetching package metadata .............
Solving package specifications: .
UnsatisfiableError: The following specifications were found to be in conflict:
- ofl
Use "conda info <package>" to see the dependencies for each package.
What does it mean for my package, ofl, to be in conflict with itself? What do I need to do to fix this?

No module named 'psycopg2._psycopg': ModuleNotFoundError in AWS Lambda

I have created a deployment package for AWS Lambda with my python file and the dependencies including sqlalchemy and psycopg2. The code works perfectly in accessing the DB locally. But when I imported this zip file, I am getting the following error.
No module named 'psycopg2._psycopg': ModuleNotFoundError
The stack trace of the error is,
{
"errorMessage": "No module named 'psycopg2._psycopg'",
"errorType": "ModuleNotFoundError",
"stackTrace": [
[
"/var/task/DBAccessLamdaHandler.py",
50,
"lambda_handler",
"engine = create_engine(rds_host)"
],
[
"/var/task/sqlalchemy/engine/__init__.py",
387,
"create_engine",
"return strategy.create(*args, **kwargs)"
],
[
"/var/task/sqlalchemy/engine/strategies.py",
80,
"create",
"dbapi = dialect_cls.dbapi(**dbapi_args)"
],
[
"/var/task/sqlalchemy/dialects/postgresql/psycopg2.py",
554,
"dbapi",
"import psycopg2"
],
[
"/var/task/psycopg2/__init__.py",
50,
"<module>",
"from psycopg2._psycopg import ( # noqa"
]
]
}
Any help is appreciable
The AWS Lambda runtime environment doesn't include the PostgreSQL libraries so you need to include them within your AWS Lambda upload.
One way to do this is to get them from the jkehler/awslambda-psycopg2 repo at GitHub. Note that you don't need to build this project from scratch as the repo includes a pre-built package in the psycopg2 folder that you can simply include in your Lambda upload.
The psycopg2 build library from jkehler/awslambda-psycopg2 was built for python 3.6 and make sure that while uploading your code to AWS lambda, select Python Runtime environment as 3.6, and it should work. I banged my head on this for a full day and then when i changed to 3.6, the import error just vanished.
If you are going to attempt to build it yourself, remember that you must build on a machine or VM with the same architecture as your target at AWS.
Like other answers, psycopg2-binary worked fine for python3.9 (it looks like other package awslambda-psycopg2 was only available for python3.6).
But, if you run on MacOs before sending to aws lambda, you must specify the platform to your pip install like this:
pip3.9 install --platform=manylinux1_x86_64 --only-binary=:all: psycopg2-binary
Latest as of 26/MAR/2020
I was skeptical about depending on a third-party library for my production code. On research the following works,
The issue happens only when the packages are built from MAC OS.
I can confirm today that the issue is fixed when I build the package from Centos 7 ( AWS AMI )
The following is my approach
requirement.txt
psycopg2-binary==2.8.4
Build process
pip install -r requirements.txt --target .
Lambda code is in the root directory
+-- lambda_function.py
+-- psycopg2
+-- psycopg2 files
Zip the directory and test the code in lambda works.
The only additional step is to build the package in Linux env instead of macOS using the Docker container. An example can be found here: Deploy AWS Amplify Python Lambda from macOS with Docker
Thanks to AWS Lambda Layers we can include a ready compiled psycopg2 layer for our selected Python versions directly to our Lambda function. Please use the version you need from this Github repo. Be careful to use the correct Python version and Region in Lambda and layer creation.
For Windows (Looks like in any OS which is not build on Amazon AMI or Centos), the easiest fix is to use
psycopg2 Python Library for AWS Lambda
which you can find https://github.com/jkehler/awslambda-psycopg2
I've had issues getting SQLAlchemy to run on AWS Lambda despite the fact that I have tried multiple psycopg2 versions, but what eventually solved it for me was to use an older Python version. I went from Python 3.9 to 3.7 and it was finally able to run (Using psycopg2-binary 2.8.4, but I didn't try other versions or the none binary version with 3.7)
I have faced a similar issue. I tried to create the layers using AWS Cloud shell it worked with Python3.8. But failed in Python3.9 though.
Please change the region name as per your AWS Region
sudo amazon-linux-extras install python3.8
curl -O https://bootstrap.pypa.io/get-pip.py
python3.8 get-pip.py --user
mkdir python
python3.8 -m pip install --platform=manylinux1_x86_64 --only-binary=:all: pandas numpy pymysql psycopg2-binary SQLAlchemy -t python/
zip -r layer.zip python
aws lambda publish-layer-version \
--layer-name mics-layer \
--description "Pandas Numpy psycopg2-binary SQLAlchemy pymysql" \
--zip-file fileb://layer.zip \
--compatible-runtimes python3.8 python3.9 \
--region eu-west-1
instead of using psycopg2, try using pg8000
in command prompt go the directory that you are currently working(example F:\example)
pip install pg8000 -t. (-t. is for installing pg8000 in the directory that you are working)
// handler.py
import pg8000
database=''
host=''
port=''
user=''
password=''
conn = pg8000.connect(database=database, host=host, port=port, user=user,
password=password)
def lambda_function(event, context):
.
.
.
after writing your code zip the code and upload in your aws lambda.
It worked for me!!!
If you are using CDK you can use SQLAlchemy on AWS Lambda by bundling with Docker and providing the right pip install command:
Python 3.8.
sqlachemy_layer = lambda_.LayerVersion(
scope=self,
id=LAYERS_SQLALCHEMY_LAYER,
layer_version_name=LAYERS_SQLALCHEMY_LAYER,
code=lambda_.Code.from_asset(
str(pathlib.Path(__file__).parent.joinpath("dependencies").resolve()),
bundling=BundlingOptions(
image=lambda_.Runtime.PYTHON_3_8.bundling_image,
command=[
"bash", "-c",
"pip install -r requirements-sqlalchemy.txt -t /asset-output/python && cp -au . /asset-output/python",
]
)
),
compatible_runtimes=[
lambda_.Runtime.PYTHON_3_8,
],
)
Python 3.9
sqlachemy_layer = lambda_.LayerVersion(
scope=self,
id=LAYERS_SQLALCHEMY_LAYER,
layer_version_name=LAYERS_SQLALCHEMY_LAYER,
code=lambda_.Code.from_asset(
str(pathlib.Path(__file__).parent.joinpath("dependencies").resolve()),
bundling=BundlingOptions(
image=lambda_.Runtime.PYTHON_3_9.bundling_image,
command=[
"bash", "-c",
"pip3.9 install --platform=manylinux1_x86_64 --only-binary=:all: -r requirements-sqlalchemy.txt -t /asset-output/python && cp -au . /asset-output/python",
]
)
),
compatible_runtimes=[
lambda_.Runtime.PYTHON_3_9,
],
)
The requirements-sqlalchemy.txt:
psycopg2-binary
sqlalchemy
And the file structure:
dependencies/
requirements-sqlalchemy.txt
my_stack.py

GitLab CI - Cache not working

I'm currently using GitLab in combination with CI runners to run unit tests of my project, to speed up the process of bootstrapping the tests I'm using the built-in cache functionality, however this doesn't seem to work.
Each time someone commits to master, my runner does a git fetch and proceeds to remove all cached files, which means I have to stare at my screen for around 10 minutes to wait for a test to complete while the runner re-downloads all dependencies (NPM and PIP being the biggest time killers).
Output of the CI runner:
Fetching changes...
Removing bower_modules/jquery/ --+-- Shouldn't happen!
Removing bower_modules/tether/ |
Removing node_modules/ |
Removing vendor/ --'
HEAD is now at 7c513dd Update .gitlab-ci.yml
Currently my .gitlab-ci.yml
image: python:latest
services:
- redis:latest
- node:latest
cache:
key: "$CI_BUILD_REF_NAME"
untracked: true
paths:
- ~/.cache/pip/
- vendor/
- node_modules/
- bower_components/
before_script:
- python -V
# Still gets executed even though node is listed as a service??
- '(which nodejs && which npm) || (apt-get update -q && apt-get -o dir::cache::archives="vendor/apt/" install nodejs npm -yqq)'
- npm install -g bower gulp
# Following statements ignore cache!
- pip install -r requirements.txt
- npm install --only=dev
- bower install --allow-root
- gulp build
test:
variables:
DEBUG: "1"
script:
- python -m unittest myproject
I've tried reading the following articles for help however none of them seem to fix my problem:
http://docs.gitlab.com/ce/ci/yaml/README.html#cache
https://fleschenberg.net/gitlab-pip-cache/
https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/issues/336
Turns out that I was doing some things wrong:
Your script can't cache files outside of your project scope, creating a virtual environment instead and caching that allows you to cache your pip modules.
Most important of all: Your test must succeed in order for it to cache the files.
After using the following config I got a -3 minute time difference:
Currently my configuration looks like follows and works for me.
# Official framework image. Look for the different tagged releases at:
# https://hub.docker.com/r/library/python
image: python:latest
# Pick zero or more services to be used on all builds.
# Only needed when using a docker container to run your tests in.
# Check out: http://docs.gitlab.com/ce/ci/docker/using_docker_images.html#what-is-service
services:
- mysql:latest
- redis:latest
cache:
untracked: true
key: "$CI_BUILD_REF_NAME"
paths:
- venv/
- node_modules/
- bower_components/
# This is a basic example for a gem or script which doesn't use
# services such as redis or postgres
before_script:
# Check python installation
- python -V
# Install NodeJS (Gulp & Bower)
# Default repository is outdated, this is the latest version
- 'curl -sL https://deb.nodesource.com/setup_8.x | bash -'
- apt-get install -y nodejs
- npm install -g bower gulp
# Install dependencie
- pip install -U pip setuptools
- pip install virtualenv
test:
# Indicate to the framework that it's being unit tested
variables:
DEBUG: "1"
# Test script
script:
# Set up virtual environment
- virtualenv venv -ppython3
- source venv/bin/activate
- pip install coverage
- pip install -r requirements.txt
# Install NodeJS & Bower + Compile JS
- npm install --only=dev
- bower install --allow-root
- gulp build
# Run all unit tests
- coverage run -m unittest project.tests
- coverage report -m project/**/*.py
Which resulted in the following output:
Fetching changes...
Removing .coverage --+-- Don't worry about this
Removing bower_components/ |
Removing node_modules/ |
Removing venv/ --`
HEAD is now at 24e7618 Fix for issue #16
From https://git.example.com/repo
85f2f9b..42ba753 master -> origin/master
Checking out 42ba7537 as master...
Skipping Git submodules setup
Checking cache for master... --+-- The files are back now :)
Successfully extracted cache --`
...
project/module/script.py 157 9 94% 182, 231-244
---------------------------------------------------------------------------
TOTAL 1084 328 70%
Creating cache master...
Created cache
Uploading artifacts...
venv/: found 9859 matching files
node_modules/: found 7070 matching files
bower_components/: found 982 matching files
Trying to load /builds/repo.tmp/CI_SERVER_TLS_CA_FILE ...
Dialing: tcp git.example.com:443 ...
Uploading artifacts to coordinator... ok id=127 responseStatus=201 Created token=XXXXXX
Job succeeded
For the coverage report, I used the following regular expression:
^TOTAL\s+(?:\d+\s+){2}(\d{1,3}%)$

AWS Lambda + Python-ldap

I am trying to use python-ldap with AWS Lambda. I downloaded the tarball from : https://pypi.python.org/pypi/python-ldap
and code to use lambda (lambda_function.py)
from ldap_dir.ldap_query.Lib import ldap
and uploaded the zip to Lambda.
where my directory structure is
ldap_dir -> ldap_query -> Lib -> ldap folder
ldap_dir -> lambda_function.py
Am I missing out something?
python-ldap is built on top of native OpenLDAP libraries. This article - even though unrelated to the python ldap module - describes how to bundle Python packages that have native dependencies.
The outline of this is the following:
Create an Amazon EC2 instance with Amazon Linux
Install compiler packages as well as the OpenLDAP developer package. yum install -y gcc openldap-devel
Create a virtual environment: virtualenv env
Activate the virtual environment: env/bin/activate
Upgrade pip (I am not sure this is necessary, but I got a warning without this): pip install --upgrade pip
Install python-ldap: pip install python-ldap
Create a handler Python script, for example, lambda.py with the following code:
import os
import subprocess
libdir = os.path.join(os.getcwd(), 'local', 'lib')
def handler(event, context):
command = 'LD_LIBRARY_PATH={} python ldap.py'.format(libdir)
subprocess.call(command, shell=True)
Implement your LDAP function, in this example ldap.py:
import ldap
print ldap.PORT
Create a zip package, let's say ldap.zip:
zip -9 ~/ldap.zip ldap.py
zip -9 ~/ldap.zip lambda.py
cd env/lib/python2.7/site-packages
zip -r9 ~/ldap.zip *
cd ../../../lib64/python2.7/site-packages
zip -r9 ~/ldap.zip *
Download the zip to your system (or put it into an S3 bucket). Now you can create your Lambda function using lambda.handler as the function name and use the zip file as the code.
I hope this helps.
one more step/check to the solution above:
still you might get No module named '_ldap', then check if the python version that you install on local/EC2 are the same as the Runtime on lambda

Resources