I have a conda virtual environment based on the following yaml:
channels:
- conda-forge
dependencies:
- gcc_linux-64
- gxx_linux-64
- gfortran_linux-64
- theano
This is a simplified example, in reality the YAML has much more packages.
In details, the software is installed in base environment inside a docker container, however I do not believe that my problem is related to containers at all. The important part of the Dockerfile is below:
# BASE IMAGE
FROM ubuntu:18.04
# PATH EXPORT
ENV PATH="/root/miniconda3/bin:${PATH}"
ARG PATH="/root/miniconda3/bin:${PATH}"
# UPDATE THE PACKAGE LIST
RUN apt-get update
# INSTALL WGET
RUN apt-get install -y wget && rm -rf /var/lib/apt/lists/*
# INSTALL MINICONDA WITH PYTHON 3.7
RUN wget --no-verbose \
https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& mkdir /root/.conda \
&& bash Miniconda3-latest-Linux-x86_64.sh -b \
&& rm -f Miniconda3-latest-Linux-x86_64.sh
# UPDATE CONDA
RUN conda update --name base --channel defaults conda
# COPY THE YAML & INSTALL SOFTWARE WITH CONDA
COPY conda_packages.yaml .
RUN conda env update --name base --file conda_packages.yaml
The container is built properly and afterwards I can run the Anaconda New Compilers within using commands: x86_64-conda_cos6-linux-gnu-gcc or x86_64-conda_cos6-linux-gnu-c++. However, when I run a test python script that does import theano I get an error:
/root/miniconda3/lib/python3.7/site-packages/theano/configdefaults.py:560: UserWarning:
DeprecationWarning: there is no c++ compiler.This is deprecated and with Theano 0.11 a c++ compiler will be mandatory
WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.
WARNING (theano.configdefaults): install mkl with `conda install mkl-service`: No module named 'mkl'
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
When I later check the build logs theano version that is installed is: 1.0.4
Compilers versions are: 7.3.0
Related
I'm trying to create a new lambda layer to import the zip file with psycopg2, because the library made my deployment package get over 3MB, and I can not see the inline code in my lambda function any more.
I created lambda layer for the following 2 cases with Python 3.7:
psycopg2_lib.zip (contains psycopg2, psycopg2_binary.libs and psycopg2_binary-2.8.5.dist-info folders)
psycopg2_only.zip which contains only the psycopg2 folder.
I added they new created layer into my lambda function.
But, in both cases, my lambda_function throws an error as follows:
{
"errorMessage": "Unable to import module 'lambda_function': No module named 'psycopg2'",
"errorType": "Runtime.ImportModuleError"
}
The error seems as if something went wrong with my zip file that they are not recognized. But when it works well in my deployment package.
Any help or reason would be much appriciated. Thanks!
not sure if the OP found a solution to this but in case others land here. I resolved this using the following steps:
download the code/clone the git from:
https://github.com/jkehler/awslambda-psycopg2
create the following directory tree, if building for python3.7, otherwise replace 'python3.7' with the version choice:
mkdir -p python/lib/python3.7/site-packages/psycopg2
choose the python version of interest and copy the files from the folders downloaded in step 1. to the directory tree in step 2. e.g. if building a layer for python 3.7:
cp psycopg2-3.7/* python/lib/python3.7/site-packages/psycopg2
create the zip file for the layer. e.g.: zip -r9 psycopg2-py37.zip python
create a layer in the console or cli and upload the zip
If you end up on this page in >= 2022 year. Use official psycopg2-binary https://pypi.org/project/psycopg2-binary/
Works well for me. Just
pip install --target ./python psycopg2-binary
zip -r python.zip python
Maintainers of psycopg2 do not recommend using psycopg2-binary because it comes with linked libpq and libssl and others that may cause issues in production under certain circumstances.
I may imagine this being an issue when upgrading postgresql server while bundled libpq is incompatible. I also had issues w/ psycopg2-binary on AWS Lambda running in arm64 environment.
I've resorted to building postgresql and psycopg in Docker running on linux/arm64 platform using public.ecr.aws/lambda/python:3.9 as the base image.
FROM public.ecr.aws/lambda/python:3.9
RUN yum -y update && \
yum -y upgrade && \
yum -y install libffi-devel postgresql-devel postgresql-libs zip rsync wget openssl openssl-devel && \
yum -y groupinstall "development tools" && \
pip install pipenv
ENTRYPOINT ["/bin/bash"]
The build script is the following and valid for aarch64 platform. Just change path to x86_64 version on Prepare psycopg2 step.
#!/usr/bin/env bash
set -e
PG_VERSION="14.5"
cd "$TERRAFORM_ROOT"
if [ ! -f "postgresql-$PG_VERSION.tar.bz2" ]; then
wget "https://ftp.postgresql.org/pub/source/v$PG_VERSION/postgresql-$PG_VERSION.tar.bz2"
tar -xf "$(pwd)/postgresql-$PG_VERSION.tar.bz2"
fi
if [ ! -d "psycopg2" ]; then
git clone https://github.com/psycopg/psycopg2.git
fi
# Build postgres
cd "$TERRAFORM_ROOT/postgresql-$PG_VERSION"
./configure --without-readline --without-zlib
make
make install
# Build psycopg2
cd "$TERRAFORM_ROOT/psycopg2"
make clean
python setup.py build_ext \
--pg-config "$TERRAFORM_ROOT/postgresql-14.5/src/bin/pg_config/pg_config"
# Prepare psycopg2
cd build/lib.linux-aarch64-3.9
mkdir -p python/
cp -r psycopg2 python/
zip -9 -r "$BUNDLE" ./python
# Prepare libpq
cd "$TERRAFORM_ROOT/postgresql-$PG_VERSION/src/interfaces/libpq/"
mkdir -p lib/
cp libpq.so.5 lib/
zip -9 -r "$BUNDLE" ./lib
where $BUNDLE is the path to already existing .zip file.
I also tried to statically build psycopg2 binary and link libpq.a, however, I have had quite a lot of issues with missing symbols.
From AWS post How do I add Python packages with compiled binaries to my deployment package and make the package compatible with Lambda?:
To create a Lambda deployment package or layer that's compatible with Lambda Python runtimes when using pip outside of Linux operating system, run the pip install command with manylinux2014 as the value for the --platform parameter.
pip install \
--platform manylinux2014_x86_64 \
--target=my-lambda-function \
--implementation cp \
--python 3.9 \
--only-binary=:all: --upgrade \
psycopg2-binary
You can then zip the content of directory my-lambda-function
I am entirely new to the concept of dockers. I am creating the following Dockerfile as an exercise.
FROM ubuntu:latest
MAINTAINER kesarling
RUN apt update && apt upgrade -y
RUN apt install nginx curl zip unzip -y
RUN apt install openjdk-14-jdk python3 python3-doc clang golang-go gcc g++ -y
RUN curl -s "https://get.sdkman.io" | bash
RUN bash /root/.sdkman/bin/sdkman-init.sh
RUN sdk version
RUN yes | bash -c 'sdk install kotlin'
CMD [ "echo","The development environment has now been fully setup with C, C++, JAVA, Python3, Go and Kotlin" ]
I am using SDKMAN! to install Kotlin. The problem initially was that instead of using RUN bash /root/.sdkman/bin/sdkman-init.sh, I was using RUN source /root/.sdkman/bin/sdkman-init.sh. However, it gave the error saying source not found. So, I tried using RUN . /root/.sdkman/bin/sdkman-init.sh, and it did not work. However, RUN bash /root/.sdkman/bin/sdkman-init.sh seems to work, as in does not give any error and tries to run the next command. However, the docker then gives error saying sdk: not found
Where am I going wrong?
It should be noted that these steps worked like charm for my host distribution (The one on which I'm running docker) which is Pop!_OS 20.04
Actually the script /root/.sdkman/bin/sdkman-init.sh sources the sdk
source is a built-in to bash rather than a binary somewhere on the filesystem.
source command executes the file in the current shell.
Each RUN instruction will execute any commands in a new layer on top of the current image and commit the results.
The resulting committed image will be used for the next step in the Dockerfile.
Try this:
FROM ubuntu:latest
MAINTAINER kesarling
RUN apt update && apt upgrade -y
RUN apt install nginx curl zip unzip -y
RUN apt install openjdk-14-jdk python3 python3-doc clang golang-go gcc g++ -y
RUN curl -s "https://get.sdkman.io" | bash
RUN /bin/bash -c "source /root/.sdkman/bin/sdkman-init.sh; sdk version; sdk install kotlin"
CMD [ "echo","The development environment has now been fully setup with C, C++, JAVA, Python3, Go and Kotlin" ]
SDKMAN in Ubuntu Dockerfile
tl;dr
the sdk command is not a binary but a bash script loaded into memory
Shell sessions are a "process", which means environment variables and declared shell function only exist for the duration that shell session exists; which lasts only as long as the RUN command.
Manually tweak your PATH
RUN apt-get update && apt-get install curl bash unzip zip -y
RUN curl -s "https://get.sdkman.io" | bash
RUN source "$HOME/.sdkman/bin/sdkman-init.sh" \
&& sdk install java 8.0.275-amzn \
&& sdk install sbt 1.4.2 \
&& sdk install scala 2.12.12
ENV PATH=/root/.sdkman/candidates/java/current/bin:$PATH
ENV PATH=/root/.sdkman/candidates/scala/current/bin:$PATH
ENV PATH=/root/.sdkman/candidates/sbt/current/bin:$PATH
Full Version
Oh wow this was a journey to figure out. Below each line is commented as to why certain commands are run.
I learnt a lot about how unix works and how sdkman works and how docker works and why the intersection of the three give very unusual behaviour.
# I am using a multi-stage build so I am just copying the built artifacts
# from this stage to keep final image small.
FROM ubuntu:latest as ScalaBuild
# Switch from `sh -c` to `bash -c` as the shell behind a `RUN` command.
SHELL ["/bin/bash", "-c"]
# Usual updates
RUN apt-get update && apt-get upgrade -y
# Dependencies for sdkman installation
RUN apt-get install curl bash unzip zip -y
#Install sdkman
RUN curl -s "https://get.sdkman.io" | bash
# FUN FACTS:
# 1) the `sdk` command is not a binary but a bash script loaded into memory
# 2) Shell sessions are a "process", which means environment variables
# and declared shell function only exist for
# the duration that shell session exists
RUN source "$HOME/.sdkman/bin/sdkman-init.sh" \
&& sdk install java 8.0.275-amzn \
&& sdk install sbt 1.4.2 \
&& sdk install scala 2.12.12
# Once the real binaries exist these are
# the symlinked paths that need to exist on PATH
ENV PATH=/root/.sdkman/candidates/java/current/bin:$PATH
ENV PATH=/root/.sdkman/candidates/scala/current/bin:$PATH
ENV PATH=/root/.sdkman/candidates/sbt/current/bin:$PATH
# This is specific to running a minimal empty Scala project and packaging it
RUN touch build.sbt
RUN sbt compile
RUN sbt package
FROM alpine AS production
# setup production environment image here
COPY --from=ScalaBuild /root/target/scala-2.12/ $INSTALL_PATH
ENTRYPOINT ["java", "-cp", "$INSTALL_PATH", "your.main.classfile"]
Generally you want to avoid using "version manager" type tools in Docker; it's better to install a specific version of the compiler or runtime you need.
In the case of Kotlin, it's a JVM application distributed as a zip file so it should be fairly easy to install:
FROM openjdk:15-slim
ARG KOTLIN_VERSION=1.3.72
# Get OS-level updates:
RUN apt-get update \
&& apt-get install --no-install-recommends --assume-yes \
curl \
unzip
# and if you need C/Python dependencies, those too
# Download and unpack Kotlin
RUN cd /opt \
&& curl -LO https://github.com/JetBrains/kotlin/releases/download/v${KOTLIN_VERSION}/kotlin-compiler-${KOTLIN_VERSION}.zip \
&& unzip kotlin-compiler-${KOTLIN_VERSION}.zip \
&& rm kotlin-compiler-${KOTLIN_VERSION}.zip
# Add its directory to $PATH
ENV PATH=/opt/kotlinc/bin:$PATH
The real problem with version managers is that they heavily depend on the tool setting environment variables. As #JeevanRao notes in their answer, each Dockerfile RUN command runs in a separate shell in a separate container, and any environment variable settings within that command get lost for the next command.
# Does absolutely nothing: environment variables do not stay set
RUN . /root/.sdkman/bin/sdkman-init.sh
Since an image generally contains only one application and its runtime, you don't need the ability to change which version of the runtime or compiler you're using. My Dockerfile example passes it as an ARG, so you can change it in the Dockerfile or pass a docker build --build-arg KOTLIN_VERSION=... option to use a different version.
GPU driver and CUDA is not enabled and accessible by PyTorch.
torch.cuda.is_available() returns false
I am using macOS Mojave 10.14.6
I have installed Cuda 10.0 version of pytorch.
I tried verfication on https://pytorch.org/get-started/locally/ and constructing a randomly initialized tensor works just fine.
But when I tried
import torch
torch.cuda.is_availalbe()
it returns false.
Therefore, I followed instructions on Pytorch and installed Anaconda and Cuda.
Then tried this:
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install
In terminal, I got
fatal error: 'string.h' file not found
#include_next <string.h>
I searched on stackoverflow and came up with this. Build Pytorch from source . So I tried:
$ find /Library/Developer/CommandLineTools/usr -type f -name string.h
which returned /Library/Developer/CommandLineTools/usr/include/c++/v1/string.h
Doesn't this mean I already have string.h?
How can I solve this problem?
Are you installing from a conda env? According to the github this should work:
- Create a conda env
- conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing (that installs some requirements)
Then this (which I assume you've already done):
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive
And finally set up the conda variable and install:
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install
There are issues on the git reporting that behavior here which suggest to add something like:
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ NO_CUDA=1 python setup.py install
Check thatNO_CUDA, this issue as also been mentioned HERE in the forums and it seems to be that it could be an issue caused by the OS and driver versions. If that should be the case I recommend to use Nvidia Docker (hopefully it has mac support) with a pytorch container from https://ngc.nvidia.com/catalog/landing
It that should also fail, your best bet is to install without CUDA support.
I build the caffe2 with anaconda following the page.
In the server with a single titanx, has cudnn7 and cuda9 but do not have nccl, so I download the nccl2 from nvidia and extract it to path/to/local/nccl2, and then edit the ./pytorch/conda/integrated/build.sh in the line 42 to be:"export NCCL_ROOT_DIR=path/to/local/nccl2".
Then I need to use caffe2 with python2, so I added "conda_args+=(" --python 2.7") " in the ./pytorch/scripts/build_anaconda.sh to use python2.7.
The building was succeed, but when I run python2 test.py from caffe2.python import core
It tells me:
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:root:Debug message: No module named caffe2_pybind11_state_hip
Segmentation fault (core dumped)
My question is:
a. why the conda does not support gpu?
b. if I am using a single gpu, is nccl necessary for building?
c. how to fix No module named caffe2_pybind11_state_hip
PyTorch or Caffe2: caffe2
How you installed PyTorch (conda, pip, source): conda
Build command you used (if compiling from source):./scripts/build_anaconda.sh --install-locally --cuda 9.0 --cudnn 7
OS:ubuntu16
PyTorch version:
Python version:2.7
CUDA/cuDNN version:9.1/7
GPU models and configuration:??
GCC version (if compiling from source):5.4.0
CMake version:not install
Versions of any other relevant libraries:
Thank you very much!
First of all get CUDA and install it:
sudo apt-get update && sudo apt-get install wget -y --no-install-recommends
wget "http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb"
sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda
Now proceed with installation from source ( do it in an environment):
FULL_CAFFE2=1 python setup.py install
You can find more info here: https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile#install-with-gpu-support
Follow the below procedure it worked for me
ubuntu#test:~$ cd $HOME
ubuntu#test:~$ conda create -n caffe2
ubuntu#test:~$ source activate caffe2
(caffe2) ubuntu#test:~$ git clone --recursive https://github.com/pytorch/pytorch.git && cd pytorch
(caffe2) ubuntu#test:~/pytorch$ git submodule update --init
(caffe2) ubuntu#test:~/pytorch$ CONDA_INSTALL_LOCALLY=1 ./scripts/build_anaconda.sh --cuda 8.0 --cudnn 7 -DUSE_CUDA=ON -DUSE_NCCL=ON
environments: Ubuntu 14.04(64bit) Python2.7.11
Firstly, I installed tensorflow in the way of Virtualenz installation.
$ sudo apt-get install python-pip python-dev python-virtualenv
$ virtualenv --system-site-packages ~/tensorflow
$ source ~/tensorflow/bin/activate
$export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.1-cp27-none-linux_x86_64.whl
$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl
$ pip install --upgrade $TF_BINARY_URL
and then, I test my installation and some issue appear. I know I didn't install tensorflow successfully.
import tensorflow
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
import tensorflow as tf
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
I don't know how to solve the problem. Please help me, it cost me one day. I tried to uninstall tensorflow and then I installed in the way of pip installation. But I get the same error.
The protocbuf is 3.1.0.
Are you running python in the same virtual environment you installed tensorflow in?
To access your tensorflow installation, you have to first "activate" the virtualenv in any new terminals, as follows:
source ~/tensorflow/bin/activate
python
import tensorflow as tf
If you run the above in a new terminal, does it solve your problem?
When you did
$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl
this step you are specifying that you are going to use Nvidia card.
To run tensorflow with GPU(Nvidia graphics card) you need to satisfy all Nvidia requirements
Nvidia requires some special privileges to its CUDA cores
You also need to check for Cuda pathnames to the LD_LIBRARY_PATH environment variable, check in Nvidia Documentation.Also, you need to install an profiling support, this can be done by libcupti-dev library, which is the NVIDIA CUDA Profile Tools Interface. This library provides advanced profiling support. To install this library, issue the following command:
sudo apt-get install libcupti-dev
But if you want to run tensorflow in CPU mode only, do not specify $ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl.With this you are overriding TF_BINARY_URL variable to use Nvidia CUDA core
So, to use CPU from all your steps remove $ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl and include only $export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.1-cp27-none-linux_x86_64.whl and reinstall
I hope this should clear the problem
In case, your prerequisite python packages are not installed properly,
check several things.
$ source $HOME/tensorflow/bin/activate
$ which python
$ which pip
please check these binaries are in the path $HOME/tensorflow/bin/activate. If so, try
$ pip install -I --upgrade $TF_BINARY_URL
where -I option forces to install packages.
INSTALLATION OF TENSORFLOW ON UBUNTU 18.04
download anaconda python package
install it via shell using bash
$bash anaconda*.sh
editing the .bashrc script //location home
$sudo apt-get install python3-pip
$sudo apt-get update
$cd
$nano .bashrc
nano is the text editor
insert the given line at the end of the file
export PATH=-/anaconda3/bin:$PATH
create a virtual environment
using conda
$conda create -n myenv python=3.5
//SPECIFY THE VERSION REQUIRED DO NOT USE 3.7 AS THERE IS A COMPATIBLITY ISSUE WITH TENSORFLOW 10
$source activate myenv
$pip install -U tensorflow
$python
>>import tensorflow as tf
>> //get this prompt without an error it means the installation is successful
>>exit()
source deactivate
fully tested if an issue arises do let me know
whenever you install python packages i would suggest to do it in a virtual environment