Beam Dataflow job stuck after upgrading the Apache Beam version from 2.27.0 to 2.32.0 - pip

Currently I am in a process of upgrading the Apache Beam version from 2.27.0 to 2.32.0 but when I start my jobs on Dataflow runner the job stucks during the worker-startup and it never finish installing dependencies. The python version is 3.7
This is what I see in the logs
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime
After initial analysis it looks like this is the issue with pip dependencies backtracking and it keeps on downloading and installing dependencies. These are some warnings in the logs
INFO: pip is looking at multiple versions of google-auth to determine which version is compatible with other requirements.
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead:
This is the setup.py for the Beam
import setuptools
REQUIRED_PACKAGES = [
"numpy==1.21.4",
"pandas==0.25.3",
"dateparser==1.1.0",
"python-dateutil==2.8.2",
"pytz==2021.3",
"google-api-core==1.14.0",
"google-cloud-storage==1.36.1",
"fastavro==0.22.10",
]
setuptools.setup(
name="data-workflows",
version="0.1.0",
install_requires=REQUIRED_PACKAGES,
packages=setuptools.find_packages(),
)
The pipelines used to run fine in the Beam version 2.27.0. I am not sure if these warnings are the cause of the issue. Could someone please help me to identify the root cause of this problem?

Related

Issue installing Tax4Fun

I'm trying to install the package "Tax4Fun" but keep failing.
I've tried 2 different ways:
install.packages("devtools")
devtools::install_url("http://tax4fun.gobics.de/Tax4Fun/Tax4Fun_0.3.1.tar.gz")
library(Tax4Fun)
The error that I get is:
ERROR: dependency 'biom' is not available for package 'Tax4Fun'
I've also tried installing biom directly
BiocManager::install("biom")
which does not work either
Bioconductor version 3.10 (BiocManager 1.30.10), R 3.6.1 (2019-07-05)
Installing package(s) 'biom'
Installation path not writeable, unable to update packages: boot, foreign, KernSmooth,
mgcv, nlme, survival
Warning message:
package ‘biom’ is not available (for R version 3.6.1)
The other way I've tried to install Tax4Fun directly is
BiocManager::install("Tax4Fun")
I get the following error code:
Bioconductor version 3.10 (BiocManager 1.30.10), R 3.6.1 (2019-07-05)
Installing package(s) 'Tax4Fun'
Installation path not writeable, unable to update packages: boot, foreign, KernSmooth,
mgcv, nlme, survival
Warning message:
package ‘Tax4Fun’ is not available (for R version 3.6.1)
Please help :)
You need to install it by downloading the packages from source (http://tax4fun.gobics.de). Then it depends whether you are running on Linux/Mac or Windows.
From the command line, you navigate to the folder containing the .tar.gz downloaded package. Then you should install it using:
R CMD INSTALL Tax4Fun_0.3.1.tar.gz
But dependancies are not installed by default. So you need to install dependancies manually, Qiimer and Biom, which are both deprecated on Cran. You install them using the same command, after you have downloaded the packages from the Cran archives.
Before that, you need to also install their dependancies in R:
install.packages("pheatmap")
install.packages("RJSONIO")
Then you should be able to proceed as mentioned above: install Qiimer and Biom from the command line first. Then Tax4Fun from the command line too.
If you are running on Windows you should have quite the same issues, but the installation of the different packages and dependancies is different. You can have a look at the readme at http://tax4fun.gobics.de

Installing specific versions of pyproj from github

I have been trying to install obspy and have been running into a lot of problems. I want to install obspy which has a dependency on pyproj. But apparently obspy only works with pyproj 1.9.5.1, which I tried installing using pip (pip3 install pyproj==1.9.5.1), but only got the errors like-
_proj.c:7488:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
Digging deeper I found that it might be a Cython problem, and installing pyproj directly from github might help, because it would apparently make Cython recompile all the necessary files. Something along the lines of -
pip3 install git+https://github.com/jswhit/pyproj.git
However this one gives the error -
ERROR: Minimum supported proj version is 6.2.0, installed version is 5.2.0.
I di try installing a higher version of libproj-dev (sudo apt install libproj-dev=6.2.0) however it shows that there is no candidate for 6.2.0. I tried downloading the deb file and installing from that using -
sudo apt-get install ~/Downloads/libproj-dev_6.2.0-1_amd64.deb
which just leads to the error -
The following packages have unmet dependencies:
libproj-dev : Depends: libproj15 (= 6.2.0-1) but it is not installable
E: Unable to correct problems, you have held broken packages.
But I think this is not the right way to install for me anyway, since I need a specific version. Hence I tried installing directly from the tarball of the release -
pip3 install https://github.com/pyproj4/pyproj/archive/v1.9.5.1rel.tar.gz
Which leads to the first error I had, evidently due to Cython.
With errors on everything I tried to do to fix this, I am not sure what even is relevant to my problem now.
Any help is appreciated, and if this site is not the correct place for this question, please help me migrate it to its proper destination.
I am on Ubuntu 18.10.
The problem is, that Cython-generated c-files don't work for Python-3.7 if generated with Cython versions up to 0.27.3 (at least): The setup.py of pyproj (at least in the version 1.9.5.1) doesn't regenerate the_proj.c, which is generated with Cython 0.23.2 and thus the installation cannot succeed.
You have the following options:
stay on Python3.6 where everything works out of the box.
regenerate _proj.c with a current Cython-version.
For the second option:
download and unzip your prefered version from https://github.com/pyproj4/pyproj/releases/tag/v1.9.5.1rel and switch to the created folder pyproj-1.9.5.1rel.
check, that the cython-version is >=0.27.3. via cython --version.
regenerate the _proj.c file via cython -3 _proj.pyx (_proj.pyx looks like Python3-code, but also language_level=2 (i.e. cython -2 _proj.pyx) will probably work.
install running pip install .
pyproj 1.9.5.1 was release at Jan 7, 2016. At that time, the latest version Python was 3.5. In my tests. pyproj 1.9.5.1 failed to be installed on Python 3.7.4, but succeeded on Python 3.5.7.
You need to create a environment with Python 3.5 by pyenv or conda.
References
pyproj 1.9.5.1 release
Python release history

Phoenix: Running mix ecto.create Error Compiling Ranch Dependency

I'm trying to go through the Up And Running tutorial on the Phoenix framework site. I have the following setup :
macOS 10.14.5
Phoenix 1.4.6
Elixir 1.8.2
Erlang/OTP 22
I create the project with the mix phx.new command. I get prompted to fetch and install the dependencies. I type Y. The dependencies get fetched and installed successfully.
I go to my project directory and enter the following command :
mix ecto.create
The following error appears :
(Mix) Could not compile dependency :ranch, "/Volumes/Macintosh HD/Users/mark/.mix/rebar3 bare compile --paths "/Code/hello/_build/dev/lib/*/ebin"" command failed.
You can recompile this dependency with "mix deps.compile ranch", update it with "mix deps.update ranch" or clean it with "mix deps.clean ranch"
I get the same error if I run mix phx.server.
If I run mix deps.clean ranch and mix deps.update ranch, it lists the following unchanged dependencies:
Resolving Hex dependencies...
Dependency resolution completed:
Unchanged:
connection 1.0.4
cowboy 2.6.3
cowlib 2.7.3
db_connection 2.0.6
decimal 1.7.0
ecto 3.1.4
ecto_sql 3.1.3
file_system 0.2.7
gettext 0.16.1
jason 1.1.2
mime 1.3.1
phoenix 1.4.6
phoenix_ecto 4.0.0
phoenix_html 2.13.2
phoenix_live_reload 1.2.0
phoenix_pubsub 1.1.2
plug 1.8.0
plug_cowboy 2.0.2
plug_crypto 1.0.0
postgrex 0.14.3
ranch 1.7.1
telemetry 0.4.0
So ranch has been compiled. But when I run mix ecto.create again, I get the same error about being unable to compile dependency :ranch.
I did an Internet search to see if anyone else had the same issue. Every issue someone had with mix ecto.create involved creating database users. No one else had an issue with ranch.
What do I have to do to get the Up and Running tutorial running properly?
I've run into this problem under Ubuntu, and the issue was that the ~/.configure folder was unreadable by my current user. Changing the owner and group on that folder and it's contents solved the problem for us.
I was able to recreate this problem under MacOS, using Elixir 1.7.4 and Erlang 20.1 by changing the permissions on my ~/.config folder to 600. Setting the permissions back to 755 allowed the compile.

Pip install failed in openshift 3

I want to use the new platform Openshift 3 but I can't install lxml for Weblate with pip when build process is launch.
In logs the last line is "Running setup.py install for lxml" but no more error
How can I found what happened ?
Thanks
Some of the packages around data analytics when compiled with compiler optimisations can chew up too much memory and hit the default memory limit for builds. Try following steps outlined in:
Pandas on OpenShift v3
Is less likely, but just in case is the version of pip used, add a file .s2i/environment and in it add:
UPGRADE_PIP_TO_LATEST=1
This will ensure that latest version of pip is installed first. This can be required sometimes where a package provides a wheel file. Older version of pip used may ignore the binary wheel or get confused in other ways.
Thanks #Graham I followed this instruction Pandas on OpenShift v3 to edit YAML build configuration
resources:
limits:
memory: 1Gi

Invalid parameter elasticsearch_package_name on Elasticsearch_plugin

OS : 'CentOS 6.5
'
ElasticSearch version : '2.3.0'
Master's puppet version: '3.8.7'
Client's puppet version : '3.7.4'
Base module version before upgrade : '0.10.2'
Base module version after upgrade : '5.1.0'
Error: could not retrieve catalog from remote server: Error 400 on
SERVER: invalid parameter elasticsearch_package_name on
Elasticsearch_plugin[license] at
/etc/puppet/environments/production/modules/elasticsearch/manifests/plugin.pp:169
on node bla-test01.dom'
Hi,
This error started after we upgraded our Elasticsearch's base (Official from puppet forge) module from version '0.10.2' to '5.1.0'. Our puppet module of elasticsearch worked just fine before the upgrade.
Since the upgrade this error occurred whenever puppet ran on our nodes.
After we saw this case we tried to restart our puppetserver service. Since the restart, the error occurs once every 3-4 runs of puppet and we have no idea why.
Looking at the elastic/elasticsearch module which is the one you seem to be using i can see that the elastic_plugin custom type did not have the elasticsearch_package_name parameter in version 0.11.0 however the 5.1.0 version does. This looks to me that you may have updated the module on the system but have not restarted the puppet server so it still has the 0.11.0 custom type/provider ruby files loaded.
Restart the puppet master server and see if that fixes the issue

Resources