Is MySQL a prerequisite for building HUE? - hadoop

I'm trying to build hue and not having much success so far. I'm
getting the following error message :
--- Building egg for MySQL-python-1.2.3c1
sh: mysql_config: command not found
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "build/bdist.linux-i686/egg/setuptools/sandbox.py", line 62, in
run_setup
File "build/bdist.linux-i686/egg/setuptools/sandbox.py", line 105,
in run
File "build/bdist.linux-i686/egg/setuptools/sandbox.py", line 64, in
<lambda>
File "setup.py", line 15, in <module>
metadata, options = get_config()
File "/Users/kramachandran/Sandbox/hue/hue/desktop/core/ext-py/MySQL-
python-1.2.3c1/setup_posix.py", line 43, in get_config
libs = mysql_config("libs_r")
File "/Users/kramachandran/Sandbox/hue/hue/desktop/core/ext-py/MySQL-
python-1.2.3c1/setup_posix.py", line 24, in mysql_config
raise EnvironmentError("%s not found" % (mysql_config.path,))
EnvironmentError: mysql_config not found
make[2]: *** [/Users/kramachandran/Sandbox/hue/hue/desktop/core/build/
MySQL-python-1.2.3c1/egg.stamp] Error 1
make[1]: *** [.recursive-env-install/core] Error 2
I'm interpreting this to mean that MySQL or MySQL python is a pre-
req. However, I don't see any documentation to that effect.
Any information would be appreciated.
Thanks

Hue has the list of required packages for building it on the README
You will need the MySql dev packages, e.g. 'libmysqlclient-dev' with Ubuntu.

Hue uses SQLite by default but can be configured to use MySql
see the installation doc

you need mysql_config do the following on your linux distribution, on centos the following will do:
yum install mariadb-devel
for 7.x, for others use apt with proper package.
once this is done you should be ok and it will continue building.

Related

Airflow on WSL2 Ubuntu: ValueError: Unable to configure handler 'processor'

I am aware that very similar questions have been asked previously but I have found these have tended to include Docker, which I am not using at this time, nor do I have installed. I am led to believe that WSL2 should be an alternative to Docker regarding running Airflow on Windows.
I am using WSL2 on my Windows 11 laptop and have installed Apache-Airflow from a tutorial from the following link:
https://coding-stream-of-consciousness.com/2018/11/06/apache-airflow-windows-10-install-ubuntu/
On WSL2 I have:
Python version 3.8.10
Pip version 20.0.2
Apache-Airflow 2.4.1 (I believe)
I have run the following commands (as per the tutorial) with no issue:
sudo apt-get install software-properties-common
sudo apt-add-repository universe
sudo apt-get update
sudo apt-get install python-pip
sudo pip install apache-airflow #I had path issues without 'sudo' command
But when I attempt to use the 'airflow' command in the WSL2 Ubuntu terminal, I am greeted with the following error:
$ airflow
Unable to load the config, contains a configuration error.
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/config.py", line 563, in configure
handler = self.configure_handler(handlers[name])
File "/usr/lib/python3.8/logging/config.py", line 744, in configure_handler
result = factory(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/log/file_processor_handler.py", line 45, in __init__
self.filename_template, self.filename_jinja_template = parse_template_string(filename_template)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/helpers.py", line 165, in parse_template_string
import jinja2
File "/usr/lib/python3/dist-packages/jinja2/__init__.py", line 33, in <module>
from jinja2.environment import Environment, Template
File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 15, in <module>
from jinja2 import nodes
File "/usr/lib/python3/dist-packages/jinja2/nodes.py", line 23, in <module>
from jinja2.utils import Markup
File "/usr/lib/python3/dist-packages/jinja2/utils.py", line 656, in <module>
from markupsafe import Markup, escape, soft_unicode
ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/usr/local/lib/python3.8/dist-packages/markupsafe/__init__.py)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 5, in <module>
from airflow.__main__ import main
File "/usr/local/lib/python3.8/dist-packages/airflow/__init__.py", line 46, in <module>
settings.initialize()
File "/usr/local/lib/python3.8/dist-packages/airflow/settings.py", line 564, in initialize
LOGGING_CLASS_PATH = configure_logging()
File "/usr/local/lib/python3.8/dist-packages/airflow/logging_config.py", line 74, in configure_logging
raise e
File "/usr/local/lib/python3.8/dist-packages/airflow/logging_config.py", line 69, in configure_logging
dictConfig(logging_config)
File "/usr/lib/python3.8/logging/config.py", line 808, in dictConfig
dictConfigClass(config).configure()
File "/usr/lib/python3.8/logging/config.py", line 570, in configure
raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'processor'
I have googled the ValueError extensively and can't find any clear solutions that don't involve Docker.
Any insight into the error would be much appreciated!
You should install airflow using constraints.
https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#installation-tools
This is the only way installation of airflow is guaranteed to work.

Heroku deployment failed of missing mariadb-config or MariaDB connector from remote DB

Wanted to make run my Python Flask project via Heroku (it's the first time I am using it). The case is, that I connect to a remote MariaDB on a Ubuntu VM.
When I try to push from local git I get an error to be related to the MariaDB. Is there an idea how I can solve this problem?
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [19 lines of output]
/bin/sh: 1: mariadb_config: not found
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-2j753bw3/mariadb_0c456c224d01457ab040c19f32e1f9ee/setup.py", line 26, in <module>
cfg = get_config(options)
File "/tmp/pip-install-2j753bw3/mariadb_0c456c224d01457ab040c19f32e1f9ee/mariadb_posix.py", line 63, in get_config
cc_version = mariadb_config(config_prg, "cc_version")
File "/tmp/pip-install-2j753bw3/mariadb_0c456c224d01457ab040c19f32e1f9ee/mariadb_posix.py", line 28, in mariadb_config
raise EnvironmentError(
OSError: mariadb_config not found.
This error typically indicates that MariaDB Connector/C, a dependency which must be preinstalled,
is not found.
If MariaDB Connector/C is not installed, see installation instructions
at: https://github.com/mariadb-corporation/mariadb-connector-c/wiki/install.md.
If MariaDB Connector/C is installed, either set the environment variable MARIADB_CONFIG or edit
the configuration file 'site.cfg' to set the 'mariadb_config' option to the file location of the
mariadb_config utility.
[end of output]
It looks like you need the mariadb_config utility at compile time. This is likely because whatever Python driver you are using requires it.
mariadb_config is available in the libmariadb-dev Ubuntu package. You'll need to install that using the Apt buildpack.
Add the buildpack:
heroku buildpacks:add --index 1 heroku-community/apt
That buildpack doesn't do dependency resolution, so you'll have to list all transitive dependencies.
Create a new file called Aptfile (no extension) that lists the dependencies you wish to install:
mysql-common
mariadb-common
libmariadb3
libmariadb-dev
Commit your Aptfile and redeploy.

Getting error when trying to install apache-airflow on Mac. How can I fix this?

Error output below:
ronakvora:dtc ronakvora$ pip install apache-airflow
Installing build dependencies ... done
Complete output from command python setup.py egg_info:
running egg_info
creating pip-egg-info/pendulum.egg-info
writing requirements to pip-egg-info/pendulum.egg-info/requires.txt
writing pip-egg-info/pendulum.egg-info/PKG-INFO
writing top-level names to pip-egg-info/pendulum.egg-info/top_level.txt
writing dependency_links to pip-egg-info/pendulum.egg-info/dependency_links.txt
writing manifest file 'pip-egg-info/pendulum.egg-info/SOURCES.txt'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-install-WFGcOd/pendulum/setup.py", line 50, in <module>
setup(**setup_kwargs)
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
File "/private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-build-env-HZt1xp/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 296, in run
self.find_sources()
File "/private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-build-env-HZt1xp/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 303, in find_sources
mm.run()
File "/private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-build-env-HZt1xp/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 534, in run
self.add_defaults()
File "/private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-build-env-HZt1xp/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 570, in add_defaults
sdist.add_defaults(self)
File "/private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-build-env-HZt1xp/lib/python2.7/site-packages/setuptools/command/py36compat.py", line 36, in add_defaults
self._add_defaults_ext()
File "/private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-build-env-HZt1xp/lib/python2.7/site-packages/setuptools/command/py36compat.py", line 119, in _add_defaults_ext
build_ext = self.get_finalized_command('build_ext')
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/cmd.py", line 312, in get_finalized_command
cmd_obj.ensure_finalized()
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/cmd.py", line 109, in ensure_finalized
self.finalize_options()
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/command/build_ext.py", line 159, in finalize_options
self.include_dirs.append(py_include)
AttributeError: 'unicode' object has no attribute 'append'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/6x/xsb52c7936l38mmb9f7s268m0000gn/T/pip-install-WFGcOd/pendulum/
You are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
I had recently encountered a similar error on Airflow 1.10.2 and in my case it was related to incorrect version of pendulum
Run pip show pendulum
Name: pendulum
Version: 1.4.4
Summary: Python datetimes made easy.
...
If your version of pendulum is different from v1.4.4 just do a force-reinstall (Airflow 1.10.2 requires pendulum===1.4.4)
pip install --force-reinstall pendulum===1.4.4
References
Find which version of package is installed with pip
Installing specific package versions with pip
I didn't exactly solve the issue above, but just decided to switch to python3 and used pip3 install apache-airflow.

What does this stack trace tell me is wrong with my Tensorflow Installation

I have went through and installed CUDA, cuDNN, an followed the instructions to the best of my ability. I have added the environment variables I believe I need, but I still seem to have problems.
I have come as far as testing to see if tensorflow has been installed correctly. When pull up a command prompt, type python to use the shell, I type import tensorflow as tf.
I then get this stack trace, something I cannot make sense of to solve the problem myself. This is where I need the communities help:
>>> import tensorflow as tf
Traceback (most recent call last):
File "C:\Users\Troy\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\self_check.py", line 75, in preload_check
ctypes.WinDLL(build_info.cudart_dll_name)
File"C:\Users\Troy\AppData\Local\Programs\Python\Python36\lib\ctypes\__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Troy\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\__init__.py", line 24, in <module>
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "C:\Users\Troy\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "C:\Users\Troy\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 30, in <module>
self_check.preload_check()
File "C:\Users\Troy\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\self_check.py", line 82, in preload_check
% (build_info.cudart_dll_name, build_info.cuda_version_number))
ImportError: Could not find 'cudart64_90.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 9.0 from this URL: https://developer.nvidia.com/cuda-toolkit
Please try installing CUDA 9.0. the problem should be with the CUDA 9.1 version. You can download the driver from the following link : CUDA Toolkit 9.0
To uninstall CUDA:
Please run the following commands :
sudo apt-get --purge remove cuda
sudo apt autoremove
These commands should be able to uninstall cuda from your system.
If you have the cuDNN configured to work with GPU:
You can remove them just my deleting the files in the directories that you have copied to, during its setup.

Installing GDAL on Google Cloud Datalab

I am having trouble installing GDAL on Google Cloud Datalab. When I run:
!pip install gdal
I get the following error
Collecting gdal
Using cached GDAL-2.2.4.tar.gz
Complete output from command python setup.py egg_info:
running egg_info
creating pip-egg-info/GDAL.egg-info
writing pip-egg-info/GDAL.egg-info/PKG-INFO
writing top-level names to pip-egg-info/GDAL.egg-info/top_level.txt
writing dependency_links to pip-egg-info/GDAL.egg-
info/dependency_links.txt
writing manifest file 'pip-egg-info/GDAL.egg-info/SOURCES.txt'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-umhRKK/gdal/setup.py", line 342, in <module>
**extra )
File "/usr/local/lib/python2.7/dist-packages/setuptools/__init__.py",
line 129, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in
run_commands
self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
File "/usr/local/lib/python2.7/dist-packages/setuptools/command/egg_info.py", line 278, in run
self.find_sources()
File "/usr/local/lib/python2.7/dist-packages/setuptools/command/egg_info.py", line 293, in find_sources
mm.run()
File "/usr/local/lib/python2.7/dist-packages/setuptools/command/egg_info.py", line 524, in run
self.add_defaults()
File "/usr/local/lib/python2.7/dist-packages/setuptools/command/egg_info.py", line 560, in add_defaults
sdist.add_defaults(self)
File "/usr/local/lib/python2.7/dist-packages/setuptools/command/py36compat.py", line 36, in add_defaults
self._add_defaults_ext()
File "/usr/local/lib/python2.7/dist-packages/setuptools/command/py36compat.py", line 119, in _add_defaults_ext
build_ext = self.get_finalized_command('build_ext')
File "/usr/lib/python2.7/distutils/cmd.py", line 312, in get_finalized_command
cmd_obj.ensure_finalized()
File "/usr/lib/python2.7/distutils/cmd.py", line 109, in ensure_finalized
self.finalize_options()
File "/tmp/pip-build-umhRKK/gdal/setup.py", line 217, in finalize_options
self.gdaldir = self.get_gdal_config('prefix')
File "/tmp/pip-build-umhRKK/gdal/setup.py", line 191, in get_gdal_config
return fetch_config(option)
File "/tmp/pip-build-umhRKK/gdal/setup.py", line 144, in fetch_config
raise gdal_config_error, e""")
File "<string>", line 4, in <module>
__main__.gdal_config_error: [Errno 2] No such file or directory
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-umhRKK/gdal/
Same thing goes for :
!pip install gdal==2.2
or:
!pip install python-gdal
and a few other similar commands I have tried. The fix I discovered which was to update the setup-tools still gives me this problem after updating it.
What is the problem?
As per the Datalab documentation on how to add a Python library to Cloud Datalab, there are three approaches that you can follow to perform an installation of a third-party library in Cloud Datalab:
Use !pip install lib-name. This has obviously not worked for you, even by applying some of the troubleshooting fixes you have found after looking around.
Use the code in # Option 2 below to specify the command to run in the startup script of the Datalab instance, then proceed to restart it.
Use your own Docker customized container, inherited from the Datalab Docker container. I know this option is the most complex, but it may be the most appropriate one for packages that require a more flexible customization, such as the gdal one you are having issues with.
Script to be used in option 2:
#Option 2
%%bash
echo "pip install lib-name" >> /content/datalab/.config/startup.sh
cat /content/datalab/.config/startup.sh

Resources