Airflow initdb fails due to SyntaxError on importing AsyncRetrying - installation

I am new to Airflow. I install it by pip install apache-airflow. When I run command airflow initdb in terminal then I am getting the error below. Where did I go wrong during install, and how can I fix this issue?
aamir#aamir-Inspiron-3542:~$ airflow initdb
[2019-03-30 18:32:27,309] {__init__.py:51} INFO - Using executor SequentialExecutor
DB: sqlite:////home/aamir/airflow/airflow.db
[2019-03-30 18:32:31,790] {db.py:338} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
ERROR [airflow.models.DagBag] Failed to import: /home/aamir/anaconda3/lib/python3.7/site-packages/airflow/example_dags/example_http_operator.py
Traceback (most recent call last):
File "/home/aamir/anaconda3/lib/python3.7/site-packages/airflow/models.py", line 374, in process_file
m = imp.load_source(mod_name, filepath)
File "/home/aamir/anaconda3/lib/python3.7/imp.py", line 171, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 696, in _load
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/aamir/anaconda3/lib/python3.7/site-packages/airflow/example_dags/example_http_operator.py", line 27, in <module>
from airflow.operators.http_operator import SimpleHttpOperator
File "/home/aamir/anaconda3/lib/python3.7/site-packages/airflow/operators/http_operator.py", line 21, in <module>
from airflow.hooks.http_hook import HttpHook
File "/home/aamir/anaconda3/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 23, in <module>
import tenacity
File "/home/aamir/anaconda3/lib/python3.7/site-packages/tenacity/__init__.py", line 352
from tenacity.async import AsyncRetrying
^
SyntaxError: invalid syntax
Done.

In Python 3.7, async is a reserved keyword, which means it cannot be used in module and variable names. This was valid in prior Python versions, but starting from 3.7, a SyntaxError is raised.
In your case, Airflow comes pre-installed with example DAGs, which were parsed when you ran airflow initdb. Some of those DAGs make use of the SimpleHttpOperator which depends on http_hook.py. That hook furthermore depends on the tenacity library, which tries to import an async module as part of initialization:
from tenacity.async import AsyncRetrying
To fix this, wait for/install Airflow v1.10.3 which updates Tenacity (see AIRFLOW-2876). Alternatively, you can downgrade your Python version. You can see that the import fails using 3.7.3:
$ docker run --rm -it python:3.7
Python 3.7.3 (default, Mar 27 2019, 23:40:30)
>>> from tenacity.async import AsyncRetrying
File "<stdin>", line 1
from tenacity.async import AsyncRetrying
^
SyntaxError: invalid syntax
But it works fine in version 3.6.8:
$ docker run --rm -it python:3.6
Python 3.6.8 (default, Feb 6 2019, 12:07:20)
>>> from tenacity.async import AsyncRetrying
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'tenacity'

Related

Run docker-compose with podman as a backend on MacOS

I am trying to run docker-compose CLI with podman backend on my local machine (MacOS). Here is what I did:
Install podman using brew: brew install podman
Initialize the podman machine: podman machine init
Started the machine: podman machine start
Running podman version gives me this:
Client:
Version: 3.4.4
API Version: 3.4.4
Go Version: go1.17.6
Built: Wed Dec 8 19:41:11 2021
OS/Arch: darwin/amd64
Server:
Version: 3.4.4
API Version: 3.4.4
Go Version: go1.16.8
Built: Wed Dec 8 22:45:07 2021
OS/Arch: linux/amd64
I created a symlink for podman (ln -s podman docker), so running docker version gives me the same and I can actually run containers using docker run even though docker is not installed.
Afterwards I installed docker-compose using:
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
docker-compose version gives me this:
docker-compose version 1.29.2, build 5becea4c
docker-py version: 5.0.0
CPython version: 3.9.0
OpenSSL version: OpenSSL 1.1.1h 22 Sep 2020
Problem is that docker-compose up test is not working as docker-compose doesn't seem to find the docker host to connect to or is somehow blocked, does somebody know how to solve this issue:
Traceback (most recent call last):
File "urllib3/connectionpool.py", line 670, in urlopen
File "urllib3/connectionpool.py", line 392, in _make_request
File "http/client.py", line 1255, in request
File "http/client.py", line 1301, in _send_request
File "http/client.py", line 1250, in endheaders
File "http/client.py", line 1010, in _send_output
File "http/client.py", line 950, in send
File "docker/transport/unixconn.py", line 43, in connect
ConnectionRefusedError: [Errno 61] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "requests/adapters.py", line 439, in send
File "urllib3/connectionpool.py", line 726, in urlopen
File "urllib3/util/retry.py", line 410, in increment
File "urllib3/packages/six.py", line 734, in reraise
File "urllib3/connectionpool.py", line 670, in urlopen
File "urllib3/connectionpool.py", line 392, in _make_request
File "http/client.py", line 1255, in request
File "http/client.py", line 1301, in _send_request
File "http/client.py", line 1250, in endheaders
File "http/client.py", line 1010, in _send_output
File "http/client.py", line 950, in send
File "docker/transport/unixconn.py", line 43, in connect
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionRefusedError(61, 'Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "docker/api/client.py", line 214, in _retrieve_server_version
File "docker/api/daemon.py", line 181, in version
File "docker/utils/decorators.py", line 46, in inner
File "docker/api/client.py", line 237, in _get
File "requests/sessions.py", line 543, in get
File "requests/sessions.py", line 530, in request
File "requests/sessions.py", line 643, in send
File "requests/adapters.py", line 498, in send
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionRefusedError(61, 'Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "docker-compose", line 3, in <module>
File "compose/cli/main.py", line 81, in main
File "compose/cli/main.py", line 200, in perform_command
File "compose/cli/command.py", line 60, in project_from_options
File "compose/cli/command.py", line 152, in get_project
File "compose/cli/docker_client.py", line 41, in get_client
File "compose/cli/docker_client.py", line 170, in docker_client
File "docker/api/client.py", line 197, in __init__
File "docker/api/client.py", line 221, in _retrieve_server_version
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', ConnectionRefusedError(61, 'Connection refused'))
[63757] Failed to execute script docker-compose
Did you install podman-mac-helper?
sudo /opt/homebrew/Cellar/podman/4.0.2/bin/podman-mac-helper install
docker-compose v2.3.4 works just fine with Podman v4.0.2 on a MacBook Pro M1 running macOS Monterey as long as you install podman-mac-helper (which makes /var/run/docker.sock available.)

How to install pyspark without hadoop?

I want to install pyspark but I don't want to use hadoop because I just want to test out some functions. I followed instructions from a bunch of websites: I used pip to install pyspark, installed jdk 8 and set JAVA_PATH, SPARK_HOME, PATH variables but it's not working.
My program is:
from pyspark import *
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
I am getting this exception:
\Java\jdk1.8.0_291\bin\java was unexpected at this time.
Traceback (most recent call last):
File "c:\Users\ankit\Untitled-1.py", line 4, in <module>
spark = SparkSession.builder.getOrCreate()
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\sql\session.py", line 228, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 384, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 144, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 331, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\java_gateway.py", line 108, in launch_gateway
raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

VirtualBox anaconda installation failing

Having trouble installing anaconda on my Ubuntu VirtualBox. Have tried rebooting and have tried assigning a bigger chunk of base memory but still failing at the final few hurdles.
Unpacking payload ...
concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
File "concurrent/futures/process.py", line 368, in _queue_management_worker
File "multiprocessing/connection.py", line 251, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "entry_point.py", line 69, in <module>
File "concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
File "concurrent/futures/_base.py", line 611, in result_iterator
File "concurrent/futures/_base.py", line 439, in result
File "concurrent/futures/_base.py", line 388, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or
pending.
[1981] Failed to execute script entry_point

"No module named error" when try to run Elasticalert

When I try to run elastalert
python -m elastalert.elastalert --verbose --start 2019-09-04 --rule rules/rule.yaml --config config.yaml
I get following error.
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/lib/python2.7/dist-packages/elastalert-0.2.1-py2.7.egg/elastalert/elastalert.py", line 29, in <module>
from . import kibana
File "/usr/local/lib/python2.7/dist-packages/elastalert-0.2.1-py2.7.egg/elastalert/kibana.py", line 4, in <module>
import urllib.error
ImportError: No module named error
My environment is
ubuntu 18
elasticsearch 6.3.0

Ubuntu 10.04 - Python multiprocessing - 'module' object has no attribute 'local' error

The following code is from the python 2.6 manual.
from multiprocessing import Process
import os
def info(title):
print(title)
print('module name:', 'me')
print('parent process:', os.getppid())
print('process id:', os.getpid())
def f(name):
info('function f')
print('hello', name)
if __name__ == '__main__':
info('main line')
p = Process(target=f, args=('bob',))
p.start()
p.join()
This creates the following stack traces:
Traceback (most recent call last):
File "threading.py", line 1, in <module>
from multiprocessing import Process
File "/usr/lib/python2.6/multiprocessing/__init__.py", line 64, in <module>
from multiprocessing.util import SUBDEBUG, SUBWARNING
File "/usr/lib/python2.6/multiprocessing/util.py", line 287, in <module>
class ForkAwareLocal(threading.local):
AttributeError: 'module' object has no attribute 'local'
Exception AttributeError: '_shutdown' in <module 'threading' from '/home/v0idnull/tmp/pythreads/threading.pyc'> ignored
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib/python2.6/multiprocessing/util.py", line 258, in _exit_function
info('process shutting down')
TypeError: 'NoneType' object is not callable
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib/python2.6/multiprocessing/util.py", line 258, in _exit_function
info('process shutting down')
TypeError: 'NoneType' object is not callable
I'm completely clueless as to WHY this is happening, and google has given me very little to work with.
that code runs fine on my machine:
Ubuntu 10.10, Python 2.6.6 64-bit.
but your error is actually because you have a file named 'threading.py' that you are running this code from (see the stack-trace details). this is causing a namespace mismatch, since the multiprocessing module needs the 'real' threading module. try renaming your file to something other than 'threading.py' and running it again.
also... the example you posted is not from the Python 2.6 docs... it is from the Python 3.x docs. make sure you are reading the docs for the version that matches what you are running.

Resources