I have installed Celery 3.1.5, RabbitMQ server 3.2.1 and Python 2.7.5 on Windows 7 64 bit machine. Here is my code which copied from first-steps-with-celery.
from celery import Celery
app = Celery('tasks', backend='amqp', broker='amqp://guest#localhost//')
#app.task
def add(x, y):
return x + y
When I execute task from python shell I got "The operation timed out" exception message. And state and ready() always returns PENDING & False.
>>> from tasks import *
>>> result = add.delay(4, 4)
>>> result.ready()
False
>>> result.state
'PENDING'
>>> result.get(timeout=20)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\celery\result.py", line 136, in get
interval=interval)
File "C:\Python27\lib\site-packages\celery\backends\amqp.py", line 154, in wait_for
raise TimeoutError('The operation timed out.')
celery.exceptions.TimeoutError: The operation timed out.
>>>
I verified RabbitMQ server is running however I have no clue why celery throwing exception.
You can try to start the worker with command
celery -A proj worker -l info --pool==solo
Though there are lots of things that can cause the result.get() call to fail -- because there are a lot of steps in the chain between sending the message via the .delay() command, to Celery, to the broker (RabbitMQ), and back to a Celery worker, which does the work, and posts the results back, etc. -- I had this problem and the solution was the one that #Deja_vu suggested of "--pool=solo" (note one equals sign, not two).
The default "pool" option is "prefork" (see http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#module-celery.bin.worker ). So this may be a Celery bug in its "prefork" system under Windows: see https://github.com/celery/celery/issues/2146
Related StackOverflow questions:
Celery 'Getting Started' not able to retrieve results; always pending
Trouble getting result from Celery queue
Related
I am using the Redis Cloud add-in on my Heroku application and I keep getting this error sporadically. I have tried flushing the redis DB and restarting dynos and that seems to fix it but I am curious why this is happening so often.
I am running worker dynos that use this redis DB and I am using python-rq to schedule jobs on the worker queues.
File "/usr/local/lib/python3.8/site-packages/redis/connection.py", line 563, in connect
raise ConnectionError(self._error_message(e))
2020-08-05T17:12:25.451733+00:00 app[worker_proc.5]: redis.exceptions.ConnectionError: Error -3 connecting to redis-13618.c73.us-east-1-2.ec2.cloud.redislabs.com:13618. Temporary failure in name resolution.
2020-08-05 17:12:25,461 INFO exited: worker_proc-0 (exit status 1; not expected)
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/redis/connection.py", line 559, in connect
sock = self._connect()
File "/usr/local/lib/python3.8/site-packages/redis/connection.py", line 584, in _connect
for res in socket.getaddrinfo(self.host, self.port, self.socket_type,
File "/usr/local/lib/python3.8/socket.py", line 918, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
Following is attempt to launch cluster with ten slaves.
12:13:44/sparkup $ec2/spark-ec2 -k sparkeast -i ~/.ssh/myPem.pem \
-s 10 -z us-east-1a -r us-east-1 launch spark2
Here is output. Note that the same command had been successful with the February Master code. Today I had updated to latest 1.4.0-SNAPSHOT
Setting up security groups...
Searching for existing cluster spark2 in region us-east-1...
Spark AMI: ami-5bb18832
Launching instances...
Launched 10 slaves in us-east-1a, regid = r-68a0ae82
Launched master in us-east-1a, regid = r-6ea0ae84
Waiting for AWS to propagate instance metadata...
Waiting for cluster to enter 'ssh-ready' state.........unable to load cexceptions
TypeError
p0
(S''
p1
tp2
Rp3
(dp4
S'child_traceback'
p5
S'Traceback (most recent call last):\n File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1280, in _execute_child\n sys.stderr.write("%s %s (env=%s)\\n" %(executable, \' \'.join(args), \' \'.join(env)))\nTypeError\n'
p6
sb.Traceback (most recent call last):
File "ec2/spark_ec2.py", line 1444, in <module>
main()
File "ec2/spark_ec2.py", line 1436, in main
real_main()
File "ec2/spark_ec2.py", line 1270, in real_main
cluster_state='ssh-ready'
File "ec2/spark_ec2.py", line 869, in wait_for_cluster_state
is_cluster_ssh_available(cluster_instances, opts):
File "ec2/spark_ec2.py", line 833, in is_cluster_ssh_available
if not is_ssh_available(host=dns_name, opts=opts):
File "ec2/spark_ec2.py", line 807, in is_ssh_available
stderr=subprocess.STDOUT # we pipe stderr through stdout to preserve output order
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 709, in __init__
errread, errwrite)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1328, in _execute_child
raise child_exception
TypeError
The AWS console shows that instances are actually running. So it is unclear what actually failed.
Any hints or workarounds appreciated.
UPDATE This same error occurs when doing login command. It seems to be problem with the boto API - but the cluster itself appears to be OK.
ec2/spark-ec2 -i ~/.ssh/sparkeast.pem login spark2
Searching for existing cluster spark2 in region us-east-1...
Found 1 master, 10 slaves.
Logging into master ec2-54-87-46-170.compute-1.amazonaws.com...
unable to load cexceptions
TypeError
p0
(.. same exception stacktrace as above )
The issue is that the python-2.7.6 installation on my yosemite macbook appears to have become corrupted.
I reset the PATH and PYTHONPATH to point to a custom homebrew installed python version and then the boto - and other python commands including building spark performance project - work fine.
I have a Flask app running on nginx + uWSGI.
On my local server (non-nginx), I get a nice stack trace + error reporting for exceptions.
Like this:
$ python run.py
Traceback (most recent call last):
File "run.py", line 1, in <module>
from myappname import app
File "/home/me/myappname/myappname/__init__.py", line 27, in <module>
file_handler.setLevel(logging.debug)
File "/usr/lib/python2.7/logging/__init__.py", line 710, in setLevel
self.level = _checkLevel(level)
File "/usr/lib/python2.7/logging/__init__.py", line 190, in _checkLevel
raise TypeError("Level not an integer or a valid string: %r" % level)
On nginx, there is next to no logging whatsoever (in /var/log/nginx/error.log).
This post suggests adding app.logger.exception('Failed') to my script, which didn't help.
How do I enable this sort of logging for debugging purposes?
Nginx will capture your app's console output, but you must make the app recover from exceptions. Else, you'll only get 500 or 400 errors from Nginx.
Try running the app off Nginx until it seems stable.
Use the logging module to capture app status information to your own log file. This strategy will be useful in the long run.
Is there a way to set a timeout in psycopg2 for db transactions or for db queries?
A sample use-case:
Heroku limits django web requests to 30sec, after which Heroku terminates the request without allowing django to gracefully roll-back any transactions which have not yet returned. This can leave outstanding transactions open on postgres. You could configure a timeout in the database, but that would also limit non-web-related queries such as maintenance scripts analytics etc. In this case setting a timeout via the middleware (or via django) would be preferable.
You can set the timeout at connection time using the options parameter. The syntax is a bit weird:
>>> import psycopg2
>>> cnn = psycopg2.connect("dbname=test options='-c statement_timeout=1000'")
>>> cur = cnn.cursor()
>>> cur.execute("select pg_sleep(2000)")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
psycopg2.extensions.QueryCanceledError: canceling statement due to statement timeout
it can also be set using an env variable:
>>> import os
>>> os.environ['PGOPTIONS'] = '-c statement_timeout=1000'
>>> import psycopg2
>>> cnn = psycopg2.connect("dbname=test")
>>> cur = cnn.cursor()
>>> cur.execute("select pg_sleep(2000)")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
psycopg2.extensions.QueryCanceledError: canceling statement due to statement timeout
You can set a per-statement timeout at any time using SQL. For example:
SET statement_timeout = '2s'
will abort any statement (following it) that takes more than 2 seconds (you can use any valid unit as 's' or 'ms'). Note that when a statement timeouts, psycopg raises an exception and it is your care to catch it and act appropriately.
Looks like PostgreSQL 9.6 added idle transaction timeouts. See:
https://www.postgresql.org/docs/9.6/static/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT for reference.
http://blog.dbi-services.com/a-look-at-postgresql-9-6-killing-idle-transactions-automatically/ as an example.
PostgreSQL 9.6 is also supported in Heroku so you should be able to use this.
In python, I am using the following:
context = zmq.Context()
socket = context.socket(zmq.PUSH)
socket.bind_to_random_port('tcp://*', min_port=6001, max_port=6004, max_tries=100)
port_selected = socket.???????
How do I know what port is chosen? I will have a look up table in redis for the workers to read.
I am using a push pull model. I need to let workers know what ports to connect to.
I have to do this because I am using the gevent loop in uwsgi and specifying a a plain blind thows and error becuase of a fork. If a use bind_to_random_port then a port is seleced, I just dont know which.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/gevent-1.0b2-py2.7-linux-x86_64.egg/gevent/greenlet.py",
line 328, in run
result = self._run(*self.args, **self.kwargs)
File "/home/ubuntu/workspace/rtbopsConfig/rtbServers/rtbUwsgiPixelServer/uwsgiPixelServer.py",
line 43, in sendthis
socket.send(push)
File "/usr/local/lib/python2.7/dist-packages/zmq/green/core.py",
line 173, in send
self._wait_write()
File "/usr/local/lib/python2.7/dist-packages/zmq/green/core.py",
line 108, in _wait_write
assert self.__writable.ready(), "Only one greenlet can be waiting
on this event"
AssertionError: Only one greenlet can be waiting on this event
<Greenlet at 0x2d41370: sendthis('2012-07-02 04:05:15')> failed with
AssertionError
port_selected = socket.bind_to_random_port('tcp://*', min_port=6001, max_port=6004, max_tries=100)