Does django-celery-backend only save a record when the task completes? - django-celery

I've been doing some testing on django-celery-results.
I have found that the task result is only stored when the task completes, using the sqlite db.
Is this correct?

I am unfamiliar with django-celery-results but a quick glance at it's code suggests it's just saving data using django ORM, which implies that same rules for regular celery should apply.
In such case, yes, by default, only success is stored (you can read more about it here, but generally only terminal states are stored by default).
You can tweak this by settings the flag to track started state more info here
track_started = False
Note that Pending state is not really persisted, but is returned when no other state for that task exists more info here.

Related

Is this Redis Race Condition Scenario Possible?

I'm debugging an issue in an application and I'm running into a scneario where I'm out of ideas, but I suspect a race condition might be in play.
Essentially, I have two API routes - let's call them A and B. Route A generates some data and Route B is used to poll for that data.
Route A first creates an entry in the redis cache under a given key, then starts a background process to generate some data. The route immediately returns a polling ID to the caller, while the background data thread continues to run. When the background data is fully generated, we write it to the cache using the same cache key. Essentially, an overwrite.
Route B is a polling route. We simply query the cache using that same cache key - we expect one of 3 scenarios in this case:
The object is in the cache but contains no data - this indicates that the data is still being generated by the background thread and isn't ready yet.
The object is in the cache and contains data - this means that the process has finished and we can return the result.
The object is not in the cache - we assume that this means you are trying to poll for an ID that never existed in the first place.
For the most part, this works as intended. However, every now and then we see scenario 3 being hit, where an error is being thrown because the object wasn't in the cache. Because we add the placeholder object to the cache before the creation route ever returns, we should be able to safely assume this scenario is impossible. But that's clearly not the case.
Is it possible that there is some delay between when a Redis write operation returns and when the data is actually available for querying? That is, is it possible that even though the call to add the cache entry has completed but the data would briefly not be returned by queries? It seems the be the only thing that can explain the behavior we are seeing.
If that is a possibility, how can I avoid this scenario? Is there some way to force Redis to wait until the data is available for query before returning?
Is it possible that there is some delay between when a Redis write operation returns and when the data is actually available for querying?
Yes and it may depend on your Redis topology and on your network configuration. Only standalone Redis servers provides strong consistency, albeit with some considerations - see below.
Redis replication
While using replication in Redis, the writes which happen in a master need some time to propagate to its replica(s) and the whole process is asynchronous. Your client may happen to issue read-only commands to replicas, a common approach used to distribute the load among the available nodes of your topology. If that is the case, you may want to lower the chance of an inconsistent read by:
directing your read queries to the master node; and/or,
issuing a WAIT command right after the write operation, and ensure all the replicas acknowledged it: while the replication process would happen to be synchronous from the client standpoint, this option should be used only if absolutely needed because of its bad performance.
There would still be the (tiny) possibility of an inconsistent read if, during a failover, the replication process promotes a replica which did not receive the write operation.
Standalone Redis server
With a standalone Redis server, there is no need to synchronize data with replicas and, on top of that, your read-only commands would be always handled by the same server which processed the write commands. This is the only strongly consistent option, provided you are also persisting your data accordingly: in fact, you may end up having a server restart between your write and read operations.
Persistence
Redis supports several different persistence options; in your scenario, you may want to configure your server so that it
logs to disk every write operation (AOF) and
fsync every query.
Of course, every configuration setting is a trade off between performance and durability.

Hibernate-search elasticsearch data indexing is synchronous or Asynchronous

We are implementing hibernate-search elasticsearch in our project and need a clarification.
After em.persist if the transaction is committed, is the data indexed into elasticsearch in a sync mode or async mode. Basically will the method return after the em.persist commit or will it ensure the data is indexed in elasticsearch. In the following video https://www.youtube.com/watch?v=_NGnbON3xAo at 17:30 I hear that the data is put on a queue and will be indexed in es in batch after db commit. So if elasticsearch indexing fails for some reason I might want to implement a manual db rollback so that the data is in consistent state in both the stores
Thanks in advance
Data is sent to Elasticsearch on commit, either synchronously or asynchronously depending on your settings.
If you're interested in changing those settings, have a look at the documentation: https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#configuration-worker
EDIT: Also, even in synchronous mode, Elasticsearch is "near-real-time", meaning even if the changes have been processed, they won't be visible until a small period of time after your changes. So you might also have a look at the hibernate.search.default.elasticsearch.refresh_after_write configuration setting. From the docs:
Whether to perform an explicit refresh after a set of operations has been executed against a
specific index (true or false)
hibernate.search.default.elasticsearch.refresh_after_write false
(default)
This is useful in unit tests to ensure that a write is visible by a
query immediately without delay. This keeps unit tests simpler. You
should not rely on the synchronous behaviour for your production code
except in rare cases as Elasticsearch is optimised for asynchronous
writes: leave at false for optimal performance.

Connecting Redis events to Lua Script execution and concurrency issues

I have grouped key value pairs or data structures built using Redisson library. The design is that a change in value of any group of value(s) should be sent as event to subscribing Lua scripts. These scripts then do computations and update another group's key-value pair. This process is implemented as a chain such that once the Lua script updates a key-value per, that in turn generates a event and another Lua script does the work similar to first Lua script based on certain parameters.
Question 1: How to connect the Lua script and the event?
Question 2: Events are pipelined but it may be that my Lua Scripts may have to wait for network IO. In that case, I assume the next event is processed and the subscribing script executed. this for me is a problem because first script hasn't finished updating the key-value pair it needs to and the second script is going ahead with its work. This will cause errors for me. Is there a way to get over this?
Question 3: How to emit events from Redisson datastructures and I need the Lua script to understand that data structure's structure. How?
At the time of writing, Redis (3.2.9) does not allow blocking commands inside Lua scripts, including the subscribe command. So it is impossible to achieve what you have described via Lua script.
However you can do it using Redisson Topic and/or Redisson distributed services:
Modify a value, send a message to a channel. Another process receives the message, do the computation and updating.
Or ...
If there's only one particular process that does the computation and updating, you can use Redisson remote service to tell this process do the work, it works like RPC. Maybe it is able to modify the first value too.
Or ...
Create the whole lot as one runnable job and send it to be processed by a Redisson remote executor. You can also choose to schedule the job if it is not immediately required.

Ensure consistency when caching data in after_commit hooks

For a specific database table, we need an in-memory cache of this data that is always in-sync with the database. My current attempt is to write the changes to the cache in an after_commit hook - this way we make sure not to write any changes to the cache that could get reverted later.
However, this strategy is vulnerable to the following scenario:
Thread A locks and updates record, stores value 1
Thread A commits the change
Thread B locks and updates record, stores value 2
Thread B commits the change
Thread B runs the after_commit hook, so the cache now has value 2
Thread A runs the after_commit hook, so the cache now has value 1 but should have value 2
Am I right about this problem and how would one solve this?
You are right about this problem.
There is a after_save callback that runs within the same transaction. You might want to use that one instead of the after_commit hook that run after the transaction.
But than you will need to deal with a rolled back transaction yourself.
Or you might want to write your caching method in a way that does not depend on a specific instance. But instead caches the latest version that is found in the database by reloading the record from the database first.
But even than: Multithreaded systems are hard to keep in sync. And you cannot even ensure if the first or the second update send to your cache would be stored, because the caching system might be multi-threaded too.
You might want to read about different consistency models.
The solution we came up with is to lock the cache for read / write before_commit and unlock it in the after_commit. This seems to do the trick.

How to restore bolt state during failover

I'm trying to figure out how to restore the state of a storm bolt intance during failover. I can persist the state externally (DB or file system), however once the bolt instance is restarted I need to point to the specific state of that bolt instance to recover it.
The prepare method of a bolt receives a context, documented here http://nathanmarz.github.io/storm/doc/backtype/storm/task/TopologyContext.html
What is not clear to me is - is there any piece of this context that uniquely identifies the specific bolt instance so I can understand which persistent state to point to? Is that ID preserved during failover? Alternatively, is there any variable/object I can set for the specific bolt/instance that is preserved during failover? Any help appreciated!
br
Sib
P.S.
New to stackoverflow so pls bear with me...
You can probably look for Trident Its basically an abstraction built on top of storm . The documentation says
Trident has first-class abstractions for reading from and writing to stateful sources. The state can either be internal to the topology – e.g., kept in-memory and backed by HDFS – or externally stored in a database like Memcached or Cassandra
In case of any fail over it says
Trident manages state in a fault-tolerant way so that state updates are idempotent in the face of retries and failures.
You can go through the documentation for any further clarification.
Tx (and credit) to Storm user group!
http://mail-archives.apache.org/mod_mbox/storm-user/201312.mbox/%3C74083558E4509844944FF5CF2BA7B20F1060FD0E#ESESSMB305.ericsson.se%3E
In original Storm, both spout and bolt are stateless. Storm can managed to restart nodes but it will require some effort to restore the states of nodes. There are two solutions that I can think of:
If a message fails to process, Storm will replay it from ROOT of the topology and the logic of replay has to be implemented by user. So in this case I would like to put more state information (e.g. the ID of some external state storage and id of this task) in messages.
Or you can use Trident. It can provides txid to each transaction and simplify storage process.
I'm OK with first solution because my app doesn't require transactional operations and I have a better understanding of the original Storm (Storm generates simpler log than Trident does).
You can use the task ID.
Task ids are assigned at topology creation and are static. If a task dies/restarts or gets reassigned somewhere else, it will still have the same id.

Resources