A few years ago I read ODL recommendation not to use READ operation but instead use Data Change Listener or some of its variation. Is it still valid recommendation?
Looking at the ODL code, I got impression that each transaction commit is applied to “In Memory Data Store” immediately during the commit simultaneously with sending notification to the listener. Is it correct?
Why in this case, reading is not as efficient as using the notification?
Where did you read this recommendation? It depends on your use case. Using a data tree change listener (DTCL) with your own cache is going to have faster access than issuing a read operation, especially in a clustered environment if the shard leader is remote. However maintaining your own cache via a DTCL is eventually consistent, meaning your cache may not have up-to-date data. This has to be considered for the use case. If you need strong consistency, then you must use read operations.
Related
I'm debugging an issue in an application and I'm running into a scneario where I'm out of ideas, but I suspect a race condition might be in play.
Essentially, I have two API routes - let's call them A and B. Route A generates some data and Route B is used to poll for that data.
Route A first creates an entry in the redis cache under a given key, then starts a background process to generate some data. The route immediately returns a polling ID to the caller, while the background data thread continues to run. When the background data is fully generated, we write it to the cache using the same cache key. Essentially, an overwrite.
Route B is a polling route. We simply query the cache using that same cache key - we expect one of 3 scenarios in this case:
The object is in the cache but contains no data - this indicates that the data is still being generated by the background thread and isn't ready yet.
The object is in the cache and contains data - this means that the process has finished and we can return the result.
The object is not in the cache - we assume that this means you are trying to poll for an ID that never existed in the first place.
For the most part, this works as intended. However, every now and then we see scenario 3 being hit, where an error is being thrown because the object wasn't in the cache. Because we add the placeholder object to the cache before the creation route ever returns, we should be able to safely assume this scenario is impossible. But that's clearly not the case.
Is it possible that there is some delay between when a Redis write operation returns and when the data is actually available for querying? That is, is it possible that even though the call to add the cache entry has completed but the data would briefly not be returned by queries? It seems the be the only thing that can explain the behavior we are seeing.
If that is a possibility, how can I avoid this scenario? Is there some way to force Redis to wait until the data is available for query before returning?
Is it possible that there is some delay between when a Redis write operation returns and when the data is actually available for querying?
Yes and it may depend on your Redis topology and on your network configuration. Only standalone Redis servers provides strong consistency, albeit with some considerations - see below.
Redis replication
While using replication in Redis, the writes which happen in a master need some time to propagate to its replica(s) and the whole process is asynchronous. Your client may happen to issue read-only commands to replicas, a common approach used to distribute the load among the available nodes of your topology. If that is the case, you may want to lower the chance of an inconsistent read by:
directing your read queries to the master node; and/or,
issuing a WAIT command right after the write operation, and ensure all the replicas acknowledged it: while the replication process would happen to be synchronous from the client standpoint, this option should be used only if absolutely needed because of its bad performance.
There would still be the (tiny) possibility of an inconsistent read if, during a failover, the replication process promotes a replica which did not receive the write operation.
Standalone Redis server
With a standalone Redis server, there is no need to synchronize data with replicas and, on top of that, your read-only commands would be always handled by the same server which processed the write commands. This is the only strongly consistent option, provided you are also persisting your data accordingly: in fact, you may end up having a server restart between your write and read operations.
Persistence
Redis supports several different persistence options; in your scenario, you may want to configure your server so that it
logs to disk every write operation (AOF) and
fsync every query.
Of course, every configuration setting is a trade off between performance and durability.
I have been starting to make greater use of the message data feature of masstransit and am getting to the point needing to manage the message data in the store - i.e. remove old data.
The obvious choice is to have some outside process tidy up data, but clearly a scheduled (or not) clean up could remove data still in use or referenced by error or dead letter queues.
Ideally I would like to limit stored message data retention to messages only in error or dead letter queues, and automatically remove data for messages that have been successfully processed.
What would be the best approach to achieve this with MassTransit? Perhaps with a MiddleWare approach or similar, and if that is the case what is the correct approach?
Manual cleanup is recommended, using whatever makes sense for the repository in use. Because messages may still be in queues, or in error/dead-letter queues as you pointed out, it is really up to development/operations team to know when the right time is to remove older message data.
I'd suggest monitoring and managing the error/dead-letter queues more aggressively, keeping them empty. And then, just figure a good timeframe to delete old message data - one week, ten days, whatever - and deal with it that way.
I have had a backlog item to come up with a way to automatically manage message data, but since message data can be forwarded (using the same stored data) either via publish or send, there is no good way to track references.
I'm new in ES, and only trying to sort everything in my head. I have heard that ES is actually solving the consistency issue between write and read database (with some delay for sure). But I still do not fully understand how?
If command is coming to domain and aggregate root firing event to update event store, same event is sending to update read side?? But what if message lost, we will have outdated read side.
Is projections the only solution??So instead of updating from event, read side walking through event store and reproducing aggregate (from beginning or from some snapshot). But in such case it's probably breaking some rules as read side should be simple and it should not know about domain. And also usually read side is a separate application so she can't know about aggregate.
For sure we also can use rabbitMQ or some other message broker to not lost messages,and actually I think we need. But I also read that to make it consistent "you can use rabbit or ES", but again how ES can make it consistent by own??
Benjamin is completely right about the purpose of Event Sourcing.
My answer aims to add some more details.
First:
Read models and projections aren't suppose to represent the aggregate state.
Projections are the way for event-sourced systems to build the read model for CQRS. CQRS in essence postulates that write and read models usually serve different purposes and therefore it makes perfect sense to use another model for the read side.
Therefore, you often find multiple projections building different, narrowly purposed models, targeting specific needs for queries.
Second:
By "solving consistency issues" you probably mean that in event-sourced systems each state transition is represented as an event (or multiple events). Therefore, writes are always transactional. The database you choose as your event store should support (could using some library or additional tool) real-time subscription that would allow you to receive new events in your projection, in order. For new projections, it will start reading from the start and eventually come real-time. Subscriptions usually need to keep the current processing position in the global stream of events so when the projection restarts, it starts receiving events from the point which is last known to it.
By doing this, you will guarantee that every state transition in the write model will be reflected in the read model. This is probably what you mean in your original question.
Third:
Now, all those things above imply that you cannot use a message bus (only) to deliver events to projections. Brokers give no ordering guarantees and can deliver one message more than once. Also, message brokers don't keep history so you cannot build new projections at will.
However, it doesn't mean that you can't use brokers at all. Some projections don't require ordering and are idempotent. But the feed for events to publish via a broker is the same subscription, so you get guaranteed delivery and can read past events if necessary.
Fourth:
CQRS doesn't imply separate databases. Sometimes, using CQRS just means that you use some persistence layer for your domain objects, so you read and write aggregates. But for queries, you just query at will, whatever you want. A database view is a technical example of CQRS.
Almost there:
Projections need to have little to no logic, it is true. The main point here is to ensure idempotency, if possible, so projections usually should not use operations to calculate new values based on old values and information from events.
But projections will know about your domain. Everything in your system should know about your domain.
And last:
You can definitely use different databases for write and read models without getting to Event Sourcing. You just need to choose a database that supports a change feed. SQL Server, Postgres, CosmosDb and other databases have such functionality.
P.S. I'd suggest spending some time studying those concepts. I can point to the book repository, it has CQRS and Event Sourcing examples: https://github.com/PacktPublishing/Hands-On-Domain-Driven-Design-with-.NET-Core
I have heard that ES is actually solving the consistency issue between
write and read database
To the best of my knowledge, Event sourcing has NOTHING to do with consistency between read/write to your db. Consistency between read/write has actually more to do with the type of db you are using such as relational which are mostly ACID versus the non-relational db which are often eventual consistency.
ES is not meant for that, instead ES : "Capture all changes to an application state as a sequence of events" Martin Fowler.
ES works like time machine, which allows you to change the state of your application to a specific date time in the past.
I am new to the topic. Having read a handful of articles on it, and asked a couple of persons, I still do not understand what you people do in regard to one problem.
There are UI clients making requests to several backend instances (for now it's irrelevant whether sessions are sticky or not), and those instances are connected to some highly available DB cluster (may it be Cassandra or something else of even Elasticsearch). Say the backend instance is not specifically tied to one or cluster's machines, and instead its every request to DB may be served by a different machine.
One client creates some record, it's synchronously of asynchronously stored to one of cluster's machines then eventually gets replicated to the rest of DB machines. Then another client requests the list or records, the request ends up served by a distant machine not yet received the replicated changes, and so the client does not see the record. Well, that's bad but not yet ugly.
Consider however that the second client hits the machine which has the record, displays it in a list, then refreshes the list and this time hits the distant machine and again does not see the record. That's very weird behavior to observe, isn't it? It might even get worse: the client successfully requests the record, starts some editing on it, then tries to store the updates to DB and this time hits the distant machine which says "I know nothing about this record you are trying to update". That's an error which the user will see while doing something completely legitimate.
So what's the common practice to guard against this?
So far, I only see three solutions.
1) Not actually a solution but rather a policy: ignore the problem and instead speed up the cluster hard enough to guarantee that 99.999% of changes will be replicated on the whole cluster in, say, 0.5 secord (it's hard to imagine some user will try to make several consecutive requests to one record in that time; he can of course issue several reading requests, but in that case he'll probably not notice inconsistency between results). And even if sometimes something goes wrong and the user faces the problem, well, we just embrace that. If the loser gets unhappy and writes a complaint to us (which will happen maybe once a week or once an hour), we just apologize and go on.
2) Introduce an affinity between user's session and a specific DB machine. This helps, but needs explicit support from the DB, and also hurts load-balancing, and invites complications when the DB machine goes down and the session needs to be re-bound to another machine (however with proper support from DB I think that's possible; say Elasticsearch can accept routing key, and I believe if the target shard goes down it will just switch the affinity link to another shard - though I am not entirely sure; but even if re-binding happens, the other machine may contain older data :) ).
3) Rely on monotonic consistency, i.e. some method to be sure that the next request from a client will get results no older than the previous one. But, as I understand it, this approach also requires explicit support from DB, like being able so pass some "global version timestamp" to a cluster's balancer, which it will compare with it's latest data on all machines' timestamps to determine which machines can serve the request.
Are there other good options? Or are those three considered good enough to use?
P.S. My specific problem right now is with Elasticsearch; AFAIK there is no support for monotonic reads there, though looks like option #2 may be available.
Apache Ignite has primary partition for a key and backup partitions. Unless you have readFromBackup option set, you will always be reading from primary partition whose contents is expected to be reliable.
If a node goes away, a transaction (or operation) should be either propagated by remaining nodes or rolled back.
Note that Apache Ignite doesn't do Eventual Consistency but instead Strong Consistency. It means that you can observe delays during node loss, but will not observe inconsistent data.
In Cassandra if using at least quorum consistency for both reads and writes you will get monotonic reads. This was not the case pre 1.0 but thats a long time ago. There are some gotchas if using server timestamps but thats not by default so likely wont be an issue if using C* 2.1+.
What can get funny is since C* uses timestamps is things that occur at "same time". Since Cassandra is Last Write Wins the times and clock drift do matter. But concurrent updates to records will always have race conditions so if you require strong read before write guarantees you can use light weight transactions (essentially CAS operations using paxos) to ensure no one else updates between your read to update, these are slow though so I would avoid it unless critical.
In a true distributed system, it does not matter where your record is stored in remote cluster as long as your clients are connected to that remote cluster. In Hazelcast, a record is always stored in a partition and one partition is owned by one of the servers in the cluster. There could be X number of partitions in the cluster (by default 271) and all those partitions are equally distributed across the cluster. So a 3 members cluster will have a partition distribution like 91-90-90.
Now when a client sends a record to store in Hazelcast cluster, it already knows which partition does the record belong to by using consistent hashing algorithm. And with that, it also knows which server is the owner of that partition. Hence, the client sends its operation directly to that server. This approach applies on all client operations - put or get. So in your case, you may have several UI clients connected to the cluster but your record for a particular user is stored on one server in the cluster and all your UI clients will be approaching that server for their operations related to that record.
As for consistency, Hazelcast by default is strongly consistent distributed cache, which implies that all your updates to a particular record happen synchronously, in the same thread and the application waits until it has received acknowledgement from the owner server (and the backup server if backups are enabled) in the cluster.
When you connect a DB layer (this could be one or many different types of DBs running in parallel) to the cluster then Hazelcast cluster returns data even if its not currently present in the cluster by reading it from DB. So you never get a null value. On updating, you configure the cluster to send the updates downstream synchronously or asynchronously.
Ah-ha, after some even more thorough study of ES discussions I found this: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html
Note how they specifically highlight the "custom value" case, recommending to use it exactly to solve my problem.
So, given that's their official recommendation, we can summarise it like this.
To fight volatile reads, we are supposed to use "preference",
with "custom" or some other approach.
To also get "read your
writes" consistency, we can have all clients use
"preference=_primary", because primary shard is first to get all
writes. This however will probably have worse performance than
"custom" mode due to no distribution. And that's quite similar to what other people here said about Ignite and Hazelcast.
Right?
Of course that's a solution specifically for ES. Reverting to my initial question which is a bit more generic, turns out that options #2 and #3 are really considered good enough for many distributed systems, with #3 being possible to achieve with #2 (even without immediate support for #3 by DB).
I have a WCF service that uses ODP.NET to read data from an Oracle database. The service also writes to the database, but indirectly, as all updates and inserts are achieved through an older layer of business logic that I access via COM+, which I wrap in a TransactionScope. The older layer connects to Oracle via ODBC, not ODP.NET.
The problem I have is that because Oracle uses a two-phase-commit, and because the older business layer is using ODBC and not ODP.NET, the transaction sometimes returns on the TransactionScope.Commit() before the data is actually available for reads from the service layer.
I see a similar post about a Java user having trouble like this as well on Stack Overflow.
A representative from Oracle posted that there isn't much I can do about this problem:
This maybe due to the way OLETx
ITransaction::Commit() method behaves.
After phase 1 of the 2PC (i.e. the
prepare phase) if all is successful,
commit can return even if the resource
managers haven't actually committed.
After all the successful "prepare" is
a guarantee that the resource managers
cannot arbitrarily abort after this
point. Thus even though a resource
manager couldn't commit because it
didn't receive a "commit" notification
from the MSDTC (due to say a
communication failure), the
component's commit request returns
successfully. If you select rows from
the table(s) immediately you may
sometimes see the actual commit occur
in the database after you have already
executed your select. Your select will
not therefore see the new rows due to
consistent read semantics. There is
nothing we can do about this in Oracle
as the "commit success after
successful phase 1" optimization is
part of the MSDTC's implementation.
So, my question is this:
How should I go about dealing with the possible delay ("asyc" via the title) problem of figuring out when the second part of the 2PC actually occurs, so I can be sure that data I inserted (indirectly) is actually available to be selected after the Commit() call returns?
How do big systems deal with the fact that the data might not be ready for reading immediately?
I assume that the whole transaction has prepared and a commit outcome decided by the TransactionManager, therefore eventually (barring heuristic damage) the Resource Managers will receive their commit message and complete. However, there are no guarantees as to how long that might take - could be days, no timeouts apply, having voted "commit" in the Prepare the Resource Manager must wait to hear the collective outcome.
Under these conditions, the simplest approach is to take "an understood, we're thinking" approach. Your request has been understood, but you actually don't know the outcome, and that's what you tell the user. Yes, in all sane circumstances the request will complete, but under some conditions operators could actually choose to intervene in the transaction manually (and maybe cause heuristic damage in doing so.)
To go one step further, you could start a new transaction and perform some queries to see if the data is there. Now, if you are populating a result screen you will naturally be doing such as query. The question would be what to do if the expected results are not there. So again, tell the user "your recent request is being processed, hit refresh to see if it's complete". Or retry automatically (I don't much like auto retry - prefer to educate the user that it's effectively an asynch operation.)