I'm trying to understand Heroku's release phase. As far as I can tell, Heroku will serve requests from the existing dynos during the release phase. Is that correct?
Which means that if I remove a column from the database during the release phase, then the existing dynos will be serving requests that use the database that has this column removed. It is only when the release phase finally finishes (which could take a moment if I'm running other things besides migrations in the release tasks) that the new dynos with the new code will be swapped into serving requests.
During the time it takes the release phase to finish and the new code being swapped into place, there will be exceptions raised (I use Python) by code that is trying to access the removed column.
Does that make sense, and is my understanding correct?
So to avoid exceptions being raised, I guess it is best to deploy twice. Once to remove code that accesses the database column to be removed, and a second time to remove the column. Is this a strategy that people use?
about release phase: the behaviour you describe is part of preboot, which you have to activate for your application. Otherwise Heroku will stop the old dynos before booting the new ones.
About dropping the column:
Yes, since the routers point to the old dynos while release phase is running, errors will happen.
The strategy here is exactly what you proposed, do a first deploy removing the code, then a second one removing the column.
Keep in mind that some database operations also lock the whole table for some time, where reads and writes to the whole table have to wait (DROP COLUMN I believe also gets a full table lock). Also for example adding a column with a default on a bigger table takes quite much time.
Related
A system is being implemented using microservices. In order to decrease interactions between microservices implemented "at the same level" in an architecture, some microservices will locally cache copies of tables managed by other services. The assumption is that the locally cached table (a) is frequently accessed in a "read mode" by the microservice, and (b) has relatively static content (i.e., more of a "lookup table" vice a transactional content).
The local caches will maintain synch using inter-service messaging. As the content should be fairly static, this should not be a significant issue/workload. However, on startup of a microservice, there is a possibility that the local cache has gone stale.
I'd like to implement some sort of rolling revision number on the source table, so that microservices with local caches can check this revision number to potentially avoid a re-synch event.
Is there a "best practice" to this approach? Or, a "better alternative", given that each microservice is backed by it's own database (i.e., no shared database)?
In my opinion you shouldn't be loading the data at start up. It might be bit complicated to maintain version.
Cache-Aside Pattern
Generally in microservices architecture you consider "cache-aside pattern". You don't build the cache at front but on demand. When you get a request you check the cache , if it's not there you update the cache with latest value and return response, from there it's always returned from cache. The benefit is you don't need to load everything at front. Say you have 200 records, while services are only using 50 of them frequently , you are maintaining the extra cache that may not be required.
Let the requests build the cache , it's the one time DB hit . You can set the expiry on cache and incoming request build it again.
If you have data which is totally static (never ever change) then this pattern may not be worth a discussion , but if you have a lookup table that can change even once a week, month, then you should be using this pattern with longer cache expiration time. Maintaining the version could be costly. But really upto you how you may want to implement.
https://learn.microsoft.com/en-us/azure/architecture/patterns/cache-aside
We ran into this same issue and have temporarily solved it by using a LastUpdated timestamp comparison (same concept as your VersionNumber). Every night (when our application tends to be slow) each service publishes a ServiceXLastUpdated message that includes the most recent timestamp when the data it owns was added/edited. Any other service that subscribes to this data processes the message and if there's a mismatch it requests all rows "touched" since it's last local update so that it can get back in sync.
For us, for now, this is okay as new services don't tend to come online and be in use same day. But, our plan going forward is that any time a service starts up, it can publish a message for each subscribed service indicating it's most recent cache update timestamp. If a "source" service sees the timestamp is not current, it can send updates to re-sync the data. This has the advantage of only sending the needed updates to the specific service(s) that need it even though (at least for us) all services subscribed have access to the messages.
We started with using persistent Queues so if all instances of a Microservice were down, the messages would just build up in it's queue. There are 2 issues with this that led us to build something better:
1) It obviously doesn't solve the "first startup" scenario as there is no queue for messages to build up in
2) If ANYTHING goes wrong either in storing queued messages or processing them, you end up out of sync. If that happens, you still need a proactive mechanism like we have now to bring things back in sync. So, it seemed worth going this route
I wouldn't say our method is a "best practice" and if there is one I'm not aware of it. But, the way we're doing it (including planned future work) has so far proven simple to build, easy to understand and monitor, and robust in that it's extremely rare we get an event caused by out-of-sync local data.
I am new to the topic. Having read a handful of articles on it, and asked a couple of persons, I still do not understand what you people do in regard to one problem.
There are UI clients making requests to several backend instances (for now it's irrelevant whether sessions are sticky or not), and those instances are connected to some highly available DB cluster (may it be Cassandra or something else of even Elasticsearch). Say the backend instance is not specifically tied to one or cluster's machines, and instead its every request to DB may be served by a different machine.
One client creates some record, it's synchronously of asynchronously stored to one of cluster's machines then eventually gets replicated to the rest of DB machines. Then another client requests the list or records, the request ends up served by a distant machine not yet received the replicated changes, and so the client does not see the record. Well, that's bad but not yet ugly.
Consider however that the second client hits the machine which has the record, displays it in a list, then refreshes the list and this time hits the distant machine and again does not see the record. That's very weird behavior to observe, isn't it? It might even get worse: the client successfully requests the record, starts some editing on it, then tries to store the updates to DB and this time hits the distant machine which says "I know nothing about this record you are trying to update". That's an error which the user will see while doing something completely legitimate.
So what's the common practice to guard against this?
So far, I only see three solutions.
1) Not actually a solution but rather a policy: ignore the problem and instead speed up the cluster hard enough to guarantee that 99.999% of changes will be replicated on the whole cluster in, say, 0.5 secord (it's hard to imagine some user will try to make several consecutive requests to one record in that time; he can of course issue several reading requests, but in that case he'll probably not notice inconsistency between results). And even if sometimes something goes wrong and the user faces the problem, well, we just embrace that. If the loser gets unhappy and writes a complaint to us (which will happen maybe once a week or once an hour), we just apologize and go on.
2) Introduce an affinity between user's session and a specific DB machine. This helps, but needs explicit support from the DB, and also hurts load-balancing, and invites complications when the DB machine goes down and the session needs to be re-bound to another machine (however with proper support from DB I think that's possible; say Elasticsearch can accept routing key, and I believe if the target shard goes down it will just switch the affinity link to another shard - though I am not entirely sure; but even if re-binding happens, the other machine may contain older data :) ).
3) Rely on monotonic consistency, i.e. some method to be sure that the next request from a client will get results no older than the previous one. But, as I understand it, this approach also requires explicit support from DB, like being able so pass some "global version timestamp" to a cluster's balancer, which it will compare with it's latest data on all machines' timestamps to determine which machines can serve the request.
Are there other good options? Or are those three considered good enough to use?
P.S. My specific problem right now is with Elasticsearch; AFAIK there is no support for monotonic reads there, though looks like option #2 may be available.
Apache Ignite has primary partition for a key and backup partitions. Unless you have readFromBackup option set, you will always be reading from primary partition whose contents is expected to be reliable.
If a node goes away, a transaction (or operation) should be either propagated by remaining nodes or rolled back.
Note that Apache Ignite doesn't do Eventual Consistency but instead Strong Consistency. It means that you can observe delays during node loss, but will not observe inconsistent data.
In Cassandra if using at least quorum consistency for both reads and writes you will get monotonic reads. This was not the case pre 1.0 but thats a long time ago. There are some gotchas if using server timestamps but thats not by default so likely wont be an issue if using C* 2.1+.
What can get funny is since C* uses timestamps is things that occur at "same time". Since Cassandra is Last Write Wins the times and clock drift do matter. But concurrent updates to records will always have race conditions so if you require strong read before write guarantees you can use light weight transactions (essentially CAS operations using paxos) to ensure no one else updates between your read to update, these are slow though so I would avoid it unless critical.
In a true distributed system, it does not matter where your record is stored in remote cluster as long as your clients are connected to that remote cluster. In Hazelcast, a record is always stored in a partition and one partition is owned by one of the servers in the cluster. There could be X number of partitions in the cluster (by default 271) and all those partitions are equally distributed across the cluster. So a 3 members cluster will have a partition distribution like 91-90-90.
Now when a client sends a record to store in Hazelcast cluster, it already knows which partition does the record belong to by using consistent hashing algorithm. And with that, it also knows which server is the owner of that partition. Hence, the client sends its operation directly to that server. This approach applies on all client operations - put or get. So in your case, you may have several UI clients connected to the cluster but your record for a particular user is stored on one server in the cluster and all your UI clients will be approaching that server for their operations related to that record.
As for consistency, Hazelcast by default is strongly consistent distributed cache, which implies that all your updates to a particular record happen synchronously, in the same thread and the application waits until it has received acknowledgement from the owner server (and the backup server if backups are enabled) in the cluster.
When you connect a DB layer (this could be one or many different types of DBs running in parallel) to the cluster then Hazelcast cluster returns data even if its not currently present in the cluster by reading it from DB. So you never get a null value. On updating, you configure the cluster to send the updates downstream synchronously or asynchronously.
Ah-ha, after some even more thorough study of ES discussions I found this: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html
Note how they specifically highlight the "custom value" case, recommending to use it exactly to solve my problem.
So, given that's their official recommendation, we can summarise it like this.
To fight volatile reads, we are supposed to use "preference",
with "custom" or some other approach.
To also get "read your
writes" consistency, we can have all clients use
"preference=_primary", because primary shard is first to get all
writes. This however will probably have worse performance than
"custom" mode due to no distribution. And that's quite similar to what other people here said about Ignite and Hazelcast.
Right?
Of course that's a solution specifically for ES. Reverting to my initial question which is a bit more generic, turns out that options #2 and #3 are really considered good enough for many distributed systems, with #3 being possible to achieve with #2 (even without immediate support for #3 by DB).
I want to load a large list 5000 items, say, from a Firebase Database as fast as possible and then get any updates when a child is added or changed.
My approach so far is use ObserveSingleEvent to load the entire list at startup
pathRef.ObserveSingleEvent(DataEventType.Value, (snapshot) =>
and when that has finished use ObserveEvent for child changes and additions:
nuint observerHandle1 = pathRef.ObserveEvent(DataEventType.ChildChanged, (snapshot) =>
nuint observerHandle2 = pathRef.ObserveEvent(DataEventType.ChildAdded, (snapshot) =>
The problem is the ChildAdded event is triggered 5000 times, all of which are usually unnecessary and it takes a long time to complete.
(From the docs: "child_added is triggered once for each existing child and then again every time a new child is added to the specified path)
And I can't just ignore the first update for each child, because it might be a genuine change (eg database updated just after the ObserveSingleEvent completed).
So I have to check each item on the client and see if it has changed. This is all very slow and inefficient.
I've seen posts suggesting to use a query with LimitToLast etc, but this is not reliable as child changes may be missed.
My Plan B is to not do ObserveSingleEvent to load the list and just rely on the 5000 child added calls, but that seems crazy and is slower to get the list initially, and probably uses more of the user's data allowance.
I would think this is a common scenario, so what is the best solution?
Also, I have persistence enabled, so a couple of related questions:
1) If the entire list has been loaded before, does ObserveSingleEvent with Value just load from disk and there's no network use, when online (but no changes have happened since last download) or does it download the whole list again?
2) Similarly with the 5000 calls to child added - if it's all on disk, is there any network use, when online if no changes?
(I'm using C# for iOS, but I'm sure its the same issue for other platforms)
The wire traffic for a value listener and for a child_ listener is exactly the same. The events are purely a client-side interpretation of the underlying data.
If you need all children, consider using only the child_* listeners and no pathRef.ObserveSingleEvent(DataEventType.Value. It'll lead to the simplest code.
I have a Laravel (Lumen 5.2) project that run against a MariaDB Galera Cluster. When running the app it seems to work just fine. But when I run the PHPUnit tests, they randomly fails.
The problem is that I populate the database and then trying to get the data (ids) to populate other tables with a foreign key. But when trying to get the data immediately after, the data is null.
The Laravel database connection is used with a READ user and a WRITE user. (Laravel automatically uses the correct one when inserting or reading). And I think this is the problem somehow. When I only use the WRITE user, the tests works just fine.
SET SESSION wsrep_sync_wait = 1;
before the SELECT will guarantee that the writes have caught up.
Although Galera is "synchronous", it is not quite. The writes are guaranteed to be sent to all the other nodes, and that they will work there. However, in the case of a "critical" read, a SELECT can get to the recipient node too fast to see the write. The above setting solves that problem.
Now, let's get real about how you should implement, say, a web site with Galera under the covers.
When practical, such as in a transaction, do all the commands against the same node. There is no penalty for this. But, check for errors after the COMMIT.
When not practical -- such as starting a new web page from a new HTTP connection -- use that SET.
While there is, potentially, a slight delay in the SELECT to wait for replication, the delay is usually very close to zero. I would suggest that your tests are effectively stress tests will deceptively say that the wait is high. That is, a benchmark is usually designed to find the "worst", not the "typical".
How much delay is there between the write and the failing SELECT? Perhaps only 1ms. How fast can a 'user' post a 'blog', then get to the next page and find it "missing"? Perhaps over 100ms.
Your stress test has discovered the need for the SET, not that Galera is 'broken'.
More Galera Caveats.
We have an issue, more often than I would like, where whether worker or client sessions crash and these sessions were in the process of using a number sequences to create a new record, but they end up blocking that number sequence literally and anyone else trying to create a record using the same sequence will have its client frozen.
When this happens, I usually go in the NUMBERSEQUENCELIST table, I spot the correct DataAreadId and the user, and delete the row whose Status = 1.
But this kind of annoying really. Is there anything, any way I can configure the AOS server to release number sequence when client/workers crash ?
For the worker sessions, I guess we can fine tweak the code which runs in them, but for the client sessions crashing, not much we can do...
Any ideas ?
Thanks!
EDIT: Turns out that in this situation, after restarting the AOS server, you can go in List in the number sequence menu, and clean it up. Prior to the restart, my client would freeze trying to do that. So no need to do it directly through SQL.
Continuous numbers in NumberSequenceList are automatically cleaned up every 24 hours (or as set up on the number sequence). The cleanup process is quite slow if there are many "dead" numbers (hundreds or thousands). This may be considered as a hang, but is not.
Things to consider:
Is a continuous number sequence needed?
Do the cleanup more frequent (say every half hour instead of the default 24 hour)
Setup the cleanup process as a batch process
Fix the bug in the client code using the number sequence
Also avoid reserving the number, just use it. Instead of the anti-pattern:
NumberSeq idSequence = NumberSeq::newGetNum(IntrastatParameters::numRefIntrastatArchiveID(), true);
this.IntrastatArchiveID = idSequence.num();
idSequence.used();
Just use the number:
this.IntrastatArchiveID = NumberSeq::newGetNum(IntrastatParameters::numRefIntrastatArchiveID()).num();
The makeDecisionLater parameter should only be used in forms, where user may decide not to use the number (by delete or by escape). And in that case the NumberSeqFormHandler class should be used anyway.