Does service fabric actor model allow throttling? If not, how to make one? - actor

Here is my scenario using service fabric cluster:
One actor instance per 1 customer.
Actor instance id is customer id. So this actor is my customer actor.
The customer actor implements a workflow that has n number of steps.
Lets say, at step m (of n steps), this actor needs to talk to external system (E).
System E does not allow more than x number of clients at any given time.
I have 100,000 customer actor instances at any given time.
Because of external system (E), I need to throttle my customer actors to x at step m.
After step m (with throttling set at x), I want to be back to full potential of actor model again.
Here are my questions:
Does service fabric cluster actor model provide any throttling mechanism? With that throttling mechanism, how would my scenario change?
If there is no out-of-box solution for throttling in actor model, how would I go about creating one?

Usually you'd use some kind of Queue to deal with load bursts. You could (for instance) add a Service between the Actor and system E that enqueues messages from the Actors, and then processes them in the background, using throttling there.
Actors and their callers won't need to know about any throttling. And it decouples the Actors from system E.

No there's nothing like that built in. If your system E can only handle x clients, and you have y > x actors, then your actors can simply try to talk to E at step m, and if E is at capacity, back-off and try again later. What other sort of throttling are you hoping for?

Related

Which Data structure is suitable queue or stack?

I am given a task to develop an application that will manage the appointment of patients coming for treatment. There can be two types of patients called normal patients and emergency patients. The system will take information of every patient and decide the turn of patient on the basis of type of patient.
If the patient will be from normal category, it will take appointment on the basis of arrival while the emergency patient will take appointment early than normal patients. All normal and emergency patients will be added into same system but appointment will be given differently. Every emergency patient will get appointment immediately.
Which data structure from Stack and Queue will be most efficient choice for the development of required application.
Note: You are not allowed to use any other data structures like priority queue and double ended queue
According to me, queue is a better option for calling normal patients because this follows FIFO. But how to deal with emergency patients without using priority queue
Are you prohibited from using two queues ? If not use two one for emergency patients the other for regular patients.
Which data structure from Stack and Queue will be most efficient choice for the development of required application.
A stack won't be of much help since that's a LIFO ("last in, first out") container. You should use a queue.
For the simple case, you could use two separate regular FIFO ("first in, first out") queues. When it's time to pick a patient:
If the emergency queue is not empty, pick one from that queue.
otherwise pick from the other queue.
A more generic solution is to use a priority queue.
Upon initial inspection of a new patient, you assign a priority to the patient, which can be a number from 0-100 for example. The patient is then placed in the priority queue, which automatically puts the patient before all other patiens with a lower priority.
When it's time to pick a patient you will always pick the first in line since the patients are lined up in the queue based on the priority of the matter.
A second sorting parameter for the priority queue could be the time of arrival, so that two patients that have been assigned the same priority are ordered according to when they came to seek help.

Solana - leader validator and incrementing field

As I understand it, Solana will elect a leader each round and there will be multiple validators handling the transactions independently. The leader will then consolidate all the transactions.
From this understanding, I'm curious how Solana actually handles programs which increment a field. So lets say we have this counter field, which increases by 1 each time the program is called. What happens if 10 different users calls this program at the same time, how will this work if the 10 transactions are handled by the ten validators independently. For example at the start of the round, counter=50 and during the round, ten different validators handles the transactions separately so each validator will increase the counter=51. When the leader gets back all the txns, it will say counter=51, what happens in this scenario?
I feel like there is something missing in my assumptions.
So my understanding here seems to be incorrect. It is actually the leader who executes the transactions and the validators who are verifying the transactions.
Source
Page 2 - Section 3 - https://solana.com/solana-whitepaper.pdf
As shown in Figure 1, at any given time a system node is designated as
Leader to generate a Proof of History sequence, providing the network global
read consistency and a verifiable passage of time. The Leader sequences user
messages and orders them such that they can be efficiently processed by other
nodes in the system, maximizing throughput. It executes the transactions
on the current state that is stored in RAM and publishes the transactions
and a signature of the final state to the replications nodes called Verifiers.
Verifiers execute the same transactions on their copies of the state, and publish their computed signatures of the state as confirmations. The published
confirmations serve as votes for the consensus algorithm.
The "recent blockhash" is another important part of this. A transaction references a recent blockhash, which is part of the Proof of History sequence. If two transactions reference the same blockhash, they are counted as duplicates by the network, even if they come from two different users.
More information can be found at https://docs.solana.com/developing/programming-model/transactions#recent-blockhash
There is only one PoH generator(Block producer) at a time. other nodes are just validating.
I cannot comment to Jon C but the answer is wrong. you can use the same recent blockhash otherwise there is no way solana can handle 50000 tps when block time is around 0.4 sec.

Is REPLICATE DATA pattern good option to minimize synchronous micro-services communication?

In a world of microservices, often one microservice needs to invoke another, synchronous or asynchronous way.
In the case of synchronous way of communication, I have understood that it affects the availbility of services, as both services need to be available during calls.
To minimize this synchronous way of communication, one possible solution is to have DATA REPLICATION at client service. The client service also up-to-date data by listening to events published by services.
According to me, this is not a good choice as we are duplicating data and it might become stale and also database overhead.
what will be the best suitable scenario when the above pattern will be the best suit?
Microservices are distributed systems. This means that they are constrained by the CAP theorem, which basically means you have a choice between:
Sacrifice availability to preserve consistency: this would (among other things) lead to one service invoking functionality in another in a synchronous way. If the other service is unavailable, so is all functionality in this service which depends on that service's functionality.
Sacrifice consistency to preserve availability: you build services to be autonomous and not depend on other services being up. This leads in fairly short order to services not sharing databases and to asynchronous replication of data (because if service A has synchronously replicated data from service B, then service B being down doesn't affect A's availability, but A being down affects B's availability): with asynchronous replication, the best you can hope for is eventual consistency.
The choice between those two (if you happen to have the ability to freeze the entire universe if there's a network partition, you might be able to sacrifice partition tolerance for consistency and availability) is ultimately a business question (it's worth noting that there's a continuum of approaches between those extremes). How much are you spending on storage and on designing an (arguably) more complex system vs. how much are you losing by being unavailable?
It should be noted that the universe is inherently eventually consistent: the sun could have gone supernova a few minutes ago and we can't know it for a few minutes more.
As for the concern about duplicated data: chances are the data is already duplicated (backups) and in any database worth using the data is duplicated (the write-ahead log).
As for situations, it's a lot harder to think of a situation where aiming for strong consistency is strictly the most suitable option.
But for an example, consider a chain of coffee shops. We have a cash register service and we have a loyalty/rewards service. Data from the loyalty/rewards service is needed by the cash register (if a customer is redeeming a "50% off a latte" reward you'd want the register to know that it's valid), and every transaction (at least those with a loyalty ID) at the register should be known by the rewards service.
If we want the reward redemptions to be consistent, then it implies that if the loyalty/rewards service is inaccessible from the register, no rewards can be redeemed. There's a nonzero chance that a customer who can't redeem a reward just walks out (and a further nonzero chance that they never get coffee from you again).
Conversely, if we want both services to have a consistent view then we're demanding that if the power's out at any store we can't determine new rewards, or if the loyalty/rewards service is inaccessible from the register, no new sales can be made.
The solution is for both services to maintain the data they need to function, even if another service controls updates to that data. They'll eventually catch up. In the case of reward redemption, assuming the unavailability happens rarely enough, it may even be desirable to have the cash register perform a preliminary validation and if that passes, assume that the reward is valid and submit it later to the loyalty/reward service.

Advice on pubsub topic division based on geohashes for ably websocket connection service

My question concerns the following use case:
Use case actors
User A: The user who sets a broadcast region and views stream with live posts.
User B: The first user who sends a broadcast message from within the broadcast region set by user A.
User C: The second user who sends a broadcast message from within the broadcast region set by user A.
Use case description
User A selects a broadcast region within which boundaries (radius) (s)he wants to receive live broadcast messages.
User A opens the livefeed and requests an initial set of livefeed items.
User B broadcasts a message from within the broadcast region of user A while user A’s livefeed is still open.
A label with 1 new livefeed item appears at the top of User A’s livefeed while it is open.
As user C publishes another livefeed post from within the selected broadcast region from user A, the label counter increments.
User A receives a notification similar to this example of Facebook:
The solution I thought to apply (and which I think Pubnub uses), is to create a topic per geohash.
In my case that would mean that for every user who broadcasted a message, it needs to be published to the geohash-topic, and clients (app / website users) would consume the geohash-topic through a websocket if it fell within the range of the defined area (radius). Ably seems to provide this kind of scalable service using web sockets.
I guess it would simplified be something like this:
So this means that a geohash needs to be extracted from the current location from where the broadcast message is sent. This geohash should have granular scale that is small enough so that the receiving user can set a broadcast region that is more or less accurate. (I.e. the geohash should have enough accuracy if we want to allow users to define a broadcast region within which to receive live messages, which means that one should expect a quite large amount of topics if we decided to scale).
Option 2 would be to create topics for a geohash that has a less specific granularity (covering a larger area), and let clients handle the accuracy based on latlng values that are sent along with the message.
The client would then decide whether or not to drop messages. However, this means more messages are sent (more overhead), and a higher cost.
I don't have experience with this kind of architecture, and question the viability / scalability of this approach.
Could you think of an alternate solution to this question to achieve the desired result or provide more insight on how to solve this kind of problem overall? (I also considered using regular req-res flow, but this means spamming the server, which also doesn't seem like a very good solution).
I actually checked.
Given a region of 161.4 km² (like region Brussels), the division of geohashes by length of the string is as follows:
1 ≤ 5,000km × 5,000km
2 ≤ 1,250km × 625km
3 ≤ 156km × 156km
4 ≤ 39.1km × 19.5km
5 ≤ 4.89km × 4.89km
6 ≤ 1.22km × 0.61km
7 ≤ 153m × 153m
8 ≤ 38.2m × 19.1m
9 ≤ 4.77m × 4.77m
10 ≤ 1.19m × 0.596m
11 ≤ 149mm × 149mm
12 ≤ 37.2mm × 18.6mm
Given that we would allow users to have a possible inaccuracy up to 153m (on the region to which users may want to subscribe to receive local broadcast messages), it would require an amount of topics that is definitely already too large to even only cover the entire region of Brussels.
So I'm still a bit stuck at this level currently.
1. PubNub
PubNub is currently the only service that offers an out of the box geohash pub-sub solution over websockets, but their pricing is extremely high (500 connected devices cost about 49$, 20k devices cost 799$) UPDATE: PubNub has updated price, now with unlimited devices. Website updates coming soon.
Pubnub is working on their pricing model because some of their customers were paying a lot for unexpected spikes in traffic.
However, it will not be a viable solution for a generic broadcasting messaging app that is meant to be open for everybody, and for which traffic is therefore very highly unpredictable.
This is a pity, since this service would have been the perfect solution for us otherwise.
2. Ably
Ably offers a pubsub system to stream data to clients over websockets for custom channels. Channels are created dynamically when a client attaches itself in order to either publish or subscribe to that channel.
The main problem here is that:
If we want high geohash accuracy, we need a high number of channels and hence we have to pay more;
If we go with low geohash accuracy, there will be a lot of redundant messaging:
Let's say that we take a channel that is represented by a geohash of 4 characters, spanning a geographical area of 39.1 x 19.5 km.
Any post that gets sent to that channel, would be multiplexed to everybody within that region who is currently listening.
However, let's say that our app allows for a maximum radius of 10km, and half of the connected users has its setting to a 1km radius.
This means that all posts outside of that 2km radius will be multiplexed to these users unnecessarily, and will just be dropped without having any further use.
We should also take into account the scalability of this approach. For every geohash that either producer or consumer needs, another channel will be created.
It is definitely more expensive to have an app that requires topics based on geohashes worldwide, than an app that requires only theme-based topics.
That is, on world-wide adoption, the number of topics increases dramatically, hence will the price.
Another consideration is that our app requires an additional number of channels:
By geohash and group: Our app allows the possibility to create geolocation based groups (which would be the equivalent of Twitter like #hashtags).
By place
By followed users (premium feature)
There are a few optimistic considerations to this approach despite:
Streaming is only required when the newsfeed is active:
when the user has a browser window open with our website +
when the user is on a mobile device, and actively has the related feed open
Further optimisation can be done, e.g. only start streaming as from 10
to 20 seconds after refresh of the feed
Streaming by place / followed users may have high traffic depending on current activity, but many place channels will be idle as well
A very important note in this regard is how Ably bills its consumers, which can be used to our full advantage:
A channel is opened when any of the following happens:
A message is published on the channel via REST
A realtime client attaches to the channel. The channel remains active for the entire time the client is attached to that channel, so
if you connect to Ably, attach to a channel, and publish a message but
never detach the channel, the channel will remain active for as long
as that connection remains open.
A channel that is open will automatically close when all of the
following conditions apply:
There are no more realtime clients attached to the channel At least
two minutes has passed since the last message was published. We keep
channels alive for two minutes to ensure that we can provide
continuity on the channel as part of our connection state recovery.
As an example, if you have 10,000 users, and at your busiest time of
the month there is a single spike where 500 customers establish a
realtime connection to Ably and each attach to one unique channel and
one global shared channel, the peak number of channels would be the
sum of the 500 unique channels per client and the one global shared
channel i.e. 501 peak channels. If throughout the month each of those
10,000 users connects and attaches to their own unique channel, but
not necessarily at the same time, then this does not affect your peak
channel count as peak channels is the concurrent number of channels
open at any point of time during that month.
Optimistic conclusion
The most important conclusion is that we should consider that this feature may not be as crucial as believe it is for a first version of the app.
Although Twitter, Facebook, etc offer this feature of receiving live updates (and users have grown to expect it), an initial beta of our app on a limited scale can work without, i.e. the user has to refresh in order to receive new updates.
During a first launch of the app, statistics can ba gathered to gain more insight into detailed user behaviour. This will enable us to build more solid infrastructural and financial reflections based on factual data.
Putting aside the question of Ably, Pubnub and a DIY solution, the core of the question is this:
Where is message filtering taking place?
There are three possible solution:
The Pub/Sub service.
The Server (WebSocket connection handler).
Client side (the client's device).
Since this is obviously a mobile oriented approach, client side message filtering is extremely rude, as it increases data consumption by the client while much of the data might be irrelevant.
Client side filtering will also increase battery consumption and will likely result in lower acceptance rates by clients.
This leaves pub/sub filtering (channel names / pattern matching) and server-side filtering.
Pub/Sub channel name filtering
A single pub/sub service serves a number of servers (if not all of them), making it a very expensive resource (relative to the resources we have at hand).
Using channel names to filter messages would be ideal - as long as the filtering is cheap (using exact matches with channel name hash mapping).
However, pattern matching (when subscribing to channels with inexact names, such as "users.*") is very expansive when compared to exact pattern matching.
This means that Pub/Sub channel name filtering can't be used to filter all the messages without overloading the pub/sub system.
Server side filtering
Since a server accepts WebSocket connections and bridges between the WebSocket and the pub/sub service, it's in an ideal position to filter the messages.
However, we don't want the server to process all the messages for all the clients for each connection, as this is an extreme duplication of effort.
Hybrid solution
A classic solution would divide the earth into manageable sections (1 sq. km per section will require 510.1 million unique channel names for full coverage... but I would suggest that the 70% ocean space should be neglected).
Busy sections might be subdivided (NYC might require a section per 250 sq meters rather than 1 sq kilometer).
This allows publishers to publish to exact channel names and subscribers to subscribe to exact channel names.
Publishers might need to publish to more than one channel and subscribers might need to subscribe to more than one channel, depending on their exact location and the grid's borders.
This filtering scheme will filter much, but not all.
The server node will need to look into the message, review it's exact geo-location and filter messages before deciding if they should be sent along the WebSocket connection to the client.
Why the Hybrid Solution?
This allows the system to scale with relative ease.
Since server nodes are (by design) cheaper than the pub/sub service, they could be used to handle the exact location filtering (the heavy work).
At the same time, the strength of the pub/sub system can be used to minimize the server's workload and filter the obvious mis-matches.
Pubnub vs. Ably?
I don't know. I didn't use either of them. I worked with Redis and implemented my own pub/sub solution.
I assume they are both great and it's really up to your needs.
Personally I prefer the DIY approach when it comes to customized or complex situations. IMHO, this seems like it would fall into the DIY category if I were to implement it.

How to determine the number of actors to spawn in akka?

I have recently started looking into the Akka 2.0 framework and was able to get some code running, spawning actors that perform simple oracle database calls, performing simple calculations and whatnot, nothing in production however.
What I want to know, is there a general rule of thumb or best practice to determining how many actors to spawn for certain types of tasks? Say for example, I have a connection pool of 200 jdbc connections, Do I create an actor to represent each connection? Do I create a handful of them and use a round-robin approach?
Thanks.
Note that numberOf(actors) != numberOf(threads).
You should create an actor for every entity that would otherwise share mutable state across threads. The whole thing about the actor model is that it shall isolate mutable state so that only immutable messages get exchanged between the actors. The result is that you don't need any locks anymore and you can easily reason about the thread safety of your program because all mutable state is isolated in actors and you can rely on the framework to properly pass the memory barrier whenever required, e.g. when switching an actor from one thread to another.
The number of threads is a different subject: This depends on the number of cores and the blocking coefficient for each thread, i.e. the percentage of time it spends waiting for other threads or the I/O subsystem. For example, if your actors are doing CPU intensive calculations (e.g. calculating Pi) then the blocking coefficient will be close to 0%. If however your actors are doing mostly I/O, you can easily assume a blocking coefficient of 90% or more.
Finally, the number of threads can be calculated like this:
int threads = Runtime.getRuntime().availableProcessors() * 100 / (100 - blockingCoefficient)
where blockingCoefficient represents an integer percentage between 0 and 99 inclusively.
You can create as many actors as you like, however, you're limited to about 2 billion per parent, also don't forget to stop them when they are done. Also, do not create your actors as top level unless they're actually top-level actors. (i.e. create actors inside actors using context.actorOf instead of system.actorOf)

Resources