I am developing an SMPP platform that has to be capable to delivere specific amount of sms per second.
This has been easily implemented using amqp with spring integration.
But:
I need to run the project as an active-active service on 2 nodes and each node has a connection to 2 SMSC.
For this configuration, I have an allowed traffic of 100 msg/s and I need to ideally spread my traffic on all the available connections.
A simple poller can be easily configured to 25 msg/s for each node (4 * 25 = 100) but if one of my connection is down, I want to spread the lost capacity to the other nodes/connections in live.
For this I would like to create a dynamic poller that gets information about connection status in redis and just adapts the amount of messages allowed per poll at runtime (0 for the broken connection and 33% for the 3 others for example, or 50% if there is only 2 connections on 4 available).
Is it possible to implement this behavior with a custom PollerMetadata or should I look for some other solution?
Poll is quite heavy and may be consider "old-fashion" these day.
I highly recommend to try using : Sse (server send event) or websocket.
Many technology also support both above solution (spring...)
You can find more detail in this article:
https://codeburst.io/polling-vs-sse-vs-websocket-how-to-choose-the-right-one-1859e4e13bd9
Related
I'm trying to understand how Nats Jetstream scales and have a couple of questions.
How efficient is subscribing by subject to historic messages? For example lets say have a stream foo that consists of 100 million messages with a subject of foo.bar and then a single message with a subject foo.baz. If I then make a subscription to foo.baz from the start of the stream will something on the server have to perform a linear scan of all messages in foo or will it be able to immediately seek to the foo.baz message.
How well does the system horizontally scale? I ask because I'm having issues getting Jetstream to scale much above a few thousand messages per second, regardless of how many machines I throw at it. Test parameters are as follows:
Nats Server 2.6.3 running on 4 core 8GB nodes
Single Stream replicated 3 times (disk or in-memory appears to make no difference)
500 byte message payloads
n publishers each publishing 1k messages per second
The bottleneck appears to be on the publishing side as I can retrieve messages at least as fast as I can publish them.
Publishing in NATS JetStream is slightly different than publishing in Core NATS.
Yes, you can publish a Core NATS message to a subject that is recorded by a stream and that message will indeed be captured in the stream, but in the case of the Core NATS publication, the publishing application does not expect an acknowledgement back from the nats-server, while in the case of the JetStream publish call, there is an acknowledgement sent back to the client from the nats-server that indicates that the message was indeed successfully persisted and replicated (or not).
So when you do js.Publish() you are actually making a synchronous relatively high latency request-reply (especially if your replication is 3 or 5, and more so if your stream is persisted to file, and depending on the network latency between the client application and the nats-server), which means that your throughput is going to be limited if you are just doing those synchronous publish calls back to back.
If you want throughput of publishing messages to a stream, you should use the asynchronous version of the JetStream publish call instead (i.e. you should use js.AsyncPublish() that returns a PubAckFuture).
However in that case you must also remember to introduce some amount of flow control by limiting the number of 'in-flight' asynchronous publish applications you want to have at any given time (this is because you can always publish asynchronously much much faster than the nats-server(s) can replicate and persist messages.
If you were to continuously publish asynchronously as fast as you can (e.g. when publishing the result of some kind of batch process) then you would eventually overwhelm your servers, which is something you really want to avoid.
You have two options to flow-control your JetStream async publications:
specify a max number of in-flight asynchronous publication requests as an option when obtaining your JetStream context: i.e. js = nc.JetStream(nats.PublishAsyncMaxPending(100))
Do a simple batch mechanism to check for the publication's PubAcks every so many asynchronous publications, like nats bench does: https://github.com/nats-io/natscli/blob/e6b2b478dbc432a639fbf92c5c89570438c31ee7/cli/bench_command.go#L476
About the expected performance: using async publications allows you to really get the throughput that NATS and JetStream are capable of. A simple way to validate or measure performance is to use the nats CLI tool (https://github.com/nats-io/natscli) to run benchmarks.
For example you can start with a simple test: nats bench foo --js --pub 4 --msgs 1000000 --replicas 3 (in memory stream with 3 replicas 4 go-routines each with it's own connection publishing 128 byte messages in batches of 100) and you should get a lot more than a few thousands messages per second.
For more information and examples of how to use the nats bench command you can take a look at this video: https://youtu.be/HwwvFeUHAyo
Would be good to get an opinion on this. I have a similar behaviour and the only way to achieve higher throughput for publishers is to lower replication (from 3 to 1) but that won't be an acceptable solution.
I have tried adding more resources (cpu/ram) with no success on increasing the publishing rate.
Also, scaling horizontally did not make any difference.
In my situation , i am using Bench tool to publish to js.
For an R3 filestore you can expect ~250k small msgs per second. If you utilize synchronous publish that will be dominated by RTT from the application to the system, and from the stream leader to the closest follower. You can use windowed intelligent async publish to get better performance.
You can get higher numbers with memory stores, but again will be dominated by RTT throughout the system.
If you give me a sense of how large are your messages we can show you some results from nats bench against the demo servers (R1) and NGS (R1 & R3).
For the original question regarding filtered consumers, >= 2.8.x will not do a linear scan to retrieve foo.baz. We could also show an example of this as well if it would help.
Feel free to join the slack channel (slack.nats.io) which is a pretty active community. Even feel free to DM me directly, happy to help.
I am using IBM MQ in my application and the connection factory is defined at Jboss level. The maximum pool size property at Jboss level is configured as 50. The max instances per client configuration for the channel is set as 999999999. Sharing conversations stays as default 10.
I would appreciate if someone can elaborate more on how these connections work altogether?
I understand the maximum connections can be established by the JVM to Queue Manager is 50 (Max connection pool). If I have 50 message listener threads running in parallel all of them will be consumed. But the channel level, sharing conversation is 10 which means up to 10 conversations can be shared over a single TCP connection. In that case, we are not utilizing this capability as we exhausted the 50 connections. If 10 sharing allowed over a connection, we should see only 5 connections established for 50 messages?
Also if we have 100 messages to be consumed or loaded, as we have max channels set to a high value, does that mean 100 channels will be operated, or 10 channels with 10 sharing conversation each?
Please excuse me if the above assumptions are completely wrong as I am a very beginner to async architecture.
Sharing conversation is used to tune load balacing of mq server. When more than one sharing coversation is used, the client code acts as it has lock or round robin acces to channel instace. So if 50 listeners with sharing conversation of 10 are waiting for messages, then only 5 of them are active in any moment. One for each channel instance. No matter threading model in JVM.
By setting sharing conversations to 1 you are eliminating this contetion. The price is higher resource usage. Keep this number to 1 unless you have large number of lightly used queues.
Note : This is a design related question to which i couldn't find a satisfying answer. Hence asking here.
I have a spring boot app which is deployed in cloud ( Cloud foundry). The app connects to an oracle database to retrieve data. The application uses a connection pool(HikariCp) to maintain the connections to database. Lets say the number of connections is set as 5. Now the application has the capacity to scale automatically based on the load. All the instances will be sharing the same database. At any moment there could 50 instances of the same application running, which means the total number of database connections will be 250 (ie 5 * 50).
Now suppose the database can handle only 100 concurrent connections. In the current scenario, 20 instances will use up the 100 connections available. What will happen if the next 30 instances tries to connect to db? If this is design issue, how can this be avoided?
Please note that the numbers provided in the question are hypothetical for simplicity. The actual numbers are much higher.
Let's say:
Number of available DB connections = X
Number of concurrent instances of your application = Y
Maximum size of the DB connection pool within each instance of your application = X / Y
That's slightly simplistic since you might want to be able to connect to your database from other clients (support tools, for example) so perhaps a safer formula is (X * 0.95) / Y.
Now, you have ensured that your application layer will not encounter 'no database connection exists' issues. However if (X * 0.95) / Y is, say, 25 and you have more than 25 concurrent requests passing through your application which need a database connection at the same time then some of those requests will encounter delays when trying to acquire a database connection and, if those delays exceed a configured timeout, they will result in failed requests.
If you can limit throughput in your application such that you will never have more than (X * 0.95) / Y concurrent 'get database connection' requests then hey presto the issue disappears. But, of course, that's not typically realistic (indeed since less is rarely more ... telling your clients to stop talking to you is generally an odd signal to send). This brings us to the crux of the issue:
Now the application has the capacity to scale automatically based on the load.
Upward scaling is not free. If you want the same responsiveness when handling N concurrent requests as you have when handling 100000N concurrent requests then something has to give; you have to scale up the resources which those requests need. So, if they make use of databaase connections then the number of concurrent connections supported by your database will have to grow. If server side resources cannot grow proprotional to client usage then you need some form of back pressure or you need to carefully manage your server side resources. One common way of managing your server side resources is to ...
Make your service non-blocking i.e. delegate each client request to a threadpool and respond to the client via callbacks within your service (Spring facilitates this via DeferredResult or its Async framework or its RX integration)
Configure your server side resources (such as the maximum number of available connections allowed by your DB) to match the maximum througput from your services based on the total size of your service instance's client-request threadpools
The client-request threadpool limits the number of currently active requests in each service instance it does not limit the number of requests your clients can submit. This approach allows the service to scale upwards (to a limit represented by the size of the client-request threadpools across all service instances) and in so doing it allows the service owner to safe guard resources (such as their database) from being overloaded. And since all client requests are accepted (and delegated to the client-request threadpool) the client requests are never rejected so it feels from their perspective as if scaling is seamless.
This sort of design is further augmented by a load balancer over the cluster of service instances which distributes traffic across them (round robin or even via some mechanism whereby each node reports its 'busy-ness' with that feedback being used to direct the load balancer's behaviour e.g. direct more traffic to NodeA because it is under utilised, direct less traffic to NodeB because it is over utilised).
The above description of a non blocking service only scratches the surface; there's plenty more to them (and loads of docs, blog postings, helpful bits-n-pieces on the Internet) but given your problem statement (concern about server side resources in the face of increasing load from a client) it sounds like a good fit.
I have multiple clients(referring to them as channels) accessing a service on a WebSphere message broker.
The service is a likely to be a SOAP based webservice (can possibly be RESTful too).
Prioritizing requests for MQ/JMS can be handled by WMB using the header info (priority).
The SOAP or HTTP Nodes do not seem to have an equivalent property. Wondering how we cna achieve priority for requests from a specific client channel.
Can I use multiple execution groups(EG) to give higher priortiy for a specific channel. In other words, I am thinking of using EG to give a bigger pipe for a specific channel which should translate to requests being processed faster compared to the other channels.
Thanks
the end points
If you have IIB v9 you can use the "workload management" feature described here:
http://pic.dhe.ibm.com/infocenter/wmbhelp/v9r0m0/topic/com.ibm.etools.mft.doc/bj58250_.htm
https://www.youtube.com/watch?v=K11mKCHMRxo
The problem with this is that it allows you to cap different classes of messages at max rates, it won't allow you to run low priority work at full speed when there is no high priority work for example.
So a better approach might be to create multiple EGs using the maxThreads property on the EG level HTTP connector and the number of additional instances configured on each flow to give relative priority to the different classes of traffic.
Recently we noticed that our Nservicebus subscribers were not able to handle the increasing load. We have a fairly constant input stream of events (measurement data from embedded devices), so it is very important that the throughput follows the input.
After some profiling, we concluded that it was not the handling of the events that was taking a lot of time, but rather the NServiceBus process of retrieving and publishing events. To try to get a better idea of what goes on, I recreated the Pub/Sub sample (http://particular.net/articles/nservicebus-step-by-step-publish-subscribe-communication-code-first).
On my laptop, using all the NServiceBus defaults, the maximum throughput of the Ordering.Server is about 10 events/second. The only thing it does is
class PlaceOrderHandler : IHandleMessages<PlaceOrder>
{
public IBus Bus { get; set; }
public void Handle(PlaceOrder message)
{
Bus.Publish<OrderPlaced>
(e => { e.Id = message.Id; e.Product = message.Product; });
}
}
I then started to play around with configuration settings. None seem to have any impact on this (very low) performance:
Configure.With()
.DefaultBuilder()
.UseTransport<Msmq>()
.MsmqSubscriptionStorage();
With this configuration, the throughput instantly went up to 60 messages/sec.
I have two questions:
When using MSMQ as subscription storage, the performance is much better than RavenDB. Why does something as trivial as the storage for subscription data have such an impact?
I would have expected a much higher performance. Are there any other configuration settings that I should use to get at least one order of magnitude better than this? On our servers, the maximum throughput when running this sample is about 200 msg/s. This is far from spectacular for a system that doesn't even do anything useful yet.
MSMQ doesn't have native pub/sub capabilities so NServiceBus adds support this by storing the list of subscribers and then looping over that list sending a copy of the event to each of the subscribers. This translates to X message queuing operations where X is the number of subscribers. This explains why RabbitMQ is faster since it has native pub/sub so you would only need one operation against the broker.
The reason the storage based on a msmq queue is faster is that it's a local storage (can't be used if you need to scaleout the endpoint) and that means that we can cache the data since that can't be any other endpoint instances updating the storage. In short this means that we get away with a in memory lookup which as you can see is the fastest option.
There are plans to add native caching across all storages:
https://github.com/Particular/NServiceBus/issues/1320
200 msg/s sounds quite low, what number do you get if you skip the bus.Publish? (just to get a base line)
Possibility 1: distributed transactions
Distributed transactions are created when processing messages because of the combination Queue-Database.
Try measuring without transactional handling of the messages. How does that compare?
Possibility 2: msmq might not be the best queueing system for your needs
Ever considered switching to rabbitmq for transport? I have very good experiences with RabbitMq in combination with MassTransit. Way exceed the numbers you are mentioning in your question.