Cometd - Websocket open / active Connections count in real time - websocket

We are running jetty 9.2.9 on our production with cometd 3.0.1.
We are trying to understand the current load on the system at any given point of time and estimate its maximum scale.
Please suggest the best approach to accomplish the same.
I tried different approaches such as listening the meta channels and keep a count on each message passed and each channel closed. But this does not look like a good approach as this has to touch the meta channels every time and it may also potentially slow down the message passing across channels.
Thank you!

CometD relies on the Servlet Container implementation of WebSocket, so for monitoring the open/closed WebSocket connections you should probably rely on container features.
For example, if you are using Jetty you may look at the JMX MBeans exposed by Jetty about its WebSocket implementation.
If what already exposed is not enough for you, you can probably ask for a feature request (for Jetty, at this URL).

Related

Spring + Websockets + STOMP + Broker + Gateway does not scale

We have been evaluating Spring-Stomp-Broker-websockets, for a full duplex type messaging application that will run on AWS. We had hoped to use Amazon MQ. We are pushing messages to individual users, and also broadcasting. So functionally the stack did look good. We have about 40,000 - 80,000 users. We quickly found, with load testing, that none of the spring stack or Amazon MQ scales very well, issues:
Spring Cloud Gateway instance cannot handle more than about 3,000
websockets before dying.
Spring Websocket server instance can also
only handle about 4,000 websockets, on a T3.Medium. When we bypass
the Gateway.
AWS limits Active MQ connections to 100 for a small
server, and then only 1000 on a massive server. No in-between, this
is just weird.
Yes we have increased the file handles etc on the machines so TCP connections are not the limit. There is no way Spring could ever get close to the limit here.We are sending a 18 K message, for load, the maximum we will expect. In our results message size has little impact, its just the connection over head on the Spring Stack.
The StompBrokerRelayMessageHandler opens a connection to the Broker for each STOMP Connect. There is no way to pool the connections. So this makes this Spring feature completely useless for any ‘real’ web applications. In order to support our users the cost of AWS massive servers for MQ means this solution is ridiculously expensive, requiring 40 of the biggest servers. In load testing, the Amazon MQ machine is doing nothing, with the 1000 users, it is not loaded.In reality a couple of medium sized machine is all we need for all our brokers.
Has any one ever built a real world solution, as above, using Spring Stack. It appears no one has done this, and no one has scaled this up.
Has anyone written a pooling StompBrokerRelayMessageHandle. I assume there must be a reason this can’t work as it should be the default approach ? What is the issue here ?
Seems this issues makes the whole Spring Websocket + STOMP + Broker approach pretty useless and we are now forced to use a different approach for message reliability, and for messaging across servers where users are not connected (main reason we are using broker) and have gone back too using a Simple Broker, and wrote a registry to manage the client server location. So we have now eliminated the broker and the figures above are with that model. The we may add in AWS SQS for reliability of messages.
Whats left. We were going to use the Spring Cloud Gateway to load balance across multiple small WebSocket servers, but seems this approach will not work, as the WebSocket load a server can handle is just way too small. The Gateway just cannot handle it. We are now removing Spring Cloud Gateway and using a AWS load balancer instead. So now we can get significantly more connections load balanced. Why does Spring Cloud Gateway not load balance ?
Whats left. The websocket server instances are t3.mediums, they have no business logic and just pass a message between 2 clients, so it really does not need a bigger server. We would expect considerably better than 4,000 connections. However this is close to usable.
We are now drilling into the issues to get more details on where the performance bottlenecks are, but the lack of any tuning guides or scaling information does not suggest good things about Spring. Compare this to Node solutions that scale very well, and handle larger number of connections on small machines.
Next approach is to look at WebFlux + WebSocket, but then we loose STOMP. Maybe we’ll check raw websockets ?
This is just an early attempt to see if anyone actually has used Spring Websockets in anger and can share real working production architecture, as only Toy examples are available. So any help on above issues would be appreciated.

Notifying golongpoll.SubscriptionManager of an event from kafka-go

I was writing a POC on long-polling using go.
I see the general package to be used is https://github.com/jcuga/golongpoll .
But assuming that I would want to publish an event to the golongpoll.SubscriptionManager from a general context, especially when there is a possibility that the long poll API request is being served by one machine, while the Kafka event for that particular consumer group is consumed by another instance in the cluster.
The examples given in the documentation did not talk of such a scenario at all, even though this seems like a common scenario. One way I can think of is have a distributed cache like Redis in between and have all the services poll this for a change? But that sounds a bit dumb to me.

Web server and ZeroMQ patterns

I am running an Apache server that receives HTTP requests and connects to a daemon script over ZeroMQ. The script implements the Multithreaded Server pattern (http://zguide.zeromq.org/page:all#header-73), it successfully receives the request and dispatches it to one of its worker threads, performs the action, responds back to the server, and the server responds back to the client. Everything is done synchronously as the client needs to receive a success or failure response to its request.
As the number of users is growing into a few thousands, I am looking into potentially improving this. The first thing I looked at is the different patterns of ZeroMQ, and whether what I am using is optimal for my scenario. I've read the guide but I find it challenging understanding all the details and differences across patterns. I was looking for example at the Load Balancing Message Broker pattern (http://zguide.zeromq.org/page:all#header-73). It seems quite a bit more complicated to implement than what I am currently using, and if I understand things correctly, its advantages are:
Actual load balancing vs the round-robin task distribution that I currently have
Asynchronous requests/replies
Is that everything? Am I missing something? Given the description of my problem, and the synchronous requirement of it, what would you say is the best pattern to use? Lastly, how would the answer change, if I want to make my setup distributed (i.e. having the Apache server load balance the requests across different machines). I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Some thoughts about the subject...
Keep it simple
I would try to keep things simple and "plain" ZeroMQ as long as possible. To increase performance, I would simply to change your backend script to send request out from dealer socket and move the request handling code to own program. Then you could just run multiple worker servers in different machines to get more requests handled.
I assume this was the approach you took:
I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Only problem here is that there is no request retry in the backend. If worker fails to handle given task it is forever lost. However one could write worker servers so that they handle all the request they got before shutting down. With this kind of setup it is possible to update backend workers without clients to notice any shortages. This will not save requests that get lost if the server crashes.
I have the feeling that in common scenarios this kind of approach would be more than enough.
Mongrel2
Mongrel2 seems to handle quite many things you have already implemented. It might be worth while to check it out. It probably does not completely solve your problems, but it provides tested infrastructure to distribute the workload. This could be used to deliver the request to be handled to multithreaded servers running on different machines.
Broker
One solution to increase the robustness of the setup is a broker. In this scenario brokers main role would be to provide robustness by implementing queue for the requests. I understood that all the requests the worker handle are basically the same type. If requests would have different types then broker could also do lookups to find correct server for the requests.
Using the queue provides a way to ensure that every request is being handled by some broker even if worker servers crashed. This does not come without price. The broker is by itself a single point of failure. If it crashes or is restarted all messages could be lost.
These problems can be avoided, but it requires quite much work: the requests could be persisted to the disk, servers could be clustered. Need has to be weighted against the payoffs. Does one want to use time to write a message broker or the actual system?
If message broker seems a good idea the time which is required to implement one can be reduced by using already implemented product (like RabbitMQ). Negative side effect is that there could be a lot of unwanted features and adding new things is not so straight forward as to self made broker.
Writing own broker could covert toward inventing the wheel again. Many brokers provide similar things: security, logging, management interface and so on. It seems likely that these are eventually needed in home made solution also. But if not then single home made broker which does single thing and does it well can be good choice.
Even if broker product is chosen I think it is a good idea to hide the broker behind ZeroMQ proxy, a dedicated code that sends/receives messages from the broker. Then no other part of the system has to know anything about the broker and it can be easily replaced.
Using broker is somewhat developer time heavy. You either need time to implement the broker or time to get use to some product. I would avoid this route until it is clearly needed.
Some links
Comparison between broker and brokerless
RabbitMQ
Mongrel2

Jersey and AsyncResponse vs. Redirects

Currently I am using Jersey 1.0 and about to switch to 2.0. For REST requests the may last over a second or two I use the following pattern:
Client calls GET or PUT
Server returns a polling URL to the client
The client polls the URL until it gets a redirect to the completed resource
Pretty standard and straightforward. However, I noticed that Jersey 2.0 has an AsyncResponse capability. But it looks like this is done with no changes on the wire. In other words, the client still blocks for the result while the server is asynchronously processing the request.
So what good is this? Should I be using it instead of my current asynchronous approach for calls >1 second? Or is it really just to keep the connections freed on the server for calls that would be only a few hundred milliseconds?
I want my server to be as scalable as possible but the approach I use now can be tedious for the client. AsyncResponse seems super simple but I'm not sure how it would work for something like a heroku service where you want very short connection times.
AsyncResponse presumably gives you more scalability within the web app server for standard standard requests in terms of thread pooling resources, but I don't think it changes anything about the client experience which will continue to block on read on their connection. Therefore, if you already implemented a polling solution from your client side, this won't add much of any value to you imho.

Solution/Architecture: queues or something else?

I have a multiple frontends to my service written in Node.js and workers written in Ruby. Now the question is how to make those communicate? I need to maintain dynamic pool of workers to handle load (spawn more workers when load rises) and messages are quite big ~2-3M because I'm sending images to workers uploaded by users through Node.js frontends. Because I want nice scaling I thought about some queuing solution, but I didn't find any existing solutions (or misunderstood guides) that will provide:
Fallback mechanisms. Solutions I've found so far have single failure point - message broker and there are no ways to provide fallbacks.
Serialization. So when broker fails tasks are not lost.
Ability to pass big messages.
Easy API for Ruby and Node.js
Some API to track queue size so I could rearrange workers pool.
Preferrably lightweight.
Maybe my approach is wrong? Maybe I shouldn't use queues but some other way? Or there's some queueing solution that fits requirements above?
No doubt you require a Queue to scale and you can monitor this queue to spawn "workers".
Apache ActiveMQ is very robust and supports REST protocol. Ruby client is also available to access the queue.
Interesting article on RESTful queue using Apache ActiveMQ
in the end of the day i took ZeroMQ queue solution. Very fast, robust and lightweight implementation. Had to write own broker, but thats the only cons of this solution.
redis publish/subscribe should do the trick
http://redis.io/topics/pubsub

Resources