What are the technologies for building real-time servers? - websocket

I am a backend developer and I would like to know what are the common technologies for building real-time servers. I know I could use a service like Firebase, but I really want to create it. I have some experience using Websockets on Java, but I would like to know more ways to achieve a real-time server. When I say real-time, I mean something like Facebook. I also would like to know how to scale real-time servers.
Thank you all!

I've asked the same in multiple forums. Common answer to this question is strangely enough still:
WebSocket
Socket.io
Server-Sent Events (SSE)
But those are mainly ways of transporting or streaming events to the clients. Something needs to be built on top of it. And there are multiple other things to consider, such as:
Considerations for real-time API's
What events to send to the client
How to send each client only the events they need
How to handle authorization for events
Where to keep state on the event subscriptions (for stateless services)
How to recover from missed events due to lost connections and service crashes
Producing events for search-, or pagination queries
How to scale
Publish/Subscribe solutions
There are multiple pub/sub solutions out there, such as:
Pusher
PubNub
SocketCluster
etc.
But because of the limitation of a topic based pub/sub architecture, some of the above questions are still left unanswered and has to be dealt with by yourself. Examples are lost connections, where Pusher has no fallback, neither does SocketCluster, and PubNub has a limited queue.
Resgate - Realtime API Gateway
An alternative to the traditional topic based pub/sub pattern is using a resource-aware realtime API Gateway, such as Resgate.
Instead of the client subscribing to topics, the gateway keeps track on which resources (objects or arrays) that the client has fetched, keeping the client data up to date until it unsubscribes.
As a developer of Resgate, I can really recommend checking it out as it solves all above question, is language agnostic, simple and light-weight, and blazingly fast.
Read more at NATS blog.
Scaling
Let's say you want to scale both in the number of concurrent clients and the number of events that is produced. You will eventually need to ensure each client only gets the data they are interested in through either traditional topic based publish/subscribe, or through resource subscriptions. All above solutions handles that.
I also assume all the above mentioned solutions scales concurrent clients by allowing you to add more nodes/servers that handles the persistent WebSocket connections.
With Resgate, first level of scaling is done by simply running multiple instances (it is a simple executable), and adding a load balancer that distributes the connection evenly between them:
Handling 100M concurrent clients
Let's say a single Resgate instance handles 10000 persistent WebSocket connections, and you can add 10000 Resgates (distributed to multiple data centers) to a single NATS Server. This would allow a total of 100M connections. Of course, depending on your data, you might have other scaling issues as well, such as network traffic ;) .
A second layer of scaling (and adding redundancy) would be to replicate the whole setup to different data centers, and have the services synchronize their data between the data centers using other tools like Kafka, CockroachDB, etc.
Scaling data retrieval
With the traditional publish/subscribe solution that only deals with events, you will also have to handle scaling for the HTTP (REST) requests.
With Resgate, this is not required, as resource data is also fetched over the WebSocket connection. This allows Resgate not only to ensure that resource data and events are synchronized (another issue with separate pub/sub solutions), but also that the data can be cached. If multiple clients requests the same data, Resgate will only need to fetch it from the service once, effectively improving scalability.

Butterfly Server .NET is a real-time server written in C# allowing you to create real-time apps. You can see the source at https://github.com/firesharkstudios/butterfly-server-dotnet.

Related

KDB+/Q: GRPC implementation?

gRPC is a modern open source high performance RPC framework that can
run in any environment. It can efficiently connect services in and
across data centers with pluggable support for load balancing,
tracing, health checking and authentication. It is also applicable in
last mile of distributed computing to connect devices, mobile
applications and browsers to backend services.
I'm finding GRPC is becoming increasingly more pertinent in backend infrastructure, and would've liked to have it in my favorite language/tsdb kdb+/q.
I was surprised to find that kdb+ does not have a grpc implementation. Obviously, the (https://code.kx.com/q/interfaces/protobuf/)
package doesn't support the parsing of rpc's, is there anything quantitatively preventing there being a KDB+ implementation of the rpc requests/services etc. found in grpc?
Why would one not want to implement rpc's (grpc) in kdb+ and would it be a good idea to wrap a c++/c implemetation therin inorder to achieve this functionality.
Thanks for your advice.
Interesting post:
https://zimarev.com/blog/event-sourcing/myth-busting/2020-07-09-overselling-event-sourcing/
outlines event sourcing, which I think might be a better fit for kdb?
What is the main issue with services using RPC calls to exchange information? Well, it’s the high degree of coupling introduced by RPC by its nature. The whole group of services or even the whole system can go down if only one of the services stops working. This approach diminishes the whole idea of independent components.
In my practice I hardly encounter any need to use RPC for inter-service communication. Partially because I often use Event Sourcing, more about it later. But we always use asynchronous communication and exchange information between services using events, even without Event Sourcing.
For example, an order microservice in an e-commerce system needs customer data from the customer microservice. These dependencies between microservices are not ideal. Other microservices can go down and synchronous RESTful requests over https do not scale well due to their blocking nature. If there was a way to completely eliminate dependencies between microservices completely the result would be a more robust architecture with less bottlenecks.
You don’t need Event Sourcing to fix this issue. Event-driven systems are perfectly capable of doing that. Event Sourcing can eliminate some of the associated issues like two-phase commits, but again, not a requirement to remove the temporal coupling from your system.

Transaction Log in Aerospilke

What I have?
A lot of different microservices managing by different teams. All microservices persist data in Aerospike database.
What I want to achieve?
I'm building new microservice that relies on data handled by another services. I want to listen the changes in entities, but unfortunately that microservices don't put anything in message queue, they have only usual REST APIs, so I cant just subscribe to events.
The idea is to listen a transaction log(event log/commit log/WAL) of database. This approach is also using in different Event Sourcing systems, but I cant found any Aerospike API that would stream this log. So the question - does Aerospike provide any similar functionality, may be with different name?
Aerospike, in its enterprise edition, has a feature called change notification framework which may fit your requirements. It informs an external agent about all the write operations. This is built over the XDR functionality which is meant for replicating across data centers using a digestlog.
If you are not planning for enterprise, you should reconsider having your own message queue in front of Aerospike.

Bidirectional client-server communication using Server-Sent Events instead of WebSockets?

It is possible to achieve two-way communication between a client and server using Server Sent Events (SSE) if the clients send messages using HTTP POST and receive messages asynchronously using SSE.
It has been mentioned here that SSE with AJAX would have higher round-trip latency and higher client->server bandwidth since an HTTP request includes headers and that websockets are better in this case, however isn't it advantageous for SSE that they can be used for consistent data compression, since websockets' permessage-deflate supports selective compression, meaning some messages might be compressed while others aren't compressed
Your best bet in this scenario would be to use a WebSockets server because building a WS implementation from scratch is not only time-consuming but the fact that it has already been solved makes it useless. As you've tagged Socket.io, that's a good option to get started. It's an open source tool and easy to use and follow from the documentation.
However, since it is open-source, it doesn't provide some functionality that is critical when you want to stream data in a production level application. There are issues like scalability, interoperability (for endpoints operating on protocols other than WebSockets), fault tolerance, ensuring reliable message ordering, etc.
The real-time messaging infrastructure plus these critical production level features mentioned above are provided as a service called a 'Data Stream Network'. There are a couple of companies providing this, such as Ably, PubNub, etc.
I've extensively worked with Ably so comfortable to share an example in Node.js that uses Ably:
var Ably = require('ably');
var realtime = new Ably.Realtime('YOUR-API-KEY');
var channel = realtime.channels.get('data-stream-a');
//subscribe on devices or database
channel.subscribe(function(message) {
console.log("Received: " message.data);
});
//publish from Server A
channel.publish("example", "message data");
You can create a free account to get an API key with 3m free messages per month, should be enough for trying it out properly afaik.
There's also a concept of Reactor functions, which is essentially invoking serverless functions in realtime on AWS, Azure, Gcloud, etc. You can place a database on one side too and log data as it arrives. Pasting this image found on Ably's website for context:
Hope this helps!
Yes, it's possible.
You can have more than 1 parallel HTTP connection open, so there's nothing stopping you.

web Api application subscribing to a queue. Is it a good idea?

We are designing a reporting system using microservice architecture. All the services are supposed to be subscribers to the event bus and they communicate by raising events. We also decided to expose each of our services using REST api. Now the question is , is it a good idea to create our services as web api [RESTful] applications which are also subscribers to the event bus? so basically there are 2 ponits of entry to each service - api and events. I have a feeling that we should separate out these 2 as these are 2 different concerns. Any ideas?
Since Microservices architecture are Un-opinionated software design. So you may get different answers on this questions.
Yes, REST and Event based are two different things but sometime both combined gives design to achieve better flexibility.
Answering to your concerns, I don't see any harm if REST APIs also subscribe to a queue as long as you can maintain both of them i.e changes to message does not have any impact of APIs and you have proper fallback and Eventual consistency mechanism in place. you can check discussion . There are already few project which tried it such as nakadi and ponte.
So It all depends on your service's communication behaviour to choose between REST APIs and Event-Based design Or Both.
What you do is based on your requirement you can choose REST APIs where you see synchronous behaviour between services
and go with Event based design where you find services needs asynchronous behaviour, there is no harm combining both also.
Ideally for inter-process communication protocol it is better to go with messaging and for client-service REST APIs are best fitted.
Check the Communication style in microservices.io
REST based Architecture
Advantage
Request/Response is easy and best fitted when you need synchronous environments.
Simpler system since there in no intermediate broker
Promotes orchestration i.e Service can take action based on response of other service.
Drawback
Services needs to discover locations of service instances.
One to one Mapping between services.
Rest used HTTP which is general purpose protocol built on top of TCP/IP which adds enormous amount of overhead when using it to pass messages.
Event Driven Architecture
Advantage
Event-driven architectures are appealing to API developers because they function very well in asynchronous environments.
Loose coupling since it decouples services as on a event of once service multiple services can take action based on application requirement. it is easy to plug-in any new consumer to producer.
Improved availability since the message broker buffers messages until the consumer is able to process them.
Drawback
Additional complexity of message broker, which must be highly available
Debugging an event request is not that easy.

Which group messaging technology to use?

I feel a little bit kind of confused — for about 24 hours I have been thinking which group broadcasting technology to use in my project.
Basically, what I need is:
create groups (by some backend process)
broadcast messages by any client (1:N, N:N)
(potentially) direct messages (1:1)
(important) authenticate/authorize clients with my own backend (say, through some kind of HTTP API)
to be able to kick specific clients by backend process (or server plugin)
Here is what I will have:
Backend-related process(es) in either Ruby or Haxe
Frontend in JS+Haxe(Flash9) — in browser, so ideally communicating through 80/443, but not necessarily.
So, this technology will have to be easily accessible in Haxe for Flash and preferably Ruby.
I've been thinking about: RabbitMQ (or OpenAMQ), RabbitMQ+STOMP, ejabberd, ejabberd+BOSH, juggernaut (with a need to write a Haxe lib for it).
Any ideas/suggestions?
Yurii,
RabbitMQ, Haxe and as3: http://geekrelief.wordpress.com/2008/12/15/hxamqp-amqp-with-haxe/
RabbitMQ, Ruby and ACLs: http://pastie.org/pastes/368315
You might also want to look at using Nanite with RabbitMQ to manage backend groups: http://brainspl.at/articles/2008/10/11/merbcamp-keynote-and-introducing-nanite
You say you need:
* broadcast messages by any client (1:N, N:N)
* (potentially) direct messages (1:1)
You can easily do both using RabbitMQ. RabbitMQ supports both cases, 1:N pubsub and 1:1 messaging, with 'direct' exchanges.
The direct exchange pattern is as follows:
Any publisher (group member) sends a message to the broker with a 'routing key' such as "yurii". RabbitMQ matches this key with subscription bindings in the routing table (aka "exchange") for you. Each binding represents a subscription by a queue, expressing interest in messages with a given routing key. When the routing and binding keys match, the message is then routed to queues for subsequent consumption by clients (group members). This works for 1:N and 1:1 cases; with N:N building on 1:N.
Introduction to the routing model: http://blogs.digitar.com/jjww/2009/01/rabbits-and-warrens/
General intro: http://google-ukdev.blogspot.com/2008/09/rabbitmq-tech-talk-at-google-london.html
You also require:
* (important) authenticate/authorize clients with my own backend (say, through some kind of HTTP API)
Please see the ACLs code for this (link above). There is also a HTTP interface to RabbitMQ but we have not yet combined the HTTP front end with the ACL code. That shouldn't hold oyu back though. Please come to the rabbitmq-discuss list where this topic has been talked about recently.
You also require:
* create groups (by some backend process)
* to be able to kick specific clients by backend process (or server plugin)
I suggest looking at how tools like Nanite and Workling do this. Group creation is not usually part of a messaging system, instead, in RabbitMQ, you create routing patterns using subscriptions. You can kick specific clients by sending messages to them by whichever key they have used to bind their consuming queue to the exchange.
Hope this helps!
alexis
If you are going to be doing Flash dev have you looked at SmartfoxServer? It has everything you want and has native Flash client libraries. I used in on a project to manage 10s of thousands of connected users.
http://www.smartfoxserver.com/
Well group communication is a slightly different beast than simple messaging / queuing.
Most group communication systems are commercial but there are two (that I know of) open-source / free you can take a look at:
Spread Toolkit
OpenAIS
Both of these might be tough to find Ruby bindings though. Spread, and probably OpenAIS, view clients as trusted so a browser based client doesn't make sense. You'd need to have your browser front-ends talk to a group client(s) on the back-end.
We've been using ActiveMQ. Our vendor who supplies our HR system is using Ruby/ActiveMQ to broadcast and receive updates.
http://activemq.apache.org/cross-language-clients.html
Other open source message brokers which support the Stomp protocol are OpenMQ, which is included in GlassFish V3 and GlassFish 2.1.1 but also works standalone, and soon the JBoss message broker, HornetQ V2.1.
OpenMQ supports temporary queues which are useful for a RPC style communication, but ActiveMQ offers some interesting features in the Stomp adapter too.

Resources