Scaling FastApi and Redis Pub/Sub for chat - websocket

trying to build a scalable chat app with FastApi and Redis pub/sub.
Suppose we have 10 processes running FastApi app. Each process will create 1 connection pool to Redis at startup. Redis instance allows max 10 connections. Each user has its own redis channel where all notifications (chat messages, app notifications etc) are coming. When the user connects to a websocket 2 tasks are launched, 1 that listens to websocket, and 1 that listens to redis user channel. Below is a simplified thing we have now.
resources.py
redis = None
async def startup_event():
global redis
redis = aioredis.from_url(url=REDIS_URL, password=REDIS_PASSWORD, encoding='utf-8', decode_responses=True)
async def get_redis() -> Redis:
return redis
views.py
import orjson as json
channel = 'user:channel'
async def listen_socket(
websocket: WebSocket,
redis: Redis, ):
while True:
try:
data = await websocket.receive_bytes()
except:
await redis.publish(channel, json.dumps({'type': 'disconnect_user'}))
return None
async def listen_redis(
websocket: WebSocket,
redis: Redis, ):
ps = redis.pubsub()
await ps.psubscribe(channel)
async for data in ps.listen():
if data['type'] == 'pmessage':
data = json.loads(data['data'])
event_type = data.get('type')
if event_type == 'disconnect_user':
return None
elif event_type == 'echo':
await websocket.send_bytes(json.dumps(data))
#router.websocket('/', name='ws', )
async def process_ws(
websocket: WebSocket,
redis: Redis = Depends(get_redis), ):
await websocket.accept()
await asyncio.gather(
listen_redis(
websocket=websocket,
redis=redis, ),
listen_socket(
websocket=websocket,
redis=redis, ),
)
This line async for data in ps.listen(): blocks the connection and this particular connection cannot serve clients on different threads of the same process, not even the current client. Is this true? If yes then this approach is absolutely not scalable, because we cannot afford 1 Redis connection per user.
What would solve the above issue? 2 Redis connections per process? 1 connection pool and 1 connection dedicated to consume redis pub/sub channel? In this case publishing will be done to a process channel not to a specific user channel. We would need a thread that consumes the pub/sub channel and routes to the user websocket connected to that process. Is this correct?
Am I overthinking?
Are there better approaches?
Thank you so much for help!

I recommend that you don't try to implement the functionality you describe manually just by using a fastapi and redis. Is a path of pain and suffering that is unjustified and highly ineffective.
Just use centrifugo and you'll be happy.

I recommend using queues to scale your real time application.
e.g. RabbitMQ or even rpush und lpop with redis lists - if you want stay with redis. this approch is much easier to implement as pub/sub and scales great.
Handling and sharing events bidirectional with Pub/Sub & WebSockets is a pain in most languages.

Related

Correct use of streamz with websocket

I am trying to figure out a correct way of processing streaming data using streamz. My streaming data is loaded using websocket-client, after which I do this:
# open a stream and push updates into the stream
stream = Stream()
# establish a connection
ws = create_connection("ws://localhost:8765")
# get continuous updates
from tornado import gen
from tornado.ioloop import IOLoop
async def f():
while True:
await gen.sleep(0.001)
data = ws.recv()
stream.emit(data)
IOLoop.current().add_callback(f)
While this works, I find that my stream is not able to keep pace with the streaming data (so the data I see in the stream is several seconds behind the streaming data, which is both high volume and high frequency). I tried setting the gen.sleep(0.001) to a smaller value (removing it completely halts the jupyter lab), but the problem remains.
Is this a correct way of connecting streamz with streaming data using websocket?
I don't think websocket-client provides an async API and, so, it's blocking the event loop.
You should use an async websocket client, such as the one Tornado provides:
from tornado.websocket import websocket_connect
ws = websocket_connect("ws://localhost:8765")
async def f():
while True:
data = await ws.read_message()
if data is None:
break
else:
await stream.emit(data)
# considering you're receiving data from a localhost
# socket, it will be really fast, and the `await`
# statement above won't pause the while-loop for
# enough time for the event loop to have chance to
# run other things.
# Therefore, sleep for a small time to suspend the
# while-loop.
await gen.sleep(0.0001)
You don't need to sleep if you're receiving/sending data from/to a remote connection which will be slow enough to suspend the while loop at await statements.

Broadcasting message from grpc server to all/some connected clients in python

i am learning how to use grpc streams to exchange messages between clients and server in python. I found a base example that enables the simple message sending between server and client. I am trying to modify it so that i could keep track of all the clients connected to the grpc server (on the server side) and could do two things: 1) broadcast from server to all clients, 2) send message to a particular connected client.
Here is the .proto file
syntax = 'proto3';
service Scenario {
rpc Chat(stream DPong) returns (stream DPong) {}
}
message DPong {
string name = 1;
}
And here is the client.py that creates a daemon process to listen for incoming messages and waits for stdin for any outgoing messages
import threading
import grpc
import time
import scenario_pb2_grpc, scenario_pb2
# new changes
msgQueue = queue.Queue()
def run():
channel = grpc.insecure_channel('localhost:50052')
stub = scenario_pb2_grpc.ScenarioStub(channel)
print('client connected')
global queue
def inputStream():
while 1:
msg = input('>>Enter message\n>>')
yield scenario_pb2.DPong(name=msg)
input_stream = stub.Chat(inputStream())
def read_incoming():
while 1:
print('receivedFromServer: {}\n>>'.format(next(input_stream).name))
thread = threading.Thread(target=read_incoming)
thread.daemon = True
thread.start()
while 1:
time.sleep(1)
if __name__ == '__main__':
print('client starting ...')
run()
Below is the server.py
import random
import string
import threading
import grpc
import scenario_pb2_grpc
import scenario_pb2
import time
from concurrent import futures
clientList = []
class Scenario(scenario_pb2_grpc.ScenarioServicer):
def Chat(self, request_iterator, context):
clients = []
def stream():
while 1:
time.sleep(1)
msg = input('>>Enter message\n>>')
for i in clientList:
yield msg
output_stream = stream()
def read_incoming():
while 1:
received = next(request_iterator).name
if (context,request_iterator) not in clientList:
clientList.append((context, request_iterator))
print('receivedFromClient: {}'.format(received), len(clientList))
thread = threading.Thread(target=read_incoming)
thread.daemon = True
thread.start()
while 1:
msg = output_stream
yield scenario_pb2.DPong(name=next(msg))
if __name__ == '__main__':
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
scenario_pb2_grpc.add_ScenarioServicer_to_server(
Scenario(), server)
server.add_insecure_port('[::]:50052')
server.start()
print('listening ...')
while 1:
time.sleep(1)
So far, i have tried to maintain a list object clientList that contains the context & request_iterator object of the client, and is updated every time a new client joins the server. But how do i set these object from the clientList before sending out an outgoing message? I have tried to iterate the list but the server sends the message to the same client (the last client heard from) a number of times instead of sending it to all the clients once.
Any help is highly appreciated!
This is certainly possible. The problem that you're running into here is that each call to Scenario.Chat on the server side corresponds to a single client connection. That is, this function is called when the streaming RPC starts and as soon as the function exits, the RPC ends.
So if you want n connected clients, you'll need n instances of Scenario.Chat running concurrently, each on its own thread. This does mean that the number of concurrently connected clients is limited by the size of the threadpool with which you instantiate your server.
So, let's say you have n threads in your server process dedicated to maintaining client connections. Then you need another n+1th thread (perhaps the main thread) determining when the server will broadcast a message to all clients (maybe by looking for input from STDIN?). When this extra thread determines that a message should be broadcast, it needs to communicate this intent to all of the threads maintaining connections to a client. There are many ways to make this happen. A threading.Condition and a global collections.deque, or a collections.deque per client connection (somewhat like channels between goroutines) would be two ways. The tricky bit here is ensuring that each client connection will receive the message regardless of how long the client connection thread takes to wake up and how many messages the n+1th thread decides to send in the interim.
If this is still unclear, I can follow up with some actual code demonstrating the idea.
You can spin up multiple ports in one application.
gRPC can be running in port 50011 and flask with socket.io can be running in port 8080
with python, you can use the flask framework and flask_socketio library in your server.py
eg server.py
from flask import Flask
from flask_socketio import SocketIO, emit
app = Flask(__name__)
socketio = SocketIO(app)
#app.route('/')
def index():
return "Hello, World!"
if __name__ == '__main__':
app.run(port=8080)
app.run(debug=True)
socketio.run(app)
instead of using gRPC streaming API, use WebSocket to broadcast to all connected clients and specific/selected clients using rooms.
eg
#socketio.on('message')
def handle_message(data):
// logic to send large data in chunks the logic should call the
// emit function in socket.io and emit an event that send the large
// data in chunks eg emit('my response', chunkData)
gRPC is primarily built for one client request and response and WebSocket is for multiple clients.

How to connect queues to a ZeroMQ PUB/SUB

Consider the following:
a set of 3 logical services: S1, S2 and S3
two instances of each service are running, so we have the following processes: S1P1, S1P2, S2P1, S2P2, S3P1, S3P2
a ZeroMQ broker running in a single process and reachable by all service processes
A logical service, let's say S1, publishes a message M1 that is of interest to logical services S2 and S3. Only one process of each logical service must receive M1, so let's say S2P1 and S3P2.
I have tried the following, but without success:
broker thread 1 is running a XSUB/XPUB proxy
broker thread 2 is running a ROUTER/DEALER proxy with the ROUTER connected to the XPUB socket and subscribed to everything (for logical S1)
broker thread 3 is running a ROUTER/DEALER proxy with the ROUTER connected to the XPUB socket and subscribed to everything (for logical S2)
broker thread 4 is running a ROUTER/DEALER proxy with the ROUTER connected to the XPUB socket and subscribed to everything (for logical S3)
each logical service process is running a REP socket thread connected to the broker DEALER socket
I figured that the XSUB/XPUB proxy would give me publish/subscribe semantics and that the ROUTER/DEALER proxies would introduce a competition between the REP sockets for the messages sent by the XSUB/XPUB proxy.
How can I combine ZeroMQ sockets to accomplish this?
Update1
I know "without success" isn't helpful, I've tried different configurations and got different errors. The latest configuration I tried is the following:
(XSUB proxy=> XPUB) => (SUB copyLoop=> REQ) => (ROUTER proxy=> DEALER) => REP
The copyLoop goes like this:
public void start() {
context = ZMQ.context(1);
subSocket = context.socket(ZMQ.SUB);
subSocket.connect(subSocketUrl);
subSocket.subscribe("".getBytes());
reqSocket = context.socket(ZMQ.REQ);
reqSocket.connect(reqSocketUrl);
while (!Thread.currentThread().isInterrupted()) {
final Message msg = receiveNextMessage();
resendMessage(msg);
}
}
private Message receiveNextMessage() {
final String header = subSocket.recvStr();
final String entity = subSocket.recvStr();
return new Message(header, entity);
}
private void resendMessage(Message msg) {
reqSocket.sendMore(msg.getKey());
reqSocket.send(msg.getData(), 0);
}
The exception I get is the following:
java.lang.IllegalStateException: Cannot send another request
at zmq.Req.xsend(Req.java:51) ~[jeromq-0.3.4.jar:na]
at zmq.SocketBase.send(SocketBase.java:613) ~[jeromq-0.3.4.jar:na]
at org.zeromq.ZMQ$Socket.send(ZMQ.java:1206) ~[jeromq-0.3.4.jar:na]
at org.zeromq.ZMQ$Socket.sendMore(ZMQ.java:1189) ~[jeromq-0.3.4.jar:na]
at com.xyz.messaging.zeromq.SubReqProxyConnector.resendMessage(SubReqProxyConnector.java:47) ~[classes/:na]
at com.xyz.messaging.zeromq.SubReqProxyConnector.start(SubReqProxyConnector.java:35) ~[classes/:na]
I'm running JeroMQ 0.3.4, Oracle Java 8 JVM and Windows 7.
You seem to be adding in some complexity with your ROUTER connection - you should be able to do everything connected directly to your publisher.
The error you're currently running into is that REQ sockets have a strict message ordering pattern - you are not allowed to send() twice in a row, you must send/receive/send/receive/etc (likewise, REP sockets must receive/send/receive/send/etc). From what it looks like, you're just doing send/send/send/etc on your REQ socket without ever receiving a response. If you don't care about a response from your peer, then you must receive and discard it or use DEALER (or ROUTER, but DEALER makes more sense in your current diagram).
I've created a diagram of how I would accomplish this architecture below - using your basic process structure.
Broker T1 Broker T2 Broker T3 Broker T4
(PUB*)------>(*SUB)[--](DEALER*) -->(*SUB)[--](DEALER*) -->(*SUB)[--](DEALER*)
|_____________________||____| || | ||
|_____________________||_______________________||____| ||
|| || ||
========================|| ==================|| ===========||=
|| || || || || ||
|| || || || || ||
|| || || || || ||
(REP*) (REP*) (REP*) (REP*) (REP*) (REP*)
S1P1 S1P2 S2P1 S2P2 S3P1 S3P2
So, the main difference is that I've ditched your (SUB copyLoop=> REQ) step. Whether you choose XPUB/XSUB vs PUB/SUB is up to you, but I would tend to start simpler unless you currently want to make use of the extra features of XPUB/XSUB.
Obviously this diagram doesn't deal with how information enters your broker, where you currently show an XSUB socket - that's out of scope for the information you've provided thus far, presumably you're able to receive information into your broker successfully already so I won't deal with that.
I assume your broker threads that are dedicated to each service are making intelligent choices on whether to send the message to their service or not? If so, then your choice of having them subscribed to everything should work fine, otherwise more intelligent subscription setups might be necessary.
If you're using a REP socket on your service processes, then the service process must take that message and deal with it asynchronously, never communicating back any details about that message to the broker. It must then respond to each message with an acknowledgement (like "RECEIVED") so that it follows the strict receive/send/receive/send pattern for REP sockets.
If you want any other type of communication about how the service handles that message sent back to the broker, REP is no longer the appropriate socket type for your service processes, and DEALER may no longer be the correct socket type for your broker. If you want some form of load balancing so that you send to the next open service process, you'll need to use ROUTER/REQ and have each service indicate its availability and have the broker hold on to the message until the next service process says its available by sending results back. If you want some other type of message handling, you'll have to indicate what that is so a suitable architecture can be proposed.
Clearly I got mixed up with a few elements:
Sockets have the same API whether you're using it as a client-side socket (Socket.connect) or a server-side socket (Socket.bind)
Sockets have the same API regardless of the type (e.g. Socket.subscribe should not be called on a PUSH socket)
Some socket types require a send/receive response loop (e.g. REQ/REP)
Some nuances in communication patterns (PUSH/PULL vs ROUTER/DEALER)
The difficulty (impossiblity?) in debugging a ZeroMQ setup
So a big thanks to Jason for his incredibly detailed answer (and awesome diagram!) that pointed me to the right direction.
I ended up with the following design:
broker thread 1 is running a fan-out XSUB/XPUB proxy on bind(localhost:6000) and bind(localhost:6001)
broker thread 2 is running a queuing SUB/PUSH proxy on connect(localhost:6001) and bind(localhost:6002); broker threads 3 and 4 use a similar design with different bind port numbers
message producers connect to the broker using a PUB socket on connect(localhost:6000)
message consumers connect to the broker queuing proxy using a PULL socket on connect(localhost:6002)
On top of this service-specific queuing mechanism, I was able to add a similar service-specific fan-out mechanism rather simply:
broker thread runs a SUB/PUB proxy on connect(localhost:6001) and bind(localhost:6003)
message producers still connect to the broker using a PUB socket on connect(localhost:6000)
message consumers connect to the broker fan-out proxy using a SUB socket on connect(localhost:6003)
This has been an interesting ride.

Polling if I can PUSH or send in zmq?

By using 0mq, I am trying to detect if I have made a successful connection to a PULL port, and if I can PUSH. However, it didn't work as I had expected, see the example code below. Poller will return immediately even remote peer hasn't been started to accept connections. Is there a way to fix it?
import sys
import zmq
context = zmq.Context()
pusher = context.socket(zmq.PUSH)
pusher.connect("tcp://localhost:5555")
poller = zmq.Poller()
poller.register(pusher, zmq.POLLOUT)
socks = dict(poller.poll(timeout=1000))
if pusher in socks and socks[pusher] == zmq.POLLOUT:
print("Pusher can push")
else:
print("Failed to connect, exit.")
sys.exit(1)
You would be allowed to send as long as you haven't reached the High Water Mark ( HWM ) of the sending socket - the number of messages allowed to pile up on the sender side.
By default it is set to 1000 as far as I remember.
/Søren

ZeroMQ: Many-to-one no-reply aynsc messages

I have read through the zguide but haven't found the kind of pattern I'm looking for:
There is one central server (with known endpoint) and many clients (which may come and go).
Clients keep sending hearbeats to the server, but they don't want the server to reply.
Server receives heartbeats, but it does not reply to clients.
Hearbeats sent when clients and server are disconnected should somehow be dropped to prevent a heartbeat flood when they go back online.
The closet I can think of is the DEALER-ROUTER pattern, but since this is meant to be used as an async REQ-REP pattern (no?), I'm not sure what would happen if the server just keep silent on incoming "requests." Also, the DEALER socket would block rather then start dropping heartbeats when the send High Water Mark is reached, which would still result in a heartbeat flood.
The PUSH/PULL pattern should give you what you need.
# Client example
import zmq
class Client(object):
def __init__(self, client_id):
self.client_id = client_id
ctx = zmq.Context.instance()
self.socket = ctx.socket(zmq.PUSH)
self.socket.connect("tcp://localhost:12345")
def send_heartbeat(self):
self.socket.send(str(self.client_id))
# Server example
import zmq
class Server(object):
def __init__(self):
ctx = zmq.Context.instance()
self.socket = ctx.socket(zmq.PULL)
self.socket.bind("tcp://*:12345") # close quote
def receive_heartbeat(self):
return self.socket.recv() # returns the client_id of the message's sender
This PUSH/PULL pattern works with multiple clients as you wish. The server should keep an administration of the received messages (i.e. a dictionary like {client_id : last_received} which is updated with datetime.utcnow() on each received message. And implement some housekeeping function to periodically check the administration for clients with old timestamps.

Resources