Is there a concept of acknowledgements in Redis Pub/Sub?
For example, when using RabbitMQ, I can have two workers running on separate machines and when I publish a message to the queue, only one of the workers will ack/nack it and process the message.
However I have discovered with Redis Pub/Sub, both workers will process the message.
Consider this simple example, I have this go routine running on two different machines/clients:
go func() {
for {
switch n := pubSubClient.Receive().(type) {
case redis.Message:
process(n.Data)
case redis.Subscription:
if n.Count == 0 {
return
}
case error:
log.Print(n)
}
}
}()
When I publish a message:
conn.Do("PUBLISH", "tasks", "task A")
Both go routines will receive it and run the process function.
Is there a way of achieving similar behaviour to RabbitMQ? E.g. first worker to ack the message will be the only one to receive it and process it.
Redis PubSub is more like a broadcast mechanism.
if you want queues, you can use BLPOP along with RPUSH to get the same interraction. Keep in mind, RabbitMQ does all sorts of other stuff that are not really there in Redis. But if you looking for simple job scheduling / request handling style, this will work just fine.
No, Redis' PubSub does not guarantee delivery nor does it limit the number of possible subscribers who'll get the message.
Redis streams (now, with Redis 5.0) support acknowledgment of tasks as they are completed by a group.
https://redis.io/topics/streams-intro
Related
I have a scenario where the clients can connect to a server via GRPC and I would like to implement backpressure on it, meaning that I would like to accept many simultaneous requests 10000, but have only 50 simultaneous threads executing the requests (this is inspired in Apache Tomcat NIO interface behaviour). I also would like the communication to be asynchronous, in a reactive manner, meaning that the client send the request but does not wait on it and the server sends the response back later and the client then execute some function registered to be executed.
How can I do that in GO GRPC? Should I use streams? Is there any example?
The GoLang API is a synchronous API, this is how GoLang usually works. You block in a while true loop until an event happens, and then you proceed to handle that event. With respect to having more simultaneous threads executing requests, we don't control that on the Client Side. On the client side at the application layer above gRPC, you can fork more Goroutines, each executing requests. The server side already forks a goroutine for each accepted connection and even stream on the connection so there is already inherent multi threading on the server side.
Note that there are no threads in go. Go us using goroutines.
The behavior described, is already built in to the GRC server. For example, see this option.
// NumStreamWorkers returns a ServerOption that sets the number of worker
// goroutines that should be used to process incoming streams. Setting this to
// zero (default) will disable workers and spawn a new goroutine for each
// stream.
//
// # Experimental
//
// Notice: This API is EXPERIMENTAL and may be changed or removed in a
// later release.
func NumStreamWorkers(numServerWorkers uint32) ServerOption {
// TODO: If/when this API gets stabilized (i.e. stream workers become the
// only way streams are processed), change the behavior of the zero value to
// a sane default. Preliminary experiments suggest that a value equal to the
// number of CPUs available is most performant; requires thorough testing.
return newFuncServerOption(func(o *serverOptions) {
o.numServerWorkers = numServerWorkers
})
}
The workers are at some point initialized.
// initServerWorkers creates worker goroutines and channels to process incoming
// connections to reduce the time spent overall on runtime.morestack.
func (s *Server) initServerWorkers() {
s.serverWorkerChannels = make([]chan *serverWorkerData, s.opts.numServerWorkers)
for i := uint32(0); i < s.opts.numServerWorkers; i++ {
s.serverWorkerChannels[i] = make(chan *serverWorkerData)
go s.serverWorker(s.serverWorkerChannels[i])
}
}
I suggest you read the server code yourself, to learn more.
Below is a service with set of 3 Go-routines that process a message from Kafka:
Channel-1 & Channel-2 are unbuffered data channels in Go. Channel is like a queuing mechanism.
Goroutine-1 reads a message from a kafka topic, throw its message payload on Channel-1, after validation of the message.
Goroutine-2 reads from Channel-1 and processes the payload and throws the processed payload on Channel-2.
Goroutine-3 reads from Channel-2 and encapsulates the processed payload into http packet and perform http requests(using http client) to another service.
Loophole in the above flow: In our case, processing fails either due to bad network connections between services or remote service is not ready to accept http requests from Go-routine3(http client timeout), due to which, above service lose that message(already read from Kafka topic).
Goroutine-1 currently subscribes the message from Kafka without an acknowledgement sent to Kafka(to inform that specific message is processed successfully by Goroutine-3)
Correctness is preferred over performance.
How to ensure that every message is processed successfully?
E.g., add a feedback from Goroutine-3 to Goroutine-1 through new Channel-3. Goroutine-1 will block until it get acknowledgement from Channel-3.
// in gorouting 1
channel1 <- data
select {
case <-channel3:
case <-ctx.Done(): // or smth else to prevent deadlock
}
...
// in gorouting 3
data := <-channel2
for {
if err := sendData(data); err == nil {
break
}
}
channel3<-struct{}{}
To ensuring correctness you need to commit (=acknowledge) the message after processing finished successfully.
For the cases when the processing wasn't finished successfully - in general, you need to implement retry mechanism by yourself.
That should be specific to your use-case, but generally you throw the message back to a dedicated Kafka retry topic (that you create), add a sleep and process the message again. if after x times the processing fails - you throw the message to a DLQ (=dead letter queue).
You can read more here:
https://eng.uber.com/reliable-reprocessing/
https://www.confluent.io/blog/error-handling-patterns-in-kafka/
A program sends data to API using N concurrent workers as goroutines that consume data from a channel (Producer/Consumer pattern). API signals it can't handle more using HTTP status codes and demands a back-off.
How do I block all workers until back-off interval has passed?
Where to I put those requests that failed for retry?
Any links/pointers to this presumably already solved problem are much appreciated!
You can use select to control calling API
for _, k := range data {
select {
case <- backoff:
time.Sleep(backoffDuration)
default:
// Call API
// Check http status code and trigger backoff channel
backoff <- 1
}
}
Here is set-up:
Pass same backoff channel to all go routine
Setting backoffDuration. This is tricky as all go routines should be able to set this value and all others should be able to read it. One method can be using closure
Once these two are set-up you can control API calls by manipulating backoff channel and backoffDuration to control for how long routine will pause working.
Disclaimer: This is just pseudo-code.
You can check Hashicorp's library here. Looks like it will solve your problem
My scenario:
I have a producer and a consumer. Both are goroutines, and they communicate through one channel.
The producer is capable of (theoretically) generating a message at any time.
Generating a message requires some computation.
The message is somewhat time-sensitive (i.e. the older it is, the less relevant it is).
The consumer reads from the channel occasionally. For the sake of this example, let's say the consumer uses a time.Ticker to read a message once every couple of seconds.
The consumer would prefer "fresh" messages (i.e. messages that were generated as recently as possible).
So, the question is: How can the producer generate a message as late as possible?
Sample code that shows the general idea:
func producer() {
for {
select {
...
case pipe <- generateMsg():
// I'd like to call generateMsg as late as possible,
// i.e. calculate the timestamp when I know
// that writing to the channel will not block.
}
}
}
func consumer() {
for {
select {
...
case <-timeTicker.C:
// Reading from the consumer.
msg <- pipe
...
}
}
}
Full code (slightly different from above) is available at the Go Playground: https://play.golang.org/p/y0oCf39AV6P
One idea I had was to check if writing to a channel would block. If it wouldn't block, then I can generate a message and then send it. However…
I couldn't find any way to test if writing to a channel would block or not.
In the general case, this is a bad idea because it introduces a racing condition if we have multiple producers. In this specific case, I only have one producer.
Another (bad) idea:
func producer() {
var msg Message
for {
// This is BAD. DON'T DO THIS!
select {
case pipe <- msg:
// It may send the same message multiple times.
default:
msg = generateMsg()
// It causes a busy-wait loop, high CPU usage
// because it re-generates the message all the time.
}
}
}
This answer (for Go non-blocking channel send, test-for-failure before attempting to send?) suggests using a second channel to send a signal from the consumer to the producer:
Consumer wants to get a message (e.g. after receiving a tick from timer.Ticker).
Consumer sends a signal through a side channel to the producer goroutine. (So, for this side channel, the producer/consumer roles are reversed).
Producer receives the signal from the side channel.
Producer starts computing the real message.
Producer sends the message through the main channel.
Consumer receives the message.
I'm using GO redis client redigo to write image to ~20 redis servers.
speed is an important factor here and I'm just sending set commands to the redis so I'm using Send and Flush without calling Receive.
after a few hours I'm getting "connection reset by peer" on the client.
I was wondering, does it have something to do with the fact that I don't call Receive?
maybe my RX queue just getting to its max capacity because I don't empty it with Receive?
Thank you.
An application must call Receive to clear the responses from the server and to check for errors. If the application is not pipelining commands, then it's best to call Do. Do combines Send, Flush and Receive.
If you don't care about errors, then start a goroutine to read the responses:
go func(c redis.Conn) {
for c.Err() == nil {
c.Receive()
}
}()