How to make every message process successfully? - go

Below is a service with set of 3 Go-routines that process a message from Kafka:
Channel-1 & Channel-2 are unbuffered data channels in Go. Channel is like a queuing mechanism.
Goroutine-1 reads a message from a kafka topic, throw its message payload on Channel-1, after validation of the message.
Goroutine-2 reads from Channel-1 and processes the payload and throws the processed payload on Channel-2.
Goroutine-3 reads from Channel-2 and encapsulates the processed payload into http packet and perform http requests(using http client) to another service.
Loophole in the above flow: In our case, processing fails either due to bad network connections between services or remote service is not ready to accept http requests from Go-routine3(http client timeout), due to which, above service lose that message(already read from Kafka topic).
Goroutine-1 currently subscribes the message from Kafka without an acknowledgement sent to Kafka(to inform that specific message is processed successfully by Goroutine-3)
Correctness is preferred over performance.
How to ensure that every message is processed successfully?

E.g., add a feedback from Goroutine-3 to Goroutine-1 through new Channel-3. Goroutine-1 will block until it get acknowledgement from Channel-3.
// in gorouting 1
channel1 <- data
select {
case <-channel3:
case <-ctx.Done(): // or smth else to prevent deadlock
}
...
// in gorouting 3
data := <-channel2
for {
if err := sendData(data); err == nil {
break
}
}
channel3<-struct{}{}

To ensuring correctness you need to commit (=acknowledge) the message after processing finished successfully.
For the cases when the processing wasn't finished successfully - in general, you need to implement retry mechanism by yourself.
That should be specific to your use-case, but generally you throw the message back to a dedicated Kafka retry topic (that you create), add a sleep and process the message again. if after x times the processing fails - you throw the message to a DLQ (=dead letter queue).
You can read more here:
https://eng.uber.com/reliable-reprocessing/
https://www.confluent.io/blog/error-handling-patterns-in-kafka/

Related

How can I compute the message to be sent on a channel as late as possible?

My scenario:
I have a producer and a consumer. Both are goroutines, and they communicate through one channel.
The producer is capable of (theoretically) generating a message at any time.
Generating a message requires some computation.
The message is somewhat time-sensitive (i.e. the older it is, the less relevant it is).
The consumer reads from the channel occasionally. For the sake of this example, let's say the consumer uses a time.Ticker to read a message once every couple of seconds.
The consumer would prefer "fresh" messages (i.e. messages that were generated as recently as possible).
So, the question is: How can the producer generate a message as late as possible?
Sample code that shows the general idea:
func producer() {
for {
select {
...
case pipe <- generateMsg():
// I'd like to call generateMsg as late as possible,
// i.e. calculate the timestamp when I know
// that writing to the channel will not block.
}
}
}
func consumer() {
for {
select {
...
case <-timeTicker.C:
// Reading from the consumer.
msg <- pipe
...
}
}
}
Full code (slightly different from above) is available at the Go Playground: https://play.golang.org/p/y0oCf39AV6P
One idea I had was to check if writing to a channel would block. If it wouldn't block, then I can generate a message and then send it. However…
I couldn't find any way to test if writing to a channel would block or not.
In the general case, this is a bad idea because it introduces a racing condition if we have multiple producers. In this specific case, I only have one producer.
Another (bad) idea:
func producer() {
var msg Message
for {
// This is BAD. DON'T DO THIS!
select {
case pipe <- msg:
// It may send the same message multiple times.
default:
msg = generateMsg()
// It causes a busy-wait loop, high CPU usage
// because it re-generates the message all the time.
}
}
}
This answer (for Go non-blocking channel send, test-for-failure before attempting to send?) suggests using a second channel to send a signal from the consumer to the producer:
Consumer wants to get a message (e.g. after receiving a tick from timer.Ticker).
Consumer sends a signal through a side channel to the producer goroutine. (So, for this side channel, the producer/consumer roles are reversed).
Producer receives the signal from the side channel.
Producer starts computing the real message.
Producer sends the message through the main channel.
Consumer receives the message.

reconnectable websocket which drops message while reconnecting

I'm implementing websocket client with golang.
I have to send several messages in one websocket session.
To deal with network problem, I need to re-connect to websocket server whenever a connection is accidentally closed.
Currently I'm thinking of implementation like this.
for {
select {
case t := <-message:
err := connection.WriteMessage(websocket.TextMessage, []byte(t.String()))
if err != nil {
// If session is disconnected.
// Try to reconnect session here.
connection.reconnect()
}
case t := <- errSignal:
panic()
}
}
In an above example, messages stacks while reconnecting.
This is not preferable for my purpose.
How can I drop websocket messages while reconnecting?
messages stacks while reconnecting.
This is not preferable for my purpose.
How can I drop websocket messages while reconnecting?
I take it message is a buffered channel. It's not clear to me exactly what behavior you're asking for with regard to dropping websocket messages while reconnecting, or why you want to do that, but you have some options for tracking messages related to the reconnect and handling them however you want.
First off, buffered channels act like queues: first in, first out (FIFO). You can always pop an element off the channel with a receive operation. You don't need to pass this to a variable or use it. So say you just wanted to remove the first two messages from the queue around the reconnect and do nothing with them (not sure why), you can:
if err != nil {
// If session is disconnected.
// Try to reconnect session here.
connection.reconnect()
// drop the next two messages
<-message
<-message
}
But that's going to remove the messages from the front of the queue. If the queue wasn't empty when you started the reconnect, it won't specifically remove the messages added during the reconnect.
If you want to relate the number of messages removed to the number of messages added during the reconnect, you can use the channel length:
before := len(message)
connection.reconnect()
after := len(message)
for x := 0; x < after - before; x++ {
<-message
}
Again, that will remove from the front of the queue, and I don't know why you would want to do that unless you're just trying to empty the channel.
And if the channel is non-empty at the start of the reconnect and you really want to drop the messages that got added during the reconnect, you can use the time package. Channels can be defined for any Go type. So create a struct with fields for the message and a timestamp, redefine your buffered message channel to the struct type, and set the timestamp before sending the message. Save a "before" timestamp from before the reconnect and an "after" afterward. Then before processing a received message you can check whether it's in an after/before window and delete it (not write it) if so. You could make a new data structure to save several before/after windows, methods on the type to check whether a given time falls within any. Again, I don't know why you would want to do this.
Perhaps a better solution would be to just limit the buffer size of the channel instead, and then no new messages could be added to the channel when the channel is full. Would that meet your needs? If you have a reason to drop messages maybe you can explain more about your goals and design -- especially to explain which messages you want to drop.
It might also clarify your question if you include more of the relevant code, such as the declaration of the message channel.
Edit: Asker added info in comment to this answer, and comment to question.
The choice between a buffered channel and an unbuffered channel is, in part, about whether you want senders to block when receivers are not available. If this is your only receiver, during the reconnect it will be unavailable. Therefore if it works for your design for senders to block, you can use an unbuffered channel instead of timestamps and no messages will be added to the channel during the reconnect. However, senders blocked at the channel send will be waiting for the receiver with their old message, and only after that send succeeds will they send a new message with current data. If that doesn't work for you, a buffered channel is probably the better option.

RabbitMQ multiple acknowledges to same message closes the consumer

If I acknowledge the same message twice using the Delivery.Ack method, my consumer channel just closes by itself.
Is this expected behaviour? Has anyone experienced this ?
The reason I am acknowledging the same message twice is a special case where I have to break the original message into copies and process them on the consumer. Once the consumer processes everything, it loops and acks everything. Since there are copies of the entity, it acks the same message twice and my consumer channel shuts down
According to the AMQP reference, a channel exception is raised when a message gets acknowledged for the second time:
A message MUST not be acknowledged more than once. The receiving peer
MUST validate that a non-zero delivery-tag refers to a delivered
message, and raise a channel exception if this is not the case.
Second call to Ack(...) for the same message will not return an error, but the channel gets closed due to this exception received from server:
Exception (406) Reason: "PRECONDITION_FAILED - unknown delivery tag ?"
It is possible to register a listener via Channel.NotifyClose to observe this exception.

Redis Pub/Sub Ack/Nack

Is there a concept of acknowledgements in Redis Pub/Sub?
For example, when using RabbitMQ, I can have two workers running on separate machines and when I publish a message to the queue, only one of the workers will ack/nack it and process the message.
However I have discovered with Redis Pub/Sub, both workers will process the message.
Consider this simple example, I have this go routine running on two different machines/clients:
go func() {
for {
switch n := pubSubClient.Receive().(type) {
case redis.Message:
process(n.Data)
case redis.Subscription:
if n.Count == 0 {
return
}
case error:
log.Print(n)
}
}
}()
When I publish a message:
conn.Do("PUBLISH", "tasks", "task A")
Both go routines will receive it and run the process function.
Is there a way of achieving similar behaviour to RabbitMQ? E.g. first worker to ack the message will be the only one to receive it and process it.
Redis PubSub is more like a broadcast mechanism.
if you want queues, you can use BLPOP along with RPUSH to get the same interraction. Keep in mind, RabbitMQ does all sorts of other stuff that are not really there in Redis. But if you looking for simple job scheduling / request handling style, this will work just fine.
No, Redis' PubSub does not guarantee delivery nor does it limit the number of possible subscribers who'll get the message.
Redis streams (now, with Redis 5.0) support acknowledgment of tasks as they are completed by a group.
https://redis.io/topics/streams-intro

Changing state of messages which are "in delivery"

In my application, I have a queue (HornetQ) set up on JBoss 7 AS.
I have used Spring batch to do some work once the messages is received (save values in database etc.) and then the consumer commits the JMS session.
Sometimes when there is an exception while processing the message, the excecution of consumer is aborted abruptly.
And the message remains in "in delivery" state. There are about 30 messages in this state on my production queue.
I have tried restarting the consumer but the state of these messages is not changed. The only way to remove these
messages from the queue is to restart the queue. But before doing that I want a way to read these messages so
that they can be corrected and sent to the queue again to be processed.
I have tried using QueueBrowser to read them but it does not work. I have searched a lot on Google but could not
find any way to read these messages.
I am using a Transacted session, where once the message is processed, I am calling:
session.commit();
This sends the acknowledgement.
I am implementing spring's
org.springframework.jms.listener.SessionAwareMessageListener
to recieve messages and then to process them.
While processing the messages, I am using spring batch to insert some data in database.
For a perticular case, it tries to insert data too big to be inserted in a column.
It throws an exception and transaction is aborted.
Now, I have fixed my producer and consumer not to have such data, so that this case should not happen again.
But my question is what about the 30 "in delivery" state messages that are in my production queue? I want to read them so that they can be corrected and sent to the queue again to be processed. Is there any way to read these messages? Once I know their content, I can restart the queue and submit them again (after correcting them).
Thanking you in anticipation,
Suvarna
It all depends on the Transaction mode you are using.
for instance if you use transactions:
// session here is a TX Session
MessageConsumer cons = session.createConsumer(someQueue);
session.start();
Message msg = consumer.receive...
session.rollback(); // this will make the messages to be redelivered
if you are using non TX:
// session here is auto-ack
MessageConsumer cons = session.createConsumer(someQueue);
session.start();
// this means the message is ACKed as we receive, doing autoACK
Message msg = consumer.receive...
//however the consumer here could have a buffer from the server...
// if you are not using the consumer any longer.. close it
consumer.close(); // this will release messages on the client buffer
Alternatively you could also set consumerWindowSize=0 on the connectionFactory.
This is on 2.2.5 but it never changed on following releases:
http://docs.jboss.org/hornetq/2.2.5.Final/user-manual/en/html/flow-control.html
I"m covering all the possibilities I could think of since you're not being specific on how you are consuming. If you provide me more detail then I will be able to tell you more:
You can indeed read your messages in the queue using jmx (with for example jconsole)
In Jboss As7 you can do it the following way :
MBeans>jboss.as>messaging>default>myJmsQueue>Operations
listMessagesAsJson
[edit]
Since 2.3.0 You have a dedicated method for this specific case :
listDeliveringMessages
See https://issues.jboss.org/browse/HORNETQ-763

Resources