reconnectable websocket which drops message while reconnecting

reconnectable websocket which drops message while reconnecting - go

I'm implementing websocket client with golang.
I have to send several messages in one websocket session.
To deal with network problem, I need to re-connect to websocket server whenever a connection is accidentally closed.
Currently I'm thinking of implementation like this.
for {
select {
case t := <-message:
err := connection.WriteMessage(websocket.TextMessage, []byte(t.String()))
if err != nil {
// If session is disconnected.
// Try to reconnect session here.
connection.reconnect()
}
case t := <- errSignal:
panic()
}
}
In an above example, messages stacks while reconnecting.
This is not preferable for my purpose.
How can I drop websocket messages while reconnecting?

messages stacks while reconnecting.
This is not preferable for my purpose.
How can I drop websocket messages while reconnecting?
I take it message is a buffered channel. It's not clear to me exactly what behavior you're asking for with regard to dropping websocket messages while reconnecting, or why you want to do that, but you have some options for tracking messages related to the reconnect and handling them however you want.
First off, buffered channels act like queues: first in, first out (FIFO). You can always pop an element off the channel with a receive operation. You don't need to pass this to a variable or use it. So say you just wanted to remove the first two messages from the queue around the reconnect and do nothing with them (not sure why), you can:
if err != nil {
// If session is disconnected.
// Try to reconnect session here.
connection.reconnect()
// drop the next two messages
<-message
<-message
}
But that's going to remove the messages from the front of the queue. If the queue wasn't empty when you started the reconnect, it won't specifically remove the messages added during the reconnect.
If you want to relate the number of messages removed to the number of messages added during the reconnect, you can use the channel length:
before := len(message)
connection.reconnect()
after := len(message)
for x := 0; x < after - before; x++ {
<-message
}
Again, that will remove from the front of the queue, and I don't know why you would want to do that unless you're just trying to empty the channel.
And if the channel is non-empty at the start of the reconnect and you really want to drop the messages that got added during the reconnect, you can use the time package. Channels can be defined for any Go type. So create a struct with fields for the message and a timestamp, redefine your buffered message channel to the struct type, and set the timestamp before sending the message. Save a "before" timestamp from before the reconnect and an "after" afterward. Then before processing a received message you can check whether it's in an after/before window and delete it (not write it) if so. You could make a new data structure to save several before/after windows, methods on the type to check whether a given time falls within any. Again, I don't know why you would want to do this.
Perhaps a better solution would be to just limit the buffer size of the channel instead, and then no new messages could be added to the channel when the channel is full. Would that meet your needs? If you have a reason to drop messages maybe you can explain more about your goals and design -- especially to explain which messages you want to drop.
It might also clarify your question if you include more of the relevant code, such as the declaration of the message channel.
Edit: Asker added info in comment to this answer, and comment to question.
The choice between a buffered channel and an unbuffered channel is, in part, about whether you want senders to block when receivers are not available. If this is your only receiver, during the reconnect it will be unavailable. Therefore if it works for your design for senders to block, you can use an unbuffered channel instead of timestamps and no messages will be added to the channel during the reconnect. However, senders blocked at the channel send will be waiting for the receiver with their old message, and only after that send succeeds will they send a new message with current data. If that doesn't work for you, a buffered channel is probably the better option.

Related

Clarification on Go channels tutorial from some missing word or words

This page on a Go Tutorial about channels seems to be missing a word(s) or was just not edited. I can't tell what it is supposed to say about sending and receiving through channels.
By default, sends and receives block until the other side is ready.
Is a block something within Go? I haven't seen it before. Is block being used as a noun?
I tried searching for clarification. The only other page that has similar wording is educative.io
Moreover, by default, channels send and receive until the other side is ready
But it doesn't make sense. Do they mean:
Channels send and receive regardless of whether or not the other side is ready? Doesn't this seem wasteful?
Or is "don't" missing in the statement above?

"Block" means that the goroutine will wait. You could write it this way:
By default, sends and receives wait until the other side is ready.
"Block" is just the normal term for this. It is not specific to Go. It is possible to use a channel in Go in a non-blocking manner:
You can create a channel with a buffer. As long as there is space in the buffer, a write is non-blocking (but it will block if the buffer is full). As long as there is data in the buffer, a read is non-blocking (but it will block if the buffer is empty).
You can use a select statement with a default branch.
var readch chan int
var writech chan int
var value int
select {
case n := <- readch:
// Received data.
case writech <- value:
// Sent data.
default:
// Didn't send or receive data.
}
In this code, instead of blocking (waiting), the goroutine will go to the default branch.

How to make every message process successfully?

Below is a service with set of 3 Go-routines that process a message from Kafka:
Channel-1 & Channel-2 are unbuffered data channels in Go. Channel is like a queuing mechanism.
Goroutine-1 reads a message from a kafka topic, throw its message payload on Channel-1, after validation of the message.
Goroutine-2 reads from Channel-1 and processes the payload and throws the processed payload on Channel-2.
Goroutine-3 reads from Channel-2 and encapsulates the processed payload into http packet and perform http requests(using http client) to another service.
Loophole in the above flow: In our case, processing fails either due to bad network connections between services or remote service is not ready to accept http requests from Go-routine3(http client timeout), due to which, above service lose that message(already read from Kafka topic).
Goroutine-1 currently subscribes the message from Kafka without an acknowledgement sent to Kafka(to inform that specific message is processed successfully by Goroutine-3)
Correctness is preferred over performance.
How to ensure that every message is processed successfully?

E.g., add a feedback from Goroutine-3 to Goroutine-1 through new Channel-3. Goroutine-1 will block until it get acknowledgement from Channel-3.
// in gorouting 1
channel1 <- data
select {
case <-channel3:
case <-ctx.Done(): // or smth else to prevent deadlock
}
...
// in gorouting 3
data := <-channel2
for {
if err := sendData(data); err == nil {
break
}
}
channel3<-struct{}{}

To ensuring correctness you need to commit (=acknowledge) the message after processing finished successfully.
For the cases when the processing wasn't finished successfully - in general, you need to implement retry mechanism by yourself.
That should be specific to your use-case, but generally you throw the message back to a dedicated Kafka retry topic (that you create), add a sleep and process the message again. if after x times the processing fails - you throw the message to a DLQ (=dead letter queue).
You can read more here:
https://eng.uber.com/reliable-reprocessing/
https://www.confluent.io/blog/error-handling-patterns-in-kafka/

Program with select statements escape deadlock in go

This question have quite possibly been answered by I couldn't find it so here we go:
I have this go function that sends or recieves "messages", whichever one is avaivable, using a select statement:
func Seek(name string, match chan string) {
select {
case peer := <-match:
fmt.Printf("%s sent a message to %s.\n", peer, name)
case match <- name:
// Wait for someone to receive my message.
I start this function on 4 different go-routines, using an unbuffered channel (It would be better to use a buffer och 1 but this is merely experimental):
people := []string{"Anna", "Bob", "Cody", "Dave"}
match := make(chan string)
for _, name := range people {
go Seek(name, match, wg)
Now, I've just started using go and thought that since we're using an unbuffered channel, both the send and recieve statement of the "select" should block (there's no one waiting to send a message so you can't recieve, and there's no one waiting to recieve so you can't send), meaning that there won't be any communcation done between the functions, aka Deadlock. However running the code shows us that this is not the case:
API server listening at: 127.0.0.1:48731
Dave sent a message to Cody.
Anna sent a message to Bob.
Process exiting with code: 0
My question to you lovely people is why this happens? Does the compiler realize that the functions want to read / write in the same channel and arranges that to happen? Or does the "select" statement continually check if there's anyone available to use the channel with?
Sorry if the question is hard to answer, I'm still a novice and not that experienced in how things operate behind the scene :)

Now, I've just started using go and thought that since we're using an unbuffered channel, both the send and recieve statement of the "select" should block (there's no one waiting to send a message so you can't recieve, and there's no one waiting to recieve so you can't send)
This is actually not true; in fact, there are multiple goroutines waiting to receive and multiple goroutines waiting to send. When a goroutine does a select like yours:
select {
case peer := <-match:
fmt.Printf("%s sent a message to %s.\n", peer, name)
case match <- name:
// Wait for someone to receive my message.
It is simultaneously waiting to send and to receive. Since you have multiple routines doing this, every routine will find both senders and receievers. Nothing will block. The selects will choose cases at random since multiple cases are unblocked at the same time.

How can I compute the message to be sent on a channel as late as possible?

My scenario:
I have a producer and a consumer. Both are goroutines, and they communicate through one channel.
The producer is capable of (theoretically) generating a message at any time.
Generating a message requires some computation.
The message is somewhat time-sensitive (i.e. the older it is, the less relevant it is).
The consumer reads from the channel occasionally. For the sake of this example, let's say the consumer uses a time.Ticker to read a message once every couple of seconds.
The consumer would prefer "fresh" messages (i.e. messages that were generated as recently as possible).
So, the question is: How can the producer generate a message as late as possible?
Sample code that shows the general idea:
func producer() {
for {
select {
...
case pipe <- generateMsg():
// I'd like to call generateMsg as late as possible,
// i.e. calculate the timestamp when I know
// that writing to the channel will not block.
}
}
}
func consumer() {
for {
select {
...
case <-timeTicker.C:
// Reading from the consumer.
msg <- pipe
...
}
}
}
Full code (slightly different from above) is available at the Go Playground: https://play.golang.org/p/y0oCf39AV6P
One idea I had was to check if writing to a channel would block. If it wouldn't block, then I can generate a message and then send it. However…
I couldn't find any way to test if writing to a channel would block or not.
In the general case, this is a bad idea because it introduces a racing condition if we have multiple producers. In this specific case, I only have one producer.
Another (bad) idea:
func producer() {
var msg Message
for {
// This is BAD. DON'T DO THIS!
select {
case pipe <- msg:
// It may send the same message multiple times.
default:
msg = generateMsg()
// It causes a busy-wait loop, high CPU usage
// because it re-generates the message all the time.
}
}
}

This answer (for Go non-blocking channel send, test-for-failure before attempting to send?) suggests using a second channel to send a signal from the consumer to the producer:
Consumer wants to get a message (e.g. after receiving a tick from timer.Ticker).
Consumer sends a signal through a side channel to the producer goroutine. (So, for this side channel, the producer/consumer roles are reversed).
Producer receives the signal from the side channel.
Producer starts computing the real message.
Producer sends the message through the main channel.
Consumer receives the message.

Publisher finishes before subscriber and messages are lost - why?

Fairly new to zeromq and trying to get a basic pub/sub to work. When I run the following (sub starting before pub) the publisher finishes but the subscriber hangs having not received all the messages - why ?
I think the socket is being closed but the messages have been sent ? Is there a way of ensuring all messages are received ?
Publisher:
import zmq
import random
import time
import tnetstring
context=zmq.Context()
socket=context.socket(zmq.PUB)
socket.bind("tcp://*:5556")
y=0
for x in xrange(5000):
st = random.randrange(1,10)
data = []
data.append(random.randrange(1,100000))
data.append(int(time.time()))
data.append(random.uniform(1.0,10.0))
s = tnetstring.dumps(data)
print 'Sending ...%d %s' % (st,s)
socket.send("%d %s" % (st,s))
print "Messages sent: %d" % x
y+=1
print '*** SERVER FINISHED. # MESSAGES SENT = ' + str(y)
Subscriber :-
import sys
import zmq
import tnetstring
# Socket to talk to server
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.connect("tcp://localhost:5556")
filter = "" # get all messages
socket.setsockopt(zmq.SUBSCRIBE, filter)
x=0
while True:
topic,data = socket.recv().split()
print "Topic: %s, Data = %s. Total # Messages = %d" % (topic,data,x)
x+=1

In ZeroMQ, clients and servers always try to reconnect; they won't go down if the other side disconnects (because in many cases you'd want them to resume talking if the other side comes up again). So in your test code, the client will just wait until the server starts sending messages again, unless you stop recv()ing messages at some point.

In your specific instance, you may want to investigate using the socket.close() and context.term(). It will block until all the messages have been sent. You also have the problem of a slow joiner. You can add a sleep after the bind, but before you start publishing. This works in a test case, but you will want to really understand what is the solution vs a band-aid.
You need to think of the PUB/SUB pattern like a radio. The sender and receiver are both asynchronous. The Publisher will continue to send even if no one is listening. The subscriber will only receive data if it is listening. If the network goes down in the middle, the data will be lost.
You need to understand this in order to design your messages. For example, if you design your messages to be "idempotent", it doesn't matter if you lose data. An example of this would be a status type message. It doesn't matter if you have any of the previous statuses. The latest one is correct and message loss doesn't matter. The benefits to this approach is that you end up with a more robust and performant system. The downsides are when you can't design your messages this way.
Your example includes a type of message that requires no loss. Another type of message would be transactional. For example, if you just sent the deltas of what changed in your system, you would not be able to lose the messages. Database replication is often managed this way which is why db replication is often so fragile. To try to provide guarantees, you need to do a couple things. One thing is to add a persistent cache. Each message sent needs to be logged in the persistent cache. Each message needs to be assigned a unique id (preferably a sequence) so that the clients can determine if they are missing a message. A second socket (ROUTER/REQ) needs to be added for the client to request the missing messages individually. Alternatively, you could just use the secondary socket to request resending over the PUB/SUB. The clients would then all receive the messages again (which works for the multicast version). The clients would ignore the messages they had already seen. NOTE: this follows the MAJORDOMO pattern found in the ZeroMQ guide.
An alternative approach is to create your own broker using the ROUTER/DEALER sockets. When the ROUTER socket saw each DEALER connect, it would store its ID. When the ROUTER needed to send data, it would iterate over all client IDs and publish the message. Each message should contain a sequence so that the client can know what missing messages to request. NOTE: this is a sort of reimplementation of Kafka from linkedin.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio