Is it safe to hide sending to channel behind function call - go

I have a struct called Hub with a Run() method which is executed in its own goroutine. This method sequentially handles incoming messages. Messages arrive concurrently from multiple producers (separate goroutines). Of course I use a channel to accomplish this task. But now I want to hide the Hub behind an interface to be able to choose from its implementations. So, using a channel as a simple Hub's field isn't appropriate.
package main
import "fmt"
import "time"
type Hub struct {
msgs chan string
}
func (h *Hub) Run() {
for {
msg, hasMore := <- h.msgs
if !hasMore {
return
}
fmt.Println("hub: msg received", msg)
}
}
func (h *Hub) SendMsg(msg string) {
h.msgs <- msg
}
func send(h *Hub, prefix string) {
for i := 0; i < 5; i++ {
fmt.Println("main: sending msg")
h.SendMsg(fmt.Sprintf("%s %d", prefix, i))
}
}
func main() {
h := &Hub{make(chan string)}
go h.Run()
for i := 0; i < 10; i++ {
go send(h, fmt.Sprintf("msg sender #%d", i))
}
time.Sleep(time.Second)
}
So I've introduced Hub.SendMsg(msg string) function that just calls h.msgs <- msg and which I can add to the HubInterface. And as a Go-newbie I wonder, is it safe from the concurrency perspective? And if so - is it a common approach in Go?
Playground here.

Channel send semantics do not change when you move the send into a method. Andrew's answer points out that the channel needs to be created with make to send successfully, but that was always true, whether or not the send is inside a method.
If you are concerned about making sure callers can't accidentally wind up with invalid Hub instances with a nil channel, one approach is to make the struct type private (hub) and have a NewHub() function that returns a fully initialized hub wrapped in your interface type. Since the struct is private, code in other packages can't try to initialize it with an incomplete struct literal (or any struct literal).
That said, it's often possible to create invalid or nonsense values in Go and that's accepted: net.IP("HELLO THERE BOB") is valid syntax, or net.IP{}. So if you think it's better to expose your Hub type go ahead.

Easy answer
Yes
Better answer
No
Channels are great for emitting data from unknown go-routines. They do so safely, however I would recommend being careful with a few parts. In the listed example the channel is created with the construction of the struct by the consumer (and not not by a consumer).
Say the consumer creates the Hub like the following: &Hub{}. Perfectly valid... Apart from the fact that all the invokes of SendMsg() will block for forever. Luckily you placed those in their own go-routines. So you're still fine right? Wrong. You are now leaking go-routines. Seems fine... until you run this for a period of time. Go encourages you to have valid zero values. In this case &Hub{} is not valid.
Ensuring SendMsg() won't block could be achieved via a select{} however you then have to decide what to do when you encounter the default case (e.g. throw data away). The channel could block for more reasons than bad setup too. Say later you do more than simply print the data after reading from the channel. What if the read gets very slow, or blocks on IO. You then will start pushing back on the producers.
Ultimately, channels allow you to not think much about concurrency... However if this is something of high-throughput, then you have quite a bit to consider. If it is production code, then you need to understand that your API here involves SendMsg() blocking.

Related

How does this go-routine in an anonymous function exactly work?

func (s *server) send(m *message) error {
go func() {
s.outgoingMessageChan <- message
}()
return nil
}
func main(s *server) {
for {
select {
case <-someChannel:
// do something
case msg := <-s.outGoingMessageChan:
// take message sent from "send" and do something
}
}
}
I am pulling out of this s.outgoingMessageChan in another function, before using an anonymous go function, a call to this function would usually block - meaning whenever send is called, s.outgoingMessageChan <- message would block until something is pulling out of it. However after wrapping it like this it doesn't seem to block anymore. I understand that it kind of sends this operation to the background and proceeds as usual, but I'm not able to wrap my head around how this doesn't affect the current function call.
Each time send is called a new goroutine is created, and returns immediately. (BTW there is no reason to return an error if there can never be an error.) The goroutine (which has it's own "thread" of execution) will block if nothing is ready to read from the chan (assuming it's unbuffered). Once the message is read off the chan the goroutine will continue but since it does nothing else it will simply end.
I should point out that there is no such thing as an anonymous goroutine. Goroutines have no identifier at all (except for a number that you should only use for debugging purposes). You have an anonymous function which you put the go keyword in front causing it to run in a separate goroutine.
For a send function that blocks as you seem to want then just use:
func (s *server) send(m *message) {
s.outgoingMessageChan <- message
}
However, I can't see any point in this function (though it would be inlined and just as efficient as not using a function).
I suspect you may be calling send many times before anything is read from the chan. In this case many new goroutines will be created (each time you call send) which will all block. Each time the chan is read from one will unblock delivering its value and that goroutine will terminate. Doing this you are simply creating an inefficient buffering mechanism. Moreover, if send is called for a prolonged period at a faster rate than the values can be read from the chan then you will eventually run out of memory. Better would be to use a buffered chan (and no goroutines) that once it (the chan) became full exerted "back-pressure" on whatever was producing the messages.
Another point is that the function name main is used to identify the entry point to a program. Please use another name for your 2nd function above. It also seems like it should be a method (using s *server receiver) than a function.

Concurrent queue which returns channels, locking doubts

There is queue of not important structs Message, which has the classic push and pop methods:
type Queue struct {
messages list.List
}
//The implementation is not relevant for the sake of the question
func (q *Queue) Push(msg Message) { /*...*/ }
func (q *Queue) Pop() (Message, bool) { /*...*/ }
/*
* NewTimedChannel runs a goroutine which pops a message from the queue every
* given time duration and sends it over the returned channel
*/
func (q *Queue) NewTimedChannel(t time.Duration) (<-chan Message) {/*...*/}
The client of the Push function will be a web gui in which users will post their messages.
The client of the channel returned by NewTimedChannel will be a service which sends each message to a not relevant endpoint over the network.
I'm a newbie in concurrency and go and I have the following question:
I know that since Queue.messages is a shared state between the main goroutine which deals with pushing the message after the user submit a web form and the ones created for each NewTimedChannel invocation, I need to lock it.
Do I need to lock and unlock using the sync.Mutex in all the Push, Pop and NewTimedChannel methods?
And is there a more idiomatic way to handle this specific problem in the go environment?
As others have pointed out, it requires synchronization or there will be a data race.
There is a saying in Go, "Don't communicate by sharing memory, share memory by communicating." As in this case, I think an idomatic way is to make channels send to a seprate goroutine which synchronize all the operations together using select. The code can easily be extended by adding more channels to support more kinds of operations (like the timed channel in your code which I don't fully understand what does it do), and by using select and other utils, it can easily handle more complex synchronizing by using locks. I write some sample code:
type SyncQueue struct {
Q AbsQueue
pushCh,popMsgCh chan Message
popOkCh chan bool
popCh chan struct{}
}
// An abstract of the Queue type. You can remove the abstract layer.
type AbsQueue interface {
Push(Message)
Pop() (Message,bool)
}
func (sq SyncQueue) Push(m Message) {
sq.pushCh <- m
}
func (sq SyncQueue) Pop() (Message,bool) {
sq.popCh <- struct{}{} // send a signal for pop. struct{}{} cost no memory at all.
return <-sq.popMsgCh,<-sq.popOkCh
}
// Every pop and push get synchronized here.
func (sq SyncQueue) Run() {
for {
select {
case m:=<-pushCh:
Q.Push(m)
case <-popCh:
m,ok := Q.Pop()
sq.popMsgCh <- m
sq.popOkCh <- ok
}
}
}
func NewSyncQueue(Q AbsQueue) *SyncQueue {
sq:=SyncQueue {
Q:Q,
pushCh: make(chan Message),popMsgCh: make(chan Message),
pushOkCh: make(chan bool), popCh: make(chan struct{}),
}
go sq.Run()
return &sq
}
Note that for simpilicity, I did not use a quit channel or a context.Context, so the goroutine of sq.Run() has no way of exiting and would cause a memory leak.
Do I need to lock and unlock using the sync.Mutex in all the Push, Pop and NewTimedChannel methods?
Yes.
And is there a more idiomatic way to handle this specific problem in
the go environment?
For insight, have a look at the last answer for this question:
How do I (succinctly) remove the first element from a slice in Go?

Calling Functions Inside a "LockOSThread" GoRoutine

I'm writing a package to control a Canon DSLR using their EDSDK DLL from Go.
This is a personal project for a photo booth to use at our wedding at my partners request, which I'll be happy to post on GitHub when complete :).
Looking at the examples of using the SDK elsewhere, it isn't threadsafe and uses thread-local resources, so I'll need to make sure I'm calling it from a single thread during usage. While not ideal, it looks like Go provides a "runtime.LockOSThread" function for doing just that, although this does get called by the core DLL interop code itself, so I'll have to wait and find out if that interferes or not.
I want the rest of the application to be able to call the SDK using a higher level interface without worrying about the threading, so I need a way to pass function call requests to the locked thread/Goroutine to execute there, then pass the results back to the calling function outside of that Goroutine.
So far, I've come up with this working example of using very broad function definitions using []interface{} arrays and passing back and forward via channels. This would take a lot of mangling of input/output data on every call to do type assertions back out of the interface{} array, even if we know what we should expect for each function ahead of time, but it looks like it'll work.
Before I invest a lot of time doing it this way for possibly the worst way to do it - does anyone have any better options?
package edsdk
import (
"fmt"
"runtime"
)
type CanonSDK struct {
FChan chan functionCall
}
type functionCall struct {
Function func([]interface{}) []interface{}
Arguments []interface{}
Return chan []interface{}
}
func NewCanonSDK() (*CanonSDK, error) {
c := &CanonSDK {
FChan: make(chan functionCall),
}
go c.BackgroundThread(c.FChan)
return c, nil
}
func (c *CanonSDK) BackgroundThread(fcalls <-chan functionCall) {
runtime.LockOSThread()
for f := range fcalls {
f.Return <- f.Function(f.Arguments)
}
runtime.UnlockOSThread()
}
func (c *CanonSDK) TestCall() {
ret := make(chan []interface{})
f := functionCall {
Function: c.DoTestCall,
Arguments: []interface{}{},
Return: ret,
}
c.FChan <- f
results := <- ret
close(ret)
fmt.Printf("%#v", results)
}
func (c *CanonSDK) DoTestCall([]interface{}) []interface{} {
return []interface{}{ "Test", nil }
}
For similar embedded projects I've played with, I tend to create a single goroutine worker that listens on a channel to perform all the work over that USB device. And any results sent back out on another channel.
Talk to the device with channels only in Go in a one-way exchange. LIsten for responses from the other channel.
Since USB is serial and polling, I had to setup a dedicated channel with another goroutine that justs picks items off the channel when they were pushed into it from the worker goroutine that just looped.

Handle different types of messages - one or many channels?

Consider this simple code:
type Message struct { /* ... */ }
type MyProcess struct {
in chan Message
}
func (foo *MyProcess) Start() {
for msg := range foo.in {
// handle `msg`
}
// someone closed `in` - bye
}
I'd like to change MyProcess to support 2 different kinds of messages.
I have 2 ideas:
a) Type switch
type Message struct { /* ... */ }
type OtherMessage struct { /* ... */ }
type MyProcess struct {
in chan interface{} // Changed signature to something more generic
}
func (foo *MyProcess) Start() {
for msg := range foo.in {
switch msg := msg.(type) {
case Message:
// handle `msg`
case OtherMessage:
// handle `msg`
default:
// programming error, type system didn't save us this time.
// panic?
}
}
// someone closed `in` - bye
}
b) Two channels
type Message struct { /* ... */ }
type OtherMessage struct { /* ... */ }
type MyProcess struct {
in chan Message
otherIn chan OtherMessage
}
func (foo *MyProcess) Start() {
for {
select {
case msg, ok := <-foo.in:
if !ok {
// Someone closed `in`
break
}
// handle `msg`
case msg, ok := <-foo.otherIn:
if !ok {
// Someone closed `otherIn`
break
}
// handle `msg`
}
}
// someone closed `in` or `otherIn` - bye
}
What's the functional difference between the two implementations? One thing is the ordering differences - only the first one guarantees that the messages (Message and OtherMessage) will be processed in the proper sequence.
Which one is more idiomatic? The approach 'a' is shorter but doesn't enforce message type correctness (one could put anything in the channel). The approach 'b' fixes this, but has more boilerplate and more space for human error: both channels need to be checked for closedness (easy to forget) and someone needs to actually close both of them (even easier to forget).
Long story short I'd rather use 'a' but it doesn't leverage the type system and thus feels ugly. Maybe there is an even better option?
I would also go with option 'a': one channel only. You can enforce type correctness if you create a base message type (an interface) and both of the possible message types implement it (or if they are interfaces too, they can embed it).
Further advantage of the one-channel solution is that it is extensible. If now you want to handle a 3rd type of message, it's very easy to add it and to handle it. In case of the other: you would need a 3rd channel which if the number of message types increases soon becomes unmanageable and makes your code ugly. Also in case of multi channels, the select randomly chooses a ready channel. If messages come in frequently in some channels, others might starve even if only one message is in the channel and no more is coming.
Answer to your questions first:
1) You got the major functional difference already, the ordering difference depending on how the channel is written to. There is also some differences in the implementation of how a channel of struct type versus interface type is implemented. Mostly, these are implementation details and don't change the nature of the majority of outcomes of using your code that much, but in the case where you're sending millions of messages, maybe this implementation detail will cost you.
2) I would say neither example you gave is more idiomatic than the other simply by reading your pseudocode, because whether you read from one channel or two has more to do with the semantics and requirements of your program (ordering, where the data is coming from, channel depth requirements, etc) than anything else. For example, What if one of the message types was a "stop" message to tell your processor to stop reading, or do something that could change the state of future messages processed? Maybe that would go on its own channel to make sure it doesn't get delayed by pending writes to the other channel.
And then you asked for possibly a better option?
One way to keep using a single channel and also keep from doing type checks is to instead send an enclosing type as the channel type:
type Message struct { /* ... */}
type OtherMessage struct { /* ... */}
type Wrap struct {
*Message
*OtherMessage
}
type MyProcess struct {
in chan Wrap
}
func (foo *MyProcess) Start() {
for msg := range foo.in {
if msg.Message != nil {
// do processing of message here
}
if msg.OtherMessage != nil {
// process OtherMessage here
}
}
// someone closed `in` - bye
}
An interesting side effect of struct Wrap is you can send both a Message and OtherMessage in the same channel message. It's up to you to decide whether this means anything or will happen at all.
One should note that if Wrap was going to grow beyond a handful of message types the cost of sending a wrap instance may actually be higher at some breakoff point (easy enough to benchmark) than simply sending an interface type and doing a type switch.
The other thing which you may want to look at, depending on the similarity between the types, is defining a non-empty interface where both Message and OtherMessage have that method receiver set; maybe it will contain functionality that will solve having to do a type switch at all.
Maybe you're reading messages to send them to a queuing library and all you really needed to get was:
interface{
MessageID() string
SerializeJSON() []byte
}
(I just made that up for illustration purposes)

More idiomatic way of adding channel result to queue on completion

So, right now, I just pass a pointer to a Queue object (implementation doesn't really matter) and call queue.add(result) at the end of goroutines that should add things to the queue.
I need that same sort of functionality—and of course doing a loop checking completion with the comma ok syntax is unacceptable in terms of performance versus the simple queue add function call.
Is there a way to do this better, or not?
There are actually two parts to your question: how does one queue data in Go, and how does one use a channel without blocking.
For the first part, it sounds like what you need to do is instead of using the channel to add things to the queue, use the channel as a queue. For example:
var (
ch = make(chan int) // You can add an int parameter to this make call to create a buffered channel
// Do not buffer these channels!
gFinished = make(chan bool)
processFinished = make(chan bool)
)
func f() {
go g()
for {
// send values over ch here...
}
<-gFinished
close(ch)
}
func g() {
// create more expensive objects...
gFinished <- true
}
func processObjects() {
for val := range ch {
// Process each val here
}
processFinished <- true
}
func main() {
go processObjects()
f()
<-processFinished
}
As for how you can make this more asynchronous, you can (as cthom06 pointed out) pass a second integer to the make call in the second line which will make send operations asynchronous until the channel's buffer is full.
EDIT: However (as cthom06 also pointed out), because you have two goroutines writing to the channel, one of them has to be responsible for closing the channel. Also, my previous revision would exit before processObjects could complete. The way I chose to synchronize the goroutines is by creating a couple more channels that pass around dummy values to ensure that the cleanup gets finished properly. Those channels are specifically unbuffered so that the sends happen in lock-step.

Resources