How to pass vars with chan between functions - go

I'm implementing a simple mechanism of passing variable between two goroutines with a channel. Here is my code:
pipe := make(chan string)
go func(out chan string, data string) { //1st goroutine
out <- DataSignerMd5(data)
}(pipe, data)
go func(in chan string) { //2nd goroutine
data := <-in
in <- DataSignerCrc32(data)
}(pipe)
crcMdData := <- pipe
More likely, crcMdData pulls a variable from pipe before 2nd goroutine. I guess that I simply can create another channel to make this work. But maybe it's possible with a single pipe?

You should use a second channel for what you want to do. You could get away with using a single channel and switching on the result, but that's not really ideal - you're basically trying to put two different types of objects into the same channel, and your program will end up being a lot cleaner and easier to reason about if you just have one channel per data type / intended transformation.

Related

Golang: Can channels be closed in a deferred function?

I was working on some Go code and wanted to close a channel at the end of it. However, I wanted to test if it'd actually close the channel if I use an anonymous function, passing the channel as an argument and use the function with the defer keyword.
channel := make(chan string, 2)
defer func (channel chan string) {
close(channel)
value, ok := <-channel
fmt.Println("Value:", value, "Open channel?", ok)
}(channel)
To my understanding everything in Go is passed by value and not by reference, and make's documentation says that it returns a channel when used to spawn a channel, not a pointer to a channel. Does this imply that if I pass the channel as argument to a function and close the channel within that function, the channel outside the function would remain open? If so, I'm guessing I'd need to pass the channel as a pointer to the function to achieve that behavior?
I'm new to Go, so I'd appreciate if you can explain me the "why" rather than "Oh, just swap this chunk for this other chunk of code" just so I can better understand how it's actually working:)
Thanks!
Channels behaves like slice, map, strings or functions, as in they actually store a wrapped pointer to the underlying data. They behave like opaque pointers.
So passing a channel is copying it by value, but just like a slice, it's "value" is literally just a reference to the actual low-level channel data structure. So there is no need to explicitly pass a channel pointer, since that would be like passing a pointer of a pointer.
To answer your questions: Yes it can and yes it's safe to pass channels by value because they are "pointers" in and of themselves.
You can also verify it with a slightly modified version of your code:
package main
import "fmt"
func main() {
channel := make(chan string, 2)
// channel is open here.
// close the channel by passing it into a function
// by value and closing it within the function.
func(channel chan string) {
close(channel)
}(channel)
// is the channel closed?
value, ok := <-channel
fmt.Println("Value:", value, "Open channel?", ok)
}
Output (go version go1.16 darwin/amd64):
Value: Open channel? false
Does this imply that if I pass the channel as argument to a function and close the channel within that function, the channel outside the function would remain open?
No. Copying a chan variable doesn't produce a copy of the channel itself. Variables of type chan are handles to the underlying channel created by make. Closing any copy of the chan variable closes the channel.

Channel of type "set"

I recently started writing Go after years of programming in C#, and I'm having a hard time wrapping my head around several concepts of the language. Here's an example of what I'm trying to solve: I'd like to be able to create a routine that iterates over a list, calls a function, and stores the output in a buffered channel. The issue is I want to return a distinct set of these output values, as the function can return similar results for two different elements in the list.
Since Go doesn't have a built-in set type, I'm trying to use a map[string]bool to store distinct values (using map[string]bool or map[string]struct is what others suggested as a replacement for a set); and I'm using a buffered channel to insert into this map, however I'm not certain what the right syntax for inserting 1 element into a map would look like. Here's what I'm trying to do:
resultsChnl := make(chan map[string]bool, len(myList))
go func(myList []string, resultsChnl chan map[string]bool) {
for _, item := range myList {
result, err := getResult(item)
/* error checking */
resultsChnl <- {result: true}
}
close(resultsChnl)
}(myList, resultsChnl)
for item := range resultsChnl {
...
}
Obviously this doesn't compile due to invalid syntax of resultsChnl <- {result: true}. I know this sounds impractical since naturally in this particular case I could create a local map inside the for loop and assign one map[string]bool object to a non-buffered channel and return that, but let's assume I was creating a go routine for each item in the list and really wanted to use a buffered channel (as opposed to using a mutex to grab a lock on a shared map). So is there any way to insert one key-value pair in a map channel? Or am I thinking about this completely wrong?
To answer the question directly, you would want
resultsChnl <- map[string]bool{result: true}
But this doesn't seem useful at all. You may want to collect the results in a map, but there's no reason to pass a map over the channel for each result when you know it will only have one element. Simply use a channel of string, do
resultsChnl <- result
for each result in your producer goroutine, and
seenResult[item] = true
in your consumer loop to collect the results (where seenResult is a map[string]bool).
Or forget about the channel entirely and have your producer goroutines write directly into a sync.Map.

Catching return values from goroutines

The below code gives compilation error saying 'unexpected go':
x := go doSomething(arg)
func doSomething(arg int) int{
...
return my_int_value
}
I know, I can fetch the return value if I call the function normally i.e. without using goroutine or I can use channels etc.
My question is why is it not possible to fetch a return value like this from a goroutine.
Why is it not possible to fetch a return value from a goroutine assigning it to a variable?
Run goroutine (asynchronously) and fetch return value from function are essentially contradictory actions. When you say go you mean "do it asynchronously" or even simpler: "Go on! Don't wait for the function execution be finished". But when you assign function return value to a variable you are expecting to have this value within the variable. So when you do that x := go doSomething(arg) you are saying: "Go on, don't wait for the function! Wait-wait-wait! I need a returned value be accessible in x var right in the next line below!"
Channels
The most natural way to fetch a value from a goroutine is channels. Channels are the pipes that connect concurrent goroutines. You can send values into channels from one goroutine and receive those values into another goroutine or in a synchronous function. You could easily obtain a value from a goroutine not breaking concurrency using select:
func main() {
c1 := make(chan string)
c2 := make(chan string)
go func() {
time.Sleep(time.Second * 1)
c1 <- "one"
}()
go func() {
time.Sleep(time.Second * 2)
c2 <- "two"
}()
for i := 0; i < 2; i++ {
// Await both of these values
// simultaneously, printing each one as it arrives.
select {
case msg1 := <-c1:
fmt.Println("received", msg1)
case msg2 := <-c2:
fmt.Println("received", msg2)
}
}
}
The example is taken from Go By Example
CSP & message-passing
Go is largerly based on CSP theory. The naive description from above could be precisely outlined in terms of CSP (although I believe it is out of scope of the question). I strongly recommend to familiarize yourself with CSP theory at least because it is RAD. These short quotations give a direction of thinking:
As its name suggests, CSP allows the description of systems in terms of component processes that operate independently, and interact with each other solely through message-passing communication.
In computer science, message passing sends a message to a process and relies on the process and the supporting infrastructure to select and invoke the actual code to run. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name.
The strict answer is that you can do that. It's just probably not a good idea. Here's code that would do that:
var x int
go func() {
x = doSomething()
}()
This will spawn off a new goroutine which will calculate doSomething() and then assign the result to x. The problem is: how are you going to use x from the original goroutine? You probably want to make sure the spawned goroutine is done with it so that you don't have a race condition. But if you want to do that, you'll need a way to communicate with the goroutine, and if you've got a way to do that, why not just use it to send the value back?
The idea of the go keyword is that you run the doSomething function asynchronously, and continue the current goroutine without waiting for the result, kind of like executing a command in a Bash shell with an '&' after it. If you want to do
x := doSomething(arg)
// Now do something with x
then you need the current goroutine to block until doSomething finishes. So why not just call doSomething in the current goroutine? There are other options (like, doSomething could post a result to a channel, which the current goroutine receives values from) but simply calling doSomething and assigning the result to a variable is obviously simpler.
It's a design choice by Go creators. There's a whole lot of abstractions/APIs to represent the value of async I/O operations - promise, future, async/await, callback, observable, etc. These abstractions/APIs are inherently tied to the unit of scheduling - coroutines - and these abstractions/APIs dictate how coroutines (or more precisely the return value of async I/O represented by them) can be composed.
Go chose message passing (aka channels) as the abstraction/API to represent the return value of async I/O operations. And of course, goroutines and channels give you a composable tool to implement async I/O operations.
Why not use a channel to write into?
chanRes := make(chan int, 1)
go doSomething(arg, chanRes)
//blocks here or you can use some other sync mechanism (do something else) and wait
x := <- chanRes
func doSomething(arg int, out chan<- int){
...
out <- my_int_value
}

How to block all goroutines except the one running

I have two (but later I'll be three) go routines that are handling incoming messages from a remote server (from a ampq channel). But because they are handling on the same data/state, I want to block all other go routines, except the one running.
I come up with a solution to use chan bool where each go routine blocks and then release it, the code is like:
package main
func a(deliveries <-chan amqp, handleDone chan bool) {
for d := range deliveries {
<-handleDone // Data comes always, wait for other channels
handleDone <- false // Block other channels
// Do stuff with data...
handleDone <- true // I'm done, other channels are free to do anything
}
}
func b(deliveries <-chan amqp, handleDone chan bool) {
for d := range deliveries {
<-handleDone
handleDone <- false
// Do stuff with data...
handleDone <- true
}
}
func main() {
handleDone := make(chan bool, 1)
go a(arg1, handleDone)
go b(arg2, handleDone)
// go c(arg3, handleDone) , later
handleDone <- true // kickstart
}
But for the first time each of the function will get handleDone <- true, which they will be executed. And later if I add another third function, things will get more complicated. How can block all other go routines except the running? Any other better solutions?
You want to look at the sync package.
http://golang.org/pkg/sync/
You would do this with a mutex.
If you have an incoming stream of messages and you have three goroutines listening on that stream and processing and you want to ensure that only one goroutine is running at a time, the solution is quite simple: kill off two of the goroutines.
You're spinning up concurrency and adding complexity and then trying to prevent them from running concurrently. The end result is the same as a single stream reader, but with lots of things that can go wrong.
I'm puzzled why you want this - why can't each message on deliveries be handled independently? and why are there two different functions handling those message? If each is responsible for a particular type of message, it seems like you want one deliveries receiver that dispatches to appropriate logic for the type.
But to answer your question, I don't think it's true that each function will get a true from handleDone on start. One (let's say it's a) is receiving the true sent from main; the other (b then) is getting the false sent from the first. Because you're discarding the value received, you can't tell this. And then both are running, and you're using a buffered channel (you probably want make(chan bool) instead for an unbuffered one), so confusion ensues, particularly when you add that third goroutine.
The handleDone <- false doesn't actually accomplish anything. Just treat any value on handleDone as the baton in a relay race. Once a goroutine receives this value, it can do its thing; when it's done, it should send it to the channel to hand it to the next goroutine.

More idiomatic way of adding channel result to queue on completion

So, right now, I just pass a pointer to a Queue object (implementation doesn't really matter) and call queue.add(result) at the end of goroutines that should add things to the queue.
I need that same sort of functionality—and of course doing a loop checking completion with the comma ok syntax is unacceptable in terms of performance versus the simple queue add function call.
Is there a way to do this better, or not?
There are actually two parts to your question: how does one queue data in Go, and how does one use a channel without blocking.
For the first part, it sounds like what you need to do is instead of using the channel to add things to the queue, use the channel as a queue. For example:
var (
ch = make(chan int) // You can add an int parameter to this make call to create a buffered channel
// Do not buffer these channels!
gFinished = make(chan bool)
processFinished = make(chan bool)
)
func f() {
go g()
for {
// send values over ch here...
}
<-gFinished
close(ch)
}
func g() {
// create more expensive objects...
gFinished <- true
}
func processObjects() {
for val := range ch {
// Process each val here
}
processFinished <- true
}
func main() {
go processObjects()
f()
<-processFinished
}
As for how you can make this more asynchronous, you can (as cthom06 pointed out) pass a second integer to the make call in the second line which will make send operations asynchronous until the channel's buffer is full.
EDIT: However (as cthom06 also pointed out), because you have two goroutines writing to the channel, one of them has to be responsible for closing the channel. Also, my previous revision would exit before processObjects could complete. The way I chose to synchronize the goroutines is by creating a couple more channels that pass around dummy values to ensure that the cleanup gets finished properly. Those channels are specifically unbuffered so that the sends happen in lock-step.

Resources