Channel of type "set" - go

I recently started writing Go after years of programming in C#, and I'm having a hard time wrapping my head around several concepts of the language. Here's an example of what I'm trying to solve: I'd like to be able to create a routine that iterates over a list, calls a function, and stores the output in a buffered channel. The issue is I want to return a distinct set of these output values, as the function can return similar results for two different elements in the list.
Since Go doesn't have a built-in set type, I'm trying to use a map[string]bool to store distinct values (using map[string]bool or map[string]struct is what others suggested as a replacement for a set); and I'm using a buffered channel to insert into this map, however I'm not certain what the right syntax for inserting 1 element into a map would look like. Here's what I'm trying to do:
resultsChnl := make(chan map[string]bool, len(myList))
go func(myList []string, resultsChnl chan map[string]bool) {
for _, item := range myList {
result, err := getResult(item)
/* error checking */
resultsChnl <- {result: true}
}
close(resultsChnl)
}(myList, resultsChnl)
for item := range resultsChnl {
...
}
Obviously this doesn't compile due to invalid syntax of resultsChnl <- {result: true}. I know this sounds impractical since naturally in this particular case I could create a local map inside the for loop and assign one map[string]bool object to a non-buffered channel and return that, but let's assume I was creating a go routine for each item in the list and really wanted to use a buffered channel (as opposed to using a mutex to grab a lock on a shared map). So is there any way to insert one key-value pair in a map channel? Or am I thinking about this completely wrong?

To answer the question directly, you would want
resultsChnl <- map[string]bool{result: true}
But this doesn't seem useful at all. You may want to collect the results in a map, but there's no reason to pass a map over the channel for each result when you know it will only have one element. Simply use a channel of string, do
resultsChnl <- result
for each result in your producer goroutine, and
seenResult[item] = true
in your consumer loop to collect the results (where seenResult is a map[string]bool).
Or forget about the channel entirely and have your producer goroutines write directly into a sync.Map.

Related

Calling multiple functions with different signatures concurrently

I'd like some feedback on the implementation details of what I'm trying to build. What I want to achieve is have multiple functions with different signatures that can be called concurrently.
Calling the functions in coroutines sequentially works fine, but I'm wondering if there's a way to do this in a more idiomatic way, e.g. iterate over a slice of functions.
Since each function has different arguments and return values though, I have trouble figuring out what the best approach would be. An example that is a bit similar to my goal can be seen here: Golang - How do you create a slice of functions with different signatures?, but there the code just calls the functions and doesn't account for any return values.
Is what I have in mind even possible?
You can use code from linked question and just wrap the v.Call(params) into an anonymous function executing in its own goroutine like this:
...
// WaitGroup to wait on goroutines to finish their execution
var wg sync.WaitGroup
for a, v := range f {
v := reflect.TypeOf(v)
//calling the function from reflect
val := reflect.ValueOf(f[a])
params := make([]reflect.Value, v.NumIn())
if v.NumIn() == 1 {
params[0] = reflect.ValueOf(1564)
} else if v.NumIn() == 2 {
params[0] = reflect.ValueOf("Test FROM reflect")
params[1] = reflect.ValueOf(float32(123456))
}
// Run them in parallel
wg.Add(1)
go func() {
defer wg.Done()
val.Call(params)
}()
}
wg.Wait()
See it on Go Playground
As for return values Value.Call() returns []Value which is slice of return values - so you are covered here too. Your question doesn't specify what you intend to do with results but given they will be generated in parallel you'll probably need to send them through a channel(s) - you can do that in anonymous function (after processing return slice) too.
go func() { MyPackage.MyFunc(with, whatsoever, signature); }() - roughtly, that's what you need. You span as many goroutines (using the go keyword) as there are concurrent functions.
There is no notion of "returned value" from goroutine. For that you have to use channels. They are primary communication mechanism. So, you span a new goroutine with some function f of arbitrary signature and when it's done and you got some result, you send it to some channel shared between goroutines for communication.
Channels are thread-safe and were carefully designed to handle such a communication gracefully. Go, as programming language, provides few keywords that deal with reading/writing to/from channels. So there are pretty fundamental to (concurrent) programming in Go.
However, of course, you can handle it differently. Sharing some mutable memory protected by some kind of locking, or relying upon lockless compareAndSet fashion. Arguably, that's less idiomatic way and generally have to be avoided. Always prefer channels.

How to pass vars with chan between functions

I'm implementing a simple mechanism of passing variable between two goroutines with a channel. Here is my code:
pipe := make(chan string)
go func(out chan string, data string) { //1st goroutine
out <- DataSignerMd5(data)
}(pipe, data)
go func(in chan string) { //2nd goroutine
data := <-in
in <- DataSignerCrc32(data)
}(pipe)
crcMdData := <- pipe
More likely, crcMdData pulls a variable from pipe before 2nd goroutine. I guess that I simply can create another channel to make this work. But maybe it's possible with a single pipe?
You should use a second channel for what you want to do. You could get away with using a single channel and switching on the result, but that's not really ideal - you're basically trying to put two different types of objects into the same channel, and your program will end up being a lot cleaner and easier to reason about if you just have one channel per data type / intended transformation.

is delete guaranteed to delete from a hash in golang?

I have a hash like this:
var TransfersInFlight map[string]string = make(map[string]string)
And before I send a file I make a key for it store, send it, delete it:
timeKey := fmt.Sprintf("%v",time.Now().UnixNano())
TransfersInFlight[timeKey] = filename
total, err := sendTheFile(filename)
delete(TransfersInFlight, timeKey)
i.e. during the time it takes to send the file, there is a key in the hash with a timestamp pointing to the filename.
the func sendTheFile always either works, or has an err but never throws a stacktrace exception and crashes the whole program so the line:
delete(TransfersInFlight, timeKey)
should be called 100% of the time. And yet, I sometimes find cases where it's like this line was never called and the file is stuck in TransfersInFlight forever. How is this possible?
Maps are not safe for concurrent access. I would do this either using a mutex to moderate map access or having a goroutine reading either a channel of "op" structs or have a "add" channel and a "delete" channel.
You're probably safe having multiple read-only accesses concurrently, but once you have writes in the mix, you really want to ensure you only have one access at a time.
If you are set on using a goroutine to manage the count, one way would be something like:
import "sync/atomic"
var TransferChan chan int32
var TransfersInFlight int32
func TransferManager() {
TransfersInFlight = 0
for delta := range TransferChan {
// You're *probably* safe just using +=, but, you know...
atomic.AddInt32(&TransfersInFlight, delta)
}
}
That way, you only need to do go TransferManager() and then pass your increments and decrements over the TransferChan channel.

Catching return values from goroutines

The below code gives compilation error saying 'unexpected go':
x := go doSomething(arg)
func doSomething(arg int) int{
...
return my_int_value
}
I know, I can fetch the return value if I call the function normally i.e. without using goroutine or I can use channels etc.
My question is why is it not possible to fetch a return value like this from a goroutine.
Why is it not possible to fetch a return value from a goroutine assigning it to a variable?
Run goroutine (asynchronously) and fetch return value from function are essentially contradictory actions. When you say go you mean "do it asynchronously" or even simpler: "Go on! Don't wait for the function execution be finished". But when you assign function return value to a variable you are expecting to have this value within the variable. So when you do that x := go doSomething(arg) you are saying: "Go on, don't wait for the function! Wait-wait-wait! I need a returned value be accessible in x var right in the next line below!"
Channels
The most natural way to fetch a value from a goroutine is channels. Channels are the pipes that connect concurrent goroutines. You can send values into channels from one goroutine and receive those values into another goroutine or in a synchronous function. You could easily obtain a value from a goroutine not breaking concurrency using select:
func main() {
c1 := make(chan string)
c2 := make(chan string)
go func() {
time.Sleep(time.Second * 1)
c1 <- "one"
}()
go func() {
time.Sleep(time.Second * 2)
c2 <- "two"
}()
for i := 0; i < 2; i++ {
// Await both of these values
// simultaneously, printing each one as it arrives.
select {
case msg1 := <-c1:
fmt.Println("received", msg1)
case msg2 := <-c2:
fmt.Println("received", msg2)
}
}
}
The example is taken from Go By Example
CSP & message-passing
Go is largerly based on CSP theory. The naive description from above could be precisely outlined in terms of CSP (although I believe it is out of scope of the question). I strongly recommend to familiarize yourself with CSP theory at least because it is RAD. These short quotations give a direction of thinking:
As its name suggests, CSP allows the description of systems in terms of component processes that operate independently, and interact with each other solely through message-passing communication.
In computer science, message passing sends a message to a process and relies on the process and the supporting infrastructure to select and invoke the actual code to run. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name.
The strict answer is that you can do that. It's just probably not a good idea. Here's code that would do that:
var x int
go func() {
x = doSomething()
}()
This will spawn off a new goroutine which will calculate doSomething() and then assign the result to x. The problem is: how are you going to use x from the original goroutine? You probably want to make sure the spawned goroutine is done with it so that you don't have a race condition. But if you want to do that, you'll need a way to communicate with the goroutine, and if you've got a way to do that, why not just use it to send the value back?
The idea of the go keyword is that you run the doSomething function asynchronously, and continue the current goroutine without waiting for the result, kind of like executing a command in a Bash shell with an '&' after it. If you want to do
x := doSomething(arg)
// Now do something with x
then you need the current goroutine to block until doSomething finishes. So why not just call doSomething in the current goroutine? There are other options (like, doSomething could post a result to a channel, which the current goroutine receives values from) but simply calling doSomething and assigning the result to a variable is obviously simpler.
It's a design choice by Go creators. There's a whole lot of abstractions/APIs to represent the value of async I/O operations - promise, future, async/await, callback, observable, etc. These abstractions/APIs are inherently tied to the unit of scheduling - coroutines - and these abstractions/APIs dictate how coroutines (or more precisely the return value of async I/O represented by them) can be composed.
Go chose message passing (aka channels) as the abstraction/API to represent the return value of async I/O operations. And of course, goroutines and channels give you a composable tool to implement async I/O operations.
Why not use a channel to write into?
chanRes := make(chan int, 1)
go doSomething(arg, chanRes)
//blocks here or you can use some other sync mechanism (do something else) and wait
x := <- chanRes
func doSomething(arg int, out chan<- int){
...
out <- my_int_value
}

How do I find out if a goroutine is done, without blocking?

All the examples I've seen so far involve blocking to get the result (via the <-chan operator).
My current approach involves passing a pointer to a struct:
type goresult struct {
result resultType;
finished bool;
}
which the goroutine writes upon completion. Then it's a simple matter of checking finished whenever convenient. Do you have better alternatives?
What I'm really aiming for is a Qt-style signal-slot system. I have a hunch the solution will look almost trivial (chans have lots of unexplored potential), but I'm not yet familiar enough with the language to figure it out.
You can use the "comma, ok" pattern (see their page on "effective go"):
foo := <- ch; // This blocks.
foo, ok := <- ch; // This returns immediately.
Select statements allows you to check multiple channels at once, taking a random branch (of the ones where communication is waiting):
func main () {
for {
select {
case w := <- workchan:
go do_work(w)
case <- signalchan:
return
// default works here if no communication is available
default:
// do idle work
}
}
}
For all the send and receive
expressions in the "select" statement,
the channel expressions are evaluated,
along with any expressions that appear
on the right hand side of send
expressions, in top-to-bottom order.
If any of the resulting operations can
proceed, one is chosen and the
corresponding communication and
statements are evaluated. Otherwise,
if there is a default case, that
executes; if not, the statement blocks
until one of the communications can
complete.
You can also peek at the channel buffer to see if it contains anything by using len:
if len(channel) > 0 {
// has data to receive
}
This won't touch the channel buffer, unlike foo, gotValue := <- ch which removes a value when gotValue == true.

Resources