Goroutine not executing after sending channel - go

package main
import (
"fmt"
"sync"
)
// PUT function
func put(hashMap map[string](chan int), key string, value int, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Printf("this is getting printed")
hashMap[key] <- value
fmt.Printf("this is not getting printed")
fmt.Printf("PUT sent %d\n", value)
}
func main() {
var value int
var key string
wg := &sync.WaitGroup{}
hashMap := make(map[string](chan int), 100)
key = "xyz"
value = 100
for i := 0; i < 5; i++ {
wg.Add(1)
go put(hashMap, key, value, wg)
}
wg.Wait()
}
The last two print statements in the put function are not getting printed, I am trying to put values into the map based on key.
and also how to close the hashMap in this case.

You need to create a channel, for example hashMap[key] = make(chan int)
Since you are not reading from the channel, you need buffered channel to make it work:
key := "xyz"
hashMap[key] = make(chan int, 5)
Try the following code:
func put(hashMap map[string](chan int), key string, value int, wg *sync.WaitGroup) {
hashMap[key] <- value
fmt.Printf("PUT sent %d\n", value)
wg.Done()
}
func main() {
var wg sync.WaitGroup
hashMap := map[string]chan int{}
key := "xyz"
hashMap[key] = make(chan int, 5)
for i := 0; i < 5; i++ {
wg.Add(1)
go put(hashMap, key, 100, &wg)
}
wg.Wait()
}
Output:
PUT sent 100
PUT sent 100
PUT sent 100
PUT sent 100
PUT sent 100

My solution to fix the problem is:
// PUT function
func put(hashMap map[string](chan int), key string, value int, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Printf("this is getting printed")
hashMap[key] <- value // <-- nil problem
fmt.Printf("this is not getting printed")
fmt.Printf("PUT sent %d\n", value)
}
in this line of code hashMap[key] <- value in put function, It cannot accept the value because chan int is nil which is define in put (hashMap map[string](chan int) parameter.
// PUT function
func put(hashMap map[string](chan int), cval chan int, key string, value int, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Println("this is getting printed")
cval <- value // put the value in chan int (cval) which is initialized
hashMap[key] = cval // set the cval(chan int) to hashMap with key
fmt.Println("this is not getting printed")
fmt.Printf("PUT sent %s %d\n", key, value)
}
func main() {
var value int
wg := &sync.WaitGroup{}
cval := make(chan int,100)
hashMap := make(map[string](chan int), 100)
value = 100
for i := 0; i < 5; i++ {
wg.Add(1)
go put(hashMap, cval, fmt.Sprintf("key%d",i), value, wg)
}
wg.Wait()
/* uncomment to test cval
close(cval)
fmt.Println("Result:",<-hashMap["key2"])
fmt.Println("Result:",<-hashMap["key1"])
cval <- 88 // cannot send value to a close channel
hashMap["key34"] = cval
fmt.Println("Result:",<-hashMap["key1"])
*/
}
In my code example. I initialized cval buffered channel 100 same size to hashMap and pass cval as value in put function. you can close cval only and not the hashMap itself.

Also, your code can be reduced to this. Why pass params unnecessarily! One extra modification is that I take different values to make you understand the concept clearer.
package main
import (
"log"
"sync"
)
func put(hash chan int, wg *sync.WaitGroup) {
defer wg.Done()
log.Println("Put sent: ", <-hash)
}
func main() {
hashMap := map[string]chan int{}
key := "xyz"
var wg sync.WaitGroup
hashMap[key] = make(chan int, 5)
for i := 0; i < 5; i++ {
value := i
wg.Add(1)
go func(val int) {
hashMap[key] <- val
put(hashMap[key], &wg)
}(value)
}
wg.Wait()
}

Related

How to return value from aggregate function over a chan [duplicate]

This question already has answers here:
mixture of field:value and value initializers
(2 answers)
how to provide a value for an imported embedded struct literal?
(1 answer)
Closed 6 days ago.
I have aggregate function. I m sending data to this function through a channel. Once I process the data, I have to send back updated information to each original caller. Aggregation help us to improve latency.
I m trying to send a struct of channel & int over a channel. Aggregate function will send result back via channel inside struct, to the original caller. This is what I have tried, (Playground link)
package main
import (
"context"
"fmt"
"time"
)
// Original at https://elliotchance.medium.com/batch-a-channel-by-size-or-time-in-go-92fa3098f65
// This works.
func BatchStringsCtx[T any](ctx context.Context, values <-chan T, maxItems int, maxTimeout time.Duration) chan []T {
batches := make(chan []T)
go func() {
defer close(batches)
for keepGoing := true; keepGoing; {
var batch []T
expire := time.After(maxTimeout)
for {
select {
case <-ctx.Done():
keepGoing = false
goto done
case value, ok := <-values:
if !ok {
keepGoing = false
goto done
}
batch = append(batch, value)
if len(batch) == maxItems {
goto done
}
case <-expire:
goto done
}
}
done:
if len(batch) > 0 {
batches <- batch
}
}
}()
return batches
}
type ER struct{
e int
r chan int
}
// Process will do aggregation and some processing over the batch. Now result should go back to caller of each ER
func process(strings chan ER){
ctx := context.Background()
batches := BatchStringsCtx[ER](ctx, strings, 2, 10*time.Millisecond)
for batch := range batches {
for _, b := range batch{ // 2 elem in batch
b.r <- b.e + 100 // some operation. Batching helps in improving latency.
}
}
}
func main() {
strings := make(chan ER)
go process(strings)
er := ER{ e:0, make(chan chan int)}
er1 := ER{ e:1, make(chan chan int)}
go func() {
strings <- er
strings <- er1
close(strings)
}()
fmt.Println(<-er.r, <-er1.r) // print 100, 101
}
But I get these errors,
./prog.go:71:17: mixture of field:value and value elements in struct literal
./prog.go:72:18: mixture of field:value and value elements in struct literal
Any idea, what can be improved?
Do below changes to your code snippet.
er := ER{ e:0, make(chan chan int)}
er1 := ER{ e:1, make(chan chan int)}
Above two lines should be as below,
er := ER{e: 0, r: make(chan int)}
er1 := ER{e: 1, r: make(chan int)}

golang producer consumer number of messages received

I have written producer-consumer pattern in golang. Reading multiple csv files and processing records. I am reading all records of csv file in one go.
I want to log percentage of processing completion in interval of 5% of total records including all csv files. for e.g I have 3 csv to process & each have 20,30,50 rows/records (so in total 100 records to process) want to log progress when 5 records are processed.
func processData(inputCSVFiles []string) {
producerCount := len(inputCSVFiles)
consumerCount := producerCount
link := make(chan []string, 100)
wp := &sync.WaitGroup{}
wc := &sync.WaitGroup{}
wp.Add(producerCount)
wc.Add(consumerCount)
for i := 0; i < producerCount; i++ {
go produce(link, inputCSVFiles[i], wp)
}
for i := 0; i < consumerCount; i++ {
go consume(link, wc)
}
wp.Wait()
close(link)
wc.Wait()
fmt.Println("Completed data migration process for all CSV data files.")
}
func produce(link chan<- []string, filePath string, wg *sync.WaitGroup) {
defer wg.Done()
records := readCsvFile(filePath)
totalNumberOfRecords := len(records)
for _, record := range records {
link <- record
}
}
func consume(link <-chan []string, wg *sync.WaitGroup) {
defer wg.Done()
for record := range link {
// process csv record
}
}
I have used atomic variable & counter channel where consumer will push count when record is processed & other goroutine will read from channel & calculate total processed record percentage.
var progressPercentageStep float64 = 5.0
var totalRecordsToProcess int32
func processData(inputCSVFiles []string) {
producerCount := len(inputCSVFiles)
consumerCount := producerCount
link := make(chan []string, 100)
counter := make(chan int, 100)
defer close(counter)
wp := &sync.WaitGroup{}
wc := &sync.WaitGroup{}
wp.Add(producerCount)
wc.Add(consumerCount)
for i := 0; i < producerCount; i++ {
go produce(link, inputCSVFiles[i], wp)
}
go progressStats(counter)
for i := 0; i < consumerCount; i++ {
go consume(link, wc)
}
wp.Wait()
close(link)
wc.Wait()
}
func produce(link chan<- []string, filePath string, wg *sync.WaitGroup) {
defer wg.Done()
records := readCsvFile(filePath)
atomic.AddInt32(&totalRecordsToProcess, int32(len(records)))
for _, record := range records {
link <- record
}
}
func consume(link <-chan []string,counter chan<- int, wg *sync.WaitGroup) {
defer wg.Done()
for record := range link {
// process csv record
counter <- 1
}
}
func progressStats(counter <-chan int) {
var feedbackThreshold = progressPercentageStep
for count := range counter {
totalRemaining := atomic.AddInt32(&totalRecordsToProcess, -count)
donePercent := 100.0 * processed / totalRemaining
// log progress
if donePercent >= feedbackThreshold {
log.Printf("Progress ************** Total Records: %d, Processed Records : %d, Processed Percentage: %.2f **************\n", totalRecordsToProcess, processed, donePercent)
feedbackThreshold += progressPercentageStep
}
}
}

Deadlock when trying to code a pool of worker methods

In the code hereunder, I don't understand why the "Worker" methods seem to exit instead of pulling values from the input channel "in" and processing them.
I had assumed they would only return after having consumed all input from the input channel "in" and processing them
package main
import (
"fmt"
"sync"
)
type ParallelCallback func(chan int, chan Result, int, *sync.WaitGroup)
type Result struct {
i int
val int
}
func Worker(in chan int, out chan Result, id int, wg *sync.WaitGroup) {
for item := range in {
item *= item // returns the square of the input value
fmt.Printf("=> %d: %d\n", id, item)
out <- Result{item, id}
}
wg.Done()
fmt.Printf("%d exiting ", id)
}
func Run_parallel(n_workers int, in chan int, out chan Result, Worker ParallelCallback) {
wg := sync.WaitGroup{}
for id := 0; id < n_workers; id++ {
fmt.Printf("Starting : %d\n", id)
wg.Add(1)
go Worker(in, out, id, &wg)
}
wg.Wait() // wait for all workers to complete their tasks
close(out) // close the output channel when all tasks are completed
}
const (
NW = 4
)
func main() {
in := make(chan int)
out := make(chan Result)
go func() {
for i := 0; i < 100; i++ {
in <- i
}
close(in)
}()
Run_parallel(NW, in, out, Worker)
for item := range out {
fmt.Printf("From out : %d: %d", item.i, item.val)
}
}
The output is
Starting : 0
Starting : 1
Starting : 2
Starting : 3
=> 3: 0
=> 0: 1
=> 1: 4
=> 2: 9
fatal error: all goroutines are asleep - deadlock!
fatal error: all goroutines are asleep - deadlock!
The full error shows where each goroutine is "stuck". If you run this in the playground, it will even show you the line number. That made it easy for me to diagnose.
Your Run_parallel runs in the main groutine, so before main can read from out, Run_parallel must return. Before Run_parallel can return, it must wg.Wait(). But before the workers call wg.Done(), they must write to out. That's what causes a deadlock.
One solution is simple: just run Run_parallel concurrently in its own Goroutine.
go Run_parallel(NW, in, out, Worker)
Now, main ranges over out, waiting on outs closure to signal completion. Run_parallel waits for the workers with wg.Wait(), and the workers will range over in. All the work will get done, and the program won't end until it's all done. (https://go.dev/play/p/oMrgH2U09tQ)
Solution :
Run_parallel has to run in it’s own goroutine:
package main
import (
"fmt"
"sync"
)
type ParallelCallback func(chan int, chan Result, int, *sync.WaitGroup)
type Result struct {
id int
val int
}
func Worker(in chan int, out chan Result, id int, wg *sync.WaitGroup) {
defer wg.Done()
for item := range in {
item *= 2 // returns the double of the input value (Bogus handling of data)
out <- Result{id, item}
}
}
func Run_parallel(n_workers int, in chan int, out chan Result, Worker ParallelCallback) {
wg := sync.WaitGroup{}
for id := 0; id < n_workers; id++ {
wg.Add(1)
go Worker(in, out, id, &wg)
}
wg.Wait() // wait for all workers to complete their tasks
close(out) // close the output channel when all tasks are completed
}
const (
NW = 8
)
func main() {
in := make(chan int)
out := make(chan Result)
go func() {
for i := 0; i < 10; i++ {
in <- i
}
close(in)
}()
go Run_parallel(NW, in, out, Worker)
for item := range out {
fmt.Printf("From out [%d]: %d\n", item.id, item.val)
}
println("- - - All done - - -")
}
Alternative formulation of the solution:
In that alternative formulation , it is not necessary to start Run_parallel as a goroutine (it triggers its own goroutine).
I prefer that second solution, because it automates the fact that Run_parallel() has to run parallel to the main function. Also, for the same reason it's safer, less error-prone (no need to remember to run Run_parallel with the go keyword).
package main
import (
"fmt"
"sync"
)
type ParallelCallback func(chan int, chan Result, int, *sync.WaitGroup)
type Result struct {
id int
val int
}
func Worker(in chan int, out chan Result, id int, wg *sync.WaitGroup) {
defer wg.Done()
for item := range in {
item *= 2 // returns the double of the input value (Bogus handling of data)
out <- Result{id, item}
}
}
func Run_parallel(n_workers int, in chan int, out chan Result, Worker ParallelCallback) {
go func() {
wg := sync.WaitGroup{}
defer close(out) // close the output channel when all tasks are completed
for id := 0; id < n_workers; id++ {
wg.Add(1)
go Worker(in, out, id, &wg)
}
wg.Wait() // wait for all workers to complete their tasks *and* trigger the -differed- close(out)
}()
}
const (
NW = 8
)
func main() {
in := make(chan int)
out := make(chan Result)
go func() {
defer close(in)
for i := 0; i < 10; i++ {
in <- i
}
}()
Run_parallel(NW, in, out, Worker)
for item := range out {
fmt.Printf("From out [%d]: %d\n", item.id, item.val)
}
println("- - - All done - - -")
}

Is it possible to access channels ch1, ch2 using `select` in Golang?

I was trying to debug this code but am stuck here. I wanted to access ch1, ch2 but found printed nothing.
package main
import (
"fmt"
)
type degen struct {
i, j string
}
func (x degen) CVIO(ch1, ch2 chan string, quit chan int, m, n string) {
for {
select {
case ch1 <- m:
fmt.Println(x.i)
case ch2 <- n:
fmt.Println("ok")
case <-quit:
fmt.Println("quit")
return
}
}
}
func main() {
ch1 := make(chan string)
ch2 := make(chan string)
quit := make(chan int)
x := degen{"goosebump", "ok"}
go x.CVIO(ch1, ch2, quit, "goosebump", "ok")
}
Desired:
It should print the channel data as to be produced.
Its not really clear what you expect your code to do:
main() ends without waiting for the go routine to exit (its quite possible it the loop will not run at all).
in the select the sends will not proceed because there is no receiver (spec - "if the capacity is zero or absent, the channel is unbuffered and communication succeeds only when both a sender and receiver are ready.").
Nothing is sent to the quit channel.
I suspect that the following (playground) might do what you were expecting.
package main
import (
"fmt"
"sync"
)
type degen struct {
i, j string
}
func (x degen) CVIO(ch1, ch2 chan string, quit chan int, m, n string) {
for {
select {
case ch1 <- m:
fmt.Println(x.i)
case ch2 <- n:
fmt.Println("ok")
case <-quit:
fmt.Println("quit")
return
}
}
}
func main() {
ch1 := make(chan string)
ch2 := make(chan string)
quit := make(chan int)
x := degen{"goosebump", "ok"}
var wg sync.WaitGroup
wg.Add(1)
go func() {
x.CVIO(ch1, ch2, quit, "goosebump", "ok")
wg.Done()
}()
<-ch1 // Receive from CH1 (allowing "ch1 <- m" in go routine to proceed)
<-ch2 // Receive from CH2 (allowing "ch2 <- n" in go routine to proceed)
quit <- 1
wg.Wait() // Wait for CVIO to end (which it should do due to above send)
}

What's the idiomatic solution to embarassingly parallel tasks in Go?

I'm currently staring at a beefed up version of the following code:
func embarrassing(data []string) []string {
resultChan := make(chan string)
var waitGroup sync.WaitGroup
for _, item := range data {
waitGroup.Add(1)
go func(item string) {
defer waitGroup.Done()
resultChan <- doWork(item)
}(item)
}
go func() {
waitGroup.Wait()
close(resultChan)
}()
var results []string
for result := range resultChan {
results = append(results, result)
}
return results
}
This is just blowing my mind. All this is doing can be expressed in other languages as
results = parallelMap(data, doWork)
Even if it can't be done quite this easily in Go, isn't there still a better way than the above?
If you need all the results, you don't need the channel (and the extra goroutine to close it) to communicate the results, you can write directly into the results slice:
func cleaner(data []string) []string {
results := make([]string, len(data))
wg := &sync.WaitGroup{}
wg.Add(len(data))
for i, item := range data {
go func(i int, item string) {
defer wg.Done()
results[i] = doWork(item)
}(i, item)
}
wg.Wait()
return results
}
This is possible because slice elements act as distinct variables, and thus can be written individually without synchronization. For details, see Can I concurrently write different slice elements. You also get the results in the same order as your input for free.
Anoter variation: if doWork() would not return the result but get the address where the result should be "placed", and additionally the sync.WaitGroup to signal completion, that doWork() function could be executed "directly" as a new goroutine.
We can create a reusable wrapper for doWork():
func doWork2(item string, result *string, wg *sync.WaitGroup) {
defer wg.Done()
*result = doWork(item)
}
If you have the processing logic in such format, this is how it can be executed concurrently:
func cleanest(data []string) []string {
results := make([]string, len(data))
wg := &sync.WaitGroup{}
wg.Add(len(data))
for i, item := range data {
go doWork2(item, &results[i], wg)
}
wg.Wait()
return results
}
Yet another variation could be to pass a channel to doWork() on which it is supposed to deliver the result. This solution doesn't even require a sync.Waitgroup, as we know how many elements we want to receive from the channel:
func cleanest2(data []string) []string {
ch := make(chan string)
for _, item := range data {
go doWork3(item, ch)
}
results := make([]string, len(data))
for i := range results {
results[i] = <-ch
}
return results
}
func doWork3(item string, res chan<- string) {
res <- "done:" + item
}
"Weakness" of this last solution is that it may collect the result "out-of-order" (which may or may not be a problem). This approach can be improved to retain order by letting doWork() receive and return the index of the item. For details and examples, see How to collect values from N goroutines executed in a specific order?
You can also use reflection to achieve something similar.
In this example it distribute the handler function over 4 goroutines and returns the results in a new instance of the given source slice type.
package main
import (
"fmt"
"reflect"
"strings"
"sync"
)
func parralelMap(some interface{}, handle interface{}) interface{} {
rSlice := reflect.ValueOf(some)
rFn := reflect.ValueOf(handle)
dChan := make(chan reflect.Value, 4)
rChan := make(chan []reflect.Value, 4)
var waitGroup sync.WaitGroup
for i := 0; i < 4; i++ {
waitGroup.Add(1)
go func() {
defer waitGroup.Done()
for v := range dChan {
rChan <- rFn.Call([]reflect.Value{v})
}
}()
}
nSlice := reflect.MakeSlice(rSlice.Type(), rSlice.Len(), rSlice.Cap())
for i := 0; i < rSlice.Len(); i++ {
dChan <- rSlice.Index(i)
}
close(dChan)
go func() {
waitGroup.Wait()
close(rChan)
}()
i := 0
for v := range rChan {
nSlice.Index(i).Set(v[0])
i++
}
return nSlice.Interface()
}
func main() {
fmt.Println(
parralelMap([]string{"what", "ever"}, strings.ToUpper),
)
}
Test here https://play.golang.org/p/iUPHqswx8iS

Resources