I'd like to track execution of some long running process and show the user completion percentage and errors (if any). If it's one long running process, then it's easy - you can create channels for progress (percentage) and error. What would the correct way to implement such logic when we have X long running processes?
Below is a snippet of code that works, but I don't really like how it's implemented.
I created a struct ProgressTracker that keeps Url (as a field), Error, Progress
as channels. I keep such ProgressTracker in a slice and once I submit all tasks I iterate via the slice of ProgressTracker and listen to channels for each tracker in ProgressTracker. Once the number of submitted requests == number of received responses - exit the loop.
Is it Go idiomatic solution? It would be easier to pass ProgressTracker to the function as a channel, but I don't know how to properly send "progress", "error" and "complete" events in such case.
The code is below, the same is available in Go playground: https://go.dev/play/p/f3hXJsZR9WV
package main
import (
"errors"
"fmt"
"strings"
"sync"
"time"
)
type ProgressTracker struct {
Progress chan int
Error chan error
Complete chan bool
Url string
}
/**
This method sleeps for 1 second and sends progress (in %) in each iteration to Progress channel
For .net sites on 3rd iteration fail with error
When everything is completed, send a message to Complete channel
*/
func work(url string, tracker *ProgressTracker) {
tracker.Url = url
fmt.Printf("processing url %s\n", url)
for i := 1; i <= 5; i++ {
time.Sleep(time.Second)
if i == 3 && strings.HasSuffix(url, ".net") {
tracker.Error <- errors.New("emulating error for .net sites")
tracker.Complete <- true
}
progress := 20 * i
tracker.Progress <- progress
}
tracker.Complete <- true
}
func main() {
var trackers []*ProgressTracker
var urls = []string{"google.com", "youtube.com", "someurl.net"}
var wg sync.WaitGroup
wg.Add(len(urls))
for _, url := range urls {
tracker := &ProgressTracker{
Progress: make(chan int),
Error: make(chan error),
Complete: make(chan bool),
}
trackers = append(trackers, tracker)
go func(workUrl string, progressTracker *ProgressTracker) {
work(workUrl, progressTracker)
}(url, tracker)
}
go func() {
wg.Wait()
}()
var processed = 0
//iterate through all trackers and select each channel.
//Exit from this loop when number of processed requests equals the number of trackers
for {
for _, t := range trackers {
select {
case pr := <-t.Progress:
fmt.Printf("Url = %s, progress = %d\n", t.Url, pr)
case err := <-t.Error:
fmt.Printf("Url = %s, error = %s\n", t.Url, err.Error())
case <-t.Complete:
fmt.Printf("Url = %s is completed\n", t.Url)
processed = processed + 1
if processed == len(trackers) {
fmt.Printf("Everything is completed, exit")
return
}
}
}
}
}
UPD:
If I add a delay to one of the tasks, then the for loop where I select all the channels will also wait for the slowest worker on each iteration.
Go playground: https://go.dev/play/p/9FvDE7ZGIrP
Updated work function:
func work(url string, tracker *ProgressTracker) {
tracker.Url = url
fmt.Printf("processing url %s\n", url)
for i := 1; i <= 5; i++ {
if url == "google.com" {
time.Sleep(time.Second * 3)
}
time.Sleep(time.Second)
if i == 3 && strings.HasSuffix(url, ".net") {
tracker.Error <- errors.New("emulating error for .net sites")
tracker.Complete <- true
return
}
progress := 20 * i
tracker.Progress <- progress
}
tracker.Complete <- true
}
You're deadlocking because you're continuing to select against trackers that have finished with no default. Your inner for loop iterates all trackers every time, which includes trackers that are done and are never going to send another message. The easiest way out of this is an empty default, which would also make these behave better in real life where they don't all go at the same pace, but it does turn this into a tight loop which will consume more CPU.
Your WaitGroup doesn't do anything at all; you're calling Wait in a goroutine but doing nothing when it returns, and you never call Done in the goroutines it's tracking, so it will never return. Instead, you're separately tracking the number of Complete messages you get and using that instead of the WaitGroup; it's unclear why this is implemented this way.
Fixing both resolves the stated issue: https://go.dev/play/p/do0g9jrX0mY
However, that's probably not the right approach. It's impossible to say with a contrived example what the right approach would be; if the example is all it needs to do, you don't need any of the logic, you could put your print statements in the workers and just use a waitgroup and no channels and be done with it. Assuming you're actually doing something with the results, you probably want a single Completed channel and a single Error channel shared by all the workers, and possibly a different mechanism altogether for tracking progress, like an atomic int/float you can just read from when you want to know the current progress. Then you don't need the nested looping stuff, you just have one loop with one select to read messages from the shared channels. It all depends on the context in which this code is intended to be used.
Thank you for your answers! I came up with this approach and it works for my needs:
package main
import (
"errors"
"fmt"
"strings"
"sync"
"time"
)
type ProgressTracker struct {
Progress int
Error error
Completed bool
Url string
}
/**
This method sleeps for 1 second and sends progress (in %) in each iteration to Progress channel
For .net sites on 3rd iteration fail with error
When everything is completed, send a message to Complete channel
*/
func work(url string, tracker chan ProgressTracker) {
var internalTracker = ProgressTracker{
Url: url,
}
tracker <- internalTracker
fmt.Printf("processing url %s\n", url)
for i := 1; i <= 5; i++ {
if url == "google.com" {
time.Sleep(time.Second * 3)
}
time.Sleep(time.Second)
if i == 3 && strings.HasSuffix(url, ".net") {
internalTracker.Error = errors.New("error for .net sites")
internalTracker.Completed = true
tracker <- internalTracker
return
}
progress := 20 * i
internalTracker.Progress = progress
internalTracker.Completed = false
tracker <- internalTracker
}
internalTracker.Completed = true
tracker <- internalTracker
}
func main() {
var urls = []string{"google.com", "youtube.com", "someurl.net"}
var tracker = make(chan ProgressTracker, len(urls))
var wg sync.WaitGroup
wg.Add(len(urls))
for _, url := range urls {
go func(workUrl string) {
defer wg.Done()
work(workUrl, tracker)
}(url)
}
go func() {
wg.Wait()
close(tracker)
fmt.Printf("After wg wait")
}()
var completed = 0
for completed < len(urls) {
select {
case t := <-tracker:
if t.Completed {
fmt.Printf("Processing for %s is completed!\n", t.Url)
completed = completed + 1
} else {
fmt.Printf("Processing for %s is in progress: %d\n", t.Url, t.Progress)
}
if t.Error != nil {
fmt.Printf("Url %s has errors %s\n", t.Url, t.Error)
}
}
}
}
Here I pass ProgressTracker as a channel (fields in ProgressTracker are declared as simple fields, not channels) and on each event from work function return a complete state of what's is going on (if progress increased - set new value and return the a structure
to channel, if error happened - set the error and return the structure, etc).
Related
Consider a group of check works, each of which has independent logic, so they seem to be good to run concurrently, like:
type Work struct {
// ...
}
// This Check could be quite time-consuming
func (w *Work) Check() bool {
// return succeed or not
//...
}
func CheckAll(works []*Work) {
num := len(works)
results := make(chan bool, num)
for _, w := range works {
go func(w *Work) {
results <- w.Check()
}(w)
}
for i := 0; i < num; i++ {
if r := <-results; !r {
ReportFailed()
break;
}
}
}
func ReportFailed() {
// ...
}
When concerned about the results, if the logic is no matter which one work fails, we assert all works totally fail, the remaining values in the channel are useless. Let the remaining unfinished goroutines continue to run and send results to the channel is meaningless and waste, especially when w.Check() is quite time-consuming. The ideal effect is similar to:
for _, w := range works {
if !w.Check() {
ReportFailed()
break;
}
}
This only runs necessary check works then break, but is in sequential non-concurrent scenario.
So, is it possible to cancel these unfinished goroutines, or sending to channel?
Cancelling a (blocking) send
Your original question asked how to cancel a send operation. A send on a channel is basically "instant". A send on a channel blocks if the channel's buffer is full and there is no ready receiver.
You can "cancel" this send by using a select statement and a cancel channel which you close, e.g.:
cancel := make(chan struct{})
select {
case ch <- value:
case <- cancel:
}
Closing the cancel channel with close(cancel) on another goroutine will make the above select abandon the send on ch (if it's blocking).
But as said, the send is "instant" on a "ready" channel, and the send first evaluates the value to be sent:
results <- w.Check()
This first has to run w.Check(), and once it's done, its return value will be sent on results.
Cancelling a function call
So what you really need is to cancel the w.Check() method call. For that, the idiomatic way is to pass a context.Context value which you can cancel, and w.Check() itself must monitor and "obey" this cancellation request.
See Terminating function execution if a context is cancelled
Note that your function must support this explicitly. There is no implicit termination of function calls or goroutines, see cancel a blocking operation in Go.
So your Check() should look something like this:
// This Check could be quite time-consuming
func (w *Work) Check(ctx context.Context, workDuration time.Duration) bool {
// Do your thing and monitor the context!
select {
case <-ctx.Done():
return false
case <-time.After(workDuration): // Simulate work
return true
case <-time.After(2500 * time.Millisecond): // Simulate failure after 2.5 sec
return false
}
}
And CheckAll() may look like this:
func CheckAll(works []*Work) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
num := len(works)
results := make(chan bool, num)
wg := &sync.WaitGroup{}
for i, w := range works {
workDuration := time.Second * time.Duration(i)
wg.Add(1)
go func(w *Work) {
defer wg.Done()
result := w.Check(ctx, workDuration)
// You may check and return if context is cancelled
// so result is surely not sent, I omitted it here.
select {
case results <- result:
case <-ctx.Done():
return
}
}(w)
}
go func() {
wg.Wait()
close(results) // This allows the for range over results to terminate
}()
for result := range results {
fmt.Println("Result:", result)
if !result {
cancel()
break
}
}
}
Testing it:
CheckAll(make([]*Work, 10))
Output (try it on the Go Playground):
Result: true
Result: true
Result: true
Result: false
We get true printed 3 times (works that complete under 2.5 seconds), then the failure simulation kicks in, returns false, and terminates all other jobs.
Note that the sync.WaitGroup in the above example is not strictly needed as results has a buffer capable of holding all results, but in general it's still good practice (should you use a smaller buffer in the future).
See related: Close multiple goroutine if an error occurs in one in go
The short answer is: No.
You can not cancel or close any goroutine unless the goroutine itself reaches the return or end of its stack.
If you want to cancel something, the best approach is to pass a context.Context to them and listen to this context.Done() inside of the routine. Whenever context is canceled, you should return and the goroutine will automatically die after executing defers(if any).
package main
import "fmt"
type Work struct {
// ...
Name string
IsSuccess chan bool
}
// This Check could be quite time-consuming
func (w *Work) Check() {
// return succeed or not
//...
if len(w.Name) > 0 {
w.IsSuccess <- true
}else{
w.IsSuccess <- false
}
}
//堆排序
func main() {
works := make([]*Work,3)
works[0] = &Work{
Name: "",
IsSuccess: make(chan bool),
}
works[1] = &Work{
Name: "111",
IsSuccess: make(chan bool),
}
works[2] =&Work{
Name: "",
IsSuccess: make(chan bool),
}
for _,w := range works {
go w.Check()
}
for i,w := range works{
select {
case checkResult := <-w.IsSuccess :
fmt.Printf("index %d checkresult %t \n",i,checkResult)
}
}
}
enter image description here
I have the following piece of code. I'm trying to run 3 GO routines at the same time never exceeding three. This works as expected, but the code is supposed to be running updates a table in the DB.
So the first routine processes the first 50, then the second 50, and then third 50, and it repeats. I don't want two routines processing the same rows at the same time and due to how long the update takes, this happens almost every time.
To solve this, I started flagging the rows with a new column processing which is a bool. I set it to true for all rows to be updated when the routine starts and sleep the script for 6 seconds to allow the flag to be updated.
This works for a random amount of time, but every now and then, I'll see 2-3 jobs processing the same rows again. I feel like the method I'm using to prevent duplicate updates is a bit janky and was wondering if there was a better way.
stopper := make(chan struct{}, 3)
var counter int
for {
counter++
stopper <- struct{}{}
go func(db *sqlx.DB, c int) {
fmt.Println("start")
updateTables(db)
fmt.Println("stop"b)
<-stopper
}(db, counter)
time.Sleep(6 * time.Second)
}
in updateTables
var ids[]string
err := sqlx.Select(db, &data, `select * from table_data where processing = false `)
if err != nil {
panic(err)
}
for _, row:= range data{
list = append(ids, row.Id)
}
if len(rows) == 0 {
return
}
for _, row:= range data{
_, err = db.Exec(`update table_data set processing = true where id = $1, row.Id)
if err != nil {
panic(err)
}
}
// Additional row processing
I think there's a misunderstanding on approach to go routines in this case.
Go routines to do these kind of work should be approached like worker Threads, using channels as the communication method in between the main routine (which will be doing the synchronization) and the worker go routines (which will be doing the actual job).
package main
import (
"log"
"sync"
"time"
)
type record struct {
id int
}
func main() {
const WORKER_COUNT = 10
recordschan := make(chan record)
var wg sync.WaitGroup
for k := 0; k < WORKER_COUNT; k++ {
wg.Add(1)
// Create the worker which will be doing the updates
go func(workerID int) {
defer wg.Done() // Marking the worker as done
for record := range recordschan {
updateRecord(record)
log.Printf("req %d processed by worker %d", record.id, workerID)
}
}(k)
}
// Feeding the records channel
for _, record := range fetchRecords() {
recordschan <- record
}
// Closing our channel as we're not using it anymore
close(recordschan)
// Waiting for all the go routines to finish
wg.Wait()
log.Println("we're done!")
}
func fetchRecords() []record {
result := []record{}
for k := 0; k < 100; k++ {
result = append(result, record{k})
}
return result
}
func updateRecord(req record) {
time.Sleep(200 * time.Millisecond)
}
You can even buffer things in the main go routine if you need to update all the 50 tables at once.
I am attempting to loop through an Array and copy each value within an array. I would like to but spin each loop off in a separate goroutine. When I do run it with goroutines then will loop one less than the size of the array (len(Array) -1) but if I get rid of the goroutine then it processes just fine.
Am I missing something about how this should work? It seems very odd that it is always one less when running goroutines. Below is my code.
func createEventsForEachWorkoutReference(plan *sharedstructs.Plan, user *sharedstructs.User, startTime time.Time, timeZoneKey *string, transactionID *string, monitoringChannel chan interface{}) {
//Set the activity type as these workouts are coming from plans
activityType := "workout"
for _, workoutReference := range plan.WorkoutReferences {
go func(workoutReference sharedstructs.WorkoutReference) {
workout, getWorkoutError := workout.GetWorkoutByName(workoutReference.WorkoutID.ID, *transactionID)
if getWorkoutError == nil && workout != nil {
//For each workout, create a reference to be inserted into the event
reference := sharedstructs.Reference{ID: workout.WorkoutID, Type: activityType, Index: 0}
referenceArray := make([]sharedstructs.Reference, 0)
referenceArray = append(referenceArray, reference)
event := sharedstructs.Event{
EventID: uuidhelper.GenerateUUID(),
Description: workout.Description,
Type: activityType,
UserID: user.UserID,
IsPublic: false,
References: referenceArray,
EventDateTime: startTime,
PlanID: plan.PlanID}
//Insert the Event into the databse, I don't handle errors intentionally as it will be async
creationError := eventdomain.CreateNewEvent(&event, transactionID)
if creationError != nil {
redFalconLogger.LogCritical("plan.createEventsForEachWorkoutReference() Error Creating a workout"+creationError.Error(), *transactionID)
}
//add to the outputchannel
monitoringChannel <- event
//Calculate the next start time for the next loop
startTime = calculateNextEventTime(&startTime, &workoutReference.RestTime, timeZoneKey, transactionID)
}
}(workoutReference)
}
return
}
After a bit deeper dive, I think that I figured the root cause but not the (elegant) solution yet.
What appears to be happening is that my calling function is running in an async goroutine as well and using a "chan interface{}" to monitor and stream progress back to the client. On the last item in the array, it is completing the calling goroutine before the chan can be processed upstream.
What is the proper way to wait for the channel processing to complete. Below is a portion of my unit test that I am using to provide context.
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
createEventsForEachWorkoutReference(plan, &returnedUser, startDate, &timeZone, &transactionID, monitoringChan)
}()
var userEventArrayList []sharedstructs.Event
go func() {
for result := range monitoringChan {
switch result.(type) {
case sharedstructs.Event:
counter++
event := result.(sharedstructs.Event)
userEventArrayList = append(userEventArrayList, event)
fmt.Println("Channel Picked Up New Event: " + event.EventID + " with counter " + strconv.Itoa(counter))
default:
fmt.Println("No Match")
}
}
}()
wg.Wait()
//I COULD SLEEP HERE BUT THAT SEEMS HACKY
close(monitoringChan)
Wanted to add one more example (without my custom code). You can comment out the sleep line to see it work with sleep in there.
https://play.golang.org/p/t6L_C4zScP-
Finally figured the answer...
The problem was that I needed to close my monitoringChan in the first goroutine and then monitor (Defer wg.close()) in the second goroutine. Worked great when I did that!
https://play.golang.org/p/fEaZXiWCLt-
Short story:
I'm having an issue where a map that previously had data but should now be empty is reporting a len() of > 0 even though it appears to be empty, and I have no idea why.
Longer story:
I need to process a number of devices at a time. Each device can have a number of messages. The concurrency of Go seemed like an obvious place to begin, so I wrote up some code to handle it and it seems to be going mostly very well. However...
I started a single goroutine for each device. In the main() function I have a map that contains each of the devices. When a message comes in I check to see whether the device already exists and if not I create it, store it in the map, and then pass the message into the device's receiving buffered channel.
This works great, and each device is being processed nicely. However, I need the device (and its goroutine) to terminate when it doesn't receive any messages for a preset amount of time. I've done this by checking in the goroutine itself how much time has passed since the last message was received, and if the goroutine is considered stale then the receiving channel is closed. But how to remove from the map?
So I passed in a pointer to the map, and I have the goroutine delete the device from the map and close the receiving channel before returning. The problem though is that at the end I'm finding that the len() function returns a value > 0, but when I output the map itself I see that it's empty.
I've written up a toy example to try to replicate the fault, and indeed I'm seeing that len() is reporting > 0 when the map is apparently empty. The last time I tried it I saw 10. The time before that 14. The time before that one, 53.
So I can replicate the fault, but I'm not sure whether the fault is with me or with Go. How is len() reporting > 0 when there are apparently no items in it?
Here's an example of how I've been able to replicate. I'm using Go v1.5.1 windows/amd64
There are two things here, as far as I'm concerned:
Am I managing the goroutines properly (probably not) and
Why does len(m) report > 0 when there are no items in it?
Thanks all
Example Code:
package main
import (
"log"
"os"
"time"
)
const (
chBuffSize = 100 // How large the thing's channel buffer should be
thingIdleLifetime = time.Second * 5 // How long things can live for when idle
thingsToMake = 1000 // How many things and associated goroutines to make
thingMessageCount = 10 // How many messages to send to the thing
)
// The thing that we'll be passing into a goroutine to process -----------------
type thing struct {
id string
ch chan bool
}
// Go go gadget map test -------------------------------------------------------
func main() {
// Make all of the things!
things := make(map[string]thing)
for i := 0; i < thingsToMake; i++ {
t := thing{
id: string(i),
ch: make(chan bool, chBuffSize),
}
things[t.id] = t
// Pass the thing into it's own goroutine
go doSomething(t, &things)
// Send (thingMessageCount) messages to the thing
go func(t thing) {
for x := 0; x < thingMessageCount; x++ {
t.ch <- true
}
}(t)
}
// Check the map of things to see whether we're empty or not
size := 0
for {
if size == len(things) && size != thingsToMake {
log.Println("Same number of items in map as last time")
log.Println(things)
os.Exit(1)
}
size = len(things)
log.Printf("Map size: %d\n", size)
time.Sleep(time.Second)
}
}
// Func for each goroutine to run ----------------------------------------------
//
// Takes two arguments:
// 1) the thing that it is working with
// 2) a pointer to the map of things
//
// When this goroutine is ready to terminate, it should remove the associated
// thing from the map of things to clean up after itself
func doSomething(t thing, things *map[string]thing) {
lastAccessed := time.Now()
for {
select {
case <-t.ch:
// We received a message, so extend the lastAccessed time
lastAccessed = time.Now()
default:
// We haven't received a message, so check if we're allowed to continue
n := time.Now()
d := n.Sub(lastAccessed)
if d > thingIdleLifetime {
// We've run for >thingIdleLifetime, so close the channel, delete the
// associated thing from the map and return, terminating the goroutine
close(t.ch)
delete(*things, string(t.id))
return
}
}
// Just sleep for a second in each loop to prevent the CPU being eaten up
time.Sleep(time.Second)
}
}
Just to add; in my original code this is looping forever. The program is designed to listen for TCP connections and receive and process the data, so the function that is checking the map count is running in it's own goroutine. However, this example has exactly the same symptom even though the map len() check is in the main() function and it is designed to handle an initial burst of data and then break out of the loop.
UPDATE 2015/11/23 15:56 UTC
I've refactored my example below. I'm not sure if I've misunderstood #RobNapier or not but this works much better. However, if I change thingsToMake to a larger number, say 100000, then I get lots of errors like this:
goroutine 199734 [select]:
main.doSomething(0xc0d62e7680, 0x4, 0xc0d64efba0, 0xc082016240)
C:/Users/anttheknee/go/src/maptest/maptest.go:83 +0x144
created by main.main
C:/Users/anttheknee/go/src/maptest/maptest.go:46 +0x463
I'm not sure if the problem is that I'm asking Go to do too much, or if I've made a hash of understanding the solution. Any thoughts?
package main
import (
"log"
"os"
"time"
)
const (
chBuffSize = 100 // How large the thing's channel buffer should be
thingIdleLifetime = time.Second * 5 // How long things can live for when idle
thingsToMake = 10000 // How many things and associated goroutines to make
thingMessageCount = 10 // How many messages to send to the thing
)
// The thing that we'll be passing into a goroutine to process -----------------
type thing struct {
id string
ch chan bool
done chan string
}
// Go go gadget map test -------------------------------------------------------
func main() {
// Make all of the things!
things := make(map[string]thing)
// Make a channel to receive completion notification on
doneCh := make(chan string, chBuffSize)
log.Printf("Making %d things\n", thingsToMake)
for i := 0; i < thingsToMake; i++ {
t := thing{
id: string(i),
ch: make(chan bool, chBuffSize),
done: doneCh,
}
things[t.id] = t
// Pass the thing into it's own goroutine
go doSomething(t)
// Send (thingMessageCount) messages to the thing
go func(t thing) {
for x := 0; x < thingMessageCount; x++ {
t.ch <- true
time.Sleep(time.Millisecond * 10)
}
}(t)
}
log.Printf("All %d things made\n", thingsToMake)
// Receive on doneCh when the goroutine is complete and clean the map up
for {
id := <-doneCh
close(things[id].ch)
delete(things, id)
if len(things) == 0 {
log.Printf("Map: %v", things)
log.Println("All done. Exiting")
os.Exit(0)
}
}
}
// Func for each goroutine to run ----------------------------------------------
//
// Takes two arguments:
// 1) the thing that it is working with
// 2) the channel to report that we're done through
//
// When this goroutine is ready to terminate, it should remove the associated
// thing from the map of things to clean up after itself
func doSomething(t thing) {
timer := time.NewTimer(thingIdleLifetime)
for {
select {
case <-t.ch:
// We received a message, so extend the timer
timer.Reset(thingIdleLifetime)
case <-timer.C:
// Timer returned so we need to exit now
t.done <- t.id
return
}
}
}
UPDATE 2015/11/23 16:41 UTC
The completed code that appears to be working properly. Do feel free to let me know if there are any improvements that could be made, but this works (sleeps are deliberate to see progress as it's otherwise too fast!)
package main
import (
"log"
"os"
"strconv"
"time"
)
const (
chBuffSize = 100 // How large the thing's channel buffer should be
thingIdleLifetime = time.Second * 5 // How long things can live for when idle
thingsToMake = 100000 // How many things and associated goroutines to make
thingMessageCount = 10 // How many messages to send to the thing
)
// The thing that we'll be passing into a goroutine to process -----------------
type thing struct {
id string
receiver chan bool
done chan string
}
// Go go gadget map test -------------------------------------------------------
func main() {
// Make all of the things!
things := make(map[string]thing)
// Make a channel to receive completion notification on
doneCh := make(chan string, chBuffSize)
log.Printf("Making %d things\n", thingsToMake)
for i := 0; i < thingsToMake; i++ {
t := thing{
id: strconv.Itoa(i),
receiver: make(chan bool, chBuffSize),
done: doneCh,
}
things[t.id] = t
// Pass the thing into it's own goroutine
go doSomething(t)
// Send (thingMessageCount) messages to the thing
go func(t thing) {
for x := 0; x < thingMessageCount; x++ {
t.receiver <- true
time.Sleep(time.Millisecond * 100)
}
}(t)
}
log.Printf("All %d things made\n", thingsToMake)
// Check the `len()` of things every second and exit when empty
go func() {
for {
time.Sleep(time.Second)
m := things
log.Printf("Map length: %v", len(m))
if len(m) == 0 {
log.Printf("Confirming empty map: %v", things)
log.Println("All done. Exiting")
os.Exit(0)
}
}
}()
// Receive on doneCh when the goroutine is complete and clean the map up
for {
id := <-doneCh
close(things[id].receiver)
delete(things, id)
}
}
// Func for each goroutine to run ----------------------------------------------
//
// When this goroutine is ready to terminate it should respond through t.done to
// notify the caller that it has finished and can be cleaned up. It will wait
// for `thingIdleLifetime` until it times out and terminates on it's own
func doSomething(t thing) {
timer := time.NewTimer(thingIdleLifetime)
for {
select {
case <-t.receiver:
// We received a message, so extend the timer
timer.Reset(thingIdleLifetime)
case <-timer.C:
// Timer expired so we need to exit now
t.done <- t.id
return
}
}
}
map is not thread-safe. You cannot access a map on multiple goroutines safely. You can corrupt the map, as you're seeing in this case.
Rather than allow the goroutine to modify the map, the goroutine should write their identifier to a channel before returning. The main loop should watch that channel, and when an identifier comes back, should remove that element from the map.
You'll probably want to read up on Go concurrency patterns. In particular, you may want to look at Fan-out/Fan-in. Look at the links at the bottom. The Go blog has a lot of information on concurrency.
Note that your goroutine is busy waiting to check for timeout. There's no reason for that. The fact that you "sleep(1 second)") should be a clue that there's a mistake. Instead, look at time.Timer which will give you a chan that will receive a value after some time, which you can reset.
Your problem is how you're converting numbers to strings:
id: string(i),
That creates a string using i as a rune (int32). For example string(65) is A. Some unequal Runes resolve to equal strings. You get a collision and close the same channel twice. See http://play.golang.org/p/__KpnfQc1V
You meant this:
id: strconv.Itoa(i),
This is a good example of workers & controller mode in Go written by #Jimt, in answer to
"Is there some elegant way to pause & resume any other goroutine in golang?"
package main
import (
"fmt"
"runtime"
"sync"
"time"
)
// Possible worker states.
const (
Stopped = 0
Paused = 1
Running = 2
)
// Maximum number of workers.
const WorkerCount = 1000
func main() {
// Launch workers.
var wg sync.WaitGroup
wg.Add(WorkerCount + 1)
workers := make([]chan int, WorkerCount)
for i := range workers {
workers[i] = make(chan int)
go func(i int) {
worker(i, workers[i])
wg.Done()
}(i)
}
// Launch controller routine.
go func() {
controller(workers)
wg.Done()
}()
// Wait for all goroutines to finish.
wg.Wait()
}
func worker(id int, ws <-chan int) {
state := Paused // Begin in the paused state.
for {
select {
case state = <-ws:
switch state {
case Stopped:
fmt.Printf("Worker %d: Stopped\n", id)
return
case Running:
fmt.Printf("Worker %d: Running\n", id)
case Paused:
fmt.Printf("Worker %d: Paused\n", id)
}
default:
// We use runtime.Gosched() to prevent a deadlock in this case.
// It will not be needed of work is performed here which yields
// to the scheduler.
runtime.Gosched()
if state == Paused {
break
}
// Do actual work here.
}
}
}
// controller handles the current state of all workers. They can be
// instructed to be either running, paused or stopped entirely.
func controller(workers []chan int) {
// Start workers
for i := range workers {
workers[i] <- Running
}
// Pause workers.
<-time.After(1e9)
for i := range workers {
workers[i] <- Paused
}
// Unpause workers.
<-time.After(1e9)
for i := range workers {
workers[i] <- Running
}
// Shutdown workers.
<-time.After(1e9)
for i := range workers {
close(workers[i])
}
}
But this code also has an issue: If you want to remove a worker channel in workers when worker() exits, dead lock happens.
If you close(workers[i]), next time controller writes into it will cause a panic since go can't write into a closed channel. If you use some mutex to protect it, then it will be stuck on workers[i] <- Running since the worker is not reading anything from the channel and write will be blocked, and mutex will cause a dead lock. You can also give a bigger buffer to channel as a work-around, but it's not good enough.
So I think the best way to solve this is worker() close channel when exits, if the controller finds a channel closed, it will jump over it and do nothing. But I can't find how to check a channel is already closed or not in this situation. If I try to read the channel in controller, the controller might be blocked. So I'm very confused for now.
PS: Recovering the raised panic is what I have tried, but it will close goroutine which raised panic. In this case it will be controller so it's no use.
Still, I think it's useful for Go team to implement this function in next version of Go.
There's no way to write a safe application where you need to know whether a channel is open without interacting with it.
The best way to do what you're wanting to do is with two channels -- one for the work and one to indicate a desire to change state (as well as the completion of that state change if that's important).
Channels are cheap. Complex design overloading semantics isn't.
[also]
<-time.After(1e9)
is a really confusing and non-obvious way to write
time.Sleep(time.Second)
Keep things simple and everyone (including you) can understand them.
In a hacky way it can be done for channels which one attempts to write to by recovering the raised panic. But you cannot check if a read channel is closed without reading from it.
Either you will
eventually read the "true" value from it (v <- c)
read the "true" value and 'not closed' indicator (v, ok <- c)
read a zero value and the 'closed' indicator (v, ok <- c) (example)
will block in the channel read forever (v <- c)
Only the last one technically doesn't read from the channel, but that's of little use.
I know this answer is so late, I have wrote this solution, Hacking Go run-time, It's not safety, It may crashes:
import (
"unsafe"
"reflect"
)
func isChanClosed(ch interface{}) bool {
if reflect.TypeOf(ch).Kind() != reflect.Chan {
panic("only channels!")
}
// get interface value pointer, from cgo_export
// typedef struct { void *t; void *v; } GoInterface;
// then get channel real pointer
cptr := *(*uintptr)(unsafe.Pointer(
unsafe.Pointer(uintptr(unsafe.Pointer(&ch)) + unsafe.Sizeof(uint(0))),
))
// this function will return true if chan.closed > 0
// see hchan on https://github.com/golang/go/blob/master/src/runtime/chan.go
// type hchan struct {
// qcount uint // total data in the queue
// dataqsiz uint // size of the circular queue
// buf unsafe.Pointer // points to an array of dataqsiz elements
// elemsize uint16
// closed uint32
// **
cptr += unsafe.Sizeof(uint(0))*2
cptr += unsafe.Sizeof(unsafe.Pointer(uintptr(0)))
cptr += unsafe.Sizeof(uint16(0))
return *(*uint32)(unsafe.Pointer(cptr)) > 0
}
Well, you can use default branch to detect it, for a closed channel will be selected, for example: the following code will select default, channel, channel, the first select is not blocked.
func main() {
ch := make(chan int)
go func() {
select {
case <-ch:
log.Printf("1.channel")
default:
log.Printf("1.default")
}
select {
case <-ch:
log.Printf("2.channel")
}
close(ch)
select {
case <-ch:
log.Printf("3.channel")
default:
log.Printf("3.default")
}
}()
time.Sleep(time.Second)
ch <- 1
time.Sleep(time.Second)
}
Prints
2018/05/24 08:00:00 1.default
2018/05/24 08:00:01 2.channel
2018/05/24 08:00:01 3.channel
Note, refer to comment by #Angad under this answer:
It doesn't work if you're using a Buffered Channel and it contains
unread data
I have had this problem frequently with multiple concurrent goroutines.
It may or may not be a good pattern, but I define a a struct for my workers with a quit channel and field for the worker state:
type Worker struct {
data chan struct
quit chan bool
stopped bool
}
Then you can have a controller call a stop function for the worker:
func (w *Worker) Stop() {
w.quit <- true
w.stopped = true
}
func (w *Worker) eventloop() {
for {
if w.Stopped {
return
}
select {
case d := <-w.data:
//DO something
if w.Stopped {
return
}
case <-w.quit:
return
}
}
}
This gives you a pretty good way to get a clean stop on your workers without anything hanging or generating errors, which is especially good when running in a container.
You could set your channel to nil in addition to closing it. That way you can check if it is nil.
example in the playground:
https://play.golang.org/p/v0f3d4DisCz
edit:
This is actually a bad solution as demonstrated in the next example,
because setting the channel to nil in a function would break it:
https://play.golang.org/p/YVE2-LV9TOp
ch1 := make(chan int)
ch2 := make(chan int)
go func(){
for i:=0; i<10; i++{
ch1 <- i
}
close(ch1)
}()
go func(){
for i:=10; i<15; i++{
ch2 <- i
}
close(ch2)
}()
ok1, ok2 := false, false
v := 0
for{
ok1, ok2 = true, true
select{
case v,ok1 = <-ch1:
if ok1 {fmt.Println(v)}
default:
}
select{
case v,ok2 = <-ch2:
if ok2 {fmt.Println(v)}
default:
}
if !ok1 && !ok2{return}
}
}
From the documentation:
A channel may be closed with the built-in function close. The multi-valued assignment form of the receive operator reports whether a received value was sent before the channel was closed.
https://golang.org/ref/spec#Receive_operator
Example by Golang in Action shows this case:
// This sample program demonstrates how to use an unbuffered
// channel to simulate a game of tennis between two goroutines.
package main
import (
"fmt"
"math/rand"
"sync"
"time"
)
// wg is used to wait for the program to finish.
var wg sync.WaitGroup
func init() {
rand.Seed(time.Now().UnixNano())
}
// main is the entry point for all Go programs.
func main() {
// Create an unbuffered channel.
court := make(chan int)
// Add a count of two, one for each goroutine.
wg.Add(2)
// Launch two players.
go player("Nadal", court)
go player("Djokovic", court)
// Start the set.
court <- 1
// Wait for the game to finish.
wg.Wait()
}
// player simulates a person playing the game of tennis.
func player(name string, court chan int) {
// Schedule the call to Done to tell main we are done.
defer wg.Done()
for {
// Wait for the ball to be hit back to us.
ball, ok := <-court
fmt.Printf("ok %t\n", ok)
if !ok {
// If the channel was closed we won.
fmt.Printf("Player %s Won\n", name)
return
}
// Pick a random number and see if we miss the ball.
n := rand.Intn(100)
if n%13 == 0 {
fmt.Printf("Player %s Missed\n", name)
// Close the channel to signal we lost.
close(court)
return
}
// Display and then increment the hit count by one.
fmt.Printf("Player %s Hit %d\n", name, ball)
ball++
// Hit the ball back to the opposing player.
court <- ball
}
}
it's easier to check first if the channel has elements, that would ensure the channel is alive.
func isChanClosed(ch chan interface{}) bool {
if len(ch) == 0 {
select {
case _, ok := <-ch:
return !ok
}
}
return false
}
If you listen this channel you always can findout that channel was closed.
case state, opened := <-ws:
if !opened {
// channel was closed
// return or made some final work
}
switch state {
case Stopped:
But remember, you can not close one channel two times. This will raise panic.