Best way to stop a single goroutine? - go

In my program I have several go-routines who are essentially running endless processes. Why? you may ask, long story short it is the purpose of my entire application so it's out of question to change that. I would like to give users the ability to stop a single go-routine. I understand that I can use channel to signal the go-routines to stop, however there may be cases where I have, say, 10 go-routines running and I only want to stop 1. The issue is that the number of go-routines I want to run is dynamic and based on user input. What is the best way for me to add the ability to stop a go-routine dynamically and allow for singles to be stopped without the rest?

You need design a map to manage contexts.
Assume you've already known usage of context. It might look like:
ctx, cancel := context.WithCancel(ctx.TODO())
go func(ctx){
for {
select {
case <-ctx.Done():
return
default:
// job
}
}
}(ctx)
cancel()
Ok, now you can convert your question to another, it might called 'how to manage contexts of many goroutine'
type GoroutineManager struct{
m sync.Map
}
func (g *GoroutineManager) Add(cancel context.CancelFunc, key string)) {
g.m.Store(key, cancel)
}
func (g *GoroutineManager) KillGoroutine(key string) {
cancel, exist := g.m.Load(key)
if exist {
cancel()
}
}
Ok, Now you can manage your goroutine like :
ctx, cancel := context.WithCancel(ctx.TODO())
manager.Add(cancel, "routine-job-1")
go func(ctx){
for {
select {
case <-ctx.Done():
return
default:
// job
}
}
}(ctx)
// kill it as your wish
manager.KillGoroutine("routine-job-1")

Related

bounded 'mutex pools' for synchronized entity behaviour

I have a function
type Command struct {
id Uuid
}
handleCommand(cmd Command)
{
entity := lookupEntityInDataBase(cmd.Uuid)
entity.handleCommand(cmd)
saveEntityInDatabase(entity)
}
However this function can be called in parallel, and the entities are assumed to be non-threadsafe leading to racyness in the entity's state and the state which will be saved in the database.
A simple mutex locking at the beginning and end in this function would solve this, but would result in overly pessimistic synchronisation, since entities of different instances (i.e. different uuid), should be allowed to handle their commands in parallel.
An alternative can be to keep a Map of map[uuid]sync.Mutex, and create a new mutex if a uuid is not encountered before, and create it in a threadsafe manner. However, this will result in a possibly endlessly growing map of all the uuids every encountered at runtime.
I thought of cleaning up the mutexes afterwards, but doing this threadsafe, and realising that another thread may already be waiting for the mutex opens so many cans of worms.
I hope I am missing a very simple and elegant solution.
There isn't really an elegant solution. Here's a version using channels:
var m map[string]chan struct{}
var l sync.Mutex
func handleCommand(cmd Command) {
for {
l.Lock()
ch, ok:=m[cmd.Uuid]
if !ok {
ch=make(chan struct{})
m[uuid]=ch
defer func() {
l.Lock()
delete(m,cmd.Uuid)
close(ch)
l.Unlock()
}()
l.Unlock()
break
}
l.Unlock()
<-ch
}
entity := lookupEntityInDataBase(cmd.Uuid)
entity.handleCommand(cmd)
saveEntityInDatabase(entity)
}
The Moby project has such a thing as a library, see https://github.com/moby/locker
The one-line description is,
locker provides a mechanism for creating finer-grained locking to help free up more global locks to handle other tasks.

context without channels in the same thread of execution

Can't figure out how I can cancel a task if it takes to much to time compute in the same thread of execution via context semantics?
I use this example as a reference point
https://golang.org/src/context/context_test.go
The goal here call a doWork, if doWork takes to much time to compute, GetValueWithDeadline should after a timeout return 0, or if caller called cancel that cancel a wait, (here it main is caller) or the value returned in in give a time window.
The same scenario can be done In a different way. ( separate goroutine sleep, wakeup check value etc, condition on a mutex, etc) but I really want to understand the correct way to use context.
The channel semantic I understand but here I can't achieve the desired effect, the default case
call to a doWork fault under default case and sleep.
package main
import (
"context"
"fmt"
"log"
"math/rand"
"sync"
"time"
)
type Server struct {
lock sync.Mutex
}
func NewServer() *Server {
s := new(Server)
return s
}
func (s *Server) doWork() int {
s.lock.Lock()
defer s.lock.Unlock()
r := rand.Intn(100)
log.Printf("Going to nap for %d", r)
time.Sleep(time.Duration(r) * time.Millisecond)
return r
}
// I take an example from here and it very unclear where is do work executed
// https://golang.org/src/context/context_test.go
func (s *Server) GetValueWithDeadline(ctx context.Context) int {
val := 0
select {
case <- time.After(150 * time.Millisecond):
fmt.Println("overslept")
return 0
case <- ctx.Done():
fmt.Println(ctx.Err())
return 0
default:
val = s.doWork()
}
return all
}
func main() {
rand.Seed(time.Now().UTC().UnixNano())
s := NewServer()
for i :=0; i < 10; i++ {
d := time.Now().Add(50 * time.Millisecond)
ctx, cancel := context.WithDeadline(context.Background(), d)
log.Print(s.GetValueWithDeadline(ctx))
cancel()
}
}
Thank you
There are multiple problems with your approach.
What problem contexts solve
First, the primary reason contexts were invented in Go is that they allow to unify an approach to cancellation of a set of tasks.
To explain this concept using a simple example, consider a client request to some sever; to simplify further let it be an HTTP request.
The client connects to the server, sends some data telling the server what to do to fulfill the request and then waits for the server to respond.
Let's now suppose the request requires elaborate and time-consuming processing on the server — for instance, suppose it needs to perform multiple complex queries to multiple remote database engines, do multiple HTTP requests to external services and then process the acquired results to actually produce the data the client wants.
So the client starts its request and the server goes on with all those requests.
To hide latency of individual tasks the server has to perform to fulfill the request, it runs them in separate goroutines.
Once each goroutine completes the assigned task, it communicates its result (and/or an error) back to the goroutine which handles the client's request, and so on.
Now suppose that the client fails to wait for the response to its request for whatever reason — a network outage, an explicit timeout in the client's software, the user kills the app which initiated the request etc, — there are lots of possibilities.
As you can see, there's little sense for the server to continue spending resources to finish the tasks which were logically bound to the now-dead request: there's no one to hear back the result anyway.
So it makes sense to reap those tasks once we know the request is not going to be completed, and that's where contexts come into play: you can associate each incoming request with a single context and then either pass it itself to any goroutine spawned to carry out a single task required to be done to fulfill the request, or derive another request from that and pass it instead.
Then, as soon as you cancel the "root" request, that signal is propagated through the whole tree of requests derived from the root one.
Now each goroutine which were given a context, might "listen" on it to be notified when that cancellation signal is sent, and once the goroutine notices that it might drop whatever it was busy doing and exit.
In terms of actual context.Context type that signal is called "done" — as in "we're done doing whatever that context is assotiated with", — and that's why the goroutine which wants to know it should stop doing its work listens on a special channel returned by the context's method called Done.
Back to your example
To make it work, you'd do something like:
func (s *Server) doWork(ctx context.Context) int {
s.lock.Lock()
defer s.lock.Unlock()
r := rand.Intn(100)
log.Printf("Going to nap for %d", r)
select {
case <- time.After(time.Duration(r) * time.Millisecond):
return r
case <- ctx.Done():
return -1
}
}
func (s *Server) GetValueWithTimeout(ctx context.Context, maxTime time.Duration) int {
d := time.Now().Add(maxTime)
ctx, cancel := context.WithDeadline(ctx, d)
defer cancel()
return s.doWork(ctx)
}
func main() {
const maxTime = 50 * time.Millisecond
rand.Seed(time.Now().UTC().UnixNano())
s := NewServer()
for i :=0; i < 10; i++ {
v := s.GetValueWithTimeout(context.Background(), maxTime)
log.Print(v)
}
}
(Playground).
So what happens here?
The GetValueWithTimeout method accepts the maximum time it should take the doWork method to produce a value, calculates the deadline, derives a context which cancels itself once the deadline passes from the context passed to the method and calls doWork with the new context object.
The doWork method arms its own timer to go off after a random time interval and then listens on both the context and the timer.
This one is the critical point: the code which performs some unit of work which is supposed to be cancellable must check the context to become "done" actively, by itself.
So, in our toy example, either the doWork's own timer fires first or the deadline of the generated context gets reached first; whatever happens first, makes the select statement unblock and proceed.
Note that if your "do the work" code wold be more involved — it would actually do something instead of sleeping, — you would most probably need to check on the context's status periodically, usually after performing invividual bits of that work.

Is there any way to prevent default golang program finish

I have a server working with websocket connections and a database. Some users can connect by sockets, so I need to increment their "online" in db; and at the moment of their disconnection I also decrement their "online" field in db. But in case the server breaks down I use a local variable replica map[string]int of users online. So I need to postpone the server shutdown until it completes a database request that decrements all users "online" in accordance with my variable replica, because at this way socket connection doesnt send default "close" event.
I have found a package github.com/xlab/closer that handles some system calls and can do some action before program finished, but my database request doesnt work in this way (code below)
func main() {
...
// trying to handle program finish event
closer.Bind(cleanupSocketConnections(&pageHandler))
...
}
// function that handles program finish event
func cleanupSocketConnections(p *controllers.PageHandler) func() {
return func() {
p.PageService.ResetOnlineUsers()
}
}
// this map[string]int contains key=userId, value=count of socket connections
type PageService struct {
Users map[string]int
}
func (p *PageService) ResetOnlineUsers() {
for userId, count := range p.Users {
// decrease online of every user in program variable
InfoService{}.DecreaseInfoOnline(userId, count)
}
}
Maybe I use it incorrectly or may be there is a better way to prevent default program finish?
First of all executing tasks when the server "breaks down" as you said is quite complicated, because breaking down can mean a lot of things and nothing can guarantee clean up functions execution when something goes really bad in your server.
From an engineering point of view (if setting users offline on breakdown is so important), the best would be to have a secondary service, on another server, that receives user connection and disconnection events and ping event, if it receives no updates in a set timeout the service considers your server down and proceeds to set every user offline.
Back to your question, using defer and waiting for termination signals should cover 99% of cases. I commented the code to explain the logic.
// AllUsersOffline is called when the program is terminated, it takes a *sync.Once to make sure this function is performed only
// one time, since it might be called from different goroutines.
func AllUsersOffline(once *sync.Once) {
once.Do(func() {
fmt.Print("setting all users offline...")
// logic to set all users offline
})
}
// CatchSigs catches termination signals and executes f function at the end
func CatchSigs(f func()) {
cSig := make(chan os.Signal, 1)
// watch for these signals
signal.Notify(cSig, syscall.SIGKILL, syscall.SIGTERM, syscall.SIGINT, syscall.SIGQUIT, syscall.SIGHUP) // these are the termination signals in GNU => https://www.gnu.org/software/libc/manual/html_node/Termination-Signals.html
// wait for them
sig := <- cSig
fmt.Printf("received signal: %s", sig)
// execute f
f()
}
func main() {
/* code */
// the once is used to make sure AllUsersOffline is performed ONE TIME.
usersOfflineOnce := &sync.Once{}
// catch termination signals
go CatchSigs(func() {
// when a termination signal is caught execute AllUsersOffline function
AllUsersOffline(usersOfflineOnce)
})
// deferred functions are called even in case of panic events, although execution is not to take for granted (OOM errors etc)
defer AllUsersOffline(usersOfflineOnce)
/* code */
// run server
err := server.Run()
if err != nil {
// error logic here
}
// bla bla bla
}
I think that you need to look at go routines and channel.
Here something (maybe) useful:
https://nathanleclaire.com/blog/2014/02/15/how-to-wait-for-all-goroutines-to-finish-executing-before-continuing/

Attempting to acquire a lock with a deadline in golang?

How can one only attempt to acquire a mutex-like lock in go, either aborting immediately (like TryLock does in other implementations) or by observing some form of deadline (basically LockBefore)?
I can think of 2 situations right now where this would be greatly helpful and where I'm looking for some sort of solution. The first one is: a CPU-heavy service which receives latency sensitive requests (e.g. a web service). In this case you would want to do something like the RPCService example below. It is possible to implement it as a worker queue (with channels and stuff), but in that case it becomes more difficult to gauge and utilize all available CPU. It is also possible to just accept that by the time you acquire the lock your code may already be over deadline, but that is not ideal as it wastes some amount of resources and means we can't do things like a "degraded ad-hoc response".
/* Example 1: LockBefore() for latency sensitive code. */
func (s *RPCService) DoTheThing(ctx context.Context, ...) ... {
if s.someObj[req.Parameter].mtx.LockBefore(ctx.Deadline()) {
defer s.someObj[req.Parameter].mtx.Unlock()
... expensive computation based on internal state ...
} else {
return s.cheapCachedResponse[req.Parameter]
}
}
Another case is when you have a bunch of objects which should be touched, but which may be locked, and where touching them should complete within a certain amount of time (e.g. updating some stats). In this case you could also either use LockBefore() or some form of TryLock(), see the Stats example below.
/* Example 2: TryLock() for updating stats. */
func (s *StatsObject) updateObjStats(key, value interface{}) {
if s.someObj[key].TryLock() {
defer s.someObj[key].Unlock()
... update stats ...
... fill in s.cheapCachedResponse ...
}
}
func (s *StatsObject) UpdateStats() {
s.someObj.Range(s.updateObjStats)
}
For ease of use, let's assume that in the above case we're talking about the same s.someObj. Any object may be blocked by DoTheThing() operations for a long time, which means we would want to skip it in updateObjStats. Also, we would want to make sure that we return the cheap response in DoTheThing() in case we can't acquire a lock in time.
Unfortunately, sync.Mutex only and exclusively has the functions Lock() and Unlock(). There is no way to potentially acquire a lock. Is there some easy way to do this instead? Am I approaching this class of problems from an entirely wrong angle, and is there a different, more "go"ish way to solve them? Or will I have to implement my own Mutex library if I want to solve these? I am aware of issue 6123 which seems to suggest that there is no such thing and that the way I'm approaching these problems is entirely un-go-ish.
Use a channel with buffer size of one as mutex.
l := make(chan struct{}, 1)
Lock:
l <- struct{}{}
Unlock:
<-l
Try lock:
select {
case l <- struct{}{}:
// lock acquired
<-l
default:
// lock not acquired
}
Try with timeout:
select {
case l <- struct{}{}:
// lock acquired
<-l
case <-time.After(time.Minute):
// lock not acquired
}
I think you're asking several different things here:
Does this facility exist in the standard libray? No, it doesn't. You can probably find implementations elsewhere - this is possible to implement using the standard library (atomics, for example).
Why doesn't this facility exist in the standard library: the issue you mentioned in the question is one discussion. There are also several discussions on the go-nuts mailing list with several Go code developers contributing: link 1, link 2. And it's easy to find other discussions by googling.
How can I design my program such that I won't need this?
The answer to (3) is more nuanced and depends on your exact issue. Your question already says
It is possible to implement it as a worker queue (with channels and
stuff), but in that case it becomes more difficult to gauge and
utilize all available CPU
Without providing details on why it would be more difficult to utilize all CPUs, as opposed to checking for a mutex lock state.
In Go you usually want channels whenever the locking schemes become non-trivial. It shouldn't be slower, and it should be much more maintainable.
How about this package: https://github.com/viney-shih/go-lock . It use channel and semaphore (golang.org/x/sync/semaphore) to solve your problem.
go-lock implements TryLock, TryLockWithTimeout and TryLockWithContext functions in addition to Lock and Unlock. It provides flexibility to control the resources.
Examples:
package main
import (
"fmt"
"time"
"context"
lock "github.com/viney-shih/go-lock"
)
func main() {
casMut := lock.NewCASMutex()
casMut.Lock()
defer casMut.Unlock()
// TryLock without blocking
fmt.Println("Return", casMut.TryLock()) // Return false
// TryLockWithTimeout without blocking
fmt.Println("Return", casMut.TryLockWithTimeout(50*time.Millisecond)) // Return false
// TryLockWithContext without blocking
ctx, cancel := context.WithTimeout(context.Background(), 50*time.Millisecond)
defer cancel()
fmt.Println("Return", casMut.TryLockWithContext(ctx)) // Return false
// Output:
// Return false
// Return false
// Return false
}
PMutex from package https://github.com/myfantasy/mfs
PMutex implements RTryLock(ctx context.Context) and TryLock(ctx context.Context)
// ctx - some context
ctx := context.Background()
mx := mfs.PMutex{}
isLocked := mx.TryLock(ctx)
if isLocked {
// DO Something
mx.Unlock()
} else {
// DO Something else
}

Best way to stop reading a channel if a condition occurs

I have a go routine that keeps blocked until a channel receives new data. However, I need to stop the go routine whenever a condition is true. I wonder what is the best way
to do this.
I will illustrate the problem with an example code. The first solution I thought was using a select statement and check the condition constantly, like this:
func routine(c chan string, shouldStop func() bool) {
select {
case s := <-c:
doStuff(s)
default:
if shouldStop() {
return
}
}
}
However, this approach will force the routine to call shouldStop() every time and never block. I thought this could lead to performance problems, specially because there a lot others routines running.
Another option would be to use a sleep to at least block a little between shouldStop() calls. However, this would not be a perfect solution, since I'd like to call doStuff() in the exact time the channel receives with new data
Lastly, I thought about using a second channel just to achieve this, like:
func routine(c chan string, stop chan bool) {
select {
case s := <-c:
doStuff(s)
case b := <-stop:
return
}
}
While I thought that this might work, this would force me to have an extra channel along with the shouldStop flag. Maybe there is a better solution I'm not aware of.
Any suggestion is appreciated. Thanks.

Resources