My problem can be summarized as the following snippet:
package main
import (
"fmt"
"time"
)
func main() {
done := make(chan int)
done2 := make(chan int)
go func() {
for {
fmt.Println("1")
time.Sleep(time.Duration(1) * time.Second)
}
done <- 1
}()
go func() {
for {
fmt.Println("2")
}
done2 <- 1
}()
<- done
<- done2
}
Where the go routine "1" never gets the chance to run again. After doing some research, it looks like it's because go routine "2" takes up all the CPU.
I had done something similar in java before, and thread "1" can always wake up approximately 1 sec later.
My question is how can I achieve the same behavior in go?(I'm transferring a socket program originally written in Java to Go)
I have also tried runtime.GOMAXPROCS(2), but didn't work
Go routines to someone new to Go can have a number of surprising side effects. One of the more important of these is the problems of empty infinite loops.
The way that the Go scheduler works, every function call should get a preemption point placed in the beginning of the call. This should be happening with your fmt.Println("2") call, meaning that every time a print is called, the scheduler in the background has the ability to move around what routines are running at any given time, increase memory for a goroutine, etc.
Under normal circumstances, even if it is not ideally written code, this should be enough. Given that you have stated in the comments that this is not the actual code that you are working with, its not really possible to answer why your other code might not be working.
If you are interested in learning more about how the go scheduler works, I would recommend this article which goes over preemption, and these articles which covers the scheduler at a lower level.
I would also recommend taking a look through the Tour of Go site. While some of the concepts are rather basic, there are some very complex things that you can do if you are aware of all the tools in your toolbox. I would specifically recommend the concurrency section, and the select statement, which might be able to help you refactor your program easier, and in a manner more fitting to go.
Related
I'm trying to understand best practices for Golang concurrency. I read O'Reilly's book on Go's concurrency and then came back to the Golang Codewalks, specifically this example:
https://golang.org/doc/codewalk/sharemem/
This is the code I was hoping to review with you in order to learn a little bit more about Go. My first impression is that this code is breaking some best practices. This is of course my (very) unexperienced opinion and I wanted to discuss and gain some insight on the process. This isn't about who's right or wrong, please be nice, I just want to share my views and get some feedback on them. Maybe this discussion will help other people see why I'm wrong and teach them something.
I'm fully aware that the purpose of this code is to teach beginners, not to be perfect code.
Issue 1 - No Goroutine cleanup logic
func main() {
// Create our input and output channels.
pending, complete := make(chan *Resource), make(chan *Resource)
// Launch the StateMonitor.
status := StateMonitor(statusInterval)
// Launch some Poller goroutines.
for i := 0; i < numPollers; i++ {
go Poller(pending, complete, status)
}
// Send some Resources to the pending queue.
go func() {
for _, url := range urls {
pending <- &Resource{url: url}
}
}()
for r := range complete {
go r.Sleep(pending)
}
}
The main method has no way to cleanup the Goroutines, which means if this was part of a library, they would be leaked.
Issue 2 - Writers aren't spawning the channels
I read that as a best practice, the logic to create, write and cleanup a channel should be controlled by a single entity (or group of entities). The reason behind this is that writers will panic when writing to a closed channel. So, it is best for the writer(s) to create the channel, write to it and control when it should be closed. If there are multiple writers, they can be synced with a WaitGroup.
func StateMonitor(updateInterval time.Duration) chan<- State {
updates := make(chan State)
urlStatus := make(map[string]string)
ticker := time.NewTicker(updateInterval)
go func() {
for {
select {
case <-ticker.C:
logState(urlStatus)
case s := <-updates:
urlStatus[s.url] = s.status
}
}
}()
return updates
}
This function shouldn't be in charge of creating the updates channel because it is the reader of the channel, not the writer. The writer of this channel should create it and pass it to this function. Basically saying to the function "I will pass updates to you via this channel". But instead, this function is creating a channel and it isn't clear who is responsible of cleaning it up.
Issue 3 - Writing to a channel asynchronously
This function:
func (r *Resource) Sleep(done chan<- *Resource) {
time.Sleep(pollInterval + errTimeout*time.Duration(r.errCount))
done <- r
}
Is being referenced here:
for r := range complete {
go r.Sleep(pending)
}
And it seems like an awful idea. When this channel is closed, we'll have a goroutine sleeping somewhere out of our reach waiting to write to that channel. Let's say this goroutine sleeps for 1h, when it wakes up, it will try to write to a channel that was closed in the cleanup process. This is another example of why the writters of the channels should be in charge of the cleanup process. Here we have a writer who's completely free and unaware of when the channel was closed.
Please
If I missed any issues from that code (related to concurrency), please list them. It doesn't have to be an objective issue, if you'd have designed the code in a different way for any reason, I'm also interested in learning about it.
Biggest lesson from this code
For me the biggest lesson I take from reviewing this code is that the cleanup of channels and the writing to them has to be synchronized. They have to be in the same for{} or at least communicate somehow (maybe via other channels or primitives) to avoid writing to a closed channel.
It is the main method, so there is no need to cleanup. When main returns, the program exits. If this wasn't the main, then you would be correct.
There is no best practice that fits all use cases. The code you show here is a very common pattern. The function creates a goroutine, and returns a channel so that others can communicate with that goroutine. There is no rule that governs how channels must be created. There is no way to terminate that goroutine though. One use case this pattern fits well is reading a large resultset from a
database. The channel allows streaming data as it is read from the
database. In that case usually there are other means of terminating the
goroutine though, like passing a context.
Again, there are no hard rules on how channels should be created/closed. A channel can be left open, and it will be garbage collected when it is no longer used. If the use case demands so, the channel can be left open indefinitely, and the scenario you worry about will never happen.
As you are asking about if this code was part of a library, yes it would be poor practice to spawn goroutines with no cleanup inside a library function. If those goroutines carry out documented behaviour of the library, it's problematic that the caller doesn't know when that behaviour is going to happen. If you have any behaviour that is typically "fire and forget", it should be the caller who chooses when to forget about it. For example:
func doAfter5Minutes(f func()) {
go func() {
time.Sleep(5 * time.Minute)
f()
log.Println("done!")
}()
}
Makes sense, right? When you call the function, it does something 5 minutes later. The problem is that it's easy to misuse this function like this:
// do the important task every 5 minutes
for {
doAfter5Minutes(importantTaskFunction)
}
At first glance, this might seem fine. We're doing the important task every 5 minutes, right? In reality, we're spawning many goroutines very quickly, probably consuming all available memory before they start dropping off.
We could implement some kind of callback or channel to signal when the task is done, but really, the function should be simplified like so:
func doAfter5Minutes(f func()) {
time.Sleep(5 * time.Minute)
f()
log.Println("done!")
}
Now the caller has the choice of how to use it:
// call synchronously
doAfter5Minutes(importantTaskFunction)
// fire and forget
go doAfter5Minutes(importantTaskFunction)
This function arguably should also be changed. As you say, the writer should effectively own the channel, as they should be the one closing it. The fact that this channel-reading function insists on creating the channel it reads from actually coerces itself into this poor "fire and forget" pattern mentioned above. Notice how the function needs to read from the channel, but it also needs to return the channel before reading. It therefore had to put the reading behaviour in a new, un-managed goroutine to allow itself to return the channel right away.
func StateMonitor(updates chan State, updateInterval time.Duration) {
urlStatus := make(map[string]string)
ticker := time.NewTicker(updateInterval)
defer ticker.Stop() // not stopping the ticker is also a resource leak
for {
select {
case <-ticker.C:
logState(urlStatus)
case s := <-updates:
urlStatus[s.url] = s.status
}
}
}
Notice that the function is now simpler, more flexible and synchronous. The only thing that the previous version really accomplishes, is that it (mostly) guarantees that each instance of StateMonitor will have a channel all to itself, and you won't have a situation where multiple monitors are competing for reads on the same channel. While this may help you avoid a certain class of bugs, it also makes the function a lot less flexible and more likely to have resource leaks.
I'm not sure I really understand this example, but the golden rule for channel closing is that the writer should always be responsible for closing the channel. Keep this rule in mind, and notice a few points about this code:
The Sleep method writes to r
The Sleep method is executed concurrently, with no method of tracking how many instances are running, what state they are in, etc.
Based on these points alone, we can say that there probably isn't anywhere in the program where it would be safe to close r, because there's seemingly no way of knowing if it will be used again.
Trying to understand the flow of goroutines so i wrote this code only one thing which i am not able to understand is that how routine-end runs between the other go routines and complete a single go routines and print the output from the channel at the end.
import(
"fmt"
)
func add(dataArr []int,dataChannel chan int,i int ){
var sum int
fmt.Println("GOROUTINE",i+1)
for i:=0;i<len(dataArr);i++{
sum += dataArr[i]
}
fmt.Println("wRITING TO CHANNEL.....")
dataChannel <- sum
fmt.Println("routine-end")
}
func main(){
fmt.Println("main() started")
dataChannel := make(chan int)
dataArr := []int{1,2,3,4,5,6,7,8,9}
for i:=0;i<len(dataArr);i+=3{
go add(dataArr[i:i+3],dataChannel,i)
}
fmt.Println("came to blocking statement ..........")
fmt.Println(<-dataChannel)
fmt.Println("main() end")
}
output
main() started
came to blocking statement ..........
GOROUTINE 1
wRITING TO CHANNEL.....
routine-end
GOROUTINE 4
wRITING TO CHANNEL.....
6
main() end
Your for loop launches 3 goroutines that invoke the add function.
In addition, main itself runs in a separate "main" goroutine.
Since goroutines execute concurrently, the order of their run is typically unpredictable and depends on timing, how busy your machine is, etc. Results may differ between runs and between machines. Inserting time.Sleep calls in various places may help visualize it. For example, inserting time.Sleep for 100ms before "came to blocking statement" shows that all add goroutines launch.
What you may see in your run typically is that one add goroutine launches, adds up its slice to its sum and writes sum to dataChannel. Since main launches a few goroutines and immediately reads from the channel, this read gets the sum written by add and then the program exists -- because by default main won't wait for all goroutines to finish.
Moreover, since the dataChannel channel is unbuffered and main only reads one value, the other add goroutines will block on the channel indefinitely while writing.
I do recommend going over some introductory resources for goroutines and channels. They build up the concepts from simple principles. Some good links for you:
Golang tour
https://gobyexample.com/ -- start with the Goroutines example and do the next several ones.
Besides a few tutorials on Go I have no actual experience in it. I'm trying to take a project written in Go and converting it into a windows service.
I honestly haven't tried anything besides trying to find things to read over. I have found a few threads and choosen the best library I felt covered all of our needs
https://github.com/golang/sys
// Copyright 2012 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build windows
package main
import (
"fmt"
"strings"
"time"
"golang.org/x/sys/windows/svc"
"golang.org/x/sys/windows/svc/debug"
"golang.org/x/sys/windows/svc/eventlog"
)
var elog debug.Log
type myservice struct{}
func (m *myservice) Execute(args []string, r <-chan svc.ChangeRequest, changes chan<- svc.Status) (ssec bool, errno uint32) {
const cmdsAccepted = svc.AcceptStop | svc.AcceptShutdown | svc.AcceptPauseAndContinue
changes <- svc.Status{State: svc.StartPending}
fasttick := time.Tick(500 * time.Millisecond)
slowtick := time.Tick(2 * time.Second)
tick := fasttick
changes <- svc.Status{State: svc.Running, Accepts: cmdsAccepted}
loop:
for {
select {
case <-tick:
beep()
elog.Info(1, "beep")
case c := <-r:
switch c.Cmd {
case svc.Interrogate:
changes <- c.CurrentStatus
// Testing deadlock from https://code.google.com/p/winsvc/issues/detail?id=4
time.Sleep(100 * time.Millisecond)
changes <- c.CurrentStatus
case svc.Stop, svc.Shutdown:
// golang.org/x/sys/windows/svc.TestExample is verifying this output.
testOutput := strings.Join(args, "-")
testOutput += fmt.Sprintf("-%d", c.Context)
elog.Info(1, testOutput)
break loop
case svc.Pause:
changes <- svc.Status{State: svc.Paused, Accepts: cmdsAccepted}
tick = slowtick
case svc.Continue:
changes <- svc.Status{State: svc.Running, Accepts: cmdsAccepted}
tick = fasttick
default:
elog.Error(1, fmt.Sprintf("unexpected control request #%d", c))
}
}
}
changes <- svc.Status{State: svc.StopPending}
return
}
func runService(name string, isDebug bool) {
var err error
if isDebug {
elog = debug.New(name)
} else {
elog, err = eventlog.Open(name)
if err != nil {
return
}
}
defer elog.Close()
elog.Info(1, fmt.Sprintf("starting %s service", name))
run := svc.Run
if isDebug {
run = debug.Run
}
err = run(name, &myservice{})
if err != nil {
elog.Error(1, fmt.Sprintf("%s service failed: %v", name, err))
return
}
elog.Info(1, fmt.Sprintf("%s service stopped", name))
}
So I spent some time going over this code. Tested it out to see what it does. It performs as it should.
The question I have is we currently have a Go program that takes in arguments and for our service we pass in server. Which spins up our stuff on a localhost webpage.
I believe the code above may have something to do with that but I'm lost at how I would actually get it spin off our exe with the correct arguements. Is this the right spot to call main?
Im sorry if this is vague. I dont know exactly how to make this interact with our already exisiting exe.
I can get that modified if I know what needs to be changed. I appreacite any help.
OK, that's much clearer now. Well, ideally you should start with some tutorial on what constitutes a Windows service—I bet tihis might have solved the problem for you. But let's try anyway.
Some theory
A Windows service sort of has two facets: it performs some useful task and it communicates with the SCM facility. When you manipulate a service using the sc command or through the Control Panel, you have that piece of software to talk with SCM on your behalf, and SCM talks with that service.
The exact protocol the SCM and a service use is low-level and complicated
and the point of the Go package you're using is to hide that complexity from you
and offer a reasonably Go-centric interface to that stuff.
As you might gather from your own example, the Execute method of the type you've created is—for the most part—concerned with communicating with SCM: it runs an endless for loop which on each iteration sleeps on reading from the r channel, and that channel delivers SCM commands to your service.
So you basically have what could be called "an SCM command processing loop".
Now recall those two facets above. You already have one of them: your service interacts with SCM, so you need another one—the code which actually performs useful tasks.
In fact, it's already partially there: the example code you've grabbed creates a time ticker which provides a channel on which it delivers a value when another tick passes. The for loop in the Execute method reads from that channel as well, "doing work" each time another tick is signalled.
OK, this is fine for a toy example but lame for a real work.
Approaching the solution
So let's pause for a moment and think about our requirements.
We need some code running and doing our actual task(s).
We need the existing command processing loop to continue working.
We need these two pieces of code to work concurrently.
In this toy example the 3rd point is there "for free" because a time ticker carries out the task of waiting for the next tick automatically and fully concurrently with the rest of the code.
Your real code most probably won't have that luxury, so what do you do?
In Go, when you need to do something concurrently with something else,
an obvious answer is "use a goroutine".
So the first step is to grab your existing code, turn it into a callable function
and then call it in a separate goroutine right before entering the for loop.
This way, you'll have both pieces run concurrently.
The hard parts
OK, that wasn't hard.
The hard parts are:
How to configure the code which performs the tasks.
How to make the SCM command processing loop and the code carrying out tasks communicate.
Configuration
This one really depends on the policies at your $dayjob or of your $current_project, but there are few hints:
A Windows service may receive command-line arguments—either for a single run or permanently (passed to the service on each of its runs).
The downside is that it's not convenient to work with them from the UI/UX standpoint.
Typically Windows services used to read the registry.
These days (after the advent of .NET and its pervasive xml-ity) the services tend to read configuration files.
The OS environment most of the time is a bad fit for the task.
You may combine several of these venues.
I think I'd start with a configuration file but then again, you should pick the path of the least resistance, I think.
One of the things to keep in mind is that the reading and processing of the configuration should better be done before the service signals the SCM it started OK: if the configuration is invalid or cannot be loaded, the service should extensively log that and signal it failed, and not run the actual task processing code.
Communication between the command processing loop and the tasks carrying code
This is IMO the hardest part.
It's possible to write a whole book here but let's keep it simple for now.
To make it as simple as possible I'd do the following:
Consider pausing, stopping and shutting down mostly the same: all these signals must tell your task processing code to quit, and then wait for it to actually do that.
Consider the "continue" signal the same as starting the task processing function: run it again—in a new goroutine.
Have a one-directional communication: from the control loop to the tasks processing code, but not the other way—this will greatly simplify service state management.
This way, you may create a single channel which the task processing code listens on—or checks periodically, and when a value comes from that channel, the code stops running, closes the channel and exits.
The control loop, when the SCM tells it to pause or stop or shut down, sends anything on that channel then waits for it to close. When that happens, it knows the tasks processing code is finished.
In Go, a paradigm for a channel which is only used for signaling, is to have a channel of type struct{} (an empty struct).
The question of how to monitor this control channel in the tasks running code is an open one and heavily depends on the nature of the tasks it performs.
Any further help here would be reciting what's written in the Go books on concurrency so you should have that covered first.
There's also an interesting question of how to have the communication between the control loop and the tasks processing loop resilient to the possible processing stalls in the latter, but then again, IMO it's too early to touch upon that.
func main() {
messages := make(chan string)
go func() { messages <- "hello" }()
go func() { messages <- "ping" }()
msg := <-messages
msg2 := <-messages
fmt.Println(msg)
fmt.Println(msg2)
The above code consistently prints "ping" and then "hello" on my terminal.
I am confused about the order in which this prints, so I was wondering if I could get some clarification on my thinking.
I understand that unbuffered channels are blocking while waiting for both a sender and a receiver. So in the above case, when these 2 go routines are executed, there isn't, in both cases,a receiver yet. So I am guessing that both routines block until a receiver is available on the channel.
Now... I would assume that first "hello" is tried into the channel, but has to wait... at the same time, "ping" tries, but again has to wait. Then
msg := <- messages
shows up, so I would assume that at that stage, the program will arbitrarily pick one of the waiting goroutines and allow it to send its message over into the channel, since msg is ready to receive.
However, it seems that no matter how many times I run the program, it always is msg that gets assigned "ping" and msg2 that gets assigned "hello", which gives the impression that "ping" always gets priority to send first (to msg). Why is that?
It’s not about order of reading a channel but about order of goroutines execution which is not guaranteed.
Try to ‘Println’ from the function where you are writing to the channel (before and after writing) and I think it should be in same order as reading from the channel.
In Golang Spec Channels order is described as:-
Channels act as first-in-first-out queues. For example, if one
goroutine sends values on a channel and a second goroutine receives
them, the values are received in the order sent.
It will prints which value is available first to be received on other end.
If you wants to synchronize them use different channels or add wait Groups.
package main
import (
"fmt"
)
func main() {
messages1 := make(chan string)
messages2 := make(chan string)
go func(<-chan string) {
messages2 <- "ping"
}(messages2)
go func(<-chan string) {
messages1 <- "hello"
}(messages1)
fmt.Println(<-messages1)
fmt.Println(<-messages2)
}
If you see you can easily receive any value you want according to your choice using different channels.
Go playground
I just went through this same thing. See my post here: Golang channels, order of execution
Like you, I saw a pattern that was counter-intuitive. In a place where there actually shouldn't be a pattern. Once you launch a go process, you have launched a thread of execution, and basically all bets are off at that point regarding the order that the threads will execute their steps. But if there was going to be an order, logic tells us the first one called would be executed first.
In actual fact, if you recompile that program each time, the results will vary. That's what I found when I started compiling/running it on my local computer. In order to make the results random, I had to "dirty" the file, by adding and removing a space for instance. Then the compiler would re-compile the program, and then I would get a random order of execution. But when compiled in the go sandbox, the result was always the same.
When you use the sandbox, the results are apparently cached. I couldn't get the order to change in the sandbox by using insignificant changes. The only way I got it to change was to issue a time.Sleep(1) command between the launching of the go statements. Then, the first one launched would be the first one executed every time. I still don't think I'd bet my life on that continuing to happen though because they are separate threads of execution and there are no guarantees.
The bottom line is that I was seeing a deterministic result where there should be no determinism. That's what stuck me. I was fully cleared up when I found that the results really are random in a normal environment. The sandbox is a great tool to have. But it's not a normal environment. Compile and run your code locally and you will see the varying results you would expect.
I was confused when I first met this. but now I am clear about this. the reason cause this is not about channel, but for goroutine.
As The Go Memory Model mention, there's no guaranteed to goroutine's running and exit, so when you create two goroutine, you cannot make sure that they are running in order.
So if you want printing follow FIFO rule, you can change your code like this:
func main() {
messages := make(chan string)
go func() {
messages <- "hello"
messages <- "ping"
}()
//go func() { messages <- "ping" }()
msg := <-messages
msg2 := <-messages
fmt.Println(msg)
fmt.Println(msg2)
}
I feel like the answer to my question is no but asking for certainty as I've only started playing around with Go for a few days. Should we encapsulate IO bound tasks (like http requests) into goroutines even if it was to be used in a sequential use case?
Here's my naive example. Say I have a method that makes 3 http requests but need to be executed sequentially. Is there any benefit in creating the invoke methods as goroutines? I understand the example below would actually take a performance hit.
func myMethod() {
chan1 := make(chan int)
chan2 := make(chan int)
chan3 := make(chan int)
go invoke1(chan1)
res1 := <-chan1
invoke2(res1, chan2)
res2 := <-chan2
invoke3(res2, chan3)
// Do something with <-chan3
}
One possible reason that comes to mind is to future proof the invoke methods for when they're called in a concurrent context later on when other develops start re-using the method. Any other reasons?
There's nothing standard that would say yes or no to this question.
Although you can do it correctly this way, it is much simpler to stick to plain sequential execution.
Three reasons come to mind:
this is still sequential: you're waiting for every single goroutine sequentially, so this buys you nothing. Performance probably doesn't change much if it's only doing an http request, both cases will spend most of their time waiting for the response.
error handling is much simpler if you just get result, err := invoke; if err != nil .... rather than having to pass both results and errors through channels
over-generalization is a more apt word than "future proofing". If you need to call your invoke methods asynchronously in the future, then change your code in the future. It will be just as easy then to add asynchronous wrappers around your functions.
I am a little bit late to the party but I think I still have something to share.
Before answering the question, I would like to dive a little into the Go Proverb, Concurrency is not parellism. It is a very common misunderstanding of goroutine and Go's language feature that when people thinking of goroutine, they thinks about the ability of being parell.
But as Rob Pike pointed out in many Go Talks, what Go and goroutine auctually provides, is concurency. Concurency is a model of better interepting the real world, a way and a structure of code, of how code interact.
So back to the question. Should goroutine be used when steps are sequential? It depends on the design. If your code compose of individual parts that talks to each other very naturally, or if some of your code preserve a state and frequently return just does not make sense, or if your code fits in any other concurency design, it is perfectly fine to use goroutine and channel and select statement. Rob Pike, again, gives a Go Talk on a lexer used by now Go's text/template where the lexer uses a goroutine but the parser, obviously, only use the lexer sequentially. He stated that by using goroutine and channel, at cost of a little performance, a better API is acheived.
But on the other hand, in your example and probably what you are thinking about (codes that may require future parellism), I agree with #Marc. Stick for blocking call, at least for now.
You could use a channel like buf, or use []string (slice of strings urls). You don't get any benefit from gorutines, if you need only sequential execution, because we couldn't control goroutine when it's start.
But we can ask then don't wait runtime.Gosched()
From documentation:
Gosched yields the processor, allowing other goroutines to run. It does not suspend the current goroutine, so execution resumes automatically.
Example of sequential execution:
package main
func main() {
urls := []string{
"https://google.com",
"https://yahoo.com",
"https://youtube.com",
}
buf := make([][]byte, 0, len(urls))
for _, v := range urls {
buf = append(buf, sendRequest(v))
}
}
func sendRequest(url string) []byte {
//send
return []byte("")
}
For the solution as such, I see no gain by solving your problem the way you are in the real world. That doesn't mean there isn't gain in your solution. If you're after practice for example synchronising go routines, then you have a plethora of possibilities experimenting and learning.
So to me this is far from a clear no, but no clear yes either. I would refrain from saying maybe, it depends on what exactly your after. Are you after actually solving a problem or learning?