golang: how to debug possible race condition

golang: how to debug possible race condition - go

I wrote a log collector program in go, which runs a bunch of goroutines as follow:
routine A runs HTTP server, allow users to view log information
routine B runs UDP server, allow log messages to be sent to it from LAN
routine C runs a timer, which periodically query/download zipped log archives from an internal HTTP file server (not part of the program)
routine B & C both send processed messages to a Channel
routine D runs a for {} loop with a select statement which receives message from the Channel and flush it to disk
there are a few other go routines such as a routine to scan the log archives generated by routine D to create SQLite indices etc.
The program has a problem that after a few hours running, the log viewer http server still works well but there are NO messages coming in either from the UDP or fileserver routines. I know that there are endless log messages sending from various channels, also if I restart the program, it start to process incoming logs again.
I added -race to the compiler, and it indeed find out some problematic code, and I fixed these, but still, problem persists. What's more, although there are racy problems, the old version code running on our production server works well, regardless of the racy code.
My question is, how can I proceed to pinpoint the problem. The following is key loop in my log processing routine:
for {
select {
case msg := <-logCh:
logque.Cache(msg)
case <-time.After(time.Second):
}
if time.Since(lastFlush) >= 3 * time.Second {
logque.Flush()
lastFlush = time.Now()
}
}

I finally found the code that created the blocking. In the following code:
for {
select {
case msg := <-logCh:
logque.Cache(msg)
case <-time.After(time.Second):
}
if time.Since(lastFlush) >= 3 * time.Second {
logque.Flush()
lastFlush = time.Now()
}
}
Inside logque.Flush() there are some code that generate log messages which in turn write into the channel, eventually caused the channel's buffer being filled up. This only occurs when I turn on debug mode, production code does not do this in the Flush() method.
To answer my own question, the method I used to nail down the problem is pretty simple:
if len(logch) >= LOG_CHANNEL_CAP {
//drop the message or store it into
//secondary buffer...
return
}
logch <- msg

Related

Program with select statements escape deadlock in go

This question have quite possibly been answered by I couldn't find it so here we go:
I have this go function that sends or recieves "messages", whichever one is avaivable, using a select statement:
func Seek(name string, match chan string) {
select {
case peer := <-match:
fmt.Printf("%s sent a message to %s.\n", peer, name)
case match <- name:
// Wait for someone to receive my message.
I start this function on 4 different go-routines, using an unbuffered channel (It would be better to use a buffer och 1 but this is merely experimental):
people := []string{"Anna", "Bob", "Cody", "Dave"}
match := make(chan string)
for _, name := range people {
go Seek(name, match, wg)
Now, I've just started using go and thought that since we're using an unbuffered channel, both the send and recieve statement of the "select" should block (there's no one waiting to send a message so you can't recieve, and there's no one waiting to recieve so you can't send), meaning that there won't be any communcation done between the functions, aka Deadlock. However running the code shows us that this is not the case:
API server listening at: 127.0.0.1:48731
Dave sent a message to Cody.
Anna sent a message to Bob.
Process exiting with code: 0
My question to you lovely people is why this happens? Does the compiler realize that the functions want to read / write in the same channel and arranges that to happen? Or does the "select" statement continually check if there's anyone available to use the channel with?
Sorry if the question is hard to answer, I'm still a novice and not that experienced in how things operate behind the scene :)

Now, I've just started using go and thought that since we're using an unbuffered channel, both the send and recieve statement of the "select" should block (there's no one waiting to send a message so you can't recieve, and there's no one waiting to recieve so you can't send)
This is actually not true; in fact, there are multiple goroutines waiting to receive and multiple goroutines waiting to send. When a goroutine does a select like yours:
select {
case peer := <-match:
fmt.Printf("%s sent a message to %s.\n", peer, name)
case match <- name:
// Wait for someone to receive my message.
It is simultaneously waiting to send and to receive. Since you have multiple routines doing this, every routine will find both senders and receievers. Nothing will block. The selects will choose cases at random since multiple cases are unblocked at the same time.

Is it a bad idea to pass a channel contains file handle to goroutine?

ORIGINAL 09/11/2019
conn := createConnection() // or a file handle
go getData(conn)
Is it possible the thread for getData, is in different thread of conn handle. Therefore, it can result an connection error.
---- UPDATED 11/11/2019 09am ----
Senario 1
func createConnection() handler {
... create a socket connection (tcp://.....) or file open handler
return conn
}
func sendData(conn handler, data string) {
conn.send(data)
}
conn := createConnection() // or a file handle
go sendData(conn, "test data")
Senario 2
func createConnection() handler {
... create a socket connection (tcp://.....) or open file handler
return conn
}
func sendData(ch chan handler, data string) {
conn := <- ch
conn.send(data)
}
ch := make(chan conn, 10)
ch <- createConnection() // or a file handle
go sendData(ch, "test data")
Story behind:
I was working on a task to proxy data to a socket server. My solution towards the challenge was using the idea of [Senario 2].
Few of my colleagues are C programmer, work with system level programming. They pointed out that golang channel better only contains data - put file handler in channel can cause unknow problem, such as: the thread for channel get is in different thread of channel put, therefore, the file handler can also missing.
To my understanding, golang should solve the problem by itself already. I, then, asked the question above.
By looking into some of the source code of socket related projects, I think [Senario 1] is fine. However, [Senario 2] is still a question to me.
Again, my question is not [can I pass a file handle to a function], everyone knows "It is a yes". The question is in golang CSP, use go and chan together, with file handler pass through, can it be a problem? Or, more intersetingly: use pointer in golang channel put and channel get can be a problem or not; it is a big "no no" in C by books. If it is fine in golang, how does golang achive it?
---- UPDATED 11/11/2019 10am ----
The question only apply to golang. Such problem does not happen to node.js, since it is single threaded language. The question focuses on threades and file handler. By the fact, I have limited knowledge around the problem, I apologise to ask bad question or provide miss leading infomation.
---- UPDATED 11/11/2019 10：40am ----
I re-confirmed with my colleague, the concern is "everytime code ask for a file handler, system return a number. Howerver, the number is only unique in one process, which means the same file handler number, in different process, may point to different resource. I am not sure goroutine take care it or not."

There is nothing wrong with passing a connection handle to a separate goroutine as long as you are careful about the following:
Do not close the handle while the goroutine is working, or write the goroutine to deal with it.
If you are using the handle from multiple goroutines, make sure the connection you're dealing with is thread-safe, or put a lock around it.
Be clear and explicit about who's going to close it. The goroutine may close it when it is done, or another goroutine closes it when all work using the handle is done.

How do I run another executable from a Windows service

Besides a few tutorials on Go I have no actual experience in it. I'm trying to take a project written in Go and converting it into a windows service.
I honestly haven't tried anything besides trying to find things to read over. I have found a few threads and choosen the best library I felt covered all of our needs
https://github.com/golang/sys
// Copyright 2012 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build windows
package main
import (
"fmt"
"strings"
"time"
"golang.org/x/sys/windows/svc"
"golang.org/x/sys/windows/svc/debug"
"golang.org/x/sys/windows/svc/eventlog"
)
var elog debug.Log
type myservice struct{}
func (m *myservice) Execute(args []string, r <-chan svc.ChangeRequest, changes chan<- svc.Status) (ssec bool, errno uint32) {
const cmdsAccepted = svc.AcceptStop | svc.AcceptShutdown | svc.AcceptPauseAndContinue
changes <- svc.Status{State: svc.StartPending}
fasttick := time.Tick(500 * time.Millisecond)
slowtick := time.Tick(2 * time.Second)
tick := fasttick
changes <- svc.Status{State: svc.Running, Accepts: cmdsAccepted}
loop:
for {
select {
case <-tick:
beep()
elog.Info(1, "beep")
case c := <-r:
switch c.Cmd {
case svc.Interrogate:
changes <- c.CurrentStatus
// Testing deadlock from https://code.google.com/p/winsvc/issues/detail?id=4
time.Sleep(100 * time.Millisecond)
changes <- c.CurrentStatus
case svc.Stop, svc.Shutdown:
// golang.org/x/sys/windows/svc.TestExample is verifying this output.
testOutput := strings.Join(args, "-")
testOutput += fmt.Sprintf("-%d", c.Context)
elog.Info(1, testOutput)
break loop
case svc.Pause:
changes <- svc.Status{State: svc.Paused, Accepts: cmdsAccepted}
tick = slowtick
case svc.Continue:
changes <- svc.Status{State: svc.Running, Accepts: cmdsAccepted}
tick = fasttick
default:
elog.Error(1, fmt.Sprintf("unexpected control request #%d", c))
}
}
}
changes <- svc.Status{State: svc.StopPending}
return
}
func runService(name string, isDebug bool) {
var err error
if isDebug {
elog = debug.New(name)
} else {
elog, err = eventlog.Open(name)
if err != nil {
return
}
}
defer elog.Close()
elog.Info(1, fmt.Sprintf("starting %s service", name))
run := svc.Run
if isDebug {
run = debug.Run
}
err = run(name, &myservice{})
if err != nil {
elog.Error(1, fmt.Sprintf("%s service failed: %v", name, err))
return
}
elog.Info(1, fmt.Sprintf("%s service stopped", name))
}
So I spent some time going over this code. Tested it out to see what it does. It performs as it should.
The question I have is we currently have a Go program that takes in arguments and for our service we pass in server. Which spins up our stuff on a localhost webpage.
I believe the code above may have something to do with that but I'm lost at how I would actually get it spin off our exe with the correct arguements. Is this the right spot to call main?
Im sorry if this is vague. I dont know exactly how to make this interact with our already exisiting exe.
I can get that modified if I know what needs to be changed. I appreacite any help.

OK, that's much clearer now. Well, ideally you should start with some tutorial on what constitutes a Windows service—I bet tihis might have solved the problem for you. But let's try anyway.
Some theory
A Windows service sort of has two facets: it performs some useful task and it communicates with the SCM facility. When you manipulate a service using the sc command or through the Control Panel, you have that piece of software to talk with SCM on your behalf, and SCM talks with that service.
The exact protocol the SCM and a service use is low-level and complicated
and the point of the Go package you're using is to hide that complexity from you
and offer a reasonably Go-centric interface to that stuff.
As you might gather from your own example, the Execute method of the type you've created is—for the most part—concerned with communicating with SCM: it runs an endless for loop which on each iteration sleeps on reading from the r channel, and that channel delivers SCM commands to your service.
So you basically have what could be called "an SCM command processing loop".
Now recall those two facets above. You already have one of them: your service interacts with SCM, so you need another one—the code which actually performs useful tasks.
In fact, it's already partially there: the example code you've grabbed creates a time ticker which provides a channel on which it delivers a value when another tick passes. The for loop in the Execute method reads from that channel as well, "doing work" each time another tick is signalled.
OK, this is fine for a toy example but lame for a real work.
Approaching the solution
So let's pause for a moment and think about our requirements.
We need some code running and doing our actual task(s).
We need the existing command processing loop to continue working.
We need these two pieces of code to work concurrently.
In this toy example the 3rd point is there "for free" because a time ticker carries out the task of waiting for the next tick automatically and fully concurrently with the rest of the code.
Your real code most probably won't have that luxury, so what do you do?
In Go, when you need to do something concurrently with something else,
an obvious answer is "use a goroutine".
So the first step is to grab your existing code, turn it into a callable function
and then call it in a separate goroutine right before entering the for loop.
This way, you'll have both pieces run concurrently.
The hard parts
OK, that wasn't hard.
The hard parts are:
How to configure the code which performs the tasks.
How to make the SCM command processing loop and the code carrying out tasks communicate.
Configuration
This one really depends on the policies at your $dayjob or of your $current_project, but there are few hints:
A Windows service may receive command-line arguments—either for a single run or permanently (passed to the service on each of its runs).
The downside is that it's not convenient to work with them from the UI/UX standpoint.
Typically Windows services used to read the registry.
These days (after the advent of .NET and its pervasive xml-ity) the services tend to read configuration files.
The OS environment most of the time is a bad fit for the task.
You may combine several of these venues.
I think I'd start with a configuration file but then again, you should pick the path of the least resistance, I think.
One of the things to keep in mind is that the reading and processing of the configuration should better be done before the service signals the SCM it started OK: if the configuration is invalid or cannot be loaded, the service should extensively log that and signal it failed, and not run the actual task processing code.
Communication between the command processing loop and the tasks carrying code
This is IMO the hardest part.
It's possible to write a whole book here but let's keep it simple for now.
To make it as simple as possible I'd do the following:
Consider pausing, stopping and shutting down mostly the same: all these signals must tell your task processing code to quit, and then wait for it to actually do that.
Consider the "continue" signal the same as starting the task processing function: run it again—in a new goroutine.
Have a one-directional communication: from the control loop to the tasks processing code, but not the other way—this will greatly simplify service state management.
This way, you may create a single channel which the task processing code listens on—or checks periodically, and when a value comes from that channel, the code stops running, closes the channel and exits.
The control loop, when the SCM tells it to pause or stop or shut down, sends anything on that channel then waits for it to close. When that happens, it knows the tasks processing code is finished.
In Go, a paradigm for a channel which is only used for signaling, is to have a channel of type struct{} (an empty struct).
The question of how to monitor this control channel in the tasks running code is an open one and heavily depends on the nature of the tasks it performs.
Any further help here would be reciting what's written in the Go books on concurrency so you should have that covered first.
There's also an interesting question of how to have the communication between the control loop and the tasks processing loop resilient to the possible processing stalls in the latter, but then again, IMO it's too early to touch upon that.

Golang Channel Won't Receive Messages

I try to explore go channel, i create channel buffer max 10, with gomaxprocess is 2, but i wonder why this code won't receive message
runtime.GOMAXPROCS(2)
messages := make(chan int, 9)
go func() {
for {
i := <-messages
fmt.Println("Receive data:", i)
}
}()
for i := 0; i <= 9; i++ {
fmt.Println("Send data ", i)
messages <- i
}

Your case works like this, though it may appear to work certain times, but it's not guaranteed to always.
Just to add some context, in an unbuffered channel, the sending go routine is blocked as it tries to send a value and a receive is guaranteed to occur before the sending go routine is awakened (in this case the main), so it may seem like a viable option in such cases. But the sending go routine may still exit before the print statement in the receiving go routine is executed. So basically you need to use a synchronization mechanism such that the sending go routine exits only after the work in the receiver is completed.
Here's how you can use a synchronization mechanism, have annotated it so that you can make better sense out of it. This will work for both buffered and unbuffered channels. Another option is to have the receive in the main thread itself so that it doesn't exit before receive processing is done, this way you won't need a separate synchronization mechanism. Hope this helps.

You created a channel which has 9 buffer space, which means main routine (r1) will not blocked until the 10th element was ready to send to messages.
In your go func() (r2), it most probably starts running when r1 almost finished for r2 is a new routine and system takes time to create stacks etc.
so, r2 doesn't print anything, for r1 is done and program exits while r2 has just begin running.

Go showing strange behavior in infinite loop

I am having very strange behavior in my Go code. The overall gist is that when I have
for {
if messagesRecieved == l {
break
}
select {
case result := <-results:
newWords[result.index] = result.word
messagesRecieved += 1
default:
// fmt.Printf("messagesRecieved: %v\n", messagesRecieved)
if i != l {
request := Request{word: words[i], index: i, thesaurus_word: results}
requests <- request
i += 1
}
}
}
the program freezes and fails to advance, but when I uncomment out the fmt.Printf command, then the program works fine. You can see the entire code here. does anyone know what's causing this behavior?

Go in version 1.1.2 (the current release) has still only the original (since initial release) cooperative scheduling of goroutines. The compiler improves the behavior by inserting scheduling points. Inferred from the memory model they are next to channel operations. Additionaly also in some well known, but intentionally undocumented places, such as where I/O occurs. The last explains why uncommenting fmt.Printf changes the behavior of your program. And, BTW, the Go tip version now sports a preemptive scheduler.
Your code keeps one of your goroutines busy going through the default select case. As there are no other scheduling points w/o the print, no other goroutine has a chance to make progress (assuming default GOMAXPROCS=1).
I recommend to rewrite the logic of the program in a way which avoids spinning (busy waiting). One possible approach is to use a channel send in the default case. As a perhaps nice side effect of using a buffered channel for that, one gets a simple limiter from that for free.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio