Number of specific go routines running - go

In go's runtime lib we have NumGoroutine() to return the total number of go routines running at that time. I was wondering if there was a simple way of getting the number of go routines running of a specific function?
Currently I have it telling me I have 1000, or whatever, of all go routines but I would like to know I have 500 go routines running func Foo. Possible? Simple? Shouldn't bother with it?

I'm afraid you have to count the goroutines on your own if you are interested in those numbers. The cheapest way to achieve your goal would be to use the sync/atomic package directly.
import "sync/atomic"
var counter int64
func example() {
atomic.AddInt64(&counter, 1)
defer atomic.AddInt64(&counter, -1)
// ...
}
Use atomic.LoadInt64(&counter) whenever you want to read the current value of the counter.
And it's not uncommon to have a couple of such counters in your program, so that you can monitor it easily. For example, take a look at the CacheStats struct of the recently published groupcache source.

Related

Goroutine Channel, Copy vs Pointer

Both functions are doing the same task which is initializing "Data struct". what are the Pros or Cons of each function? e.g. the function should unmarshal a big JSON file.
package main
type Data struct {
i int
}
func funcp(c chan *Data) {
var t *Data
t = <-c //receive
t.i = 10
}
func funcv(c chan Data) {
var t Data
t.i = 20
c <- t //send
}
func main() {
c := make(chan Data)
cp := make(chan *Data)
var t Data
go funcp(cp)
cp <- &t //send
println(t.i)
go funcv(c)
t = <- c //receive
println(t.i)
}
Link to Go Playground
The title of your question seems wrong. You are asking not about swapping things but rather about whether to send a pointer to some data or a copy of some data. More importantly, the overall thrust of your question lacks crucial information.
Consider two analogies:
Which is better, chocolate ice cream, or strawberry? That's probably a matter of opinion, but at least both with serve similar purposes.
Which is better, a jar of glue or a brick of C4? That depends on whether you want to build something, or blow something up, doesn't it?
If you send a copy of data through a channel, the receiver gets ... a copy. The receiver does not have access to the original. The copying process may take some time, but the fact that the receiver does not have to share access may speed things up. So this is something of an opinion, and if your question is about which is faster, well, you'll have to benchmark it. Be sure to benchmark the real problem, and not a toy example, because benchmarks on toy examples don't translate to real-world performance.
If you send a pointer to data through a channel, the receiver gets a copy of the pointer, and can therefore modify the original data. Copying the pointer is fast, but the fact that the receiver has to share access may slow things down. But if the receiver must be able to modify the data, you have no choice. You must use a tool that works, and not one that does not.
In your two functions, one generates values (funcv) so it does not have to send pointers. That's fine, and gives you the option. The other (funcp) receives objects but wants to update them so it must receive a pointer to the underlying object. That's fine too, but it means that you are now communicating by sharing (the underlying data structure), which requires careful coordination.

Can't get golang pprof working

I have tried to profile some golang applications but I couldn't have that working, I have followed these two tutorials:
http://blog.golang.org/profiling-go-programs
http://saml.rilspace.org/profiling-and-creating-call-graphs-for-go-programs-with-go-tool-pprof
Both says that after adding some code lines to the application, you have to execute your app, I did that and I receiveed the following message in the screen:
2015/06/16 12:04:00 profile: cpu profiling enabled,
/var/folders/kg/4fxym1sn0bx02zl_2sdbmrhr9wjvqt/T/profile680799962/cpu.pprof
So, I understand that the profiling is being executed, sending info to the file.
But, when I see the file size, in any program that I test, it is always 64bytes.
When I try to open the cpu.pprof file with pprof, and I execute the "top10" command, I see that nothing is in the file:
("./fact" is my app)
go tool pprof ./fact
/var/folders/kg/4fxym1sn0bx02zl_2sdbmrhr9wjvqt/T/profile680799962/cpu.pprof
top10 -->
(pprof) top10 0 of 0 total ( 0%)
flat flat% sum% cum cum%
So, it is like nothing is happening when I am profiling.
I have tested it in mac (this example) and in ubuntu, with three different programs.
Do you know that I am doing wrong?
Then example program is very simple, this is the code (is a very simple factorial program that I take from internet):
import "fmt"
import "github.com/davecheney/profile"
func fact(n int) int {
if n == 0 {
return 1
}
return n * fact(n-1)
}
func main() {
defer profile.Start(profile.CPUProfile).Stop()
fmt.Println(fact(30))
}
Thanks,
Fer
As inf mentioned already, your code executes too fast. The reason is that pprof works by repeatedly halting your program during its execution, looking at which function is running at that moment in time and writing that down (together with the whole function call stack). Pprof samples with a rate of 100 samples per second. This is hardcoded in runtime/pprof/pprof.go as you can easily check (see https://golang.org/src/runtime/pprof/pprof.go line 575 and the comment above it):
func StartCPUProfile(w io.Writer) error {
// The runtime routines allow a variable profiling rate,
// but in practice operating systems cannot trigger signals
// at more than about 500 Hz, and our processing of the
// signal is not cheap (mostly getting the stack trace).
// 100 Hz is a reasonable choice: it is frequent enough to
// produce useful data, rare enough not to bog down the
// system, and a nice round number to make it easy to
// convert sample counts to seconds. Instead of requiring
// each client to specify the frequency, we hard code it.
const hz = 100
// Avoid queueing behind StopCPUProfile.
// Could use TryLock instead if we had it.
if cpu.profiling {
return fmt.Errorf("cpu profiling already in use")
}
cpu.Lock()
defer cpu.Unlock()
if cpu.done == nil {
cpu.done = make(chan bool)
}
// Double-check.
if cpu.profiling {
return fmt.Errorf("cpu profiling already in use")
}
cpu.profiling = true
runtime.SetCPUProfileRate(hz)
go profileWriter(w)
return nil
}
The longer your program runs the more samples can be made and the more probable it will become that also short running functions will be sampled. If your program finishes before even the first sample is made, than the generated cpu.pprof will be empty.
As you can see from the code above, the sampling rate is set with
runtime.SetCPUProfileRate(..)
If you call runtime.SetCPUProfileRate() with another value before you call StartCPUProfile(), you can override the sampling rate. You will receive a warning message during execution of your program telling you "runtime: cannot set cpu profile rate until previous profile has finished." which you can ignore. It results since pprof.go calls SetCPUProfileRate() again. Since you have already set the value, the one from pprof will be ignored.
Also, Dave Cheney has released a new version of his profiling tool, you can find it here: https://github.com/pkg/profile . There, you can, among other changes, specify the path where the cpu.pprof is written to:
defer profile.Start(profile.CPUProfile, profile.ProfilePath(".")).Stop()
You can read about it here: http://dave.cheney.net/2014/10/22/simple-profiling-package-moved-updated
By the way, your fact() function will quickly overflow, even if you take int64 as parameter and return value. 30! is roughly 2*10^32 and an int64 stores values only up to 2^63-1 which is roughly 9*10^18.
The problem is that your function is running too fast and pprof can't sample it. Try adding a loop around the fact call and sum the result to artificially prolong the program.
I struggled with empty pprof files, until I realized that I followed outdated blog articles.
The upstream docs are good: https://pkg.go.dev/runtime/pprof
Write a test which you want to profile, then:
go test -cpuprofile cpu.prof -memprofile mem.prof -bench .
this creates cpu.prof and mem.prof.
You can analyze them like this:
go tool pprof cpu.prof
This gives you a command-line. The "top" and "web" commands are used often.
Same for memory:
go tool pprof mem.prof

Golang error function arguments too large for new goroutine

I am running a program with go 1.4 and I am trying to pass a large struct to a go function.
go ProcessImpression(network, &logImpression, campaign, actualSpent, partnerAccount, deviceId, otherParams)
I get this error:
runtime.newproc: function arguments too large for new goroutine
I have moved to pass by reference which helps but I am wondering if there is some way to pass large structs in a go function.
Thanks,
No, none I know of.
I don't think you should be too aggressive tuning to avoid copying, but it appears from the source that this error is emitted when parameters exceed the usable stack space for a new goroutine, which should be kilobytes. The copying overhead is real at that point, especially if this isn't the only time these things are copied. Perhaps some struct either explicitly is larger than expected thanks to a large struct member (1kb array rather than a slice, say) or indirectly. If not, just using a pointer as you have makes sense, and if you're worried about creating garbage, recycle the structs pointed to using sync.Pool.
I was able to fix this issue by changing the arguments from
func doStuff(prev, next User)
to
func doStuff(prev, next *User)
The answer from #twotwotwo in here is very helpful.
Got this issue at processing list of values([]BigType) of big struct:
for _, stct := range listBigStcts {
go func(stct BigType) {
...process stct ...
}(stct) // <-- error occurs here
}
Workaround is to replace []BigType with []*BigType

Go "panic: sync: unlock of unlocked mutex" without a known reason

I have a cli application in Go (still in development) and no changes were made in source code neither on dependencies but all of a sudden it started to panic panic: sync: unlock of unlocked mutex.
The only place I'm running concurrent code is to handle when program is requested to close:
func handleProcTermination() {
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
go func() {
<-c
curses.Endwin()
os.Exit(0)
}()
defer curses.Endwin()
}
Only thing I did was to rename my $GOPATH and work space folder. Can this operation cause such error?
Have some you experienced any related problem without having any explanation? Is there a rational check list that would help to find the cause of the problem?
Ok, after some unfruitful debugging sessions, as a last resort, I simply wiped all third party code (dependencies) from the workspace:
cd $GOPATH
rm -rf pkg/ bin/ src/github.com src/golang.org # the idea is to remove all except your own source
Used go get to get all used dependencies again:
go get github.com/yadayada/yada
go get # etc
And the problem is gone! Application is starting normally and tests are passing. No startup panics anymore. It looks like this problem happens when you mv your work space folder but I'm not 100% sure yet. Hope it helps someone else.
From now on, re install dependencies will be my first step when weird panic conditions like that suddenly appear.
You're not giving much information to go on, so my answer is generic.
In theory, bugs in concurrent code can remain unnoticed for a long time and then suddenly show up. In practice, if the bug is easily repeatable (happens nearly every run) this usually indicates that something did change in the code or environment.
The solution: debug.
Knowing what has changed can be helpful to identify the bug. In this case, it appears that lock/unlock pairs or not matching up. If you are not passing locks between threads, you should be able to find a code path within the thread that has not acquired the lock, or has released it early. It may be helpful to put assertions at certain points to validate that you are holding the lock when you think you are.
Make sure you don't copy the lock somewhere.
What can happen with seemingly bulletproof code in concurrent environments is that the struct including the code gets copied elsewhere, which results in the underlying lock being different.
Consider this code snippet:
type someStruct struct {
lock sync.Mutex
}
func (s *someStruct) DoSomethingUnderLock() {
s.lock.Lock()
defer s.lock.Unlock() // This will panic
time.Sleep(200 * time.Millisecond)
}
func main() {
s1 := &someStruct{}
go func() {
time.Sleep(100 * time.Millisecond) // Wait until DoSomethingUnderLock takes the lock
s2 := &someStruct{}
*s1 = *s2
}()
s1.DoSomethingUnderLock()
}
*s1 = *s2 is the key here - it results in a different lock being used by the same receiver function and if the struct is replaced while the lock is taken, we'll get sync: unlock of unlocked mutex.
What makes it harder to debug is that someStruct might be nested in another struct (and another, and another), and if the outer struct gets replaced (as long as someStruct is not a reference there) in a similar manner, the result will be the same.
If this is the case, you can use a reference to the lock (or the whole struct) instead. Now you need to initialize it, but it's a small price that might save you some obscure bugs. See the modified code that doesn't panic here.
For those who come here and didn't solve your problem. Check If the application is compiled in one version of Linux but running in another version. At least in my case it happened.

why go routine only operate on one core [duplicate]

I'm testing this Go code on my VirtualBoxed Ubuntu 11.4
package main
import ("fmt";"time";"big")
var c chan *big.Int
func sum( start,stop,step int64) {
bigStop := big.NewInt(stop)
bigStep := big.NewInt(step)
bigSum := big.NewInt(0)
for i := big.NewInt(start);i.Cmp(bigStop)<0 ;i.Add(i,bigStep){
bigSum.Add(bigSum,i)
}
c<-bigSum
}
func main() {
s := big.NewInt( 0 )
n := time.Nanoseconds()
step := int64(4)
c = make( chan *big.Int , int(step))
stop := int64(100000000)
for j:=int64(0);j<step;j++{
go sum(j,stop,step)
}
for j:=int64(0);j<step;j++{
s.Add(s,<-c)
}
n = time.Nanoseconds() - n
fmt.Println(s,float64(n)/1000000000.)
}
Ubuntu has access to all my 4 cores. I checked this with simultaneous run of several executables and System Monitor.
But when I'm trying to run this code, it's using only one core and is not gaining any profit of parallel processing.
What I'm doing wrong?
You probably need to review the Concurrency section of the Go FAQ, specifically these two questions, and work out which (if not both) apply to your case:
Why doesn't my multi-goroutine program
use multiple CPUs?
You must set the GOMAXPROCS shell environment
variable or use the similarly-named function
of the runtime package to allow the run-time
support to utilize more than one OS thread.
Programs that perform parallel computation
should benefit from an increase in GOMAXPROCS.
However, be aware that concurrency is not parallelism.
Why does using GOMAXPROCS > 1
sometimes make my program slower?
It depends on the nature of your
program. Programs that contain several
goroutines that spend a lot of time
communicating on channels will
experience performance degradation
when using multiple OS threads. This
is because of the significant
context-switching penalty involved in
sending data between threads.
Go's goroutine scheduler is not as
good as it needs to be. In future, it
should recognize such cases and
optimize its use of OS threads. For
now, GOMAXPROCS should be set on a
per-application basis.
For more detail on this topic see the
talk entitled Concurrency is not Parallelism.

Resources