How to measure function run times - go

In a golang web server I want to measure times taken by some http controller. I am calling time.Now() before calling controller function, and time.Since() after controller function returns. But if it has long remote io request that takes 1 second, or the process is throttled, or controller is parallelized with goroutines - then that time will be not exactly what I want.
If we assume analogy to bash time command - then I am getting real time with this technique:
time go build
real 0m5,204s
user 0m12,012s
sys 0m2,043s
How can I measure user and sys times for a function run(preferably for a goroutine plus its forked children) in a golang program (preferably with standard packages)?
this is my profiler implementation. How can i extend it with sys and user time per goroutine?
const HeaderCost = "Cost"
// Timed middleware will set Cost header in http response
func Timed(h http.Handler) http.HandlerFunc {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
h.ServeHTTP(&responseWriterWithTimer{
ResponseWriter: w,
headerWritten: false,
startedAt: time.Now(),
}, r)
})
}
type responseWriterWithTimer struct {
http.ResponseWriter
headerWritten bool
startedAt time.Time
}
func (w *responseWriterWithTimer) WriteHeader(statusCode int) {
w.Header().Set(
HeaderCost,
strconv.FormatFloat(
time.Since(w.startedAt).Seconds(),
'g',
64,
64,
),
)
w.ResponseWriter.WriteHeader(statusCode)
w.headerWritten = true
}
func (w *responseWriterWithTimer) Write(b []byte) (int, error) {
if !w.headerWritten {
w.WriteHeader(http.StatusOK)
}
return w.ResponseWriter.Write(b)
}

If you want to do basic instrumentation at runtime, you can wrap your handlers to measure their execution time:
func perfMiddleware(h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t := time.Now()
h.ServeHTTP(w, r)
log.Printf("handler took %s", time.Since(t))
})
}
You could expose this more easily using expvar. Going beyond this, there are also numerous instrumentation/telemetry/APM libraries available for Go if you look for them, along with metrics management solutions like the TICK stack, Datadog, and so on.
As for the real, user, and sys data output by time, these are posix measures that don't perfectly apply to instrumenting a Go HTTP handler (or any other unit of code at runtime), for a number of reasons:
goroutines have no parent/child relationship; all are equal peers, so there is no metric of the time taken by "children" of your handler.
most of the I/O is handled within the stdlib, which isn't instrumented to this level (and instrumentation at this level would have a non-negligible performance impact of its own)
You can of course instrument each piece individually, which is often more useful; for example, instrument your HTTP handlers, as well as any code that is making its own external requests, in order to measure the performance of each component. From this you can analyze the data and get a much clearer picture of what is taking time, in order to address any performance issues you find.

If you want to measure something in isolation, benchmarks are probably exactly what you're after.
If you're trying to measure a http.Handler, you can use httptest.NewRecorder and httptest.NewRequest to create a new response writer and request object and just invoke the handler directly inside your benchmark.
func BenchmnarkHttpHandler(b*testing.B) {
req := httptest.NewRequest("GET", "/foo", nil)
myHandler := thingtotest.Handler{}
for n := 0; n < b.N; n++ {
myHandler.ServeHTTP(httptest.NewRecorder(), req);
}
}

How can I measure user and sys times for a function run
You cannot. That distinction is not an observable for Go functions.
(But honestly: Measuring them is of no real use and doesn't make much sense. This sound like a XY problem.)

Related

Golang passing function as argument

My question is related to an example from this link Effective Go. Why do they pass the function sum() as an argument vs calling it directly ? Below is the sample code from the link. The handle() function invokes sum() as req.f(req.args). What are the advantages of doing this way vs invoking it as sum(args) ?
type Request struct {
args []int
f func([]int) int
resultChan chan int
}
func sum(a []int) (s int) {
for _, v := range a {
s += v
}
return
}
request := &Request{[]int{3, 4, 5}, sum, make(chan int)}
// Send request
clientRequests <- request
// Wait for response.
fmt.Printf("answer: %d\n", <-request.resultChan)
func handle(queue chan *Request) {
for req := range queue {
req.resultChan <- req.f(req.args)
/**** how about calling the same function this way ***/
req.resultChan <- sum(args)
/***************************/
}
}
With only a single example of a Request this is a valid observation and question. However, if you limit yourself to only that one example of a Request to sum a fixed set of integers then you might also ask: why not simply do the arithmetic yourself and declare a constant?
i.e. all of this code is unnecessary, just write: fmt.Print("Answer: 12\n")
:)
So, assuming that all of this code serves some useful purpose, let's examine what those purposes might be...
The Use of Channels
A requests and results are passed via channels. Completely unnecessary in this case as the code is entirely synchronous, but in a more complex scenario where fulfilling a request involves some I/O, then channels help improve the efficiency of the code.
The example illustrates the pattern of sending a request over a channel, providing a per request result channel and receiving the result over the request specific result channel.
Request Fields (arguments/parameters)
By providing a Request struct that accepts a slice of ints, requests can be submitted to operate on an arbitrary number of arbitrary int values. There may be 0 or more ints that we want to process in a given request.
The example shows just one scenario involving the ints 3, 4, and 5.
The Func Reference
By providing a func reference in the Request, the specific processing performed by the request is decoupled from the asynchronous channel invocation used to make the request and return the result. This avoids having to recreate different handlers for different types of requests that operate over arbitrary slices of integers and return an int value.
The example illustrates using the asynchronous Request mechanism to sum() the slice of ints. But another request might involve a different function, e.g. mult(), which might multiply all the ints or mode()/median() to return the mode or median average
(a mean() request is not possible - at least not accurately - since that would need to return a float, which this particular Request type does not support via its channel int result channel).
Why Not Just Call sum() in the handle() func?
What the example illustrates here is "Inversion of Control" or "Don't Call Me, I'll Call You".
i.e. rather than embed the logic needed to fulfil each request in the function that handles the request, each request carries the required logic with it to be called by the handler. The handler is then only responsible for co-ordinating over the channels and calling the request logic at the appropriate point.
You don't have to do this and it may not be appropriate in all cases. Indeed, you could eliminate the sum() func entirely and implement the summing functionality directly in the handle func, if the handle func was only ever required to sum integers in a request (though there is still an argument for decomposing the sum functionality into a separate func to aid unit testing).
But if you did want to perform different asynchronous operations over slices of ints, e.g. multiplication or calculate averages using the same asynchronous pattern, you would need:
additional channels, one for each different type of request
additional handler funcs, one for each different type of request
additional goroutines to run the handler funcs
You wouldn't necessarily need different Request types themselves, since these would be the same and reduced to simply a slice of ints as "input" and the channel int for output. But it might be argued that they should still be separated into separate, different but identical types in order to separate the concerns (so you can change one request type without inadvertently affecting or even breaking the others).
A Different Example
The example falls a little short in not demonstrating the use of a second function operating over a slice of ints, to illustrate the flexibility of the implementation.
A more complete example might be:
package main
import "fmt"
type Request struct {
args []int
f func([]int) int
resultChan chan int
}
func handle(queue chan *Request) {
for req := range queue {
req.resultChan <- req.f(req.args)
}
}
func mult(a []int) (s int) {
if len(a) == 0 {
return 0
}
s = a[0]
for i, v := range a {
if i == 0 {
continue
}
s *= v
}
return
}
func sum(a []int) (s int) {
for _, v := range a {
s += v
}
return
}
func main() {
// Setup our request handler
requests := make(chan *Request)
go handle(requests)
// Setup some requests
ints := []int{3, 4, 5}
rqsum := &Request{ints, sum, make(chan int)}
rqprod := &Request{ints, mult, make(chan int)}
// Send sum request, wait for and print result
requests <- rqsum
fmt.Printf("sum: %d\n", <-rqsum.resultChan)
// Send product request, wait for and print result
requests <- rqprod
fmt.Printf("product: %d\n", <-rqprod.resultChan)
}
Please note that this does not necessarily illustrate good channel patterns or practices, only serving to demonstrate the inversion of control that function references provide!

context without channels in the same thread of execution

Can't figure out how I can cancel a task if it takes to much to time compute in the same thread of execution via context semantics?
I use this example as a reference point
https://golang.org/src/context/context_test.go
The goal here call a doWork, if doWork takes to much time to compute, GetValueWithDeadline should after a timeout return 0, or if caller called cancel that cancel a wait, (here it main is caller) or the value returned in in give a time window.
The same scenario can be done In a different way. ( separate goroutine sleep, wakeup check value etc, condition on a mutex, etc) but I really want to understand the correct way to use context.
The channel semantic I understand but here I can't achieve the desired effect, the default case
call to a doWork fault under default case and sleep.
package main
import (
"context"
"fmt"
"log"
"math/rand"
"sync"
"time"
)
type Server struct {
lock sync.Mutex
}
func NewServer() *Server {
s := new(Server)
return s
}
func (s *Server) doWork() int {
s.lock.Lock()
defer s.lock.Unlock()
r := rand.Intn(100)
log.Printf("Going to nap for %d", r)
time.Sleep(time.Duration(r) * time.Millisecond)
return r
}
// I take an example from here and it very unclear where is do work executed
// https://golang.org/src/context/context_test.go
func (s *Server) GetValueWithDeadline(ctx context.Context) int {
val := 0
select {
case <- time.After(150 * time.Millisecond):
fmt.Println("overslept")
return 0
case <- ctx.Done():
fmt.Println(ctx.Err())
return 0
default:
val = s.doWork()
}
return all
}
func main() {
rand.Seed(time.Now().UTC().UnixNano())
s := NewServer()
for i :=0; i < 10; i++ {
d := time.Now().Add(50 * time.Millisecond)
ctx, cancel := context.WithDeadline(context.Background(), d)
log.Print(s.GetValueWithDeadline(ctx))
cancel()
}
}
Thank you
There are multiple problems with your approach.
What problem contexts solve
First, the primary reason contexts were invented in Go is that they allow to unify an approach to cancellation of a set of tasks.
To explain this concept using a simple example, consider a client request to some sever; to simplify further let it be an HTTP request.
The client connects to the server, sends some data telling the server what to do to fulfill the request and then waits for the server to respond.
Let's now suppose the request requires elaborate and time-consuming processing on the server — for instance, suppose it needs to perform multiple complex queries to multiple remote database engines, do multiple HTTP requests to external services and then process the acquired results to actually produce the data the client wants.
So the client starts its request and the server goes on with all those requests.
To hide latency of individual tasks the server has to perform to fulfill the request, it runs them in separate goroutines.
Once each goroutine completes the assigned task, it communicates its result (and/or an error) back to the goroutine which handles the client's request, and so on.
Now suppose that the client fails to wait for the response to its request for whatever reason — a network outage, an explicit timeout in the client's software, the user kills the app which initiated the request etc, — there are lots of possibilities.
As you can see, there's little sense for the server to continue spending resources to finish the tasks which were logically bound to the now-dead request: there's no one to hear back the result anyway.
So it makes sense to reap those tasks once we know the request is not going to be completed, and that's where contexts come into play: you can associate each incoming request with a single context and then either pass it itself to any goroutine spawned to carry out a single task required to be done to fulfill the request, or derive another request from that and pass it instead.
Then, as soon as you cancel the "root" request, that signal is propagated through the whole tree of requests derived from the root one.
Now each goroutine which were given a context, might "listen" on it to be notified when that cancellation signal is sent, and once the goroutine notices that it might drop whatever it was busy doing and exit.
In terms of actual context.Context type that signal is called "done" — as in "we're done doing whatever that context is assotiated with", — and that's why the goroutine which wants to know it should stop doing its work listens on a special channel returned by the context's method called Done.
Back to your example
To make it work, you'd do something like:
func (s *Server) doWork(ctx context.Context) int {
s.lock.Lock()
defer s.lock.Unlock()
r := rand.Intn(100)
log.Printf("Going to nap for %d", r)
select {
case <- time.After(time.Duration(r) * time.Millisecond):
return r
case <- ctx.Done():
return -1
}
}
func (s *Server) GetValueWithTimeout(ctx context.Context, maxTime time.Duration) int {
d := time.Now().Add(maxTime)
ctx, cancel := context.WithDeadline(ctx, d)
defer cancel()
return s.doWork(ctx)
}
func main() {
const maxTime = 50 * time.Millisecond
rand.Seed(time.Now().UTC().UnixNano())
s := NewServer()
for i :=0; i < 10; i++ {
v := s.GetValueWithTimeout(context.Background(), maxTime)
log.Print(v)
}
}
(Playground).
So what happens here?
The GetValueWithTimeout method accepts the maximum time it should take the doWork method to produce a value, calculates the deadline, derives a context which cancels itself once the deadline passes from the context passed to the method and calls doWork with the new context object.
The doWork method arms its own timer to go off after a random time interval and then listens on both the context and the timer.
This one is the critical point: the code which performs some unit of work which is supposed to be cancellable must check the context to become "done" actively, by itself.
So, in our toy example, either the doWork's own timer fires first or the deadline of the generated context gets reached first; whatever happens first, makes the select statement unblock and proceed.
Note that if your "do the work" code wold be more involved — it would actually do something instead of sleeping, — you would most probably need to check on the context's status periodically, usually after performing invividual bits of that work.

Rate limit with golang.org/x/time/rate api request

I already created a function for limiting to 50 requests for API logins in one day.
var limit = 50
package middleware
import (
"log"
"net"
"net/http"
"sync"
"time"
"golang.org/x/time/rate"
)
// Create a custom request struct which holds the rate limiter for each
// visitor and the last time that the request was seen.
type request struct {
limiter *rate.Limiter
lastSeen time.Time
}
// Change the the map to hold values of the type request.
// defaultTime using 3 minutes
var requests = make(map[string]*request)
var mu sync.Mutex
func getRequest(ip string, limit int) *rate.Limiter {
mu.Lock()
defer mu.Unlock()
v, exists := requests[ip]
if !exists {
limiter := rate.NewLimiter(1, limit)
requests[ip] = &request{limiter, time.Now()}
return limiter
}
// Update the last seen time for the visitor.
v.lastSeen = time.Now()
return v.limiter
}
func throttle(next http.Handler, limit int) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip, _, err := net.SplitHostPort(r.RemoteAddr)
if err != nil {
log.Println(err.Error())
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
limiter := getRequest(ip, limit)
fmt.Println(limiter.Allow())
if limiter.Allow() == false {
http.Error(w, http.StatusText(http.StatusTooManyRequests), http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Is it correct?
Because when I try it, it still passes. The function limit is not working.
I doubt with NewLimiter()
limiter := rate.NewLimiter(1, limit)
Does it mean one user only can request login 50 requests per day? (I already read the docs, but I do not understand.)
From the rate docs:
func NewLimiter(r Limit, b int) *Limiter
NewLimiter returns a new Limiter that allows events up to rate r and
permits bursts of at most b tokens.
So the first parameter is the rate-limit, not the second. Burst is the number of requests you want to allow that occur faster than the rate-limit - typically one uses a value of 1 to disallow bursting, anything higher will let this number of requests in before the regular rate-limit kicks in. Anyway...
To create the rate.Limit for your needs, you can use the helper function rate.Every():
rt := rate.Every(24*time.Hour / 50)
limiter := rate.NewLimiter(rt, 1)
NewLimited(1, 50) means 1 request/second with a burst of up to 50 requests. It's a token bucket, which means that there are 50 tokens, each accepted API call uses up one token, and the tokens are regenerated at the given rate, up to burst. Your code is creating a limiter per IP address, so that's a limit per IP address (which I guess you are approximating as one IP address is one user).
If you're running on a single persistent server, and the server and code never restarts, then you may be able to get something like 50 requests/day per user by specifying a rate of 50 / (3600*24) and a burst of 50. (Note: 3600*24 is the number of seconds in a day). But the rate limiting package you're using is not designed for such coarse rate-limiting (on the order of requests per day) -- it's designed to prevent server overload under heavy traffic in the short term (on the order of requests per second).
You probably want a rate-limiter that works with a database or similar (perhaps using a token bucket scheme, since that can be implemented efficiently). Probably there's a package somewhere for that, but I don't know of one of the top of my head.

Calling Functions Inside a "LockOSThread" GoRoutine

I'm writing a package to control a Canon DSLR using their EDSDK DLL from Go.
This is a personal project for a photo booth to use at our wedding at my partners request, which I'll be happy to post on GitHub when complete :).
Looking at the examples of using the SDK elsewhere, it isn't threadsafe and uses thread-local resources, so I'll need to make sure I'm calling it from a single thread during usage. While not ideal, it looks like Go provides a "runtime.LockOSThread" function for doing just that, although this does get called by the core DLL interop code itself, so I'll have to wait and find out if that interferes or not.
I want the rest of the application to be able to call the SDK using a higher level interface without worrying about the threading, so I need a way to pass function call requests to the locked thread/Goroutine to execute there, then pass the results back to the calling function outside of that Goroutine.
So far, I've come up with this working example of using very broad function definitions using []interface{} arrays and passing back and forward via channels. This would take a lot of mangling of input/output data on every call to do type assertions back out of the interface{} array, even if we know what we should expect for each function ahead of time, but it looks like it'll work.
Before I invest a lot of time doing it this way for possibly the worst way to do it - does anyone have any better options?
package edsdk
import (
"fmt"
"runtime"
)
type CanonSDK struct {
FChan chan functionCall
}
type functionCall struct {
Function func([]interface{}) []interface{}
Arguments []interface{}
Return chan []interface{}
}
func NewCanonSDK() (*CanonSDK, error) {
c := &CanonSDK {
FChan: make(chan functionCall),
}
go c.BackgroundThread(c.FChan)
return c, nil
}
func (c *CanonSDK) BackgroundThread(fcalls <-chan functionCall) {
runtime.LockOSThread()
for f := range fcalls {
f.Return <- f.Function(f.Arguments)
}
runtime.UnlockOSThread()
}
func (c *CanonSDK) TestCall() {
ret := make(chan []interface{})
f := functionCall {
Function: c.DoTestCall,
Arguments: []interface{}{},
Return: ret,
}
c.FChan <- f
results := <- ret
close(ret)
fmt.Printf("%#v", results)
}
func (c *CanonSDK) DoTestCall([]interface{}) []interface{} {
return []interface{}{ "Test", nil }
}
For similar embedded projects I've played with, I tend to create a single goroutine worker that listens on a channel to perform all the work over that USB device. And any results sent back out on another channel.
Talk to the device with channels only in Go in a one-way exchange. LIsten for responses from the other channel.
Since USB is serial and polling, I had to setup a dedicated channel with another goroutine that justs picks items off the channel when they were pushed into it from the worker goroutine that just looped.

Should I care about providing asynchronous calls in my go library?

I am developing a simple go library for jsonrpc over http.
There is the following method:
rpcClient.Call("myMethod", myParam1, myParam2)
This method internally does a http.Get() and returns the result or an error (tuple).
This is of course synchron for the caller and returns when the Get() call returns.
Is this the way to provide libraries in go? Should I leave it to the user of my library to make it asynchron if she wants to?
Or should I provide a second function called:
rpcClient.CallAsync()
and return a channel here? Because channels cannot provide tuples I have to pack the (response, error) tuple in a struct and return that struct instead.
Does this make sense?
Otherwise the user would have to wrap every call in an ugly method like:
result := make(chan AsyncResponse)
go func() {
res, err := rpcClient.Call("myMethod", myParam1, myParam2)
result <- AsyncResponse{res, err}
}()
Is there a best practice for go libraries and asynchrony?
The whole point of go's execution model is to hide the asynchronous operations from the developer, and behave like a threaded model with blocking operations. Behind the scenes there are green-threads and asynchronous IO and a very sophisticated scheduler.
So no, you shouldn't provide an async API to your library. Networking in go is done in a pseudo-blocking way from the code's perspective, and you open as many goroutines as needed, as they are very cheap.
So your last example is the way to go, and I don't consider it ugly. Because it allows the developer to choose the concurrency model. In the context of an http server, where each command is handled in separate goroutine, I'd just call rpcClient.Call("myMethod", myParam1, myParam2).
Or if I want a fanout - I'll create fanout logic.
You can also create a convenience function for executing the call and returning on a channel:
func CallAsync(method, p1, p2) chan AsyncResponse {
result := make(chan AsyncResponse)
go func() {
res, err := rpcClient.Call(method, p1, p2)
result <- AsyncResponse{res, err}
}()
return result
}

Resources