Why does Go log package slow my http APIs so bad? Is it that slow?
Here is my router example using httprouter without logging:
package main
import (
"fmt"
"log"
"net/http"
"time"
"github.com/julienschmidt/httprouter"
)
func main() {
handler := httprouter.New()
handler.GET("/hello", f)
http.ListenAndServe(fmt.Sprintf(":%d", 8080), handler)
}
func f(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {
w.WriteHeader(http.StatusOK)
fmt.Fprint(w, "world")
}
Using wrk to benchmark that endpoint I got this:
$ wrk -t1 -d1s -c100 http://localhost:8080/hello
Running 1s test # http://localhost:8080/hello
1 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.15ms 197.55us 2.84ms 80.02%
Req/Sec 84.58k 6.15k 99.01k 80.00%
83904 requests in 1.01s, 9.68MB read
Requests/sec: 83380.37
Transfer/sec: 9.62MB
When I added middleware for logging:
package main
import (
"fmt"
"log"
"net/http"
"time"
"github.com/julienschmidt/httprouter"
)
func main() {
handler := httprouter.New()
handler.GET("/hello", logger(f))
fmt.Println("httprouter")
http.ListenAndServe(fmt.Sprintf(":%d", 8080), handler)
}
func logger(next httprouter.Handle) httprouter.Handle {
return func(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {
start := time.Now()
next(w, r, ps)
elapsed := time.Since(start)
log.Printf("%s | %s | %s | %d\n", time.Now().Format(time.RFC3339), r.Method, r.URL.Path, elapsed)
}
}
func f(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {
w.WriteHeader(http.StatusOK)
fmt.Fprint(w, "world")
}
It got slowed down to the factor of 4x:
$ wrk -t1 -d1s -c100 http://localhost:8080/hello
Running 1s test # http://localhost:8080/hello
1 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.25ms 4.34ms 26.47ms 60.23%
Req/Sec 20.51k 2.19k 24.28k 70.00%
20449 requests in 1.01s, 2.36MB read
Requests/sec: 20330.66
Transfer/sec: 2.35MB
I tested it locally on:
MacBook Pro 13inches
2 GHz Quad-Core Intel Core i5
Memory 16GB
I use default Go max proc without modifying anything after installed.
Is log package that slow? Any suggestion how to improve this?
This answer summarizes the commentary on the question.
used buffered io
write from a goroutine to reduce blocking from other the logging goroutines.
Here's the code:
type writer chan []byte
func (w writer) Write(p []byte) (int, error) {
w <- append(([]byte)(nil), p...)
return len(p), nil
}
func writePump(w writer) {
bw := bufio.NewWriter(os.Stderr)
for p := range w {
bw.Write(p)
// Slurp up buffered messages in flush. This ensures
// timely output.
n := len(w)
for i := 0; i < n; i++ {
bw.Write(<-w)
}
bw.Flush()
}
}
Set it up as follows:
w := make(writer, 16) // adjust capacity to meet your needs
go writePump(w)
log.SetOutput(w)
Related
Broadly, I'm trying to answer the question "which of two approaches to handling incoming requests is more efficient". I'm measuring efficiency by:
lower max RAM use
lower percentage of CPU usage
These metrics will hopefully signal which approach is more suited to running on a resource-constrained server. It's gets a lot of volume, so the most important thing is that we can process requests faster than they come in. We've had "too many open files" problems when our queue filled up and the server had too many open connections.
So, tach of the two programs aims to respond to the initial request as quickly as possible and then queue up the actual work. In real life, it's an outgoing HTTP request; in my tests, it's a sleep for a variable amount of time, to try to simulate inconsistent conditions.
Here's the first program:
// approach A: spin up a few goroutines at the start
// and send messages into a long buffered channel
package main
import (
"fmt"
"log"
"math/rand"
"net/http"
_ "net/http/pprof"
"time"
)
const (
minResponseTimeSec = 0.5
maxResponeTimeSec = 2.5
)
var messageChan = make(chan int, 1024)
func randResponseTime() float64 {
return minResponseTimeSec + rand.Float64()*(maxResponeTimeSec-minResponseTimeSec)
}
func main() {
rand.Seed(time.Now().UnixNano())
for i := 1; i <= 8; i++ {
go worker()
}
http.HandleFunc("/message", handler)
log.Fatal(http.ListenAndServe(":1234", nil))
}
func handler(writer http.ResponseWriter, request *http.Request) {
messageChan <- 3 // would normally send POST body
fmt.Fprint(writer, "ok\n")
}
func worker() {
for range messageChan {
time.Sleep(time.Second * time.Duration(randResponseTime()))
}
}
and here's the second:
// approach B: set a maximum number of concurrent
// goroutines and spin them up as needed
package main
import (
"fmt"
"log"
"math/rand"
"net/http"
_ "net/http/pprof"
"time"
)
const (
minResponseTimeSec = 0.5
maxResponeTimeSec = 2.5
)
var (
semaphoreChan chan struct{}
)
func randResponseTime() float64 {
return minResponseTimeSec + rand.Float64()*(maxResponeTimeSec-minResponseTimeSec)
}
func main() {
rand.Seed(time.Now().UnixNano())
semaphoreChan = make(chan struct{}, 1024)
http.HandleFunc("/message", handler)
log.Fatal(http.ListenAndServe(":1234", nil))
}
func handler(writer http.ResponseWriter, request *http.Request) {
semaphoreChan <- struct{}{}
go fanout() // would normally send POST body
fmt.Fprint(writer, "ok\n")
}
func fanout() {
defer func() { <-semaphoreChan }()
time.Sleep(time.Second * time.Duration(randResponseTime()))
}
With each program running (separately) I used ab to send lots of requests at each in turn. But, I'm having trouble interpreting the data available at /profile/pprof. I was focusing on the stats at the bottom of /debug/pprof/heap?debug=1, especially TotalAlloc and HeapAlloc. But, those numbers all seem to grow indefinitely (as I refresh the page), while I'd expect them to stay flat before / after the benchmarking is done, which leads me to think I'm looking at the wrong numbers.
I have the following program where the HTTP server is created using gorilla mux.
When any request comes, it starts goroutine 1. In processing, I am starting another goroutine 2.
I want to wait for goroutine 2's response in goroutine 1? How I can do that?
How to ensure that only goroutine 2 will give the response to goroutine 1?
There can be GR4 created by GR3 and GR 3 should wait for GR4 only.
GR = Goroutine
SERVER
package main
import (
"encoding/json"
"fmt"
"net/http"
"strconv"
"time"
"github.com/gorilla/mux"
)
type Post struct {
ID string `json:"id"`
Title string `json:"title"`
Body string `json:"body"`
}
var posts []Post
var i = 0
func getPosts(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
i++
fmt.Println(i)
ch := make(chan int)
go getTitle(ch, i)
p := Post{
ID: "123",
}
// Wait for getTitle result and update variable P with title
s := <-ch
//
p.Title = strconv.Itoa(s) + strconv.Itoa(i)
json.NewEncoder(w).Encode(p)
}
func main() {
router := mux.NewRouter()
posts = append(posts, Post{ID: "1", Title: "My first post", Body: "This is the content of my first post"})
router.HandleFunc("/posts", getPosts).Methods("GET")
http.ListenAndServe(":9999", router)
}
func getTitle(resultCh chan int, m int) {
time.Sleep(2 * time.Second)
resultCh <- m
}
CLIENT
package main
import (
"fmt"
"net/http"
"io/ioutil"
"time"
)
func main(){
for i :=0;i <100 ;i++ {
go main2()
}
time.Sleep(200 * time.Second)
}
func main2() {
url := "http://localhost:9999/posts"
method := "GET"
client := &http.Client {
}
req, err := http.NewRequest(method, url, nil)
if err != nil {
fmt.Println(err)
}
res, err := client.Do(req)
defer res.Body.Close()
body, err := ioutil.ReadAll(res.Body)
fmt.Println(string(body))
}
RESULT ACTUAL
{"id":"123","title":"25115","body":""}
{"id":"123","title":"23115","body":""}
{"id":"123","title":"31115","body":""}
{"id":"123","title":"44115","body":""}
{"id":"123","title":"105115","body":""}
{"id":"123","title":"109115","body":""}
{"id":"123","title":"103115","body":""}
{"id":"123","title":"115115","body":""}
{"id":"123","title":"115115","body":""}
{"id":"123","title":"115115","body":""}
RESULT EXPECTED
{"id":"123","title":"112112","body":""}
{"id":"123","title":"113113","body":""}
{"id":"123","title":"115115","body":""}
{"id":"123","title":"116116","body":""}
{"id":"123","title":"117117","body":""}
there are few ways to do this , a simple way is to use channels
change the getTitle func to this
func getTitle(resultCh chan string) {
time.Sleep(2 * time.Second)
resultCh <- "Game Of Thrones"
}
and the getPosts will use it like this
func getPosts(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
ch := make(chan string)
go getTitle(ch)
s := <-ch // this will wait until getTile inserts data to channel
p := Post{
ID: s,
}
json.NewEncoder(w).Encode(p)
}
i suspect you are new to go, this is a basic channel usage , check more details here Channels
So the problem you're having is that you haven't really leaned how to deal with concurrent code (not a dis, I was there once). Most of this centers not around channels. The channels are working correctly as #kojan's answer explains. Where things go awry is with the i variable. Firstly you have to understand that i is not being mutated atomically so if your client requests arrive in parallel you can mess up the number:
C1 : C2:
i == 6 i == 6
i++ i++
i == 7 i == 7
Two increments in software become one increment in actuality because i++ is really 3 operations: load, increment, store.
The second problem you have is that i is not a pointer, so when you pass i to your go routine you're making a copy. the i in the go routine is sent back on the channel and becomes the first number in your concatenated string which you can watch increment. However the i left behind which is used in the tail of the string has continued to be incremented by successive client invocations.
package main
import (
"fmt"
"log"
"net/http"
"time"
)
var ch chan bool
func testTimer1() {
go func() {
log.Println("test timer 1")
ch <- true
}()
}
func timer1() {
timer1 := time.NewTicker(2 * time.Second)
select {
case <-timer1.C:
testTimer1()
}
}
func myhandler(w http.ResponseWriter, r *http.Request) {
for {
go timer1()
a := <-ch
log.Println("get a: ", a)
fmt.Fprintf(w, "hello world!!!!", a)
}
log.Println("test for break")
}
func main() {
ch = make(chan bool)
http.HandleFunc("/", myhandler)
http.ListenAndServe(":8080", nil)
}
I wrote the above code, put a channel into "myhandler", channel will be
given a bool data when the
timer task executed.
then I get the data from channel and write "hello world" into http writer
but I found the client couldn't receive the "hello world", the writer has been blocked!!!!!
Any one knows about this?
looks the running pic on my cmd:
enter image description here
enter image description here
The for loop is an infinite loop so printing to the ResponseWriter is not "scheduled" to happen. If you want a comet-like approack (or long-polling URL) you may want to try this method.
There's also a leak of tickers in timer1(). According to the Go Docs:
Stop the ticker to release associated resources.
You're always creating a new ticker every time you call go timer1() and the ticker is never closed so every new ticker just adds-up.
Short answer
Avoid buffering
Client Side
Use curl with --no-buffer set
curl http://localhost:8080 --no-buffer
Server Side
Flush after every fmt.Fprint
w.(http.Flusher).Flush()
Long Answer
The biggest problem when implementing HTTP streaming is understanding the effect of buffering. Buffering is the practice of accumulating reads or writes into a temporarily fixed memory space. The advantages of buffering include reducing read or write call overhead. For example, instead of writing 1KB 4096 times, you can just write 4096KB at once. This means your program can create a write buffer holding 4096KB of temporary data (which can be aligned to the disk block size), and once the space limit is reached, the buffer is flushed to disk.
Here the above mentioned HTTP component include two components Server(go server) and Client(Curl).Each one of these components can possess adjustable and varied buffering styles and limits.
An unrelated issue, n the program given it has one more problem ie, not stopping timer always stop the ticker to release associated resources.
Here is an implementation with some corrections
Code
package main
import (
"fmt"
"log"
"net/http"
"time"
)
var ch chan bool
func testTimer1() {
go func() {
log.Println("test timer 1")
ch <- true
}()
}
func timer1() {
timer1 := time.NewTicker(2 * time.Second)
defer timer1.Stop()
<-timer1.C
testTimer1()
}
func myhandler(w http.ResponseWriter, r *http.Request) {
for {
go timer1()
a := <-ch
log.Println("get a: ", a)
fmt.Fprintf(w, "hello world!!!! - %v", a)
w.(http.Flusher).Flush()
}
}
func main() {
ch = make(chan bool)
http.HandleFunc("/", myhandler)
http.ListenAndServe(":8080", nil)
}
Curl
curl http://localhost:8080 --no-buffer
Why is there such a delay in processing of incoming requests by the main server goroutine and how can this delay be avoided ?
Simple code with NO DELAYS
package main
import (
"log"
"net/http"
)
func main() {
http.HandleFunc("/", root)
http.ListenAndServe(":8090", nil)
}
//---------------------------------------------------------------------------
// http handlers
//---------------------------------------------------------------------------
func root(w http.ResponseWriter, r *http.Request) {
log.Printf("[root] `%v`\n", r.URL.Path)
w.Write([]byte("What the hell"))
}
Result testing the loading
╰─➤ wrk -d20s -t5 -c100 http://localhost:8090
Running 20s test # http://localhost:8090
5 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 17.07ms 46.63ms 368.77ms 91.73%
Req/Sec 10.99k 4.87k 24.48k 62.49%
1038912 requests in 20.10s, 128.80MB read
Requests/sec: 51684.63
Transfer/sec: 6.41MB
Adding goroutines
package main
import (
"log"
"net/http"
)
func main() {
_ = NewTesterGo(100)
http.HandleFunc("/", root)
http.ListenAndServe(":8090", nil)
}
//---------------------------------------------------------------------------
// http handlers
//---------------------------------------------------------------------------
func root(w http.ResponseWriter, r *http.Request) {
log.Printf("[root] `%v`\n", r.URL.Path)
w.Write([]byte("What the fuck"))
}
//---------------------------------------------------------------------------
// tester segment
//---------------------------------------------------------------------------
type (
TesterGo struct {
Work chan string
}
)
func NewTesterGo(count int) *TesterGo {
t:=&TesterGo{
Work:make(chan string,100),
}
for ; count > 0 ; count -- {
go t.Worker()
}
return t
}
func (t *TesterGo) Worker() {
log.Printf("[testergo][worker][work] стартовал....\n")
for {
select {
case work := <-t.Work:
log.Printf("[testerGo][Worker] %v\n", work)
default:
}
}
}
Result with loading
╰─➤ wrk -d20s -t5 -c100 http://localhost:8090
Running 20s test # http://localhost:8090
5 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 464.71ms 305.44ms 1.90s 77.90%
Req/Sec 54.62 43.74 200.00 67.50%
3672 requests in 20.05s, 466.17KB read
Socket errors: connect 0, read 0, write 0, **timeout 97**
Requests/sec: **183.11**
Transfer/sec: 23.25KB
your goroutines use default, causing them to spin immediately if there is nothing in the channel (and there is nothing in your example). This probably makes Go's scheduler do way more context switching than needed, and probably consume a lot of CPU for nothing.
Is there a reason to default in a loop? If not try one of the following:
Either no default, the goroutines would simply "sleep" until there's work.
for {
select {
case work := <-t.Work:
log.Printf("[testerGo][Worker] %v\n", work)
}
}
This BTW makes the select completely redundant, so just get rid of it:
for { //you can also use a range on the channel
work := <- t.Work
log.Printf("[testerGo][Worker] %v\n", work)
}
Second option - a timeout that will make them wait before continuing in the loop:
for {
select {
case work := <-t.Work:
log.Printf("[testerGo][Worker] %v\n", work)
case <- time.After(100*time.Millisecond): //or whatever you prefer
}
}
Code below works fine with hard coded JSON data however doesn't work when I read JSON data from a file. I'm getting fatal error: all goroutines are asleep - deadlock error when using sync.WaitGroup.
WORKING EXAMPLE WITH HARD-CODED JSON DATA:
package main
import (
"bytes"
"fmt"
"os/exec"
"time"
)
func connect(host string) {
cmd := exec.Command("ssh", host, "uptime")
var out bytes.Buffer
cmd.Stdout = &out
err := cmd.Run()
if err != nil {
fmt.Println(err)
}
fmt.Printf("%s: %q\n", host, out.String())
time.Sleep(time.Second * 2)
fmt.Printf("%s: DONE\n", host)
}
func listener(c chan string) {
for {
host := <-c
go connect(host)
}
}
func main() {
hosts := [2]string{"user1#111.79.154.111", "user2#111.79.190.222"}
var c chan string = make(chan string)
go listener(c)
for i := 0; i < len(hosts); i++ {
c <- hosts[i]
}
var input string
fmt.Scanln(&input)
}
OUTPUT:
user#user-VirtualBox:~/go$ go run channel.go
user1#111.79.154.111: " 09:46:40 up 86 days, 18:16, 0 users, load average: 5"
user2#111.79.190.222: " 09:46:40 up 86 days, 17:27, 1 user, load average: 9"
user1#111.79.154.111: DONE
user2#111.79.190.222: DONE
NOT WORKING - EXAMPLE WITH READING JSON DATA FILE:
package main
import (
"bytes"
"fmt"
"os/exec"
"time"
"encoding/json"
"os"
"sync"
)
func connect(host string) {
cmd := exec.Command("ssh", host, "uptime")
var out bytes.Buffer
cmd.Stdout = &out
err := cmd.Run()
if err != nil {
fmt.Println(err)
}
fmt.Printf("%s: %q\n", host, out.String())
time.Sleep(time.Second * 2)
fmt.Printf("%s: DONE\n", host)
}
func listener(c chan string) {
for {
host := <-c
go connect(host)
}
}
type Content struct {
Username string `json:"username"`
Ip string `json:"ip"`
}
func main() {
var wg sync.WaitGroup
var source []Content
var hosts []string
data := json.NewDecoder(os.Stdin)
data.Decode(&source)
for _, value := range source {
hosts = append(hosts, value.Username + "#" + value.Ip)
}
var c chan string = make(chan string)
go listener(c)
for i := 0; i < len(hosts); i++ {
wg.Add(1)
c <- hosts[i]
defer wg.Done()
}
var input string
fmt.Scanln(&input)
wg.Wait()
}
OUTPUT
user#user-VirtualBox:~/go$ go run deploy.go < hosts.txt
user1#111.79.154.111: " 09:46:40 up 86 days, 18:16, 0 users, load average: 5"
user2#111.79.190.222: " 09:46:40 up 86 days, 17:27, 1 user, load average: 9"
user1#111.79.154.111 : DONE
user2#111.79.190.222: DONE
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc210000068)
/usr/lib/go/src/pkg/runtime/sema.goc:199 +0x30
sync.(*WaitGroup).Wait(0xc210047020)
/usr/lib/go/src/pkg/sync/waitgroup.go:127 +0x14b
main.main()
/home/user/go/deploy.go:64 +0x45a
goroutine 3 [chan receive]:
main.listener(0xc210038060)
/home/user/go/deploy.go:28 +0x30
created by main.main
/home/user/go/deploy.go:53 +0x30b
exit status 2
user#user-VirtualBox:~/go$
HOSTS.TXT
[
{
"username":"user1",
"ip":"111.79.154.111"
},
{
"username":"user2",
"ip":"111.79.190.222"
}
]
Go program ends when the main function ends.
From the language specification
Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
Therefore, you need to wait for your goroutines to finish. The common solution for this is to use sync.WaitGroup object.
The simplest possible code to synchronize goroutine:
package main
import "fmt"
import "sync"
var wg sync.WaitGroup // 1
func routine() {
defer wg.Done() // 3
fmt.Println("routine finished")
}
func main() {
wg.Add(1) // 2
go routine() // *
wg.Wait() // 4
fmt.Println("main finished")
}
And for synchronizing multiple goroutines
package main
import "fmt"
import "sync"
var wg sync.WaitGroup // 1
func routine(i int) {
defer wg.Done() // 3
fmt.Printf("routine %v finished\n", i)
}
func main() {
for i := 0; i < 10; i++ {
wg.Add(1) // 2
go routine(i) // *
}
wg.Wait() // 4
fmt.Println("main finished")
}
WaitGroup usage in order of execution.
Declaration of global variable. Making it global is the easiest way to make it visible to all functions and methods.
Increasing the counter. This must be done in main goroutine because there is no guarantee that newly started goroutine will execute before 4 due to memory model guarantees.
Decreasing the counter. This must be done at the exit of goroutine. Using deferred call, we make sure that it will be called whenever function ends no matter but no matter how it ends.
Waiting for the counter to reach 0. This must be done in main goroutine to prevent program exit.
* The actual parameters are evaluated before starting new gouroutine. Thus it is needed to evaluate them explicitly before wg.Add(1) so the possibly panicking code would not leave increased counter.
Use
param := f(x)
wg.Add(1)
go g(param)
instead of
wg.Add(1)
go g(f(x))
Thanks for the very nice and detailed explanation Grzegorz Żur.
One thing that I want to point it out that typically the func that needs to be threaded wont be in main(), so we would have something like this:
package main
import (
"bufio"
"fmt"
"io"
"io/ioutil"
"math/rand"
"os"
"reflect"
"regexp"
"strings"
"sync"
"time"
)
var wg sync.WaitGroup // VERY IMP to declare this globally, other wise one //would hit "fatal error: all goroutines are asleep - deadlock!"
func doSomething(arg1 arg1Type) {
// cured cancer
}
func main() {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
randTime := r.Intn(10)
wg.Add(1)
go doSomething(randTime)
wg.Wait()
fmt.Println("Waiting for all threads to finish")
}
The thing that I want to point it out is that global declaration of wg is very crucial for all threads to finish before main()
try this code snippest
package main
import (
"bytes"
"fmt"
"os/exec"
"time"
"sync"
)
func connect(host string, wg *sync.WaitGroup) {
defer wg.Done()
cmd := exec.Command("ssh", host, "uptime")
var out bytes.Buffer
cmd.Stdout = &out
err := cmd.Run()
if err != nil {
fmt.Println(err)
}
fmt.Printf("%s: %q\n", host, out.String())
time.Sleep(time.Second * 2)
fmt.Printf("%s: DONE\n", host)
}
func listener(c chan string,wg *sync.WaitGroup) {
for {
host,ok := <-c
// check channel is closed or not
if !ok{
break
}
go connect(host)
}
}
func main() {
var wg sync.WaitGroup
hosts := [2]string{"user1#111.79.154.111", "user2#111.79.190.222"}
var c chan string = make(chan string)
go listener(c)
for i := 0; i < len(hosts); i++ {
wg.Add(1)
c <- hosts[i]
}
close(c)
var input string
fmt.Scanln(&input)
wg.Wait()
}