Can't overcome 4000 RPS treshold on localhost HTTP Golang server - performance

I tried to measure the bandwidth of Go default HTTP server implementation on my local machine. The server is just accepting any HTTP request, increments the counter using sync.atomic and send 200 OK response. Also, the server collects amount of requests every second, prints it and resets counter to zero:
type hand struct {
cnt int32
}
func (h *hand) ServeHTTP(rsp http.ResponseWriter, req *http.Request) {
atomic.AddInt32(&h.cnt, 1)
rsp.WriteHeader(200)
if req.Body != nil {
req.Body.Close()
}
}
func main() {
h := new(hand)
s := &http.Server{
Addr: ":8080",
Handler: h,
}
ticker := time.NewTicker(1 * time.Second)
go func() {
for tick := range ticker.C {
val := atomic.SwapInt32(&h.cnt, 0)
fmt.Printf("(%v) %d RPS\n", tick, val)
}
}()
log.Fatal(s.ListenAndServe())
}
The target client is trying to send 100000 GET requests simultaneously:
const total = 100000
func main() {
var r int32
var rsp int32
r = total
rsp = r
for r > 0 {
go func() {
p, err := http.Get("http://localhost:8080")
atomic.AddInt32(&rsp, -1)
if err != nil {
fmt.Printf("error: %s\n", err)
return
}
if p.StatusCode != 200 {
fmt.Printf("status %d\n", p.StatusCode)
}
}()
r--
}
for {
x := atomic.LoadInt32(&rsp)
fmt.Printf("sent : %d\n", total-x)
if x == 0 {
return
}
time.Sleep(1 * time.Second)
}
}
I'm using Linux machine with 5.3.2-gentoo kernel. I changed ulimits (both soft and hard) of nofile to 100000. When I run this tests, all other user applications were stopped.
I'm not expecting to get accurate results, but just need to know the level of this threshold, something like X000 or X0000 or X00000.
But the server can't process more than 4000 requests per second, it's looking too low:
# removed timestamps
0 RPS
0 RPS
0 RPS
3953 RPS
3302 RPS
387 RPS
37 RPS
1712 RPS
How can I raise the bandwidth of HTTP server? Or maybe there is an issue with my testing method or local configuration?

The problem was in testing method:
It's not correct to run client and server on the same machine, the target server should be located at dedicated host, the network between target and client should be fast enough
Custom scripts for network testing is not an option: for simple cases wrk can be used, for more complex scenarios Jmetr or other frameworks
When I tested this server on dedicated host using wrk it shows 285900.73 RPS.

Related

benchmark function with worker pool and without but faster without worker for test result

I am trying to make function with worker pool and without, after that I create Benchmark test to compare with which want faster, but I got result that function with worker pool take longer than without.
here is the result
goos: linux
goarch: amd64
BenchmarkWithoutWorker-4 4561 228291 ns/op 13953 B/op 1744 allocs/op
BenchmarkWithWorker-4 1561 651845 ns/op 54429 B/op 2746 allocs/op
the worker pool looks simple and I am following the example from this stackoverflow question
here is the scenario of my worker pool and without
var wg sync.WaitGroup
// i will get data from the DB, let say the data lenght about 1000
const dataFromDB int = 1000
// numOfProduce in benchmarking value is dataFromDB i defined
func WithoutWorker(numOfProduce int) {
for i := 0; i < numOfProduce; i++ {
if doSomething(fmt.Sprintf("data %d", i)) != nil {
fmt.Println("error")
}
}
}
func WithWorker(numWorker int) {
jobs := make(chan *Job, dataFromDB)
result := make(chan *Result, 10)
for i := 0; i < numWorker; i++ {
wg.Add(1)
go consume(i, jobs, result)
}
go produce(jobs)
wg.Wait()
// i might analyze the result channel
// here later to return any error to client if any error i got
}
func doSomething(str string) error {
if str == "" {
return errors.New("empty")
}
return nil
}
func consume(workerID int, jobs <-chan *Job, result chan<- *Result) {
defer wg.Done()
for job := range jobs {
//log.Printf("worker %d", workerID)
//log.Printf("job %v", job.ValueJob)
err := doSomething(job.ValueJob)
if err != nil {
result <- &Result{Err: err}
}
}
}
func produce(jobs chan<- *Job) {
for i := 1; i < dataFromDB; i++ {
jobs <- &Job{
Id: i,
ValueJob: fmt.Sprintf("data %d", i),
}
}
close(jobs)
}
am I missing something in my worker pool?
for the benchmark test code, it looks like codes from tutorial outs there :) just simple codes to call the functions and I added b.ReportAllocs() as well
If the work you are splitting up on several goroutines / workers is less than the overhead of the communication to send the job to the goroutine and receive the result, then it is faster to do the work on a single machine.
In your example you are doing (almost) no work:
func doSomething(str string) error {
if str == "" {
return errors.New("empty")
}
return nil
}
Splitting that up on multiple goroutines is going to slow things down.
Example to illustrate:
If you have work that needs 5ns (nano seconds) and you do that 1000 times you have
0.005ms on a single core
If you distribute it across 10 cores it will add communication overhead for each job. Let's say the communication overhead is 1 micro second (1000ns). Now you have 1000 jobs * (5ns + 1000ns) / 10 cores =
0.1005ms on 10 cores
This is just an example with some made up numbers and the math is not exact, but it should illustrate the point: There is a cost to communication that is only worth introducing, if it is (significantly) smaller than the cost of the job itself.

Why does HTTP request always take as long as the full timeout?

I am making a _golang git bruteforcer. It's acting a bit weird, I guess it's got something to do with concurrency.
sync.WaitGroup
Here's the code : https://dpaste.org/vO7y
package main
import { <snipped for brevity> }
// ReadFile : Reads File and returns it's contents
func ReadFile(fileName string) []string { <snipped for brevity> }
func joinString(strs ...string) string { <snipped for brevity> }
// MakeRequest : Makes requests concurrently
func MakeRequest(client *http.Client, url string, useragent string, ch chan<- string, wg *sync.WaitGroup) {
defer wg.Done()
// start := time.Now()
request, err := http.NewRequest("GET", url, nil)
if err != nil {
fmt.Println(err)
return
}
request.Header.Set("User-Agent", useragent)
response, err := client.Do(request)
if err != nil {
return
}
// secs := time.Since(start).Seconds()
if response.StatusCode < 400 {
// fmt.Printf("Time elapsed %f", secs)
bodyBytes, err := ioutil.ReadAll(response.Body)
if err != nil {
log.Fatal(err)
}
defer response.Body.Close()
bodyString := string(bodyBytes)
notGit, err := regexp.MatchString("<html>", strings.ToLower(bodyString))
if !notGit && len(bodyString) > 0 { // empty pages and html pages shouldn't be included
fmt.Println(bodyString)
ch <- fmt.Sprintf(" %s ", Green(url))
}
}
}
func main() {
start := time.Now()
useragent := "Mozilla/10.0 (Windows NT 10.0) AppleWebKit/538.36 (KHTML, like Gecko) Chrome/69.420 Safari/537.36"
gitEndpoint := []string{"/.git/", "/.git/HEAD", "/.gitignore", "/.git/description", "/.git/index"}
timeout := 10 * time.Second
var tr = &http.Transport{
MaxIdleConns: 30,
IdleConnTimeout: time.Second,
DisableKeepAlives: true,
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
DialContext: (&net.Dialer{
Timeout: timeout,
KeepAlive: time.Second,
}).DialContext,
}
re := func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
}
client := &http.Client{
Transport: tr,
CheckRedirect: re,
Timeout: timeout,
}
output := ReadFile(os.Args[1])
// start := time.Now()
ch := make(chan string)
var wg sync.WaitGroup
for _, url := range output {
for _, endpoint := range gitEndpoint {
wg.Add(1)
go MakeRequest(client, "https://"+url+endpoint, useragent, ch, &wg)
}
}
go func() {
wg.Wait()
close(ch)
}()
f, err := os.OpenFile("git_finder.txt", os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
for val := range ch {
if err != nil {
fmt.Println(Red(err))
}
_, err = fmt.Fprintln(f, val)
fmt.Println(val)
}
f.Close()
fmt.Printf("Total time taken %.2fs elapsed\n", time.Since(start).Seconds())
}
Working :
It reads the urls from a file and checks for /.git, /.git/HEAD, /.git/description, /.git/index on the webserver.
Problem :
If I change the http.Client timeout to 2 seconds it will finish in 2 seconds, if it's 50 seconds it will wait till 50 seconds, it doesn't matter if the input file contains 10 urls or 500 urls.
My Understanding is if there's more number of urls it will wait till the timeout of last URL that's passed with the goroutine.
Update 1 :
As adrian mentioned in the comments, it doesn't look like a concurrency problem, that's what one of the main issue with this is that I can't place a finger on what the exact problem is here
In your code, you are reading URLs from a file, then firing requests in parallel to all those URLs, then waiting for all the parallel requests to finish.
So this actually makes sense and would not indicate an issue:
If I change the http.Client timeout to 2 seconds it will finish in 2 seconds, if it's 50 seconds it will wait till 50 seconds, it doesn't matter if the input file contains 10 urls or 500 urls.
Let's say your file has 500 URLs.
You fire the 500 requests in parallel... then wait for all of them to finish (remember, they are all executing in parallel). How long would that take?
In the worst case (all of the requests timeout at 50 seconds), it will just take 50 seconds in total (since they are all waiting for those 50 seconds in parallel).
In the best case (all requests go through successfully with no timeouts) it should take a few seconds.
In the average case you are probably seeing (a few timeout at 50 seconds) then it takes 50 seconds (you will be waiting for those few requests to wait those 50 seconds in parallel as in the worst case).

Missing milliseconds in the Go code performing redis operation

Below is the sample snippet for getting value from Redis. I'm pipeling 3 redis commands and getting the values. The problem here is "missing milliseconds". The time taken by redis pipeline is significantly lower ( less than 5ms) but the overall time taken to perform a Get Operation is more than 10ms. Not sure which operation is taking time, unmarshal is not the issue, as I measured the len(bytes) and timing. Any help is much appreciated.
Request/Second = 300, running on 3 AWS large instances with a powerful 25GB redis instance. Using 10 default connections.
func Get(params...) <-chan CacheResult {
start := time.Now()
var res CacheResult
defer func() {
resCh <- res
}()
type timers struct {
total time.Duration
pipeline time.Duration
unmarshal time.Duration
}
t := timers{}
startPipeTime := time.Now()
// pipe line commands
pipe := c.client.Pipeline()
// 3 commands pipelined (HGET, HEGT, GET)
if _, res.Err = pipe.Exec(); res.Err != nil && res.Err != redis.Nil {
return resCh
}
sinceStartPipeTime := time.Since(startPipeTime)
// get query values like below for HGET & GET
if val, res.Err = cachedValue.Bytes(); res.Err != nil {
return resCh
}
// Unmarshal the query value
startUnmarshalTime := time.Now()
var cv common.CacheValue
if res.Err = json.Unmarshal(val, &cv); res.Err != nil {
return resCh
}
sinceStartUnmarshalTime := time.Since(startUnmarshalTime)
t.unmarshal = sinceStartUnmarshalTime
endTime := time.Since(start)
xlog.Infof("Timings total:%s, "+
"pipeline(redis):%s, unmarshaling(%vB):%s", t.total, t.pipeline, len(val), t.unmarshal)
return resCh
}
Time to execute a redis command include:
App server pre-processing
Round trip time between app server and redis server
Redis server processing time
In normal operation, (2) takes the most significant time.

Golang http handler - time taken for request

I am trying to set a timer to count how much time is needed for my server to finish a request and I want the timer to stop after the last byte of the response is sent.
I found that the http server will only send the response after the handler function returns.
Is there any way to add a callback after the response is sent ?
Or is there a better way to count the time taken from the first byte of the request coming in till the last byte byte of the response is sent ?
The easier but not as accurate way to do it would be using a middleware to wrap your handler function.
func timer(h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
startTime := time.Now()
h.ServeHTTP(w, r)
duration := time.Now().Sub(startTime)
})
}
Then
http.Handle("/route",timer(yourHandler))
This is more accurately the time taken to process the request and form the response and not the time between writes.
If you absolutely need a more accurate duration then the parts of code you're looking to change reside in the net/http package.
It would be around here.
The highlighted line go c.serve(ctx) is where the the go routine for serving the request is spawned.
for {
rw, e := l.Accept()
if e != nil {
if ne, ok := e.(net.Error); ok && ne.Temporary() {
if tempDelay == 0 {
tempDelay = 5 * time.Millisecond
} else {
tempDelay *= 2
}
if max := 1 * time.Second; tempDelay > max {
tempDelay = max
}
srv.logf("http: Accept error: %v; retrying in %v", e, tempDelay)
time.Sleep(tempDelay)
continue
}
return e
}
tempDelay = 0
c := srv.newConn(rw)
c.setState(c.rwc, StateNew) // before Serve can return
go func(){
startTime := time.Now()
c.serve(ctx)
duration := time.Now().Sub(startTime)
}()
}
Note : The request actually gets written in the net.Conn somewhere inside l.Accept() but the highlighted point is the only place where we can have the approximate start time and end time within the same scope in the code.

Why does benchmarking with wrk give weird results when working with Go and MySQL?

It started out with me trying to see how much of a difference there was between prepared statements and non prepared statements. I ended up sitting for 5 hours trying to figure out what was going on. And what I found out makes no sense.
I am using CentOS 7 with supervisord to run Go 1.4.2 in the background as a deamon. I am using MariaDB as the sql database.
The program I am benchmarking is really simple.
I tried to do this in 3 different ways.
1 (the most uncorrect way which for some reason works best):
package main
import (
"database/sql"
"fmt"
"log"
"net/http"
_ "github.com/go-sql-driver/mysql"
"github.com/julienschmidt/httprouter"
)
type User struct {
Id int64
Email string
Username string
}
var DB *sql.DB
func getUser(w http.ResponseWriter, r *http.Request, _ httprouter.Params) {
query := `SELECT id, email, username FROM user WHERE id=3 LIMIT 1;`
rows, _ := DB.Query(query)
var users []*User
for rows.Next() {
u := new(User)
rows.Scan(&u.Id, &u.Email, &u.Username)
users = append(users, u)
}
fmt.Fprint(w, users[0])
}
func main() {
// runtime.GOMAXPROCS(4)
var err error
DB, err = sql.Open("mysql", "root:pass#unix(/var/lib/mysql/mysql.sock)/dbname?charset=utf8")
if err != nil {
log.Fatal(err)
}
if err = DB.Ping(); err != nil {
log.Fatal(err)
}
router := httprouter.New()
router.GET("/api/user", getUser)
log.Fatal(http.ListenAndServe(":80", router))
}
I did not check for any errors here. I did not close any rows and so on.
This method gives great results with wrk:
Running 5s test # http://127.0.0.1:80/api/user
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 18.85ms 17.73ms 238.81ms 91.57%
Req/Sec 7.19k 1.58k 8.45k 96.00%
35776 requests in 5.04s, 5.02MB read
Requests/sec: 7096.50
Transfer/sec: 0.99MB
I seem to get a r/s over 7000 consistently. Great result when I am getting stuff from the MySQL database.
In the second method I'll just replace the content inside the getUser function like this:
func getUser(w http.ResponseWriter, r *http.Request, _ httprouter.Params) {
query := `SELECT id, email, username FROM user WHERE id=3 LIMIT 1;`
rows, err := DB.Query(query)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var users []*User
for rows.Next() {
u := new(User)
err := rows.Scan(&u.Id, &u.Email, &u.Username)
if err != nil {
log.Fatal(err)
}
users = append(users, u)
}
if err = rows.Err(); err != nil {
log.Fatal(err)
}
fmt.Fprint(w, users[0])
}
This is the recommended way of doing a query as I understood:
http://go-database-sql.org/retrieving.html
This gives really weird and awful results consistently!
Running 5s test # http://127.0.0.1:80/api/user
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 11.80ms 3.68ms 14.40ms 100.00%
Req/Sec 0.00 0.00 0.00 100.00%
2 requests in 5.01s, 294.00B read
Socket errors: connect 0, read 1490, write 159932, timeout 0
Requests/sec: 0.40
Transfer/sec: 58.64B
And only when I decrease the connections amount do I seem to get better results, but not consistently, the results sometimes go all the way down to around 1500 r/s, it makes no sense to me.
The third way I tried is this: (Just doing a simple 1 row query)
func getUser(w http.ResponseWriter, r *http.Request, _ httprouter.Params) {
query := `SELECT id, email, username FROM user WHERE id=3 LIMIT 1;`
user := new(User)
err := DB.QueryRow(query).Scan(&user.Id, &user.Email, &user.Username)
if err != nil {
if err != sql.ErrNoRows {
log.Fatal(err)
}
http.Error(w, http.StatusText(http.StatusNotFound), http.StatusNotFound)
return
}
fmt.Fprint(w, user)
}
Here I get the same awful results, less than 1 request per second.
What is going on? This has been bugging me for hours now. Hope someone has an explanation and a fix to this..
EDIT:
Now I removed all the lines with log.Fatal(err) and replaced them with log.Println(err)
And set
runtime.GOMAXPROCS(4)
And the single query one gives me results like these consistently:
[root#centos7 main]# wrk -t1 -c1000 -d5s http://127.0.0.1:80/api/user
Running 5s test # http://127.0.0.1:80/api/user
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 22.24ms 12.11ms 176.58ms 79.83%
Req/Sec 38.01k 4.62k 48.75k 66.67%
188080 requests in 5.04s, 24.09MB read
Non-2xx or 3xx responses: 183505
Requests/sec: 37350.28
Transfer/sec: 4.78MB
But is this not too good? Must be something wrong here?
Another thing, are you not supposed to be using log.Fatal(err) in a production server but only when developing? Since all the examples on how to use the MySQL driver shows examples with log.Fatal(err) after each err when getting results.

Resources