Golang API giving higher response time with increasing number of concurrent users

Golang API giving higher response time with increasing number of concurrent users - go

I am having some problems with the concurrent HTTP connection in the golang. Kindly read the whole question, and as the actual code is quite long, I am using pseudocode
In short, I have to create a single API, which will internally call 5 other APIs, unify their response, and send them as a single response.
I am using goroutines to call those 5 internal APIs along with timeout, and using channels to ensure that every goroutine has been completed, then I unify their response, and return the same.
Things are going fine when I do local testing, my response time is around 300ms, which is pretty good.
The problem arises when I do the locust load testing of 200 users, then my response time go as high as 7 8 sec. I am thinking it has to do with the HTTP client waiting for the resources as we are running a high number of goroutines.
like 1 API spin up 5 go-routine, so if each of 200 users makes API requests at the rate of supposing 5 req/sec. Then a total number of goroutines goes way higher. Again this is my assumption only
p.s. normally the API I am building is pretty good in response time,
I am using all the caching and stuff and any response greater than
400ms should not be the case
So can anyone please tell me how can I tackle this problem of
increasing response time when number of concurrent users increases
Locust test report
pseudo code
simple route
group.POST("/test", controller.testHandler)
controller
type Worker struct {
NumWorker int
Data chan structures.Placement
}
e := Worker{
NumWorker: 5, // Number of worker goroutine(s)
Data: make(chan, 5) /* Buffer Size */),
}
//call the goroutines along with the
for i := 0; i < e.NumWorker; i++ {
// Do some fake work
wg.Add(1)
go ad.GetResponses(params ,chan , &wg) //making HHTP call and returning the response in the channel
}
for v := range resChan {
//unifying all the response, and return the same as our response
switch v.Tyoe{
case A :
finalResponse.A = v
case B
finalResponse.B = v
}
}
return finalResponse
Request HTTP client
//i am using a global http client with custom transport , so that i can effectively use the resources
var client *http.Client
func init() {
tr := &http.Transport{
MaxIdleConnsPerHost: 1024,
TLSHandshakeTimeout: 0 * time.Second,
}
tr.MaxIdleConns = 100
tr.MaxConnsPerHost = 100
tr.MaxIdleConnsPerHost = 100
client = &http.Client{Transport: tr, Timeout: 10 * time.Second}
}
func GetResponses(params , chan ,wg){
res = client.Do(req)
chan <- res
}

So I have done some debugging and span monitoring , and turns out redis was the culprit in this. You can see this https://stackoverflow.com/a/70902382/9928176
To get an idea how I solved it

Related

Golang Handlefunc (Runtime output display on browser) [duplicate]

I am trying to send a page response as soon as request is received, then process something, but I found the response does not get sent out "first" even though it is first in code sequence.In real life I have a page for uploading a excel sheet which gets saved into the database which takes time (50,0000+ rows) and would like to update to user progress. Here is a simplified example; (depending how much RAM you have you may need to add a couple zeros to counter to see result)
package main
import (
"fmt"
"net/http"
)
func writeAndCount(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Starting to count"))
for i := 0; i < 1000000; i++ {
if i%1000 == 0 {
fmt.Println(i)
}
}
w.Write([]byte("Finished counting"))
}
func main() {
http.HandleFunc("/", writeAndCount)
http.ListenAndServe(":8080", nil)
}

The original concept of the HTTP protocol is a simple request-response server-client computation model. There was no streaming or "continuous" client update support. It is (was) always the client who first contacted the server should it needed some kind of information.
Also since most web servers cache the response until it is fully ready (or a certain limit is reached–which is typically the buffer size), data you write (send) to the client won't be transmitted immediately.
Several techniques were "developed" to get around this "limitation" so that the server is able to notify the client about changes or progress, such as HTTP Long polling, HTTP Streaming, HTTP/2 Server Push or Websockets. You can read more about these in this answer: Is there a real server push over http?
So to achieve what you want, you have to step around the original "borders" of the HTTP protocol.
If you want to send data periodically, or stream data to the client, you have to tell this to the server. The easiest way is to check if the http.ResponseWriter handed to you implements the http.Flusher interface (using a type assertion), and if it does, calling its Flusher.Flush() method will send any buffered data to the client.
Using http.Flusher is only half of the solution. Since this is a non-standard usage of the HTTP protocol, usually client support is also needed to handle this properly.
First, you have to let the client know about the "streaming" nature of the response, by setting the ContentType=text/event-stream response header.
Next, to avoid clients caching the response, be sure to also set Cache-Control=no-cache.
And last, to let the client know that you might not send the response as a single unit (but rather as periodic updates or as a stream) and so that the client should keep the connection alive and wait for further data, set the Connection=keep-alive response header.
Once the response headers are set as the above, you may start your long work, and whenever you want to update the client about the progress, write some data and call Flusher.Flush().
Let's see a simple example that does everything "right":
func longHandler(w http.ResponseWriter, r *http.Request) {
flusher, ok := w.(http.Flusher)
if !ok {
http.Error(w, "Server does not support Flusher!",
http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
start := time.Now()
for rows, max := 0, 50*1000; rows < max; {
time.Sleep(time.Second) // Simulating work...
rows += 10 * 1000
fmt.Fprintf(w, "Rows done: %d (%d%%), elapsed: %v\n",
rows, rows*100/max, time.Since(start).Truncate(time.Millisecond))
flusher.Flush()
}
}
func main() {
http.HandleFunc("/long", longHandler)
panic(http.ListenAndServe("localhost:8080", nil))
}
Now if you open http://localhost:8080/long in your browser, you will see an output "growing" by every second:
Rows done: 10000 (20%), elapsed: 1s
Rows done: 20000 (40%), elapsed: 2s
Rows done: 30000 (60%), elapsed: 3s
Rows done: 40000 (80%), elapsed: 4.001s
Rows done: 50000 (100%), elapsed: 5.001s
Also note that when using SSE, you should "pack" updates into SSE frames, that is you should start them with "data:" prefix, and end each frame with 2 newline chars: "\n\n".
"Literature" and further reading / tutorials
Read more about Server-sent events on Wikipedia.
See a Golang HTML5 SSE example.
See Golang SSE server example with client codes using it.
See w3school.com's turorial on Server-Sent Events - One Way Messaging.

You can check if the ResponseWriter is a http.Flusher, and if so, force the flush to network:
if f, ok := w.(http.Flusher); ok {
f.Flush()
}
However, bear in mind that this is a very unconventional HTTP handler. Streaming out progress messages to the response as if it were a terminal presents a few problems, particularly if the client is a web browser.
You might want to consider something more fitting with the nature of HTTP, such as returning a 202 Accepted response immediately, with a unique identifier the client can use to check on the status of processing using subsequent calls to your API.

Go Concurrency Circular Logic

I'm just getting into concurrency in Go and trying to create a dispatch go routine that will send jobs to a worker pool listening on the jobchan channel. If a message comes into my dispatch function via the dispatchchan channel and my other go routines are busy, the message is appended onto the stack slice in the dispatcher and the dispatcher will try to send again later when a worker becomes available, and/or no more messages are received on the dispatchchan. This is because the dispatchchan and the jobchan are unbuffered, and the go routine the workers are running will append other messages to the dispatcher up to a certain point and I don't want the workers blocked waiting on the dispatcher and creating a deadlock. Here's the dispatcher code I've come up with so far:
func dispatch() {
var stack []string
acount := 0
for {
select {
case d := <-dispatchchan:
stack = append(stack, d)
case c := <-mw:
acount = acount + c
case jobchan <-stack[0]:
if len(stack) > 1 {
stack[0] = stack[len(stack)-1]
stack = stack[:len(stack)-1]
} else {
stack = nil
}
default:
if acount == 0 && len(stack) == 0 {
close(jobchan)
close(dispatchchan)
close(mw)
wg.Done()
return
}
}
}
Complete example at https://play.golang.wiki/p/X6kXVNUn5N7
The mw channel is a buffered channel the same length as the number of worker go routines. It acts as a semaphore for the worker pool. If the worker routine is doing [m]eaningful [w]ork it throws int 1 on the mw channel and when it finishes its work and goes back into the for loop listening to the jobchan it throws int -1 on the mw. This way the dispatcher knows if there's any work being done by the worker pool, or if the pool is idle. If the pool is idle and there are no more messages on the stack, then the dispatcher closes the channels and return control to the main func.
This is all good but the issue I have is that the stack itself could be zero length so the case where I attempt to send stack[0] to the jobchan, if the stack is empty, I get an out of bounds error. What I'm trying to figure out is how to ensure that when I hit that case, either stack[0] has a value in it or not. I don't want that case to send an empty string to the jobchan.
Any help is greatly appreciated. If there's a more idomatic concurrency pattern I should consider, I'd love to hear about it. I'm not 100% sold on this solution but this is the farthest I've gotten so far.

This is all good but the issue I have is that the stack itself could be zero length so the case where I attempt to send stack[0] to the jobchan, if the stack is empty, I get an out of bounds error.
I can't reproduce it with your playground link, but it's believable, because at lest one gofunc worker might have been ready to receive on that channel.
My output has been Msgcnt: 0, which is also easily explained, because gofunc might not have been ready to receive on jobschan when dispatch() runs its select. The order of these operations is not defined.
trying to create a dispatch go routine that will send jobs to a worker pool listening on the jobchan channel
A channel needs no dispatcher. A channel is the dispatcher.
If a message comes into my dispatch function via the dispatchchan channel and my other go routines are busy, the message is [...] will [...] send again later when a worker becomes available, [...] or no more messages are received on the dispatchchan.
With a few creative edits, it was easy to turn that into something close to the definition of a buffered channel. It can be read from immediately, or it can take up to some "limit" of messages that can't be immediately dispatched. You do define limit, though it's not used elsewhere within your code.
In any function, defining a variable you don't read will result in a compile time error like limit declared but not used. This stricture improves code quality and helps identify typeos. But at package scope, you've gotten away with defining the unused limit as a "global" and thus avoided a useful error - you haven't limited anything.
Don't use globals. Use passed parameters to define scope, because the definition of scope is tantamount to functional concurrency as expressed with the go keyword. Pass the relevant channels defined in local scope to functions defined at package scope so that you can easily track their relationships. And use directional channels to enforce the producer/consumer relationship between your functions. More on this later.
Going back to "limit", it makes sense to limit the quantity of jobs you're queueing because all resources are limited, and accepting more messages than you have any expectation of processing requires more durable storage than process memory provides. If you don't feel obligated to fulfill those requests no matter what, don't accept "too many" of them in the first place.
So then, what function has dispatchchan and dispatch()? To store a limited number of pending requests, if any, before they can be processed, and then to send them to the next available worker? That's exactly what a buffered channel is for.
Circular Logic
Who "knows" when your program is done? main() provides the initial input, but you close all 3 channels in `dispatch():
close(jobchan)
close(dispatchchan)
close(mw)
Your workers write to their own job queue so only when the workers are done writing to it can the incoming job queue be closed. However, individual workers also don't know when to close the jobs queue because other workers are writing to it. Nobody knows when your algorithm is done. There's your circular logic.
The mw channel is a buffered channel the same length as the number of worker go routines. It acts as a semaphore for the worker pool.
There's a race condition here. Consider the case where all n workers have just received the last n jobs. They've each read from jobschan and they're checking the value of ok. disptatcher proceeds to run its select. Nobody is writing to dispatchchan or reading from jobschan right now so the default case is immediately matched. len(stack) is 0 and there's no current job so dispatcher closes all channels including mw. At some point thereafter, a worker tries to write to a closed channel and panics.
So finally I'm ready to provide some code, but I have one more problem: I don't have a clear problem statement to write code around.
I'm just getting into concurrency in Go and trying to create a dispatch go routine that will send jobs to a worker pool listening on the jobchan channel.
Channels between goroutines are like the teeth that synchronize gears. But to what end do the gears turn? You're not trying to keep time, nor construct a wind-up toy. Your gears could be made to turn, but what would success look like? Their turning?
Let's try to define a more specific use case for channels: given an arbitrarily long set of durations coming in as strings on standard input*, sleep that many seconds in one of n workers. So that we actually have a result to return, we'll say each worker will return the start and end time the duration was run for.
So that it can run in the playground, I'll simulate standard input with a hard-coded byte buffer.
package main
import (
"bufio"
"bytes"
"fmt"
"os"
"strings"
"sync"
"time"
)
type SleepResult struct {
worker_id int
duration time.Duration
start time.Time
end time.Time
}
func main() {
var num_workers = 2
workchan := make(chan time.Duration)
resultschan := make(chan SleepResult)
var wg sync.WaitGroup
var resultswg sync.WaitGroup
resultswg.Add(1)
go results(&resultswg, resultschan)
for i := 0; i < num_workers; i++ {
wg.Add(1)
go worker(i, &wg, workchan, resultschan)
}
// playground doesn't have stdin
var input = bytes.NewBufferString(
strings.Join([]string{
"3ms",
"1 seconds",
"3600ms",
"300 ms",
"5s",
"0.05min"}, "\n") + "\n")
var scanner = bufio.NewScanner(input)
for scanner.Scan() {
text := scanner.Text()
if dur, err := time.ParseDuration(text); err != nil {
fmt.Fprintln(os.Stderr, "Invalid duration", text)
} else {
workchan <- dur
}
}
close(workchan) // we know when our inputs are done
wg.Wait() // and when our jobs are done
close(resultschan)
resultswg.Wait()
}
func results(wg *sync.WaitGroup, resultschan <-chan SleepResult) {
for res := range resultschan {
fmt.Printf("Worker %d: %s : %s => %s\n",
res.worker_id, res.duration,
res.start.Format(time.RFC3339Nano), res.end.Format(time.RFC3339Nano))
}
wg.Done()
}
func worker(id int, wg *sync.WaitGroup, jobchan <-chan time.Duration, resultschan chan<- SleepResult) {
var res = SleepResult{worker_id: id}
for dur := range jobchan {
res.duration = dur
res.start = time.Now()
time.Sleep(res.duration)
res.end = time.Now()
resultschan <- res
}
wg.Done()
}
Here I use 2 wait groups, one for the workers, one for the results. This makes sure Im done writing all the results before main() ends. I keep my functions simple by having each function do exactly one thing at a time: main reads inputs, parses durations from them, and sends them off to the next worker. The results function collects results and prints them to standard output. The worker does the sleeping, reading from jobchan and writing to resultschan.
workchan can be buffered (or not, as in this case); it doesn't matter because the input will be read at the rate it can be processed. We can buffer as much input as we want, but we can't buffer an infinite amount. I've set channel sizes as big as 1e6 - but a million is a lot less than infinite. For my use case, I don't need to do any buffering at all.
main knows when the input is done and can close the jobschan. main also knows when jobs are done (wg.Wait()) and can close the results channel. Closing these channels is an important signal to the worker and results goroutines - they can distinguish between a channel that is empty and a channel that is guaranteed not to have any new additions.
for job := range jobchan {...} is shorthand for your more verbose:
for {
job, ok := <- jobchan
if !ok {
wg.Done()
return
}
...
}
Note that this code creates 2 workers, but it could create 20 or 2000, or even 1. The program functions regardless of how many workers are in the pool. It can handle any volume of input (though interminable input of course leads to an interminable program). It does not create a cyclic loop of output to input. If your use case requires jobs to create more jobs, that's a more challenging scenario that can typically be avoided with careful planning.
I hope this gives you some good ideas about how you can better use concurrency in your Go applications.
https://play.golang.wiki/p/cZuI9YXypxI

Rate limit with golang.org/x/time/rate api request

I already created a function for limiting to 50 requests for API logins in one day.
var limit = 50
package middleware
import (
"log"
"net"
"net/http"
"sync"
"time"
"golang.org/x/time/rate"
)
// Create a custom request struct which holds the rate limiter for each
// visitor and the last time that the request was seen.
type request struct {
limiter *rate.Limiter
lastSeen time.Time
}
// Change the the map to hold values of the type request.
// defaultTime using 3 minutes
var requests = make(map[string]*request)
var mu sync.Mutex
func getRequest(ip string, limit int) *rate.Limiter {
mu.Lock()
defer mu.Unlock()
v, exists := requests[ip]
if !exists {
limiter := rate.NewLimiter(1, limit)
requests[ip] = &request{limiter, time.Now()}
return limiter
}
// Update the last seen time for the visitor.
v.lastSeen = time.Now()
return v.limiter
}
func throttle(next http.Handler, limit int) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip, _, err := net.SplitHostPort(r.RemoteAddr)
if err != nil {
log.Println(err.Error())
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
limiter := getRequest(ip, limit)
fmt.Println(limiter.Allow())
if limiter.Allow() == false {
http.Error(w, http.StatusText(http.StatusTooManyRequests), http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Is it correct?
Because when I try it, it still passes. The function limit is not working.
I doubt with NewLimiter()
limiter := rate.NewLimiter(1, limit)
Does it mean one user only can request login 50 requests per day? (I already read the docs, but I do not understand.)

From the rate docs:
func NewLimiter(r Limit, b int) *Limiter
NewLimiter returns a new Limiter that allows events up to rate r and
permits bursts of at most b tokens.
So the first parameter is the rate-limit, not the second. Burst is the number of requests you want to allow that occur faster than the rate-limit - typically one uses a value of 1 to disallow bursting, anything higher will let this number of requests in before the regular rate-limit kicks in. Anyway...
To create the rate.Limit for your needs, you can use the helper function rate.Every():
rt := rate.Every(24*time.Hour / 50)
limiter := rate.NewLimiter(rt, 1)

NewLimited(1, 50) means 1 request/second with a burst of up to 50 requests. It's a token bucket, which means that there are 50 tokens, each accepted API call uses up one token, and the tokens are regenerated at the given rate, up to burst. Your code is creating a limiter per IP address, so that's a limit per IP address (which I guess you are approximating as one IP address is one user).
If you're running on a single persistent server, and the server and code never restarts, then you may be able to get something like 50 requests/day per user by specifying a rate of 50 / (3600*24) and a burst of 50. (Note: 3600*24 is the number of seconds in a day). But the rate limiting package you're using is not designed for such coarse rate-limiting (on the order of requests per day) -- it's designed to prevent server overload under heavy traffic in the short term (on the order of requests per second).
You probably want a rate-limiter that works with a database or similar (perhaps using a token bucket scheme, since that can be implemented efficiently). Probably there's a package somewhere for that, but I don't know of one of the top of my head.

How do I properly post a message to a simple Go Chatserver using REST API

I am currently building a simple chat server that supports posting messages through a REST API.
example:
========
```
curl -X POST -H "Content-Type: application/json" --data '{"user":"alex", "text":"this is a message"}' http://localhost:8081/message
{
"ok": true
}
Right now, I'm just currently storing the messages in an array of messages. I'm pretty sure this is an inefficient way. So is there a simple, better way to get and store the messages using goroutines and channels that will make it thread-safe.
Here is what I currently have:
type Message struct {
Text string
User string
Timestamp time.Time
}
var Messages = []Message{}
func messagePost(c http.ResponseWriter, req *http.Request){
decoder := json.NewDecoder(req.Body)
var m Message
err := decoder.Decode(&m)
if err != nil {
panic(err)
}
if m.Timestamp == (time.Time{}) {
m.Timestamp = time.Now()
}
addUser(m.User)
Messages = append(Messages, m)
}
Thanks!

It could be made thread safe using mutex, as #ThunderCat suggested but I think this does not add concurrency. If two or more requests are made simultaneously, one will have to wait for the other to complete first, slowing the server down.
Adding Concurrency: You make it faster and handle more concurrent request by using a queue (which is a Go channel) and a worker that listens on that channel - it'll be a simple implementation. Every time a message comes in through a Post request, you add to the queue (this is instantaneous and the HTTP response can be sent immediately). In another goroutine, you detect that a message has been added to the queue, you take it out append it to your Messages slice. While you're appending to Messages, the HTTP requests don't have to wait.
Note: You can make it even better by having multiple goroutines listen on the queue, but we can leave that for later.
This is how the code will somewhat look like:
type Message struct {
Text string
User string
Timestamp time.Time
}
var Messages = []Message{}
// messageQueue is the queue that holds new messages until they are processed
var messageQueue chan Message
func init() { // need the init function to initialize the channel, and the listeners
// initialize the queue, choosing the buffer size as 8 (number of messages the channel can hold at once)
messageQueue = make(chan Message, 8)
// start a goroutine that listens on the queue/channel for new messages
go listenForMessages()
}
func listenForMessages() {
// whenever we detect a message in the queue, append it to Messages
for m := range messageQueue {
Messages = append(Messages, m)
}
}
func messagePost(c http.ResponseWriter, req *http.Request){
decoder := json.NewDecoder(req.Body)
var m Message
err := decoder.Decode(&m)
if err != nil {
panic(err)
}
if m.Timestamp == (time.Time{}) {
m.Timestamp = time.Now()
}
addUser(m.User)
// add the message to the channel, it'll only wait if the channel is full
messageQueue <- m
}
Storing Messages: As other users have suggested, storing messages in memory may not be the right choice since the messages won't persist if the application is restarted. If you're working on a small, proof-of-concept type project and don't want to figure out the DB, you could save the Messages variable as a flat file on the server and then read from it every time the application starts (*Note: this should not be done on a production system, of course, for that you should set up a Database). But yeah, database should be the way to go.

Use a mutex to make the program threadsafe.
var Messages = []Message{}
var messageMu sync.Mutex
...
messageMu.Lock()
Messages = append(Messages, m)
messageMu.Unlock()
There's no need to use channels and goroutines to make the program threadsafe.
A database is probably a better choice for storing messages than the in memory slice used in the question. Asking how to use a database to implement a chat program is too broad a question for SO.

Why is my webserver in golang not handling concurrent requests?

This simple HTTP server contains a call to time.Sleep() that makes
each request take five seconds. When I try quickly loading multiple
tabs in a browser, it is obvious that each request
is queued and handled sequentially. How can I make it handle concurrent requests?
package main
import (
"fmt"
"net/http"
"time"
)
func serve(w http.ResponseWriter, r *http.Request) {
fmt.Fprintln(w, "Hello, world.")
time.Sleep(5 * time.Second)
}
func main() {
http.HandleFunc("/", serve)
http.ListenAndServe(":1234", nil)
}
Actually, I just found the answer to this after writing the question, and it is very subtle. I am posting it anyway, because I couldn't find the answer on Google. Can you see what I am doing wrong?

Your program already handles the requests concurrently. You can test it with ab, a benchmark tool which is shipped with Apache 2:
ab -c 500 -n 500 http://localhost:1234/
On my system, the benchmark takes a total of 5043ms to serve all 500 concurrent requests. It's just your browser which limits the number of connections per website.
Benchmarking Go programs isn't that easy by the way, because you need to make sure that your benchmark tool isn't the bottleneck and that it is also able to handle that many concurrent connections. Therefore, it's a good idea to use a couple of dedicated computers to generate load.

From Server.go , the go routine is spawned in the Serve function when a connection is accepted. Below is the snippet, :-
// Serve accepts incoming connections on the Listener l, creating a
// new service goroutine for each. The service goroutines read requests and
// then call srv.Handler to reply to them.
func (srv *Server) Serve(l net.Listener) error {
for {
rw, e := l.Accept()
if e != nil {
......
c, err := srv.newConn(rw)
if err != nil {
continue
}
c.setState(c.rwc, StateNew) // before Serve can return
go c.serve()
}
}

If you use xhr request, make sure that xhr instance is a local variable.
For example, xhr = new XMLHttpRequest() is a global variable. When you do parallel request with the same xhr variable you receive only one result. So, you must declare xhr locally like this var xhr = new XMLHttpRequest().

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Golang API giving higher response time with increasing number of concurrent users - go

So I have done some debugging and span monitoring , and turns out redis was the culprit in this. You can see this https://stackoverflow.com/a/70902382/9928176 To get an idea how I solved it

Related

Golang Handlefunc (Runtime output display on browser) [duplicate]

Go Concurrency Circular Logic

Rate limit with golang.org/x/time/rate api request

How do I properly post a message to a simple Go Chatserver using REST API

Why is my webserver in golang not handling concurrent requests?

Categories

Resources