GoLang net/http memory keeps increasing on contineous requests - go

I have the following code in GoLang
package main
import (
"bytes"
"encoding/json"
"io/ioutil"
"log"
"net/http"
"time"
)
func httpClient() *http.Client {
var transport http.RoundTripper = &http.Transport{
DisableKeepAlives: false,
}
client := &http.Client{Timeout: 60 * time.Second, Transport: transport}
return client
}
func sendRequest(client *http.Client, method string) []byte {
endpoint := "https://httpbin.org/post"
values := map[string]string{"foo": "baz"}
jsonData, err := json.Marshal(values)
req, err := http.NewRequest(method, endpoint, bytes.NewBuffer(jsonData))
if err != nil {
log.Fatalf("Error Occurred. %+v", err)
}
resp, err:= client.Do(req)
if err != nil {
defer resp.Body.Close()
log.Fatalf("Error sending request to API endpoint. %+v", err)
}
// Close the connection to reuse it
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Fatalf("Couldn't parse response body. %+v", err)
}
return body
}
func main() {
// c should be re-used for further calls
c := httpClient()
for i := 1; i <= 60; i++ {
response := sendRequest(c, http.MethodPost)
log.Println("Response Body:", string(response))
response = nil
time.Sleep(time.Millisecond * 1000)
}
}
When executed, it keeps the memory size increasing and the growth goes to as much as 90mb in one hour. is the gc not working properly. Even though i am using same httpclient for multiple requests but it still looks like theres something thats increasing the size of memory footprint.

I advice you to use tools like pprof, these are very useful at troubleshooting precisely this kind of issues.
You have set DisableKeepAlives field to false, which means that it will keep open connections even after the requests have been made, leading to further memory leaks. You should also call defer resp.Body.Close() after calling ioutil.ReadAll(resp.Body). This is precisely the purpose of the defer keyword - preventing memory leaks. GC does not mean absolute memory safety.
Also, outside of main avoid using log.Fatal. Use leveled logger, like zap or zerolog instead, since log.Fatal calls os.Exit(1) with an immediate effect, which means your defer statements will take no effect, or call plain panic. See Should a Go package ever use log.Fatal and when?

Related

How can I get a size of request.Header in bytes in Golang?

I need to find the size of request.Header where request has *http.Request type:
req, err := http.NewRequest("GET", "/", nil)
cookie := &http.Cookie{Name: "foo", Value: "bar"}
req.AddCookie(cookie)
I tried
len(request.Header) # returned the number of elements in the map -- essentially the number of headers
and
for k, v := range req.Header {
bytesSize += len(k) + len(v)
}
that didn't work either since v was a map.
I found Computing the memory footprint (or byte length) of a map question but the answer seems pretty complicated (and their map values are integers which is not the case here).
Update: actually here's the definition of type Header map[string][]string so we don't have to use recursion.
https://pkg.go.dev/net/http#Server.MaxHeaderBytes can handle this for you.
This demo doesn't work reliably in the Playground (dial or connect timeouts) . It seems to work reliably locally though, which makes me guess it's an artifact of the playground's behavior.
We'll start an http Server with alow MaxHeaderBytes and then surpass it greatly.
package main
import (
"context"
"fmt"
"io"
"net"
"net/http"
"strings"
"time"
)
func main() {
res := make(chan error)
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "%+v", r.Header)
})
s := &http.Server{
Addr: "127.0.0.1:8103",
Handler: mux,
ReadTimeout: 1 * time.Second,
WriteTimeout: 1 * time.Second,
MaxHeaderBytes: 2048,
}
if l, err := net.Listen("tcp", "127.0.0.1:8103"); err != nil {
panic(fmt.Errorf("Couldn't listen: %w", err))
} else {
go func() {
res <- s.Serve(l)
}()
}
client := &http.Client{
Timeout: 3 * time.Second,
}
req, err := http.NewRequest("GET", "http://127.0.0.1:8103", nil)
if err != nil {
panic(err)
}
req.Header.Add("X-Long-Header", strings.Repeat("long ", 2048)+"header")
resp, err := client.Do(req)
if err != nil {
panic(fmt.Errorf("HTTP Request failed: %w", err))
}
fmt.Println(resp)
body, err := io.ReadAll(resp.Body)
if err != nil {
panic(fmt.Errorf("Could not read response body: %w", err))
}
fmt.Println("Body:", string(body))
s.Shutdown(context.Background())
<-res
}
Here, I'm setting MaxHeaderBytes to a fairly small value. I am passing far more than that value in my X-Long-Header: long long long .... header. If you can get the playground to work (just run it a few times) or run it locally, you'll get:
&{431 Request Header Fields Too Large 431 HTTP/1.1 1 1 map[Content-Type:[text/plain; charset=utf-8]] 0xc00001a180 -1 [] true false map[] 0xc000176000 <nil>}
Body: 431 Request Header Fields Too Large
As you can see, the 431 will automatically be generated if all headers are too large.
It might be appropriate for your handler itself to respond with a 431 if particular headers were too long, but by the time your handler has been passed an http.Request, the headers have been received. It doesn't make sense to try to compute the total length of the headers yourself and then respond with a 431 based on that.
Besides, standard headers may come and go, so it would be unwise to restrict overall header size too closely.
Instead, check whatever individual headers you're concerned about.

Trouble figuring out data race in goroutine

I started learning go recently and I've been chipping away at this for a while now, but figured it was time to ask for some specific help. I have my program requesting paginated data from an api and because there are about 160 pages of data. Seems like a good use of goroutines, except I have race conditions and I can't seem to figure out why. It's probably because I'm new to the language, but my impressions was that params for a function are passed as a copy of the data in the function calling it unless it's a pointer.
According to what I think I know this should be making copies of my data which leaves me free to change it in the main function, but I end up request some pages multiple times and other pages just once.
My main.go
package main
import (
"bufio"
"encoding/json"
"log"
"net/http"
"net/url"
"os"
"strconv"
"sync"
"github.com/joho/godotenv"
)
func main() {
err := godotenv.Load()
if err != nil {
log.Fatalln(err)
}
httpClient := &http.Client{}
baseURL := "https://api.data.gov/ed/collegescorecard/v1/schools.json"
filters := make(map[string]string)
page := 0
filters["school.degrees_awarded.predominant"] = "2,3"
filters["fields"] = "id,school.name,school.city,2018.student.size,2017.student.size,2017.earnings.3_yrs_after_completion.overall_count_over_poverty_line,2016.repayment.3_yr_repayment.overall"
filters["api_key"] = os.Getenv("API_KEY")
outFile, err := os.Create("./out.txt")
if err != nil {
log.Fatalln(err)
}
writer := bufio.NewWriter(outFile)
requestURL := getRequestURL(baseURL, filters)
response := requestData(requestURL, httpClient)
wg := sync.WaitGroup{}
for (page+1)*response.Metadata.ResultsPerPage < response.Metadata.TotalResults {
page++
filters["page"] = strconv.Itoa(page)
wg.Add(1)
go func() {
defer wg.Done()
requestURL := getRequestURL(baseURL, filters)
response := requestData(requestURL, httpClient)
_, err = writer.WriteString(response.TextOutput())
if err != nil {
log.Fatalln(err)
}
}()
}
wg.Wait()
}
func getRequestURL(baseURL string, filters map[string]string) *url.URL {
requestURL, err := url.Parse(baseURL)
if err != nil {
log.Fatalln(err)
}
query := requestURL.Query()
for key, value := range filters {
query.Set(key, value)
}
requestURL.RawQuery = query.Encode()
return requestURL
}
func requestData(url *url.URL, httpClient *http.Client) CollegeScoreCardResponseDTO {
request, _ := http.NewRequest(http.MethodGet, url.String(), nil)
resp, err := httpClient.Do(request)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
var parsedResponse CollegeScoreCardResponseDTO
err = json.NewDecoder(resp.Body).Decode(&parsedResponse)
if err != nil {
log.Fatalln(err)
}
return parsedResponse
}
I know another issue I will be running into is writing to the output file in the correct order, but I believe using channels to tell each routine what request finished writing could solve that. If I'm incorrect on that I would appreciate any advice on how to approach that as well.
Thanks in advance.
goroutines do not receive copies of data. When the compiler detects that a variable "escapes" the current function, it allocates that variable on the heap. In this case, filters is one such variable. When the goroutine starts, the filters it accesses is the same map as the main thread. Since you keep modifying filters in the main thread without locking, there is no guarantee of what the goroutine sees.
I suggest you keep filters read-only, create a new map in the goroutine by copying all items from the filters, and add the "page" in the goroutine. You have to be careful to pass a copy of the page as well:
go func(page int) {
flt:=make(map[string]string)
for k,v:=range filters {
flt[k]=v
}
flt["page"]=strconv.Itoa(page)
...
} (page)

ioutil.ReadAll leads to goroutine leak

Why do I have more than one goroutine before termination, even though I closed resp.body, while I only used blocking calls? If I do not consume resp.Body it terminates with only one goroutine.
package main
import (
"fmt"
"io/ioutil"
"net/http"
"runtime"
"time"
)
func fetch() {
client := http.Client{Timeout: time.Second * 10}
url := "http://example.com"
req, err := http.NewRequest("POST", url, nil)
resp, err := client.Do(req)
if err != nil {
panic(err)
}
defer resp.Body.Close()
_, err = ioutil.ReadAll(resp.Body)
if err != nil {
panic(err)
}
}
func main() {
fmt.Println("#Goroutines:", runtime.NumGoroutine())
fetch()
// runtime.GC()
time.Sleep(time.Second * 5)
fmt.Println("#Goroutines:", runtime.NumGoroutine())
}
Outputs:
#Goroutines: 1
#Goroutines: 3
The default http transport maintains a connection pool.
DefaultTransport is the default implementation of Transport and is used by DefaultClient. It establishes network connections as needed and caches them for reuse by subsequent calls.
Each connection is managed by at least one goroutine. This is not a leak though, you are just impatient. If you wait long enough, you will see that the connection is closed eventually and the goroutine goes away. The default idle timeout is 90 seconds.
If you want to close connections asap, set either of http.Request.Close or http.Transport.DisableKeepAlives to true.

why is fasthttp like single process?

requestHandler := func(ctx *fasthttp.RequestCtx) {
time.Sleep(time.Second*time.Duration(10))
fmt.Fprintf(ctx, "Hello, world! Requested path is %q", ctx.Path())
}
s := &fasthttp.Server{
Handler: requestHandler
}
if err := s.ListenAndServe("127.0.0.1:82"); err != nil {
log.Fatalf("error in ListenAndServe: %s", err)
}
multiple request,and it cost time like X*10s.
fasthttp is single process?
after two days...
I am sorry for this question,i describe my question not well.My question is caused by the browser,the browser request the same url by synchronization, and it mislead me, it make think the fasthttp web server hanlde the request by synchronization.
I think instead of fasthttp is single process?, you're asking whether fasthttp handles client requests concurrently or not?
I'm pretty sure that any server (including fasthttp) package will handle client requests concurrently. You should write a test/benchmark instead of manually access the server through several browsers. The following is an example of such test code:
package main_test
import (
"io/ioutil"
"net/http"
"sync"
"testing"
"time"
)
func doRequest(uri string) error {
resp, err := http.Get(uri)
if err != nil {
return err
}
defer resp.Body.Close()
_, err = ioutil.ReadAll(resp.Body)
if err != nil {
return err
}
return nil
}
func TestGet(t *testing.T) {
N := 1000
wg := sync.WaitGroup{}
wg.Add(N)
start := time.Now()
for i := 0; i < N; i++ {
go func() {
if err := doRequest("http://127.0.0.1:82"); err != nil {
t.Error(err)
}
wg.Done()
}()
}
wg.Wait()
t.Logf("Total duration for %d concurrent request(s) is %v", N, time.Since(start))
}
And the result (in my computer) is
fasthttp_test.go:42: Total duration for 1000 concurrent request(s) is 10.6066411s
You can see that the answer to your question is No, it handles the request concurrently.
UPDATE:
In case the requested URL is the same, your browser may perform the request sequentially. See Multiple Ajax requests for same URL. This explains why the response times are X*10s.
I am sorry for this question,i describe my question not well.My question is caused by the browser,the browser request the same url by synchronization, and it mislead me, it make think the fasthttp web server hanlde the request by synchronization.

Go Error handling on a REST API

I am very new to go and have deployed a small service with an API endpoint.
I have heard/read that go doesn't use try/catch so I am trying to figure out how I can "catch" any problems happening from my service call from my API and make sure that the resource server doesn't go down.
My code for my API looks like the following..
I have a routes.go file with the following
package main
import (
"net/http"
"github.com/gorilla/mux"
)
type Route struct {
Name string
Method string
Pattern string
HandlerFunc http.HandlerFunc
}
type Routes []Route
func NewRouter() *mux.Router {
router := mux.NewRouter().StrictSlash(true)
for _, route := range routes {
router.
Methods(route.Method).
Path(route.Pattern).
Name(route.Name).
Handler(route.HandlerFunc)
}
return router
}
var routes = Routes{
Route{
"CustomerLocationCreate",
"POST",
"/tracking/customer",
CustomerLocationCreate,
},
}
I have a handlers.go
package main
import (
"encoding/json"
"net/http"
"io"
"io/ioutil"
)
//curl -H "Content-Type: application/json" -d '{"userId":"1234"}' http://localhost:8181/tracking/customer
func CustomerLocationCreate(w http.ResponseWriter, r *http.Request) {
var location CustomerLocation
body, err := ioutil.ReadAll(io.LimitReader(r.Body, 1048576))
if err != nil {
panic(err)
}
if err := r.Body.Close(); err != nil {
panic(err)
}
if err := json.Unmarshal(body, &location); err != nil {
w.Header().Set("Content-Type", "application/json; charset=UTF-8")
w.WriteHeader(422) // unprocessable entity
if err := json.NewEncoder(w).Encode(err); err != nil {
panic(err)
}
}
c := RepoCreateCustomerLocation(location)
w.Header().Set("Content-Type", "application/json; charset=UTF-8")
w.WriteHeader(http.StatusCreated)
if err := json.NewEncoder(w).Encode(c); err != nil {
panic(err)
}
HandleCustomerLocationChange(c);
}
and I have a bus.go which has the HandleCustomerLocationChange(...) function.
func HandleCustomerLocationChange(custLoc CustomerLocation) {
endpoint := og.Getenv("RABBIT_ENDPOINT")
conn, err := amqp.Dial("amqp://guest:guest#" + endpoint)
failOnError(err, "Failed to connect to RabbitMQ")
defer conn.Close()
ch, err := conn.Channel()
failOnError(err, "Failed to open a channel")
defer ch.Close()
topic := "locationChange"
err = ch.ExchangeDeclare(
topic, // name
"topic", // type
true, // durable
false, // auto-deleted
false, // internal
false, // no-wait
nil, // arguments
)
failOnError(err, "Failed to declare an exchange")
// Create JSON from the instance data.
body, _ := json.Marshal(custLoc)
// Convert bytes to string.
err = ch.Publish(
topic, // exchange
"", // routing key
false, // mandatory
false, // immediate
amqp.Publishing{
ContentType: "text/plain",
Body: body,
})
failOnError(err, "Failed to publish a message")
log.Printf(" [x] Sent %s", body)
}
My question is how should I modify both the HandleCustomerLocationChange(...) function and if necessaryCustomerLocationChange(..)` handler to handle errors properly so that if an error occurs, my entire API doesn't go down?
Go suggests a different approach, that errors are not exceptional, they're normal events, just less common.
Taking an example from the code above:
body, err := ioutil.ReadAll(io.LimitReader(r.Body, 1048576))
if err != nil {
panic(err)
}
Here, a panic (without recovery) terminates the process, shutting down the web server. Seems an overly severe response to not fully reading a request.
What do you want to do? It may be appropriate to tell the client who made the request:
body, err := ioutil.ReadAll(io.LimitReader(r.Body, 1048576))
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
You might want to return a json encoded response, or give a generic message to the client avoid exposing too much, and log the specific error details.
For general functions it's idiomatic to return the error as the last return parameter. In the specific example you mentioned:
func HandleCustomerLocationChange(custLoc CustomerLocation)
...
conn, err := amqp.Dial(...)
failOnError(err, "Failed to connect to RabbitMQ")
Instead, check if the connection failed, and return the error to the caller. Handle it in the calling function, or add information and propagate it up the call stack.
func HandleCustomerLocationChange(custLoc CustomerLocation) error
...
conn, err := amqp.Dial(...)
if err != nil {
return fmt.Errorf("failed to connect to RabbitMQ: %s", err)
}
Propagating the error in this way gives a concise explanation of the root cause, like the 5 whys technique, eg:
"did not update client location: did not connect to rabbitmq: network address 1.2.3 unreachable"
Another convention is to deal with errors first and return early. This helps to reduce nesting.
See also the many error handling resources, like error handling in a web application, Go by Example, Error Handling and Go, errors are values and Defer, Panic & Recover. The source code of the error package is interesting, as is Russ Cox's comment on error handling, and Nathan Youngman's To Err is Human.
Also interesting is Upspin's concept of an operational trace, rather than a stack trace.

Resources