What is go lang http.Request Body in term of computer science? - go

in http.Request type Body is closed when request is send by client. Why it need to be closed, why it can not be string, which you can read over and over?

This is called a stream. It's useful because it lets you handle data without having the whole set of data available in memory. It also lets you give the results of the operations you may do faster : you don't wait for the whole set to be computed.
As soon as you want to handle big data or worry about performances, you need streams.
It's also a convenient abstraction that lets you handle data one by one even when the whole set is available without having to handle an offset to iterate over the whole.

You can store the request stream as a string using the bytes and the io package:
func handler(w http.ResponseWriter, r *http.Request) {
var bodyAsString string
b := new(bytes.Buffer)
_, err := io.Copy(b, r)
if err == io.EOF {
bodyAsString = b.String()
}
}

Related

Logging and decoding repsonse body from golang net/http library

I'm writing a webhook in Go that parses a JSON payload. I'm attempting to log the raw payload and then decode it immediately after but it fails when I try. If I perform the actions separately, they both work fine independently.
Can someone explain why I can't use ioutil.ReadAll and json.NewDecoder together?
func webhook(w http.ResponseWriter, r *http.Request) {
body, _ := ioutil.ReadAll(r.Body)
log.Printf("incoming message - %s", body)
var p payload
decoder := json.NewDecoder(r.Body)
err := decoder.Decode(&p)
if err != nil {
// Returns EOF
log.Printf("invalid payload - %s", err)
}
defer r.Body.Close()
}
Can someone explain why I can't use ioutil.ReadAll and json.NewDecoder
together?
The request body is an io.ReadCloser that reads bytes, more or less, directly from a network connection. The contents of the Body aren't stored in memory by default. That's why after the first time you've read the Body the next time you try to read it you'll get EOF.
So if you need to process the request Body more than once, you yourself will have to store the contents into memory, which is what you are already doing with:
body, _ := ioutil.ReadAll(r.Body)
You can then reuse body as many times as you like, and since you have the Body contents at your disposal as a []byte value, you can use json.Unmarshal instead of json.NewDecoder(...).Decode.
This is unrelated to your question, but please do not ignore the error returned from ioutil.ReadAll.
Also you can drop the defer r.Body.Close() line, because you do not have to close the request body in your server handlers. (emphasis mine)
For server requests the Request Body is always non-nil but will return
EOF immediately when no body is present. The Server will close the
request body. The ServeHTTP Handler does not need to.
r.Body is meant to be read exactly once.
When you use the ioutil.ReadAll function you do read all the data from the body. That's why the decoder which also relies on r.Body in fact gets nothing to decode.
Minor additional point about json.Decoder and json.Unmarshal: at first glance it looks like the only difference between the two is just that the former operates on a stream and the latter on a []byte, but they actually have different semantics.
json.Unmarshal will return an error if the data contains more than one json object. So, for example, it will parse {}, but it will not parse {}{}.
json.Decoder parses one complete object per call to Decode, so if you give it {}{}, it will parse those two objects and then the third call will return io.EOF and it's More method will return false.
In a normal http body, you probably only want a single object, so you'd want to use Unmarshal if you're not worried about loading all the data into memory at once. You can also use Decoder and manually check that there is only one object if you care to do so.

Extending Golang's http.Resp.Body to handle large files

I have a client application which reads in the full body of a http response into a buffer and performs some processing on it:
body, _ = ioutil.ReadAll(containerObject.Resp.Body)
The problem is that this application runs on an embedded device, so responses that are too large fill up the device RAM, causing Ubuntu to kill the process.
To avoid this, I check the content-length header and bypass processing if the document is too large. However, some servers (I'm looking at you, Microsoft) send very large html responses without setting content-length and crash the device.
The only way I can see of getting around this is to read the response body up to a certain length. If it reaches this limit, then a new reader could be created which first streams the in-memory buffer, then continues reading from the original Resp.Body. Ideally, I would assign this new reader to the containerObject.Resp.Body so that callers would not know the difference.
I'm new to GoLang and am not sure how to go about coding this. Any suggestions or alternative solutions would be greatly appreciated.
Edit 1: The caller expects a Resp.Body object, so the solution needs to be compatible with that interface.
Edit 2: I cannot parse small chunks of the document. Either the entire document is processed or it is passed unchanged to the caller, without loading it into memory.
If you need to read part of the response body, then reconstruct it in place for other callers, you can use a combination of an io.MultiReader and ioutil.NopCloser
resp, err := http.Get("http://google.com")
if err != nil {
return err
}
defer resp.Body.Close()
part, err := ioutil.ReadAll(io.LimitReader(resp.Body, maxReadSize))
if err != nil {
return err
}
// do something with part
// recombine the buffered part of the body with the rest of the stream
resp.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), resp.Body))
// do something with the full Response.Body as an io.Reader
If you can't defer resp.Body.Close() because you intend to return the response before it's read in its entirety, you will need to augment the replacement body so that the Close() method applies to the original body. Rather than using the ioutil.NopCloser as the io.ReadCloser, create your own that refers to the correct method calls.
type readCloser struct {
io.Closer
io.Reader
}
resp.Body = readCloser{
Closer: resp.Body,
Reader: io.MultiReader(bytes.NewReader(part), resp.Body),
}

Golang high cpu usage on simple webserver unable to understand why?

So I have a simple net/http webserver. All it does is is deliver 100MB of random bytes, which I intend to use for network speed testing. My handler for the 100mb endpoint is really simple (pasted below). The code works fine and I get my random byte file, the problem is when I run this and someone downloads these 100megabytes, the CPU for this program shoots up to 150% and stays there until this handler finishes running. Am I doing something very wrong here? What could I do to improve this handler's performance?
func downloadHandler(w http.ResponseWriter, r *http.Request) {
str := RandStringBytes(8192); //generates 8192 bytes of randomness
sz := 1000*1000*100; //100Megabytes
iter := sz/len(str)+1;
w.Header().Set("Content-Type", "application/octet-stream")
w.Header().Set("Content-Length", strconv.Itoa( sz ))
for i := 0; i < iter ; i++ {
fmt.Fprintf(w, str )
}
}
The problem is that fmt.Fprintf() expects a format string:
func Fprintf(w io.Writer, format string, a ...interface{}) (n int, err error)
And you pass it a big, 8 KB format string. The fmt package has to analyze the format string, it is not something that gets to the output as is. Most definately this is what is eating your CPU.
If the random string contains the special % sign, that even makes your case worse, as then fmt.Fprintf() might expect further arguments which you don't "deliver", so the fmt package also has to (will) include error messages in the output, such as:
fmt.Fprintf(os.Stdout, "aaa%bbb%d")
Output:
aaa%!b(MISSING)bb%!d(MISSING)
Use fmt.Fprint() instead which does not expect a format string:
fmt.Fprint(w, str)
Or even better, convert your random string to a byte slice once, and just keep writing that:
data := []byte(str)
for i := 0; i < iter; i++ {
if _, err := w.Write(data); err != nil {
// Handle error, e.g. return
}
}
Delivering large amount of data – you won't get a faster solution than writing a prepared byte slice in a loop (maybe slightly if you vary the size of the slice). If your solution is still "slow", that might be due to your RandStringBytes() function which we don't know anything about, or your output might be compressed (gzipped) if you use other handlers or some framework (which does use relatively high CPU). Also if the client that receives the response is also on your computer (e.g. a browser), it –or a firewall / antivirus software– may check / analyze the response for malicious code (which may also be resource intensive).

How to perform html minify with Revel framework

My first idea was to get the response body within the filter, then use one of minify libraries like tdewolff/minify and write to response, but i cant find the way to get response body.
Is there any better solutions?
It seems, by looking at the docs, that a filter can access the Controller type, which contains the Response. This response contains Out which is a ResponseWriter (and thus also an io.Writer). We need to replace only the Write method to redirect the write to the minifier, which then writes to the response writer. We need to use io.Pipe and a goroutine for this.
type MinifyResponseWriter struct {
http.ResponseWriter
io.Writer
}
func (f MinifyResponseWriter) Write(b []byte) (int, error) {
return f.Writer.Write(b)
}
func MinifyFilter(c *Controller, fc []Filter) {
pr, pw := io.Pipe()
go func(w io.Writer) {
m := minify.New()
m.AddFunc("text/css", css.Minify)
m.AddFunc("text/html", html.Minify)
m.AddFunc("text/javascript", js.Minify)
m.AddFunc("image/svg+xml", svg.Minify)
m.AddFuncRegexp(regexp.MustCompile("[/+]json$"), json.Minify)
m.AddFuncRegexp(regexp.MustCompile("[/+]xml$"), xml.Minify)
if err := m.Minify("mimetype", w, pr); err != nil {
panic(err)
}
}(c.Response.Out)
c.Response.Out = MinifyResponseWriter{c.Response.Out, pw}
}
Something along those lines (not tested). Here we take the incoming io.Writer (which is part of the ResponseWriter), and wrap a struct around that. It keeps the original methods for the response writer, but the Write method is overrided to be replaced by the PipeWriter. This means that any write to the new response writer goes to PipeWriter, which is coupled to PipeReader. Minify Reads from that reader and writes to the original response writer.
Because we change the value of c.Response.Out, we need to pass it explicitly to the goroutine. Make sure you obtain the correct mimetype (through extension?) or call the appropriate minify function directly.

Is it advisable to (further) limit the size of forms when using golang?

I searched around and as far as I can tell, POST form requests are already limited to 10MB (http://golang.org/src/net/http/request.go#L721).
If I were to go about reducing this in my ServeHTTP method, I'm not sure how to properly do it. I would try something like this:
r.Body = http.MaxBytesReader(w, r.Body, MaxFileSize)
err := r.ParseForm()
if err != nil {
//redirect to some error page
return
}
But would returning upon error close the connection as well? How would I prevent having to read everything? I found this: https://stackoverflow.com/a/26393261/2202497, but what if content length is not set and in the middle of reading I realize that the file is too big.
I'm using this as a security measure to prevent someone from hogging my server's resources.
The correct way to limit the size of the request body is to do as you suggested:
r.Body = http.MaxBytesReader(w, r.Body, MaxFileSize)
err := r.ParseForm()
if err != nil {
// redirect or set error status code.
return
}
MaxBytesReader sets a flag on the response when the limit is reached. When this flag is set, the server does not read the remainder of the request body and the server closes the connection on return from the handler.
If you are concerned about malicious clients, then you should also set Server.ReadTimeout, Server.WriteTimeout and possibly Server.MaxHeaderBytes.
If you want to set the request body limit for all of your handlers, then wrap root handler with a handler that sets the limit before delegating to the root handler:
type maxBytesHandler struct {
h http.Handler
n int64
}
func (h *maxBytesHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
r.Body = http.MaxBytesReader(w, r.Body, h.n)
h.h.ServeHTTP(w, r)
}
Wrap the root handler when calling ListenAndServe:
log.Fatal(http.ListenAndServe(":8080", &maxBytesHandler{h:mux, n:4096))
or when configuring a server:
s := http.Server{
Addr: ":8080",
Handler: &maxBytesReader{h:mux, n:4096},
}
log.Fatal(s.ListenAndServe())
There's no need for a patch as suggested in another answer. MaxBytesReader is the official way to limit the size of the request body.
Edit: As others cited MaxByteReader is the supported way. It is interesting that the default reader is instead, limitreader after type asserting for max byte reader.
Submit a patch to the Go source code and make it configurable! You are working with an open source project after all. Adding a setter to http.Request and some unit tests for it is probably only 20 minutes worth of work. Having a hardcoded value here is a bit clunky, give back and fix it :).
You can of course implement your own ParseForm(r *http.Request) method if you really need to override this. Go is essentially BSD, so you can copy paste the library ParseForm and change the limit, but thats a bit ugly no?

Resources