I'm letting users upload a file using FormFile. At what point should I check if the file size is too large. When I do
file, header, fileErr := r.FormFile("file")
A file object is already created. So have I incurred the cost of reading in the entire file already?
https://golang.org/pkg/net/http#Request.FormFile
Use http.MaxBytesReader to limit the number of bytes read from the request. Before calling ParseMultiPartForm or FormFile, execute this line:
r.Body = http.MaxBytesReader(w, r.Body, max)
where r is the *http.Request and w is the http.Response.
MaxBytesReader limits the bytes read for the entire request body and not an individual file. A limit on the request body size can be a good approximation of a limit on the file size when there's only one file upload. If you need to enforce a specific limit for one or more files, then set the MaxBytesReader limit large enough for all expected request data and check FileHeader.Size for each file.
When the http.MaxBytesReader limit is breached, the server stops reading from the request and closes the connection after the handler returns.
If you want to limit the amount of memory used instead of the request body size, then call r.ParseMultipartForm(maxMemory) before calling r.FormFile(). This will use up to maxMemory bytes for file parts, with the remainder stored in temporary files on disk. This call does not limit the total number of bytes read from the client or the size of an uploaded file.
Checking the request Content-Length header does not work for two reasons:
The content length is not set for chunked request bodies.
The server may read the entire request body to support connection keep-alive. Breaching the MaxBytesReader limit is the only way to ensure that the server stops reading the request body.
Some people are suggesting to rely on Content-Length header and I have to warn you not to use it at all. This header can be any number because it can be changed by a client regardless of the actual file size.
Use MaxBytesReader because:
MaxBytesReader prevents clients from accidentally or maliciously
sending a large request and wasting server resources.
Here is an example:
r.Body = http.MaxBytesReader(w, r.Body, 2 * 1024 * 1024) // 2 Mb
clientFile, handler, err := r.FormFile(formDataKey)
if err != nil {
log.Println(err)
return
}
If your request body is bigger than 2 Mb, you will see something like this: multipart: NextPart: http: request body too large
Calling FormFile calls ParseMultiPartForm, which will parse the entire request body, using up to 32M by default before storing the contents in temporary files. You can call ParseMultiPartForm yourself before calling FormFile to determine how much memory to consume, but the body will still be parsed.
Th client may provide a Content-Length header in the multipart.FileHeader which you could use, but that is dependent on the client.
If you want to limit the incoming request size, wrap the request.Body with MaxBytesReader in your handler before parsing any of the Body.
You have r.ContentLength int64 field in request struct and r.Header.Get("Content-Length") string method. Maybe that can help.
Related
func (handler Handler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
var content string
...
w.Write([]byte(content))
}
if len(content) <= 2048, the content-length will be added automatically in the response. And if it's over 2048, there is no content-length, and the Transfer-Encoding: chunked will be added.
I can't find where to determine the 2048.
I'm asking for help to find the source code that where to determine the 2048.
Let's look at the documentation of this feature in the http.ResponseWriter interface just for clarity:
[I]f the total size of all written data is under a few KB and there are no Flush calls, the Content-Length header is added automatically.
First, we can see that the number may not be exactly 2048 (2 KB), but that is in the range we would expect for "a few KB". Second, we can see that this behaviour has something to do with the Flush method, which is documented in the Flusher interface:
Flush sends any buffered data to the client.
The Flusher interface is implemented by ResponseWriters that allow an HTTP handler to flush buffered data to the client.
The default HTTP/1.x and HTTP/2 ResponseWriter implementations support Flusher, but ResponseWriter wrappers may not. Handlers should always test for this ability at runtime.
As it says, your ResponseWriter may support data buffering and flushing. What this means is that when you write data to the response writer, it does not immediately get transmitted over the connection. Instead, it first gets written into a buffer. Each time the buffer is too full to write to anymore, and when the ServeHTTP method returns, the whole buffer will get transmitted. This ensures that data gets transmitted efficiently even when you do lots of tiny writes, and that all data gets transmitted in the end. You also have the option of proactively emptying the buffer at any time with the Flush method. The HTTP headers must be sent before the body data, but there's no need to send them until the first time the buffer is emptied.
Putting all this together, you'll see that if the total amount written is no more than the buffer size, and we never call Flush, then the headers do not need to be sent until all the data is ready, at which point we know the content length. If the total amount written is more than the buffer size, then the headers must be sent before the content length is known, and so the ResponseWriter can't determine it automatically.
This is implemented in the source in net/http/server.go. Specifically, here are the declarations of the buffer size, and the chunkedWriter which implements part of the buffered writing behaviour:
// This should be >= 512 bytes for DetectContentType,
// but otherwise it's somewhat arbitrary.
const bufferBeforeChunkingSize = 2048
// chunkWriter writes to a response's conn buffer, and is the writer
// wrapped by the response.w buffered writer.
//
// chunkWriter also is responsible for finalizing the Header, including
// conditionally setting the Content-Type and setting a Content-Length
// in cases where the handler's final output is smaller than the buffer
// size. It also conditionally adds chunk headers, when in chunking mode.
//
// See the comment above (*response).Write for the entire write flow.
type chunkWriter struct {
Link to the source code for 1.19.5. Please note that the source code is subject to change with each Go release.
The value is defined here:
// This should be >= 512 bytes for DetectContentType,
// but otherwise it's somewhat arbitrary.
const bufferBeforeChunkingSize = 2048
The Life of a Write explains what happens:
If the handler didn't declare a Content-Length up front, we either go into chunking mode or, if the handler finishes running before the chunking buffer size, we compute a Content-Length and send that in the header instead.
I am currently learning to use golang as a server side language. I'm learning how to handle forms, and so I wanted to see how I could prevent some malicious client from sending a very large (in the case of a form with multipart/form-data) file and causing the server to run out of memory. For now this is my code which I found in a question here on stackoverflow:
part, _ := ioutil.ReadAll(io.LimitReader(r.Body, 8388608))
r.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), r.Body))
In my code r is equal to *http.Request. So, I think that code works well, but what happens is that when I send a file regardless of its size (according to my code, the maximum size is 8M) my code still receives the entire file, so I have doubts that my code actually works. So my question is. Does my code really work wrong? Is there a concept that I am missing and that is why I think my code is malfunctioning? How can I limit the size of an http request correctly?
Update
I tried to run the code that was shown in the answers, I mean, this code:
part, _ := ioutil.ReadAll(io.LimitReader(r.Body, 8388608))
r.Body = ioutil.NopCloser(bytes.NewReader(part))
But when I run that code, and when I send a file larger than 8M I get this message from my web browser:
The connection was reset
The connection to the server was reset while the page was loading.
How can I solve that? How can I read only 8M maximum but without getting that error?
I would ask the question: "How is your service intended/expected to behave if it receives a request greater than the maximum size?"
Perhaps you could simply check the ContentLength of the request and immediately return a 400 Bad Request if it exceeds your maximum?
func MyHandler(rw http.ResponseWriter, rq *http.Request) {
if rq.ContentLength > 8388608 {
rw.WriteHeader(http.StatusBadRequest)
rw.Write([]byte("request content limit exceeded"))
return
}
// ... normal processing
}
This has the advantage of not reading anything and deciding not to proceed at the earliest possible opportunity (short of some throttling on the ingress itself), minimising cpu and memory load on your process.
It also simplifies your normal processing which then does not have to be concerned with catering for circumstances where a partial request might be involved, or aborting and possibly having to clean up processing if the request content limit is reached before all content has been processed..
Your code reads:
r.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), r.Body))
This means that you are assigned a new io.MultiReader to your body that:
reads at most 8388608 from a byte slice in memory
and then reads the rest of the body after those 8388608 bytes
To ensure that you only read 8388608 bytes at most, replace that line with:
r.Body = ioutil.NopCloser(bytes.NewReader(part))
Is there a way to do a client.Do("POST", "example.com", body) and read the response headers before the entire response body has been received/closed? This would be similar to how JavaScript XHR requests emit an event that the headers have been received and you can read them before the rest of the request arrives.
What I'm trying to accomplish is making a sort of "smart client" that uses information in the headers from my server to determine what to upload in the request body. So I need to start the request, read the response headers, then start writing to the request body. Because of the nature of my system, I can't split it across separate requests. I believe it's possible at the protocol level, but I'm not sure if go's http libraries support it.
http client Do function doesn't block until whole response body is returned. if you don't want to read full response, why not just use res.Body.Close() after you have examined headers?. i think it should work if you want roughly same behavior. According to Doc.
The response body is streamed on demand as the Body field is read. If the network
connection fails or the server terminates the response, Body.Read calls return an error.
Although DefaultTransport of default http.Client which is http.Transport doesn't give you guarantee that it won't read any byte before you specify.
You can fulfill your requirements by sending an OPTIONS request to the url before sending actual request and read the response header.
The response header will contain all the necessary headers to perform the preferred request.
req, _ := http.NewRequest("OPTIONS", "example.com", nil)
resp, _ := client.Do(req)
I have a simple multipart form which uploads to a Go app. I wanted to set a restriction on the upload size, so I did the following:
func myHandler(rw http.ResponseWriter, request *http.Request){
request.Body = http.MaxBytesReader(rw, request.Body, 1024)
err := request.ParseMultipartForm(1024)
if err != nil{
// Some response.
}
}
Whenever an upload exceeds the maximum size, I get a connection reset like the following:
and yet the code continues executing. I can't seem to provide any feedback to the user. Instead of severing the connection I'd prefer to say "You've exceeded the size limit". Is this possible?
This code works as intended. Description of http.MaxBytesReader
MaxBytesReader is similar to io.LimitReader but is intended for
limiting the size of incoming request bodies. In contrast to
io.LimitReader, MaxBytesReader's result is a ReadCloser, returns a
non-EOF error for a Read beyond the limit, and closes the underlying
reader when its Close method is called.
MaxBytesReader prevents clients from accidentally or maliciously
sending a large request and wasting server resources.
You could use io.LimitReader to read just N bytes and then do the handling of the HTTP request on your own.
The only way to force a client to stop sending data is to forcefully close the connection, which is what you're doing with http.MaxBytesReader.
You could use a io.LimitReader wrapped in a ioutil.NopCloser, and notify the client of the error state. You could then check for more data, and try and drain the connection up to another limit to keep it open. However, clients that aren't responding correctly to MaxBytesReader may not work in this case either.
The graceful way to handle something like this is using Expect: 100-continue, but that only really applies to clients other than web browsers.
If I have a basic http handler for POST requests, how can I stop processing if the payload is larger than 100 KB?
From what I understand, in my POST Handler, behind the scenes the server is streaming the POSTED data. But if I try and access it, it will block correct?
I want to stop processing if it is over 100 KB in size.
Use http.MaxBytesReader to limit the amount of data read from the client. Execute this line of code
r.Body = http.MaxBytesReader(w, r.Body, 100000)
before calling r.ParseForm, r.FormValue or any other request method that reads the body.
Wrapping the request body with io.LimitedReader limits the amount of data read by the application, but does not necessarily limit the amount of data read by the server on behalf of the application.
Checking the request content length is unreliable because the field is not set to the actual request body size when chunked encoding is used.
I believe you can simply check http.Request.ContentLength param to know about the size of the posted request prior to decide whether to go ahead or return error if larger than expected.