Request body too large causing connection reset in Go - go

I have a simple multipart form which uploads to a Go app. I wanted to set a restriction on the upload size, so I did the following:
func myHandler(rw http.ResponseWriter, request *http.Request){
request.Body = http.MaxBytesReader(rw, request.Body, 1024)
err := request.ParseMultipartForm(1024)
if err != nil{
// Some response.
}
}
Whenever an upload exceeds the maximum size, I get a connection reset like the following:
and yet the code continues executing. I can't seem to provide any feedback to the user. Instead of severing the connection I'd prefer to say "You've exceeded the size limit". Is this possible?

This code works as intended. Description of http.MaxBytesReader
MaxBytesReader is similar to io.LimitReader but is intended for
limiting the size of incoming request bodies. In contrast to
io.LimitReader, MaxBytesReader's result is a ReadCloser, returns a
non-EOF error for a Read beyond the limit, and closes the underlying
reader when its Close method is called.
MaxBytesReader prevents clients from accidentally or maliciously
sending a large request and wasting server resources.
You could use io.LimitReader to read just N bytes and then do the handling of the HTTP request on your own.

The only way to force a client to stop sending data is to forcefully close the connection, which is what you're doing with http.MaxBytesReader.
You could use a io.LimitReader wrapped in a ioutil.NopCloser, and notify the client of the error state. You could then check for more data, and try and drain the connection up to another limit to keep it open. However, clients that aren't responding correctly to MaxBytesReader may not work in this case either.
The graceful way to handle something like this is using Expect: 100-continue, but that only really applies to clients other than web browsers.

Related

Is this code ok to avoid a big HTTP request? Golang

I am currently learning to use golang as a server side language. I'm learning how to handle forms, and so I wanted to see how I could prevent some malicious client from sending a very large (in the case of a form with multipart/form-data) file and causing the server to run out of memory. For now this is my code which I found in a question here on stackoverflow:
part, _ := ioutil.ReadAll(io.LimitReader(r.Body, 8388608))
r.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), r.Body))
In my code r is equal to *http.Request. So, I think that code works well, but what happens is that when I send a file regardless of its size (according to my code, the maximum size is 8M) my code still receives the entire file, so I have doubts that my code actually works. So my question is. Does my code really work wrong? Is there a concept that I am missing and that is why I think my code is malfunctioning? How can I limit the size of an http request correctly?
Update
I tried to run the code that was shown in the answers, I mean, this code:
part, _ := ioutil.ReadAll(io.LimitReader(r.Body, 8388608))
r.Body = ioutil.NopCloser(bytes.NewReader(part))
But when I run that code, and when I send a file larger than 8M I get this message from my web browser:
The connection was reset
The connection to the server was reset while the page was loading.
How can I solve that? How can I read only 8M maximum but without getting that error?
I would ask the question: "How is your service intended/expected to behave if it receives a request greater than the maximum size?"
Perhaps you could simply check the ContentLength of the request and immediately return a 400 Bad Request if it exceeds your maximum?
func MyHandler(rw http.ResponseWriter, rq *http.Request) {
if rq.ContentLength > 8388608 {
rw.WriteHeader(http.StatusBadRequest)
rw.Write([]byte("request content limit exceeded"))
return
}
// ... normal processing
}
This has the advantage of not reading anything and deciding not to proceed at the earliest possible opportunity (short of some throttling on the ingress itself), minimising cpu and memory load on your process.
It also simplifies your normal processing which then does not have to be concerned with catering for circumstances where a partial request might be involved, or aborting and possibly having to clean up processing if the request content limit is reached before all content has been processed..
Your code reads:
r.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), r.Body))
This means that you are assigned a new io.MultiReader to your body that:
reads at most 8388608 from a byte slice in memory
and then reads the rest of the body after those 8388608 bytes
To ensure that you only read 8388608 bytes at most, replace that line with:
r.Body = ioutil.NopCloser(bytes.NewReader(part))

How to un-wedge go gRPC bidi-streaming server from the blocking Recv() call?

When serving a bidirectional stream in gRPC in golang, the canonical stream handler looks something like this:
func (s *MyServer) MyBidiRPC(stream somepb.MyServer_MyBidiServer) error {
for {
data, err := stream.Recv()
if err == io.EOF {
return nil // clean close
}
if err != nil {
return err // some other error
}
// do things with data here
}
}
Specifically, when the handler for the bidi RPC returns, that is the signal to consider the server side closed.
This is a synchronous programming model -- the server stays blocked inside this goroutine (created by the grpc library) while waiting for messages from the client.
Now, I would like to unblock this Recv() call (which ends up calling RecvMsg() on an underlying grpc.ServerStream,) and return/close the stream, because the server process has decided that it is done with this client.
Unfortunately, I can find no obvious way to do this:
There's no Close() or CloseSend() or CloseRecv() or Shutdown()-like function on the bidi server interface generated for my service
The context inside the stream, which I can get at with stream.Context(), doesn't expose user-accessible the cancel function
I can't find a way to pass in a context on the "starting side" for a new connection accepted by the grpc.Server, where I could inject my own cancel function
I could close the entire grpc.Server by calling Stop(), but that's not what I want to do -- only this particular client connection (grpc.ServerStream) should be finished.
I could send a message to the client that makes the client in turn shut down the conection. However, this doesn't work if the client has fallen off the network, which would be solved with a timeout, which has to be pretty long to be generally robust. I want it now because I'm impatient, and, more importantly, at scale, dangling unresponsive clients can be a high cost.
I could (perhaps) dig through the grpc.ServerStream with reflection until I find the transportStream, and then dig out the cancel function out of that and call it. Or dig through the stream.Context() with reflection, and make my own cancel function reference to call. Neither of these seem well advised for future maintainers.
But surely these can't be the only options? Deciding that a particular client no longer needs to be connected is not magic space-alien science. How do I close this stream such that the Recv() call un-blocks, from the server process side, without involving a round-trip to the client?
Unfortunately I don't think there is a great way to do what you are asking. Depending on your goal, I think you have two options:
Run Recv in a goroutine and return from the bidi handler when you need it to return. This will close the context and unblock Recv. This is obviously suboptimal, as it requires care because you now have code executing outside the scope of the handler's execution. It is, however, the closest answer I can seem to find.
If you are trying to mitigate the impact of misbehaving clients by instituting timeouts, you might be able to offload the work of this to the framework with KeepaliveEnforcementPolicy and/or KeepaliveParams. This is probably preferable if this aligns with the reason you are hoping to close the connection, but otherwise isn't of much use.

Unbuffered bidirectional data streaming with gRPC: how to get the size of the client-side buffer?

I am streaming data from a server to a client and I would like the server not to read and send more data than the client's buffer size.
Given:
service StreamService {
rpc Stream(stream Buffer) returns (stream Buffer);
}
message Buffer {
bytes data = 1;
}
My client's program basically looks like:
func ReadFromServer(stream StreamService_StreamClient, buf []byte) (n int, err error) {
// I actually don't need more than len(buf)...
// How could I send len(buf) while stream is bidirectional...?
buffer, err := stream.Recv()
if err != nil {
return 0, err
}
n = copy(buf, buffer.Data)
// buf could also be smaller than buffer.Data...
return n, nil
}
So how could I send len(buf) while the RPC's stream is bidirectional, i.e. the send direction is used by another independent stream of data? Note that I don't use client or server-side buffering to avoid loosing data when one of them is terminated (my data-source is an I/O).
gRPC provides no mechanism for this. It only provides push-back when a sender needs to slow down. But there will still be buffering happening internally and that is not exposed because gRPC is message-based, not byte-based.
There's really only two options in your case:
Server chunks responses arbitrarily. The client Recv()s when necessary and any extra is manually managed for later.
The client sends a request asking for a precise amount to be returned, and then waits for the response.
Note that I don't use client or server-side buffering to avoid loosing data when one of them is terminated (my data-source is an I/O).
This isn't how it works. When you do a Send() there is no guarantee it is received when the call returns. When you do a Recv() there is no guarantee that the message was received after the recv call (it could have been received before the call). There is buffering going on, period.
I think there's no built-in solution for that. The use-case looks little bit weird: why server must care about client's state at all? If it really needs to, you should extend your bidirectional stream: the client must request byte slices of a particular size (according to the own buffer size and other factors).
By the way, you may find useful message size limit settings GRPC client and server:
https://godoc.org/google.golang.org/grpc#MaxMsgSize https://godoc.org/google.golang.org/grpc#WithMaxMsgSize

How can I orchestrate concurrent request-response flow?

I'm new to concurrent programming, and have no idea what concepts to start with, so please be gentle.
I am writing a webservice as a front-end to a TCP server. This server listens to the port I give it, and returns the response to the TCP connection for each request.
Here is why I'm writing a web-service front-end for this server:
The server can handle one request at a time, and I'm trying to make it be able to process several inputs concurrently, by launching multiple processes and giving them a different port to listen on. For example, I want to launch 30 instances and tell them to listen on ports 20000-20029.
Our team uses PHP, and PHP does not have the capacity to launch server instances and maintain them concurrently, so I'm trying to write an API they can just send HTTP requests to.
So, here is the structure I have thought of.
I will have a main() function. This function launches the processes concurrently, then starts an HTTP server on port 80 and listens.
I have an http.Handler that adds the content of a request to a channel,.
I will have gorutines, one per server instance, that are in an infinite loop.
The code for the function mentioned in item three would be something like this:
func handleRequest(queue chan string) {
for {
request := <-queue
conn, err := connectToServer()
err = sendRequestToServer(conn)
response, err := readResponseFromServer(conn)
}
}
So, my http.Handler can simply do something like queue<- request to add the request to the queue, and handleRequest, which has blocked, waiting for the channel to have something to get, will simply get the request and continue on. When done, the loop finishes, execution comes back to the request := <-queue, and the same thing continues.
My problem starts in the http.Handler. It makes perfect sense to put requests in a channel, because multiple gorutines are all listening to it. However, how can these gorutines return the result to my http.Handler?
One way is to use a channel, let's call it responseQueue, that all of these gorutines would then write to. The problem is that when a response is added to the channel, I don't know which request it belongs to. In other words, when multiple http.Handlers send requests, each executing handler will not know which response the current message in the channel belongs to.
Is there a best practice, or a pattern, to send data to a gorutine from another gorutine and receive the data back?
Create a per request response channel and include it in the value sent to the worker. The handler receives from the channel. The worker sends the result to the channel.

using Go redis client (Redigo)

I'm using GO redis client redigo to write image to ~20 redis servers.
speed is an important factor here and I'm just sending set commands to the redis so I'm using Send and Flush without calling Receive.
after a few hours I'm getting "connection reset by peer" on the client.
I was wondering, does it have something to do with the fact that I don't call Receive?
maybe my RX queue just getting to its max capacity because I don't empty it with Receive?
Thank you.
An application must call Receive to clear the responses from the server and to check for errors. If the application is not pipelining commands, then it's best to call Do. Do combines Send, Flush and Receive.
If you don't care about errors, then start a goroutine to read the responses:
go func(c redis.Conn) {
for c.Err() == nil {
c.Receive()
}
}()

Resources