How can I orchestrate concurrent request-response flow? - go

I'm new to concurrent programming, and have no idea what concepts to start with, so please be gentle.
I am writing a webservice as a front-end to a TCP server. This server listens to the port I give it, and returns the response to the TCP connection for each request.
Here is why I'm writing a web-service front-end for this server:
The server can handle one request at a time, and I'm trying to make it be able to process several inputs concurrently, by launching multiple processes and giving them a different port to listen on. For example, I want to launch 30 instances and tell them to listen on ports 20000-20029.
Our team uses PHP, and PHP does not have the capacity to launch server instances and maintain them concurrently, so I'm trying to write an API they can just send HTTP requests to.
So, here is the structure I have thought of.
I will have a main() function. This function launches the processes concurrently, then starts an HTTP server on port 80 and listens.
I have an http.Handler that adds the content of a request to a channel,.
I will have gorutines, one per server instance, that are in an infinite loop.
The code for the function mentioned in item three would be something like this:
func handleRequest(queue chan string) {
for {
request := <-queue
conn, err := connectToServer()
err = sendRequestToServer(conn)
response, err := readResponseFromServer(conn)
}
}
So, my http.Handler can simply do something like queue<- request to add the request to the queue, and handleRequest, which has blocked, waiting for the channel to have something to get, will simply get the request and continue on. When done, the loop finishes, execution comes back to the request := <-queue, and the same thing continues.
My problem starts in the http.Handler. It makes perfect sense to put requests in a channel, because multiple gorutines are all listening to it. However, how can these gorutines return the result to my http.Handler?
One way is to use a channel, let's call it responseQueue, that all of these gorutines would then write to. The problem is that when a response is added to the channel, I don't know which request it belongs to. In other words, when multiple http.Handlers send requests, each executing handler will not know which response the current message in the channel belongs to.
Is there a best practice, or a pattern, to send data to a gorutine from another gorutine and receive the data back?

Create a per request response channel and include it in the value sent to the worker. The handler receives from the channel. The worker sends the result to the channel.

Related

How to deal with back pressure in GO GRPC?

I have a scenario where the clients can connect to a server via GRPC and I would like to implement backpressure on it, meaning that I would like to accept many simultaneous requests 10000, but have only 50 simultaneous threads executing the requests (this is inspired in Apache Tomcat NIO interface behaviour). I also would like the communication to be asynchronous, in a reactive manner, meaning that the client send the request but does not wait on it and the server sends the response back later and the client then execute some function registered to be executed.
How can I do that in GO GRPC? Should I use streams? Is there any example?
The GoLang API is a synchronous API, this is how GoLang usually works. You block in a while true loop until an event happens, and then you proceed to handle that event. With respect to having more simultaneous threads executing requests, we don't control that on the Client Side. On the client side at the application layer above gRPC, you can fork more Goroutines, each executing requests. The server side already forks a goroutine for each accepted connection and even stream on the connection so there is already inherent multi threading on the server side.
Note that there are no threads in go. Go us using goroutines.
The behavior described, is already built in to the GRC server. For example, see this option.
// NumStreamWorkers returns a ServerOption that sets the number of worker
// goroutines that should be used to process incoming streams. Setting this to
// zero (default) will disable workers and spawn a new goroutine for each
// stream.
//
// # Experimental
//
// Notice: This API is EXPERIMENTAL and may be changed or removed in a
// later release.
func NumStreamWorkers(numServerWorkers uint32) ServerOption {
// TODO: If/when this API gets stabilized (i.e. stream workers become the
// only way streams are processed), change the behavior of the zero value to
// a sane default. Preliminary experiments suggest that a value equal to the
// number of CPUs available is most performant; requires thorough testing.
return newFuncServerOption(func(o *serverOptions) {
o.numServerWorkers = numServerWorkers
})
}
The workers are at some point initialized.
// initServerWorkers creates worker goroutines and channels to process incoming
// connections to reduce the time spent overall on runtime.morestack.
func (s *Server) initServerWorkers() {
s.serverWorkerChannels = make([]chan *serverWorkerData, s.opts.numServerWorkers)
for i := uint32(0); i < s.opts.numServerWorkers; i++ {
s.serverWorkerChannels[i] = make(chan *serverWorkerData)
go s.serverWorker(s.serverWorkerChannels[i])
}
}
I suggest you read the server code yourself, to learn more.

Difference between NewChannel vs Request in ssh sftp server

I'm looking at go sftp server example code
https://github.com/pkg/sftp/blob/master/examples/go-sftp-server/main.go
There are section of code which are unclear to me
_, chans, reqs, err := ssh.NewServerConn(nConn, config)
if err != nil {
log.Fatal("failed to handshake", err)
}
fmt.Fprintf(debugStream, "SSH server established\n")
// The incoming Request channel must be serviced.
go ssh.DiscardRequests(reqs)
// Service the incoming Channel channel.
for newChannel := range chans {
...
}
First: With ssh.NewServerConn, if NewChannel(chans) represent new request to the channel what is Request reqs. So what is difference between chans and reqs here.
Second: Why is the need to ssh.DiscardRequests(reqs)
Looking at the documentation for ssh.NewServerConn it appears that it returns the following:
*ServerConn
<-chan NewChannel
<-chan *Request
error
The second returned value, NewChannel
represents an incoming request to a channel
The third returned value, Request
is a request sent outside of the normal stream of data
This doesn't really answer your questions but it does provide helpful clues as where to look.
So to answer you questions:
chans receives connections that are new to the server. Using the received value from chans, you can either accept and communicate with that connection or just reject the connection. This can be thought of multiple people logging into a remote machine via ssh and handling multiple sessions.
reqs holds global requests (which is defined here) sent to either the server or client that should not be sent to any specific channel. RFC4254 gives the example of a such a request as "start TCP/IP forwarding for a specific port".
You can see the internal usage of how the ssh package uses the incomingRequests channel here.
The documentation for ssh.NewServerConn explicitly states
The Request and NewChannel channels must be serviced, or the connection will hang.
In the event that this server does receive a global request it needs to be handled appropriately if the request is asking for a reply.
Apart from #will7200 answer I just want to add a couple of things which I found while my reading around this.
SSH has Global request called SSH_MESSAGE_GLOBAL_REQUEST and SSH_MESSAGE_CHANNEL_REQUEST or starts TCP/IP forwarding for a specific port
a channel is any specific terminal or how we see it when we send the data across the ssh server and client.
So reqs over here is the global request and all channel requests are wrapped inside the channel.
GLOBAL requests are requests that are not specific to a CHANNEL like TCPKeepAlive (as mention in ssh_config) or start TCP/IP forwarding for a specific port.
and DisdCardRequest essentially discard those request that does not want a reply

Relay data between two different tcp clients in golang

I'm writing a TCP server which simultaneously accepts multiple connections from mobile devices and some WiFi devices (IOT). The connections needs to be maintained once established, with the 30 seconds timeout if there is no heartbeat received. So it is something like the following:
// clientsMap map[string] conn
func someFunction() {
conn, err := s.listener.Accept()
// I store the conn in clientsMap
// so I can access it, for brevity not
// shown here, then:
go serve(connn)
}
func serve(conn net.Conn) {
timeoutDuration := 30 * time.Second
conn.SetReadDeadline(time.Now().Add(timeoutDuration))
for {
msgBuffer := make([]byte, 2048)
msgBufferLen, err := conn.Read(msgBuffer)
// do something with the stuff
}
}
So there is one goroutine for each client. And each client, once connected to the server, is pending on the read. The server then processes the stuff read.
The problem is that I sometimes need to read things off one client, and then pass data to another (Between a mobile device and a WiFi device). I have stored the connections in clientsMap. So I can always access that. But since each client is handled by one goroutine, shall I be passing the data from one client to another by using a channel? But if the goroutine is blocked waiting for a pending read, how do I make it also wait for data from a channel? Or shall I just obtain the connection for the other party from the clientsMap and write to it?
The documentation for net.Conn clearly states:
Multiple goroutines may invoke methods on a Conn simultaneously.
So yes, it is okay to simply Write to the connections. You should take care to issue a single Write call per message you want to send. If you call Write more than once you risk interleaving messages from different mobile devices. This implies calling Write directly and not via some other API (in other words don't wrap the connection). For instance, the following would not be safe:
json.NewEncoder(conn).Encode(myValue) // use json.Marshal(myValue) instead
io.Copy(conn, src) // use io.ReadAll(src) instead

Is this example tcp socket programming sequence of events safe?

I plan on having two services.
HTTP REST service written in Ruby
JSON RPC service written in Go
The Ruby service will open a TCP socket connection to a Go JSON RPC service. It'll do this for each incoming HTTP request it receives. It will send some data over the socket to the Go service and that service will subsequently send back the corresponding data back down the socket.
Go code
The Go service go would look something like this (simplified):
srv := new(service.App) // this would expose a Process method
rpc.Register(srv)
listener, err := net.Listen("tcp", ":8080")
if err != nil {
// handle error
}
for {
conn, err := listener.Accept()
if err != nil {
// handle error
}
go jsonrpc.ServeConn(conn)
}
Notice we serve the incoming connection using a goroutine, so we can handle requests concurrently.
Ruby code
Below is a simple snippet of Ruby code that demonstrates (in theory) the way I would send data to the Go service:
require "socket"
require "json"
socket = TCPSocket.new "localhost", "8080"
b = {
:method => "App.Process",
:params => [{ :Config => JSON.generate({ :foo => :bar }) }],
:id => "0"
}
socket.write(JSON.dump(b))
response = JSON.load socket.readline
My concern is: will this be a safe sequence of events?
I'm not asking if this will be 'thread safe', because i'm not worried about manipulating shared memory across the go routines. I'm more concerned around whether my Ruby HTTP service will get back the data it's expecting?
If I have two parallel requests coming into my HTTP Service (or maybe the Ruby app is hosted behind a load balancer and so different instances of the HTTP service is handling multiple requests), then I could have instance A send the message Foo to the Go service; while instance B sends the message Bar.
The business logic inside the Go service will return different responses depending on its input so I want to be sure that Ruby instance A gets back the correct response for Foo, and B gets back the correct response for Bar.
I assume a socket connection is more like a queue in that if instance A makes a request to the Go service first and then B does, but B is quicker responding for whatever reason, then the Go service will write the response for B to the socket and instance A of the Ruby app will end up reading in the wrong socket data (this is obviously just one possible scenario considering that I could get lucky and have instance B read the socket data before instance A does).
Solutions?
I'm not sure if there is simple solution to this problem. Unless I don't use a TCP socket or RPC and instead rely on standard HTTP in the Go service. But I wanted the performance and less overhead of TCP.
I'm worried the design could get more complicated by maybe having to implement an external queue as a way of synchronising the responses with the Ruby service.
It maybe because the nature of my Ruby service is fundamentally synchronous (HTTP response/request) that I have no option but to switch to HTTP for the Go service.
But wanted to double check with the community first just in case I'm missing something obvious.
Yes this is safe if you create a new connection every time.
That said there are latent issues with your approach:
TCP connections are rather expensive to establish, so you probably want to re-use connections with a connection pool
If you make too many simultaneous requests you will exhaust ports/open file descriptors which will cause your program to crash
You don't have any timeouts in place, so it's possible to end up with orphaned TCP connections which never complete (either because of something bad on the Go side, or network problems)
I think you'd be better off using HTTP (despite the overhead) since libraries are already written to cope with these problems. HTTP is also much more debuggable since you can just curl an endpoint to test it.
Personally I'd probably go with gRPC.

Request body too large causing connection reset in Go

I have a simple multipart form which uploads to a Go app. I wanted to set a restriction on the upload size, so I did the following:
func myHandler(rw http.ResponseWriter, request *http.Request){
request.Body = http.MaxBytesReader(rw, request.Body, 1024)
err := request.ParseMultipartForm(1024)
if err != nil{
// Some response.
}
}
Whenever an upload exceeds the maximum size, I get a connection reset like the following:
and yet the code continues executing. I can't seem to provide any feedback to the user. Instead of severing the connection I'd prefer to say "You've exceeded the size limit". Is this possible?
This code works as intended. Description of http.MaxBytesReader
MaxBytesReader is similar to io.LimitReader but is intended for
limiting the size of incoming request bodies. In contrast to
io.LimitReader, MaxBytesReader's result is a ReadCloser, returns a
non-EOF error for a Read beyond the limit, and closes the underlying
reader when its Close method is called.
MaxBytesReader prevents clients from accidentally or maliciously
sending a large request and wasting server resources.
You could use io.LimitReader to read just N bytes and then do the handling of the HTTP request on your own.
The only way to force a client to stop sending data is to forcefully close the connection, which is what you're doing with http.MaxBytesReader.
You could use a io.LimitReader wrapped in a ioutil.NopCloser, and notify the client of the error state. You could then check for more data, and try and drain the connection up to another limit to keep it open. However, clients that aren't responding correctly to MaxBytesReader may not work in this case either.
The graceful way to handle something like this is using Expect: 100-continue, but that only really applies to clients other than web browsers.

Resources