Scheduler-worker cluster without port forwarding - go

Hello Satckoverflow!
TLDR I would like to recreate https://github.com/KorayGocmen/scheduler-worker-grpc without port forwarding on the worker.
I am trying to build a competitive programming judge server for evaluation of submissions as a project for my school where I teach programming to kids.
Because the evaluation is computationally heavy I would like to have multiple worker nodes.
The scheduler would receive submissions and hand them out to the worker nodes. For ease of worker deployment ( as it will be often changing ) I would like the worker to be able to subscribe to the scheduler and thus become a worker and receive jobs.
The workers may not be on the same network as the scheduler + the worker resides in a VM ( maybe later will be ported to docker but currently there are issues with it ).
The scheduler should be able to know resource usage of the worker, send different types of jobs to the worker and receive a stream of results.
I am currently thinking of using grpc to address my requirements of communication between workers and the scheduler.
I could create multiple scheduler service methods like:
register worker, receive a stream of jobs
stream job results, receive nothing
stream worker state periodically, receive nothing
However I would prefer the following but idk whether it is possible:
The scheduler GRPC api:
register a worker ( making the worker GRPC api available to the scheduler )
The worker GRPC api:
start a job ( returns stream of job status )
cancel a job ???
get resource usage
The worker should unregister automatically if the connection is lost.
So my question is... is it possible to create a grpc worker api that can be registered to the scheduler for later use if the worker is behind a NAT without port forwarding?
Additional possibly unnecessary information:
Making matters worse I have multiple radically different types of jobs ( streaming an interactive console, executing code against prepared testcases ). I may just create different workers for different jobs.
Sometimes the jobs involve having large files on the local filesystem ( up to 500 MB ) that are usually kept near the scheduler therefore I would like to send the job to a worker which already has the specific files downloaded from the scheduler. Otherwise download the large files on one of the workers. Having all files at the same time on the worker would take more than 20 GB therefore I would like to avoid it.
A worker can run multiple jobs ( up to 16 ) at the same time.
I am writing the system in go.

As long as only the workers initiate the connections you don't have to worry about NAT. gRPC supports streaming in either direction (or both). This means that all of your requirements can be implemented using just one server on the scheduler; there is no need for the scheduler to connect back to the workers.
Given your description your service could look something like this:
syntax = "proto3";
import "google/protobuf/empty.proto";
service Scheduler {
rpc GetJobs(GetJobsRequest) returns (stream GetJobsResponse) {}
rpc ReportWorkerStatus(stream ReportWorkerStatusRequest) returns (google.protobuf.Empty) {}
rpc ReportJobStatus(stream JobStatus) returns (stream JobAction) {}
}
enum JobType {
JOB_TYPE_UNSPECIFIED = 0;
JOB_TYPE_CONSOLE = 1;
JOB_TYPE_EXEC = 2;
}
message GetJobsRequest {
// List of job types this worker is willing to accept.
repeated JobType types = 1;
}
message GetJobsResponse {
string jobId = 0;
JobType type = 1;
string fileName = 2;
bytes fileContent = 3;
// etc.
}
message ReportWorkerStatusRequest {
float cpuLoad = 0;
uint64 availableDiskSpace = 1;
uint64 availableMemory = 2;
// etc.
// List of filenames or file hashes, or whatever else you need to precisely
// report the presence of files.
repeated string haveFiles = 2;
}
Much of this is a matter of preference (you can use oneof instead of enums, for instance), but hopefully it's clear that a single connection from client to server is sufficient for your requirements.
Maintaining the set of available workers is quite simple:
func (s *Server) GetJobs(req *pb.GetJobRequest, stream pb.Scheduler_GetJobsServer) error {
ctx := stream.Context()
s.scheduler.AddWorker(req)
defer s.scheduler.RemoveWorker(req)
for {
job, err := s.scheduler.GetJob(ctx, req)
switch {
case ctx.Err() != nil: // client disconnected
return nil
case err != nil:
return err
}
if err := stream.Send(job); err != nil {
return err
}
}
}
The Basics tutorial includes examples for all types of streaming, including server and client implementations in Go.
As for registration, that usually just means creating some sort of credential that a worker will use when communicating with the server. This might be a randomly generated token (which the server can use to load associated metadata), or a username/password combination, or a TLS client certificate, or similar. Details will depend on your infrastructure and desired workflow when setting up workers.

Related

How to deal with back pressure in GO GRPC?

I have a scenario where the clients can connect to a server via GRPC and I would like to implement backpressure on it, meaning that I would like to accept many simultaneous requests 10000, but have only 50 simultaneous threads executing the requests (this is inspired in Apache Tomcat NIO interface behaviour). I also would like the communication to be asynchronous, in a reactive manner, meaning that the client send the request but does not wait on it and the server sends the response back later and the client then execute some function registered to be executed.
How can I do that in GO GRPC? Should I use streams? Is there any example?
The GoLang API is a synchronous API, this is how GoLang usually works. You block in a while true loop until an event happens, and then you proceed to handle that event. With respect to having more simultaneous threads executing requests, we don't control that on the Client Side. On the client side at the application layer above gRPC, you can fork more Goroutines, each executing requests. The server side already forks a goroutine for each accepted connection and even stream on the connection so there is already inherent multi threading on the server side.
Note that there are no threads in go. Go us using goroutines.
The behavior described, is already built in to the GRC server. For example, see this option.
// NumStreamWorkers returns a ServerOption that sets the number of worker
// goroutines that should be used to process incoming streams. Setting this to
// zero (default) will disable workers and spawn a new goroutine for each
// stream.
//
// # Experimental
//
// Notice: This API is EXPERIMENTAL and may be changed or removed in a
// later release.
func NumStreamWorkers(numServerWorkers uint32) ServerOption {
// TODO: If/when this API gets stabilized (i.e. stream workers become the
// only way streams are processed), change the behavior of the zero value to
// a sane default. Preliminary experiments suggest that a value equal to the
// number of CPUs available is most performant; requires thorough testing.
return newFuncServerOption(func(o *serverOptions) {
o.numServerWorkers = numServerWorkers
})
}
The workers are at some point initialized.
// initServerWorkers creates worker goroutines and channels to process incoming
// connections to reduce the time spent overall on runtime.morestack.
func (s *Server) initServerWorkers() {
s.serverWorkerChannels = make([]chan *serverWorkerData, s.opts.numServerWorkers)
for i := uint32(0); i < s.opts.numServerWorkers; i++ {
s.serverWorkerChannels[i] = make(chan *serverWorkerData)
go s.serverWorker(s.serverWorkerChannels[i])
}
}
I suggest you read the server code yourself, to learn more.

What is the best way to route tasks in machinery(go)?

I'm trying to use machinery as a distributed task queue and would like to deploy separate workers for different groups of tasks. E.g. have a worker next to the database server running database related tasks and a number of workers on different servers running cpu/memory intensive tasks. Only the documentation isn't really clear on how one wold do this.
I initially tried running the workers without registering unwanted tasks on to them but this resulted in the worker repeatedly consuming the unregistered task and requeuing it with he following message:
INFO: 2022/01/27 08:33:13 redis.go:342 Task not registered with this worker. Requeuing message: {"UUID":"task_7026263a-d085-4492-8fa8-e4b83b2c8d59","Name":"add","RoutingKey":"","ETA":null,"GroupUUID":"","GroupTaskCount":0,"Args":[{"Name":"","Type":"int32","Value":2},{"Name":"","Type":"int32","Value":4}],"Headers":{},"Priority":0,"Immutable":false,"RetryCount":0,"RetryTimeout":0,"OnSuccess":null,"OnError":null,"ChordCallback":null,"BrokerMessageGroupId":"","SQSReceiptHandle":"","StopTaskDeletionOnError":false,"IgnoreWhenTaskNotRegistered":false}
I suspect this can be fixed by setting IgnoreWhenTaskNotRegistered to True however this doesn't seem like a very elegant solution.
Task signatures also have a RoutingKey field but there was no info in the docs on how to configure a worker to only consume tasks from a specific routing key.
Also, one other solution would be to have separate machinery task servers but this would take away the ability to use workflows and orchestrate tasks between workers.
Found the solution through some trial and error.
Setting IgnoreWhenTaskNotRegistered to true isn't a correct solution since, unlike what I initially thought, the worker still consumes the unregistered task and then discards it instead of requeuing it.
The correct way to route tasks is to set RoutingKey in the task's signature to the desired queue's name and use taskserver.NewCustomQueueWorker to get a queue specific worker object instead of taskserver.NewWorker
Sending a task to a specific queue:
task := tasks.Signature{
Name: "<TASKNAME>",
RoutingKey: "<QUEUE>",
Args: []tasks.Arg{
// args...
},
}
res, err := taskserver.SendTask(&task)
if err != nil {
// handle error
}
And starting a worker to consume from a specific queue:
worker := taskserver.NewCustomQueueWorker("<WORKERNAME>", concurrency, "<QUEUE>")
if err := worker.Launch(); err != nil {
// handle error
}
Still not quite sure how to tell a worker to consume from a set of queues as `NewCustomQueueWorker` only accepts a single string as it's queue name, however that's a relatively minor detail.

How to create a shared queue in Go?

I am trying to implement the least connections algorithm for a load balancer. I am using priority queue to keep the count of connections per server in a sorted order.
Here is the code:
server = spq[0]
serverNumber = server.value
updatedPriority = server.priority + 1 // Increment connection count for server
spq.update(server, serverNumber, updatedPriority)
targetUrl, err := url.Parse(configuration.Servers[serverNumber])
if err != nil {
log.Fatal(err)
}
// Send the request to the selected server
httputil.NewSingleHostReverseProxy(targetUrl).ServeHTTP(w, r)
updatedPriority = server.priority - 1 // Decrement connection count for server
spq.update(server, serverNumber, updatedPriority)
where spq is my priority queue.
This code will run for every request the balancer will receive.
But I am not getting correct results after logging the state of queue for every request.
For example in one case I saw the queue contained the same server twice with different priorities.
I am sure this has something to do with synchronising and locking the queue across the requests. But I am not sure what is the correct approach in this particular case.
If this is really your code that runs in multiple goroutines, then you clearly have race.
I do not understand spq.update. At first it looks like it is a function that reorders the queue to have the server with minimum number of calls at element 0, but then why does it need both server and serverNumber? serverNumber appears to be a unique ID for the server, and since you already have the server, why do you need that?
In any case, you should have a sync.Mutex shared by all goroutines, and lock the mutex before the first line, and unlock after spq.update, also you should again lock it after proxy call, and unlock when all done. The line that subtracts 1 from server.priority will only work if server is a pointer. If it is not a pointer, you're losing all the updates to server happened during the call.

sql.DB on aws-lambda too many connection

As I understand in Golang: the DB handle is meant to be long-lived and shared between many goroutines.
But when I using Golang with AWS lambda, it's a very different story since lambdas are stopping the function when it's finished.
I am using:defer db.Close() in Lambda Invoke function but it isn't affect. On MySQL, it's still keep that connection as Sleep query. As a result, it causes too many connections on MySQL.
Currently, I have to set wait_timeout in MySQL to small number. But it's not the best solution, in my opinion.
Is there any way to close connection when using Go SQL driver with Lambda?
Thanks,
There are two problems that we need to address
Correctly managing state between lambda invocations
Configuring a connection pool
Correctly managing state
Let us understand a bit of how the container is managed by AWS. From the AWS docs:
After a Lambda function is executed, AWS Lambda maintains the
execution context for some time in anticipation of another Lambda
function invocation. In effect, the service freezes the execution
context after a Lambda function completes, and thaws the context for
reuse, if AWS Lambda chooses to reuse the context when the Lambda
function is invoked again.
This execution context reuse approach has the following implications:
Any declarations in your Lambda function code (outside the handler
code, see Programming Model) remains initialized, providing additional
optimization when the function is invoked again. For example, if your
Lambda function establishes a database connection, instead of
reestablishing the connection, the original connection is used in
subsequent invocations. We suggest adding logic in your code to check
if a connection exists before creating one.
Each execution context provides 500MB of additional disk space in the
/tmp directory. The directory content remains when the execution
context is frozen, providing transient cache that can be used for
multiple invocations. You can add extra code to check if the cache has
the data that you stored. For information on deployment limits, see
AWS Lambda Limits.
Background processes or callbacks initiated by your Lambda function
that did not complete when the function ended resume if AWS Lambda
chooses to reuse the execution context. You should make sure any
background processes or callbacks (in case of Node.js) in your code
are complete before the code exits.
This first bullet point says that state is maintained between executions. Let us see this in action:
let counter = 0
module.exports.handler = (event, context, callback) => {
counter++
callback(null, { count: counter })
}
If you deploy this and call multiple times consecutively you will see that the counter will be incremented between calls.
Now that you know that - you should not call defer db.Close(), instead you should be reusing the database instance. You can do that by simply making db a package level variable.
First, create a database package that will export an Open function:
package database
import (
"fmt"
"os"
_ "github.com/go-sql-driver/mysql"
"github.com/jinzhu/gorm"
)
var (
host = os.Getenv("DB_HOST")
port = os.Getenv("DB_PORT")
user = os.Getenv("DB_USER")
name = os.Getenv("DB_NAME")
pass = os.Getenv("DB_PASS")
)
func Open() (db *gorm.DB) {
args := fmt.Sprintf("%s:%s#tcp(%s:%s)/%s?parseTime=true", user, pass, host, port, name)
// Initialize a new db connection.
db, err := gorm.Open("mysql", args)
if err != nil {
panic(err)
}
return
}
Then use it on your handler.go file:
package main
import (
"context"
"github.com/aws/aws-lambda-go/events"
"github.com/aws/aws-lambda-go/lambda"
"github.com/jinzhu/gorm"
"github.com/<username>/<name-of-lib>/database"
)
var db *gorm.DB
func init() {
db = database.Open()
}
func Handler() (events.APIGatewayProxyResponse, error) {
// You can use db here.
return events.APIGatewayProxyResponse{
StatusCode: 201,
}, nil
}
func main() {
lambda.Start(Handler)
}
OBS: don't forget to replace github.com/<username>/<name-of-lib>/database with the right path.
Now, you might still see the too many connections error. If that happens you will need a connection pool.
Configuring a connection pool
From Wikipedia:
In software engineering, a connection pool is a cache of database
connections maintained so that the connections can be reused when
future requests to the database are required. Connection pools are
used to enhance the performance of executing commands on a database.
You will need a connection pool that the number of allowed connections must be equal to the number of parallel lambdas running, you have two choices:
MySQL Proxy
MySQL Proxy is a simple program that sits between your client and
MySQL server(s) and that can monitor, analyze or transform their
communication. Its flexibility allows for a wide variety of uses,
including load balancing, failover, query analysis, query filtering
and modification, and many more.
AWS Aurora:
Amazon Aurora Serverless is an on-demand, auto-scaling configuration
for Amazon Aurora (MySQL-compatible edition), where the database will
automatically start up, shut down, and scale capacity up or down based
on your application's needs. It enables you to run your database in
the cloud without managing any database instances. It's a simple,
cost-effective option for infrequent, intermittent, or unpredictable
workloads.
Regardless of your choice, there are plenty of tutorials on the internet on how to configure both.

Named pipes efficient asynchronous design

The problem:
To design an efficient and very fast named-pipes client server framework.
Current state:
I already have battle proven production tested framework. It is fast, however it uses one thread per one pipe connection and if there are many clients the number of threads could fast be to high. I already use smart thread pool (task pool in fact) that can scale with need.
I already use OVERLAPED mode for pipes, but then I block with WaitForSingleObject or WaitForMultipleObjects so that is why I need one thread per connection on the server side
Desired solution:
Client is fine as it is, but on the server side I would like to use one thread only per client request and not per connection. So instead of using one thread for the whole lifecycle of client (connect / disconnect) I would use one thread per task. So only when client requests data and no more.
I saw an example on MSDN that uses array of OVERLAPED structures and then uses WaitForMultipleObjects to wait on them all. I find this a bad design. Two problems I see here. First you have to maintain an array that can grow quite large and deletions will be costly. Second, you have a lot of events, one for each array member.
I also saw completion ports, like CreateIoCompletionPort and GetQueuedCompletionStatus, but I don't see how they are any better.
What I would like is something ReadFileEx and WriteFileEx do, they call a callback routine
when the operation is completed. This is a true async style of programming. But the problem is that ConnectNamedPipe does not support that and furthermore I saw that the thread needs to be in alertable state and you need to call some of the *Ex functions to have that.
So how is such a problem best solved?
Here is how MSDN does it: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365603(v=vs.85).aspx
The problem I see with this approach is that I can't see how you could have 100 clients connected at once if the limit to WaitForMultipleObjects is 64 handles. Sure I can disconnect the pipe after each request, but the idea is to have a permanent client connection just like in TCP server and to track the client through whole life-cycle with each client having unique ID and client specific data.
The ideal pseudo code should be like this:
repeat
// wait for the connection or for one client to send data
Result = ConnectNamedPipe or ReadFile or Disconnect;
case Result of
CONNECTED: CreateNewClient; // we create a new client
DATA: AssignWorkerThread; // here we process client request in a thread
DISCONNECT: CleanupAndDeleteClient // release the client object and data
end;
until Aborted;
This way we have only one listener thread that accepts connect / disconnect / onData events. Thread pool (worker thread) only process the actual request. This way 5 worker threads can serve a lot of clients that are connected.
P.S.
My current code should not be important. I code this in Delphi but its pure WinAPI so the language does not matter.
EDIT:
For now IOCP look like the solution:
I/O completion ports provide an efficient threading model for
processing multiple asynchronous I/O requests on a multiprocessor
system. When a process creates an I/O completion port, the system
creates an associated queue object for requests whose sole purpose is
to service these requests. Processes that handle many concurrent
asynchronous I/O requests can do so more quickly and efficiently by
using I/O completion ports in conjunction with a pre-allocated thread
pool than by creating threads at the time they receive an I/O request.
If server must handle more than 64 events (read/writes) then any solution using WaitForMultipleObjects becomes unfeasible. This is the reason the Microsoft introduced IO completion ports to Windows. It can handle very high number of IO operations using the most appropriate number of threads (usually it's the number of processors/cores).
The problem with IOCP is that it is very difficult to implement right. Hidden issues are spread like mines in the field: [1], [2] (section 3.6). I would recommend using some framework. Little googling suggests something called Indy for Delphi developers. There are maybe others.
At this point I would disregard the requirement for named pipes if that means coding my own IOCP implementation. It's not worth the grief.
I think what you're overlooking is that you only need a few listening named pipe instances at any given time. Once a pipe instance has connected, you can spin that instance off and create a new listening instance to replace it.
With MAXIMUM_WAIT_OBJECTS (or fewer) listening named pipe instances, you can have a single thread dedicated to listening using WaitForMultipleObjectsEx. The same thread can also handle the rest of the I/O using ReadFileEx and WriteFileEx and APCs. The worker threads would queue APCs to the I/O thread in order to initiate I/O, and the I/O thread can use the task pool to return the results (as well as letting the worker threads know about new connections).
The I/O thread main function would look something like this:
create_events();
for (index = 0; index < MAXIMUM_WAIT_OBJECTS; index++) new_pipe_instance(i);
for (;;)
{
if (service_stopping && active_instances == 0) break;
result = WaitForMultipleObjectsEx(MAXIMUM_WAIT_OBJECTS, connect_events,
FALSE, INFINITE, TRUE);
if (result == WAIT_IO_COMPLETION)
{
continue;
}
else if (result >= WAIT_OBJECT_0 &&
result < WAIT_OBJECT_0 + MAXIMUM_WAIT_OBJECTS)
{
index = result - WAIT_OBJECT_0;
ResetEvent(connect_events[index]);
if (GetOverlappedResult(
connect_handles[index], &connect_overlapped[index],
&byte_count, FALSE))
{
err = ERROR_SUCCESS;
}
else
{
err = GetLastError();
}
connect_pipe_completion(index, err);
continue;
}
else
{
fail();
}
}
The only real complication is that when you call ConnectNamedPipe it may return ERROR_PIPE_CONNECTED to indicate that the call succeeded immediately or an error other than ERROR_IO_PENDING if the call failed immediately. In that case you need to reset the event and then handle the connection:
void new_pipe(ULONG_PTR dwParam)
{
DWORD index = dwParam;
connect_handles[index] = CreateNamedPipe(
pipe_name,
PIPE_ACCESS_DUPLEX | FILE_FLAG_OVERLAPPED,
PIPE_TYPE_MESSAGE | PIPE_WAIT | PIPE_ACCEPT_REMOTE_CLIENTS,
MAX_INSTANCES,
512,
512,
0,
NULL);
if (connect_handles[index] == INVALID_HANDLE_VALUE) fail();
ZeroMemory(&connect_overlapped[index], sizeof(OVERLAPPED));
connect_overlapped[index].hEvent = connect_events[index];
if (ConnectNamedPipe(connect_handles[index], &connect_overlapped[index]))
{
err = ERROR_SUCCESS;
}
else
{
err = GetLastError();
if (err == ERROR_SUCCESS) err = ERROR_INVALID_FUNCTION;
if (err == ERROR_PIPE_CONNECTED) err = ERROR_SUCCESS;
}
if (err != ERROR_IO_PENDING)
{
ResetEvent(connect_events[index]);
connect_pipe_completion(index, err);
}
}
The connect_pipe_completion function would create a new task in the task pool to handle the newly connected pipe instance, and then queue an APC to call new_pipe to create a new listening pipe at the same index.
It is possible to reuse existing pipe instances once they are closed but in this situation I don't think it's worth the hassle.

Resources