nodejs + grpc-node server much slower than REST - protocol-buffers

I have implemented 2 services A and B, where A can talk to B via both gRPC (using grpc-node with Mali) and pure HTTP REST calls.
The request size is negligible.
The response size is 1000 items that look like this:
{
"productId": "product-0",
"description": "some-text",
"price": {
"currency": "GBP",
"value": "12.99"
},
"createdAt": "2020-07-12T18:03:46.443Z"
}
Both A and B are deployed in GKE as services, and they communicate over the internal network using kube-proxy.
What I discovered is that the REST version is a lot faster than gRPC. The REST call's p99 sits at < 1s, and the gRPC's p99 can go over 30s.
Details
Node version and OS: node:14.7.0-alpine3.12
Dependencies:
"google-protobuf": "^3.12.4",
"grpc": "^1.24.3",
"mali": "^0.21.0",
I have even created client-side-TCP-pooling by setting the gRPC option grpc.use_local_subchannel_pool=1, but this did not seem to help.
The problem seems to be the server side, as I can see that from the log that the grpc lib's call.startBatch call took many seconds to send data of size ~51kb. This is way slower than the REST version.
I also checked the CPU and network of the services are healthy. The REST version could send > 2mbps, whereas the gRPC version only manages ~150kbps.
Running netstat on service B (in gRPC) shows a number of ESTABLISHED TCP connections (as expected because of TCP pooling).
My suspicion is that the grpc-core C++ code is somehow less optimal than REST, but I have no proof.
Any ideas where I should look at next? Thanks for any helps
Update 1
Here're some benchmarks:
Setup
Blazemeter --REST--> services A --gRPC/REST--> service B
request body (both lags) is negligible
service A is a node service + Koa
service B has 3 options:
grpc-node: node with grpc-node
gRPC + Go: Go implementation of the same gRPC service
REST + Koa: node with Koa
Blazemeter --> service A: response payload is negligible, and the same for all tests
serivce A --> service B: the gRPC/REST response payload is 1000 of ProductPrice:
message ProductPrice {
string product_id = 1; // Hard coded to "product-x", x in [0 ... 999]
string description = 2; // Hard coded to random string, length = 10
Money price = 3;
google.protobuf.Timestamp created_at = 4; // Hard coded
}
message Money {
Currency currency = 1; // Hard coded to GBP
string value = 2; // Hard coded to "12.99"
}
enum Currency {
CURRENCY_UNKNOWN = 0;
GBP = 1;
}
The services are deployed to Kubernetes in GCP,
instance type: n1-highcpu-4
5 pods each service
2 CPU, 1 GB memory each pod
kube-proxy using cluster IP (not going via internet) (I've also test headless with clusterIP: None, which gave similar results)
Load
50rps
Results
service B using grpc-node
service B using Go gRPC
service B using REST with Koa
Network IO
Observations
gRPC + Go is roughly on par with REST (I thought gRPC would be faster)
grpc-node is 4x slower than REST
Network isn't the bottleneck

Related

Why triton serving shared memory failed with running multiple workers in uvicorn in order to send multiple request concurrently to the models?

I run a model in triton serving with shared memory and it works correctly.
In order to simulate backend structure I wrote a Fast API for my model and run it with gunicorn with 6 workers. Then I wrote anthor Fast API to route locust requests to my first Fast Fast API as below image(pseudo code). my second Fast API runs with uvicorn. but the problem is when I used multiple workers for my uvicorn, triton serving failed to shared memory.
Note: without shared memory every thing works but my response time is much longer than the shared memory option. so I need to use shared memory option.
here is my triton client code:
I have a functions in my client code named predict function which used the requestGenerator to shared input_simple and output_simple spaces.
this is my requestGenerator generator:
def requestGenerator(self, triton_client, batched_img_data, input_name, output_name, dtype, batch_data):
triton_client.unregister_system_shared_memory()
triton_client.unregister_cuda_shared_memory()
output_simple = "output_simple"
input_simple = "input_simple"
input_data = np.ones(
shape=(batch_data, 3, self.width, self.height), dtype=np.float32)
input_byte_size = input_data.size * input_data.itemsize
output_byte_size = input_byte_size * 2
shm_op0_handle = shm.create_shared_memory_region(
output_name, output_simple, output_byte_size)
triton_client.register_system_shared_memory(
output_name, output_simple, output_byte_size)
shm_ip0_handle = shm.create_shared_memory_region(
input_name, input_simple, input_byte_size)
triton_client.register_system_shared_memory(
input_name, input_simple, input_byte_size)
inputs = []
inputs.append(
httpclient.InferInput(input_name, batched_img_data.shape, dtype))
inputs[0].set_data_from_numpy(batched_img_data, binary_data=True)
outputs = []
outputs.append(
httpclient.InferRequestedOutput(output_name,
binary_data=True))
inputs[-1].set_shared_memory(input_name, input_byte_size)
outputs[-1].set_shared_memory(output_name, output_byte_size)
yield inputs, outputs, shm_ip0_handle, shm_op0_handle
this is my predict function:
def predict(self, triton_client, batched_data, input_layer, output_layer, dtype):
responses = []
results = None
for inputs, outputs, shm_ip_handle, shm_op_handle in self.requestGenerator(
triton_client, batched_data, input_layer, output_layer, type,
len(batched_data)):
self.sent_count += 1
shm.set_shared_memory_region(shm_ip_handle, [batched_data])
responses.append(
triton_client.infer(model_name=self.model_name,
inputs=inputs,
request_id=str(self.sent_count),
model_version="",
outputs=outputs))
output_buffer = responses[0].get_output(output_layer)
if output_buffer is not None:
results = shm.get_contents_as_numpy(
shm_op_handle, triton_to_np_dtype(output_buffer['datatype']),
output_buffer['shape'])
triton_client.unregister_system_shared_memory()
triton_client.unregister_cuda_shared_memory()
shm.destroy_shared_memory_region(shm_ip_handle)
shm.destroy_shared_memory_region(shm_op_handle)
return results
Any help would be appreciated to help me how to use multiple uvicorn workers to send multiple requests concurrently to my triton code without failing.

Backpressure with Reactors Parallel Flux + Timeouts

I'm currently working on using paralellism in a Flux. Right now I'm having problems with the backpressure. In our case we have a fast producing service we want to consume, but we are much slower.
With a normal flux, this works so far, but we want to have parallelism. What I see when I'm using the approach with
.parallel(2)
.runOn(Schedulers.parallel())
that there is a big request on the beginning, which takes quite a long time to process. Here occurs also a different problem, that if we take too long to process, we somehow seem to generate a cancel event in the producer service (we consume it via webflux rest-call), but no cancel event is seen in the consumer.
But back to problem 1, how is it possible to bring this thing back to sync. I know of the prefetch parameter on the .parallel() method, but it does not work as I expect.
A minimum example would be something like this
fun main() {
val atomicInteger = AtomicInteger(0)
val receivedCount = AtomicInteger(0)
val processedCount = AtomicInteger(0)
Flux.generate<Int> {
it.next(atomicInteger.getAndIncrement())
println("Emitted ${atomicInteger.get()}")
}.doOnEach { it.get()?.let { receivedCount.addAndGet(1) } }
.parallel(2, 1)
.runOn(Schedulers.parallel())
.flatMap {
Thread.sleep(200)
log("Work on $it")
processedCount.addAndGet(1)
Mono.just(it * 2)
}.subscribe {
log("Received ${receivedCount.get()} and processed ${processedCount.get()}")
}
Thread.sleep(25000)
}
where I can observe logs like this
...
Emitted 509
Emitted 510
Emitted 511
Emitted 512
Emitted 513
2022-02-02T14:12:58.164465Z - Thread[parallel-1,5,main] Work on 0
2022-02-02T14:12:58.168469Z - Thread[parallel-2,5,main] Work on 1
2022-02-02T14:12:58.241966Z - Thread[parallel-1,5,main] Received 513 and processed 2
2022-02-02T14:12:58.241980Z - Thread[parallel-2,5,main] Received 513 and processed 2
2022-02-02T14:12:58.442218Z - Thread[parallel-2,5,main] Work on 3
2022-02-02T14:12:58.442215Z - Thread[parallel-1,5,main] Work on 2
2022-02-02T14:12:58.442315Z - Thread[parallel-2,5,main] Received 513 and processed 3
2022-02-02T14:12:58.442338Z - Thread[parallel-1,5,main] Received 513 and processed 4
So how could I adjust that thing that I can use the parallelism but stay in backpressure/sync with my producer? The only way I got it to work is with a semaphore acquired before the parallelFlux and released after work, but this is not really a nice solution.
Ok for this szenario it seemed crucial that prefetch of parallel and runOn had to bet seen very low, here to 1.
With defaults from 256, we requested too much from our producer, so that there was already a cancel event because of the long time between the first block of requests for getting the prefetch and the next one when the Flux decided to fill the buffer again.

Why and how is the quota "critial read requests" exceeded when using batchCreateContacts

I'm programming a contacts export from our database to Google Contacts using the Google People API. I'm programming the requests over URL via Google Apps Script.
The code below - using https://people.googleapis.com/v1/people:batchCreateContacts - works for 13 to about 15 single requests, but then Google returns this error message:
Quota exceeded for quota metric 'Critical read requests (Contact and Profile Reads)' and limit 'Critical read requests (Contact and Profile Reads) per minute per user' of service 'people.googleapis.com' for consumer 'project_number:***'.
For speed I send the request with batches of 10 parallel requests.
I have the following two questions regarding this problem:
Why, for creating contacts, I would hit a quotum regarding read requests?
Given the picture link below, why would sending 2 batches of 10 simultaneous requests (more precise: 13 to 15 single requests) hit that quotum limit anyway?
quotum limit of 90 read requests per user per minute as displayed on console.cloud.google.com
Thank you for any clarification!
Further reading: https://developers.google.com/people/api/rest/v1/people/batchCreateContacts
let payloads = [];
let lengthPayloads;
let limitPayload = 200;
/*Break up contacts in payload limits*/
contacts.forEach(function (contact, index) /*contacts is an array of objects for the API*/
{
if(!(index%limitPayload))
{
lengthPayloads = payloads.push(
{
'readMask': "userDefined",
'sources': ["READ_SOURCE_TYPE_CONTACT"],
'contacts': []
}
);
}
payloads[lengthPayloads-1]['contacts'].push(contact);
}
);
Logger.log("which makes "+payloads.length+" payloads");
let parallelRequests = [];
let lengthParallelRequests;
let limitParallelRequest = 10;
/*Break up payloads in parallel request limits*/
payloads.forEach(function (payload, index)
{
if(!(index%limitParallelRequest))
lengthParallelRequests = parallelRequests.push([]);
parallelRequests[lengthParallelRequests-1].push(
{
'url': "https://people.googleapis.com/v1/people:batchCreateContacts",
'method': "post",
'contentType': "application/json",
'payload': JSON.stringify(payload),
'headers': { 'Authorization': "Bearer " + token }, /*token is a token of a single user*/
'muteHttpExceptions': true
}
);
}
);
Logger.log("which makes "+parallelRequests.length+" parallelrequests");
let responses;
parallelRequests.forEach(function (parallelRequest)
{
responses = UrlFetchApp.fetchAll(parallelRequest); /* error occurs here*/
responses = responses.map(function (response) { return JSON.parse(response.getContentText()); });
responses.forEach(function (response)
{
if(response.error)
{
Logger.log(JSON.stringify(response));
throw response;
}
else Logger.log("ok");
}
);
Output of logs:
which makes 22 payloads
which makes 3 parallelrequests
ok (15 times)
(the error message)
I had raised the same issue in Google's issue tracker.
Seems that the single BatchCreateContacts or BatchUpdateContacts call consumes six (6) "Critical Read Request" quota per request. Still did not get an answer why for creating/updating contacts, we are hitting the limit of critical read requests.
Quota exceeded for quota metric 'Critical read requests (Contact and Profile Reads)' and limit 'Critical read requests (Contact and Profile Reads) per minute per user' of service 'people.googleapis.com' for consumer 'project_number:***'.
There are two types of quotas: project based quotas and user based quotas. Project based quotas are limits placed upon your project itself. User based quotes are more like flood protection they limit the number of requests a single user can make over a period of time.
When you send a batch request with 10 requests in it it counts as ten requests not as a single batch request. If you are trying to run this parallel then you are defiantly going to be overflowing the request per minute per user quota.
Slow down this is not a race.
Why, for creating contacts, I would hit a quota regarding read requests?
I would chock it up to a bad error message.
Given the picture link below, why would sending 13 to 15 requests hit that quota limit anyway? ((there are 3 read requests before this code)) quota limit of 90 read requests per user per minute as displayed on console.cloud.google.com
Well you are sending 13 * 10 = 130 per minute that would exceed the request per minute. There is also no way of knowing how fast your system is running it could be going faster as it will depend upon what else the server is doing at the time it gets your requests what minute they are actually being recorded in.
My advice is to just respect the quota limits and not try to understand why there are to many variables on Googles servers to be able to tack down what exactly a minute is. You could send 100 requests in 10 seconds and then try to send another 100 in 55 seconds and you will get the error you could also get the error after 65 seconds depend upon when they hit the server and when the server finished processing your initial 100 requests.
Again slow down.

zmq.error.ZMQError: Cannot assign requested address

I have the following pull - publisher ZMQ schema on an Amazon EC2 machine:
I am working with the Public IP address of my EC2 Amazon machine.
I am trying send data via ZMQ PUSH socket from the client side to ZMQ PULL socket server side, which is this:
import zmq
from zmq.log.handlers import PUBHandler
import logging
# from zmq.asyncio import Context
def main():
ctx = zmq.Context()
publisher = ctx.socket(zmq.PUB)
# publisher.bind("tcp://*:5557")
publisher.bind("tcp://54.89.25.43:5557")
handler = PUBHandler(publisher)
logger = logging.getLogger()
logger.addHandler(handler)
print("Network Manager CNVSS Broker listening")
collector = ctx.socket(zmq.PULL)
# collector.bind("tcp://*:5558")
collector.bind("tcp://54.89.25.43:5558")
while True:
message = collector.recv()
print("Publishing update %s" % message)
publisher.send(message)
if __name__ == '__main__':
main()
But when I excute this script, I get this error:
(cnvss_nm) ubuntu#ip-172-31-55-72:~/cnvss_nm$ python pull_pub-nm.py
Traceback (most recent call last):
File "pull_pub-nm.py", line 28, in <module>
main()
File "pull_pub-nm.py", line 10, in main
publisher.bind("tcp://54.89.25.43:5557")
File "zmq/backend/cython/socket.pyx", line 547, in zmq.backend.cython.socket.Socket.bind
File "zmq/backend/cython/checkrc.pxd", line 25, in zmq.backend.cython.checkrc._check_rc
zmq.error.ZMQError: Cannot assign requested address
(cnvss_nm) ubuntu#ip-172-31-55-72:~/cnvss_nm$
I've changed my IP-address to publisher.bind("tcp://*:5557") and collector.bind("tcp://*:5558") in the server side, and my script is running:
(cnvss_nm) ubuntu#ip-x-x-x-x:~/cnvss_nm$ python pull_pub-nm.py
Network Manager CNVSS Broker listening
But from my client-side code ( added recently ), any data is sent.
#include <zmq.hpp>
#include <zmq.h>
#include <iostream>
#include "zhelpers.hpp"
using namespace std;
int main(int argc, char *argv[])
{
zmq::context_t context(1);
/*
std::cout << "Sending message to NM Server…\n" << std::endl; */
zmq::socket_t subscriber(context, ZMQ_SUB);
subscriber.connect("tcp://localhost:5557");
subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);
zmq::socket_t sender(context, ZMQ_PUSH);
sender.connect("tcp://localhost:5558");
string firstMessage = "Hola, soy el cliente 1";
while (1)
{
// Wait for next request from client
std::string string = s_recv(subscriber);
std::cout << "Received request: " << string << std::endl;
// Do some 'work'
// sleep(1);
// Send reply back to client
// zmq::message_t message(firstMessage.size() + 1);
// Cualquiera de los dos se puede
// memcpy(message.data(), firstMessage.c_str(), firstMessage.size() + 1);
// s_send(sender, "Hola soy un responder 1");
// sender.send(message);
}
}
I think that my inconvenient is on my EC2 machine network configuration or on the way of setup the IP address of the server.
When I test the clients and server locally, all it works perfectly.
Is there any possibility of performing some forwarding or NAT operation on my EC2 machine?
My clients do not reach the server.
I have the security groups rule the above mentioned ports 5557 and 5558.
How to solve this inconvenience?
I had a similar situation where I was using ZMQ on EC2 and getting "Cannot assign requested address." I was also using Elastic IP as suggested in the answer, but it did not work for me. It turned out that on EC2, the sending side (ZMQ.PUSH) needs to bind to the private IP rather than to the public, while the receiving side needs to bind to the public IP, so trying to bind the server to Elastic IP caused the error. After I changed it to bind the server ZMQ.PUSH side to Private IP and the client ZMQ.PULL to Elastic IP (on the same port), it worked.
How to solve this inconvenience?
1 )If in doubts about the EC2 addresses, first try to test the reversed .bind() / .connect(), so that the EC2-side localhost address assignments are out f the game, and your connectivity proof towards a known IP-address will not depend on the EC2-side settings.
2 )Next, given there are no details about the client-side part of the MCVE, I may have got the scenario idea incorectly, so bear with me - there are only these compatible ZeroMQ Scalable Formal Communication Archetype sockets' matches available ever since, up to API v4.2.x in 2018/Q2:
{ PUB: [ SUB,
XSUB,
None
],
PULL: [ PUSH,
None
],
...
}
3 )There is a good engineering practice not to let unhandled exceptions happen, the more, if Context()-instance may still bear the possession of IP:PORT# (b)locked resource ( sometimes even beyond the python process termination ( many incidents with my own naive and this way deadlocked experiments in my past dark history :o) )
Each step in the infrastructure setup ought be wrapped into error-handling syntax-clause, best including a finally: section, where so far created resources will occasionally get dismantled in a graceful manner in cases, when exception(s) spring out. This way your code will prevent a forever hanging orphan(s), that have just an option to reboot the platform so as to get rid of these, otherwise impossible to salvage, hostages.
Problem solved,a final summary :
The initially indicated problem ( diagnosed at .bind() / .connect() phase ) was, as depicted earlier, related to Amazon EC2 instance IP-address mapping, as the term, needed for any transport-class Endpoint setup, localhost:port#
camdebu on Nov 1, 2012 5:07 PM explained all the steps needed:Setup an Elastic IP to your EC2 isntance. You will then have a static IP address. There's no cost for the Elastic IP as long as you have it pointed to an EC2 instance.
You should then have no problem connecting to your new IP Address and port as long as your security group is setup correctly.
-Cam-
Check your Security Group Rule. Make sure you allow the port to communicate from outside the instance. (Enable All TCP and Check). [ added Yesu Jeya Bensh.P ]
The recently posted client-code but shows another issue, a mutual block, generated by a non-cooperating zmq::socket_t sender( context, ZMQ_PUSH ), which actually never sends a single message.
Given the client goes into while(1)-loop as posted above, the associated peer will inadvertently get into an unsalvageable blocked state inside the python-made main(), since :
def main():
...
collector = ctx.socket( zmq.PULL )
#ollector.bind( "tcp://*:5558" )
collector.bind( "tcp://54.89.25.43:5558" )
while True:
message = collector.recv() # THIS SLOC WILL BLOCK FOREVER HERE,
... # GIVEN <sender> NEVER SENDS...
so more care is to be taken, so as to make the flow of events robust enough, not to ever fall into this or similar unsalvageable mutual block.

Akka actors and Clustering-I'm having trouble with ClusterSingletonManager- unhandled event in state Start

I've got a system that uses Akka 2.2.4 which creates a bunch of local actors and sets them as the routees of a Broadcast Router. Each worker handles some segment of the total work, according to some hash range we pass it. It works great.
Now, I've got to cluster this application for failover. Based on the requirement that only one worker per hash range exist/be triggered on the cluster, it seems to me that setting up each one as a ClusterSingletonManager would make sense..however I'm having trouble getting it working. The actor system starts up, it creates the ClusterSingletonManager, it adds the path in the code cited below to a Broadcast Router, but it never instantiates my actual worker actor to handle my messages for some reason. All I get is a log message: "unhandled event ${my message} in state Start". What am I doing wrong? Is there something else I need to do to start up this single instance cluster? Am I sending the wrong actor a message?
here's my akka config(I use the default config as a fallback):
akka{
cluster{
roles=["workerSystem"]
min-nr-of-members = 1
role {
workerSystem.min-nr-of-members = 1
}
}
daemonic = true
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
hostname = "127.0.0.1"
port = ${akkaPort}
}
}
actor{
provider = akka.cluster.ClusterActorRefProvider
single-message-bound-mailbox {
# FQCN of the MailboxType. The Class of the FQCN must have a public
# constructor with
# (akka.actor.ActorSystem.Settings, com.typesafe.config.Config) parameters.
mailbox-type = "akka.dispatch.BoundedMailbox"
# If the mailbox is bounded then it uses this setting to determine its
# capacity. The provided value must be positive.
# NOTICE:
# Up to version 2.1 the mailbox type was determined based on this setting;
# this is no longer the case, the type must explicitly be a bounded mailbox.
mailbox-capacity = 1
# If the mailbox is bounded then this is the timeout for enqueueing
# in case the mailbox is full. Negative values signify infinite
# timeout, which should be avoided as it bears the risk of dead-lock.
mailbox-push-timeout-time = 1
}
worker-dispatcher{
type = PinnedDispatcher
executor = "thread-pool-executor"
# Throughput defines the number of messages that are processed in a batch
# before the thread is returned to the pool. Set to 1 for as fair as possible.
throughput = 500
thread-pool-executor {
# Keep alive time for threads
keep-alive-time = 60s
# Min number of threads to cap factor-based core number to
core-pool-size-min = ${workerCount}
# The core pool size factor is used to determine thread pool core size
# using the following formula: ceil(available processors * factor).
# Resulting size is then bounded by the core-pool-size-min and
# core-pool-size-max values.
core-pool-size-factor = 3.0
# Max number of threads to cap factor-based number to
core-pool-size-max = 64
# Minimum number of threads to cap factor-based max number to
# (if using a bounded task queue)
max-pool-size-min = ${workerCount}
# Max no of threads (if using a bounded task queue) is determined by
# calculating: ceil(available processors * factor)
max-pool-size-factor = 3.0
# Max number of threads to cap factor-based max number to
# (if using a bounded task queue)
max-pool-size-max = 64
# Specifies the bounded capacity of the task queue (< 1 == unbounded)
task-queue-size = -1
# Specifies which type of task queue will be used, can be "array" or
# "linked" (default)
task-queue-type = "linked"
# Allow core threads to time out
allow-core-timeout = on
}
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 1
# The parallelism factor is used to determine thread pool size using the
# following formula: ceil(available processors * factor). Resulting size
# is then bounded by the parallelism-min and parallelism-max values.
parallelism-factor = 3.0
# Max number of threads to cap factor-based parallelism number to
parallelism-max = 1
}
}
}
}
Here's where I create my Actors(its' written in Groovy):
Props clusteredProps = ClusterSingletonManager.defaultProps("worker".toString(), PoisonPill.getInstance(), "workerSystem",
new ClusterSingletonPropsFactory(){
#Override
Props create(Object handOverData) {
log.info("called in ClusterSingetonManager")
Props.create(WorkerActorCreator.create(applicationContext, it.start, it.end)).withDispatcher("akka.actor.worker-dispatcher").withMailbox("akka.actor.single-message-bound-mailbox")
}
} )
ActorRef manager = system.actorOf(clusteredProps, "worker-${it.start}-${it.end}".toString())
String path = manager.path().child("worker").toString()
path
when I try to send a message to the actual worker actor, should the path above resolve? Currently it does not.
What am I doing wrong? Also, these actors live within a Spring application, and the worker actors are set up with some #Autowired dependencies. While this Spring integration worked well in a non-clustered environment, are there any gotchyas in a clustered environment I should be looking out for?
thank you
FYI:I've also posted this in the akka-user google group. Here's the link.
The path in your code is to the ClusterSingletonManager actor that you start on each node with role "workerSystem". It will create a child actor (WorkerActor) with name "worker-${it.start}-${it.end}" on the oldest node in the cluster, i.e. singleton within the cluster.
You should also define the name of the ClusterSingletonManager, e.g. system.actorOf(clusteredProps, "workerSingletonManager").
You can't send the messages to the ClusterSingletonManager. You must send them to the path of the active worker, i.e. including the address of the oldest node. That is illustrated by the ConsumerProxy in the documentation.
I'm not sure you should use a singleton at all for this. All workers will be running on the same node, the oldest. I would prefer to discuss alternative solutions to your problem at the akka-user google group.

Resources