i get a index rejected exception when concurrent index document operation.
rejected execution of org.elasticsearch.transport.TcpTransport$RequestHandler#6d1cb827
on EsThreadPoolExecutor
[
index,
queue capacity = 200,
org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor#5d6lae0c
[
Running,
pool size = 32,
active threads = 32,
queued tasks = 312,
completed tasks = 32541513
]
]
i try to visit the url, but the index queue field is aways 0.
/_cat/thread_pool?v&h=id,type,name,size,largest,active,queue_size,queue
question 1: the queue capacity is 200, why is the munber of tasks in queue is 312 (over 200) ?
question 2: how to view the number of task in the current queue?
Related
I am using SocketJS and Stomp to send files over a backend api for being process.
My problem is that the upload function get stuck if more than two upload are done at the same time.
Ex:
User 1 -> upload a file -> backend is receiving correctly the file
User 2 -> upload a file -> backend is receiving correctly the file
User 3 -> upload a file -> the backend is not called until one of the
previous upload hasn't completed.
(after a minute User 1 complete its upload and the third upload starts)
The error I can see through the log is the following:
2021-06-28 09:43:34,884 INFO [MessageBroker-1] org.springframework.web.socket.config.WebSocketMessageBrokerStats.lambda$initLoggingTask$0: WebSocketSession[11 current WS(5)-HttpStream(6)-HttpPoll(0), 372 total, 26 closed abnormally (26 connect failure, 0 send limit, 16 transport error)], stompSubProtocol[processed CONNECT(302)-CONNECTED(221)-DISCONNECT(0)], stompBrokerRelay[null], **inboundChannel[pool size = 2, active threads = 2**, queued tasks = 263, completed tasks = 4481], outboundChannel[pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 607], sockJsScheduler[pool size = 1, active threads = 1, queued tasks = 14, completed tasks = 581444]
It seems clear that the pool size is full:
inboundChannel[pool size = 2, active threads = 2
but I really cannot find a way to increase the size.
This is the code:
Client side
ws = new SockJS(host + "/createTender");
stompClient = Stomp.over(ws);
Server side configuration
#EnableWebSocketMessageBroker
public class WebSocketBrokerConfig extends AbstractWebSocketMessageBrokerConfigurer {
...
...
#Override
public void configureWebSocketTransport(WebSocketTransportRegistration registration) {
registration.setMessageSizeLimit(100240 * 10240);
registration.setSendBufferSizeLimit(100240 * 10240);
registration.setSendTimeLimit(20000);
}
I've already tried with changing the configureWebSocketTransport parameters but it did not work.
How can I increase the pool size of the socket?
The inbound channel into the WebSocket can be overwritten by using this method:
#Override
public void configureClientInboundChannel(ChannelRegistration registration) {
registration.taskExecutor().corePoolSize(4);
registration.taskExecutor().maxPoolSize(4)
}
The official documentation suggests to have a pool size = number of cores. For sure, since the maxPoolSize is reached then requests are handled through an internal queue. So, given this configuration I can process concurrently 4 requests.
I have the following code snippet
for self.step in range(0, num_steps):
with torch.no_grad():
pool = mp.Pool(4)
self.step_iter = np.full(shape=len(self.env.agents),
fill_value=self.step, dtype=np.int)
# select action for every agent
action_vector = pool.starmap(Trajectory.select_action,
zip(self.replay_buffer,
self.actors,
self.critics,
self.step_iter,
self.action_size))
pool.close()
pool.join()
The problem is that for every timestep a new pool process is created (which takes some time).
Is there a way to reuse the same pool for every iteration until self.step==num_steps?
(moving comment to answer)
The pool is a collection of task handlers that will handle tasks in the queue. Think of it as (4) cashiers in a store that handle the line of customers. When a cashier is available, they begin processing the next customer. The pool is the cashiers and the starmap creates the line of customers to be processed.
You only need to create the pool (the cashiers) once:
pool = mp.Pool(4) # process handlers
for self.step in range(0, num_steps):
with torch.no_grad():
self.step_iter = np.full(shape=len(self.env.agents),
fill_value=self.step, dtype=np.int)
# select action for every agent
action_vector = pool.starmap(Trajectory.select_action,
zip(self.replay_buffer,
self.actors,
self.critics,
self.step_iter,
self.action_size))
pool.close()
pool.join()
I'm trying to Insert 1M keys of hashes to redis using batch insertion.
When I do that, several thousands of keys are not being inserted, and I got RedisTimeoutException.
Here is my code:
IDatabase db = RedisDB.Instance;
List<Task> tasks = new List<Task>();
var batch = db.CreateBatch();
foreach (var itemKVP in items)
{
HashEntry[] hash = RedisConverter.ToHashEntries(itemKVP.Value);
tasks.Add(batch.HashSetAsync(itemKVP.Key, hash));
}
batch.Execute();
Task.WaitAll(tasks.ToArray());
And then I get this exception:
RedisTimeoutException: Timeout awaiting response (outbound=463KiB, inbound=10KiB, 100219ms elapsed, timeout is 5000ms), command=HMSET, next: HMSET *****:*****:1390194, inst: 0, qu: 0, qs: 110, aw: True, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 1045, out-pipe: 0, serverEndpoint: 10.7.3.36:6379, mgr: 9 of 10 available, clientName: DataCachingService:DEV25S, IOCP: (Busy=0,Free=1000,Min=8,Max=1000), WORKER: (Busy=4,Free=32763,Min=8,Max=32767), Local-CPU: 0%, v: 2.0.601.3402 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
I read the article and I didn’t succeed to solve the problem.
I have a .Net Core application with Hangfire implementation.
There is a recurring job per minute as below:-
RecurringJob.AddOrUpdate<IS2SScheduledJobs>(x => x.ProcessInput(), Cron.MinuteInterval(1));
var hangfireOptions = new BackgroundJobServerOptions
{
WorkerCount = 20,
};
_server = new BackgroundJobServer(hangfireOptions);
The ProcessInput() internally checks the BlockingCollection() of some Ids to process, it keeps on continuously processing.
There is a time when the first ten ProcessInput() jobs keeps on processing with 10 workers where as the other new ProcessInput() jobs gets enqueued.
For this purpose, I wanted to increase the workers count, say around 50, so that there would be 50 ProcessInput() jobs being processed parallely.
Please suggest.
Thanks.
In the .NET Core version you can set the worker count when adding the Hangfire Server:
services.AddHangfireServer(options => options.WorkerCount = 50);
you can try increasing the worker process using this method. You have to know the number of processors running on your servers. To get 100 workers, you can try
var options = new BackgroundJobServerOptions { WorkerCount = Environment.ProcessorCount * 25 };
app.UseHangfireServer(options);
Source
I've got a system that uses Akka 2.2.4 which creates a bunch of local actors and sets them as the routees of a Broadcast Router. Each worker handles some segment of the total work, according to some hash range we pass it. It works great.
Now, I've got to cluster this application for failover. Based on the requirement that only one worker per hash range exist/be triggered on the cluster, it seems to me that setting up each one as a ClusterSingletonManager would make sense..however I'm having trouble getting it working. The actor system starts up, it creates the ClusterSingletonManager, it adds the path in the code cited below to a Broadcast Router, but it never instantiates my actual worker actor to handle my messages for some reason. All I get is a log message: "unhandled event ${my message} in state Start". What am I doing wrong? Is there something else I need to do to start up this single instance cluster? Am I sending the wrong actor a message?
here's my akka config(I use the default config as a fallback):
akka{
cluster{
roles=["workerSystem"]
min-nr-of-members = 1
role {
workerSystem.min-nr-of-members = 1
}
}
daemonic = true
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
hostname = "127.0.0.1"
port = ${akkaPort}
}
}
actor{
provider = akka.cluster.ClusterActorRefProvider
single-message-bound-mailbox {
# FQCN of the MailboxType. The Class of the FQCN must have a public
# constructor with
# (akka.actor.ActorSystem.Settings, com.typesafe.config.Config) parameters.
mailbox-type = "akka.dispatch.BoundedMailbox"
# If the mailbox is bounded then it uses this setting to determine its
# capacity. The provided value must be positive.
# NOTICE:
# Up to version 2.1 the mailbox type was determined based on this setting;
# this is no longer the case, the type must explicitly be a bounded mailbox.
mailbox-capacity = 1
# If the mailbox is bounded then this is the timeout for enqueueing
# in case the mailbox is full. Negative values signify infinite
# timeout, which should be avoided as it bears the risk of dead-lock.
mailbox-push-timeout-time = 1
}
worker-dispatcher{
type = PinnedDispatcher
executor = "thread-pool-executor"
# Throughput defines the number of messages that are processed in a batch
# before the thread is returned to the pool. Set to 1 for as fair as possible.
throughput = 500
thread-pool-executor {
# Keep alive time for threads
keep-alive-time = 60s
# Min number of threads to cap factor-based core number to
core-pool-size-min = ${workerCount}
# The core pool size factor is used to determine thread pool core size
# using the following formula: ceil(available processors * factor).
# Resulting size is then bounded by the core-pool-size-min and
# core-pool-size-max values.
core-pool-size-factor = 3.0
# Max number of threads to cap factor-based number to
core-pool-size-max = 64
# Minimum number of threads to cap factor-based max number to
# (if using a bounded task queue)
max-pool-size-min = ${workerCount}
# Max no of threads (if using a bounded task queue) is determined by
# calculating: ceil(available processors * factor)
max-pool-size-factor = 3.0
# Max number of threads to cap factor-based max number to
# (if using a bounded task queue)
max-pool-size-max = 64
# Specifies the bounded capacity of the task queue (< 1 == unbounded)
task-queue-size = -1
# Specifies which type of task queue will be used, can be "array" or
# "linked" (default)
task-queue-type = "linked"
# Allow core threads to time out
allow-core-timeout = on
}
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 1
# The parallelism factor is used to determine thread pool size using the
# following formula: ceil(available processors * factor). Resulting size
# is then bounded by the parallelism-min and parallelism-max values.
parallelism-factor = 3.0
# Max number of threads to cap factor-based parallelism number to
parallelism-max = 1
}
}
}
}
Here's where I create my Actors(its' written in Groovy):
Props clusteredProps = ClusterSingletonManager.defaultProps("worker".toString(), PoisonPill.getInstance(), "workerSystem",
new ClusterSingletonPropsFactory(){
#Override
Props create(Object handOverData) {
log.info("called in ClusterSingetonManager")
Props.create(WorkerActorCreator.create(applicationContext, it.start, it.end)).withDispatcher("akka.actor.worker-dispatcher").withMailbox("akka.actor.single-message-bound-mailbox")
}
} )
ActorRef manager = system.actorOf(clusteredProps, "worker-${it.start}-${it.end}".toString())
String path = manager.path().child("worker").toString()
path
when I try to send a message to the actual worker actor, should the path above resolve? Currently it does not.
What am I doing wrong? Also, these actors live within a Spring application, and the worker actors are set up with some #Autowired dependencies. While this Spring integration worked well in a non-clustered environment, are there any gotchyas in a clustered environment I should be looking out for?
thank you
FYI:I've also posted this in the akka-user google group. Here's the link.
The path in your code is to the ClusterSingletonManager actor that you start on each node with role "workerSystem". It will create a child actor (WorkerActor) with name "worker-${it.start}-${it.end}" on the oldest node in the cluster, i.e. singleton within the cluster.
You should also define the name of the ClusterSingletonManager, e.g. system.actorOf(clusteredProps, "workerSingletonManager").
You can't send the messages to the ClusterSingletonManager. You must send them to the path of the active worker, i.e. including the address of the oldest node. That is illustrated by the ConsumerProxy in the documentation.
I'm not sure you should use a singleton at all for this. All workers will be running on the same node, the oldest. I would prefer to discuss alternative solutions to your problem at the akka-user google group.