How many simultaneous requests can I send to ElasticSearch cluster? - elasticsearch

I want to send multiple bulk operation requests to ElasticSearch cluster, and I come across this issue EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
I have a cluster of 4 ElasticSearch instances (version 1.3.4), when I sent this request to get the number of its bulk operation thread pool size:
GET /_cat/thread_pool?v&h=host,bulk.active,bulk.queueSize
I got
host bulk.active bulk.queueSize
1D4HPY1 0 50
1D4HPY2 0 50
1D4HPY3 0 50
1D4HPY4 0 50
So how many simultaneous bulk operation requests I can send to that cluster? 50 or 200?

I would suggest having a look at this section from the documentation.
Also, you need to be more specific when you say "simultaneous requests that you can send" because, as you see in the page above, there are various thread pools that handle various jobs. You give an example in your post for "bulk" operations.
In my opinion, the correct request for "bulk" to see the number of simultaneous running threads (as per this piece of documentation) is GET /_cat/thread_pool?v&h=host,bulk.queueSize,bulk.min,bulk.max. So, you have bulk.max active threads allowed in the thread pool with a bulk.queueSize number of tasks in the queue for it. When a request comes in and there are no threads to handle it, the request is put in queue to wait.

Related

Is there anyway to check current bulk queue size Opensearch?

My Opensearch sometimes reaches the error "429 Too Many Requests" when writing data. I know there is a queue, when the queue is full it will show that error. So is there any Api to check that bulk queue status, current size...? Example: queue 150/200 (nearly full)
Yes, you can use the following API call
GET _cat/thread_pool?v
You will get something like this, where you can see the node name, the thread pool name (look for write), the number of active requests currently being carried out, the number of requests waiting in the queue and finally the number of rejected requests.
node_name name active queue rejected
node01 search 0 0 0
node01 write 8 2 0
The write queue can handle as many requests as 1 + number of CPUs, i.e. as many can be active at the same time. If active is full and new requests come in, they go directly in the queue (default size 10000). If active and queue are full, requests start to be rejected.
Your mileage may vary, but when optimizing this, you're looking at:
keeping rejected at 0
minimizing the number of requests in the queue
making sure that active requests get carried out as fast as possible.
Instead of increasing the queue, it's usually preferable to increase the number of CPU. If you have heavy ingest pipelines kicking in, it's often a good idea to add ingest nodes whose goal will be to execute that pipeline instead of on the data node.

What happens if I give a large value to server.tomcat.max-threads to handle load on my application?

There are around 1000+ jobs running through our service in a day and around 70-80 jobs starting at the same time and running parallelly.
To handle this, we looked that increasing the number of max threads to a large number to server.tomcat.max-threads property of our Spring application should work but I do not have full confidence as to what all can be the side effects of having a huge number like 800 to this property.
Can you please help here.
The default installation of Tomcat sets the maximum number of HTTP servicing threads at 200. Effectively, this means that the system can handle a maximum of 200 simultaneous HTTP requests. When the number of simultaneous HTTP requests exceeds this count, the unhandled requests are placed in a queue, and the requests in this queue are serviced as processing threads become available. This default queue length is 100. At these default settings, a large web load that can generate over 300 simultaneous requests will surpass the thread availability, resulting in service unavailable (HTTP 503).
More reference: https://docs.bmc.com/docs/brid91/en/tomcat-container-workload-configuration-825210082.html
How to run multiple servlets execution in parallel for Tomcat?
If this is a batch job like configuration, you can use spring batch.

Creation of parallel threads for bulk request handling?

I have rest service and want to handle almost 100 requests in parallel for this service. I have mentioned number of threads and number of connections to create as 100 in my application.yml even i did not see 100 connections created to handle requests
Here is what i did in my application.yml
server.tomcat.max-threads=100
server.tomcat.max-connections=100
I am using yourkit to see the internals , but when i start its created only 10 connections to handle requests, when i sent multiple requests also the count of request handling threads not increased its remain as 10. see the attachment i took from yourkit.
You're setting max threads. Not minimum threads. Tomcat in this case has decided the minimum should be 10.

Bigquery Streaming inserts, persistent or new http connection on every insert?

I am using google-api-ruby-client for Streaming Data Into BigQuery. so whenever there is a request. it is pushed into Redis as a queue & then a new Sidekiq worker tries to insert into bigquery. I think its involves opening a new HTTPS connection to bigquery every insert.
the way, I have it setup is:
Events post every 1 second or when the batch size reaches 1MB (one megabyte), whichever occurs first. This is per worker, so the Biquery API may receive tens of HTTP posts per second over multiple HTTPS connections.
This is done using the provided API client by Google.
Now the Question -- For Streaming inserts, what is the better approach:-
persistent HTTPS connection. if yes, then should it be a global connection that's shared across all requests? or something else?
Opening new connection. like we are doing now using google-api-ruby-client
I think it's pretty much to early to speak about these optimizations. Also other context is missing like if you exhausted the kernel's TCP connections or not. Or how many connections are in TIME_WAIT state and so on.
Until the worker pool doesn't reach 1000 connections per second on the same machine, you should stick with the default mode the library offers
Otherwise this would need lots of other context and deep level of understanding how this works in order to optimize something here.
On the other hand you can batch more rows into the same streaming insert requests, the limits are:
Maximum row size: 1 MB
HTTP request size limit: 10 MB
Maximum rows per second: 100,000 rows per second, per table.
Maximum rows per request: 500
Maximum bytes per second: 100 MB per second, per table
Read my other recommendations
Google BigQuery: Slow streaming inserts performance
I will try to give also context to better understand the complex situation when ports are exhausted:
Let's say on a machine you have a pool of 30,000 ports and 500 new connections per second (typical):
1 second goes by you now have 29500
10 seconds go by you now have 25000
30 seconds go by you now have 15000
at 59 seconds you get to 500,
at 60 you get back 500 and stay at using 29500 and that keeps rolling at
29500. Everyone is happy.
Now say that you're seeing an average of 550 connections a second.
Suddenly there aren't any available ports to use.
So, your first option is to bump up the range of allowed local ports;
easy enough, but even if you open it up as much as you can and go from
1025 to 65535, that's still only 64000 ports; with your 60 second
TCP_TIMEWAIT_LEN, you can sustain an average of 1000 connections a
second. Still no persistent connections are in use.
This port exhaust is better discussed here: http://www.gossamer-threads.com/lists/nanog/users/158655

ElasticSearch gives error about queue size

RemoteTransportException[[Death][inet[/172.18.0.9:9300]][bulk/shard]]; nested: EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1#12ae9af];
Does this mean I'm doing too many operations in one bulk at one time, or too many bulks in a row, or what? Is there a setting I should be increasing or something I should be doing differently?
One thread suggests "I think you need to increase your 'threadpool.bulk.queue_size' (and possibly 'threadpool.index.queue_size') setting due to recent defaults." However, I don't want to arbitrarily increase a setting without understanding the fault.
I lack the reputation to reply to the comment as a comment.
It's not exactly the number of bulk requests made, it is actually the total number of shards that will be updated on a given node by the bulk calls. This means the contents of the actual bulk operations inside the bulk request actually matter. For instance, if you have a single node, with a single index, running on an 8 core box, with 60 shards and you issue a bulk request that has indexing operations that affects all 60 shards, you will get this error message with a single bulk request.
If anyone wants to change this, you can see the splitting happening inside of org.elasticsearch.action.bulk.TransportBulkAction.executeBulk() near the comment "go over all the request and create a ShardId". The individual requests happen a few lines down around line 293 on version 1.2.1.
You want to up the number of bulk threads available in the thread pool. ES sets aside threads in several named pools for use on various tasks. These pools have a few settings; type, size, and queue size.
from the docs:
The queue_size allows to control the size of the queue of pending
requests that have no threads to execute them. By default, it is set
to -1 which means its unbounded. When a request comes in and the queue
is full, it will abort the request.
To me that means you have more bulk requests queued up waiting for a thread from the pool to execute one of them than your current queue size. The documentation seems to indicate the queue size is defaulted to both -1 (the text above says that) and 50 (the call out for bulk in the doc says that). You could take a look at the source to be sure for your version of es OR set the higher number and see if your bulk issues simply go away.
ES thread pool settings doco
elasticsearch 1.3.4
our system 8 core * 2
4 bulk worker each insert 300,000 message per 1 min => 20,000 per sec
i'm also that exception! then set config
elasticsearch.yml
threadpool.bulk.type: fixed
threadpool.bulk.size: 8 # availableProcessors
threadpool.bulk.queue_size: 500
source
BulkRequestBuilder bulkRequest = es.getClient().prepareBulk();
bulkRequest.setReplicationType (ReplicationType.ASYNC).setConsistencyLevel(WriteConsistencyLevel.ONE);
loop begin
bulkRequest.add(es.getClient().prepareIndex(esIndexName, esTypeName).setSource(document.getBytes ("UTF-8")));
loop end
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
4core => bulk.size 4
then no error
I was having this issue and my solution ended up being increasing ulimit -Sn and ulimit Hn for the elasticsearch user. I went from 1024 (default) to 99999 and things cleaned right up.

Resources