I'd like to optimize rsyslog in order to have a maximum amount of messages per second.
I work with The rsyslog 8.27.0. With my current configuration ,it offers me about 72k messages /second.
Do you have any other configuration to get my goal ?
Thank you in advance
Related
My Opensearch sometimes reaches the error "429 Too Many Requests" when writing data. I know there is a queue, when the queue is full it will show that error. So is there any Api to check that bulk queue status, current size...? Example: queue 150/200 (nearly full)
Yes, you can use the following API call
GET _cat/thread_pool?v
You will get something like this, where you can see the node name, the thread pool name (look for write), the number of active requests currently being carried out, the number of requests waiting in the queue and finally the number of rejected requests.
node_name name active queue rejected
node01 search 0 0 0
node01 write 8 2 0
The write queue can handle as many requests as 1 + number of CPUs, i.e. as many can be active at the same time. If active is full and new requests come in, they go directly in the queue (default size 10000). If active and queue are full, requests start to be rejected.
Your mileage may vary, but when optimizing this, you're looking at:
keeping rejected at 0
minimizing the number of requests in the queue
making sure that active requests get carried out as fast as possible.
Instead of increasing the queue, it's usually preferable to increase the number of CPU. If you have heavy ingest pipelines kicking in, it's often a good idea to add ingest nodes whose goal will be to execute that pipeline instead of on the data node.
I Have currently implemented Spring Batch Remote Chunking with Kafka. I have one Manager and Workers (21 copy of the workers).
Currently I am facing below issue
I want to know if it is possible to run two same Remote chunking step with different parameters parallelly at the same time. The problem I see here since i am using the same reply channel and the responses are getting mixed up of the two different instances of the same chunk steps.
Second problem is I have more than 250000 record to process and my chunk size is 1000 and number of workers are 21 and I have also the throttle limit to 20 and maxtimeoutcount to 10000. I want to know what should be throttle limit and maxtimeoutcount for processing huge records
Please do suggest me regarding the above issue as I am stuck.
I have Kubernetes system in Azure and used the following instrustions to install fluent, elasticsearch and kibana: https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch I am able to see my pods logs in kibana but when i send logs more then 16k chars its just split.
if i send 35k chars . its split into 3 logs.
how can i increase the limit of 1 log? I want to able to see the 36k chars in one log.
image here
https://github.com/fluent-plugins-nursery/fluent-plugin-concat
did the job combine to one log.
solve docker's max log line (of 16Kb)
solve long lines in my container logs get split into multiple lines
solve max size of the message seems to be 16KB therefore for a message of 85KB the result is that 6 messages were created in different chunks.
I have been digging this topic half a year ago. In short, this is an expected behavior.
Docker chunks log messages at 16K, because of a 16K buffer for log messages. If a message length exceeds 16K, it should be split by the json file logger and merged at the endpoint.
It looks like the Docker_mode option for Fluentbit might help, but I'm not sure how exactly you are parsing container logs.
I am using ElasticSearch6.2.1. I am using single node cluster. It is working fine with my small size indices and medium traffic. But when I test for large number of concurrent request to handle using Apache JMeter, ES is going down with error message like below.
My requirement is to prevent ES to not crash even in such high traffic situation. It should discard requests after a certain time but not to stop working. Is there any option by which I can achieve it? Please advise.
if the requests are going up for just few seconds, you can increase the queue size of requested thread_pool (for example search thread-pool). otherwise you should add some node to cluster.
(please add some log of elastic crashing. do you have any out of memory exception?)
Are you sure elasticsearch is crashing? here, it's saying the search thread pool is full.
More read at https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html.
I have configured Nutch 2.3.1 with Hadoop/Hbase ecosystem. I have not changed gora.buffer.read.limit and gora.buffer.read.limit i.e., using their default values that is 10000 in both cases. At generate phase, I set topN to 100,000. During generate job I get following information
org.apache.gora.mapreduce.GoraRecordWriter: Flushing the datastore after 60000 records
After job completion, I found that 100,000 urls are marked for fetched that I want to be. But I am confused what does above warning shows ? What is impact of gora.buffer.read.limit on my crawling ?
Can someone guide ?
That log is written here. By default, the buffer is flushed after writing 10000 records, so you must have somewhere configured gora.buffer.write.limit to 60000 (at core-site.xml or mapred-site.xml or code?).
It is not important, since it is at INFO level. It only notifies that the write buffer is going to be written into the storage.
The writing process happens each time you call store.flush(), or in gora.buffer.write.limit size batches.