The best memory configuration for ElasticSearch - elasticsearch

I have one linux server with 128G memory and 32 cpu cores. I would run an ElasticSearch instance on this server, the server is exclusively only for running ES. So how many memory I should configure for ES. How could I get the best performance of ES please. Is the server too luxurious for ES? Thanks!

I suggest you run two ES instances in each server. Since your linux server pretty powerful, if you set the ES memory as 60g or 80g it may encounter GC problem. Try to run two or three ES instances in one server and monitor the CPU and Memory usage, btw, change the http port of ES for running multiple nodes in one server.

Related

Creating Elasticsearch cluster from three servers

We have three physical servers. Each server has 2 CPUs (32 cores), 96 TB HDD, and 768 GB RAM. We would like to use these servers in an Elasticsearch cluster.
Each server will be located in a different data center, connecting each server using a private connection.
How can be optimize our configuration for high performance? Also, how should we best run Elasticsearch on these machines. For example, should we use virtualization to create multiple nodes per machine, or not?
As you have huge RAM(768) available on each physical server and according to ES documentation on heap setting it shouldn't cross 32 GB, so you will have to use virtualization to create multiple nodes per physical server for better ultization of your infra.
Apart from these there are various cluster settings and node settings which you can optimize but as you have not provided them, its difficult to provide recommendation on them.
Another thing to note is that you have huge RAM and disk but CPU is not in proportion to it, so if you can increase them as well, it would be good.

What are the resource requirements to run Logstash in a k8s pod?

I was noticing that running a ELK stack on a Raspberry Pi running a Kubernetes Cluster. I noticed that it didnt have the resources to run all three containers. I was looking up that with Kubernetes you can put limits and requests on your resources CPU and Memory, and it got me thinking. What are the minimum requirements? To me, applications are greedy, so is there a way to cut down the requirements for Logstash, to emphasize resources for Elasticsearch?
Right now, I am running a Raspberry Pi 4, 4g RAM, 32G disk.
If I can put min and max requirements on the container it will better allow me manage the resources. The think though that I noticed is that there was no insight from what I can tell as to minimum requirements for the different containers.
https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-managing-compute-resources.html
The above link i believe tells me that the CPU consumption is greedy, but the default MEMORY for Elastic and Kibana 2Gi and 1Gi respectively. It mentioned nothing about logstash though, and whether or not there is a Minimum requirement for CPUs.
I wasnt sure if I should set each ELK container to 1CPU, 1Gi RAM, and I can try it to see if it functions, but since the concept of it throttling down makes me curious what the happy medium would be.
Logstash is not part of the Elastic Cloud, that is why there is no mention of it in the Elastic Cloud on Kubernetes documentation link that you shared.
Logstash is way more CPU bound than memory bound, but how much memory does it needs is completely dependent on your pipelines.
In Logstash the memory depends on the pipelines, the batch size, the filters used, the number of events per seconds, the queue type etc. If you are running a dev or lab environment I think that you can try to give Logstash 1 CPU and 512 MB of RAM and see if it feets your use case.
But I would say that 4GB is pretty small for a full stack since you need to have memory for the applications and still have some memory left for the sytems.

Elasticsearch replication cluster with different OS

If I am running one ES on Windows server, another ES on Linux, and the third ES on Unix, can I cluster them and make them replicate each other? Is it possible?
Server A Windows 192.168.0.100
Server B Linux 192.168.0.101
Server C Unix 192.168.0.102
Well it possible as soon as they can "see" each other in network.
I think more important that should have similar configuration in terms of memory and CPU this would be more important but again it depends on performance you are looking for.

Will Redis get faster with sharding on multiple masters when using no persistence?

My tests with standalone (single-threaded) Redis show that load from a number of parallel clients can drive Redis CPU usage to 100% (in my memory cache use case).
Starting it in cluster mode and sharding the content to multiple masters is a possible approach for speeding it up, if persistence is turned on.
I have a configuration without persistence (turned off RDB and AOF). Would starting multiple masters help performance (still using the same cummulative amount of RAM)?
Redis is single-threaded, so the performance of a standalone instance is limited by processing power of a single CPU core and the network bandwidth of a single machine. However, Redis is very very fast. So normally the bottleneck is network bandwidth, unless you run lots of slow commands/lua scripts.
If you deploy Redis cluster on multiple machines, the performance should be improved no matter whether the persistence is turned on or off. Since you have more CPU cores, and more network bandwidth.
If you deploy Redis cluster on a single machine (each node listen on a unique port), the performance might be improved. It depends... If the bottleneck is network bandwidth, it won't be improved. On the other hand, if the bottleneck is CPU processing power, the performance should be improved. So, in this case, you should do some benchmark with your specific data, specific environment, and specific commands/lua scripts.

How to increase the request per second on amazon EC2 T2.micro instance?

I recently lunched a Amazon EC2 instance, the T2.micro. After installed Wildfly 8.2.0Final, I try to do a load test of the web server. I tested the server to serve a static page of less than 500 byte size, and a dynamic page that write and read mysql. To my suprise, I got the similar result, both test get the result of around 1000 RPS. I monitored the system using top -d 1, the CPU hasn't reach the max, and there are free memory. I think either EC2 has some limitation on concurrent connections, or my setup needs improvement.
My setup is CentOS 7, WileFly/Jboss 8.2.0 Final, MariaDb 5.5. The test tool is jmeter in distributed mode or command line mode. Tests were performed on remote, on the same subnet, and on the localhost. All get the same result.
Can you please help identify where the bottleneck is. Are there any limitations on Amazon EC2 instance that could affect this? Thanks.
Yes, there are some limitations depending of the EC2 instance type and one of them is network performance.
Amazon doesn't publish the exact limitations of each type of instance, but in the Instance Types Matrix you can see that t2.micro has a low to moderate network performance. If you need better network performance, you can check on the AWS instance types page where it shows which instances have enhanced networking:
Enhanced Networking
Enhanced Networking enables you to get significantly higher packet per second (PPS) performance, lower network jitter and lower latencies. This feature uses a new network virtualization stack that provides higher I/O performance and lower CPU utilization compared to traditional implementations. In order to take advantage of Enhanced Networking, you should launch an HVM AMI in VPC, and install the appropriate driver. Enhanced Networking is currently supported in C4, C3, R3, I2, M4, and D2 instances. For instructions on how to enable Enhanced Networking on EC2 instances, see the Enhanced Networking on Linux and Enhanced Networking on Windows tutorials. To learn more about this feature, check out the Enhanced Networking FAQ section.
You have more information in these SO and SF questions:
Bandwidth limits for Amazon EC2
Does anyone know the bandwidth available for different EC2 Instances?
EC2 Instance Types's EXACT Network Performance?
You're right that 1000 RPS feels awfully low for Wildfly, given that the Undertow server powering it is one of the fastest in Java land and among the 10 fastest, period.
Starting points to optimize:
Make sure that you do not have request logging on (that could cause an I/O bottleneck), use the latest stable JVM, and it's probably worth using the most recent Wildfly version that your app works with.
With that done, you're almost certainly being bottlenecked by connection creation, not your AWS instance. This could be within JMeter, or within the Wildfly subsystem.
To eliminate JMeter as a culprit, try ApacheBenchmark ("ab") at the same concurrency level, and then try it with the -k option on (to allow connection reuse).
If the first ApacheBenchmark number is much higher than JMeter, the issue is the thread-based networking model that JMeter uses (Another load-testing tool, such as gatling or locust.io may be needed).
If the second number is much higher than the first, the bottleneck is proven to be connection creation. The may be solved by tuning the Undertow server settings.
As far as WildFly goes, I'd have to see the config.xml, but you may be able to improve performance by tweaking the Undertow subsystem settings. The defaults are usually solid, but you want a very low number of I/O threads (either 1, or the number of CPUs, no more).
I have seen a trivial Wildfly 10 application far exceed the performance you're seeing on a t2.micro instance.
Benchmark results, with Wildfly 10 + docker + Java 8:
Server setup (EC2 t2.micro running latest amazon linux, in US-east-1, different AZs)
sudo yum install docker
sudo service docker start
sudo docker run --rm -it -p 8080:8080 svanoort/jboss-demo-app:0.7-lomem
Client (another t2.micro, minimal load, different AZ):
ab -c 16 -k -n 1000 http://$SERVER_PRIVATE_IP:8080/rest/cached/500
16 concurrent connections with keep-alive, serving 500 bytes of cached randomly pre-generated data
Results over multiple runs:
430 requests per second (RPS), 1171 RPS, 1527 RPS, 1686 RPS, 1977 RPS, 2471 RPS, 3339 RPS, eventually peaking at ~6500 RPS after hundreds of thousands of requests.
Notice how that goes up over time? It's important to prewarm the server before benchmarking, to allow for enough handler threads to be created, and to allow for JIT compilation. 10,000 requests is a good starting point.
If I turn off connection keepalive? Peaks at about ~1450 RPS with concurrency 16. BUT WAIT! With a single thread (concurrency 1), it only gives ~340-350 RPS. Increasing concurrency beyond 16 does not give higher performance, it remains fairly stable (even up to 512 concurrent connections).
If I increase the request data size to 2000 bytes, by using http://$SERVER_PRIVATE_IP:8080/rest/cached/2000 then it still hits 1367 RPS, showing that almost all of the time is spent on connection handling.
With very large (300k) requests and connection keep-alive, I hit about 50 MB/s between hosts, but I've seen up to 90 MB/s in optimal situations.
Very impressive performance for JBoss/Wildfly there, I'd say. Note that higher concurrency may be needed if there is more latency between hosts, to allow for the impact of round-trip time on connection creation.

Resources