recently when I try to index a document. The request, response with max-timeout reached
after a certain point in time, and it starts indexing again.
Now, I'm trying to find the root cause of that issue. The only thing I'm able to find is one of my master nodes was down at that time. will it result in that timeout issue?
the infra details of my elastic search are:
run in Kubernetes
3 data nodes - each node spec(ram 64GB memory limit 32GB) - heap size - 28GB disk size - 1TB
3 master nodes - each node spec(ram 16GB memory limit 4GB) - heap size - 4GB disk size - 10GB
Found the cause of it
which is due to all masters being down at that time
because of multiple issues:
heap dump saving (out of mem in storage)
because of storage sharing trying to dump heap data with the same name(which throws file exist error)
I would like to increase my cluster. I have 20 slave nodes, dual-quad-core CPUs, each hard drive space is 12 TB. I would like to add 10 additional slave nodes. Do I have to care about the disk space on my new nodes? Can my nodes have any amount of hard drive space?
I have a three nodes cluster each of 114 GB disk capacity. I am pushing syslog events from around 190 devices to this server. The events flow at a rate of 100k withing 15 mins. The index has 5 shards and uses best_compression method. Inspite of this the disk gets filled up soon so i was forced to remove the replica for this index. The index size is 170 GB and each shard is of size 34.1 GB. Now if i get additional disk space and i try to re index this data to a new index with 3 shards and replica will it save disk space ?
I am new to elasticsearch.Suppose we have a two node cluster and have a config of 2 primary shards and one replica for our single index.So node 1 has P0,R1 and node 2 has P1,R0. Now suppose later on I reduce the number of replicas to 0.Then will the shards P0 and P1 automatically resize themselves to occupy the disk space vacated by replicas and allow me greater disk space for indexing then previously when I had replicas.
A replica shard takes more or less the same space as its primary since both contain the same documents. So, say, you have indexed 1 million documents in your index, then each primary shard contains more or less half that amount of documents, i.e. 500K document and each replica contains the same number of documents as well.
If each document weighs 1KB, then:
The primary shard P0 has 500K document weighing 500MB
The replica shard R0 has 500K document weighing 500MB
The primary shard P1 has 500K document weighing 500MB
The replica shard R1 has 500K document weighing 500MB
Which means that your index occupies 2GB of disk space on your node. If you later reduce the number of replicas to 0, then that will free up 1GB of space that your primary shards will be able to occupy, indeed.
However, note that by doing so, you certainly gain disk space, but you won't have any redundancy anymore and you will not be able to spread your index over two nodes, which is the main idea behind replicas to begin with.
The other thing is that the size of a shard is bounded by a physical limit that it will not be able to cross. That limit is dependent on many factors, among which the amount of heap and the total physical memory you have. If you have 2GB of heap and 50GB of disk space, you cannot expect to index 50GB of data into your index, that won't work, or will be very slow and unstable.
=> So the disk space only should not be the main driver for sizing your shards. Having enough disk space is necessary condition but not a sufficient one, you also need to look at the RAM and the heap allocated to your ES node.
Frequent full GC even if the memory utilization is 30-40%. We have not changed the default JVM settings
Cluster details
A master node
Two data nodes
All the replicas are allocated on one node and all the primaries are on the other node. It should not be a problem, right? Also frequent full GC is observed on one of the 2 data nodes.
Shards -
We have around 75 shards with replication as 1. I guess the problem is because of shard over allocation.
Most of the queries we hit are aggregation queries.