Elasticsearch path.data multiple disks, adding more - elasticsearch

When I originally set up my Elasticsearch cluster, it was recommended to "stripe" the data across multiple disks thusly:
path.data: [ /disk1, /disk2, /disk3 ]
Which I did prevously, and has worked fine, but now I need to add more space (more disks), which I plan to do like this:
path.data: [ /disk1, /disk2, /disk3, /disk4, /disk5 ]
I have not been able to find any authoritative reference that indicates how the data will be re-balanced (or not). It seems that the behavior has changed somewhat over the years/versions, so googling has been difficult.
All the docs say about it is: "path.data settings can be set to multiple paths, in which case all paths will be used to store data" which is rather vague.
I am running Elasticsearch 5.6.
I would like to understand what will happen when disks 1,2,3 are above the 85% "low watermark" (but not yet at the high 90% mark), and I introduce 2 new disks to the mix. Will new indices go to the 2 new disks only?
The docs say: "ES will not allocate new shards to nodes once they have more than 85% disk used". Does this mean the whole node, or just the disks that are at 85% on that node?
My indices are daily logging data, and are pruned with Curator every N days, so I imagine at some point, things will even out but may take a while. Is there any way to proactively relocate shards to a different disk or should I just let it self-balance over time?

Using multiple disks (via data paths) is not STRIPING. Data is distributed across disks by shards count and not disk space usage. Even if a single disk goes past watermark, the node will get affected. So adding new disks to data path won't distribute data to new disks.
To use data striping use atleast RAID0 or other options as per your data safety requirement.
REFER Data storage architecture

Related

Why does Elasticsearch allow you to specify multiple disk partitions in the .yml file if it doesn't balance shards across partitions?

This is follow-up to a questions I asked previously here.
I have a cluster with three data nodes and one head node. The hard-drive on each data node has three partitions: /data1, /data2 and /data3. I configured my elasticsearch.yml on the head node like this:
path.data: /data1/elasticsearch, /data2/elasticsearch_2, /data3/elasticsearch_3
My existing index is stored in /data1/elasticsearch on each node. However, when I disable replication and try to load the data for my new index I trigger the low watermark cluster setting; the /data1 doesn't have enough space.
Looking through the Elasticsearch documentation I found this warning:
Elasticsearch does not balance shards across a node’s data paths. High disk usage in a single path can trigger a high disk usage watermark for the entire node. If triggered, Elasticsearch will not add shards to the node, even if the node’s other paths have available disk space. If you need additional disk space, we recommend you add a new node rather than additional data paths.
So my questions is: Why does Elasticsearch allow you to specify multiple paths for data storage if it doesn't allocate shards to the next empty path on the node?
The option to use multiple data paths won't be allowed anymore, this feature has some problems, for example the one you mentioned and the fact the kibana could show the wrong free space when using multiple disks on the same node.
The use of multiple data paths is planned to be deprecated in version 7.13 and removed in version 8.0 according to this github issue.
According to the same issue:
(...) multiple-data-paths is a feature that has a very high cost (numerous bugs and deficiencies in its design), relatively few users, and most importantly, better alternatives that are standard outside of Elasticsearch.

setting up a basic elasticsearch cluster

Im new to elasticsearch and would like someone to help me clarify a few concepts
Im designing a small cluster with the following requirements
everything should still work when restarting one of the machines, one at a time (eg: OS updates)
a single disk failure is ok
heavy indexing should not impact query performance
How many master, data, ingest nodes should I have?
or do I need 2 clusters?
the indexing workload is purely indexing structured text documents, no processing/rules... do I even need an ingest node?
Also, does each node have a complete copy of the all the data? or only a cluster has the complete copy?
Be sure to read the documentation about Elasticsearch terminology at the very least.
With the default of 1 replica (primary shard and one replica shard) you can survive the failure of 1 Elasticsearch node (failed disk, restart, upgrade,...).
"heavy indexing should not impact query performance": You'll need to size your cluster correctly to handle both the indexing and searching. If you want to read current data and you do heavy updates, that will take up resources and you won't be able to fully decouple it.
By default every node is a data, ingest, and master-eligible node. The minimum HA setting needs 3 nodes. If you don't use ingest that's fine; it won't take up resources when you're not using it.
To understand which node has which data, you need to read up on the concept of shards. Basically every index is broken up into 1 to N shards (current default is 5) and there is one primary and one replica copy of each one of them (by default).

Using multiple node clients in elasticsearch

I'm trying to think of ways to scale our elasticsearch setup. Do people use multiple node clients on an Elasticsearch cluster and put them in front of a load balancer/reverse proxy like Nginx. Other ideas would be great.
So I'd start with re-capping the three different kinds of nodes you can configure in Elasticsearch:
Data Node - node.data set to true and node.master set to false -
these are your core nodes of an elasticsearch cluster, where the data
is stored.
Dedicated Master Node - node.data is set to false and node.master is
set to true - these are responsible for managing the cluster state.
Client Node - node.data is set to false and node.master is set to
false - these respond to client data requests, querying for results
from the data nodes and gathering the data to return to the client.
By splitting the functions into 3 different base node types you have a great degree of granularity and control in managing the scale of your cluster. As each node type handles a more isolated set of responsibilities you are better able to tune each one and to scale appropriately.
For data nodes, it's a function of handling indexing and query responses, along with making certain you have enough storage allocated to each node. You'll want to monitor storage usage and disk thru-put for each node, along with cpu and memory usage. You want to avoid configurations where you run out of disk, or saturate disk thru-put, while still have substantial excess cpu and memory, or the reverse where memory and cpu max but you have lot's of disk available. The best way to determine this is thru some benchmarking of typical indexing and querying activities.
For master nodes, you should always have at least 3 and should always have an odd number. The quorum should be set to N/2 + 1 where is N is the number of master nodes. This way you don't run into split brain issues with your cluster. Dedicated master nodes tend not to be heavily loaded so that can be quite small.
For client nodes you can indeed put them behind a load balancer, or use dns entries to point to them. They are easily scaled up and down by just adding more to the cluster and should be added for both redundancy and as you see cpu and memory usage climb. Not much need for a lot of disk.
No matter what your configuration, in addition to benchmarking likely loads ahead of time I'd strongly advise close monitoring of cpu, memory and disk - ES is easy to start rolling out but it does need watching as you scale into larger numbers of transactions and more nodes. Dealing with a yellow or red status cluster due to node failures from memory or disk exhaustion is not a lot of fun.
I'd take a close read of this article for some background:
http://elastic.co/guide/en/elasticsearch/reference/current/modules-node.html
Plus this series of articles:
http://elastic.co/guide/en/elasticsearch/guide/current/distributed-cluster.html

Elasticsearch one storage for all nodes

In oracle we use RMAN for one base storage for all cluster nodes, so can we have multiple data nodes but only with one storage disk.
Can all storage node act like RAID 5?
You can always mount the same disk on different nodes and store data there. But then that defies the philosophy of replicas and sharding in Elasticsearch. The idea of replicas is to sustain the cluster even if a hard disk or hardware goes down. Here , if a single disk goes down , we will loose the entire cluster data. And in sharding we try to use apply parllel computation on independent hardware so as to improve performance. But then here we are using the same disk. So i don't think this is a good idea.
If you are still going ahead with this plan , make sure you force awareness so that unwanted replicas are not made.

Elasticsearch cluster leaving shards unassigned

We're running an elasticsearch cluster for logging, indexing logs from multiple locations using logstash. We recently added two additional nodes for additional capacity whilst we await further hardware for the cluster's expansion. Ultimately we aim to have 2 nodes for "realtime" data running on SSDs to provide fast access to recent data, and ageing the data over to HDDs for older indicies. The new nodes we put in had a lot less memory than the existing boxes (700GB vs 5TB), but given this will be similar to the situation we'd have when we implemented SSDs, I didn't forsee it being much of a problem.
As a first attempt, I threw the nodes into the cluster trusting the new Disk spaced based allocation rules would mean they wouldn't instantly get filled up. This unfortunately wasn't the case, I awoke to find the cluster had merrily reallocated shards onto the new nodes, in excess of 99%. After some jigging of settings I managed to remove all data from these nodes and return the cluster to it's previous state (all shards assigned, cluster state green).
As a next approach I tried to implement index/node tagging similar to my plans for when we implement SSDs. This left us with the following configuration:
Node 1 - 5TB, tags: realtime, archive
Node 2 - 5TB, tags: realtime, archive
Node 3 - 5TB, tags: realtime, archive
Node 4 - 700GB, tags: realtime
Node 5 - 700GB, tags: realtime
(all nodes running elasticsearch 1.3.1 and oracle java 7 u55)
Using curator I then tagged indicies older than 10days as "archive" and more recent ones "realtime". This in the background sets the index shard allocation "Require". Which my understanding is it will require the node to have the tag, but not ONLY that tag.
Unfortunately this doesn't appeared to have had the desired effect. Most worryingly, no indices tagged as archive are allocating their replica shards, leaving 295 unassigned shards. Additionally the realtime tagged indicies are only using nodes 4, 5 and oddly 3. Node 3 has no shards except the very latest index and some kibana-int shards.
If I remove the tags and use exclude._ip to pull shards off the new nodes, I can (slowly) return the cluster to green, as this is the approach I took when the new nodes had filled up completely, but I'd really like to get this setup sorted so I can have confidence the SSD configuration will work when the new kit arrives.
I have attempted to enable: cluster.routing.allocation.allow_rebalance to always, on the theory the cluster wasn't rebalancing due to the unassigned replicas.
I've also tried: cluster.routing.allocation.enable to all, but again, this has had no discernable impact.
Have I done something obviously wrong? Or is there disagnostics of some sort I could use? I've been visualising the allocation of shards using Elasticsearch Head plugin.
Any assistance would be appreciated, hopefully it's just a stupid mistake that I can fix easily!
Thanks in advance
This probably doesn't fully answer your question, but seeing as I was looking at these docs this morning:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk
You should be able to set watermarks on disk usage in your version to avoid this reoccurring.
For (manual) monitoring of clusters I quite like
https://github.com/lmenezes/elasticsearch-kopf
Currently watching my cluster sort out it's shards again (so slow) after a similar problem, but I'm still running an ancient version.

Resources