Im new to elasticsearch and would like someone to help me clarify a few concepts
Im designing a small cluster with the following requirements
everything should still work when restarting one of the machines, one at a time (eg: OS updates)
a single disk failure is ok
heavy indexing should not impact query performance
How many master, data, ingest nodes should I have?
or do I need 2 clusters?
the indexing workload is purely indexing structured text documents, no processing/rules... do I even need an ingest node?
Also, does each node have a complete copy of the all the data? or only a cluster has the complete copy?
Be sure to read the documentation about Elasticsearch terminology at the very least.
With the default of 1 replica (primary shard and one replica shard) you can survive the failure of 1 Elasticsearch node (failed disk, restart, upgrade,...).
"heavy indexing should not impact query performance": You'll need to size your cluster correctly to handle both the indexing and searching. If you want to read current data and you do heavy updates, that will take up resources and you won't be able to fully decouple it.
By default every node is a data, ingest, and master-eligible node. The minimum HA setting needs 3 nodes. If you don't use ingest that's fine; it won't take up resources when you're not using it.
To understand which node has which data, you need to read up on the concept of shards. Basically every index is broken up into 1 to N shards (current default is 5) and there is one primary and one replica copy of each one of them (by default).
Related
Background
We're designing the architecture of a new system using Elasticsearch now, and we plan to use Elastic Cloud based on reviews contrasting their service with AWS's, and self-hosting on an EC2 instance. As we design the system, I'm trying to learn from a small test project my team deployed on Elastic Cloud 6 months ago. While I've spent a lot of time reading the Elasticsearch Docs, Elasticsearch: The Definitive Guide, and Elastic Cloud's Docs, there are some concepts here that I'm still not understanding.
Our Test Project's issues
Our test project uses the default of 5 primary shards and 1 replica shard per primary. It was configured using the default deployment options on Elastic Cloud with a single one node, currently with 2GB of memory. Because there is only one node, and because replica shards are never assigned to the same node as their primary shard (reason 2), none of the replicas are getting assigned. Also, this project uses time-based data, and is creating one index per account per day, resulting in about 10 indexes per day (or 100 shards), and over time, the proverbial Kagillion Shards. This system was only ever meant to have several months of data on it at a time, so the solution has been to manually delete old data when memory on this deployment runs out.
The New System
Our new system is meant to have 5 years worth of time based-data on it, which is projected to grow to 250 GB in size. The current implementation uses a single index for the time-based data, with 6 primary shards and 1 replica per primary. This decision was made based on reading that a single shard should aim for a maximum of 30GB in size.
Questions
Our old system had one node with too many indexes (over 100) and too many shards (over 1000), and it seems like our new one is being designed with too few (one index for 5+ years of data). It seems a better indexing strategy according to the time-based data recommendations would be to create one index per week or month? That being said, according to another answer on SO the optimal number of indexes per node is 1, so what is the utility in creating multiple indices for time-based data in the first place if we're only running on one node?
How does one add a node to an ES deployment in Elastic Cloud? Currently all of the replica nodes in the test project are unassigned, because the deployment only has one node. There is a slider which allows you to easily choose the memory of each node in a deployment (between 1GB and 250B), however I see no way to add multiple nodes, which is confusing because it seems like basic functionality for Elasticsearch.
Our test project's node has restarted several times, always when there is lots of old data on the node, and therefore memory pressure. The solution has been to delete old data (as the test project was only meant to have several months of data at a time), but it appears the node didn't lose data when it restarted. Why would this be?
Our test project has taken no snapshots, which are supposed to happen automatically on Elastic Cloud every 30 minutes. I've asked their support about this, but just curious to see if anyone knows what could cause this and how to resolve it?
Our test project uses the default of 5 primary shards and 1 replica shard per primary. It was configured using the default deployment options on Elastic Cloud with a single one node
Clearly, on a single node, you cannot have replicas. So your index should have been configured with 0 replicas and you can do it dynamically to get your cluster back to green (PUT index/_settings {"index.number_of_replicas": 0}), simple as that.
Also, this project uses time-based data, and is creating one index per account per day, resulting in about 10 indexes per day (or 100 shards)
I cannot tell if 50 new primary shards (10 index) per day were reasonable or not because you don't give any information regarding the volume of data in your test project. But it's probably too many.
It seems a better indexing strategy according to the time-based data recommendations would be to create one index per week or month?
Having five years worth of data in a single index is perfectly possible, it doesn't really depend on how old the data is, but on how big it grows. You mention 250GB and also that you know a shard shouldn't grow over 30GB (and that again depends on the spec of your hardware underneath, more on that later), but since you have only 6 shards for that index, it means that each shard will grow over 40GB (which is ok according to this), but to be on the safe side, you should probably increase to 8-9 shards, or you split your data into yearly/monthly indices.
The 30GB-ish limit per shard is also dependent on how much heap your nodes have. If you have nodes with 2GB heap, then having 30GB shards is clearly too big. Since you're on ES Cloud and you plan to have 250GB of data, you must have chosen a node capacity of 16GB heap + 384GB storage (or bigger). So with 16GB heap, it's reasonable to have 30GB shards, but you'll need several nodes in my opinion. You can verify how many nodes you have using GET _cat/nodes?v.
That being said, according to another answer on SO the optimal number of indexes per node is 1...
What Chris is saying is a theoretical/ideal setting, which is almost never possible/advisable/desired to do in reality. You do want to have several shards in your index and the reason is that when your data grows, you want to be able to scale to more than one node, that's the whole point of ES, otherwise you'd be better off embedding the Lucene library directly in your project.
..., so what is the utility in creating multiple indices for time-based data in the first place if we're only running on one node?
First check how many nodes you have in your cluster using GET _cat/nodes?v, but clearly if you're assigned a single node for 250GB of data split on 6-8 shards, a single node is not ideal, indeed.
How does one add a node to an ES deployment in Elastic Cloud?
Right now, you can't. However, at the last Elastic{ON} conference, Elastic announced that it will be possible to pick the number of nodes or the kind of deployment (hot/warm, etc) you want to set up.
Currently all of the replica nodes in the test project are unassigned, because the deployment only has one node.
You don't really need replicas in a test project, right?
The solution has been to delete old data (as the test project was only meant to have several months of data at a time), but it appears the node didn't lose data when it restarted. Why would this be?
How did you delete the data? Between the time you deleted the data and before the node restarted, did you witness that the data was indeed gone?
Our test project has taken no snapshots, which are supposed to happen automatically on Elastic Cloud every 30 minutes.
This is weird, since on ES cloud your cluster generally gets snapshotted every 30 minutes. What do you see under Deployments > cluster-id > Elasticsearch > Snapshots? What does the ES Cloud support say about it? What do you get when running GET _cat/repositories?v and GET _cat/snapshots/found-snapshots?v? (update your question with the results)
We're running an elasticsearch cluster for logging, indexing logs from multiple locations using logstash. We recently added two additional nodes for additional capacity whilst we await further hardware for the cluster's expansion. Ultimately we aim to have 2 nodes for "realtime" data running on SSDs to provide fast access to recent data, and ageing the data over to HDDs for older indicies. The new nodes we put in had a lot less memory than the existing boxes (700GB vs 5TB), but given this will be similar to the situation we'd have when we implemented SSDs, I didn't forsee it being much of a problem.
As a first attempt, I threw the nodes into the cluster trusting the new Disk spaced based allocation rules would mean they wouldn't instantly get filled up. This unfortunately wasn't the case, I awoke to find the cluster had merrily reallocated shards onto the new nodes, in excess of 99%. After some jigging of settings I managed to remove all data from these nodes and return the cluster to it's previous state (all shards assigned, cluster state green).
As a next approach I tried to implement index/node tagging similar to my plans for when we implement SSDs. This left us with the following configuration:
Node 1 - 5TB, tags: realtime, archive
Node 2 - 5TB, tags: realtime, archive
Node 3 - 5TB, tags: realtime, archive
Node 4 - 700GB, tags: realtime
Node 5 - 700GB, tags: realtime
(all nodes running elasticsearch 1.3.1 and oracle java 7 u55)
Using curator I then tagged indicies older than 10days as "archive" and more recent ones "realtime". This in the background sets the index shard allocation "Require". Which my understanding is it will require the node to have the tag, but not ONLY that tag.
Unfortunately this doesn't appeared to have had the desired effect. Most worryingly, no indices tagged as archive are allocating their replica shards, leaving 295 unassigned shards. Additionally the realtime tagged indicies are only using nodes 4, 5 and oddly 3. Node 3 has no shards except the very latest index and some kibana-int shards.
If I remove the tags and use exclude._ip to pull shards off the new nodes, I can (slowly) return the cluster to green, as this is the approach I took when the new nodes had filled up completely, but I'd really like to get this setup sorted so I can have confidence the SSD configuration will work when the new kit arrives.
I have attempted to enable: cluster.routing.allocation.allow_rebalance to always, on the theory the cluster wasn't rebalancing due to the unassigned replicas.
I've also tried: cluster.routing.allocation.enable to all, but again, this has had no discernable impact.
Have I done something obviously wrong? Or is there disagnostics of some sort I could use? I've been visualising the allocation of shards using Elasticsearch Head plugin.
Any assistance would be appreciated, hopefully it's just a stupid mistake that I can fix easily!
Thanks in advance
This probably doesn't fully answer your question, but seeing as I was looking at these docs this morning:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk
You should be able to set watermarks on disk usage in your version to avoid this reoccurring.
For (manual) monitoring of clusters I quite like
https://github.com/lmenezes/elasticsearch-kopf
Currently watching my cluster sort out it's shards again (so slow) after a similar problem, but I'm still running an ancient version.
We have java application with embedded Elasticsearch in a cluster of 14 nodes. All the data resides in a central database, and they are indexed in elasticsearch for querying. A full reindex can be done at any time.
The system are very query-heavy, the amount of writes are small. The number of documents will not be higher than, say, 300.000.
The size of each document varies greatly, from just a couple of ids, to extracted text from e.g word-documents of several pages.
I want to make sure that in case of a total breakdown, it should be sufficient that one or two nodes are available for the system to work.
Write consistency should not be a problem since the master copy of the data is in the database, and it seems that ES is capable of resolving conflicting data by using the newest version (which should be all right in our case)
My first though is to use 1 shard, and 13 replicas. This will naturally ensure that all nodes have access to all data. This could also be accomplished by having 2 shards / 13 replicas, so this yield that to ensure that all data is available, the number of replicas should be the number of nodes - 1, not depending on the number of shards (which could be anything).
If the requirement of number of nodes are reduced to "2 nodes should be up at any time", then a shards / replica distribution of "x/number of nodes - 2" should be sufficient.
So, for the question:
Asserting the above setup and that my thoughts is correct, would a setup with 1 shard / 13 replicas make sense or would there be anything to gain by adding more shards and run e.g a 4 shards/13 replicas setup?
After a good bit of research and talking to ES-gurus;
As long as the shard size is small enough, the most efficient way of setting up this cluster would indeed be 1 shard only, with 13 replicas. I have not been able to pinpoint the threshold size of the shard for this starting to perform worse.
If the index is big... you will need more than one shard (if you want perfomance). Do You really need 13 replica? When you put only 2 replicas, ES manage that to keep it that way, if the principal node fail, ES will create a new reply. May be you will need a balancer node too.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I'm in the middle of attempting to replace a Solr setup with Elasticsearch. This is a new setup, which has not yet seen production, so I have lots of room to fiddle with things and get them working well.
I have very, very large amounts of data. I'm indexing some live data and holding onto it for 7 days (by using the _ttl field). I do not store any data in the index (and disabled the _source field). I expect my index to stabilize around 20 billion rows. I will be putting this data into 2-3 named indexes. Search performance so far with up to a few billion rows is totally acceptable, but indexing performance is an issue.
I am a bit confused about how ES uses shards internally. I have created two ES nodes, each with a separate data directory, each with 8 indexes and 1 replica. When I look at the cluster status, I only see one shard and one replica for each node. Doesn't each node keep multiple indexes running internally? (Checking the on-disk storage location shows that there is definitely only one Lucene index present). -- Resolved, as my index setting was not picked up properly from the config. Creating the index using the API and specifying the number of shards and replicas has now produced exactly what I would've expected to see.
Also, I tried running multiple copies of the same ES node (from the same configuration), and it recognizes that there is already a copy running and creates its own working area. These new instances of nodes also seem to only have one index on-disk. -- Now that each node is actually using multiple indices, a single node with many indices is more than sufficient to throttle the entire system, so this is a non-issue.
When do you start additional Elasticsearch nodes, for maximum indexing performance? Should I have many nodes each running with 1 index 1 replica, or fewer nodes with tons of indexes? Is there something I'm missing with my configuration in order to have single nodes doing more work?
Also: Is there any metric for knowing when an HTTP-only node is overloaded? Right now I have one node devoted to HTTP only, but aside from CPU usage, I can't tell if it's doing OK or not. When is it time to start additional HTTP nodes and split up your indexing software to point to the various nodes?
Let's clarify the terminology a little first:
Node: an Elasticsearch instance running (a java process). Usually every node runs on its own machine.
Cluster: one or more nodes with the same cluster name.
Index: more or less like a database.
Type: more or less like a database table.
Shard: effectively a lucene index. Every index is composed of one or more shards. A shard can be a primary shard (or simply shard) or a replica.
When you create an index you can specify the number of shards and number of replicas per shard. The default is 5 primary shards and 1 replica per shard. The shards are automatically evenly distributed over the cluster. A replica shard will never be allocated on the same machine where the related primary shard is.
What you see in the cluster status is weird, I'd suggest to check your index settings using the using the get settings API. Looks like you configured only one shard, but anyway you should see more shards if you have more than one index. If you need more help you can post the output that you get from elasticsearch.
How many shards and replicas you use really depends on your data, the way you access them and the number of available nodes/servers. It's best practice to overallocate shards a little in order to redistribute them in case you add more nodes to your cluster, since you can't (for now) change the number of shards once you created the index. Otherwise you can always change the number of shards if you are willing to do a complete reindex of your data.
Every additional shard comes with a cost since each shard is effectively a Lucene instance. The maximum number of shards that you can have per machine really depends on the hardware available and your data as well. Good to know that having 100 indexes with each one shard or one index with 100 shards is really the same since you'd have 100 lucene instances in both cases.
Of course at query time if you want to query a single elasticsearch index composed of 100 shards elasticsearch would need to query them all in order to get proper results (unless you used a specific routing for your documents to then query only a specific shard). This would have a performance cost.
You can easily check the state of your cluster and nodes using the Cluster Nodes Info API through which you can check a lot of useful information, all you need in order to know whether your nodes are running smoothly or not. Even easier, there are a couple of plugins to check those information through a nice user interface (which internally uses the elasticsearch APIs anyway): paramedic and bigdesk.
I have an Oracle Database with around 7 millions of records/day and I want to switch to MongoDB. (~300Gb)
To setup a POC, I'd like to know how many nodes I need? I think 2 replica of 3 node in 2 shard will be enough but I want to know your thinking about it :)
I'd like to have an HA setup :)
Thanks in advance!
For MongoDB to work efficiently, you need to know your working set size..You need to know how much data does 7 million records/day amounts to. This is active data that will need to stay in RAM for high performance.
Also, be very sure WHY you are migrating to Mongo. I'm guessing..in your case, it is scalability..
but know your data well before doing so.
For your POC, keeping two shards means roughly 150GB on each.. If you have that much disk available, no problem.
Give some consideration to your sharding keys, what fields does it make sense for you to shared your data set on? This will impact on the decision of how many shards to deploy, verses the capacity of each shard. You might go with relatively few shards maybe two or three big deep shards if your data can be easily segmented into half or thirds, or several more lighter thinner shards if you can shard on a more diverse key.
It is relatively straightforward to upgrade from a MongoDB replica set configuration to a sharded cluster (each shard is actually a replica set). Rather than predetermining that sharding is the right solution to start with, I would think about what your reasons for sharding are (eg. will your application requirements outgrow the resources of a single machine; how much of your data set will be active working set for queries, etc).
It would be worth starting with replica sets and benchmarking this as part of planning your architecture and POC.
Some notes to get you started:
MongoDB's journaling, which is enabled by default as of 1.9.2, provides crash recovery and durability in the storage engine.
Replica sets are the building block for high availability, automatic failover, and data redundancy. Each replica set needs a minimum of three nodes (for example, three data nodes or two data nodes and an arbiter) to enable failover to a new primary via an election.
Sharding is useful for horizontal scaling once your data or writes exceed the resources of a single server.
Other considerations include planning your documents based on your application usage .. for example, if your documents will be updated frequently and grow in size over time, you may want to consider manual padding to prevent excessive document moves.
If this is your first MongoDB project you should definitely read the FAQs on Replica Sets and Sharding with MongoDB, as well as for Application Developers.
Note that choosing a good shard key for your use case is an important consideration. A poor choice of shard key can lead to "hot spots" for data writes, or unbalanced shards if you plan to delete large amounts of data.