Elasticsearch node is created automatically - elasticsearch

I work with a single elasticsearch node in development.
When I do about 10 000 PUT requests (about 64MB data) in a short time period another node is created automatically.
Why does this happen?
After this happens the cluster is in yellow state till I don't shut down this node manually.
I'm using logstash to put the data into elasticsearch.

Related

Can you run an elasticsearch data node after deleting the data folder?

I am running a three node Elasticsearch (ELK) cluster. All nodes have all and the same roles, e.g. data, master, etc. The disk on node 3 where the data folder is assigned became corrupt and that data is probably unrecoverable. The other nodes are running normally and one of them assumed the master role instead.
Will the cluster work normally if I replace the disk and make the empty directory available to elastic again, or am I risking crashing the whole cluster?
EDIT: As this is not explicitly mentioned in the answer, yes, if you add your node with an empty data folder, the cluster will continue normally as if you added a new node to the cluster, but you have to deal with the missing data. In my case, I lost the data as I do not have replicas.
Let me try to explain that in simple way.
Your data got corrupt at node-3 so if you add that that node again, it will not have the older data, i.e. the shards stored in node-3 will remain unavailable for the cluster.
Did you have the replica shards configured for the indexes?
What is the current status(yellow/red) of the cluster when you have
node-3 removed?
If a primary shard isn't available then the master-node promotes one of the active replicas to become the new primary. If there are currently no active replicas then status of the cluster will remain red.

Why Druid segments become unavailable after data ingestion

Druid cluster shows unavailable for certain segments of data of data source after data ingestion.
Ex: 72.4% available (2352 segments, 647 segments unavailable)
We have a clustered deployment 3 nodes :
master node (coordinator amd overlord)
Data node (historical and middlemanager)
Query node (broker and router)
Any specific reason why it is happening so.
The issue is resolved after clean restart of master and data nodes. However just restarting nodes without cleaning data did not work

Elasticsearch, how many clusters, indexes do I need for 8 applications

I have an ELK Stack set up and accepting log data from 2 of my applications and everything is working ok. Its been running for 25 days and I have nearly 4GB of Data/Documents on a 25GB server.
My question
I have 8 applications in total that I would like to hook up to my ELK Stack.
Is the one cluster OK for this, or do I need to add more clusters? say a cluster for each applications data? If so how do I do that without having to re-index my data?
Why does cluster health say "yellow (244 of 488)?
Should I index each application to index on it own index rather than the default "logstash-{todays-date}". Like my-app-1-{todays-date}, my-app-2-{todays-date} etc..?
your help is greatly appreciated
G
Your cluster is yellow because your logstash-* indices are configured with 1 replica and you probably have a single node. 244 of 488 means that you have 488 shards in all your indices but only 244 are assigned on your single node and 244 remain to be assigned to new nodes. This is not a problem per se, but if your current node were to fail for some reason, you'd probably lose some data, whereas if you had 2+ nodes, the data would be replicated on other nodes, your cluster would be green (and you'd see 488 of 488) and you'd have a lower risk of losing data.
As for your second question, nothing prevents you from storing all the logs from your eight applications in the same daily logstash indices. You just need to make sure that your logstash configuration accounts for every different apps and adds one field with the application name (e.g. app: app1, app: app2, etc) to the indexed log events so that you can then distinguish within Kibana from which app each log event has been issued.
I have only used Elasticsearch and no the complete ELK stack, but I can give some ideas and guess what is going on. 488 = 2 x 244 , so I guess there are un-assigned replica shards in the single-machine cluster. You can update this setting ad-hoc and set it to zero:
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{"index" : {"number_of_replicas" : 0}}'
You should update logstash index template not to use replicas when you are running just a single machine. Also your shards seem to be only about 20 MB in size so I'd recommend each index to use just one shard instead of five, each shard consumes extra resources. Having multiple shards increases indexing speed but slows down queries, you should check if one is sufficient or not.
Index / application / day would speed-up querying if dashboards are mostly application-specific, and you can create a day-specific alias to-be used by cross-application queries.

Elasticsearch replica auto undertake shard

Below I have 2 nodes in two servers. 2 indexes each. The indexes are distributed in 2 shards and 1 replica set.
"Thor" node had a downtime so I "Iron_man" took over. That's fine.
As you can see events_v1 is an index created before the downtime and venue_v1 was created after the downtime. Shouldn't "Thor" after being back alive take over one shard automatically in the same way as it handles the newly created venue index?
If yes how should I configure the settings?
You need not to configure anything for above scenario.Because Elasticsearch's default behavior is same as you requested.if you create replica for a shard in same node is not allocated to query.If you add new node means your replica shard will be allocated in newly created node.
for more information watch this video

Remove of data folder is not synced in Elasticsearch upon index delete

We have an ES cluster with 2 nodes. When we delete an index not all folders in the cluster (on filesystem) are deleted which causes some problems when restarting one server.
Then our deleted indices gets distributed with some weird state indicating that the cluster health is not green.
Example. We delete index with name someIndex and after deletion we check file system, one can see this:
Node1
ElasticSearch\data\clustername\nodes\0\indices\
ElasticSearch\data\clustername\nodes\1\indices\
Node2
ElasticSearch\data\clustername\nodes\0\indices\
ElasticSearch\data\clustername\nodes\1\indices\someIndex (<-- still present)
Anyone know whats causing this?
ES-version: 0.90.5
There are two nodes directories for each on your filesystem (these are nodes\0 and nodes\1).
When you start Elasticsearch, you start up a node (in ES-lingo). Your machine can host multiple nodes, which happens if you start Elasticsearch multiple times. The default settings for the http port is 9200-9300, that means, ES is looking for a free port in that range and binds its node to it (the same is true for the transport module with 9300-9400)
So, if you start an ES process while another is still running, that is, it's bound to a port, you start a second node and ES will create a new directory for it. Maybe this has happened if you issued a restart, but ES couldn't shut down in time before the new node started up.
But now you have a third node in your cluster and ES will assign shards to it. Then you do a cluster restart or something similar and you start one node on each of your machine. ES cannot find the shards that were assigned to the third node, because it's not spun up, and it will show you a red or yellow state, depending on what shards live on the third node. If you delete you index data, you won't delete the data from this missing node.
If you don't care about the data, you can just shutdown ES and delete these directories or start two ES nodes on each of your machines and then delete the index again.
Then you could change the port settings to one specific port, that would prevent second processes from starting up, since they won't be able to bind to a free port.

Resources