Create two nodes on a Elasticsearch run as a service - elasticsearch

I'm using one single Linux machine and I'm running Elasticsearch as a service (I run ES with the command service elasticsearch start). My configuration has only one node so I would like to add to my cluster a new node (as a failover) to locate the replicas there.
I'm trying to follow the solution of this question but the solutions are super messy and I can't find any way to archieve my goal.
Can someone explain me (in the most clear way as I'm quite a newbye in Linux) what do I've to do to add a new node to my Elasticsearch?
Thanks in advance.

Just go to terminal and run elastic search instance again.
Your new node will be set up.You can run as many instances as you want.

Related

How to setup 2 nodes on elasticsearch?

Hello enthusiastic people.
I am a student trying to learn Elastic stack.
I have 1 node installed on my local machine. I have also successfully installed beats on my other local machine to get data and deliver it to my logstash.
My question is, what if I add another node, do I still need to install kibana and elasticsearch? Then connect it from my first node?
I just read a lot that a single node is prone to data loss.
Sorry for my noob question.
Your answer is very appreciated.
Thanks in advance.
Having a cluster with at least 3 nodes would be good to ensure data security and integrity.
A cluster can have one or more nodes.
An example scenario:
It will be easier for you to install with docker during the learning and development process. I recommend you follow the link below. This link explains how to set up an elasticsearch cluster with 3 nodes on docker.
Start a multi-node cluster with Docker Compose

When configuring Snapshot for an ElasticSearch cluster, do I do that to every node?

Sorry for what may be an obvious question. But I have a 3 node ElasticSearch cluster, and I want it to take a nightly snapshot that is sent to S3 for recovery. I have done this for my test cluster which is a single node. And I was starting to do it for my 3 node production cluster when I was left wondering if I have to configure the repository and snapshot on each node separately or can I just do it on one node via Kibana and then it will replicate that across the cluster? I have looked through the documentation but didn't see anything about this.
Thank you!
Yes, you need to configure it in every node.
First you need to install the repository-s3 plugin in every node, this is explained in the documentation.
After that, you also need to add the access and secret keys in the elasticsearch-keystore of every node. (documentation).
The rest of the configuration, creating the repository and setting the snapshots, are done through Kibana once.

Unknow source of daily clean up of indices

I have two separate elastic clusters, each one of elastic node is docker container, which live in docker swarm. I aggregate logs from various microservices in indices, and one of them is in format "logs-timestamp".
In one of cluster I have those indices from previous days, in other one I have only from present day.
This affect only those ones in "logs-timestamp" format.
Do you have any idea? or point from I can start to lookup?
Does elastic has some form of builtin garbage collector?
Ps. I didn't start this project so basiclly I have quite small knowledge about whole infrastructure.
You should check the ILM policies documentation (here) which is one way of automatically removing old indices.
In short, check the result of this command in kibana
GET _ilm/policy
It will tell you if you have some policy configured.
The other way I know for automatic indices curation is Curator ( see here and here). You should check if Curator is installed somewhere in your infrastructure and check the configuration.
Hope it helps.

Elasticsearch snaphots to s3

I have a elasticsearch 5.6.2 cluster with one master and two data nodes and I am using Kibana for visualizing . I want to enable automatic snapshots for the elasticsearch cluster to Amazon-s3 every 30mins. Can I Know How Can I accomplish it ..? There is no proper Documentation . I had also refered curator docs and I have a question, DO I need to configure that curator or on each node ...?
Please help guys
Curator is an external process.
You must put it on one single machine. It can be a node or any other machine.
It will send REST requests to elasticsearch when needed.
Put in your crontab and that is going to be ok.
You can also call the SNAPSHOT endpoint manually from a shell script every 30 minutes and don’t use curator at all.
Elastic cloud does a backup every 30 minutes (in case you don’t want to manage the cluster yourself and have that kind of advanced features like also rolling upgrades, Kibana, security...)

Strategy to persist the node's data for dynamic Elasticsearch clusters

I'm sorry that this is probably a kind of broad question, but I didn't find a solution form this problem yet.
I try to run an Elasticsearch cluster on Mesos through Marathon with Docker containers. Therefore, I built a Docker image that can start on Marathon and dynamically scale via either the frontend or the API.
This works great for test setups, but the question remains how to persist the data so that if either the cluster is scaled down (I know this is also about the index configuration itself) or stopped, and I want to restart later (or scale up) with the same data.
The thing is that Marathon decides where (on which Mesos Slave) the nodes are run, so from my point of view it's not predictable if the all data is available to the "new" nodes upon restart when I try to persist the data to the Docker hosts via Docker volumes.
The only things that comes to my mind are:
Using a distributed file system like HDFS or NFS, with mounted volumes either on the Docker host or the Docker images themselves. Still, that would leave the question how to load all data during the new cluster startup if the "old" cluster had for example 8 nodes, and the new one only has 4.
Using the Snapshot API of Elasticsearch to save to a common drive somewhere in the network. I assume that this will have performance penalties...
Are there any other way to approach this? Are there any recommendations? Unfortunately, I didn't find a good resource about this kind of topic. Thanks a lot in advance.
Elasticsearch and NFS are not the best of pals ;-). You don't want to run your cluster on NFS, it's much too slow and Elasticsearch works better when the speed of the storage is better. If you introduce the network in this equation you'll get into trouble. I have no idea about Docker or Mesos. But for sure I recommend against NFS. Use snapshot/restore.
The first snapshot will take some time, but the rest of the snapshots should take less space and less time. Also, note that "incremental" means incremental at file level, not document level.
The snapshot itself needs all the nodes that have the primaries of the indices you want snapshoted. And those nodes all need access to the common location (the repository) so that they can write to. This common access to the same location usually is not that obvious, that's why I'm mentioning it.
The best way to run Elasticsearch on Mesos is to use a specialized Mesos framework. The first effort is this area is https://github.com/mesosphere/elasticsearch-mesos. There is a more recent project, which is, AFAIK, currently under development: https://github.com/mesos/elasticsearch. I don't know what is the status, but you may want to give it a try.

Resources