Elasticsearch in production with kubernetes - elasticsearch

I am working on product in which we are using elasticsearch for search. Our production setup is in K8S (1.7.7) and we are able to scale it pretty well. Only thing I am not sure about is whether we should be hosting elasticsearch in k8s (it can go on dedicated host as well using label selector nodes) or it is advisable to host elasticsearch on VM than docker.
Our data set size is 2-3 GB and would go further. But this is the benchmark we can consider.
And elasticsearch cluster I am planning to have ti is - 3 master (with 2 as eligible master), one client node, and one data node. We can scale datanode and client node as data increases.
Is anyone did this before? thanks in advance.

IMO the best resource for Elasticsearch on Kubernetes is https://github.com/pires/kubernetes-elasticsearch-cluster
Note that while there are official Docker containers, no official solution for orchestration is being provided at the moment. This is currently covered by the community only.
3 master (with 2 as eligible master)
This doesn't make much sense. You'll want 3 master eligible nodes with the setting discovery.zen.minimum_master_nodes: 2 and one of the 3 nodes will be the actual master.

Related

How to setup 2 nodes on elasticsearch?

Hello enthusiastic people.
I am a student trying to learn Elastic stack.
I have 1 node installed on my local machine. I have also successfully installed beats on my other local machine to get data and deliver it to my logstash.
My question is, what if I add another node, do I still need to install kibana and elasticsearch? Then connect it from my first node?
I just read a lot that a single node is prone to data loss.
Sorry for my noob question.
Your answer is very appreciated.
Thanks in advance.
Having a cluster with at least 3 nodes would be good to ensure data security and integrity.
A cluster can have one or more nodes.
An example scenario:
It will be easier for you to install with docker during the learning and development process. I recommend you follow the link below. This link explains how to set up an elasticsearch cluster with 3 nodes on docker.
Start a multi-node cluster with Docker Compose

Infinispan Server Topology

I am currently analyzing a cluster environment with a Distributed cache.
I have 5 nodes, each one with an application that must cache a value. 2 nodes are in one datacenter and the other 3 in another. I was thinking of installing an ISPN instance on each node (where is the application hosted) to build the cluster.
Do you have any suggestions for me for further analysis?
Many thanks!

Can I change GCP Dataproc cluster from Standard (1 master, N workers) to High Availability?

I have created a GCP Dataproc cluster with Standard (1 master, N workers). Now I want to upgrade it to High Availability (3 masters, N workers) - Is it possible?
I tried GCP, GCP alpha and GCP beta commands. For example GCP beta documented here: https://cloud.google.com/sdk/gcloud/reference/beta/dataproc/clusters/update.
It has option to scale worker nodes, however does not have option to switch from standard to high availability mode. Am I correct?
You can upgrade the master node by going into VM Instances section under your cluster , Stop your Master VM and Edit the configuration to
You may always upgrade your master node machine type and also add more worker node.
While that would improve your cluster job performance but noting to do with HA.
The answer is - no. Once HA cluster is created, it can't be downgraded and vice versa. You can add worker nodes, however master node can't be altered.
Yes, you can always do that, for changing the machine type of master node
you first need to stop the master VM instance, then you can change the machine type
Even the machine type of worker node can be changed, only we need to do is stop the machine and edit the machine configuration.

Elasticsearch upgrade order based on node roles

I looked for this in many websites, including Elastic official documentation, without success.
I have one Elasticsearch cluster with:
3 Master nodes
4 Data nodes
4 Ingest nodes
2 Client nodes
I must perform a rolling upgrade (from 5.x to 5.x) but the official docs do not explain the order based on node roles.
Should I upgrade Master nodes at first? What next? Data nodes?
I mean, I need to know which is the best way to get the whole cluster upgraded.
Thanks,
Best regards
We had a similar situation and elastic recommended upgrading master first, followed by data and then client nodes.
It's worth checking if your 5.x version has an upgrade assistant available (You see that in Kibana).

How to deploy a Cassandra cluster on two ec2 machines?

It's a known fact that it is not possible to create a cluster in a single machine by changing ports. The workaround is to add virtual Ethernet devices to our machine and use these to configure the cluster.
I want to deploy a cluster of , let's say 6 nodes, on two ec2 instances. That means, 3 nodes on each machine. Is it possible? What should be the seed nodes address, if it's possible?
Is it a good idea for production?
You can use Datastax AMI on AWS. Datastax Enterprise is a suitable solution for production.
I am not sure about your cluster, because each node need its own config files and it is default. I have no idea how to change it.
There are simple instructions here. When you configure instances settings, you have to write advanced settings for cluster, like --clustername yourCluster --totalnodes 6 --version community etc. You also can install Cassandra manually by installing latest version java and cassandra.
You can build cluster by modifying /etc/cassandra/cassandra.yaml (Ubuntu 12.04) fields like cluster_name, seeds, listener_address, rpc_broadcast and token. Cluster_name have to be same for whole cluster. Seed is master node, which IP you should add for every node. I am confused about tokens

Resources