I'm writing an automation script that supposed to create 4 instances in AWS and deploy rethinkdb cluster on them without any human interaction. According to the documentation I need to either use --join parameter on command line or put join statements in configuration file. However, what I don't understand is if I need to specify join only once in order to create the cluster or every time I restart any of the cluster nodes?
My current understanding is that I only need to issue it once, the cluster configuration is somehow stored in metadata and next time I can just start rethinkdb without --join parameter and it will reconnect to the rest of the cluster on its own. But when would I need the join option in the configuration file then?
If this is true then do I need to start rethinkdb with --join option in my script then shut it down and then start again without --join? Is this the right way to do it or there are better alternatives?
You're right that on subsequent restarting, you don't need to specify --join from command line, it will discover the cluster and attempt to re-connect. Part of cluster state is store in system table server_config.
Even if you wiped out the data directory on a this node, it may still be able to form cluster because other nodes may have information about that node, and will attempt to connect to it. But if no other node store information about this particular server, or when this particular node is restarted and have a new IP address for some reason, and its data directory is wiped as well, this time, the cluster doesn't know about it(with new IP address).
So, I'll always specifiing --join. It doesn't hurt. And it helps in worst case to make the new node be able to join cluster again.
Related
My one of the Master Nodes got deleted accidentally in my dataproc cluster. Is there any way to recover that Master node or Can I spin up a new Master Node and add it to my cluster? The reason of deletion is still unknown.
Any help is really appreciated.
After knowing that I didn't have many options, I tried the below steps and it worked.
Determine the current active NameNode(hdfs haadmin -getServiceState nn0/nn1)
Create an AMI of the current active NameNode
Launch a new instance from that AMI having exact same name as of deleted master-node.(This is crucial as all hdfs properties inside hdfs-site.xml are configured using this hostname only. So make sure every detail of this instance is exact same as the lost one.)
Our AMI contains every required configuration and services, So as the new instance starts, dataproc will automatically identify the node and add it to the cluster.
If it has been deleted I don't think it can be restored to whatever state you had before deletion. However you can prevent from future accidental deletion by making sure it doesn't get scheduled deleted.
I wanted to know if I can temporarily close down my EMR ec2 instance to avoid extra charges. I waned to know if I can get a snapshot of my cluster and closing the ec2 instances temporarily.
You cannot currently terminate/stop your master instance without losing everything on your cluster, including in HDFS, but one thing you might be able to do is shrink your core/task node instance groups when you don't need them. You must still keep at least one core instance (or more if you have a lot of data in HDFS that you want to keep), but you can resize your task instance groups down to zero if your cluster is not in use.
On the other hand, unless you really have anything on your cluster that you need to keep, you might just want to terminate your cluster when you no longer need it, then clone it to a new cluster when you require it again.
For instance, if you only ever store your output data in S3 or another service external to your EMR cluster, there is no reason you need to keep your EMR cluster running while idle and no reason to need to "snapshot" it because your cluster is essentially stateless.
If you do have any data/configuration stored on your cluster that you don't want to lose, you might want to consider moving it off the cluster so that you can shut down your cluster when not in use. (Of course, how you would do this depends upon what exactly we're talking about.)
Can you help me with installing cosmos-gui? I think you are one of the developers behind cosmos? Am I right?
We have already installed Cosmos, and now we want to install cosmos-gui.
In the link below, I found the install guide:
https://github.com/telefonicaid/fiware-cosmos/blob/develop/cosmos-gui/README.md#prerequisites
Under subchapter “Prerequisites” is written
A couple of sudoer users, one within the storage cluster and another one wihtin the computing clusters, are required. Through these users, the cosmos-gui will remotely run certain administration commands such as new users creation, HDFS userspaces provision, etc. The access through these sudoer users will be authenticated by means of private keys.
What is meant by the above? Must I create, a sudo user for the computing and storage cluster? And for that, do need to install a MySQL DB?
And under subchapter “Installing the GUI.”
Before continuing, remember to add the RSA key fingerprints of the Namenodes accessed by the GUI. These fingerprints are automatically added to /home/cosmos-gui/.ssh/known_hosts if you try an ssh access to the Namenodes for the first time.
I can’t make any sense about the above. Can you give a step by step plan?
I hope you can help me.
JH
First of all, a reminder about the Cosmos architecture:
There is a storage cluster based on HDFS.
There is a computing cluster based on shared Hadoop or based on Sahara; that's up to the administrator.
There is a services node for the storage cluster, a special node not storing data but exposing storage-related services such as HttpFS for data I/O. It is the entry point to the storage cluster.
There is a services node for the computing cluster, a special node not involved in the computations but exposing computing-related services such as Hive or Oozie. It is the entry point to the computing cluster.
There is another machine hosting the GUI, not belonging to any cluster.
Being said that, the paragraphs you mention try to explain the following:
Since the GUI needs to perform certain sudo operations on the storage and computing clusters for user account creation purposes, then a sudoer user must be created in both the services nodes. These sudoer users will be used by the GUI in order to remotely perform the required operations on top of ssh.
Regarding the RSA fingerprints, since the operations the GUI performs on the services nodes are executed in top of ssh, then the fingerprints the servers send back when you ssh them must be included in the .ssh/known_hosts file. You may do this manually, or simply ssh'ing the services nodes for the first time (you will be prompted to add the fingerprints to the file or not).
MySQL appears in the requirements because that section is about all the requisites in general, and thus they are listed. Not necessarily there may be relation maong them. In this particular case, MySQL is needed in order to store the accounts information.
We are always improving the documentation, we'll try to explain this better for the next release.
I'm sorry that this is probably a kind of broad question, but I didn't find a solution form this problem yet.
I try to run an Elasticsearch cluster on Mesos through Marathon with Docker containers. Therefore, I built a Docker image that can start on Marathon and dynamically scale via either the frontend or the API.
This works great for test setups, but the question remains how to persist the data so that if either the cluster is scaled down (I know this is also about the index configuration itself) or stopped, and I want to restart later (or scale up) with the same data.
The thing is that Marathon decides where (on which Mesos Slave) the nodes are run, so from my point of view it's not predictable if the all data is available to the "new" nodes upon restart when I try to persist the data to the Docker hosts via Docker volumes.
The only things that comes to my mind are:
Using a distributed file system like HDFS or NFS, with mounted volumes either on the Docker host or the Docker images themselves. Still, that would leave the question how to load all data during the new cluster startup if the "old" cluster had for example 8 nodes, and the new one only has 4.
Using the Snapshot API of Elasticsearch to save to a common drive somewhere in the network. I assume that this will have performance penalties...
Are there any other way to approach this? Are there any recommendations? Unfortunately, I didn't find a good resource about this kind of topic. Thanks a lot in advance.
Elasticsearch and NFS are not the best of pals ;-). You don't want to run your cluster on NFS, it's much too slow and Elasticsearch works better when the speed of the storage is better. If you introduce the network in this equation you'll get into trouble. I have no idea about Docker or Mesos. But for sure I recommend against NFS. Use snapshot/restore.
The first snapshot will take some time, but the rest of the snapshots should take less space and less time. Also, note that "incremental" means incremental at file level, not document level.
The snapshot itself needs all the nodes that have the primaries of the indices you want snapshoted. And those nodes all need access to the common location (the repository) so that they can write to. This common access to the same location usually is not that obvious, that's why I'm mentioning it.
The best way to run Elasticsearch on Mesos is to use a specialized Mesos framework. The first effort is this area is https://github.com/mesosphere/elasticsearch-mesos. There is a more recent project, which is, AFAIK, currently under development: https://github.com/mesos/elasticsearch. I don't know what is the status, but you may want to give it a try.
How do I move Elasticsearch data from one server to another?
I have server A running Elasticsearch 1.4.2 on one local node with multiple indices. I would like to copy that data to server B running Elasticsearch with the same version. The lucene_version is also same on both the servers.But when I copy all the files to server B data is not migrated it only shows the mappings of all the node. I tried the same procedure on my local computer and it worked perfectly. Am I missing something on the server end?
This can be achieved by multiple ways. The easier and safest way is to create a replica on the new node. Replica can be created by starting a new node on the new server by assigning the same cluster name. (if you have changed other network configurations then you might need to change that also). If you have initialized your index with no replica before then you can change the number of replica online using update settings api
Your cluster will be in yellow state until your datas are in sync.Normal operations won't get affected.
Once your cluster state is in green you can shut down the server you do not wish to have. At this stage your cluster stage will go to yellow again. You can use the update setting to change replica count to 0 / add other nodes to bring cluster state in green state.
This way is recommended only if both your servers are on the same network else data syncing will take lots of time.
Another way is to use snapshot. You can create a snapshot on your old server. Copy the snapshot files from the old server to new server in the same location. On the new server create the same snapshot on the same location. You will find the snapshot file you copied. You can restore it using that. Doing it using command line can be a bit cumbersome. You can use a plugin like kopf which will make taking snapshot and restore as easy as button click.