Private networking necessary for Mesos and Marathon? - mesos

I am working through this tutorial: http://mesosphere.io/docs/getting-started/cloud-install/
Just learning on an Ubuntu instance on Digital Ocean, I let the master process bind to the public IP, and the Mesos and Marathon web interfaces became publicly accessible. No surprises there.
Do Mesos and Marathon rely on Zookeeper to create private IPs between instances? Could you skip using Zookeeper by manually setting up a private network between instances? Then the proper way to start the master and slave processes is to bind to the secondary, private IPs of each instance?
Digital Ocean can set up private IPs automatically, but this is kind of a learning exercise for me. I am aware of the broad rule that administrator access to a server shouldn't come through a public IP. Another way of phrasing this posting is, does private networking provide the security for Mesos and Marathon?
Only starting with one Ubuntu instance, running both master and slave, for now. Binding to the loopback address would fix this issue for just one machine, I realize.

ZooKeeper is used for a few different things for both Marathon and Mesos:
Leader election
Storing state
Resolving the Mesos masters
At the moment, you can't skip ZooKeeper entirely because of 2 and 3 (although later versions of Mesos have their own registry which keeps track of state). AFAIK, Mesos doesn't rely on ZooKeeper for creation of private IPs - it'll bind to whatever is available (but you can force this via the ip parameter). So, you won't be able to forgo ZooKeeper entirely with a private network.
Private networking will provide some security for Mesos and Marathon - assuming you firewall off their access to the external world.
A good (although not necessarily the best) solution for keeping the instances on a private network is to set up an OpenVPN (or similar) network to one of the masters. Then, launch each instance on its private IP and make you also set the hostname parameter to that IP. Connect to the Mesos/Marathon web consoles via their private IP and the VPN and all should resolve correctly.

Mesos and marathon doesn't create private IPs between instance.
For that, I suggest you use tinc or directly a docker image tinc
Using this, I was able to do the config you want in 5 minutes, it's easier to configure than openvpn, and each host can connect to another, no need to use a vpn server to route all the traffic.
Each node will store a private and public for connecting to each server of the private network.
You should setup a private network for using mesos.
After that, you can add in /etc/hosts all the hosts with the IP of the internal network.
You will be able to bind zookeeper using the private network :
zk://master-1:2181,master-2:2181,master-3:2181
Then the proper way to start the master and slave processes is to bind to the secondary private IPs of each instance.

Related

How to activate two network interfaces at a time?

I have two network interfaces. Both interfaces have two private ip each for a total of 4. All are set to elastic ip.
I can ping only two public IP.
How can I activate 4 IP at a time?
You need to configure the secondory IPs from ec2 OS too.
From AWS Documentation:
Configuring the Operating System on Your Instance to Recognize the Secondary Private IPv4 Address
After you assign a secondary private IPv4 address to your instance,
you need to configure the operating system on your instance to
recognize the secondary private IP address.
If you are using Amazon Linux, the ec2-net-utils package can take care
of this step for you. It configures additional network interfaces that
you attach while the instance is running, refreshes secondary IPv4
addresses during DHCP lease renewal, and updates the related routing
rules. You can immediately refresh the list of interfaces by using the
command sudo service network restart and then view the up-to-date list
using ip addr li. If you require manual control over your network
configuration, you can remove the ec2-net-utils package. For more
information, see Configuring Your Network Interface Using
ec2-net-utils. If you are using another Linux distribution, see the
documentation for your Linux distribution. Search for information
about configuring additional network interfaces and secondary IPv4
addresses. If the instance has two or more interfaces on the same
subnet, search for information about using routing rules to work
around asymmetric routing. For information about configuring a Windows
instance, see Configuring a Secondary Private IP Address for Your
Windows Instance in a VPC in the Amazon EC2 User Guide for Windows
Instances
For Linux here is the post where the steps are explained:
https://bobcares.com/blog/an-easy-guide-to-setup-amazon-ec2-multiple-ips/

Marathon loses control over Mesos when Marathon and Mesos leaders mismatch

When mesos or marathon service restart due to some reasons and leader of mesos and marathon is not on the same machine, deployments stuck in marathon and nothing happens in mesos, that leads to terrible results when marathon can not restart failed services and do nothing with deployments until leaders will not match again.
Our cluster has 3 masters (installed through mesosphere website) and this situation happens quite often, is there any way to fix that?
Marathon v.0.9.0
Mesos v0.22.1
It sounds like either Mesos or Marathon use a private ip (localhost/127.0.0.1), thus they weren't able to talk to each other.
You should be able to solve your issue by setting a public ip using the respective --ip command line flag or LIBPROCESS_IP environment var.
One particularly useful setting is LIBPROCESS_IP, which tells the master and slave binaries which IP address to bind to; in some installations, the default interface that the hostname resolves to is not the machine’s external IP address, so you can set the right IP through this variable.
/source http://mesos.apache.org/documentation/latest/deploy-scripts/

I suddenly cannot connect to my EC2 instance. Why? How can I mitigate this?

I had a running instance, and then I became unable to connect to it via http(80) and ssh(22). I tried to reboot the instance, but nothing went up. This has happened to me twice in the past month.
Why does it happen? Can I do anything to fix and/or prevent it from happening?
If I launch a new instance in same region, and it works.
Things to check when trying to connect to an Amazon EC2 instance:
Security Group: Make sure the security group allows inbound access on the desired ports (eg 80, 22) for the appropriate IP address range (eg 0.0.0.0/0). This solves the majority of problems.
Public IP Address: Check that you're using the correct Public IP address for the instance. If the instance is stopped and started, it might receive a new Public IP address (depending on how it has been configured).
VPC Configuration: Accessing an EC2 instance that is launched inside a Virtual Private Cloud (VPC) requires:
An Internet Gateway
A routing table connecting the subnet to the Internet Gateway
NACLs (Network ACLS) that permit through-traffic
If you are able to launch and connect to another instance in the same subnet, then the VPC configuration would appear to be correct.
The other thing to check would be the actual configuration of the operating system on the instance itself. Some software may be affecting the configuration so that the web server / ssh daemon is not working correctly. Of course, that is hard to determine without connecting to the instance.
If you are launching from a standard Amazon Linux AMI, ssh would work correctly anytime. The web server (port 80) would require installation and configuration of software on the instance, which is your responsibility to maintain.

How to restart single node hadoop cluster on ec2

I have installed a single node haodoop cluster on using Hortonworks/Ambari on Amazon's ec2 host.
Since I don't want this cluster running 24/7, I stop the instance when done. When I reboot the instance later, I get a new IP address and then ambari no longer is able to start the Hadoop related services.
Is there a way other than completely redeploying to reconfigure the cluster so the services will start?
It looks like the IP address lives in various xml files under /etc, in the postgres database table ambari, and possibly other places I haven't found yet.
I tried updating the xml files and postgres database with updated versions of the ip address, internal and external dns names as I could find them, but to no avail. I have not been able to restart the services.
The reason I am doing this is to possibly save the deployment time and data configuration on hdfs and other project specific setup each time I restart the host.
Any suggestions?
Thanks!
Elastic IP can be used. Also, since you mentioned it being a single node cluster - you can use localhost or private IP.
If you use elastic IP, your UIs will always be on the same public IP. However, if you use private IP or localhost and do not associate your instance with an elastic IP you will have to look for public IP everytime you start the instance and then connect to the web UI using the IP.
Thanks for the help, both Harman and TJ are correct. I haven't used an elastic IP because I might have more than one of these running and a time, and for now at least, I don't mind looking up the public ip address.
Harman's suggestion of using "localhost" as the fqdn when setting up ambari in the first place is a really good idea in retrospect. Unless I go through the whole setup again, that's water under the bridge for me, but I recommend this to others who might read this post.
In my case, I figured this out on my own before coming back to the page. The specific step I took was insanely simple after all, thanks to Occam's Razor.
I added the following line in /etc/hosts:
<new internal IP> <old internal dns name>
and then did
ambari-server restart. from the command line. Then I am able to restart all services after logging into ambari.

Elasticsearch on EC2

I've spent some time now looking for information regarding elasticsearch.yml configurations that make my single instance Elasticsearch (on Windows 2012 Server EC2) accessible via public ip, but everytime I uncomment one or both of following settings the only thing that changes is, calling the private ip as well results in an error.
network.publish_host: <public ip>
network.bind_host: <private ip>
Is this correct and are there any other settings that have to be defined? Shouldn't it run with the default values?
This is more of a general answer as to how networking works within EC2 instead of a specific answer to your question. But it should help inform how to configure your application.
EC2 has 1:1 NAT between a public and private IP address. Because of this, only the private IP address is visible to the instance directly.
If you are binding a service to a network interface, it would be the one with the private IP.
Some services do require knowledge of the external IP address in order to function properly. The only one I have run into is FTP in a passive configuration, likely due to the fact that it needs to open a separate socket for data transfer.
In the case of elastic search, it appears that they have a special plugin that will help configure elastic search for the aws environment: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-network.html
I had the same problem.
Installed only one instance of ES on aws EC2 and wanted to grant it public access.
On ubuntu 16.04 this is what works for me:
in /etc/elasticsearch/elasticsearch.yml add this line:
network.host: <ec2 instance private ip>
The private ip should be something like 172.x.x.x
Also do not forget allow access in security group in your aws console for port 9200 (default) and ip address from which you will be sending requests.
So difference was setting not public but private ip address from aws console..
Also note that this can be dangerous as there is not any user/password or other access control

Resources