OpenShift New Installation - ansible

I am new to OpenShift and ansible. I in the process of on a POC for an OpenShift installation.
1.What a typical installation of OpenShift in a POC environment looks like? I was thinking one master, one infrastructure, and one application node.
2.How long does a normal installation take for first timers?
3.Where would the registry reside?
4.Will the route on the master?

This may answer some of your questions. Maybe go though the actual install process yourself so you can see what is involved and how long it takes.
https://developers.redhat.com/blog/2017/02/23/openshift-for-developers-set-up-a-full-cluster-in-under-30-minutes/
Specifically
1 - The pre-requisites will probably answer how many you will need as this can vary based on need.
https://docs.openshift.org/latest/install_config/install/prerequisites.html
So 3-3-3 for redundancy, failover etc can be a good setup.
2 - It can really depend since there are many different variables like how many masters, how many nodes etc etc. In the video he does a simple installation with ansible in about 30 minutes.
3- For us our registry is running on our infrastructure nodes with storage to s3
4 - not sure I understand this one sorry.

Related

How many Marathon clusters / frameworks can Mesos handle?

Marathon does not support task configuration template which can establish command patterns and avoid redundancy. We are trying to find a way around it, otherwise we need to create 100,000s tasks and it would be very difficult to manage those config files. One approach, we are thinking is running multiple marathon clusters inside mesos. Now the question is can we run multiple marathon clusters inside mesos? And is there a limit number of frameworks mesos can handle?
Yes, multiple Marathon frameworks is not only possible but actually considered a best practices. There are many use cases, from scaling to Chinese Wall setups (esp. in financial services area).
For example, in the DCOS we're installing a 'system Marathon' per default and you can then install as many 'application' or 'project' or 'group' Marathons as you like.
I'm not aware of a theoretical limit of the number of frameworks, but hey, this might actually be a good load test to run, I'll look into it.

Marathon vs Aurora and their purposes

Both Marathon and Aurora are built on Mesos and supposedly are engineered for running long running services. My questions are:
What are their differences? I have struggled in finding any good explanations regarding their key differences
Do these frameworks run anything that runs on Linux? For Marathon they state that it can run anything that "is executable in a shell" but this is sort of vague :)
Thanks!
Disclaimer: I am the VP of Apache Aurora, and have been the tech lead of the Aurora team at Twitter for ~5 years. My likely-biased opinions are my own and do not necessarily represent those of Twitter or the ASF.
Do these frameworks run anything that runs on Linux? For Marathon they
state that it can run anything that "is executable in a shell" but
this is sort of vague :)
Essentially, yes. Ultimately these systems are sophisticated machinery to execute shell code somewhere in a cluster :-)
What are their differences? I have struggled in finding any good
explanations regarding their key differences
Aurora and Marathon do indeed offer similar feature sets, both being classified as "service schedulers". In other words, you hand us instructions for how to run your application servers, and we do our best to keep them up.
I'll offer some differences in broad strokes. When it comes to shortcomings mentioned in each, I think it's safe to say that the communities are aware and intend to fix them.
Ease of use
Aurora is not easy to install. It will likely feel like you are trailblazing while setting it up. It exposes a thrift API, which means you'll need a thrift client to interact with it programmatically (a REST-like API is coming, but is vaporware at the moment), or use our command line client. Aurora has a DSL for configuration which can be daunting, but allows you to easily share templates and common patterns as you use the system more.
Marathon, on the other hand, helps you to run 'Hello World' as quickly as possible. It has great docs to do this in many environments and there's little overhead to get going. It has a REST API, making it easier to adapt to custom tools. It uses JSON for configuration, which is easy to start with but more prone to cargo culting.
Targeted use cases
Aurora has always been designed to handle a large engineering organization. The clusters at Twitter have tens of thousands of machines and hundreds of engineers using them. It is critical to Twitter's business. As a result, we take our requirements of scale, stability, and security very seriously. We make sure to only condone features that we believe are trustworthy at scale in production (for example, we have our Docker support labeled as beta because of known issues with Docker itself and the Mesos-Docker integration). We also have features like preemption that make our clusters suitable for mixing business-critical services with prototypes and experiments.
I can't make any claim for or against Marathon's scalability. On the feature front, Marathon has build out features quickly, but this can feel bleeding edge in practice (Docker support is a good example). This is not always due to Marathon itself, but also layers down the stack. Marathon does not provide preemption.
Ownership
To some, ownership and governance of a project is important. It feel that in practice it does not define the openness of a project, but for some people/companies the legal fine print can be a deal-breaker.
Marathon is owned by a company (Mesosphere)
To some, this is beneficial, to others is is not. It means that you can pay for support and features. It also means that there is something to be sold, and the project direction is ultimately decided by Mesosphere's interests.
Aurora is owned by the Apache Software Foundation
This means it is subject to the governance model of the ASF, driven by the community. Aurora does not have paying customers, and there is not currently a software shop that you can pay for development.
tl;dr If you are just getting your feet wet with running services on Mesos, I would suggest Marathon as your first port of call. It will be easier for you to get running and poke around the ecosystem. If you are forming the 'private cloud strategy' for a company, I suggest seriously considering Aurora, as it is proven and specifically designed for that.
So I've been evaluating both and this is my summary.
Aurora
[+] also handles recurring jobs
[+] finer grained, extensive file-based configuration
[+] has namespaces so multiple environments can co-exist
[-] read-only UI, no official API
[~] file based configuration and cli based execution brings overhead (which can be justified with more extensive feature set)
Marathon
[+] very easy to setup and use
[+] UI that provides control and extensive API (even with features missing from UI at the moment)
[+] event bus to listen in on api calls
[-] handles only long-running jobs
[-] does not have separate deployment-run-cleanup steps, these if necessary need to be combined in a script of one-liner
Even though Aurora has better capabilities, I prefer Marathon due to Auroras complexity/overhead and lack of UI (for control) & API
I have more experience with Marathon.
Ideological:
Marathon is a relatively tested product that is used in production at AirBnB. Aurora is an early Apache project (so YMMV).
Both are open source and active. Feel free to contribute pull requests or file issues!
Technical:
Marathon doesn't schedule batch tasks or cron jobs
Marathon has a friendly UI and better health indicators (in 0.8.x)
In regards to your second question, you can run any command or docker container, and Mesos will do the resource isolation for you. If you have 50% CentOS nodes and 50% Ubuntu nodes and you run a task that executes apt-get, the task will have a 50% chance of failure. Mesos and Marathon have no awareness of the actual machines.
Disclaimer: I don't have hands-on experience with Aurora, only with Marathon.
ad Q1: In a nutshell Apache Aurora is capable of doing what Marathon + Chronos can provide, that is, schedule both long-running services and recurring (batch) jobs; see also Aurora user guide.
ad Q2: Yes, anything. Currently based on cgroups and Docker but hey, you can roll your own.

Docker-Swarm, Kubernetes, Mesos & Core-OS Fleet

I am relatively new to all these, but I'm having troubles getting a clear picture among the listed technologies.
Though, all of these try to solve different problems, but do have things in common too. I would like to understand what are the things that are common and what is different. It is likely that the combination of few would be great fit, if so what are they?
I am listing a few of them along with questions, but it would be great if someone lists all of them in detail and answers the questions.
Kubernetes vs Mesos:
This link
What's the difference between Apache's Mesos and Google's Kubernetes
provides a good insight into the differences, but I'm unable to understand as to why Kubernetes should run on top of Mesos. Is it more to do with coming together of two opensource solutions?
Kubernetes vs Core-OS Fleet:
If I use kubernetes, is fleet required?
How does Docker-Swarm fit into all the above?
Disclosure: I'm a lead engineer on Kubernetes
I think that Mesos and Kubernetes are largely aimed at solving similar problems of running clustered applications, they have different histories and different approaches to solving the problem.
Mesos focuses its energy on very generic scheduling, and plugging in multiple different schedulers. This means that it enables systems like Hadoop and Marathon to co-exist in the same scheduling environment. Mesos is less focused on running containers. Mesos existed prior to widespread interest in containers and has been re-factored in parts to support containers.
In contrast, Kubernetes was designed from the ground up to be an environment for building distributed applications from containers. It includes primitives for replication and service discovery as core primitives, where-as such things are added via frameworks in Mesos. The primary goal of Kubernetes is a system for building, running and managing distributed systems.
Fleet is a lower-level task distributor. It is useful for bootstrapping a cluster system, for example CoreOS uses it to distribute the kubernetes agents and binaries out to the machines in a cluster in order to turn-up a kubernetes cluster. It is not really intended to solve the same distributed application development problems, think of it more like systemd/init.d/upstart for your cluster. It's not required if you run kubernetes, you can use other tools (e.g. Salt, Puppet, Ansible, Chef, ...) to accomplish the same binary distribution.
Swarm is an effort by Docker to extend the existing Docker API to make a cluster of machines look like a single Docker API. Fundamentally, our experience at Google and elsewhere indicates that the node API is insufficient for a cluster API. You can see a bunch of discussion on this here: https://github.com/docker/docker/pull/8859 and here: https://github.com/docker/docker/issues/8781
Join us on IRC # #google-containers if you want to talk more.
I think the simplest answer is that there is no simple answer. The swift rise to power of containers, and Docker in particular has left a power vacuum for "container scheduling and orchestration", whatever that might mean. In reality, that means you have a number of technologies that can work in harmony on some levels, but with certain aspects in competition. For example, Kubernetes can be used as a one stop shop for deploying and managing containers on a compute cluster (as Google originally designed it), but could also sit atop Fleet, making use of the resilience tier that Fleet provides on CoreOS.
As this Google vid states Kubernetes is not a complete out the box container scaling solution, but is a good statement to start from. In the same way, you would at some stage expect Apache Mesos to be able to work with Kubernetes, but not with Marathon, in as much as Marathon appears to fulfil the same role as Kubernetes. Somewhere I think I've read these could become part of the same effort, but I could be wrong about that - it's really about the strategic direction of Mesosphere and the corresponding adoption of Kubernetes principles.
In the DockerCon keynote, Solomon Hykes suggested Swarm would be a tier that could provide a common interface onto the many orchestration and scheduling frameworks. From what I can see, Swarm is designed to provide a smooth Docker deployment workflow, working with some existing container workflow frameworks such as Deis, but flexible enough to yield to "heavyweight" deployment and resource management such as Mesos.
Hope this helps - this could be an enormous post. I think the key is that these are young, evolving services that will likely merge and become interoperable, but we need to ride out the next 12 months to see how it plays out. There's some very clever people on the problem, so the future looks very bright.
As far as I understand it:
Mesos, Kubernetes and Fleet are all trying to solve a very similar problem. The idea is that you abstract away all your hardware from developers and the 'cluster management tool' sorts it all out for you. Then all you need to do is give a container to the cluster, give it some info (keep it running permanently, scale up if X happens etc) and the cluster manager will make it happen.
With Mesos, it does all the cluster management for you, but it doesn't include the scheduler. The scheduler is the bit that says, ok this process needs 2 procs and 512MB RAM, and I have a machine over there with that free, so I'll run it on that machine. There are some plugin schedulers available for Mesos: Marathon and Chronos and you can write your own. This gives you a lot of power of resource distribution and cluster scaling etc.
Fleet and Kubernetes seem to abstract away those sorts of details (so you don't have to write your own scheduler basically). This means you have to define your tasks and submit them in the format/manner defined by Fleet or Kubernetes and then they take over and schedule the tasks (containers) for you.
So I guess: Using Mesos may mean a bit more work in writing your own scheduler, but potentially provides more flexibility if required.
I think the idea of running Kubernetes on top of Mesos is that Kubernetes acts as the scheduler for Mesos. Personally I'm not sure what benefits this brings over running one or the other on its own though (hopefully someone will jump in and explain!)
As MikeB said.. it's early days, and it's all up for grabs (keep an eye on Amazon's ECS as well) so there are many competing standards and a lot of overlap!
-edit- I didn't mention Docker swarm as I don't really have much experience with it.
For anyone coming to this after 2017 fleet is deprecated. Do not use it anymore.
Fleet docs say "fleet is no longer actively developed or maintained by CoreOS" and link to Container orchestration: Moving from fleet to Kubernetes. Fleet was removed from Container Linux (formerly known as CoreOS Linux) and replaced with Kubernetes kubelet (agent). This coincided with a corporate pivot to offer Tectonic (a Kubernetes distro) as their primary product.

How to use a "Rocks" cluster

I've just joined a research lab at my University and been given access to a Cluster to compile and run the c++ code that I write. I use SSH to access it and simply use the cluster like a Linux terminal.
I often have to wait a relatively long time while my code runs. I'm trying to figure out if there's a more efficient way use the Cluster. For example, there are different CPUs/Nodes in the cluster, some of which are more in use and others less in use. How do I access a specific CPU? I have access to the "Ganglia" overview page which gives information about the different Nodes.
Also, if I run 2 processes in a different SSH windows will it automatically use different processors or nodes, or do I have to manually specify that.
I couldn't find any documentation to help me with these issues, so I'd appreciate a little help.
Thanks
Simply running something on a cluster does not mean it is taking advantage of the cluster at all. By default, it will probably just run on the head node. Software needs to be written specifically for a cluster.
There is likely to be some kind of scheduler running that you need to interface with. Perhaps you could also see if distcc is installed and configured for your particular cluster (for doing the compilation across multiple machines). There may also be a particular flavour of MPI running to allow processes on different nodes to communicate.
Clusters software setups tend to be very specialised to the hardware and computing environment. Really, I would recommend that you ask someone who has used the machine before these kinds of questions, because any advice you receive here is unlikely to be completely accurate for your particular cluster.

Managing server instance identity on EC2

I recently brought up a cluster on EC2, and I felt like I had to invent a lot of things. I'm wondering what kinds of tools, patterns, ideas are out there for how to deal with this.
Some context:
I had 3 different kinds of servers, so first I created AMIs for each of them. The first AMI had zookeeper, so step one in deploying the system was to get the zookeeper server running.
My script then made a note of the mapping between EC2's completely arbitrary and unpredictable hostnames, and the zookeeper server.
Then as I brought up new instances of the other 2 kinds of servers, the first thing I would do is ssh to the new server, and add the zookeeper server to its /etc/hosts file. Then as the server software on each instance starts up, it can find zookeeper.
Obviously this is a problem that lots of people have to solve, and it probably works a little bit differently in different clouds.
Are there products that address this concept? I was pretty surprised that EC2 didn't provide some kind of way to tie your own name to its name.
Thanks for any ideas.
How to do some service discovery on Amazon EC2 seems to have some good options.
I think you might want to look at http://puppetlabs.com/mcollective/introduction/ and the suite of tools from http://puppetlabs.com in general.
From the site:
The Marionette Collective AKA MCollective is a framework to build server orchestration or parallel job execution systems.
Primarily we’ll use it as a means of programmatic execution of Systems Administration actions on clusters of servers. In this regard we operate in the same space as tools like Func, Fabric or Capistrano.
I am fairly certain mcollective was built to solve exactly the problem you are trying to address. But, be forewarned, it's not a DNS-based solution, it's a method of addressing arbitrarily large and arbitrarily tagged groups of hosts.

Resources