I have a Go application running in a Kubernetes cluster which needs to read files from a large MapR cluster. The two clusters are separate and the Kubernetes cluster does not permit us to use the CSI driver. All I can do is run userspace apps in Docker containers inside Kubernetes pods and I am given maprtickets to connect to the MapR cluster.
I'm able to use the com.mapr.hadoop maprfs jar to write a Java app which is able to connect and read files using a maprticket, but we need to integrate this into a Go app, which, ideally, shouldn't require a Java sidecar process.
This is a good question because it highlights the way that some environments impose limits that violate the assumptions external software may hold.
And just for reference, MapR was acquired by HPE so a MapR cluster is now an HPE Ezmeral Data Fabric cluster. I am still training myself to say that.
Anyway, the accepted method for a generic program in language X to communicate with the Ezmeral Data Fabric (the filesystem formerly known as MapR FS) is to mount the file system and just talk to it using file APIs like open/read/write and such. This applies to Go, Python, C, Julia or whatever. Inside Kubernetes, the normal way to do this mount is to use a CSI driver that has some kind of operator working in the background. That operator isn't particularly magical ... it just does what is needful. In the case of data fabric, the operator mounts the data fabric using NFS or FUSE and then bind mounts[1] part of that into the pod's awareness.
But this question is cool because it precludes all of that. If you can't install an operator, then this other stuff is just a dead letter.
There are three alternative approaches that may work.
NFS mounts were included in Kubernetes as a native capability before the CSI plugin approach was standardized. It might still be possible to use that on a very vanilla Kubernetes cluster and that could give access to the data cluster.
It is possible to integrate a container into your pod that does the necessary FUSE mount in an unprivileged way. This will be kind of painful because you would have to tease apart the FUSE driver from the data fabric install and get it to work. That would let you see the data fabric inside the pod. Even then, there is no guarantee Kubernetes or the OS will allow this to work.
There is an unpublished Go file system client that users the low level data fabric API directly. We don't yet release that separately. For more information on that, folks should ping me directly (my contact info is everywhere ... email to ted.dunning hpe.com or gmail.com works)
The data fabric allows you to access data via S3. With the 7.0 release of Ezmeral Data Fabric, this capability is heavily revamped to give massive performance especially since you can scale up the number of gateways essentially without limit (I have heard numbers like 3-5GB/s per stateless connection to a gateway, but YMMV). This will require the least futzing and should give plenty of performance. You can even access files as if they were S3 objects.
[1] https://unix.stackexchange.com/questions/198590/what-is-a-bind-mount#:~:text=A%20bind%20mount%20is%20an,the%20same%20as%20the%20original.
Related
There is a requirement to have an Object Storage server (preferably open source) to be run on Embedded Linux platform. We are thinking to use Minio as Object Storage server. The question is whether Minio Object Storage server can be ported/run on embedded Linux platform like Xilinx based Linux system?
I don't think it's a good idea to run product like Minio on embedded hardware. Various CPU- and RAM-intensive optimization techniques (e. g. compression, deduplication, erasure coding, journaling) are likely to be used in this kind of software. Are you sure your embedded system is powerfull enough to make this job? Also (in most cases) you'll have to pay attention to the fault tolerance and high availability, which means that you will need to run Minio cluster on independent devices interacting via network.
So I think it's problem of architecture of your product rather then compilation.
I use the node application purely for socket.io channels with Redis PubSub, and at the moment I have it spread across 3 machines, backed by nginx load balancing on one of the machines.
I want to replace this node application with a Phoenix application, and I'm still all new to the erlang/Elixir world so I still haven't figured out how a single Phoenix application can span on more than one machine. Googling all possible scaling and load balancing terms yielded nothing.
The 1.0 release notes mention this regarding channels:
Even on a cluster of machines, your messages are broadcasted across the nodes automatically
1) So I basically deploy my application to N servers, starting the Cowboy servers in each one of them, similarly to how I do with node and them I tie them nginx/HAProxy?
2) If that is the case, how channel messages are broadcasted across all nodes as mentioned on the release notes?
EDIT 3: Taking Theston answer which clarifies that there is no such thing as Phoenix applications, but instead, Elixir/Erlang applications, I updated my search terms and found some interesting results regarding scaling and load balancing.
A free extensive book: Stuff Goes Bad: Erlang in Anger
Erlang pooling libraries recommendations
EDIT 2: Found this from Elixir's creator:
Elixir provides conveniences for process grouping and global processes (shared between nodes) but you can still use external libraries like Consul or Zookeeper for service discovery or rely on HAProxy for load balancing for the HTTP based frontends.
EDITED: Connecting Elixir nodes on the same LAN is the first one that mentions inter Elixir communication, but it isn't related to Phoenix itself, and is not clear on how it related with load balancing and each Phoenix node communicating with another.
Phoenix isn't the application, when you generate a Phoenix project you create an Elixir application with Phoenix being just a dependency (effectively a bunch of things that make building a web part of your application easier).
Therefore any Node distribution you need to do can still happen within your Elixir application.
You could just use Phoenix for the web routing and then pass the data on to your underlying Elixir app to handle the distribution across nodes.
It's worth reading http://www.phoenixframework.org/v1.0.0/docs/channels (if you haven't already) where it explains how Phoenix channels are able to use PubSub to distribute (which can be configured to use different adapters).
Also, are you spinning up cowboy on your deployment servers by running mix phoenix.server ?
If so, then I'd recommend looking at EXRM https://github.com/bitwalker/exrm
This will bundle your Elixir application into a self contained file that you can simply deploy to your production servers (with Capistrano if you like) and then you start your application.
It also means you don't need any Erlang/Elixir dependencies installed on the production machines either.
In short, Phoenix is not like Rails, Phoenix is not the application, not the stack. It's just a dependency that provides useful functionality to your Elixir application.
Unless I am misunderstanding your use case, you can still use the exact scaling technique your node version of the application is. Simply deploy the Phoenix application to > 1 machines and use an Nginx load balancer configured to forward requests to one of the many application machines.
The built in node communications etc of Erlang are used for applications that scale in a different way than a web app. For instance, distributed databases or queues.
Look at Phoenix.PubSub
It's where Phoenix internally has the Channel communication bits.
It currently has two adapters:
Phoenix.PubSub.PG2 - uses Distributed Elixir, directly exchanging notifications between servers. (This requires that you deploy your application in a elixir/erlang distributed cluster way.)
Phoenix.PubSub.Redis - uses Redis to exchange data between servers. (This should be similar to solutions found in socket.io and others)
I am relatively new to all these, but I'm having troubles getting a clear picture among the listed technologies.
Though, all of these try to solve different problems, but do have things in common too. I would like to understand what are the things that are common and what is different. It is likely that the combination of few would be great fit, if so what are they?
I am listing a few of them along with questions, but it would be great if someone lists all of them in detail and answers the questions.
Kubernetes vs Mesos:
This link
What's the difference between Apache's Mesos and Google's Kubernetes
provides a good insight into the differences, but I'm unable to understand as to why Kubernetes should run on top of Mesos. Is it more to do with coming together of two opensource solutions?
Kubernetes vs Core-OS Fleet:
If I use kubernetes, is fleet required?
How does Docker-Swarm fit into all the above?
Disclosure: I'm a lead engineer on Kubernetes
I think that Mesos and Kubernetes are largely aimed at solving similar problems of running clustered applications, they have different histories and different approaches to solving the problem.
Mesos focuses its energy on very generic scheduling, and plugging in multiple different schedulers. This means that it enables systems like Hadoop and Marathon to co-exist in the same scheduling environment. Mesos is less focused on running containers. Mesos existed prior to widespread interest in containers and has been re-factored in parts to support containers.
In contrast, Kubernetes was designed from the ground up to be an environment for building distributed applications from containers. It includes primitives for replication and service discovery as core primitives, where-as such things are added via frameworks in Mesos. The primary goal of Kubernetes is a system for building, running and managing distributed systems.
Fleet is a lower-level task distributor. It is useful for bootstrapping a cluster system, for example CoreOS uses it to distribute the kubernetes agents and binaries out to the machines in a cluster in order to turn-up a kubernetes cluster. It is not really intended to solve the same distributed application development problems, think of it more like systemd/init.d/upstart for your cluster. It's not required if you run kubernetes, you can use other tools (e.g. Salt, Puppet, Ansible, Chef, ...) to accomplish the same binary distribution.
Swarm is an effort by Docker to extend the existing Docker API to make a cluster of machines look like a single Docker API. Fundamentally, our experience at Google and elsewhere indicates that the node API is insufficient for a cluster API. You can see a bunch of discussion on this here: https://github.com/docker/docker/pull/8859 and here: https://github.com/docker/docker/issues/8781
Join us on IRC # #google-containers if you want to talk more.
I think the simplest answer is that there is no simple answer. The swift rise to power of containers, and Docker in particular has left a power vacuum for "container scheduling and orchestration", whatever that might mean. In reality, that means you have a number of technologies that can work in harmony on some levels, but with certain aspects in competition. For example, Kubernetes can be used as a one stop shop for deploying and managing containers on a compute cluster (as Google originally designed it), but could also sit atop Fleet, making use of the resilience tier that Fleet provides on CoreOS.
As this Google vid states Kubernetes is not a complete out the box container scaling solution, but is a good statement to start from. In the same way, you would at some stage expect Apache Mesos to be able to work with Kubernetes, but not with Marathon, in as much as Marathon appears to fulfil the same role as Kubernetes. Somewhere I think I've read these could become part of the same effort, but I could be wrong about that - it's really about the strategic direction of Mesosphere and the corresponding adoption of Kubernetes principles.
In the DockerCon keynote, Solomon Hykes suggested Swarm would be a tier that could provide a common interface onto the many orchestration and scheduling frameworks. From what I can see, Swarm is designed to provide a smooth Docker deployment workflow, working with some existing container workflow frameworks such as Deis, but flexible enough to yield to "heavyweight" deployment and resource management such as Mesos.
Hope this helps - this could be an enormous post. I think the key is that these are young, evolving services that will likely merge and become interoperable, but we need to ride out the next 12 months to see how it plays out. There's some very clever people on the problem, so the future looks very bright.
As far as I understand it:
Mesos, Kubernetes and Fleet are all trying to solve a very similar problem. The idea is that you abstract away all your hardware from developers and the 'cluster management tool' sorts it all out for you. Then all you need to do is give a container to the cluster, give it some info (keep it running permanently, scale up if X happens etc) and the cluster manager will make it happen.
With Mesos, it does all the cluster management for you, but it doesn't include the scheduler. The scheduler is the bit that says, ok this process needs 2 procs and 512MB RAM, and I have a machine over there with that free, so I'll run it on that machine. There are some plugin schedulers available for Mesos: Marathon and Chronos and you can write your own. This gives you a lot of power of resource distribution and cluster scaling etc.
Fleet and Kubernetes seem to abstract away those sorts of details (so you don't have to write your own scheduler basically). This means you have to define your tasks and submit them in the format/manner defined by Fleet or Kubernetes and then they take over and schedule the tasks (containers) for you.
So I guess: Using Mesos may mean a bit more work in writing your own scheduler, but potentially provides more flexibility if required.
I think the idea of running Kubernetes on top of Mesos is that Kubernetes acts as the scheduler for Mesos. Personally I'm not sure what benefits this brings over running one or the other on its own though (hopefully someone will jump in and explain!)
As MikeB said.. it's early days, and it's all up for grabs (keep an eye on Amazon's ECS as well) so there are many competing standards and a lot of overlap!
-edit- I didn't mention Docker swarm as I don't really have much experience with it.
For anyone coming to this after 2017 fleet is deprecated. Do not use it anymore.
Fleet docs say "fleet is no longer actively developed or maintained by CoreOS" and link to Container orchestration: Moving from fleet to Kubernetes. Fleet was removed from Container Linux (formerly known as CoreOS Linux) and replaced with Kubernetes kubelet (agent). This coincided with a corporate pivot to offer Tectonic (a Kubernetes distro) as their primary product.
I am trying to build a prototype of Elasticsearch as a Service. I have thought of 2 different approaches and I'd like to get opinions towards one or the other implementation
One single installation of Elasticsearch, and a proxy layer on top to add user validation (http basic authentication + user account to validate the usage).
This approach would be relatively straight forward and the main challenge would be configure the cluster properly to handle the load, as well as the permissions so there are no data leaks of the users don't have access to the cluster management APIs.
Use Docker as a container and have one instance of elasticsearch for each user. In this case I would be providing the isolation by using the Linux container (Docker). I'd still need to manage authentication.
It probably would be good to implement both, play around and see how things behave. Any opinions about pros and cons of each approach?
Thanks!
Disclaimer: I am the founder of the Elasticsearch service provider Facetflow, which currently offers shared clusters.
I think that both approaches have merit, but maybe suited for different types of customers.
Looking at other SaaS providers, like MongoDB provider MongoLab, they essentially ended up offering both setups (although not using Docker).
So, pros and cons as I see them:
Shared Cluster
Most Elasticsearch as a Service providers operate this way.
Pros:
Far more affordable for the majority of users just looking for good search and analytics.
Simpler maintenance, less clusters for you to monitor
Potentially less versions of Elasticsearch to integrate with. If you need to communicate with other systems (which you do), write your own plugins (we did, for authentication, silos, entitlements, stats etc.) less versions will be far easier to maintain.
Cons:
Noisy neighbours have to be monitored and you have to scale and relocate indices to handle this.
Users have to choose from a limited list of versions of Elasticsearch, usually a single version.
Users don't get full cluster admin control.
Private Clusters using Docker
One provider that works this way is Found.
Pros:
Users could potentially be able to deploy a variety of versions of Elasticsearch
Users can have complete cluster admin access
Noisy neighbours don't affect their cluster, less manual intervention from you
Cons:
Complex monitoring and support. If people can do whatever they want (shut down the cluster over the api), you have to be clear where your responsibility as a provider ends, and what wakes you up at night.
Complex integration with multiple versions, see shared cluster pros.
More expensive since you have to allocate resources that might not always be used.
We want to use an ActiveMQ master/slave configuration based on a shared file system on Amazon EC2 - that's the final goal. Operating system should be Ubuntu 12.04, but that shouldn't make too much difference.
Why not a master/slave configuration based on RDS? We've tried that and it's easy to set up (including multi-AZ). However, it is relatively slow and the failover takes approximately three minutes - so we want to find something else.
Which shared file system should we use? We did some research and came to the following conclusion (which might be wrong, so please correct me):
GlusterFS is often suggested and should be supporting multi-AZs fine.
NFSv4 should be working (while NFSv3 is said to corrupt the file system), but I didn't see too many references to it on EC2 (rather: asked for NFS, got the suggestion to use GlusterFS). Is there any particular reason for that?
Ubuntu's Ceph isn't stable yet.
Hadoop Distributed File System (HDFS) sounds like overkill to me and the NameNode would again be a single point of failure.
So GlusterFS it is? We found hardly any success stories. Instead rather dissuasive entries in the bug tracker without any real explanation: "I would not recommend using GlusterFS with master/slave shared file system with more than probably 15 enqueued/dequeued messages per second." Does anyone know why or is anyone successfully using ActiveMQ on GlusterFS with a lot of messages?
EBS or ephemeral storage? Since GlusterFS should replicate all data, we could use the ephemeral storage or are there any advantages of using EBS (IMHO snapshots are not relevant for our scenario).
We'll probably try out GlusterFS, but according to Murphy's Law we'll run into problems at the worst possible moment. So we'd rather try to avoid that by (hopefully) getting a few more opinions on this. Thanks in advance!
PS: Why didn't I post this on ServerFault? It would be a better fit, but on SO there are 10 times more posts about this topic, so I stuck with the flock.
Just an idea.... but with activemq 5.7 (or maybe already 5.6) you can have pluggable lockers (http://activemq.apache.org/pluggable-storage-lockers.html). So it might be an option to use the filesystem as storage and RDS as just a locking mechanism. Note I have never tried this before.