Clustering Microservice Components - cluster-computing

We have a set of Microservices collaborating with each other in the eco system. We used to have occasional problems where one or more of these Microservices would go down accidentally. Thankfully, we have some monitoring built around which would realize this and take corrective action.
Now, we would like to have redundancy built around each of those Microservices. I'm thinking more like a master / slave approach where a slave is always on stand by and when the master goes off, the slave picks it up.
Should we consider using any framework that we could use as service registry, where we register each of those Microservices and allow them to be controlled? Any other suggestions on how to achieve the kind of master / slave architecture with the Microservices that would enable us to have failover redundancy?

I thought about this for a couple of minutes and this is what I currently think is the best method, based on experience.
There are a couple of problems you will face with availability. First is always having at least one endpoint up. This is easy enough to do by installing on multiple servers. In the enterprise space, you would use a name for the endpoint and then have it resolve to multiple servers (virtual or hardware). You would also load balance it.
The second is registry. This is a very easy problem with API management software. The really good software in this space is not cheap, so this is not a weekend hobbyist type of software. But there are open source API Management solutions out there. As I work in the Enterprise space, I am very familiar with options like Apigee, CA, Mashery, etc. so I cannot recommend an open source option and feel good about myself.
You could build your own registry, if you desire. Just be careful how you design it, as a "registry of all interface points" leads to a service that becomes more tightly coupled.

Related

What is difference between Microservices and Decentralized applications

I am new to the decentralized application, after going through some articles I confused between the microservices and decentralized application. Can someone help me to understand the difference between them. I know that microservices can be built using spring boot & docker. Is there any other technology present to build it.I think Ethereum is used to develop the decentralized application. Can someone help me to understand the difference?
A microservice application still runs on your infrastructure, you still control all of its nodes, state and infrastructure. So, despite being distributed (and even though the infrastructure might not be yours such as a 3rd party cloud), you still have the power to interfere in all of its aspects.
The main selling point of a decentralized application is that theoretically no one can actually interfere in its infrastructure since it's not owned by a single entity. Theoretically anyone in the world (and the largest its user base, the more resilient the decentralized application becomes) can become a node in the infrastructure and the "current valid state" is calculated based on a kind of agreement between the nodes (so, unless you can interfere in a majority of the nodes, which you don't own, you can't change the application's state on your own).
In a certain sense, you're right that they seem similar, since they're both distributed applications. The decentralized ones just go a step further to be not "owned" and "controlled" by a single entity and be the product of an anonymous community.
EDIT
So suppose you/your company makes a very cool microservice application and you host it on a bunch of 3rd party clouds around the world to make sure it's very redundant and always available. A change of heart on your part (or maybe being forced to do so by government regulations) can shutdown the application out of the blue or ban certain users from it or edit/censor content currently being published on it. You're in full control since it's your app. As good as your intentions might be, you are a liability, a single point of failure in the ecosystem.
Now, if your app is decentralized... there's no specific person/entity to be hunt down to force such a behavior. You need to hunt thousands/millions of owners of single independent nodes providing infrastructure to the app and enforcing its agreed set of rules. So how would you go banning users / censoring content / etc? You (theoretically) can't... unless you can reach a majority of its nodes and that has already proven to be quite difficult and even brute force might be nearly impossible to achieve.
Microservice
Microservice is rather software architecture. The idea is that you have many small applications - microservices, each focused on addressing only a single goal, but doing it really well.
Specific instance of a microservice can be for example application running HTTP server for managing users. It could have HTTP endpoints for adding, viewing and deleting users in a database. You would then deploy such application together with database on some server.
With fair degree of simplification, we could say that microservice is not all that different than a web browser you are running on your computer. The difference between your web browser and a microservice is that microservice will be running on server, exposing some sort of network interface, whereas you browser runs on your personal computer and it doesn't expose network interface for others to interact with.
The bottom line is that singe microservice is just an application running on a server, you can modify its code anytime, you can stop it any time, you can change data in the database it's using.
Decentralized application
Decentralized application is deployed to blockchain. Blockchain is network of computers (Ethereum MainNet has tens of thousands of nodes), all running the same program. When you write decentralized application (called smart contract, in terms of Ethereum blockchain) and you "deploy it", what happens is that you basically insert your code into this networks of computers and each of them will have it available.
Once code of your application is in the network, you can interact with it - you can the interface you defined in your decentralized application, by sending JSON-RPC requests to a server which is part of this blockchain network.
It then takes some time until your request for execution is picked up by network. If everything goes right, your request is eventually distributed to the network and executed by every single computer connected to blockchain.
The consequence of that is that if some computer in the network tries to lie about the result, the fraudulent attempt would be noticed by the rest of the network.
The bottom line here is that decentralized application is not executed on one computer, but many (possibly thousands) and even as a creator, you cannot modify it's code or data (you can only to limited degree)

Looking for help on how to manage microservices in Golang

Currently, I deal with microservices on a daily basis at my 9-5. Most everything that I touch is written in PHP, and as only a software engineer, SysOps manages everything that has to do with apps running, etc. I have a little familiarity in how the infrastructure and build pipeline is setup, but I still am not a SysOps or DevOps guy.
With that said, I love Golang and for a side project, I am creating a fairly large web application with a lot of moving parts. Writing and designing the code is easy as I have learned a lot from my day job, but deploying and managing Golang web apps (as they are executables) is quite different than updating files for apache to serve.
I have researched a lot on how I would build and deploy my microservice apps, but I keep on thinking of more problems that will need to be solved along the way. I have tinkered with the idea of using Docker for all of this, but I would rather not have the added complexity of learning that and managing storage for all of the images as that could be large.
Is there a best practice or a good way to manage Golang applications after they have been deployed? I would need a way to keep track of all the microservice processes to be able to see if they are still up and to be able to stop them when a new build is going to be deployed.
As for the setup, just assume that all the microservices will be run on the server, not in a container or in a VM. They will all need to be managed, but also able to act upon independently. Jenkins will be used for building and deploying. I will be using Consul for service discovery and possibly configuration, and most likely health checks on the services. I'm thinking of having each microservice register itself to consul when started and deregister when stopping.
Again, I am looking for a solution that is hopefully not just "Docker". I also had thoughts into creating a deploy service that manages the services (add and remove), as well as registering them in Consul. So if I cannot find a better solution, I might go that path. Any help is appreciated.
** Sorry if my question was confusing, but since a couple people answered on the wrong topic at hand, I will try to clarify. I don't need any help making the microservices, or even know anything about them. I brought that point up as to why I need to ask my question. Basically what I need is just the ability to manage the running go processes of all my microservices so I can do deployments and be able to stop and start processes to update the code. It is easier when you have to worry about one app, but when you can have up to 10-15 difference microservices they become harder to keep track of. After my own research, it seems that Supervisord is what I am looking for, but I'm not sure. That is the direction I am going in with this question. Thanks.
Golang is great to use for microservices, but I would say there is not so big difference of managing golang or other languages microservices.
What I would say is golang specific:
you don't need to install anything on servers since golang is compiled to single library
you can take advantage of std lib golang rpc package and gob binary decoding, instead of usin 3rd party solution (gorpc, protocol buffer etc)
Other than that you need to use your own judgement. There is plenty of ways of doing one things in microservices world; one day you will implement solution A but when after 3 month you will see that its better to do B, do that.
In internet, there is so much reading about microservices. I will recommend you 2 good resurces: https://books.google.co.uk/books/about/Building_Microservices.html?id=RDl4BgAAQBAJ&source=kp_cover&redir_esc=y&hl=en
And article: http://highscalability.com/blog/2014/4/8/microservices-not-a-free-lunch.html
Remember, microservices are not a golden bullet, they often can help making application easier to maintain and grow, but from the other side require lot of additional work, consequence in specifying API contracts and strong devops culture.

Must Microservices based systems be all in the same network?

I have an web application that is separated in several components. For some reasons (pricing) I'm considering to deploy future components in different clouds.
Does anybody has references and experience on this to tell me if this is definitely not good? I know that components being in different networks will decrease the performance. At the same time, I do not like the idea of losing the power of choice where the new components will be.
Must Microservices based systems be all in the same network? How do you handle this problem?
Having worked with multiple services in the past I can tell you that services are made to work across separate networks. This is why there are security protocols like CAS, SAML, OAUTH, HTTPS, and HMAC to name a few.
So as long as you are able to deal with the management of the networks, and you have good security around your services (and I assume you do), then I would not be worried about breaking some unspoken microservices rule. Remember that microservices, if written well and are useful, are expected to be used across the Internet, especially for the Internet of Things, so they are expected to be used across multiple networks.
When you start trying this, I would pay very close attention to the bandwidth charges. AWS as an example you are ok if you are in the same region. Bandwidth between services will not cost much if anything. Lets say you use AWS and Google Cloud. Now you will be paying for the bandwidth between the 2 providers.
As a suggestion I would look at Docker as a possible solution to your problem/concern of vendor lock in.
You would be restricted to providers that support docker but in theory you could migrate quickly between providers easily since your application would be abstracted from each cloud providers architecture.
Performance, will take a hit with anything leaving the providers data center. I suppose with some investigation you might try researching providers that use a common internet exchange. This would help minimize a few hops at least.

Marathon vs Aurora and their purposes

Both Marathon and Aurora are built on Mesos and supposedly are engineered for running long running services. My questions are:
What are their differences? I have struggled in finding any good explanations regarding their key differences
Do these frameworks run anything that runs on Linux? For Marathon they state that it can run anything that "is executable in a shell" but this is sort of vague :)
Thanks!
Disclaimer: I am the VP of Apache Aurora, and have been the tech lead of the Aurora team at Twitter for ~5 years. My likely-biased opinions are my own and do not necessarily represent those of Twitter or the ASF.
Do these frameworks run anything that runs on Linux? For Marathon they
state that it can run anything that "is executable in a shell" but
this is sort of vague :)
Essentially, yes. Ultimately these systems are sophisticated machinery to execute shell code somewhere in a cluster :-)
What are their differences? I have struggled in finding any good
explanations regarding their key differences
Aurora and Marathon do indeed offer similar feature sets, both being classified as "service schedulers". In other words, you hand us instructions for how to run your application servers, and we do our best to keep them up.
I'll offer some differences in broad strokes. When it comes to shortcomings mentioned in each, I think it's safe to say that the communities are aware and intend to fix them.
Ease of use
Aurora is not easy to install. It will likely feel like you are trailblazing while setting it up. It exposes a thrift API, which means you'll need a thrift client to interact with it programmatically (a REST-like API is coming, but is vaporware at the moment), or use our command line client. Aurora has a DSL for configuration which can be daunting, but allows you to easily share templates and common patterns as you use the system more.
Marathon, on the other hand, helps you to run 'Hello World' as quickly as possible. It has great docs to do this in many environments and there's little overhead to get going. It has a REST API, making it easier to adapt to custom tools. It uses JSON for configuration, which is easy to start with but more prone to cargo culting.
Targeted use cases
Aurora has always been designed to handle a large engineering organization. The clusters at Twitter have tens of thousands of machines and hundreds of engineers using them. It is critical to Twitter's business. As a result, we take our requirements of scale, stability, and security very seriously. We make sure to only condone features that we believe are trustworthy at scale in production (for example, we have our Docker support labeled as beta because of known issues with Docker itself and the Mesos-Docker integration). We also have features like preemption that make our clusters suitable for mixing business-critical services with prototypes and experiments.
I can't make any claim for or against Marathon's scalability. On the feature front, Marathon has build out features quickly, but this can feel bleeding edge in practice (Docker support is a good example). This is not always due to Marathon itself, but also layers down the stack. Marathon does not provide preemption.
Ownership
To some, ownership and governance of a project is important. It feel that in practice it does not define the openness of a project, but for some people/companies the legal fine print can be a deal-breaker.
Marathon is owned by a company (Mesosphere)
To some, this is beneficial, to others is is not. It means that you can pay for support and features. It also means that there is something to be sold, and the project direction is ultimately decided by Mesosphere's interests.
Aurora is owned by the Apache Software Foundation
This means it is subject to the governance model of the ASF, driven by the community. Aurora does not have paying customers, and there is not currently a software shop that you can pay for development.
tl;dr If you are just getting your feet wet with running services on Mesos, I would suggest Marathon as your first port of call. It will be easier for you to get running and poke around the ecosystem. If you are forming the 'private cloud strategy' for a company, I suggest seriously considering Aurora, as it is proven and specifically designed for that.
So I've been evaluating both and this is my summary.
Aurora
[+] also handles recurring jobs
[+] finer grained, extensive file-based configuration
[+] has namespaces so multiple environments can co-exist
[-] read-only UI, no official API
[~] file based configuration and cli based execution brings overhead (which can be justified with more extensive feature set)
Marathon
[+] very easy to setup and use
[+] UI that provides control and extensive API (even with features missing from UI at the moment)
[+] event bus to listen in on api calls
[-] handles only long-running jobs
[-] does not have separate deployment-run-cleanup steps, these if necessary need to be combined in a script of one-liner
Even though Aurora has better capabilities, I prefer Marathon due to Auroras complexity/overhead and lack of UI (for control) & API
I have more experience with Marathon.
Ideological:
Marathon is a relatively tested product that is used in production at AirBnB. Aurora is an early Apache project (so YMMV).
Both are open source and active. Feel free to contribute pull requests or file issues!
Technical:
Marathon doesn't schedule batch tasks or cron jobs
Marathon has a friendly UI and better health indicators (in 0.8.x)
In regards to your second question, you can run any command or docker container, and Mesos will do the resource isolation for you. If you have 50% CentOS nodes and 50% Ubuntu nodes and you run a task that executes apt-get, the task will have a 50% chance of failure. Mesos and Marathon have no awareness of the actual machines.
Disclaimer: I don't have hands-on experience with Aurora, only with Marathon.
ad Q1: In a nutshell Apache Aurora is capable of doing what Marathon + Chronos can provide, that is, schedule both long-running services and recurring (batch) jobs; see also Aurora user guide.
ad Q2: Yes, anything. Currently based on cgroups and Docker but hey, you can roll your own.

Why would you not want to use Cloud Computing [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Our company is considering moving from hosting our own servers to EC2 and I was wondering if this was a good idea.
I have seen a lot of stuff about can cloud computing (and specifically EC2) do x, or can it do y, but my real question is why would you NOT want to use it?
If you were setting up a business, what are the reasons (outside of cost) that you would choose to go through the trouble of managing your own servers?
I know there are a lot of cost calculations you can put in regarding bandwidth, disk usage etc, but there are of course, other costs regarding maintenance of your own server. For the sake of this discussion I am willing to consider the costs roughly equal.
I seem to remember that Joel Spolsky wrote a little blur on this at one time, but I was unable to find it.
Anyone have any reasons?
Thanks!
I can think of several reasons why not use EC2 (and I am talking about EC2, not grid comp in general):
Reliability: Amazon makes no guarantee as to the availability / down time / safety of EC2
Security: Amazon does not makes any guarantee as to whom it will disclose your data
Persistence: ensuring persistence of your data (that includes, effort to set up the system) is complicated over EC2
Management: there are very few integrated management tools for a cloud deployed on EC2
Network: the virtual network that allows EC2 instances to communicates has some quite painful limitations (latency, no multicast, arbitrary topological location)
And to finish that:
Cost: on the long run, if you are not using EC2 to absorb peak traffic, it is going to be much more costly than investing into your own servers (cheapo servers like Supermicro cost just a couple of hundred bucks...)
On the other side, I still think EC2 is a great way to soak up non-sensitive peak traffic, if your architecture allows it.
Some questions to ask:
What is the expected uptime, and how does downtime affect your business? What sort of service level agreement can you get, what are the penalties for missing it, and how confident are you that the SLA uptime goals will be met? (They may be better or worse at keeping the systems up than you are.)
How sensitive is the data you're proposing to put into the cloud? Again, we get into the questions of how secure the provider promises to be, what the contractual penalties and indemnities are, and how confident you are that the provider will live up to the agreement. Further, there may be external requirements. If you deal with health-related data in the US, you are subject to very strict requirements. If you deal with credit card data, you also have responsibilities (contractual, not legal).
How easy will it be to back out of the arrangement, should service not be what was expected, or if you find a better deal elsewhere? This includes not only getting your data back, but also some version of the applications you've been using. Consider the possibilities of your provider going bankrupt (Amazon isn't going to go bankrupt any time soon, but they could split off a cloud provider which could then go bankrupt), or having an internal reorganization. Bear in mind that a company in serious trouble may not be able to live up to your expectations of service.
How much independence are you going to have? Are you going to be running their software or software you pick? How easy will it be to reconfigure?
What is the pricing scheme? Is it possible for the bills to hit unacceptable levels without adequate warning?
What is the disaster plan? Ideally, it's running your software on servers in a different location from where the disaster hit.
What does your legal department (or retained corporate attorney) think of the contract? Is there a dispute resolution mechanism, and, if so, is it fair to you?
Finally, what do you expect to get out of moving to the cloud? What are you willing to pay? What can you compromise on, and what do you need?
Highly sensitive data might be better to control yourself. And there's legislation; some privacy sensitive information, for example, might not leave the the country.
Also, except for Microsoft Azure in combination with SDS, the data stores tend to be not relational, which is a nuisance in certain cases.
Maybe concern that that big a company will more likely be approached by an Agent Smith from the government to spy on everyone that a little small provider somewhere.
Big company - more customers - more data to aggregate and recognize patterns - more resources to organize a sophisticated watch system.
Maybe it's more of a fantasy but who ever knows?
If you don't have a paranoia it doesn't mean yet that you are not being watched.
The big one is: if Amazon goes down, there's nothing you can do to bring it back up.
I'm not talking about doomsday scenarios where the company disappears. I mean that you're at the mercy of their downtime, with little recourse of your own.
Security -- you don't know what is being done to your data
Dependency -- your business is now directly intertwined with the provider
There are different kinds of cloud computing with lots of different vendors providing it. It would make me nervous to code my apps to work with a single cloud vendor. that you specifically had to code for..amazon and Microsoft I believe you need to specifically code for that platform - maybe google too.
That said, I recently jettisoned my own dedicated servers and moved to Rackspaces Mosso Cloud platform (which have no proprietary coding necessary) and I am really, really pleased with it so far. Cut my costs in half, and performance is way better than before. My sql server databases are now running on 64Bit enterprise SQL server versions with 32G of ram - that would have cost me a fortune on my previous providers infrastructure.
As far as being out of luck when the cloud is down, that was true if my dedicated server went down - it never did, but if there was a hardware crash on my dedicated server, I am not sure it would be back on-line any quicker than rackspace could bring their cloud back up.
Lack of control.
Putting your software on someone else's cloud represents handing over some control. They might institute a file upload size limit, or memory limits which could ruin your application. A security vulnerbility in their control panel could get your site hacked.
Security issues are not relevant if your application does its own encryption. Amazon is then storing encrypted data that they have no way of decrypting.
But in addition to the uptime issues, Amazon could decide to increase their prices to whatever they want. If you're dependent on them, you'll just have to pay it.
Depends how much you trust your own infrastructure in comparison to a 3rd party cloud service. In my opinion, most businesses (at least not IT related) should choose the later.
Another thing you lose with the cloud is the ability to choose exactly what operating system you want to run. For example, the latest Fedora Linux kernel available on EC2 is FC8, and the latest Windows version is Server 2003.
Besides the issues raised regarding dependability, reliability, and cost is the issue of data ownership. When you locate data on someone else's server, you no longer control who views, accesses, modifies, or uses that data. While the cloud operators can limit your access, you possess no way of limiting theirs or limiting who they give access to. Yes, you can encrypt all the data on the server but you lack any way of knowing who possesses root access to the server itself and any means to stop others from downloading your encrypted data and cracking it open. You lose control over your data; depending on what type of apps you are running and the proprietary nature of the data involved, this could engender corporate security and/or liability risks.
The other factor to consider is what would happen to your company if Amazon and/or EC2 were to suddenly vanish overnight. While a seemingly preposterous position, it could happen. Would you be able to quickly fill the hole and restore service, or would your potentially revenue generating apps languish while the IT staff scramble to obtain servers and bandwidth to get them back online? Also, what would happen to your data? The cloud hard drive holding all your information still exists, somewhere, and could pose a potential liability risk depending on the information you stored there--items such as personal information, business transaction records etc.
If I was starting my own business now, I would go through the hassle of purchasing and maintaining my own severs so I retained data ownership. I could control root access to the hardware, as well as control who can access and modify the data.
Unanswered security questions.
Really, do you want your IP out there, where you're not the one in control of it?
Most cloud computing environment are at least partially vendor specific. There's no good way to move stuff from one cloud to another without having to do a lot of rewriting. That sort of lock-in puts you at the mercy of one vendor when it comes to downtime, price increases, etc. If you rent or own your own servers, hosting providers and colos are pretty much interchangeable. You always have the option of moving somewhere else.
This may change in the future, as these things become standardized, but for now tying yourself to the cloud means tying yourself to a specific vendor.
This is kind of like the "Why would you use Linux" comment I received from management many years ago. The response I got was that it is a solution in search of a problem.
So what are your goals and objectives in moving to EC2?
I'd be interested to know if you'd still want to move to a cloud, if it was your own.
Cloud computing has brought parallel programming a little closer to the masses, but you still have to understand how best to use it - otherwise you're going to waste compute cycles and bandwidth.
Re-architecting your application for most efficient use of a cloud computing service is non-trivial.
Besides what has already been said here, we have to consider uniformity across the business. Are all of you applications going to be hosted in the cloud, or only most? Is most enough to pull the trigger on using the cloud when you still have to have personnel to handle a few special servers?
In particular, there might be special hardware that you need to communicate with such modems to accept incoming data, or voice cards that make automated phone calls. I don't know how such things could be handled in a cloud environment.

Resources