Spring boot application restart automatically when Cloud foundry updates/upgrade - spring

I am using Cloud Foundry and I deployed my Spring boot application on Cloud. Whenever there is some updates/upgrade happens on Cloud foundry, my application got restart and some request got failed to reach to application as restart of application takes more time to get up.
Is there any way in CF that some instances of application will be running while upgrade/restart of application to process requests.
Also I want to know, if CF provides services from different locations/regions, so consider my application will be deployed on 2 CF containers available on different region. Wherever there is some updates/upgrade available, proceed upgrade on one region for Cf so other CF service from another region will be available and some application instances will be running to serve requests and vice versa.
-Thank you.

What you're describing is the intended behavior of CF.
If you have two or more instance of your application, they should never both go down at the same time. i.e. one will be taken down, then after it's restarted successfully, then the other will be taken down and restarted.
If your operator has configured multiple availability zones for the foundation that you've targeted, then application instances will be distributed across those AZs to help facilitate HA and best possible availability.
If you're not seeing this behavior then you should take a look at the following as these items can affect uptime of your apps:
Do you have more than one application instance? If you only have one application instance, then you can expect to see some small windows of downtime when updates are applied to the foundation and under other scenarios. This happens because a times Diego will need to evict applications running on a Diego Cell. It makes an attempt to start your app on another Cell before stopping the current instance, but there are no guarantees provided around this. Thus you can end up with some downtime, if for example your app is slow to start or your app does not have a good health check configured (like it passes the health check before the app is really up).
Did your operator set up multiple AZs? As a developer, you cannot really tell. This is abstracted away, so you would need to ask your platform operations team and confirm if there are more than one and if so how many. For best possible uptime, have at least as many app instances as you have AZs.
The other thing often overlooked, does your application depend on any services? If so, it is also possible that you will see downtime when services are being updated. That all depends on the services you are using and if there will be associated downtime for management and upgrades of those services. You may be able to tell if this is the case by looking more closely at your application logs when it fails to see if there are connection failures or errors like that. You might also be able to tell by looking at the plan defined in the CF Marketplace. Often the description will say if there are stipulations regarding the plan, like it is or isn't clustered or HA.
UPDATE
One other thing which can cause downtime:
If your operator has the "max in flight" value too high for the number of Diego Cells this can also cause downtime. Essentially, "max in flight" dictates how many Diego Cells will be taken out of service during an upgrade. If this value is too high, you can run into a situation where there is not enough capacity in the remaining Cells to host all of your applications. This ends up resulting in downtime for app instances as they cannot be rescheduled on another Cell in a timely manner. As a developer, I don't think this is something you can troubleshoot, you would need to work with your platform operators to investigate further.
That is probably a theme here. If you are an app developer, you should be talking to your platform operations team to debug this.
Hope that helps!

Related

Dockerized spring boot app with multiple (dynamic) datasources

I'm working on a Spring Boot application, using AbstractRoutingDatasource in order to provide multitenancy feature. New tenants can be added dynamically, and each one has it's own datasource and pool configuration, and everything is working well, since we only have around 10 tenants right now.
But I'm wondering: since the application is running on a docker container, with limit resources, as the number of tenants grows, also more and more threads will be allocated for each connection (considering a pool from 1 to 30 threads for each tenant) and the container, at some point (with 50 tenants, for example), will be killed due to memory limit defined at container startup.
It appears to me that, this multitenancy solution (using AbstractRoutingDatasource) is not suitable to an application designed to be containerized since I can't simply scale it horizontally to deal with more tenants.
Am I missing something? Should I be worried about that?
The point made in the post is about the system resource exhaustion that might arise with the increasing volume of requests as a result of increased tenants in the system. I would like to address few points
The whole infrastructure can be managed efficiently using ECS & AWS Fargate so that when there is a huge load, there are automatically new containers spun up to take the load. In case of having separate servers, ELB might be of help. There will be no issues when spinning up new containers / servers as your services are stateless
Regarding the number of active connections to a database from your application, you should profile your app and understand the DAP data access patterns. Any master data or static information should NOT be taken always from the database (Except for the 1st time), instead they should be cached. There are many managed cache services that can help you scale better.
In regards to the database, it is understood that tenants have their own databases, in case of a very large tenant, try to scale out the databases as well.
Focus on building the entire suite of features using async features in JAVA or using RxJava so that the async nature will help managing the threads.
Since you have not mentioned what cloud your applications will be deployed, I have cited sample using AWS. However most of the features can be used across Azure , GCP or AWS.
There are lot of strategies to scale, the right understanding of the business needs and data usage patterns etc... could help us decide the right approach.
Hope this clarifies.

What is difference between Microservices and Decentralized applications

I am new to the decentralized application, after going through some articles I confused between the microservices and decentralized application. Can someone help me to understand the difference between them. I know that microservices can be built using spring boot & docker. Is there any other technology present to build it.I think Ethereum is used to develop the decentralized application. Can someone help me to understand the difference?
A microservice application still runs on your infrastructure, you still control all of its nodes, state and infrastructure. So, despite being distributed (and even though the infrastructure might not be yours such as a 3rd party cloud), you still have the power to interfere in all of its aspects.
The main selling point of a decentralized application is that theoretically no one can actually interfere in its infrastructure since it's not owned by a single entity. Theoretically anyone in the world (and the largest its user base, the more resilient the decentralized application becomes) can become a node in the infrastructure and the "current valid state" is calculated based on a kind of agreement between the nodes (so, unless you can interfere in a majority of the nodes, which you don't own, you can't change the application's state on your own).
In a certain sense, you're right that they seem similar, since they're both distributed applications. The decentralized ones just go a step further to be not "owned" and "controlled" by a single entity and be the product of an anonymous community.
EDIT
So suppose you/your company makes a very cool microservice application and you host it on a bunch of 3rd party clouds around the world to make sure it's very redundant and always available. A change of heart on your part (or maybe being forced to do so by government regulations) can shutdown the application out of the blue or ban certain users from it or edit/censor content currently being published on it. You're in full control since it's your app. As good as your intentions might be, you are a liability, a single point of failure in the ecosystem.
Now, if your app is decentralized... there's no specific person/entity to be hunt down to force such a behavior. You need to hunt thousands/millions of owners of single independent nodes providing infrastructure to the app and enforcing its agreed set of rules. So how would you go banning users / censoring content / etc? You (theoretically) can't... unless you can reach a majority of its nodes and that has already proven to be quite difficult and even brute force might be nearly impossible to achieve.
Microservice
Microservice is rather software architecture. The idea is that you have many small applications - microservices, each focused on addressing only a single goal, but doing it really well.
Specific instance of a microservice can be for example application running HTTP server for managing users. It could have HTTP endpoints for adding, viewing and deleting users in a database. You would then deploy such application together with database on some server.
With fair degree of simplification, we could say that microservice is not all that different than a web browser you are running on your computer. The difference between your web browser and a microservice is that microservice will be running on server, exposing some sort of network interface, whereas you browser runs on your personal computer and it doesn't expose network interface for others to interact with.
The bottom line is that singe microservice is just an application running on a server, you can modify its code anytime, you can stop it any time, you can change data in the database it's using.
Decentralized application
Decentralized application is deployed to blockchain. Blockchain is network of computers (Ethereum MainNet has tens of thousands of nodes), all running the same program. When you write decentralized application (called smart contract, in terms of Ethereum blockchain) and you "deploy it", what happens is that you basically insert your code into this networks of computers and each of them will have it available.
Once code of your application is in the network, you can interact with it - you can the interface you defined in your decentralized application, by sending JSON-RPC requests to a server which is part of this blockchain network.
It then takes some time until your request for execution is picked up by network. If everything goes right, your request is eventually distributed to the network and executed by every single computer connected to blockchain.
The consequence of that is that if some computer in the network tries to lie about the result, the fraudulent attempt would be noticed by the rest of the network.
The bottom line here is that decentralized application is not executed on one computer, but many (possibly thousands) and even as a creator, you cannot modify it's code or data (you can only to limited degree)

Clustering Microservice Components

We have a set of Microservices collaborating with each other in the eco system. We used to have occasional problems where one or more of these Microservices would go down accidentally. Thankfully, we have some monitoring built around which would realize this and take corrective action.
Now, we would like to have redundancy built around each of those Microservices. I'm thinking more like a master / slave approach where a slave is always on stand by and when the master goes off, the slave picks it up.
Should we consider using any framework that we could use as service registry, where we register each of those Microservices and allow them to be controlled? Any other suggestions on how to achieve the kind of master / slave architecture with the Microservices that would enable us to have failover redundancy?
I thought about this for a couple of minutes and this is what I currently think is the best method, based on experience.
There are a couple of problems you will face with availability. First is always having at least one endpoint up. This is easy enough to do by installing on multiple servers. In the enterprise space, you would use a name for the endpoint and then have it resolve to multiple servers (virtual or hardware). You would also load balance it.
The second is registry. This is a very easy problem with API management software. The really good software in this space is not cheap, so this is not a weekend hobbyist type of software. But there are open source API Management solutions out there. As I work in the Enterprise space, I am very familiar with options like Apigee, CA, Mashery, etc. so I cannot recommend an open source option and feel good about myself.
You could build your own registry, if you desire. Just be careful how you design it, as a "registry of all interface points" leads to a service that becomes more tightly coupled.

Camunda: architecture and decisions for a high-performance, dynamic multi tenant app

I'm in charge of building a complex system for my company, and after some research decided that Camunda fits most of my requirements. But some of my requirements are not common, and after reading the user guide I realized there are many ways of doing the same thing, so I hope this question will clarify my thoughts and also will serve as a base questión for everyone else looking for building something similar.
First of all, I'm planning to build a specific App on top of Camunda BPM. It will use workflow and BPM, but not necessarily all the stuff BPM/Camunda provides. This means it is not in my plans to use mostly of the web apps that came bundled with Camunda (tasks, modeler...), at least not for end users. And to make things more complicated it must support multiple tenants... dynamically.
So, I will try to specify all of my requirements and then hopefully someone with more experience than me could explain which is the best architecture/solution to make this work.
Here we go:
Single App built on top of Camunda BPM
High-performance
Workload (10k new process instances/day after few months).
Users (starting with 1k, expected to be ~ 50k).
Multiple tenants (starting with 10, expected to be ~ 1k)
Tenants dynamically managed (creation, deploy of process definitions)
It will be deployed on cluster
PostgreSQL
WildFly 8.1 preferably
After some research, this are my thoughts
One Process Application
One Process Engine per tenant
Multi tenancy data isolation: schema or table level.
Clustering (2 nodes) at first for high availability, and adding more nodes when amount of tenants and workload start to rise.
Doubts
Should I let camunda manage my users/groups, or better manage this on my app? In this case, can I say to Camunda “User X completed Task Y”, even if camunda does not know about the existence of user X?
What about dynamic multi tenancy? Is it possible to create tenants on the fly and make those tenants persist over time even after restarting the application server? What about re-deployment of processes after restarting?
After which point should I think on partitioning of engines on nodes? It’s hard to figure out how I’m going to do this with dynamic multi tenancy, but moreover... Is this the right way to deal with high workload and growing number of tenants?
With my setup of just one process application, should I take care of something else in a cluster environment?
I'm not ruling out using only one tenant, one process engine and handle everything related to tenants logically within my app, but I understand that this can be very (VERY!) cumbersome.
All answers are welcome, hopefully we'll achieve a good approach to this problem.
1. Should I let camunda manage my users/groups, or better manage this on my app? In this case, can I say to Camunda “User X completed Task Y”, even if camunda does not know about the existence of user X?
Yes, you can choose your app to manage the users and tell Camunda that a task is completed by a user whom Camunda doesn't know about. And same way, you can make Camunda to assign task to users which it doesn't know at all. This is done by implementing their org.camunda.bpm.engine.impl.identity.ReadOnlyIdentityProvider interface and let the configuration know about your implementation.
PS: If you doesn't need all the application that comes with Camunda, I would even suggest you to embed the Camunda engine in your app. It can be done easily and they have good documentation for thier java APIs. And it is easily achievable.
2. What about dynamic multi tenancy? Is it possible to create tenants on the fly and make those tenants persist over time even after restarting the application server? What about re-deployment of processes after restarting?
Yes. Its possible to dynamically add Tenants. While restarting the engine or your application, you can either choose to redeploy / or just use the existing deployed processes. Even when you redeploy a process, if you want Camunda to create a new version of the process only if there is a change in the process, that's also possible. See enableDuplicateFiltering property for their DeploymentBuilder.
3. After which point should I think on partitioning of engines on nodes? It’s hard to figure out how I’m going to do this with dynamic multi tenancy, but moreover... Is this the right way to deal with high workload and growing number of tenants?
In my experience, it is possible. You need to keep track of various parameters here, like memory, number of requests being served, number of open connections available etc., then accordingly add more or remove nodes. With AWS, this will be much easier as they have some of these tools already available for dynamic scaling in / out nodes. But that said, I have done this only with Camunda as embedded engine application(s).

Basic AWS questions

I'm newbie on AWS, and it has so many products (EC2, Load Balancer, EBS, S3, SimpleDB etc.), and so many docs, that I can't figure out where I must start from.
My goal is to be ready for scalability.
Suppose I want to set up a simple webserver, which access a database in mongolab. I suppose I need one EC2 instance to run it. At this point, do I need something more (EBS, S3, etc.)?
At some point of time, my app has reached enough traffic and I must scale it. I was thinking of starting a new copy (instance) of my EC2 machine. But then it will have another IP. So, how traffic is distributed between both EC2 instances? Is that did automatically? Must I hire a Load Balancer service to distribute the traffic? And then will I have to pay for 2 EC2 instances and 1 LB? At this point, do I need something more (e.g.: Elastic IP)?
Welcome to the club Sony Santos,
AWS is a very powerfull architecture, but with this power comes responsibility. I and presumably many others have learned the hard way building applications using AWS's services.
You ask, where do I start? This is actually a very good question, but you probably won't like my answer. You need to read and do research about all the technologies offered by amazon and even other providers such as Rackspace, GoGrid, Google's Cloud and Azure. Amazon is not easy to get going but its not meant to be really, its focus is more about being very customizable and have a very extensive api. But lets get back to your question.
To run a simple webserver you would need to start an EC2 instance this instance by default runs on a diskdrive called EBS. Essentially an EBS drive is a normal harddrive except that you can do lots of other cool stuff with it like take it off one server and move it to another. S3 is really more of a file storage system its more useful if you have a bunch of images or if you want to store a lot of backups of your databases etc, but its not a requirement for a simple webserver. Just running an EC2 instance is all you need, everything else will happen behind the scenes.
If you app reaches a lot of traffic you have two options. You can scale your machine up by shutting it off and starting it with a larger instance. Generally speaking this is the easiest thing to do, but you'll get to a point where you either cannot handle all the traffic with 1 instance even at the larger size and you'll decide you need two OR you'll want a more fault tolerant application that will still be online in the event of a failure or update.
If you create a second instance you will need to do some form of loadbalancing. I recommend using amazons Elastic Load Balancer as its easy to configure and its integration with the cloud is better than using Round Robin DNS or a application like haproxy. Elastic Load Balancers are not expensive, I believe they cost around $18 / month + data that's passed between the loadbalancer.
But no, you don't need anything else to do scale up your site. 2 EC2 instances and a ELB will do the trick.
Additional questions you didn't ask but probably should have.
How often does an EC2 instance experience hardware failure and crash my server. What can I do if this happens?
It happens frequently, usually in batches. Sometimes I go months without any problems then I will get a few servers crash at a time. But its defiantly something you should plan for I didn't in the beginning and I paid for it. Make sure you create scripts and have backups and a backup plan ready incase your server fails. Be ok with it being down or have a load balanced solution from day 1.
Whats the hardest part about scalabilty?
Testing testing testing testing... Don't ever assume anything. Also be prepared for sudden spikes in your traffic. You have to be prepared for anything if you page goes from 1 to 1000 people over night are you prepared to handle it? Have you tested what you "think" will happen?
Best of luck and have fun... I know I have :)

Resources