Changing spring-cloud-stream instance index/count at runtime - spring

In spring-cloud-stream, is there a way to change the instance count and instance index of an application without restarting it?
Also, is there any recommended way to automatically populate these values? In the microservices world, this seems like it would prohibitively difficult, since services are starting and stopping all the time.

In spring-cloud-stream, is there a way to change the instance count and instance index of an application without restarting it?
Not in the current version, but open to discuss this in the context of a GitHub issue.
Also, is there any recommended way to automatically populate these values? In the microservices world, this seems like it would prohibitively difficult, since services are starting and stopping all the time.
My recommendation would be to look at http://cloud.spring.io/spring-cloud-dataflow/ which helps with the orchestration of complex microservice topologies (and is designed to work in conjunction with Spring Cloud Stream for streaming scenarios)

Related

Spring boot application restart automatically when Cloud foundry updates/upgrade

I am using Cloud Foundry and I deployed my Spring boot application on Cloud. Whenever there is some updates/upgrade happens on Cloud foundry, my application got restart and some request got failed to reach to application as restart of application takes more time to get up.
Is there any way in CF that some instances of application will be running while upgrade/restart of application to process requests.
Also I want to know, if CF provides services from different locations/regions, so consider my application will be deployed on 2 CF containers available on different region. Wherever there is some updates/upgrade available, proceed upgrade on one region for Cf so other CF service from another region will be available and some application instances will be running to serve requests and vice versa.
-Thank you.
What you're describing is the intended behavior of CF.
If you have two or more instance of your application, they should never both go down at the same time. i.e. one will be taken down, then after it's restarted successfully, then the other will be taken down and restarted.
If your operator has configured multiple availability zones for the foundation that you've targeted, then application instances will be distributed across those AZs to help facilitate HA and best possible availability.
If you're not seeing this behavior then you should take a look at the following as these items can affect uptime of your apps:
Do you have more than one application instance? If you only have one application instance, then you can expect to see some small windows of downtime when updates are applied to the foundation and under other scenarios. This happens because a times Diego will need to evict applications running on a Diego Cell. It makes an attempt to start your app on another Cell before stopping the current instance, but there are no guarantees provided around this. Thus you can end up with some downtime, if for example your app is slow to start or your app does not have a good health check configured (like it passes the health check before the app is really up).
Did your operator set up multiple AZs? As a developer, you cannot really tell. This is abstracted away, so you would need to ask your platform operations team and confirm if there are more than one and if so how many. For best possible uptime, have at least as many app instances as you have AZs.
The other thing often overlooked, does your application depend on any services? If so, it is also possible that you will see downtime when services are being updated. That all depends on the services you are using and if there will be associated downtime for management and upgrades of those services. You may be able to tell if this is the case by looking more closely at your application logs when it fails to see if there are connection failures or errors like that. You might also be able to tell by looking at the plan defined in the CF Marketplace. Often the description will say if there are stipulations regarding the plan, like it is or isn't clustered or HA.
UPDATE
One other thing which can cause downtime:
If your operator has the "max in flight" value too high for the number of Diego Cells this can also cause downtime. Essentially, "max in flight" dictates how many Diego Cells will be taken out of service during an upgrade. If this value is too high, you can run into a situation where there is not enough capacity in the remaining Cells to host all of your applications. This ends up resulting in downtime for app instances as they cannot be rescheduled on another Cell in a timely manner. As a developer, I don't think this is something you can troubleshoot, you would need to work with your platform operators to investigate further.
That is probably a theme here. If you are an app developer, you should be talking to your platform operations team to debug this.
Hope that helps!

Ways to update config properties for a Spring boot rest service

I have a spring boot rest service where configuration values are stored in git and fetched using a config server. Deployment is done in a docker swarm cluster where this service would run across multiple containers. So one thing I had to keep in mind is that when actuator's refresh endpoint is called, it refreshes all the containers for this service seamlessly and not just any random container. This is quite an obvious ask I believe.
I can implement updating the config values for a service as and when it's config changes in git using a message broker. However, that would take time and time is not with me at the moment.
I have come up with two quick solutions and would like your help based on your experience as to which one is better than the other. Keep in mind that both work and I tested them both.
Solution 1
Create a scheduler using #Scheduled in the Application.java and keep pinging actuator's refresh endpoint every 5 seconds. I think this is really expensive and resource intensive in production.
Solution 2
Call actuator's refresh endpoint in the controller method itself. This way, I called refresh endpoint on demand and don't keep polling it like solution 1 and be wasteful. It will also ensure that whatever container is picked for servicing a request, it refreshes itself as refresh endpoint call would refresh the properties referred by that container only.
Do you have any preference on one over the other ? Do you see any pros and cons with these solutions ? which one would you pick and why ?
Please let me know what your thoughts are.
This sounds like an interesting problem. Also, like you pointed out Solution1 is resource intensive and should not be used in production. If you are running out of time, I would suggest you go ahead with Solution2, its smarter than the prior.
However, I think the optimal way to solve this problem can be using webhooks in github. This way github will make an API call to your predefined endpoint when a specific event is generated. Events are the core of Github Webhooks. Here is the list of all github events. Choose the one that best suits your requirement. https://developer.github.com/webhooks/#events

How to share database connection between in spring cloud

How can I share database connection aong in spring cloud module microservices. If there are many microservices how can i use same db connection or should i use db connection per microservices?
In my opinion, the thing that you've asked for is impossible only because each microservice is a dedicated process and it runs inside its own JVM (probably in more than one server). When you create a connection to the database (assuming you use connection pool) its always at the level of a single JVM.
I understand that the chances are that you meant something different but I had to put it on because it directly answers your question
Now, you can share the same database between microservices (the same schema, tables, etc) so that each JVM will have a set of connections opened (in accordance with connection pool definitions).
However, this is a really bad practice - you don't want to share the databases between microservice. The reason is the cost of change: if you (as a maintainer of microservice A) decide to, say, alter one of the tables, now all microservices will have to support this, and this is not a trivial thing to do.
So, a better approach is to have a service that has a "sole responsibility" for your data in some domain. Now, all the services could contact this service and ask for the required data through well-established APIs that should never be broken. In this approach, the cost of change is much "cheaper" since only this "data service" should be changed in a way that it doesn't break existing APIs.
Now regarding the database connection thing: you usually will have more than one JVM that runs the same microservice (like data microservice) so, it's not that you share connections between them, but rather you share the same way of working with database (because after all its the same code).
When dealing with a mircoservice architecture it is usually the case that you have a distributed system.
Most microservices that communicate with each other are not on the same machine, instance or container. Communication between them is most commonly done via http, though there are many other ways.
I would suggest designing mircoservices around a single concern of your application. For example, in your case, you could have a "persistence microservice" that would be responsible for dealing with data persistence operations on a single or multiple types data-stores. It could possibly deal with relational DBs, noSQL, file storage etc. Then, via REST endpoints, you can expose any persistence functionality to the mircoservices that deal with business logic.
A very easy way to build a REST service like this would be with the help of Spring Data REST project.
To answer your actual question, I'm not aware of any way to share actual connections between processes. Beyond that, having many microservices running on the same instance is not a good practice most of the time.
Mircoservices are very popular these days and everybody is trying to transition to them. My advice would be to make sure you don't "over-engineer" your project.
Hope I didn't misunderstand your question, but to be fair it is a little vague. If you could provide a longer more detailed description of your architecture and use case I can suggest more tools/frameworks you can use to achieve your cloudy goals.
First and most important - your microservice should be responsible for handling all data in a given business domain/bounded context. So the question is - 'Why do you need to share database connection between microservices and isn't this a sign you went too far with slicing your system?' Microservice is a tool and word 'micro' may be misleading a bit :)
For more reading I would suggest e.g. https://learn.microsoft.com/en-us/dotnet/standard/microservices-architecture/architect-microservice-container-applications/identify-microservice-domain-model-boundaries (don' t worry, it's general enough to be applicable also to Spring).

Clustering Spring Boot applications

I have an adapter (written in Spring Boot and Spring Integration) retrieving currency reates from two different sources (via REST and proprietary library). I filter unnecesary things, create instances of class known in my system and send rates to JMS cluster. I want this adapter to be replicated. Only one instance should be running at the same time. When one crashes (I know it from health endpoint) another one should start publishing rates. How can I achieve such effect? I know that available services can be registered using Eureka but how to turn one of them on automatically?
The solution to the problem is using spring-cloud-cluster. One can use either zookeeper or hazelcast to negotiate leadership. From few instances only one is given a leader role. If it crashes, another one takes its role (it is informed via event propagation). You can also use yieldLeadership method to manually relinquish leadership (if health indicator says something is wrong with the application).
Without knowing more details it is hard to give you a recommendation.
I'd personally say Eureka is not build for what you are trying to achieve. But it sounds more like you want to have a look into ZooKeeper. Also see Eureka FAQ for reference. ZooKeeper was exactly build for doing what you are trying to achieve: leader election.
On the other hand, if you can survive also with having the service down for a few seconds I'd suggest you use either your script that monitors the /health endpoint already to restart the service or use systems who already have this build in like Systemd or Docker, where you can define Restart policies.

Camunda: architecture and decisions for a high-performance, dynamic multi tenant app

I'm in charge of building a complex system for my company, and after some research decided that Camunda fits most of my requirements. But some of my requirements are not common, and after reading the user guide I realized there are many ways of doing the same thing, so I hope this question will clarify my thoughts and also will serve as a base questión for everyone else looking for building something similar.
First of all, I'm planning to build a specific App on top of Camunda BPM. It will use workflow and BPM, but not necessarily all the stuff BPM/Camunda provides. This means it is not in my plans to use mostly of the web apps that came bundled with Camunda (tasks, modeler...), at least not for end users. And to make things more complicated it must support multiple tenants... dynamically.
So, I will try to specify all of my requirements and then hopefully someone with more experience than me could explain which is the best architecture/solution to make this work.
Here we go:
Single App built on top of Camunda BPM
High-performance
Workload (10k new process instances/day after few months).
Users (starting with 1k, expected to be ~ 50k).
Multiple tenants (starting with 10, expected to be ~ 1k)
Tenants dynamically managed (creation, deploy of process definitions)
It will be deployed on cluster
PostgreSQL
WildFly 8.1 preferably
After some research, this are my thoughts
One Process Application
One Process Engine per tenant
Multi tenancy data isolation: schema or table level.
Clustering (2 nodes) at first for high availability, and adding more nodes when amount of tenants and workload start to rise.
Doubts
Should I let camunda manage my users/groups, or better manage this on my app? In this case, can I say to Camunda “User X completed Task Y”, even if camunda does not know about the existence of user X?
What about dynamic multi tenancy? Is it possible to create tenants on the fly and make those tenants persist over time even after restarting the application server? What about re-deployment of processes after restarting?
After which point should I think on partitioning of engines on nodes? It’s hard to figure out how I’m going to do this with dynamic multi tenancy, but moreover... Is this the right way to deal with high workload and growing number of tenants?
With my setup of just one process application, should I take care of something else in a cluster environment?
I'm not ruling out using only one tenant, one process engine and handle everything related to tenants logically within my app, but I understand that this can be very (VERY!) cumbersome.
All answers are welcome, hopefully we'll achieve a good approach to this problem.
1. Should I let camunda manage my users/groups, or better manage this on my app? In this case, can I say to Camunda “User X completed Task Y”, even if camunda does not know about the existence of user X?
Yes, you can choose your app to manage the users and tell Camunda that a task is completed by a user whom Camunda doesn't know about. And same way, you can make Camunda to assign task to users which it doesn't know at all. This is done by implementing their org.camunda.bpm.engine.impl.identity.ReadOnlyIdentityProvider interface and let the configuration know about your implementation.
PS: If you doesn't need all the application that comes with Camunda, I would even suggest you to embed the Camunda engine in your app. It can be done easily and they have good documentation for thier java APIs. And it is easily achievable.
2. What about dynamic multi tenancy? Is it possible to create tenants on the fly and make those tenants persist over time even after restarting the application server? What about re-deployment of processes after restarting?
Yes. Its possible to dynamically add Tenants. While restarting the engine or your application, you can either choose to redeploy / or just use the existing deployed processes. Even when you redeploy a process, if you want Camunda to create a new version of the process only if there is a change in the process, that's also possible. See enableDuplicateFiltering property for their DeploymentBuilder.
3. After which point should I think on partitioning of engines on nodes? It’s hard to figure out how I’m going to do this with dynamic multi tenancy, but moreover... Is this the right way to deal with high workload and growing number of tenants?
In my experience, it is possible. You need to keep track of various parameters here, like memory, number of requests being served, number of open connections available etc., then accordingly add more or remove nodes. With AWS, this will be much easier as they have some of these tools already available for dynamic scaling in / out nodes. But that said, I have done this only with Camunda as embedded engine application(s).

Resources