I have been working with Spring Cloud DataFlow and I have created my Composed Task which involves 3 Tasks.
My question is: Is possible to share information between those tasks? What would be a good pattern to do this?
I know that Spring Dataflow executes composed tasks as Spring Batch Jobs, so I was thinking that might be feasible share information using the Context.
As responded in the other thread, we don't provide out-of-the-box opinions because of the said reason.
However, if you have a requirement for it, you could use a shared database or a pub/sub broker for sharing information.
Related
I have an application separated in various OSGI bundles which run on a single Apache Karaf instance. However, I want to migrate to a microservice framework because
Apache Karaf is pretty tough to set up due its dependency mechanism and
I want to be able to bring the application later to the cloud (AWS, GCloud, whatever)
I did some research, had a look at various frameworks and concluded that Quarkus might be the right choice due to its container-based approach, the performance and possible cloud integration opportunities.
Now, I am struggeling at one point and I didn't find a solution so far, but maybe I also might have a misunderstanding here: my plan is to migrate almost every OSGI bundle of my application into a separate microservice. In that way, I would be able to scale horizontally only the services for which this is necessary and I could also update/deploy them separately without having to restart the whole application. Thus, I assume that every service needs to run in a separate Quarkus instance. However, Quarkus does not not seem to support this out of the box?!? Instead I would need to create a separate configuration for each Quarkus instance.
Is this really the way to go? How can the services discover each other? And is there a way that a service A can communicate with a service B not only via REST calls but also use objects of classes and methods of service B incorporating a dependency to service B for service A?
Thanks a lot for any ideas on this!
I think you are mixing some points between microservices and osgi-based applications. With microservices you usually have a independent process running each microservice which can be deployed in the same o other machines. Because of that you can scale as you said and gain benefits. But the communication model is not process to process. It has to use a different approach and its highly recommended that you use a standard integration mechanism, you can use REST, you can use Json RPC, SOAP, or queues or topics to use a event-driven communication. By this mechanisms you invoke the 'other' service operations as you do in osgi, but you are just using a different interface, instead of a local invocation you do a remote invocation.
Service discovery is something that you can do with just Virtual IP's accessing other services through a common dns name and a load balancer, or using kubernetes DNS, if you go for kubernetes as platform. You could use also a central configuration service or let each service register itself in a central registry. There are already plenty different flavours of solutions to tackle this complexity.
Also more importantly, you will have to be aware of your new complexities, but some you already have.
Contract versioning and design
Synchronous or asynchronous communication between services.
How to deal with security in the boundary of the services / Do i even need security in most of my services or i just need information about the user identity.
Increased maintenance cost and redundant side code for common features (here quarkus helps you a lot with its extensions and also you have microprofile compatibility).
...
Deciding to go with microservices is not an easy decision and not one that should be taken in a single step. My recommendation is that you analyse your application domain and try to check if your design is ok to go with microservices (in terms of separation of concenrs and model cohesion) and extract small parts of your osgi platform into microservices, otherwise you mostly will be force to make changes in your service interfaces which would be more difficult to do due to the service to service contract dependency than change a method and some invocations.
I am newbie to micrometer. could anyone let me know how to manage microservice metrics centrally in spring boot ?
Where i can get all registered service information and matrices and stored metrics in influxdb ?
Assuming that you're asking "How to use Micrometer with Spring Boot for collecting metrics from heterogeneous services which have multiple instances on multiple hosts" as there is nothing special with the microservice architecture compared to the assumed environment, you need to add dimensions to metrics for hosts, application instances, and so on. You can achieve it with the common tags support. See the section for it in the Spring Boot reference guide.
UPDATED:
To answer the additional question on the below comment, I created a sample showing how to use common tags with environment variables. Note that it's on the branch common-tags-2.1.x-with-env, not the master.
I have created a project with 2 rest API that launches different Jobs. My project is connected to a MySql database. I would like to monitor both the Jobs in spring cloud data flow. Please help me out how we need to configure SCDF to MySql so that both the Jobs will be monitored. And additionally, i would like to know that how, if we launch the job by firing the API, whether our SCDF will monitor those Job Instance. If not, please let me know how we can do that.
Thanks in Advance
Please take a moment to read the Spring Batch Admin to SCDF migration guide. It is a requirement that the jobs are wrapped with Spring Cloud Task programming model.
Once you have the batch-job wrapped as a Task, you can register them in SCDF to build Task/Batch pipelines using SCDF's DSL or the GUI.
As for the datasource, all you must have to make sure is that the same datasource is shared between SCDF and the batch-jobs. With this, SCDF's Dashboard will automatically list the jobs and its execution details.
Here are a few examples for your reference.
And additionally, i would like to know that how, if we launch the job by firing the API, whether our SCDF will monitor those Job Instance
Assuming you're referring to SCDF's Task launch API (e.g., a Scheduled trigger or by other means); if triggered, yes, the job executions will be captured in the database as far as SCDF and the batch-jobs share a common datasource as explained previously.
We are planning to retire the existing legacy java batch applications and recreate it with the latest available batch framework.
Given that we have a large number of batch jobs to be modernised, we are looking for a framework or architecture that would allow us to
Develop a batch solution that would allow us to dynamically deploy a new batch as and when they are created, without disturbing the existing deployed applications. - Does Spring cloud Task provide any of this feature. Note: We are looking only to deploy the apps to our local server, and has nothing to do with cloud.
If Spring Batch/Boot can provide us the feature we typically expect from a batch application, what is the special value add to go for Spring Cloud Task? - I wasn't able to completely understand this from the Spring documentation available online.
From the documentation of the Spring Cloud Task, I was able to understand that it allows an application to have many tasks within it. What should I do if each of the tasks have their own library dependencies, which might contradict with the dependencies of other Tasks? So in that case, should each of these tasks moved to a new application or this there a work around for that?
To answer your questions:
Does Spring Cloud Task handle orchestration - No. Spring Cloud Task does not handle orchestration of tasks or jobs. The component in this ecosystem that handles the deployment/orchestration of tasks or jobs is really Spring Cloud Data Flow (which is why I asked if you use any type of cloud platform including YARN, Cloud Foundry, Kubernetes, or Mesos...the environments supported by Spring Cloud Data Flow).
What added value does Spring Cloud Task provide over Spring Boot/Spring Batch - Spring Cloud Task is designed to provide a few things:
Similar abilities to Spring Batch with regards to state management without needing to create a batch job. When running a Boot application on a cloud environment, there is no standard way of getting the results from environment to environment (YARN handles job results differently from tasks on Cloud Foundry which is different from jobs on Kubernetes, etc). Spring Batch provides this but now all short lived processes need the overhead of the Batch API so Spring Cloud Task provides a lighter touch to those use cases.
Automatically adds informational listeners. With Spring XD, when you ran a job in an XD container, the XD container automatically added a number of informational listeners that broadcast events that you could listen for. Spring Cloud Task brings the same functionality without the need for the XD container.
Integration with Spring Cloud Stream. Spring Cloud Task provides the ability to launch tasks from messages received from Spring Cloud Stream. Also, the informational messages previously mentioned (both Batch events as well as Task events) are sent via Spring Cloud Stream channels.
The DeployerPartitionHandler. When working in a cloud environment, this PartitionHandler implementation allows you to launch workers for a partitioned batch job as tasks. This allows for the dynamic scaling of partitioned batch jobs instead of the traditional option of pre-deploying workers that listen for work which wastes resources in a modern cloud environment.
How does the packaging of multiple tasks work with dependencies - In short, this is not recommended. The idea of a Spring Cloud Task is that the execution of the Spring Boot application is the Task. While you could package up multiple tasks and using different methods, have them execute based on different stimulus, that goes against the 12 factor application concepts which are essential for correct use of Spring Cloud Task.
My two cents
For the best option for a modern batch platform, you really need to look into some from of platform first and that begins at the Cloud Foundry/Kubernetes/Mesos/YARN layer. Without that, you end up building a large part of the infrastructure yourself. That is why Spring XD evolved into Spring Cloud Data Flow. The added complexity that lived in the containers of Spring XD is removed by requiring a modern platform to run on (since they all handle those guarantees themselves). Without that piece, you're going to spend a lot of time managing the deployment and orchestration of applications that most modern platforms handle for you.
From there, the choice becomes pretty easy IMHO with Spring Cloud Task for simple tasks, Spring Batch for batch jobs, and Spring Cloud Data Flow for orchestration.
thanks for attention,i defined a combine spring batch and spring integration project and communicate with ftp server to retrieve file and process on it and write on ftp, i am looking for a good architecture for my project, i designed an architecture with spring integration as bellow diagram:
when retrieve file from server process on it and route files based on condition to mvChannel and toGet channel, i have many process scenario on the retrieved file from server that i defined a router that router handle the scenario , and route to job channels and run spring batch
now, my is question is that are right the architecture?
Really looks good. It is is typical pattern to get gain from both Spring Integration and Spring Batch worlds integration them.
However I don't see reason in the last <aggregator>. All your jobs can send their result to the <int-ftp:outbound-gateway command="PUT">. Looks like no need to wait for all results to do some analysis on them in the single place.