spring-cloud-task how to pass messages or flag between two apps - spring-cloud-task

I have already made a Ingestion job using spring batch which reads xml file and ingest into AEM and its working fine.
Now, I am trying to convert this apps into Spring cloud Task. I want to split this apps into 4 different part which is individual apps. I need to connect them into spring cloud data workflow and pass some data and flags based on that next flow will be execute.
Is it possible on spring cloud Task? if yes then how can I bind them? please provide some programming tutorial.

In the recent 1.2.0.RELEASE, we have released a new feature called the "Composed Tasks". With this, you could define a directed graph that's made of several spring-cloud-task (SCT) applications.
Each step in your flow can be an independent SCT application, which you can develop, test, and CI/CD in isolation. Once you're ready to orchestrate them as a composed graph, you'd then register and use them in the specially designed composed-task DSL or the drag & drop GUI.
Checkout this screencast for more details.

Related

Data Migration using Spring

We are beginning the process of re-architecting the systems within our company.
One of the key components of the work is a new data model which better meets our requirements.
A major part of the initial phase of the work is to design and build a data migration tool.
This will take data from one or more existing systems and migrate it to the new model.
Some requirements:
Transformation of data to the new model
Enrichment of data, with default values or according to business rules
Integration with existing systems to pull data
Integration with Salesforce CRM which is being introduced into the company.
Logging and notification about failures
Within the Spring world, which is the best Spring project to use as the underlying framework for such a data migration tool?
My initial thoughts are to look at implementing the tool using Spring Integration.
This would:
Through the XML or DSL, allow for the high level data flow to be seen, understood, and edited (possibly using a visual tool such as a STS plugin). Being able to view the high level flow in such a way is a big advantage.
Connectors to work with different data sources.
Transformers components to be built to migrate data formats.
Routers to route the data in the new model to endpoints which connect with systems.
However, are there other Spring projects, such as Spring Data or Spring Batch, which are a better match for the requirements?
Very much appreciate feedback and ideas.
I would certainly start with spring-integration which exposes bare bones implementation for Enterprise Integration Patterns which are at the core of most/all of your requirements listed.
It is also an exceptionally great problem modelling tool which helps you better understand the problem and then envision its implementation in one cohesive integration flow
Later on, once you have a clear understanding of how things are working it would be extremely simple to take it to the next level by introducing the "other frameworks" you mentioned/tagged adding #spring-cloud-data-flow and #spring-cloud-stream.
Overall this question is rather broad, so consider following the above pointers and get started and raise more concrete questions.

Is it sutable to use Spring Cloud DataFlow to orchestrate long running external batch jobs inside infinite running apps?

We have String Batch applications with triggers defined in each app.
Each Batch application runs tens of similar jobs with different parameters and is able to do that with 1400 MiB per app.
We use Spring Batch Admin, which is deprecated years ago, to launch individual job and to get brief overview what is going in jobs. Migration guide recommends to replace Spring Batch Admin with Spring Cloud DataFlow.
Spring Cloud DataFlow docs says about grabbing jar from Maven repo and running it with some parameters. I don't like idea to wait 20 sec for application downloading, 2 min to application launching and all that security/certificates/firewall issues (how can I download proprietary jar across intranets?).
I'd like to register existing applications in Spring Cloud DataFlow via IP/port and pass job definitions to Spring Batch applications and monitor executions (including ability to stop job). Is Spring Cloud DataFlow usable for that?
Few things to unpack here. Here's an attempt at it.
Spring Cloud DataFlow docs says about grabbing jar from Maven repo and running it with some parameters. I don't like idea to wait 20 sec for application downloading, 2 min to application launching and all that security/certificates/firewall issues
Yes, there's an App resolution process. However, once downloaded, we would reuse the App from Maven cache.
As for the 2mins bootstrapping window, it is up to Boot and the number of configuration objects, and of course, your business logic. Maybe all that in your case is 2mins.
how can I download proprietary jar across intranets?
There's an option to resolve artifacts from a Maven artifactory hosted behind the firewall through proxies - we have users on this model for proprietary JARs.
Each Batch application runs tens of similar jobs with different parameters and is able to do that with 1400 MiB per app.
You may want to consider the Composed Task feature. It not only provides the ability to launch child Tasks as Direct Acyclic Graphs, but it also allows transitions based on exit-codes at each node, to further split and branch to launch more Tasks. All this, of course, is automatically recorded at each execution level for further tracking and monitoring from the SCDF Dashboard.
I'd like to register existing applications in Spring Cloud DataFlow via IP/port and pass job definitions to Spring Batch applications and monitor executions (including ability to stop job).
As far as the batch-jobs are wrapped into Spring Cloud Task Apps, yes, you'd be able to register them in SCDF and use it in the DSL or drag & drop them into the visual canvas, to create coherent data pipelines. We have a few "batch-job as task" samples here and here.

Spring batch or Spring core libraries for building file operation process

I'm dipping my toes into the microservices, is spring boot batch applicable to the following requirements?
Files of one or multiple are read from a specific directory in Linux.
Several operations like regex, build new files, write the file and ftp to a location
Send email during a process fail
Using spring boot is confirmed, now the question is
Should I use spring batch or just core spring framework?
I need to integrate with Control-M to trigger the job. Can the Control-M be completely removed by using Spring batch library? As we don't know when to expect the files in the directory.
I've not seen a POC with these requirements. Would someone provide an example POC or an affirmation this could be achieved with Spring batch?
I would use Spring Batch for that use case. Not only does it provide out of the box components for reading, processing, and writing files, it adds a lot more for error handling, scalability, etc. All of those things you'd probably end up wiring up by yourself if you go without Spring Batch.
As for being launched via Control-M, yes MANY large customers use Control-M to launch their jobs. Unfortunately, I've never done it myself so I cannot provide any details on the mechanics, but if Control-M can either launch a script or call a REST API, you can launch a job with it.
I would suggest you, go for spring batch as it has much-inbuilt functionality which will be provided to you for file reading and writing to your required location. Even you will be able to handle record skipping requirement. Your mail triggering requirement will be handled by Control M. You just need to decide one exit code for your handled exception and on the basis of that exit code you can trigger the mail to respective members. And there are many other features which will be helpful if you go for spring batch.

What modules to use for a synchronization service in Java/Spring?

I'm willing to build a synchronization service in Java. The use case is, that i'm fetching data from an exchange-service (via Exchange Web Services), normalize the data a bit (process probably) and then write it to a backend via GraphQL. I already had a look around the spring modules, but am not quite sure what modules to use. I found spring batch and spring quartz.
The synchronization will have to trigger all X seconds, fetch information from the Exchange, look what's in the backend already and update what's needed.
Do you guys have any suggestions? I started implementing this whole thing in nodejs before, but as it has to run on both, Windows Servers and Docker/Linux, it has been a real pain to keep it running smooth (mostly because bundling nodejs to an application for Windows is pain).
Difference between Spring Batch & Quartz:
Spring Batch and Quartz have different goals. Spring Batch provides functionality for processing large volumes of data and Quartz provides functionality for scheduling tasks.
So Quartz could complement Spring Batch, A common combination would be to use Quartz as a trigger for a Spring Batch job using a Cron expression.
Conclusion : So basically Spring Batch defines what should be done, Quartz defines when it should be done.
Quartz is a scheduling framework. Like "execute something every hour or every last friday of the month"
Spring Batch is a framework that defines that "something" that will be executed.
You can define a job, that consists of steps. Usually a step is something that consists of item reader, optional item processor and item writer, but you can define a custom stem. You can also tell Spring batch to commit on every 10 items and a lot of other stuff.
You can use Quartz to start Spring Batch jobs.
Recommended for your use case :
Quartz scheduling as you want trigger after specific interval.
Reference :https://projects.spring.io/spring-batch/faq.html

Activiti vs Spring batch

I have got a use case to implement. It's basically a workflow kind of use case. Below is the requirements
Extract and import data from an external db to an internal db
Make this imported data into different formats and supply it to multiple external systems and invoke some script there. The external interfaces are SFTP, SOAP, JDBC, Python over CORBA. There are around 14 external systems with one of these interfaces.
Interface transactions are executed in around 15 steps, with the ability to run some steps in parallel
These steps should be configurable. ie, a particular flow may execute 10 of these 15 steps and another flow executes 15 of 15 steps
Should have the ability to restart each step individually or restart from a particular step
There are some steps that are manual and completion of manual step should trigger next step
Volume of data is not that large. Total data size is around 400k records. But this process is executing for around 30k records at a time. Time for development is less and we are looking for some light weight easy to learn and implement solution.
We are looking for Spring based or Spring integratable solutions.
The solutions we considered are
For workflow:
Activiti, Spring Batch
For interfaces:
Spring Integration
My question is
Can Spring batch considered for managing a work flow kind of use case? I don't think it's a best fit use case for Spring Batch but as its simple and easy to implement looked for its scope. We considered doing the interfaces interaction as each step in a batch job and inside the tasklet do the Spring Integration for external interfaces, with few issues as far as I understand are
a) Dynamic step configuration can be done with Java configuration, but how flexible it is and is it recommended?
b) Manual step processing is not possible in Spring Batch
Is there any work around for this? Is there any other issues or performance impacts on doing this?
Activiti seems to a solution. Can you please provide some feedback on Activiti with Spring and Spring integration for this use case and ease of implementing it? And support for Activiti
Can Activiti workflows restarted from a particular task? Is a task can be rollbacked?
Welcoming any suggestions !!
1) For managing workflows, Activiti would be a great choice. They have created a really good process engine which should comply your needs for delegating your tasks as well as calling your custom logic. Moreover, it is based entirely on Spring Framework so Integration with your logic would be easy.
2) i've provided the same in first answer.
3) No, you will have to create a new workflow for that and Yes!, a task can be rolled back.

Resources