Spring Cloud DataFlow - getting Execution ID after running task - spring

Currently I'm moving from Spring XD as my workflow and runtime environment to Spring Cloud DataFlow and Apache Airflow. I want to create workflows in Airflow and use custom Airflow operator to run Spring Cloud Tasks on Spring Cloud DataFlow server by REST-API.
It's possible using:
curl -X GET http://SERVER:9393/tasks/deployments/...
Unfortunately DataFlow doesn't return job execution ID in this request to create simple way for monitoring of app. Is there a way to get this id in synchronic way? Because getting the last execution of specific job can lead to mistakes eg. missing job execution if I ran many the same jobs at the same time.
On Spring DataFlow I am running Spring Batch jobs so maybe better way is too somehow set execution job id and pass it as input parameter?

Try to use the following annotations to collect the task information from your bean:
public class MyBean {
#BeforeTask
public void methodA(TaskExecution taskExecution) {
}
#AfterTask
public void methodB(TaskExecution taskExecution) {
}
#FailedTask
public void methodC(TaskExecution taskExecution, Throwable throwable) {
}
}
https://docs.spring.io/spring-cloud-task/docs/current-SNAPSHOT/reference/htmlsingle/#features-task-execution-listener

Related

Download files to a local folder by using Spring Integration in spring boot application

I am new to spring integration framework. Currently i am working on a project which has a requirement to download the files to a local directory.
My goal is to complete the below task
1.Download the files by suing spring integration to a local directory
2.Trigger a batch job.It means to read the file and extract a specific column information.
I am able to connect to SFTP server.But facing difficulty how to use spring integration java DSL to download the files and trigger a batch job.
Below code to connect to SFTP Session Factory
#Bean
public SessionFactory<ChannelSftp.LsEntry> sftpSessionFactory() {
DefaultSftpSessionFactory factory = new DefaultSftpSessionFactory(true);
factory.setHost(sftpHost);
factory.setPort(sftpPort);
factory.setUser(sftpUser);
if (sftpPrivateKey != null) {
factory.setPrivateKey(sftpPrivateKey);
factory.setPrivateKeyPassphrase(privateKeyPassPhrase);
} else {
factory.setPassword("sftpPassword");
}
factory.setPassword("sftpPassword");
logger.info("Connecting to SFTP Server" + factory.getSession());
System.out.println("Connecting to SFTP Server" + factory.getSession());
factory.setAllowUnknownKeys(true);
return new CachingSessionFactory<ChannelSftp.LsEntry>(factory);
}
Below code to download the files from remote to local
#Bean
public IntegrationFlowBuilder integrationFlow() {
return IntegrationFlows.from(Sftp.inboundAdapter(sftpSessionFactory()));
}
I am using spring integration dsl. i am not able to get what to code here.
I am trying many possible ways to do this.But not able to get how to proceed with this requirement.
Can anyone one help me how to approach at this and if possible share me a sample code for reference?
The Sftp.inboundAdapter() produces messages with a File as a payload. So, having that IntegrationFlows.from(Sftp.inboundAdapter(sftpSessionFactory())) you can treat as a first task done.
Your problem from here that you don't make an integrationFlow, but rather return that IntegrationFlowBuilder and register it as a #Bean. That's where it doesn't work for you.
You need to continue a flow definition and call its get() in the end to return an integrationFlow instance which already has to be registered as a bean. If this code flow is confusing a bit, consider to implement an IntegrationFlowAdapter as a #Component.
To trigger a batch job you need consider to use a FileMessageToJobRequest in a .transform() EIP-method and then a JobLaunchingGateway in a .handle() EIP-method.
See more info in docs:
https://docs.spring.io/spring-integration/reference/html/dsl.html#java-dsl
https://docs.spring.io/spring-integration/reference/html/sftp.html#sftp-inbound
https://docs.spring.io/spring-batch/docs/4.3.x/reference/html/spring-batch-integration.html#spring-batch-integration-configuration
BTW, the last one has a flow sample exactly for your use-case.

Running scheduler in Spring boot is spawning a process external to Spring boot application context

I am scheduling a task that runs at fixed rate in Spring boot. The function that I am using to schedule a a task is as below:
private void scheduleTask(Store store, int frequency) {
final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
Runnable task = store::scan;
scheduler.scheduleAtFixedRate(task, 0, frequency, TimeUnit.MILLISECONDS);
}
This works fine but if if there is an exception at application startup, the application should exit on exception. What is happening is that I get the exception in the log and the message "Application Failed to start" but looks like the scheduler shows as still running although it looks like only the scheduled thread is still running.
Any hints on how to properly schedule an asynchronous task in a Spring boot application? I tried the #Scheduled annotation but it does not run at all.
The #Scheduled should work. Have you added the #EnabledScheduling annotation to a #Configuration or the #SpringBootApplication? The Scheduling Getting Started explains it in detail.
Regarding the scheduleTask method: What calls that? Is it started outside the Spring context? If yes then Spring won't stop it. You have to take care of the lifecycle.
You should try to use the #Scheduled as it will manage the thread pools/executors for you and most people will find it easier to understand.

Spring with Apache Beam

I want to use Spring with Apache Beam that will run on Google Cloud Data flow Runner. Dataflow job should be able to use Spring Runtime application context while executing the Pipeline steps. I want to use Spring feature in my Apache Beam pipeline for DI and other stuff. After browsing hours on google, I couldn't find any post or documentation which shows Spring integration in Apache Beam. So, if anyone has tried spring with Apache beam, please let me know.
In main class i have initialised the spring application context but it is not available while execution of pipeline steps. I get null pointer exception for autowired beans. I guess the problem is, at runtime context is not available to worker threads.
public static void main(String[] args) {
initSpringApplicationContext();
GcmOptions options = PipelineOptionsFactory.fromArgs(args)
.withValidation()
.as(GcmOptions.class);
Pipeline pipeline = Pipeline.create(options);
// pipeline definition
}
I want to inject the spring application context to each of the ParDo functions.
The problem here is that the ApplicationContext is not available on any worker, as the main method is only called when constructing the job and not on any worker machine. Therefore, initSpringApplicationContext is never called on any worker.
I've never tried to use Spring within Apache Beam, but I guess moving initSpringApplicationContext in a static initializer block will lead to your expected result.
public class ApplicationContextHolder {
private static final ApplicationContext CTX;
static {
CTX = initApplicationContext();
}
public static ApplicationContext getContext() {
return CTX;
}
}
Please be aware that this alone shouldn't be considered as a best practice of using Spring within Apache Beam since it doesn't integrate well in the lifecycle of Apache Beam. For example, when an error happens during the initialization of the application context, it will appear in the first place where the ApplicationContextHolder is used. Therefore, I'd recommend to extract initApplicationContext out of the static initializer block and call it explicitly with regards to Apache Beam's Lifecycle. The setup phase would be a good place for this.

Spring Cloud Task - launch task from maven repository in docker container

I learn Spring Cloud Task and I write simple application that is divided into 3 services. First is a TaskApplication that have only main() and implements CommandLineRunner, second is a TaskIntakeApplication that receives request and send them to RabbitMQ, third service is an TaskLauncherApplication that receives messages from RabbitMQ and runs the task with received parameters.
#Component
#EnableBinding(Source.class)
public class TaskProcessor {
#Autowired
private Source source;
public void publishRequest(String arguments) {
final String url = "maven://groupId:artifatcId:jar:version";
final List<String> args = Arrays.asList(arguments.split(","));
final TaskLaunchRequest request = new TaskLaunchRequest(url, args, null, null, "TaskApplication");
final GenericMessage<TaskLaunchRequest> message = new GenericMessage<>(request);
source.output().send(message);
}
}
And as you can see I call my built artifact by giving maven url but I wonder how can I call artifact from another docker container?
If you intend to launch a task application from an upstream event (e.g., a new file event; a new DB record event; a new message in Rabbit event, etc.,), you'd simply use the respective out-of-the-box applications and then launch the task via the Task Launcher.
Follow this example on how the 3-steps are orchestrated via SCDF's DSL.
Perhaps you could consider reusing the existing apps instead of reinventing them unless you have got a completely different requirement and that these apps cannot meet it. I'd suggest trying to get the example mentioned above working locally before you consider extending the behavior.

How to run Spring batch app using CommandLineJobRunner (spring + hibernate and/or war deployment)

I need to create batch jobs using Spring Batch.
Job will access oracle DB then fetch records, process them in tasklet and commit results.
I am planning to use hibernate with spring to deal with data.
Jobs will be executed via AutoSys. I am using CommandLineJobRunner as entry point.
(Extra info - I am using DynamicWebProject converted to Gradle, STS, Spring 4.0, Hibernate 5.0, NO Spring Boot)
I have few queries/doubts about this entire application. They are more towards environment/deployment.
Do I need to deploy this whole app as a war in Tomcat(or any server) to instantiate all beans(spring and hibernate)?
If yes, how can I start jobs using CommandLineJobRunner ?
If no, I will have to manually instantiate beans in main method using ClassPathXmlApplicationContext. In this case how should I execute jobs ? Do I need to create jar(is this mandatory) ?
How can I test these jobs on command line ? Do I need to pass jars(spring , hibernate etc dependencies) while using CommandLineJobRunner to execute jobs ?
I am new to batch jobs and all your comments would be of great help.
Thanks
No server is needed for spring batch applications.
You can launch job using jobLauncher bean . below is sample code.
public class MyJobLauncher {
public static void main(String[] args) {
GenericApplicationContext context = new AnnotationConfigApplicationContext(MyBatchConfiguration.class);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("myJobName");//this is bean name of your job
JobExecution execution = jobLauncher.run(job, jobParameters);
}
}
You will need to create jar. Also all other jar that are needed are also required. You can use maven maven assembly plugin for this.

Resources