In my spring boot project, I need to upload data as .csv.
Later when the job executes, if there is some data mismatch the job will fail.
When the job executes it will update some of the tables. So with a "dry run" without making any data change I want to check if the uploaded data makes any issue.
To avoid that job failure, how can I ensure the data is OK and the job won't fail just after uploading the file?
Related
I have a spring boot application (which reads data from txt file --> insert into database --> execute few sql queries --> write to csv file). It will run using a cron scheduler and is deployed in namespace prod-1. I want to deploy the same application in one more namespace prod-2 for backup (disaster recovery). I want to implement a fail over mechanism such that whenever the job in prod-1 failed, it should automatically start job in prod-2 and pick up from where it failed. Any ideas regarding the approach would be appreciated.
I have a working Spring Boot application which embeds a Spring Batch Job. The job is not run on a schedule, instead we kick it with an endpoint. It is working as it should. The basics of the batch are
Kick the endpoint to start the job
Reader reads from input file
Processor reads from oracle database using jpa repository and simple spring datasource config
Writer writes to output file
However there are new requirements:
The schema of the repository database is from here on unknown on application startup. The tables are the same, it is just an unknown schema. This fact is out of our control and you might think it is stupid but there are reasons for it and this cant be changed. This means that with current functionality we need to reconfigure the datasource when we know the new schema name, and restart the application. This is a job that we will run for a number of times when migrating from one system to another, so it has a limited lifecycle and we just need a "quick fix" to be able to use it without rewriting the whole app. So what I would like to do is:
Send the schema name as a query param to the application, put it in job parameters and then - get a new datasource when the processor reads from the repository. Would this be doable at all using Spring Batch? Any help appreciated!
I have spring batch maven project deployed in a unix server.
I know that there are a couple of questions on this topic already and I have tried all those solutions. I have tried adding the date and even the time in milliseconds as a job parameter to keep it unique. I am testing something and I have to keep triggering the job manually a lot of times in a day. I have created a folder in a unix server and I have compiled my spring maven project into a jar file and moved it to the server. But whenever I run the spring batch job there by giving a time in application.properties it gives me the Jobinstancealreadycompleteexception.
One more surprising thing is that this issue does not happen in local, only in the server.
I have spring batch job configured in my JAVA application, the application runs in a cluster. Hence the same job gets executed twice, which I don't want.
So I want to configure a step within the job which will check if CREATE_DATE is present for that day in BATCH_JOB_EXECUTION table and will continue or fail over.
How can this be configured within a spring batch step ?
Use a JobExecutionDecider.
From Javadoc:
Interface allowing for programmatic access to the decision on what the
status of a flow should be. For example, if some condition that's
stored in the database indicates that the job should stop for a manual
check, a decider implementation could check that value to determine
the status of the flow
I am able to import data from my MS sql to HDFS using JDBCHDFC Spring Batch jobs.But if that containers fails , the job does not shift to other container. How do I proceed to make the job fault tolerant.
I am using spring xd 1.0.1 release
You don't mention which version of Spring XD you're currently using so I can't verify the exact behavior. However, on a container failure with a batch job running in the current version, the job should be re-deployed to a new eligible container. That being said, it will not restart the job automatically. We are currently looking at options for how to allow a user to specify if they want it restarted (there are scenarios that fall into both camps so we need to allow a user to configure that).