Performing two read operation in spring batch - spring

I need to read from a DB and based on that result I need to fetch data from another DB which is on another server and after need to write it in file. Now solution that came in mind to use Spring Batch reader for reading from first DB and using we can read from 2nd DB in process.
But in this process what I feel that in process reading is not good idea because it processes single data in one time. (Please correct me if I am wrong)
Is there any other way to do this so that we can perform this task in efficient way.
Thanks in advance
Please Suggest me what could be the options

Related

Database locking in Spring Batch

I'm working on a Spring Batch application that was running into a DB2 deadlock while using a default JdbcCursorItemReader. When the batch job ran into an error, we had set up a SkipListener to write an "Error" status to the relevant row, which is when the deadlock occurred.
We found that by using a default JdbcPagingItemReader, we were able to avoid the deadlock scenario, though we aren't exactly sure why this is the case.
My understanding of Spring Batch is that either Reader should have freed up the lock on the database once the ResultSet was read in from the query, but this didn't appear to be happening with the JdbcCursorItemReader.
Would anyone be able to help me understand why this is the case?
Thanks!
The JdbcCursorItemReader will maintain a position (Cursor) within the database so it knows where to read from next. This Cursor is maintained by a Lock. The JdbcPageItemReader appears to be submitting queries requesting data from a known start and end point such that it only reads the data between these two points and does not require locks between calls.

Avoid processing same file twice in Spring Batch

I have to write a Spring Batch job as follows:
Step 1: Load an XML file from the file system and write its contents to a database staging table
Step 2: Call Oracle PL/SQL procedure to process the staging table.
(Comments on that job structure are welcome, but not the question).
In Step 1, I want to move the XML file to another directory after I have loaded it. I want this, as much as possible, to be "transactional" with the write to the staging table. That is, either both the writes to staging and the file move succeed, or neither does.
I feel this necessary because if (A) the staging writes happen but the file does not move, the next run will pick up the file again and process it again and (B) if the file gets moved but the staging writes do not happen, then we will have missed that file's processing.
This interface's requirements are all about robustness. I know I could just put a step execution listener to move all the files at the end, but I want the approach that is going to guarantee that we never miss processing data and never process the same file twice.
Part of the difficulty is that I am using a MultiResourceItemReader. I read that ChunkListener.beforeChunk() happens as part of the chunk transaction, so I tried to make a custom chunk CompletionPolicy to force chunks to complete after each change of resource (file) name, but I could not get it to work. In any case, I would have needed an afterChunk() listener, which is not part of the transaction anyway.
I'll take any guidance on my specific questions or an expert explanation of how to robustly process files in Spring Batch (which I am only just learning). Thanks!
I have pretty similar spring batch process right now.
Spring batch fits good to your requirement.
I would recommend to start using here spring integration.
In spring integration you can configure to monitor your folder and then make it trigger batch job. There is good example in official documentation.
Then you should use powerful concept of spring batch - identifying parameters. Spring batch job runs with unique parameters, and if you put this parameter as identifying, then no other job could be spawned with same parameter (though you can restart your original job).
/**
* Add a new String parameter for the given key.
*
* #param key - parameter accessor.
* #param parameter - runtime parameter
* #param identifying - indicates if the parameter is used as part of identifying a job instance
* #return a reference to this object.
*/
public JobParametersBuilder addString(String key, String parameter, boolean identifying) {
parameterMap.put(key, new JobParameter(parameter, identifying));
return this;
}
So here you need to ask yourself what is your uniquely identifying constraint for batch job? I would suggest it's full file path. But then you need to be sure that nobody provides different files with same filename.
Also spring integration can see if file was already seen by application and ignore it. Please check documentation on AcceptOnceFileListFilter.
If you want to have guaranteed 'transactional-like' logic in batch - then don't put it into Listeners, create a specific step which will move file. Listeners are good for suplimental logic.
In this way if this step will fail for any reason, you will still be able fix issue and to retry job.
This kind of process can be easy done with a job with 2 step and 1 listener:
A standard (read from XML -> process? -> write to DB) step; you don't care about restartability because SB is smart enough to avoid data read repetition
a listener attached to step 1 to move file after successfully step execution (example 1, example 2 or example 3)
A second step with data processing
#3 may may be inserted as step 1 process phase

Aggregating processor or aggregating reader

I have a requirement, which is like, I read items from a DB, if possible in a paging way where the items represent the later "batch size", I do some processing steps, like filtering etc. then I want to accumulate the items to send it to a rest service where I can send it to in batches, e.g. n of them at once instead one by one.
Parallelising it on the step level is what I am doing but I am not sure on how to get the batching to work, do I need to implement a reader that returns a list and a processor that receives a list? If so, I read that you will have not a proper account of number items processed.
I am trying to find a way to do it in the most appropriate spring batch way instead of hacking a fix, I also assume that I need to keep state in the reader and wondered if there is a better way not to.
You cannot have something like an aggregating processor. Every single item that is read is processed as single item.
However, you can implement a Reader that groups items and forwards them as a whole group. to get an idea, how this could be done have a look at my answer to this question Spring Batch Processor or Dean Clark's answer here Spring Batch-How to process multiple records at the same time in the processor?.
Both use a SpringBatch's SingleItemPeekableItemReader.

How to post on multiple queues using single job/ JMSwriter in spring batch

I am a newbie at Spring Batch and have recently started using it.
I have a requirement where I need to post/write the messages read from each DB record on different queues using single Job. As I have to use reader to read the messages from DB and use processor to decide on which queue I have to post it.
So my question is Can I use single JMSwriter to post the messages on different queues as I have to use single Job and DB Reader.
Thanks in Advance
As I know JMSwriter not supports it (it writes to default destination of jmsTemplate).
But you may just implement your own ItemWriter, inject all jmsTemplates in it and write custom decistion logic to select appropriate destionation and write to it.
Another way - use ClassifierCompositeItemWriter , put a set of JmsWriters to it and select one by your classifier

Chunk processing in Spring Batch effectively limits you to one step?

I am writing a Spring Batch application to do the following: There is an input table (PostgreSQL DB) to which someone continually adds rows - that is basically work items being added. For each of these rows, I need to fetch more data from another DB, do some processing, and then do an output transaction which can be multiple SQL queries touching multiple tables (this needs to be one transaction for consistency reasons).
Now, the part between the input and output should be a modular - it already has 3-4 logically separated things, and in future there would be more. This flow need not be linear - what processing is done next can be dependent on the result of previous. In short, this is basically like the flow you can setup using steps inside a job.
My main problem is this: Normally a single chunk processing step has both ItemReader and ItemWriter, i.e., input to output in a single step. So, should I include all the processing steps as part of a single ItemProcessor? How would I make a single ItemProcessor a stateful workflow in itself?
The other option is to make each step a Tasklet implementation, and write two tasklets myself to behave as ItemReader and ItemWriter.
Any suggestions?
Found an answer - yes you are effectively limited to a single step. But:
1) For linear workflows, you can "chain" itemprocessors - that is create a composite itemprocessor to which you can provide all the itemprocessors which do actual work through applicationContext.xml. Composite itemprocessor just runs them one by one. This is what I'm doing right now.
2) You can always create the internal subflow as a seperate spring batch workflow and call it through code in an itemprocessor similar to composite itemprocessor above. I might move to this in the future.

Resources