Spring Batch exception Handling - spring

Am currently working with spring batch for the first time. In spring batch i've set commit level to 1000 which gave me better performance but now I ve the issues in identifying the corrupt or exception item. We need to send mail update with the record line or item number with the exception data.
I tried item listener, chunk listener, step listener and job listener but am not able to figure out how to get those information from execution listener context while generating mail in job listener. Am able to get the information about exception and not able to track which record has the issue and item count in the chunk.
For example, if I have 1000 lines in file or db and commit level 100. If we have issue in 165 item. I need to get the line number as 165 in any listener so I can attach that in context to populate logging info to have a quick turn around time to fix the issue before reprocessing.
I Searched but I couldn't get suggestion or idea. I believe this will be a common problem in chunk commit greater than 1. Please suggest the better way to handle.
Thanks in advance

You'll want to perform the checks that can cause an issue in the processor, and create an error item out of them which will get persisted to its own table/file. Some errors are unavoidable, and unfortunately you'll need to do manual debugging within that chunk.
Edit:
To find the commit range, you would need to preserve order. If using a FlatFileItemReader, it will store the line for you if your POJO implements ItemCountAware. If running against a DB, you'll want to make sure the query preserves order with an order by on the unique index. Then you'll be able to track the chunk down by checking where read_count from the batch_step_execution table.

You can enable skipping. Spring Batch processes each item of a chunk again in a separate transaction after a chunk fails due to a skippable exception. It detects the item that caused the exception in this way.

Related

Spring batch skipping ValidationException [duplicate]

I have a job that processes items in chunks (of 1000). The items are marshalled into a single JSON payload and posted to a remote service as a batch (all 1000 in one HTTP POST). Sometime the remote service bogs down and the connection times out. I set up skip for this
return steps.get("sendData")
.<DataRecord, DataRecord> chunk(1000)
.reader(reader())
.processor(processor())
.writer(writer())
.faultTolerant()
.skipLimit(10)
.skip(IOException.class)
.build();
If a chunk fails, batch retries the chunk, but one item at a time (in order to find out which item caused the failure) but in my case no one item caused the failure, it is the case that the entire chunk succeeeds or fails as a chunk and should be retried as a chunk (in fact, dropping to single-item mode causes the remote service to get very angry and it refuses to accept the data. We do not control the remote service).
What's my best way out of this? I was trying to see if I could disable single-item retry mode, but I don't even fully understand where this happens. Is there a custom SkipPolicy or something that I can implement? (the methods there didn't look that helpful)
Or is there some way to have the item reader read the 1000 records but pass it to the writer as a List (1000 input items => one output item)?
Let me walk though this in two parts. First I'll explain why it works the way it does, then I'll propose an option for addressing your issue.
Why Is Retry Item By Item
In your configuration, you've specified that it be fault tolerant. With that, when an exception is thrown in the ItemWriter, we don't know which item caused it so we don't have a way to skip/retry it. That's why, when we do begin the skip/retry logic, we go item by item.
How To Handle Retry By The Chunk
What this comes down to is you need to get to a chunk size of 1 in order for this to work. What that means is that instead of relying on Spring Batch for iterating over the items within a chunk for the ItemProcessor, you'll have to do it yourself. So your ItemReader would return a List<DataRecord> and your ItemProcessor would loop over that list. Your ItemWriter would take a List<List<DataRecord>>. I'd recommend creating a decorator for an ItemWriter that unwraps the outer list before passing it to the main ItemWriter.
This does remove the ability to do true skipping of a single item within that list but it sounds like that's ok for your use case.

Spring Kafka error handling for multiple record in the same poll

I'm playing with Spring Kafka and error handling (org.springframework.kafka.listener.ErrorHandler) but what it's not clear to me is what happens when poll returns multiple records and only one of them is causing exception. As far as I understood, the other records are skipped. How to achieve instead a punctual error handling (e.g. skipping only the affected record and continue with the others)?
See the SeekToCurrentErrorHandler - it performs a seek on the failed record as well as any other partitions that follow the failed one in the poll results.
When retries are exhausted, the failed record is skipped.
A RemainingRecordsErrorHandler (sub-interface of ErrorHandler of which STCEH is an implementation) is given the list of remaining records.
Docs here.
With the simple error handler, which only gets the failed record, the remaining records are passed to the listener (as long as transactions are not being used).

Database locking in Spring Batch

I'm working on a Spring Batch application that was running into a DB2 deadlock while using a default JdbcCursorItemReader. When the batch job ran into an error, we had set up a SkipListener to write an "Error" status to the relevant row, which is when the deadlock occurred.
We found that by using a default JdbcPagingItemReader, we were able to avoid the deadlock scenario, though we aren't exactly sure why this is the case.
My understanding of Spring Batch is that either Reader should have freed up the lock on the database once the ResultSet was read in from the query, but this didn't appear to be happening with the JdbcCursorItemReader.
Would anyone be able to help me understand why this is the case?
Thanks!
The JdbcCursorItemReader will maintain a position (Cursor) within the database so it knows where to read from next. This Cursor is maintained by a Lock. The JdbcPageItemReader appears to be submitting queries requesting data from a known start and end point such that it only reads the data between these two points and does not require locks between calls.

Spring Batch, Chunk Size and Skip Listener together

I have a spring batch application, which is working good. It just reads from text file and write to to oracle table. It performs the loading in chuck. Currently I have configured with chuck size of 2000. The issue is, when I implement the skip listener for this job, spring ignoring the chunk size i have given and it is inserting just one record at a time into database. Skip listerner is just writing the invalid record to text file. Is this how spring batch works ?
In a chunk, the ItemWriter will always first attempt to write the entire list of items in the chunk. However, if a skippable exception is thrown, the framework needs to figure out which item(s) caused the error.
To do this, the transaction is rolled back and then the items are retried one-by-one. This allows any item(s) that may have caused the issue to be passed to your skip listener. Unfortunately, it also removes the batch-iness of the chunk.
In general, it is preferable (and will perform better) to do upfront validation in the processor, so you can "filter" the items out rather than throwing an exception and retrying the items individually.

MDB CLIENT_ACKNOWLEDGEMENT mode with max-messages-in-transaction >1

I have a need where I want to group messages received from a system based on certain criterion. For performance reasons, I want to avoid persisting these individual messages before I can group them. I've seen that JMS implementations provide transaction batching over a set of messages as given in
Document 1
Document 2
But I also want the acknowledgement of batch to be controlled by my code; as in case there is some issue in grouping, I should be able to rollback the batch I am reading, to be able to process the message in next try.
From above links, as the transaction is managed by container over a set of onMessage calls, I would not control the transaction commit and rollback.
Can someone let me know if I misreading it and what would be the way to achieve this?

Resources