Spring Batch Processor Exception Listener? - spring

I have a partitioned Spring Batch job that reads several split up CSV files and processes each in their own thread, then writes the results to a corresponding output file.
If an item fails to process though (an exception is thrown), I want to write that result to an error file. Is there a way to add a writer or listener that can handle this?
Taking this one step further, is there a way to split this up by exception type and write the different exceptions to different files?

You can achieve this by specifying SkipPolicy. Implement this interface and add your own logic.
public class MySkipper implements SkipPolicy {
#Override
public boolean shouldSkip(Throwable exception, int skipCount) throws SkipLimitExceededException {
if (exception instanceof XYZException) {
//doSomething
}
......
}
You can specify this skip policy in your batch.
this.stepBuilders.get("importStep").<X, Y>chunk(10)
.reader(this.getItemReader()).faultTolerant().skipPolicy(....)
.processor(this.getItemProcessor())
.writer(this.getItemWriter())
.build();

One way that I have seen this done is through a combination of a SkipPolicy and a SkipListener.
The policy would allow you to skip over items that threw an exception, such as a FlatFileParseException (skippable exceptions can be configured).
The listener gives you access to the Throwable and the item that caused it (or just Throwable in the case of reads). The skip listener also lets you differentiate between skips in the read/processor/writer if you wanted to handle those separately.
public class ErrorWritingSkipListener<T, S> implements SkipListener<T, S> {
#Override
public void onSkipInRead(final Throwable t) {
// custom logic
}
#Override
public void onSkipInProcess(final T itemThatFailed, final Throwable t) {
// custom logic
}
#Override
public void onSkipInWrite(final S itemThatFailed, final Throwable t) {
// custom logic
}
}
I would recommend using the SkipPolicy only to identify the exceptions you want to write out to your various files, and leveraging the SkipListener to perform the actual file writing logic. That would match up nicely with their intended use as defined by their interfaces.

Related

Spring Batch Single Reader Multiple Processers and Multiple Writers [duplicate]

In Spring batch I need to pass the items read by an ItemReader to two different processors and writer. What I'm trying to achieve is that...
+---> ItemProcessor#1 ---> ItemWriter#1
|
ItemReader ---> item ---+
|
+---> ItemProcessor#2 ---> ItemWriter#2
This is needed because items written by ItemWriter#1 should be processed in a completely different way compared to the ones written by ItemWriter#2.
Moreover, ItemReader reads item from a database, and the queries it executes are so computational expensive that executing the same query twice should be discarded.
Any hint about how to achieve such set up ? Or, at least, a logically equivalent set up ?
This solution is valid if your item should be processed by processor #1 and processor #2
You have to create a processor #0 with this signature:
class Processor0<Item, CompositeResultBean>
where CompositeResultBean is a bean defined as
class CompositeResultBean {
Processor1ResultBean result1;
Processor2ResultBean result2;
}
In your Processor #0 just delegate work to processors #1 and #2 and put result in CompositeResultBean
CompositeResultBean Processor0.process(Item item) {
final CompositeResultBean r = new CompositeResultBean();
r.setResult1(processor1.process(item));
r.setResult2(processor2.process(item));
return r;
}
Your own writer is a CompositeItemWriter that delegate to writer CompositeResultBean.result1 or CompositeResultBean.result2 (look at PropertyExtractingDelegatingItemWriter, maybe can help)
I followed Luca's suggestion to use PropertyExtractingDelegatingItemWriter as writer and I was able to work with two different entities in one single step.
First of all what I did was to define a DTO that stores the two entities/results from the processor
public class DatabaseEntry {
private AccessLogEntry accessLogEntry;
private BlockedIp blockedIp;
public AccessLogEntry getAccessLogEntry() {
return accessLogEntry;
}
public void setAccessLogEntry(AccessLogEntry accessLogEntry) {
this.accessLogEntry = accessLogEntry;
}
public BlockedIp getBlockedIp() {
return blockedIp;
}
public void setBlockedIp(BlockedIp blockedIp) {
this.blockedIp = blockedIp;
}
}
Then I passed this DTO to the writer, a PropertyExtractingDelegatingItemWriter class where I define two customized methods to write the entities into the database, see my writer code below:
#Configuration
public class LogWriter extends LogAbstract {
#Autowired
private DataSource dataSource;
#Bean()
public PropertyExtractingDelegatingItemWriter<DatabaseEntry> itemWriterAccessLogEntry() {
PropertyExtractingDelegatingItemWriter<DatabaseEntry> propertyExtractingDelegatingItemWriter = new PropertyExtractingDelegatingItemWriter<DatabaseEntry>();
propertyExtractingDelegatingItemWriter.setFieldsUsedAsTargetMethodArguments(new String[]{"accessLogEntry", "blockedIp"});
propertyExtractingDelegatingItemWriter.setTargetObject(this);
propertyExtractingDelegatingItemWriter.setTargetMethod("saveTransaction");
return propertyExtractingDelegatingItemWriter;
}
public void saveTransaction(AccessLogEntry accessLogEntry, BlockedIp blockedIp) throws SQLException {
writeAccessLogTable(accessLogEntry);
if (blockedIp != null) {
writeBlockedIp(blockedIp);
}
}
private void writeBlockedIp(BlockedIp entry) throws SQLException {
PreparedStatement statement = dataSource.getConnection().prepareStatement("INSERT INTO blocked_ips (ip,threshold,startDate,endDate,comment) VALUES (?,?,?,?,?)");
statement.setString(1, entry.getIp());
statement.setInt(2, threshold);
statement.setTimestamp(3, Timestamp.valueOf(startDate));
statement.setTimestamp(4, Timestamp.valueOf(endDate));
statement.setString(5, entry.getComment());
statement.execute();
}
private void writeAccessLogTable(AccessLogEntry entry) throws SQLException {
PreparedStatement statement = dataSource.getConnection().prepareStatement("INSERT INTO log_entries (date,ip,request,status,userAgent) VALUES (?,?,?,?,?)");
statement.setTimestamp(1, Timestamp.valueOf(entry.getDate()));
statement.setString(2, entry.getIp());
statement.setString(3, entry.getRequest());
statement.setString(4, entry.getStatus());
statement.setString(5, entry.getUserAgent());
statement.execute();
}
}
With this approach you can get the wanted inital behaviour from a single reader for processing multiple entities and save them in a single step.
You can use a CompositeItemProcessor and CompositeItemWriter
It won't look exactly like your schema, it will be sequential, but it will do the job.
this is the solution I came up with.
So, the idea is to code a new Writer that "contains" both an ItemProcessor and an ItemWriter. Just to give you an idea, we called it PreprocessoWriter, and that's the core code.
private ItemWriter<O> writer;
private ItemProcessor<I, O> processor;
#Override
public void write(List<? extends I> items) throws Exception {
List<O> toWrite = new ArrayList<O>();
for (I item : items) {
toWrite.add(processor.process(item));
}
writer.write(toWrite);
}
There's a lot of things being left aside. Management of ItemStream, for instance. But in our particular scenario this was enough.
So you can just combine multiple PreprocessorWriter with CompositeWriter.
There is an other solution if you have a reasonable amount of items (like less than 1 Go) : you can cache the result of your select into a collection wrapped in a Spring bean.
Then u can just read the collection twice with no cost.

Spring batch - Processor or Writer?

In my Spring boot and Spring batch application, I have a step like this:
#Bean
public Step step1() {
return stepBuilderFactory.get("step1").<FileInfo, FileInfo>chunk(10).reader(FileInfoItemReader).processor(processor()).writer(writer()).build();
}
My writer is a empty like below:
public class BlankWriter<T> implements ItemWriter<T> {
#Override
public void write(List<? extends T> items) throws Exception {
}
}
Now, in my processor I have this:
public class FileInfoItemProcessor implements ItemProcessor<FileInfo, FileInfo> {
.....
#Override
public FileInfo process(final FileInfo FileInfo) throws Exception {
myCustomStuff () {
......
}
}
public static void myCustomStuff() {
......
......
}
}
Question: As all the objects are passed to processor, I can deal with them in my processor itself rather using any transformations etc AND since my purpose get solved by using processor, is it a good practice? or I must use a writer/custom-writer to get the job done?
I think doing the REST POST call in the writer is more appropriate than doing it in the processor. A REST POST call is a kind of write operation to a remote location.
So you can omit the processor (since it is optional) and move that code to the item writer (instead of using a NoOp item writer with an empty write method).

Send data to Spring Batch Item Reader (or Tasklet)

I have the following requirement:
An endpoint http://localhost:8080/myapp/jobExecution/myJobName/execute which receives a CSV and use univocity to apply some validations and generate a List of some pojo.
Send that list to a Spring Batch Job for some processing.
Multiple users could do this.
I want to know if with Spring Batch I can achieve this?
I was thinking to use a queue, put the data and execute a Job that pull objects from that queue. But how can I be sure that if other person execute the endpoint and other Job is executing, Spring Batch Knows which Item belongs to a certain execution?
You can use a queue or go ahead to put the list of values that was generated after the step with validations and store it as part of job parameters in the job execution context.
Below is a snippet to store the list to a job context and read the list using an ItemReader.
Snippet implements StepExecutionListener in a Tasklet step to put List which was constructed,
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
//tenantNames is a List<String> which was constructed as an output of an evaluation logic
stepExecution.getJobExecution().getExecutionContext().put("listOfTenants", tenantNames);
return ExitStatus.COMPLETED;
}
Now "listOfTenants" are read as part of a Step which has Reader (To allow one thread read at a time), Processor and Writer. You can also store it as a part of Queue and fetch it in a Reader. Snippet for reference,
public class ReaderStep implements ItemReader<String>, StepExecutionListener {
private List<String> tenantNames;
#Override
public void beforeStep(StepExecution stepExecution) {
try {
tenantNames = (List<String>)stepExecution.getJobExecution().getExecutionContext()
.get("listOfTenants");
logger.debug("Sucessfully fetched the tenant list from the context");
} catch (Exception e) {
// Exception block
}
}
#Override
public synchronized String read() throws Exception {
String tenantName = null;
if(tenantNames.size() > 0) {
tenantName = tenantNames.get(0);
tenantNames.remove(0);
return tenantName;
}
logger.info("Completed reading all tenant names");
return null;
}
// Rest of the overridden methods of this class..
}
Yes. Spring boot would execute these jobs in different threads. So Spring knows which items belongs to which execution.
Note: You can use like logging correlation id. This will help you filter the logs for a particular request. https://dzone.com/articles/correlation-id-for-logging-in-microservices

Get failure exception in #HystrixCommand fallback method

Is there a way to get the reason a HystrixCommand failed when using the #HystrixCommand annotation within a Spring Boot application? It looks like if you implement your own HystrixCommand, you have access to the getFailedExecutionException but how can you get access to this when using the annotation? I would like to be able to do different things in the fallback method based on the type of exception that occurred. Is this possible?
I saw a note about HystrixRequestContext.initializeContext() but the HystrixRequestContext doesn't give you access to anything, is there a different way to use that context to get access to the exceptions?
Simply add a Throwable parameter to the fallback method and it will receive the exception which the original command produced.
From https://github.com/Netflix/Hystrix/tree/master/hystrix-contrib/hystrix-javanica
#HystrixCommand(fallbackMethod = "fallback1")
User getUserById(String id) {
throw new RuntimeException("getUserById command failed");
}
#HystrixCommand(fallbackMethod = "fallback2")
User fallback1(String id, Throwable e) {
assert "getUserById command failed".equals(e.getMessage());
throw new RuntimeException("fallback1 failed");
}
I haven't found a way to get the exception with Annotations either, but creating my own Command worked for me like so:
public static class DemoCommand extends HystrixCommand<String> {
protected DemoCommand() {
super(HystrixCommandGroupKey.Factory.asKey("Demo"));
}
#Override
protected String run() throws Exception {
throw new RuntimeException("failed!");
}
#Override
protected String getFallback() {
System.out.println("Events (so far) in Fallback: " + getExecutionEvents());
return getFailedExecutionException().getMessage();
}
}
Hopefully this helps someone else as well.
As said in the documentation Hystrix-documentation getFallback() method will be thrown when:
Whenever a command execution fails: when an exception is thrown by construct() or run()
When the command is short-circuited because the circuit is open
When the command’s thread pool and queue or semaphore are at capacity
When the command has exceeded its timeout length.
So you can easily get what raised your fallback method called by assigning the the execution exception to a Throwable object.
Assuming your HystrixCommand returns a String
public class ExampleTask extends HystrixCommand<String> {
//Your class body
}
do as follows:
#Override
protected ErrorCodes getFallback() {
Throwable t = getExecutionException();
if (circuitBreaker.isOpen()) {
// Log or something
} else if (t instanceof RejectedExecutionException) {
// Log and get the threadpool name, could be useful
} else {
// Maybe something else happened
}
return "A default String"; // Avoid using any HTTP request or ypu will need to wrap it also in HystrixCommand
}
More info here
I couldn't find a way to obtain the exception with the annotations, but i found HystrixPlugins , with that you can register a HystrixCommandExecutionHook and you can get the exact exception in that like this :
HystrixPlugins.getInstance().registerCommandExecutionHook(new HystrixCommandExecutionHook() {
#Override
public <T> void onFallbackStart(final HystrixInvokable<T> commandInstance) {
}
});
The command instance is a GenericCommand.
Most of the time just using getFailedExecutionException().getMessage() gave me null values.
Exception errorFromThrowable = getExceptionFromThrowable(getExecutionException());
String errMessage = (errorFromThrowable != null) ? errorFromThrowable.getMessage()
this gives me better results all the time.

Dropwizard intercept bad json and return custom error message

I want to intercept a bad JSON input and return custom error messages using Dropwizard application. I followed the approach of defining a custom exception mapper as mentioned here : http://gary-rowe.com/agilestack/2012/10/23/how-to-implement-a-runtimeexceptionmapper-for-dropwizard/ . But it did not work for me. This same question has been asked here https://groups.google.com/forum/#!topic/dropwizard-user/r76Ny-pCveA but unanswered.
Any help would be highly appreciated.
My code below and I am registering it in dropwizard as environment.jersey().register(RuntimeExceptionMapper.class);
#Provider
public class RuntimeExceptionMapper implements ExceptionMapper<RuntimeException> {
private static Logger logger = LoggerFactory.getLogger(RuntimeExceptionMapper.class);
#Override
public Response toResponse(RuntimeException runtime) {
logger.error("API invocation failed. Runtime : {}, Message : {}", runtime, runtime.getMessage());
return Response.serverError().type(MediaType.APPLICATION_JSON).entity(new Error()).build();
}
}
Problem 1:
The exception being thrown by Jackson doesn't extends RuntimeException, but it does extend Exception. This doesn't matter though. (See Problem 2)
Problem 2:
DropwizardResourceConfig, registers it's own JsonProcessingExceptionMapper. So you should already see results similar to
{
"message":"Unrecognized field \"field\" (class d.s.h.c.MyClass),..."
}
Now if you want to override this, then you should create a more specific exception mapper. When working with exception mappers the most specific one will be chosen. JsonProcessingException is subclassed by JsonMappingException and JsonProcessingException, so you will want to create an exception mapper for each of these. Then register them. I am not sure how to unregister the Dropwizard JsonProcessingExceptionMapper, otherwise we could just create a mapper for JsonProcessingException, which will save us the hassle of create both.
Update
So you can remove the Dropwizard mapper, if you want, with the following
Set<Object> providers = environment.jersey().getResourceConfig().getSingletons();
Iterator it = providers.iterator();
while (it.hasNext()) {
Object val = it.next();
if (val instanceof JsonProcessingExceptionMapper) {
it.remove();
break;
}
}
Then you are free to use your own mapper, JsonProcessingException

Resources