org.apache.http.nio.reactor.IOReactorException: I/O dispatch worker terminated abnormally - apache-httpcomponents

I have a service which uses apache HttpAsyncClient. (versions: httpasyncclient-4.0.2.jar, httpcore-4.4.3.jar, httpcore-nio-4.3.3.jar)
All requests start failing some time after starting the async client with following being initial exception -
[#|2016-03-16T22:31:59.376-0700|SEVERE|glassfish3.1.2|org.apache.http.impl.nio.client.InternalHttpAsyncClient|_ThreadID=564;_ThreadName=Thread-6;|I/O reactor terminated abnormally
org.apache.http.nio.reactor.IOReactorException: I/O dispatch worker terminated abnormally
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:357)
at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:189)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.doExecute(CloseableHttpAsyncClientBase.java:67)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.access$000(CloseableHttpAsyncClientBase.java:38)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:57)
at java.lang.Thread.run(Unknown Source)
Caused by: RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=null)
at com.notificationservice.analytics.client.AsyncResponse$2.failed(AsyncResponse.java:178)
at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:134)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.failed(DefaultClientExchangeHandlerImpl.java:258)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.exception(HttpAsyncRequestExecutor.java:127)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:68)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:37)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.outputReady(AbstractIODispatch.java:154)
at org.apache.http.impl.nio.reactor.BaseIOReactor.writable(BaseIOReactor.java:180)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:342)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:316)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:277)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:586)
at java.lang.Thread.run(Unknown Source)
)
at com.notificationservice.client.AsyncResponse$2.failed(AsyncResponse.java:178)
at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:134)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.failed(DefaultClientExchangeHandlerImpl.java:258)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.exception(HttpAsyncRequestExecutor.java:127)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:68)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:37)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.outputReady(AbstractIODispatch.java:154)
at org.apache.http.impl.nio.reactor.BaseIOReactor.writable(BaseIOReactor.java:180)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:342)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:316)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:277)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:586)
... 1 more
Same problem happens with new versions - httpasyncclient-4.1.1.jar, httpcore-4.4.4.jar, httpcore-nio-4.4.4.jar
Any insight would be highly appreciated. Is there some IOReactorConfig parameter which needs to be changed?

I would say something is wrong with your rest parameters. StatusCode 500 comes from the server so your request are going to it.
Caused by: RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=null

Related

Delayed execution for 'Explain' queries on Prestosql clusters

I have two types of prestosql clusters, on aws instances and on Kubernetes. Prestosql on K8s has a weird issue with 'EXPLAIN' queries as it takes a long time ~2-3 mins compared to 2-3 seconds on the instance one.
The query stays on "WAITING_FOR_RESOURCES" for about 2 minutes and then executes very quickly.
There is also an exception on the server logs
2020-12-23T05:25:01.930Z ERROR Query-20201223_052431_00004_pxqak-276 io.prestosql.cost.CachingStatsProvider Error occurred when computing stats for query 20201223_052431_00004_pxqak
io.prestosql.spi.PrestoException: HIVE_METASTORE_ERROR
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.getMetastorePartitionColumnStatistics(ThriftHiveMetastore.java:461)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.getPartitionColumnStatistics(ThriftHiveMetastore.java:438)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.getPartitionStatistics(ThriftHiveMetastore.java:389)
at io.prestosql.plugin.hive.metastore.thrift.BridgingHiveMetastore.getPartitionStatistics(BridgingHiveMetastore.java:110)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.lambda$loadPartitionColumnStatistics$6(CachingHiveMetastore.java:360)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.loadPartitionColumnStatistics(CachingHiveMetastore.java:353)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.access$100(CachingHiveMetastore.java:89)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore$1.loadAll(CachingHiveMetastore.java:179)
at com.google.common.cache.CacheLoader$1.loadAll(CacheLoader.java:207)
at io.prestosql.cost.JoinStatsRule.doCalculate(JoinStatsRule.java:81)
at io.prestosql.cost.JoinStatsRule.doCalculate(JoinStatsRule.java:48)
at io.prestosql.cost.SimpleStatsRule.calculate(SimpleStatsRule.java:39)
at io.prestosql.cost.ComposableStatsCalculator.calculateStats(ComposableStatsCalculator.java:82)
at io.prestosql.cost.ComposableStatsCalculator.calculateStats(ComposableStatsCalculator.java:70)
at io.prestosql.cost.CachingStatsProvider.getGroupStats(CachingStatsProvider.java:103)
at io.prestosql.cost.CachingStatsProvider.getStats(CachingStatsProvider.java:72)
at io.prestosql.cost.JoinStatsRule.doCalculate(JoinStatsRule.java:81)
at io.prestosql.cost.JoinStatsRule.doCalculate(JoinStatsRule.java:48)
at io.prestosql.cost.SimpleStatsRule.calculate(SimpleStatsRule.java:39)
at io.prestosql.cost.ComposableStatsCalculator.calculateStats(ComposableStatsCalculator.java:82)
at io.prestosql.cost.ComposableStatsCalculator.calculateStats(ComposableStatsCalculator.java:70)
at io.prestosql.cost.CachingStatsProvider.getGroupStats(CachingStatsProvider.java:103)
at io.prestosql.cost.CachingStatsProvider.getStats(CachingStatsProvider.java:72)
at io.prestosql.cost.CostCalculatorWithEstimatedExchanges.calculateJoinExchangeCost(CostCalculatorWithEstimatedExchanges.java:233)
at io.prestosql.cost.CostCalculatorWithEstimatedExchanges.calculateJoinCostWithoutOutput(CostCalculatorWithEstimatedExchanges.java:208)
at io.prestosql.sql.planner.iterative.rule.DetermineJoinDistributionType.getJoinNodeWithCost(DetermineJoinDistributionType.java:180)
at io.prestosql.sql.planner.iterative.rule.DetermineJoinDistributionType.addJoinsWithDifferentDistributions(DetermineJoinDistributionType.java:116)
at io.prestosql.sql.planner.iterative.rule.DetermineJoinDistributionType.getCostBasedJoin(DetermineJoinDistributionType.java:98)
at io.prestosql.sql.planner.iterative.rule.DetermineJoinDistributionType.apply(DetermineJoinDistributionType.java:74)
at io.prestosql.sql.planner.iterative.rule.DetermineJoinDistributionType.apply(DetermineJoinDistributionType.java:49)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.transform(IterativeOptimizer.java:165)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.exploreNode(IterativeOptimizer.java:140)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:105)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:190)
at com.google.common.cache.LocalCache.loadAll(LocalCache.java:4058)
at com.google.common.cache.LocalCache.getAll(LocalCache.java:4021)
at com.google.common.cache.LocalCache$LocalLoadingCache.getAll(LocalCache.java:4972)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.getAll(CachingHiveMetastore.java:255)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.getPartitionStatistics(CachingHiveMetastore.java:330)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.lambda$loadPartitionColumnStatistics$6(CachingHiveMetastore.java:360)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.loadPartitionColumnStatistics(CachingHiveMetastore.java:353)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.access$100(CachingHiveMetastore.java:89)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore$1.loadAll(CachingHiveMetastore.java:179)
at com.google.common.cache.CacheLoader$1.loadAll(CacheLoader.java:207)
at com.google.common.cache.LocalCache.loadAll(LocalCache.java:4058)
at com.google.common.cache.LocalCache.getAll(LocalCache.java:4021)
at com.google.common.cache.LocalCache$LocalLoadingCache.getAll(LocalCache.java:4972)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.getAll(CachingHiveMetastore.java:255)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.getPartitionStatistics(CachingHiveMetastore.java:330)
at io.prestosql.plugin.hive.HiveMetastoreClosure.getPartitionStatistics(HiveMetastoreClosure.java:88)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore.getPartitionStatistics(SemiTransactionalHiveMetastore.java:256)
at io.prestosql.plugin.hive.statistics.MetastoreHiveStatisticsProvider.getPartitionsStatistics(MetastoreHiveStatisticsProvider.java:126)
at io.prestosql.plugin.hive.statistics.MetastoreHiveStatisticsProvider.lambda$new$0(MetastoreHiveStatisticsProvider.java:104)
at io.prestosql.plugin.hive.statistics.MetastoreHiveStatisticsProvider.getTableStatistics(MetastoreHiveStatisticsProvider.java:146)
at io.prestosql.plugin.hive.HiveMetadata.getTableStatistics(HiveMetadata.java:695)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:107)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:190)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:107)
at io.prestosql.sql.planner.iterative.IterativeOptimizer.optimize(IterativeOptimizer.java:96)
at io.prestosql.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:196)
at io.prestosql.sql.analyzer.QueryExplainer.getLogicalPlan(QueryExplainer.java:182)
at io.prestosql.sql.analyzer.QueryExplainer.getPlan(QueryExplainer.java:121)
at io.prestosql.sql.rewrite.ExplainRewrite$Visitor.getQueryPlan(ExplainRewrite.java:137)
at io.prestosql.sql.rewrite.ExplainRewrite$Visitor.visitExplain(ExplainRewrite.java:115)
at io.prestosql.sql.rewrite.ExplainRewrite$Visitor.visitExplain(ExplainRewrite.java:65)
at io.prestosql.sql.tree.Explain.accept(Explain.java:80)
at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27)
at io.prestosql.sql.rewrite.ExplainRewrite.rewrite(ExplainRewrite.java:62)
at io.prestosql.sql.rewrite.StatementRewrite.rewrite(StatementRewrite.java:57)
at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:80)
at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:75)
at io.prestosql.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:221)
at io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:180)
at io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:97)
at io.prestosql.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:732)
at io.prestosql.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:119)
at io.prestosql.$gen.Presto_330____20201223_050837_2.call(Unknown Source)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: MetaException(message:null)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_statistics_req_result$get_partitions_statistics_req_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_statistics_req_result$get_partitions_statistics_req_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_statistics_req_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions_statistics_req(ThriftHiveMetastore.java:4013)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_statistics_req(ThriftHiveMetastore.java:4000)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastoreClient.getPartitionColumnStatistics(ThriftHiveMetastoreClient.java:227)
at io.prestosql.plugin.hive.metastore.thrift.FailureAwareThriftMetastoreClient.lambda$getPartitionColumnStatistics$16(FailureAwareThriftMetastoreClient.java:191)
at io.prestosql.plugin.hive.metastore.thrift.FailureAwareThriftMetastoreClient.runWithHandle(FailureAwareThriftMetastoreClient.java:394)
at io.prestosql.plugin.hive.metastore.thrift.FailureAwareThriftMetastoreClient.getPartitionColumnStatistics(FailureAwareThriftMetastoreClient.java:191)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.lambda$getMetastorePartitionColumnStatistics$15(ThriftHiveMetastore.java:453)
at io.prestosql.plugin.hive.metastore.thrift.ThriftMetastoreApiStats.lambda$wrap$0(ThriftMetastoreApiStats.java:42)
at io.prestosql.plugin.hive.util.RetryDriver.run(RetryDriver.java:130)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.getMetastorePartitionColumnStatistics(ThriftHiveMetastore.java:451)
... 156 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
Suppressed: MetaException(message:null)
... 170 more
I tried changing the values of hive.metastore.partition-batch-size.max and hive.metastore-cache-ttl
It seems than in your "slow" deployment the metastore call get_partitions_statistics_req fails for some reason and is getting retried. The retries likely consume all the "waiting" time. Since Presto by default ignores stats calculation failures like this, the query eventually works.
The failure is on the Hive side, so you need to check metastore logs to understand the cause of the failure, since it's not getting propagated on the Presto side.
On the Presto side you can still apply some configuration changes, as a workaround:
disable stats for the Hive connector with hive.table-statistics-enabled configuration property
reduce the time spent retrying metastore calls with hive.metastore.thrift.client.max-retry-time configuration property
make your queries fail loud with global config property optimizer.ignore-stats-calculator-failures=false (unlikely what you want)

Exception 'Cannot get a connection, pool error Timeout waiting for idle object' when using 'DBCPConnectionPoolLookup' service in Nifi

I'm trying to use 'DBCPConnectionPoolLookup' service in 'ExecuteGroovyScript' to dynamically query the required database based on 'database.name' parameter in the input flow file.
The processor is successfully able to get the corresponding 'DBCPConnectionPool' service for querying but I'm getting the an exception java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object. As opposed to if I directly use the 'DBCPConnectionPool' service without the 'Lookup' service without changing any configuration it works fine.
I access the service as follows:
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())
Then use the 'clientDb' object to query as:
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
I have tried increasing the values of Max Wait Time and Max Total Connections to higher values in 'DBCPConnectionPool' service, it does not help.
Please find below detail links of images for code,error and configuration
Exception
Configuration of 'ExecuteGroovyScript'
Configuration of 'DBCPConnectionPool' service
Configuration of 'DBCPConnectionPoolLookup' service
Script Code
import org.apache.nifi.distributed.cache.client.Deserializer
import org.apache.nifi.distributed.cache.client.Serializer
import org.apache.nifi.distributed.cache.client.exception.DeserializationException
import org.apache.nifi.distributed.cache.client.exception.SerializationException
import groovy.sql.Sql
import java.time.*
try {
def flowFile = session.get()
def isBootstrap=flowFile."isBootstrap"
def timseriesSqlQuery='SELECT id FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def timseriesSqlCountQuery='SELECT count(id) as c FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def spaceSqlQuery='select id from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def spaceSqlCountQuery='select count(id) as c from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def cache = CTL.lastIngestTimeMap
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())//SQL.staticService
int numRowsTimeSeries=0
int numRowsSpace=0
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
clientDb.rows(spaceSqlCountQuery).eachWithIndex { row, idx ->numRowsSpace= row.c}
}
Exception from Nifi logs
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-3] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-3] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-9] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-9] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-10] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=a1ec4496-dca3-38ab-a47b-43d7ff95e40f] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-2] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=54d1e251-88f2-33f3-0489-722879a802bd] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
Finally after bring down Nifi twice I have found the solution. The problem seemed to be in the code which I was using, I used the object returned by CTL.index.getConnection(flowFile.getAttributes()) to query the SQL table which actually is a connection table, now due to this Nifi used up all available connections to SQL, due to which even if I reverted to using 'DBCPConnectionPool' service instead if 'Lookup' I was getting the above error. When I used to restart Nifi it used to work fine.
The actual code to be used in your script for using 'Lookup' Service is
def connectionObj = CTL.index.getConnection(flowFile.getAttributes())
def clientDb = new Sql(connectionObj)
Now use the 'clientDb' object to query your table
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}

class org.apache.ignite.IgniteClientDisconnectedException

I am trying to populate a Gridgain cache which is running on a cluster on other machine. I'm getting ClientDisconnectedException while calling method on a Gridgain Cache and getting this after calling put method on the Cache.
Here is my cache Configuration:
// DPH cache
CacheConfiguration<K,V> DPHCacheCfg = new CacheConfiguration<>(DPH_CACHE);
DPHCacheCfg.setCacheMode(CacheMode.PARTITIONED); // Default.
DPHCacheCfg.setIndexedTypes(String.class, DPH.class);
DPHCacheCfg.setOffHeapMaxMemory(10 * 1024L * 1024L * 1024L);
DPHCacheCfg.setMemoryMode(CacheMemoryMode.ONHEAP_TIERED);
FifoEvictionPolicy evctPolicy = new FifoEvictionPolicy();
DPHCacheCfg.setEvictionPolicy(evctPolicy);
Her is how I put data in The Cache:
DPHCache.put(K, V); where V is some object. After certain number of puts, I am having the below exception.
avax.cache.CacheException: class org.apache.ignite.IgniteClientDisconnectedException: Operation has been cancelled (client node disconnected).
at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1615)
at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:1955)
at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1155)
at com.elsevier.elssie.datafabric.LoadQuetzal.populateDPH(LoadQuetzal.java:246)
at com.elsevier.elssie.datafabric.LoadQuetzal.main(LoadQuetzal.java:163)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteClientDisconnectedException: Operation has been cancelled (client node disconnected).
at org.apache.ignite.internal.util.IgniteUtils$14.apply(IgniteUtils.java:829)
at org.apache.ignite.internal.util.IgniteUtils$14.apply(IgniteUtils.java:827)
... 11 more
Caused by: class org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Operation has been cancelled (client node disconnected).
at org.apache.ignite.internal.processors.cache.GridCacheMvccManager.disconnectedError(GridCacheMvccManager.java:406)
at org.apache.ignite.internal.processors.cache.GridCacheMvccManager.onDisconnected(GridCacheMvccManager.java:382)
at org.apache.ignite.internal.processors.cache.GridCacheSharedContext.onDisconnected(GridCacheSharedContext.java:151)
at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onDisconnected(GridCacheProcessor.java:934)
at org.apache.ignite.internal.IgniteKernal.onDisconnected(IgniteKernal.java:3023)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:588)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2058)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2039)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1435)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
This usually happens when all server nodes stop. You should check their logs to understand what exactly happened.
Also note that the client will reconnect automatically when at least one server is back. IgniteClientDisconnectedException has the reconnectFuture() method which returns the future that will be completed when the reconnection happens, so you can block the client until the cluster is functional.

Coherence Exception in NamedCache.putall after updating coherence to 12.1.3

I use coherence in my project with following configuration.
Everything is Ok with coherence 12.1.2, but after updating coherence to version 12.1.3, i have a problem. When i put for example 1000 item on cache one by one using NamedCache.put() there is no error, but when i put 1000 items on cache with calling NamedCache.putall once, coherence raises an exception.
My project is in java and is deployed on jboss.
cache configuration:
<distributed-scheme>
<scheme-name>my-map</scheme-name>
<service-name>MyMap</service-name>
<serializer>
<instance>
<class-name>com.tangosol.io.pof.ConfigurablePofContext</class-name>
<init-params>
<init-param>
<param-type>String</param-type>
<param-value>pof-config.xml</param-value>
</init-param>
</init-params>
</instance>
</serializer>
<thread-count>100</thread-count>
<local-storage>true</local-storage>
<backup-count>0</backup-count>
<backing-map-scheme>
<read-write-backing-map-scheme>
<internal-cache-scheme>
<ramjournal-scheme/>
</internal-cache-scheme>
<cachestore-scheme>
<class-scheme>
<class-name>myPackage.loader</class-name>
</class-scheme>
</cachestore-scheme>
</read-write-backing-map-scheme>
</backing-map-scheme>
<autostart>true</autostart>
</distributed-scheme>
Raised exception details:
Exception in thread "pool-7-thread-1" com.tangosol.net.RequestIncompleteException: Partial failure
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$BinaryMap.validatePartialResponse(PartitionedCache.CDB:56)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$BinaryMap.putAll(PartitionedCache.CDB:123)
at com.oracle.common.collections.ConverterCollections$ConverterMap.putAll(ConverterCollections.java:1553)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$ViewMap.putAll(PartitionedCache.CDB:5)
at com.tangosol.coherence.component.util.SafeNamedCache.putAll(SafeNamedCache.CDB:1)
at mypackage.CacheEnabledDataProvider.putOnCache(CacheEnabledDataProvider.java:66)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: Portable(com.tangosol.util.WrapperException): (Wrapped: Failed request execution for myMap service on Member(Id=1, Timestamp=2014-10-20 18:00:25.737, Address=192.168.70.101:8088, MachineId=5642, Location=site:,machine:PS,process:9876,member:Mem_1, Role=CoherenceServer) (Wrapped: Failed to store keys="9112791815, 9165497418, 9199193873, 9192139970, 9392020128, ") Assertion failed:) Assertion failed:
at com.tangosol.util.Base.ensureRuntimeException(Base.java:289)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:50)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPartialCommit(PartitionedCache.CDB:7)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPutAllRequest(PartitionedCache.CDB:85)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PutAllRequest$PutJob.run(PartitionedCache.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:32)
at com.tangosol.coherence.component.util.DaemonPool$Daemon.onNotify(DaemonPool.CDB:65)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Unknown Source)
at <process boundary>
at com.tangosol.io.pof.ThrowablePofSerializer.deserialize(ThrowablePofSerializer.java:57)
at com.tangosol.io.pof.PofBufferReader.readAsObject(PofBufferReader.java:3316)
at com.tangosol.io.pof.PofBufferReader.readObject(PofBufferReader.java:2604)
at com.tangosol.io.pof.ConfigurablePofContext.deserialize(ConfigurablePofContext.java:376)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.readObject(Service.CDB:1)
at com.tangosol.coherence.component.net.Message.readObject(Message.CDB:1)
at com.tangosol.coherence.component.net.message.responseMessage.DistributedPartialResponse.read(DistributedPartialResponse.CDB:12)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PartialValueResponse.read(PartitionedCache.CDB:4)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.deserializeMessage(Grid.CDB:20)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:21)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.onNotify(PartitionedService.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onNotify(PartitionedCache.CDB:3)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Thread.java:745)
Caused by: Portable(com.tangosol.util.AssertionException): Assertion failed:
at com.tangosol.util.Base.azzertFailed(Base.java:209)
at com.tangosol.util.Base.azzert(Base.java:166)
at com.tangosol.net.cache.ReadWriteBackingMap$CacheStoreWrapper.storeAllInternal(ReadWriteBackingMap.java:5943)
at com.tangosol.net.cache.ReadWriteBackingMap$StoreWrapper.storeAll(ReadWriteBackingMap.java:5067)
at com.tangosol.net.cache.ReadWriteBackingMap.putAll(ReadWriteBackingMap.java:840)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.putAllPrimaryResource(PartitionedCache.CDB:7)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.postPutAll(PartitionedCache.CDB:27)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.putAll(PartitionedCache.CDB:14)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPutAllRequest(PartitionedCache.CDB:62)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PutAllRequest$PutJob.run(PartitionedCache.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:32)
at com.tangosol.coherence.component.util.DaemonPool$Daemon.onNotify(DaemonPool.CDB:65)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Unknown Source)
at <process boundary>
at com.tangosol.io.pof.ThrowablePofSerializer.deserialize(ThrowablePofSerializer.java:57)
at com.tangosol.io.pof.PofBufferReader.readAsObject(PofBufferReader.java:3316)
at com.tangosol.io.pof.PofBufferReader.readObject(PofBufferReader.java:2604)
at com.tangosol.io.pof.PortableException.readExternal(PortableException.java:150)
at com.tangosol.io.pof.ThrowablePofSerializer.deserialize(ThrowablePofSerializer.java:59)
at com.tangosol.io.pof.PofBufferReader.readAsObject(PofBufferReader.java:3316)
at com.tangosol.io.pof.PofBufferReader.readObject(PofBufferReader.java:2604)
at com.tangosol.io.pof.ConfigurablePofContext.deserialize(ConfigurablePofContext.java:376)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.readObject(Service.CDB:1)
at com.tangosol.coherence.component.net.Message.readObject(Message.CDB:1)
at com.tangosol.coherence.component.net.message.responseMessage.DistributedPartialResponse.read(DistributedPartialResponse.CDB:12)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PartialValueResponse.read(PartitionedCache.CDB:4)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.deserializeMessage(Grid.CDB:20)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:21)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.onNotify(PartitionedService.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onNotify(PartitionedCache.CDB:3)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Thread.java:745)
I found that if loader class defined in cachestore-scheme element, implements CacheStore instead of CacheLoader and leave store() and storeAll() empty, no exception is raised. According to oracle documents, implementing CacheStore is not necessary for read-only cache-store and implementing CacheLoader is sufficient. This might be a bug.
I am not sure implementing CacheStore with empty Store() and storeAll() methods when we need read-only cache-store has performance penalty or not?
I would only implement CacheLoader (not CacheStore) if you are not writing back to the database.

Spring Batch UNKNOWN state

Have a batch which does some heavy operations. It runs for approximately 11-12 hours.
After that it moves to UNKNOWN state.
I have a question when would a batch move to UNKNOWN state?
Following is stack Trace.
org.springframework.transaction.TransactionSystemException: Could not roll back JDBC transaction; nested exception is java.sql.SQLException: Protocol violation
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:285)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:845)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:822)
at org.springframework.transaction.support.TransactionTemplate.rollbackOnException(TransactionTemplate.java:161)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:134)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:284)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:282)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:121)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:909)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.sql.SQLException: Protocol violation
at oracle.jdbc.driver.T4CTTIfun.receive
Thanks
Aditya
A batch job will move into the UNKNOWN state only when a rollback is unsuccessful, leaving the job in an uncertain state which is what it looks like happened here. The real question here is…why was the rollback unsuccessful?
Check org.springframework.batch.core.step.AbstractStep. It is full of:
try {
getJobRepository().updateExecutionContext(stepExecution);
}
catch (Exception e) {
stepExecution.setStatus(BatchStatus.UNKNOWN);
exitStatus = exitStatus.and(ExitStatus.UNKNOWN);
stepExecution.addFailureException(e);
}
So it happens on update* operations of getJobRepository() (same as #Minella stated).

Resources