CPU Starvation detected because of MQ exception - ibm-mq

**0002bc2d SibMessage W [:] CWSJY0003W: JMSCC3036: An exception has been delivered to the
connections exception listener: '
Message : com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ1107: A problem with this connection has o
ccurred. An error has occurred with the WebSphere MQ JMS connection. Use the linked exception to determine the cause of this e
rror.
Class : class com.ibm.msg.client.jms.DetailedJMSException
Stack : com.ibm.msg.client.wmq.common.internal.Reason.reasonToException(Reason.java:608)
: com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:236)
: com.ibm.msg.client.wmq.internal.WMQConnection.consumer(WMQConnection.java:851)
: com.ibm.mq.jmqi.remote.internal.RemoteAsyncConsume.callEventHandler(RemoteAsyncConsume.java:1
023)
: com.ibm.mq.jmqi.remote.internal.RemoteAsyncConsume.driveEventsEH(RemoteAsyncConsume.java:1381
)
: com.ibm.mq.jmqi.remote.internal.RemoteDispatchThread.run(RemoteDispatchThread.java:310)
: com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.runTask(WorkQueueItem.java:209)
: com.ibm.msg.client.commonservices.workqueue.SimpleWorkQueueItem.runItem(SimpleWorkQueueItem.j
ava:100)
: com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.run(WorkQueueItem.java:224)
: com.ibm.ws.wmqcsi.workqueue.WorkQueueManagerImpl$WorkQueueRunnable.run(WorkQueueManagerImpl.j
ava:648)
: java.lang.Thread.run(Thread.java:738)
Caused by [1] --> Message : com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED'
) reason '2009' ('MQRC_CONNECTION_BROKEN').
Class : class com.ibm.mq.MQException
Stack : com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
: com.ibm.msg.client.wmq.internal.WMQConnection.consumer(WMQConnection.java:851)
: com.ibm.mq.jmqi.remote.internal.RemoteAsyncConsume.callEventHandler(RemoteAsyncConsume.java:1
023)
: com.ibm.mq.jmqi.remote.internal.RemoteAsyncConsume.driveEventsEH(RemoteAsyncConsume.java:1381)
: com.ibm.mq.jmqi.remote.internal.RemoteDispatchThread.run(RemoteDispatchThread.java:310)
: com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.runTask(WorkQueueItem.java:209)
: com.ibm.msg.client.commonservices.workqueue.SimpleWorkQueueItem.runItem(SimpleWorkQueueItem.j
ava:100)
: com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.run(WorkQueueItem.java:224)
: com.ibm.ws.wmqcsi.workqueue.WorkQueueManagerImpl$WorkQueueRunnable.run(WorkQueueManagerImpl.j
ava:648)
: java.lang.Thread.run(Thread.java:738)
'.
Please can anybody explain what this exception means and why has this caused CPU starvation**

Did you review your own logfile?
JMSWMQ1107: A problem with this connection has occurred. An error has
occurred with the WebSphere MQ JMS connection. Use the linked
exception to determine the cause of this error.
And further down it says:
java.lang.Thread.run(Thread.java:738) Caused by [1] --> Message :
com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with
compcode '2' ('MQCC_FAILED' ) reason '2009'
('MQRC_CONNECTION_BROKEN').
Did you look up MQ reason code 2009 in the MQ Information Center? Basically, it means the connection to the queue manager was lost and you need to reconnect.
and why has this caused CPU starvation
Any CPU starvation you are seeing has nothing to do with error but rather your code is not properly handling the error and is probably in a loop racing around and around in circles doing nothing.

Related

Radis Caching Issue (Time Out issue)

I am facing radis issue. I have implemented centeralized radis cluster approch. But whenever the maximum number of services data get cached in the radis it causes an issue, "Read time out" occured.
Please anyone can give me suggestions what can i do?
Please find below properties:
spring :
redis :
database : ${redis.database}
#host : ${redis.host}
#port : ${redis.port}
password : ${redis.password}
pool :
max-active : 8
max-wait : 10000
max-idle : 8
min-idle : 1
timeout : 10000
redis :
cluster :
nodes : 10.xxx.x.xx:1234,10.xxx.x.xx:12365
database : 0
password : 'xxxxxx'
Error Logs:
2022-02-11 13:59:59,996 ERROR [nio-9120-exec-1008] c.h.l.framework.aop.ControllerAspect - [AGW2202111359590622605456]:
redis.clients.jedis.exceptions.JedisDataException: ERR Error running script (call to f_e78e961698f118a64316964cac1ca9e6008c29da): #user_script:34: #user_script: 34: -OOM command not allowed when used memory > 'maxmemory'.
at redis.clients.jedis.Protocol.processError(Protocol.java:127)
at redis.clients.jedis.Protocol.process(Protocol.java:161)
at redis.clients.jedis.Protocol.read(Protocol.java:215)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
at redis.clients.jedis.Connection.getOne(Connection.java:322)
at redis.clients.jedis.BinaryJedis.evalsha(BinaryJedis.java:3142)
at redis.clients.jedis.BinaryJedis.evalsha(BinaryJedis.java:3135)
at redis.clients.jedis.BinaryJedisCluster$119.execute(BinaryJedisCluster.java:1270)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:120)
at redis.clients.jedis.JedisClusterCommand.runBinary(JedisClusterCommand.java:81)
at redis.clients.jedis.BinaryJedisCluster.evalsha(BinaryJedisCluster.java:1272)
at com.hisun.lemon.framework.jedis.JedisClusterConnection.evalSha(JedisClusterConnection.java:67)
at sun.reflect.GeneratedMethodAccessor1795.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.data.redis.core.CloseSuppressingInvocationHandler.invoke(CloseSuppressingInvocationHandler.java:57)
at com.sun.proxy.$Proxy537.evalSha(Unknown Source)

Getting an XAException

I'm having this problem and I am hoping someone can help me out.
Error message :
[6/24/21 5:20:01:846 GMT-03:00] 000014df WSRdbXaResour E DSRA0304E: XAException occurred. XAException contents and details are:
The XA Error is : -7
The XA Error message is : Resource manager is unavailable.
The Oracle Error code is : 17410
The Oracle Error message is: Internal XA Error
.
[6/24/21 5:20:01:849 GMT-03:00] 000014df WSRdbXaResour E DSRA0302E: XAException occurred. Error code is: XAER_RMFAIL (-7). Exception is: <null>
[6/24/21 5:20:01:852 GMT-03:00] 000014df ConnectionEve W J2CA0206W: A connection error occurred. To help determine the problem, enable the Diagnose Connection Usage option on the Connection Factory or Data Source. This is the multithreaded access detection option. Alternatively check that the Database or MessageProvider is available.
[6/24/21 5:20:01:853 GMT-03:00] 000014df ConnectionEve A J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adapter for resource MegaBroker_DataSource. The exception is: oracle.jdbc.xa.OracleXAException
[6/24/21 5:20:01:853 GMT-03:00] 000014df RegisteredRes E WTRN0078E: An attempt by the transaction manager to call start on a transactional resource has resulted in an error. The error code was XAER_RMFAIL. The exception stack trace follows: oracle.jdbc.xa.OracleXAException
at oracle.jdbc.xa.OracleXAResource.checkError(OracleXAResource.java:1188)
.
.
.
[6/24/21 5:20:01:859 GMT-03:00] 000014df XATransaction E J2CA0030E: Method enlist caught javax.transaction.SystemException: XAResource start association error:XAER_RMFAIL
at com.ibm.tx.jta.impl.RegisteredResources.startRes(RegisteredResources.java:1088)
.
.
.
Caused by: oracle.jdbc.xa.OracleXAException
.
.
.
while trying to enlist resources from DataSource MegaBroker_DataSource with the Transaction Manager for the current transaction, and threw a ResourceException.
System Details:
WAS Version: 8.5.5.3 Oracle version: 12c

Exception 'Cannot get a connection, pool error Timeout waiting for idle object' when using 'DBCPConnectionPoolLookup' service in Nifi

I'm trying to use 'DBCPConnectionPoolLookup' service in 'ExecuteGroovyScript' to dynamically query the required database based on 'database.name' parameter in the input flow file.
The processor is successfully able to get the corresponding 'DBCPConnectionPool' service for querying but I'm getting the an exception java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object. As opposed to if I directly use the 'DBCPConnectionPool' service without the 'Lookup' service without changing any configuration it works fine.
I access the service as follows:
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())
Then use the 'clientDb' object to query as:
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
I have tried increasing the values of Max Wait Time and Max Total Connections to higher values in 'DBCPConnectionPool' service, it does not help.
Please find below detail links of images for code,error and configuration
Exception
Configuration of 'ExecuteGroovyScript'
Configuration of 'DBCPConnectionPool' service
Configuration of 'DBCPConnectionPoolLookup' service
Script Code
import org.apache.nifi.distributed.cache.client.Deserializer
import org.apache.nifi.distributed.cache.client.Serializer
import org.apache.nifi.distributed.cache.client.exception.DeserializationException
import org.apache.nifi.distributed.cache.client.exception.SerializationException
import groovy.sql.Sql
import java.time.*
try {
def flowFile = session.get()
def isBootstrap=flowFile."isBootstrap"
def timseriesSqlQuery='SELECT id FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def timseriesSqlCountQuery='SELECT count(id) as c FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def spaceSqlQuery='select id from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def spaceSqlCountQuery='select count(id) as c from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def cache = CTL.lastIngestTimeMap
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())//SQL.staticService
int numRowsTimeSeries=0
int numRowsSpace=0
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
clientDb.rows(spaceSqlCountQuery).eachWithIndex { row, idx ->numRowsSpace= row.c}
}
Exception from Nifi logs
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-3] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-3] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-9] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-9] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-10] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=a1ec4496-dca3-38ab-a47b-43d7ff95e40f] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-2] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=54d1e251-88f2-33f3-0489-722879a802bd] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
Finally after bring down Nifi twice I have found the solution. The problem seemed to be in the code which I was using, I used the object returned by CTL.index.getConnection(flowFile.getAttributes()) to query the SQL table which actually is a connection table, now due to this Nifi used up all available connections to SQL, due to which even if I reverted to using 'DBCPConnectionPool' service instead if 'Lookup' I was getting the above error. When I used to restart Nifi it used to work fine.
The actual code to be used in your script for using 'Lookup' Service is
def connectionObj = CTL.index.getConnection(flowFile.getAttributes())
def clientDb = new Sql(connectionObj)
Now use the 'clientDb' object to query your table
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}

Explanation more on org.apache.hadoop.hdfs.leaserenewer

The message from kafka topic is not written to hdfs.
Reason :
DEBUG LeaseRenewer:pratim_ghosh#localhost:9000, clients=[DFSClient_NONMAPREDUCE_1103459197_28], created at java.lang.Throwable: TRACE
at org.apache.hadoop.hdfs.LeaseRenewer.<init>(LeaseRenewer.java:202)
at org.apache.hadoop.hdfs.LeaseRenewer.<init>(LeaseRenewer.java:71)
at org.apache.hadoop.hdfs.LeaseRenewer$Factory.get(LeaseRenewer.java:143)
at org.apache.hadoop.hdfs.LeaseRenewer$Factory.access$100(LeaseRenewer.java:90)
at org.apache.hadoop.hdfs.LeaseRenewer.getInstance(LeaseRenewer.java:80)
at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:805)
at org.apache.hadoop.hdfs.DFSClient.beginFileLease(DFSClient.java:811)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1672)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1593)
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:775)
at io.confluent.connect.hdfs.avro.AvroRecordWriterProvider.getRecordWriter(AvroRecordWriterProvider.java:55)
Could anybody please explain.

Spring Batch UNKNOWN state

Have a batch which does some heavy operations. It runs for approximately 11-12 hours.
After that it moves to UNKNOWN state.
I have a question when would a batch move to UNKNOWN state?
Following is stack Trace.
org.springframework.transaction.TransactionSystemException: Could not roll back JDBC transaction; nested exception is java.sql.SQLException: Protocol violation
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:285)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:845)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:822)
at org.springframework.transaction.support.TransactionTemplate.rollbackOnException(TransactionTemplate.java:161)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:134)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:284)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:282)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:121)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:909)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.sql.SQLException: Protocol violation
at oracle.jdbc.driver.T4CTTIfun.receive
Thanks
Aditya
A batch job will move into the UNKNOWN state only when a rollback is unsuccessful, leaving the job in an uncertain state which is what it looks like happened here. The real question here is…why was the rollback unsuccessful?
Check org.springframework.batch.core.step.AbstractStep. It is full of:
try {
getJobRepository().updateExecutionContext(stepExecution);
}
catch (Exception e) {
stepExecution.setStatus(BatchStatus.UNKNOWN);
exitStatus = exitStatus.and(ExitStatus.UNKNOWN);
stepExecution.addFailureException(e);
}
So it happens on update* operations of getJobRepository() (same as #Minella stated).

Resources