Spring Batch UNKNOWN state - spring

Have a batch which does some heavy operations. It runs for approximately 11-12 hours.
After that it moves to UNKNOWN state.
I have a question when would a batch move to UNKNOWN state?
Following is stack Trace.
org.springframework.transaction.TransactionSystemException: Could not roll back JDBC transaction; nested exception is java.sql.SQLException: Protocol violation
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:285)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:845)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:822)
at org.springframework.transaction.support.TransactionTemplate.rollbackOnException(TransactionTemplate.java:161)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:134)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:284)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:282)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:121)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:909)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.sql.SQLException: Protocol violation
at oracle.jdbc.driver.T4CTTIfun.receive
Thanks
Aditya

A batch job will move into the UNKNOWN state only when a rollback is unsuccessful, leaving the job in an uncertain state which is what it looks like happened here. The real question here is…why was the rollback unsuccessful?

Check org.springframework.batch.core.step.AbstractStep. It is full of:
try {
getJobRepository().updateExecutionContext(stepExecution);
}
catch (Exception e) {
stepExecution.setStatus(BatchStatus.UNKNOWN);
exitStatus = exitStatus.and(ExitStatus.UNKNOWN);
stepExecution.addFailureException(e);
}
So it happens on update* operations of getJobRepository() (same as #Minella stated).

Related

Exception 'Cannot get a connection, pool error Timeout waiting for idle object' when using 'DBCPConnectionPoolLookup' service in Nifi

I'm trying to use 'DBCPConnectionPoolLookup' service in 'ExecuteGroovyScript' to dynamically query the required database based on 'database.name' parameter in the input flow file.
The processor is successfully able to get the corresponding 'DBCPConnectionPool' service for querying but I'm getting the an exception java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object. As opposed to if I directly use the 'DBCPConnectionPool' service without the 'Lookup' service without changing any configuration it works fine.
I access the service as follows:
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())
Then use the 'clientDb' object to query as:
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
I have tried increasing the values of Max Wait Time and Max Total Connections to higher values in 'DBCPConnectionPool' service, it does not help.
Please find below detail links of images for code,error and configuration
Exception
Configuration of 'ExecuteGroovyScript'
Configuration of 'DBCPConnectionPool' service
Configuration of 'DBCPConnectionPoolLookup' service
Script Code
import org.apache.nifi.distributed.cache.client.Deserializer
import org.apache.nifi.distributed.cache.client.Serializer
import org.apache.nifi.distributed.cache.client.exception.DeserializationException
import org.apache.nifi.distributed.cache.client.exception.SerializationException
import groovy.sql.Sql
import java.time.*
try {
def flowFile = session.get()
def isBootstrap=flowFile."isBootstrap"
def timseriesSqlQuery='SELECT id FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def timseriesSqlCountQuery='SELECT count(id) as c FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def spaceSqlQuery='select id from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def spaceSqlCountQuery='select count(id) as c from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def cache = CTL.lastIngestTimeMap
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())//SQL.staticService
int numRowsTimeSeries=0
int numRowsSpace=0
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
clientDb.rows(spaceSqlCountQuery).eachWithIndex { row, idx ->numRowsSpace= row.c}
}
Exception from Nifi logs
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-3] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-3] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-9] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-9] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-10] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=a1ec4496-dca3-38ab-a47b-43d7ff95e40f] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-2] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=54d1e251-88f2-33f3-0489-722879a802bd] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
Finally after bring down Nifi twice I have found the solution. The problem seemed to be in the code which I was using, I used the object returned by CTL.index.getConnection(flowFile.getAttributes()) to query the SQL table which actually is a connection table, now due to this Nifi used up all available connections to SQL, due to which even if I reverted to using 'DBCPConnectionPool' service instead if 'Lookup' I was getting the above error. When I used to restart Nifi it used to work fine.
The actual code to be used in your script for using 'Lookup' Service is
def connectionObj = CTL.index.getConnection(flowFile.getAttributes())
def clientDb = new Sql(connectionObj)
Now use the 'clientDb' object to query your table
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}

Try catch ignored?

The following code creates a new table in the db. I would like it to catch any sql errors and continue running should the table already exists. However, when I execute the code, if the table already exists, I get an exception as expected but the code exits during compilation. Is the try catch being ignored?
Code:
(ns app.storage
(:import com.mchange.v2.c3p0.ComboPooledDataSource
(clojure.lang ExceptionInfo))
(:require [clojure.java.jdbc :refer :all]))
(def db {
:classname "org.sqlite.JDBC"
:subprotocol "sqlite"
:subname "src/storage/journal.db"
})
(defn create-db []
(try
(db-do-commands db
(create-table-ddl :entry
[:id :primary :key]
[:account "varchar(255)"]
[:timestamp "timestamp"]
[:debt "double(9,2)"]
[:credit "double(9,2)"]))
(catch ExceptionInfo e
(println e))))
(create-db)
Exception:
Exception in thread "main" java.sql.BatchUpdateException: batch entry 0: [SQLITE_ERROR] SQL error or missing database (table entry already exists), compiling:(app/storage.clj:29:21)
at clojure.lang.Compiler.load(Compiler.java:7142)
at clojure.lang.RT.loadResourceScript(RT.java:370)
at clojure.lang.RT.loadResourceScript(RT.java:361)
at clojure.lang.RT.load(RT.java:440)
at clojure.lang.RT.load(RT.java:411)
at clojure.core$load$fn__5066.invoke(core.clj:5641)
at clojure.core$load.doInvoke(core.clj:5640)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.core$load_one.invoke(core.clj:5446)
at clojure.core$load_lib$fn__5015.invoke(core.clj:5486)
at clojure.core$load_lib.doInvoke(core.clj:5485)
at clojure.lang.RestFn.applyTo(RestFn.java:142)
at clojure.core$apply.invoke(core.clj:626)
at clojure.core$load_libs.doInvoke(core.clj:5524)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at clojure.core$apply.invoke(core.clj:628)
at clojure.core$use.doInvoke(core.clj:5618)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at app.entry$eval2001$loading__4958__auto____2002.invoke(entry.clj:1)
at app.entry$eval2001.invoke(entry.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:6703)
at clojure.lang.Compiler.eval(Compiler.java:6692)
at clojure.lang.Compiler.load(Compiler.java:7130)
at clojure.lang.RT.loadResourceScript(RT.java:370)
at clojure.lang.RT.loadResourceScript(RT.java:361)
at clojure.lang.RT.load(RT.java:440)
at clojure.lang.RT.load(RT.java:411)
at clojure.core$load$fn__5066.invoke(core.clj:5641)
at clojure.core$load.doInvoke(core.clj:5640)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.core$load_one.invoke(core.clj:5446)
at clojure.core$load_lib$fn__5015.invoke(core.clj:5486)
at clojure.core$load_lib.doInvoke(core.clj:5485)
at clojure.lang.RestFn.applyTo(RestFn.java:142)
at clojure.core$apply.invoke(core.clj:626)
at clojure.core$load_libs.doInvoke(core.clj:5524)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at clojure.core$apply.invoke(core.clj:628)
at clojure.core$use.doInvoke(core.clj:5618)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at app.handler$eval1687$loading__4958__auto____1688.invoke(handler.clj:1)
at app.handler$eval1687.invoke(handler.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:6703)
at clojure.lang.Compiler.eval(Compiler.java:6692)
at clojure.lang.Compiler.load(Compiler.java:7130)
at clojure.lang.RT.loadResourceScript(RT.java:370)
at clojure.lang.RT.loadResourceScript(RT.java:361)
at clojure.lang.RT.load(RT.java:440)
at clojure.lang.RT.load(RT.java:411)
at clojure.core$load$fn__5066.invoke(core.clj:5641)
at clojure.core$load.doInvoke(core.clj:5640)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.core$load_one.invoke(core.clj:5446)
at clojure.core$load_lib$fn__5015.invoke(core.clj:5486)
at clojure.core$load_lib.doInvoke(core.clj:5485)
at clojure.lang.RestFn.applyTo(RestFn.java:142)
at clojure.core$apply.invoke(core.clj:626)
at clojure.core$load_libs.doInvoke(core.clj:5524)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at clojure.core$apply.invoke(core.clj:626)
at clojure.core$require.doInvoke(core.clj:5607)
at clojure.lang.RestFn.invoke(RestFn.java:421)
at user$eval5.invoke(form-init1139882819344867506.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:6703)
at clojure.lang.Compiler.eval(Compiler.java:6692)
at clojure.lang.Compiler.load(Compiler.java:7130)
at clojure.lang.Compiler.loadFile(Compiler.java:7086)
at clojure.main$load_script.invoke(main.clj:274)
at clojure.main$init_opt.invoke(main.clj:279)
at clojure.main$initialize.invoke(main.clj:307)
at clojure.main$null_opt.invoke(main.clj:342)
at clojure.main$main.doInvoke(main.clj:420)
at clojure.lang.RestFn.invoke(RestFn.java:421)
at clojure.lang.Var.invoke(Var.java:383)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.Var.applyTo(Var.java:700)
at clojure.main.main(main.java:37)
Caused by: java.sql.BatchUpdateException: batch entry 0: [SQLITE_ERROR] SQL error or missing database (table entry already exists)
at org.sqlite.Stmt.executeBatch(Stmt.java:226)
at clojure.java.jdbc$execute_batch.invoke(jdbc.clj:400)
at clojure.java.jdbc$db_do_commands$fn__2331.invoke(jdbc.clj:671)
at clojure.java.jdbc$db_transaction_STAR_.doInvoke(jdbc.clj:580)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at clojure.java.jdbc$db_do_commands.doInvoke(jdbc.clj:670)
at clojure.lang.RestFn.applyTo(RestFn.java:142)
at clojure.core$apply.invoke(core.clj:628)
at clojure.java.jdbc$db_do_commands.doInvoke(jdbc.clj:677)
at clojure.lang.RestFn.applyTo(RestFn.java:142)
at clojure.core$apply.invoke(core.clj:630)
at clojure.java.jdbc$db_do_commands.doInvoke(jdbc.clj:664)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at app.storage$create_db.invoke(storage.clj:21)
at app.storage$eval2445.invoke(storage.clj:33)
at clojure.lang.Compiler.eval(Compiler.java:6703)
at clojure.lang.Compiler.load(Compiler.java:7130)
... 76 more
Subprocess failed
I think the problem is that the catch is trying to catch exceptions of type ExceptionInfo. Try changing that to just Exception or Throwable and see if it catches the exception.
(catch Exception e
Notice in your stack trace the type of the exception being thrown is java.sql.BatchUpdateException, which doesn't inherit from clojure.lang.ExceptionInfo.
ExceptionInfo exceptions are typically created by calling ex-info, docs here. Your catch will only catch exceptions of the specified type (or sub-types of it).

class org.apache.ignite.IgniteClientDisconnectedException

I am trying to populate a Gridgain cache which is running on a cluster on other machine. I'm getting ClientDisconnectedException while calling method on a Gridgain Cache and getting this after calling put method on the Cache.
Here is my cache Configuration:
// DPH cache
CacheConfiguration<K,V> DPHCacheCfg = new CacheConfiguration<>(DPH_CACHE);
DPHCacheCfg.setCacheMode(CacheMode.PARTITIONED); // Default.
DPHCacheCfg.setIndexedTypes(String.class, DPH.class);
DPHCacheCfg.setOffHeapMaxMemory(10 * 1024L * 1024L * 1024L);
DPHCacheCfg.setMemoryMode(CacheMemoryMode.ONHEAP_TIERED);
FifoEvictionPolicy evctPolicy = new FifoEvictionPolicy();
DPHCacheCfg.setEvictionPolicy(evctPolicy);
Her is how I put data in The Cache:
DPHCache.put(K, V); where V is some object. After certain number of puts, I am having the below exception.
avax.cache.CacheException: class org.apache.ignite.IgniteClientDisconnectedException: Operation has been cancelled (client node disconnected).
at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1615)
at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:1955)
at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1155)
at com.elsevier.elssie.datafabric.LoadQuetzal.populateDPH(LoadQuetzal.java:246)
at com.elsevier.elssie.datafabric.LoadQuetzal.main(LoadQuetzal.java:163)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteClientDisconnectedException: Operation has been cancelled (client node disconnected).
at org.apache.ignite.internal.util.IgniteUtils$14.apply(IgniteUtils.java:829)
at org.apache.ignite.internal.util.IgniteUtils$14.apply(IgniteUtils.java:827)
... 11 more
Caused by: class org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Operation has been cancelled (client node disconnected).
at org.apache.ignite.internal.processors.cache.GridCacheMvccManager.disconnectedError(GridCacheMvccManager.java:406)
at org.apache.ignite.internal.processors.cache.GridCacheMvccManager.onDisconnected(GridCacheMvccManager.java:382)
at org.apache.ignite.internal.processors.cache.GridCacheSharedContext.onDisconnected(GridCacheSharedContext.java:151)
at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onDisconnected(GridCacheProcessor.java:934)
at org.apache.ignite.internal.IgniteKernal.onDisconnected(IgniteKernal.java:3023)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:588)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2058)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2039)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1435)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
This usually happens when all server nodes stop. You should check their logs to understand what exactly happened.
Also note that the client will reconnect automatically when at least one server is back. IgniteClientDisconnectedException has the reconnectFuture() method which returns the future that will be completed when the reconnection happens, so you can block the client until the cluster is functional.

org.apache.http.nio.reactor.IOReactorException: I/O dispatch worker terminated abnormally

I have a service which uses apache HttpAsyncClient. (versions: httpasyncclient-4.0.2.jar, httpcore-4.4.3.jar, httpcore-nio-4.3.3.jar)
All requests start failing some time after starting the async client with following being initial exception -
[#|2016-03-16T22:31:59.376-0700|SEVERE|glassfish3.1.2|org.apache.http.impl.nio.client.InternalHttpAsyncClient|_ThreadID=564;_ThreadName=Thread-6;|I/O reactor terminated abnormally
org.apache.http.nio.reactor.IOReactorException: I/O dispatch worker terminated abnormally
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:357)
at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:189)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.doExecute(CloseableHttpAsyncClientBase.java:67)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.access$000(CloseableHttpAsyncClientBase.java:38)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:57)
at java.lang.Thread.run(Unknown Source)
Caused by: RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=null)
at com.notificationservice.analytics.client.AsyncResponse$2.failed(AsyncResponse.java:178)
at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:134)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.failed(DefaultClientExchangeHandlerImpl.java:258)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.exception(HttpAsyncRequestExecutor.java:127)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:68)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:37)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.outputReady(AbstractIODispatch.java:154)
at org.apache.http.impl.nio.reactor.BaseIOReactor.writable(BaseIOReactor.java:180)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:342)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:316)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:277)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:586)
at java.lang.Thread.run(Unknown Source)
)
at com.notificationservice.client.AsyncResponse$2.failed(AsyncResponse.java:178)
at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:134)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.failed(DefaultClientExchangeHandlerImpl.java:258)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.exception(HttpAsyncRequestExecutor.java:127)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:68)
at org.apache.http.impl.nio.client.InternalIODispatch.onException(InternalIODispatch.java:37)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.outputReady(AbstractIODispatch.java:154)
at org.apache.http.impl.nio.reactor.BaseIOReactor.writable(BaseIOReactor.java:180)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:342)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:316)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:277)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:586)
... 1 more
Same problem happens with new versions - httpasyncclient-4.1.1.jar, httpcore-4.4.4.jar, httpcore-nio-4.4.4.jar
Any insight would be highly appreciated. Is there some IOReactorConfig parameter which needs to be changed?
I would say something is wrong with your rest parameters. StatusCode 500 comes from the server so your request are going to it.
Caused by: RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=RestException(statusCode=500, code=null, message=I/O operation failed, developerMessage=null

Coherence Exception in NamedCache.putall after updating coherence to 12.1.3

I use coherence in my project with following configuration.
Everything is Ok with coherence 12.1.2, but after updating coherence to version 12.1.3, i have a problem. When i put for example 1000 item on cache one by one using NamedCache.put() there is no error, but when i put 1000 items on cache with calling NamedCache.putall once, coherence raises an exception.
My project is in java and is deployed on jboss.
cache configuration:
<distributed-scheme>
<scheme-name>my-map</scheme-name>
<service-name>MyMap</service-name>
<serializer>
<instance>
<class-name>com.tangosol.io.pof.ConfigurablePofContext</class-name>
<init-params>
<init-param>
<param-type>String</param-type>
<param-value>pof-config.xml</param-value>
</init-param>
</init-params>
</instance>
</serializer>
<thread-count>100</thread-count>
<local-storage>true</local-storage>
<backup-count>0</backup-count>
<backing-map-scheme>
<read-write-backing-map-scheme>
<internal-cache-scheme>
<ramjournal-scheme/>
</internal-cache-scheme>
<cachestore-scheme>
<class-scheme>
<class-name>myPackage.loader</class-name>
</class-scheme>
</cachestore-scheme>
</read-write-backing-map-scheme>
</backing-map-scheme>
<autostart>true</autostart>
</distributed-scheme>
Raised exception details:
Exception in thread "pool-7-thread-1" com.tangosol.net.RequestIncompleteException: Partial failure
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$BinaryMap.validatePartialResponse(PartitionedCache.CDB:56)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$BinaryMap.putAll(PartitionedCache.CDB:123)
at com.oracle.common.collections.ConverterCollections$ConverterMap.putAll(ConverterCollections.java:1553)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$ViewMap.putAll(PartitionedCache.CDB:5)
at com.tangosol.coherence.component.util.SafeNamedCache.putAll(SafeNamedCache.CDB:1)
at mypackage.CacheEnabledDataProvider.putOnCache(CacheEnabledDataProvider.java:66)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: Portable(com.tangosol.util.WrapperException): (Wrapped: Failed request execution for myMap service on Member(Id=1, Timestamp=2014-10-20 18:00:25.737, Address=192.168.70.101:8088, MachineId=5642, Location=site:,machine:PS,process:9876,member:Mem_1, Role=CoherenceServer) (Wrapped: Failed to store keys="9112791815, 9165497418, 9199193873, 9192139970, 9392020128, ") Assertion failed:) Assertion failed:
at com.tangosol.util.Base.ensureRuntimeException(Base.java:289)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:50)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPartialCommit(PartitionedCache.CDB:7)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPutAllRequest(PartitionedCache.CDB:85)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PutAllRequest$PutJob.run(PartitionedCache.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:32)
at com.tangosol.coherence.component.util.DaemonPool$Daemon.onNotify(DaemonPool.CDB:65)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Unknown Source)
at <process boundary>
at com.tangosol.io.pof.ThrowablePofSerializer.deserialize(ThrowablePofSerializer.java:57)
at com.tangosol.io.pof.PofBufferReader.readAsObject(PofBufferReader.java:3316)
at com.tangosol.io.pof.PofBufferReader.readObject(PofBufferReader.java:2604)
at com.tangosol.io.pof.ConfigurablePofContext.deserialize(ConfigurablePofContext.java:376)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.readObject(Service.CDB:1)
at com.tangosol.coherence.component.net.Message.readObject(Message.CDB:1)
at com.tangosol.coherence.component.net.message.responseMessage.DistributedPartialResponse.read(DistributedPartialResponse.CDB:12)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PartialValueResponse.read(PartitionedCache.CDB:4)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.deserializeMessage(Grid.CDB:20)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:21)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.onNotify(PartitionedService.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onNotify(PartitionedCache.CDB:3)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Thread.java:745)
Caused by: Portable(com.tangosol.util.AssertionException): Assertion failed:
at com.tangosol.util.Base.azzertFailed(Base.java:209)
at com.tangosol.util.Base.azzert(Base.java:166)
at com.tangosol.net.cache.ReadWriteBackingMap$CacheStoreWrapper.storeAllInternal(ReadWriteBackingMap.java:5943)
at com.tangosol.net.cache.ReadWriteBackingMap$StoreWrapper.storeAll(ReadWriteBackingMap.java:5067)
at com.tangosol.net.cache.ReadWriteBackingMap.putAll(ReadWriteBackingMap.java:840)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.putAllPrimaryResource(PartitionedCache.CDB:7)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.postPutAll(PartitionedCache.CDB:27)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.putAll(PartitionedCache.CDB:14)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPutAllRequest(PartitionedCache.CDB:62)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PutAllRequest$PutJob.run(PartitionedCache.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool$WrapperTask.run(DaemonPool.CDB:32)
at com.tangosol.coherence.component.util.DaemonPool$Daemon.onNotify(DaemonPool.CDB:65)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Unknown Source)
at <process boundary>
at com.tangosol.io.pof.ThrowablePofSerializer.deserialize(ThrowablePofSerializer.java:57)
at com.tangosol.io.pof.PofBufferReader.readAsObject(PofBufferReader.java:3316)
at com.tangosol.io.pof.PofBufferReader.readObject(PofBufferReader.java:2604)
at com.tangosol.io.pof.PortableException.readExternal(PortableException.java:150)
at com.tangosol.io.pof.ThrowablePofSerializer.deserialize(ThrowablePofSerializer.java:59)
at com.tangosol.io.pof.PofBufferReader.readAsObject(PofBufferReader.java:3316)
at com.tangosol.io.pof.PofBufferReader.readObject(PofBufferReader.java:2604)
at com.tangosol.io.pof.ConfigurablePofContext.deserialize(ConfigurablePofContext.java:376)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.readObject(Service.CDB:1)
at com.tangosol.coherence.component.net.Message.readObject(Message.CDB:1)
at com.tangosol.coherence.component.net.message.responseMessage.DistributedPartialResponse.read(DistributedPartialResponse.CDB:12)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PartialValueResponse.read(PartitionedCache.CDB:4)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.deserializeMessage(Grid.CDB:20)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:21)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.onNotify(PartitionedService.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onNotify(PartitionedCache.CDB:3)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Thread.java:745)
I found that if loader class defined in cachestore-scheme element, implements CacheStore instead of CacheLoader and leave store() and storeAll() empty, no exception is raised. According to oracle documents, implementing CacheStore is not necessary for read-only cache-store and implementing CacheLoader is sufficient. This might be a bug.
I am not sure implementing CacheStore with empty Store() and storeAll() methods when we need read-only cache-store has performance penalty or not?
I would only implement CacheLoader (not CacheStore) if you are not writing back to the database.

Resources