Spring-XD Curator Connection Timeout - spring

In Spring-XD the Curator Connection times out:
WARN ConnectionStateManager-0 curator.ConnectionState - Connection
attempt unsuccessful after 63021 (greater than max timeout of 60000).
Resetting connection and trying again with a new connection.
Curator tries to re-establish the connection, but fails. Please check the logs below. Has anyone faced similar issue? Please let me know if you know of any ways to resolve the issue or if you know of any workarounds.
Also the default Curator connection time out is 60000. Is there a way to increase it? Does spring-xd expose a property which can be set?
2014-12-10 01:24:41,003 WARN ConnectionStateManager-0
server.ContainerRegistrar - >>> disconnected container:
1c8a234d-4b8d-4d65-b374-xxxxe8619 2014-12-10 01:24:41,004 INFO
DeploymentsPathChildrenCache-0 server.ContainerRegistrar - Path cache
event: null, type: CONNECTION_SUSPENDED 2014-12-10 01:24:41,005 INFO
ConnectionStateManager-0 server.ContainerRegistrar - Undeploying
module [ModuleDescriptor#350920b1 moduleName = 'rabbit', moduleLabel =
'rabbit', group = 'xxx-ingestion-2', sourceChannelName = [null],
sinkChannelName = [null], sinkChannelName = [null], index = 0, type =
source, parameters = map['vhost' -> 'xxx_virtual_host', 'requeue' ->
'false', 'outputType' -> 'text/plain', 'queues' -> 'xx.xxx.queue',
'addresses' -> 'xxxmq.xx.xxxx.com'], children = list[[empty]]]
2014-12-10 01:24:46,022 ERROR pool-22-thread-1
connection.CachingConnectionFactory - Channel shutdown: clean
connection shutdown; protocol method:
method<connection.close>(reply-code=200, reply-text=OK, class-id=0, method-id=0)
2014-12-10 01:24:56,007 **ERROR CuratorFramework-0
curator.ConnectionState - Connection timed out for connection string
(514.xx.93.xxx:2181,504.58.xxx.xx:2181) and timeout (15000) / elapsed**
(15004) org.apache.curator.CuratorConnectionLossException:
KeeperErrorCode = ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198)
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:793)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:779)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:58)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:265)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744) 2014-12-10 01:24:56,161
ERROR main-EventThread curator.ConnectionState - Connection timed out
for connection string (514.xx.93.xxx:2181,504.58.xxx.xx:2181) and
timeout (15000) / elapsed (15159)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198)
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:474)
at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:287)
at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
at org.springframework.xd.dirt.server.ContainerRegistrar$StreamModuleWatcher.process(ContainerRegistrar.java:744)
at org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2014-12-10 01:25:03,014 ERROR CuratorFramework-0 imps.CuratorFrameworkImpl - **Background retry gave up**
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198)
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:793)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:779)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:58)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:265)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Is this reproducible? Are you running in clustered or single node mode?
The Curator connection timeout (in milliseconds) can be set via system property curator-default-connection-timeout.

Related

Storm supervisor cat't connect the zk

the error msg in supervisor.log: Storm supervisor cat't create stormClusterState
at the same time,It is empty in the /storm/supervisor directory of zk.The nimbus process can be started but the supervisor cannot start.why?
the error msg in supervisor.log:
ava.lang.Error: java.lang.RuntimeException: org.apache.storm.shade.org.apache.zookeeper.KeeperExceptionsConnectionLossException: KeeperErrorCode = ConnectionLoss for /stom
at org.apache.storm.utils.Utils.handleUncaughtException(Utils.java:663)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.utils.Utils.handleUncaughtException(Utils.java:667)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.utils.Utils.lambda$createDefaultUncaughtExceptionHandler$2(Utils.java:1047)[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.utils.Utils$$Lambda$17/00000000F826AC00.uncaughtException(UnknownSource)[storm-client-2.3.0.jar:2.3.0]
at java. lang. ThreadGroup.uncaughtException (ThreadGroup.java:B68) [7:1.8.0_2421
at java.lang. ThreadGroup.uncaughtException (ThreadGroup. java: 866) [?:1.8.0 242j
at java.lang.Thread.uncaughtException(Thread.java: 1335) [7:1.8.0 242]
Caused by: java.lang.RuntimeException:org.apache.storm.shade.org.apache.zookeeper.KeeperException$ConnectionLossException:KeeperErrorCode=ConnectionLossfor/storm
at org.apache.storm.utils.Utils.wrapInRuntime(Utils.java:493)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.zookeeper.ClientZookeeper.existsNode(ClientZookeeper.java:147)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.zookeeper.ClientZookeeper.mkdirsImpl(ClientZookeeper.java:288)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.zookeeper.ClientZookeeper.mkdirs(ClientZookeeper.java:70)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ZKStateStorage.(ZKStateStorage.java:65)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ZKStateStorageFactory.mkStore(ZKStateStorageFactory.java:30)~[storm-client-2.3.0.jar:2.3.01
at org.apache.storm.cluster.ClusterUtils.mkStateStorageImpl(ClusterUtils.java:318)~[storm-client-2.3.0.jar:2.3.01
at org.apache.storm.cluster.ClusterUtils.mkStormClusterStateImpl(ClusterUtils.java:301)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ClusterUtils.mkStormClusterState(ClusterUtils.java:286)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.daemon.supervisor. Supervisor.(Supervisor.java: 160) ~[storm-server-2.3.0.jar:2.3.0]
at org.apache.storm.daemon.supervisor.Supervisor.(Supervisor.java:127)~[storm-server-2.3.0.jar:2.3.0]
at org.apache.storm.daemon.supervisor.Supervisor.main(Supervisor.java:200)~[storm-server-2.3.0.jar:2.3.0]
caused by: org.apache.storm.shade.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /storm
at org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:102)~[storm-shaded-deps-2.3.0.jar:2.3.0]
at org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:54)~[storm-shaded-deps-2.3.0.jar:2.3.01
at org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1l11)~[storm-shaded-deps-2.3.0.jar:2.3.0]
at org.apache.storm.shade.org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:268)~[storm-shaded-deps-2.3.0.jar:2.3.0]
at org.apache.storm.shade.org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuj.derImpl.java:257)~[storm-shaded-deps-2.3.0.jar:2.3.01
at org.apache.storm.shade.org.apache.curator.connection.StandardConnectionHandlingPolicy.callkäthRetry(StandardConnectionHandlingPolicy.java:64)-[storm-shaded-deps
at org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)~[storm-shaded-deps-2.3.0.jar:2.3.0]
at org.apache.storm.shade.org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForegroundStandard(ExistsBuilderImpl.java:254)~[storm-shaded-deps-2.3.0.jar:2.3
at org.apache.storm.shade.org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:247)~[storm-shaded-deps-2.3.0.jar:2.3.0]
at org.apache.storm.shade.org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:206)~[storm-shaded-deps-2.3.0.jar:2.3.0]
at org.apache.storm.shade.org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:35)~[storm-shaded-deps-2.3.0.jar:2.3.0]
at org.apache.storm.zookeeper.ClientZookeeper.existsNode(ClientZookeeper.java:144)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.zookeeper.ClientZookeeper.mkdirsImpl(ClientZookeeper.java:288)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.zookeeper.ClientZookeeper.mkdirs(ClientZookeeper.java:70)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ZKStateStorage.(ZKStateStorage.java:65)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ZKStateStorageFactory.mkStore(ZKStateStorageFactory.java:30)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ClusterUtils.mkStateStorageImpl(ClusterUtils.java:318)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ClusterUtils.mkStormClusterStateImpl(ClusterUtils.java:301)~[storm-client-2.3.0.jar:2.3.0]
at org.apache.storm.cluster.ClusterUtils.mkStormClusterState(ClusterUtils.java:286)-[storm-client-2.3.0.iar:2.3.01
at org.apache.storm.daemon.supervisor.Supervisor.(Supervisor.java:160)~[storm-server-2.3.0.jar:2.3.0]
at org.apache.storm.daemon.supervisor.Supervisor.(Supervisor.java:127)~[storm-server-2.3.0.jar:2.3.0]
at org.apache.storm.daemon. supervisor.Supervisor.main (Supervisor.java:200) ~[storm-server-2.3.0.jar:2.3.0]
The /storm node already exists in the zk client, so the storm cluster cannot be connected when reconnecting to zk. You can log in to the zk client to delete the /storm node and then start the storm cluster related processes.

Exception 'Cannot get a connection, pool error Timeout waiting for idle object' when using 'DBCPConnectionPoolLookup' service in Nifi

I'm trying to use 'DBCPConnectionPoolLookup' service in 'ExecuteGroovyScript' to dynamically query the required database based on 'database.name' parameter in the input flow file.
The processor is successfully able to get the corresponding 'DBCPConnectionPool' service for querying but I'm getting the an exception java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object. As opposed to if I directly use the 'DBCPConnectionPool' service without the 'Lookup' service without changing any configuration it works fine.
I access the service as follows:
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())
Then use the 'clientDb' object to query as:
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
I have tried increasing the values of Max Wait Time and Max Total Connections to higher values in 'DBCPConnectionPool' service, it does not help.
Please find below detail links of images for code,error and configuration
Exception
Configuration of 'ExecuteGroovyScript'
Configuration of 'DBCPConnectionPool' service
Configuration of 'DBCPConnectionPoolLookup' service
Script Code
import org.apache.nifi.distributed.cache.client.Deserializer
import org.apache.nifi.distributed.cache.client.Serializer
import org.apache.nifi.distributed.cache.client.exception.DeserializationException
import org.apache.nifi.distributed.cache.client.exception.SerializationException
import groovy.sql.Sql
import java.time.*
try {
def flowFile = session.get()
def isBootstrap=flowFile."isBootstrap"
def timseriesSqlQuery='SELECT id FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def timseriesSqlCountQuery='SELECT count(id) as c FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def spaceSqlQuery='select id from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def spaceSqlCountQuery='select count(id) as c from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def cache = CTL.lastIngestTimeMap
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())//SQL.staticService
int numRowsTimeSeries=0
int numRowsSpace=0
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
clientDb.rows(spaceSqlCountQuery).eachWithIndex { row, idx ->numRowsSpace= row.c}
}
Exception from Nifi logs
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-3] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-3] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-9] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-9] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-10] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=a1ec4496-dca3-38ab-a47b-43d7ff95e40f] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-2] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=54d1e251-88f2-33f3-0489-722879a802bd] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
Finally after bring down Nifi twice I have found the solution. The problem seemed to be in the code which I was using, I used the object returned by CTL.index.getConnection(flowFile.getAttributes()) to query the SQL table which actually is a connection table, now due to this Nifi used up all available connections to SQL, due to which even if I reverted to using 'DBCPConnectionPool' service instead if 'Lookup' I was getting the above error. When I used to restart Nifi it used to work fine.
The actual code to be used in your script for using 'Lookup' Service is
def connectionObj = CTL.index.getConnection(flowFile.getAttributes())
def clientDb = new Sql(connectionObj)
Now use the 'clientDb' object to query your table
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}

Nutch Elasticsearch Integration

I'm following this tutorial for setting up nutch alongwith Elasticsearch. Whenever I try to index the data into the ES, it returns an error. Following are the logs:-
Command:-
bin/nutch index elasticsearch -all
Logs when I add elastic.port(9200) in conf/nutch-site.xml :-
2016-05-05 13:22:49,903 INFO basic.BasicIndexingFilter - Maximum title length for indexing set to: 100
2016-05-05 13:22:49,904 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2016-05-05 13:22:49,904 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2016-05-05 13:22:49,904 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2016-05-05 13:22:49,905 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.metadata.MetadataIndexer
2016-05-05 13:22:49,906 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.more.MoreIndexingFilter
2016-05-05 13:22:49,961 INFO elastic.ElasticIndexWriter - Processing remaining requests [docs = 0, length = 0, total docs = 0]
2016-05-05 13:22:49,961 INFO elastic.ElasticIndexWriter - Processing to finalize last execute
2016-05-05 13:22:54,898 INFO client.transport - [Peggy Carter] failed to get node info for [#transport#-1][ubuntu][inet[localhost/127.0.0.1:9200]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:9200]][cluster:monitor/nodes/info] request_id [1] timed out after [5000ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:366)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-05-05 13:22:55,682 INFO indexer.IndexWriters - Adding org.apache.nutch.indexwriter.elastic.ElasticIndexWriter
2016-05-05 13:22:55,683 INFO indexer.IndexingJob - Active IndexWriters :
ElasticIndexWriter
elastic.cluster : elastic prefix cluster
elastic.host : hostname
elastic.port : port (default 9300)
elastic.index : elastic index command
elastic.max.bulk.docs : elastic bulk index doc counts. (default 250)
elastic.max.bulk.size : elastic bulk index length. (default 2500500 ~2.5MB)
2016-05-05 13:22:55,711 INFO elasticsearch.plugins - [Adrian Toomes] loaded [], sites []
2016-05-05 13:23:00,763 INFO client.transport - [Adrian Toomes] failed to get node info for [#transport#-1][ubuntu][inet[localhost/127.0.0.1:92$0]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:9200]][cluster:monitor/nodes/info] request_id [0] time$ out after [5000ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:366)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-05-05 13:23:00,766 INFO indexer.IndexingJob - IndexingJob: done.
Logs when default port 9300 is used:-
2016-05-05 13:58:44,584 INFO elasticsearch.plugins - [Mentallo] loaded [], sites []
2016-05-05 13:58:44,673 WARN transport.netty - [Mentallo] Message not fully read (response) for [0] handler future(org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler$1#3c80f1dd), error [true], resetting
2016-05-05 13:58:44,674 INFO client.transport - [Mentallo] failed to get node info for [#transport#-1][ubuntu][inet[localhost/127.0.0.1:9300]], disconnecting...
org.elasticsearch.transport.RemoteTransportException: Failed to deserialize exception response from stream
Caused by: org.elasticsearch.transport.TransportSerializationException: Failed to deserialize exception response from stream
at org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:173)
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.StreamCorruptedException: Unsupported version: 1
at org.elasticsearch.common.io.ThrowableObjectInputStream.readStreamHeader(ThrowableObjectInputStream.java:46)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:301)
at org.elasticsearch.common.io.ThrowableObjectInputStream.<init>(ThrowableObjectInputStream.java:38)
at org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:170)
... 23 more
2016-05-05 13:58:44,676 INFO indexer.IndexingJob - IndexingJob: done.
I've configured everything fine. Have had a look at various threads as well but to no avail. Also java version for both ES and JVM is same. Is there a bug in here?
I'm using Nutch 2.3.1 and have tried with both ES 1.4.4 and 2.3.2. I can see data in Mongo but I cannot index data in ES. Why??

Failed to connect to hadoop cluster when accessing file from pyspark

I'm running the following code:
conf = SparkConf().setAppName("basicRegressionUbuntu").setMaster("spark://MyCUSTOMIP:7077")
sc = SparkContext(conf=conf)
rdd = sc.textFile("hdfs://MYHADOOPMASTERNODE:8020/sampleData/Sacramentorealestatetransactions.csv")
It throws the following:
16/03/25 10:01:11 WARN security.UserGroupInformation: PriviledgedActionException as:hduser (auth:SIMPLE) cause:java.io.IOException: Failed to connect to /10.0.2.15:42939
Exception in thread "main" java.io.IOException: Failed to connect to /10.0.2.15:42939
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection timed out: /10.0.2.15:42939
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
I know that file path exists because when I SSH into MYHADOOPMASTERNODE and do an hdfs dfs -ls /sampleData/ it shows me the fille.
Any help would be much appreciated!

java.io.IOException: Job status not available about hive

When I use hive with select * from table_name;, it works.
When I use select t.a from table_name t OR select * from table_name where ..., the following error happens :
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1414949555870_118360, Tracking URL = N/A
Kill Command = /usr/local/hadoop-2.5.1/bin/hadoop job -kill job_1414949555870_118360
java.io.IOException: Job status not available
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:322)
at org.apache.hadoop.mapreduce.Job.getJobState(Job.java:347)
at org.apache.hadoop.mapred.JobClient$NetworkedJob.getJobState(JobClient.java:295)
at org.apache.hadoop.hive.shims.HadoopShimsSecure.isJobPreparing(HadoopShimsSecure.java:104)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:242)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:541)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:431)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1485)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1263)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1091)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Ended Job = job_1414949555870_118360 with exception java.io.IOException(Job status not available )
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask`
So, what's wrong about hive?
DEBUG info as follow, maybe useful!! Please help me!!!
15/04/14 16:53:48 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client#60cb201b
15/04/14 16:53:48 DEBUG mapred.ClientServiceDelegate: Failed to contact AM/History for job job_1414949555870_118441 retrying..
java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "slave109":43759; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost

Resources