Spring-Batch - MultiFileResourcePartitioner - Caused by: java.net.SocketException: Connection reset - spring

I am using FlatFileItemReader and have extended the AbstractResource to return a stream from Amazon S3 object.
S3Object amazonS3Object = s3client.getObject(new GetObjectRequest(bucket,file));
InputStream stream = null;
stream = amazonS3Object.getObjectContent();
return stream;
In my batch job I have also implemented MultiFileResourcePartitioner in which i gave the bucket to partition all the files. I am able to read only part of few files and after which i get a socket reset error.see below pieces of error
.ResourcelessTransactionManager$ResourcelessTransaction#122ba881]
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=9
2015-08-24 23:24:03 DEBUG StepContextRepeatCallback:68 - Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext#252ce07a
2015-08-24 23:24:03 DEBUG StepContextRepeatCallback:76 - Chunk execution starting: queue size=0
2015-08-24 23:24:03 DEBUG ResourcelessTransactionManager:367 - Creating new transaction with name [null]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT
2015-08-24 23:24:03 DEBUG RepeatTemplate:464 - Starting repeat context.
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=1
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=2
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=3
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=4
2015-08-24 23:24:03 DEBUG DefaultClientConnection:160 - Connection 0.0.0.0:58171<->10.37.135.39:8099 shut down
2015-08-24 23:24:03 DEBUG DefaultClientConnection:176 - Connection 0.0.0.0:58171<->10.37.135.39:8099 closed
Caused by: org.springframework.batch.item.file.NonTransientFlatFileException: Unable to read from resource: [null]
at org.springframework.batch.item.file.FlatFileItemReader.readLine(FlatFileItemReader.java:220)
at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:173)
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.read(AbstractItemCountingItemStreamItemReader.java:83)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:133)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:121)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207)
at com.sun.proxy.$Proxy17.read(Unknown Source)
at org.springframework.batch.core.step.item.SimpleChunkProvider.doRead(SimpleChunkProvider.java:91)
at org.springframework.batch.core.step.item.FaultTolerantChunkProvider.read(FaultTolerantChunkProvider.java:87)
... 22 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)</i>
My requirement is to process files with millions of records out of an S3 bucket and the application runs on AWS. I have passed the S3 client configurations with retry's and open connections which didn't help much.

As #Michael Minella mentioned, it might be a choice for you to use Spring Cloud AWS project to get resources:
#Autowired
private ResourcePatternResolver resourcePatternResolver;
public void resolveAndLoad() throws IOException {
Resource[] allTxtFilesInFolder = this.resourcePatternResolver.getResources("s3://bucket/name/*.txt");
Resource[] allTxtFilesInBucket = this.resourcePatternResolver.getResources("s3://bucket/**/*.txt");
Resource[] allTxtFilesGlobally = this.resourcePatternResolver.getResources("s3://**/*.txt");
}
And then pass the resources to your MultiFileResourcePartitioner to see the exception can be solved.

Related

Jdbi select with insanely long list as parameter to bindList

I have a query where I'd like to retrieve a huge amount of rows based on their pk ids, and use select .... where id in (<ids>) via the fluent api in JDBI like this:
jdbi(db).withHandle(h --> handle.createQuery(SQL).bindList("ids", listOfIds).mapToMap().list();
This works for as long as the number of ids doesn't exceed what the database (DB2) can handle in an in-clause. Obviously, in my case, the list of id's gets longer than Db2 can handle. So I split it into many in a List<List<Integer>> listOfIdLists and create a List<Map<String, Object>> result.
Now I have to somehow iterate over listOfIdLists and for each iteration add the result to result. Here is one of many tested variants:
List<Map<String, Object>> result = new ArrayList<>();
List<List<Integer>> listOfIdLists = chopListToLists(ids, 10);
Iterator<List<Integer>> oneChopIterator = listOfIdLists.iterator();
while (oneChopIterator.hasNext()) {
result.addAll(jdbi(db).withHandle(handle --> handle.createQuery(SQL).bindList("id", oneChopIterator.next()).mapToMap().list()));
}
Obviously variants with chops.forEach and try (Handle h = jdbi(db).open();) { /* iterate and addAll */ } has been tried also.
All of this run in a Quarkus app, and I get exceptions from Arjuna when iterating. For testing, I can add to result without errors when there is no iterating, and I instead just pick the first element/list in chops.
The exception is:
2022-11-02 00:13:15,219 WARN [com.arj.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check processing TX 0:ffff7f000101:925d:6361a7cf:0 in state RUN
2022-11-02 00:13:15,239 WARN [com.arj.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012095: Abort of action id 0:ffff7f000101:925d:6361a7cf:0 invoked while multiple threads active within it.
2022-11-02 00:13:15,242 WARN [com.arj.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012381: Action id 0:ffff7f000101:925d:6361a7cf:0 completed with multiple threads - thread Quarkus Main Thread was in progress with java.base#17.0.4/sun.nio.ch.SocketDispatcher.read0(Native Method)
java.base#17.0.4/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
java.base#17.0.4/sun.nio.ch.NioSocketImpl.tryRead(NioSocketImpl.java:261)
...
2022-11-02 00:13:15,244 WARN [com.arj.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012108: CheckedAction::check - atomic action 0:ffff7f000101:925d:6361a7cf:0 aborting with 1 threads active!
2022-11-02 00:13:15,315 WARN [io.agr.pool] (Transaction Reaper Worker 0) Datasource 'fs': JDBC resources leaked: 0 ResultSet(s) and 1 Statement(s)
2022-11-02 00:13:15,340 ERROR [no.cen.bat.dbs.Runner] (Quarkus Main Thread) dbsync 2 failed with throwable Unable to advance result set [statement:"select ...
...
2022-11-02 00:13:15,360 ERROR [no.cen.bat.dbs.Runner] (Quarkus Main Thread) org.jdbi.v3.core.result.ResultSetException: Unable to advance result set [statement:"select ...
...
Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][10120][10898][4.31.10] Invalid operation: result set is closed. ERRORCODE=-4470, SQLSTATE=null
at com.ibm.db2.jcc.am.b7.a(b7.java:794)
...
2022-11-02 00:13:15,739 WARN [com.arj.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check processing TX 0:ffff7f000101:925d:6361a7cf:0 in state CANCEL
2022-11-02 00:13:15,741 WARN [com.arj.ats.arjuna] (Transaction Reaper) ARJUNA012378: ReaperElement appears to be wedged: java.base#17.0.4/sun.nio.ch.SocketDispatcher.read0(Native Method)
java.base#17.0.4/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
...
2022-11-02 00:13:16,242 WARN [com.arj.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check processing TX 0:ffff7f000101:925d:6361a7cf:0 in state CANCEL_INTERRUPTED
2022-11-02 00:13:16,244 WARN [com.arj.ats.arjuna] (Transaction Reaper) ARJUNA012120: TransactionReaper::check worker Thread[Transaction Reaper Worker 0,5,main] not responding to interrupt when cancelling TX 0:ffff7f000101:925d:6361a7cf:0 -- worker marked as zombie and TX scheduled for mark-as-rollback
2022-11-02 00:13:21,478 WARN [com.arj.ats.arjuna] (Transaction Reaper) ARJUNA012110: TransactionReaper::check successfuly marked TX 0:ffff7f000101:925d:6361a7cf:0 as rollback only
2022-11-02 00:13:21,479 WARN [com.arj.ats.arjuna] (Quarkus Main Thread) ARJUNA012077: Abort called on already aborted atomic action 0:ffff7f000101:925d:6361a7cf:0
2022-11-02 00:13:21,480 WARN [com.arj.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012113: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] missed interrupt when cancelling TX 0:ffff7f000101:925d:6361a7cf:0 -- exiting as zombie (zombie count decremented to 0)
This runs on Java 17 with latest Quarkus (2.13.3) and the ibm-drivers that come with quarkus-jdbc-db2. Jdbi version 3.34.0. Not running a native image.
The reason for the parameterized jdbi(db) is that the application has two datasources. It is replacing a db-job where two databases are linked and stuff copied from one database to the other with statements like insert into db1.mytable(a,b,c...) select a,b,c ... from db2.mytable where not exists (...);
The source db is on linux and the target on zOs. The app runs on Ubuntu 20.04.
So basically what the code do, is to retrieve all pk ids from each table in each database, use CollectionUtils.subtract(list1, list2) to identify ids missing from the target table, and then use the resulting list of ids to retrive the full rows via a select ... from ... where ids in (<ids>) query as described above. The list of Map<String, Object> which would be the result rows, would then be inserted into the other table where they are missing.
The question is; How to get this working without exceptions? I can brute-force this by deleting and inserting all rows, but I'd rather not.
The stacktrace element:
2022-11-02 00:13:15,244 WARN [com.arj.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012108: CheckedAction::check - atomic action 0:ffff7f000101:925d:6361a7cf:0 aborting with 1 threads active!
shows the narayana Transaction Reaper thread, it's responsible for timing out transactions, attempting to rollback the transaction. The text "atomic action 0:ffff7f000101:925d:6361a7cf:0 aborting with 1 threads active!" is saying that there is an application thread still running with the transaction context bound to the application thread. This is normal application behavior and the usual fix for this problem is to either extend the timeout, do less work inside the transaction add more compute/networking resources to the setup.

apache nifi Stateless - not able to set parm of controller service (DBCPConnectionPool 1.10.0)

I am following the NiFi 1.10 stateless guildeline to create a simple process group of executing a sql in mysql db. I have put necessary parm of db controller service to parameter context.
it works well in nifi canvas. Then i add it to registry and prepare a json parm file: stateless-simpledb.json
{
"registryUrl": "http://localhost:18080",
"bucketId": "cac8f127-e328-45c1-a4cb-0e03dc837ceb",
"flowId": "cc2753f2-78f3-4449-a2fd-343dfeaafe15",
"flowVersion": "3",
"parameters": {
"lastIngestId" : "20000",
"mysql-jdbc-driver-name" : "com.mysql.jdbc.Driver",
"db-user" : "root",
"db-password" : "password",
"db-con-url" : "jdbc:mysql://localhost:3306/mms",
"jdbc-jar-path" : "/program/jdbc/mysql-connector-java.jar"
}
}
and run the one-off command:
/program/nifi/bin/nifi.sh stateless RunFromRegistry Once --file /app/poc/nifi-stateless/conf/stateless-simpledb.json
It raise error:
=== FlowFileRepository Type ===
org.apache.nifi.controller.repository.RocksDBFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
org.apache.nifi.controller.repository.WriteAheadFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
org.apache.nifi.controller.repository.VolatileFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
=== End FlowFileRepository types ===
23:32:32.626 [main] INFO org.apache.nifi.stateless.bootstrap.ExtensionDiscovery - Successfully discovered extensions in 4411 milliseconds
23:32:32.633 [main] DEBUG org.apache.nifi.stateless.core.ComponentFactory - Setting context class loader to org.apache.nifi.nar.InstanceClassLoader#50fa5938 (parent = org.apache.nifi.nar.NarClassLoader[/program/nifi-1.10.0/work/stateless-nars/nifi-dbcp-service-nar-1.10.0.nar-unpacked]) to create org.apache.nifi.dbcp.DBCPConnectionPool
23:32:32.647 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input #{jdbc-jar-path} found 1 Parameter references: [org.apache.nifi.parameter.StandardParameterReference#2d3eecda]
23:32:32.650 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input /program/jdbc/mysql-connector-java.jar found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 500 millis found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 8 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 0 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 8 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 30 mins found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.bootstrap.RunStatelessNiFi.main(RunStatelessNiFi.java:69)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.StatelessNiFi.main(StatelessNiFi.java:103)
... 5 more
Caused by: java.lang.RuntimeException: Failed to enable Controller Service {id=691ecc97-ff46-3a5e-8aad-37dc568bc247, name=MYSQL-MMS-stateless-test, type=class org.apache.nifi.dbcp.DBCPConnectionPool} because validation failed: ['Database Connection URL' is invalid because Database Connection URL is required, 'Database Driver Class Name' is invalid because Database Driver Class Name is required]
at org.apache.nifi.stateless.core.StatelessControllerServiceLookup.enableControllerServices(StatelessControllerServiceLookup.java:133)
at org.apache.nifi.stateless.core.StatelessFlow.<init>(StatelessFlow.java:153)
at org.apache.nifi.stateless.core.StatelessFlow.createAndEnqueueFromJSON(StatelessFlow.java:469)
at org.apache.nifi.stateless.runtimes.Program.runLocal(Program.java:133)
at org.apache.nifi.stateless.runtimes.Program.launch(Program.java:67)
... 10 more
Seems the apache nifi stateless function failed to set controller service even it's in "process group" scope.
Would anyone has any advice?
As mentioned in the comments, this appears to be a known problem with the validation of controller services.
This can be avoided by using Nifi 1.12 and above as it got fixed in the following jira: https://issues.apache.org/jira/plugins/servlet/mobile#issue/NIFI-7380
Though I am not entirely sure of this, it may also be possible that this simply indicates that your controller service is not configured correctly. This would be worth double checking.

Exception 'Cannot get a connection, pool error Timeout waiting for idle object' when using 'DBCPConnectionPoolLookup' service in Nifi

I'm trying to use 'DBCPConnectionPoolLookup' service in 'ExecuteGroovyScript' to dynamically query the required database based on 'database.name' parameter in the input flow file.
The processor is successfully able to get the corresponding 'DBCPConnectionPool' service for querying but I'm getting the an exception java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object. As opposed to if I directly use the 'DBCPConnectionPool' service without the 'Lookup' service without changing any configuration it works fine.
I access the service as follows:
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())
Then use the 'clientDb' object to query as:
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
I have tried increasing the values of Max Wait Time and Max Total Connections to higher values in 'DBCPConnectionPool' service, it does not help.
Please find below detail links of images for code,error and configuration
Exception
Configuration of 'ExecuteGroovyScript'
Configuration of 'DBCPConnectionPool' service
Configuration of 'DBCPConnectionPoolLookup' service
Script Code
import org.apache.nifi.distributed.cache.client.Deserializer
import org.apache.nifi.distributed.cache.client.Serializer
import org.apache.nifi.distributed.cache.client.exception.DeserializationException
import org.apache.nifi.distributed.cache.client.exception.SerializationException
import groovy.sql.Sql
import java.time.*
try {
def flowFile = session.get()
def isBootstrap=flowFile."isBootstrap"
def timseriesSqlQuery='SELECT id FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def timseriesSqlCountQuery='SELECT count(id) as c FROM [dbo].[Points] where ([MappedToEquipment] = \'Mapped\' or PointStatus = \'Mapped\')'
def spaceSqlQuery='select id from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def spaceSqlCountQuery='select count(id) as c from (select id from dbo.organization union select id from dbo.facility union select id from dbo.building union select id from dbo.floor union select id from dbo.wing union select id from dbo.room union select id from dbo.systems) tmp'
def cache = CTL.lastIngestTimeMap
def clientDb = CTL.SQLLookupService.getConnection(flowFile.getAttributes())//SQL.staticService
int numRowsTimeSeries=0
int numRowsSpace=0
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}
clientDb.rows(spaceSqlCountQuery).eachWithIndex { row, idx ->numRowsSpace= row.c}
}
Exception from Nifi logs
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-3] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-3] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=b435c079-ee6c-3c42-a6ea-020968267ecf] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 ERROR [Timer-Driven Process Thread-9] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] failed to process session due to java.lang.ClassCastException; Processor Administratively Yielded for 1 sec: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,629 WARN [Timer-Driven Process Thread-9] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ExecuteGroovyScript[id=9b81ca15-93a5-3953-9f40-d0874cfe2531] due to uncaught Exception: java.lang.ClassCastException
java.lang.ClassCastException: null
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-10] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=a1ec4496-dca3-38ab-a47b-43d7ff95e40f] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
2019-09-12 06:18:33,708 ERROR [Timer-Driven Process Thread-2] o.a.n.p.groovyx.ExecuteGroovyScript ExecuteGroovyScript[id=54d1e251-88f2-33f3-0489-722879a802bd] org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object: org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:308)
at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:84)
at com.sun.proxy.$Proxy89.getConnection(Unknown Source)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onInitSQL(ExecuteGroovyScript.java:339)
at org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:439)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563)
at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:305)
... 19 common frames omitted
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:451)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
... 21 common frames omitted
Finally after bring down Nifi twice I have found the solution. The problem seemed to be in the code which I was using, I used the object returned by CTL.index.getConnection(flowFile.getAttributes()) to query the SQL table which actually is a connection table, now due to this Nifi used up all available connections to SQL, due to which even if I reverted to using 'DBCPConnectionPool' service instead if 'Lookup' I was getting the above error. When I used to restart Nifi it used to work fine.
The actual code to be used in your script for using 'Lookup' Service is
def connectionObj = CTL.index.getConnection(flowFile.getAttributes())
def clientDb = new Sql(connectionObj)
Now use the 'clientDb' object to query your table
clientDb.rows(timseriesSqlCountQuery).eachWithIndex { row, idx ->numRowsTimeSeries= row.c}

Spring-XD Curator Connection Timeout

In Spring-XD the Curator Connection times out:
WARN ConnectionStateManager-0 curator.ConnectionState - Connection
attempt unsuccessful after 63021 (greater than max timeout of 60000).
Resetting connection and trying again with a new connection.
Curator tries to re-establish the connection, but fails. Please check the logs below. Has anyone faced similar issue? Please let me know if you know of any ways to resolve the issue or if you know of any workarounds.
Also the default Curator connection time out is 60000. Is there a way to increase it? Does spring-xd expose a property which can be set?
2014-12-10 01:24:41,003 WARN ConnectionStateManager-0
server.ContainerRegistrar - >>> disconnected container:
1c8a234d-4b8d-4d65-b374-xxxxe8619 2014-12-10 01:24:41,004 INFO
DeploymentsPathChildrenCache-0 server.ContainerRegistrar - Path cache
event: null, type: CONNECTION_SUSPENDED 2014-12-10 01:24:41,005 INFO
ConnectionStateManager-0 server.ContainerRegistrar - Undeploying
module [ModuleDescriptor#350920b1 moduleName = 'rabbit', moduleLabel =
'rabbit', group = 'xxx-ingestion-2', sourceChannelName = [null],
sinkChannelName = [null], sinkChannelName = [null], index = 0, type =
source, parameters = map['vhost' -> 'xxx_virtual_host', 'requeue' ->
'false', 'outputType' -> 'text/plain', 'queues' -> 'xx.xxx.queue',
'addresses' -> 'xxxmq.xx.xxxx.com'], children = list[[empty]]]
2014-12-10 01:24:46,022 ERROR pool-22-thread-1
connection.CachingConnectionFactory - Channel shutdown: clean
connection shutdown; protocol method:
method<connection.close>(reply-code=200, reply-text=OK, class-id=0, method-id=0)
2014-12-10 01:24:56,007 **ERROR CuratorFramework-0
curator.ConnectionState - Connection timed out for connection string
(514.xx.93.xxx:2181,504.58.xxx.xx:2181) and timeout (15000) / elapsed**
(15004) org.apache.curator.CuratorConnectionLossException:
KeeperErrorCode = ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198)
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:793)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:779)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:58)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:265)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744) 2014-12-10 01:24:56,161
ERROR main-EventThread curator.ConnectionState - Connection timed out
for connection string (514.xx.93.xxx:2181,504.58.xxx.xx:2181) and
timeout (15000) / elapsed (15159)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198)
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:474)
at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:287)
at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
at org.springframework.xd.dirt.server.ContainerRegistrar$StreamModuleWatcher.process(ContainerRegistrar.java:744)
at org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2014-12-10 01:25:03,014 ERROR CuratorFramework-0 imps.CuratorFrameworkImpl - **Background retry gave up**
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198)
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:793)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:779)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:58)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:265)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Is this reproducible? Are you running in clustered or single node mode?
The Curator connection timeout (in milliseconds) can be set via system property curator-default-connection-timeout.

apache thrift transport TTransportException

Hive Version : 0.13.1
Pig Version : 0.13.0
I was trying to get read the hive tables using pig with the below command.
grunt> DATA = LOAD 'dev.profile' USING org.apache.hcatalog.pig.HCatLoader();
I get the below piece of log
2014-07-16 22:44:58,986 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
2014-07-16 22:44:59,037 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:10000
2014-07-16 22:44:59,057 [main] INFO hive.metastore - Connected to metastore.
2014-07-16 22:45:02,019 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
2014-07-16 22:45:02,166 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
when i do the describe the results comes properly as expected.
grunt> describe DATA
2014-07-16 22:46:42,189 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
DATA: {name: chararray,age: int,salary: int}
but when i dump the data i get SocketTimeoutException
2014-07-16 22:47:25,146 [main] ERROR hive.log - Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826)
at org.apache.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:276)
at org.apache.hcatalog.common.HiveClientCache.get(HiveClientCache.java:146)
at org.apache.hcatalog.common.HCatUtil.getHiveClient(HCatUtil.java:548)
at org.apache.hcatalog.pig.PigHCatUtil.getHiveMetaClient(PigHCatUtil.java:158)
at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:200)
at org.apache.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:195)
at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:885)
at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1712)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1420)
at org.apache.pig.PigServer.storeEx(PigServer.java:1004)
at org.apache.pig.PigServer.store(PigServer.java:974)
at org.apache.pig.PigServer.openIterator(PigServer.java:887)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:752)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:542)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 40 more
2014-07-16 22:47:25,148 [main] ERROR hive.log - Converting exception to MetaException
2014-07-16 22:47:25,151 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:10000
2014-07-16 22:47:25,152 [main] INFO hive.metastore - Connected to metastore.
2014-07-16 22:47:45,173 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
Failed to parse: Can not retrieve schema from loader org.apache.hcatalog.pig.HCatLoader#1342464f
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:198)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1712)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1420)
at org.apache.pig.PigServer.storeEx(PigServer.java:1004)
at org.apache.pig.PigServer.store(PigServer.java:974)
at org.apache.pig.PigServer.openIterator(PigServer.java:887)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:752)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:542)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.RuntimeException: Can not retrieve schema from loader org.apache.hcatalog.pig.HCatLoader#1342464f
at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:91)
at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:885)
at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 17 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:179)
at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
... 24 more
Caused by: java.io.IOException: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:205)
at org.apache.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:195)
at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
... 25 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1036)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1022)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
at org.apache.hcatalog.common.HCatUtil.getTable(HCatUtil.java:194)
at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:201)
... 27 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 37 more
2014-07-16 22:47:45,176 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2245: Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
Even though i am able to connect to metastore i am not able to retrieve the data. What could be the reason for read fail ?
and at times the process fails with java.lang.OutOfMemoryError: Java heap space
Any help would be greatly appreciated.
Edit the hive-site.xml.
Replace hive.metastore.ds.retry with /hive.hmshandler.retry.
vim /usr/local/Cellar/hive/0.13.1/libexec/conf/hive-site.xml
:%s/hive.metastore.ds.retry/hive.hmshandler.retry/g
:wq

Resources