I have a Neo4j database with about 130K nodes and probably somewhere between 17M relationships. My computer running Windows 10 has 16GB of RAM, 10GB (maximum) of which are allocated to the Neo4j-shell heap.
I want to run a query using the neo4j-shell from the command prompt and redirect the results to a csv file. The command I’m using for that is:
Neo4jShell -v -file query.cql > results.csv
Where the query is of the form:
MATCH (subject)-[:type1]->(intNode1:label)<-[:type2]-(intNode2:label)<-[:type3]-(object) RETURN subject.property1, object.property1;
The problem is that whenever I run this query, I get an OutOfMemory error (see error message at the bottom).
Does anyone have advice for how to run a query like this successfully? I feel like 10GB of RAM should be enough for a query like this given the size of the graph DB.
The error message I get is:
ERROR (-v for expanded information):
Error occurred in server thread; nested exception is:
java.lang.OutOfMemoryError: GC overhead limit exceeded
java.rmi.ServerError: Error occurred in server thread; nested exception is:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
at sun.rmi.transport.Transport$2.run(Unknown Source)
at sun.rmi.transport.Transport$2.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(Unknown Source)
at sun.rmi.transport.StreamRemoteCall.executeCall(Unknown Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(Unknown Source)
at java.rmi.server.RemoteObjectInvocationHandler.invoke(Unknown Source)
at com.sun.proxy.$Proxy1.interpretLine(Unknown Source)
at org.neo4j.shell.impl.AbstractClient.evaluate(AbstractClient.java:149)
at org.neo4j.shell.impl.AbstractClient.evaluate(AbstractClient.java:133)
at org.neo4j.shell.StartClient.executeCommandStream(StartClient.java:393)
at org.neo4j.shell.StartClient.grabPromptOrJustExecuteCommand(StartClient.java:372)
at org.neo4j.shell.StartClient.startRemote(StartClient.java:330)
at org.neo4j.shell.StartClient.start(StartClient.java:196)
at org.neo4j.shell.StartClient.main(StartClient.java:135)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Solution when I hade this problem was to add this line on the end of neo4j.properties
# Default values for the low-level graph engine
cache_type=none
neostore.nodestore.db.mapped_memory=50M
neostore.relationshipstore.db.mapped_memory=500M
neostore.propertystore.db.mapped_memory=100M
neostore.propertystore.db.strings.mapped_memory=100M
neostore.propertystore.db.arrays.mapped_memory=0M
also try to increase java in neo4j-wrapper.conf this line
#wrapper.java.initmemory=2048
#wrapper.java.maxmemory=2048
One more thing do you have index on node that you query?
You can provide more heap for Neo4jShell (use JAVA_OPTS=-Xmx4G -Xms4G -Xmn1G environment variable).
Did you try to run your query with profile? I presume you span up billions of paths as you don't have any restrictions.
You missed a label on subject and object, which cause the query planner to run a full-graph-scan.
MATCH (subject:label)-[:type1]->(intNode1:label)
<-[:type2]-(intNode2:label)
<-[:type3]-(object:label)
WITH distinct subject, object
RETURN subject.property1, object.property1;
You should reduce the in-between cardinality and also the output cardinality.
MATCH (subject:label)-[:type1]->(intNode1:label)
<-[:type2]-(intNode2:label)
WITH distinct subject, intNode2
MATCH (intNode2)<-[:type3]-(object:label)
WITH distinct subject, object
RETURN subject.property1, object.property1;
even better would be:
MATCH (subject:label)-[:type1]->(intNode1:label)
<-[:type2]-(intNode2:label)
WITH intNode2, collect(distinct subject) as subjects
MATCH (intNode2)<-[:type3]-(object:label)
WITH distinct object, subjects
UNWIND subjects as subject
RETURN subject.property1, object.property1;
Related
Elastic search server doesn't start on a new node. It fails with the following error :
[2019-06-27T00:16:01,471][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-10] fatal error in thread [main], exiting
java.lang.ExceptionInInitializerError: null
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_212]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_212]
at org.elasticsearch.painless.Definition.addStruct(Definition.java:753) ~[?:?]
at org.elasticsearch.painless.Definition.<init>(Definition.java:566) ~[?:?]
at org.elasticsearch.painless.PainlessScriptEngine.<init>(PainlessScriptEngine.java:106) ~[?:?]
at org.elasticsearch.painless.PainlessPlugin.getScriptEngine(PainlessPlugin.java:59) ~[?:?]
at org.elasticsearch.script.ScriptModule.<init>(ScriptModule.java:69) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.node.Node.<init>(Node.java:327) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.node.Node.<init>(Node.java:246) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:323) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.2.2.jar:6.2.2]
at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.2.2.jar:6.2.2]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85) ~[elasticsearch-6.2.2.jar:6.2.2]
Caused by: java.lang.ArrayIndexOutOfBoundsException: 4
at java.time.chrono.JapaneseEra.<clinit>(JapaneseEra.java:179) ~[?:1.8.0_212]
... 19 more
I have a 5-node cluster running in production already in GCP. Since the load has increased, I tried to add few more nodes to that cluster. To create new nodes, I used "Create similar" option provided by GCP. I updated all configurations like node name and deleted the /var/lib/elasticsearch/nodes folder and tried to start the ES server on the new node. But it always fails with the error mentioned above.
This node uses OpenJDK 1.8. I enabled trace log for root logger, but couldn't identify what's wrong with the new node.
Please help in identifying what's the root cause of this problem.
Theory explanation:
public class ArrayIndexOutOfBoundsException extends IndexOutOfBoundsException
Thrown to indicate that an array has been accessed with an illegal index. The index is either negative or greater than or equal to the size of the array.
How to avoid:
Always remember that array is zero-based index, the first element is at 0th index and the last element is at length - 1 index.
So accessing the first element will give you the
java.lang.ArrayIndexOutOfBoundsException : 0 error in Java.
In your case it is a Caused by: java.lang.ArrayIndexOutOfBoundsException: 4
You should always pay to one-off errors while looping over an array in Java. The programmer often makes mistakes which result in either missing first or last element of the array by messing 1st element or finishing just before the last element by incorrectly using the <, >,> >= or <= operator in for loops.
Give special attention to the start and end condition of the loop.
Put some if-else blocks in code.
Here I have reproduced your error:
I'm working on a custom load function to load data from Bigtable using Pig on Dataproc. I compile my java code using the following list of jar files I grabbed from Dataproc. When I run the following Pig script, it fails when it tries to establish a connection with Bigtable.
Error message is:
Bigtable does not support managed connections.
Questions:
Is there a work around for this problem?
Is this a known issue and is there a plan to fix or adjust?
Is there a different way of implementing multi scans as a load function for Pig that will work with Bigtable?
Details:
Jar files:
hadoop-common-2.7.3.jar
hbase-client-1.2.2.jar
hbase-common-1.2.2.jar
hbase-protocol-1.2.2.jar
hbase-server-1.2.2.jar
pig-0.16.0-core-h2.jar
Here's a simple Pig script using my custom load function:
%default gte '2017-03-23T18:00Z'
%default lt '2017-03-23T18:05Z'
%default SHARD_FIRST '00'
%default SHARD_LAST '25'
%default GTE_SHARD '$gte\_$SHARD_FIRST'
%default LT_SHARD '$lt\_$SHARD_LAST'
raw = LOAD 'hbase://events_sessions'
USING com.eduboom.pig.load.HBaseMultiScanLoader('$GTE_SHARD', '$LT_SHARD', 'event:*')
AS (es_key:chararray, event_array);
DUMP raw;
My custom load function HBaseMultiScanLoader creates a list of Scan objects to perform multiple scans on different ranges of data in the table events_sessions determined by the time range between gte and lt and sharded by SHARD_FIRST through SHARD_LAST.
HBaseMultiScanLoader extends org.apache.pig.LoadFunc so it can be used in the Pig script as load function.
When Pig runs my script, it calls LoadFunc.getInputFormat().
My implementation of getInputFormat() returns an instance of my custom class MultiScanTableInputFormat which extends org.apache.hadoop.mapreduce.InputFormat.
MultiScanTableInputFormat initializes org.apache.hadoop.hbase.client.HTable object to initialize the connection to the table.
Digging into the hbase-client source code, I see that org.apache.hadoop.hbase.client.ConnectionManager.getConnectionInternal() calls org.apache.hadoop.hbase.client.ConnectionManager.createConnection() with the attribute “managed” hardcoded to “true”.
You can see from the stack track below that my code (MultiScanTableInputFormat) tries to initialize an HTable object which invokes getConnectionInternal() which does not provide an option to set managed to false.
Going down the stack trace, you will get to AbstractBigtableConnection that will not accept managed=true and therefore cause the connection to Bigtable to fail.
Here’s the stack trace showing the error:
2017-03-24 23:06:44,890 [JobControl] ERROR com.turner.hbase.mapreduce.MultiScanTableInputFormat - java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:431)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:424)
at org.apache.hadoop.hbase.client.ConnectionManager.getConnectionInternal(ConnectionManager.java:302)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:185)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:151)
at com.eduboom.hbase.mapreduce.MultiScanTableInputFormat.setConf(Unknown Source)
at com.eduboom.pig.load.HBaseMultiScanLoader.getInputFormat(Unknown Source)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:264)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
... 26 more
Caused by: java.lang.IllegalArgumentException: Bigtable does not support managed connections.
at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:123)
at com.google.cloud.bigtable.hbase1_2.BigtableConnection.<init>(BigtableConnection.java:55)
... 31 more
The original problem was caused by the use of outdated and deprecated hbase client jars and classes.
I updated my code to use the newest hbase client jars provided by Google and the original problem was fixed.
I still get stuck with some ZK issue that I still did not figure out, but that's a conversation for a different question.
This one is answered!
I have confronted the same error message:
Bigtable does not support managed connections.
However, according to my research, the root cause is that the class HTable can not be constructed explicitly. After changed the way to construct HTable by connection.getTable. The problem resolved.
When running a load test for 50 users with a steady state load of 15 mins, the samples do not go in the next loop, it means if we put a load of 50 users, in the sample tables for the first 50 samples there are no errors, however all the requests after that fail.
On log out we receive an authentication token
BDT3-CHE8-GKA5-BWA1%7Cd67830e7c46bc1011d76e69de76c59c57c4f5956%7Clin
and in the previous requests the token is
BDT3-CHE8-GKA5-BWA1|d67830e7c46bc1011d76e69de76c59c57c4f5956|lin
noticed that pipe (|) character in previous token is replaced by %7C.
Also the session ID is just generated on URL launch page, however not captured in Jmeter parameters and not used in further request.
Please provide more insights on this issue or a possible solution on how to decode the token, so it can be passed to the next request
Exception on Log out page is :
java.net.URISyntaxException: Illegal character in query at index 113: http://www.siteunderprogress.com/secure/WorkflowUIDispatcher.jspa?id=17116&action=11&atl_token=BDT3-CHE8-GKA5-BWA1|d67830e7c46bc1011d76e69de76c59c57c4f5956|lin&decorator=dialog&inline=true&_=1422286605586
at java.net.URI$Parser.fail(Unknown Source)
at java.net.URI$Parser.checkChars(Unknown Source)
at java.net.URI$Parser.parseHierarchical(Unknown Source)
at java.net.URI$Parser.parse(Unknown Source)
at java.net.URI.<init>(Unknown Source)
at java.net.URL.toURI(Unknown Source)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:283)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:74)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1141)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1130)
at org.apache.jmeter.threads.JMeterThread.process_sampler(JMeterThread.java:431)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:258)
at java.lang.Thread.run(Unknown Source)
You have to extract the token from wherever you are more comfortable, (usually from the page) and then use it in all 3 places.
Use a regular expression post processor. See:
http://jmeter.apache.org/usermanual/component_reference.html#Regular_Expression_Extractor
When inserting into the URL, remember to encode it
We are testing our Spring Integration application from within Eclipse.
The application polls a DB, uses a rowMapper to create domain objects, these are then marshalled into XML(using Castor), and the XML is sent as a message downstream.
The application works well, but at some point we get heap errors.
The VM arguments are set as follows: -Xms512M -Xmx1024M
It appears that the OutOfMemoryExceptions which are thrown occur when the poller has to handle very large result sets.
This is the configuration of the poller:
<task:executor id="inboundAdapterPollerPool" pool-size="1000 queue-capacity="100000" />
<int-jdbc:inbound-channel-adapter id="jdbcPollingChannelAdapter"
channel="input" auto-startup="true"
query="${sql}" row-mapper="entryRowMapper" update=""
select-sql-parameter-source="timeRangeSqlParameterSource"
jdbc-operations="myJdbcTemplate">
<int:poller fixed-rate="50" task-executor="inboundAdapterPollerPool" error-channel="error" >
<int:transactional transaction-manager="transactionManager" isolation="DEFAULT" timeout="-1" synchronization-factory="syncFactory"/>
</int:poller>
</int-jdbc:inbound-channel-adapter>
The following are the two exceptions thrown during different executions:
INFO org.springframework.integration.endpoint.SourcePollingChannelAdapter: started jdbcPollingChannelAdapter
INFO org.springframework.context.support.DefaultLifecycleProcessor: Starting beans in phase -2147483648
INFO org.springframework.context.support.DefaultLifecycleProcessor: Starting beans in phase 2147483647
ERROR org.springframework.integration.handler.LoggingHandler: org.springframework.core.task.TaskRejectedException: Executor [java.util.concurrent.ThreadPoolExecutor#1ac364e] did not accept task: org.springframework.integration.util.ErrorHandlingTaskExecutor$1#1215a83
at org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.execute(ThreadPoolTaskExecutor.java:244)
at org.springframework.integration.util.ErrorHandlingTaskExecutor.execute(ErrorHandlingTaskExecutor.java:49)
at org.springframework.integration.endpoint.AbstractPollingEndpoint$Poller.run(AbstractPollingEndpoint.java:231)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:53)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.util.concurrent.RejectedExecutionException
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
at org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.execute(ThreadPoolTaskExecutor.java:241)
... 12 more
and
INFO org.springframework.integration.endpoint.SourcePollingChannelAdapter: started jdbcPollingChannelAdapter
INFO org.springframework.context.support.DefaultLifecycleProcessor: Starting beans in phase -2147483648
INFO org.springframework.context.support.DefaultLifecycleProcessor: Starting beans in phase 2147483647
ERROR org.springframework.transaction.support.TransactionSynchronizationUtils: TransactionSynchronization.afterCompletion threw exception
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at org.springframework.integration.message.GenericMessage.toString(GenericMessage.java:82)
at java.lang.String.valueOf(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at org.springframework.integration.transaction.ExpressionEvaluatingTransactionSynchronizationProcessor.doProcess(ExpressionEvaluatingTransactionSynchronizationProcessor.java:107)
at org.springframework.integration.transaction.ExpressionEvaluatingTransactionSynchronizationProcessor.processAfterRollback(ExpressionEvaluatingTransactionSynchronizationProcessor.java:99)
at org.springframework.integration.transaction.DefaultTransactionSynchronizationFactory$DefaultTransactionalResourceSynchronization.afterCompletion(DefaultTransactionSynchronizationFactory.java:93)
at org.springframework.transaction.support.TransactionSynchronizationUtils.invokeAfterCompletion(TransactionSynchronizationUtils.java:168)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.invokeAfterCompletion(AbstractPlatformTransactionManager.java:993)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.triggerAfterCompletion(AbstractPlatformTransactionManager.java:968)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:872)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:822)
at org.springframework.transaction.interceptor.TransactionAspectSupport.completeTransactionAfterThrowing(TransactionAspectSupport.java:410)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:114)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy15.call(Unknown Source)
at org.springframework.integration.endpoint.AbstractPollingEndpoint$Poller$1.run(AbstractPollingEndpoint.java:236)
at org.springframework.integration.util.ErrorHandlingTaskExecutor$1.run(ErrorHandlingTaskExecutor.java:52)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
ERROR org.springframework.transaction.support.TransactionSynchronizationUtils: TransactionSynchronization.afterCompletion threw exception
java.lang.OutOfMemoryError: Java heap space
ERROR org.springframework.transaction.support.TransactionSynchronizationUtils: TransactionSynchronization.afterCompletion threw exception
java.lang.OutOfMemoryError: Java heap space
Is there anything obvious wrong with the adapter or poller configuration, or what are the things I could do to overcome the memory issues
Thanks very much
Well, your app clearly can't keep up with the pace of polling every 50ms.
You could try setting the rejection policy to CALLER_RUNS but that will cause messages to be processed out of order (when the queue is full, the poller thread will process any new messages) but at least it will throttle the poller.
You could also write a custom RejectedExecutionHandler that blocks the poller thread until there's space in the queue.
That said, your queue size is rather large.
Probably the easiest fix is to reduce the poller frequency to a value that your downstream processes can keep up with.
seems to be the issue with your task executor configuration where you have configured the thread pool and queue with very large values '100000'.
Suggesting to tune it with some smaller values to avoid memory issues.
<task:executor id="inboundAdapterPollerPool" pool-size="100-500" queue-capacity="1000" rejection-policy="CALLER_RUNS" />
By default, the rejection policy is AbortPolicy which will throw the exception if queue is full. However if you need to throttle the tasks under heavy load , you can use CallerRunsPolicy. This allows the executor to "catch up" on the tasks it is handling and thereby frees up some capacity on the queue, in the pool, or both.
In our hadoop setup, when a datanode crashes (or) hadoop doesn't respond on the datanode, reduce task fails unable to read from the failed node(exception below). I thought hadoop handles data node failures and that is the main purpose of creating hadoop. Is anybody facing similar problem with their clusters? If you have a solution, please let me know.
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1547)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1483)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1391)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1302)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1234)
When a task of a mapreduce job fails Hadoop will retry it on another node
You can take a look at the jobtracker (:50030/jobtracker.jsp) and see the blacklisted nodes(nodes that have problems with their keep-alive) or drill to a running/completed job and see the number of killed tasks/retries as well as deadnodes, decommisioned nodes etc.
I've had a similar problem on a cluster where executing tasks failed on some nodes due to "out of memory" problems. They were definitely restarted on other nodes. The computation eventually failed because it was badly designed, causing all nodes to run out of memory, and eventually the threshold for cancelling the job was reached.