Jmeter loop count - StackOverflowError:null - jmeter

I have the following test plan
thread-group ( threads: "1", loops: "-1", ramp-up: "1" )
http request defaults (i.e url)
csv data set config (containing 4 lines of input data)
user defined variables (initialize variables e.g isLast=false)
simple controller
while controller
http request
json extractor
json extractor
JSR233 PostProcessor
The json extractor and post processor are because i want to handle pagination and increment the page number accordingly.
The first issue is that although i set up the thread-group to run infinitely, when the thread finishes it does not run again and i get the following error.
021-08-27 14:15:38,686 INFO o.a.j.e.J.increment page: isLast = true
2021-08-27 14:15:38,738 INFO o.a.j.t.JMeterThread: Thread finished: dba-data-exporter-users 1-1
2021-08-27 14:15:38,738 ERROR o.a.j.JMeter: Uncaught exception in thread Thread[dba-data-exporter-users 1-1,6,main]
java.lang.StackOverflowError: null
at java.lang.Module.isExported(Module.java:456) ~[?:?]
at jdk.internal.reflect.Reflection.verifyModuleAccess(Reflection.java:212) ~[?:?]
at jdk.internal.reflect.Reflection.verifyMemberAccess(Reflection.java:125) ~[?:?]
at java.lang.reflect.AccessibleObject.slowVerifyAccess(AccessibleObject.java:633) ~[?:?]
at java.lang.reflect.AccessibleObject.verifyAccess(AccessibleObject.java:626) ~[?:?]
at java.lang.reflect.AccessibleObject.checkAccess(AccessibleObject.java:590) ~[?:?]
at java.lang.reflect.Constructor.newInstance(Constructor.java:481) ~[?:?]
at org.codehaus.groovy.runtime.InvokerHelper.newScript(InvokerHelper.java:503) ~[groovy-3.0.7.jar:3.0.7]
at org.codehaus.groovy.runtime.InvokerHelper.createScript(InvokerHelper.java:461) ~[groovy-3.0.7.jar:3.0.7]
at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:266) ~[groovy-jsr223-3.0.7.jar:3.0.7]
at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:155) ~[groovy-jsr223-3.0.7.jar:3.0.7]
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233) ~[java.scripting:?]
at org.apache.jmeter.functions.Groovy.execute(Groovy.java:120) ~[ApacheJMeter_functions.jar:5.4.1]
at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:138) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:113) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.testelement.property.FunctionProperty.getStringValue(FunctionProperty.java:100) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.WhileController.getCondition(WhileController.java:142) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.WhileController.endOfLoop(WhileController.java:62) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.WhileController.next(WhileController.java:112) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:222) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.next(GenericController.java:175) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:222) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.next(GenericController.java:175) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.LoopController.next(LoopController.java:134) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.LoopController.nextIsNull(LoopController.java:166) ~[ApacheJMeter_core.jar:5.4.1]
Is it something that has to do with my while loop maybe? The condition i have in my while loop is this
${__groovy("false".equals(vars.get("isLast")))}
The other issue that i have noticed, is that although i specify thread lifetime through the GUI, jmeter does not take that in to account at all (while i generate the schematic view for example).
I don't believe that CSV file is the issue, since i have already set recycle on EOF to true, stop thread on EOF to false and sharing mode to Current thread group

The thread goes into an infinite loop after completing the while loop.
The isLast flag is not reset to true when you go outside the While loop. Hence the thread will never enter the while loop after the first cycle.
You can reset the flag to true at the end of the thread group or before entering into the While loop.
Add a JSRS223 Sampler and add the following
vars.put("isLast","false")
SampleResult.setIgnore()

Related

JMeter If Controller jexl3

In an IfController
${__jexl3("${usecase}" == "month")} # works (true) when $usecase = months
${__jexl3("${usecase}" == "month")} # do not work when $usecase = quarter?
Instead, I get
2021-05-18 16:17:57,863 ERROR o.a.j.JMeter: Uncaught exception in thread Thread[MonthTable 1-1,6,main] java.lang.StackOverflowError: null at java.lang.ThreadLocal.get(ThreadLocal.java:163) ~[?:?] at org.apache.jmeter.threads.JMeterContextService.getContext(JMeterContextService.java:59) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.SimpleVariable.getVariables(SimpleVariable.java:64) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.SimpleVariable.toString(SimpleVariable.java:50) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:144) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:113) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.functions.Jexl3Function.execute(Jexl3Function.java:72) ~[ApacheJMeter_functions.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:138) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:113) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.testelement.property.FunctionProperty.getStringValue(FunctionProperty.java:100) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.testelement.AbstractTestElement.getPropertyAsString(AbstractTestElement.java:280) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.IfController.getCondition(IfController.java:170) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.IfController.next(IfController.java:230) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:222) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.next(GenericController.java:175) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.LoopController.next(LoopController.java:134) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.LoopController.nextIsNull(LoopController.java:166) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.next(GenericController.java:170) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.LoopController.next(LoopController.java:134) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:225) ~[ApacheJMeter_core.jar:5.4.1]
/Best regards, Mats
Make sure that your ${usecase} variable does really have the value using Debug Sampler and View Results Tree Listener combination as looking at the error it is not the case.
I cannot reproduce your issue using the same JMeter version
For non-defined variable it cannot be reproduced either.
Try getting a clean vanilla JMeter installation without any plugins and if the issue is still reproducible maybe it's connected with your Java version, in that case seeing your jmeter.log file would be very useful.

java.util.concurrent.RejectedExecutionException: Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor

I have a spring batch job which is reading data from DB and indexing into solr.After deploying war for the first time i run it runs fine. But if i run it second time its showing exception as below
java.util.concurrent.RejectedExecutionException: Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$5793/1735313067#63e1224 rejected from org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor#3411eba2[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 157]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:194)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.addRunner(ConcurrentUpdateSolrClient.java:429)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.request(ConcurrentUpdateSolrClient.java:527)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
at org.apache.solr.client.solrj.SolrClient.addBeans(SolrClient.java:357)
at org.apache.solr.client.solrj.SolrClient.addBeans(SolrClient.java:329)
at sun.reflect.GeneratedMethodAccessor1557.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:136)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
at com.sun.proxy.$Proxy1112.write(Unknown Source)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.writeItems(SimpleChunkProcessor.java:188)
I think connection to solr is refused second time due to some issues.Pls help
This means you closed the CloudSolrClient and then tried to add more work with it after it was closed.
So you probably have some code like this:
try (CloudSolrClient solrClient = new CloudSolrClient.Builder()
.withZkHost(SOLR_ZK_HOSTS)
.withZkChroot(SOLR_ZK_CHROOT)
.withConnectionTimeout(10000)
.withSocketTimeout(60000)
.build()) {
... start all threads that do the work here ...
}
wait for threads to complete
You should have instead did this:
try (CloudSolrClient solrClient = new CloudSolrClient.Builder()
.withZkHost(SOLR_ZK_HOSTS)
.withZkChroot(SOLR_ZK_CHROOT)
.withConnectionTimeout(10000)
.withSocketTimeout(60000)
.build()) {
... start all threads that do the work here ...
wait for threads to complete
}
Notice how the wait for threads to complete is inside the try with resources loop. The result of not waiting for the threads to complete is that you ended up closing the solrClient before the files were all added.

apache nifi Stateless - not able to set parm of controller service (DBCPConnectionPool 1.10.0)

I am following the NiFi 1.10 stateless guildeline to create a simple process group of executing a sql in mysql db. I have put necessary parm of db controller service to parameter context.
it works well in nifi canvas. Then i add it to registry and prepare a json parm file: stateless-simpledb.json
{
"registryUrl": "http://localhost:18080",
"bucketId": "cac8f127-e328-45c1-a4cb-0e03dc837ceb",
"flowId": "cc2753f2-78f3-4449-a2fd-343dfeaafe15",
"flowVersion": "3",
"parameters": {
"lastIngestId" : "20000",
"mysql-jdbc-driver-name" : "com.mysql.jdbc.Driver",
"db-user" : "root",
"db-password" : "password",
"db-con-url" : "jdbc:mysql://localhost:3306/mms",
"jdbc-jar-path" : "/program/jdbc/mysql-connector-java.jar"
}
}
and run the one-off command:
/program/nifi/bin/nifi.sh stateless RunFromRegistry Once --file /app/poc/nifi-stateless/conf/stateless-simpledb.json
It raise error:
=== FlowFileRepository Type ===
org.apache.nifi.controller.repository.RocksDBFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
org.apache.nifi.controller.repository.WriteAheadFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
org.apache.nifi.controller.repository.VolatileFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
=== End FlowFileRepository types ===
23:32:32.626 [main] INFO org.apache.nifi.stateless.bootstrap.ExtensionDiscovery - Successfully discovered extensions in 4411 milliseconds
23:32:32.633 [main] DEBUG org.apache.nifi.stateless.core.ComponentFactory - Setting context class loader to org.apache.nifi.nar.InstanceClassLoader#50fa5938 (parent = org.apache.nifi.nar.NarClassLoader[/program/nifi-1.10.0/work/stateless-nars/nifi-dbcp-service-nar-1.10.0.nar-unpacked]) to create org.apache.nifi.dbcp.DBCPConnectionPool
23:32:32.647 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input #{jdbc-jar-path} found 1 Parameter references: [org.apache.nifi.parameter.StandardParameterReference#2d3eecda]
23:32:32.650 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input /program/jdbc/mysql-connector-java.jar found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 500 millis found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 8 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 0 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 8 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 30 mins found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.bootstrap.RunStatelessNiFi.main(RunStatelessNiFi.java:69)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.StatelessNiFi.main(StatelessNiFi.java:103)
... 5 more
Caused by: java.lang.RuntimeException: Failed to enable Controller Service {id=691ecc97-ff46-3a5e-8aad-37dc568bc247, name=MYSQL-MMS-stateless-test, type=class org.apache.nifi.dbcp.DBCPConnectionPool} because validation failed: ['Database Connection URL' is invalid because Database Connection URL is required, 'Database Driver Class Name' is invalid because Database Driver Class Name is required]
at org.apache.nifi.stateless.core.StatelessControllerServiceLookup.enableControllerServices(StatelessControllerServiceLookup.java:133)
at org.apache.nifi.stateless.core.StatelessFlow.<init>(StatelessFlow.java:153)
at org.apache.nifi.stateless.core.StatelessFlow.createAndEnqueueFromJSON(StatelessFlow.java:469)
at org.apache.nifi.stateless.runtimes.Program.runLocal(Program.java:133)
at org.apache.nifi.stateless.runtimes.Program.launch(Program.java:67)
... 10 more
Seems the apache nifi stateless function failed to set controller service even it's in "process group" scope.
Would anyone has any advice?
As mentioned in the comments, this appears to be a known problem with the validation of controller services.
This can be avoided by using Nifi 1.12 and above as it got fixed in the following jira: https://issues.apache.org/jira/plugins/servlet/mobile#issue/NIFI-7380
Though I am not entirely sure of this, it may also be possible that this simply indicates that your controller service is not configured correctly. This would be worth double checking.

Spring ldaptemplate update group with large membership issue

I have issues updating groups in Active Directory with > 1500 members. It's only trying to modify the member attribute.
I have no issues updating groups with fewer members. I can also add a new group with many members.
However if its too large, update fails. I can try to update the large group to just one member and it still fails with the same error.
Code fails on the modifyAttributes line:
ModificationItem[] modList =
nameContext.getDirContextAdapter().getModificationItems();
writeADTemplate.modifyAttributes(nameContext.getName(),modList);
StackTrace Below:
org.springframework.ldap.NameAlreadyBoundException: [LDAP: error code 68 -
00000562: UpdErr: DSID-031A122A, problem 6005 (ENTRY_EXISTS), data 0
nested exception is javax.naming.NameAlreadyBoundException: [LDAP: error
code 68 - 00000562: UpdErr: DSID-031A122A, problem 6005 (ENTRY_EXISTS), data 0
remaining name 'cn=Atlassian Users,ou=Groups'
at org.springframework.ldap.support.LdapUtils.convertLdapException
(LdapUtils.java:169)
at org.springframework.ldap.core.LdapTemplate.executeWithContext
(LdapTemplate.java:810)
at
org.springframework.ldap.core.LdapTemplate.executeReadWrite
(LdapTemplate.java:802)
at org.springframework.ldap.core.LdapTemplate.modifyAttributes
(LdapTemplate.java:967)
more ...
Caused by: javax.naming.NameAlreadyBoundException: [LDAP: error code 68 -
00000562: UpdErr: DSID-031A122A, problem 6005 (ENTRY_EXISTS), data 0
remaining name 'cn=Atlassian Users,ou=Groups'
at com.sun.jndi.ldap.LdapCtx.mapErrorCode(Unknown Source)
at com.sun.jndi.ldap.LdapCtx.processReturnCode(Unknown Source)
at com.sun.jndi.ldap.LdapCtx.processReturnCode(Unknown Source)
at com.sun.jndi.ldap.LdapCtx.c_modifyAttributes(Unknown Source)
at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_modifyAttributes(Unknown
Source)
at
com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.modifyAttributes(Unknown
Source)
at javax.naming.directory.InitialDirContext.modifyAttributes(Unknown Source)
at
org.springframework.ldap.core.LdapTemplate$19.executeWithContext
(LdapTemplate.java:969)
at
org.springframework.ldap.core.LdapTemplate.executeWithContext
(LdapTemplate.java:807)
... 88 more
Ok my real issue is that Active Directory will not return a multi value attribute like member if the values > 1500.
When I was getting the current group members it was return 0 values so my code was trying to add all the members back to the group.
Looks like I'll have to figure out how to use
DefaultIncrementalAttributesMapper to get all the members

slave lost and very slow join in spark

I did a join of two dataframes on one common column and then ran a show method:
df= df1.join(df2, df1.col1== df2.col2, 'inner')
df.show()
Then join ran very slow and finally raise an error: slave lost.
Py4JJavaError: An error occurred while calling o109.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 : ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Slave lost
Driver stacktrace:
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236) at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at
org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:212)
at
org.apache.spark.sql.execution.Limit.executeCollect(basicOperators.scala:165)
at
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
at
org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
at
org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56)
at
org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086)
at
org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498)
at
org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1505)
at
org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1375)
at
org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1374)
at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099)
at org.apache.spark.sql.DataFrame.head(DataFrame.scala:1374) at
org.apache.spark.sql.DataFrame.take(DataFrame.scala:1456) at
org.apache.spark.sql.DataFrame.showString(DataFrame.scala:170) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at
py4j.Gateway.invoke(Gateway.java:259) at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79) at
py4j.GatewayConnection.run(GatewayConnection.java:209) at
java.lang.Thread.run(Thread.java:745)
After some search, it seems this is a memory related issue. Then I increased repartition to 3000, increased executor memory,increased memoryOverhead, but still no luck, I got the same slave lost error. During df.show(), I found one of the execuctor shuffle write size is very high, the others were not so high.
Any clue?
Thanks
If using scala try
val df = df1.join(df2,Seq("column name"))
if pyspark
df = df1.join(df2,["columnname"])
or
df = df1.join(df2,df1.columnname == df2.columnname)
display(df)
If trying to do same in pyspark - sql
df1.createOrReplaceTempView("left_test_table")
df2..createOrReplaceTempView("right_test_table")
left <- sql(sqlContext, "SELECT * FROM left_test_table")
right <- sql(sqlContext, "SELECT * FROM right_test_table")
head(drop(join(left, right), left$name))

Resources