In an IfController
${__jexl3("${usecase}" == "month")} # works (true) when $usecase = months
${__jexl3("${usecase}" == "month")} # do not work when $usecase = quarter?
Instead, I get
2021-05-18 16:17:57,863 ERROR o.a.j.JMeter: Uncaught exception in thread Thread[MonthTable 1-1,6,main] java.lang.StackOverflowError: null at java.lang.ThreadLocal.get(ThreadLocal.java:163) ~[?:?] at org.apache.jmeter.threads.JMeterContextService.getContext(JMeterContextService.java:59) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.SimpleVariable.getVariables(SimpleVariable.java:64) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.SimpleVariable.toString(SimpleVariable.java:50) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:144) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:113) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.functions.Jexl3Function.execute(Jexl3Function.java:72) ~[ApacheJMeter_functions.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:138) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:113) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.testelement.property.FunctionProperty.getStringValue(FunctionProperty.java:100) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.testelement.AbstractTestElement.getPropertyAsString(AbstractTestElement.java:280) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.IfController.getCondition(IfController.java:170) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.IfController.next(IfController.java:230) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:222) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.next(GenericController.java:175) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.LoopController.next(LoopController.java:134) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.LoopController.nextIsNull(LoopController.java:166) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.next(GenericController.java:170) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.LoopController.next(LoopController.java:134) ~[ApacheJMeter_core.jar:5.4.1] at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:225) ~[ApacheJMeter_core.jar:5.4.1]
/Best regards, Mats
Make sure that your ${usecase} variable does really have the value using Debug Sampler and View Results Tree Listener combination as looking at the error it is not the case.
I cannot reproduce your issue using the same JMeter version
For non-defined variable it cannot be reproduced either.
Try getting a clean vanilla JMeter installation without any plugins and if the issue is still reproducible maybe it's connected with your Java version, in that case seeing your jmeter.log file would be very useful.
Related
I have the following test plan
thread-group ( threads: "1", loops: "-1", ramp-up: "1" )
http request defaults (i.e url)
csv data set config (containing 4 lines of input data)
user defined variables (initialize variables e.g isLast=false)
simple controller
while controller
http request
json extractor
json extractor
JSR233 PostProcessor
The json extractor and post processor are because i want to handle pagination and increment the page number accordingly.
The first issue is that although i set up the thread-group to run infinitely, when the thread finishes it does not run again and i get the following error.
021-08-27 14:15:38,686 INFO o.a.j.e.J.increment page: isLast = true
2021-08-27 14:15:38,738 INFO o.a.j.t.JMeterThread: Thread finished: dba-data-exporter-users 1-1
2021-08-27 14:15:38,738 ERROR o.a.j.JMeter: Uncaught exception in thread Thread[dba-data-exporter-users 1-1,6,main]
java.lang.StackOverflowError: null
at java.lang.Module.isExported(Module.java:456) ~[?:?]
at jdk.internal.reflect.Reflection.verifyModuleAccess(Reflection.java:212) ~[?:?]
at jdk.internal.reflect.Reflection.verifyMemberAccess(Reflection.java:125) ~[?:?]
at java.lang.reflect.AccessibleObject.slowVerifyAccess(AccessibleObject.java:633) ~[?:?]
at java.lang.reflect.AccessibleObject.verifyAccess(AccessibleObject.java:626) ~[?:?]
at java.lang.reflect.AccessibleObject.checkAccess(AccessibleObject.java:590) ~[?:?]
at java.lang.reflect.Constructor.newInstance(Constructor.java:481) ~[?:?]
at org.codehaus.groovy.runtime.InvokerHelper.newScript(InvokerHelper.java:503) ~[groovy-3.0.7.jar:3.0.7]
at org.codehaus.groovy.runtime.InvokerHelper.createScript(InvokerHelper.java:461) ~[groovy-3.0.7.jar:3.0.7]
at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:266) ~[groovy-jsr223-3.0.7.jar:3.0.7]
at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:155) ~[groovy-jsr223-3.0.7.jar:3.0.7]
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233) ~[java.scripting:?]
at org.apache.jmeter.functions.Groovy.execute(Groovy.java:120) ~[ApacheJMeter_functions.jar:5.4.1]
at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:138) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:113) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.testelement.property.FunctionProperty.getStringValue(FunctionProperty.java:100) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.WhileController.getCondition(WhileController.java:142) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.WhileController.endOfLoop(WhileController.java:62) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.WhileController.next(WhileController.java:112) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:222) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.next(GenericController.java:175) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.nextIsAController(GenericController.java:222) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.GenericController.next(GenericController.java:175) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.LoopController.next(LoopController.java:134) ~[ApacheJMeter_core.jar:5.4.1]
at org.apache.jmeter.control.LoopController.nextIsNull(LoopController.java:166) ~[ApacheJMeter_core.jar:5.4.1]
Is it something that has to do with my while loop maybe? The condition i have in my while loop is this
${__groovy("false".equals(vars.get("isLast")))}
The other issue that i have noticed, is that although i specify thread lifetime through the GUI, jmeter does not take that in to account at all (while i generate the schematic view for example).
I don't believe that CSV file is the issue, since i have already set recycle on EOF to true, stop thread on EOF to false and sharing mode to Current thread group
The thread goes into an infinite loop after completing the while loop.
The isLast flag is not reset to true when you go outside the While loop. Hence the thread will never enter the while loop after the first cycle.
You can reset the flag to true at the end of the thread group or before entering into the While loop.
Add a JSRS223 Sampler and add the following
vars.put("isLast","false")
SampleResult.setIgnore()
ru.yandex.clickhouse.except.ClickHouseUnknownException: ClickHouse exception, code: 1002, host: xxx.xxx.xxx.xxx, port: 8123; xxx.xxx.xxx.xxx:8123 failed to respond
at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.getException(ClickHouseExceptionSpecifier.java:91)
at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:55)
at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:24)
at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:633)
at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117)
at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100)
at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95)
at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90)
Caused by: org.apache.http.NoHttpResponseException: xxx.xxx.xxx.xxx:8123 failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614)
... 6 more
It's a famous bug in JDBC-driver https://github.com/ClickHouse/clickhouse-jdbc/issues/290
I am trying to ripple start my cluster, but getting below error:
[10/25/18 20:30:31:311 CEST] 00000001 WsServerImpl E WSVR0009E: Error occurred during startup
com.ibm.ws.exception.RuntimeError: org.omg.CORBA.INTERNAL: CREATE_LISTENER_FAILED_4 vmcid: 0x49421000 minor code: 56 completed: No
at com.ibm.ws.runtime.component.ORBImpl.start(ORBImpl.java:490)
at com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:540)
at com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
at com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
at com.ibm.ws.runtime.component.ServerImpl.start(ServerImpl.java:555)
at com.ibm.ws.runtime.WsServerImpl.bootServerContainer(WsServerImpl.java:311)
at com.ibm.ws.runtime.WsServerImpl.start(WsServerImpl.java:224)
at com.ibm.ws.runtime.WsServerImpl.main(WsServerImpl.java:697)
at com.ibm.ws.runtime.WsServer.main(WsServer.java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:90)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:508)
at com.ibm.wsspi.bootstrap.WSLauncher.launchMain(WSLauncher.java:234)
at com.ibm.wsspi.bootstrap.WSLauncher.main(WSLauncher.java:101)
at com.ibm.wsspi.bootstrap.WSLauncher.run(WSLauncher.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:90)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:508)
at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethodWithException(EclipseAppContainer.java:587)
at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:198)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:354)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:181)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:90)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:508)
at org.eclipse.core.launcher.Main.invokeFramework(Main.java:340)
at org.eclipse.core.launcher.Main.basicRun(Main.java:282)
at org.eclipse.core.launcher.Main.run(Main.java:981)
at com.ibm.wsspi.bootstrap.WSPreLauncher.launchEclipse(WSPreLauncher.java:413)
at com.ibm.wsspi.bootstrap.WSPreLauncher.main(WSPreLauncher.java:174)
Caused by: org.omg.CORBA.INTERNAL: CREATE_LISTENER_FAILED_4 vmcid: 0x49421000 minor code: 56 completed: No
at com.ibm.ws.orbimpl.transport.WSTransport.createListener(WSTransport.java:867)
at com.ibm.ws.orbimpl.transport.WSTransport.initTransports(WSTransport.java:605)
at com.ibm.rmi.iiop.TransportManager.initTransports(TransportManager.java:157)
at com.ibm.rmi.corba.ORB.set_parameters(ORB.java:1362)
at com.ibm.CORBA.iiop.ORB.set_parameters(ORB.java:1697)
at org.omg.CORBA.ORB.init(ORB.java:473)
at com.ibm.ws.orb.GlobalORBFactory.init(GlobalORBFactory.java:95)
at com.ibm.ejs.oa.EJSORBImpl.initializeORB(EJSORBImpl.java:169)
at com.ibm.ejs.oa.EJSServerORBImpl.(EJSServerORBImpl.java:88)
at com.ibm.ejs.oa.EJSORB.init(EJSORB.java:50)
at com.ibm.ws.runtime.component.ORBImpl.start(ORBImpl.java:482)
... 34 more
As per other posts in stackoverflow, this is problem with bootstrap port. As I am using cluster, I assume default ports will be 9809/9810. But, there is no entry for these ports in /etc/services.
How I can proceed here? I also tried to start individual servers, but got same error. netstat do not have any entry for these ports too.
/etc/services do have an entry for 2809, but I am using cluster. SO, I was not concerned about 2809.
I did a join of two dataframes on one common column and then ran a show method:
df= df1.join(df2, df1.col1== df2.col2, 'inner')
df.show()
Then join ran very slow and finally raise an error: slave lost.
Py4JJavaError: An error occurred while calling o109.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 : ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Slave lost
Driver stacktrace:
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236) at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at
org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:212)
at
org.apache.spark.sql.execution.Limit.executeCollect(basicOperators.scala:165)
at
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
at
org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
at
org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56)
at
org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086)
at
org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498)
at
org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1505)
at
org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1375)
at
org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1374)
at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099)
at org.apache.spark.sql.DataFrame.head(DataFrame.scala:1374) at
org.apache.spark.sql.DataFrame.take(DataFrame.scala:1456) at
org.apache.spark.sql.DataFrame.showString(DataFrame.scala:170) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at
py4j.Gateway.invoke(Gateway.java:259) at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79) at
py4j.GatewayConnection.run(GatewayConnection.java:209) at
java.lang.Thread.run(Thread.java:745)
After some search, it seems this is a memory related issue. Then I increased repartition to 3000, increased executor memory,increased memoryOverhead, but still no luck, I got the same slave lost error. During df.show(), I found one of the execuctor shuffle write size is very high, the others were not so high.
Any clue?
Thanks
If using scala try
val df = df1.join(df2,Seq("column name"))
if pyspark
df = df1.join(df2,["columnname"])
or
df = df1.join(df2,df1.columnname == df2.columnname)
display(df)
If trying to do same in pyspark - sql
df1.createOrReplaceTempView("left_test_table")
df2..createOrReplaceTempView("right_test_table")
left <- sql(sqlContext, "SELECT * FROM left_test_table")
right <- sql(sqlContext, "SELECT * FROM right_test_table")
head(drop(join(left, right), left$name))
Did somebody manage to write files (and especially CSV) using Spark's DataFrame on Windows?
Many answers on SO are outdated (e.g. this one) because of Sparks native capabilities to write .CSV (and a unified write() method) since version 2.0. Also, I downloaded and added winutils.exe like proposed here.
Code:
// reading works just fine
val df = spark.read
.option("header", true)
.option("inferSchema", true)
.csv("file:///C:/tmp/in.csv")
// writing fails, none of these work
df.write.csv("file:///C:/tmp/out.csv")
df.write.csv("C:/tmp/out.csv")
Error:
Exception in thread "main" org.apache.spark.SparkException: Job aborted.
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelationCommand.scala:149)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:487)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194)
at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:551)
at prost.ebtl.load.DataSourceCSV$.loadFromFilesystem(DataSourceCSV.scala:12)
at TestScala$$anonfun$main$2.apply(TestScala.scala:98)
at TestScala$$anonfun$main$2.apply(TestScala.scala:80)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at TestScala$.main(TestScala.scala:80)
at TestScala.main(TestScala.scala)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 3.0 failed 4 times, most recent failure: Lost task 1.3 in stage 3.0 (TID 13, 192.168.56.1): java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Ljava/lang/String;JJJI)Ljava/io/FileDescriptor;
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileOutputStreamWithMode(NativeIO.java:559)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:219)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:305)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:294)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:326)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:393)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:132)
at org.apache.spark.sql.execution.datasources.csv.CsvOutputWriter.<init>(CSVRelation.scala:191)
at org.apache.spark.sql.execution.datasources.csv.CSVOutputWriterFactory.newInstance(CSVRelation.scala:169)
at org.apache.spark.sql.execution.datasources.BaseWriterContainer.newOutputWriter(WriterContainer.scala:131)
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:247)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1450)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1438)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1437)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1437)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1659)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1871)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1884)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1904)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelationCommand.scala:143)
... 27 more
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Ljava/lang/String;JJJI)Ljava/io/FileDescriptor;
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileOutputStreamWithMode(NativeIO.java:559)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:219)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:305)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:294)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:326)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:393)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:132)
at org.apache.spark.sql.execution.datasources.csv.CsvOutputWriter.<init>(CSVRelation.scala:191)
at org.apache.spark.sql.execution.datasources.csv.CSVOutputWriterFactory.newInstance(CSVRelation.scala:169)
at org.apache.spark.sql.execution.datasources.BaseWriterContainer.newOutputWriter(WriterContainer.scala:131)
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:247)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Note: A folder named out.csv is created though
Setup: Hadoop v.2.7.3, Spark 2.0.1 Intelli J IDEA 2016.2, Scala 2.11.8, Testcluster on Win7 Workstation
I tried this, its working. You need to set warehouse dir configuration. That's the only thing missing from your code,also do you have write access to the directory where you are trying to write.
val spark = SparkSession
.builder()
.appName("Spark SQL CSV example")
.master("local")
.config("spark.sql.warehouse.dir", "file:///C:/IJava/")
.getOrCreate()
val df = spark.read
.option("header", true)
.option("inferSchema", true)
.csv("file:///C:/Users/sankar/Downloads/FLinsurancesample.csv")
df.write.csv("file:///C:/Users/sankar/Downloads/out.csv")