The constructor ClientArguments(String[], SparkConf) is undefined - hadoop

I am trying spark-submit through Java code. I am referring the following example.
https://github.com/mahmoudparsian/data-algorithms-book/blob/master/misc/how-to-submit-spark-job-to-yarn-from-java-code.md
But I am getting
The constructor ClientArguments(String[], SparkConf) is undefined
My version of spark-yarn is spark-yarn_2.11-2.0.0.
I saw the topic :
spark-submit through java code
but it does not work for me.
Any help ?

The constructor arguments have changed between 1.x and 2.0. In 1.x it looked like:
ClientArguments(args: Array[String], sparkConf: SparkConf)
In 2.0 its:
ClientArguments(args: Array[String])
See: https://github.com/apache/spark/blob/branch-2.0/yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala

Related

Is it possible to read a file using SparkSession object of Scala language on Windows?

I've been trying to read from a .csv file on many ways, utilizing SparkContext object. I found it possible through scala.io.Source.fromFile function, but I want to use spark object. Everytime I run function textfile for org.apache.spark.SparkContext I get the same error:
scala> sparkSession.read.csv("file://C:\\Users\\184229\\Desktop\\bigdata.csv")
21/12/29 16:47:32 WARN streaming.FileStreamSink: Error while looking for metadata directory.
java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
.....
As it's mentioned in the title I run the code on Windows in IntelliJ
[Edit]
In build.sbt have no redundant or overlapped dependencies. I use hadoop-tools, spark-sql and hadoop-xz.
Have you tried to run your spark-shell using local mode?
spark-shell --master=local
Also pay attention to not use both Hadoop-code and Hadoop-commons as a dependencies since you may have conflicting jars issues.
I've found the solution, precisely one of my colleague did that.
In dependencies build.sbt I changed hadoop-tools to hadoop-commons and it worked out.

org.apache.kylin.job.exception.ExecuteException: java.lang.NoClassDefFoundError: org/apache/hadoop/hive/serde2/typeinfo/TypeInfo

I find similar error on https://issues.apache.org/jira/browse/KYLIN-2511
env:
hadoop-2.7.1
hbase-1.3.2
apache-hive-2.1.1-bin
apache-kylin-1.6.0-hbase1.x-bin
I've tried copy all the hive libs to kylin, but get another ERROR.
org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoClassDefFoundError: org/apache/hadoop/hive/serde2/typeinfo/TypeInfo
The missing class should be in hive-exec-.jar; Check and debug the "bin/find-hive-dependency.sh" to see why it wasn't able to locate this jar from your server. You can manually add it to the "hive_exec_path" variable.
BTW, Kylin 1.6 is quite old, try to upgrade to a 2.x version.
Why you just try the method mentioned in https://issues.apache.org/jira/browse/KYLIN-2511. You'd better prepare the env according to the document of v16. It is better for using the latest version of Kylin. It has more feature and fixes some bugs.

Beam / DataFlow unexpected error ProtocolMessageEnum not implemented when using DataFlowRunner

When running my Beam pipeline locally it all works as expected but when trying to run it on the DataflowRunner I suddenly get the error below. Honestly I don't even know where to start evaluating this because the DataflowRunner seems to be a black box.
Jan 14, 2019 11:26:51 AM org.apache.beam.runners.dataflow.DataflowRunner fromOptions
INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage 165 files. Enable logging at DEBUG level to see which files will be staged.
Exception in thread "main" java.lang.IncompatibleClassChangeError: Class org.apache.beam.model.pipeline.v1.RunnerApi$StandardPTransforms$Primitives does not implement the requested interface com.google.protobuf.ProtocolMessageEnum
at org.apache.beam.runners.core.construction.BeamUrns.getUrn(BeamUrns.java:27)
at org.apache.beam.runners.core.construction.PTransformTranslation.<clinit>(PTransformTranslation.java:58)
at org.apache.beam.runners.core.construction.UnconsumedReads$1.visitValue(UnconsumedReads.java:49)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:666)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:649)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:649)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:649)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:311)
at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:245)
at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:458)
at org.apache.beam.runners.core.construction.UnconsumedReads.ensureAllReadsConsumed(UnconsumedReads.java:40)
at org.apache.beam.runners.dataflow.DataflowRunner.replaceTransforms(DataflowRunner.java:868)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:660)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:173)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at (my code: pipe.run().waitUntilFinish();)
check the versions of beam etc and upgrade your dependencies where possible.
I had the same error and after seeing you get this error, I thought it must be a dependency conflict as it didn't exist before.
I'm using scio to deploy to dataflow and just referenced what they're using. https://github.com/spotify/scio/blob/v0.7.1/build.sbt
I updated guava and protobuf also.
I know you're using java, but try updating beam to 2.9.0 and maybe guava, protobuf...

RxJS-DOM - Cannot read property 'AbstractObserver' of undefined

I've just included RxJS 5.4.0 and RxJS-DOM 7.0.3 on a page and got this error in the console:
TypeError: Cannot read property 'AbstractObserver' of undefined
How can I resolve this error?
Rx-DOM is not fully compatible will RxJS 5.
See open tickets on github:
https://github.com/ReactiveX/rxjs/issues/1223
https://github.com/Reactive-Extensions/RxJS-DOM/issues/105
To avoid this error you need to use RxJS 4.

configuring springXD Horthonworks

I try to Configuring Spring XD to use Hadoop (horthonworks) but when I excute this line In a terminal "./xd-singlenode --hadoopDistro hadoop11" ,I get error 'hadoop11' is not a valid value for option --hadoopDistro Possible values are [cdh5, hdp22, phd21, hadoop27, phd30]
The error message is pretty clear, the Spring Xd version your are trying to use is wrong. For horthonworks configuration you should use hdp22.
Regards

Resources