Spark can't find Guava Classes

Spark can't find Guava Classes - maven

I'm running Spark's example called JavaPageRank, but it's a copy that I compiled separately using maven in a new jar. I keep getting this error:
ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main]
java.lang.NoClassDefFoundError: com/google/common/collect/Iterables
Despite the fact that guava is listed as one of Spark's dependencies. I'm running compiled Spark 1.6 that I downloaded pre-compiled from the apache website.
Thanks!

The error means that the jar containing com.google.common.collect.Iterables class is not in the classpath. So your application is not able to find the required class in runtime.
If you are using maven/gradle , try to clean, build and refresh the project. Then check your classes folder and make sure the guava jar is in the lib folder.
Hope this will help.
Good luck!

Related

JNI Error followed by Java Error when opening exectuable-jar

Good evening,
I encountered a problem while building an executable jar with Intellij. Before I never encountered Problems with it, but this time there is JNI Error occured and a Java Error occured message. I already tried the tips I found in Stackoverflow. It won't work with Maven Assembly Plugin or Maven Dependency Plugin. Moreover I tried creating the MANIFEST in main/java and main/resources. When I create it in main/java nothing happens when i open the jar with the second method the error mentioned earlier occures.
The only way it works is using this option in Artifacts
But using this option all of the libraries are in the file Path individually and the jar size is 75 kb. My goal was to create ONE fat executable jar. Btw another maven project I created works finde after creating the jar. Thanks in advance.

How to deploy assembly jar and use it as provided dependency?

Using spark over hbase and hadoop using Yarn,
an assembly library among other libraries is provided server side.
(called like spark-looongVersion-haddop-looongVersion.jar)
it includes numerous libraries.
When the spark jar is sent as a job to the server for execution, conflicts may arise between the libraries included in the job and the server libraries (assembly jar and possibly other libraries) .
I need to include this assembly jar as a "provided" maven dependency to avoid conflicts between client dependencies and server classpath
how can I deploy and use this assembly jar as a provided dependency ?

how can I deploy and use this assembly jar as a provided dependency ?
An assembly jar is a regular jar file and so as any other jar file can be a library dependency if it's available in the artifact repo to download it from, e.g. Nexus, Artifactory or similar.
The quickest way to do it is to "install" it in your Maven local repository (see Maven's Guide to installing 3rd party JARs). That however binds you to what you have locally available and so will quickly get out of sync with what other teams are using.
The recommended way is to deploy the dependency using Apache Maven Deploy Plugin.
Once it's deployed, declaring it as a dependency is not different from declaring other dependencies.

Provided dependencies scope
Spark dependencies must be excluded from the assembled JAR. If not, you should expect weird errors from Java classloader during application startup. Additional benefit of assembly without Spark dependencies is faster deployment. Please remember that application assembly must be copied over the network to the location accessible by all cluster nodes (e.g: HDFS or S3).

Jar hell - force load one version

Is there a way to force a java application to load a particular version of a jar, and not include the other version in the classpath?
Many other users here have noted that Elasticsearch unit tests throw jar hell errors. I have tried all the suggestions, and nothing has worked. I have set up tons and tons of 'exclusions' in my pom.xml, and this has not helped. How exactly is this jar being loaded?
java.lang.RuntimeException: found jar hell in test classpath
..
Caused by: java.lang.IllegalStateException: jar hell!
class: javax.servlet.annotation.WebFilter
jar1: /Users/xx/.m2/repository/org/jboss/spec/javax/servlet/jboss-servlet-api_3.0_spec/1.0.1.Final/jboss-servlet-api_3.0_spec-1.0.1.Final.jar
jar2: /Users/xx/.m2/repository/org/jboss/spec/javax/servlet/jboss-servlet-api_3.1_spec/1.0.0.Final/jboss-servlet-api_3.1_spec-1.0.0.Final.jar
How to find which jar is linked to jboss-servlet-api_3.0_spec-1.0.1.Final.jar? I have tried using tattletale(this did not generate reports) and maven-duplicate-finder(this did not find anything). I could hack it to set the jar-check option to false, but then I get shadowing errors.
These jars do not show up in my maven dependency tree.
I tried deleting the two versions from my .m2 directory, but they just popup again when I am running the app - is there a way to find out where it is being loaded from?

Submit spark application in a Jar file separate from uber Jar containing all dependencies

I am building Spark application which has several heavy dependencies (e.g. Stanford NLP with language models) so that uber Jar that contains application code with dependencies takes ~500MB. Uploading this fat Jar to my test cluster takes a lot of time and I decided to build my app and dependencies into separate Jar files.
I've created two modules in my parent pom.xml and build app and uber jar separately with mvn package and mvn assembly:asembly respectively.
However, after I upload these separate jars to my YARN cluster application fails with the following error:
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.hadoop.net.unix.DomainSocketWatcher.(I)V at
org.apache.hadoop.hdfs.shortcircuit.DfsClientShmManager.(DfsClientShmManager.java:415)
at
org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.(ShortCircuitCache.java:379)
at
org.apache.hadoop.hdfs.ClientContext.(ClientContext.java:100)
at org.apache.hadoop.hdfs.ClientContext.get(ClientContext.java:151)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:690) at
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:601) at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
When running application on Spark it also fails with similar error.
Jar with dependencies is included into Yarn classpath:
<property>
<name>yarn.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/*,
$HADOOP_COMMON_HOME/lib/*,
$HADOOP_HDFS_HOME/*,
$HADOOP_HDFS_HOME/lib/*,
$HADOOP_MAPRED_HOME/*,
$HADOOP_MAPRED_HOME/lib/*,
$YARN_HOME/*,
$YARN_HOME/lib/*,
/usr/local/myApp/org.myCompany.myApp-dependencies.jar
</value>
</property>
Is it actually possibly to run Spark application this way? Or I have to put all dependencies on YARN (or Spark) classpath as individual Jar files?

I encountered the same issue with my spark job. This is a dependency issue for sure. You have to make sure the correct versions are picked up at runtime.The best way to do this was adding the correct version hadoop-common-2.6.jar to my application jar. I also upgraded my hadoop-hdfs version in application jar. This resolved my issue.

dependency issues with app while deploying in tomcat-server

i am using hbase 0.94.7 and hadoop 1.0.4 and tomcat 7
i wrote a small res-based application which performs crud operations on hbase.
earlier i used to run the app using maven tomcat plugin.
now i am trying to deploy the war in tomcat-server.
since hadoop and hbase jars already contain org.mortbay.jetty jsp-api and servlet-api jars of older verisons,
i am getting Abstract Method Exceptions
here's the exception log
so then i made a exclusion of org.mortbay.jetty from both hadoop and hbase dependencies in pom.xml. but it started showing more and more such kind of issues like jasper.
so then i added scope provided to hadoop and hbase dependencies.
now tomcat is unable to find the hadoop and hbase jars.
can someone help me in fixing this dependecy issues.
Thanks.

Do one thing,
- Right click on project
- go to property,
- type java build path,
- go to third tab of library,
- Removed dependency of lib and maven,
- Clean build your project.
might be solve your problem.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Spark can't find Guava Classes - maven

Related

JNI Error followed by Java Error when opening exectuable-jar

How to deploy assembly jar and use it as provided dependency?

Jar hell - force load one version

Submit spark application in a Jar file separate from uber Jar containing all dependencies

dependency issues with app while deploying in tomcat-server

Categories

Resources