Flink error - org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4 - maven

I am trying to run a flink job using a file from HDFS. I have created a dataset as following -
DataSource<Tuple2<LongWritable, Text>> visits = env.readHadoopFile(new TextInputFormat(), LongWritable.class,Text.class, Config.pathToVisits());
I am using flink's latest version - 0.9.0-milestone-1-hadoop1
(I have also tried with 0.9.0-milestone-1)
whereas my Hadoop version is 2.6.0
But, I get the following exception when I try to execute the job. I have searched for similar problem, and it is related to version incompatibility between client and hdfs.
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Can you please let me know what changes should I make in my pom, so that it points to correct Hadoop/HDFS version? or changes elsewhere?
Or I need to downgrade the hadoop installation?

Have you tried the Hadoop-2 build of Flink? Have a look at the downloads page. There is a build called flink-0.9.0-milestone-1-bin-hadoop2.tgz that should work with Hadoop 2.

Related

Installing splice machine on HDP 2.6.5

We tried almost all the guides that we can find online to splice machine as a ambari service.
But everytime we run the sqlshell.sh, it just says there is no server running and unable to connect to port 1527 on localhost.
We have a simple HDP sandbox version 2.6.5 and a three node 2.6.5 cluster. We are trying to install version 2.87 of splicemachine.
These are the guides we followed.
https://github.com/splicemachine/spliceengine/blob/branch-2.8/platforms/hdp2.6.5/docs/HDP-installation.md
That did not work on our three node cluster
Then we tried the sandbox with this tutorial
https://github.com/splicemachine/splice-ambari-service
Again the same result.
Please let us know if there is anything that we have missed in the guide/ or are there any extra steps.
Can you confirm you are running sqlshell.sh on a server that is running a region server? I would look at the region server log files. You want to look for 'Ready to accept JDBC connections on 0.0.0.0:1527'. That is the indicator that a region server is up and running. If you do not see that, then can you look in the log file and see if there are any error messages?
Found out the answer - Apart from what is mentioned in the doc here - https://github.com/splicemachine/spliceengine/blob/branch-2.8/platforms/hdp2.6.5/docs/HDP-installation.md.
There were multiple NoClassDefFoundError errors being thrown in the hbase master logs for classes found in spark2 jars.
One of them was
master.HMaster: The coprocessor com.splicemachine.hbase.SpliceMasterObserver threw java.io.IOException: Unexpected exception
java.io.IOException: Unexpected exception
at com.splicemachine.si.data.hbase.coprocessor.CoprocessorUtils.getIOException(CoprocessorUtils.java:30)
at com.splicemachine.hbase.SpliceMasterObserver.start(SpliceMasterObserver.java:111)
at org.apache.hadoop.hbase.coprocessor.CoprocessorHost$Environment.startup(CoprocessorHost.java:415)
at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:256)
at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadSystemCoprocessors(CoprocessorHost.java:159)
at org.apache.hadoop.hbase.master.MasterCoprocessorHost.<init>(MasterCoprocessorHost.java:93)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:774)
at org.apache.hadoop.hbase.master.HMaster.access$900(HMaster.java:225)
at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:2038)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: org/spark_project/guava/util/concurrent/ThreadFactoryBuilder
at com.splicemachine.timestamp.impl.TimestampClient.<init>(TimestampClient.java:108)
at com.splicemachine.timestamp.hbase.ZkTimestampSource.initialize(ZkTimestampSource.java:62)
at com.splicemachine.timestamp.hbase.ZkTimestampSource.<init>(ZkTimestampSource.java:48)
at com.splicemachine.si.data.hbase.coprocessor.HBaseSIEnvironment.<init>(HBaseSIEnvironment.java:146)
at com.splicemachine.si.data.hbase.coprocessor.HBaseSIEnvironment.loadEnvironment(HBaseSIEnvironment.java:100)
at com.splicemachine.hbase.SpliceMasterObserver.start(SpliceMasterObserver.java:81)
... 8 more
Caused by: java.lang.ClassNotFoundException: org.spark_project.guava.util.concurrent.ThreadFactoryBuilder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 14 more
We found out that this class was in the spark-network-common jar and symlinked it to one of the folders that was part of HBASE_CLASSPATH. But then it threw another classdef not found error for a different class.
We ended up symlinking all the jars in spark2/jars folder to one of the folders that was part of HBASE_CLASSPATH.
After this the hbase master came up successfully and started the splice db process and we were able to connect with the sqlshell.sh
Nots: Make sure to do this symlinking on all nodes that have a regionserver.
What helped us in figuring this out were these two java files in splice documentation
https://github.com/splicemachine/spliceengine/blob/master/hbase_sql/src/main/java/com/splicemachine/hbase/SpliceMasterObserver.java
and
https://github.com/splicemachine/spliceengine/blob/ee61cadf17c97a0c5d866e2b764142f8e55311a5/splice_timestamp_api/src/main/java/com/splicemachine/timestamp/impl/TimestampServer.java

File Does Not Exist in Amazon EMR even though it tries to upload it

I used Amazon EMR to create an emr-4.0.0 cluster:
However, whenever I try to submit a spark application on it, it fails and gives the following error:
File does not exist: hdfs://ip-xx-xx-xxx-xx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1441035668468_0001/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar
This is even though earlier in the log it uploads this exact same file without issuing any error message:
2015-08-31 15:43:29,070 INFO [main] yarn.Client (Logging.scala:logInfo(59)) - Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar -> hdfs://ip-xx-xx-xxx-xx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1441035668468_0001/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar
(I've verified that the source file indeed exists at /usr/lib/spark/lib/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar on the master machine).
The command I use is:
spark-submit --deploy-mode cluster --master yarn-cluster --class com.sundaysky.ads.spark.cluster.TrackingLogsAnalysis /tmp/oz/AdsTests-1.0-SNAPSHOT.jar
BTW, I've noticed that this uses Java 1.7 (even though it's the newest EMR version by Amazon), but I don't think that is relevant.
Do you have any ideas what could be the issue, or alternatively, how to debug the problem? I've tried many way of adding parameters to the spark-submit command to get TRACE level messages from yarn-client, but without success.
Thanks,
Oz
So, after talking to Amazon support, in case anyone ever comes across a simliar issue:
The specific problem in my case was that my logic jar (not the spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar, which is provided by Amazon) was compiled with Java 8, while the machine only supported Java 7.
This was not reflected in the error log for the step, but rather in the stderr log for the step's container, where a following message appeared:
15/08/31 15:43:41 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread Exception in thread "main" java.lang.UnsupportedClassVersionError: com/xxxxxx/xxxx/xxxxx/xxxxx/MyClass : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
If you encounter a similar problem, and the step's log files do not provide an answer, you should also look in the container's log:
Go to Amazon's EMR web page.
Click your cluster to open the Cluster Details screen
Near the "Log URI" there should be a folder icon, click it to open the logs
Go to "containers" and continue going down the one matching your task
Check the stderr.gz and stdout.gz for issues
HTH,
Oz

Interfacing Hadoop to Cassandra on Amazon AWS - netty version conflict?

I've got a Hadoop map reduce class that runs on Amazon EMR and outputs to an HDFS flatfile. All well and good, but now I need to output to a Cassandra DB, also running on AWS. I built and ran a local client and got that working, then moved the Cassandra writing code to my Hadoop project. The problem, it seems, is that Amazon draws in /home/hadoop/lib/netty-3.2.4.Final.jar for Hadoop 1.0.3, but the Cassandra that runs on AWS is 1.2.6 and uses netty-3.5.9.Final.jar.
What can I do to prevent or circumvent this conflict? Can I draw in my newer version of netty along side the one Amazon EMR draws in?
The error I get from running the jar on EMR is as follows:
Exception in thread "main" java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.<init>(IIIIIZ)V
at org.apache.cassandra.transport.Frame$Decoder.<init>(Frame.java:147)
at com.datastax.driver.core.Connection$PipelineFactory.getPipeline(Connection.java:616)
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:212)
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:188)
at com.datastax.driver.core.Connection.<init>(Connection.java:111)
at com.datastax.driver.core.Connection.<init>(Connection.java:56)
at com.datastax.driver.core.Connection$Factory.open(Connection.java:387)
at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:211)
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:174)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:87)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:609)
at com.datastax.driver.core.Cluster$Manager.access$100(Cluster.java:553)
at com.datastax.driver.core.Cluster.<init>(Cluster.java:67)
at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:94)
at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:534)
at GraphAnalysis.MatrixBuilder.compileOutput(MatrixBuilder.java:282)
at GraphAnalysis.MatrixBuilder.main(MatrixBuilder.java:205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
I had a similar problem with one of our internal modules which had used netty-3.2.4.jar which had a missing function.
Ultimately I had to do open-heart surgery on the jar file.
Expand the jar file that embeds the old netty library using jar
-xvf oldlibrary.jar
Delete the folder org/jboss/netty.
Then expand the new netty jar file into a separate folder
Copy the new org/jboss/netty folder along with its contents to the old location cp -prf netty .
Create a new JAR package jar -cvf ../new_jarfile.jar *
This is a rare case, but you can apply this method to overcome library incompatibilities whenever they occur. Make sure that your program still runs. If you change the underlying library this way, there are changes that your original program will not run. A well-designed library should be agnostic of such changes though.

HBase error: Server IPC version 8 cannot communicate with client version 4

I'm using hbase-0.94.9, I tried to follow the introductions from HBase online book, but I got the error:
org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException
Then I found on the Web that I had to set up Hadoop first, I used start-dfs.sh in Hadoop 2.0.5-alpha but now I get this error, when I try to run start-hbase.sh:
2013-07-09 17:27:40,706 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException: Server IPC version 8 cannot communicate with client version 4
You trying to use an HBase release that was built against Hadoop 1.0.x with hadoop 2.0.x. Either use an HBase release built against Hadoop 2.0.x or rebuild your HBase with hadoop.profile set to 2.0
-Dhadoop.profile=2.0
If you need help on how to build HBase, you can visit this link.
HTH

mahout wont start up. Anything to do with compatible version between hadoop and mahout?

I am new to hadoop and not to say mahout. I hope someone could assist me to get through here.. have been trying for 2 days..
I have already a hadoop cluster running.
I am using hadoop-2.0.0-alpha.
I installed mahout (ahout-distribution-0.7) and maven-2.2.1 (latest maven-3.0.4 doesnt work)
Now i would like to just run mahout to get the idea of what is it.
I learnt that by typing "mahout" it will print out a list of options (algorithms) available in mahout, but when i typed mahout, it just gives me Java Exception.
$ [hadoop#localhost bin]$ mahout
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /home/hadoop/hadoop/bin/hadoop and HADOOP_CONF_DIR=/home/hadoop/hadoop/conf
MAHOUT-JOB: /home/hadoop/mahout/examples/target/mahout-examples-0.7-job.jar
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.util.ProgramDriver.driver([Ljava/lang/String;)V
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
From what i googled online, most of the answers required me to use lower version of hadoop, ie hadoop-0.20, Is my problem now has something to do with my hadoop version?
Thank you.
======== NEWLY EDITED ========
I changed my hadoop version to hadoop-1.0.3 and now it works when i typed "mahout" (my mahout is version7)
But it fails again with the similar error, when i tried to run an example..
$ hadoop /home/hadoop/mahout/core/target/mahout-core-0.7-job.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.output.dir=output -Dmapred.input.dir=input/prefs.txt --usersFile input/users.txt --similarityClassname SIMILARITY_PEARSON_CORRELATION
Caused by: java.lang.ClassNotFoundException: .home.hadoop.mahout.core.target.mahout-core-0.7-job.jar
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: /home/hadoop/mahout/core/target/mahout-core-0.7-job.jar. Program will exit.
Hmm..
Yes it looks like you need to use a different version of Hadoop (or build the latest Mahout from source) if you want this to work. You got a NoSuchMethodError so the first thing to do is check if the ProgramDriver is in the distribution of hadoop you're using.
Looking at the API docs for the various version you can see that its in v0.0.20.x but had been removed from the newer versions.
http://hadoop.apache.org/common/docs/r0.20.205.0/api/index.html
http://hadoop.apache.org/common/docs/r2.0.0-alpha/api/index.html (your version)
Looking at the JIRA for Mahout you can see a bug was submitted for a similar problem to this on July 11th and has been fixed in version 0.8.
[MAHOUT-1044] https://issues.apache.org/jira/browse/MAHOUT-1044
Update:
Shouldn't your command have jar after the hadoop command? Something like:
$ hadoop jar /home/hadoop/mahout/core/target/mahout-core-0.7-job.jar
etc
#BinaryNerd is right. There is a bug in Mahout as detailed in:
[MAHOUT-1044] https://issues.apache.org/jira/browse/MAHOUT-1044
Mahout 0.7 command line gives the NoClassDef ProgramDriver error as detailed in the first part of your question. This will be fixed in 0.8 or you can edit your bin/mahout as detailed in the bug report to fix the brace placement.
I ran into the same issue using mahout under Cloudera CDH 4.1.2. Changing the brace in my bin/mahout fixed it.

Resources