Installing splice machine on HDP 2.6.5 - hadoop

We tried almost all the guides that we can find online to splice machine as a ambari service.
But everytime we run the sqlshell.sh, it just says there is no server running and unable to connect to port 1527 on localhost.
We have a simple HDP sandbox version 2.6.5 and a three node 2.6.5 cluster. We are trying to install version 2.87 of splicemachine.
These are the guides we followed.
https://github.com/splicemachine/spliceengine/blob/branch-2.8/platforms/hdp2.6.5/docs/HDP-installation.md
That did not work on our three node cluster
Then we tried the sandbox with this tutorial
https://github.com/splicemachine/splice-ambari-service
Again the same result.
Please let us know if there is anything that we have missed in the guide/ or are there any extra steps.

Can you confirm you are running sqlshell.sh on a server that is running a region server? I would look at the region server log files. You want to look for 'Ready to accept JDBC connections on 0.0.0.0:1527'. That is the indicator that a region server is up and running. If you do not see that, then can you look in the log file and see if there are any error messages?

Found out the answer - Apart from what is mentioned in the doc here - https://github.com/splicemachine/spliceengine/blob/branch-2.8/platforms/hdp2.6.5/docs/HDP-installation.md.
There were multiple NoClassDefFoundError errors being thrown in the hbase master logs for classes found in spark2 jars.
One of them was
master.HMaster: The coprocessor com.splicemachine.hbase.SpliceMasterObserver threw java.io.IOException: Unexpected exception
java.io.IOException: Unexpected exception
at com.splicemachine.si.data.hbase.coprocessor.CoprocessorUtils.getIOException(CoprocessorUtils.java:30)
at com.splicemachine.hbase.SpliceMasterObserver.start(SpliceMasterObserver.java:111)
at org.apache.hadoop.hbase.coprocessor.CoprocessorHost$Environment.startup(CoprocessorHost.java:415)
at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:256)
at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadSystemCoprocessors(CoprocessorHost.java:159)
at org.apache.hadoop.hbase.master.MasterCoprocessorHost.<init>(MasterCoprocessorHost.java:93)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:774)
at org.apache.hadoop.hbase.master.HMaster.access$900(HMaster.java:225)
at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:2038)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: org/spark_project/guava/util/concurrent/ThreadFactoryBuilder
at com.splicemachine.timestamp.impl.TimestampClient.<init>(TimestampClient.java:108)
at com.splicemachine.timestamp.hbase.ZkTimestampSource.initialize(ZkTimestampSource.java:62)
at com.splicemachine.timestamp.hbase.ZkTimestampSource.<init>(ZkTimestampSource.java:48)
at com.splicemachine.si.data.hbase.coprocessor.HBaseSIEnvironment.<init>(HBaseSIEnvironment.java:146)
at com.splicemachine.si.data.hbase.coprocessor.HBaseSIEnvironment.loadEnvironment(HBaseSIEnvironment.java:100)
at com.splicemachine.hbase.SpliceMasterObserver.start(SpliceMasterObserver.java:81)
... 8 more
Caused by: java.lang.ClassNotFoundException: org.spark_project.guava.util.concurrent.ThreadFactoryBuilder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 14 more
We found out that this class was in the spark-network-common jar and symlinked it to one of the folders that was part of HBASE_CLASSPATH. But then it threw another classdef not found error for a different class.
We ended up symlinking all the jars in spark2/jars folder to one of the folders that was part of HBASE_CLASSPATH.
After this the hbase master came up successfully and started the splice db process and we were able to connect with the sqlshell.sh
Nots: Make sure to do this symlinking on all nodes that have a regionserver.
What helped us in figuring this out were these two java files in splice documentation
https://github.com/splicemachine/spliceengine/blob/master/hbase_sql/src/main/java/com/splicemachine/hbase/SpliceMasterObserver.java
and
https://github.com/splicemachine/spliceengine/blob/ee61cadf17c97a0c5d866e2b764142f8e55311a5/splice_timestamp_api/src/main/java/com/splicemachine/timestamp/impl/TimestampServer.java

Related

Hadoop: ERROR BlockSender.sendChunks() exception

I have a cluster to use Hadoop (one master that works as namenode and datanode, and two slaves). I saw in the log files these messages of error:
hadoop-hduser-datanode-master.log file:
2017-05-15 13:02:55,303 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: BlockSender.sendChunks() exception:
java.io.IOException: Tubería rota
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:570)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:739)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:527)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:239)
at java.lang.Thread.run(Thread.java:748)
That happened only in the master node, after a while of inactivity. Fifteen minutes before, I ran a wordcount example successfully.
The OS in each node is Ubuntu 16.04. The cluster was created using VirtualBox.
Could you help me, please?
I followed this link:
https://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
to configure some parameters about the memory and my problem was resolved!
Note: in some posts I read that this error could be for lack of disk space (non in my case) and in other the reason was the SO's version (they recommended downgrade Ubuntu 16.04 to 14.04).

File Does Not Exist in Amazon EMR even though it tries to upload it

I used Amazon EMR to create an emr-4.0.0 cluster:
However, whenever I try to submit a spark application on it, it fails and gives the following error:
File does not exist: hdfs://ip-xx-xx-xxx-xx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1441035668468_0001/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar
This is even though earlier in the log it uploads this exact same file without issuing any error message:
2015-08-31 15:43:29,070 INFO [main] yarn.Client (Logging.scala:logInfo(59)) - Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar -> hdfs://ip-xx-xx-xxx-xx.ec2.internal:8020/user/hadoop/.sparkStaging/application_1441035668468_0001/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar
(I've verified that the source file indeed exists at /usr/lib/spark/lib/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar on the master machine).
The command I use is:
spark-submit --deploy-mode cluster --master yarn-cluster --class com.sundaysky.ads.spark.cluster.TrackingLogsAnalysis /tmp/oz/AdsTests-1.0-SNAPSHOT.jar
BTW, I've noticed that this uses Java 1.7 (even though it's the newest EMR version by Amazon), but I don't think that is relevant.
Do you have any ideas what could be the issue, or alternatively, how to debug the problem? I've tried many way of adding parameters to the spark-submit command to get TRACE level messages from yarn-client, but without success.
Thanks,
Oz
So, after talking to Amazon support, in case anyone ever comes across a simliar issue:
The specific problem in my case was that my logic jar (not the spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar, which is provided by Amazon) was compiled with Java 8, while the machine only supported Java 7.
This was not reflected in the error log for the step, but rather in the stderr log for the step's container, where a following message appeared:
15/08/31 15:43:41 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread Exception in thread "main" java.lang.UnsupportedClassVersionError: com/xxxxxx/xxxx/xxxxx/xxxxx/MyClass : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
If you encounter a similar problem, and the step's log files do not provide an answer, you should also look in the container's log:
Go to Amazon's EMR web page.
Click your cluster to open the Cluster Details screen
Near the "Log URI" there should be a folder icon, click it to open the logs
Go to "containers" and continue going down the one matching your task
Check the stderr.gz and stdout.gz for issues
HTH,
Oz

Hive - issues while starting

I have been using Hive for sometime now on Ubuntu while Hadoop is in Pseudo Distribution mode however today out of nowhere i am getting error while starting Hive shell.I have not made any changes in configuration at all -
Caused by: Meta Exception(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
The hivemetastore service is not running. You can start the service with the command below. This command is for installations made using packages.
service hive-metastore start
For tarball installations, you can start the hive metastore using the below command
hive --service metastore &

Flink error - org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4

I am trying to run a flink job using a file from HDFS. I have created a dataset as following -
DataSource<Tuple2<LongWritable, Text>> visits = env.readHadoopFile(new TextInputFormat(), LongWritable.class,Text.class, Config.pathToVisits());
I am using flink's latest version - 0.9.0-milestone-1-hadoop1
(I have also tried with 0.9.0-milestone-1)
whereas my Hadoop version is 2.6.0
But, I get the following exception when I try to execute the job. I have searched for similar problem, and it is related to version incompatibility between client and hdfs.
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Can you please let me know what changes should I make in my pom, so that it points to correct Hadoop/HDFS version? or changes elsewhere?
Or I need to downgrade the hadoop installation?
Have you tried the Hadoop-2 build of Flink? Have a look at the downloads page. There is a build called flink-0.9.0-milestone-1-bin-hadoop2.tgz that should work with Hadoop 2.

cant run pig with single node hadoop server

I have setup a VM with ubuntu. It runs hadoop as a single node. Later I installed apache pig on it. apache pig runs great with local mode, but it always prom ERROR 2999: Unexpected internal error. Failed to create DataStorage
I am missing something very obvious. Can someone help me get this running please?
More details:
1. I assume that hadoop is running fine because, I could run MapReduce jobs in python.
2. pig -x local runs as i expect.
3. when i just type pig it gives me following error
Error before Pig is launched
----------------------------
ERROR 2999: Unexpected internal error. Failed to create DataStorage
java.lang.RuntimeException: Failed to create DataStorage
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.(HDataStorage.java:58)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:214)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:134)
at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
at org.apache.pig.PigServer.(PigServer.java:226)
at org.apache.pig.PigServer.(PigServer.java:215)
at org.apache.pig.tools.grunt.Grunt.(Grunt.java:55)
at org.apache.pig.Main.run(Main.java:452)
at org.apache.pig.Main.main(Main.java:107)
Caused by: java.io.IOException: Call to localhost/127.0.0.1:54310 failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
at org.apache.hadoop.ipc.Client.call(Client.java:743)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:207)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:170)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
... 9 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
================================================================================
Link helped me understand possible cause of failure.
Here is what fixed my problem.
1. Recompile pig without hadoop.
2. Update PIG_CLASSPATH to have all the jars from $HADOOP_HOME/lib
3. Run pig.
Thanks.
set your PIG_CLASSPATH to point to your correct HADOOP_HOME installation so that Pig can pick up ur cluster information from core-site.xml,mapreduce-site.xml and hdfs-site.xml,better to follow the link for correct installation.
Just install Cygwin, then add the Cygwin path to the Path Environment Variable:
For details see here.

Resources