Hortonworks Ambari cannot start data node - Cannot find Java VM / JVM library file - java-6

I've just added a new datanode to my Hortonworks cluster (machines running RHEL7), but clearly I must have missed something when I installed Java jdk 1.8 on it. All the node's roles are installed but Datanode, metrics monitor and node manager show up as stopped in the Ambari manager. Whenever I run 'Datanode start' it fails with the following message:
==> /var/log/hadoop/hdfs/jsvc.out <==
==> /var/log/hadoop/hdfs/jsvc.err <==
Cannot find any VM in Java Home /usr/java/jdk1.8.0_77
Cannot locate JVM library file
Output when running java -version (logged in as root):
java version "1.8.0_77"
Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
Java HotSpot(TM) Server VM (build 25.77-b03, mixed mode)
I figure it must be something along the lines of exporting JAVA_HOME or setting PATH, in a way that it looks inside the jdk's bin folder. Can't make it work though. Maybe because I'm exporting to root's bash profile, instead whichever account ambari uses to run datanode start? Any ideas?

Turned out Ambari doesn't automatically 'see' the changes you make to the jdk (if like me you have been messing with it). To solve this I recommissioned the datanode, and then restarted it. It then worked right away.

Related

Windows/Drillbit Error: Could not find or load main class org.apache.drill.exec.server.Drillbit

I have set up a Hadoop single node cluster with pseudo distributed operations, and YARN running. I am able to use Spark JAVA API to run queries as a YARN-client. I wanted to go one step further and try Apache Drill on this "cluster". I installed Zookeeper that is running smoothly but I am not able to start drill and I get this log:
nohup: ignoring input
Error: Could not find or load main class
org.apache.drill.exec.server.Drillbit
Any idea?
I am on Windows 10 with JDK 1.8.
DRILL CLASSPATH is not initialized in the process of running drillbit on your machine.
For the purpose to start Drill on Windows machine it is necessary to run sqlline.bat script, for example:
C:\bin\sqlline sqlline.bat –u "jdbc:drill:zk=local;schema=dfs"
See more info: https://drill.apache.org/docs/starting-drill-on-windows/

Changing JDK on cluster deployed with ./spark-ec2

I have deployed an Amazon EC2 cluster with Spark like so:
~/spark-ec2 -k spark -i ~/.ssh/spark.pem -s 2 --region=eu-west-1 --spark-version=1.3.1 launch spark-cluster
I copy a file I need first to the master and then from master to HDFS using:
ephemeral-hdfs/bin/hadoop fs -put ~/ANTICOR_2_10000.txt ~/user/root/ANTICOR_2_10000.txt
I have a jar I want to run which was compiled with JDK 8 (I am using a lot of Java 8 features) so I copy it over with scp and run it with:
spark/bin/spark-submit --master spark://public_dns_with_port --class package.name.to.Main job.jar -f hdfs://public_dns:~/ANTICOR_2_10000.txt
The problem is that spark-ec2 loads the cluster with JDK7 so I am getting the Unsupported major.minor version 52.0
My question is, which are all the places where I need to change JDK7 to JDK8?
The steps I am doing thus far on master are:
Install JDK8 with yum
Use sudo alternatives --config java and change prefered java to java-8
export JAVA_HOME=/usr/lib/jvm/openjdk-8
Do I have to do that for all the nodes? Also do I need to change the java path that hadoop uses at ephemeral-hdfs/conf/hadoop-env.sh or are there any other spots I missed?
Unfortunately, Amazon doesn't offer out-of-the-box Java 8 installations, yet: see available versions.
Have you seen this post on how to install it on running instances?
Here is what i have been doing for all java installations which are different from versions provided by default installations: -
Configure the JAVA_HOME environment variable on each machine/ node: -
export JAVA_HOME=/home/ec2-user/softwares/jdk1.7.0_25
Modify the default PATH and place the "java/bin" directory before the rest of the PATH on all Nodes/ machines.
export PATH=/home/ec2-user/softwares/jdk1.7.0_25/bin/:$M2:$SCALA_HOME/bin/:$HIVE_HOME/bin/:$PATH:
And the above needs to be done with the same "OS user" which is used to execute/ own the spark master/ worker process.

DSE with Hadoop: Error in getting started

I am facing a problem in DSE with Hadoop.
Let me describe the setup, including steps in some details, for you to be able to help me.
I set up a three-node cluster of DSE, with the cluster name as 'training'. All three machines are running Ubuntu 14.04, 64-bit, 4 GB RAM.
DSE was installed using the GUI installer (sudo command). After the installation, cassandra.yaml file was modified for
rpc_address = 0.0.0.0
One-by-one the three nodes were started. A keyspace with replication_factor = 3 was created. Data inserted and accessed from any other node, successfully.
Then DSE installed on a fourth machine (let's call this machine as HadoopMachine), again, with the same configuration, using GUI installer (sudo).
/etc/default/dse modified as follows:
HADOOP_ENABLED = 1
Then, on this HadoopMachine, the following command is run:
sudo service dse start
So far so good.
Then from the installation directory:
bin/dse hadoop fs -mkdir /user/hadoop/wordcount
This fails. Gives a very long series of error messages, running into hundreds of lines, ending with:
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368)
at com.datastax.bdp.loader.SystemClassLoader.tryLoadClassInBackground(SystemClassLoader.java:163)
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:117)
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:81)
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:75)
at java.util.ResourceBundle$RBClassLoader.loadClass(ResourceBundle.java:503)
at java.util.ResourceBundle$Control.newBundle(ResourceBundle.java:2640)
at java.util.ResourceBundle.loadBundle(ResourceBundle.java:1501)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:1465)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:1419)
at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1361)
at java.util.ResourceBundle.getBundle(ResourceBundle.java:890)
at sun.util.resources.LocaleData$1.run(LocaleData.java:164)
at sun.util.resources.LocaleData$1.run(LocaleData.java:160)
FATAL ERROR in native method: processing of -javaagent failed
bin/dse: line 192: 12714 Aborted (core dumped) "$HADOOP_BIN/hadoop" "$HADOOP_CMD" $HADOOP_CREDENTIALS "${#:2}"
I don't know what the problem is, and how to fix it.
Will appreciate any help. Thanks.
I managed to find the solution, after a lot of struggle. I had been guessing all this time, that the problem would be one mis-step somewhere that was not very obvious, at least to me, and that's how it turned out.
So for the benefit of anybody else who may face the same problem, what the problem was and what worked is as follows.
DSE documentation specifies that for DSE with integrated Hadoop you must have Oracle JRE 7. I, perhaps foolishly, assumed it would mean Oracle JRE 7 or higher. So I had JRE 8 on my machine, and never realized that that would be the issue. When I removed JRE 8 and installed JRE 7, and bingo, it worked.
I am amazed. Now I realize that since DSE uses Hadoop 1.0.4 (an ancient version), it works with JRE 7 only. JRE 8 must have come after Hadoop 1.0.4 and something in JRE 8 must be incompatible with JRE 7, I guess.

Spark Standalone Mode: Worker not starting properly in cloudera

I am new to the spark, After installing the spark using parcels available in the cloudera manager.
I have configured the files as shown in the below link from cloudera enterprise:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.8.1/Cloudera-Manager-Installation-Guide/cmig_spark_installation_standalone.html
After this setup, I have started all the nodes in the spark by running /opt/cloudera/parcels/SPARK/lib/spark/sbin/start-all.sh. But I couldn't run the worker nodes as I got the specified error below.
[root#localhost sbin]# sh start-all.sh
org.apache.spark.deploy.master.Master running as process 32405. Stop it first.
root#localhost.localdomain's password:
localhost.localdomain: starting org.apache.spark.deploy.worker.Worker, logging to /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain: failed to launch org.apache.spark.deploy.worker.Worker:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
localhost.localdomain: at gnu.java.lang.MainThread.run(libgcj.so.10)
localhost.localdomain: full log in /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain:starting org.apac
When I run jps command, I got:
23367 Jps
28053 QuorumPeerMain
28218 SecondaryNameNode
32405 Master
28148 DataNode
7852 Main
28159 NameNode
I couldn't run the worker node properly. Actually I thought to install a standalone spark where the master and worker work on a single machine. In slaves file of spark directory, I given the address as "localhost.localdomin" which is my host name. I am not aware of this settings file. Please any one cloud help me out with this installation process. Actually I couldn't run the worker nodes. But I can start the master node.
Thanks & Regards,
bips
Please notice error info below:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
I met the same error when I installed and started Spark master/workers on CentOS 6.2 x86_64 after making sure that libgcj.x86_64 and libgcj.i686 had been installed on my server, finally I solved it. Below is my solution, wish it can help you.
It seem as if your JAVA_HOME environment parameter didn't set correctly.
Maybe, your JAVA_HOME links to system embedded java, e.g. java version "1.5.0".
Spark needs java version >= 1.6.0. If you are using java 1.5.0 to start Spark, you will see this error info.
Try to export JAVA_HOME="your java home path", then start Spark again.

Not able to run Hadoop job remotely

I want to run a hadoop job remotely from a windows machine. The cluster is running on Ubuntu.
Basically, I want to do two things:
Execute the hadoop job remotely.
Retrieve the result from hadoop output directory.
I don't have any idea how to achieve this. I am using hadoop version 1.1.2
I tried passing jobtracker/namenode URL in the Job configuration but it fails.
I have tried the following example : Running java hadoop job on local/remote cluster
Result: Getting error consistently as cannot load directory. It is similar to this post:
Exception while submitting a mapreduce job from remote system
Welcome to a world of pain. I've just implemented this exact use case, but using Hadoop 2.2 (the current stable release) patched and compiled from source.
What I did, in a nutshell, was:
Download the Hadoop 2.2 sources tarball to a Linux machine and decompress it to a temp dir.
Apply these patches which solve the problem of connecting from a Windows client to a Linux server.
Build it from source, using these instructions. It will also ensure that you have 64-bit native libs if you have a 64-bit Linux server. Make sure you fix the build files as the post instructs or the build would fail. Note that after installing protobuf 2.5, you have to run sudo ldconfig, see this post.
Deploy the resulted dist tar from hadoop-2.2.0-src/hadoop-dist/target on the server node(s) and configure it. I can't help you with that since you need to tweak it to your cluster topology.
Install Java on your client Windows machine. Make sure that the path to it has no spaces in it, e.g. c:\java\jdk1.7.
Deploy the same Hadoop dist tar you built on your Windows client. It will contain the crucial fix for the Windox/Linux connection problem.
Compile winutils and Windows native libraries as described in this Stackoverflow answer. It's simpler than building entire Hadoop on Windows.
Set up JAVA_HOME, HADOOP_HOME and PATH environment variables as described in these instructions
Use a text editor or unix2dos (from Cygwin or standalone) to convert all .cmd files in the bin and etc\hadoop directories, otherwise you'll get weird errors about labels when running them.
Configure the connection properties to your cluster in your config XML files, namely fs.default.name, mapreduce.jobtracker.address, yarn.resourcemanager.hostname and the alike.
Add the rest of the configuration required by the patches from item 2. This is required for the client side only. Otherwise the patch won't work.
If you've managed all of that, you can start your Linux Hadoop cluster and connect to it from your Windows command prompt. Joy!

Resources