I have installed Hadoop 2.x and its running fine in Windows 8.
10916 NameNode
1588 DataNode
3332 Jps
4200 ResourceManager
2444 NodeManager
And I have installed Hive also in Windows, But when I start Hive it's throwing an error saying:
"Missing hadoop installation: G:\hadoop\winutils must be set"
HADOOP_HOME is already set to G:\hadoop\winutils in env variables.
Please help here.

You have wrongly set HADOOP_HOME.
Try below..
In User variables, configure HADOOP_HOME with following value.
In System variables, add following value in addition of existing path value.
If you not ok with that above configurations just try below way.
Open cmd prompt, Just set home by setting path and home.
C:>set HADOOP_HOME=D:\Hadoop-2.8.1
Now start hadoop services from same cmd prompt and then go to hive shell.
Hope this helpful for you.


JAVA_HOME is not set and could not be found. error on When Install HADOOP

I'm new to the hadoop. when the process of the installation, i gave hadoop.env.sh a JAVA_HOME path, but when I'm going to execute hdfs namenode -format it says that the java_home is not set.when check it again, it also saved in the hadoop.env.sh. i can't up the hdfs because of this. explained help is much appreciated.
thank u. i've attached the screen shot for the reference as well.
Can you restart HDFS service after adding JAVA_HOME to hadoop-env.sh?
ALso try echoing echo $JAVA_HOME before running hadoop namenode format command.
Make sure you have set environment variable correctly.
Reference: Hadoop-Psuedo Distributed Mode
Hope this helps.

spark-shell throws error in Apache spark

I have installed hadoop on ubuntu on virtual box(host os Windows 7).I have also installed Apache spark, configured SPARK_HOME in .bashrc and added HADOOP_CONF_DIR to spark-env.sh. Now when I start the spark-shell it throws error and do not initialize spark context, sql context. Am I missing something in installation and also I would want to run it on a cluster (hadoop 3 node cluster is set up).
I have the same issue when trying to install Spark local with Windows 7. Please make sure the below paths is correct and I am sure I will work with you. I answer same question in this link So, you can follow the below and it will work.
Create JAVA_HOME variable: C:\Program Files\Java\jdk1.8.0_181\bin
Add the following part to your path: ;%JAVA_HOME%\bin
Create SPARK_HOME variable: C:\spark-2.3.0-bin-hadoop2.7\bin
Add the following part to your path: ;%SPARK_HOME%\bin
The most important part Hadoop path should include bin file before winutils.ee as the following: C:\Hadoop\bin Sure you will locate winutils.exe inside this path.
Create HADOOP_HOME Variable: C:\Hadoop
Add the following part to your path: ;%HADOOP_HOME%\bin
Now you can run the cmd and write spark-shell it will work.

Could not format the Namenode in hadoop 2.6?

I have installed the hadoop 2.6 on ubuntu 14.04.I just followed this blog.
While I am trying to format the namenode, I am hitting with below error:
hduser#data1:~$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
/usr/local/hadoop/bin/hdfs: line 276: /home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java: No such file or directory
/home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java: No such file or directory
This error occurs because the JAVA_HOME you have provided does not have java.
Just add this line in hadoop-env.sh and /home/hduser/.bashrc:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
I think you have already set the $JAVA_HOME but you did it wrong (just a guess):
It would be :
You probably have added ~ before the path when you exported that JAVA_HOME and this added the home directory /home/hduser.
To check this out, type java -version and see if java is working. And type echo $JAVA_HOME and check the path manually.
I figured out. The entry we made was for amd64. it is really i386 computers. Please verify the path and that should fix the issue.

Hadoop+HBase cluster on windows: winutils not found

I'm trying to set up a fully-distributed 4-node dev cluster with Hadoop 2.20 and HBase 0.98 on Windows. I've built Hadoop on Windows successfully, and more recently, also build HBase on Windows.
We have successfully ran the wordcount example from the Hadoop installation guide, as well as a custom WebHDFS job. As HBase fully-distributed on Windows isn't supported yet, I'm running HBase under cygwin.
When trying to start hbase from my master (./bin/start-hbase.sh), I get the following error:
2014-04-17 16:22:08,599 ERROR [main] util.Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1514)
at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:113)
at org.apache.hadoop.hbase.zookeeper.ZKServerTool.main(ZKServerTool.java:46)
Looking at the Shell.java source, what is here set as null, seems to be the HADOOP_HOME environment variable. With hadoop under D:/hadoop, and HBase under cygwin root at C:/cygwin/root/usr/local/hbase, the cygwin $HADOOP_HOME variable is /cygdrive/d/hadoop/, and the Windows system environment variable %HADOOP_HOME% is D:\hadoop . Seems to me like with those two variables, the variable should be found correctly...
Also potentially relevant: I'm running Windows Server 2012 x64.
Edit: I have verified that there actually is a winutils.exe in D:\hadoop\bin\ .
We've found it. So, in Hadoop's Shell.java, you'll find that there are two options to communicate the Hadoop-path.
// first check the Dflag hadoop.home.dir with JVM scope
String home = System.getProperty("hadoop.home.dir");
// fall back to the system/user-global env variable
if (home == null) {
home = System.getenv("HADOOP_HOME");
After trial and error, we found that in the HBase options (HBase's hbase-env.sh, HBASE_OPTS variable), you'll need to add in this option with the Windows(!) path to Hadoop. In our case, we needed to add -Dhadoop.home.dir=D:/hadoop .
Good luck to anyone else who happens to stumble across this ;).

Spark Standalone Mode: Worker not starting properly in cloudera

I am new to the spark, After installing the spark using parcels available in the cloudera manager.
I have configured the files as shown in the below link from cloudera enterprise:
After this setup, I have started all the nodes in the spark by running /opt/cloudera/parcels/SPARK/lib/spark/sbin/start-all.sh. But I couldn't run the worker nodes as I got the specified error below.
[root#localhost sbin]# sh start-all.sh
org.apache.spark.deploy.master.Master running as process 32405. Stop it first.
root#localhost.localdomain's password:
localhost.localdomain: starting org.apache.spark.deploy.worker.Worker, logging to /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain: failed to launch org.apache.spark.deploy.worker.Worker:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
localhost.localdomain: at gnu.java.lang.MainThread.run(libgcj.so.10)
localhost.localdomain: full log in /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain:starting org.apac
When I run jps command, I got:
23367 Jps
28053 QuorumPeerMain
28218 SecondaryNameNode
32405 Master
28148 DataNode
7852 Main
28159 NameNode
I couldn't run the worker node properly. Actually I thought to install a standalone spark where the master and worker work on a single machine. In slaves file of spark directory, I given the address as "localhost.localdomin" which is my host name. I am not aware of this settings file. Please any one cloud help me out with this installation process. Actually I couldn't run the worker nodes. But I can start the master node.
Thanks & Regards,
Please notice error info below:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
I met the same error when I installed and started Spark master/workers on CentOS 6.2 x86_64 after making sure that libgcj.x86_64 and libgcj.i686 had been installed on my server, finally I solved it. Below is my solution, wish it can help you.
It seem as if your JAVA_HOME environment parameter didn't set correctly.
Maybe, your JAVA_HOME links to system embedded java, e.g. java version "1.5.0".
Spark needs java version >= 1.6.0. If you are using java 1.5.0 to start Spark, you will see this error info.
Try to export JAVA_HOME="your java home path", then start Spark again.
