i want to run a JAR in Hadoop on Google Cloud using Yarn-client.
i use this command in the master node of hadoop
spark-submit --class find --master yarn-client find.jar
but it return this error
15/06/17 10:11:06 INFO client.RMProxy: Connecting to ResourceManager at hadoop-m-on8g/
15/06/17 10:11:07 INFO ipc.Client: Retrying connect to server: hadoop-m-on8g/ Already tried 0
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
What is the problem? In case it is useful this is my yarn-site.xml
<?xml version="1.0" ?>
<!-- Site specific YARN configuration properties -->
The remote path, on the default FS, to store logs.

In your case, it looks like the YARN ResourceManager may be unhealthy for unknown reasons; you can try to fix yarn with the following:
sudo sudo -u hadoop /home/hadoop/hadoop-install/sbin/
sudo sudo -u hadoop /home/hadoop/hadoop-install/sbin/
However, it looks like you're using the Click-to-Deploy solution; Click-to-Deploy's Spark + Hadoop 2 deployment actually doesn't support Spark on YARN at the moment, due to some bugs and lack of memory configs. You'd normally run into something like this if you just try to run it with --master yarn-client out-of-the-box:
15/06/17 17:21:08 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1434561664937
yarnAppState: ACCEPTED
15/06/17 17:21:09 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1434561664937
yarnAppState: ACCEPTED
15/06/17 17:21:10 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: 0
appStartTime: 1434561664937
yarnAppState: RUNNING
15/06/17 17:21:15 ERROR cluster.YarnClientSchedulerBackend: Yarn application already ended: FAILED
15/06/17 17:21:15 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/metrics/json,null}
15/06/17 17:21:15 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
The well-supported way to deploy is a cluster on Google Compute Engine with Hadoop 2 and Spark configured to be able to run on YARN is to use bdutil. You'd run something like:
./bdutil -P <instance prefix> -p <project id> -b <bucket> -z <zone> -d \
-e extensions/spark/ generate_config
./bdutil -e deploy
# Shorthand for logging in to the master
./bdutil -e shell
# Handy way to run a socks proxy to make it easy to access the web UIs
./bdutil -e socksproxy
# When done, delete your cluster
./bdutil -e delete
With Spark should default to yarn-client, though you can always re-specify --master yarn-client if you want. You can see a more detailed explanation of the flags available in bdutil with ./bdutil --help. Here are the help entries just for the flags I included above:
-b, --bucket
Google Cloud Storage bucket used in deployment and by the cluster.
-d, --use_attached_pds
If true, uses additional non-boot volumes, optionally creating them on
deploy if they don't exist already and deleting them on cluster delete.
-e, --env_var_files
Comma-separated list of bash files that are sourced to configure the cluster
and installed software. Files are sourced in order with later files being
sourced last. is always sourced first. Flag arguments are
set after all sourced files, but before the evaluate_late_variable_bindings
method of see for more information.
-P, --prefix
Common prefix for cluster nodes.
-p, --project
The Google Cloud Platform project to use to create the cluster.
-z, --zone
Specify the Google Compute Engine zone to use.


Why am I unable to connect to yarn?

I'm trying to connect to yarn by doing yarn application -list. But I cannot because it says:
<date> <time> INFO client.RMProxy: Connecting to ResourceManager at /
<date> <time> INFO ipc.Client: Retrying connecting to server: Already tried 0 time(s): retyr policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime= 1000 MILLISECONDS)
<date> <time> INFO ipc.Client: Retrying connecting to server: Already tried 1 time(s): retry policy is RetryUpToMaximumCount
<date> <time> INFO ipc.Client: Retrying connecting to server: Already tried 2 time(s): retry policy is RetryUpToMaximumCount
I have a file under /etc/hadoop/conf.empty/yarn-site.xml, which I assume is related to this in some way. I have a file at /etc/hadoop/conf.empty/ called I tried running this file, but it didn't change anything.
Am I doing something wrong? Or maybe something is not correctly configured? How do I start yarn?
yarn-site.xml is for configuring YARN daemons ResourceManager, NodeManager and ApplicationMaster. The properties relating to these services go in here. And the environment settings for YARN can be modified with
Start YARN services, (From the path of yarn-site.xml file posted, the installation does not appear to be done using tarballs. So the startup scripts might not be available)
On ResourceManager host
sudo service hadoop-yarn-resourcemanager start
And on each NodeManager host
sudo service hadoop-yarn-nodemanager start
Note: Make sure the preliminary configuration properties are set for both HDFS and YARN and the HDFS daemons Namenode and Datanode are started and running.
Additionally, Configure the mapreduce to use yarn in mapred-site.xml
You need to start the hadoop service, at least you need to start:
these shell script are located in the hadoop bin folder.
Depending on the installation maybe you need even to start history server.
If it is the first time you start hadoop, you need to format the namenode, otherwise the dfs service would not start.

Why does my yarn application not have logs even with logging enabled?

I have enabled logs in the xml file: yarn-site.xml, and I restarted yarn by doing:
sudo service hadoop-yarn-resourcemanager restart
sudo service hadoop-yarn-nodemanager restart
I ran my application, and then I see the applicationID in yarn application -list. So, I do this: yarn logs -applicationId <application ID>, and I get the following:
hdfs://<ip address>/var/log/hadoop-yarn/path/to/application/ does not have any log files
Do I need to change some other configuration? Or am I accessing the logs the wrong way?
Thank you.
yarn application -list
will list only the applications that are either in SUBMITTED, ACCEPTED or RUNNING state.
Log aggregation collects each container's logs and moves these logs onto the directory configured in yarn.nodemanager.remote-app-log-dir only after the completion of the application. Refer the description of yarn.log-aggregation-enable property here.
So, the applicationId listed by the command isn't completed yet and the logs are not yet collected. Thus the response when trying to access the logs of a running application
hdfs://<ip address>/var/log/hadoop-yarn/path/to/application/ does not have any log files
You can try the same command yarn logs -applicationId <application ID> to view the logs once the application has completed.
To list all the FINISHED applications, use
yarn application -list -appStates FINISHED
Or to list all the applications
yarn application -list -appStates ALL
Enable Log Aggregation
Log aggregation is enabled in the yarn-site.xml file. The yarn.log-aggregation-enable property enables log aggregation for running applications.
In version 2.3.2 of hadoop and higher you can get log aggregation to occur hourly on running jobs using this configuration in yarn-site.xml:
See this for further details:
It was probably saved with another appOwner. You can try to specify the application owner in your command:
yarn logs -appOwner .. -application_id ..
ROOT CAUSE: When log aggregation has been enabled each users application logs will, by default, be placed in the directory hdfs:///app-logs//logs/<APPLICATION_ID>. By default only the user that submitted the job and members of the hadoop group will have access to read the log files. In the example directory listing below you can see that the permissions are 770. No access for anyone other than the owner and members of the hadoop group.
[root#mycluster ~]$ hdfs dfs -ls /app-logs
Found 3 items
drwxrwx--- - hive hadoop 0 2017-03-10 15:33 /app-logs/hive
drwxrwx--- - user1 hadoop 0 2017-03-10 15:37 /app-logs/user1
drwxrwx--- - spark hadoop 0 2017-03-10 15:39 /app-logs/spark
SOLUTION: The message above can be deceiving and does not necessarily indicate that log aggregation has not been enabled. To obtain yarn logs for an application the 'yarn logs' command must be executed as the user that submitted the application. In the example below the application was submitted by user1. If we execute the same command as above as the user 'user1' we should get the following output if log aggregation has been enabled.
yarn logs -applicationId application_1473860344791_0001
16/09/19 23:10:33 INFO impl.TimelineClientImpl: Timeline service address:
16/09/19 23:10:33 INFO client.RMProxy: Connecting to ResourceManager at
16/09/19 23:10:34 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
16/09/19 23:10:34 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
Container: container_e03_1473860344791_0001_01_000001 on mycluster.somedomain.com_45454
Log Upload Time:Wed Sep 14 09:44:15 -0400 2016
Log Contents:
End of LogType:stderr
REFERENCE: The following document describes how to use log aggregation to collect logs for long-running YARN applications.

Running an Oozie job

I'm trying to configure Oozie to work on my hadoop-2.7.1 cluster. Everything seems to work fine, YARN, Hue, MapReduce and Spark. Jobs send by yarn jar... command finish correctly, but sending some job with oozie, either by CLI oozie job ... -run or by Hue, the job is stuck at 33% and node logs show this:
2015-11-06 06:08:56,121 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at localhost/
2015-11-06 06:08:57,165 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/ Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
I don't use 18030 port anywhere in my configuration, probably I should change its hostname from localhost to the network hostname. But where do I configure it? I've tried to change yarn.resourcemanager.scheduler.address, but that wasn't it.
I run oozie job -config examples/apps/shell/ -run with containing:
The error is occurring while trying to contact the Resource Manager.
The above mentioned log line is being printed in"Connecting to ResourceManager at " + rmAddress);
When you are using Oozie with MRv1, in "" file, the value of jobTracker is set to the Job Tracker's address:
jobTracker={JobTracker Host}:{JobTracker Port}
But, when you migrate your Oozie job to MRv2, you need to change "", to make jobTracker value to point to Resource Manager address:
jobTracker={RM Host}:{RM Port}
Please refer to the link here:
jobTracker = Variable to define the resource manager address in case of Yarn implementation. Format: <resourcemanager_hostname>:<port>
I went through the Hadoop source code. The only place where port "18030" is being used is in "SLS" (Yarn Scheduler Load Simulator).
SLS has a yarn-site.xml file (present at location: \hadoop-tools\hadoop-sls\src\main\sample-conf\yarn-site.xml), which has following configuration:
<description>The address of the scheduler interface.</description>
From your description, it seems the yarn-site.xml that is being used, is similar to the one used by SLS.

HBase fails to start in single node cluster mode on Mac OSX

I am trying to get a personal HBase development environment set up. I have hdfs and yarn running, but cannot get HBase to start.
I have started up hadoop 2.7.1, by running and I have verified these are running by testing hdfs dfs -mkdir /test and running a sample MR job bundled in the examples, I have browsed HDFS at port 50070.
I have started zookeeper 3.4.6 on port 2181 and set its dataDir. My zoo.cfg has:
I observe its zookeeper_server.PID file in the dataDir I chose, and when I run jps I see the below:
51074 NodeManager
50743 DataNode
50983 ResourceManager
50856 SecondaryNameNode
57848 QuorumPeerMain
58731 Jps
50653 NameNode
QuorumPeerMain above matches the PID in zookeeper_server.PID, as I would expect. Is this expectation correct? From what I have done so far, should it be expected that any more processes should be showing here?
I installed hbase-1.1.2. I configure hbase-site.xml. I set the hbase.rootDir to be hdfs://localhost:8200/hbase, my hdfs is running at localhost:8200. I set to my zookeeper's dataDir, with the expectation that it will use this property to find the PID of a running zookeeper. Is this expectation correct or have I misunderstood? The config in hbase-site.xml is:
When I run my server fails to start. I see this log message:
2015-09-26 19:32:43,617 ERROR [main] master.HMasterCommandLine: Master exiting
To investigate I ran hbase master start and get more detail:
2015-09-26 19:41:26,403 INFO [Thread-1] server.NIOServerCnxn: Stat command output
2015-09-26 19:41:26,405 INFO [Thread-1] server.NIOServerCnxn: Closed socket connection for client / (no session established for client)
2015-09-26 19:41:26,406 INFO [main] zookeeper.MiniZooKeeperCluster: Started MiniZooKeeperCluster and ran successful 'stat' on client port=2182
Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
2015-09-26 19:41:26,406 ERROR [main] master.HMasterCommandLine: Master exiting Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(
at org.apache.hadoop.hbase.master.HMaster.main(
So I have a few questions:
Should I be trying to set up a zookeeper before running HBase?
Why when I have started a zookeeper and told HBase where its dataDir is, does HBase try to start its own zookeeper?
Anything obviously stupid/misguided in the above?
The script you are using to start hbase will try to start the following components, in order:
hbase master
hbase regionserver
hbase master-backup
So, you could either stop the zookeeper which is started by you (or) you could start the daemons individually yourself:
# start hbase master
bin/ --config ${HBASE_CONF_DIR} start master
# start region server
bin/ --config ${HBASE_CONF_DIR} --hosts ${HBASE_CONF_DIR}/regionservers start regionserver
HBase stand alone starts it's own zookeeper (if you run, but it if fails to start or keep running, the other need hbase daemons won't work.
Make sure you explicitly set the properties for your interface lo0 in the hbase-site.xml file:
I found that when my wifi was on, if these entries were missing, zookeeper filed to start.

Couldn't start hadoop datanode normally

i am trying to install hadoop 2.2.0 i am getting following kind of error while starting dataenode services please help me resolve this issue.Thanks in Advance.
2014-03-11 08:48:16,406 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/in_use.lock acquired by nodename 3627#prassanna-Studio-1558
2014-03-11 08:48:16,426 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-611836968- (storage id DS-1960076343- service to localhost/ Incompatible clusterIDs in /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(
2014-03-11 08:48:16,427 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-611836968- (storage id DS-1960076343- service to localhost/
2014-03-11 08:48:16,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-611836968- (storage id DS-1960076343-
2014-03-11 08:48:18,532 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-11 08:48:18,534 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-11 08:48:18,536 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
SHUTDOWN_MSG: Shutting down DataNode at prassanna-Studio-1558/
Make sure you are ready with correct configuration and right path.
This is a link for Running Hadoop on ubuntu.
I have used this link to setup hadoop in my machine and it works fine.
That simply shows that the datanode tried to startup but took some exception and died.
Please check the datanode log under the logs folder in the hadoop installation folder (unless you changed that config) for exceptions. It usually points to a configuration issue of some kind, esp. network settings (/etc/hosts) related but there are quite a few possibilities.
Refer this,
1.Check JAVA_HOME---
readlink -f $(which java)
2.If JAVA is not available install by command
sudo apt-get install defalul-jdk
than run 1. and check on terminal
java -version
javac -version
3.Configure SSH
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine if you want to use Hadoop on it (which is what we want to do in this short tutorial). For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost for the user .
sudo apt-get install ssh
sudo su hadoop
ssh-keygen -t rsa -P “”
cat $HOME/.ssh/ >> $HOME/.ssh/authorized_keys
ssh localhost
Download and extract hadoop-2.7.3(Chosse dirrectory having read write permisson)
Set Environment Variable
sudo gedit .bashrc
source .bashrc
Setup Configuration Files
The following files will have to be modified to complete the Hadoop setup:
~/.bashrc (Already done)
gedit (PATH)/etc/hadoop/
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
gedit (PATH)/etc/hadoop/core-site.xml:
The (HOME)/etc/hadoop/core-site.xml file contains configuration properties that Hadoop uses when starting up.
This file can be used to override the default settings that Hadoop starts with.
($ sudo mkdir -p /app/hadoop/tmp)
Open the file and enter the following in between the <configuration></configuration> tag:
gedit /usr/local/hadoop/etc/hadoop/core-site.xml
<description>A base for other temporary directories.</description>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
By default, the (PATH)/etc/hadoop/ folder contains (PATH)/etc/hadoop/mapred-site.xml.template file which has to be renamed/copied with the name mapred-site.xml:
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
The mapred-site.xml file is used to specify which framework is being used for MapReduce.
We need to enter the following content in between the <configuration></configuration> tag:
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
The (PATH)/etc/hadoop/hdfs-site.xml file needs to be configured for each host in the cluster that is being used.
It is used to specify the directories which will be used as the namenode and the datanode on that host.
Before editing this file, we need to create two directories which will contain the namenode and the datanode for this Hadoop installation.
This can be done using the following commands:
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
Open the file and enter the following content in between the <configuration></configuration> tag:
gedit (PATH)/etc/hadoop/hdfs-site.xml
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
Format the New Hadoop Filesystem
Now, the Hadoop file system needs to be formatted so that we can start to use it. The format command should be issued with write permission since it creates current directory under /usr/local/hadoop_store/ folder:
bin/hadoop namenode -format
bin/hdfs namenode -format
Now start the hdfs
CHECK URL: http://localhost:50070/
