I am newbie to hadoop. I have successfully configured a hadoop setup in pseudo distributed mode. I want to have multiple reducers with the option -D mapred.reduce.tasks=2 (with hadoop-streaming). however there's still only one reducer.
according to Google I'm sure that mapred.LocalJobRunner limits number of reducers to 1. But I wonder is there any workaround to have more reducers?
my hadoop configuration files:
[admin#localhost string-count-hadoop]$ cat ~/hadoop-1.1.2/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/admin/hadoop-data/tmp</value>
</property>
</configuration>
[admin#localhost string-count-hadoop]$ cat ~/hadoop-1.1.2/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
[admin#localhost string-count-hadoop]$ cat ~/hadoop-1.1.2/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/admin/hadoop-data/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/admin/hadoop-data/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
the way I start job:
[admin#localhost string-count-hadoop]$ cat hadoop-startjob.sh
#!/bin/sh
~/hadoop-1.1.2/bin/hadoop jar ~/hadoop-1.1.2/contrib/streaming/hadoop-streaming-1.1.2.jar \
-D mapred.job.name=string-count \
-D mapred.reduce.tasks=2 \
-mapper mapper \
-file mapper \
-reducer reducer \
-file reducer \
-input $1 \
-output $2
[admin#localhost string-count-hadoop]$ ./hadoop-startjob.sh /z/programming/testdata/items_sequence /z/output
packageJobJar: [mapper, reducer] [] /tmp/streamjob837249979139287589.jar tmpDir=null
13/07/17 20:21:10 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/07/17 20:21:10 WARN snappy.LoadSnappy: Snappy native library not loaded
13/07/17 20:21:10 INFO mapred.FileInputFormat: Total input paths to process : 1
13/07/17 20:21:11 WARN mapred.LocalJobRunner: LocalJobRunner does not support symlinking into current working dir.
...
...
Try modifying core-site.xml's property
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
to,
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000/</value>
</property>
Put an extra / after 9000 and restart all the daemons.
Related
I am trying to set up Hadoop on my local machine. However, when I'm running wordcount based on map reduce example (I did hdfs namenode -format earlier) :
This is maybe hard to read but I end up with a "Job failed with state FAILED due to:
Application failed 2 times
due to AM Container exited with
exitCode: -1000 Failing this attempt.Diagnostics: No space available in any of the local directories."
I don't understand why I have such error. This is how my applications & attempt look like :
I followed several tutorials, ending up with these parameters :
mapred-site.xml :
configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml :
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>
%HADOOP_HOME%\etc\hadoop,
%HADOOP_HOME%\share\hadoop\common\*,
%HADOOP_HOME%\share\hadoop\common\lib\*,
%HADOOP_HOME%\share\hadoop\hdfs\*,
%HADOOP_HOME%\share\hadoop\hdfs\lib\*,
%HADOOP_HOME%\share\hadoop\mapreduce\*,
%HADOOP_HOME%\share\hadoop\mapreduce\lib\*,
%HADOOP_HOME%\share\hadoop\yarn\*,
%HADOOP_HOME%\share\hadoop\yarn\lib\*
</value>
</property>
</configuration>
core-site.xml :
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml :
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///C:/hadoop-3.3.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///C:/hadoop-3.3.0/data/datanode</value>
</property>
</configuration>
Can you help me with this please ? I already tried what is mentionned in these questions :
Hadoop errorcode -1000, No space available in any of the local directories, except for the namenode emptying cache part.
Hadoop Windows setup. Error while running WordCountJob: "No space available in any of the local directories"
what do you think ?
Thank you !
I have setup a cluster using this guide: https://medium.com/#jootorres_11979/how-to-set-up-a-hadoop-3-2-1-multi-node-cluster-on-ubuntu-18-04-2-nodes-567ca44a3b12
Currently I have one datanode and one master node.
What happens when I run a Hadoop job is that, the datanode's network activity indicates that it is sending a lot of data and the namenode receives that data. Also, the namenode's CPU is utilized fully while the datanode's CPU is not used at all. See the figure:
The nodes are VMs on the same machine. This happens for several different scripts, the figure is from running a WordCount algorithm.
Why is the work not being performed on the datanode? What could cause such a behavior?
Any help is appreciated.
According to the guide, mapred-site.xml was not changed. This means that the default values are used. The default for mapreduce.framework.name is "local". This means that all calculations will be performed locally. This must be changed to "yarn".
I created the following mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
</value>
</property>
</configuration>
I also had to change yarn-site.xml to:
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
After restarting yarn and Hadoop, everything worked as expected. The work is performed at datanodes.
I am new to Hadoop and I am trying to start Yarn daemon by using start-yarn.sh.
Below are my config files:
core-site.xml:
<?xml version="1.0"?>
<!-- core-site.xml -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
<?xml version="1.0"?>
<!-- hdfs-site.xml -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml:
<?xml version="1.0"?>
<!-- mapred-site.xml -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml:
<?xml version="1.0"?>
<!-- yarn-site.xml -->
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
I could start dfs and historyserver properly with:
dfs-start.sh --config $HADOOP_CONF_DIR (my config files)
mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR start historyserver.
Both http://localhost:50070/ and http://localhost:19888 give me the correct pages. I try to run script start-yarn.sh --config $HADOOP_CONF_DIR, here is the output in the console:
start-yarn.sh --config $HADOOP_CONF_DIR
starting yarn daemons
starting resourcemanager, logging to /usr/lib/hadoop-2.5.2/logs/yarn-yyang-resourcemanager-yyang-ubuntu.out
2017-03-26 17:37:31,051 INFO [main] resourcemanager.ResourceManager (StringUtils.java:startupShutdownMessage(619)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting ResourceManager
STARTUP_MSG: host = yyang-ubuntu/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.5.2
STARTUP_MSG: classpath = /usr/lib/hadoop-2.5.2/conf_local/hadoop:/usr/lib/hadoop-2.5.2/conf_local/hadoop:/usr/lib/hadoop-2.5.2/conf_local/hadoop:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/activation-1.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jettison-1.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/paranamer-2.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/guava-11.0.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-el-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/avro-1.7.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/asm-3.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/xz-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-net-3.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/hadoop-annotations-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-io-2.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/junit-4.11.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/hadoop-auth-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/hadoop-common-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/hadoop-nfs-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/common/hadoop-common-2.5.2-tests.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/hadoop-hdfs-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/hadoop-hdfs-nfs-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/hdfs/hadoop-hdfs-2.5.2-tests.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/activation-1.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/guice-3.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/asm-3.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/xz-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jline-0.9.94.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-common-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-api-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-common-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-client-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/hadoop-annotations-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.5.2-tests.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-common-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-api-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-common-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-client-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.5.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/activation-1.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/guice-3.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/asm-3.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/xz-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jline-0.9.94.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/lib/hadoop-2.5.2/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/lib/hadoop-2.5.2/conf_local/hadoop/rm-config/log4j.properties
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r cc72e9b000545b86b75a61f4835eb86d57bfafc0; compiled by 'jenkins' on 2014-11-14T23:45Z
STARTUP_MSG: java = 1.8.0_121
************************************************************/
The output seems ok to me (maybe I did not see the error). The resource manager's web UI does not give me the correct page (the site cannot be reached). But jps gives me:
6081 Jps
5554 JobHistoryServer
4443 SecondaryNameNode
4237 NameNode
which does not included resource manager.
I use the configuration from book Hadoop: The Definitive Guide, 4th Edition
Please help me fix the problem.
Refer this for installation issue:
https://stackoverflow.com/questions/22240488/couldnt-start-hadoop-datanode-normally/45671270#45671270
Meanwhile put only this under your yarn-site.xml
**
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
and mapred-site.xml should be:
**<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>**
**
and restart hadoop:
I'm trying to run a wordcount jar on a cluster hadoop 2.7.1 (one master and 4 slaves), but the MapReduce job was blocked at:
$ hadoop jar wc.jar WordCount /input /output_hocine
17/03/13 09:41:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/13 09:41:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/03/13 09:41:43 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/03/13 09:41:44 INFO input.FileInputFormat: Total input paths to process : 3
17/03/13 09:41:44 INFO mapreduce.JobSubmitter: number of splits:3
17/03/13 09:41:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1489393376058_0003
17/03/13 09:41:44 INFO impl.YarnClientImpl: Submitted application application_1489393376058_0003
17/03/13 09:41:44 INFO mapreduce.Job: The url to track the job: http://ibnbadis21:8088/proxy/application_1489393376058_0003/
17/03/13 09:41:44 INFO mapreduce.Job: Running job: job_1489393376058_0003
Via The navigator, The output via the navigator is shown at this image:
Here is the content of the configuration files:
Core-site.xml:
<configuration>
<!-- <property>
<name>fs.defaultFS</name>
<value>hdfs://ibnbadis21:9000</value>
</property>-->
<property>
<name>fs.default.name</name>
<value>hdfs://ibnbadis21:9000</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
yarn-site.xml:
<?xml version="1.0"?> <configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
mapred-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>ibnbadis21:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>ibnbadis21:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user/app</value>
</property>
</configuration>
hdfs-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value>
</property>
<property> <name>dfs.namenode.checkpoint.dir</name>
<value>file:/usr/local/hadoop_data/hdfs/namesecondary</value>
</property>
<property> <name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_data/hdfs/datanode</value>
</property>
</configuration>
Can anyone tell me how can solve this problem, please?
Connecting to ResourceManager at /0.0.0.0:8032
0.0.0.0 (the default) is not a valid hostname.
So, add this in yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value> YOUR VALUE HERE </value> <!-- Needs Fully Qualified Domain Name -->
</property>
There are many values that you probably didn't set.
Refer Hadoop | Configuring the Hadoop Daemons
By the way, fs.defaultFS is the correct property to use.
Finally the problem was about access rights. The framework haven't the right to access at my yarn-site.xml file. That's why it used the default value 0.0.0.0/8030. Thus When I executed the command with privilege (sudo):
sudo hadoop jar wc.jar WordCount /input /output
My job MapReduce is executed successfully!
I am trying to install and start hadoop 1.1.2 in cygwin on windows 7. I am getting the following issue when attempting to run a simple job:
bin/hadoop jar hadoop-*-examples.jar pi 10 100
13/04/26 17:56:10 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/username/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:736)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
My configuration is as follows:
mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
The answer is in the logs. There will be a specific exception detailed there, most likely a file access problem requiring you to chmod 755 -R some directory and its contents.
This error not because you are trying to Hadoop on Windows. It's because there is some problem with your DataNode. Along with the point which Chris Gerken has made, there could be some other reasons as well. I had answered a similar question recently. You should have a look at it. Upload data to HDFS running in Amazon EC2 from local non-Hadoop Machine