Giraph Job running in local mode always - hadoop

I ran Giraph 1.1.0 on Hadoop 2.6.0.
The mapredsite.xml looks like this
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>The runtime framework for executing MapReduce jobs. Can be one of
local, classic or yarn.</description>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>4096</value>
<name>mapreduce.reduce.memory.mb</name>
<value>8192</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx3072m</value>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx6144m</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>4</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>4</value>
</property>
</configuration>
The giraph-site.xml looks like this
<configuration>
<property>
<name>giraph.SplitMasterWorker</name>
<value>true</value>
</property>
<property>
<name>giraph.logLevel</name>
<value>error</value>
</property>
</configuration>
I do not want to run the job in the local mode. I have also set environment variable MAPRED_HOME to be HADOOP_HOME. This is the command to run the program.
hadoop jar myjar.jar hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation /user/$USER/inputbc/inputgraph.txt /user/$USER/outputBC 1.0 1
When I run this code that computes betweenness centrality of vertices in a graph, I get the following exception
Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run in split master / worker mode since there is only 1 task at a time!
at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:168)
at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:236)
at hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation.runMain(BetweennessComputation.java:214)
at hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation.main(BetweennessComputation.java:218)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
What should I do to ensure that the job does not run in local mode?

I have met the problem just a few days ago.Fortunately i solved it by doing this.
Modify the configuration file mapred-site.xml,make sure the value of property 'mapreduce.framework.name' to be 'yarn' and add the property 'mapreduce.jobtracker.address' which value is 'yarn' if there is not.
The mapred-site.xml looks like this:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobtracker.address</name>
<value>yarn</value>
</property>
</configuration>
Restart hadoop after modifying the mapred-site.xml.Then run your program and set the value which is after '-w' to be more than 1 and the value of 'giraph.SplitMasterWorker' to be 'true'.It will probably work.
As for the cause of the problem,I just quote somebody's saying:
These properties are designed for single-node executions and will have to be
changed when executing things in a cluster of nodes. In such a situation, the
jobtracker has to point to one of the machines that will be executing a
NodeManager daemon (a Hadoop slave). As for the framework, it should be
changed to 'yarn'.

We can see that in the stack-trace where the configuration check in LocalJobRunner fails this is a bit misleading because it makes us assume that we run in local model.You already found the responsible configuration option: giraph.SplitMasterWorker but in your case you set it to true. However, on the command-line with the last parameter 1 you specify to use only a single worker. Hence the framework decides that you MUST be running in local mode. As a solution you have two options:
Set giraph.SplitMasterWorker to false although you are running on a cluster.
Increase the number of workers by changing the last parameter to the command-line call.
hadoop jar myjar.jar hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation /user/$USER/inputbc/inputgraph.txt /user/$USER/outputBC 1.0 4
Please refer also to my other answer at SO (Apache Giraph master / worker mode) for details on the problem concerning local mode.

If you are after to split the master from the node you can use:
-ca giraph.SplitMasterWorker=true
also to specify the amount of workers you can use:
-w #
where "#" is the number of workers you want to use.

Related

ERROR datanode.DataNode: Exception in secureMain

I was trying to install Hadoop on windows.
Namenode is working fine but Data Node is not working fine. Following error is being displayed again and again even after trying for several times.
Following Error is being shown on CMD regarding dataNode:
2021-12-16 20:24:32,624 INFO checker.ThrottledAsyncChecker: Scheduling a check for [DISK]file:/C:/Users/mtalha.umair/datanode 2021-12-16 20:24:32,624 ERROR datanode.DataNode: Exception in secureMain org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value configured for dfs.datanode.failed.volumes.tolerated -
1. Value configured is >= to the number of configured volumes (1).
at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:176)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2799)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2714)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2756)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2900)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2924) 2021-12-16 20:24:32,640 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value configured for dfs.datanode.failed.volumes.tolerated - 1. Value configured is >= to the number of configured volumes (1). 2021-12-16 20:24:32,640 INFO datanode.DataNode: SHUTDOWN_MSG:
I have referred to many different articles but to no avail. I have tried to use another version of Hadoop but the problem remains and as I am just starting out, I can't fully understand the problem therefore I need help
these are my configurations
-For core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
For mapred-site.xml
mapreduce.framework.name
yarn
-For yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
-For hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/D:/big-data/hadoop-3.1.3/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>datanode</value> </property> <property>
<name>dfs.datanode.failed.volumes.tolerated</name>
<value>1</value> </property> <property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Well unfortunately the reason this is failing is exactly what the message says. Let me try to say it another way.
dfs.datanode.failed.volumes.tolerated = 1
The number of (dfs.datanode.data.dir) folders you have configured is 1.
You are saying you will tolerate no data drives (1 drive configured and you'll tolerate it breaking). This does not make sense and is why this is being raised as an issue.
You need to alter it so there's a gap of at least 1 (so that you can still have a running datanode.)
Here are your options:
Configure more data volumes (2) with dfs.datanode.failed.volumes.tolerated Set to 1. For example, store data in both your C and D drive.
dfs.datanode.failed.volumes.tolerated to 0; and keep you data volumes
as is (1)

How to limit the number of map tasks will be run simultaneously on each DataNode

Env:
Hadoop 3.0.0
1 NameNode, 5 DataNode
I config the setting on mapred-site.yml as follow to limit only 3 map tasks running simultaneously:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>3</value>
<description>The maximum number of map tasks that will be run simultaneously by a task tracker.</description>
</property>
<property>
<name>mapreduce.tasktracker.reduce.tasks.maximum</name>
<value>3</value>
<description>The maximum number of reduce tasks that will be run simultaneously by a task tracker.</description>
</property>
But when I run the TestDFSIO benchmark using the following command, The max actual running map tasks is 8, it seems the setting does not work:
yarn jar /opt/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-
client-jobclient-3.0.0-tests.jar \
TestDFSIO -storagePolicy HOT -write \
-nrFiles 500 -fileSize 1000MB -resFile /tmp/DFSIO-write.out
Any help will be appreciated.
That config parameter is from old Hadoop 1.x. As far as I can see you are using 3.0.0. Try this one:
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>3</value>
</property>
You should set it in yarn-site.xml on every host that runs a NodeManager.

Hadoop 1.2.1 is running in local mode despite set mapred.job.tracker value

I am trying to submit a giraph job to a hadoop 1.2.1 cluster. The cluster has a name node master, a map reduce master, and four slaves. The job is failing with the following exception:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, must have only one worker since only 1 task at a time!
However, here is my mapred-site.xml file:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>job.tracker.private.ip:9001</value>
</property>
<property>
<name>mapreduce.job.counters.limit</name>
<value>1000</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>50</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>50</value>
</property>
</configuration>
and my core-site.xml file:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://name.node.private.ip:9000</value>
</property>
</configuration>
Additionally my job tracker's master file contains its private ip and the slaves file contains the private ips of the four slaves. The name node's master file contains its private ip and the slaves file contains the private ips of the four slaves.
I thought that setting the mapred.job.tracker field to the ip of the map reduce master would make hadoop boot with a remote job runner but apparently not - how can I fix this?
The problem wasn't that hadoop was running in local job mode, the problem is that giraph, configured on another machine, assumed that hadoop was running in local job mode.
I was submitting the job via gremlin, I needed to add the following line to the its configuration file:
mapred.job.tracker=job.tracker.private.ip:9001

Configure job memory in Hadoop 1.2.0

I need to set -Xmx property of a job, running on data node.
On task tracker node I tried to put properties
<property>
<name>mapred.map.java.opts</name>
<value>-Xmx64m</value>
</property>
<property>
<name>mapred.reduce.java.opts</name>
<value>-Xmx64m</value>
</property>
into conf/core-site.xml
but it doesn't have any effect on submitted jobs, I still see java process with -Xmx200m in process list.
Please advice.
Try using:
<property>
<name>mapred.map.child.java.opts</name>
<value>-Xmx64m</value>
</property>
<property>
<name>mapred.reduce.child.java.opts</name>
<value>-Xmx64m</value>
</property>
in your conf/mapred-site.xml on each data node.

Failed to get system directory - hadoop

Using hadoop multinode setup (1 mater , 1 salve)
After starting up start-mapred.sh on master , i found below error in TT logs (Slave an)
org.apache.hadoop.mapred.TaskTracker: Failed to get system directory
can some one help me to know what can be done to avoid this error
I am using
Hadoop 1.2.0
jetty-6.1.26
java version "1.6.0_23"
mapred-site.xml file
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>1</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hduser/workspace</value>
</property>
</configuration>
It seems that you just added hadoop.tmp.dir and started the job. You need to restart the Hadoop daemons after adding any property to the configuration files. You have specified in your comment that you added this property at a later stage. This means that all the data and metadata along with other temporary files is still in the /tmp directory. Copy all those things from there into your /home/hduser/workspace directory, restart Hadoop and re run the job.
Do let me know the result. Thank you.
If, it is your windows PC and you are using cygwin to run Hadoop. Then task tracker will not work.

Resources