We have 4 datanode HDFS cluster ...there is large amount of space avialable on each data node of about 98gb ...but when i look at the datanode information ..
it's only using about 10gb ...
How can we make it use all the 98gb and not run out of space as indicated in image
this is the hdfs-site.xml on name node
this is the hdfs-site.xml under data node
the 98gb is under /test
Please let us know if we missed anything in the configuration
Look at the dfs.datanode.data.dir in the hdfs-site.xml. This property would control all the directories which can be used to store DFS blocks.
Documentation Link
So on you machines execute "df -h" that should list all the mount points which make up the 98 GB. Then in each of the mount points decide which directory can be used to store HDFS block data and add those under hdfs-site.xml comma separated for dfs.datanode.data.dir. Then retstart namenode and all data node services.
And from your edited post :
It should not be file://. It should look like :
Same for other properties.
I have configured my hadoop system in wsl and run the wordcount example. But when I want to see the history of the job, I found the tracking url cannot access.
The job is working well, the jobhistory is running as well.
The history tracking url is my wsl hostname:8088/proxy/application_1585482453915_0002/.
You can see the url above.
But I can still access to localhost:19888/jobhistory to see my jobhistory.
How is this problem occurs? Is it a problem of configuration?
My hadoop version is 2.7.1.
My core-site.xml
<description>Abase for other temporary directories.</description>
My hdfs-site.xml
My mapred-site.xml
My yarn-site.xml
<description>Whether virtual memory limits will be enforced for containers</description>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
My /etc/hosts localhost DESKTOP-U1EOV4J.localdomain DESKTOP-U1EOV4J
The JobHistoryServer daemon is running in localhost (, whereas the tracking URL is constructed with the hostname, thus redirecting to DESKTOP-U1EOV4J.localdomain (
For a Pseudo distributed cluster, it is safer to leave the host of JobHistoryServer to be
Update the job history server properties in mapred-site.xml
and restart the JobHistoryServer.
I have looked through this StackOverflow post but they haven't helped me much.
I am trying to get Yarn working on an existing cluster. So far we have been using spark standalone manger as our resource allocator and it has been working as expected.
This is a basic overview of our architecture. Everything in the white boxes run in docker containers.
From master-machine I can run the following command from within the yarn resource manager container and get a spark-shell running that uses yarn: ./pyspark --master yarn --driver-memory 1G --executor-memory 1G --executor-cores 1 --conf "spark.yarn.am.memory=1G"
However, if I try to run the same command from client-machine within the jupyter container I get the following error in the YARN-UI.
Application application_1512999329660_0001 failed 2 times due to AM
Container for appattempt_1512999329660_0001_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://master-machine:5000/proxy/application_1512999329660_0001/Then, click on links to logs of each attempt.
Diagnostics: File file:/sparktmp/spark-58732bb2-f513-4aff-b1f0-27f0a8d79947/__spark_libs__5915104925224729874.zip does not exist
java.io.FileNotFoundException: File file:/sparktmp/spark-58732bb2-f513-4aff-b1f0-27f0a8d79947/__spark_libs__5915104925224729874.zip does not exist
I can find file:/sparktmp/spark-58732bb2-f513-4aff-b1f0-27f0a8d79947/ on the client-machine but I am unable to find spark-58732bb2-f513-4aff-b1f0-27f0a8d79947on the master machine
As a note, spark-shell works from the client-machine when it points to the standalone spark manager on the master machine.
No logs are printed to the yarn log directories on the worker-machines either.
If I run a spark-submit on spark/examples/src/main/python/pi.py I get the same error as above.
Here is the yarn-site.xml
<description>YARN hostname</description>
<!-- <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value> -->
<!-- <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value> -->
<description>The address of the RM web application.</description>
<description>The address of the scheduler interface.</description>
<description>The address of the applications manager interface in the RM.</description>
<description>The address of the RM admin interface.</description>
<description>Set to false, to avoid ip check</description>
<description>Maximum number of applications in the system which
can be concurrently active both running and pending</description>
<description>Whether to use preemption. Note that preemption is experimental
in the current version. Defaults to false.</description>
<description>Whether to allow multiple container assignments in one
heartbeat. Defaults to false.</description>
And here is the spark.conf:
# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.
spark.driver.port 7011
spark.fileserver.port 7021
spark.broadcast.port 7031
spark.replClassServer.port 7041
spark.akka.threads 6
spark.driver.cores 4
spark.driver.memory 32g
spark.master yarn
spark.deploy.mode client
spark.blockManager.port 7051
spark.executor.port 7101
spark.port.maxRetries 10
spark.local.dir /sparktmp
spark.scheduler.mode FAIR
spark.ui.port 4140
# http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation
spark.dynamicAllocation.enabled false
spark.shuffle.service.enabled false
spark.shuffle.service.port 7061
spark.dynamicAllocation.initialExecutors 5
spark.dynamicAllocation.minExecutors 0
spark.dynamicAllocation.maxExecutors 8
spark.dynamicAllocation.executorIdleTimeout 60s
spark.executor.logs.rolling.maxRetainedFiles 5
spark.executor.logs.rolling.strategy size
spark.executor.logs.rolling.maxSize 100000000
# Testing
# spark.driver.extraJavaOptions -Dcom.sun.management.jmxremote.port=8897 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
# Spark Yarn Configs
spark.hadoop.yarn.resourcemanager.address <master-machine IP>:8032
spark.hadoop.yarn.resourcemanager.hostname master-machine
And this shell script is run on all the mahcines:
# The main ones
export CONDA_DIR=/cluster/conda
export HADOOP_HOME=/usr/hadoop
export SPARK_HOME=/usr/spark
export JAVA_HOME=/usr/java/latest
export PATH=$PATH:$SPARK_HOME/bin:$HADOOP_HOME/bin:$JAVA_HOME/bin:$CONDA_DIR/bin:/cluster/libs-python:/cluster/batch
export PYTHONPATH=/cluster/libs-python:$SPARK_HOME/python:$PY4JPATH:$PYTHONPATH
export SPARK_CLASSPATH=/cluster/libs-java/*:/cluster/libs-python:$SPARK_CLASSPATH
# Core spark configuration
export PYSPARK_PYTHON="/cluster/conda/bin/python"
export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true -Duser.timezone=UTC+02:00"
export SPARK_WORKER_DIR="/sparktmp"
export SPARK_LOCAL_IP=$(hostname -I | cut -f1 -d " ")
export SPARK_PUBLIC_DNS=$(hostname -I | cut -f1 -d " ")
export SPARK_MASTER_OPTS="-Duser.timezone=UTC+02:00"
This is the hdfs-site.xml on the master-machine(namenodes):
<!-- 1000Mbit/s -->
And this is the hdfs-site.xml on the worker-machines (data-node):
<!-- 1000Mbit/s -->
This is the core-site.xml on the worker-machines (datanodes)
This is the core-site.xml on the master-machine (name node):
After a lot of debugging I was able to identify that for some reason the jupyter container was not looking in the correct hadoop conf directory even though the HADOOP_HOME environment variable was pointing to the correct location. All I had to do to resolve the above problem was to point HADOOP_CONF_DIR to the correct directory and everything started working again.
i have configured high availability in my cluster
which consists of three nodes
hadoop-master( node)
hadoop-slave-1( (another name node )
hadoop-slave-2 ( (data node)
without formatting name node ( converting a non-HA-enabled cluster to be HA-enabled) as described here
but i got two name nodes working as standby
so i tried to move the transition of one of these two nodes to active by applying the following command
hdfs haadmin -transitionToActive mycluster --forcemanual
with the following out put
17/04/03 08:07:35 WARN ha.HAAdmin: Proceeding with manual HA state management even though
automatic failover is enabled for NameNode at hadoop-master/
17/04/03 08:07:36 WARN ha.HAAdmin: Proceeding with manual HA state management even though
automatic failover is enabled for NameNode at hadoop-slave-1/
Illegal argument: Unable to determine service address for namenode 'mycluster'
my core-site is
my hdfs-site.xml is
what should the service address value be ? and what are possible solutions i can apply in order
to turn on one name node of the two nodes to active state ?
note the zookeeper server on all three nodes is stopped
I met the same issue, and it turn out that I didn't format zookeeper and start ZKFC
I will paste all my configuration below. I have a cluster of 3 computers. Configuration of namenode 1 (impc2361)
I have copied the same configurations on the other nodes as well that is namenode2 (impc2359) and datanode (impc2391)
I don't get the web^page of namenode1(impc2361) when I type impc2361.htcitmr:50070 in web url
It throws an error
Problem accessing /dfshealth.jsp.
I get a web page of namenode2 (impc2359) when I type impc2359.htcitmr:50070 but i don't find the folder /home1 which was set in core-site.xml
I am not able to do any operations through my terminal on cluster as it throws a error that it is readonly
hadoop fs -mkdir /a
mkdir: InternalDir of ViewFileSystem is readonly; operation=mkdirsPath=/a
Please kindly help
I want to use GridGain in Hadoop 2.4.0
my hadoop config under that
finish setting and start hdfs
I use
hadoop fs -ls /
ls: No FileSystem for scheme: ggfs
How should I do
Add the followings to the core-site.xml:
The second version of Hadoop File System API is used rarely. The most of parts of Hadoop ecosystem works through first version of API.
And if you want to use GGFS only you don't need to start HDFS services.