I am using Hadoop to processing on large set of data. I set up a hadoop node to use multiple volumes : one of these volume is a NAS with 10To disk, and the other one is the local disk from server with a storage capacity of 400 GB.
The problem is, if I understood, that data-nodes will attempt to place equal amount of data in each volumes. Thus when I run a job on a large set of data the disk with 400 GB is quickly full, while the 10 To disk got enough space remained. Then my map-reduce program produce by Hive freeze because my cluster turn on the safe mode...
I tried to set the property for limit Data node's disk usage, but it does nothing : I have still the same problem.
Hope that someone could help me.
Well it seems that my mapreduce program turn on safe mode because :
The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
I saw that error on the namenode web interface. I want to disable this option with the property dfs.safemode.threshold.pct but I do not know if it is a good way to solve it?
I think you can turn to dfs.datanode.fsdataset.volume.choosing.policy for help.
<property><name>dfs.datanode.fsdataset.volume.choosing.policy</name><value>org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy</value>
Use the dfs.datanode.du.reserved configuration setting in $HADOOP_HOME/conf/hdfs-site.xml for limiting disk usage.
Reference
<property>
<name>dfs.datanode.du.reserved</name>
<!-- cluster variant -->
<value>182400</value>
<description>Reserved space in bytes per volume. Always leave this much space free for non dfs use.
</description>
</property>
Related
I have 3 nodes Hadoop 2.7.3 cluster which can be described as follows:
Node A: 25gb, DataNode, NameNode
Node B: 50gb, DataNode
Node C: 25gb, DataNode
The problem is that on the node A there is a high disk usage (about 95%). What I would like to achieve is to limit the disk usage so that it will never be more than 85%.
I tried to set dfs.namenode.resource.du.reserved property to somewhat about 3gb but it does not solve my problem because as soon as available disk space is lower than that value, my Hadoop enters safemode immediately.
I know that all required resources must be available for the NN to continue operating and that the NN will continue to operate as long as any redundant resource is available.
Also, I know about dfs.namenode.edits.dir.required property which defines required resources, but I don't think that making NN redundant instead of required is a good idea.
So my questions is as in the topic. How can I say to Hadoop: "Hey, listen. This is a datanode, put anything you want here, but if the disk usage will be higher than 85% then do not panic- just stop putting anything here and continue to do your thing on the rest of DN."?
Am I missing something? Is it even possible? If not, then what would you guys suggest me to do?
There is a process called Namenode resource checker which scans the Namenode storage volumes for available free disk space. Whenever the available free space falls below the value specified in the dfs.namenode.resource.du.reserved property (default 100MB), it forces the Namenode to enter safemode.
Setting it to 3GB would expect that free space on this node. But the Datanode would be consuming all the available free space for its data storage not considering the disk space requirements of the Namenode.
Limit the datanode's disk usage on this particular node, add this property to hdfs-site.xml
<property>
<name>dfs.datanode.du.reserved</name>
<value>3221225472</value>
<description>3GB of disk space reserved for non DFS usage.
This space will be left unconsumed by the Datanode.
</description>
</property>
Change the reserved space value as per your required threshold.
I will have 200 million files in my HDFS cluster, we know each file will occupy 150 bytes in NameNode memory, plus 3 blocks so there are total 600 bytes in NN.
So I set my NN memory having 250GB to well handle 200 Million files. My question is that so big memory size of 250GB, will it cause too much pressure on GC ? Is it feasible that creating 250GB Memory for NN.
Can someone just say something, why no body answer??
Ideal name node memory size is about total space used by meta of the data + OS + size of daemons and 20-30% space for processing related data.
You should also consider the rate at which data comes in to your cluster. If you have data coming in at 1TB/day then you must consider a bigger memory drive or you would soon run out of memory.
Its always advised to have at least 20% memory free at any point of time. This would help towards avoiding the name node going into a full garbage collection.
As Marco specified earlier you may refer NameNode Garbage Collection Configuration: Best Practices and Rationale for GC config.
In your case 256 looks good if you aren't going to get a lot of data and not going to do lots of operations on the existing data.
Refer: How to Plan Capacity for Hadoop Cluster?
Also refer: Select the Right Hardware for Your New Hadoop Cluster
You can have a physical memory of 256 GB in your namenode. If your data increase in huge volumes, consider hdfs federation. I assume you already have multi cores ( with or without hyperthreading) in the name node host. Guess the below link addresses your GC concerns:
https://community.hortonworks.com/articles/14170/namenode-garbage-collection-configuration-best-pra.html
I have a machine with 24 G memory and want to run hdfs, map-reducer and hbase on them but I want to split the memory between the map-reducer and hbase.
I want hbase to use a maximum of 15G ram and map-reducer to get 8G max. What is the best way to achieve it?
In your configuration files you can specify how much each process is allowed to have.
HBase is controled by the hbase-env.sh
mapreduce is controlled by the mapred-site.xml
There are good comments in each of the file to help you locate the exact property you are looking for.
The most tricky one is the mapreduce: depending on how many slots you want to have you have to divide the maximum memory you are willing to give to mapreduce by the number of slots you make available on the machine. So, if you only want one slot, you can set the max child memory to 8GB; 2 slots gives you 4GB; etc.
You can also set the child memory to be final in the mapred-site.xml file so that the tasks themselves can't override it.
I have a single node instance of Apache Hadoop 1.1.1 with default parameter values (see e.g. [1] and [2]) on the machine with a lot of RAM and very limited free disk space size. Then, I notice that this Hadoop instance wastes a lot of disk space during map tasks. What configuration parameters should I pay attention to in order to take advantage of high RAM capacity and decrease disk space usage?
You can use several of the mapred.* params to compress map output, which will greatly reduce the amount of disk space needed to store mapper output. See this question for some good pointers.
Note that different compression codecs will have different issues (i.e. GZip needs more CPU than LZO, but you have to install LZO yourself). This page has a good discussion of compression issues in Hadoop, although it is a bit dated.
The amount of RAM you need depends upon what you are doing in your map-reduce jobs, although you can increase your heap-size in:
conf/mapred-site.xml mapred.map.child.java.opts
See cluster setup for more details on this.
You can use dfs.datanode.du.reserved in hdfs-site.xml to specify an amount of disk space you won't use. I don't know whether hadoop is able to compensate with higher memory usage.
You'll have a problem, though, if you run a mapreduce job that's disk i/o intensive. I don't think any amount of configuring will help you then.
Can anyone give a detailed analysis of memory consumption of namenode? Or is there some reference material ? Can not find material in the network.Thank you!
I suppose the memory consumption would depend on your HDFS setup, so depending on overall size of the HDFS and is relative to block size.
From the Hadoop NameNode wiki:
Use a good server with lots of RAM. The more RAM you have, the bigger the file system, or the smaller the block size.
From https://twiki.opensciencegrid.org/bin/view/Documentation/HadoopUnderstanding:
Namenode: The core metadata server of Hadoop. This is the most critical piece of the system, and there can only be one of these. This stores both the file system image and the file system journal. The namenode keeps all of the filesystem layout information (files, blocks, directories, permissions, etc) and the block locations. The filesystem layout is persisted on disk and the block locations are kept solely in memory. When a client opens a file, the namenode tells the client the locations of all the blocks in the file; the client then no longer needs to communicate with the namenode for data transfer.
the same site recommends the following:
Namenode: We recommend at least 8GB of RAM (minimum is 2GB RAM), preferably 16GB or more. A rough rule of thumb is 1GB per 100TB of raw disk space; the actual requirements is around 1GB per million objects (files, directories, and blocks). The CPU requirements are any modern multi-core server CPU. Typically, the namenode will only use 2-5% of your CPU.
As this is a single point of failure, the most important requirement is reliable hardware rather than high performance hardware. We suggest a node with redundant power supplies and at least 2 hard drives.
For a more detailed analysis of memory usage, check this link out:
https://issues.apache.org/jira/browse/HADOOP-1687
You also might find this question interesting: Hadoop namenode memory usage
There are several technical limits to the NameNode (NN), and facing any of them will limit your scalability.
Memory. NN consume about 150 bytes per each block. From here you can calculate how much RAM you need for your data. There is good discussion: Namenode file quantity limit.
IO. NN is doing 1 IO for each change to filesystem (like create, delete block etc). So your local IO should allow enough. It is harder to estimate how much you need. Taking into account fact that we are limited in number of blocks by memory you will not claim this limit unless your cluster is very big. If it is - consider SSD.
CPU. Namenode has considerable load keeping track of health of all blocks on all datanodes. Each datanode once a period of time report state of all its block. Again, unless cluster is not too big it should not be a problem.
Example calculation
200 node cluster
24TB/node
128MB block size
Replication factor = 3
How much space is required?
# blocks = 200*24*2^20/(128*3)
~12Million blocks
~12,000 MB memory.
I guess we should make the distinction between how namenode memory is consumed by each namenode object and general recommendations for sizing the namenode heap.
For the first case (consumption) ,AFAIK , each namenode object holds an average 150 bytes of memory. Namenode objects are files, blocks (not counting the replicated copies) and directories. So for a file taking 3 blocks this is 4(1 file and 3 blocks)x150 bytes = 600 bytes.
For the second case of recommended heap size for a namenode, it is generally recommended that you reserve 1GB per 1 million blocks. If you calculate this (150 bytes per block) you get 150MB of memory consumption. You can see this is much less than the 1GB per 1 million blocks, but you should also take into account the number of files sizes, directories.
I guess it is a safe side recommendation. Check the following two links for a more general discussion and examples:
Sizing NameNode Heap Memory - Cloudera
Configuring NameNode Heap Size - Hortonworks
Namenode Memory Structure Internals