Request clarification on some HDFS concepts - hadoop

I am not sure if this questions belongs here. If not, then I apologize. I am reading the HDFS paper and am finding it difficult to understand a few terminologies. Please find my questions below.
1) As per the paper, "The HDFS namespace is a hierarchy of files and directories. Files and directories are represented on the NameNode by inodes, which record attributes like permissions, modification and access times, namespace and disk space quotas."
What exactly does namespace information mean in inode. Does it mean the complete path of the file? Because, the previous statement says "The HDFS namespace is a hierarchy of files and directories".
2) As per the paper "The NameNode maintains the namespace tree and the mapping of file blocks to DataNodes
(the physical location of file data)." Are both namespace tree and namespace the same? Please refer to point 1 about definition of the namespace. How is the namespace tree information stored? Is it stored as part of inodes where each inode will also have a parent inode pointer?
3) As per the paper, "HDFS keeps the entire namespace in RAM. The inode data and the list of blocks belonging to each file comprise the metadata of the name system called the image." Does the image also contain the namespace?
4) What is the use of a namespace id? Is it used to distinguish between two different file system instances?
Thanks,
Venkat

What exactly does namespace information mean in inode. Does it mean the complete path of the file? Because, the previous statement says "The HDFS namespace is a hierarchy of files and directories
It means that you can browse your files like you do on your system ( via commands like hadoop dfs -ls) you will see results like : /user/hadoop/myFile.txt but physically this file is distributed on your cluster in several blocks according to your replication factor
Are both namespace tree and namespace the same? Please refer to point 1 about definition of the namespace. How is the namespace tree information stored? Is it stored as part of inodes where each inode will also have a parent inode pointer?
When you copy a file on your HDFS with commands like hadoop dfs -copyFrom local myfile.txt /user/hadoop/myfile.txt, the file is splitted according to the dfs.block.size value (default is 64MB). Then blocks are distributed on your datanodes (nodes used for storage). The namenode keep a map of all blocks in order to verify your data integrity when it starts (or with commands like hadoop fsck /).
Does the image also contain the namespace?
For this one I am not sure but I think the namespace is in RAM too.
What is the use of a namespace id? Is it used to distinguish between two different file system instances?
Yes, the namespace id is just an ID, it ensures the datanode data coherence.
I hope that helps you even it is far away from an exhaustive explanation.

Related

Some info needed for Hadoop namenode

I am trying to understand Hadoop and I am referring to this book: Hadoop: The definitive Guide".
I have some doubts in understanding the data which Namenode manages, please refer to the image below:
Based on this, I have the following questions:
Q1) What is the meaning of filesystem namespace?
Q2) what is the meaning of filesystem tree?
Q3) What is meta-data? Are meta-data and namespace two different things?
Q4) what is namespace image?
Q5) what is edit logs?
Can anyone please help me understanding this?
There are many terminologies involved and no clarity of term provided.
Filesystem tree... /, /home, /tmp, etc. The filesystem. HDFS is an abstraction layer over the physical disks it runs on.
Metadata.. File xyz is located at /tmp and is 5KB large and is read-only. Data stored that identifies any file - location, size, permissions, etc.
The namespace is the combination of these items.
An edit log is a transcript of actions performed against that image, to be fault tolerant and provide checkpoints at which data consistency is known. This mechanism has less overhead than comparing raw files within a distributed system.
The rest of the question is answered by namespace image and edit log

Hadoop inode to path

I used the 'hdfs oiv' command to read the fsimage into a xml file.
hdfs oiv -p XML -i /../dfs/nn/current/fsimage_0000000003132155181 -o fsimage.out
Based on my understanding, fsimage is supposed to store the "blockmap" like how the files got broken into blocks, and where each block is storing. However, here is how a record inode looks like in the output file.
<inode>
<id>37749299</id>
<type>FILE</type>
<name>a4467282506298f8-e21f864f16b2e7c1_468511729_data.0.</name>
<replication>3</replication>
<mtime>1442259468957</mtime>
<atime>1454539092207</atime>
<perferredBlockSize>134217728</perferredBlockSize>
<permission>impala:hive:rw-r--r--</permission>
<blocks>
<block>
<id>1108336288</id>
<genstamp>35940487</genstamp>
<numBytes>16187048</numBytes>
</block>
</blocks>
</inode>
However, I was expecting something like, hdfs path to a file, how that file got broken down into smaller pieces and where each piece has been stored (like which machine, which local fs path...etc...)
Is there a mapping anywhere on the name server containing:
the HDFS path to inode mapping
the blockid to local file system path / disk location mapping?
A bit late, but since I am looking into this now and stumbled across your question.
First of all, a bit of context.
(I am working with Hadoop 2.6)
The Name server is responsible for maintaining the INodes, which is in-memory representation of the (virtual) filesystem structure, while Blocks being maintained by the data nodes. I believe that there are several reason for Name node not to maintain the rest of the information, like the links to the data nodes where the data is stored within the each INode:
It would require more memory to represent all that information (memory is the resource which actually limits the amount of files which can be writing into HDFS cluster, since the whole structure is maintained in RAM, for faster access)
Would induce more workload on the name node, in case for example if the file is moved from one node to another, or new node is installed and the file needs to be replicated to it. Each time it would happen, Name node would need to update its state.
Flexibility, since the INode is an abstraction, thus adding the link would bind it to determined technology and communication protocol
Now coming back to your questions:
The fsimage file already contains the mapping to HDFS path. If you look more carefully in the XML, each INode, regardless its type has an ID (in you case it is 37749299). If you look further in the file, you can find the section <INodeDirectorySection>, which has the mapping between the parent and children and it is this ID field which is used to determine the relation. Through the <name> attribute you can easily determine the structure you see for example in the HDFS explorer.
Furthermore, you have <blocks> section, which has block ID (in your case it is 1108336288). If you look carefully into the sources of the Hadoop, you can find the method idToBlockDir in the DatanodeUtil which gives you a hint how the files are being organized on the disk and block id mapping is performed.
Basically the original id is being shifted twice (by 16 and by 8 bits).
int d1 = (int)((blockId >> 16) & 0xff);
int d2 = (int)((blockId >> 8) & 0xff);
And the final directory is built using obtained values:
String path = DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP + DataStorage.BLOCK_SUBDIR_PREFIX + d2;
Where the block is stored using in the file which uses blk_<block_id> naming format.
I not a Hadoop expert, so if someone who understands this better could correct any of the flows in my logic, please do so. Hope this helps.

Write Path HDFS

Introduction
Follow-up question to this question.
A File has been provided to HDFS and has been subsequently replicated to three DataNodes.
If the same file is going to be provided again, HDFS indicates that the file already exists.
Based on this answer a file will be split into blocks of 64MB (depending on the configuration settings). A mapping of the filename and the blocks will be created in the NameNode. The NameNode knows in which DataNodes the blocks of a certain file reside. If the same file is provided again the NameNode knows that blocks of this file exists on HDFS and will indicate that the file already exits.
If the content of a file is changed and provided again does the NameNode update the existing file or is the check restricted to mapping of filename to blocks and in particular the filename? Which process is responsible for this?
Which process is responsible for splitting a file into blocks?
Example Write path:
According to this documentation the Write Path of HBase is as follows:
Possible Write Path HDFS:
file provided to HDFS e.g. hadoop fs -copyFromLocal ubuntu-14.04-desktop-amd64.iso /
FileName checked in FSImage whether it already exists. If this is the case the message file already exists is displayed
file split into blocks of 64MB (depending on configuration
setting). Question: Name of the process which is responsible for block splitting?
blocks replicated on DataNodes (replication factor can be
configured)
Mapping of FileName to blocks (MetaData) stored in EditLog located in NameNode
Question
How does the HDFS' Write Path look like?
If the content of a file is changed and provided again does the NameNode update the existing file or is the check restricted to mapping of filename to blocks and in particular the filename?
No, it does not update the file. The name node only checks if the path (file name) already exists.
How does the HDFS' Write Path look like?
This is explained in detail in this paper: "The Hadoop Distributed File System" by Shvachko et al. In particular, read Section 2.C (and check Figure 1):
"When a client writes, it first asks the NameNode to choose DataNodes to host replicas of the first block of the file. The client organizes a pipeline from node-to-node and sends the data. When the first block is filled, the client requests new DataNodes to be chosen to host replicas of the next block. A new pipeline is organized, and the client sends the further bytes of the file. Choice of DataNodes for each block is likely to be different. The interactions among the client, the NameNode and the DataNodes are illustrated in Fig. 1."
NOTE: A book chapter based on this paper is available online too. And a direct link to the corresponding figure (Fig. 1 on the paper and 8.1 on the book) is here.

Does hadoop use folders and subfolders

I have started learning Hadoop and just completed setting up a single node as demonstrated in hadoop 1.2.1 documentation
Now I was wondering if
When files are stored in this type of FS should I use a hierachial mode of storage - like folders and sub-folders as I do in Windows or files are just written into as long as they have a unique name?
Is it possible to add new nodes to the single node setup if say somebody were to use it in production environment. Or simply can a single node be converted to a cluster without loss of data by simply adding more nodes and editing the configuration?
This one I can google but what the hell! I am asking anyway, sue me. What is the maximum number of files I can store in HDFS?
When files are stored in this type of FS should I use a hierachial mode of storage - like folders and sub-folders as I do in Windows or files are just written into as long as they have a unique name?
Yes, use the directories to your advantage. Generally, when you run jobs in Hadoop, if you pass along a path to a directory, it will process all files in that directory. So.. you really have to use them anyway.
Is it possible to add new nodes to the single node setup if say somebody were to use it in production environment. Or simply can a single node be converted to a cluster without loss of data by simply adding more nodes and editing the configuration?
You can add/remove nodes as you please (unless by single-node, you mean pseudo-distributed... that's different)
This one I can google but what the hell! I am asking anyway, sue me. What is the maximum number of files I can store in HDFS?
Lots
To expand on climbage's answer:
Maximum number of files is a function of the amount of memory available to your Name Node server. There is some loose guidance that each metadata entry in the Name Node requires somewhere between 150-200 bytes of memory (it alters by version).
From this you'll need to extrapolate out to the number of files, and the number of blocks you have for each file (which can vary depending on file and block size) and you can estimate for a given memory allocation (2G / 4G / 20G etc), how many metadata entries (and therefore files) you can store.

How to find blockName to DataNode mapping in Hadoop

Is there a programatic interface to find out whether given a blockID it exists on which data node.
I.e. ability to read fsImage and return this info.
One of the crude ways i know of is to look for a file with blockName in the dfs data dir.
This is however an O(n) solution and i am pretty sure there would be an O(1) solution to this.
Similar to how to find file from blockName in HDFS hadoop, there is no public interface to the namenode which will allow you to lookup information froma block Id (only by file name).
You can look at opening the fsImage but this will only give you a mapping from block ID to filename as the actual locations (DataNodes) which host the blocks are not stored in this file - the data nodes treewalk their data directories and report to the NameNode what blocks thay have.
I guess if you could attach a debugger to the name node, you might be able to inspect the block map, but because there is no map from ID to filename, it's still going to be a O(n) operation

Resources