Switch a disk containing cloudera hadoop / hdfs / hbase data - hadoop

we have a Cloudera 5 installation based on one single node on a single server. Before adding 2 additional nodes on the cluster, we want to increase the size of the partition using a fresh new disk.
We have the following services installed:
yarn with 1 NodeManager 1 JobHistory and 1 ResourceManager
hdfs with 1 datanode 1 primary node and 1 secondary node
hbase with 1 master and 1 regionserver
zookeeper with 1 server
All data is currently installed on a partition. The number of data that will be collected has increased so we need to use another disk where store all the information.
All the data are under a partition mounted into the folder /dfs
The working partition is:
df -h
hadoop-dfs-partition
119G 9.8G 103G 9% /dfs
df -i
hadoop-dfs-partition
7872512 18098 7854414 1% /dfs
the content of this folder is the following:
drwxr-xr-x 11 root root 4096 May 8 2014 dfs
drwx------. 2 root root 16384 May 7 2014 lost+found
drwxr-xr-x 5 root root 4096 May 8 2014 yarn
under dfs there are these folders:
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn2
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn2
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn2
under yarn there are these folders:
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm1
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm2
How can we achieve this? I found only ways to migrate data beetween clusters with distcp command.
Didn't find any way to move raw data.
Stopping all services and shutting down the entire cluster before performing a
cp -Rp /dfs/* /dfs-new/
command is a viable option?
(/dfs-new in the folder where the fresh new ext4 partition of the new disk is mounted)
Any better way of doing this?
Thank you in advance

i've resolved in this way:
stop all services but hdfs
export data out of the hdfs. In my case the interesting part was in hbase:
su - hdfs
hdfs dfs -ls /
command show me the following data:
drwxr-xr-x - hbase hbase 0 2015-02-26 20:40 /hbase
drwxr-xr-x - hdfs supergroup 0 2015-02-26 19:58 /tmp
drwxr-xr-x - hdfs supergroup 0 2015-02-26 19:38 /user
hdfs dfs -copyToLocal / /a_backup_folder/
to export all data from hdfs to a normal file system
control-D
to return root
stop ALL services on Cloudera (hdfs included)
now you can umount the "old" and "new" partition.
mount the "new" partition in place of the path of the "old" one (in my case is /dfs)
mount the "old" partition in a new place in my case is /dfs-old (remember to mkdir /dfs-old) in this way can check the old structure
make this change permanent editing /etc/fstab. Check if everything is correct repeating step 3 and after try a
mount -a
df -h
to check if you have /dfs and /dfs-old mapped on the proper partitions (the "new" and the "old" one respectively)
format namenode going into
services > hdfs > namenode > action format namenode
in my case doing
ls -l /dfs/dfs
i have:
drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn
drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn1
drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn2
start hdfs service on cloudera
you should have new folders:
ls -l /dfs/dfs
i have:
drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 dn
drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 dn1
drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 dn2
drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn
drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn1
drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn2
drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 snn
drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 snn1
drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 snn2
now copy back data into the new partition
hdfs dfs -copyFromLocal /a_backup_folder/user/* /user
hdfs dfs -copyFromLocal /a_backup_folder/tmp/* /tmp
hdfs dfs -copyFromLocal /a_backup_folder/hbase/* /hbase
The hbase folder need to have the proper permission, hbase:hbase as user:group
hdfs dfs -chown -R hbase:hbase /hbase
if you forgot this step you get permission denied error on the hbase log file later
check the result with
hdfs dfs -ls /hbase
you should see something like this:
drwxr-xr-x - hbase hbase 0 2015-02-26 20:40 /hbase/.tmp
drwxr-xr-x - hbase hbase 0 2015-02-26 20:40 /hbase/WALs
drwxr-xr-x - hbase hbase 0 2015-02-27 11:38 /hbase/archive
drwxr-xr-x - hbase hbase 0 2015-02-25 15:18 /hbase/corrupt
drwxr-xr-x - hbase hbase 0 2015-02-25 15:18 /hbase/data
-rw-r--r-- 3 hbase hbase 42 2015-02-25 15:18 /hbase/hbase.id
-rw-r--r-- 3 hbase hbase 7 2015-02-25 15:18 /hbase/hbase.version
drwxr-xr-x - hbase hbase 0 2015-02-27 11:42 /hbase/oldWALs
(the important part here is to have the proper user and group of file and folders)
now start all services and check if hbase is working with
hbase shell
list
you should see all the tables you had before migration. Try with
count 'a_table_name'

Related

HDFS NFS locations using weird numerical username values for directory permissions

Seeing nonsense values for user names in folder permissions for NFS mounted HDFS locations, while the HDFS locations themselves (using Hortonworks HDP 3.1) appear fine. Eg.
➜ ~ ls -lh /nfs_mount_root/user
total 6.5K
drwx------. 3 accumulo hdfs 96 Jul 19 13:53 accumulo
drwxr-xr-x. 3 92668751 hadoop 96 Jul 25 15:17 admin
drwxrwx---. 3 ambari-qa hdfs 96 Jul 19 13:54 ambari-qa
drwxr-xr-x. 3 druid hadoop 96 Jul 19 13:53 druid
drwxr-xr-x. 2 hbase hdfs 64 Jul 19 13:50 hbase
drwx------. 5 hdfs hdfs 160 Aug 26 10:41 hdfs
drwxr-xr-x. 4 hive hdfs 128 Aug 26 10:24 hive
drwxr-xr-x. 5 h_etl hdfs 160 Aug 9 14:54 h_etl
drwxr-xr-x. 3 108146 hdfs 96 Aug 1 15:43 ml1
drwxrwxr-x. 3 oozie hdfs 96 Jul 19 13:56 oozie
drwxr-xr-x. 3 882121447 hdfs 96 Aug 5 10:56 q_etl
drwxrwxr-x. 2 spark hdfs 64 Jul 19 13:57 spark
drwxr-xr-x. 6 zeppelin hdfs 192 Aug 23 15:45 zeppelin
➜ ~ hadoop fs -ls /user
Found 13 items
drwx------ - accumulo hdfs 0 2019-07-19 13:53 /user/accumulo
drwxr-xr-x - admin hadoop 0 2019-07-25 15:17 /user/admin
drwxrwx--- - ambari-qa hdfs 0 2019-07-19 13:54 /user/ambari-qa
drwxr-xr-x - druid hadoop 0 2019-07-19 13:53 /user/druid
drwxr-xr-x - hbase hdfs 0 2019-07-19 13:50 /user/hbase
drwx------ - hdfs hdfs 0 2019-08-26 10:41 /user/hdfs
drwxr-xr-x - hive hdfs 0 2019-08-26 10:24 /user/hive
drwxr-xr-x - h_etl hdfs 0 2019-08-09 14:54 /user/h_etl
drwxr-xr-x - ml1 hdfs 0 2019-08-01 15:43 /user/ml1
drwxrwxr-x - oozie hdfs 0 2019-07-19 13:56 /user/oozie
drwxr-xr-x - q_etl hdfs 0 2019-08-05 10:56 /user/q_etl
drwxrwxr-x - spark hdfs 0 2019-07-19 13:57 /user/spark
drwxr-xr-x - zeppelin hdfs 0 2019-08-23 15:45 /user/zeppelin
Notice the difference for users ml1 and q_etl that they have numerical user values when running ls on the NFS locations, rather then their user names.
Even doing something like...
[hdfs#HW04 ml1]$ hadoop fs -chown ml1 /user/ml1
does not change the NFS permissions. Even more annoying, when trying to change the NFS mount permissions as root, we see
[root#HW04 ml1]# chown ml1 /nfs_mount_root/user/ml1
chown: changing ownership of ‘/nfs_mount_root/user/ml1’: Permission denied
This causes real problems, since the differing uid means that I can't access these dirs even as the "correct" user to write to them. Not sure what to make of this. Anyone with more Hadoop experience have any debugging suggestions or fixes?
UPDATE:
Doing a bit more testing / debugging, found that the rules appear to be...
If the NFS server node has no uid (or gid?) that matches the uid of the user on the node accessing the NFS mount, we get the weird uid values as seen here.
If there is a uid associated to the username of the user on the requesting node, then that is the uid user that we see assigned to the location when accessing via NFS (even if that uid on the NFS server node is not actually for the requesting user), eg.
[root#HW01 ~]# clush -ab id ml1
---------------
HW[01,04] (2)
---------------
uid=1025(ml1) gid=1025(ml1) groups=1025(ml1)
---------------
HW[02-03] (2)
---------------
uid=1027(ml1) gid=1027(ml1) groups=1027(ml1)
---------------
HW05
---------------
uid=1026(ml1) gid=1026(ml1) groups=1026(ml1)
[root#HW01 ~]# exit
logout
Connection to hw01 closed.
➜ ~ ls -lh /hdpnfs/user
total 6.5K
...
drwxr-xr-x. 6 atlas hdfs 192 Aug 27 12:04 ml1
...
➜ ~ hadoop fs -ls /user
Found 13 items
...
drwxr-xr-x - ml1 hdfs 0 2019-08-27 12:04 /user/ml1
...
[root#HW01 ~]# clush -ab id atlas
---------------
HW[01,04] (2)
---------------
uid=1027(atlas) gid=1005(hadoop) groups=1005(hadoop)
---------------
HW[02-03] (2)
---------------
uid=1024(atlas) gid=1005(hadoop) groups=1005(hadoop)
---------------
HW05
---------------
uid=1005(atlas) gid=1006(hadoop) groups=1006(hadoop)
If wondering why I have, user on the cluster that have varying uids across the cluster nodes, see the problem posted here: How to properly change uid for HDP / ambari-created user? (note that these odd uid setting for hadoop service users was set up by Ambari by default).
After talking with someone more knowledgeable in HDP hadoop, found that the problem is that when Ambari was setup and run to initially install the hadoop cluster, there may have been other preexisting users on the designated cluster nodes.
Ambari creates its various service users by giving them the next available UID of a nodes available block of user UIDs. However, prior to installing Ambari and HDP on the nodes, I created some users on the to-be namenode (and others) in order to do some initial maintenance checks and tests. I should have just done this as root. Adding these extra users offset the UID counter on those nodes and so as Ambari created users on the nodes and incremented the UIDs, it was starting from different starting counter values. Thus, the UIDs did not sync and caused problems with HDFS NFS.
To fix this, I...
Used Ambari to stop all running HDP services
Go to Service Accounts in Ambari and copy all of the expected service users name strings
For each user, run something like id <service username> to get the group(s) for each user. For service groups (which may have multiple members), can do something like grep 'group-name-here' /etc/group. I recommend doing it this way as the Ambari docs of default users and groups does not have some of the info that you can get here.
Use userdel and groupdel to remove all the Ambari service users and groups
Then recreate all the groups across the cluster
Then recreate all the users across the cluster (may need to specify UID if nodes have other users not on others)
Restart the HDP services (hopefully everything should still run as if nothing happend, since HDP should be looking for the literal string (not the UIDs))
For the last parts, can use something like clustershell, eg.
# remove user
$ clush -ab userdel <service username>
# check that the UID you want to use is actually available on all nodes
$ clush -ab id <some specific UID you want to use>
# assign that UID to a new service user
$ clush -ab useradd --uid <the specific UID> --gid <groupname> <service username>
To get the lowest common available UID from each node, used...
# for UID
getent passwd | awk -F: '($3>1000) && ($3<10000) && ($3>maxuid) { maxuid=$3; } END { print maxuid+1; }'
# for GID
getent passwd | awk -F: '($4>1000) && ($4<10000) && ($4>maxuid) { maxuid=$4; } END { print maxuid+1; }'
Ambari also creates some /home dirs for users. Once you are done recreating the users, will need to change the permissions for the dirs (can also use something like clush there as well).
* Note that this was a huge pain and you would need to manually correct the UIDs of users whenever you added another cluster node. I did this for a test cluster, but for production (or even a larger test) you should just useKerberos or SSSD + Active Directory.

How does Namenode reconstruct the full block information after restart?

I am trying to understand Namenode and I referred to online material and referring to book Hadoop: The definitive guide as well.
I understand that Namenode has concept like : "edit logs", "fsimage", and I can see the following files in my Namenode.
========================================================================
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 23 22:53 edits_0000000000000000001-0000000000000000001
-rw-r--r-- 1 root root 1048576 Nov 23 23:42 edits_0000000000000000002-0000000000000000002
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 00:07 edits_0000000000000000003-0000000000000000003
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 21:03 edits_0000000000000000004-0000000000000000004
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 22:59 edits_0000000000000000005-0000000000000000005
-rw-r--r-- 1 root root 1048576 Nov 24 23:00 edits_0000000000000000006-0000000000000000006
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 25 21:15 edits_0000000000000000007-0000000000000000007
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 25 21:34 edits_0000000000000000008-0000000000000000008
-rw-r--r-- 1 root root 1048576 Nov 26 02:13 edits_inprogress_0000000000000000009
-rw-rw-r-- 1 vevaan24 vevaan24 355 Nov 25 21:15 fsimage_0000000000000000006
-rw-rw-r-- 1 vevaan24 vevaan24 62 Nov 25 21:15 fsimage_0000000000000000006.md5
-rw-r--r-- 1 root root 355 Nov 26 00:12 fsimage_0000000000000000008
-rw-r--r-- 1 root root 62 Nov 26 00:12 fsimage_0000000000000000008.md5
-rw-r--r-- 1 root root 2 Nov 26 00:12 seen_txid
-rw-rw-r-- 1 vevaan24 vevaan24 201 Nov 26 00:12 VERSION
In that book it was mentioned that fsimage doesn't store the block locations in it.
I have following questions:
1) Does edit logs store the block locations as well? (for the new transactions?)
2) When Namenode and Datanode are restarted how does Namenode get the block address? My doubt is NN read fsimage to reconstuct the filesystem info, but fsimage doesn't have the info of block location, so how this information is reconstructed?
3) Is it true that fsimage stores BLOCK ID only, and if so, is BLOCK ID unique across Datanodes? Is BLOCK ID same as that of BLOCK address ?
Block locations i.e., the datanodes on which the blocks are stored is neither persisted in the fsimage file nor in the edit log. Namenode keeps this mapping only in the memory.
It is the responsibility of each datanode to hold the information of the list of blocks it is storing.
During restart, Namenode loads the fsimage file into memory and apply the edits from the edit log, the missing information of block locations is obtained from the datanodes as they check in with their block lists. Namenode, with the information from block lists, constructs the mapping of blocks with their locations in its memory.
fsimage has more than the Block ID. It holds the information like blocks of the file, block size, replication factor, access time, modification time, file permissions but not the location of the blocks.
Yes, Block IDs are unique. Block address would refer the address of the datanodes in which the block resides.

What information Namenode stores in Hard disk and in memory?

I am trying to understand Namenode and I referred to online material and referring to book Hadoop: The definitive guide as well.
I understand that Namenode has concept like : "edit logs", "fsimage", and I can see the following files in my Namenode.
========================================================================
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 23 22:53 edits_0000000000000000001-0000000000000000001
-rw-r--r-- 1 root root 1048576 Nov 23 23:42 edits_0000000000000000002-0000000000000000002
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 00:07 edits_0000000000000000003-0000000000000000003
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 21:03 edits_0000000000000000004-0000000000000000004
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 22:59 edits_0000000000000000005-0000000000000000005
-rw-r--r-- 1 root root 1048576 Nov 24 23:00 edits_0000000000000000006-0000000000000000006
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 25 21:15 edits_0000000000000000007-0000000000000000007
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 25 21:34 edits_0000000000000000008-0000000000000000008
-rw-r--r-- 1 root root 1048576 Nov 26 02:13 edits_inprogress_0000000000000000009
-rw-rw-r-- 1 vevaan24 vevaan24 355 Nov 25 21:15 fsimage_0000000000000000006
-rw-rw-r-- 1 vevaan24 vevaan24 62 Nov 25 21:15 fsimage_0000000000000000006.md5
-rw-r--r-- 1 root root 355 Nov 26 00:12 fsimage_0000000000000000008
-rw-r--r-- 1 root root 62 Nov 26 00:12 fsimage_0000000000000000008.md5
-rw-r--r-- 1 root root 2 Nov 26 00:12 seen_txid
-rw-rw-r-- 1 vevaan24 vevaan24 201 Nov 26 00:12 VERSION
=========================================================================
As expected I see all these files in my namenode. However I haven't understood this concept, I have following questions, can anyone please help me understand this.
Q1) What are fsimage files? Why many fsimage files are present?
Q2) What are edit_000 file? Why many edit_000 file are present?
Q3) What are there .md5 files? What purpose do they serve?
I also read that NAMENODE keeps some data in MEMORY and some data it keeps in HARD-DISK, BUT it is bit confusing to understand what kind of information is stored in hard disk and what remains in memory.
Q4) Do Namenode memory have information taken from fsimage or edit_000 OR both?
Q5) When Namenode and Datanode is restarted, how is the meta-data constructed (that is, which file stored in which datanode, block etc.).
Ok I try to explain:
EditLog
The EditLog is a transactional log to record every change that occurs to file system metadata. For example Creating a new file, renaming the file and so on. This will always generate an entry in the EditLog.
FsImage
This file contains the entire file system namespace, including the mapping of blocks to files and file system properties. So wich file consists of which blocks. Which blocks are saved where and so on.
If you start your NameNode, Hadoop loads the complete FsImage file into your memory. After that applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. This only happens once (on startup). After that Hadoop is only working with the in-memory representation. The FsImage on your HDD ist not touched.
Some of your Questions
Q1) Why many fsimage files are present?
As is explaned the FsImage is loaded, EditLog is flushed and than a new Version is saved.
Q1) Why many edit_000 file are present?
After Hadoop flushed the EditLog and persist a new Version of FsImage it starts a new EditLog. This is called a checkpoint in Hadoop
Q3) What are there .md5 files? What purpose do they serve?
MD5 is a hash to check if the FsImage is not broken.
Q5) When Namenode and Datanode is restarted, how is the meta-data constructed (that is, which file stored in which datanode, block etc.).
The information is persisted in the FsImage.
I hope i could help.

hive script file not found exception

I am running below command file is in my local directory but I am getting below error while running the file.
[hdfs#ip-xxx-xxx-xx-xx scripts]$ ls -lrt
total 28
-rwxrwxrwx. 1 root root 17 Apr 1 15:53 hive.hive
-rwxrwxrwx 1 hdfs hadoop 88 May 7 11:53 shell_fun
-rwxrwxrwx 1 hdfs hadoop 262 May 7 12:23 first_hive
-rwxrwxrwx 1 root root 88 May 7 16:59 311_cust_shell
-rwxrwxrwx 1 root root 822 May 8 20:29 script_1
-rw-r--r-- 1 hdfs hadoop 31 May 8 20:30 script_1.log
**-rwxrwxrwx 1 hdfs hdfs 64 May 8 22:07 **hql2.sql***
[hdfs#ip-xxx-xxx-xx-xx scripts]$ hive -f hql2.sql
WARNING: Use "yarn jar" to launch YARN applications.
Logging initialized using configuration in file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties Could not open input file for reading.
(File file:/home/ec2-user/scripts/hive/scripts/hql2.sql does not exist)
[hdfs#ip-xxx-xxx-xx-xx scripts]$

Hadoop archive file cannot be looked up using hadoop fs -ls har://hdfs-master/tank/zoo.har/

here is my files on hdfs:
hadoop fs -ls /
Found 5 items
-rw-r--r-- 3 hadoop supergroup 25 2016-04-18 11:29 /abc.txt
drwxr-xr-x - hadoop supergroup 0 2016-04-17 11:39 /hbase
drwxr-xr-x - hadoop supergroup 0 2016-04-18 11:49 /tank
drwx------ - hadoop supergroup 0 2016-04-18 11:30 /tmp
-rw-r--r-- 3 hadoop supergroup 66 2016-04-18 11:29 /user.txt
hadoop fs -ls /tank/
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2016-04-18 11:49 /tank/zoo.har
while l am typing
hadoop fs -ls har://hdfs-master/zoo.har/
Blockquote
that got response:
ls: Invalid path for the Har Filesystem. No index file in
har://hdfs-master/zoo.har
please help me out! Thanks!
I guess there are two format to access these files or directories:
First one as following:
hadoop fs -lsr har:///tank/zoo.har/
The other:
hadoop fs -lsr har://hdfs-master/tank/zoo.har/
By the way, are you sure your host is master and the HDFS daemon is listening on default port? Cause second format means har://hdfs-host:port/path/to/somewhere.
I forgot to add my parent path to the har url,it should be har:///parent-path/har-path!

Resources