Namenode in hadoop cluster and fsimage and Edit_logs consept - hadoop

I want to give short background about the namenodes and fsimage/edit_logs , and how namenode works in hadoop clusters,
The NameNode stores modifications to the file system as a log appended to a native file system file, edits.
When a NameNode starts up, it reads HDFS state from an image file, fsimage, and then applies edits from the edits log file.
It then writes new HDFS state to the fsimage and starts normal operation with an empty edits file.
FsImage is a file stored on the OS filesystem that contains the complete directory structure (namespace) of the HDFS with details about the location of the data on the Data Blocks and which blocks are stored on which node.
EditLogs is a transaction log that recorde the changes in the HDFS file system or any action performed on the HDFS cluster such as addtion of a new block,
replication, deletion etc., It records the changes since the last FsImage was created,
it then merges the changes into the FsImage file to create a new FsImage file.
When we are starting namenode, latest FsImage file is loaded into "in-memory" and at the same time,
EditLog file is also loaded into memory if FsImage file does not contain up to date information.
Namenode stores metadata in "in-memory" in order to serve the multiple client request(s) as fast as possible.
If this is not done, then for every operation , namenode has to read the metadata information from the disk to in-memory. This process will consume more disk seek time for every operation.
so lets summary
Persistence of HDFS metadata broadly consist of two categories of files:
fsimage
Contains the complete state of the file system at a point in time. Every file system modification is assigned a unique, monotonically increasing transaction ID. An fsimage file represents the file system state after all modifications up to a specific transaction ID.
edits file
Contains a log that lists each file system change (file creation, deletion or modification) that was made after the most recent fsimage.
Checkpointing
is the process of merging the content of the most recent fsimage, with all edits applied after that fsimage is merged, to create a new fsimage. Checkpointing is triggered automatically by configuration policies or manually by HDFS administration commands.
Until now the brief about namenode and edit logs
So lets talk now about our cluster ( its based on HDP version 2.6.5 )
In folder /var/hadoop/hdfs/namenode/current of each namenode , we have the following fsimage files
fsimage_0000000000000031788 100% 104KB 104.1KB/s 00:00
fsimage_0000000000000031788.md5 100% 62 0.1KB/s 00:00
fsimage_0000000000000041641 100% 104KB 104.1KB/s 00:00
fsimage_0000000000000041641.md5 100% 62 0.1KB/s 00:00
also the edit logs ,
.
.
.
-rw-r--r-- 1 hdfs hadoop 328138542 Jan 23 12:37 edits_0000000022056979997-0000000022059239786
-rw-r--r-- 1 hdfs hadoop 301415558 Jan 23 13:07 edits_0000000022059239787-0000000022061345588
-rw-r--r-- 1 hdfs hadoop 311747850 Jan 23 13:37 edits_0000000022061345589-0000000022063490851
-rw-r--r-- 1 hdfs hadoop 12 Jan 23 13:37 seen_txid
-rw-r--r-- 1 hdfs hadoop 330301440 Jan 24 07:10 edits_0000000022063490852-0000000022065448335
Now , we start both namenode ,
In the namenode logs we see that namenode replaying each of the edit log ( so if for example we have 1965 edit_logs then namenode is replaying to all them one by one .....)
Example:
2020-01-27 06:20:37,306 INFO namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 2072759/2282427 transactions completed. (91%)
2020-01-27 06:20:38,307 INFO namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 2214991/2282427 transactions completed. (97%)
So namenode completely started with active/standby state after replaying all 1965 edit_logs ,
And this takes almost 17 hours
So after we restart both namenodes , we expect to get fsimage files up to date
For example:
-rw-r--r-- 1 hdfs hadoop 445716 Jan 31 08:11 fsimage_0000000000000132222
-rw-r--r-- 1 hdfs hadoop 62 Jan 31 08:11 fsimage_0000000000000132222.md5
But in our case after both namenode restart we get this example ( fsimage not update - time from Jan 03 )
-rw-r--r-- 1 hdfs hadoop 445716 Jan 03 07:11 fsimage_0000000000000132222
-rw-r--r-- 1 hdfs hadoop 62 Jan 03 07:11 fsimage_0000000000000132222.md5
So we can see that fsimage was not update , in spite both namenode completely started ( after 17 hours ) and with state of active/standby
Any suggestion why fsimage not update with the current time ?

You can create a fsimage file running the checkpoint manually with these commands:
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace
hds dfsadmin -safemode leave
IMPORTANT: while doing this commands Hadoop is not available online, so ensure you have HA active and your clients acknowledge this pause (this can take around 5 minutes to complete or more)

Related

What is the prefered solution for corrupted namenode metadata

we have HDP cluster , version 2.6.5
cluster include management of two name-node ( one is active and the secondary is standby )
and 65 datanode machines
we have problem with the standby name-node that not started and from the namenode logs we can see the following
2021-01-01 15:19:43,269 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.IOException: There appears to be a gap in the edit log. We expected txid 90247527115, but got txid 90247903412.
at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:838)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:693)
from ambari we can see that standby is down
for now the active namenode is up but the standby name node is down , and the root cause for this issue is because namenode matadata is damaged/corrupted.
so we have two solution - A or B
A)
run the following recover on standby namenode
su
hadoop namenode -recover
B)
Put Active NN in safemode
su hdfs
hdfs dfsadmin -safemode enter
Do a savenamespace operation on Active NN
su hdfs
hdfs dfsadmin -saveNamespace
Leave Safemode
su hdfs
hdfs dfsadmin -safemode leave
Login to Standby NN
Run below command on Standby namenode to get latest fsimage that we saved in above steps.
su hdfs
hdfs namenode -bootstrapStandby -force
what is the preferred solution for our problem?

Unable to start datanode and file permissions of datanode are changing when start-dfs.sh is started

I was facing issues in deploying local files to hdfs and found that I should have "drwx------" for datanode and namenode.
Initial permission status of datanode and namenode in hdfs.
drwx------ 3 hduser hadoop 4096 Mar 2 16:45 datanode
drwxr-xr-x 3 hduser hadoop 4096 Mar 2 17:30 namenode
Permission of datanode is changed to 755
hduser#pradeep:~$ chmod -R 755 /usr/local/hadoop_store/hdfs/
hduser#pradeep:~$ ls -l /usr/local/hadoop_store/hdfs/
total 8
drwxr-xr-x 3 hduser hadoop 4096 Mar 2 16:45 datanode
drwxr-xr-x 3 hduser hadoop 4096 Mar 2 17:30 namenode
After initiating start-dfs.sh, datanode didn't start and permissions to datanode were restored to original state.
hduser#pradeep:~$ $HADOOP_HOME/sbin/start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop- hduser-namenode-pradeep.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-pradeep.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-pradeep.out
hduser#pradeep:~$ jps
4385 Jps
3903 NameNode
4255 SecondaryNameNode
hduser#pradeep:~$ ls -l /usr/local/hadoop_store/hdfs/
total 8
drwx------ 3 hduser hadoop 4096 Mar 2 22:34 datanode
drwxr-xr-x 3 hduser hadoop 4096 Mar 2 22:34 namenode
As datanode is not running i am not able to deploy data to hdfs from local file system. I couldn't understand or find any reason why the file permissions are restored to previous state only for datanode folder.
it appears the name space id generated by the NameNode is different from your DataNode.
Solution:
if you goto the path where your hadoop files are stored on the local file system.
forexample /usr/local/hadoop. go down the path to /usr/local/hadoop/tmp/dfs/name/version. copy the namespaceid and take it to the path /usr/local/hadoop/tmp/dfs/data/version , replace the namespaceid.
i hope this helps.

Cannot load a file from Hadoop HDFS from Pig Latin

I am having trouble trying to load a csv from file. I keep on getting the following error:
Input(s):
Failed to read data from "hdfs://localhost:9000/user/der/1987.csv"
Output(s):
Failed to produce result in "hdfs://localhost:9000/user/der/totalmiles3"
Looking at my Hadoop hdfs installed in my local machine I see the file. In fact the file is located at multiple locations such as /, /user/ , etc.
hdfs dfs -ls /user/der
Found 1 items
-rw-r--r-- 1 der supergroup 127162942 2015-05-28 12:42
/user/der/1987.csv
My pig scripts is as follows:
records = LOAD '1987.csv' USING PigStorage(',') AS
(Year, Month, DayofMonth, DayOfWeek, DepTime, CRSDepTime, ArrTime,
CRSArrTime, UniqueCarrier, FlightNum, TailNum,ActualElapsedTime,
CRSElapsedTime,AirTime,ArrDelay, DepDelay, Origin, Dest,
Distance:int, TaxIn, TaxiOut, Cancelled,CancellationCode,
Diverted, CarrierDelay, WeatherDelay, NASDelay, SecurityDelay,
lateAircraftDelay);
milage_recs= GROUP records ALL;
tot_miles = FOREACH milage_recs GENERATE SUM(records.Distance);
STORE tot_miles INTO 'totalmiles3';
I ran pig with the -x local option. I was able to read the files from my local hard disk with the -x local option. Got the right answer and the tail -f on Hadoop namenode did not scroll which proves I ran the files all locally on hard disk:
pig -x local totalmiles.pig
Now I am getting errors. It seems the hadoop name server is getting request because I used tail -f and see the logs scroll.
pig totalmiles.pig
records = LOAD '/user/der/1987.csv' USING PigStorage(',') AS
I get the following error:
Failed Jobs:
JobId Alias Feature Message Outputs
job_local602774674_0001 milage_recs,records,tot_miles
GROUP_BY,COMBINER Message: ENOENT: No such file or directory
at
org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at
org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.j
ava:724)
at
org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java: 502)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSys tem.java:600)
at
org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUpl
oader.java:94)
at
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitte
r.java:98)
at org .apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:193)
...blah...
Input(s):
Failed to read data from "/user/der/1987.csv"
Output(s):
Failed to produce result in "hdfs://localhost:9000/user/der/totalmiles3"
I used the hdfs to check for permissions by mkdir and that seems ok:
hdfs dfs -mkdir /user/der/temp2
hdfs dfs -ls /user/der
Found 3 items
-rw-r--r-- 1 der supergroup 127162942 2015-05-28 12:42
/user/der/1987.csv
drwxr-xr-x - der supergroup 0 2015-05-28 16:21
/user/der/temp2
drwxr-xr-x - der supergroup 0 2015-05-28 15:57
/user/der/test
I tried the pig with mapreduce option and still get the same type of error:
pig -x mapreduce totalmiles.pig
5-05-28 20:58:44,608 [JobControl] INFO
org.apache.hadoop.mapreduce.lib.jobc
ontrol.ControlledJob - PigLatin:totalmiles.pig while
submitting
ENOENT: No such file or directory
at
org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Na at
org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at
org.apache.hadoop.fs.RawLocalFileSystem.setPermissi at
org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSy
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:600)
at
org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(Job
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(Jo
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobS
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
My core-site.xml has the temp dir as follows:
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop</value>
<description>A base for other temporary directories.
</description>
</property>
and my hdfs-site.xml as the namenode and datanode as follows:
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/dfs/datanode</value>
</property>
I've gotten a bit further in debugging the issue. It seems my namenode is misconfigured as I cannot reformat it:
[hadoop hdfs formatting gets error failed for Block pool ]
We have to give the hadoop file path as : /user/der/1987.csv
records = LOAD '/user/der/1987.csv' USING PigStorage(',') AS
(Year, Month, DayofMonth, DayOfWeek, DepTime, CRSDepTime, ArrTime,
CRSArrTime, UniqueCarrier, FlightNum, TailNum,ActualElapsedTime,
CRSElapsedTime,AirTime,ArrDelay, DepDelay, Origin, Dest,
Distance:int, TaxIn, TaxiOut, Cancelled,CancellationCode,
Diverted, CarrierDelay, WeatherDelay, NASDelay, SecurityDelay,
lateAircraftDelay);
If its for testing, you can have the file : 1987.csv in the path from where you are executing the pig script, i.e. have 1987.csv and the .pig file in the same location.

Cloudera hdfs another namenode already locked the storage directory

I am running CDH-5.3.2-1.cdh5.3.2.p0.10 with ClouderaManager on Centos 6.6.
My HDFS service was working on a Cluster. But I wanted to change the mounting point for the hadoop data. Yet without success, so I came with the idea to rollback all changes, but the previous configuration doesnt work what is discouraging.
I have two nodes within the cluster. One node for data is bad DataNodes Health Bad.
In the log I have got a few errors
1:40:10.821 PM ERROR org.apache.hadoop.hdfs.server.common.Storage
It appears that another namenode 931#spark1.xxx.xx has already locked the storage directory
1:40:10.821 PM INFO org.apache.hadoop.hdfs.server.common.Storage
Cannot lock storage /dfs/nn. The directory is already locked
1:40:10.821 PM WARN org.apache.hadoop.hdfs.server.common.Storage
java.io.IOException: Cannot lock storage /dfs/nn. The directory is already locked
1:40:10.822 PM FATAL org.apache.hadoop.hdfs.server.datanode.DataNode
Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to spark1.xxx.xx/10.10.10.10:8022. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:463)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1318)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1288)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:320)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:221)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
at java.lang.Thread.run(Thread.java:745)
I have been trying many possible solutions but without any luck.
formatting hadoop namenode -format
stopping cluster and rm -rf /dfs/* [and reformatting]
some adjustments to /dfs/nn/current/VERSION file
removing in_use.lock file and starting only a lacking node
removing a file in /tmp/hsperfdata_hdfs/ with name like the pid locking the directory.
There are files in the directory
[root#spark1 dfs]# ll
total 8
drwxr-xr-x 3 hdfs hdfs 4096 Apr 28 13:39 nn
drwx------ 3 hdfs hadoop 4096 Apr 28 13:40 snn
There is no dn dir what is a bit interesting.
All operations on hdfs files I perform as an hdfs user.
In the file /etc/hadoop/conf/hdfs-site.xml there is
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///dfs/nn</value>
</property>
Here is a similar thread of CDH users google group which might help you : https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/FYu0gZcdXuE
Also did you do the namenode format from cloudera manager or command line ? Ideally you should be doing it through cloudera manager and not command line.

Running Apache Pig tutorial problems

I am having some difficulties running "standard" pig tutorial - pig script1-hadoop.pig
However, because of cluster set up (users), I had to modify an example a bit. Standard tutorial expects all files on / of HDFS, which I cannot use in my case, so I created /pig dir for that purpose
drwxrwxrwx - hdfs hdfs 0 2014-03-31 11:15 /pig
with the uploaded content
-rw-r--r-- 3 jakub hdfs 10408717 2014-03-31 10:41 /pig/excite.log.bz2
I also modified the pig script script1-hadoop.pig as well, to respect those changes as follows (mainly just for load and store commands):
raw = LOAD '/pig/excite.log.bz2' USING PigStorage('\t') AS (user, time, query);
...
STORE ordered_uniq_frequency INTO '/pig/script1-hadoop-results' USING PigStorage();
I run the pig script:
[jakub#hadooptools pigtmp]$ pig script1-hadoop.pig
but with no luck and getting error:
2014-03-31 10:15:11,896 [main] ERROR org.apache.pig.tools.grunt.Grunt - You don't have permission to perform the operation. Error from the server: Permission denied: user=jakub, access=WRITE, inode="/":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:214)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:158)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5202)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5184)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5158)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3405)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3375)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3349)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:724)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59598)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
I am not quite sure why PIG script is trying to write into / on HDFS. I know that PIG can store some immediate results on HDFS so I modified pig.temp.dir property (/etc/pig/conf/pig.properties) and created location on HDFS /pig/tmp
drwxrwxrwx - jakub hdfs 0 2014-03-31 11:15 /pig/tmp
Any idea what might be wrong? Pig in local mode is ok.
Sorted.
User running Pig script has to have permissions to write to tmp directory created and /user/pig_user_running has to be present on the cluster as well with permissions allowing him to write there.
Super-user on HDFS is the user under which namenode process is running, which is typycally HDFS.

Resources