I am using hadoop apache 2.7.1 cluster which consists of 4 data nodes and two name nodes becuase it is high available
deployed on centos 7
and it began working at 01-08-2017
and we know that logs will be generated for each service
and let's take the current logs for example
hadoop-root-datanode-dn1.log
hadoop-root-datanode-dn2.log
where hadoop_root is the user iam logging with
my problem is:
in dn1 log i can find info from 01-08-2017 until today
but in dn2 log doesn't have all complete info ,as it is emptied every day so it has only info related to today
is there any properties to control this behavior or it is centos problem
any help please ?
By default, the .log files are rotated daily by log4j. This is configurable with /etc/hadoop/conf/log4j.properties.
https://blog.cloudera.com/blog/2009/09/apache-hadoop-log-files-where-to-find-them-in-cdh-and-what-info-they-contain/
Not to suggest you're running a Cloudera cluster, but if you did, those files are not deleted. They're rolled and renamed
Oh, and I would suggest not running your daemons as root. Most hadoop installation guides explicitly have you create a hdfs or hadoop user
Related
Hadoop-HA cluster - 4 nodes
As soon as I start hadoop services unnecessary yarn applications gets launched and no application logs gets generated. Not able to debug problem without logs. Can anyone help me to resolve this issue.
https://i.stack.imgur.com/RjvkB.png
Never come across such issue. But it seems that there is some script or may be some oozie job triggering these apps. Try Yarn-Clean if this is of any help.
Yarn-Clean
I deleted multiple old files (HiveLogs/MR-Job intermediate files) from HDFS location /temp/hive-user/hive_2015*.
After that, I noticed my four node cluster is responding very slow and having the following issue.
I re-started my cluster, it worked fine for 3-4 hours, and then again it started giving same issue as follows:
Hadoopdfs health page is getting loaded very slowly.
File browsing is very slow
Namenode logs getting full with "Blocks does not belongs to any File".
All operation to my cluster is slow.
I found it could be because of I deleted hdfs files, according to HDFS JIRA- 7815 and 7480, as I deleted huge numbers of file Namenode could not delete blocks properly. As Namenode was busy with multiple deletions tasks. This is an existing bug with older version of Hadoop (older than 2.6.0).
Can anyone please suggest quick fix without upgrading my hadoop cluster or patch installation?
How can I identify those orphan blocks and delete them from Hadoop FS?
I need to track what is happening when I run a job or upload a file to HDFS. I do this using sql profiler in sql server. However, I miss such a tool for hadoop and so I am assuming that I can get some information from logs. I thing all logs are stored at /var/logs/hadoop/ but I am confused with what file I need to look at and how to set that file to capture detailed level information.
I am using HDP2.2.
Thanks,
Sree
'Hadoop' represents an entire ecosystem of different products. Each one has its own logging.
HDFS consists of NameNode and DataNode services. Each has its own log. Location of logs is distribution dependent. See File Locations for Hortonworks or Apache Hadoop Log Files: Where to find them in CDH, and what info they contain for Cloudera.
In Hadoop 2.2, MapReduce ('jobs') is a specific application in YARN, so you are talking about ResourceManager and NodeManager services (the YARN components), each with its own log, and then there is the MRApplication (the M/R component), which is a YARN applicaiton yet with its own log.
Jobs consists of taks, and tasks themselves have their own logs.
In Hadoop 2 there is a dedicated Job History service tasked with collecting and storing the logs from the jobs executed.
Higher level components (eg. Hive, Pig, Kafka) have their own logs, asside from the logs resulted from the jobs they submit (which are logging as any job does).
The good news is that vendor specific distribution (Cloudera, Hortonworks etc) will provide some specific UI to expose the most common logs for ease access. Usually they expose the JobHistory service collected logs from the UI that shows job status and job history.
I cannot point you to anything SQL Profiler equivalent, because the problem space is orders of magnitude more complex, with many different products, versions and vendor specific distributions being involved. I recommend to start by reading about and learning how the Job History server runs and how it can be accessed.
I have installed CDH 5.5.1 with Hue, Hadoop, Spark, Hive, Oozie, Yarn and ZooKeeper.
When I run a Spark job or MapReduce job, Hue displays a issue in the job history. The problem is that when I restart the CDH services (Not the physical nodes), it removes all the job histories that were before the restart.
On Hadoop there are several files that I suspect have information about the task and might be the ones that hold the job information. Their hadoop paths are:
/tmp/logs/user/logs/
/user/history/done/2016/
I have looked for it in the Cloudera Manager configuration page, Hue configuration page and some configuration files with no success. I don't know how to prevent this removal. Am I missing something?
If you really just need to see job history on a Hadoop cluster, the YARN History Server should have a history of all YARN jobs run on the cluster.
Hue has a JIRA ticket for the issue you describe, titled "Job browser should talk to the YARN history server to display old jobs": https://issues.cloudera.org/browse/HUE-2558. Basically, Hue needs to talk to the YARN History Server (not just the Resource Manager) to get the information you're looking for.
The good news is that the task appears to have been completed and included with the release of Hue 4.0, which occurred on 5/11/2017. The bad news is that Cloudera has not yet done a release with that version of Hue rolled in.
I'm working on a cluster with hbase.
One node crashed couple days ago. I restarted the cluster; since that time, the root region is in transition despite all my efforts.
70236052 -ROOT-,,0.70236052 state=CLOSING, ts=Wed Apr 10 15:06:04 CEST
2013 (417729s ago), server=NODE09...
I tried to :
restart HBase
remove the service and re-install it
revome the service and install the master onto another node
install 2 different Hbase
format the HDFS namenode
deleting the HBase file from HDFS system
It still can find this region in transition.
I tried to access to the .META. table :
org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for
after 7 tries
I attempted to use the command /bin/hbase hbck :
org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for
after 10 tries.
I'm out of ideas for solving this issue.
Does someone have any suggestions?
Regards
It might be that the time on the node where the problematic regionserver is on is not synchronized with the rest of the cluster. check that NTP is configured correctly.
In any event check the log of the problematic regionserver