Getting "User [dr.who] is not authorized to view the logs for application <AppID>" while running a YARN application - hadoop

I'm running a custom Yarn Application using Apache Twill in HDP 2.5 cluster, but I'm not able to see my own container logs (syslog, stderr and stdout) when I go to my container web page:
Also the login changes from my kerberos to "dr.who" when I navigate to this page.
But I can see the logs of map-reduce jobs. Hadoop version is 2.7.3 and the cluster is yarn acl enabled.

i had this issue with hadoop ui. I found in this doc, that the hadoop.http.staticuser.user is set to dr.who by default and you need include it in the related setting file (in my issue is core-site.xml file).
so late but hope useful.

Related

Where does YARN application logs get stored in EMR before sending to S3

I have a requirement to write Yarn application logs from EMR to different source other than S3 .. Can you please lep me where does applications logs get saved in EMR master instance
If the application is submitted to the emr as a step then the logs will reside in:
/var/log/hadoop/steps/<<step-id>>/<<log-file>>
most logs for emr can be found under the /var/logs directory in the master node
you could also use the yarn cli to get the application logs and redirect the returned log stream to a file to do whatever you want with.
yarn logs -applicationId <<application_id>> > application_log_file.log
Yarn logs are found at /var/log/hadoop-yarn/, and yarn container logs are found at /var/log/hadoop-yarn/container
Links:
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-debugging.html
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-manage-view-web-log-files.html

HDP 2.5: Zeppelin won't run Notebook in Kerberos-enabled cluster

I set up a Hadoop cluster with Hortonworks Data Platform 2.5 and Ambari 2.4. I also added the Zeppelin service to the cluster installation via Ambari UI.
Since I enabled Kerberos, I can't run the Zeppelin Notebooks anymore. When I click "Run paragraph" or "Run all paragraphs" nothing seems to happen. I also don't get any new entries in my logs in /var/log/zeppelin/. Before enabling Kerberos I was able to run the paragraphs.
I tried some example notebooks, and also some of mine, same problem: nothing happens... Tried with admin and non-admin users.
Here are my "Spark" and "sh" interpreter settings (other paragraphs e.g. %sql also don't work):
The tutorial below captures the configuration of Ambari and Hadoop Kerberos:
Configuring Ambari and Hadoop for Kerberos

Apache Ranger-admin not showing active plugin

I have setup Apache ranger authorization for Apache hadoop.
ranger-admin and ranger-usersync is running without any error.
I have also enabled ranger-hdfs-plugin and restarted hadoop, but active plugin list is empty in ranger-admin UI.
I don't see any error in any of the logs file. Can someone guide me how to resolve this issue ?
Ranger Version: 0.5

I cannot see the running applications in hadoop 2.5.2 (yarn)

I installed hadoop 2.5.2, and I can run the wordcount sample successfully. However, when I want to see the application running on yarn (job running), I cannot as all applictaions interface is always empty (shown in the following screen).
Is there anyway to make the jobs visible?
Please try localhost:19888 or check value of the the property for web url for job history (mapreduce.jobhistory.webapp.address) configured in you yarn config file.

where is the hadoop task manager UI

I installed the hadoop 2.2 system on my ubuntu box using this tutorial
http://codesfusion.blogspot.com/2013/11/hadoop-2x-core-hdfs-and-yarn-components.html
Everything worked fine for me and now when I do
http://localhost:50070
I can see the management UI for HDFS. Very good!!
But the I am going through another tutorial which tells me that there must be a task manager UI running at http://mymachine.com:50030 and http://mymachine.com:50060
on my machine I cannot open these ports.
I have already done
start-dfs.sh
start-yarn.sh
start-all.sh
is something wrong? why can't I see the task manager UI?
You have installed YARN (MRv2) which runs the ResourceManager. The URL http://mymachine.com:50030 is the web address for the JobTracker daemon that comes with MRv1 and hence you are not able to see it.
To see the ResourceManager UI, check your yarn-site.xml file for the following property:
yarn.resourcemanager.webapp.address
By default, it should point to : resource_manager_hostname:8088
Assuming your ResourceManager runs on mymachine, you should see the ResourceManager UI at http://mymachine.com:8088/
Make sure all your deamons are up and running before you visit the URL for the ResourceManager.
For Hadoop 2[aka YARN/MRV2] - Any hadoop installation version-ed 2.x or higher its at port number 8088. eg. localhost:8088
For Hadoop 1 - Any hadoop installation version-ed lower than 2.x[eg 1.x or 0.x] its at port number 50030. eg localhost:50030
By default HadoopUI location is as below
http://mymachine.com:50070

Resources