How can I access yarn job log via web ui?
I can view the job log via yarn manager web site. But every time yarn restart, the application list of yarn manager is empty. the picture is before restart
I can access application log via CLI command, even I restart yarn.
$HADOOP_HOME/bin/yarn logs -applicationId application_1499949542308_0020
The jobhistory server web ui is empty all the time
My log settings in yarn-site.xml and mapred-site.xml
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/home/hadoop/hadoop/nodemanager-logs</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/app-logs</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://hdp03.hp.sp.prd.bmsre.com:19888/jobhistory/logs</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hdp03.hp.sp.prd.bmsre.com:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hdp03.hp.sp.prd.bmsre.com:19888</value>
</property>
On Data Node, you can check the folder:
${HADOOP_HOME}/logs/userlogs
You just need to go to the folder which having the same name as the application id.
Yes, you can access Yarn retired jobs from Web UI.
access this url http://<jobtracker>:50070 to get the retired jobs.
With respect to your question you have restarted the yarn which means, a new log thread wakes up and does an upload of the logs to the configured location.
But in your question, does ' /app-logs' /app-logs path exists in your file-system. Please check.
There is a retention period, for how long the logs must be stored in that path and it is defined by the property name called yarn.log-aggregation.retain-seconds parameter.
To my understanding, the Job Tracker UI by default available at http://<jobtracker>:50070, exposes information on all currently running as well as retired MapReduce jobs and YARN has a JobHistory REST service that exposes details on finished applications.
Related
We usually will be able to see yarn container logs in "/var/log/hadoop-yarn/containers" path. Though I am able to see logs for successful jobs, I am not able to see the logs for failed jobs. The node manager logs shows the logs getting deleted.
Log:
2017-07-13 14:16:04,170 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor (DeletionService #1): Deleting path : /var/log/hadoop-yarn/containers/application_1234567890_12345/container_11234567890_12345_11_0000
01/stdout
2017-07-13 14:16:04,180 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl (LogAggregationService #6093): renaming /var/log/hadoop-yarn/apps/hadoop/logs/application_1234567890_12345/xx.xx.xx.xx_8041.tmp to /var/log/hadoop-yarn/apps/hadoop/logs/application_1234567890_12345/xx.xx.xx.xx_8041
2017-07-13 14:16:04,181 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor (DeletionService #3): Deleting path : /var/log/hadoop-yarn/containers/application_1234567890_12345
2017-07-13 14:16:06,048 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl (Container Monitor): Stopping resource-monitoring for container_11234567890_12345_11_0000
Here's a snippet of my yarn-site.xml.
Can some one please advise on what config needs to be modified to retain logs for failed jobs?
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://ip-XX.XX.XX.XX:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/mnt/yarn</value>
<final>true</final>
</property>
<property>
<description>Where to store container logs.</description>
<name>yarn.nodemanager.log-dirs</name>
<value>/var/log/hadoop-yarn/containers</value>
</property>
<property>
<description>Where to aggregate logs to.</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/var/log/hadoop-yarn/apps</value>
</property>
<property>
<name>yarn.log-aggregation.enable-local-cleanup</name>
<value>true</value>
</property>
<property>
<name>yarn.scheduler.increment-allocation-mb</name>
<value>32</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
The logs get moved to HDFS when log aggregation is done, usually this is /app-logs on HDFS.
Check the below settings in the documentation
yarn.nodemanager.remote-app-log-dir
Normally /app-logs on HDFS but in your case it is set to /var/log/hadoop-yarn/apps, does this directory exist on HDFS? Looks like a local directory value was put here by mistake.
Other settings that may be useful:
yarn.log-aggregation-enable:
if ${yarn.log-aggregation-enable} is enabled then the NodeManager will immediately concatenate all of the containers logs into one file and upload them into HDFS in ${yarn.nodemanager.remote-app-log-dir}/${user.name}/logs/ and delete them from the local userlogs directory
yarn.nodemanager.delete.debug-delay-sec:
Number of seconds after an application finishes before the nodemanager's DeletionService will delete the application's localized file directory and log directory. To diagnose Yarn application problems, set this property's value large enough (for example, to 600 = 10 minutes) to permit examination of these directories. After changing the property's value, you must restart the nodemanager in order for it to have an effect.
I'm currently running Apache Ignite Hadoop accelerator for MapReduce. The jobs run, but I am unable to see them in the JobHistoryServer. I wouldn't expect to see the jobs in Yarn's Resource Manager (and don't).
I'm running my MapReduce jobs like
hadoop --config path/to/config/ jar path/to/jar ....
In the mapred-site.xml, I've added
<property>
<name>mapreduce.framework.name</name>
<value>ignite</value>
</property>
<property>
<name>mapreduce.jobtracker.address</name>
<value>[your_host]:11211</value>
</property>
My mapreduce.jobhistory.* settings have not been changed.
In the core-site.xml I've added
<property>
<name>fs.default.name</name>
<value>igfs://igfs#/</value>
</property>
<property>
<name>fs.igfs.impl</name>
<value>org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.igfs.impl</name>
<value>org.apache.ignite.hadoop.fs.v2.IgniteHadoopFileSystem</value>
</property>
I've also added ignite-core-1.6.0.jar, ignite-hadoop-1.6.0.jar, and ignite-shmem-1.0.0.jar to the $HADOOP_HOME path. Similarly, I've exported HADOOP_HOME, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, and HADOOP_MAPRED_HOME.
Is this functionality not supported by Ignite or am I doing something wrong?
Also, is there a way to track the MapReduce job running on Ignite?
Currently Ignite does not integrate anyhow with Hadoop History server, issue https://issues.apache.org/jira/browse/IGNITE-3766 requests that.
Configuring hadoop 2.7.1 to retain yarn jobs for longer
Have enabled log aggregation and the jobhistory/timeline server and when a job is complete in the resource manager it does show up in the jobhistory server(if you give the correct url), however the jobhistory server is only listing M/R jobs, not yarn applications
The problem is the job is not visible in the timeline server, in fact no jobs show in the timeline server
Current yarn-site.xml configuration :
<property>
<name>yarn.timeline-service.hostname</name>
<value>host1</value>
</property>
<property>
<name>yarn.timeline-service.address</name>
<value>${yarn.timeline-service.hostname}:10200</value>
</property>
<property>
<name>yarn.timeline-service.webapp.address</name>
<value>${yarn.timeline-service.hostname}:8188</value>
</property>
<property>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.generic-application-history.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://${yarn.timeline-service.hostname}:19888/jobhistory/logs/</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/var/vm/apps/hadoop/logs</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/var/vm/apps/hadoop/logs</value>
</property>
Am I providing conflicting configuration in using the jobhistory server AND the timeline server?
At the end of the day I want the yarn logs persisted to hdfs for viewing in the web-ui over the following days/weeks
You need to set mapreduce.job.emit-timeline-data property to true in mapred-site.xml
This will enable mapreduce jobs to push events to the timeline server.
I am using the latest hadoop version 3.0.0 build from source code. I have my timeline service up and running and have configured hadoop to use that for job history also. But when I click on history in the resoucemanager UI I get the below error:-
HTTP ERROR 404
Problem accessing /jobhistory/job/job_1444395439959_0001. Reason:
NOT_FOUND
Can someone please point out what I am missing here. Following is my yarn-site.xml:-
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<description>The hostname of the Timeline service web application.</description>
<name>yarn.timeline-service.hostname</name>
<value>0.0.0.0</value>
</property>
<property>
<description>Address for the Timeline server to start the RPC server.</description>
<name>yarn.timeline-service.address</name>
<value>${yarn.timeline-service.hostname}:10200</value>
</property>
<property>
<description>The http address of the Timeline service web application.</description>
<name>yarn.timeline-service.webapp.address</name>
<value>${yarn.timeline-service.hostname}:8188</value>
</property>
<property>
<description>The https address of the Timeline service web application.</description>
<name>yarn.timeline-service.webapp.https.address</name>
<value>${yarn.timeline-service.hostname}:8190</value>
</property>
<property>
<description>Handler thread count to serve the client RPC requests.</description>
<name>yarn.timeline-service.handler-thread-count</name>
<value>10</value>
</property>
<property>
<description>Indicate to ResourceManager as well as clients whether
history-service is enabled or not. If enabled, ResourceManager starts
recording historical data that Timelien service can consume. Similarly,
clients can redirect to the history service when applications
finish if this is enabled.</description>
<name>yarn.timeline-service.generic-application-history.enabled</name>
<value>true</value>
</property>
<property>
<description>Store class name for history store, defaulting to file system
store</description>
<name>yarn.timeline-service.generic-application-history.store-class</name>
<value>org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore</value>
</property>
<property>
<description>URI pointing to the location of the FileSystem path where the history will be persisted.</description>
<name>yarn.timeline-service.generic-application-history.fs-history-store.uri</name>
<value>/tmp/yarn/system/history</value>
</property>
<property>
<description>T-file compression types used to compress history data.</description>
<name>yarn.timeline-service.generic-application-history.fs-history-store.compression-type</name>
<value>none</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
and my mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:10200</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:8188</value>
</property>
<property>
<name>mapreduce.job.emit-timeline-data</name>
<value>true</value>
</property>
</configuration>
JPS output:
6022 NameNode
27976 NodeManager
27859 ResourceManager
6139 DataNode
6310 SecondaryNameNode
28482 ApplicationHistoryServer
29230 Jps
If you want to see the logs through YARN RM web UI, then you need to enable the log aggregation. For that, you need to set the following parameters, in yarn-site.xml:
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/app-logs</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
</property>
If you do not enable log aggregation, then NMs will store the logs locally. With the above settings, the logs are aggregated in HDFS at "/app-logs/{username}/logs/". Under this folder, you can find logs for all the applications run so far. Again the log retention is determined by the configuration parameter "yarn.log-aggregation.retain-seconds" (how long to retain the aggregated logs).
When the MapReduce applications are running, then you can access the logs from the YARN's web UI. Once the application is completed, the logs are served through Job History Server.
Also, set following configuration parameter in yarn-site.xml:
<property>
<name>yarn.log.server.url</name>
<value>http://{job-history-hostname}:8188/jobhistory/logs</value>
</property>
I enabled the permission management in my hadoop cluster, but I'm facing a problem sending jobs with pig. This is the scenario:
1 - I have hadoop/hadoop user
2 - I have myuserapp/myuserapp user that runs PIG script.
3 - We setup the path /myapp to be owned by myuserapp
4 - We set pig.temp.dir to /myapp/pig/tmp
But when we pig try to run the jobs we got the following error:
job_201303221059_0009 all_actions,filtered,raw_data DISTINCT Message: Job failed! Error - Job initialization failed: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=realtime, access=EXECUTE, inode="system":hadoop:supergroup:rwx------
Hadoop jobtracker requires this permission to statup it's server.
My hadoop policy looks like:
<property>
<name>security.client.datanode.protocol.acl</name>
<value>hadoop,myuserapp supergroup,myuserapp</value>
</property>
<property>
<name>security.inter.tracker.protocol.acl</name>
<value>hadoop,myuserapp supergroup,myuserapp</value>
</property>
<property>
<name>security.job.submission.protocol.acl</name>
<value>hadoop,myuserapp supergroup,myuserapp</value>
<property>
My hdfs-site.xml:
<property>
<name>dfs.permissions</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>755</value>
</property>
<property>
<name>dfs.web.ugi</name>
<value>hadoop,supergroup</value>
</property>
My core site:
...
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
...
And finally my mapred-site.xml
...
<property>
<name>mapred.local.dir</name>
<value>/tmp/mapred</value>
</property>
<property>
<name>mapreduce.jobtracker.jobhistory.location</name>
<value>/opt/logs/hadoop/history</value>
</property>
Is there a missing configuration? How can I deal with multiples users running jobs in a restrict HDFS cluster?
Your problem is probably the staging directory. Try adding this property to mapred-site.xml:
<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>/user</value>
</property>
Then make sure that the submitting user (eg. 'realtime') has a home directory (eg. '/user/realtime') and that they own it.
The fair scheduler is designed to run map reduce jobs as the user and it creates separeted pools for users/groups but have shared resources. At first look, there are some issues with this scheduler related to permissions on certain directories not allowing other users to execute/write in places that are necessary for the job to run.
So, one solution is to use Capacity scheduler:
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.CapacityTaskScheduler</value>
</property>
Capacity Scheduler, use a number of named queues, where each queue has a configurable number of map and reduce slots. And one good thing about capacity is the ability of placing a limit on percent of running tasks per user, so that users share a cluster with a quota.