Application (job) list empty on Hadoop 2.x - hadoop

I have a Hadoop 2.8.1 installation on a macOS Sierra (Darwin Kernel version 16.7.0) and it's working fine, except the application/tasks tracking.
1) At first, I thought it was a problem with the Resource Manager web interface. So:
I've copied the yarn-site.xml template to the etc/yarn-site.xml file, but it didn't help.
I've tried to change the default 'dr. who' user to my Hadoop user on Resource manager (http://localhost:18088/cluster/apps/RUNNING?user.name=myUser), but it didn't help also.
2) Nor even on command line I can track my applications (jobs): yarn application -list returns always empty.
3) Another information: on application INFO outputs, it shows these following lines, but I can't access it.
INFO mapreduce.Job: The url to track the job: http://localhost:8080/
INFO mapreduce.Job: Running job: job_local2009332672_0001
Is it a yarn problem? Should I change another setting file? Thanks!

Look at mapreduce.framework.name in mapred-site.xml. In your HADOOP_CONF_DIR
Set its value to yarn.
If you don't have a mapred-site, then copy and rename the mapred-default XML file.

Thanks for the answer, I was looking for this feature without success. I did changes on the etc/hosts for nothing
The answer is to set mapreduce.framework.name in mapred-site.xmlto yarn as stated by cricket_007.
This is setting yarn as the default framework for MapReduce operations

Related

Why MR2 map task is running under 'yarn' user and not under user I ran hadoop job?

I'm trying to run mapreduce job on MR2, Hadoop ver. 2.6.0-cdh5.8.0. Job has relative path to directory which has a lot of files to be compressed based on some criteria(not really necessary for this question). I'm running my job as following:
sudo -u my_user hadoop jar my_jar.jar com.example.Main
There is a folder on HDFS under path /user/my_user/ with files. But when I'm running my job I got following exception:
java.io.FileNotFoundException: File /user/yarn/<path_from_job> does not exist.
I'm migrating this job from MR1 where this job is working correctly. My suggestion is this is happening due to YARN, because each container started under YARN user. In my job configuration I've tried to set mapreduce.job.user.name="my_user" but this didn't help.
I've found ${user.home} usage in me Job configuration, but I don't know aware where it is set and is it possible to change this.
The only solution I found so far is to provide absolute path to folder. Is there any other way around, because I feel like this is not correct approach.
Thank you

I cannot see the running applications in hadoop 2.5.2 (yarn)

I installed hadoop 2.5.2, and I can run the wordcount sample successfully. However, when I want to see the application running on yarn (job running), I cannot as all applictaions interface is always empty (shown in the following screen).
Is there anyway to make the jobs visible?
Please try localhost:19888 or check value of the the property for web url for job history (mapreduce.jobhistory.webapp.address) configured in you yarn config file.

Hadoop 2.2.0 Web UI not showing Job Progress

I have installed Single node hadoop 2.2.0 from this link. When i run a job from terminal, it works fine with output. Web UI's i used
- Resource Manager : http://localhost:8088
- Namenode Daemon : http://localhost:50070
But from Resource Manager's web UI(shown above) i can't see job progress like Submitted Jobs, Running Jobs, etc..
MY /etc/hosts file is as follows:
127.0.0.1 localhost
127.0.1.1 meitpict
My System has IP: 192.168.2.96(I tried by removing this ip but still it didn't worked)
The only host:port i mentioned is in core-site.xml and that is:
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:54310</value>
</property>
Even i get these problems while executing Map - Reduce job.
Best what i did that while executing Map-Reduce job you will get link on the box from which you executed MapR job for checking the job progress something like below:
http://node1:19888/jobhistory/job/job_1409462238335_0002/mapreduce/app
Here node1 is set to 192.168.56.101 in my hosts file entry and that is for NameNode box.
So at the time of your MapR job is running you can go to the UI link provided by MapR framework.
ANd when it gets opened then do not close and there you can find details about other jobs also, when they started and when they got finished etc.
So next time better to check your putty console output after submitting the MapR job, you will definitely see a link for the current job to check it status from the browser UI.
In Hadoop 2.x this problem could be related to memory issues, you can see it in MapReduce in Hadoop 2.2.0 not working

where is the hadoop task manager UI

I installed the hadoop 2.2 system on my ubuntu box using this tutorial
http://codesfusion.blogspot.com/2013/11/hadoop-2x-core-hdfs-and-yarn-components.html
Everything worked fine for me and now when I do
http://localhost:50070
I can see the management UI for HDFS. Very good!!
But the I am going through another tutorial which tells me that there must be a task manager UI running at http://mymachine.com:50030 and http://mymachine.com:50060
on my machine I cannot open these ports.
I have already done
start-dfs.sh
start-yarn.sh
start-all.sh
is something wrong? why can't I see the task manager UI?
You have installed YARN (MRv2) which runs the ResourceManager. The URL http://mymachine.com:50030 is the web address for the JobTracker daemon that comes with MRv1 and hence you are not able to see it.
To see the ResourceManager UI, check your yarn-site.xml file for the following property:
yarn.resourcemanager.webapp.address
By default, it should point to : resource_manager_hostname:8088
Assuming your ResourceManager runs on mymachine, you should see the ResourceManager UI at http://mymachine.com:8088/
Make sure all your deamons are up and running before you visit the URL for the ResourceManager.
For Hadoop 2[aka YARN/MRV2] - Any hadoop installation version-ed 2.x or higher its at port number 8088. eg. localhost:8088
For Hadoop 1 - Any hadoop installation version-ed lower than 2.x[eg 1.x or 0.x] its at port number 50030. eg localhost:50030
By default HadoopUI location is as below
http://mymachine.com:50070

Cloudera Manager: Where do I put Java ClassPath for MapReduce jobs?

I've got Hadoop-Lzo working happily on my local pseudo-cluster but the second I try the same jar file in production, I get:
java.lang.RuntimeException: native-lzo library not available
The libraries are verified to be on the DataNodes, so my question is:
In what screen / setting do I specify the location of the native-lzo library?
For MapReduce you need to add the entries to the MapReduce Client Environment Safety valve. You can find MapReduce Client Safety by going to View and Edit tab under Configuration. Then add these lines over there :
HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
Also add the LZO codecs to the io.compression.codecs property under the MapReduce Service. To do that go to io.compression under View and Edit tab under Configuration and these lines :
com.hadoop.compression.lzo.LzoCodec
com.hadoop.compression.lzo.LzopCodec
Do not forget to restart your MR daemons after making the changes. Once restarted redeploy your MR client configuration.
For a detailed help on how to use LZO you can visit this link :
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Installation-Guide/cmig_install_LZO_Compression.html
HTH
try sudo apt-get install lzop in your TaskTracker nodes.

Resources