Pig MR job failing when run by same user who started the cluster - hadoop

I am seeing this exception intermittently for some mappers and reducers in my Pig map reduce job. Most of the times it is retried on some other node and the task succeeds. But sometimes all 4 tasks fails and the map reduce job fails.
However the interesting thing is the folder jobcache indeed has permissions 700. I dont understand why it is not able to create the folder inside it.
Error initializing attempt_201212101828_0396_m_000028_0:
java.io.IOException: Failed to set permissions of path: /apollo/env/TrafficAnalyticsHadoop/var/hadoop/mapred/local_data/taskTracker/trafanly/jobcache/job_201212101828_0396 to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:682)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:671)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.mapred.JobLocalizer.createJobDirs(JobLocalizer.java:221)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:184)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1226)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1201)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1116)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2404)
at java.lang.Thread.run(Thread.java:662)
I am using Hadoop 1.0.1 if that helps. One more thing which i found while searching online was: https://issues.apache.org/jira/browse/MAPREDUCE-890. In my case the user who started the mapred cluster is indeed running the job and that is when it fails. For any other user the job runs just fine.
Any help would be appreciated.

change the permissions of the directories you have used as property values in your .xml configuration files to 755

Related

Hadoop java.io.IOException: Mkdirs failed to create /some/path, when running mapreduce job on mac osx

When I run my MR job on mac osx, I face on the following exception:
Exception in thread "main" java.io.IOException: Mkdirs failed to create /var/folders/9m/w_vzzmtx0rq0tt9whf_r4yhr0000gn/T/hadoop-unjar7688811202881231043/META-INF/license
at org.apache.hadoop.util.RunJar.ensureDirectory(RunJar.java:128)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:104)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:81)
at org.apache.hadoop.util.RunJar.run(RunJar.java:209)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
According other post, people gave alternative way to remove META-INF/LICENSE from jar file. I feel it seems temporary solution.
I think it will resolve if path trying to store tmp files below:
/var/folders/9m/.../META-INF/license
I checked permission and tried to change "hadoop.tmp.dir" value in core-site.xml, but it doesn't work for me.
PS. I know the issue is caused case-insensitive property for osx. Then, I am working with directory mounted disk image, which is case sensitive.
Thanks in advance!

Why MR2 map task is running under 'yarn' user and not under user I ran hadoop job?

I'm trying to run mapreduce job on MR2, Hadoop ver. 2.6.0-cdh5.8.0. Job has relative path to directory which has a lot of files to be compressed based on some criteria(not really necessary for this question). I'm running my job as following:
sudo -u my_user hadoop jar my_jar.jar com.example.Main
There is a folder on HDFS under path /user/my_user/ with files. But when I'm running my job I got following exception:
java.io.FileNotFoundException: File /user/yarn/<path_from_job> does not exist.
I'm migrating this job from MR1 where this job is working correctly. My suggestion is this is happening due to YARN, because each container started under YARN user. In my job configuration I've tried to set mapreduce.job.user.name="my_user" but this didn't help.
I've found ${user.home} usage in me Job configuration, but I don't know aware where it is set and is it possible to change this.
The only solution I found so far is to provide absolute path to folder. Is there any other way around, because I feel like this is not correct approach.
Thank you

Pig, Oozie, and HBase - java.io.IOException: No FileSystem for scheme: hbase

My Pig script works fine on its own, until I put it in an Oozie workflow, where I receive the following error:
ERROR 2043: Unexpected error during execution.
org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected error during execution.
...
Caused by: java.io.IOException: No FileSystem for scheme: hbase
I registered the HBase and Zookeeper jars successfully, but received the same error.
I also attempted to set the Zookeeper Quorum by adding variation of these lines in the Pig script:
SET hbase.zookeeper.quorum 'vm-myhost-001,vm-myhost-002,vm-myhost-003'
Some searching on the internet instructed me to add this to the beginning of my workflow.xml:
SET mapreduce.fileoutputcommitter.marksuccessfuljobs false
This solved the problem. I was even able to remove the registration of the HBase and Zookeeper jars and the Zookeeper quorum.
Now after double checking, I noticed that my jobs actually do their job: they store the results in HBase as expected. But, Oozie claims that a failure occurred, when it didn't.
I don't think that setting the mapreduce.fileoutputcommitter.marksuccessfuljobs to false constitutes a solution.
Are there any other solutions?
It seems that there is currently no real solution for this.
However, this answer to a different question seems to indicate that the best workaround is to create the success flag 'manually'.

Hadoop 1.0.4 - file permission issue in running map reduce jobs

I am new to hadoop and need to setup a sandbox environment in windows to showcase to a client. I have followed below mentioned steps
Install cygwin on all machines
setup ssh
install hadoop 1.0.4
configure hadoop
Applied patch for hadoop-7682 bug
After lot of hit and trial I was successfully able to run all the components (namenode, datanode, tasktracker and jobtracker). But now I am facing problem while running map-reduce jobs and getting permission error on tmp directory. When I run word count example using following command
bin/hadoop jar hadoop*examples*.jar wordcount wcountjob wcountjob/gutenberg-output
13/03/28 23:43:29 INFO mapred.JobClient: Task Id :
attempt_201303282342_0001_m_000003_2, Status : FAILED Error
initializing attempt_201303282342_0001_m_000003_2:
java.io.IOException: Failed to set permissions of path:
c:\cygwin\usr\local\tmp\taskTracker\uswu50754 to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.mapred.JobLocalizer.createLocalDirs(JobLocalizer.java:144)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:182)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1228)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
at java.lang.Thread.run(Thread.java:662)
I have tried setting the permissions manually but that also doesn't work. What I understand is that this due to java libraries being used that try to reset the permissions and fail. The permission patch that solved the tasktracker problem doesn't seem to solve this one.
Has anybody found a solution for this?
Can anybody point me to download location for Hadoop 0.20.2 which seems to be immune
to this problem?

Hadoop local job directory get deleted before job is completed on task nodes

We are having a strange issue in our Hadoop cluster. We have noticed that some of our jobs fail with the with a file not found exception[see below]. Basically the files in the "attempt_*" directory and the directory itself are getting deleted while the task is still being run on the host. Looking through some of the hadoop documentation I see that the job directory gets wiped out when it gets a KillJobAction however I am not sure why it gets wiped out while the job is still running.
My question is what could be deleting it while the job is running? Any thoughts or pointers on how to debug this would be helpful.
Thanks!
java.io.FileNotFoundException: <dir>/hadoop/mapred/local_data/taskTracker/<user>/jobcache/job_201211030344_15383/attempt_201211030344_15383_m_000169_0/output/spill29.out (Permission denied)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.<init>(RawLocalFileSystem.java:71)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:107)
at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:177)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:400)
at org.apache.hadoop.mapred.Merger$Segment.init(Merger.java:205)
at org.apache.hadoop.mapred.Merger$Segment.access$100(Merger.java:165)
at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:418)
at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1692)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1322)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:698)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
I had a similar error and found that this Permission error was caused due to the hadoop program not being able to create or access the files.
Are the files being created inside the hdfs or on any local file system.
If they are on a local file system, then try setting permissions to that folder, if they are on the hdfs folder then try setting permissions to that folder.
If you are running it on ubuntu then
try running
chmod -R a=rwx //hadoop/mapred/local_data/taskTracker//jobcache/job_201211030344_15383/attempt_201211030344_15383_m_000169_0/output/

Resources