Missing block in Hadoop - hadoop

I am trying to run a wordcount job in Hadoop. Due to a previous error, I had to turn off the safe mode for the NameNode. Now, however, when trying to run the job, I am getting the following error:
14/08/06 14:49:08 INFO mapreduce.Job: map 1% reduce 0%
14/08/06 14:49:25 INFO mapreduce.Job: Task Id : attempt_1407336345567_0002_m_000158_0, Status : FAILED
Error: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-971868671-192.168.50.2-1406571670535:blk_1073743276_2475 file=/wikidumps/enwiki-20130904-pages-meta-history3.xml-p000032706p000037161
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:164)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
The log files are not showing any particular error. Does anyone know the reason for this error? Thanks in advance!

Related

job.jar does not exist while running map reduce job

I have a hortonworks distribution (2.2.6.0-2800) of Hadoop which runs mapreduce job based on yarn, and I have a simple map reduce job which reads compressed data files from hdfs, does some processing over it and then this data is saved in hbase with bulk load
Here is my program that does it
final Configuration hadoopConfiguration = new Configuration();
configuration.set(“yarn.resourcemanager.address”, “XXXXXX”);
configuration.set(“yarn.resourcemanager.scheduler.address”, “XXXXXX”);
configuration.set("mapreduce.framework.name", "yarn”);
configuration.set("mapreduce.jobtracker.staging.root.dir", “XXXXXXXX”);
final Job job = Job.getInstance(hadoopConfiguration, "migration");
job.setJarByClass(BlitzService.class);
job.setMapperClass(DataMigrationMapper.class);
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setMapOutputValueClass(KeyValue.class);
job.setReducerClass(DataMigrationReducer.class);
job.setCombinerClass(DataMigrationReducer.class);
HFileOutputFormat2.configureIncrementalLoad(job, hTable);
FileInputFormat.setInputPaths(job, filesToProcess.toArray(new Path[filesToProcess.size()]));
HFileOutputFormat2.setOutputPath(job, new Path(SOME PATH));
job.waitForCompletion(true);
This should be a very simple thing to run but I am facing this exception while running the job
INFO [2015-07-23 23:53:20,222] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /172.30.0.147:8032
WARN [2015-07-23 23:53:20,383] org.apache.hadoop.mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
INFO [2015-07-23 23:53:20,492] org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to process : 16
INFO [2015-07-23 23:53:20,561] org.apache.hadoop.mapreduce.JobSubmitter: number of splits:16
INFO [2015-07-23 23:53:20,719] org.apache.hadoop.mapreduce.JobSubmitter: Submitting tokens for job: job_1437695344326_0002
INFO [2015-07-23 23:53:20,842] org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1437695344326_0002
INFO [2015-07-23 23:53:20,867] org.apache.hadoop.mapreduce.Job: The url to track the job: http://ip-172-30-0-147.us-west-2.compute.internal:8088/proxy/application_1437695344326_0002/
INFO [2015-07-23 23:53:20,868] org.apache.hadoop.mapreduce.Job: Running job: job_1437695344326_0002
INFO [2015-07-23 23:53:35,994] org.apache.hadoop.mapreduce.Job: Job job_1437695344326_0002 running in uber mode : false
INFO [2015-07-23 23:53:35,995] org.apache.hadoop.mapreduce.Job: map 0% reduce 0%
INFO [2015-07-23 23:53:43,053] org.apache.hadoop.mapreduce.Job: Task Id : attempt_1437695344326_0002_m_000001_1000, Status : FAILED
File file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does not exist
java.io.FileNotFoundException: File file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:608)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:821)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:598)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
INFO [2015-07-23 23:53:44,075] org.apache.hadoop.mapreduce.Job: Task Id : attempt_1437695344326_0002_m_000002_1000, Status : FAILED
File file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does not exist
java.io.FileNotFoundException: File file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:608)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:821)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:598)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
This could be similar to this. Once you verify if the jar is existing in the above directory(/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/), see if the same permissions are being set as mentioned in link. Otherwise you can add similar authorization property
I've met a same problem. It has nothing to do with the JAR dir. Make sure that your inputpaths are right. check your files path filesToProcess
FileInputFormat.setInputPaths(job, filesToProcess.toArray(new Path[filesToProcess.size()]));

Issue in Map Reduce Program

I am executing a hadoop Map-Reduce Job for simple word count problem using Putty.
I have configured Hadoop on a VM and I have verified all components of Hadoop are running using jps.
When I am executing the code using command
hadoop jar Untitled.jar
I am getting error
15/06/20 19:36:48 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
15/06/20 19:37:09 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/06/20 19:37:09 WARN snappy.LoadSnappy: Snappy native library not loaded
15/06/20 19:37:09 INFO mapred.FileInputFormat: Total input paths to process : 0
15/06/20 19:37:10 INFO mapred.JobClient: Running job: job_201506201820_0004
15/06/20 19:37:11 INFO mapred.JobClient: map 0% reduce 0%
15/06/20 19:37:12 INFO mapred.JobClient: Task Id : attempt_201506201820_0004_m_000001_0, Status : FAILED
Error initializing attempt_201506201820_0004_m_000001_0:
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:701)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:656)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1336)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1311)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1226)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2603)
at java.lang.Thread.run(Thread.java:745)
15/06/20 19:37:13 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000001_0&filter=stdout
15/06/20 19:37:13 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000001_0&filter=stderr
15/06/20 19:37:14 INFO mapred.JobClient: Task Id : attempt_201506201820_0004_m_000001_1, Status : FAILED
Error initializing attempt_201506201820_0004_m_000001_1:
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:701)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:656)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1336)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1311)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1226)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2603)
at java.lang.Thread.run(Thread.java:745)
15/06/20 19:37:15 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000001_1&filter=stdout
15/06/20 19:37:15 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000001_1&filter=stderr
15/06/20 19:37:16 INFO mapred.JobClient: Task Id : attempt_201506201820_0004_m_000001_2, Status : FAILED
15/06/20 19:37:16 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000001_2&filter=stdout
15/06/20 19:37:16 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000001_2&filter=stderr
15/06/20 19:37:17 INFO mapred.JobClient: Task Id : attempt_201506201820_0004_m_000000_0, Status : FAILED
Error initializing attempt_201506201820_0004_m_000000_0:
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:701)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:656)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1336)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1311)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1226)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2603)
at java.lang.Thread.run(Thread.java:745)
15/06/20 19:37:17 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000000_0&filter=stdout
15/06/20 19:37:17 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000000_0&filter=stderr
15/06/20 19:37:18 INFO mapred.JobClient: Task Id : attempt_201506201820_0004_m_000000_1, Status : FAILED
Error initializing attempt_201506201820_0004_m_000000_1:
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:701)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:656)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1336)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1311)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1226)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2603)
at java.lang.Thread.run(Thread.java:745)
15/06/20 19:37:19 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000000_1&filter=stdout
15/06/20 19:37:19 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000000_1&filter=stderr
15/06/20 19:37:20 INFO mapred.JobClient: Task Id : attempt_201506201820_0004_m_000000_2, Status : FAILED
Error initializing attempt_201506201820_0004_m_000000_2:
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:701)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:656)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1336)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1311)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1226)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2603)
at java.lang.Thread.run(Thread.java:745)
15/06/20 19:37:20 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000000_2&filter=stdout
15/06/20 19:37:20 WARN mapred.JobClient: Error reading task outputhttp://ankit-VirtualBox:50060/tasklog?plaintext=true&attemptid=attempt_201506201820_0004_m_000000_2&filter=stderr
15/06/20 19:37:21 INFO mapred.JobClient: Job complete: job_201506201820_0004
15/06/20 19:37:21 INFO mapred.JobClient: Counters: 4
15/06/20 19:37:21 INFO mapred.JobClient: Job Counters
15/06/20 19:37:21 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
15/06/20 19:37:21 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
15/06/20 19:37:21 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=0
15/06/20 19:37:21 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
15/06/20 19:37:21 INFO mapred.JobClient: Job Failed: JobCleanup Task Failure, Task: task_201506201820_0004_m_000000
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at WordCount.main(WordCount.java:32)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
What am I missing.
you don't seem to be providing any input path. Look at the error:
15/06/20 19:37:09 INFO mapred.FileInputFormat: Total input paths to process : 0
From logs, you can see the issue occured at first map execution it self, as 0% for both map and reduce processes. Next is the actual error. "No such file". The only time we deal with map reduce and file is input and output. Given we are at start of map process, I think the issue could be with input path and its permissions. Please look into them. Also the output directory should not be existed before running the job. It creates I suppose. Happy coding

reducer always fails and map succeeds

I am running simple wordcount job on 1GB of text file . My cluster has 8 Datanodes and 1 namenode each has a storage capacity of 3GB.
When i run wordcount I can see map always succeeds and reducer is throwing an error and fails. Please find below error message.
14/10/05 15:42:02 INFO mapred.JobClient: map 100% reduce 31%
14/10/05 15:42:07 INFO mapred.JobClient: Task Id : attempt_201410051534_0002_m_000016_0, Status : FAILED
FSError: java.io.IOException: No space left on device
14/10/05 15:42:14 INFO mapred.JobClient: Task Id : attempt_201410051534_0002_r_000000_0, Status : FAILED
java.io.IOException: Task: attempt_201410051534_0002_r_000000_0 - The reduce copier failed
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for file:/app/hadoop/tmp/mapred/local/taskTracker/hduser/jobcache/job_201410051534_0002/attempt_201410051534_0002_r_000000_0/output/map_18.out
Could you please tell me how can i fix this problem ?
Thanks
Navaz

mapreduce wroking on single node cluster but not on multinode cluster

I am running a map reduce program which works fine on my cdh quickstart vm but when trying on a multinode cluster, it gives the below error:
WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/02/12 00:23:06 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/02/12 00:23:06 INFO input.FileInputFormat: Total input paths to process : 1
14/02/12 00:23:07 INFO mapred.JobClient: Running job: job_201401221117_5777
14/02/12 00:23:08 INFO mapred.JobClient: map 0% reduce 0%
14/02/12 00:23:16 INFO mapred.JobClient: Task Id : attempt_201401221117_5777_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class Mappercsv not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:191)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ClassNotFoundException: Class Mappercsv not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1680)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1772)
... 8 more"
Please help.

Hadoop error stalling job reduce process

I have been running a Hadoop job(word count example) a few times on my two-node cluster setup, and it´s been working fine up until now. I keep getting a RuntimeException which stalls the reduce process at 19%:
2013-04-13 18:45:22,191 INFO org.apache.hadoop.mapred.Task: Task:attempt_201304131843_0001_m_000000_0 is done. And is in the process of commiting
2013-04-13 18:45:22,299 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201304131843_0001_m_000000_0' done.
2013-04-13 18:45:22,318 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-04-13 18:45:23,181 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: Error while running command to get file permissions : org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:710)
at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:267)
at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:267)
at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Has anyone any ideas of what might be causing this?
Edit: Solved it myself.
If anyone else runs into the same problem, this was caused by the etc/hosts file on the master-node. I hadn´t entered the host-name and address of the slave-node.
This is how my hosts-file is structured on the master-node:
127.0.0.1 MyUbuntuServer
192.xxx.x.xx2 master
192.xxx.x.xx3 MySecondUbuntuServer
192.xxx.x.xx3 slave
a similar problem is described here:
http://comments.gmane.org/gmane.comp.apache.mahout.user/8898
The info there might be related to other version of hadoop. It says:
java.lang.RuntimeException: Error while running command to
get file permissions : java.io.IOException: Cannot run program
"/bin/ls": error=12, Not enough space
The solution their was to change heapsize through mapred.child.java.opts* *-Xmx1200M
See also: https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/BHGYJDNKMGE
HTH,
Avner

Resources