not able to prevent local job runner from running

not able to prevent local job runner from running - hadoop

I am trying to fill up the hbase table from a java program using HTable and LoadIncrementalHFiles on Hadoop-1.
I have a fully distributed 3 node cluster with 1 master and 2 slaves.
Namenode,jobtracker are running on master and 3 datanodes,3 tasktrackers on all 3 nodes.
3 zookeepers on 3 nodes.
HMaster on master node and 3 regionservers on all 3 nodes.
My core-site.xml contains :
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/TMPDIR/</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310/</value>
</property>
mapred-site.xml contains :
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
</property>
But, when I run the program it gives me below error :
15/08/06 00:11:14 INFO mapred.TaskRunner: Creating symlink: /usr/local/hadoop/TMPDIR/mapred/local/archive/328189779182527451_-1963144838_2133510842/192.168.72.1/user/hduser/partitions_736cc0de-3c15-4a3d-8ae3-e4d239d73f93 <- /usr/local/hadoop/TMPDIR/mapred/local/localRunner/_partition.lst
15/08/06 00:11:14 WARN fs.FileUtil: Command 'ln -s /usr/local/hadoop/TMPDIR/mapred/local/archive/328189779182527451_-1963144838_2133510842/192.168.72.1/user/hduser/partitions_736cc0de-3c15-4a3d-8ae3-e4d239d73f93 /usr/local/hadoop/TMPDIR/mapred/local/localRunner/_partition.lst' failed 1 with: ln: failed to create symbolic link `/usr/local/hadoop/TMPDIR/mapred/local/localRunner/_partition.lst': No such file or directory
15/08/06 00:11:14 WARN mapred.TaskRunner: Failed to create symlink: /usr/local/hadoop/TMPDIR/mapred/local/archive/328189779182527451_-1963144838_2133510842/192.168.72.1/user/hduser/partitions_736cc0de-3c15-4a3d-8ae3-e4d239d73f93 <- /usr/local/hadoop/TMPDIR/mapred/local/localRunner/_partition.lst
15/08/06 00:11:14 INFO mapred.JobClient: Running job: job_local_0001
15/08/06 00:11:15 INFO util.ProcessTree: setsid exited with exit code 0
15/08/06 00:11:15 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#35506f5f
15/08/06 00:11:15 INFO mapred.MapTask: io.sort.mb = 100
15/08/06 00:11:15 INFO mapred.JobClient: map 0% reduce 0%
15/08/06 00:11:17 INFO mapred.MapTask: data buffer = 79691776/99614720
15/08/06 00:11:17 INFO mapred.MapTask: record buffer = 262144/327680
15/08/06 00:11:17 WARN mapred.LocalJobRunner: job_local_0001
java.lang.IllegalArgumentException: Can't read partitions file
at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:214)
Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:796)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1479)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1474)
at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.readPartitions(TotalOrderPartitioner.java:301)
at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:88)
... 6 more
Few lines from my code :
Path input = new Path(args[0]);
input = input.makeQualified(input.getFileSystem(conf));
Path partitionFile = new Path(input, "_partitions.lst");
TotalOrderPartitioner.setPartitionFile(conf, partitionFile);
InputSampler.Sampler<IntWritable, Text> sampler = new InputSampler.RandomSampler<IntWritable, Text>(0.1, 100);
InputSampler.writePartitionFile(job, sampler);
job.setNumReduceTasks(2);
job.setPartitionerClass(TotalOrderPartitioner.class);
job.setJarByClass(TextToHBaseTransfer.class);
Why its still running the local job runner and giving me "Can't read partitions file"?
What am I missing in the cluster configuration?

Related

Not an usable net address error when integrating ML 9 with Connector-for-Hadoop2-2.2.3?

Following this ML documentation I am running sample marklogic-hello-world.xml by using the configuration that was present in the documentation. My localhost name is ubuntu.localdomain . When i am giving the same in my configuration file it is throwing error like this
18/01/04 22:39:54 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
18/01/04 22:39:54 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
18/01/04 22:39:54 INFO mapred.MapTask: soft limit at 83886080
18/01/04 22:39:54 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
18/01/04 22:39:54 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
18/01/04 22:39:54 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
18/01/04 22:40:05 INFO mapred.MapTask: Starting flush of map output
18/01/04 22:40:05 INFO mapred.LocalJobRunner: map task executor complete.
18/01/04 22:40:05 WARN mapred.LocalJobRunner: job_local196795803_0001
java.lang.Exception: java.lang.IllegalArgumentException: Default provider - Not a usable net address: ubuntu.localdomain:8000
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IllegalArgumentException: Default provider - Not a usable net address: ubuntu.localdomain:8000
at com.marklogic.xcc.ContentSourceFactory.defaultConnectionProvider(ContentSourceFactory.java:453)
at com.marklogic.xcc.ContentSourceFactory.newContentSource(ContentSourceFactory.java:264)
at com.marklogic.xcc.ContentSourceFactory.newContentSource(ContentSourceFactory.java:321)
at com.marklogic.mapreduce.utilities.InternalUtilities.getInputContentSource(InternalUtilities.java:127)
at com.marklogic.mapreduce.MarkLogicRecordReader.init(MarkLogicRecordReader.java:348)
at com.marklogic.mapreduce.MarkLogicRecordReader.initialize(MarkLogicRecordReader.java:247)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base/java.lang.Thread.run(Thread.java:844)
18/01/04 22:40:05 INFO mapreduce.Job: Job job_local196795803_0001 failed with state FAILED due to: NA
18/01/04 22:40:05 INFO mapreduce.Job: Counters: 0
My configuration file is like this
<configuration>
<property>
<name>mapreduce.marklogic.input.username</name>
<value>admin</value>
</property>
<property>
<name>mapreduce.marklogic.input.password</name>
<value>admin</value>
</property>
<property>
<name>mapreduce.marklogic.input.host</name>
<value>ubuntu.localdomain</value>
</property>
<property>
<name>mapreduce.marklogic.input.port</name>
<value>8000</value>
</property>
<property>
<name>mapreduce.marklogic.input.mode</name>
<value>basic</value>
</property>
<property>
<name>mapreduce.marklogic.input.valueclass</name>
<value>com.marklogic.mapreduce.DatabaseDocument</value>
</property>
<property>
<name>mapreduce.marklogic.output.username</name>
<value>admin</value>
</property>
<property>
<name>mapreduce.marklogic.output.password</name>
<value>admin</value>
</property>
<property>
<name>mapreduce.marklogic.output.host</name>
<value>ubuntu.localdomain</value>
</property>
<property>
<name>mapreduce.marklogic.output.port</name>
<value>8000</value>
</property>
<property>
<name>mapreduce.marklogic.output.content.type</name>
<value>TEXT</value>
</property>
</configuration>
I had tried by giving various names for this mapreduce.marklogic.input.host i tried with 127.0.0.1 & localhost but by default it is taking ubuntu.localdomain.
I dont know why it is taking default one rather than taking the one which i had specified in configuration.xml file (i.e.127.0.0.1 etc) .
I had used the below command to run this
hadoop jar \
$CONNECTOR_HOME/lib/marklogic-mapreduce-examples-version.jar \
com.marklogic.mapreduce.examples.HelloWorld -libjars $LIBJARS \
-conf marklogic-hello-world.xml
As specifed in the document.
How can i overcome this ? Any help is appreciated ..
Thanks

Resolved the issue by changing the localhost name in Marklogic configuration page from ubuntu.localdomain to localhost then the above configuration worked well. But Still cant able to find why it is not picking hostname from configuration files rather then going to ML .

Yarn can't connect to Hadoop HDFS?

I am running one of the examples (pi) that came with Hadoop. The program doesn't respond, as it looks like it gets no response back due to connection with HDFS maybe?
yarn jar hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 10 100
16/07/27 06:32:38 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/07/27 06:32:38 INFO input.FileInputFormat: Total input paths to process : 10
16/07/27 06:32:38 INFO mapreduce.JobSubmitter: number of splits:10
16/07/27 06:32:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1469626018898_0001
16/07/27 06:32:39 INFO impl.YarnClientImpl: Submitted application application_1469626018898_0001
16/07/27 06:32:39 INFO mapreduce.Job: The url to track the job: http://IP_ADDRESS/proxy/application_14696260188001/
16/07/27 06:32:39 INFO mapreduce.Job: Running job: job_1469626018898_0001
I do telnet IP_ADDRESS 9000 and connection was successful.
I did already setup hdfs-site.xml with the following (to listen on both private and public addresses):
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
And core-site.xml is setup with:
<property>
<name>fs.defaultFS</name>
<value>hdfs://IP_ADDRESS:9000</value>
</property>
Any ideas why Yarn job looks like its not reaching HDFS service and thereby not completing?

hadoop/hdfs/name is in an inconsistent state: storage directory(hadoop/hdfs/data/) does not exist or is not accessible

I have tried all the different solutions provided at stackoverflow on this topic, but of no help
Asking again with the specific log and the details
Any help is appreciated
I have one master node and 5 slave nodes in my Hadoop cluster. ubuntu user and ubuntu group is the owner of the ~/Hadoop folder
Both the ~/hadoop/hdfs/data & ~/hadoop/hdfs/name folder exist
and permission for both the folders are set to 755
successfully formated the namenode before starting the script start-all.sh
THE SCRIPT FAILS TO LAUNCH THE "NAMENODE"
These are running at the master node
ubuntu#master:~/hadoop/bin$ jps
7067 TaskTracker
6914 JobTracker
7237 Jps
6834 SecondaryNameNode
6682 DataNode
ubuntu#slave5:~/hadoop/bin$ jps
31438 TaskTracker
31581 Jps
31307 DataNode
Below is the log from name-node log files.
..........
..........
.........
014-12-03 12:25:45,460 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2014-12-03 12:25:45,461 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source NameNode registered.
2014-12-03 12:25:45,532 INFO org.apache.hadoop.hdfs.util.GSet: Computing capacity for map BlocksMap
2014-12-03 12:25:45,532 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 64-bit
2014-12-03 12:25:45,532 INFO org.apache.hadoop.hdfs.util.GSet: 2.0% max memory = 1013645312
2014-12-03 12:25:45,532 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^21 = 2097152 entries
2014-12-03 12:25:45,532 INFO org.apache.hadoop.hdfs.util.GSet: recommended=2097152, actual=2097152
2014-12-03 12:25:45,588 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=ubuntu
2014-12-03 12:25:45,588 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2014-12-03 12:25:45,588 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2014-12-03 12:25:45,622 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=100
2014-12-03 12:25:45,623 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2014-12-03 12:25:45,716 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean
2014-12-03 12:25:45,777 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
2014-12-03 12:25:45,777 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2014-12-03 12:25:45,785 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/ubuntu/hadoop/file:/home/ubuntu/hadoop/hdfs/name does not exist
2014-12-03 12:25:45,787 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/ubuntu/hadoop/file:/home/ubuntu/hadoop/hdfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:304)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2014-12-03 12:25:45,801 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/ubuntu/hadoop/file:/home/ubuntu/hadoop/hdfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:304)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)

Removed the "file:" from the hdfs-site.xml file
[WRONG HDFS-SITE.XML]
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/mydata/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/mydata/hdfs/datanode</value>
</property>
[CORRECT HDFS-SITE.XML]
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hduser/mydata/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hduser/mydata/hdfs/datanode</value>
</property>
Thanks to Erik for the help.

Follow the below steps,
1.Stop all services
2.Format your namenode
3.Delete your data node directory
4.start all services

run these commands on terminal
$ cd ~
$ mkdir -p mydata/hdfs/namenode
$ mkdir -p mydata/hdfs/datanode
give permission to both directory 755
then,
Add this property in conf/hdfs-site.xml
<property>
 <name>dfs.namenode.name.dir</name>
 <value>file:/home/hduser/mydata/hdfs/namenode</value>
</property>
<property>
 <name>dfs.datanode.data.dir</name>
 <value>file:/home/hduser/mydata/hdfs/datanode</value>
</property>
if not work ,then
stop-all.sh
start-all.sh

1) name node directory you should be owner and give chmod 750 appropriately
2)stop all services
3)use hadoop namenode -format to format namenode
4)add this to hdfs-site.xml
<property>
<name>dfs.data.dir</name>
<value>path/to/hadooptmpfolder/dfs/name/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>path/to/hadooptmpfolder/dfs/name</value>
<final>true</final>
</property>
5) to run hadoop namenode -format
add export PATH=$PATH:/usr/local/hadoop/bin/ in ~/.bashrc
wherever hadoop is unzip add that in path

Had similar problem, I formatted the namenode then started it
Hadoop namenode -format
hadoop-daemon.sh start namenode

You can follow given below steps to remove this error:
Stop all hadoop daemons
Delete all files from given below directory:
/tmp/hadoop-{user}/dfs/name/current and /tmp/hadoop-{user}/dfs/data/current
where user is the user with which you logged in into the box.
Format namenode
Start all services
You will now see a new file VERSION created in directory /tmp/hadoop-/dfs/name/current
One thing to notice here is that value of Cluster ID in file /tmp/hadoop-eip/dfs/name/current/VERSION must be same as in /tmp/hadoop-eip/dfs/data/current/VERSION
-Hitesh

Debugging a Tutorial Hadoop Pipes-Project

I am working through this tutorial
and got to the very last part (with some small changes).
Now I am stuck with an error message I can't make sense of.
damian#damian-ThinkPad-T61:~/hadoop-1.1.2$ bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input dft1 -output dft1-out -program bin/word_count
13/06/09 20:17:01 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/06/09 20:17:01 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/06/09 20:17:01 WARN snappy.LoadSnappy: Snappy native library not loaded
13/06/09 20:17:01 INFO mapred.FileInputFormat: Total input paths to process : 1
13/06/09 20:17:02 INFO filecache.TrackerDistributedCacheManager: Creating word_count in /tmp/hadoop-damian/mapred/local/archive/7642618178782392982_1522484642_696507214/filebin-work-1867423021697266227 with rwxr-xr-x
13/06/09 20:17:02 INFO filecache.TrackerDistributedCacheManager: Cached bin/word_count as /tmp/hadoop-damian/mapred/local/archive/7642618178782392982_1522484642_696507214/filebin/word_count
13/06/09 20:17:02 INFO filecache.TrackerDistributedCacheManager: Cached bin/word_count as /tmp/hadoop-damian/mapred/local/archive/7642618178782392982_1522484642_696507214/filebin/word_count
13/06/09 20:17:02 INFO mapred.JobClient: Running job: job_local_0001
13/06/09 20:17:02 INFO util.ProcessTree: setsid exited with exit code 0
13/06/09 20:17:02 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#4200d3
13/06/09 20:17:02 INFO mapred.MapTask: numReduceTasks: 1
13/06/09 20:17:02 INFO mapred.MapTask: io.sort.mb = 100
13/06/09 20:17:02 INFO mapred.MapTask: data buffer = 79691776/99614720
13/06/09 20:17:02 INFO mapred.MapTask: record buffer = 262144/327680
13/06/09 20:17:02 WARN mapred.LocalJobRunner: job_local_0001
java.lang.NullPointerException
at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:103)
at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:214)
13/06/09 20:17:03 INFO mapred.JobClient: map 0% reduce 0%
13/06/09 20:17:03 INFO mapred.JobClient: Job complete: job_local_0001
13/06/09 20:17:03 INFO mapred.JobClient: Counters: 0
13/06/09 20:17:03 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1327)
at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:248)
at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:479)
at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:494)
Does anyone see where the error hides? What is a straightforward way for debugging Hadoop Pipes programs?
Thanks!

The exception :
at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:103)
Is caused by the following lines in the source:
//Add token to the environment if security is enabled
Token<JobTokenIdentifier> jobToken = TokenCache.getJobToken(conf
.getCredentials());
// This password is used as shared secret key between this application and
// child pipes process
byte[] password = jobToken.getPassword();
The actual NPE is throw in the final line as jobToken is null.
As you're using local mode (local job tracker and local file system), i'm not sure that security should be 'enabled' - do you have either of the following properties configured in your core-site.xml, or hdfs-site.xml coniguration files (if so, what are their values):
hadoop.security.authentication
hadoop.security.authorization

Possibly because your cluster is running in local mode. Do you have the following property in your mapred-site.xml file?
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>
Let the MapReduce jobs run with the yarn framework.
</description>
</property>
If you don't have this property, your cluster, by default, will run in local mode. I used to have exactly the same problem in local mode. After I add this property, the cluster will run in distributed mode and the problem will be gone.
HTH,
Shumin

Job Token file not found when running Hadoop wordcount example

I just installed Hadoop successfully on a small cluster. Now I'm trying to run the wordcount example but I'm getting this error:
****hdfs://localhost:54310/user/myname/test11
12/04/24 13:26:45 INFO input.FileInputFormat: Total input paths to process : 1
12/04/24 13:26:45 INFO mapred.JobClient: Running job: job_201204241257_0003
12/04/24 13:26:46 INFO mapred.JobClient: map 0% reduce 0%
12/04/24 13:26:50 INFO mapred.JobClient: Task Id : attempt_201204241257_0003_m_000002_0, Status : FAILED
Error initializing attempt_201204241257_0003_m_000002_0:
java.io.IOException: Exception reading file:/tmp/mapred/local/ttprivate/taskTracker/myname/jobcache/job_201204241257_0003/jobToken
at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:135)
at org.apache.hadoop.mapreduce.security.TokenCache.loadTokens(TokenCache.java:165)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1179)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1116)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2404)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.FileNotFoundException: File file:/tmp/mapred/local/ttprivate/taskTracker/myname/jobcache/job_201204241257_0003/jobToken does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:125)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:129)
... 5 more
Any help?

I just worked through this same error--setting the permissions recursively on my Hadoop directory didn't help. Following Mohyt's recommendation here, I modified core-site.xml (in the hadoop/conf/ directory) to remove the place where I specified the temp directory (hadoop.tmp.dir in the XML). After allowing Hadoop to create its own temp directory, I'm running error-free.

It is better to create your own temp directory.
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/unmesha/mytmpfolder/tmp</value>
<description>A base for other temporary directories.</description>
</property>
.....
And give permission
unmesha#unmesha-virtual-machine:~$chmod 750 /mytmpfolder/tmp
check this for core-site.xml configuration

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

not able to prevent local job runner from running - hadoop

Related

Not an usable net address error when integrating ML 9 with Connector-for-Hadoop2-2.2.3?

Yarn can't connect to Hadoop HDFS?

hadoop/hdfs/name is in an inconsistent state: storage directory(hadoop/hdfs/data/) does not exist or is not accessible

Debugging a Tutorial Hadoop Pipes-Project

Job Token file not found when running Hadoop wordcount example

Categories

Resources