Unable to find SASL server implementation? - hadoop

There's No issue with java version
The mapper phase has begun if there were issue related to version it would have thrown earlier
Its throwing some SASL Exception ?
Here are the errors.
Mapper face has already begun but it's not able to proceed further due to SASL?
2018-06-17 11:15:54,420 INFO mapreduce.Job: map 0% reduce 0%
2018-06-17 11:15:54,440 INFO mapreduce.Job: Job job_1529225370089_0093 failed with state FAILED due to: Application application_1529225370089_0093 failed 2 times due to Error launching appattempt_1529225370089_0093_000002. Got exception: org.apache.hadoop.security.AccessControlException: Unable to find SASL server implementation for DIGEST-MD5

Related

Unknown issue in Nutch elastic indexer with nutch REST api

I was trying to expose nutch using REST endpoints and ran into an issue in indexer phase. I'm using elasticsearch index writer to index docs to ES. I've used $NUTCH_HOME/runtime/deploy/bin/nutch startserver command. While indexing an unknown exception is thrown.
Error:
com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
16/10/07 16:01:47 INFO mapreduce.Job: map 100% reduce 0% 16/10/07
16:01:49 INFO mapreduce.Job: Task Id :
attempt_1475748314769_0107_r_000000_1, Status : FAILED Error:
com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
16/10/07 16:01:53 INFO mapreduce.Job: Task Id :
attempt_1475748314769_0107_r_000000_2, Status : FAILED Error:
com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
16/10/07 16:01:58 INFO mapreduce.Job: map 100% reduce 100% 16/10/07
16:01:59 INFO mapreduce.Job: Job job_1475748314769_0107 failed with
state FAILED due to: Task failed task_1475748314769_0107_r_000000 Job
failed as tasks failed. failedMaps:0 failedReduces:1
ERROR indexer.IndexingJob: Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) at
org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) at
org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at
org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
Failed with exit code 255.
Any help would be appreciated.
PS : After debugging using stack trace I think the issue is due to mismatch in guava version. I've tried changing build.xml of plugins(parse-tika and parsefilter-naivebayes) but it didn't work.
I have found solution for this issue. This is due to the version compatibility of guava dependency. Hadoop uses guava-11.0.2.jar as dependency. But the elastic indexer plugin in nutch requires 18.0 version of guava. That's why it is throwing an exception when trying to run in distributed hadoop. So we just need to update guava version to 18.0 in hadoop libs(can be found at $HADOOP_HOME/share/hadoop/common/libs/).

hadoop Input path does not exist

I am trying to get hadoop set up on my laptop. I have followed a few tutorials on setting up hadoop.
I ran this command:
bin/hdfs dfs -mkdir /user/<username>
If I run it again it says already exists.
I try to run the test jar file with this command:
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar grep input output 'dfs[a-z.]+'
and receive this exception
16/01/22 15:11:06 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/<username>/.staging/job_1453492366595_0006
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/<username>/grep-temp-891167560
I did not realize that I receive this before this error:
16/01/22 15:51:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/22 15:51:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/01/22 15:51:51 INFO input.FileInputFormat: Total input paths to process : 33
16/01/22 15:51:52 INFO mapreduce.JobSubmitter: number of splits:33
16/01/22 15:51:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1453492366595_0009
16/01/22 15:51:52 INFO impl.YarnClientImpl: Submitted application application_1453492366595_0009
16/01/22 15:51:52 INFO mapreduce.Job: The url to track the job: http://Marys-MacBook-Pro.local:8088/proxy/application_1453492366595_0009/
16/01/22 15:51:52 INFO mapreduce.Job: Running job: job_1453492366595_0009
16/01/22 15:51:56 INFO mapreduce.Job: Job job_1453492366595_0009 running in uber mode : false
16/01/22 15:51:56 INFO mapreduce.Job: map 0% reduce 0%
16/01/22 15:51:56 INFO mapreduce.Job: Job job_1453492366595_0009 failed with state FAILED due to: Application application_1453492366595_0009 failed 2 times due to AM Container for appattempt_1453492366595_0009_000002 exited with exitCode: 127
For more detailed output, check application tracking page:http://Marys-MacBook-Pro.local:8088/cluster/app/application_1453492366595_0009Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1453492366595_0009_02_000001
Exit code: 127
Stack trace: ExitCodeException exitCode=127:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 127
Failing this attempt. Failing the application.
There is a stack trace the follows this.
I am on a Mac PC.
I use Hadoop 2.7.2, and While following the Official Docs, I also encountered this problem at first.
The reason was that I forgot to follow "Prepare to Start the Hadoop Cluster" chapter.
I solved it by setting JAVA_HOME in etc/hadoop/hadoop-env.sh.
For me, it's because using wrong version JDK with hadoop. I used hadoop 2.6.5. At first, I started hadoop using oracle JDK 1.8.0_131, ran example jar and error occurred. After I used JDK 1.7.0_80, the example works like a charm.
There is a page about HadoopJavaVersions.

Unauthorized request to start container. This token is expired

I am getting "Unauthorized request to start container. This token is expired."
How to resovle it. The problem is reported on different forums, but I could not find an solution to it.
Below is the execution log
15/02/26 16:41:02 INFO impl.YarnClientImpl: Submitted application application_1424968835929_0001
15/02/26 16:41:02 INFO mapreduce.Job: The url to track the job: http://101-master15:8088/proxy/application_1424968835929_0001/
15/02/26 16:41:02 INFO mapreduce.Job: Running job: job_1424968835929_0001
15/02/26 16:41:04 INFO mapreduce.Job: Job job_1424968835929_0001 running in uber mode : false
15/02/26 16:41:04 INFO mapreduce.Job: map 0% reduce 0%
15/02/26 16:41:04 INFO mapreduce.Job: Job job_1424968835929_0001 failed with state FAILED due to: Application application_1424968835929_0001 failed 2 times due to Error launching appattempt_1424968835929_0001_000002. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1424969604829 found 1424969463686
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:122)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:249)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
. Failing the application.
15/02/26 16:41:04 INFO mapreduce.Job: Counters: 0
Time taken: 0 days, 0 hours, 0 minutes, 9 seconds.
This exception occurs when your nodes have different time settings. Make sure that your all 3 nodes have same time n timezone settings and then restart computer.
This worked for me . Hope this help to you as well !!!!

Mapreduce job failed because of container failed

Mapreduce job failed because of container failed with below log.
15/03/21 20:18:25 INFO mapreduce.Job: Job job_1426295876693_0015 failed with state FAILED due to: Application application_1426295876693_0015 failed 2 times due to Error launching appattempt_1426295876693_0015_000002. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1426996344559 found 1426969281613
It means that your cluster is not synced with same system time. Install NTP server. It will fix your issue.

Hadoop mapreduce container exited with a non-zero exit code 1

I'm trying to run some hadoop program to extracting keywords of some abstracts in Ubuntu. When I run my program using Hadoop, I get the following error.
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
INFO input.FileInputFormat: Total input paths to process : 1
INFO mapreduce.JobSubmitter: number of splits:1
INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1404812840999_0001
INFO impl.YarnClientImpl: Submitted application application_1404812840999_0001
INFO mapreduce.Job: The url to track the job: http://shiva-VirtualBox:8088/proxy/application_1404812840999_0001/
INFO mapreduce.Job: Running job: job_1404812840999_0001
INFO mapreduce.Job: Job job_1404812840999_0001 running in uber mode : false
INFO mapreduce.Job: map 0% reduce 0%
INFO mapreduce.Job: Job job_1404812840999_0001 failed with state FAILED due to: Application application_1404812840999_0001 failed 2 times due to AM Container for appattempt_1404812840999_0001_000002 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Container exited with a non-zero exit code 1
.Failing this attempt.. Failing the application.
14/07/08 14:21:44 INFO mapreduce.Job: Counters: 0
What's the cause of this error?
Note that I converted my mapreduce project to maven project for using lucene library in my code.
Is your resource manager really on the /0.0.0.0:8032? It also seams you are not using Toolrunner, so try to rewrite your mapreduce Hadoop: Implementing the Tool interface for MapReduce driver.
Hope it helps
Number of thread increased, JVM memory and CPU is fully utilised. Please increase the JVM size and increase memory limit of Mapper and reducer task.
conf.set("mapreduce.map.memory.mb", "4096");
conf.set("mapreduce.map.java.opts", "-Xmx3500m");

Resources