i have used hive-0.11.0, hadoop 2.0.3, and mysql 5.6 for metadata
i can successful run the statement like SELECT * FROM records,which not create a map/reduce task.
But when i try run SELECT * FROM records where year='1949' the map/reduce task always get some error
the hadoop give me Diagnostics:
Application application_1382680612829_0136 failed 1 times due to AM Container for appattempt_1382680612829_0136_000001 exited with exitCode: -1000 due to: java.io.FileNotFoundException: File /tmp/hadoop-hadoop/nm-local-dir/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:996) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:150) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:187) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:730) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:726) at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2379) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:726) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:88) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:274) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) .Failing this attempt.. Failing the application.
what i should do? thanks
In summary, as solution they recommend to create the parent folders, there is also a patch that is supposed to be fixed at 2.0.3 [1]
Tom White added a comment - 30/Nov/12 21:36 This fixes the problem by
creating parent directories if they don't already exist. Without the
fix the test would fail about 4 times in 10; with the fix I didn't see
a failure.
It looks like the most similar issue that I could find in the hadoop bugs database
It's also related to [2] and [3] if you want to take a look
Related
I am using Sqoop for importing data from oracle to HDFS. When Job starts it stucks in 5% of progress for about 1 hours and this info is outputs:
INFO mapreduce.Job: Task Id : attempt_1535519556038_0015_m_000037_0, Status : FAILED
Container launch failed for container_1535519556038_0015_01_000043 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1536133107764 found 1536133094775
Note: System times on machines may be out of sync. Check system time and time zones.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:375)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
and then it continues until the jobs successfully terminate and all the data imported. So, My question is What is the reason for hanging the job in 5% of progress? Why is it self-correcting? Is it normal? If not, Is it possible to relate to that issued info? How can I fix that?
The error message clearly explains “Unauthorized request to start container.
This token is expired”.
One of the options would be increasing lifespan of container by setting:
yarn.resourcemanager.rm.container-allocation.expiry-interval-ms which is by default is 10 minutes.
Note: The jobs will work if you increase the yarn.resourcemanager.rm.container-allocation.expiry-interval-ms in the yarn-site.xml config file.
<property>
<name>yarn.resourcemanager.rm.container-allocation.expiry-interval-ms</name>
<value>1000000</value>
</property>
I upload Scala / Spark jars to HDFS to test them on our cluster. After running, I frequently realize there are changes that need to be made. So I make the changes locally then push the new jar back up to HDFS. However, often (not always) when I do this, hadoop throws an error essentially saying that this jar is not the same as the old jar (duh).
I try clearing my Trash, .staging, and .sparkstaging directories but that doesn't do anything. I try renaming the jar, which will work sometimes and other times it won't (it's still ridiculous I have to do this in the first place).
Does anyone know why this is occurring and how I can prevent it from occurring? Thanks for any help. Here are some logs if that helps (edited out some paths):
Application application_1475165877428_124781 failed 2 times due to AM
Container for appattempt_1475165877428_124781_000002 exited with
exitCode: -1000 For more detailed output, check application tracking
page:http://examplelogsite/ Then, click on links to logs of each
attempt. Diagnostics: Resource MYJARPATH/EXAMPLE.jar changed on src
filesystem (expected 1475433291946, was 1475433292850
java.io.IOException: Resource MYJARPATH/EXAMPLE.jar changed on src
filesystem (expected 1475433291946, was 1475433292850 at
org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at
org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) at
org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at
java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Failing this attempt. Failing
the application.
I haven't seen that exit code before, so to me, it doesn't say anything, I would suggest you to check the logs, like this:
yarn logs -applicationId <your_application_ID>
According to your log, I'm sure it comes from yarn side.
You can modify yarn yourself to skip this exception as workaround.
I ran into this thread cause the error log changed on src filesystem, I met this issue and skipped it by modify yarn src code.
For more details, you can refer to how-to-fix-resource-changed-on-src-filesystem-issue
I ran into a problem on a kerberized cluster where hive would not start.
Symptoms:
Services start succesfully (and did not stop)
In Ambari an alert appeared which mentioned that the Hive metastore failed
Starting hive on the command line did not succeed (it just kept hanging)
Via beeline I was able to see metadata, but not get actual data
I found the following error in /var/log/hive/hivemetastore.log
2016-08-29 10:12:49,047 ERROR [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(5934)) - Metastore Thrift Server threw an exception...
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Error creating path /hive/cluster/delegation/METASTORE/keys
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:166)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.initClientAndPaths(ZooKeeperTokenStore.java:236)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:469)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server.startDelegationTokenSecretManager(HadoopThriftAuthBridge.java:444)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6015)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5930)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hive/cluster/delegation/METASTORE/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:691)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:675)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:672)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:423)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:257)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:205)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:160)
... 11 more
Note that I actually tried several things, so I am not sure whether this is the full solution, but here is the final step, which I believe to be the critical one:
After a long search I indirectly found this site: https://community.hortonworks.com/articles/49040/hive-metastore-crashes-on-nullpointerexception-wit.html
Here is the relevant fragment that helped me resolve the issue:
This is a known issue being tracked in the following Hortonworks bug:
https://hortonworks.jira.com/browse/BUG-42602
WORKAROUND:
Set the hive.cluster.delegation.token.store.class to the following:
hive.cluster.delegation.token.store.class=org.apache.hadoop.hive.thrift.DBTokenStore
If using Ambari, this setting can be changed by clicking on the Hive
service on the Ambari Dashboard, navigating to the "Configs" tab, and
modifying the parameter in the "Advanced Hive-site" section of the
Hive configs. Save the changes and restart Hive from the Ambari User
Interface when prompted.
If not using ambari, this setting can be located in the
/etc/hive/conf/hive-site.xml file. Make sure this change is made on
all applicable nodes on the cluster. Once the changes are made, the
Hive services must be restarted.
I have installed oozie on my system and I have also installed pig. Now I want ozzie to run workflow from pig which is installed on my system not from the ozzie sharelib. Please help, as I get the following error:
2015-08-19 17:15:25,724 WARN PigActionExecutor:523 - SERVER[edb-node1] USER[hduser] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000002-150819170943510-oozie-hdus-W] ACTION[0000002-150819170943510-oozie-hdus-W#pig-node] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], exception invoking main(), java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
2015-08-19 17:15:25,728 WARN PigActionExecutor:523 - SERVER[edb-node1] USER[hduser] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000002-150819170943510-oozie-hdus-W] ACTION[0000002-150819170943510-oozie-hdus-W#pig-node] Launcher exception: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:234)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)
... 13 more
You've got error messages that show clearly that you have an incomplete CLASSPATH.
That's because the pig command line does a lot of things, among which setting up silently the appropriate Pig JARs in the CLASSPATH, before invoking the PigMain Java class. But Oozie calls the Java class directly; the CLASSPATH issues are supposed to be handled either...
by the Pig ShareLib, when active
or by you, as a knowledgeable Java developer -- after all, it's your
choice not to use the default way to run Pig, so you know what you
are doing, right?
Before asking your question, did you try a search on StackOverflow (and/or Google) with the following keywords? The results could prove useful.
oozie pig custom classpath
In the above case have you checked if you are able to access Pig CLI {grunt shell} or are you able to running Pig Script manually, without Using Oozie.
The Case what happend is Pig Jar was not accessable while You are trying to use via Oozie, the Best Case to avoide such issue is to use Oozie - -ShareLib , but as You mention you dnt want the same to work using sharedlib, then use some alternate case like:
Update Pig Home Path in Hadoop ClassPath. These will help MR to fetch the List of Jars in HDFS Temp Location every time Oozie Submit Request to MR.
Update Pig Home in BashProfile {if Pig shows error while accessing GRANT Shell with Command not Found}.
Hope these will help some.
I am trying to setup Hadoop with Kerberos
I am following the CDH3 Security Guide.
Things went pretty well so far (HFDS works ok etc), but I am getting the following error when I try to submit the Job.
I run HDFS server as user HDFS and Hadoop as user called mapred. I Submit the job using user called bob, who is in mapred group.
Following are values I have for taskcontroller.cfg
mapred.local.dir=/opt/hadoop-work/local/
hadoop.log.dir=/opt/hadoop-1.0.3/logs
mapreduce.tasktracker.group=mapred
min.user.id=1000
Error I am getting is
java.io.IOException: Job initialization failed (24) with output: Reading task controller config from /etc/hadoop/taskcontroller.cfg
Can't get group information for mapred - Success.
at org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:192)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1228)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:185)
... 8 more
Error always comes with value given to "mapreduce.tasktracker.group=mapred" in the taskcontroller.cfg.
I have been debugging and looking in, and I think the problem is I have setup the permission among different users and groups wrong.
Any help is greatly appreciated.