MapReduce ERROR UserGroupInformation - PriviledgedActionException - hadoop

I am trying to run following MapReduce code in my local machine:
https://github.com/Jeffyrao/warcbase/blob/extract-links/src/main/java/org/warcbase/data/ExtractLinks.java
However, I met this exception:
[main] ERROR UserGroupInformation - PriviledgedActionException as:jeffy (auth:SIMPLE) cause:java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: Resource file:/Users/jeffy/Documents/Eclipse/warcbase/map_backup.txt is not publicly accessable and as such cannot be part of the public cache.
Exception in thread "main" java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: Resource file:/Users/jeffy/Documents/Eclipse/warcbase/map_backup.txt is not publicly accessable and as such cannot be part of the public cache.
at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:144)
at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:155)
at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:625)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:391)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)
at org.warcbase.data.ExtractLinks.run(ExtractLinks.java:254)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.warcbase.data.ExtractLinks.main(ExtractLinks.java:270)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Resource file:/Users/jeffy/Documents/Eclipse/warcbase/map_backup.txt is not publicly accessable and as such cannot be part of the public cache.
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:140)
... 14 more
I think this problem is because of I am trying to add a file to DistributedCache(Look at my code at Line 81-86 and Line 235). Any suggestion is welcome. Thanks!

I've met with a similar problem while running a Hadoop 2 job with DistributedCache added in local environment. Finally the cause of my problem is that Hadoop 2 is not only verifying the path itself to have public execution & read access permission, but it also verifies that all its ancestor directories should have execution permission. In this case, if "/" or "/Users" does not have a 755 permission, the file will still fail to be added into public cache.
See method static boolean ancestorsHaveExecutePermissions(FileSystem fs,
Path path, LoadingCache<Path,Future<FileStatus>> statCache)at Hadoop class FSDownload.java
One solution could be granting permission to all directories (sounds unsafe).
And a better solution is making sure all resource files to be cached are in /tmp folder or any other folder that defaultly have a >755 permission.

I've met with similar problem.
I run mahout seq2sparse with tfidf in local mode. And raise error:
Exception in thread "main" java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: Resource file:/root/title.tfidf/dictionary.file-0 is not publicly accessable and as such cannot be part of the public cache.
I found permission of /root is 750 by default
drwxr-x---. 12 root root 4096 16:04 root
So i changed permission of /root
chmod 755 /root
then it works. so thank Yitong.

I had to change permissions of only my home directory as following
chmod go+rx /home/hadoop
to fix the problem as / and /home already have rx permisions for group and other users on my system. Here 'hadoop' is my linux login/user name.

Related

FileNotFoundException dfs.include - Hortonworks for windows

I tried to install Hortonworks 2.2.0.2.0.6.0-0009 for Windows on a Windows Server 2012.
Everything seems clean during the installation except when launching "start_local_hdp_services.cmd" to start hadoop services. There, namenode and historyserver services fail to start and generate folowing logs :
For "hadoop-namenode-M1BY1HADOOP.log" :
2014-03-06 09:39:06,755 ERROR org.apache.hadoop.hdfs.server.namenode.HostFileManager: failed to read include file 'c:\hdp\hadoop-2.2.0.2.0.6.0-0009/etc/hadoop/dfs.include'. Continuing to use previous include list.
java.io.FileNotFoundException: c:\hdp\hadoop-2.2.0.2.0.6.0-0009\etc\hadoop\dfs.include (The system cannot find the file specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.apache.hadoop.util.HostsFileReader.readFileToSet(HostsFileReader.java:54)
at org.apache.hadoop.hdfs.server.namenode.HostFileManager$MutableEntrySet.readFile(HostFileManager.java:265)
at org.apache.hadoop.hdfs.server.namenode.HostFileManager.refresh(HostFileManager.java:284)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.<init>(DatanodeManager.java:176)
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.<init>(BlockManager.java:237)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:609)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:567)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:684)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:669)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
For "hadoop-historyserver-M1BY1HADOOP.log":
014-03-06 09:39:20,130 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done directory: [hdfs://VMHADOOP:8020/mapred/history/done]
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done directory: [hdfs://VMHADOOP:8020/mapred/history/done]
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:503)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.hs.JobHistory.serviceInit(JobHistory.java:88)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceInit(JobHistoryServer.java:93)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:155)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:165)
Does anybody know the reason of this error and can help me to solve it please?
Thank you
It is a bug in hadoop (https://issues.apache.org/jira/browse/AMBARI-2355), create empty dfs.include and dfs.exclude file and the problem would vanish.

Accumulo on Cloudera CDH4 - Access denied when starting components

I have a small cluster up and running with Cloudera CDH4 Hadoop and Map Reduce v1. Namenode/Secondary Namenode/Jobtracker all on different machines. My three servers are also acting as Zookeeper servers.
I'm trying to install Accumulo 1.4.4 on top of this cluster. I get the same behavior with Accumulo 1.5.0. I am able to bin/accumulo init and initialize Accumulo, but starting the individual components fail. I'm trying to make my Namenode the Accumulo master.
bin/start-server.sh localhost monitor spits out a very encouraging Starting monitor on localhost, but nothing gets started. If I examine logs/monitor_localhost.err I find a stacktrace:
-bash-4.1$ cat logs/monitor_localhost.err
Thread "monitor" died null
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.accumulo.start.Main$1.run(Main.java:91)
at java.lang.Thread.run(Thread.java:701)
Caused by: java.lang.ExceptionInInitializerError
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2464)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2456)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2323)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:163)
at org.apache.accumulo.core.file.FileUtil.getFileSystem(FileUtil.java:554)
at org.apache.accumulo.core.client.ZooKeeperInstance.getInstanceIDFromHdfs(ZooKeeperInstance.java:258)
at org.apache.accumulo.server.conf.ZooConfiguration.getInstance(ZooConfiguration.java:65)
at org.apache.accumulo.server.conf.ServerConfiguration.getZooConfiguration(ServerConfiguration.java:49)
at org.apache.accumulo.server.conf.ServerConfiguration.getSystemConfiguration(ServerConfiguration.java:58)
at org.apache.accumulo.server.monitor.Monitor.run(Monitor.java:440)
at org.apache.accumulo.server.monitor.Monitor.main(Monitor.java:433)
... 6 more
Caused by: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers)
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:399)
at java.security.AccessController.checkPermission(AccessController.java:557)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at java.lang.Class.checkMemberAccess(Class.java:2237)
at java.lang.Class.getDeclaredFields(Class.java:1805)
at org.apache.hadoop.util.ReflectionUtils.getDeclaredFieldsIncludingInherited(ReflectionUtils.java:315)
at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.initRegistry(MetricsSourceBuilder.java:92)
at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.<init>(MetricsSourceBuilder.java:56)
at org.apache.hadoop.metrics2.lib.MetricsAnnotations.newSourceBuilder(MetricsAnnotations.java:42)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:212)
at org.apache.hadoop.metrics2.MetricsSystem.register(MetricsSystem.java:54)
at org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
at org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
... 18 more
The AccessControlException: access denied looks like the important line to me, but I can't imagine what access is being restricted. I'm running everything as the hdfs user, which owns the entire /opt/accumulo-1.4.4/ directory where accumulo is un-tarred. The /accumulo directory in HDFS is also owned by the hdfs user. SELinux is permissive. Searching online has proved fruitless, has anyone dealt with this error before?
Much thanks.
I started browsing the Apache accumulo-users mailing list archive and came across the solution.
http://mail-archives.apache.org/mod_mbox/accumulo-user/201312.mbox/%3CB9CB2B2BF27F0F46B8ECF781831E00E710970A9F%400015-its-exmb10.us.saic.com%3E
I was copying the accumulo.policy.example to accumulo.policy because I thought I needed it in my configuration. Once I deleted the accumulo.policy file my issues went away and I've been able to stand up Accumulo (1.5.0 at least, 1.4.4 still has some issues for me)

Installing Hue, permission denied error?

I'm getting the following while trying to build Hue:
(6211) *** Controller starting at Thu Aug 8 11:29:50 2013
Should start 1 new children
Controller.spawn_children(number=1)
$HADOOP_HOME=
$HADOOP_BIN=/usr/local/hadoop/bin/hadoop
$HIVE_CONF_DIR=~/hive-0.10.0/conf
$HIVE_HOME=~/hive-0.10.0
find: `~/hive-0.10.0/lib': No such file or directory
$HADOOP_CLASSPATH=:
$HADOOP_OPTS=-Dlog4j.configuration=log4j.properties
$HADOOP_CONF_DIR=~/hive-0.10.0/conf:/usr/local/hadoop/conf
$HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
CWD=/usr/local/hue/desktop/conf
Executing /usr/local/hadoop/bin/hadoop jar /usr/local/hue/apps/beeswax/src/beeswax/../../java-lib/BeeswaxServer.jar --beeswax 8002 --desktop-host 127.0.0.1 --desktop-port 8888 --query-lifetime 604800000 --metastore 8003
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:1879)
at org.apache.hadoop.util.RunJar.main(RunJar.java:119)
I've changed the configuration file so it doesn't use hue but the user that I'm logged in as which has read and write permissions in the hadoop dfs, hadoop, hive, etc. Not sure why it's doing this...
It seems that it is starting to start Beeswax in /usr/local/hue/desktop/conf. Beeswax should be running as the 'hue' user by default (https://github.com/cloudera/hue/blob/master/desktop/core/src/desktop/supervisor.py#L67) so it need to be writable by 'hue'.

How to set User/Group permission with Hadoop/Kerberos Setup?

I am trying to setup Hadoop with Kerberos
I am following the CDH3 Security Guide.
Things went pretty well so far (HFDS works ok etc), but I am getting the following error when I try to submit the Job.
I run HDFS server as user HDFS and Hadoop as user called mapred. I Submit the job using user called bob, who is in mapred group.
Following are values I have for taskcontroller.cfg
mapred.local.dir=/opt/hadoop-work/local/
hadoop.log.dir=/opt/hadoop-1.0.3/logs
mapreduce.tasktracker.group=mapred
min.user.id=1000
Error I am getting is
java.io.IOException: Job initialization failed (24) with output: Reading task controller config from /etc/hadoop/taskcontroller.cfg
Can't get group information for mapred - Success.
at org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:192)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1228)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:185)
... 8 more
Error always comes with value given to "mapreduce.tasktracker.group=mapred" in the taskcontroller.cfg.
I have been debugging and looking in, and I think the problem is I have setup the permission among different users and groups wrong.
Any help is greatly appreciated.

"hudson.util.IOException2: Failed to create a temp file"

I came to work today and i found my hudson with this problem! I've tried to research, but i didn't found anything that help me.
Follow the full stack:
hudson.util.IOException2: Failed to create a temp file on /home/cpcaserver5/.hudson/jobs/SVN/workspace
at hudson.FilePath.createTextTempFile(FilePath.java:966)
at hudson.tasks.CommandInterpreter.createScriptFile(CommandInterpreter.java:124)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:68)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:630)
at hudson.model.Build$RunnerImpl.build(Build.java:175)
at hudson.model.Build$RunnerImpl.doRun(Build.java:137)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:429)
at hudson.model.Run.run(Run.java:1366)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:145)
Caused by: hudson.util.IOException2: Failed to create a temporary directory in /etc/tomcat6/apache-tomcat-6.0.35/temp
at hudson.FilePath$12.invoke(FilePath.java:955)
at hudson.FilePath$12.invoke(FilePath.java:944)
at hudson.FilePath.act(FilePath.java:758)
at hudson.FilePath.act(FilePath.java:740)
at hudson.FilePath.createTextTempFile(FilePath.java:944)
... 12 more
Caused by: java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1716)
at java.io.File.createTempFile(File.java:1804)
at hudson.FilePath$12.invoke(FilePath.java:953)
... 16 more
Email was triggered for: Failure
Sending email for trigger: Failure
It looks like you have a permissions problem. Make sure you run Jenkins/Tomcat with appropriate user permissions. Ditto if this happens on a slave - check that slave process runs as a user that has appropriate permissions.

Resources