Accumulo:There are no tablet servers - hadoop

./bin/accumulo shell -u root
Password: ******
2015-02-14 15:18:28,503 [impl.ServerClient] WARN : There are no tablet servers: check that zookeeper and accumulo are running.
2015-02-14 13:58:52,878 [tserver.NativeMap] ERROR: Tried and failed to load native map library from /home/hduser/hadoop/lib/native::/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
java.lang.UnsatisfiedLinkError: no accumulo in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
at java.lang.Runtime.loadLibrary0(Runtime.java:849)
at java.lang.System.loadLibrary(System.java:1088)
at org.apache.accumulo.tserver.NativeMap.<clinit>(NativeMap.java:80)
at org.apache.accumulo.tserver.TabletServerResourceManager.<init>(TabletServerResourceManager.java:155)
at org.apache.accumulo.tserver.TabletServer.config(TabletServer.java:3560)
at org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:3671)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.accumulo.start.Main$1.run(Main.java:141)
at java.lang.Thread.run(Thread.java:745)
2015-02-14 13:58:52,915 [tserver.TabletServer] ERROR: Uncaught exception in TabletServer.main, exiting
java.lang.IllegalArgumentException: Maximum tablet server map memory 83,886,080 and block cache sizes 28,311,552 is too large for this JVM configuration 48,693,248
at org.apache.accumulo.tserver.TabletServerResourceManager.<init>(TabletServerResourceManager.java:166)
at org.apache.accumulo.tserver.TabletServer.config(TabletServer.java:3560)
at org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:3671)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.accumulo.start.Main$1.run(Main.java:141)
at java.lang.Thread.run(Thread.java:745)
The above error is shown in the tserver_localhost.log. can anyone help me with this issue.
I have hadoop running on single-node mode, zookeeper running, and i followed the instructions in the Readme file of accumulo.
I dont know how to start a tablet server.There was no explanation regarding this in the readme,could anyone help me with this.

This is the confluence of two problems.
First, your Accumulo can't find the native libraries it would use for off-heaping the in-memory-map for live edits. Knowing your version of Accumulo, how you deployed accumulo, and seeing your accumulo-env.sh would be needed to diagnose why it may have failed. (asking on the user mailing list would be best) Take a look at the README for your version under the Building section for "native map support".
For example, the passage for version 1.6.1 gives the following advice for building them yourself without a full source tree:
Alternatively, you can manually unpack the accumulo-native tarball in the
$ACCUMULO_HOME/lib directory. Change to the accumulo-native directory in
the current directory and issue make. Then, copy the resulting 'libaccumulo'
library into the $ACCUMULO_HOME/lib/native/map.
$ mkdir -p $ACCUMULO_HOME/lib/native/map
$ cp libaccumulo.* $ACCUMULO_HOME/lib/native/map
Normally, not having the native libraries available is a soft failure; Accumulo will happily issue a WARN and then rely on a pure-java implementation.
Your second problem is caused by incorrect memory configuration. Accumulo relies on a single configuration parameter to tune memory use for both the native in-memory-map and the java one. The memory for the native implementation is allocated outside of the JVM heap and can be substantial (in the 1-16GB range depending on target workload). When running with the Java implementation, that same configuration value takes away space carved from the max heap size.
Based on your log output, you have configured a total max heap for tabletservers of ~46MB. You have allocated 27MB of this for the block cache and 80MB for the in-memory-map. The error you see is because those two values would result in an OOM.
You can increase the total Java Heap in accumulo-env.sh:
# Probably looks like this
test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY} -Xmx48m -Xms48m "
# change this part to give it more memory --^^^^^^
And/or you can tune how much space should be used for the native maps, block cache, and index cache in accumulo-site.xml
<!-- Amount of space to hold incoming random writes -->
<property>
<name>tserver.memory.maps.max</name>
<value>80M</value>
</property>
<!-- Amount of space for holding blocks of data read out of HDFS -->
<property>
<name>tserver.cache.data.size</name>
<value>7M</value>
</property>
<!-- Amount of space for holding indexes read out of HDFS -->
<property>
<name>tserver.cache.index.size</name>
<value>20M</value>
</property>
How you should balance these three will depend on how much memory you have and what your workload looks like. Keep in mind that more than just those two things need to go into your total Java heap (like atleast one copy of the current cell being written / read on each RPC).

I have found the solution to this.
I have removed all the config files from the config folder in accumulo and used the bootstrap_config.sh file in bin folder,..which created the config files based on the input i have given and after that i initialized accumulo again and i was able to open the shell and the error was gone.
Thanks for the help.

Related

Change server location for HDFS

I'm trying to follow the tutorial here: https://www.quickprogrammingtips.com/big-data/how-to-install-hadoop-on-mac-os-x-el-capitan.html, but getting a strange error when trying to run the line
sbin/start-dfs.sh
It doesn't raise any complaints when I run the script, but the namenode is not actually started. When I went to inspect the logs, I saw this error:
2020-01-30 13:30:52,700 INFO org.apache.hadoop.http.HttpServer2: HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: censoredsite.com:0
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:995)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:932)
at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:171)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:834)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:898)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:877)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1603)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1671)
Caused by: java.net.BindException: Can't assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:990)
Which was preceded by this line earlier:
2020-01-30 13:30:52,359 INFO org.apache.hadoop.hdfs.DFSUtil: Starting Web-server for hdfs at: http://censoredsite.com/archive:50070
It seems that somehow the web-server for HDFS has been set to something that it shouldn't be, I searched around online but I couldn't find what this value should properly be (I assume localhost?) OR how to actually change it in the config files.
The other interesting thing is that this "censoredsite" is actually a uh... lewd site I used to visit a few years ago. I have absolutely no idea how it managed to get into my HDFS configuration details, pretty worrying that it somehow worked its way into my computer. Does anyone now how to explicitly change the location of org.apache.hadoop.hdfs.DFSUtil? Thanks.
It sounds like it ended up in your /etc/hosts file as a site mapping...
The way to change the address, though, is in the hdfs-site.xml
dfs.namenode.http-address
https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
Alternatively, install Hadoop in a VM or download the Cloudera quickstart ones, where it's all pre-configured

Hive does not start: Error creating path /hive/cluster/delegation/METASTORE/keys

I ran into a problem on a kerberized cluster where hive would not start.
Symptoms:
Services start succesfully (and did not stop)
In Ambari an alert appeared which mentioned that the Hive metastore failed
Starting hive on the command line did not succeed (it just kept hanging)
Via beeline I was able to see metadata, but not get actual data
I found the following error in /var/log/hive/hivemetastore.log
2016-08-29 10:12:49,047 ERROR [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(5934)) - Metastore Thrift Server threw an exception...
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Error creating path /hive/cluster/delegation/METASTORE/keys
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:166)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.initClientAndPaths(ZooKeeperTokenStore.java:236)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:469)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server.startDelegationTokenSecretManager(HadoopThriftAuthBridge.java:444)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6015)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5930)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hive/cluster/delegation/METASTORE/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:691)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:675)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:672)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:423)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:257)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:205)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:160)
... 11 more
Note that I actually tried several things, so I am not sure whether this is the full solution, but here is the final step, which I believe to be the critical one:
After a long search I indirectly found this site: https://community.hortonworks.com/articles/49040/hive-metastore-crashes-on-nullpointerexception-wit.html
Here is the relevant fragment that helped me resolve the issue:
This is a known issue being tracked in the following Hortonworks bug:
https://hortonworks.jira.com/browse/BUG-42602
WORKAROUND:
Set the hive.cluster.delegation.token.store.class to the following:
hive.cluster.delegation.token.store.class=org.apache.hadoop.hive.thrift.DBTokenStore
If using Ambari, this setting can be changed by clicking on the Hive
service on the Ambari Dashboard, navigating to the "Configs" tab, and
modifying the parameter in the "Advanced Hive-site" section of the
Hive configs. Save the changes and restart Hive from the Ambari User
Interface when prompted.
If not using ambari, this setting can be located in the
/etc/hive/conf/hive-site.xml file. Make sure this change is made on
all applicable nodes on the cluster. Once the changes are made, the
Hive services must be restarted.

Set LD_LIBRARY_PATH or java.library.path for YARN / Hadoop2 Jobs

i have a Hadoop FileSystem which is using native libraries with JNI.
Apparently i have to include the shared object independently of the currently executed job. But i can't find a way to tell Hadoop/Yarn where it should look for the shared object.
I had partial success with the following solutions, while starting the wordcount example with yarn.
Setting export JAVA_LIBRARY_PATH=/path when starting the resource- and the nodemanager.
This helps with with the resource and the nodemanager, but the actual Job/Application fails. Printing the LD_LIBRARY_PATH and the java.library.path while executing the wordcount example yield the following result. What
/logs/userlogs/application_x/container_x_001/stdout
...
java.library.path : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_001:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
LD_LIBRARY_PATH : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x
Setting yarn.app.mapreduce.am.env="LD_LIBRARY_PATH=/path"
This did help with some of the Jobs. The actual map/reduce job did work (at least i have the correct results), but the call did fail with the Error no jni-xtreemfs in java.library.path.
Somehow the first application/job did work and shows
/logs/userlogs/application_x/container_x_001/stdout
...
java.library.path : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_001:/path:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
LD_LIBRARY_PATH : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_001:/path
But the second and the rest did fail with:
/logs/userlogs/application_x/container_x_002/stdout
...
java.library.path : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_002:/opt/hadoop-2.7.1/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
LD_LIBRARY_PATH : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_002/opt/hadoop-2.7.1/lib/native
The stacktrace for the later shows, that the error occured while executing YarnChild:
2015-08-03 15:24:03,851 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.UnsatisfiedLinkError: no jni-xtreemfs in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
at java.lang.Runtime.loadLibrary0(Runtime.java:849)
at java.lang.System.loadLibrary(System.java:1088)
at org.xtreemfs.common.libxtreemfs.jni.NativeHelper.loadLibrary(NativeHelper.java:54)
at org.xtreemfs.common.libxtreemfs.jni.NativeClient.<clinit>(NativeClient.java:41)
at org.xtreemfs.common.libxtreemfs.ClientFactory.createClient(ClientFactory.java:72)
at org.xtreemfs.common.libxtreemfs.ClientFactory.createClient(ClientFactory.java:51)
at org.xtreemfs.common.clients.hadoop.XtreemFSFileSystem.initialize(XtreemFSFileSystem.java:191)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Supply the libjni-xtreemfs.so via the commandline argument -files
This does work. I assume the .so is copied to the tmp directory. But this is no feasible solution, because it would require the users to supply the path to the .so on every call.
Does anybody now how i can globally set the LD_LIBRARY_PATH or the java.library.path or can suggest which configuration options i did probably miss? I'd be very thankful!
Short Answer: in your mapred-site.xml put the following
<property>
<name>mapred.child.java.opts</name>
<value>-Djava.library.path=$PATH_TO_NATIVE_LIBS</value>
</property>
Explanation:
The Job/Applications aren't executed by yarn rather than by a mapred (map/reduce) container, whoose configuration is controlled by the mapred-site.xml file. Specifying custom java parameters there causes that actual workers to spin with the correct path
Use mapreduce.map.env in your job or site configuration.
Usage is as follows:
<property>
<name>mapreduce.map.env</name>
<value>LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/my/libs</value>
</property>
Note:
Hadoop docs encourage the use of mapreduce.map.env for this over mapred.child.java.opts. "Usage of -Djava.library.path can cause programs to no longer function if hadoop native libraries are used."

Accumulo on Cloudera CDH4 - Access denied when starting components

I have a small cluster up and running with Cloudera CDH4 Hadoop and Map Reduce v1. Namenode/Secondary Namenode/Jobtracker all on different machines. My three servers are also acting as Zookeeper servers.
I'm trying to install Accumulo 1.4.4 on top of this cluster. I get the same behavior with Accumulo 1.5.0. I am able to bin/accumulo init and initialize Accumulo, but starting the individual components fail. I'm trying to make my Namenode the Accumulo master.
bin/start-server.sh localhost monitor spits out a very encouraging Starting monitor on localhost, but nothing gets started. If I examine logs/monitor_localhost.err I find a stacktrace:
-bash-4.1$ cat logs/monitor_localhost.err
Thread "monitor" died null
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.accumulo.start.Main$1.run(Main.java:91)
at java.lang.Thread.run(Thread.java:701)
Caused by: java.lang.ExceptionInInitializerError
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2464)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2456)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2323)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:163)
at org.apache.accumulo.core.file.FileUtil.getFileSystem(FileUtil.java:554)
at org.apache.accumulo.core.client.ZooKeeperInstance.getInstanceIDFromHdfs(ZooKeeperInstance.java:258)
at org.apache.accumulo.server.conf.ZooConfiguration.getInstance(ZooConfiguration.java:65)
at org.apache.accumulo.server.conf.ServerConfiguration.getZooConfiguration(ServerConfiguration.java:49)
at org.apache.accumulo.server.conf.ServerConfiguration.getSystemConfiguration(ServerConfiguration.java:58)
at org.apache.accumulo.server.monitor.Monitor.run(Monitor.java:440)
at org.apache.accumulo.server.monitor.Monitor.main(Monitor.java:433)
... 6 more
Caused by: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers)
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:399)
at java.security.AccessController.checkPermission(AccessController.java:557)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at java.lang.Class.checkMemberAccess(Class.java:2237)
at java.lang.Class.getDeclaredFields(Class.java:1805)
at org.apache.hadoop.util.ReflectionUtils.getDeclaredFieldsIncludingInherited(ReflectionUtils.java:315)
at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.initRegistry(MetricsSourceBuilder.java:92)
at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.<init>(MetricsSourceBuilder.java:56)
at org.apache.hadoop.metrics2.lib.MetricsAnnotations.newSourceBuilder(MetricsAnnotations.java:42)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:212)
at org.apache.hadoop.metrics2.MetricsSystem.register(MetricsSystem.java:54)
at org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
at org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
... 18 more
The AccessControlException: access denied looks like the important line to me, but I can't imagine what access is being restricted. I'm running everything as the hdfs user, which owns the entire /opt/accumulo-1.4.4/ directory where accumulo is un-tarred. The /accumulo directory in HDFS is also owned by the hdfs user. SELinux is permissive. Searching online has proved fruitless, has anyone dealt with this error before?
Much thanks.
I started browsing the Apache accumulo-users mailing list archive and came across the solution.
http://mail-archives.apache.org/mod_mbox/accumulo-user/201312.mbox/%3CB9CB2B2BF27F0F46B8ECF781831E00E710970A9F%400015-its-exmb10.us.saic.com%3E
I was copying the accumulo.policy.example to accumulo.policy because I thought I needed it in my configuration. Once I deleted the accumulo.policy file my issues went away and I've been able to stand up Accumulo (1.5.0 at least, 1.4.4 still has some issues for me)

Hadoop safemode recovery - taking too long!

I have a Hadoop cluster with 18 data nodes.
I restarted the name node over two hours ago and the name node is still in safe mode.
I have been searching for why this might be taking too long and I cannot find a good answer.
The posting here:
Hadoop safemode recovery - taking lot of time
is relevant but I'm not sure if I want/need to restart the name node after making a change to this setting as that article mentions:
<property>
<name>dfs.namenode.handler.count</name>
<value>3</value>
<final>true</final>
</property>
In any case, this is what I've been getting in 'hadoop-hadoop-namenode-hadoop-name-node.log':
2011-02-11 01:39:55,226 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call delete(/tmp/hadoop-hadoop/mapred/system, true) from 10.1.206.27:54864: error: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /tmp/hadoop-hadoop/mapred/system. Name node is in safe mode.
The reported blocks 319128 needs additional 7183 blocks to reach the threshold 0.9990 of total blocks 326638. Safe mode will be turned off automatically.
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /tmp/hadoop-hadoop/mapred/system. Name node is in safe mode.
The reported blocks 319128 needs additional 7183 blocks to reach the threshold 0.9990 of total blocks 326638. Safe mode will be turned off automatically.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1711)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1691)
at org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:565)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)
Any advice is appreciated.
Thanks!
I had it once, where some blocks were never reported in. I had to forcefully let the namenode leave safemode (hadoop dfsadmin -safemode leave) and then run an fsck to delete missing files.
Check the properties dfs.namenode.handler.count in hdfs-site.xml.
dfs.namenode.handler.count in hdfs-site.xml specifies the number of threads used by Namenode for it’s processing. its default value is 10. Too low value of this properties might cause the issue specified.
Also check the missing or corrupt blocks
hdfs fsck / | egrep -v '^.+$' | grep -v replica
hdfs fsck /path/to/corrupt/file -locations -blocks -files
if the corrupt blocks are found, remove it.
hdfs fs -rm /file-with-missing-corrupt blocks.

Resources