nodemanager error in yarn - hadoop

I have 100 nodemanagers in my cluster,with 1000 cores and 4.8T Mem.
but, some thing driver me crazy on nodemanager recently,and it occurrence once in a while, not every day.
In Cloudera Management,the health test show me that "GC Duration Unknown" and "Web Server Status Bad":
Then,my application in cluster will be error with timeout or thread interrupt.
and nodemanager errorlog is like below:
java.io.IOException: Failed on local exception: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=1000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS); Host Details : local host is: "lfh-R720-20/10.1.0.20"; destination host is: "lfh-R720-20":8040;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy40.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:129)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1132)
Caused by: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=1000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:855)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:626)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 8 more
Caused by: java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:853)
... 13 more
java.io.IOException: Failed on local exception: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=1000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS); Host Details : local host is: "lfh-R720-20/10.1.0.20"; destination host is: "lfh-R720-20":8040;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy40.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:129)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1132)
Caused by: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=1000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:855)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:626)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 8 more
Caused by: java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:853)
... 13 more
java.io.IOException: Failed on local exception: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=1000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS); Host Details : local host is: "lfh-R720-20/10.1.0.20"; destination host is: "lfh-R720-20":8040;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy40.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:129)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1132)
Caused by: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=1000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:855)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:626)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 8 more
Caused by: java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:853)
... 13 more
thank you!

Related

hbase import module don't succeed

I have to move some hbase tables from one hadoop cluster to another. I have extracted the tables using
bin/hbase org.apache.hadoop.hbase.mapreduce.Export \ <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]
and I've put the return files into HDFS on my new cluster.
But when I try bin/hbase org.apache.hadoop.hbase.mapreduce.Import , I have the strange following logs:
hadoop#edgenode:~$ hbase/bin/hbase org.apache.hadoop.hbase.mapreduce.Import ADCP /hbase/backup_hbase/ADCP/2022-07-04_1546/ADCP/
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hbase/lib/client-facing-thirdparty/slf4j-reload4j-1.7.33.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
2022-10-03 11:19:09,689 INFO [main] mapreduce.Import: writing directly to table from Mapper.
2022-10-03 11:19:09,847 INFO [main] client.RMProxy: Connecting to ResourceManager at /172.16.42.42:8032
2022-10-03 11:19:09,983 INFO [main] Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 02/10/2020 11:30 GMT
2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:host.name=edgenode
2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_342
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.vendor=Private Build
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: hadoop-yarn-client-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-services-core-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-router-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-registry-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.3.3.jar
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.library.path=/home/hadoop/hadoop/lib/native
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.name=Linux
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.arch=amd64
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.version=5.15.0-1018-kvm
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.name=hadoop
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.free=174MB
2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.max=3860MB
2022-10-03 11:19:10,045 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.total=237MB
2022-10-03 11:19:10,048 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Initiating client connection, connectString=namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$15/257950720#1124fc36
2022-10-03 11:19:10,054 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] common.X509Util: Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
2022-10-03 11:19:10,061 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ClientCnxnSocket: jute.maxbuffer value is 4194304 Bytes
2022-10-03 11:19:10,069 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ClientCnxn: zookeeper.request.timeout value is 0. feature enabled=
2022-10-03 11:19:10,077 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Opening socket connection to server namenode/172.16.42.42:2181. Will not attempt to authenticate using SASL (unknown error)
2022-10-03 11:19:10,084 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /172.16.42.187:48598, server: namenode/172.16.42.42:2181
2022-10-03 11:19:10,120 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Session establishment complete on server namenode/172.16.42.42:2181, sessionid = 0x1b000002cb790005, negotiated timeout = 40000
2022-10-03 11:19:11,001 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Session: 0x1b000002cb790005 closed
2022-10-03 11:19:11,001 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x1b000002cb790005
2022-10-03 11:19:15,366 INFO [main] input.FileInputFormat: Total input files to process : 32
2022-10-03 11:19:15,660 INFO [main] mapreduce.JobSubmitter: number of splits:32
2022-10-03 11:19:15,902 INFO [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1664271607293_0002
2022-10-03 11:19:16,225 INFO [main] conf.Configuration: resource-types.xml not found
2022-10-03 11:19:16,225 INFO [main] resource.ResourceUtils: Unable to find 'resource-types.xml'.
2022-10-03 11:19:16,231 INFO [main] resource.ResourceUtils: Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE
2022-10-03 11:19:16,231 INFO [main] resource.ResourceUtils: Adding resource type - name = vcores, units = , type = COUNTABLE
2022-10-03 11:19:16,293 INFO [main] impl.YarnClientImpl: Submitted application application_1664271607293_0002
2022-10-03 11:19:16,328 INFO [main] mapreduce.Job: The url to track the job: http://namenode:8088/proxy/application_1664271607293_0002/
2022-10-03 11:19:16,329 INFO [main] mapreduce.Job: Running job: job_1664271607293_0002
2022-10-03 11:19:31,513 INFO [main] mapreduce.Job: Job job_1664271607293_0002 running in uber mode : false
2022-10-03 11:19:31,514 INFO [main] mapreduce.Job: map 0% reduce 0%
2022-10-03 11:19:31,534 INFO [main] mapreduce.Job: 2-10-03 11:19:31.345]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[2022-10-03 11:19:31.346]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
For more detailed output, check the application tracking page: http://namenode:8088/cluster/app/application_1664271607293_0002 Then click on links to logs of each attempt.
. Failing the application.
2022-10-03 11:19:31,552 INFO [main] mapreduce.Job: Counters: 0
I don't understand what the problem could be. I went to http://namenode:8088/cluster/app/application_1664271607293_0002 but i didn't found nothing interesting. I've tried the command with different tables but get the same result. The two clusters are not one the same version but I read that it wasn't a problem. Every service works well on my clusters and I can use hbase commands on the hbase shell without any problem. Also, map reduce programs works well on my new cluster. I've also tested the copyTable and snapchot methods for the data migration, which didn't work either.
Any idea of what should be the problem? Thanks! :)
update :
I found this on a datanode syslog in the hadoop web interface, may be useful?
2022-10-04 14:12:39,341 INFO [main] org.apache.hadoop.security.SecurityUtil: Updating Configuration
2022-10-04 14:12:39,354 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2022-10-04 14:12:39,493 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 7 cluster_timestamp: 1664271607293 } attemptId: 2 } keyId: -896624238)
2022-10-04 14:12:39,536 INFO [main] org.apache.hadoop.conf.Configuration: resource-types.xml not found
2022-10-04 14:12:39,536 INFO [main] org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'.
2022-10-04 14:12:39,636 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73)
at org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36)
at org.apache.hadoop.mapreduce.v2.util.MRBuilderUtils.newJobId(MRBuilderUtils.java:39)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:298)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1745)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1742)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1673)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70)
... 10 more
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual
Reason:
Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
Current Frame:
bci: #36
flags: { }
locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
Bytecode:
0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7
0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a
0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5
0x0000030: 0013 2ab0
Stackmap Table:
same_frame(#19)
same_frame(#31)
same_frame(#40)
at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017)
at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.<init>(JobIdPBImpl.java:37)
... 15 more
2022-10-04 14:12:39,641 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73)
at org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36)
at org.apache.hadoop.mapreduce.v2.util.MRBuilderUtils.newJobId(MRBuilderUtils.java:39)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:298)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1745)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1742)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1673)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70)
... 10 more
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual
Reason:
Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
Current Frame:
bci: #36
flags: { }
locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
Bytecode:
0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7
0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a
0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5
0x0000030: 0013 2ab0
Stackmap Table:
same_frame(#19)
same_frame(#31)
same_frame(#40)
at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017)
at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.<init>(JobIdPBImpl.java:37)
... 15 more
2022-10-04 14:12:39,643 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException

Issue running spark-shell with yarn client, ERROR client.TransportClient: Failed to send RPC

I am trying to setup hadoop 3.1.2 with spark in windows. i have started hdfs cluster and i am able to create,copy files in hdfs. When i try to start spark-shell with yarn i am facing
ERROR cluster.YarnClientSchedulerBackend: Diagnostics message: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:227)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:544)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:264)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:875)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:874)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:874)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:906)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.io.IOException: Failed to send RPC RPC 7406367420263248997 to DESKTOP-TVBSANL.bbrouter/192.168.1.38:49691: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:362)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:339)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.ChannelOutboundBuffer.safeFail(ChannelOutboundBuffer.java:680)
at io.netty.channel.ChannelOutboundBuffer.remove0(ChannelOutboundBuffer.java:294)
at io.netty.channel.ChannelOutboundBuffer.failFlushed(ChannelOutboundBuffer.java:617)
at io.netty.channel.AbstractChannel$AbstractUnsafe.closeOutboundBufferForShutdown(AbstractChannel.java:627)
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:620)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
... 22 more
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
2020-03-28 11:32:11,608 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
when checked with yarn logs
2020-03-28 11:32:11,487 ERROR client.TransportClient: Failed to send RPC RPC 7406367420263248997 to DESKTOP-TVBSANL.bbrouter/192.168.1.38:49691: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
2020-03-28 11:32:11,494 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:227)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:544)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:264)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:875)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:874)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:874)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:906)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.io.IOException: Failed to send RPC RPC 7406367420263248997 to DESKTOP-TVBSANL.bbrouter/192.168.1.38:49691: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:362)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:339)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.ChannelOutboundBuffer.safeFail(ChannelOutboundBuffer.java:680)
at io.netty.channel.ChannelOutboundBuffer.remove0(ChannelOutboundBuffer.java:294)
at io.netty.channel.ChannelOutboundBuffer.failFlushed(ChannelOutboundBuffer.java:617)
at io.netty.channel.AbstractChannel$AbstractUnsafe.closeOutboundBufferForShutdown(AbstractChannel.java:627)
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:620)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
... 22 more
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
2020-03-28 11:32:11,497 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:227)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:544)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:264)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:875)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:874)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:874)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:906)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.io.IOException: Failed to send RPC RPC 7406367420263248997 to DESKTOP-TVBSANL.bbrouter/192.168.1.38:49691: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:362)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:339)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.ChannelOutboundBuffer.safeFail(ChannelOutboundBuffer.java:680)
at io.netty.channel.ChannelOutboundBuffer.remove0(ChannelOutboundBuffer.java:294)
at io.netty.channel.ChannelOutboundBuffer.failFlushed(ChannelOutboundBuffer.java:617)
at io.netty.channel.AbstractChannel$AbstractUnsafe.closeOutboundBufferForShutdown(AbstractChannel.java:627)
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:620)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
... 22 more
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
)
2020-03-28 11:32:11,505 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:227)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:544)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:264)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:875)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:874)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:874)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:906)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.io.IOException: Failed to send RPC RPC 7406367420263248997 to DESKTOP-TVBSANL.bbrouter/192.168.1.38:49691: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:362)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:339)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.ChannelOutboundBuffer.safeFail(ChannelOutboundBuffer.java:680)
at io.netty.channel.ChannelOutboundBuffer.remove0(ChannelOutboundBuffer.java:294)
at io.netty.channel.ChannelOutboundBuffer.failFlushed(ChannelOutboundBuffer.java:617)
at io.netty.channel.AbstractChannel$AbstractUnsafe.closeOutboundBufferForShutdown(AbstractChannel.java:627)
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:620)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
... 22 more
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
)
2020-03-28 11:32:11,526 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
2020-03-28 11:32:11,729 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://localhost:9000/user/Andrew/.sparkStaging/application_1585375241853_0002
2020-03-28 11:32:12,225 INFO util.ShutdownHookManager: Shutdown hook called
I have even added the following properties in spark conf
spark.driver.extraJavaOptions -Dhdp.version=3.1.2
spark.yarn.am.extraJavaOptions -Dhdp.version=3.1.2
these in yarn-site
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>5</value>
</property>
My cluster is a single node cluster. Windows os with 16gb ram and 500 GB HDD. The following is my hdfs report
Configured Capacity: 1000203087872 (931.51 GB)
Present Capacity: 252250412093 (234.93 GB)
DFS Remaining: 252011880448 (234.70 GB)
DFS Used: 238531645 (227.48 MB)
DFS Used%: 0.09%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:9866 (127.0.0.1)
Hostname: ##################
Decommission Status : Normal
Configured Capacity: 1000203087872 (931.51 GB)
DFS Used: 238531645 (227.48 MB)
Non DFS Used: 747952675779 (696.59 GB)
DFS Remaining: 252011880448 (234.70 GB)
DFS Used%: 0.02%
DFS Remaining%: 25.20%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 28 11:48:14 IST 2020
Last Block Report: Sat Mar 28 11:30:44 IST 2020
Num of Blocks: 248
I have been at this for 2 days now. would appreciate help.
Thanks in advance.

org.apache.hadoop.fs.ParentNotDirectoryException: /tmp (is not a directory)

I am new to Hadoop and running wordCount2 example. however i am getting below error
Exception in thread "main" org.apache.hadoop.fs.ParentNotDirectoryException: /tmp (is not a directory)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkIsDirectory(FSPermissionChecker.java:570)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSimpleTraverse(FSPermissionChecker.java:562)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:537)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1702)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:641)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:51)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2990)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1096)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:652)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2474)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2447)
at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1248)
at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1245)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1245)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1237)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:161)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:112)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:150)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
at WordCount2.main(WordCount2.java:128)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.ParentNotDirectoryException): /tmp (is not a directory)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkIsDirectory(FSPermissionChecker.java:570)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSimpleTraverse(FSPermissionChecker.java:562)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:537)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1702)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:641)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:51)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2990)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1096)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:652)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1507)
at org.apache.hadoop.ipc.Client.call(Client.java:1453)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:583)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2472)
... 23 mor
I can see the /tmp folder both in hdfs command as below
hadoopusr#LAPTOP:~$ hdfs dfs -ls /
19/02/03 11:02:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
drwxr-xr-x - hadoopusr supergroup 0 2019-02-03 08:34 /hadoopinput
drwxr-xr-x - hadoopusr supergroup 0 2019-02-03 08:42 /sampledata
-rwxrwxrwx 1 hadoopusr supergroup 22594 2019-01-29 10:26 /tmp
and even i can access the folder
hadoopusr#LAPTOP:~$ cd /tmp/
hadoopusr#LAPTOP:/tmp$
I have installed hadoop 2.9.2 on Ubuntu 18.04 app in windows 10
As you can see in your ls:
drwxr-xr-x - hadoopusr supergroup 0 2019-02-03 08:34 /hadoopinput
drwxr-xr-x - hadoopusr supergroup 0 2019-02-03 08:42 /sampledata
-rwxrwxrwx 1 hadoopusr supergroup 22594 2019-01-29 10:26 /tmp
/tmp is not a directory because it is not prefixed by a d.
So rm this file (if not important) and make a directory both with the dfs command.

PIG creates file on Hadoop but cannot write to it

I am learning hadoop and created a simple pig script.
Reading a file works, but writing to another file does not.
My script runs fine, the DUMP f command shows me 10 records, as expected. But when I store the same relation to a file (store f into 'result.csv';), there are some odd messages on the console, and for some reason, in the end I have a result file with only the first 3 records.
My questions are:
What's the matter with the IOException, when reading worked and
writing worked at least partly?
Why does the console tell me Total records written : 0, when actually 3 records have been written?
Why didn't it store the 10 records, as expected?
My Script (it's just some sandbox playing)
cd /user/samples
c = load 'crimes.csv' using PigStorage(',')
as (ID:int,Case_Number:int,Date:chararray,Block:chararray,IUCR:chararray,Primary_Type,Description,LocationDescription,Arrest:boolean,Domestic,Beat,District,Ward,CommunityArea,FBICode,XCoordinate,YCoordinate,Year,UpdatedOn,Latitude,Longitude,Location);
c = LIMIT c 1000;
t = foreach c generate ID, Date, Arrest, Year;
f = FILTER t by Arrest==true;
f = LIMIT f 10;
dump f;
store f into 'result.csv';
part of the console output:
2016-07-21 15:55:07,435 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-07-21 15:55:07,537 [main] WARN org.apache.pig.tools.pigstats.mapreduce.MRJobStats - Unable to get job counters
java.io.IOException: java.io.IOException: java.net.ConnectException: Call From m1.hdp2/192.168.178.201 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getCounters(HadoopShims.java:132)
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.addCounters(MRJobStats.java:284)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.addSuccessJobStats(MRPigStatsUtil.java:235)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.accumulateStats(MRPigStatsUtil.java:165)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:360)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:308)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1474)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1459)
at org.apache.pig.PigServer.execute(PigServer.java:1448)
at org.apache.pig.PigServer.access$500(PigServer.java:118)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1773)
at org.apache.pig.PigServer.registerQuery(PigServer.java:707)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1075)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:564)
at org.apache.pig.Main.main(Main.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: java.net.ConnectException: Call From m1.hdp2/192.168.178.201 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:343)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:428)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:572)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:184)
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getCounters(HadoopShims.java:126)
... 24 more
Caused by: java.net.ConnectException: Call From m1.hdp2/192.168.178.201 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor18.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy14.getJobReport(Unknown Source)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:324)
... 28 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 36 more
2016-07-21 15:55:07,540 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-07-21 15:55:07,571 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.7.2 0.16.0 hadoop 2016-07-21 15:50:17 2016-07-21 15:55:07 FILTER,LIMIT
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_1469130571595_0001 3 1 n/a n/a n/a n/a n/a n/a n/a n/a c
job_1469130571595_0002 1 1 n/a n/a n/a n/a n/a n/a n/a n/a c,f,t hdfs://localhost:9000/user/samples/result.csv,
Input(s):
Successfully read 0 records from: "hdfs://localhost:9000/user/samples/crimes.csv"
Output(s):
Successfully stored 0 records in: "hdfs://localhost:9000/user/samples/result.csv"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1469130571595_0001 -> job_1469130571595_0002,
job_1469130571595_0002
2016-07-21 15:55:07,573 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-21 15:55:07,585 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-21 15:55:08,592 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

error starting my hadoop namenode

I want to implement a Pseudo-Distributed hadoop system on my ubuntu machine.But I cannot start the namenode(others like jobtracker can be started normally).
my start command is :
./hadoop namenode -format
./start-all.sh
I checked the namenode log located in logs/hadoop-mongodb-namenode-mongodb.log
65 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec cl ock time, 1 cycles
66 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
67 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec c lock time, 1 cycles
68 2013-12-25 13:44:39,799 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source FSNamesystemMetrics registered.
69 2013-12-25 13:44:39,809 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
70 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort9000 registered.
71 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort9000 registered.
72 2013-12-25 13:44:39,812 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:9000
73 2013-12-25 13:44:39,847 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
74 2013-12-25 13:44:39,878 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
75 2013-12-25 13:44:39,884 INFO org.apache.hadoop.http.HttpServer: dfs.webhdfs.enabled = false
76 2013-12-25 13:44:39,888 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070
77 2013-12-25 13:44:39,889 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mongodb cause:java.net.BindException: Address already in use
78 2013-12-25 13:44:39,889 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor thread received InterruptedExceptionjava.lang.InterruptedException: sleep interrupted
79 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.DecommissionManager: Interrupted Monitor
80 java.lang.InterruptedException: sleep interrupted
81 at java.lang.Thread.sleep(Native Method)
82 at org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager.java:65)
83 at java.lang.Thread.run(Thread.java:701)
84 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 0 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number o f syncs: 0 SyncTimes(ms): 0
85 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: closing edit log: position=4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
86 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: close success: truncate to 4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
87 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
88 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
89 2013-12-25 13:44:39,909 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Address already in use
90 at sun.nio.ch.Net.bind0(Native Method)
91 at sun.nio.ch.Net.bind(Net.java:174)
92 at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
93 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
94 at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
95 at org.apache.hadoop.http.HttpServer.start(HttpServer.java:602)
96 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:517)
97 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:395)
98 at java.security.AccessController.doPrivileged(Native Method)
99 at javax.security.auth.Subject.doAs(Subject.java:416)
100 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
101 at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:395)
102 at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:337)
103 at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
104 at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
105 at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
106
107 2013-12-25 13:44:39,910 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
108 /************************************************************
109 SHUTDOWN_MSG: Shutting down NameNode at mongodb/192.168.10.2
110 ************************************************************/
110,1 Bot
63 2013-12-25 13:44:39,796 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list
64 2013-12-25 13:44:39,796 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
65 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec cl ock time, 1 cycles
66 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
67 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec c lock time, 1 cycles
68 2013-12-25 13:44:39,799 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source FSNamesystemMetrics registered.
69 2013-12-25 13:44:39,809 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
70 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort9000 registered.
71 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort9000 registered.
72 2013-12-25 13:44:39,812 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:9000
73 2013-12-25 13:44:39,847 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
74 2013-12-25 13:44:39,878 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
75 2013-12-25 13:44:39,884 INFO org.apache.hadoop.http.HttpServer: dfs.webhdfs.enabled = false
76 2013-12-25 13:44:39,888 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070
77 2013-12-25 13:44:39,889 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mongodb cause:java.net.BindException: Address already in use
78 2013-12-25 13:44:39,889 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor thread received InterruptedExceptionjava.lang.InterruptedException: sleep interrupted
79 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.DecommissionManager: Interrupted Monitor
80 java.lang.InterruptedException: sleep interrupted
81 at java.lang.Thread.sleep(Native Method)
82 at org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager.java:65)
83 at java.lang.Thread.run(Thread.java:701)
84 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 0 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number o f syncs: 0 SyncTimes(ms): 0
85 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: closing edit log: position=4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
86 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: close success: truncate to 4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
87 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
88 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
89 2013-12-25 13:44:39,909 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Address already in use
90 at sun.nio.ch.Net.bind0(Native Method)
91 at sun.nio.ch.Net.bind(Net.java:174)
92 at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
93 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
94 at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
95 at org.apache.hadoop.http.HttpServer.start(HttpServer.java:602)
96 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:517)
97 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:395)
98 at java.security.AccessController.doPrivileged(Native Method)
99 at javax.security.auth.Subject.doAs(Subject.java:416)
100 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
101 at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:395)
102 at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:337)
103 at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
104 at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
105 at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
106
107 2013-12-25 13:44:39,910 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
This is the error message.It seems obviously , port number went wrong!
And below is my conf file:
core-site.xml
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>fs.default.name</name>
6 <value>hdfs://localhost:9000</value>
7 </property>
8 </configuration>
hdfs-site.xml
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3
4 <!-- Put site-specific property overrides in this file. -->
5 <configuration>
6 <property>
7 <name>dfs.replication</name>
8 <value>1</value>
9 </property>
10
11 <property>
12 <name>dfs.name.dir</name>
13 <value>/var/hadoop/hadoop-1.2.1/dfs.name.dir</value>
14 </property>
15
16 <property>
17 <name>dfs.data.dir</name>
18 <value>/var/hadoop/hadoop-1.2.1/dfs.data.dir</value>
19 </property>
20 </configuration>
No matter how I change the port to others and restart hadoop ,the error exists all the same!
Anyone can help me ?
Try to remove hdfs data directory and instead of formatting namenode before starting hdfs, start hdfs first and check jps output. If everything was OK, then try to format namenode and recheck. If still there was a problem give me the log details.
P.S: Do not kill the processes. Just use stop-all.sh or whatever you should to stop hadoop.
The datanode on one of the slave machines in my cluster was throwing a similar exception with port binding:
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Address already in use
I noticed that the default web interface port of datanode i.e. 50075 was already bind to another application:
[ap2]-> netstat -an | grep -i 50075
tcp 0 0 10.0.1.1:45674 10.0.1.1:50075 ESTABLISHED
tcp 0 0 10.0.1.1:50075 10.0.1.1:45674 ESTABLISHED
[ap2]->
I have changed the Datanode web interface in conf/hdfs-site.xml:
<property>
<name>dfs.datanode.http.address</name>
<value>10.0.1.1:50080</value>
<description>Datanode http port</description>
</property>
This helped to resolve the issue, similarly you can change the default address and port where the web interface listens by setting dfs.http.address in conf/hadoop-site.xml, e.g. localhost:9090, but ensure that port is available.

Resources