error starting my hadoop namenode - hadoop
I want to implement a Pseudo-Distributed hadoop system on my ubuntu machine.But I cannot start the namenode(others like jobtracker can be started normally).
my start command is :
./hadoop namenode -format
./start-all.sh
I checked the namenode log located in logs/hadoop-mongodb-namenode-mongodb.log
65 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec cl ock time, 1 cycles
66 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
67 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec c lock time, 1 cycles
68 2013-12-25 13:44:39,799 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source FSNamesystemMetrics registered.
69 2013-12-25 13:44:39,809 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
70 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort9000 registered.
71 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort9000 registered.
72 2013-12-25 13:44:39,812 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:9000
73 2013-12-25 13:44:39,847 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
74 2013-12-25 13:44:39,878 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
75 2013-12-25 13:44:39,884 INFO org.apache.hadoop.http.HttpServer: dfs.webhdfs.enabled = false
76 2013-12-25 13:44:39,888 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070
77 2013-12-25 13:44:39,889 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mongodb cause:java.net.BindException: Address already in use
78 2013-12-25 13:44:39,889 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor thread received InterruptedExceptionjava.lang.InterruptedException: sleep interrupted
79 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.DecommissionManager: Interrupted Monitor
80 java.lang.InterruptedException: sleep interrupted
81 at java.lang.Thread.sleep(Native Method)
82 at org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager.java:65)
83 at java.lang.Thread.run(Thread.java:701)
84 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 0 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number o f syncs: 0 SyncTimes(ms): 0
85 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: closing edit log: position=4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
86 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: close success: truncate to 4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
87 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
88 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
89 2013-12-25 13:44:39,909 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Address already in use
90 at sun.nio.ch.Net.bind0(Native Method)
91 at sun.nio.ch.Net.bind(Net.java:174)
92 at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
93 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
94 at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
95 at org.apache.hadoop.http.HttpServer.start(HttpServer.java:602)
96 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:517)
97 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:395)
98 at java.security.AccessController.doPrivileged(Native Method)
99 at javax.security.auth.Subject.doAs(Subject.java:416)
100 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
101 at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:395)
102 at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:337)
103 at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
104 at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
105 at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
106
107 2013-12-25 13:44:39,910 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
108 /************************************************************
109 SHUTDOWN_MSG: Shutting down NameNode at mongodb/192.168.10.2
110 ************************************************************/
110,1 Bot
63 2013-12-25 13:44:39,796 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list
64 2013-12-25 13:44:39,796 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
65 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec cl ock time, 1 cycles
66 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
67 2013-12-25 13:44:39,797 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec c lock time, 1 cycles
68 2013-12-25 13:44:39,799 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source FSNamesystemMetrics registered.
69 2013-12-25 13:44:39,809 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
70 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort9000 registered.
71 2013-12-25 13:44:39,810 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort9000 registered.
72 2013-12-25 13:44:39,812 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:9000
73 2013-12-25 13:44:39,847 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
74 2013-12-25 13:44:39,878 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
75 2013-12-25 13:44:39,884 INFO org.apache.hadoop.http.HttpServer: dfs.webhdfs.enabled = false
76 2013-12-25 13:44:39,888 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070
77 2013-12-25 13:44:39,889 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mongodb cause:java.net.BindException: Address already in use
78 2013-12-25 13:44:39,889 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor thread received InterruptedExceptionjava.lang.InterruptedException: sleep interrupted
79 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.DecommissionManager: Interrupted Monitor
80 java.lang.InterruptedException: sleep interrupted
81 at java.lang.Thread.sleep(Native Method)
82 at org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager.java:65)
83 at java.lang.Thread.run(Thread.java:701)
84 2013-12-25 13:44:39,890 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 0 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number o f syncs: 0 SyncTimes(ms): 0
85 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: closing edit log: position=4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
86 2013-12-25 13:44:39,905 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: close success: truncate to 4, editlog=/var/hadoop/hadoop-1.2.1/dfs.name.dir/current/edits
87 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
88 2013-12-25 13:44:39,909 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
89 2013-12-25 13:44:39,909 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Address already in use
90 at sun.nio.ch.Net.bind0(Native Method)
91 at sun.nio.ch.Net.bind(Net.java:174)
92 at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
93 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
94 at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
95 at org.apache.hadoop.http.HttpServer.start(HttpServer.java:602)
96 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:517)
97 at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:395)
98 at java.security.AccessController.doPrivileged(Native Method)
99 at javax.security.auth.Subject.doAs(Subject.java:416)
100 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
101 at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:395)
102 at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:337)
103 at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
104 at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
105 at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
106
107 2013-12-25 13:44:39,910 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
This is the error message.It seems obviously , port number went wrong!
And below is my conf file:
core-site.xml
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>fs.default.name</name>
6 <value>hdfs://localhost:9000</value>
7 </property>
8 </configuration>
hdfs-site.xml
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3
4 <!-- Put site-specific property overrides in this file. -->
5 <configuration>
6 <property>
7 <name>dfs.replication</name>
8 <value>1</value>
9 </property>
10
11 <property>
12 <name>dfs.name.dir</name>
13 <value>/var/hadoop/hadoop-1.2.1/dfs.name.dir</value>
14 </property>
15
16 <property>
17 <name>dfs.data.dir</name>
18 <value>/var/hadoop/hadoop-1.2.1/dfs.data.dir</value>
19 </property>
20 </configuration>
No matter how I change the port to others and restart hadoop ,the error exists all the same!
Anyone can help me ?
Try to remove hdfs data directory and instead of formatting namenode before starting hdfs, start hdfs first and check jps output. If everything was OK, then try to format namenode and recheck. If still there was a problem give me the log details.
P.S: Do not kill the processes. Just use stop-all.sh or whatever you should to stop hadoop.
The datanode on one of the slave machines in my cluster was throwing a similar exception with port binding:
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Address already in use
I noticed that the default web interface port of datanode i.e. 50075 was already bind to another application:
[ap2]-> netstat -an | grep -i 50075
tcp 0 0 10.0.1.1:45674 10.0.1.1:50075 ESTABLISHED
tcp 0 0 10.0.1.1:50075 10.0.1.1:45674 ESTABLISHED
[ap2]->
I have changed the Datanode web interface in conf/hdfs-site.xml:
<property>
<name>dfs.datanode.http.address</name>
<value>10.0.1.1:50080</value>
<description>Datanode http port</description>
</property>
This helped to resolve the issue, similarly you can change the default address and port where the web interface listens by setting dfs.http.address in conf/hadoop-site.xml, e.g. localhost:9090, but ensure that port is available.
Related
hbase import module don't succeed
I have to move some hbase tables from one hadoop cluster to another. I have extracted the tables using bin/hbase org.apache.hadoop.hbase.mapreduce.Export \ <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]] and I've put the return files into HDFS on my new cluster. But when I try bin/hbase org.apache.hadoop.hbase.mapreduce.Import , I have the strange following logs: hadoop#edgenode:~$ hbase/bin/hbase org.apache.hadoop.hbase.mapreduce.Import ADCP /hbase/backup_hbase/ADCP/2022-07-04_1546/ADCP/ SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hbase/lib/client-facing-thirdparty/slf4j-reload4j-1.7.33.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] 2022-10-03 11:19:09,689 INFO [main] mapreduce.Import: writing directly to table from Mapper. 2022-10-03 11:19:09,847 INFO [main] client.RMProxy: Connecting to ResourceManager at /172.16.42.42:8032 2022-10-03 11:19:09,983 INFO [main] Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 02/10/2020 11:30 GMT 2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:host.name=edgenode 2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_342 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.vendor=Private Build 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: hadoop-yarn-client-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-services-core-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-router-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-registry-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.3.3.jar 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.library.path=/home/hadoop/hadoop/lib/native 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.name=Linux 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.arch=amd64 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.version=5.15.0-1018-kvm 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.name=hadoop 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.free=174MB 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.max=3860MB 2022-10-03 11:19:10,045 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.total=237MB 2022-10-03 11:19:10,048 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Initiating client connection, connectString=namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$15/257950720#1124fc36 2022-10-03 11:19:10,054 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] common.X509Util: Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation 2022-10-03 11:19:10,061 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ClientCnxnSocket: jute.maxbuffer value is 4194304 Bytes 2022-10-03 11:19:10,069 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ClientCnxn: zookeeper.request.timeout value is 0. feature enabled= 2022-10-03 11:19:10,077 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Opening socket connection to server namenode/172.16.42.42:2181. Will not attempt to authenticate using SASL (unknown error) 2022-10-03 11:19:10,084 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /172.16.42.187:48598, server: namenode/172.16.42.42:2181 2022-10-03 11:19:10,120 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Session establishment complete on server namenode/172.16.42.42:2181, sessionid = 0x1b000002cb790005, negotiated timeout = 40000 2022-10-03 11:19:11,001 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Session: 0x1b000002cb790005 closed 2022-10-03 11:19:11,001 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x1b000002cb790005 2022-10-03 11:19:15,366 INFO [main] input.FileInputFormat: Total input files to process : 32 2022-10-03 11:19:15,660 INFO [main] mapreduce.JobSubmitter: number of splits:32 2022-10-03 11:19:15,902 INFO [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1664271607293_0002 2022-10-03 11:19:16,225 INFO [main] conf.Configuration: resource-types.xml not found 2022-10-03 11:19:16,225 INFO [main] resource.ResourceUtils: Unable to find 'resource-types.xml'. 2022-10-03 11:19:16,231 INFO [main] resource.ResourceUtils: Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE 2022-10-03 11:19:16,231 INFO [main] resource.ResourceUtils: Adding resource type - name = vcores, units = , type = COUNTABLE 2022-10-03 11:19:16,293 INFO [main] impl.YarnClientImpl: Submitted application application_1664271607293_0002 2022-10-03 11:19:16,328 INFO [main] mapreduce.Job: The url to track the job: http://namenode:8088/proxy/application_1664271607293_0002/ 2022-10-03 11:19:16,329 INFO [main] mapreduce.Job: Running job: job_1664271607293_0002 2022-10-03 11:19:31,513 INFO [main] mapreduce.Job: Job job_1664271607293_0002 running in uber mode : false 2022-10-03 11:19:31,514 INFO [main] mapreduce.Job: map 0% reduce 0% 2022-10-03 11:19:31,534 INFO [main] mapreduce.Job: 2-10-03 11:19:31.345]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. [2022-10-03 11:19:31.346]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. For more detailed output, check the application tracking page: http://namenode:8088/cluster/app/application_1664271607293_0002 Then click on links to logs of each attempt. . Failing the application. 2022-10-03 11:19:31,552 INFO [main] mapreduce.Job: Counters: 0 I don't understand what the problem could be. I went to http://namenode:8088/cluster/app/application_1664271607293_0002 but i didn't found nothing interesting. I've tried the command with different tables but get the same result. The two clusters are not one the same version but I read that it wasn't a problem. Every service works well on my clusters and I can use hbase commands on the hbase shell without any problem. Also, map reduce programs works well on my new cluster. I've also tested the copyTable and snapchot methods for the data migration, which didn't work either. Any idea of what should be the problem? Thanks! :) update : I found this on a datanode syslog in the hadoop web interface, may be useful? 2022-10-04 14:12:39,341 INFO [main] org.apache.hadoop.security.SecurityUtil: Updating Configuration 2022-10-04 14:12:39,354 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2022-10-04 14:12:39,493 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 7 cluster_timestamp: 1664271607293 } attemptId: 2 } keyId: -896624238) 2022-10-04 14:12:39,536 INFO [main] org.apache.hadoop.conf.Configuration: resource-types.xml not found 2022-10-04 14:12:39,536 INFO [main] org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'. 2022-10-04 14:12:39,636 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73) at org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36) at org.apache.hadoop.mapreduce.v2.util.MRBuilderUtils.newJobId(MRBuilderUtils.java:39) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:298) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1745) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1742) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1673) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70) ... 10 more Caused by: java.lang.VerifyError: Bad type on operand stack Exception Details: Location: org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual Reason: Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage' Current Frame: bci: #36 flags: { } locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } Bytecode: 0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7 0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a 0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5 0x0000030: 0013 2ab0 Stackmap Table: same_frame(#19) same_frame(#31) same_frame(#40) at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.<init>(JobIdPBImpl.java:37) ... 15 more 2022-10-04 14:12:39,641 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73) at org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36) at org.apache.hadoop.mapreduce.v2.util.MRBuilderUtils.newJobId(MRBuilderUtils.java:39) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:298) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1745) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1742) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1673) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70) ... 10 more Caused by: java.lang.VerifyError: Bad type on operand stack Exception Details: Location: org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual Reason: Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage' Current Frame: bci: #36 flags: { } locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } Bytecode: 0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7 0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a 0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5 0x0000030: 0013 2ab0 Stackmap Table: same_frame(#19) same_frame(#31) same_frame(#40) at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.<init>(JobIdPBImpl.java:37) ... 15 more 2022-10-04 14:12:39,643 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
PIG creates file on Hadoop but cannot write to it
I am learning hadoop and created a simple pig script. Reading a file works, but writing to another file does not. My script runs fine, the DUMP f command shows me 10 records, as expected. But when I store the same relation to a file (store f into 'result.csv';), there are some odd messages on the console, and for some reason, in the end I have a result file with only the first 3 records. My questions are: What's the matter with the IOException, when reading worked and writing worked at least partly? Why does the console tell me Total records written : 0, when actually 3 records have been written? Why didn't it store the 10 records, as expected? My Script (it's just some sandbox playing) cd /user/samples c = load 'crimes.csv' using PigStorage(',') as (ID:int,Case_Number:int,Date:chararray,Block:chararray,IUCR:chararray,Primary_Type,Description,LocationDescription,Arrest:boolean,Domestic,Beat,District,Ward,CommunityArea,FBICode,XCoordinate,YCoordinate,Year,UpdatedOn,Latitude,Longitude,Location); c = LIMIT c 1000; t = foreach c generate ID, Date, Arrest, Year; f = FILTER t by Arrest==true; f = LIMIT f 10; dump f; store f into 'result.csv'; part of the console output: 2016-07-21 15:55:07,435 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-07-21 15:55:07,537 [main] WARN org.apache.pig.tools.pigstats.mapreduce.MRJobStats - Unable to get job counters java.io.IOException: java.io.IOException: java.net.ConnectException: Call From m1.hdp2/192.168.178.201 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getCounters(HadoopShims.java:132) at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.addCounters(MRJobStats.java:284) at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.addSuccessJobStats(MRPigStatsUtil.java:235) at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.accumulateStats(MRPigStatsUtil.java:165) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:360) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:308) at org.apache.pig.PigServer.launchPlan(PigServer.java:1474) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1459) at org.apache.pig.PigServer.execute(PigServer.java:1448) at org.apache.pig.PigServer.access$500(PigServer.java:118) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1773) at org.apache.pig.PigServer.registerQuery(PigServer.java:707) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1075) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) at org.apache.pig.Main.run(Main.java:564) at org.apache.pig.Main.main(Main.java:176) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: java.net.ConnectException: Call From m1.hdp2/192.168.178.201 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:343) at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:428) at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:572) at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:184) at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getCounters(HadoopShims.java:126) ... 24 more Caused by: java.net.ConnectException: Call From m1.hdp2/192.168.178.201 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor18.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) at org.apache.hadoop.ipc.Client.call(Client.java:1479) at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy14.getJobReport(Unknown Source) at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:324) ... 28 more Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712) at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528) at org.apache.hadoop.ipc.Client.call(Client.java:1451) ... 36 more 2016-07-21 15:55:07,540 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2016-07-21 15:55:07,571 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.7.2 0.16.0 hadoop 2016-07-21 15:50:17 2016-07-21 15:55:07 FILTER,LIMIT Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_1469130571595_0001 3 1 n/a n/a n/a n/a n/a n/a n/a n/a c job_1469130571595_0002 1 1 n/a n/a n/a n/a n/a n/a n/a n/a c,f,t hdfs://localhost:9000/user/samples/result.csv, Input(s): Successfully read 0 records from: "hdfs://localhost:9000/user/samples/crimes.csv" Output(s): Successfully stored 0 records in: "hdfs://localhost:9000/user/samples/result.csv" Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1469130571595_0001 -> job_1469130571595_0002, job_1469130571595_0002 2016-07-21 15:55:07,573 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 2016-07-21 15:55:07,585 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2016-07-21 15:55:08,592 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
Hadoop 2.6.4 MR job quick freeze
Hadoop 2.6.4: 1 master + 2 slaves on AWS EC2 master: namenode, secondary namenode, resource manager slave: datanode, node manager When running a test MR job (wordcount), it freezes right away: hduser#ip-172-31-4-108:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount /data/shakespeare /data/out1 16/03/21 10:45:19 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-4-108/172.31.4.108:8032 16/03/21 10:45:21 INFO input.FileInputFormat: Total input paths to process : 5 16/03/21 10:45:21 INFO mapreduce.JobSubmitter: number of splits:5 16/03/21 10:45:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1458556970596_0001 16/03/21 10:45:22 INFO impl.YarnClientImpl: Submitted application application_1458556970596_0001 16/03/21 10:45:22 INFO mapreduce.Job: The url to track the job: http://ip-172-31-4-108:8088/proxy/application_1458556970596_0001/ 16/03/21 10:45:22 INFO mapreduce.Job: Running job: job_1458556970596_0001 When running start-dfs.sh and start-yarn.sh on master, all daemons run succesfully (jps command) on corresponding EC2 instance. Below Resource Manager log when launching MR job: 2016-03-21 10:45:20,152 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 1 2016-03-21 10:45:22,784 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 1 submitted by user hduser 2016-03-21 10:45:22,785 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1458556970596_0001 2016-03-21 10:45:22,787 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hduser IP=172.31.4.108 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1458556970596_0001 2016-03-21 10:45:22,788 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1458556970596_0001 State change from NEW to NEW_SAVING 2016-03-21 10:45:22,805 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1458556970596_0001 2016-03-21 10:45:22,807 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1458556970596_0001 State change from NEW_SAVING to SUBMITTED 2016-03-21 10:45:22,809 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application added - appId: application_1458556970596_0001 user: hduser leaf-queue of parent: root #applications: 1 2016-03-21 10:45:22,810 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Accepted application application_1458556970596_0001 from user: hduser, in queue: default 2016-03-21 10:45:22,825 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1458556970596_0001 State change from SUBMITTED to ACCEPTED 2016-03-21 10:45:22,866 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1458556970596_0001_000001 2016-03-21 10:45:22,867 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1458556970596_0001_000001 State change from NEW to SUBMITTED 2016-03-21 10:45:22,896 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: maximum-am-resource-percent is insufficient to start a single application in queue, it is likely set too low. skipping enforcement to allow at least one application to start 2016-03-21 10:45:22,896 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: maximum-am-resource-percent is insufficient to start a single application in queue for user, it is likely set too low. skipping enforcement to allow at least one application to start 2016-03-21 10:45:22,897 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1458556970596_0001 from user: hduser activated in queue: default 2016-03-21 10:45:22,898 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1458556970596_0001 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User#1d51055, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1 2016-03-21 10:45:22,898 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1458556970596_0001_000001 to scheduler from user hduser in queue default 2016-03-21 10:45:22,900 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1458556970596_0001_000001 State change from SUBMITTED to SCHEDULED Below NameNode log when launching MR job: 2016-03-21 10:45:03,746 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds 2016-03-21 10:45:03,746 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s). 2016-03-21 10:45:20,613 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 3 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 7 2016-03-21 10:45:20,760 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.jar. BP-1804768821-172.31.4.108-1458553823105 blk_1073741834_1010{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} 2016-03-21 10:45:21,290 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* checkFileProgress: blk_1073741834_1010{blockUCState=COMMITTED, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} has not reached minimal replication 1 2016-03-21 10:45:21,292 INFO org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream: Nothing to flush 2016-03-21 10:45:21,297 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741834_1010{blockUCState=COMMITTED, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 270356 2016-03-21 10:45:21,297 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741834_1010 size 270356 2016-03-21 10:45:21,706 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.jar is closed by DFSClient_NONMAPREDUCE_-18612056_1 2016-03-21 10:45:21,714 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.jar 2016-03-21 10:45:21,812 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.split 2016-03-21 10:45:21,823 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.split. BP-1804768821-172.31.4.108-1458553823105 blk_1073741835_1011{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW], ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW]]} 2016-03-21 10:45:21,849 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741835_1011{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW], ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW]]} size 0 2016-03-21 10:45:21,853 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741835_1011{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW], ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW]]} size 0 2016-03-21 10:45:21,855 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.split is closed by DFSClient_NONMAPREDUCE_-18612056_1 2016-03-21 10:45:21,865 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.splitmetainfo. BP-1804768821-172.31.4.108-1458553823105 blk_1073741836_1012{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} 2016-03-21 10:45:21,876 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741836_1012{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0 2016-03-21 10:45:21,877 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741836_1012{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0 2016-03-21 10:45:21,880 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.splitmetainfo is closed by DFSClient_NONMAPREDUCE_-18612056_1 2016-03-21 10:45:22,277 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.xml. BP-1804768821-172.31.4.108-1458553823105 blk_1073741837_1013{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} 2016-03-21 10:45:22,327 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741837_1013{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0 2016-03-21 10:45:22,328 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741837_1013{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0 2016-03-21 10:45:22,332 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.xml is closed by DFSClient_NONMAPREDUCE_-18612056_1 2016-03-21 10:45:33,746 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds 2016-03-21 10:45:33,747 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s). 2016-03-21 10:46:03,748 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds 2016-03-21 10:46:03,748 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s). 2016-03-21 10:46:33,748 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds 2016-03-21 10:46:33,749 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s). 2016-03-21 10:47:03,749 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds 2016-03-21 10:47:03,750 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s). Any ideas ? thank you in advance for your support !. Below *-site.xml files content. Note: I've indeed applied some dimensioning results values to properties, but I still had the EXACT SAME issue with minimal configuration (only mandatory properties). core-site.xml <configuration> <property><name>fs.defaultFS</name><value>hdfs://ip-172-31-4-108:8020</value></property> </configuration> hdfs-site.xml <configuration> <property><name>dfs.replication</name><value>2</value></property> <property><name>dfs.namenode.name.dir</name><value>file:///xvda1/dfs/nn</value></property> <property><name>dfs.datanode.data.dir</name><value>file:///xvda1/dfs/dn</value></property> </configuration> mapred-site.xml <configuration> <property><name>mapreduce.jobhistory.address</name><value>ip-172-31-4-108:10020</value></property> <property><name>mapreduce.jobhistory.webapp.address</name><value>ip-172-31-4-108:19888</value></property> <property><name>mapreduce.framework.name</name><value>yarn</value></property> <property><name>mapreduce.map.memory.mb</name><value>512</value></property> <property><name>mapreduce.reduce.memory.mb</name><value>1024</value></property> <property><name>mapreduce.map.java.opts</name><value>410</value></property> <property><name>mapreduce.reduce.java.opts</name><value>820</value></property> </configuration> yarn-site.xml <configuration> <property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property> <property><name>yarn.resourcemanager.hostname</name><value>ip-172-31-4-108</value></property> <property><name>yarn.nodemanager.local-dirs</name><value>file:///xvda1/nodemgr/local</value></property> <property><name>yarn.nodemanager.log-dirs</name><value>/var/log/hadoop-yarn/containers</value></property> <property><name>yarn.nodemanager.remote-app-log-dir</name><value>/var/log/hadoop-yarn/apps</value></property> <property><name>yarn.log-aggregation-enable</name><value>true</value></property> <property><name>yarn.app.mapreduce.am.resource.mb</name><value>1024</value></property> <property><name>yarn.app.mapreduce.am.command-opts</name><value>820</value></property> <property><name>yarn.nodemanager.resource.memory-mb</name><value>6291456</value></property> <property><name>yarn.scheduler.minimum_allocation-mb</name><value>524288</value></property> <property><name>yarn.scheduler.maximum_allocation-mb</name><value>6291456</value></property> </configuration>
Some tasks in map() fails when I run it on AWS
I was running page rank on s3://aws-publicdatasets/common-crawl/parse-output/segment/1346876860819/metadata-XXXX dataset. The program worked when I use 10 files (about 1GB) with 2 m1.medium, but when I use 300 files(20GB) with 5 m3.xlarge instances, it fails at map 39%, reduce 4%. Could you please find the possible reason for the failure? Here are the logs. stderr: AttemptID:attempt_1411372099942_0001_m_000010_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000014_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000015_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000057_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000103_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000094_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000109_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000108_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000133_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000136_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000010_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000151_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000014_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000168_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000167_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000015_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000174_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000175_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000057_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000181_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000182_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000190_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000103_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000109_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000094_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000200_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000108_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000133_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000199_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000136_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000010_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000151_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000206_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000207_0 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000014_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000168_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000175_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000167_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000174_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000015_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000057_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000181_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000182_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000190_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000103_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000094_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000200_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000109_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000108_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000133_2 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000199_1 Timed out after 600 secs AttemptID:attempt_1411372099942_0001_m_000136_2 Timed out after 600 secs part of syslog: 08:24:24,791 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000168_1, Status : FAILED 2014-09-22 08:24:46,873 INFO org.apache.hadoop.mapreduce.Job (main): map 39% reduce 4% 2014-09-22 08:24:54,903 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000175_1, Status : FAILED 2014-09-22 08:24:54,904 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000167_1, Status : FAILED 2014-09-22 08:24:54,904 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000174_1, Status : FAILED 2014-09-22 08:24:55,908 INFO org.apache.hadoop.mapreduce.Job (main): map 38% reduce 4% 2014-09-22 08:25:13,968 INFO org.apache.hadoop.mapreduce.Job (main): map 39% reduce 4% 2014-09-22 08:25:25,007 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000015_2, Status : FAILED 2014-09-22 08:26:24,210 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000057_2, Status : FAILED 2014-09-22 08:26:54,322 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000181_1, Status : FAILED 2014-09-22 08:27:24,432 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000182_1, Status : FAILED 2014-09-22 08:27:25,435 INFO org.apache.hadoop.mapreduce.Job (main): map 38% reduce 4% 2014-09-22 08:27:54,543 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000190_1, Status : FAILED 2014-09-22 08:28:54,751 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000103_2, Status : FAILED 2014-09-22 08:29:24,851 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000094_2, Status : FAILED 2014-09-22 08:29:24,852 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000200_1, Status : FAILED 2014-09-22 08:29:24,853 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000109_2, Status : FAILED 2014-09-22 08:29:48,931 INFO org.apache.hadoop.mapreduce.Job (main): map 39% reduce 4% 2014-09-22 08:29:54,954 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000108_2, Status : FAILED 2014-09-22 08:30:24,066 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000133_2, Status : FAILED 2014-09-22 08:32:54,599 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000199_1, Status : FAILED 2014-09-22 08:32:54,600 INFO org.apache.hadoop.mapreduce.Job (main): Task Id : attempt_1411372099942_0001_m_000136_2, Status : FAILED 2014-09-22 08:34:25,910 INFO org.apache.hadoop.mapreduce.Job (main): map 100% reduce 100% 2014-09-22 08:34:25,915 INFO org.apache.hadoop.mapreduce.Job (main): Job job_1411372099942_0001 failed with state FAILED due to: Task failed task_1411372099942_0001_m_000010 Job failed as tasks failed. failedMaps:1 failedReduces:0 Attempts for: s-1W7C8YIFC87Y8, Job 1411372099942_0001, Task 2014-09-22 08:18:27,238 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-09-22 08:18:27,322 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-09-22 08:18:28,462 INFO main org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2014-09-22 08:18:28,496 INFO main org.apache.hadoop.metrics2.sink.cloudwatch.CloudWatchSink: Initializing the CloudWatchSink for metrics. 2014-09-22 08:18:28,795 INFO main org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Sink file started 2014-09-22 08:18:28,967 INFO main org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 300 second(s). 2014-09-22 08:18:28,967 INFO main org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started 2014-09-22 08:18:28,982 INFO main org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2014-09-22 08:18:28,983 INFO main org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1411372099942_0001, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier#3fc15856) 2014-09-22 08:18:29,157 INFO main org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2014-09-22 08:18:29,880 INFO main org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1411372099942_0001,/mnt1/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1411372099942_0001,/mnt2/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1411372099942_0001 2014-09-22 08:18:30,164 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-09-22 08:18:30,182 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-09-22 08:18:31,063 INFO main org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2014-09-22 08:18:32,100 INFO main org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ] 2014-09-22 08:18:32,605 INFO main org.apache.hadoop.mapred.MapTask: Processing split: s3://aws-publicdatasets/common-crawl/parse-output/segment/1346876860819/metadata-00122:0+67108864 2014-09-22 08:18:32,810 INFO main amazon.emr.metrics.MetricsSaver: MetricsSaver YarnChild root:hdfs:///mnt/var/em/ period:120 instanceId:i-ec84e7c1 jobflow:j-27XODJ8WMW4VP 2014-09-22 08:18:33,205 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-09-22 08:18:33,219 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-09-22 08:18:33,221 INFO main com.amazon.ws.emr.hadoop.fs.guice.EmrFSBaseModule: Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as FileSystem implementation. 2014-09-22 08:18:35,024 INFO main com.amazon.ws.emr.hadoop.fs.EmrFileSystem: Using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as filesystem implementation 2014-09-22 08:18:36,001 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-09-22 08:18:36,002 WARN main org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-09-22 08:18:36,024 INFO main org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2014-09-22 08:18:36,514 INFO main org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 52428796(209715184) 2014-09-22 08:18:36,514 INFO main org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 200 2014-09-22 08:18:36,514 INFO main org.apache.hadoop.mapred.MapTask: soft limit at 167772160 2014-09-22 08:18:36,514 INFO main org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 209715200 2014-09-22 08:18:36,514 INFO main org.apache.hadoop.mapred.MapTask: kvstart = 52428796; length = 13107200 2014-09-22 08:18:36,597 INFO main com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem: Opening 's3://aws-publicdatasets/common-crawl/parse-output/segment/1346876860819/metadata-00122' for reading 2014-09-22 08:18:36,716 INFO main org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 2014-09-22 08:18:36,720 INFO main org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor ht t p: //. gz 2014-09-22 08:18:36,726 INFO main org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor 2014-09-22 08:18:36,726 INFO main org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor 2014-09-22 08:18:36,727 INFO main org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor Edited by: paraxx on Sep 22, 2014 10:25 AM
task_1411372099942_0001_m_000010 has timed out. Try increasing the timeout configuration parameter. mapreduce.task.timeout=12000000
Pig Not Making Any Progress
I just wrote my first pig script, it does not seem to be making any progress. Some background info: I'm running CDH4.5 on a CentOS 6.4 VM, all installed from Cloudera's yum repo. It is configured to all run in pseudo-distributed mode. Everything is running as a service and appears to be configured correctly (thank heaven!) Here is my pig script: A = LOAD '/user/msknapp/county_insurance_pp.txt' AS (fips:int,st:chararray,stfips:int,name:chararray,a:int,b:int,c:int,d:int,e:int,f:int,g:int); DUMP A; The input file was taken from data.gov, it's some insurance data. I pre-processed it, here is some useful info: [msknapp#localhost data]$ cat county_insurance_pp.txt | grep BUTLER 1013 AL 1 BUTLER 54480 129 3287 57895 19023 IA 19 BUTLER 27291 29659 3386 25150 85486 20015 KS 20 BUTLER 233855 10028 456 29278 5759 279376 21031 KY 21 BUTLER 4164 453 4617 29023 MO 29 BUTLER 48240 5217 738 2042 25081 81317 31023 NE 31 BUTLER 4406 153 609 5168 39017 OH 39 BUTLER 856205 103041 3854 38648 203328 19832 1224910 42019 PA 42 BUTLER 1072941 19131 190 60648 68692 50230 1271832 [msknapp#localhost data]$ hadoop fs -cat /user/msknapp/county_insurance_pp.txt | head 1001 AL 1 AUTAUGA 215624 37156 46 130 53237 140420 446614 1003 AL 1 BALDWIN 1060297 95925 3284 31096 99241 200581 1490424 1005 AL 1 BARBOUR 37893 132 246 811 39082 1007 AL 1 BIBB 3127 70 241 34403 37841 1009 AL 1 BLOUNT 32311 135 11884 19392 4200 67922 1011 AL 1 BULLOCK 4301 336 274 186 5098 1013 AL 1 BUTLER 54480 129 3287 57895 1015 AL 1 CALHOUN 469959 92702 5373 2130 17069 532033 1119265 1017 AL 1 CHAMBERS 37238 3189 292 1953 42672 1019 AL 1 CHEROKEE 37984 190 117 1081 1277 40649 cat: Unable to write to output stream. When I run the pig script on the command line I get a whole bunch of log statements and it looks like it is running, but once it starts, it never makes any progress, no matter how long I wait. These are the last couple lines: 2014-01-05 15:10:41,113 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1388936205793_0006 2014-01-05 15:10:41,511 [JobControl] INFO org.apache.hadoop.yarn.client.YarnClientImpl - Submitted application application_1388936205793_0006 to ResourceManager at /0.0.0.0:8032 2014-01-05 15:10:41,564 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8088/proxy/application_1388936205793_0006/ 2014-01-05 15:10:41,653 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete I modified the pig script to point to my local filesystem file, and ran the pig script in local mode, and the job finished successfully in seconds. The local copy of the file is identical to the one hdfs has. I think for some reason pig can't make a solid connection to my HDFS. Would somebody please tell me what I'm doing wrong?
Maybe try: A = LOAD '/user/msknapp/county_insurance_pp.txt' USING PigStorage('\t') AS (fips:int,st:chararray,stfips:int,name:chararray,a:int,b:int,c:int,d:int,e:int,f:int,g:int); DUMP A;