Hadoop 2.6.4 MR job quick freeze - hadoop
Hadoop 2.6.4: 1 master + 2 slaves on AWS EC2
master: namenode, secondary namenode, resource manager
slave: datanode, node manager
When running a test MR job (wordcount), it freezes right away:
hduser#ip-172-31-4-108:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount /data/shakespeare /data/out1
16/03/21 10:45:19 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-4-108/172.31.4.108:8032
16/03/21 10:45:21 INFO input.FileInputFormat: Total input paths to process : 5
16/03/21 10:45:21 INFO mapreduce.JobSubmitter: number of splits:5
16/03/21 10:45:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1458556970596_0001
16/03/21 10:45:22 INFO impl.YarnClientImpl: Submitted application application_1458556970596_0001
16/03/21 10:45:22 INFO mapreduce.Job: The url to track the job: http://ip-172-31-4-108:8088/proxy/application_1458556970596_0001/
16/03/21 10:45:22 INFO mapreduce.Job: Running job: job_1458556970596_0001
When running start-dfs.sh and start-yarn.sh on master, all daemons run succesfully (jps command) on corresponding EC2 instance.
Below Resource Manager log when launching MR job:
2016-03-21 10:45:20,152 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 1
2016-03-21 10:45:22,784 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 1 submitted by user hduser
2016-03-21 10:45:22,785 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1458556970596_0001
2016-03-21 10:45:22,787 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hduser IP=172.31.4.108 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1458556970596_0001
2016-03-21 10:45:22,788 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1458556970596_0001 State change from NEW to NEW_SAVING
2016-03-21 10:45:22,805 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1458556970596_0001
2016-03-21 10:45:22,807 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1458556970596_0001 State change from NEW_SAVING to SUBMITTED
2016-03-21 10:45:22,809 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application added - appId: application_1458556970596_0001 user: hduser leaf-queue of parent: root #applications: 1
2016-03-21 10:45:22,810 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Accepted application application_1458556970596_0001 from user: hduser, in queue: default
2016-03-21 10:45:22,825 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1458556970596_0001 State change from SUBMITTED to ACCEPTED
2016-03-21 10:45:22,866 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1458556970596_0001_000001
2016-03-21 10:45:22,867 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1458556970596_0001_000001 State change from NEW to SUBMITTED
2016-03-21 10:45:22,896 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: maximum-am-resource-percent is insufficient to start a single application in queue, it is likely set too low. skipping enforcement to allow at least one application to start
2016-03-21 10:45:22,896 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: maximum-am-resource-percent is insufficient to start a single application in queue for user, it is likely set too low. skipping enforcement to allow at least one application to start
2016-03-21 10:45:22,897 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1458556970596_0001 from user: hduser activated in queue: default
2016-03-21 10:45:22,898 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1458556970596_0001 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User#1d51055, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1
2016-03-21 10:45:22,898 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1458556970596_0001_000001 to scheduler from user hduser in queue default
2016-03-21 10:45:22,900 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1458556970596_0001_000001 State change from SUBMITTED to SCHEDULED
Below NameNode log when launching MR job:
2016-03-21 10:45:03,746 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2016-03-21 10:45:03,746 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2016-03-21 10:45:20,613 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 3 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 7
2016-03-21 10:45:20,760 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.jar. BP-1804768821-172.31.4.108-1458553823105 blk_1073741834_1010{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]}
2016-03-21 10:45:21,290 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* checkFileProgress: blk_1073741834_1010{blockUCState=COMMITTED, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} has not reached minimal replication 1
2016-03-21 10:45:21,292 INFO org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream: Nothing to flush
2016-03-21 10:45:21,297 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741834_1010{blockUCState=COMMITTED, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 270356
2016-03-21 10:45:21,297 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741834_1010 size 270356
2016-03-21 10:45:21,706 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.jar is closed by DFSClient_NONMAPREDUCE_-18612056_1
2016-03-21 10:45:21,714 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.jar
2016-03-21 10:45:21,812 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.split
2016-03-21 10:45:21,823 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.split. BP-1804768821-172.31.4.108-1458553823105 blk_1073741835_1011{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW], ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW]]}
2016-03-21 10:45:21,849 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741835_1011{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW], ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW]]} size 0
2016-03-21 10:45:21,853 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741835_1011{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW], ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW]]} size 0
2016-03-21 10:45:21,855 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.split is closed by DFSClient_NONMAPREDUCE_-18612056_1
2016-03-21 10:45:21,865 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.splitmetainfo. BP-1804768821-172.31.4.108-1458553823105 blk_1073741836_1012{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]}
2016-03-21 10:45:21,876 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741836_1012{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0
2016-03-21 10:45:21,877 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741836_1012{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0
2016-03-21 10:45:21,880 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.splitmetainfo is closed by DFSClient_NONMAPREDUCE_-18612056_1
2016-03-21 10:45:22,277 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.xml. BP-1804768821-172.31.4.108-1458553823105 blk_1073741837_1013{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]}
2016-03-21 10:45:22,327 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.14.198:50010 is added to blk_1073741837_1013{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0
2016-03-21 10:45:22,328 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 172.31.13.117:50010 is added to blk_1073741837_1013{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-5c350bcc-f752-43cd-80c1-80f68e2db73e:NORMAL:172.31.13.117:50010|RBW], ReplicaUnderConstruction[[DISK]DS-a1e2988f-2ef7-4005-8129-0ca18c95b2cb:NORMAL:172.31.14.198:50010|RBW]]} size 0
2016-03-21 10:45:22,332 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1458556970596_0001/job.xml is closed by DFSClient_NONMAPREDUCE_-18612056_1
2016-03-21 10:45:33,746 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2016-03-21 10:45:33,747 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2016-03-21 10:46:03,748 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2016-03-21 10:46:03,748 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2016-03-21 10:46:33,748 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2016-03-21 10:46:33,749 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2016-03-21 10:47:03,749 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2016-03-21 10:47:03,750 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).
Any ideas ? thank you in advance for your support !.
Below *-site.xml files content. Note: I've indeed applied some dimensioning results values to properties, but I still had the EXACT SAME issue with minimal configuration (only mandatory properties).
core-site.xml
<configuration>
<property><name>fs.defaultFS</name><value>hdfs://ip-172-31-4-108:8020</value></property>
</configuration>
hdfs-site.xml
<configuration>
<property><name>dfs.replication</name><value>2</value></property>
<property><name>dfs.namenode.name.dir</name><value>file:///xvda1/dfs/nn</value></property>
<property><name>dfs.datanode.data.dir</name><value>file:///xvda1/dfs/dn</value></property>
</configuration>
mapred-site.xml
<configuration>
<property><name>mapreduce.jobhistory.address</name><value>ip-172-31-4-108:10020</value></property>
<property><name>mapreduce.jobhistory.webapp.address</name><value>ip-172-31-4-108:19888</value></property>
<property><name>mapreduce.framework.name</name><value>yarn</value></property>
<property><name>mapreduce.map.memory.mb</name><value>512</value></property>
<property><name>mapreduce.reduce.memory.mb</name><value>1024</value></property>
<property><name>mapreduce.map.java.opts</name><value>410</value></property>
<property><name>mapreduce.reduce.java.opts</name><value>820</value></property>
</configuration>
yarn-site.xml
<configuration>
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>
<property><name>yarn.resourcemanager.hostname</name><value>ip-172-31-4-108</value></property>
<property><name>yarn.nodemanager.local-dirs</name><value>file:///xvda1/nodemgr/local</value></property>
<property><name>yarn.nodemanager.log-dirs</name><value>/var/log/hadoop-yarn/containers</value></property>
<property><name>yarn.nodemanager.remote-app-log-dir</name><value>/var/log/hadoop-yarn/apps</value></property>
<property><name>yarn.log-aggregation-enable</name><value>true</value></property>
<property><name>yarn.app.mapreduce.am.resource.mb</name><value>1024</value></property>
<property><name>yarn.app.mapreduce.am.command-opts</name><value>820</value></property>
<property><name>yarn.nodemanager.resource.memory-mb</name><value>6291456</value></property>
<property><name>yarn.scheduler.minimum_allocation-mb</name><value>524288</value></property>
<property><name>yarn.scheduler.maximum_allocation-mb</name><value>6291456</value></property>
</configuration>
Related
hbase import module don't succeed
I have to move some hbase tables from one hadoop cluster to another. I have extracted the tables using bin/hbase org.apache.hadoop.hbase.mapreduce.Export \ <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]] and I've put the return files into HDFS on my new cluster. But when I try bin/hbase org.apache.hadoop.hbase.mapreduce.Import , I have the strange following logs: hadoop#edgenode:~$ hbase/bin/hbase org.apache.hadoop.hbase.mapreduce.Import ADCP /hbase/backup_hbase/ADCP/2022-07-04_1546/ADCP/ SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hbase/lib/client-facing-thirdparty/slf4j-reload4j-1.7.33.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] 2022-10-03 11:19:09,689 INFO [main] mapreduce.Import: writing directly to table from Mapper. 2022-10-03 11:19:09,847 INFO [main] client.RMProxy: Connecting to ResourceManager at /172.16.42.42:8032 2022-10-03 11:19:09,983 INFO [main] Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 02/10/2020 11:30 GMT 2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:host.name=edgenode 2022-10-03 11:19:10,043 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_342 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.vendor=Private Build 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: hadoop-yarn-client-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-services-core-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-router-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-registry-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.3.3.jar:/home/hadoop/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.3.3.jar 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.library.path=/home/hadoop/hadoop/lib/native 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.name=Linux 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.arch=amd64 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.version=5.15.0-1018-kvm 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.name=hadoop 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.free=174MB 2022-10-03 11:19:10,044 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.max=3860MB 2022-10-03 11:19:10,045 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Client environment:os.memory.total=237MB 2022-10-03 11:19:10,048 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Initiating client connection, connectString=namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$15/257950720#1124fc36 2022-10-03 11:19:10,054 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] common.X509Util: Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation 2022-10-03 11:19:10,061 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ClientCnxnSocket: jute.maxbuffer value is 4194304 Bytes 2022-10-03 11:19:10,069 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ClientCnxn: zookeeper.request.timeout value is 0. feature enabled= 2022-10-03 11:19:10,077 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Opening socket connection to server namenode/172.16.42.42:2181. Will not attempt to authenticate using SASL (unknown error) 2022-10-03 11:19:10,084 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /172.16.42.187:48598, server: namenode/172.16.42.42:2181 2022-10-03 11:19:10,120 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-SendThread(namenode:2181)] zookeeper.ClientCnxn: Session establishment complete on server namenode/172.16.42.42:2181, sessionid = 0x1b000002cb790005, negotiated timeout = 40000 2022-10-03 11:19:11,001 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7] zookeeper.ZooKeeper: Session: 0x1b000002cb790005 closed 2022-10-03 11:19:11,001 INFO [ReadOnlyZKClient-namenode:2181,datanode1:2181,datanode2:2181,datanode3:2181,datanode4:2181,datanode5:2181,datanode6:2181,datanode7:2181,datanode8:2181,datanode9:2181,datanode10:2181,datanode11:2181,datanode12:2181,datanode13:2181,datanode14:2181,datanode15:2181,datanode16:2181,datanode17:2181,datanode18:2181,datanode19:2181,datanode20:2181,datanode21:2181,datanode22:2181,datanode23:2181,datanode24:2181,datanode25:2181,datanode26:2181,edgenode:2181#0x05b970f7-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x1b000002cb790005 2022-10-03 11:19:15,366 INFO [main] input.FileInputFormat: Total input files to process : 32 2022-10-03 11:19:15,660 INFO [main] mapreduce.JobSubmitter: number of splits:32 2022-10-03 11:19:15,902 INFO [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1664271607293_0002 2022-10-03 11:19:16,225 INFO [main] conf.Configuration: resource-types.xml not found 2022-10-03 11:19:16,225 INFO [main] resource.ResourceUtils: Unable to find 'resource-types.xml'. 2022-10-03 11:19:16,231 INFO [main] resource.ResourceUtils: Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE 2022-10-03 11:19:16,231 INFO [main] resource.ResourceUtils: Adding resource type - name = vcores, units = , type = COUNTABLE 2022-10-03 11:19:16,293 INFO [main] impl.YarnClientImpl: Submitted application application_1664271607293_0002 2022-10-03 11:19:16,328 INFO [main] mapreduce.Job: The url to track the job: http://namenode:8088/proxy/application_1664271607293_0002/ 2022-10-03 11:19:16,329 INFO [main] mapreduce.Job: Running job: job_1664271607293_0002 2022-10-03 11:19:31,513 INFO [main] mapreduce.Job: Job job_1664271607293_0002 running in uber mode : false 2022-10-03 11:19:31,514 INFO [main] mapreduce.Job: map 0% reduce 0% 2022-10-03 11:19:31,534 INFO [main] mapreduce.Job: 2-10-03 11:19:31.345]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. [2022-10-03 11:19:31.346]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. For more detailed output, check the application tracking page: http://namenode:8088/cluster/app/application_1664271607293_0002 Then click on links to logs of each attempt. . Failing the application. 2022-10-03 11:19:31,552 INFO [main] mapreduce.Job: Counters: 0 I don't understand what the problem could be. I went to http://namenode:8088/cluster/app/application_1664271607293_0002 but i didn't found nothing interesting. I've tried the command with different tables but get the same result. The two clusters are not one the same version but I read that it wasn't a problem. Every service works well on my clusters and I can use hbase commands on the hbase shell without any problem. Also, map reduce programs works well on my new cluster. I've also tested the copyTable and snapchot methods for the data migration, which didn't work either. Any idea of what should be the problem? Thanks! :) update : I found this on a datanode syslog in the hadoop web interface, may be useful? 2022-10-04 14:12:39,341 INFO [main] org.apache.hadoop.security.SecurityUtil: Updating Configuration 2022-10-04 14:12:39,354 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2022-10-04 14:12:39,493 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 7 cluster_timestamp: 1664271607293 } attemptId: 2 } keyId: -896624238) 2022-10-04 14:12:39,536 INFO [main] org.apache.hadoop.conf.Configuration: resource-types.xml not found 2022-10-04 14:12:39,536 INFO [main] org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'. 2022-10-04 14:12:39,636 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73) at org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36) at org.apache.hadoop.mapreduce.v2.util.MRBuilderUtils.newJobId(MRBuilderUtils.java:39) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:298) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1745) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1742) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1673) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70) ... 10 more Caused by: java.lang.VerifyError: Bad type on operand stack Exception Details: Location: org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual Reason: Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage' Current Frame: bci: #36 flags: { } locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } Bytecode: 0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7 0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a 0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5 0x0000030: 0013 2ab0 Stackmap Table: same_frame(#19) same_frame(#31) same_frame(#40) at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.<init>(JobIdPBImpl.java:37) ... 15 more 2022-10-04 14:12:39,641 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73) at org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36) at org.apache.hadoop.mapreduce.v2.util.MRBuilderUtils.newJobId(MRBuilderUtils.java:39) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:298) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1745) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1742) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1673) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70) ... 10 more Caused by: java.lang.VerifyError: Bad type on operand stack Exception Details: Location: org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual Reason: Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage' Current Frame: bci: #36 flags: { } locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' } Bytecode: 0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7 0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a 0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5 0x0000030: 0013 2ab0 Stackmap Table: same_frame(#19) same_frame(#31) same_frame(#40) at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.<init>(JobIdPBImpl.java:37) ... 15 more 2022-10-04 14:12:39,643 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
Worried: Corrupt HDFS on single node - how to resolve
I am running CDH 5.6 (Hadoop 2.6, HBase 1.0.0) on a single machine. There is only Hadoop and HBase running. Hadoop is configured for replication factor 1, Hbase running on top of HDFS, no pseudo-distributed mode. For the last three days, I had a simple program running that would insert rows into HBase using 10 threads in parallel. Checking on it now I find that HDFS has gone corrupt and all but one inserting thread have failed. Running hdfs fsck / | grep CORRUPT I see that there are some corrupted blocks. hbase hbck says everything if fine. When restarting, all the sudden hdfs fsck says its HEALTHY again. Starting the insertion gets me checksum errors again in the region server log (as below). Finally I ran hdfs fsck / -delete and only after restarting everything, the insert works again. There are some details below. The question is: How can HDFS possibly get corrupted even on a single node that is doing nothing but simply trying to insert data into Hbase? and How can I prevent that in the future? If hbase hbck said everything is OK, does that mean there was no data loss? DETAILS: I checked logs of region server, data node, namenode, hmaster, zookeeper. There was no out of memory or anything like that. All processes have been up all the time and kept responding until the very end. There is no disk space shortage. Application log shows the following failure at around 05/16 08:31:32: java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: java.net.SocketTimeoutException: callTimeout=60000, callDuration=64720: row 'a899ca26-45a4-4af6-af34-72c035b4b7da/f6f88f36-9263-4dbb-9588-aaf126be9117/9223370573623181733' on table 'tt_items' at region=tt_items,a899ca26-45a4-4af6-af34-72c035b4b7da/b50993c1-7ff4-4169-b58f-f53878697709/9223370573954024736,1462900815686.08255086d13380bd559a87dd93cc15ba., hostname=hb-desktop,16201,1463231294049, seqNum=51380656 Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: java.net.SocketTimeoutException: callTimeout=60000, callDuration=64720: row 'a899ca26-45a4-4af6-af34-72c035b4b7da/f6f88f36-9263-4dbb-9588-aaf126be9117/9223370573623181733' on table 'tt_items' at region=tt_items,a899ca26-45a4-4af6-af34-72c035b4b7da/b50993c1-7ff4-4169-b58f-f53878697709/9223370573954024736,1462900815686.08255086d13380bd559a87dd93cc15ba., hostname=hb-desktop,16201,1463231294049, seqNum=51380656 Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=64720: row 'a899ca26-45a4-4af6-af34-72c035b4b7da/f6f88f36-9263-4dbb-9588-aaf126be9117/9223370573623181733' on table 'tt_items' at region=tt_items,a899ca26-45a4-4af6-af34-72c035b4b7da/b50993c1-7ff4-4169-b58f-f53878697709/9223370573954024736,1462900815686.08255086d13380bd559a87dd93cc15ba., hostname=hb-desktop,16201,1463231294049, seqNum=51380656 Caused by: java.io.IOException: java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reader reader=hdfs://localhost:9000/hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/d/9bad8381ba89430daf008c28befedec3, compression=snappy, cacheConf=blockCache=LruBlockCache{blockCount=7248, currentSize=543459280, freeSize=15851184, maxSize=559310464, heapSize=543459280, minSize=531344928, minFactor=0.95, multiSize=265672464, multiFactor=0.5, singleSize=132836232, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false, firstKey=a899ca26-45a4-4af6-af34-72c035b4b7da/b5099cdb-2148-4f95-866a-f6eba2813bab/9223370573623181721/d:co/1463268017037/Put, lastKey=a899ca26-45a4-4af6-af34-72c035b4b7da/fffffc52-8969-472a-a17b-65a4edd6fa66/9223370573623181744/d:r.tt.EPC/1463255430913/Put, avgKeyLen=118, avgValueLen=29, entries=20184807, length=465347831, cur=null] to key a899ca26-45a4-4af6-af34-72c035b4b7da/f6f88f36-9263-4dbb-9588-aaf126be9117/9223370573623181733/d:/LATEST_TIMESTAMP/DeleteFamily/vlen=0/seqid=0 Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/d/9bad8381ba89430daf008c28befedec3 at 140550656 exp: -935597690 got: -1004657115 Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reader reader=hdfs://localhost:9000/hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/d/9bad8381ba89430daf008c28befedec3, compression=snappy, cacheConf=blockCache=LruBlockCache{blockCount=7248, currentSize=543459280, freeSize=15851184, maxSize=559310464, heapSize=543459280, minSize=531344928, minFactor=0.95, multiSize=265672464, multiFactor=0.5, singleSize=132836232, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false, firstKey=a899ca26-45a4-4af6-af34-72c035b4b7da/b5099cdb-2148-4f95-866a-f6eba2813bab/9223370573623181721/d:co/1463268017037/Put, lastKey=a899ca26-45a4-4af6-af34-72c035b4b7da/fffffc52-8969-472a-a17b-65a4edd6fa66/9223370573623181744/d:r.tt.EPC/1463255430913/Put, avgKeyLen=118, avgValueLen=29, entries=20184807, length=465347831, cur=null] to key a899ca26-45a4-4af6-af34-72c035b4b7da/f6f88f36-9263-4dbb-9588-aaf126be9117/9223370573623181733/d:/LATEST_TIMESTAMP/DeleteFamily/vlen=0/seqid=0 At around the same time region server says: 2016-05-16 08:30:37,390 WARN [B.defaultRpcServer.handler=8,queue=2,port=16201] hfile.HFile: HBase checksum verification failed for file hdfs://localhost:9000/hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/d/9bad8381ba89430daf008c28befedec3 at offset 408984102 filesize 465347831. Retrying read with HDFS checksums turned on... 2016-05-16 08:30:37,433 WARN [B.defaultRpcServer.handler=8,queue=2,port=16201] hdfs.DFSClient: Found Checksum error for BP-130837870-192.168.178.29-1462900512452:blk_1073746899_6086 from DatanodeInfoWithStorage[127.0.0.1:50010,DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017,DISK] at 140550656 2016-05-16 08:30:37,434 INFO [B.defaultRpcServer.handler=8,queue=2,port=16201] hdfs.DFSClient: Could not obtain BP-130837870-192.168.178.29-1462900512452:blk_1073746899_6086 from any node: java.io.IOException: No live nodes contain block BP-130837870-192.168.178.29-1462900512452:blk_1073746899_6086 after checking nodes = [DatanodeInfoWithStorage[127.0.0.1:50010,DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017,DISK]], ignoredNodes = null No live nodes contain current block Block locations: DatanodeInfoWithStorage[127.0.0.1:50010,DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017,DISK] Dead nodes: DatanodeInfoWithStorage[127.0.0.1:50010,DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017,DISK]. Will get new block locations from namenode and retry... 2016-05-16 08:30:37,434 WARN [B.defaultRpcServer.handler=8,queue=2,port=16201] hdfs.DFSClient: DFS chooseDataNode: got # 1 IOException, will wait for 2146.2290363411184 msec. .. some repetition... 2016-05-16 08:32:40,975 WARN [B.defaultRpcServer.handler=26,queue=2,port=16201] hdfs.DFSClient: DFS chooseDataNode: got # 1 IOException, will wait for 2767.123497826818 msec. 2016-05-16 08:32:42,496 ERROR [regionserver/hb-desktop/192.168.178.29:16201-shortCompactions-1463232518856] regionserver.CompactSplitThread: Compaction failed Request = regionName=tt_items,a899ca26-45a4-4af6-af34-72c035b4b7da/b50993c1-7ff4-4169-b58f-f53878697709/9223370573954024736,1462900815686.08255086d13380bd559a87dd93cc15ba., storeName=d, fileCount=8, fileSize=1.0 G (443.8 M, 239.2 M, 124.8 M, 101.9 M, 78.5 M, 33.0 M, 32.9 M, 10.9 M), priority=2, time=152707213570012 java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for reader reader=hdfs://localhost:9000/hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/d/9bad8381ba89430daf008c28befedec3, compression=snappy, cacheConf=blockCache=LruBlockCache{blockCount=7130, currentSize=535445256, freeSize=23865208, maxSize=559310464, heapSize=535445256, minSize=531344928, minFactor=0.95, multiSize=265672464, multiFactor=0.5, singleSize=132836232, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false, firstKey=a899ca26-45a4-4af6-af34-72c035b4b7da/b5099cdb-2148-4f95-866a-f6eba2813bab/9223370573623181721/d:co/1463268017037/Put, lastKey=a899ca26-45a4-4af6-af34-72c035b4b7da/fffffc52-8969-472a-a17b-65a4edd6fa66/9223370573623181744/d:r.tt.EPC/1463255430913/Put, avgKeyLen=118, avgValueLen=29, entries=20184807, length=465347831, cur=a899ca26-45a4-4af6-af34-72c035b4b7da/f6f881d7-076a-4aec-82f0-17e33d8f4f84/9223370573623181728/d:pi/1463267434824/Put/vlen=73/seqid=0] at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:146) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:596) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:273) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:105) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:124) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1233) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1770) at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:520) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/d/9bad8381ba89430daf008c28befedec3 at 140550656 exp: -935597690 got: -1004657115 at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method) at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59) at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301) at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:237) at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:156) at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:744) at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:800) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:860) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:903) Datanode says: 2016-05-16 08:26:01,750 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:46618, dest: /127.0.0.1:50010, bytes: 114196960, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747594_6781, duration: 14642101541 2016-05-16 08:26:01,750 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747594_6781, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2016-05-16 08:26:17,445 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow flushOrSync took 304ms (threshold=300ms), isSync:false, flushTotalNanos=304174121ns 2016-05-16 08:26:22,832 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-130837870-192.168.178.29-1462900512452:blk_1073747595_6782 src: /127.0.0.1:47502 dest: /127.0.0.1:50010 2016-05-16 08:26:23,385 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:432ms (threshold=300ms) 2016-05-16 08:26:24,684 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:47502, dest: /127.0.0.1:50010, bytes: 11485542, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747595_6782, duration: 1851995254 2016-05-16 08:26:24,685 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747595_6782, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2016-05-16 08:26:45,373 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:405ms (threshold=300ms) 2016-05-16 08:27:16,631 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-130837870-192.168.178.29-1462900512452:blk_1073747596_6783 src: /127.0.0.1:49517 dest: /127.0.0.1:50010 2016-05-16 08:27:17,729 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:49517, dest: /127.0.0.1:50010, bytes: 3455293, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747596_6783, duration: 1096622112 2016-05-16 08:27:17,729 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747596_6783, type=LAST_IN_PIPELINE, downstreams=0:[] terminating ...some Slow BlockReceiver messages... org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/home/hb/seritrack-mts/nosql/data/data, DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017): no suitable block pools found to scan. Waiting 1334756421 ms. 2016-05-16 08:30:07,254 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:375ms (threshold=300ms) 2016-05-16 08:31:32,604 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1073747556_6743 file /home/hb/seritrack-mts/nosql/data/data/current/BP-130837870-192.168.178.29-1462900512452/current/finalized/subdir0/subdir22/blk_1073747556 for deletion ... some repetition... 2016-05-16 08:31:39,276 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-130837870-192.168.178.29-1462900512452:blk_1073747597_6784 src: /127.0.0.1:59092 dest: /127.0.0.1:50010 2016-05-16 08:31:40,380 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:59092, dest: /127.0.0.1:50010, bytes: 11480560, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747597_6784, duration: 1093600168 2016-05-16 08:31:40,380 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747597_6784, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2016-05-16 08:31:41,563 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-130837870-192.168.178.29-1462900512452:blk_1073747598_6785 src: /127.0.0.1:59147 dest: /127.0.0.1:50010 2016-05-16 08:31:44,761 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:408ms (threshold=300ms) 2016-05-16 08:31:48,238 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:411ms (threshold=300ms) 2016-05-16 08:31:51,398 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/home/hb/seritrack-mts/nosql/data/data, DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017): no suitable block pools found to scan. Waiting 1334619581 ms. 2016-05-16 08:31:51,529 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:389ms (threshold=300ms) ...some more of those ... 2016-05-16 08:31:57,547 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:59147, dest: /127.0.0.1:50010, bytes: 268435456, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747598_6785, duration: 15718340373 2016-05-16 08:31:57,547 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747598_6785, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2016-05-16 08:31:57,749 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-130837870-192.168.178.29-1462900512452:blk_1073747599_6786 src: /127.0.0.1:59440 dest: /127.0.0.1:50010 2016-05-16 08:31:58,657 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:343ms (threshold=300ms) ...some more of those ... 2016-05-16 08:32:13,559 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:59440, dest: /127.0.0.1:50010, bytes: 268435456, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747599_6786, duration: 15809336101 2016-05-16 08:32:13,559 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747599_6786, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2016-05-16 08:32:14,071 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-130837870-192.168.178.29-1462900512452:blk_1073747600_6787 src: /127.0.0.1:59678 dest: /127.0.0.1:50010 2016-05-16 08:32:16,251 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:493ms (threshold=300ms) ...some more of those ... 2016-05-16 08:32:29,087 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:59678, dest: /127.0.0.1:50010, bytes: 268435456, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747600_6787, duration: 15015849046 2016-05-16 08:32:29,087 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747600_6787, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2016-05-16 08:32:29,171 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-130837870-192.168.178.29-1462900512452:blk_1073747601_6788 src: /127.0.0.1:59907 dest: /127.0.0.1:50010 2016-05-16 08:32:30,420 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:322ms (threshold=300ms) ...some more of those ... 2016-05-16 08:32:42,447 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:59907, dest: /127.0.0.1:50010, bytes: 205882206, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1749190019_1, offset: 0, srvID: 7bfbedce-3f6d-4f95-8eb4-d96438fedc36, blockid: BP-130837870-192.168.178.29-1462900512452:blk_1073747601_6788, duration: 13275544186 2016-05-16 08:32:42,447 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-130837870-192.168.178.29-1462900512452:blk_1073747601_6788, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2016-05-16 08:33:46,104 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Likely the client has stopped reading, disconnecting it (hb-desktop:50010:DataXceiver error processing READ_BLOCK operation src: /127.0.0.1:46586 dst: /127.0.0.1:50010); java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/127.0.0.1:50010 remote=/127.0.0.1:46586] Namenode says: 2016-05-16 08:30:39,662 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:30:39,662 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:30:41,844 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:30:41,844 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:30:43,495 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:30:43,495 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:30:47,435 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:30:47,436 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:30:51,842 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:30:51,842 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:30:57,476 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:30:57,476 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:31:06,539 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds 2016-05-16 08:31:06,539 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s). 2016-05-16 08:31:08,643 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:31:08,643 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:31:21,129 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:31:21,130 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:31:28,993 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073747591_6778 127.0.0.1:50010 2016-05-16 08:31:28,993 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 60 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 36 SyncTimes(ms): 11435 2016-05-16 08:31:29,186 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073747556_6743 127.0.0.1:50010 2016-05-16 08:31:29,321 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073747574_6761 127.0.0.1:50010 2016-05-16 08:31:29,674 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073747593_6780 127.0.0.1:50010 2016-05-16 08:31:29,713 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073747558_6745 127.0.0.1:50010 2016-05-16 08:31:29,796 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073747575_6762 127.0.0.1:50010 2016-05-16 08:31:30,237 INFO BlockStateChange: BLOCK* BlockManager: ask 127.0.0.1:50010 to delete [blk_1073747556_6743, blk_1073747558_6745, blk_1073747574_6761, blk_1073747575_6762, blk_1073747591_6778, blk_1073747593_6780] 2016-05-16 08:31:32,007 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:31:32,007 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:31:36,540 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds 2016-05-16 08:31:36,540 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s). 2016-05-16 08:31:38,849 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/.tmp/9652e091531943848ee523a60bc5baa5. BP-130837870-192.168.178.29-1462900512452 blk_1073747597_6784{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} 2016-05-16 08:31:40,381 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_1073747597_6784{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} size 0 2016-05-16 08:31:40,745 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/.tmp/9652e091531943848ee523a60bc5baa5 is closed by DFSClient_NONMAPREDUCE_-1749190019_1 2016-05-16 08:31:41,294 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/.tmp/fb79c8ab90a5498089841431191c03ca. BP-130837870-192.168.178.29-1462900512452 blk_1073747598_6785{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} 2016-05-16 08:31:57,547 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_1073747598_6785{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} size 0 2016-05-16 08:31:57,551 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/.tmp/fb79c8ab90a5498089841431191c03ca. BP-130837870-192.168.178.29-1462900512452 blk_1073747599_6786{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} 2016-05-16 08:32:06,539 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds 2016-05-16 08:32:06,539 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s). 2016-05-16 08:32:13,559 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_1073747599_6786{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} size 0 2016-05-16 08:32:13,875 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/.tmp/fb79c8ab90a5498089841431191c03ca. BP-130837870-192.168.178.29-1462900512452 blk_1073747600_6787{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} 2016-05-16 08:32:29,087 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_1073747600_6787{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} size 0 2016-05-16 08:32:29,088 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/.tmp/fb79c8ab90a5498089841431191c03ca. BP-130837870-192.168.178.29-1462900512452 blk_1073747601_6788{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} 2016-05-16 08:32:29,088 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 91 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 57 SyncTimes(ms): 14475 2016-05-16 08:32:36,539 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds 2016-05-16 08:32:36,539 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s). 2016-05-16 08:32:42,043 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks 2016-05-16 08:32:42,044 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_1073746899 to add as corrupt on 127.0.0.1:50010 by /127.0.0.1 because client machine reported it 2016-05-16 08:32:42,447 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_1073747601_6788{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-9cc4b81b-dbe3-4da1-a394-9ca30db55017:NORMAL:127.0.0.1:50010|RBW]]} size 0 2016-05-16 08:32:42,495 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/data/default/tt_items/08255086d13380bd559a87dd93cc15ba/.tmp/fb79c8ab90a5498089841431191c03ca is closed by DFSClient_NONMAPREDUCE_-1749190019_1
As there was no further response as it seems that HDFS-8809 is indeed the source of the problem - that is more of a confusion than a real problem, I am suggesting that as an answer.
PIG latin - DUMP command not displaying
I am just trying to display the result of GROUPed records using DUMP, but instead of displaying the data, there are lots of log data. I am just playing with 10 records. The details: grunt> DUMP grouped_records; 2016-02-21 17:34:24,338 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,FILTER 2016-02-21 17:34:24,339 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, DuplicateForEachColumnRewrite, GroupByConstParallelSetter, ImplicitSplitInserter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NewPartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier, PartitionFilterOptimizer]} 2016-02-21 17:34:24,354 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2016-02-21 17:34:24,374 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2016-02-21 17:34:24,374 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2016-02-21 17:34:24,434 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 2016-02-21 17:34:24,440 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job 2016-02-21 17:34:24,527 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2016-02-21 17:34:24,530 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2016-02-21 17:34:24,534 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2016-02-21 17:34:24,541 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=142 2016-02-21 17:34:24,541 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2016-02-21 17:34:25,128 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job662989067023626482.jar 2016-02-21 17:34:31,290 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job662989067023626482.jar created 2016-02-21 17:34:31,335 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2016-02-21 17:34:31,338 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2016-02-21 17:34:31,338 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cache 2016-02-21 17:34:31,338 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2016-02-21 17:34:31,549 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2016-02-21 17:34:31,550 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2016-02-21 17:34:31,556 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 2016-02-21 17:34:31,607 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS 2016-02-21 17:34:31,918 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2016-02-21 17:34:31,918 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2016-02-21 17:34:31,921 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-02-21 17:34:31,979 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2016-02-21 17:34:32,092 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1454294818944_0034 2016-02-21 17:34:32,192 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1454294818944_0034 2016-02-21 17:34:32,198 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://quickstart.cloudera:8088/proxy/application_1454294818944_0034/ 2016-02-21 17:34:32,198 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1454294818944_0034 2016-02-21 17:34:32,198 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases filtered_records,grouped_records,records 2016-02-21 17:34:32,198 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: records[1,10],records[-1,-1],filtered_records[2,19],grouped_records[3,18] C: R: 2016-02-21 17:34:32,198 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://localhost:50030/jobdetails.jsp?jobid=job_1454294818944_0034 2016-02-21 17:34:32,428 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2016-02-21 17:35:02,623 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2016-02-21 17:35:23,469 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2016-02-21 17:35:23,470 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.0-cdh5.5.0 0.12.0-cdh5.5.0 cloudera 2016-02-21 17:34:24 2016-02-21 17:35:23 GROUP_BY,FILTER Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_1454294818944_0034 1 1 12 12 12 12 16 16 16 16 filtered_records,grouped_records,records GROUP_BY hdfs://quickstart.cloudera:8020/tmp/temp-1703423271/tmp-988597361, Input(s): Successfully read 10 records (525 bytes) from: "/user/hduser/input/maxtemppig.tsv" Output(s): Successfully stored 0 records in: "hdfs://quickstart.cloudera:8020/tmp/temp-1703423271/tmp-988597361" Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1454294818944_0034 2016-02-21 17:35:23,646 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2016-02-21 17:35:23,648 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS 2016-02-21 17:35:23,648 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2016-02-21 17:35:23,649 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2016-02-21 17:35:23,660 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2016-02-21 17:35:23,660 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 Commands that I tried: records = LOAD '/user/hduser/input/maxtemppig.tsv' AS (year:chararray, temperature:int, quality:int); filtered_records = FILTER records BY temperature IN (-10,19) AND quality IN (0,1,4,5,9); DUMP filtered_records; grouped_records = GROUP filtered_records BY year; DUMP grouped_records; max_temp = FOREACH grouped_records GENERATE group, MAX(filtered_records.temperature); DUMP max_temp; My input tsv file... 1950 32 01459 1951 33 01459 1950 21 01459 1940 24 01459 1950 33 01459 2000 30 01459 2010 44 01459 2014 -10 01459 2016 -20 01459 2011 19 01459 What am I missing?
There is a high chance that the parsing is not working and you are filtering all records. Try records = LOAD '/user/hduser/input/maxtemppig.tsv' USING PigStorage('\t') AS (year:chararray, temperature:int, quality:int);
My MapReduce job become Fails
a have a mapreduce program in Eclipse. and I want to run it.. I follow the program from below url: http://www.orzota.com/step-by-step-mapreduce-programming/ I do all things that the page says and run the program. but it show me error and my job fails.. the program create output folder but it is empty.. here is my cod: package org.orzota.bookx.mappers; import java.io.IOException; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.MapReduceBase; import org.apache.hadoop.mapred.Mapper; import org.apache.hadoop.mapred.OutputCollector; import org.apache.hadoop.mapred.Reporter; public class MyHadoopMapper extends MapReduceBase implements Mapper <LongWritable, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); public void map(LongWritable _key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String st = value.toString(); String[] bookdata = st.split("\";\""); output.collect(new Text(bookdata[3]), one); } } public class MyHadoopReducer extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable>{ public void reduce(Text _key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { Text key = _key; int freq = 0; while (values.hasNext()){ IntWritable value = (IntWritable) values.next(); freq += value.get(); } output.collect(key, new IntWritable(freq)); } } public class MyHadoopDriver { public static void main(String[] args) { JobClient client = new JobClient(); JobConf conf = new JobConf( org.orzota.bookx.mappers.MyHadoopDriver.class); conf.setJobName("BookCrossing1.0"); // TODO: specify output types conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); // TODO: specify a mapper conf.setMapperClass(org.orzota.bookx.mappers.MyHadoopMapper.class); // TODO: specify a reducer conf.setReducerClass(org.orzota.bookx.mappers.MyHadoopReducer.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); client.setConf(conf); try { JobClient.runJob(conf); } catch (Exception e) { e.printStackTrace(); } } } and here is the errors: 13/09/03 12:19:11 INFO util.ProcessTree: setsid exited with exit code 0 13/09/03 12:19:11 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#3c2378 13/09/03 12:19:11 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclip/Runs/input/BX-Books.csv:0+33554432 13/09/03 12:19:11 INFO mapred.MapTask: numReduceTasks: 1 13/09/03 12:19:12 INFO mapred.MapTask: io.sort.mb = 100 13/09/03 12:19:12 INFO mapred.MapTask: data buffer = 79691776/99614720 13/09/03 12:19:12 INFO mapred.MapTask: record buffer = 262144/327680 13/09/03 12:19:12 INFO mapred.JobClient: map 0% reduce 0% 13/09/03 12:19:13 INFO mapred.MapTask: Starting flush of map output 13/09/03 12:19:14 INFO mapred.MapTask: Finished spill 0 13/09/03 12:19:14 INFO mapred.Task: Task:attempt_local1379860058_0001_m_000000_0 is done. And is in the process of commiting 13/09/03 12:19:14 INFO mapred.LocalJobRunner: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:0+33554432 13/09/03 12:19:14 INFO mapred.Task: Task 'attempt_local1379860058_0001_m_000000_0' done. 13/09/03 12:19:14 INFO mapred.LocalJobRunner: Finishing task: attempt_local1379860058_0001_m_000000_0 13/09/03 12:19:14 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000001_0 13/09/03 12:19:14 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#15dd910 13/09/03 12:19:14 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:33554432+33554432 13/09/03 12:19:14 INFO mapred.MapTask: numReduceTasks: 1 13/09/03 12:19:14 INFO mapred.MapTask: io.sort.mb = 100 13/09/03 12:19:14 INFO mapred.MapTask: data buffer = 79691776/99614720 13/09/03 12:19:14 INFO mapred.MapTask: record buffer = 262144/327680 13/09/03 12:19:14 INFO mapred.JobClient: map 20% reduce 0% 13/09/03 12:19:15 INFO mapred.MapTask: Starting flush of map output 13/09/03 12:19:15 INFO mapred.MapTask: Finished spill 0 13/09/03 12:19:15 INFO mapred.Task: Task:attempt_local1379860058_0001_m_000001_0 is done. And is in the process of commiting 13/09/03 12:19:15 INFO mapred.LocalJobRunner: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:33554432+33554432 13/09/03 12:19:15 INFO mapred.Task: Task 'attempt_local1379860058_0001_m_000001_0' done. 13/09/03 12:19:15 INFO mapred.LocalJobRunner: Finishing task: attempt_local1379860058_0001_m_000001_0 13/09/03 12:19:15 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000002_0 13/09/03 12:19:15 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#7c3885 13/09/03 12:19:15 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Book-Ratings.csv:0+30682276 13/09/03 12:19:15 INFO mapred.MapTask: numReduceTasks: 1 13/09/03 12:19:15 INFO mapred.MapTask: io.sort.mb = 100 13/09/03 12:19:16 INFO mapred.MapTask: data buffer = 79691776/99614720 13/09/03 12:19:16 INFO mapred.MapTask: record buffer = 262144/327680 13/09/03 12:19:16 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000003_0 13/09/03 12:19:16 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#11d2572 13/09/03 12:19:16 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Users.csv:0+12284157 13/09/03 12:19:16 INFO mapred.MapTask: numReduceTasks: 1 13/09/03 12:19:16 INFO mapred.MapTask: io.sort.mb = 100 13/09/03 12:19:16 INFO mapred.MapTask: data buffer = 79691776/99614720 13/09/03 12:19:16 INFO mapred.MapTask: record buffer = 262144/327680 13/09/03 12:19:16 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000004_0 13/09/03 12:19:16 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#164b09c 13/09/03 12:19:16 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:67108864+10678575 13/09/03 12:19:16 INFO mapred.MapTask: numReduceTasks: 1 13/09/03 12:19:16 INFO mapred.MapTask: io.sort.mb = 100 13/09/03 12:19:16 INFO mapred.MapTask: data buffer = 79691776/99614720 13/09/03 12:19:16 INFO mapred.MapTask: record buffer = 262144/327680 13/09/03 12:19:16 INFO mapred.JobClient: map 40% reduce 0% 13/09/03 12:19:17 INFO mapred.MapTask: Starting flush of map output 13/09/03 12:19:17 INFO mapred.MapTask: Finished spill 0 13/09/03 12:19:17 INFO mapred.Task: Task:attempt_local1379860058_0001_m_000004_0 is done. And is in the process of commiting 13/09/03 12:19:17 INFO mapred.LocalJobRunner: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:67108864+10678575 13/09/03 12:19:17 INFO mapred.Task: Task 'attempt_local1379860058_0001_m_000004_0' done. 13/09/03 12:19:17 INFO mapred.LocalJobRunner: Finishing task: attempt_local1379860058_0001_m_000004_0 13/09/03 12:19:17 INFO mapred.LocalJobRunner: Map task executor complete. 13/09/03 12:19:17 WARN mapred.LocalJobRunner: job_local1379860058_0001 java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) Caused by: java.lang.ArrayIndexOutOfBoundsException: 3 at org.orzota.bookx.mappers.MyHadoopMapper.map(MyHadoopMapper.java:17) at org.orzota.bookx.mappers.MyHadoopMapper.map(MyHadoopMapper.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 13/09/03 12:19:17 INFO mapred.JobClient: map 60% reduce 0% 13/09/03 12:19:17 INFO mapred.JobClient: Job complete: job_local1379860058_0001 13/09/03 12:19:17 INFO mapred.JobClient: Counters: 16 13/09/03 12:19:17 INFO mapred.JobClient: File Input Format Counters 13/09/03 12:19:17 INFO mapred.JobClient: Bytes Read=77795631 13/09/03 12:19:17 INFO mapred.JobClient: FileSystemCounters 13/09/03 12:19:17 INFO mapred.JobClient: FILE_BYTES_READ=178484057 13/09/03 12:19:17 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6981917 13/09/03 12:19:17 INFO mapred.JobClient: Map-Reduce Framework 13/09/03 12:19:17 INFO mapred.JobClient: Map output materialized bytes=2971356 13/09/03 12:19:17 INFO mapred.JobClient: Map input records=271380 13/09/03 12:19:17 INFO mapred.JobClient: Spilled Records=271380 13/09/03 12:19:17 INFO mapred.JobClient: Map output bytes=2428578 13/09/03 12:19:17 INFO mapred.JobClient: Total committed heap usage (bytes)=883687424 13/09/03 12:19:17 INFO mapred.JobClient: CPU time spent (ms)=0 13/09/03 12:19:17 INFO mapred.JobClient: Map input bytes=77787439 13/09/03 12:19:17 INFO mapred.JobClient: SPLIT_RAW_BYTES=306 13/09/03 12:19:17 INFO mapred.JobClient: Combine input records=0 13/09/03 12:19:17 INFO mapred.JobClient: Combine output records=0 13/09/03 12:19:17 INFO mapred.JobClient: Physical memory (bytes) snapshot=0 13/09/03 12:19:17 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0 13/09/03 12:19:17 INFO mapred.JobClient: Map output records=271380 13/09/03 12:19:17 INFO mapred.JobClient: Job Failed: NA java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357) at org.orzota.bookx.mappers.MyHadoopDriver.main(MyHadoopDriver.java:44) I think the error is from this line: output.collect(new Text(bookdata[3]), one); but I don't know what it says.. can anyone help me please? thanks..
I checked the link you provided. I think the best thing you can do is do a system.out.println() of your input key value pairs (on a small subset of your input dataset), just to be sure. If the input file you are using contains a '\n' then it might be possible that the csv record is broken into 2 seperate records which contain fewer than 8 substrings. The ArrayOutOfBoundsException seems to point in this direction. I don't think it is a mapreduce error. You could also add the following line to your map function: if (bookdata.length!=8){ System.out.println("Warning, bad entry"); return; } If the simulation survives you have isolated the problem..
Most probably the input file you are reading has a row that doesn't have 4 columns. So when you split the row into an Array, String[] bookdata = st.split("\";\""); And you want to access the 4th element output.collect(new Text(bookdata[3]), one); It fails.
hadoop: reduce happened between flush map output and finish spill before maps done
I'm new to hadoop, and i'm trying the examples wordcount/secondsort in src/examples. wordcount test environment: input: file01.txt file02.txt secondsort test environment: input: sample01.txt sample02.txt Which means both the two test would have 2 paths to process. I print some log info trying to understand the process of map/reduce. See what's between Starting flush of map output and Finished spill 0: the wordcount program has another two reduce task before a final reduce while the secondsort program just do the reduce once and it's done. Since these programs are so "small", i dont think the io.sort.mb/io.sort.refactor would affect this. Can anybody explain this? Thanks for your patience for my broken Englisth and the long log. These are the log info (i cut some useless info to make it short): wordcount log: [hadoop#localhost ~]$ hadoop jar test.jar com.abc.example.test wordcount output 13/08/07 18:14:05 INFO mapred.FileInputFormat: Total input paths to process : 2 13/08/07 18:14:06 INFO mapred.JobClient: Running job: job_local_0001 13/08/07 18:14:06 INFO util.ProcessTree: setsid exited with exit code 0 ... 13/08/07 18:14:06 INFO mapred.MapTask: numReduceTasks: 1 13/08/07 18:14:06 INFO mapred.MapTask: io.sort.mb = 100 13/08/07 18:14:06 INFO mapred.MapTask: data buffer = 79691776/99614720 13/08/07 18:14:06 INFO mapred.MapTask: record buffer = 262144/327680 Mapper: 0 | Hello Hadoop GoodBye Hadoop 13/08/07 18:14:06 INFO mapred.MapTask: **Starting flush of map output** Reduce: GoodBye Reduce: GoodBye | 1 Reduce: Hadoop Reduce: Hadoop | 1 Reduce: Hadoop | 1 Reduce: Hello Reduce: Hello | 1 13/08/07 18:14:06 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 13/08/07 18:14:06 INFO mapred.LocalJobRunner: hdfs://localhost:8020/user/hadoop/wordcount/file02.txt:0+28 13/08/07 18:14:06 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 13/08/07 18:14:06 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#4d16ffed 13/08/07 18:14:06 INFO mapred.MapTask: numReduceTasks: 1 13/08/07 18:14:06 INFO mapred.MapTask: io.sort.mb = 100 13/08/07 18:14:06 INFO mapred.MapTask: data buffer = 79691776/99614720 13/08/07 18:14:06 INFO mapred.MapTask: record buffer = 262144/327680 13/08/07 18:14:06 INFO mapred.MapTask: **Starting flush of map output** Reduce: Bye Reduce: Bye | 1 Reduce: Hello Reduce: Hello | 1 Reduce: world Reduce: world | 1 Reduce: world | 1 13/08/07 18:14:06 INFO mapred.MapTask: **Finished spill 0** 13/08/07 18:14:06 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting 13/08/07 18:14:06 INFO mapred.LocalJobRunner: hdfs://localhost:8020/user/hadoop/wordcount/file01.txt:0+22 13/08/07 18:14:06 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done. 13/08/07 18:14:06 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#1f3c0665 13/08/07 18:14:06 INFO mapred.LocalJobRunner: 13/08/07 18:14:06 INFO mapred.Merger: Merging 2 sorted segments 13/08/07 18:14:06 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 77 bytes 13/08/07 18:14:06 INFO mapred.LocalJobRunner: Reduce: Bye Reduce: Bye | 1 Reduce: GoodBye Reduce: GoodBye | 1 Reduce: Hadoop Reduce: Hadoop | 2 Reduce: Hello Reduce: Hello | 1 Reduce: Hello | 1 Reduce: world Reduce: world | 2 13/08/07 18:14:06 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting ... 13/08/07 18:14:07 INFO mapred.JobClient: Reduce input groups=5 13/08/07 18:14:07 INFO mapred.JobClient: Combine output records=6 13/08/07 18:14:07 INFO mapred.JobClient: Physical memory (bytes) snapshot=0 13/08/07 18:14:07 INFO mapred.JobClient: Reduce output records=5 13/08/07 18:14:07 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0 13/08/07 18:14:07 INFO mapred.JobClient: Map output records=8 secondsort log info: [hadoop#localhost ~]$ hadoop jar example.jar com.abc.example.example secondsort output 13/08/07 17:00:11 INFO input.FileInputFormat: Total input paths to process : 2 13/08/07 17:00:11 WARN snappy.LoadSnappy: Snappy native library not loaded 13/08/07 17:00:12 INFO mapred.JobClient: Running job: job_local_0001 13/08/07 17:00:12 INFO util.ProcessTree: setsid exited with exit code 0 13/08/07 17:00:12 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#57d94c7b 13/08/07 17:00:12 INFO mapred.MapTask: io.sort.mb = 100 13/08/07 17:00:12 INFO mapred.MapTask: data buffer = 79691776/99614720 13/08/07 17:00:12 INFO mapred.MapTask: record buffer = 262144/327680 Map: 0 | 5 49 Map: 5 | 9 57 Map: 10 | 19 46 Map: 16 | 3 21 Map: 21 | 9 48 Map: 26 | 7 57 ... 13/08/07 17:00:12 INFO mapred.MapTask: **Starting flush of map output** 13/08/07 17:00:12 INFO mapred.MapTask: **Finished spill 0** 13/08/07 17:00:12 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 13/08/07 17:00:12 INFO mapred.LocalJobRunner: 13/08/07 17:00:12 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 13/08/07 17:00:12 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#f3a1ea1 13/08/07 17:00:12 INFO mapred.MapTask: io.sort.mb = 100 13/08/07 17:00:12 INFO mapred.MapTask: data buffer = 79691776/99614720 13/08/07 17:00:12 INFO mapred.MapTask: record buffer = 262144/327680 Map: 0 | 20 21 Map: 6 | 50 51 Map: 12 | 50 52 Map: 18 | 50 53 Map: 24 | 50 54 ... 13/08/07 17:00:12 INFO mapred.MapTask: **Starting flush of map output** 13/08/07 17:00:12 INFO mapred.MapTask: **Finished spill 0** 13/08/07 17:00:12 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting 13/08/07 17:00:12 INFO mapred.LocalJobRunner: 13/08/07 17:00:12 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done. 13/08/07 17:00:12 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#cee4e92 13/08/07 17:00:12 INFO mapred.LocalJobRunner: 13/08/07 17:00:12 INFO mapred.Merger: Merging 2 sorted segments 13/08/07 17:00:12 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 1292 bytes 13/08/07 17:00:12 INFO mapred.LocalJobRunner: Reduce: 0:35 ----------------- Reduce: 0:35 | 35 Reduce: 0:54 ----------------- ... 13/08/07 17:00:12 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting 13/08/07 17:00:12 INFO mapred.LocalJobRunner: 13/08/07 17:00:12 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now 13/08/07 17:00:12 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to output 13/08/07 17:00:12 INFO mapred.LocalJobRunner: reduce > reduce 13/08/07 17:00:12 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done. 13/08/07 17:00:13 INFO mapred.JobClient: map 100% reduce 100% 13/08/07 17:00:13 INFO mapred.JobClient: Job complete: job_local_0001 13/08/07 17:00:13 INFO mapred.JobClient: Counters: 22 13/08/07 17:00:13 INFO mapred.JobClient: File Output Format Counters 13/08/07 17:00:13 INFO mapred.JobClient: Bytes Written=4787 ... 13/08/07 17:00:13 INFO mapred.JobClient: SPLIT_RAW_BYTES=236 13/08/07 17:00:13 INFO mapred.JobClient: Reduce input records=92 PS: The main()s for others to check out. wordcount: public static void main(String[] args) throws Exception { JobConf conf = new JobConf(test.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); } secondsort: public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException { Configuration conf = new Configuration(); Job job = new Job(conf, "secondarysort"); job.setJarByClass(example.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setPartitionerClass(FirstPartitioner.class); job.setGroupingComparatorClass(GroupingComparator.class); job.setMapOutputKeyClass(IntPair.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }
Combine output records=6 This says it all: the reduce function is used both as a combiner and a reducer. So what you are seeing is output from the combiner. The combiner is (sometimes) invoked when output is spilled. I think you should have added your code, at least the part in the main() to show us how your job is set up. This would make it easier to answer your questions.
I think the lines such as Reduce: GoodBye Reduce: GoodBye | 1 are println(...)in your source codes, and you need to check the source code.