I want to test out a cluster of some few computers: Each with 2 Cores and 256 MB of RAM. By following Cloudera's tutorial, I've tried instructing Hadoop 2.6.0 about my low resources NodeManagers( Ubuntu 14.04). I have the following configurations :
mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hadoop-master:54311</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-master:19888</value>
</property>
<property>
<name>mapred.task.profile</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>200</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>200</value>
</property>
<property>
<name>mapreduce.map.java.opts.max.heap</name>
<value>160</value>
</property>
<property>
<name>mapreduce.reduce.java.opts.max.heap</name>
<value>160</value>
</property>
</configuration>
yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value> org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>200</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>200</value>
</property>
<property>
<name>yarn.scheduler.increment-allocation-mb</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop-master:8050</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/app-logs</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:///usr/local/hadoop/local</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>200</value>
</property>
</configuration>
But when I try and run a small pi generation example, I get this error:
yarn jar hadoop-mapreduce-examples-2.6.0.jar pi 1 1
Number of Maps = 1
Samples per Map = 1
Wrote input for Map #0
Starting Job
16/01/28 19:23:24 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/10.0.3.100:8050
16/01/28 19:23:25 INFO input.FileInputFormat: Total input paths to process : 1
16/01/28 19:23:25 INFO mapreduce.JobSubmitter: number of splits:1
16/01/28 19:23:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1454008935455_0001
16/01/28 19:23:26 INFO impl.YarnClientImpl: Submitted application application_1454008935455_0001
16/01/28 19:23:26 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1454008935455_0001/
16/01/28 19:23:26 INFO mapreduce.Job: Running job: job_1454008935455_0001
16/01/28 19:23:34 INFO mapreduce.Job: Job job_1454008935455_0001 running in uber mode : false
16/01/28 19:23:34 INFO mapreduce.Job: map 0% reduce 0%
16/01/28 19:23:34 INFO mapreduce.Job: Job job_1454008935455_0001 failed with state FAILED due to: Application application_1454008935455_0001 failed 2 times due to AM Container for appattempt_1454008935455_0001_000002 exited with exitCode: -103
For more detailed output, check application tracking page:http://hadoop-master:8088/proxy/application_1454008935455_0001/Then, click on links to logs of each attempt.
Diagnostics: Container [pid=847,containerID=container_1454008935455_0001_02_000001] is running beyond virtual memory limits. Current usage: 210.8 MB of 200 MB physical memory used; 1.3 GB of 420.0 MB virtual memory used. Killing container.
Dump of the process-tree for container_1454008935455_0001_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 855 847 847 847 (java) 466 16 1410424832 53695 /usr/lib/jvm/java-7-openjdk-i386/jre/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/usr/local/hadoop/logs/userlogs/application_1454008935455_0001/container_1454008935455_0001_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster
|- 847 845 847 847 (bash) 0 0 5431296 276 /bin/bash -c /usr/lib/jvm/java-7-openjdk-i386/jre/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/usr/local/hadoop/logs/userlogs/application_1454008935455_0001/container_1454008935455_0001_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/usr/local/hadoop/logs/userlogs/application_1454008935455_0001/container_1454008935455_0001_02_000001/stdout 2>/usr/local/hadoop/logs/userlogs/application_1454008935455_0001/container_1454008935455_0001_02_000001/stderr
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
16/01/28 19:23:34 INFO mapreduce.Job: Counters: 0
Job Finished in 9.962 seconds
java.io.FileNotFoundException: File does not exist: hdfs://hadoop-master:9000/user/hduser/QuasiMonteCarlo_1454009003268_765740795/out/reduce-out
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1750)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1774)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Is there an error in this configuration? Or maybe Hadoop isn't made for such low resources. I'm just doing this for learning purposes.
Yeah you'll get into trouble with low resources. For testing purposes disable mem checks:
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
For yarn.scheduler.minimum-allocation-mb you might go even lower because actual reserved mem is used in incremental steps. i.e. if you set it to 100 and request 101, yarn will round it up to 200.
vmem check is unreliable and imho should really be disabled on yarn by default.
Related
I have a problem with sqoop if you help me I really appreciate your help.
I write a sqoop command from my local computer to export data from hdfs to oracle data database. I use hadoop-3.3.0 and sqoop 1.4.7 in my local computer.
and the error is :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
sqoop command:
sqoop export --connect "jdbc:oracle:thin:#(description=(address=(protocol=tcp)(host=172.16.49.30)(port=1521))(connect_data=(service_name=stgdb)))" --table CORE_ETL.DEPOSIT_TURNOVER --username username --password password --export-dir /tmp/merged_deposit_turnover/sqoop/ --input-fields-terminated-by "," --input-lines-terminated-by '\n'
yarn-site.xml:
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>true</value>
</property>
<property>
<name>yarn.admin.acl</name>
<value>*</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>cluster.com:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>cluster.com:8033</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cluster.com:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cluster.com:8031</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cluster.com:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>cluster.com:8090</value>
</property>
<property>
<name>yarn.resourcemanager.client.thread-count</name>
<value>50</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.client.thread-count</name>
<value>50</value>
</property>
<property>
<name>yarn.resourcemanager.admin.client.thread-count</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.increment-allocation-mb</name>
<value>512</value>
</property>
<property>
<name> yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.increment-allocation-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
</property>
<property>
<name>yarn.resourcemanager.amliveliness-monitor.interval-ms</name>
<value>1000</value>
</property>
<property>
<name>yarn.am.liveness-monitor.expiry-interval-ms</name>
<value>600000</value>
</property>
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>2</value>
</property>
<property>
<name>yarn.resourcemanager.container.liveness-monitor.interval-ms</name>
<value>600000</value>
</property>
<property>
<name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
<value>1000</value>
</property>
<property>
<name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
<value>600000</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.client.thread-count</name>
<value>50</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
</property>
<property>
<name>yarn.resourcemanager.max-completed-applications</name>
<value>10000</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
</property>
</configuration>
environment variables:
export HADOOP_HOME=/etc/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export YARN_CONF_DIR=/etc/hadoop/etc/hadoop
export HADOOP_CONF_DIR=/etc/hadoop/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/dfs/nn</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address</name>
<value>cluster.com:8022</value>
</property>
<property>
<name>dfs.https.address</name>
<value>cluster.com:9871</value>
</property>
<property>
<name>dfs.https.port</name>
<value>9871</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>cluster.com:9870</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>67108864</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>fs.permissions.umask-mode</name>
<value>022</value>
</property>
<property>
<name>dfs.client.block.write.locateFollowingBlock.retries</name>
<value>7</value>
</property>
<property>
<name>dfs.namenode.acls.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.client.read.shortcircuit</name>
<value>false</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hdfs-sockets/dn</value>
</property>
<property>
<name>dfs.client.read.shortcircuit.skip.checksum</name>
<value>false</value>
</property>
<property>
<name>dfs.client.domain.socket.data.traffic</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/etc/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/etc/hadoop</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/etc/hadoop</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/common/*,$HADOOP_MAPRED_HOME/share/hadoop/common/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/yarn/*,$HADOOP_MAPRED_HOME/share/hadoop/yarn/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/hdfs/*,$HADOOP_MAPRED_HOME/share/hadoop/hdfs/lib/*</value>
</property>
</configuration>
sqoop error:
Warning: /usr/lib/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/lib/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/lib/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
2020-08-22 17:56:24,879 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2020-08-22 17:56:25,173 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2020-08-22 17:56:25,492 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
2020-08-22 17:56:25,579 INFO manager.SqlManager: Using default fetchSize of 1000
2020-08-22 17:56:25,579 INFO tool.CodeGenTool: Beginning code generation
2020-08-22 17:56:27,694 INFO manager.OracleManager: Time zone has been set to GMT
2020-08-22 17:56:27,883 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM CORE_ETL.DEPOSIT_TURNOVER t WHERE 1=0
2020-08-22 17:56:28,188 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /etc/hadoop
Note: /tmp/sqoop-hatef/compile/dc629ada72d032251eb72d68f8f68c85/CORE_ETL_DEPOSIT_TURNOVER.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2020-08-22 17:56:33,829 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hatef/compile/dc629ada72d032251eb72d68f8f68c85/CORE_ETL.DEPOSIT_TURNOVER.jar
2020-08-22 17:56:33,902 INFO mapreduce.ExportJobBase: Beginning export of CORE_ETL.DEPOSIT_TURNOVER
2020-08-22 17:56:33,902 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2020-08-22 17:56:34,381 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
2020-08-22 17:56:36,685 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-08-22 17:56:38,545 INFO manager.OracleManager: Time zone has been set to GMT
2020-08-22 17:56:38,638 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
2020-08-22 17:56:38,645 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2020-08-22 17:56:38,647 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2020-08-22 17:56:38,996 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at hdp-name1-esxi12.sdb247.com/172.16.49.10:8032
2020-08-22 17:56:40,130 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/airflow/.staging/job_1597060731030_0459
2020-08-22 18:01:01,798 INFO input.FileInputFormat: Total input files to process : 1
2020-08-22 18:01:01,885 INFO input.FileInputFormat: Total input files to process : 1
2020-08-22 18:01:02,817 INFO mapreduce.JobSubmitter: number of splits:4
2020-08-22 18:01:02,999 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2020-08-22 18:01:05,962 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1597060731030_0459
2020-08-22 18:01:05,962 INFO mapreduce.JobSubmitter: Executing with tokens: []
2020-08-22 18:01:08,561 INFO conf.Configuration: resource-types.xml not found
2020-08-22 18:01:08,562 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2020-08-22 18:01:08,901 INFO impl.YarnClientImpl: Submitted application application_1597060731030_0459
2020-08-22 18:01:09,086 INFO mapreduce.Job: The url to track the job: http://hdp-name1-esxi12.sdb247.com:8088/proxy/application_1597060731030_0459/
2020-08-22 18:01:09,088 INFO mapreduce.Job: Running job: job_1597060731030_0459
2020-08-22 18:01:11,442 INFO mapreduce.Job: Job job_1597060731030_0459 running in uber mode : false
2020-08-22 18:01:11,444 INFO mapreduce.Job: map 0% reduce 0%
2020-08-22 18:01:11,671 INFO mapreduce.Job: Job job_1597060731030_0459 failed with state FAILED due to: Application application_1597060731030_0459 failed 2 times due to AM Container for appattempt_1597060731030_0459_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2020-08-22 18:03:19.337]Exception from container-launch.
Container id: container_1597060731030_0459_02_000001
Exit code: 1
[2020-08-22 18:03:19.338]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
[2020-08-22 18:03:19.339]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
For more detailed output, check the application tracking page: http://cluster.com:8088/cluster/app/application_1597060731030_0459 Then click on links to logs of each attempt.
. Failing the application.
2020-08-22 18:01:11,780 INFO mapreduce.Job: Counters: 0
2020-08-22 18:01:11,916 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2020-08-22 18:01:11,921 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 273.1812 seconds (0 bytes/sec)
2020-08-22 18:01:12,013 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2020-08-22 18:01:12,015 INFO mapreduce.ExportJobBase: Exported 0 records.
2020-08-22 18:01:12,015 ERROR mapreduce.ExportJobBase: Export job failed!
2020-08-22 18:01:12,016 ERROR tool.ExportTool: Error during export:
Export job failed!
at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:445)
at org.apache.sqoop.manager.OracleManager.exportTable(OracleManager.java:465)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
You mention you have a cluster installed with Cloudera, but it is not clear where Sqoop is running or where you got those XML files.
If you have a fully installed Cloudera Cluster, Sqoop should already be installed and configured there for you to run without much issues (you might need extra JDBC drivers, but that should be it)
Otherwise, if you are trying to setup Sqoop (and Hadoop) externally, you'll want to grab a copy of the $HADOOP_HOME/conf folder from a worker node in the Hadoop cluster to make sure all the client configurations are the same.
I've set up a Hadoop 2.7.5.HA cluster and running Flink 1.4.0 applications using the default YARN queue. I decided to categoruze applications and run them on exclusive nodemanagers, so I labeled three nodes, each 4 core and 2GB RAM as stream in queue streamQ and three nodes each 1 core and 1GB RAM as onlinein queue onlineQ and all the settings are displayed in YARN webUI as desired and nodes are identified.
Here is the capacity-scheduler.xml:
<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>10000</value>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>0.1</value>
</property>
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
</property>
<property>
<name>yarn.scheduler.capacity.node-locality-delay</name>
<value>40</value>
</property>
<property>
<name>yarn.scheduler.capacity.queue-mappings</name>
<value></value>
</property>
<property>
<name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>streamQ,onlineQ</value>
</property>
<!-- streamQ settings -->
<property>
<name>yarn.scheduler.capacity.root.streamQ.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels</name>
<value>stream</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.default-node-label-expression</name>
<value>stream</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.acl_submit_applications</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.acl_administer_queue</name>
<value>*</value>
</property>
<!-- onlineQ settings -->
<property>
<name>yarn.scheduler.capacity.root.onlineQ.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels</name>
<value>online</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.default-node-label-expression</name>
<value>online</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.acl_submit_applications</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.acl_administer_queue</name>
<value>*</value>
</property>
I run the command to start the Flink session on an edge node with all hadoop configuration the same as cluster:
yarn-session.sh -n 2 -jm 768 -tm 768 -nm flink -z flink_zoo -s 3 -qu streamQ
it successfully uploads Flink libs on HDFS and in YARN webUI I can see the application, but when it attempts to get resources, it says:
018-01-28 10:02:04,087 INFO org.apache.flink.yarn.YarnClusterDescriptor - Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster
Here is the whole logs:
2018-01-28 10:00:09,648 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost
2018-01-28 10:00:09,649 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123
2018-01-28 10:00:09,650 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 768
2018-01-28 10:00:09,650 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 768
2018-01-28 10:00:09,650 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2018-01-28 10:00:09,650 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false
2018-01-28 10:00:09,650 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1
2018-01-28 10:00:09,650 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: web.port, 8081
2018-01-28 10:00:10,003 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-01-28 10:00:10,069 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to manager (auth:SIMPLE)
2018-01-28 10:00:10,377 INFO org.apache.flink.yarn.YarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=768, numberTaskManagers=2, slotsPerTaskManager=3}
2018-01-28 10:00:10,747 WARN org.apache.flink.yarn.YarnClusterDescriptor - The configuration directory ('/opt/flink/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2018-01-28 10:00:10,751 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/conf/log4j.properties to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/log4j.properties
2018-01-28 10:00:11,123 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/lib/log4j-1.2.17.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/log4j-1.2.17.jar
2018-01-28 10:00:11,384 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/lib/flink-dist_2.11-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/flink-dist_2.11-1.4.0.jar
2018-01-28 10:00:30,986 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/lib/flink-shaded-hadoop2-uber-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/flink-shaded-hadoop2-uber-1.4.0.jar
2018-01-28 10:00:40,852 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/lib/flink-python_2.11-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/flink-python_2.11-1.4.0.jar
2018-01-28 10:00:41,017 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/lib/slf4j-log4j12-1.7.7.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/slf4j-log4j12-1.7.7.jar
2018-01-28 10:00:41,250 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/conf/logback.xml to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/logback.xml
2018-01-28 10:00:41,386 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/lib/flink-dist_2.11-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/flink-dist_2.11-1.4.0.jar
2018-01-28 10:01:02,966 INFO org.apache.flink.yarn.Utils - Copying from /tmp/application_1517118829753_0002-flink-conf.yaml285707454205346702.tmp to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/application_1517118829753_0002-flink-conf.yaml285707454205346702.tmp
2018-01-28 10:01:03,601 INFO org.apache.flink.yarn.YarnClusterDescriptor - Submitting application master application_1517118829753_0002
2018-01-28 10:01:03,782 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1517118829753_0002
2018-01-28 10:01:03,783 INFO org.apache.flink.yarn.YarnClusterDescriptor - Waiting for the cluster to be allocated
2018-01-28 10:01:03,796 INFO org.apache.flink.yarn.YarnClusterDescriptor - Deploying cluster, current state ACCEPTED
What is the problem?
Editing the capacity-scheduler.xml, solved the problem:
<!-- configuration of queue-root -->
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>streamQ,onlineQ</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.stream.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.online.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.default-node-label-expression</name>
<value>*</value>
</property>
<!-- configuration of queue-streamQ -->
<property>
<name>yarn.scheduler.capacity.root.streamQ.capacity</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels</name>
<value>stream</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.online.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.default-node-label-expression</name>
<value>stream</value>
</property>
<!-- configuration of queue-streamQ -->
<property>
<name>yarn.scheduler.capacity.root.onlineQ.capacity</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels</name>
<value>online</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.stream.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.default-node-label-expression</name>
<value>online</value>
</property>
</configuration>
Please check your flink app logs to see if there is some issue when connect to yarn resourcemanager. I also encounted the issue when I use flink on yarn with HA.I am not sure if I was the only one.
I am running Hadoop with Oozie in pseudo-mode ( I am not using any distribution of hadoop per say by CDH or Hortonworks etc.,). I have having the following configuration while running - Fedora 22 VM running on VirtualBox, RAM allocated 4GB, Hadoop 2.7, Oozie 4.2
After I submit the example Mapreduce job for OOZIE it gets SUSPENDED with the Job error below,
2015-10-29 15:44:59,048 WARN ActionStartXCommand:523 - SERVER[hadoop] USER[hadoop] GROUP[-] TOKEN[] APP[map-reduce-wf] JOB[0000000-151029154441128-OOZIE-VB-W] ACTION[0000000-151029154441128-OOZIE-VB-W#mr-node] Error starting action [mr-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1024
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)]
org.apache.oozie.action.ActionExecutorException: JA009: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1024
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:456)
at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:440)
at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1132)
at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1286)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
at org.apache.oozie.command.XCommand.call(XCommand.java:286)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I think this is something got to do with the Memory allocations to the MapReduce jobs , but I am not able to figure out the exact math behind this. Help on this is much appreciated.
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.jobtracker.address</name>
<value>http://localhost:50031</value>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>http://localhost:50030</value>
</property>
<property>
<name>mapreduce.jobtracker.jobhistory.location</name>
<value>/home/osboxes/hadoop/logs/jobhistory</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>http://localhost:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/home/osboxes/hadoop/mr-history/temp</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/home/osboxes/hadoop/mr-history/done</value>
</property>
<property>
<name>mapreduce.cluster.local.dir</name>
<value>/home/osboxes/hadoop/dfs/local</value>
</property>
<property>
<name>mapreduce.jobtracker.system.dir</name>
<value>/home/osboxes/hadoop/dfs/system</value>
</property>
yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description> Execution Framework </description>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>1024</value>
</property>
EDITED 30-Oct-2015
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
Looks like User group permissions issue Try running it as Oozie USER.
I had installed all components of cloudera 5 in one machine: name node, datanode, hue, pig, oozie , yarn, hbase ...
I run the following pig script in shell:
sudo -u hdfs pig
and then in the pig shell run
data = LOAD '/user/test/text.txt' as (text:CHARARRAY) ;
DUMP data;
script work well
But when run this script on hue browsers Query Editor/ Pig Editor it stuck and below is the log:
2015-09-14 14:07:06,847 [uber-SubtaskRunner] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://HadoopTestEnv:50030/jobdetails.jsp?jobid=job_1442214247855_0002
2015-09-14 14:07:06,884 [uber-SubtaskRunner] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-09-14 14:07:07,512 [communication thread] INFO org.apache.hadoop.mapred.TaskAttemptListenerImpl - Progress of TaskAttempt attempt_1442214247855_0001_m_000000_0 is : 1.0
Heart beat
2015-09-14 14:07:37,545 [communication thread] INFO org.apache.hadoop.mapred.TaskAttemptListenerImpl - Progress of TaskAttempt attempt_1442214247855_0001_m_000000_0 is : 1.0
Heart beat
2015-09-14 14:08:07,571 [communication thread] INFO org.apache.hadoop.mapred.TaskAttemptListenerImpl - Progress of TaskAttempt attempt_1442214247855_0001_m_000000_0 is : 1.0
Heart beat
I have using the yarn-utils script to support me config the yarn-site.xml and marped-site.xml
python yarn-ulti.spy -c 6 -m 16 -d 1 -k True
Using cores=4 memory=16GB disks=1 hbase=True
Profile: cores=6 memory=12288MB reserved=4GB usableMem=12GB disks=1
Num Container=3
Container Ram=4096MB
Used Ram=12GB
Unused Ram=4GB
yarn.scheduler.minimum-allocation-mb=4096
yarn.scheduler.maximum-allocation-mb=12288
yarn.nodemanager.resource.memory-mb=12288
mapreduce.map.memory.mb=2048
mapreduce.map.java.opts=-Xmx1638m
mapreduce.reduce.memory.mb=4096
mapreduce.reduce.java.opts=-Xmx3276m
yarn.app.mapreduce.am.resource.mb=2048
yarn.app.mapreduce.am.command-opts=-Xmx1638m
mapreduce.task.io.sort.mb=819
Script still hang and heart beat forever , any one help me please !
Here is my config:
yarn-site.xml
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>12288</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>6</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>12288</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx1638m</value>
</property>
Mapred-site.xml
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx768m</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1638m</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx3276m</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>2048</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>819</value>
</property>
<property>
<name>mapreduce.map.cpu.vcores</name>
<value>2</value>
</property>
<property>
<name>mapreduce.reduce.cpu.vcores</name>
<value>2</value>
</property>
The Pig App submits an Oozie jobs that will use one MR slot in addition to what the script does.
The locking is usually due to submission deadlocks like gotcha #5 or having only one available task slot in your cluster.
I am trying to test my hadoop installation by running a wordcount job. My problem is that the job gets stuck at ACCEPTED state and seems to run forever. I am using hadoop 2.3.0 and tried fix the problem by following an answer to this question here but it didn't work for me.
This is what I have:
C:\hadoop-2.3.0>yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.3.0.jar wordcount /data/test.txt /data/output
15/03/15 15:36:07 INFO client.RMProxy: Connecting to ResourceManager at/0.0.0.0:8032
15/03/15 15:36:09 INFO input.FileInputFormat: Total input paths to process : 1
15/03/15 15:36:10 INFO mapreduce.JobSubmitter: number of splits:1
15/03/15 15:36:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14 26430101974_0001
15/03/15 15:36:11 INFO impl.YarnClientImpl: Submitted application application_14 26430101974_0001
15/03/15 15:36:11 INFO mapreduce.Job: The url to track the job: http://Agata-PC:8088/proxy/application_1426430101974_0001/
15/03/15 15:36:11 INFO mapreduce.Job: Running job: job_1426430101974_0001
This is my mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>127.0.0.1:9001</value>
</property>
<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>/user</value>
</property>
<property>
<name>mapreduce.history.server.http.address</name>
<value>127.0.0.1:51111</value>
<description>Http address of the history server</description>
<final>false</final>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx768m</value>
</property>
<property>
<name>mapreduce.map.cpu.vcores</name>
<value>1</value>
<description>The number of virtual cores required for each map task.</description>
</property>
<property>
<name>mapreduce.reduce.cpu.vcores</name>
<value>1</value>
<description>The number of virtual cores required for each map task.</description>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
<description>Larger resource limit for maps.</description>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx768m</value>
<description>Heap-size for child jvms of maps.</description>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1024</value>
<description>Larger resource limit for reduces.</description>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx768m</value>
<description>Heap-size for child jvms of reduces.</description>
</property>
</configuration>
And this is my yarn-site.xml:
<configuration>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>128</value>
<description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
<description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>4</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>
</configuration>
Any help is much appreciated.
Did you try restarting your hadoop's processes or clusters? There might be some works still running.
May be it will be helpful to see the log by following the url of the job or by going through the hadoop url.
Cheers.
I have run into similar issue earlier, you might have a infinite loop in mapper or reducer . Check if your reducer is properly handling iterable.