Flume Exception in thread "main" java.lang.OutOfMemoryError: Java heap space - hadoop

arun#arun-admin:/usr/lib/apache-flume-1.6.0-bin/bin$ ./flume-ng agent --conf ./conf/ -f /usr/lib/apache-flume-1.6.0properties -Dflume.root.logger=DEBUG,console -n agent
Info: Including Hadoop libraries found via (/usr/share/hadoop/bin/hadoop) for HDFS access
Info: Excluding /usr/share/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
Info: Excluding /usr/share/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
Info: Including Hive libraries found via (/usr/lib/apache-hive-3.1.2-bin) for Hive access
+ exec /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx20m -Dflume.root.logger=DEBUG,console -cp './conf/:/usr/lib/apache-flume-1.6.0-bin/lib/:/usr/share/hadoop/etc/hadoop:/usr/share/hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/share/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/share/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/share/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/share/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/share/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/share/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/share/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/share/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/share/hadoop/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/share/hadoop/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/share/hadoop/share/hadoop/common/lib/gson-2.2.4.jar:/usr/share/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/share/hadoop/share/hadoop/common/lib/hadoop-annotations-2.7.3.jar:/usr/share/hadoop/share/hadoop/common/lib/hadoop-auth-2.7.3.jar:/usr/share/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/share/hadoop/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/usr/share/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/share/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/share/hadoop/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/share/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/share/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/share/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/share/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/share/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/share/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/share/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/share/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/share/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/share/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/share/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/share/hadoop/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/share/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/share/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/share/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/share/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/share/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/share/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/share/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/share/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/share/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/share/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/share/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/share/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/share/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/share/hadoop/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/share/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar:/usr/share/hadoop/share/hadoop/common/hadoop-common-2.7.3-tests.jar:/usr/share/hadoop/share/hadoop/common/hadoop-nfs-2.7.3.jar:/usr/share/hadoop/share/hadoop/common/jdiff:/usr/share/hadoop/share/hadoop/common/lib:/usr/share/hadoop/share/hadoop/common/sources:/usr/share/hadoop/share/hadoop/common/templates:/usr/share/hadoop/share/hadoop/hdfs:/usr/share/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/share/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/share/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.3.jar:/usr/share/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.3-tests.jar:/usr/share/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-2.7.3.jar:/usr/share/hadoop/share/hadoop/hdfs/jdiff:/usr/share/hadoop/share/hadoop/hdfs/lib:/usr/share/hadoop/share/hadoop/hdfs/sources:/usr/share/hadoop/share/hadoop/hdfs/templates:/usr/share/hadoop/share/hadoop/hdfs/webapps:/usr/share/hadoop/share/hadoop/yarn/lib/activation-1.1.jar:/usr/share/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/share/hadoop/share/hadoop/yarn/lib/asm-3.2.jar:/usr/share/hadoop/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/share/hadoop/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/share/hadoop/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/usr/share/hadoop/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/share/hadoop/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/share/hadoop/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/share/hadoop/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/share/hadoop/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/share/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/share/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/share/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/share/hadoop/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/usr/share/hadoop/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/share/hadoop/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/share/hadoop/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/share/hadoop/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/share/hadoop/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/share/hadoop/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/share/hadoop/share/hadoop/yarn/lib/xz-1.0.jar:/usr/share/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/share/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-api-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-client-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-registry-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.3.jar:/usr/share/hadoop/share/hadoop/yarn/lib:/usr/share/hadoop/share/hadoop/yarn/sources:/usr/share/hadoop/share/hadoop/yarn/test:/usr/share/hadoop/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/hadoop-annotations-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3-tests.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar:/usr/share/hadoop/share/hadoop/mapreduce/lib:/usr/share/hadoop/share/hadoop/mapreduce/lib-examples:/usr/share/hadoop/share/hadoop/mapreduce/sources:/usr/share/hadoop/contrib/capacity-scheduler/.jar:/usr/lib/apache-hive-3.1.2-bin/lib/*' -Djava.library.path=:/ usr/share/hadoop/lib org.apache.flume.node.Application -f

It is saying out of Memory error. Please change your Xmx value while running the application. Currently, you are giving 20MB by Xmx20m and maybe this much of memory is not enough to run this. Change it to higher value say 1000MB like this Xmx1000m and see if that helps.
You need to find the right value for this configuration. This can be done if you know the data size that has to flow. If you are unable to anticipate that then, trial and error is the only option.

You can try increasing heap size in your flume command by passing -Xmx512m. If you still face the same error pls try to increase heap size in flume command to -Xmx1000m.

Related

Oracle data export using Liquibase causing OOM

I am getting an exception while exporting the data from Oracle database using Liquibase.
I have used below maven command
**mvn -e -X liquibase:generateChangeLog -Dliquibase.diffTypes=data -DargLine= "-Xms10G -Xmx 20G -XX:-UseGCOverheadLimit"**
last line I can see in the log before failing is,
[DEBUG] Executing with the 'jdbc' executor
When I checked the same in Java Visual VM, I found that **liquibase.change.ColumnConfig** class is consuming 80% of space, this is more likely a memory leakage issue.
On top of this when I checked the java memory space, I can see Old Gen space is utilized 100% and it never goes down after 50 mins of execution.
I would like to know is there a way to export a data for few specific tables, if so please help with the maven command. I tried out below, but it didn't work out for me.
**mvn -e -X liquibase:generateChangeLog -Dliquibase.diffTypes=data,table -Dliquibase.includeObjects="table:myTable" -DargLine= "-Xms10G -Xmx 20G -XX:-UseGCOverheadLimit"**
The version of Liquibase I am using is 4.2.0
Any help is highly appreciated.

set spark vm options

I'm trying to build a spark application which uses zookeeper and kafka. Maven is being used for build. The project I'm trying to build is here. On executing:
mvn clean package exec:java -Dexec.mainClass="com.iot.video.app.spark.processor.VideoStreamProcessor"
It shows
ERROR SparkContext:91 - Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 253427712 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.
I tried adding spark.driver.memory 4g to spark-defaults.conf but I still get the error. How can I fix it?
You can send extra JVM options to your workers by using dedicated spark-submit arguments:
spark-submit --conf 'spark.executor.memory=1g'\
--conf 'spark.executor.extraJavaOptions=-Xms1024m -Xmx4096m'
Similarly, you can set the option for your driver (useful if your application is submitted in cluster mode, or launched by spark-submit):
--conf 'spark.driver.extraJavaOptions=-Xms512m -Xmx2048m'

JMETER : ERROR - jmeter.JMeter: Uncaught exception: java.lang.OutOfMemoryError: GC overhead limit exceeded

I am running 6450 users test in a distributed environment in AWS ubuntu machines.
I am getting the following error when test reach to peak load,
ERROR - jmeter.JMeter: Uncaught exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
Machine Details:
m4.4xlarge
HEAP="-Xms512m -Xmx20480m" (jmeter.sh file)
I allocated 20GB for the heap size in JMeter.sh.
But when I run the ps -eaf|grep java command its giving following response.
root 11493 11456 56 15:47 pts/9 00:00:03 java -server -
XX:+HeapDumpOnOutOfMemoryError -Xms512m -Xmx512m -
XX:MaxTenuringThreshold=2 -XX:PermSize=64m -XX:MaxPermSize=128m -
XX:+CMSClassUnloadingEnabled -jar ./ApacheJMeter.jar**
I don't have any idea what changes I have to do now.
Do the change in jmeter file not in jmeter.sh as you can see with ps that it is not being applied.
Also with such a heap you may need to add:
-XX:-UseGCOverheadLimit
And switch to G1 garbage collector algorithm.
And also check you respect these recommendations:
http://jmeter.apache.org/usermanual/best-practices.html
http://www.ubik-ingenierie.com/blog/jmeter_performance_tuning_tips/
First of all, the answer is in your question: you say that ps -eaf|grep java shows this:
XX:+HeapDumpOnOutOfMemoryError -Xms512m -Xmx512m
That is memory is still very low. So either you changed jmeter.sh, but using other shell script to actually start JMeter, or you didn't change it in a valid way, so JMeter uses defaults.
But on top of that, I really doubt you can run 6450 users on one machine, unless your script is very light. Unconfigured machine can usually handle 200-400, and well-configured machine probably can deal with up to 2000.
You need to amend the line in jmeter file, not jmeter.sh file. Locate HEAP="-Xms512m -Xmx512m" line and update the Xmx value accordingly.
Also ensure you're starting JMeter using jmeter file.
If you have environment which explicitly relies on jmeter.sh file you should be amending HEAP size a little bit differently, like:
export JVM_ARGS="-Xms512m -Xmx20480m" && ./jmeter.sh
or add the relevant line to jmeter.sh file.
See JMeter Best Practices and 9 Easy Solutions for a JMeter Load Test “Out of Memory” Failure articles for comprehensive information on tuning JMeter

Error: Failed to create Data Storage while running embedded pig in java

I wrote a simple program to test the embedded pig in java to run in mapreduce mode.
The hadoop version in the server I am running is 0.20.2-cdh3u4a, and pig version is 0.10.0-cdh3u4a.
When I try to run in local mode, it runs successfully. But when I try to run in mapreduce mode, it gives me the error.
I run my program using the following commands as shown in http://pig.apache.org/docs/r0.9.1/cont.html#embed-java
javac -cp pig.jar EmbedPigTest.java
javac -cp pig.jar:.:/etc/hadoop/conf EmbedPigTest.java input.txt
My program gives error as:
Exception in thread "main" java.lang.RuntimeException: Failed to create DataStorage
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:214)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:134)
at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
at org.apache.pig.PigServer.<init>(PigServer.java:226)
at org.apache.pig.PigServer.<init>(PigServer.java:215)
at org.apache.pig.PigServer.<init>(PigServer.java:211)
at org.apache.pig.PigServer.<init>(PigServer.java:207)
at WordCount.main(EmbedPigTest.java:9)
In some online resources they say that this problem occurs due to different hadoop version. But, I didn't understand what I should do. Suggestions please !!
This is happening because you are linking to the wrong jar, Please see the link below it describes this issue very well.
http://localsteve.wordpress.com/2012/09/30/embedding-pig-for-cdh4-java-apps-fer-realz/
I was faced same kind of issue when I tried to use pig in map reduce mode without starting the services.
Please check all services using jps before using pig in map reduce mode.

Error in Hadoop MapReduce

When I run a mapreduce program using Hadoop, I get the following error.
10/01/18 10:52:48 INFO mapred.JobClient: Task Id : attempt_201001181020_0002_m_000014_0, Status : FAILED
java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)
10/01/18 10:52:48 WARN mapred.JobClient: Error reading task outputhttp://ubuntu.ubuntu-domain:50060/tasklog?plaintext=true&taskid=attempt_201001181020_0002_m_000014_0&filter=stdout
10/01/18 10:52:48 WARN mapred.JobClient: Error reading task outputhttp://ubuntu.ubuntu-domain:50060/tasklog?plaintext=true&taskid=attempt_201001181020_0002_m_000014_0&filter=stderr
What is this error about?
One reason Hadoop produces this error is when the directory containing the log files becomes too full. This is a limit of the Ext3 Filesystem which only allows a maximum of 32000 links per inode.
Check how full your logs directory is in hadoop/userlogs
A simple test for this problem is to just try and create a directory from the command-line for example: $ mkdir hadoop/userlogs/testdir
If you have too many directories in userlogs the OS should fail to create the directory and report there are too many.
I was having the same issue when I run out of space on disk with log directory.
Another cause can be, JVM Error when you try to allocate some dedicated space to JVM and it is not present on your machine.
sample code:
conf.set("mapred.child.java.opts", "-Xmx4096m");
Error message:
Error occurred during initialization of VM
Could not reserve enough space for object heap
Solution: Replace -Xmx with dedicated memory value that you can provide to JVM on your machine(e.g. "-Xmx1024m")
Increase your ulimit to unlimited. or alternate solution reduce the allocated memory.
If you create a runnable jar file in eclipse, it gives that error on hadoop system. You should extract runnable part. That solved my problem.

Resources