ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/python/google/common/collect/Lists - hadoop

I'm just getting started with Pig and I'm facing lots of issues with running my first program. Any help is much appreciated.
I've tried resolving using these:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable
and
Pig Installation error: ERROR pig.Main: ERROR 2998: Unhandled internal error
but none of them seem to work. Can someone give a more detailed solution of what needs to be done.
Pig version: 0.17.0
Stack Trace:
Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. org/python/google/common/collect/Lists
java.lang.NoClassDefFoundError: org/python/google/common/collect/Lists
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.getTaskReports(MRJobStats.java:533)
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.addMapReduceStatistics(MRJobStats.java:355)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.addSuccessJobStats(MRPigStatsUtil.java:232)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.accumulateStats(MRPigStatsUtil.java:164)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:379)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
at org.apache.pig.PigServer.storeEx(PigServer.java:1119)
at org.apache.pig.PigServer.store(PigServer.java:1082)
at org.apache.pig.PigServer.openIterator(PigServer.java:995)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:782)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:630)
at org.apache.pig.Main.main(Main.java:175)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.lang.ClassNotFoundException: org.python.google.common.collect.Lists
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 24 more
================================================================================

I think this is a bug in some versions of Pig.
It seems that it (MRJobStats.java) is using org.python.google.common.collect.* classes instead of org.google.common.collect.*, and the former are not on the classpath.
The bug was fixed in this commit:
https://github.com/apache/pig/commit/6dd3ca4deb84edd9edd7765aa1d12f89a31b1283
in July 2017. Unfortunately, that is after the pig 0.17.0 release that you are using; see https://github.com/apache/pig/blob/trunk/CHANGES.txt.
So you will most likely need to checkout and build pig for yourself. There is a link to the build instructions in the README.txt file on Github: https://github.com/apache/pig

Related

Unable to start elasticsearch 2.2.0

I have installed elasticsearch 2.2.0 and it worked fine for a week, but now it didn't start. I have set both JAVA_HOME and JRE_HOME. I use java version 1.8 (i.e., jdk1.8.0_201 and jre1.8.0_202). When I try to start the elasticsearch.bat it terminates with a error message of:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/commons/cli/CommandLineParser
Likely root cause: java.lang.ClassNotFoundException:
org.apache.commons.cli.CommandLineParser
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:241)
at
org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
Refer to the log for complete error details.
But no logs has been generated.

Pig script running on MapReduce but not on Tez

I am using version of Pig(0.16.0) and Tez version is 0.9.0. The pig script is running fine on MapReduce, but not with Tez. I had tried change tez-0.8.(3-5) still not work. Can this be a version mismatch problem? Please have a look at the logs:
ERROR 2017: Internal error creating job configuration.
org.apache.pig.backend.hadoop.executionengine.JobCreationException: ERROR 2017: Internal error creating job configuration.
at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.getJob(TezJobCompiler.java:137)
at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.compile(TezJobCompiler.java:78)
at org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:198)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:308)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1474)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1459)
at org.apache.pig.PigServer.execute(PigServer.java:1448)
at org.apache.pig.PigServer.executeBatch(PigServer.java:488)
at org.apache.pig.PigServer.executeBatch(PigServer.java:471)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:172)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:235)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:501)
at org.apache.pig.Main.main(Main.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.NoSuchMethodException: org.apache.tez.dag.api.DAG.setCallerContext(org.apache.tez.client.CallerContext)
at java.lang.Class.getMethod(Class.java:1786)
at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.getJob(TezJobCompiler.java:128)
... 20 more
================================================================================

EMR Hadoop Pig job error "Internal error creating job configuration"

I have a PIG job running on Amazon EMR and suddnly it has stopped working giving the following error:
Pig Stack Trace
---------------
ERROR 2017: Internal error creating job configuration.
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: ERROR 2017: Internal error creating job configuration.
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:855)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:294)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1264)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249)
at org.apache.pig.PigServer.execute(PigServer.java:1239)
at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:479)
at org.apache.pig.Main.main(Main.java:159)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
Caused by: java.lang.NullPointerException
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.adjustNumReducers(JobControlCompiler.java:875)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:480)
... 17 more
================================================================================
Does anyone know why or what might be the problem? this is one of the most vague errors I have ever seen.
The problem actually turned out to be that PIG was unable to locate one of the input files to be processed, yet the error doesn't even remotely suggest a missing file issue.

Hadoop 2.6.0: Basic error "starting MRAppMaster" after installing

I have just started to work with Hadoop 2.
After installing with basic configs, I always failed to run any examples. Has anyone seen this problem and please help me?
And the error is something like
Error starting MRAppMaster
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
This is the log
20152015-01-06 11:56:23,194 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1420510526926_0002_000001
2015-01-06 11:56:23,362 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.security.Groups.<init>(Groups.java:70)
at org.apache.hadoop.security.Groups.<init>(Groups.java:66)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:299)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1473)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129)
... 7 more
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V
at org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative(Native Method)
at org.apache.hadoop.security.JniBasedUnixGroupsMapping.<clinit>(JniBasedUnixGroupsMapping.java:49)
at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.<init>(JniBasedUnixGroupsMappingWithFallback.java:39)
... 12 more
2015-01-06 11:56:23,366 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1
Error messages: "Error starting MRAppMaster", "InvocationTargetException", "UnsatisfiedLinkError"
Root cause: Fail to execute the native function "anchorNative" in the class "org.apache.hadoop.security.JniBasedUnixGroupsMapping"
Description: the function "anchorNative" will call a function in the library "libhadoop.so". The path of this library is specified by these environment variables:
export JAVA_LIBRARY_PATH
export HADOOP_COMMON_LIB_NATIVE_DIR
In the Hadoop source code, print the java library class path by
System.err.println( (System.getProperty("java.library.path") );
# result
/home/maidinh/hadoop2/build/hadoop-2.6.0-src/hadoop-dist/target/hadoop-2.6.0/lib/native:
/usr/java/packages/lib/amd64:
/usr/lib64:
/lib64:
/lib:
/usr/lib
Different versions of the library "libhadoop.so" can be found in these locations that make a conflict.
Solution: Except the right path of native library (in hadoop-2.6.0/lib/native), delete all "libhadoop.so" in other directories.
Notes: delete all related libraries of hadoop
rm -r libhadoop*
rm -r libhdfs*

Installing a Spark Cluster, problems with Hive

I am trying to get a Spark/Shark cluster up but keep running into the same problem.
I have followed the instructions on https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster and addressed Hive as stated.
I think that the Shark Driver is picking up another version of Hadoop jars but am unsure why.
Here are the details, any help would be great.
Spark/Shark 0.9.0
Apache Hadoop 2.3.0
Amplabs Hive 0.11
Scala 2.10.3
Java 7
I have everything install but I get some deprecation warnings and then an exception:
14/03/14 11:24:47 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/03/14 11:24:47 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
Exception:
Exception in thread "main" org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1072)
at shark.memstore2.TableRecovery$.reloadRdds(TableRecovery.scala:49)
at shark.SharkCliDriver.<init>(SharkCliDriver.scala:275)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:162)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1139)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:51)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2288)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2299)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1070)
... 4 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1137)
... 9 more
Caused by: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
I had this same problem, and I think it's caused by incompatible versions of hadoop/hive and spark/shark.
You need to either:
Remove hadoop-core-1.0.x.jar from shark/lib_managed/jars/org.apache.hadoop/hadoop-core/
When building shark, explicitly set SHARK_HADOOP_VERSION as follows:
cd shark;
SHARK_HADOOP_VERSION=2.0.0-mr1-cdh4.5.0 ./sbt/sbt clean
SHARK_HADOOP_VERSION=2.0.0-mr1-cdh4.5.0 ./sbt/sbt package
The second method solved other issues for me as well. You can also see this topic for more details: https://groups.google.com/forum/#!msg/shark-users/lTNPcxHJiOQ/EqzyByZrzQMJ

Resources