Hive 2.1.0 Issue org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source - hadoop

I am using hadoop 2.7.2 , hive 2.1.0 have enabled tez and but reverted back to use mr as execution engine and facing below error while trying to run a Insert Select query on orc and parquet tables,
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://.../user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18/.hive-staging_hive_2016-10-20_14-46-33_718_5449118624054166228-1/-ext-10000 to destination hdfs://..../user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2959)
at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1532)
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1461)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:497)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: rename for src path: hdfs:.....:9000/user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18/.hive-staging_hive_2016-10-20_14-46-33_718_5449118624054166228-1/-ext-10000/000000_0 to dest:hdfs://.....:9000/user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18 returned false
Any clue about the issue, would be highly appriciated.
Regards.

Related

Pig script running on MapReduce but not on Tez

I am using version of Pig(0.16.0) and Tez version is 0.9.0. The pig script is running fine on MapReduce, but not with Tez. I had tried change tez-0.8.(3-5) still not work. Can this be a version mismatch problem? Please have a look at the logs:
ERROR 2017: Internal error creating job configuration.
org.apache.pig.backend.hadoop.executionengine.JobCreationException: ERROR 2017: Internal error creating job configuration.
at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.getJob(TezJobCompiler.java:137)
at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.compile(TezJobCompiler.java:78)
at org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:198)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:308)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1474)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1459)
at org.apache.pig.PigServer.execute(PigServer.java:1448)
at org.apache.pig.PigServer.executeBatch(PigServer.java:488)
at org.apache.pig.PigServer.executeBatch(PigServer.java:471)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:172)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:235)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:501)
at org.apache.pig.Main.main(Main.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.NoSuchMethodException: org.apache.tez.dag.api.DAG.setCallerContext(org.apache.tez.client.CallerContext)
at java.lang.Class.getMethod(Class.java:1786)
at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.getJob(TezJobCompiler.java:128)
... 20 more
================================================================================

Gobblin Kafka to HDFS pull job error

I'm trying to pull data from Kafka to HDFS using Gobblin.
Gobblin version (compiled from github source code with command sudo ./gradlew clean build -PuseHadoop2 -PhadoopVersion=2.7.1 -x test):
0.6.2-546-g431188b
Hadoop version:
Hadoop 2.7.1.2.4.2.0-258
Subversion git#github.com:hortonworks/hadoop.git -r 13debf893a605e8a88df18a7d8d214f571e05289
Compiled by jenkins on 2016-04-24T16:02Z
Compiled with protoc 2.5.0
From source with checksum 2a2d95f05ec6c3ac547ed58cab713ac
This command was run using /usr/hdp/2.4.2.0-258/hadoop/hadoop-common-2.7.1.2.4.2.0-258.jar
Gobblin job:
job.name=GobblinKafkaQuickStart
job.group=GobblinKafka
job.description=Gobblin quick start job for Kafka
job.lock.enabled=false
job.schedule=0 0/2 * * * ?
kafka.brokers=hd-mgt03:6667,hd-mgt02:6667,hd-mgt04:6667
source.class=gobblin.source.extractor.extract.kafka.KafkaSimpleSource
extract.namespace=gobblin.extract.kafka
writer.builder.class=gobblin.writer.AvroHdfsDataWriter
writer.file.path.type=tablename
writer.destination.type=HDFS
writer.output.format=AVRO
data.publisher.type=gobblin.publisher.BaseDataPublisher
mr.job.max.mappers=1
metrics.reporting.file.enabled=true
metrics.log.dir=/gobblin-kafka/metrics
metrics.reporting.file.suffix=txt
bootstrap.with.offset=earliest
fs.uri=hdfs://hdfs:8020
writer.fs.uri=hdfs://hdfs:8020
state.store.fs.uri=hdfs://hdfs:8020
mr.job.root.dir=/kafka/working
state.store.dir=/kafka/state-store
task.data.root.dir=/kafka/task-data
data.publisher.final.dir=/kafka/job-output
I'm trying to run gobblin-mapreduce.sh from gobblin-dist/bin folder, but getting error:
Exception in thread "main" gobblin.runtime.JobException: Job job_GobblinKafkaQuickStart_1464962113982 failed
at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:363)
at gobblin.runtime.mapreduce.CliMRJobLauncher.launchJob(CliMRJobLauncher.java:84)
at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:61)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Log file contains error:
2016-06-03 16:55:17 MSK ERROR [main] gobblin.runtime.AbstractJobLauncher 321 - Failed to launch and run job job_GobblinKafkaQuickStart_1464962113982: java.lang.NoSuchFieldError: DEFAULT_MR_AM_ADMIN_USER_ENV
java.lang.NoSuchFieldError: DEFAULT_MR_AM_ADMIN_USER_ENV
at org.apache.hadoop.mapred.YARNRunner.createApplicationSubmissionContext(YARNRunner.java:470)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:285)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at gobblin.runtime.mapreduce.MRJobLauncher.runWorkUnits(MRJobLauncher.java:198)
at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:296)
at gobblin.runtime.mapreduce.CliMRJobLauncher.launchJob(CliMRJobLauncher.java:84)
at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:61)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
What could be the reason for this error? How can I fix it?
From you error I can tell it might be the problem of JAR.
Usually, this error (java.lang.NoSuchFieldError: DEFAULT_MR_AM_ADMIN_USER_ENV) is caused by jar conflicts. You can check your class path to see if there are any version conflicts.

Phoenix Error while trying to create table

When I either log in to sqlline.py in phoenix or I try to create table in phoenix through api's I get an exception.
With my limited knowledge of phenix I am unable to figure out why phoenix is checking for System.Catalog table even before it creates it.Any help will be greatly appreciated.
StackTrace:
*4/11/18 06:07:18 WARN client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: SYSTEM.CATALOG, row=SYSTEM.CATALOG,\x00SYSTEM\x00CATALOG,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:151)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1059)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1121)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:914)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1053)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1156)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:422)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:183)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:226)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:908)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1351)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:131)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:112)
at sqlline.SqlLine$DatabaseConnection.connect(SqlLine.java:4650)
at sqlline.SqlLine$DatabaseConnection.getConnection(SqlLine.java:4701)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3942)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3851)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sqlline.SqlLine$ReflectiveCommandHandler.execute(SqlLine.java:2810)
at sqlline.SqlLine.dispatch(SqlLine.java:817)
at sqlline.SqlLine.initArgs(SqlLine.java:633)
at sqlline.SqlLine.begin(SqlLine.java:680)
at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
at sqlline.SqlLine.main(SqlLine.java:424)
Error: SYSTEM.CATALOG (state=08000,code=101)
org.apache.phoenix.exception.PhoenixIOException: SYSTEM.CATALOG
at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:97)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:935)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1053)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1156)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:422)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:183)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:226)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:908)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1351)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:131)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:112)
at sqlline.SqlLine$DatabaseConnection.connect(SqlLine.java:4650)
at sqlline.SqlLine$DatabaseConnection.getConnection(SqlLine.java:4701)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3942)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3851)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sqlline.SqlLine$ReflectiveCommandHandler.execute(SqlLine.java:2810)
at sqlline.SqlLine.dispatch(SqlLine.java:817)
at sqlline.SqlLine.initArgs(SqlLine.java:633)
at sqlline.SqlLine.begin(SqlLine.java:680)
at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
at sqlline.SqlLine.main(SqlLine.java:424)
Caused by: org.apache.hadoop.hbase.TableNotFoundException: SYSTEM.CATALOG
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1139)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:914)
... 23 more*
Check your hdfs://.../hbase/data/default/ is exist SYSTEM.CATALOG ?
if there is nothing, you must try to use bin/hbase clean --cleanZk
before you use the command, you must stop hbase Master and regionServers,but still keep ZK alive.

Hive and Cassandra integration using CqlStorageHandler

I referred this git project for integrating cassandra data using hive table.I copied the appropriate cassandra jars into the hive lib folder.But while running the query against cassandra I'm getting the following error.Please help me to resolve it.
https://github.com/milliondreams/hive/tree/cas-support-cql/cassandra-handler
hive> CREATE EXTERNAL TABLE messages(row_key string, col1 string, col2 string)
STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES("cql.primarykey" = "row_key")
TBLPROPERTIES ("cassandra.ks.name" = "mycqlks", "cassandra.ks.stratOptions"="'DC':1, 'DC2':1",
"cassandra.ks.strategy"="NetworkTopologyStrategy");
java.lang.NoSuchMethodError: org.apache.hadoop.hive.metastore.MetaStoreUtils.getSchema(Lorg/apache/hadoop/hive/metastore/api/Table;)Ljava/util/Properties;
at org.apache.hadoop.hive.cassandra.cql.CqlManager.createColumnFamily(CqlManager.java:238)
at org.apache.hadoop.hive.cassandra.cql.CqlManager.createCFIfNotFound(CqlManager.java:189)
at org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler.preCreateTable(CqlStorageHandler.java:247)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:462)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:455)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at com.sun.proxy.$Proxy11.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3776)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:256)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.DDLTask
Which version of hive are you using?
It has to be hive 0.9 as per https://github.com/milliondreams/hive/tree/cas-support-cql/cassandra-handler
I think you are using version >= 0.11.0
Version 0.9.0: http://svn.apache.org/repos/asf/hive/tags/release-0.9.0/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Version 0.10.0: http://svn.apache.org/repos/asf/hive/tags/release-0.10.0/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Version 0.11.0: http://svn.apache.org/repos/asf/hive/tags/release-0.11.0/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Single argument method - org.apache.hadoop.hive.metastore.MetaStoreUtils.getSchema is missing in 0.11.0

Trying to load indexed LZO file using LzoPigStorage and elephant-bird

I've got a log file with default LZO compression and an .index file generated using Hadoop-LZO, but when I run a simple Pig file to retrieve the top 100 records using LzoPigStorage, I get the following exception:
Message: Unexpected System Error Occured: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:130)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:191)
at java.lang.Thread.run(Thread.java:724)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
... 3 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at com.twitter.elephantbird.mapreduce.input.LzoInputFormat.listStatus(LzoInputFormat.java:55)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:269)
at com.twitter.elephantbird.mapreduce.input.LzoInputFormat.getSplits(LzoInputFormat.java:111)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:452)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:469)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
I am running Hadoop 2.0, Pig 0.11, and elephant-bird 2.2.3
I don't use elephant-bird, so I'm not entirely sure that this is the problem.
But looking at their build for v2.2.3, it was compiled against hadoop-0.20.2 and pig-0.9.2. I've seen problems with UDFs in Pig when running on a newer version than what the UDF was compiled against.
Are you able to upgrade elephant-bird to a newer version or recompile against the correct libraries?

Resources