Including ThirdParty JAR in Hadoop 1.2.1 - hadoop

I have a custom (Mahout) class that I am trying to run via Hadoop. I have followed the instructions here (as well as one comment to put the jars in HDFS) but I keep getting the following error
ClassNotFoundException: org.apache.mahout.common.AbstractJob
My setup is as follows:
REP_ROOT=/home/user/.m2/repository
MAHOUT_CORE=$REP_ROOT/org/apache/mahout/mahout-core/0.9/mahout-core-0.9.jar
MAHOUT_CLI=$REP_ROOT/org/apache/mahout/commons/commons-cli/2.0-mahout/commons-cli-2.0-mahout.jar
MAHOUT_INTEGRATION=$REP_ROOT/org/apache/mahout/mahout-integration/0.9/mahout-integration-0.9.jar
LANG3=$REP_ROOT/org/apache/commons/commons-lang3/3.1/commons-lang3-3.1.jar
GOOGLE_LIST=$REP_ROOT/com/google/guava/guava/14.0.1/guava-14.0.1.jar
export LIBJARS=$MAHOUT_CORE,$MAHOUT_CLI,$MAHOUT_INTEGRATION,$LANG3,$GOOGLE_LIST
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$MAHOUT_CORE:$MAHOUT_CLI:$MAHOUT_INTEGRATION:$LANG3:$GOOGLE_LIST
hadoop jar /home/user/bin/CustomJar.jar com.company.Mahout.LDAJob \
-libjars ${LIBJARS} \
-DbaseLocation=/user/hadoop/test \
-Dinput=input \
-Doutput=output
The missing class (AbstractJob) is in the included in MAHOUT_CORE. The LDAJob starts as:
public class LDAJob extends AbstractJob{
public static void main(String args[]) throws Exception {
Configuration conf = new Configuration();
ToolRunner.run(conf,new LDAJob(), args);
}
public int run(String[] args) throws Exception {
Configuration conf = super.getConf();
String baseFileLocation = conf.get("baseLocation");
.....
}
Details of error
14/02/26 01:29:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/02/26 01:29:44 WARN snappy.LoadSnappy: Snappy native library not loaded
14/02/26 01:29:47 INFO mapred.JobClient: Running job: job_201402260118_0004
14/02/26 01:29:48 INFO mapred.JobClient: map 0% reduce 0%
14/02/26 01:29:53 INFO mapred.JobClient: Task Id : attempt_201402260118_0004_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:488)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
... 10 more
Caused by: java.lang.NoClassDefFoundError: org/apache/mahout/common/AbstractJob
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:61)
... 15 more
Caused by: java.lang.ClassNotFoundException: org.apache.mahout.common.AbstractJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 28 more
Any help would be much appreciated.

Related

Spark ThriftServer failed to start in secured mode

Configured Spark-1.4.1 and Hive-1.2.1 on a Hadoop-2.7.1 secured cluster with kerberos. Started external metastore with no sasl enabled. I can do basic operations in Hive server2 with beeline.
When trying to start Spark Thrift Server, getting exception something related to delegation token.
Command
spark-submit --class org.apache.spark.deploy.history.HistoryServer --master yarn-client C:\Spark\lib\spark-core_2.10-1.4.0.jar
Exception in Spark
15/07/28 16:07:31 INFO scheduler.DAGScheduler: Stopping DAGScheduler
15/07/28 16:07:31 INFO cluster.YarnClientSchedulerBackend: Stopped
15/07/28 16:07:31 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
15/07/28 16:07:31 ERROR spark.SparkContext: Error stopping SparkContext after init error.
java.lang.NullPointerException
at org.apache.spark.network.netty.NettyBlockTransferService.close(NettyBlockTransferService.scala:152)
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1216)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:96)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1659)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:565)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:51)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:73)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Exception in thread "main" java.lang.RuntimeException: Unexpected exception
at org.apache.spark.deploy.yarn.Client$.org$apache$spark$deploy$yarn$Client$$obtainTokenForHiveMetastore(Client.scala:1162)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:263)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:561)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:115)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:51)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:73)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.Client$.org$apache$spark$deploy$yarn$Client$$obtainTokenForHiveMetastore(Client.scala:1142)
... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.NullPointerException)
at org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2575)
... 23 more
Caused by: MetaException(message:java.lang.NullPointerException)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_delegation_token(ThriftHiveMetastore.java:3293)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_delegation_token(ThriftHiveMetastore.java:3279)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDelegationToken(HiveMetaStoreClient.java:1521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy23.getDelegationToken(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2572)
... 23 more
15/07/28 16:07:31 INFO storage.DiskBlockManager: Shutdown hook called
Exception in Hive Logs
2015-07-28 16:05:04,543 ERROR [pool-3-thread-5]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(155)) - MetaException(message:java.lang.NullPointerException)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5393)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_delegation_token(HiveMetaStore.java:5272)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy5.get_delegation_token(Unknown Source)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_delegation_token.getResult(ThriftHiveMetastore.java:11488)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_delegation_token.getResult(ThriftHiveMetastore.java:11472)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.metastore.HiveMetaStore.getDelegationToken(HiveMetaStore.java:5784)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_delegation_token(HiveMetaStore.java:5256)
... 15 more
Help me to solve this.
This issue has been fixed in Spark-1.5.0. You can find the issue link below.
https://issues.apache.org/jira/browse/SPARK-5111

ClassNotFoundException with no class name

I tried adding a UDF in a jar and tried the LOAD. The following is my snippet
register 'target/warcbase-0.1.0-SNAPSHOT-fatjar.jar';
DEFINE WarcLoader org.warcbase.pig.WarcLoader();
warc = LOAD '/raw/' USING WarcLoader AS (url: chararray, date: chararray, mime: chararray, content: bytearray);
STORE warc INTO '/raw/proc/';
I got the following Exception. Unfortunately, it does not tell me which class was not found. The following is the entire stack trace
Backend error message
---------------------
Error: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:236)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1982)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:224)
... 10 more
Pig Stack Trace
---------------
ERROR 0: java.io.IOException: java.lang.ClassNotFoundException: Class not found
org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:819)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:452)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.execute(PigServer.java:1364)
at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:624)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:236)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1982)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:224)
Please help me on how to proceed from here.
Make sure that your UDF jar is in your Pig ClASSPATH.
Add the following environment variable:
export PIG_CLASSPATH=/to/jar/location:$PIG_HOME/pig-0.15.0-h2.jar:$PIG_CLASSPATH

Hadoop:Cascading FlowException

I Installed hadoop 1.0.4 and hive 0.12.When i run the Cascading Pattern on this it Give Cascading flow exception. when i run with following hadoop command
hadoop jar bulid/libs/pattern-example*.jar
i am getting above mention exception,for reference i include Cascading Code.
Tap inputTap = new Hfs(new TextDelimited(true, "\t"),
"hdfs://hdmaster:54310/user/hive/warehouse/temp/Dataformated/finalformated");
String classifyPath=Output Path;
hdfsPath = classifyPath/pmml File Name;
Tap classifyTap = new Hfs(new TextDelimited(true, "\t"),
classifyPath/pmml File Name));
String formatLocalHdfsData = classifyPath/PMML FILE NAME);
FlowDef flowDef = FlowDef.flowDef().setName("classify")
.addSource("input", inputTap)// input is LFs or HFS
.addSink("classify", classifyTap);
flowDef.addAssemblyPlanner(pmmlPlanner);
Flow classifyFlow = flowConnector.connect(flowDef);
classifyFlow.writeDOT("dot/classify.dot");
classifyFlow.complete();
Cascading Flow Exception
Exception in thread "main" cascading.flow.FlowException: step failed: (1/1) ...eg_Nocoerce20150513093050, with job id: job_201505130921_0003, please see cluster logs for failure messages
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:221)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:149)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:124)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:43)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Log File Exceprtion
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 9 more
Caused by: cascading.flow.FlowException: internal error during mapper configuration
at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:99)
... 14 more
Caused by: java.io.InvalidClassException: cascading.tap.hadoop.Hfs; local class incompatible: stream classdesc serialVersionUID = -2723557385578774808, local class serialVersionUID = -4246440312226820384
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:560)
Please Help Me to solve this Issue.
I resolve the issue. In log file I was getting serialVersionID compatibility issue. Generate the new SerialVersionID and it worked.

storm0.9.3 cluster in local machine

zookeeper installed and runing successfully.but storm nimbus is not running throwing exceptions like below
Exception in thread "main" java.lang.ExceptionInInitializerError
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at backtype.storm.config$loading__4910__auto__.invoke(config.clj:17)
at backtype.storm.config__init.load(Unknown Source)
at backtype.storm.config__init.<clinit>(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at clojure.lang.RT.loadClassForName(RT.java:2098)
at clojure.lang.RT.load(RT.java:430)
at clojure.lang.RT.load(RT.java:411)
at clojure.core$load$fn__5018.invoke(core.clj:5530)
at clojure.core$load.doInvoke(core.clj:5529)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.core$load_one.invoke(core.clj:5336)
at clojure.core$load_lib$fn__4967.invoke(core.clj:5375)
at clojure.core$load_lib.doInvoke(core.clj:5374)
at clojure.lang.RestFn.applyTo(RestFn.java:142)
at clojure.core$apply.invoke(core.clj:619)
at clojure.core$load_libs.doInvoke(core.clj:5417)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at clojure.core$apply.invoke(core.clj:621)
at clojure.core$use.doInvoke(core.clj:5507)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at backtype.storm.command.config_value$loading__4910__auto__.invoke(config_value.clj:16)
at backtype.storm.command.config_value__init.load(Unknown Source)
at backtype.storm.command.config_value__init.<clinit>(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at clojure.lang.RT.loadClassForName(RT.java:2098)
at clojure.lang.RT.load(RT.java:430)
at clojure.lang.RT.load(RT.java:411)
at clojure.core$load$fn__5018.invoke(core.clj:5530)
at clojure.core$load.doInvoke(core.clj:5529)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.lang.Var.invoke(Var.java:415)
at backtype.storm.command.config_value.<clinit>(Unknown Source)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map
at backtype.storm.utils.Utils.findAndReadConfigFile(Utils.java:141)
at backtype.storm.utils.Utils.readStormConfig(Utils.java:188)
at backtype.storm.utils.Utils.<clinit>(Utils.java:71)
... 36 more
storm.yamll
storm.zookeeper.servers:"localhost"
nimbus.host:"localhost"
Made it two liner in your storm.yaml, like below:
storm.zookeeper.servers:"localhost"
nimbus.host:"localhost"
And read about how .yaml syntax used from here

Dataingestion with Flume & Hadoop doesn't work

I'm using Flume 1.4.0 and Hadoop 2.2.0.
When I'm starting Flume and writing to HDFS I get following Exception:
(SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)] process failed
java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RenewLeaseRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2442)
at java.lang.Class.privateGetPublicMethods(Class.java:2562)
at java.lang.Class.privateGetPublicMethods(Class.java:2572)
at java.lang.Class.getMethods(Class.java:1427)
at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426)
at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323)
at java.lang.reflect.Proxy.getProxyClass(Proxy.java:521)
at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:601)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:92)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:328)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:235)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:207)
at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:53)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:172)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:364)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
The part of my hdfs-sink in the flume.conf is looking like this:
Define a sink that outputs to hdfs
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/flume
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.sinks.hdfs-sink.hdfs.rollCount = 10
agent.sinks.hdfs-sink.hdfs.batchSize = 10
agent.sinks.hdfs-sink.hdfs.rollSize = 0
I hope anyone can help me.
If somebody has the same problem, here's the solution:
Replace all older jars in the flume/lib-directory by copying the newer ones from hadoop.

Resources