Dataingestion with Flume & Hadoop doesn't work - hadoop

I'm using Flume 1.4.0 and Hadoop 2.2.0.
When I'm starting Flume and writing to HDFS I get following Exception:
(SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)] process failed
java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RenewLeaseRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2442)
at java.lang.Class.privateGetPublicMethods(Class.java:2562)
at java.lang.Class.privateGetPublicMethods(Class.java:2572)
at java.lang.Class.getMethods(Class.java:1427)
at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426)
at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323)
at java.lang.reflect.Proxy.getProxyClass(Proxy.java:521)
at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:601)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:92)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:328)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:235)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:207)
at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:53)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:172)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:364)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
The part of my hdfs-sink in the flume.conf is looking like this:
Define a sink that outputs to hdfs
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/flume
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.sinks.hdfs-sink.hdfs.rollCount = 10
agent.sinks.hdfs-sink.hdfs.batchSize = 10
agent.sinks.hdfs-sink.hdfs.rollSize = 0
I hope anyone can help me.

If somebody has the same problem, here's the solution:
Replace all older jars in the flume/lib-directory by copying the newer ones from hadoop.

Related

Can't Connect to BigTable using Connection Factory

I am using a custom code to load HFiles directly into BigTable using the following API's
LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(new Path(inputPath), admin, table, regionLocator);
while using ConnectionFactory I am using the following function in the statement
Connection connection = ConnectionFactory.createConnection(conf, null, null);
which always returns the parameter managed as false but I am getting the following error while running it.
java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:431)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:424)
at org.apache.hadoop.hbase.client.ConnectionManager.getConnectionInternal(ConnectionManager.java:302)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:235)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.initialize(LoadIncrementalHFiles.java:154)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.<init>(LoadIncrementalHFiles.java:144)
at com.example.bigtable.sample.BigtableLoader.main(BigtableLoader.java:122)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at com.example.bigtable.sample.BigtableDriver.main(BigtableDriver.java:18)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:424)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
at com.google.cloud.hadoop.services.agent.job.shim.HadoopRunJarShim.main(HadoopRunJarShim.java:12)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
... 22 more
Caused by: java.lang.IllegalArgumentException: Bigtable does not support managed connections.
at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:130)
at com.google.cloud.bigtable.hbase1_2.BigtableConnection.<init>(BigtableConnection.java:55)
... 27 more
Cloud Bigtable does not support LoadIncrementalHFiles. If you want to load data, your best bet is to use these instructions.

ClassNotFoundException with no class name

I tried adding a UDF in a jar and tried the LOAD. The following is my snippet
register 'target/warcbase-0.1.0-SNAPSHOT-fatjar.jar';
DEFINE WarcLoader org.warcbase.pig.WarcLoader();
warc = LOAD '/raw/' USING WarcLoader AS (url: chararray, date: chararray, mime: chararray, content: bytearray);
STORE warc INTO '/raw/proc/';
I got the following Exception. Unfortunately, it does not tell me which class was not found. The following is the entire stack trace
Backend error message
---------------------
Error: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:236)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1982)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:224)
... 10 more
Pig Stack Trace
---------------
ERROR 0: java.io.IOException: java.lang.ClassNotFoundException: Class not found
org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:819)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:452)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.execute(PigServer.java:1364)
at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:624)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:236)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1982)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:224)
Please help me on how to proceed from here.
Make sure that your UDF jar is in your Pig ClASSPATH.
Add the following environment variable:
export PIG_CLASSPATH=/to/jar/location:$PIG_HOME/pig-0.15.0-h2.jar:$PIG_CLASSPATH

Hadoop:Cascading FlowException

I Installed hadoop 1.0.4 and hive 0.12.When i run the Cascading Pattern on this it Give Cascading flow exception. when i run with following hadoop command
hadoop jar bulid/libs/pattern-example*.jar
i am getting above mention exception,for reference i include Cascading Code.
Tap inputTap = new Hfs(new TextDelimited(true, "\t"),
"hdfs://hdmaster:54310/user/hive/warehouse/temp/Dataformated/finalformated");
String classifyPath=Output Path;
hdfsPath = classifyPath/pmml File Name;
Tap classifyTap = new Hfs(new TextDelimited(true, "\t"),
classifyPath/pmml File Name));
String formatLocalHdfsData = classifyPath/PMML FILE NAME);
FlowDef flowDef = FlowDef.flowDef().setName("classify")
.addSource("input", inputTap)// input is LFs or HFS
.addSink("classify", classifyTap);
flowDef.addAssemblyPlanner(pmmlPlanner);
Flow classifyFlow = flowConnector.connect(flowDef);
classifyFlow.writeDOT("dot/classify.dot");
classifyFlow.complete();
Cascading Flow Exception
Exception in thread "main" cascading.flow.FlowException: step failed: (1/1) ...eg_Nocoerce20150513093050, with job id: job_201505130921_0003, please see cluster logs for failure messages
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:221)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:149)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:124)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:43)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Log File Exceprtion
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 9 more
Caused by: cascading.flow.FlowException: internal error during mapper configuration
at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:99)
... 14 more
Caused by: java.io.InvalidClassException: cascading.tap.hadoop.Hfs; local class incompatible: stream classdesc serialVersionUID = -2723557385578774808, local class serialVersionUID = -4246440312226820384
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:560)
Please Help Me to solve this Issue.
I resolve the issue. In log file I was getting serialVersionID compatibility issue. Generate the new SerialVersionID and it worked.

storm0.9.3 cluster in local machine

zookeeper installed and runing successfully.but storm nimbus is not running throwing exceptions like below
Exception in thread "main" java.lang.ExceptionInInitializerError
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at backtype.storm.config$loading__4910__auto__.invoke(config.clj:17)
at backtype.storm.config__init.load(Unknown Source)
at backtype.storm.config__init.<clinit>(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at clojure.lang.RT.loadClassForName(RT.java:2098)
at clojure.lang.RT.load(RT.java:430)
at clojure.lang.RT.load(RT.java:411)
at clojure.core$load$fn__5018.invoke(core.clj:5530)
at clojure.core$load.doInvoke(core.clj:5529)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.core$load_one.invoke(core.clj:5336)
at clojure.core$load_lib$fn__4967.invoke(core.clj:5375)
at clojure.core$load_lib.doInvoke(core.clj:5374)
at clojure.lang.RestFn.applyTo(RestFn.java:142)
at clojure.core$apply.invoke(core.clj:619)
at clojure.core$load_libs.doInvoke(core.clj:5417)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at clojure.core$apply.invoke(core.clj:621)
at clojure.core$use.doInvoke(core.clj:5507)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at backtype.storm.command.config_value$loading__4910__auto__.invoke(config_value.clj:16)
at backtype.storm.command.config_value__init.load(Unknown Source)
at backtype.storm.command.config_value__init.<clinit>(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at clojure.lang.RT.loadClassForName(RT.java:2098)
at clojure.lang.RT.load(RT.java:430)
at clojure.lang.RT.load(RT.java:411)
at clojure.core$load$fn__5018.invoke(core.clj:5530)
at clojure.core$load.doInvoke(core.clj:5529)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.lang.Var.invoke(Var.java:415)
at backtype.storm.command.config_value.<clinit>(Unknown Source)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map
at backtype.storm.utils.Utils.findAndReadConfigFile(Utils.java:141)
at backtype.storm.utils.Utils.readStormConfig(Utils.java:188)
at backtype.storm.utils.Utils.<clinit>(Utils.java:71)
... 36 more
storm.yamll
storm.zookeeper.servers:"localhost"
nimbus.host:"localhost"
Made it two liner in your storm.yaml, like below:
storm.zookeeper.servers:"localhost"
nimbus.host:"localhost"
And read about how .yaml syntax used from here

Including ThirdParty JAR in Hadoop 1.2.1

I have a custom (Mahout) class that I am trying to run via Hadoop. I have followed the instructions here (as well as one comment to put the jars in HDFS) but I keep getting the following error
ClassNotFoundException: org.apache.mahout.common.AbstractJob
My setup is as follows:
REP_ROOT=/home/user/.m2/repository
MAHOUT_CORE=$REP_ROOT/org/apache/mahout/mahout-core/0.9/mahout-core-0.9.jar
MAHOUT_CLI=$REP_ROOT/org/apache/mahout/commons/commons-cli/2.0-mahout/commons-cli-2.0-mahout.jar
MAHOUT_INTEGRATION=$REP_ROOT/org/apache/mahout/mahout-integration/0.9/mahout-integration-0.9.jar
LANG3=$REP_ROOT/org/apache/commons/commons-lang3/3.1/commons-lang3-3.1.jar
GOOGLE_LIST=$REP_ROOT/com/google/guava/guava/14.0.1/guava-14.0.1.jar
export LIBJARS=$MAHOUT_CORE,$MAHOUT_CLI,$MAHOUT_INTEGRATION,$LANG3,$GOOGLE_LIST
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$MAHOUT_CORE:$MAHOUT_CLI:$MAHOUT_INTEGRATION:$LANG3:$GOOGLE_LIST
hadoop jar /home/user/bin/CustomJar.jar com.company.Mahout.LDAJob \
-libjars ${LIBJARS} \
-DbaseLocation=/user/hadoop/test \
-Dinput=input \
-Doutput=output
The missing class (AbstractJob) is in the included in MAHOUT_CORE. The LDAJob starts as:
public class LDAJob extends AbstractJob{
public static void main(String args[]) throws Exception {
Configuration conf = new Configuration();
ToolRunner.run(conf,new LDAJob(), args);
}
public int run(String[] args) throws Exception {
Configuration conf = super.getConf();
String baseFileLocation = conf.get("baseLocation");
.....
}
Details of error
14/02/26 01:29:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/02/26 01:29:44 WARN snappy.LoadSnappy: Snappy native library not loaded
14/02/26 01:29:47 INFO mapred.JobClient: Running job: job_201402260118_0004
14/02/26 01:29:48 INFO mapred.JobClient: map 0% reduce 0%
14/02/26 01:29:53 INFO mapred.JobClient: Task Id : attempt_201402260118_0004_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:488)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
... 10 more
Caused by: java.lang.NoClassDefFoundError: org/apache/mahout/common/AbstractJob
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:61)
... 15 more
Caused by: java.lang.ClassNotFoundException: org.apache.mahout.common.AbstractJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 28 more
Any help would be much appreciated.

Resources