Hadoop:Cascading FlowException - hadoop

I Installed hadoop 1.0.4 and hive 0.12.When i run the Cascading Pattern on this it Give Cascading flow exception. when i run with following hadoop command
hadoop jar bulid/libs/pattern-example*.jar
i am getting above mention exception,for reference i include Cascading Code.
Tap inputTap = new Hfs(new TextDelimited(true, "\t"),
"hdfs://hdmaster:54310/user/hive/warehouse/temp/Dataformated/finalformated");
String classifyPath=Output Path;
hdfsPath = classifyPath/pmml File Name;
Tap classifyTap = new Hfs(new TextDelimited(true, "\t"),
classifyPath/pmml File Name));
String formatLocalHdfsData = classifyPath/PMML FILE NAME);
FlowDef flowDef = FlowDef.flowDef().setName("classify")
.addSource("input", inputTap)// input is LFs or HFS
.addSink("classify", classifyTap);
flowDef.addAssemblyPlanner(pmmlPlanner);
Flow classifyFlow = flowConnector.connect(flowDef);
classifyFlow.writeDOT("dot/classify.dot");
classifyFlow.complete();
Cascading Flow Exception
Exception in thread "main" cascading.flow.FlowException: step failed: (1/1) ...eg_Nocoerce20150513093050, with job id: job_201505130921_0003, please see cluster logs for failure messages
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:221)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:149)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:124)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:43)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Log File Exceprtion
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 9 more
Caused by: cascading.flow.FlowException: internal error during mapper configuration
at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:99)
... 14 more
Caused by: java.io.InvalidClassException: cascading.tap.hadoop.Hfs; local class incompatible: stream classdesc serialVersionUID = -2723557385578774808, local class serialVersionUID = -4246440312226820384
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:560)
Please Help Me to solve this Issue.

I resolve the issue. In log file I was getting serialVersionID compatibility issue. Generate the new SerialVersionID and it worked.

Related

Issue when writing to elasticsearch using es-hadoop

Am getting this exception when I'm trying to write to Elasticsearch using mapreduce program with es-hadoop. Am trying to write to index=employee and type=basic which already exists in my Elasticsearch cluster.
My stack trace :-
Exception in thread "main"
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: No resource
['es.resource'] (index/query/location) specified at
org.elasticsearch.hadoop.util.Assert.hasText(Assert.java:30) at
org.elasticsearch.hadoop.mr.EsOutputFormat.init(EsOutputFormat.java:257)
at
org.elasticsearch.hadoop.mr.EsOutputFormat.checkOutputSpecs(EsOutputFormat.java:233)
at
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308) at
com.mstack.mapreduce.DIGDriver.main(DIGDriver.java:22) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497) at
org.apache.hadoop.util.RunJar.run(RunJar.java:221) at
org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Driver class :-
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "es-hadoop");
job.setJarByClass(DIGDriver.class);
conf.set("es.nodes", "localhost:9200");
conf.set("es.port", "9200");
conf.set("es.resource", "employee/basic");
job.setNumReduceTasks(0);
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapperClass(DIGMapper.class);
job.setMapOutputValueClass(MapWritable.class);
conf.setBoolean("mapreduce.map.speculative", false);
conf.setBoolean("mapreduce.reduce.speculative", false);
boolean status = job.waitForCompletion(true);
if (status) {
System.exit(0);
} else {
System.out.println("Job Failed : Some error!");
System.exit(1);
}
Resolved myself by changing the configs :-
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");

ClassNotFoundException with no class name

I tried adding a UDF in a jar and tried the LOAD. The following is my snippet
register 'target/warcbase-0.1.0-SNAPSHOT-fatjar.jar';
DEFINE WarcLoader org.warcbase.pig.WarcLoader();
warc = LOAD '/raw/' USING WarcLoader AS (url: chararray, date: chararray, mime: chararray, content: bytearray);
STORE warc INTO '/raw/proc/';
I got the following Exception. Unfortunately, it does not tell me which class was not found. The following is the entire stack trace
Backend error message
---------------------
Error: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:236)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1982)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:224)
... 10 more
Pig Stack Trace
---------------
ERROR 0: java.io.IOException: java.lang.ClassNotFoundException: Class not found
org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:819)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:452)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.execute(PigServer.java:1364)
at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:624)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: java.lang.ClassNotFoundException: Class not found
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:236)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1982)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:224)
Please help me on how to proceed from here.
Make sure that your UDF jar is in your Pig ClASSPATH.
Add the following environment variable:
export PIG_CLASSPATH=/to/jar/location:$PIG_HOME/pig-0.15.0-h2.jar:$PIG_CLASSPATH

org.apache.hadoop.net.StandardSocketFactory not found

configuration = new Configuration();
configuration.set("fs.default.name",NAME_NODE_URL);
hdfs = FileSystem.get(configuration);
i am getting the below exception while using the code specified above,
java.lang.RuntimeException: Socket Factory class not found: java.lang.ClassNotFoundException: Class org.apache.hadoop.net.StandardSocketFactory not found
at org.apache.hadoop.net.NetUtils.getSocketFactoryFromProperty(NetUtils.java:142)
at org.apache.hadoop.net.NetUtils.getDefaultSocketFactory(NetUtils.java:122)
at org.apache.hadoop.net.NetUtils.getSocketFactory(NetUtils.java:100)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:477)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
at com.arista.cvp.commons.db.HdfsClient.copyfromLocaltoHdfs(HdfsClient.java:55)
at com.arista.cvp.services.hadoop.HDFSService.copyFromLocal(HDFSService.java:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
could anyone help in resolving the issue?
You definitely need either the hadoop-common-2.x jar on the classpath or the hadoop-core-1.x!

Including ThirdParty JAR in Hadoop 1.2.1

I have a custom (Mahout) class that I am trying to run via Hadoop. I have followed the instructions here (as well as one comment to put the jars in HDFS) but I keep getting the following error
ClassNotFoundException: org.apache.mahout.common.AbstractJob
My setup is as follows:
REP_ROOT=/home/user/.m2/repository
MAHOUT_CORE=$REP_ROOT/org/apache/mahout/mahout-core/0.9/mahout-core-0.9.jar
MAHOUT_CLI=$REP_ROOT/org/apache/mahout/commons/commons-cli/2.0-mahout/commons-cli-2.0-mahout.jar
MAHOUT_INTEGRATION=$REP_ROOT/org/apache/mahout/mahout-integration/0.9/mahout-integration-0.9.jar
LANG3=$REP_ROOT/org/apache/commons/commons-lang3/3.1/commons-lang3-3.1.jar
GOOGLE_LIST=$REP_ROOT/com/google/guava/guava/14.0.1/guava-14.0.1.jar
export LIBJARS=$MAHOUT_CORE,$MAHOUT_CLI,$MAHOUT_INTEGRATION,$LANG3,$GOOGLE_LIST
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$MAHOUT_CORE:$MAHOUT_CLI:$MAHOUT_INTEGRATION:$LANG3:$GOOGLE_LIST
hadoop jar /home/user/bin/CustomJar.jar com.company.Mahout.LDAJob \
-libjars ${LIBJARS} \
-DbaseLocation=/user/hadoop/test \
-Dinput=input \
-Doutput=output
The missing class (AbstractJob) is in the included in MAHOUT_CORE. The LDAJob starts as:
public class LDAJob extends AbstractJob{
public static void main(String args[]) throws Exception {
Configuration conf = new Configuration();
ToolRunner.run(conf,new LDAJob(), args);
}
public int run(String[] args) throws Exception {
Configuration conf = super.getConf();
String baseFileLocation = conf.get("baseLocation");
.....
}
Details of error
14/02/26 01:29:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/02/26 01:29:44 WARN snappy.LoadSnappy: Snappy native library not loaded
14/02/26 01:29:47 INFO mapred.JobClient: Running job: job_201402260118_0004
14/02/26 01:29:48 INFO mapred.JobClient: map 0% reduce 0%
14/02/26 01:29:53 INFO mapred.JobClient: Task Id : attempt_201402260118_0004_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:488)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
... 10 more
Caused by: java.lang.NoClassDefFoundError: org/apache/mahout/common/AbstractJob
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:61)
... 15 more
Caused by: java.lang.ClassNotFoundException: org.apache.mahout.common.AbstractJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 28 more
Any help would be much appreciated.

Dataingestion with Flume & Hadoop doesn't work

I'm using Flume 1.4.0 and Hadoop 2.2.0.
When I'm starting Flume and writing to HDFS I get following Exception:
(SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)] process failed
java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RenewLeaseRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2442)
at java.lang.Class.privateGetPublicMethods(Class.java:2562)
at java.lang.Class.privateGetPublicMethods(Class.java:2572)
at java.lang.Class.getMethods(Class.java:1427)
at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426)
at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323)
at java.lang.reflect.Proxy.getProxyClass(Proxy.java:521)
at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:601)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:92)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:328)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:235)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:207)
at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:53)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:172)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:364)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
The part of my hdfs-sink in the flume.conf is looking like this:
Define a sink that outputs to hdfs
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/flume
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.sinks.hdfs-sink.hdfs.rollCount = 10
agent.sinks.hdfs-sink.hdfs.batchSize = 10
agent.sinks.hdfs-sink.hdfs.rollSize = 0
I hope anyone can help me.
If somebody has the same problem, here's the solution:
Replace all older jars in the flume/lib-directory by copying the newer ones from hadoop.

Resources