Flume to HBase dependencie failure - hadoop

I have installed HBase and Flume using Apache Cloudera. I have a flume agent running on a linux server, where the HBase current master is running.
I'm trying to write from a spooldir to HBase but I get the following error:
...
ERROR org.apache.flume.node.PollingPropertiesFileConfigurationProvider: Failed to start agent because dependencies were not found in classpath. Error follows.
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at org.apache.flume.sink.hbase.HBaseSink.<init>(HbaseSink.java:116)
...
Flume configuration:
...
#Sinks
tier1.sinks.hbase-sink.channel = memory-channel
tier1.sinks.hbase-sink.type = org.apache.flume.sink.hbase.HBaseSink
tier1.sinks.hbase-sink.table = FlumeTable
tier1.sinks.hbase-sink.columnFamily = FlumeColumn
I tried to modify the flume-env.sh and set HBASE_HOME HADOOP_HOME, but it changed nothing.
I have succeeded to write to HDFS, but the HBase is making problems.

I could resolve this problem by adding the path of the hbase-libraries to the FLUME_CLASSPATH in the conf/flume-env.sh, i.e., in my case the file looked like:
FLUME_CLASSPATH="/home/USERNAME/hbase-1.0.1.1/lib/*"
Hope it helps.

Related

Error: Could not find or load main class backup

I have setup a Hbase on top of hadoop in my linux system. I creates a sample table in hbase shell and its working fine. However, when I try to run backup command I am getting an error in the terminal as follows:
> hbase backup create full hdfs://localhost:8020/data/backup
> Error: Could not find or load main class backup
OR
> hbase backup help
> Error: Could not find or load main class backup
I have installed apache Hadoop 2.7.3 and HBase 2.1.4. The Hbase is of Apache and not of Cloudera or Hortonworks.
I see that in the docs (http://hbase.apache.org/book.html#_backup_and_restore_commands), hbase command can be used. Please help here.

Jar file not found exception when running map reduce job when copying data from hbase

When I tried to execute the following command to copy data from hbase to another cluster in a hbase client environment. The command I ran is:
hbase org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=[destination zk]:/hbase [source table name]
I got this error:
Exception in thread "main" java.io.FileNotFoundException: File does
not exist:
hdfs://servername:8020/opt/hbase-1.2.10/lib/metrics-core-2.2.0.jar at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1072)
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1064)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
The /opt/hbase-1.2.10/lib/metrics-core-2.2.0.jar is on my local path but it does not exist in the hdfs. It seems the CopyTable util is submitting a mapreduce job without the dependency jars. I read a few articles and it seems the only solution is to upload the jar lib to hdfs with the same path. This is really an ugly solution.
Please kindly advise. Thanks!

flum agent with syslogs source and hbase sink

I try to use flume with syslogs source and hbase sink.
when I run flume agent I get this error : Failed to start agent because dependencies were not found in classpath. Error follows. java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration, which means (from that question) that some hbase lib are missing, to solve it I need to set in flume-env.sh file the path to these lib,that what I did, and run flume but the error persisted here is the command that I used to run flume agent : bin/flume-ng agent --conf ./conf --conf-file ./conf/flume.properties --name agent -Dflume.root.logger=INFO,console so my question is, If the solution that I used is correct (I need to add lib to flume) why I still get the same error, if not how to solve that problem
EDIT
from the doc I read : The flume-ng executable looks for and sources a file named "flume-env.sh" in the conf directory specified by the --conf/-c commandline option..
I haven't test it yet but I think that is the solution (I just need a confirmation )
I would recommend you to download HBase full tar ball and set the environment variables like HBASE_HOME etc to the right locations. Then Flume can automatically pick the libraries from HBase repo.

Executing Mahout against Hadoop cluster

I have a jar file which contains the mahout jars as well as other code I wrote.
It works fine in my local machine.
I would like to run it in a cluster that has Hadoop already installed.
When I do
$HADOOP_HOME/bin/hadoop jar myjar.jar args
I get the error
Exception in thread "main" java.io.IOException: Mkdirs failed to create /some/hdfs/path (exists=false, cwd=file:local/folder/where/myjar/is)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java 440)
...
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
I checked that I can access and create the dir in the hdfs system.
I have also ran hadoop code (no mahout) without a problem.
I am running this in a linux machine.
Check for the mahout user and hadoop user being same. and also check for mahout and hadoop version compatibility.
Regards
Jyoti ranjan panda

Issue with psuedo mode configuration of Hadoop

I am trying to do pseudo mode configuration of Hadoop 2.0.4 version. Script start-dfs.sh works fine. However, start-mapred.sh fails to start the jobtracker and tasktracker. Below is the error I am getting. Seeing at error it looks like it is not able to pick the jar file. Please let me know if you have any idea of this issue. Thanks.
FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NoSuchMethodError: org/apache/hadoop/mapred/JobACLsManager.<init>(Lorg/apache/hadoop/mapred/JobConf;)V
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2182)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1895)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1889)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:311)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:302)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:297)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4820)
It seems I was using incorrect jars. So, first I replaced those. Then, I just created a new directory with hadoop conf files. Formatted the namenode. Finally it worked. :)

Resources