nutch1.14 deduplication failed - hadoop

I have integrated nutch 1.14 along with solr-6.6.0 on CentOS Linux release 7.3.1611 I had given about 10 urls in seedlist which is at /usr/local/apache-nutch-1.13/urls/seed.txt I followed the tutorial
[root#localhost apache-nutch-1.14]# bin/nutch dedup http://ip:8983/solr/
DeduplicationJob: starting at 2018-01-09 15:07:52
DeduplicationJob: java.io.IOException: No FileSystem for scheme: http
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:258)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:320)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:870)
at org.apache.nutch.crawl.DeduplicationJob.run(DeduplicationJob.java:326)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.DeduplicationJob.main(DeduplicationJob.java:369)
Everything upto solr related commands work. Please help.
Where is the hadoop element they are talking about in nutch tutorial. Do we have to install anything other than java for hadoop, nutch and solr to work together to build a search engine?

try this
bin/nutch dedup -Dsolr.server.url=http://ip:8983/solr/

I was reading the same guide and ran into the same problem. This might help:
(Step-by-Step: Deleting Duplicates)
$ bin/nutch dedup crawl/crawldb/ -Dsolr.server.url=http://localhost:8983/solr/nutch
DeduplicationJob: starting at 2018-02-23 14:27:34
Deduplication: 1 documents marked as duplicates
Deduplication: Updating status of duplicate urls into crawl db.
Deduplication finished at 2018-02-23 14:27:37, elapsed: 00:00:03

Related

Google Cloud connector for Hadoop doesn't work with Pig

I'm using Hadoop with HDFS 2.7.1.2.4 and Pig 0.15.0.2.4 (Hortonworks HDP 2.4) and trying to use Google Cloud Storage Connector for Spark and Hadoop (bigdata-interop on GitHub). It works correctly when I try, say,
hadoop fs -ls gs://bucket-name
But when I try the following in Pig (in mapreduce mode):
data = LOAD 'gs://softline/o365.avro' USING AvroStorage();
data = STORE data INTO 'gs://softline/o366.avro' USING AvroStorage();
Pig fails with the following errors:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Wrong FS scheme: hdfs, in path: hdfs://hdp.slweb.ru:8020/user/root, expected scheme: gs
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:279)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
Caused by: java.lang.IllegalArgumentException: Wrong FS scheme: hdfs, in path: hdfs://hdp.slweb.ru:8020/user/root, expected scheme: gs
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.checkPath(GoogleHadoopFileSystemBase.java:741)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.checkPath(GoogleHadoopFileSystem.java:90)
at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:466)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.makeQualified(GoogleHadoopFileSystemBase.java:701)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.getGcsPath(GoogleHadoopFileSystem.java:163)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.setWorkingDirectory(GoogleHadoopFileSystemBase.java:1094)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:235)
... 18 more
If needed I could post the logs of GC connectors.
Hame somebody used Pig with this connectors? Any help would be appeciated.
TL;DR explicitly set workmapreduce.job.working.dir=/user/root/ when starting the pig job
If a working directory has not been explicitly set during job submission then Hadoop will set the working directory to be the working directory of the default filesystem. When using HDFS as your default FS the working directory will generally be something like 'hdfs://namenode:port/user/<your username>'.
When PigInputFormat#getSplits is called, it fetches the FileSystem associated with the path of the input that it is operating on. In this case the filesystem is an instance of GoogleHadoopFileSystem. Pig then inspects the path of its input and if the path is non-local calls FileSystem#setWorkingDirectory(job.getWorkingDirectory()). The problem here is that the job's working directory is 'hdfs://namenode:port/user/<your username>' which GoogleHadoopFileSystem will reject as a path to set as its own working directory (as it only supports 'gs://' paths).

Apache Nutch 2.3: throwing Error Failed with exit value 255

I'm using apache nutch 2.3 version.
My hadoop version is 2.6.0.Hadoop is running on single node.
When I run following command of nutch
./crawl --index ~/test/seed ~/test -1
The output of the above command is following.
InjectorJob: starting at 2016-01-04 12:03:26
InjectorJob: Injecting urlDir: --index
InjectorJob: Using class org.apache.gora.memory.store.MemStore as the
Gora storage class.
InjectorJob:
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
path does not exist: file:/usr/local/nutch/runtime/local/bin/--index
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus
(FileInputFormat.java:235)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits
(FileInputFormat.java:252)
at org.apache.hadoop.mapred.JobClient.writeNewSplits
(JobClient.java:1054)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs
(UserGroupInformation.java:1190
at org.apache.hadoop.mapred.JobClient.submitJobInternal
(JobClient.java:936)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
Error running:
/usr/local/nutch/runtime/local/bin/nutch inject --index -crawlId
/home/jalaj/test/seed
Failed with exit value 255.
What is the problem with nutch? Should I need to install Apache Gora?
The problem is here: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/usr/local/nutch/runtime/local/bin/--index
Nutch try to read the seed file, but cannot. Please make sure your command is correct.
Hope this helps,
Le Quoc Do

org.apache.hadoop.hbase.TableNotFoundException: SYSTEM.CATALOG exception with phoenix 4.5.2

I've been trying to integrate Phoenix 4.5.2 to my existing hadoop cluster.
Hadoop Version : 2.7.1
HBase Version : 1.1.2
When I try to create table from my phoenix client I'm getting following exception. But I'm able to create table successfully from HBase console.
org.apache.phoenix.exception.PhoenixIOException: SYSTEM.CATALOG
at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1051)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1014)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1259)
at org.apache.phoenix.query.DelegateConnectionQueryServices.createTable(DelegateConnectionQueryServices.java:113)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1937)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:751)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:186)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:320)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:312)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:310)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1422)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1927)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1896)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1896)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:180)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:132)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:151)
at org.apache.tools.ant.taskdefs.JDBCTask.getConnection(JDBCTask.java:370)
at org.apache.tools.ant.taskdefs.SQLExec.getConnection(SQLExec.java:940)
at org.apache.tools.ant.taskdefs.SQLExec.execute(SQLExec.java:612)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.apache.hadoop.hbase.TableNotFoundException: SYSTEM.CATALOG
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1257)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1155)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1139)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1096)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getRegionLocation(ConnectionManager.java:931)
at org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:83)
at org.apache.hadoop.hbase.client.HTable.getRegionLocation(HTable.java:496)
at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:736)
at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:706)
at org.apache.hadoop.hbase.client.HTable.getStartKeysInRange(HTable.java:1760)
at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1715)
at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1695)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1034)
... 49 more
Please suggest what's going wrong here and Phoenix 4.5.2 is compatible with HBase 1.1.2 or not.
Clearing zookeeper will solve the problem.
bin/hbase clean --cleanZk
Note:- you need to shutdown your master and regionserver before using above command.
Here's what you need to do:
Login into hbase zookeeper cli: hbase zkcli
check to see if the SYSTEM.CATALOG, SYSTEM.SEQUENCE, SYSTEM.STATS, and SYSTEM.FUNCTION exists by issuing: ls /hbase/table
If so... then this is the solution:)
Remove all the aforementioned directories: i.e. from the same cli issue rmr /hbase/table/SYSTEM.CATALOG. Do this for all four of the aforementioned files
Retry saline... and smile
I have an embedded HBase, I had this error after deleting manually HBase data, so I have had to clean Zookeeper and the rest of HBase with:
bin/hbase clean --cleanAll
That solved the problem for me.
According to your given logs,
Pheonix was trying to create a table called SYSTEM.CATALOG in Hbase. But, due to some problem in Hbase, it wasn't able to create it.
The table "SYSTEM.CATALOG" is created the 1st time the connection is made from Pheonix to Hbase.
Therefore, I would recommend checking your Pheonix Configuration for connectivity issues like wrong mapping of IP-Address , forgetting to create a Password less SSH etc.
The above suggestions of cleaning hbase data with only zookeeper running did not help me. What fiannl helped me overcome this error and allow creation of the SYSTEM.CATALOG table and the other associated tables was by copying the hbase-site.xml from the hbase installation into the bin folder of the apache-phoenix installation.
In my case the hbase conf was in /usr/local/hbase/conf which I copied into /usr/local/apache-phoenix/bin. I just renamed the original hbase-site.xml that was already present in the bin folder to something else, just as a back up.
This solved the problem of connectivity from sqlline.py as well as SQuirreL client.
This was apart from ensuring the phoenix client libraries were in /squirrel/lib folder for squirrel to work.
Check your hdfs://.../hbase/data/default/ is exist SYSTEM.CATALOG ? enter image description here
if there is nothing, you must try to use bin/hbase clean --cleanZk before you use the command, you must stop hbase Master and regionServers,but still keep ZK alive.

Apache Pig - ERROR 2118: For input string: "4f8:0:a111::add:9898"

We recently upgraded the cluster to use Hadoop 2.0.0-cdh4.4.0.
After the change, we needed to reinstall pig, which used to work absolutely fine. After the installation as described here, the simplest HBase job does not get created.
raw_protobuffer = LOAD 'hbase://data_table' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('external_data:downloaded',
'-limit=1
-gte=0 -lte=1')
AS (data:bytearray);
Which fails with the magical:
Failed Jobs: JobId Alias Feature Message Outputs
N/A raw_protobuffer MAP_ONLY Message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: For
input string: "4f8:0:a111::add:9898" at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at
org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992) at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319)
at
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239)
at
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270)
at
org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160)
at java.lang.Thread.run(Thread.java:744) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
Caused by: java.lang.NumberFormatException: For input string:
"4f8:0:a111::add:9898" at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492) at
java.lang.Integer.parseInt(Integer.java:527) at
com.sun.jndi.dns.DnsClient.(DnsClient.java:122) at
com.sun.jndi.dns.Resolver.(Resolver.java:61) at
com.sun.jndi.dns.DnsContext.getResolver(DnsContext.java:570) at
com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:430) at
com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:231)
at
com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:139)
at
com.sun.jndi.toolkit.url.GenericURLDirContext.getAttributes(GenericURLDirContext.java:103)
at
javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:142)
at org.apache.hadoop.net.DNS.reverseDns(DNS.java:85) at
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.reverseDNS(TableInputFormatBase.java:219)
at
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:184)
at
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat.getSplits(HBaseTableInputFormat.java:87)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
... 16 more
We suspected permissions to tmp folder but they seem to be fine (i.e. the job directory gets created with the pig runner (!) being its owner). Any suggestion what we might have missed would be much appreciated.
Looks like an IPv6 address to me - suggest you investigate disabling IPv6 functionality on your cluster

Hbase Hadoop cluster.. java.io.IOException: java.lang.NoSuchMethodExceptio

I am trying to set up a hbase cluster which is running on top of a hadoop cluster.
Both clusters are up and running but
when I try to create a table in Hbase client..am seeing the following error in the logs!!
compute-0-11 : is the name node for the hadoop cluster.
2012-03-18 01:18:54,696 WARN org.apache.hadoop.hbase.util.FSUtils:
Unable to create version file at hdfs://compute-0-11:9000/hbase, retrying:
java.io.IOException: java.lang.NoSuchMethodException:
org.apache.hadoop.hdfs.protocol.ClientProtocol.create(java.lang.String, org.apache.hadoop.fs.permission.FsPermission, java.lang.String, boolean, boolean, short, long)
at java.lang.Class.getMethod(Class.java:1605)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
Please help..
Usually it happens when there is a mismatch between versions of software in your cluster and your job / client.
Errors like class was expected / interface found usually happens when different versions of hadoop APIs are used (napred vs napreduce).

Resources