HBase client - server’s version compatibility - hadoop

I wonder how can I know if my HBase client’s jar fit to my HBase server’s version. Is there any place where it is specified which HBase versions are supported with an HBase client jar?
In my case I want to use the newest HBase client jar (2.4.5) with a pretty old HBase server (version 1.2). Is there any place where I can check the compatibility to know if it’s possible and supported?
I’d like to know if there’s a table that shows the wide compatibility like other databases has. Something like:
https://docs.mongodb.com/drivers/java/sync/current/compatibility/

Perhaps you can use checkcompatibility.py script provided in HBase distro itself to generate client API compatibility report between 1.2 and 2.4. Haven't used 2.4 myself, but based on prior history I wouldn't hope there is no breaking changes across two different major versions.

Related

Apache Sqoop moved into the Attic in 2021-06

I have installed hadoop version 3.3.1 and sqoop 1.4.7 which doesn't seem compatible , I am getting depreciated API implemented error while importing rdbms table.
As I tried to google for compatible versions I found apache sqoop is moved to appache attiq .and version 1.4.7 which is last stable version states in its documentation says that " Sqoop is currently supporting 4 major Hadoop releases - 0.20, 0.23, 1.0 and 2.0. "
Would you please explain what does it mean and what should I do.
could you please suggest What are the alternatives of SQOOP .
It means just what the board minutes say: Sqoop has become inactive and is now moved to the Apache Attic. This doesn't mean Sqoop is deprecated in favor of some other project, but for practical purposes you should probably not build new implementations using it.
Much of the same functionality is available in other tools, including other Apache projects. Possible options are Spark, Kafka, Flume. Which one to use is very dependent on the specifics of your use case, since none of these quite fill the same niche as Sqoop. The database connectivity capabilities of Spark make it the most flexible solution, but it also could be the most labor-intensive to set up. Kafka might work, although it's not quite as ad-hoc friendly as Sqoop (take a look at Kafka Connect). I probably wouldn't use Flume, but it might be worth a look (it is mainly meant for shipping logs).

MapR 5.2.2 clients

I have a task which requires me to create a Go program to read from an HBASE table.
HBASE is installed in a MapR cluster.
Every other application (Java) uses a MapR client to connect to the MapR cluster so as to retrieve the data.
However, I am unable to find a way to connect to HBASE with a Go application.
I have found HBASE package, but it does not support integration with MapR.
It would be great if anyone could guide me in this situation.
I also have seen that for MapR 6 and above has Go support through OJAI, but sadly, upgrading MapR is not an option.
Can someone advice me how to proceed in this situation?
If you are actually running HBase in MapR, then the Go package for HBase should work (assuming version match and such).
If you are actually using the MapR DB Binary tables (which are roughly HBase compatible) the likely best approach would be to use the Thrift API or REST.
The OJAI lightweight client should work well in Go since it uses gRPC to talk to the underlying table (and thus gains lots of portability). The problem in your case won't be so much that you need to upgrade the platform so much as the lightweight client only works with MapR DB JSON (the document oriented version of MapR DB).
Ping me directly if you would like more information.

Database versions for Oozie

I would like to change my Oozie installation from a MySQL db to an Oracle db.
My cluster is running CDH 5.4.7 with Oozie 4.1. The Oracle db that I have access to is version 12c.
In the Cloudera documentation it states that Oracle db 12c is only supported by Cloudera Manager and CDH 5.6 and newer.
My question is therefore: is there any reason why my Oozie installation should not be able to use this database, even through Cloudera components do not support it? In the Oozie documentation it does not state anything version related, as far as I have found.
I am lacking a non-production system to test this on, but looking into setting one up currently.
Any answers, including speculation, are appreciated.
If any information is missing, I will gladly append.
Thanks
Oozie inside CDH5.4.7 is using a quite old OpenJPA version, 2.2.2.
OpenJPA 2.2.2 does not support Oracle 12c.
However CDH5.8.0 still using OpenJPA 2.2.2, so my guess it that is will probably work but was never tested. Make sure to create a backup of your DB before the migration. Also, you might try the DB migration tool developed in OOZIE-2632

HBase shaded client 1.1.x on Cloudera 5.4.7 (HBase 1.0.0)

I'm running into a trouble when trying to create an Java based client which is used to query data from both Cloudera HBase 1.0.0 (CDH 5.4.7) and ElasticSearch 2.1.0. The issue is about dependency conflicts on guava library.
This bug describes pretty much the same issue as I met:
https://issues.apache.org/jira/browse/HBASE-14126
At least in my case using a lower version of guava (lower than 17.0 from where the breaks happens) is out of the table because it will lead Elasticsearch JAVA search API failure. And now I'm trying the Apache HBase Shaded Client 1.1.2 (which is designed for HBase 1.1.0 I suppose), at least so far some of the simple HBase operations (I tried only get/scan so far) are all suceeded.
http://mvnrepository.com/artifact/org.apache.hbase/hbase-shaded-client
I'm wondering if there is any known risk or issue about using Apache HBase Shaded Client 1.1.x on Hbase 1.0.0 or even earlier versions.
Or is there any design "rules" on Hadoop/Hbase client libraries (native API) on forward/backward compatibility? E.g., is that necessary to upgrade all the applications which are using any of those client libraries, when some updates happens on server's end.
Let me answer to myself:)
After 6 months of running, testing and development, we can confirm that backward compatibility of HBase client v1.1.x is quite well, and works smoothly with older HBase server such as 1.0.0-CDH 5.4.7.

Cascading HBase Tap

I am trying to write Scalding jobs which have to connect to HBase, but I have trouble using the HBase tap. I have tried using the tap provided by Twitter Maple, following this example project, but it seems that there is some incompatibility between the Hadoop/HBase version that I am using and the one that was used as client by Twitter.
My cluster is running Cloudera CDH4 with HBase 0.92 and Hadoop 2.0.0-cdh4.1.3. Whenever I launch a Scalding job connecting to HBase, I get the exception
java.lang.NoSuchMethodError: org.apache.hadoop.net.NetUtils.getInputStream(Ljava/net/Socket;)Ljava/io/InputStream;
at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:363)
at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1046)
...
It seems that the HBase client used by Twitter Maple is expecting some method on NetUtils that does not exist on the version of Hadoop deployed on my cluster.
How do I track down what exactly is the mismatch - what version would the HBase client expect and so on? Is there in general a way to mitigate these issues?
It seems to me that often client libraries are compiled with hardcoded version of the Hadoop dependencies, and it is hard to make those match the actual versions deployed.
The method actually exists but has changed its signature. Basically, it boils down to having different versions of Hadoop libraries on your client and server. If your server is running Cloudera, you should be using the HBase and Hadoop libraries from Cloudera. If you're using Maven, you can use Cloudera's Maven repository.
It seems like library dependencies are handled in Build.scala. I haven't used Scala yet, so I'm not entirely sure how to fix it there.
The change that broke compatibility was committed as part of HADOOP-8350. Take a look at Ted Yu's comments and the responses. He works on HBase and had the same issue. Later versions of the HBase libraries should automatically handle this issue, according to his comment.

Resources