Is it possible to connect tableau to cloudera hive in windows 7? - windows

I downloaded and installed cloudera hive drivers provided in the link http://www.tableausoftware.com/support/drivers. But when I try to add driver in ODBC connections, it is not shown there. I read some where that cloudera hive driver will work only
with windows 2008. I am using windows 7. Kindly help me.

A little late in the day, but here are some more detailed articles from the Tableau Knowledge Base may be of interest to you or anyone else interested in this question.
Connecting to Hadoop Hive
Extra Capabilities for Hadoop Hive
Designing for Performance Using Hadoop Hive
Administering Hadoop and Hive for Tableau Connectivity
Failing that, if you are still unable to connect to Cloudera Hive and you're a registered customer, or have downloaded a trial, then you can always drop an email to support#tableausoftware.com and ask for help there. :)

Yes it is possible to connect Tableau to cloudera Hive on Windows 7.
Steps are:
1. start the thrift server for hive
nohup HIVE_PORT=10000 hive --service hiveserver &
2. install the Hive ODBC driver from https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Tableau+License+Agreement
3. open Tableau
Connect to Data -> Cloudera Hadoop Hive -> Give the server ip and port :10000 (you can change the thrift server port if you need to by changing HIVE_PORT to some other value while starting the Hive server)
The rest is straight forward.
Also make sure that the required port (10000 or which ever you chose) is open in the firewall.

Please make sure that you tried to create the ODBC connection in ODBC 32bit, since the drivers and the Tableau desktop is a 32bit application. You can run the ODBC 32bit driver panel with the odbcad32.exe command line.

Related

How to configure HUE to be connected to remote Hive server?

I'm trying to use HUE Beeswax to connect my company's Hive database. Firstly, is it possible to use HUE installed on my mac to be connected with remote Hive server? If it does, how am I supposed to find the address for the Hive server which is running on our private server? Only thing I can do is to type 'hive' and put some sql queries in hive shell. I already installed HUE but can't figure out how to connect it to the remote Hive server. Any tips would be much appreciated.
If all you want is a desktop connection to Hive, you only need a JDBC client, not a full web app like Hue.
In any case, Hive CLI is deprecated. Beeline is preferred. To use Beeline and Hue, you need a HiveServer2 running.
To find the address of the HiveServer2, if you have it, you need to find your hive-site.xml file on the Hadoop cluster, and export it. Other ways to get this information are available in Ambari or Cloudera Manager (but if you're using a Cloudera CDH cluster, you already have Hue). The Thrift interface is what you want. Default port is 10000
When you setup the Hue, you will need to find the hue.ini file, in which, edit the section that starts with [beeswax] and fill in the necessary values. Personally, I find that section fairly straightforward
You can read the Hue github to find the requirements for running it on a Mac

Hive Server2,Beeline not able to understand

Q1 : What is Server2 in Hive?
Q2 : What is the use of jdbc or odbc in Server2? For What purpose server2 is used with jdbc or odbc?
Q3 : If i want to connect with Hive server2 to jdbc or odbc, how I can connect? Can I connect in my cloudera which is single node? Guide me how to connect with it?
Q4 : How to connect with Beeline in Cloudera. The commands of Beeline are same or there is any difference. How to connect Beeline with jdbc and odbc?
Please help me regarding these questions. I searched on internet but unable to understand it.. Thanks in advance
Please find answers below:
A1. HiveServer2 is simply the version 2 of the Hive Server. The enhanced Hive server is designed for multi-client concurrency and improved authentication that encourages clients to connect through JDBC and ODBC rather than thrift protocol directly
A2. JDBC/ODBC is the standard recommended way to interact with SQL engines through programming languages. Apart from interacting with Hive using command line i.e. beeline, clients can interact programmatically or external applications like Tableau / Qlik etc which needs the corresponding JDBC/ODBC drivers. The process should be the same whether its a single node or distributed cluster.
A3. Please refer Cloudera documentation on how to setup and execute Hive commands using JDBC/ODBC. Check the below links
http://www.cloudera.com/documentation/other/connectors/hive-jdbc/latest/Cloudera-JDBC-Driver-for-Apache-Hive-Install-Guide.pdf
A4. Check the link for complete example - http://hadooptutorial.info/hiveserver2-beeline-introduction/
Hope that helps!!

Set up IBM Open Platform with an external Oracle Database

I'm a little confused when I try to install a single node IBM Open Platform cluster using an Oracle database as RDBMS.
Firstly, I understand that the Hadoop part of the IBM Big Insights is not a modified version of the corresponding Apache version (as HortonWorks do) so, when Ambari (from the IBM repo) offers me to use an external Oracle database, I suppose it should work. I may be wrong, and I can't find any oracle reference in the crappy IBM installation guide to set it up correctly (only that it should work with Oracle 11g R2)
So, as I do with an equivalent HortonWorks distribution (but using the binaries from IBM), I set up my ambari-server with all the oracle parameters (--jdbc-db=oracle --jdbc-driver=path/to/ojdbc6.jar, I'm using a Oracle 11g XE on Centos 6.5, supposed to be supported by IOP) and I specified all the stuff I had to specify to use Ambari with Oracle (Service Name, Host, Port, ...)
I created the ambari user, loaded the corresponding Oracle DDL (packaged with Ambari) and created my Hive & Oozie users, as specified in the... Hortonworks installation guide.
Well, Ambari seems to work well with Oracle, I can set up my cluster until the last step :
If I configure Hive and/or oozie to work with oracle (validating the oracle connection is OK from the service configuration tab), the "review" step (step 8) doesn't show anything (or sometimes the IOP repos, it seems to be arbitrary). Trying to deploy starts the tasks preparation and implies a blocking states of the installation: I can't do anything else than dropping the database and reload the entire DDL to try again (or I'll obtain lots of unexpected NullPointerException)
If I configure Hive AND Oozie to work with an embedded MySQL (the default choice), keeping Ambari against Oracle, everything works fine.
Am I doing something wrong?? Or is there any limitation to configure (IBM Open Platform) Hive and Oozie to use Oracle 11 ? (when it works with the HortonWorks - same apache version - and Cloudera Distribution)
Of course, log files don't tell me anything...
UPDATE:
I tried to install IOP 4.1, firstly using MySQL as my Ambari, Hive and Oozie database, everything was fine.
Next I tried to install IOP 4.1 with Oracle 11 XE as external database (I configured oracle, created ambari, hive and oozie oracle users and loaded the Ambari Oracle schema given with IOP 4.1, and I configure the same cluster as the first time, specifying the Oracle particularities for Hive, Oozie (and Sqoop (Oracle driver)). Before deploying the services to all the nodes, Ambari is supposed to resume what it is going to install, but it doesn't: sometimes it doesn't show anything, sometimes it shows only the IOP repos urls. Next, trying to deploy, it starts the preparation tasks but never ends. and that's it. No message, no log, nothing, it just get stucked.
As the desired components of IOP 4.1 are in the same version in HDP 2.3 (Ambari 2.1, Hive 1.2.1, oozie 4.2.0, hadoop 2.7.1, pig 0.15.0, sqoop 1.4.6 and zookeeper 3.4.6), I tried to configure exactly the same cluster with HDP 2.3, Oracle 11 XE, ... and everything worked. I noticed that HDP 2.3 forces me to use SSL, while IOP does not. HDP works with an Oracle JDK 1.8 by default while IOP actually offer to use an OpenJDK 1.8 instead. I don't know if it matters, I'll try to be sure... I'll take pictures of the Ambari screen when it blocks and copy the log traces, even if there's no error message...
If anyone got an idea, please share it!
Thanks!
Trying to operate the same installation using the Oracle JDK 1.8 everything works fine.
I don't know if there is any restriccion using the Oracle JDBC driver with OpenJDK 1.8 but using Oracle 11 XE with IOP 4.1 + Oracle JDK 1.8 works.

Browsing Hbase data in Hue through Phoenix

I am using CDH 5.4.4 and installed Phoenix parcel to be able to run SQL on hbase tables. Has anyone tried to browse that data using Hue?
I know since we can connect using JDBC connection to Phoenix, there must be a way for Hue to connect to it too.
The current status is that we would need to add HUE-2745 and then it would show up in DBQuery / Notebook
The latest https://phoenix.apache.org/server.html is brand new and JDBC only.
If there was an HiveServer2 Thrift API or ODBC for Phoenix it would work almost out of the box in the SQL or DB Query apps. Hue could work with JDBC but there will be a JDBC connector that is GPL (so to install separately).
The Hue jira for integrating Phoenix is https://issues.cloudera.org/browse/HUE-2121.

Cloudera beeswax server and hive server

I have a fundamental question regarding the two servers mentioned in the context of cloudera cdh4 distribution
Are those two interchangeable/replaceable as in could you run beeswax in place of hive server?
I'm trying to use a thrift client to connect and in my set up only the beeswax is running and not the hive server. In such a case can I connect to the beeswax server?
Hive Server is the default process and Beeswax is a newer process designed to better support concurrency and provide authentication using Kerberos. You should run one or the other.
And yes, you should definitely be able to connect to beeswax using Thrift. You can find clients for Beeswax and Hive server here.
what is the difference between hive-server2 and beeswax? They are both designed to better support concurrency and security.

Resources