Cloudera Impala connect to Tableau Error - hadoop

I am working on using Tableau to connect to Cloudera Hadoop. I provide the server and port details and connect using Impala. I am able to successfully connect, select default Schema and choose the required table(s).
After this, when I drag and drop either a dimension or a measure to Rows/Columns on the grid, I get the below error:
[Cloudera][Hardy] (22) Error from ThriftHiveClient:
Query returned non-zero code: 10025, cause: FAILED:
SemanticException [Error 10025]: Line 1:7 Expression not in GROUP BY key ''.
I saw several similar problems on the forum, but none of them got the solution Any help on this is very much appreciated?

I encountered the same problem before. The error occurs when Tableau try to run something like this:
SELECT `table`.`param_1` AS `param_1`
,SUM(`table`.`param_2`) AS `sum_all`
FROM `db_name`.`table`
`table` GROUP BY 1
Since you can check the schema and tables. This aggregation may cause the problem.
I think you might need to check a few things:
Is your odbc driver version correct? Cloudera ODBC driver 2.5.28 does
not support Tableau with Impala.
Did you choose the right port number or login type? Impala port
number is 21000 and 21050. Hive is 10000.
For me the setup is using port 21050 and choose Impala as Type with no authentication. You can also choose Type HiveServer2 and using impala port number to login. But it didn't work on my case.
Hope that help.

Related

LibreOffice Base JDBC connection to Hive returns “Method not supported” when executing valid select statement

I'm trying to get LibreOffice's Base v5.1.4.2, running on Ubuntu v16.04 to connect to a Hive v1.2.1 database via JDBC. I added the following jars, downloaded from Maven Central, to LibreOffice's classpath ('Tools -> LibreOffice -> Advanced -> Class Path'):
hive-common-1.2.1.jar
hive-jdbc-1.2.1.jar
hive-metastore-1.2.1.jar
hive-service-1.2.1.jar
hadoop-common-2.6.2.jar
httpclient-4.4.jar
httpcore-4.4.jar
libthrift-0.9.2.jar
commons-logging-1.1.3.jar
slf4j-api-1.7.5.jar
I then restarted LibreOffice, opened Base, selected 'Connect to an existing database' -> 'JDBC' and set the following properties:
I entered the credentials and clicked the 'Test Connection' button, which returned a "the connection was established successfully" message. Great!
In the LibreOffice Base UI, the options under the 'Tables' panel were grayed out. The options in the queries tab were not, so I tried to connect to Hive.
The 'Use Wizard to Create Query' option prompts for a password and then returns "The field names from 'airline.on_time_performance' could not be retrieved."
The JDBC connection is able to connect to Hive and list the tables, though it seems to have problems retrieving the columns. When I try to execute a simple select statement, the 'Create Query in SQL View' option returns a somewhat cryptic "Method not supported" message:
The error message is a bit vague. I suspect that I may be missing a dependency since I am able to connect to Hive from Java using JDBC.
I'm curious to know if anyone in the community has LibreOffice Base working with Hive. If so, what am I missing?
The Apache JDBC driver reports "Method not supported" for most features, just because the Apache committers did not bother to handle the list of simple yes/no API calls. Duh.
If you want to see by yourself, just download DBVisualizer Free, configure the Apache Hive driver, open a connection, and check the Database Info tab.
Now, DBVis is quite permissive with lame drivers, but it seems that LibreOffice is not.
You can try the Cloudera Hive JDBC driver as an alternative. You just have to "register" -- i.e. leave your e-mail address -- to access the download URL; it's simpler to deploy than the Apache thing (based on the Simba SDK, all Hive-specific JARs are bundled) and it works with about any BI tool. So hopefully it works with LibreThing too.
Disclaimer: I wish the Apache distro had a proper JDBC driver, and anyone could use it instead of relying of "free" commercial software. But for now it's just a wish.

Impala Query Editor always shows AnalysisException

I am running a Quickstart VM Cloudera on a Windows 7 computer, with 8Go of RAM and 4Go dedicated to the VM.
I loaded tables from a SQL database into Hive, using Sqoop (Cloudera VM tutorial exercise 1). Using the Hive Query Editor OR Impala Shell, everything works fine (i.e. "show tables" shows me the tables that were imported).
Using the Impala Query Editor, whatever I type, I get the same error message:
AnalysisException: Syntax error in line 1: USE `` ^ Encountered: EMPTY IDENTIFIER Expected: IDENTIFIER CAUSED BY...
I have the same if I type "show tables;" ...
I checked that Impala-services were up and running and it was the case, and everything works fine in the Impala shell:
I googled around but could not find any answer, many thanks in advance for your answer !!
Need to use the Hive Query Editor. The error shows up if you use the Impala or other Query Editor because you're using a library written for Hive.
Query -> Editor -> Hive
Yes, try selecting a database and if one does not appear, try either clearing your browser cache and reloading the page and also verify that your user has permissions to view the default database. Although since you said that Hive query editor works fine, it sounds like permissions are not the issue.
I solved this issue cleaning history from Firefox. After that i signed again on HUE and the databases on Impala Query Editor was showed again.enter image description here
Impala does not support ORC file format I changed to sequence file it works

How to check connection of cassandra with pentaho data integrator

I'm trying to load data from Oracle table to Cassandra table by using Pentaho Data Integration 5.1(Community Edition). But I'm not getting whether connection has been established between oracle and cassandra. I'm using Cassandra 2.2.3 and Oracle 11gR2.
I've added following jars in lib folder of data-integration
--cassandra-thrift-1.0.0
--apache-cassandra-cql-1.0.0
--libthrift-0.6.jar
--guava-r08.jar
--cassandra_driver.jar
Please anyone can help me to figure out how to check whether connection has been established in Pentaho.
There are some ways to debug if a connection is established to a database, I don't know if all of them are valid for cassandra, but I'll add a especial one for that.
1) The test button
By simply clicking the test button on the connection edit screen.
2) Logs with high details may help
Another way to test is running you transformation with a high detail log:
sh pan.sh -file=my_cassandra_transformation.ktr -level=Rowlevel
3) The input preview
For cassandra, in especific, I would try just to create a simple read operation using Cassandra Input step and clicking in the 'preview' button.
4) The controlled output test
Or maybe you can try with a simplier transformation first, to make sure it's running fine. Eg.

"Unexpected Error" on Join 2 simple tables

I have created a hive database. I have created an ODBC Data source to Hive using Hortonworks ODBC Driver for Hive.
I use this data source from Tableau 9 (desktop).
I can query Table DimA, I can query Table FactA. But in tableau if I try to do a join I get error
[Hortonworks][HiveODBC] (35) Error from Hive: error code: '0' error message: 'ExecuteStatement finished with operation state: ERROR_STATE'.
Unexpected Error
I can easily go to my cluster and issue the same query in hiveshell without any problems and it returns results.
I searched the Internet and people have this permission problem which gets solved by "grant".. but in this case I am able to query individual 2 tables (dima, facta) easily from tableau... but ONLY when I JOIN the tables that it throws the above error.
I tried the "New Custom SQL" and copy pasted the SQL which worked in hive Shell... but tableau threw the error.
[Hortonworks][HiveODBC] (35) Error from Hive: error code: '40000' error message: 'Error while compiling statement: FAILED: ParseException line 1:11 cannot recognize input near 'TOP' '1' '*' in select expression'.
I fixed the issue. I had chosen the user "hue" to connect to HIVE.
I did this because a tutorial showed me the steps to connect to hive.
http://hortonworks.com/hadoop-tutorial/how-to-install-and-configure-the-hortonworks-odbc-driver-on-windows-7/
but the tutorial is wrong in suggesting the user hue. they should instead use hdfs because hue user does not have rights to launch MR jobs which are required to run joins on Hive.
Possible fix:
This SQL error is a known issue when using Hadoop Hive driver 1.4.8 to
1.4.13. This issue can be resolved by rolling the client driver back to 1.3. The most recent drivers produce issues when using a CASE
statements in Tableau, and Hortonworks is in the process of repairing
this functionality. (http://community.tableau.com/thread/150002)

I can read data from hive tables through hive client. But i cannot read from tools like talend

I have installed cdh4.4. And hive client is working properly and i am able to create, and display all the hive tables.
But when i use tools like talend i am getting the error 10001 table not found.
Can anybody tell where i am going wrong?
This is problem is due to the reason that the tool talend searches the default database.
Hence give database.tablename in the table field. This will solve the problem.
Regards,
Nagaraj

Categories

Resources