"Unexpected Error" on Join 2 simple tables - hadoop

I have created a hive database. I have created an ODBC Data source to Hive using Hortonworks ODBC Driver for Hive.
I use this data source from Tableau 9 (desktop).
I can query Table DimA, I can query Table FactA. But in tableau if I try to do a join I get error
[Hortonworks][HiveODBC] (35) Error from Hive: error code: '0' error message: 'ExecuteStatement finished with operation state: ERROR_STATE'.
Unexpected Error
I can easily go to my cluster and issue the same query in hiveshell without any problems and it returns results.
I searched the Internet and people have this permission problem which gets solved by "grant".. but in this case I am able to query individual 2 tables (dima, facta) easily from tableau... but ONLY when I JOIN the tables that it throws the above error.
I tried the "New Custom SQL" and copy pasted the SQL which worked in hive Shell... but tableau threw the error.
[Hortonworks][HiveODBC] (35) Error from Hive: error code: '40000' error message: 'Error while compiling statement: FAILED: ParseException line 1:11 cannot recognize input near 'TOP' '1' '*' in select expression'.

I fixed the issue. I had chosen the user "hue" to connect to HIVE.
I did this because a tutorial showed me the steps to connect to hive.
http://hortonworks.com/hadoop-tutorial/how-to-install-and-configure-the-hortonworks-odbc-driver-on-windows-7/
but the tutorial is wrong in suggesting the user hue. they should instead use hdfs because hue user does not have rights to launch MR jobs which are required to run joins on Hive.

Possible fix:
This SQL error is a known issue when using Hadoop Hive driver 1.4.8 to
1.4.13. This issue can be resolved by rolling the client driver back to 1.3. The most recent drivers produce issues when using a CASE
statements in Tableau, and Hortonworks is in the process of repairing
this functionality. (http://community.tableau.com/thread/150002)

Related

Oracle DB to Big-query using Apache NiFi. Error while querying the same tables

We are migrating all the tables from Oracle to Bigquery through Apache NiFi. After successfully dumping the tables into Bigquery we are getting following Error while trying to query them in BQ CLI or GUI.
Query Failed
Error: Not found: Dataset cloudcover-191610:mesdata. Please verify that the dataset exists and the correct location was used for the job.
Anyone has idea about this

Hive Transactions + Remote Metastore Error

I'm running Hive 2.1.1 on EMR 5.5.0 with a remote mysql metastore DB. I need to enable transactions on hive, but when I follow the configuration here and run any query, I get the following error
FAILED: Error in acquiring locks: Error communicating with the metastore
Settings on the metastore:
hive.compactor.worker.threads = 0
hive.compactor.initiator.on = true
Settings in the hive client:
SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
This only happens when I set hive.txn.manager, so my hive metastore is definitely online.
I've tried some of the old suggestions of turning hive test features on which didn't work, but I don't think this is a test feature anymore. I can't turn off concurrency as a similar post in SO suggests because I need concurrency. It seems like the problem is that either DbTxnManager isn't getting the remote metastore connection info properly from hive config or the mysqldb is missing some tables required by DbTxnManager. I have datanucleus.autoCreateTables=true.
It looks like hive wasn't properly creating the tables needed for the transaction manager. I'm not sure where it was getting its schema, but it was definitely wrong.
So we just ran the hive-txn-schema query to setup the schema manually. We'll do this at the start of any of our clusters from now on.
https://github.com/apache/hive/blob/master/metastore/scripts/upgrade/mysql/hive-txn-schema-2.1.0.mysql.sql
The error from
FAILED: Error in acquiring locks: Error communicating with the metastore
sometimes because of it without any data, you need to initialization some data in your tables. for example below.
create table t1(id int, name string)
clustered by (id) into 8 buckets
stored as orc TBLPROPERTIES ('transactional'='true');

Cloudera Impala connect to Tableau Error

I am working on using Tableau to connect to Cloudera Hadoop. I provide the server and port details and connect using Impala. I am able to successfully connect, select default Schema and choose the required table(s).
After this, when I drag and drop either a dimension or a measure to Rows/Columns on the grid, I get the below error:
[Cloudera][Hardy] (22) Error from ThriftHiveClient:
Query returned non-zero code: 10025, cause: FAILED:
SemanticException [Error 10025]: Line 1:7 Expression not in GROUP BY key ''.
I saw several similar problems on the forum, but none of them got the solution Any help on this is very much appreciated?
I encountered the same problem before. The error occurs when Tableau try to run something like this:
SELECT `table`.`param_1` AS `param_1`
,SUM(`table`.`param_2`) AS `sum_all`
FROM `db_name`.`table`
`table` GROUP BY 1
Since you can check the schema and tables. This aggregation may cause the problem.
I think you might need to check a few things:
Is your odbc driver version correct? Cloudera ODBC driver 2.5.28 does
not support Tableau with Impala.
Did you choose the right port number or login type? Impala port
number is 21000 and 21050. Hive is 10000.
For me the setup is using port 21050 and choose Impala as Type with no authentication. You can also choose Type HiveServer2 and using impala port number to login. But it didn't work on my case.
Hope that help.

Impala Query Editor always shows AnalysisException

I am running a Quickstart VM Cloudera on a Windows 7 computer, with 8Go of RAM and 4Go dedicated to the VM.
I loaded tables from a SQL database into Hive, using Sqoop (Cloudera VM tutorial exercise 1). Using the Hive Query Editor OR Impala Shell, everything works fine (i.e. "show tables" shows me the tables that were imported).
Using the Impala Query Editor, whatever I type, I get the same error message:
AnalysisException: Syntax error in line 1: USE `` ^ Encountered: EMPTY IDENTIFIER Expected: IDENTIFIER CAUSED BY...
I have the same if I type "show tables;" ...
I checked that Impala-services were up and running and it was the case, and everything works fine in the Impala shell:
I googled around but could not find any answer, many thanks in advance for your answer !!
Need to use the Hive Query Editor. The error shows up if you use the Impala or other Query Editor because you're using a library written for Hive.
Query -> Editor -> Hive
Yes, try selecting a database and if one does not appear, try either clearing your browser cache and reloading the page and also verify that your user has permissions to view the default database. Although since you said that Hive query editor works fine, it sounds like permissions are not the issue.
I solved this issue cleaning history from Firefox. After that i signed again on HUE and the databases on Impala Query Editor was showed again.enter image description here
Impala does not support ORC file format I changed to sequence file it works

Tables not found when hive cli called from different directory

I am facing a weird problem with Hive Tables. I have HIVE_HOME set in my environ and it is also in my search path so i can invoke hive directly.
Now I invoke hive from a directory lets say /a/b/c and create some tables. I can see the tables.
Now I change to a directory e.g /a/b and invoke hive from there. Here is the problem part. Either i am unable to see the tables or i get this error
hive> show tables;
FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Failed to start
database 'metastore_db', see the next exception for details.
NestedThrowables:
java.sql.SQLException: Failed to start database 'metastore_db', see the next exception
for details.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Why are tables tied to the directory from which the hive cli was called from? Any pointers?
I think you are using derby server which hive uses for storing the metadata. So, for that what you can do is delete everything inside metastore_db folder and then try to restart the hadoop. And then try to see. But, i think best advice would be you use the mysql as a metastore.

Resources