Hadoop view created with CTE misbehaves

Hadoop view created with CTE misbehaves - hadoop

Here is the view definition(Runs fine. View gets created)
CREATE OR REPLACE VIEW my_view
AS WITH Q1
AS (SELECT MAX(LOAD_DT) AS LOAD_DT FROM load_table WHERE UCASE(TBL_NM) = 'FACT_TABLE')
SELECT F.COLUMN1
, F.COLUMN2
FROM Q1, FACT_TABLE F
WHERE Q1.LOAD_DT = F.TRAN_DT
;
However, when run
SELECT * from my_view;
getting following error message:
FAILED: SemanticException Line N:M Table not found 'Q1' in definition of view my_view....etc..
Looks like hive is trying to treat the Q1 (wich is CTE) as a physical table. Any ideas how to work around this?
Thank You,
Natalia

We have faced the similar issue in our environment. To answer your question, it's a bug in Hive. Fortunately, we have a workaround to make it work. If you were using impala and hive and both are using same metastore. Create the view in Impala and it will work on both hive and impala.
Reason:
Hive is appending your database name to the CTE reference created which is causing the issue.
Thanks,
Neo

Related

SubQuery works in IMPALA but not HIVE

I'm trying to understand why the following subquery will work in Impala and not Hive.
select * from MySchema.MyTable where identifier not in
(select identifier from schema.table where status_code in (1,2,3));
EDIT:
Added the error
Error while compiling statement: FAILED: SemanticException [Error
10249]: line 1:55 Unsupported SubQuery Expression 'identifier':
Correlating expression cannot contain unqualified column references.

Issue could be because of 'identifier' in both queries, in main query and inner subquery. Explicitly mentioning which 'identifier' you are referring to like 'mytable.identifier' should resolve this issue.
This is probably an issue with Hive that has been fixed in recent versions and issue is not reproduced in hive 3.1.0.
If you are still facing the issue, let us know the hive version you are using and DDL statements used to create tables.

Input file name in Spark 1.6.0 View

I cannot use the input_file_name() function in Spark 1.6.0 views. It works in select statements or in df.withColumn("path", input_file_name()), but not in a view.
For example:
CREATE VIEW v_test AS SELECT *, input_file_name() FROM table
fails. It also fails when i use INPUT__FILE__NAME instead. Just:
SELECT *, input_file_name() FROM table
works as expected. Is this a known bug or am i doing something wrong?
PS: I can create the view in Hive, but cannot access it from Spark as it fails with the same error: unknown function...
UPDATE:
I use Zeppelin with livy interpreter and Scala API.
The error i get from the above query to create the view is:
invalid function input_file_name
I also tried to import the function, but it has no effect

You have create a temp view as below
df.registerTempTable("table")
and then use input_file_name(). It would just work perfect.
sqlContext.sql("select *, input_file_name() from table")
for newer versions of spark you can use following api for creating temp view
df.createOrReplaceTempView("table")
I hope the answer is helpful

expecting KW_EXCHANGE near 'table' in alter exchange partition

I am dealing with a table in hive which does not have partitions and with input format as textinputformat. This is not an external table and I create it using "Create table as select" template.
I use the alter table statement to rename the table as given below:
ALTER TABLE testdb.temptable RENAME TO testdb.newtable;
I get the following error:
Error: Error while compiling statement: FAILED: ParseException line 1:32 mismatched input 'RENAME' expecting KW_EXCHANGE near 'temptable' in alter exchange partition (state=42000,code=40000)
Closing: org.apache.hive.jdbc.HiveConnection
I see it is a bug in hive. I use the version:
Hive 0.12.0-cdh5.1.4
How do i go about fixing this issue. Thanks in advance for the help!

It's not exactly a bug, just a side effect of Open Source when it's done by a motley crew of people all around the world with no "product owner" and no incentives to use a common programming style (or run extensive regression tests, or <insert your complaint here>).
Aaaaaaah, now that it's said, I feel better... Let's get to the point.
In HiveQL the alter command does not use the same semantics as create or select; specifically, you cannot use the "ALTER DATABASE.TABLE" notation. If you try, then the HQL parser just fails with a queer error message, as you can see by yourself.
That's the way it is. You must type a use command first, then your alter command with just the table name. Yes, it sucks. But that's the way it is. And I see no reason why it should improve any time soon.
[Update Jun-2017] looks like ALTER finally supports the DB.TABLE syntax, on recent Cloudera distro (tested on CDH 5.10 with Hive 1.1.0 - but since they usually include a number of back-ports in their distro, maybe it's a feature of Hive 1.2+)

I have similar error message it is gone after using alternative syntax, selecting schema and reference table by the short name:
USE mydb;
ALTER TABLE mytable RECOVER PARTITIONS;

Oracle 11g Materialized View hangs

I'm attempting to create a materialized view within Oracle using a pre-built view.
create materialized view bfb_rpt_sch01.mvw_base_sales
as select * from bfb_rpt_sch01.vw_base_sales;
This command will not execute and hangs. I figured perhaps this has something to do with the view not being properly written. So I performed the following query on the view.
select count(*) from bfb_rpt_sch01.vw_base_sales
This query takes about 6 minutes to execute and returns 2.7 million. This tells me the view is not the issue, but I could be wrong.

I managed to figure out my issue. My (CREATE MATERIALIZED VIEW AS) was using a different explain compared to my (CREATE TABLE AS). If my code contained the follow line of code, it would run completely fine as (CREATE TABLE AS), but it would continue to hang for 48+ hrs before it would fail when using (CREATE MATERIALIZED VIEW AS).
WHERE a.column_name NOT IN (SELECT b.column_name FROM B) --culprit
I changed the code using the following and it now works fine.
WHERE NOT EXISTS (SELECT NULL FROM B WHERE a.column_name = b.column_name) --works
I'm not sure why this happens, perhaps bug? I don't enough about ORACLE to make the call.

Does hsqldb support table alias in oracle compatible mode

We're using hsqdb-2.2.9 in dao tests. The hsqldb was made compatible with oracle (in production) by setting SET DATABASE SQL SYNTAX ORA TRUE; and we use ibatis sql map.
It gets failed when the sql contains table alias, something like select a.name, b.code form t_a a, t_b b where a.id = b.a_id , which reports unexpected token a. We tried adding 'as' between table and table alias, it doesn't work either. Do I miss something?

Yes, HSQLDB supports table aliases.
If you use the exact query you reported, you would get:
unexpected token: T_A
If you correct the query as commented by a_horse_with_no_name it should work. If one of the tables does not exist, you would get:
user lacks privilege or object not found: T_A
BTW, try using the latest 2.3.0 snapshot jar for better Oracle compatibility tests. You can find it from the support page of the website.

uh.... I think I‘ve found the issue........of my own. It suddenly occur to me that I use ’do‘ (table name is t_delivery_order) as the table alias which happens to be a keyword in hsqldb(or in sql). Just replace 'do’ with 'd', its fixed. Thank u all

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hadoop view created with CTE misbehaves - hadoop

Related

SubQuery works in IMPALA but not HIVE

Input file name in Spark 1.6.0 View

expecting KW_EXCHANGE near 'table' in alter exchange partition

Oracle 11g Materialized View hangs

Does hsqldb support table alias in oracle compatible mode

Categories

Resources