Why can't hive recognize alias named in select part? - hadoop

Here's the scenario: When I invoke hql as follows, it tells me that it cannot find alias for u1.
hive> select user as u1, url as u2 from rank_test where u1 != "";
FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'u1': (possible column names are: user, url)
This problem is the same as when I try to use count(*) as cnt. Could anyone give me some hint on how to use alias in where clause? Thanks a lot!
hive> select user, count(*) as cnt from rank_test where cnt >= 2 group by user;
FAILED: ParseException line 1:58 missing EOF at 'where' near 'user'

The where clause is evaluated before the select clause, which is why you can't refer to select aliases in your where clause.
You can however refer to aliases from a derived table.
select * from (
select user as u1, url as u2 from rank_test
) t1 where u1 <> "";
select * from (
select user, count(*) as cnt from rank_test group by user
) t1 where cnt >= 2;
Side note: a more efficient way to write the last query would be
select user, count(*) as cnt from rank_test group by user
having count(*) >= 2
If I remember correctly, you can refer to the alias in having i.e. having cnt >= 2

I was able to use Alias in my Hive select statement using backtick symbol ``.
SELECT COL_01 AS `Column_A`;
The above solution worked for Hive version 1.2.1.
reference link

Related

Count of rows from all views in Oracle with a condition

I am trying to get count of all rows from views in oracle schema and my code is working fine. But when i try to add a condition like where actv_ind = 'Y', i am unable to get it working. Any suggestions on how to modify the code to make it working?
SELECT view_name,
TO_NUMBER(extractvalue(xmltype(dbms_xmlgen.getxml('select count(*) cnt from '||owner||'.'||view_name||
'where'||view_name||'.'||'actv_ind= Y')),'/ROWSET/ROW/CNT')) as VIEW_CNT
FROM all_views
WHERE owner = 'ABC' AND view_name not in ('LINK$')
I am getting the error ACTV_IND : Invalid Identifier.
The error messages from DBMS_XMLGEN are not very helpful so you need to be very careful with the syntax of the SQL statement. Add a space before and after the where, and add quotation marks around the Y:
SELECT view_name,
TO_NUMBER(extractvalue(xmltype(dbms_xmlgen.getxml('select count(*) cnt from '||owner||'.'||view_name||
' where '||view_name||'.'||'actv_ind= ''Y''')),'/ROWSET/ROW/CNT')) as VIEW_CNT
FROM all_views
WHERE owner = 'ABC' AND view_name not in ('LINK$');
The query still assumes that every view contains the column ACTV_IND. If that is not true, you might want to base the query off of DBA_TAB_COLUMNS WHERE COLUMN_NAME = 'ACTV_IND'.
Here's a simple sample schema I used for testing:
create user abc identified by abc;
create or replace view abc.view1 as select 1 id, 'Y' actv_ind from dual;
create or replace view abc.view2 as select 2 id, 'N' actv_ind from dual;

Oracle Invalid Identifier (with Inner Join)

I'm having an "Invalid Identifier" in Oracle because of the "B.username" (username column does exist in USER table). When i remove this, it's working fine. How to resolve this issue? I came from a MySQL background.
SELECT * FROM (SELECT qNA.assignment, qNA.regDate, B.username, (
SELECT DISTINCT NVL(idx, 0)
FROM EK_USERGRADE
WHERE year = (SELECT DISTINCT userGradeNo FROM EK_USER WHERE ID = qNA.userIdx)
) AS userGradeIdx
FROM EK_NEWTESTAPPLICANT qNA
WHERE IDX = :idx ) A
INNER JOIN EK_USER B ON (A.userIdx = B.ID)
Let's try this with a simplified version of your query:
-- test tables
create table NEWTESTAPPLICANT as select 1 useridx from dual ;
create table B as select 1 id, 'name1' username from dual ;
-- query
select *
from (
select B.username
from NEWTESTAPPLICANT qNA
) A join B on A.useridx = B.id ;
-- ORA-00904: "B"."USERNAME": invalid identifier
There's no "username" column in the NEWTESTAPPLICANT table, which causes the error. A LATERAL inline view (examples see here) may do the trick ...
-- query
select
*
from B, lateral (
select B.username
from NEWTESTAPPLICANT qNA
) A ;
-- result
ID USERNAME USERNAME
1 name1 name1
This works with Oracle 12c.
The problem is, that both your virtual table A and users B have the same column name "username". Specify alias in the main select, like "Select A.* , B.* from(...".
Is it ORA-00903?
User is a reserved word are you sure you created this table? Table name cannot be a reserved word.

ora 22992 on a LOB field i am not using

I have the following simple oracle query:
select A.field
from table1 A
left join table2#remotedb B on A.id = B.id
Where table B has a BLOB field
It runs fine
If i add a concat to the select:
select A.field||'x'
from table1 A
left join table2#remotedb B on A.id = B.id
I get the following error:
"ora-22992 cannot use lob locators selected from remote tables"
Why adding a concat to a filed which isn't the LOB file is giving me this error?!?
What can i do to avoid it?
check this
with sub1 as
(
select /*+ materialize */ A.field
from table1 A
left join table2#remotedb B on A.id = B.id
)
select field || 'x'
from sub1
I just ran into similar issue. It seems Oracle requires it must be guaranteed any work with clob is avoided at remote side.
Assuming #remotedb is db link to another Oracle db, consider this minimized case:
select dummy from dual; -- works
select to_clob(dummy) from dual; -- works
select dummy from dual#remotedb; -- works
select to_clob(dummy) from dual#remotedb; -- fails - ORA-22992
with m as (select /*+ MATERIALIZE */ dummy from dual#remotedb)
select to_clob(dummy) from m; -- works again, fails without hint
I also tried to find workaround based on forcing computation to local db (select /*+DRIVING_SITE(local)*/ to_clob(r.dummy) from dual local, dual#remotedb r) but with no success.

Convert Oracle (Cross Join?) to Netezza when using comma separated table list instead of JOIN keywords

Below is is some Oracle PL/SQL code to join tables without using actual JOIN keywords. This looks like a cross join? How would I convert to Netezza SQL code? That's where I'm stuck.
SELECT COUNT(*)
FROM TABLE_A A, TABLE_A B
WHERE A.X = 'Y' AND A.PATH LIKE '/A/A/A'
AND B.X = 'Z' AND B.PATH LIKE '/B/B/B';
Oracle Cross Join:
http://www.sqlguides.com/sql_cross_join.php
Here's what I tried so far:
SELECT *
from TABLE_A A
cross join (
select * from TABLE_A
) B
WHERE
A.X = 'Y' AND A.PATH LIKE '/A/A/A'
AND B.X = 'Z' AND B.PATH LIKE '/B/B/B';
EDIT:
a_horse_with_no_name:
When I use either syntax in Netezza for the COUNT(*) in the very beginning, it works and returns a count of 60, which matches the first query above when running in Oracle. Without the WHERE clause in Netezza returns 125316 results, which matches the first query above when running in Oracle. When I use either syntax in Netezza for the SELECT * in the very beginning, I get error
ERROR [HY000] ERROR: Record size 70418 exceeds internal limit of 65535 bytes'
Had to use explicit columns in Netezza when doing a CROSS JOIN. Using SELECT * throws the error as indicated in my question EDIT. Also had to escape the '%' character by escaping nothing. Thank you a_horse_with_no_name. Cheers! "Where everybody knows your name." ;-)
select A.CODE, B.CODE, LOWER(A.DIM), LOWER(B.DIM)
FROM TABLE_A A
cross join TABLE_A B
WHERE A.PATH LIKE '\A\A\A%' ESCAPE '' AND A.X = 'Y'
AND B.PATH LIKE '\B\B\B%' ESCAPE '' AND B.X = 'Y'

Hadoop - error message when declaring variable within query

I have tried the following query within HUE's Beeswax Query Editor:
SET MAXDATE=(SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE);
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
WHERE YEAR(DATA_DAY) >= '2015'
AND DATA_DAY > ${HIVECONF:MAXDATE};
This query will not run and produces the following error message:
FAILED: ParseException line 1:4 missing KW_ROLE at 'MAXDATE' near 'MAXDATE' line 1:11 missing EOF at '=' near 'MAXDATE'
Any advice on what the problem is? I don't understand what the KW_ROLE message means.
I come from a SQL Server background and would just run the following within SQL Server, but am trying to find a functional Hadoop/Hive equivalent.
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
WHERE YEAR(DATA_DAY) >= '2015'
AND DATA_DAY > (SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE)
Query which you have tried contains syntax issue. HiveConf should surrounded by single quotes.
SET MAXDATE=(SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE);
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
WHERE YEAR(DATA_DAY) >= '2015'
AND DATA_DAY > '${HIVECONF:MAXDATE}';
As far as I know, hive support the following syntax too.
SELECT COUNT(*) FROM DB2.SOURCE_TABLE a
JOIN
(SELECT MAX(DATA_DAY) AS max_date FROM DB1.DESTINATION_TABLE) b
WHERE YEAR(a.DATA_DAY) >= '2015'
AND a.DATA_DAY > b.max_date;
But it's not a good implementation if there are bunches of data on DB1.DESTINATION_TABLE.
In such case each query would task lots of sub-querys in SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE.
If possible, you could store your SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE result in another table, maybe Max_table.
Then the sql would be like this:
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
JOIN Max_table
WHERE YEAR(DB2.SOURCE_TABLE.DATA_DAY) >= '2015' and
DB2.SOURCE_TABLE.DATA_DAY > (Max_table.DATA_DAY)

Resources