Hive doesn't support in, exists. How do I write the following query? - hadoop

I have two tables A and B that both have a column id. I wish to obtain ids from A that are not present in B. The obvious way is:
SELECT id FROM A WHERE id NOT IN (SELECT id FROM B)
Unfortunately, Hive doesn't support in, exists or subqueries. Is there a way to achieve the above using joins?
I thought of the following
SELECT A.id FROM A,B WHERE A.id<>B.id
But it seems like this will return the entirety of A, since there always exists an id in B that is not equal to any id in A.

You can do the same with a LEFT OUTER JOIN in Hive:
SELECT A.id
FROM A
LEFT OUTER JOIN B
ON (B.id = A.id)
WHERE B.id IS null

Hive seems to support IN, NOT IN, EXIST and NOT EXISTS from 0.13.
select count(*)
from flight a
where not exists(select b.tailnum from plane b where b.tailnum = a.tailnum);
The subqueries in EXIST and NOT EXISTS should have correlated predicates (like b.tailnum = a.tailnum in above sample)
For more, refer Hive Wiki > Subqueries in the WHERE Clause

Should you ever want to do an IN as so:
SELECT id FROM A WHERE id IN (SELECT id FROM B)
Hive has this covered with a LEFT SEMI JOIN:
SELECT a.key, a.val
FROM a LEFT SEMI JOIN b on (a.key = b.key)

if you can using spark sql you can use left anti join.
ex: SELECT A.id FROM A left anti join B on a.id=b.id

Related

how to use full outer join writing query on multiple tables?

I have a below query
How to use full outer join for TABLE T4 for getting all records?
WHERE
(DB.T4.AUTH_REV_NO=DB.T2.AUTH_REV_NO
AND DB.T4.AUTH_NO=DB.T2.AUTH_NO)
AND (DB.T2.AUTH_CURR_IN='Y' )
AND (DB.T3.AUTH_NO=DB.T2.AUTH_NO)
AND (DB.T3.AUTH_REV_NO=DB.T2.AUTH_REV_NO )
AND (DB.T6.FNC_ID=DB.T4.FNC_ID)
AND (DB.T7.FNC_SEG_ID=DB.T6.FNC_SEG_ID)
AND (DB.T1.SCT_ID(+)=DB.T7.SCT_ID
AND DB.T1.FNC_SEG_ID(+)=DB.T7.FNC_SEG_ID)
AND (DB.T8.NDE_ID=DB.T12.NDE_ID)
AND (DB.T7.FNC_SEG_ID=DB.T8.FNC_SEG_ID)
AND (DB.T7.SCT_ID=DB.T8.SCT_ID)
AND ((DB.T12.NDE_ID=DB.T6.NDE_STRT_ID)
OR (DB.T12.NDE_ID=DB.T6.NDE_END_ID))
AND (DB.T5.FNC_ID(+)=DB.T4.FNC_ID)
AND (T13_A4.REF_ID(+)=DB.T5.REF_TONE_TYP_ID)
AND (fne.FNC_SEG_ID=DB.T8.FNC_SEG_ID)
AND (fne.NDE_ID=DB.T8.NDE_ID)
AND (fne.SCT_ID=DB.T8.SCT_ID)
AND fnode.NDE_ID=DB.T6.NDE_STRT_ID
AND tnode.NDE_ID=DB.T6.NDE_END_ID
AND (DB.T4.REF_FNC_TYP_ID=T13_A1.REF_ID)
AND (ne_port.NDE_EQP_ID=fne.NDE_EQP_ID)
AND (ne_port.NDE_EQP_PRN_ID=ne_card.NDE_EQP_ID)
AND (ne_card.NDE_EQP_PRN_ID=ne_shelf.NDE_EQP_ID)
AND (ne_shelf.NDE_EQP_PRN_ID=ne_rack.NDE_EQP_ID)
AND (eq.EQP_ID=ne_card.EQP_ID)
AND (eq.REF_EQP_CLS_ID=T13_A2.REF_ID)
AND (DB.T3.REF_AUTH_STS_ID=T13_A3.REF_ID)
AND (DB.T3.AUTH_STS_ID
IN (SELECT MAX(DB.T3.AUTH_STS_ID) FROMDB.T3
WHERE (DB.T3.AUTH_NO,DB.T3.AUTH_REV_NO)
IN
(SELECT
DB.T3.AUTH_NO,
MAX(DB.T3.AUTH_REV_NO)
FROM
DB.T3
GROUP BY
DB.T3.AUTH_NO)
GROUP BY
DB.T3.AUTH_NO))
How to use full outer join for TABLE T4 and for COLUMN FNC_TONE_LVL_QT to get all records.
Please help.
You posted a whole lot of "joins". I'm not going to rewrite it for you, but - I'd suggest you to switch to a more recent explicit JOIN syntax which makes things somewhat simpler and easier to understand as you'd separate joins from conditions. Moreover, it allows you to outer join the same table to more than just one another table, which is impossible with the old (+) Oracle's outer join operator.
Something like this
select ...
from table_1 a left join table_2 b on a.id = b.id
full outer join table_3 c on c.id = a.id
...

Left outer join is null- additional conditions

I'm trying to find all entries in table a, where there is no matching entry in table b for one specific column (order). I'm using the following:
SELECT *
FROM a
LEFT OUTER JOIN b
ON a.id = b.id
WHERE b.order IS NULL
AND a.result>10
However, the last condition for result doesn't seems to work. It simply lists all the entries from table a, regardless whether result is more than 10 or not.
Any way around this?
Shouldn't your query be as below?
SELECT * FROM a LEFT OUTER JOIN b ON a.id = b.id WHERE b.id IS NULL AND a.result>10

Select value from grandchildren

I have the following table structure:
I want to select:
all TableA entries + the Identifier column from Table C that have:
a special value in TableBType ("TableBTypeValue")
a special value in TableCType ("TableCTypeValue")
The problem that I have is that the linq queries seem to fail when there are TableA entries that have no TableB entry or if there are TableB entries without a TableC (TableBType and TableCType is mandatory so they don't have that problem).
With SQL this would not be a big problem, but as I am new to linq I could not find the correct way to create this query.
I think this is what you are looking for:
from c in db.TableC
where c.TableCType == TableCTypeValue
join b in db.TableB on c.TableBId equals b.Id
where b.TableBType == TableBTypeValue
join a in db.TableA on b.TableAId equals a.Id
select new { a, c.Identifier };
Hope it helps.

Linq query only returning 1 row

Dim ds = From a In db.Model
Join b In db.1 On a.id Equals b.ID
Join c In db.2 On a.id Equals c.ID
Join d In db.3 On a.id Equals d.ID
Join f In db.4 On a.id Equals f.ID
Select a.id, a.Ref, a.Type, a.etc
Above is my linq query. At the moment I am only getting the first row from the db returned when there are currently 60 rows. Please can you tell me where I am going wrong and how to select all records.
Thanks in advance!
UPDATE:
When I take out all the joins like so:
Dim ds = From a In db.1, b In db.2, c In db.3, d In db.4, f In db.5
Select a.id, a.Ref, a.type, b.etc, c.etc, d.etc
I get a system.outofmemory exception!
You're only going to get a row produced when all of the joins match - in other words, when there's a row from Model with an AP, an Option, a Talk and an Invoice. My guess is that there's only one of those.
LINQ does an inner join by default. If you're looking for a left outer join (i.e. where a particular row may not have an Invoice, or a Talk etc) then you need to use a group join, usually in conjunction with DefaultIfEmpty.
I'm not particularly hot on VB syntax, but this article looks like it's what you're after.

Oracle : How to use if then in a select statement

select ma.TITLE,ma.ID as aid,ur.USER_ID
from LEO_MENU_ACTIVITY_RELATION mr
inner join LEO_MENU_MASTER mm on mm.ID=mr.MENU_ID
INNER join LEO_MENUACTIVITY ma on mr.ACTIVITY_ID=ma.ID
LEFT OUTER JOIN LEO_USER_RIGHTS ur on ma.ID=ur.MENU_RELATION_ID and ur.MENU_ID=mm.ID and ur.USER_ID='141'
where mm.ID='1'
UNION (SELECT
'List' as TITLE,
1 as ID,
case (WHEN ur.MENU_RELATION_ID=1 THEN NULL ELSE USER_ID END)as USER_ID
from
LEO_USER_RIGHTS)
In the UNION i want perform a conditional select like if ur.MENU_RELATION_ID=1 then the USER_ID should be selected as NULL otherwise the the original value from the 'LEO_USER_RIGHTS' table must be retrieved.
How can i do this ? Please help
Krishnik
If you want to combine in a UNION something based on the first table I think you can only do it by repeating the whole thing like this:
select ma.TITLE,ma.ID as aid,ur.USER_ID
from LEO_MENU_ACTIVITY_RELATION mr
inner join LEO_MENU_MASTER mm on mm.ID=mr.MENU_ID
INNER join LEO_MENUACTIVITY ma on mr.ACTIVITY_ID=ma.ID
LEFT OUTER JOIN LEO_USER_RIGHTS ur on ma.ID=ur.MENU_RELATION_ID and ur.MENU_ID=mm.ID and ur.USER_ID='141'
where mm.ID='1'
UNION (SELECT
'List' as TITLE,
1 as ID,
case (WHEN ur.MENU_RELATION_ID=1 THEN NULL ELSE USER_ID END)as USER_ID
from
LEO_MENU_ACTIVITY_RELATION mr
inner join LEO_MENU_MASTER mm on mm.ID=mr.MENU_ID
INNER join LEO_MENUACTIVITY ma on mr.ACTIVITY_ID=ma.ID
LEFT OUTER JOIN LEO_USER_RIGHTS ur on ma.ID=ur.MENU_RELATION_ID and ur.MENU_ID=mm.ID and ur.USER_ID='141'
where mm.ID='1'
)
If this is used often I would create a view to avoid duplicate code. In ORACLE (I do not know for other SQL dialects) there is a WITH statement enables you to make a sort of "temporary view".

Resources