HIve join without common filed - hadoop

I have the following tables:
Table1:
user_name Url
Rahul www.cric.info.com
ranbir www.rogby.com
sahil www.google.com
banit www.yahoo.com
Table2:
Keyword category
cric sports
footbal sports
google search
I want to search Table1 by matching the keyword in Table2. I can perform the same using case statement and the query works but it is not the right approach because each time I have to add the case statement when I will add new search keyword.
select user_name from table1
case when url like '%cric%' then sports
else 'undefined'
end as category
from table1;

Thanks find the soluntions for this approach. FIrst we need to do the Join and after that we need to filter the record.
select user_name,url,Keyword,catagory from(select table1.user_name,table1.url ,table2.keyword,table2.catagory from table1 left outer join table2)a where a.url like (concat('%',a.phrase,'%')

Not sure about more current versions, but I've run into a similar problem... the primary issue is that Hive only supports equi-join statements... when you apply logic to either side of the join, it has difficulty translating into a Map Reduce function.
The alternative method, if you have a reliably structured field, is that you can create a matching key from the larger field. For example, if you know that you're looking for your keyword to exist in the second position of a dot-delimited URI, you could do something like:
select
Uri
, split(Uri, "\\.")[1] as matchKey
from
Table1
join Table2 on Table2.keyword = Table1.matchKey
;

Related

Spoon lookup obtaining two rows instead of duplicate columns

I have this two tables
If I use the lookup step using the EMPLOYEEID and YEAR I get this columns:
But I need to get this instead:
I have prepare a solution Here.
Using this you will get your result.
Please let me know if its ok with you.
A lookup is designed to retrieve fields from a matching record in another table and append them to the "master" record.
If you want to combine records from 2 tables into a single dataset then you need to UNION them together.
Update following comment
Something like this pseudo-code ought to work:
SELECT T1.*
FROM TABLE1 T1
UNION
SELECT T2.*
FROM TABLE1 T2
INNER JOIN TABLE1 T1A ON T2.KEY1 = T1A.KEY1 AND T2.KEY2 = T1A.KEY2 AND ...

How to retrieve data from 3 tables using sub query oracle SQL

I want to retrieve users name and there responsibility_key where there end_date is null and i want to convert it to (sysdate+1) using nvl but i am only able to retrieve the responsibility_key not the name please help.
The error in the image says "column ambiguously defined". Take a close look. Your last END_DATE could refer to either the u alias or the table from the subquery. Change it to match the rest of your subquery (FIND_USER_GROUPS_DIRECT.END_DATE)
EDIT
Your query is
select u.USER_NAME, d.responsibility_key from FND_USER u,FND_RESPONSIBILITY_VL d
where responsibility_id in(
select responsibility_id from
FND_USER_RESP_GROUPS_DIRECT WHERE END_USER_RESP_GROUPS_DIRECT.END_DATE=nvl(END_DATE,sysdate+1)) and
u.END_DATE=nvl(END_DATE,SYSDATE + 1)
;
The query isn't formatted, which makes it hard to read.
Not all columns are qualified with table name (or aliases), as mentioned in the comments.
The query currently uses an implicit join.
The query is impossible to understand without seeing the table definitions (desc [table_name]).
For points 1 and 2, a properly formatted query will look something like
select u.user_name, d.responsibility_key
from
fnd_user u,
fnd_responsibility_vl d
where
d.responsibility_id in (
select urgd.responsibility_id
from
fnd_user_resp_groups_direct urgd
where
urgd.end_date = nvl(u.end_date, sysdate+1)
) and
u.end_date = nvl(urgd.end_date, sysdate + 1)
;
This makes it easier to read and in addition to this, you can see that without table definitions I guessed (see point 4) as to which tables the end_date column belongs in your query. If I had to guess, so does Oracle. That means you have an ambiguity problem. To fix it, take a close look at the end_date column as it appears in your original query and where you do not prefix it with anything, you need to prefix it with the appropriate alias (after you have aliased all your tables).
For point 3, you can write your query more clearly with an explicit join and by using aliases for all columns. As for the explicit join I have no idea what your tables look like but one possibility is something like
select u.user_name, d.responsibility_key
from fnd_user u
join fnd_responsibility_vl d
on u.id = d.user_id
where
d.responsibility_id in (
select responsibility_id
from fnd_user_resp_groups_direct urgd
where
urgd.end_date = nvl(u.end_date, sysdate+1)
) and
u.end_date = nvl(urgd.end_date, sysdate+1)
;
If you follow these points you will get to the root of the error.

Better solution than left join subqueries?

TablePatient.Patient_ID(PK)
TableProviders.Encounter (joins to PK)
TableProviders.Provider_Type
TableProviders.Provider_ID
TableNames.Full_Name
TableNames.Provider_ID (joins to Table Names)
I want a query that will give, for all the Patient_IDs, the Full_Name of the provider for every Provider ID.
There are about 30 provider_types.
I have made this already using a left join a ton of left joins. It takes a long time to run and I am thinking there is a trick I am missing.
Any help?
Ok, my previous answer didn't match at all what you meant. You want to pivot the table to have on each line one Patient_ID with every Full_name for every provider_type. I assume that each patient has only one provider for one type and not more ; if more, you will have more than one row for each patient, and anyway I don't think it's really possible.
Here is my solution with pivot. The first part is to make it more understandable, so I create a table named TABLE_PATIENT in a subquery.
WITH TABLE_PATIENT AS
(
SELECT TablePatient.Patient_ID,
TableProviders.Provider_Type,
TableNames.Full_Name
FROM TablePatient LEFT JOIN
TableProviders on TablePatient.Patient_ID = TableProviders.Encounter
LEFT JOIN
TableNames on TableNames.Provider_ID = TableProviders.Provider_ID
group by TablePatient.Patient_ID,
TableProviders.Provider_Type,
TableNames.Full_Name
)
SELECT *
FROM TABLE_PATIENT
PIVOT
(
min(Full_name)
for Provider_type in ([type1], [type2],[type3])
) AS PVT
So TABLE_PATIENT just has many rows for each patient, with one provider each row, and the pivot puts everything on a single row. Tell me if something doesn't work.
You need to write every type you want in the [type1],[type2] etc. Just put them inside [], no other character needed as ' or anything else.
If you put only some types, then the query will not show providers of other types.
Tell me if something doesn't work.
If I understand what you mean, you just want to group the answer by Patient Id and then Provider ID. A full name is unique on a provider id right ?
This should be something like
SELECT TablePatient.Patient_ID,
TableProviders.Provider_ID,
TableNames.Full_Name
FROM TablePatient LEFT JOIN
TableProviders on TablePatient.Patient_ID = TableProviders.Encounter
LEFT JOIN
TableNames on TableNames.Provider_ID = TablerProviders.Provider_ID
group by TablePatient.Patient_ID,
TableProviders.Provider_ID,
TableNames.Full_Name
You can either group by TableNames.Full_Name or select First(TableNames.Full_Name) for example if indeed a full name is unique to a provider ID.
Note : I used the SQL server Syntax, there can be différences with Oracle ..

cascading Input Control sql query return error: "ORA-01427: single-row subquery returns more than one row"

looking for solution on my sql query error.I'm trying to create second cascading Input Control in JaspersoftServer. The first Input Control works fine, however when I try to create a second cascade IC it returns with the error. I have 3 tables (user, client, user_client), many to many, so 1 linked table (user_client) between them.The 1st Input Control (client) - works well, end user will select the client, the client can have many users, so cascade is the key. Also, as the output, I would like to get not the user_id, but user's firstname and the lastname as one column field. And here is where i'm stuck. I'm pretty sure it is simple syntaxis error, but spent a good couple of hours to figure out what is wrong with it. Is anyone can have a look at it please and indicate where is the problem in my query ?! So far I've done:
select distinct
u.user_id,(
SELECT CONCAT(first_name, surname) AS user_name from tbl_user ),
c.client_id
FROM tbl_user u
left join tbl_user_client uc
on uc.user_id = u.user_id
left join tbl_client c
on c.client_id = uc.client_id
where c.client_id = uc.client_id
order by c.client_id
Thank you in advance.
P.S. JasperServer + Oracle 11g
You're doing an uncorrelated subquery to get the first/last name from the user table. There is no relationship between that subquery:
SELECT CONCAT(first_name, surname) AS user_name from tbl_user
... and the user ID in the main query, so the subquery will attempt to return every first/last name for all users, for every row your joins find.
You don't need to do a subquery at all as you already have the tbl_user information available:
select u.user_id,
CONCAT(u.first_name, u.surname) AS user_name
c.client_id
FROM tbl_user u
left join tbl_user_client uc
on uc.user_id = u.user_id
left join tbl_client c
on c.client_id = uc.client_id
where c.client_id = uc.client_id
order by c.client_id
If you want to put a space between the first and last name you'll either need nested concat() calls, since that function only takes two arguments:
select u.user_id,
CONCAT(u.first_name, CONCAT(' ', u.surname)) AS user_name
...
... or perhaps more readably use the concatenation operator instead:
select u.user_id,
u.first_name ||' '|| u.surname AS user_name
...
If the first control has selected a client and this query is supposed to find the users related to that client, you're joining the tables the wrong way round, aren't you? And you aren't filtering on the selected client - but no idea how that's actually implemented in Jasper. Maybe you do want the entire list and will filter it on the Jasper side.

Efficient Alternative to Outer Join

The RIGHT JOIN on this query causes a TABLE ACCESS FULL on lims.operator. A regular join runs quickly, but of course, the samples 'WHERE authorised_by IS NULL' do not show up.
Is there a more efficient alternative to a RIGHT JOIN in this case?
SELECT full_name
FROM (SELECT operator_id AS authorised_by, full_name
FROM lims.operator)
RIGHT JOIN (SELECT sample_id, authorised_by
FROM lims.sample
WHERE sample_template_id = 200)
USING (authorised_by)
NOTE: All columns shown (except full_name) are indexed and the primary key of some table.
Since you're doing an outer join, it could easily be that it actually is more efficient to do a full table scan rather than use the index.
If you are convinced the index should be used, force it with a hint:
SELECT /*+ INDEX (lims.operator operator_index_name)*/ ...
then see what happens...
No need to nest queries. Try this:
select s.full_name
from lims.operator o, lims.sample s
where o.operator_id = s.authorised_by(+)
and s.sample_template_id = 200
I didn't write sql for oracle since a while, but i would write the query like this:
SELECT lims.operator.full_name
FROM lims.operator
RIGHT JOIN lims.sample
on lims.operator.operator_id = lims.sample.authorized_by
and sample_template_id = 200
Does this still perform that bad?

Resources