Unfortunately this database has a ton of duplicate email addresses in it. I need to do a query and return only unique emails, doesn't really matter which one.
The query I have looks like this, can't really figure out what to add to not get duplicate emails returned. Can anyone think of anything?
select c.cid, c.email, c.uuid, e.code
from c
inner join e on e.cid = c.cid
where regexp_like(c.email, '\.net$', 'i');
-- Adding some additional info on request
The above query returns the following results, where you can see there are duplicates. I'm interested in only returning one row per unique email address.
3478|cust1#cust1.net|ouskns;dhf|1
3488|cust2#cust2.net|jlsudo;uff|0
3598|cust3#cust3.net|dl;udjffff|1
3798|cust1#cust1.net|osuosdujff|1
3888|cust1#cust1.net|odsos7jfff|1
-- Solution, thanks Mathguy
select cid, email, uuid, code
from
(select c.cid, c.email, c.uuid, e.code, row_number() over (partition by
c.email order by null) as rn
from c
inner join e on e.cid = c.cid
where regexp_like(c.email, '\.net$', 'i')
)
where rn = 1;
If it works as is and the only problem is the duplicates, you can change c.email to MAX(c.email) as email in the select clause, and add a group by clause to group by the other columns included in select.
EDIT: (actually I should delete the original answer since the OP clarified his question was quite different from what he seemed to ask originally - but that would also delete the comments... so editing instead)
If your query produces the desired results, but now you must pick just one random row per email address, you can try this:
select cid, email, uuid, code
from
( -- .... copy your select query here
-- ADD one column to the select line like so:
-- select c.cid, c.uuid, c.email, e.code,
-- row_number() over (partition by c.email order by null) as rn
-- ....
)
where rn = 1;
Using DISTINCT :
select DISTINCT c.email
from c
inner join e on e.cid = c.cid
where regexp_like(c.email, '\.net$', 'i');
Or using GROUP BY (and you get the number of dups in the cnt column)
select c.email, count(*) as cnt
from c
inner join e on e.cid = c.cid
where regexp_like(c.email, '\.net$', 'i')
GROUP BY c.email;
Related
I am trying to select values from three different tables.
When I select all columns it works well, but if I select specific column, the SQL Error [42000]: JDBC-8027:Column name is ambiguous. appear.
this is the query that selected all that works well
SELECT
*
FROM (SELECT x.*, B.*,C.* , COUNT(*) OVER (PARTITION BY x.POLICY_NO) policy_no_count
FROM YIP.YOUTH_POLICY x
LEFT JOIN
YIP.YOUTH_POLICY_AREA B
ON x.POLICY_NO = B.POLICY_NO
LEFT JOIN
YIP.YOUTH_SMALL_CATEGORY C
ON B.SMALL_CATEGORY_SID = C.SMALL_CATEGORY_SID
ORDER BY x.POLICY_NO);
and this is the error query
SELECT DISTINCT
x.POLICY_NO,
x.POLICY_TITLE,
policy_no_count ,
B.SMALL_CATEGORY_SID,
C.SMALL_CATEGORY_TITLE
FROM (SELECT x.*, B.*,C.* , COUNT(*) OVER (PARTITION BY x.POLICY_NO) policy_no_count
FROM YIP.YOUTH_POLICY x
LEFT JOIN
YIP.YOUTH_POLICY_AREA B
ON x.POLICY_NO = B.POLICY_NO
LEFT JOIN
YIP.YOUTH_SMALL_CATEGORY C
ON B.SMALL_CATEGORY_SID = C.SMALL_CATEGORY_SID
ORDER BY x.POLICY_NO);
I am trying to select if A.POLICY_NO values duplicate rows more than 18, want to change C.SMALL_CATEGORY_TITLE values to "ZZ" and also want to cahge B.SMALL_CATEGORY_SID values to null.
that is why make 2 select in query like this
SELECT DISTINCT
x.POLICY_NO,
CASE WHEN (policy_no_count > 17) THEN 'ZZ' ELSE C.SMALL_CATEGORY_TITLE END AS C.SMALL_CATEGORY_TITLE,
CASE WHEN (policy_no_count > 17) THEN NULL ELSE B.SMALL_CATEGORY_SID END AS B.SMALL_CATEGORY_SID,
x.POLICY_TITLE
FROM (SELECT x.*, B.*,C.* , COUNT(*) OVER (PARTITION BY x.POLICY_NO) policy_no_count
FROM YIP.YOUTH_POLICY x
LEFT JOIN
YIP.YOUTH_POLICY_AREA B
ON x.POLICY_NO = B.POLICY_NO
LEFT JOIN
YIP.YOUTH_SMALL_CATEGORY C
ON B.SMALL_CATEGORY_SID = C.SMALL_CATEGORY_SID
ORDER BY x.POLICY_NO);
If i use that query, I got SQL Error [42000]: JDBC-8006:Missing FROM keyword. ¶at line 3, column 80 of null error..
I know I should solve it step by step. Is there any way to select specific columns?
That's most probably because of SELECT x.*, B.*,C.* - avoid asterisks - explicitly name all columns you need, and then pay attention to possible duplicate column names; if you have them, use column aliases.
For example, if that select (which is in a subquery) evaluates to
select x.id, x.name, b.id, b.name
then outer query doesn't know which id you want as two columns are named id (and also two names), so you'd have to
select x.id as x_id,
x.name as x_name,
b.id as b_id,
b.name as b_name
from ...
and - in outer query - select not just id, but e.g. x_id.
I am creating a view that needs to select only one random row for each customer. Something like:
select c.name, p.number
from customers c, phone_numbers p
where p.customer_id = c.id
If this query returns:
NAME NUMBER
--------- ------
Customer1 1
Customer1 2
Customer1 3
Customer2 4
Customer2 5
Customer3 6
I need it to be something like:
NAME NUMBER
--------- ------
Customer1 1
Customer2 4
Customer3 6
Rownum wont work because it will select only the first from all 6 records, and i need the first from each customer. I need solution that won't affect performance much, because the query that selects the data is pretty complex, this is just an example to explain what I need. Thanks in advance.
Use the ROW_NUMBER() analytic function:
SELECT name,
number
FROM (
SELECT c.name,
p.number,
ROW_NUMBER() OVER ( PARTITION BY c.id ORDER BY DBMS_RANDOM.VALUE ) AS rn
FROM customers c
INNER JOIN phone_numbers p
ON ( p.customer_id = c.id )
)
WHERE rn = 1
You can use group by clause to return only one phone number:
select c.name, MAX(p.number) as phone
from customers c, phone_numbers p
where p.customer_id = c.id
group by c.name
You can also use the min or max aggregate function with the keep dense_rank syntax:
select c.name,
min(p.number) keep (dense_rank last order by dbms_random.value) as number
from customers c
join phone_numbers p on p.customer_id = c.id
group by c.id, c.name
order by c.name;
(number isn't a valid column or alias name as it's a reserved word, so use your own real name of course).
If the phone number needs to be arbitrary rather than actually random, you can order by something else:
select c.name,
min(p.number) keep (dense_rank last order by null) as number
from customers c
join phone_numbers p on p.customer_id = c.id
group by c.id, c.name
order by c.name;
You'll probably get the same number back for each customer each time, but not always, and data/stats/plan changes will affect which you see. It seems like you don't care though. But then, using a plain aggregate might work just as well for you, as in #under's answer.
If you're getting lots of columns from that random row, rather than just the phone number, MTO's subquery might be simpler; for one or two values I find this a bit clearer though.
I have query in which i used partition of avoid the duplicate value for particular column , but still it is giving duplicate row below i am mention my query in which i used partition
SELECT iol.M_product_id as faultyProduct , iol.SERIALNO,iol.M_product_id as newproduct, ma.Description,
mp.M_Product_category_id ,mi.issotrx, co.C_BPartner_ID,
ROW_NUMBER() OVER(PARTITION BY ma.Description ORDER BY iol.M_product_id DESC) rn
FROM M_inoutline iol
inner join M_inout mi ON (iol.m_inout_id = mi.m_inout_id)
inner join C_Order co ON (co.c_order_id = mi.c_order_id )
inner Join M_AttributeSetInstance ma ON (ma.m_attributesetinstance_id =iol.m_attributesetinstance_id)
inner join M_Product mp ON (mp.m_product_id = iol.m_product_id)
where mp.m_product_category_id= 1000447 AND mi.issotrx = 'Y';
Please help me out
For me it looks you want to do:
select * from (/*YOUR QUERY*/) where rn = 1;
I'd like to isplay the name, address and phone number for the clients who has not made any reservations for the past two months.
The tables are :
CLIENT (
ClientNo,
Name,
Sex,
DOB,
Address,
Phone,
Email,
Occupation,
MaritalStatus,
Spouse,
Anniversary
)
RESERVATION
ResNo,
ResDate,
NoOfGuests,
StartDate,
EndDate,
ClientNo,
Status
)
I tried:
SELECT Name, Address, Phone, ResNo
FROM Client C, Reservation R
WHERE Date_Column >= ResDate R(MONTH, -3, GETDATE())
ORDER BY Name DESC
You need an outer join, with all conditions in the join, and then filter out all successful joins:
SELECT C.*
FROM Client C
LEFT JOIN Reservation R ON C.ID = R.ID -- outer join
AND R.ResDate >= add_months(TRUNC(SYSDATE) + 1, 2) -- condition in join
WHERE R.ID IS NULL -- only return missed joins
ORDER BY Name ASC
You want the missed joins - ones where there are no reservations at all given the conditions. Such missed joins have all-nulls in the joined table's columns, so filtering on (one of) those columns being null will return clients who don't have any such reservations.
Try to make your where condition like
ResDate >= add_months(TRUNC(SYSDATE) + 1, 2)
Also you are missing the JOIN for your table. So you need to put the JOIN like
SELECT C.Name, C.Address, C.Phone, R.ResNo, R.ResDate
FROM Client C INNER JOIN Reservation R ON C.ID = R.ID --Change the column for Joing as per your table structure
WHERE R.ResDate >= add_months(TRUNC(SYSDATE) + 1, 2))
ORDER BY Name ASC;
For Oracle,
I have 2 tables; first is Store and another is Book whereas they are connected by store ID (PK FK).
I would like to lists the name of the store which has the highest numbers of books.
However, the result showed every store in orders but I just want the highest.
SELECT STORE.STORE_NAME
FROM STORE, BOOK
WHERE STORE.STORE_ID=BOOK.BOOK_STOREID
GROUP BY STORE.STORE_NAME
ORDER BY COUNT(BOOK.BOOK_STOREID) DESC;
the result is
Store:
D
E
F
B
A
C
It should be only 'D'. What should I do? Thank you.
Try
SELECT STORE_NAME
FROM
(SELECT STORE.STORE_NAME
FROM STORE, BOOK
WHERE STORE.STORE_ID=BOOK.BOOK_STOREID
GROUP BY STORE.STORE_NAME
ORDER BY COUNT(BOOK.BOOK_STOREID) DESC)
WHERE rownum = 1
Here is a sqlfiddle demo
BTW, you can also use row_number() function
select STORE_NAME
from
(SELECT STORE.STORE_NAME,
row_number() over( order by COUNT(BOOK.BOOK_STOREID)desc) rn
FROM STORE join BOOK on STORE.STORE_ID=BOOK.BOOK_STOREID
group by STORE.STORE_NAME)
where rn = 1;
UPDATE If you want to see all stores which have the max number of books you can use rank instead of row_number:
select STORE_NAME
from
(SELECT STORE.STORE_NAME,
rank() over( order by COUNT(BOOK.BOOK_STOREID)desc) rn
FROM STORE join BOOK on STORE.STORE_ID=BOOK.BOOK_STOREID
group by STORE.STORE_NAME)
where rn = 1;
Just for fun, here's another formulation:
with store_counts as (
select store_name,
count(*) books
from store join book on store_id=book_storeid
group by store_name)
select *
from store_counts
where books = (
select max(books)
from store_counts)