I have a table that has a bunch of records that are updated frequently, and I am trying to write a code to pull only the latest record. In an ideal situation, I would use the updated_date column to pull the latest record, however, in many situations, the updated-date column is blank and so, I have to create a new column using the following code:
CASE WHEN UPDATED_DATE IS NULL THEN CREATED_DATE ELSE UPDATED_DATE END latest_date
Basically I made a new column for when updated_date is blank, use the created_date and now, my data looks like this:
updated_date
audit_id
created_date
latest_date
2021-04-02
006
2018-06-06
2021-04-02
NULL
006
2018-06-06
2018-06-06
2020-03-01
006
2018-06-06
2020-03-01
NULL
007
2018-07-07
2018-07-07
2020-04-01
007
2018-07-07
2020-04-01
2019-09-08
007
2018-07-07
2019-09-08
What I would like to retrieve is the latest info only:
updated_date
audit_id
created_date
latest_date
2021-04-02
006
2018-06-06
2021-04-02
2020-04-01
007
2018-07-07
2020-04-01
If someone could please let me know how this data can be retried, that would be great.
thank you
I assume your intention is to get the most recent row for each audit_id. If so, you can do something like
select *
from (select t.*,
row_number() over (partition by audit_id
order by coalesce( updated_date, created_date ) desc ) rn
from your_table t)
where rn = 1;
If you can have two rows with the same audit_id and latest_date, you could use the rank or dense_rank function rather than row_number to handle ties differently but I'm guessing that is not an issue you'll need to deal with and arbitrarily breaking the tie with row_number is sufficient.
You can use COALESCE and FETCH FIRST n ROWS ONLY (Oracle 12):
SELECT *
FROM table_name
ORDER BY COALESCE(UPDATED_DATE, CREATED_DATE) DESC
FETCH FIRST 2 ROWS ONLY;
Related
I am working on oracle database.
We load customer data in source table which eventually migrates to target table.
Every time customer data is loaded in source table it is having a unique batch_id.
If we want to update some field in customer table, then we again load the same customer in source table but this time with different batch_id.
Now I want to know batch_id of the customer just before the latest batch_id.
Batch_id we take is usually the current date.
Use ROW_NUMBER analytic function
your sample data
select * from tab
order by 1,2
CUSTOMER_ID BATCH_ID
----------- -------------------
1 09.12.2019 00:00:00
1 10.12.2019 00:00:00
2 10.12.2019 00:00:00
Row_number assihns sequence number starting from 1 for each customer order descending on BATCH_ID - you are interested on one before the latest, i.e. the rows with the number 2.
with cust as (
select
customer_id, batch_id,
row_number() over (partition by customer_id order by batch_id desc) rn
from tab)
select CUSTOMER_ID, BATCH_ID
from cust
where rn = 2;
CUSTOMER_ID BATCH_ID
----------- -------------------
1 09.12.2019 00:00:00
It seems that you're basically looking for the second biggest value in the SOURCE table.
In this example code the SOURCE_TABLE represents the table containing same CUSTOMER_NO with different BATCH_NO:
create table source_table (customer_no integer, batch_no date);
insert into source_table values ('1', SYSDATE-2);
insert into source_table values ('1', SYSDATE-1);
insert into source_table values ('1', SYSDATE);
SELECT batch_no
FROM (
SELECT batch_no, row_number() over (order by batch_no desc) as row_num
FROM source_table
) t
WHERE row_num = 2
Where row_num = 2 represents the second biggest value in the table.
The query returns SYSDATE-1.
I have the following tables
f_orders
ORDER_NUMBER NUMBER(5,0)
ORDER_DATE DATE
ORDER_TOTAL NUMBER(8,2)
CUST_ID NUMBER(5,0)
STAFF_ID NUMBER(5,0)
with the following data
ORDER_NUMBER ORDER_DATE ORDER_TOTAL CUST_ID STAFF_ID
5678 10-Dec-2017 103.02 123 12
9999 10-Dec-2017 10 456 19
9997 09-Dec-2017 3 123 19
9989 10-Dec-2016 3 123 19
and
f_customers
ID NUMBER(5,0)
FIRST_NAME VARCHAR2(25)
LAST_NAME VARCHAR2(35)
ADDRESS VARCHAR2(50)
with the following data
ID FIRST_NAME LAST_NAME ADDRESS
123 Cole Bee 123 Main Street
456 Zoe Twee 1009 Oliver Avenue
I'm supposed to display the name of the customer wthi the most orders placed in the year 2017.
My query looks like this
SELECT f_customers.first_name,
f_customers.last_name,
count(order_total)
FROM f_orders JOIN f_customers
ON f_customers.id = f_orders.CUST_ID
WHERE TO_CHAR(order_date, 'DD-Mon-YYYY') LIKE '%2017'
GROUP BY f_customers.first_name, f_customers.last_name
HAVING count(order_total) = (SELECT max(count(cust_id))
FROM f_orders
GROUP BY cust_id)
The problem is that whenever I insert the where statement it returns no data found, even though it should return the name Cole Bee with 2 orders
If I remove the where statement it will show that Cole Bee has placed 3 orders
I can't figure out why I get the no data found result. Any ideas?
Your main query is filtering on the year; the subquery on the right hand side of the having clause is not. The max(count()) is 3 if you run that subquery on its own, and you’re comparing that with the filtered list which (as you expect) only finds 2 rows for that customer.
Run the whole query with just the having part removed (rather than the where clause), and run just the subquery; and compare the results.
The simple answer is to repeat the filter:
SELECT f_customers.first_name,
f_customers.last_name,
count(order_total)
FROM f_orders JOIN f_customers
ON f_customers.id = f_orders.CUST_ID
WHERE TO_CHAR(order_date, 'DD-Mon-YYYY') LIKE '%2017'
GROUP BY f_customers.first_name, f_customers.last_name
HAVING count(order_total) = (SELECT max(count(cust_id))
FROM f_orders
WHERE TO_CHAR(order_date, 'DD-Mon-YYYY') LIKE '%2017'
GROUP BY cust_id)
Both filters could be written more simply as:
WHERE TO_CHAR(order_date, 'YYYY') = '2017'
or even:
WHERE EXTRACT(YEAR FROM order_date) = 2017
You can avoid hitting the table twice using analytic queries and other tricks but as this seems to be an assignment that may be getting beyond what you’ve been taught and are expected to know/use.
I have a table in hive called purchase_data that has a list all the purchases made.
I need to query this table and find the cust_id, product_id and price of the most expensive product purchased by a customer.
The data in purchase_data table looks like:
cust_id product_id price purchase_data
--------------------------------------------------------
aiman_sarosh apple_iphone5s 55000 01-01-2014
aiman_sarosh apple_iphone6s 65000 01-01-2017
jeff_12 apple_iphone6s 65000 01-01-2017
jeff_12 dell_vostro 70000 01-01-2017
missy_el lenovo_thinkpad 70000 01-02-2017
I have written the code below, but it is not fetching the right rows.
Some rows are getting repeated:
select master.cust_id, master.product_id, master.price
from
(
select cust_id, product_id, price
from purchase_data
) as master
join
(
select cust_id, max(price) as price
from purchase_data
group by cust_id
) as max_amt_purchase
on max_amt_purchase.price = master.price;
output:
aiman_sarosh apple_iphone6s 65000.0
jeff_12 apple_iphone6s 65000.0
jeff_12 dell_vostro 70000.0
jeff_12 dell_vostro 70000.0
missy_el lenovo_thinkpad 70000.0
missy_el lenovo_thinkpad 70000.0
Time taken: 21.666 seconds, Fetched: 6 row(s)
Is there something wrong with the code ?
Use row_number():
select pd.*
from (select pd.*,
row_number() over (partition by cust_id order by price_desc) as seqnum
from purchase_data pd
) pd
where seqnum = 1;
This returns one row per cust_id, even if there are ties. If you want multiple rows when there are ties, then use rank() or dense_rank() instead of row_number().
I changed the code, its working now:
select master.cust_id, master.product_id, master.price
from
purchase_data as master,
(
select cust_id, max(price) as price
from purchase_data
group by cust_id
) as max_price
where master.cust_id=max_price.cust_id and master.price=max_price.price;
output:
aiman_sarosh apple_iphone6s 65000.0
missy_el lenovo_thinkpad 70000.0
jeff_12 dell_vostro 70000.0
Time taken: 55.788 seconds, Fetched: 3 row(s)
I'm using ORACLE Database,
How to get all column with GROUP by only 1 column (EMP_ID)?
Example I have table ESD_RESULTS
FIRST_NAME | LAST_NAME | EMP_ID | WRIST_STATUS | LFOOT_STATUS | DATE
Dodo | A | 0101 | Pass | Pass | 2016-01-18 10:00
Wedi | Wil | 0105 | Pass | Pass | 2016-01-18 10:05
Dodo | A | 0101 | Pass | Fail | 2016-01-18 10:11
What I want the data display is (Get the last data by date desc if EMP_ID same):
FIRST_NAME | LAST_NAME | EMP_ID | WRIST_STATUS | LFOOT_STATUS | DATE
Dodo | A | 0101 | Pass | Fail | 2016-01-18 10:11
Wedi | Wil | 0105 | Pass | Pass | 2016-01-18 10:05
I tried to use DISTINCT and GROUP by the data still show all.
One option is to use ROW_NUMBER() to identify the latest record for each employee:
SELECT t.FIRST_NAME,
t.LAST_NAME,
t.EMP_ID,
t.WRIST_STATUS,
t.LFOOT_STATUS,
t.DATE
FROM
(
SELECT FIRST_NAME, LAST_NAME, EMP_ID, WRIST_STATUS, LFOOT_STATUS, DATE,
ROW_NUMBER() OVER (PARTITION BY EMP_ID ORDER BY DATE DESC) rn
FROM ESD_RESULTS
) t
WHERE t.rn = 1
Since presumably the first name and the last name are determined by the emp_id (they don't change from one row to another), you might as well group by all three columns - resulting in less work. (On the other hand, it would make more sense to normalize your table design; one table shows the associated first name and last name for each emp_id, there is no need to repeat the first name and last name in "this" table, which you show in your post.)
Then: you can use the FIRST/LAST function, with keep (dense_rank ...), as demonstrated below, to eliminate the need for a subquery and an outer query. If there is the possibility of two rows having the exact same date and time for an emp_id, you may refine the query to accommodate "tie-breaks" of some kind. If there are no ties, then the query will work without modification.
DATE is a reserved word in Oracle, it shouldn't be used for table or column names. I changed it to DT.
with
test_data ( first_name, last_name, emp_id, wrist_status, lfoot_status, dt ) as (
select 'Dodo', 'A' , 0101, 'Pass', 'Pass', to_date('2016-01-18 10:00', 'yyyy-mm-dd hh24:mi') from dual union all
select 'Wedi', 'Wil', 0105, 'Pass', 'Pass', to_date('2016-01-18 10:05', 'yyyy-mm-dd hh24:mi') from dual union all
select 'Dodo', 'A' , 0101, 'Pass', 'Fail', to_date('2016-01-18 10:11', 'yyyy-mm-dd hh24:mi') from dual
)
-- end of test data (NOT part of the solution); SQL query begins BELOW THIS LINE
select first_name, last_name, emp_id,
min(wrist_status) keep (dense_rank last order by dt) as wrist_status,
min(lfoot_status) keep (dense_rank last order by dt) as lfoot_status,
max(dt) as dt
from test_data
group by first_name, last_name, emp_id
;
FIRST_NAME LAST_NAME EMP_ID WRIST_STATUS LFOOT_STATUS DT
---------- --------- ---------- ------------ ------------ ----------------
Dodo A 101 Pass Fail 2016-01-18 10:11
Wedi Wil 105 Pass Pass 2016-01-18 10:05
2 rows selected.
Problem statement:
select all stores name , their status, phone numbers , effective date
whose phone number has been changed from 2003 until present date.
Schema is
store_name,phone number , start_date , status
sample rows
abc 1234 30-DEC-2011 open
abc 3433 04-Jan-2012 close
bbb 4444 30-Jan-2010 open
bbb 4444 31-Jan-2011 open
Output
abc 1234 open 30-DEC-3011 till 3-Jan-2012
abc 3433 close 04-Jan-2012 till date
I am also fine having two rows in output with sorted start date like
abc 1234 30-DEC-2011 open
abc 3433 04-Jan-2012 close
bbb should not be reported as there was no change in phone number. We should report only those stores for which phone number was changed .
Can someone help me with this query on Oracle? I guess by using correlated queries it can be done but I am not sure how can I construct one.
Please note that my table is having around 3154953 records so I also need to make sure that correlated query doesn't lock the table for whole lot of time. Is this even possible with Oracle ?
Thanks!
APC's answer works for me just that I am seeing alot of repetitions in my result.
For input :
select store_name,phone_number,start_date, status where store_name=abc;
returns
STORE_name Phone number start_date STATUS
---------------- ---------------- ----------- ----------
abc 122 18-JAN-2011 open
abc 122 18-JAN-2011 open
abc 122 18-JAN-2011 close
running your query gives me following output.
abc 122 open from 18-JAN-2011 to 17-JAN-2011
abc 122 open from 18-JAN-2011 to 17-JAN-2011
abc 122 close from 18-JAN-2011 to date
Can you explain why and where is the miss?
I'm presuming that this is for Oracle rather than MySQL, as my solution uses a couple of magic tricks which I'm pretty certain are not available in MySQL. The first is the Common Table Expression to get a result set which we can use more than once. The second is the use of the LEAD() analytic function to "predict" values in the next row.
So, here's the query:
with a as ( select store_name
, phone_number
, status
, start_date
, lead (start_date, 1, trunc(sysdate)) over (partition by store_name
order by start_date) as next_date
, lead (phone_number, 1, null) over (partition by store_name
order by start_date) as next_number
from your_table
where start_date >= date '2003-01-01' )
select a.store_name
, a.phone_number
, case when a.next_date != trunc(sysdate) then
a.status||' from '|| a.start_date ||' to '||to_char(a.next_date - 1)
else a.status||' from '||a.start_date ||' to date'
end as status_text
from a
where a.store_name in (
select store_name
from a
where phone_number != next_number)
order by a.store_name, a.start_date
/
And here's its output:
SQL> r
1 with a as ( select store_name
...
22 order by a.store_name, a.start_date
23 /
STORE_NAME PHONE_NUMBER STATUS_TEXT
-------------------- ------------ --------------------------------
abc 1234 open from 30-DEC-11 to 03-JAN-12
abc 3433 close from 04-JAN-12 to date
2 rows selected.
SQL>
As for this remark:
"so I also need to make sure that correlated query doesn't lock the
table for whole lot of time"
Doesn't matter in Oracle, because reads don't block other reads. Nor writes come to that.
It will be something along the lines of
select * from table0 as q0 join
(
select min(date) from table0 as q1 where q1.store_name = q0.store_name
) as q2 on q2.store_name = q0.store_name
left join
(
select max(date) from table0 as q1 where q1.store_name = q0.store_name
) as q3 on q3.store_name = q0.store_name
That's not quite right as I don't have MySQL in front of me but its something along these lines.