grouping while querying oracle database [duplicate] - oracle

This question already has answers here:
ORA-00979 not a group by expression
(10 answers)
Closed 5 years ago.
Below is the query that i am using. but
Query:
SELECT c.session_id session_number,
u.first_name first_name,
u.last_name last_name,
Min(c.timestamp) session_start_ts,
Max(c.timestamp) session_end_ts,
Count(
CASE
WHEN c.message_type='IN' THEN 1
ELSE NULL
END) message_in_count,
Count(c.message_type) total_messages
FROM users u
join chatlog c
ON u.user_id = c.user_id
WHERE Trunc(c.timestamp) BETWEEN To_date('2017-10-11','YYYY-MM-DD') AND To_date('2017-11-09','YYYY-MM-DD')
GROUP BY c.session_id,
order by timestamp;
The problem is that it gives an error stating "not a GROUP BY expression". But instead of just grouping by session id if i use :
group by c.session_id, u.first_name, u.last_name, c.timestamp
It works though the values of first_name and last_name are same for a particulat session_id and timestamp also i am only taking the max. so cant understand why i am unable to group by session_id only.

"cant understand why i am unable to group by session_id only"
The rule of Oracle aggregation functions is that we need to group by all the non-aggregated columns in the projection. In your case that is
group by c.session_id, u.first_name, u.last_name
" how do we overcome that. i mean there must be a way that you can overcome that. so that if you want to select multiple columns and just group by a particular column?"
This doesn't apply in your case. You say:
"the values of first_name and last_name are same for a particulat (sic) session_id"
which means grouping by session_id,first_name,last_name is the same as grouping by session_id alone. But as a general point, we can use analytical functions to aggregate values in a different window from the result set. Find out more.

Related

get the newest order for each group [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 5 months ago.
I am trying to get the newest order for each phone number so I used in oracle
select * from (select * from orders where phonenum ='914780' order by order_date)
where rownum<=1
it works for one number only if I used it of several number it give me wrong results
as each number has several orders
Use window function for ordering phone number wise orders and then pick each phone number latest record.
Select *
from (
Select o.*,
row_number() over (partition by phonenum order by order_id desc) AS rowno
from order o
Where phonenum = '1233'
) t
Where t.rowno = 1;
N.B.: use table name and columns according to your DB objects.

Hive Getting error on group by column while using case statements and aggregations

I am working on a query in hive. In that I am using aggregations like sum and case statements and group by clause. I have changed the column names and table names but my logic is same which I was using in my project
select
empname,
empsal,
emphike,
sum(empsal) as tot_sal,
sum(emphike) as tot_hike,
case when tot_sal > 1000 then exp(tot_hike)
else 0
end as manager
from employee
group by
empname,
empsal,
emphike
For the above query I was getting error as "Expression not in group by key '1000'".
So I have slightly modified the query and tried again My other query is
select
empname,
empsal,
emphike,
sum(empsal) as tot_sal,
sum(emphike) as tot_hike,
case when sum(empsal) > 1000 then exp(sum(emphike))
else 0
end as manager
from employee
group by
empname,
empsal,
emphike
For above query its putting me error as "Expression not in group by key 'Manager'".
When I add manager in the group by its showing invalid alias.
Please help me out here
I see three issues in your query:
1.) Hive cannot group by a variable you defined in the select block by the name you gave it right away. You will probably need a subquery for that.
2.) Hive tends to show errors when sum or count operations are not at the end of the query.
3.) Although I do not know what your goal is, I think that your query will not deliver the desired result. If you group by empsal there would be no difference between empsal and sum(empsal) by design. Same goes for emphike and sum(emphike).
I think the following query might solve these issues:
select
a.empname,
a.tot_sal,
a.tot_hike,
if(a.tot_sal > 1000, exp(a.tot_hike), 0) as manager
from
(select
empname,
sum(empsal) as tot_sal,
sum(emphike) as tot_hike,
from employee
group by
empname
)a
The if statement is equivalent to your case statement, however I find it a bit easier to read.
In this example you wouldn't need to group by after the subquery because the grouping is done in the subquery a.

Function returning Last record

I don't often use ORACLE PL/SQL by the way but i need to understand what if anything in this function created by someone else
in the company before me is wrong as for it is not returning the latest record i've been told. I found out in some other forum issues that they
suggested to use the max(dateColumn) instead of "row_numer = 1" for example but not quite sure how to and where to incorporate that.
-- Knowing that --
We use Oracle version 12,
CustomObjectTypeA is an custom Oracle OBJECT TYPE defined by some old employee not longer in here,
V_OtherView is of Table_Mnd type beeing defined by some old employee not longer in here,
V_ABC_123 is a view created by some old employee not longer in here as well.
CREATE OR REPLACE FUNCTION F_TABLE_APPROVED (NUMBER_F_UPD number, NUMBER_F_GET VARcHAR2)
RETURN Table_Mnd
IS
V_OtherView Table_Mnd
BEGIN
SELECT CustomObjectTypeA (FromT.NUMBER_F,
FromT.OP_CODE,
FromT.CATG_CODE,
FromT.CATG_NAME,
FromT.CATG_SORT,
FromT.ORG_CODE,
FromT.ORG_NAME
FromT.DATA_ENTRY_VALID,
FromT.NUMBER_RECEIVED,
FromT.YEAR_1,
FromT.YEAR_2)
BULK COLLECT INTO V_OtherView
FROM (SELECT NUMBER_F,
OP_CODE,
CATG_CODE,
CATG_NAME,
CATG_SORT,
ORG_CODE,
ORG_NAME
DATA_ENTRY_VALID,
NUMBER_RECEIVED,
YEAR_1,
YEAR_2,
ROW_NUMBER() OVER (PARTITION BY BY ORG_CODE ORDER BY NUMBER_RECEIVED DESC, LOAD_DATE DESC) AS ROW_NUMBER
FROM V_ABC_123
WHERE NUMBER_F = NUMBER_F_UPD AND DATA_ENTRY_VALID <> 'OnGoing'
AND LOAD_DATE >= (SELECT sysdate-10 FROM dual)
AND LOAD_DATE <= (SELECT DISTINCT LOAD_DATE
FROM V_ABC_123
WHERE NUMBER_RECEIVED = NUMBER_F_GET)) FromT
WHERE FromT.ROW_NUMBER=1;
RETURN V_OtherView;
END F_TABLE_APPROVED;
The important bits of the query are:
SELECT ...
FROM (select ...,
ROW_NUMBER()
OVER (PARTITION BY ORG_CODE
ORDER BY NUMBER_RECEIVED DESC,
LOAD_DATE DESC) AS ROW_NUMBER
...) FromT
WHERE FromT.ROW_NUMBER = 1;
The "ROW_NUMBER" column is computed according to the following window clause:
PARTITION BY ORG_CODE
ORDER BY NUMBER_RECEIVED DESC, LOAD_DATE DESC
Which means that for each ORG_CODE, it will sort all the records by NUMBER_RECEVED,LOAD_DATE in descending order. Note that if the columns are Oracle DATEs, they will only be accurate to the nearest second; so if there are multiple records with date/times in the exact same 1-second interval, this sort order will not be guaranteed unique. The logic of ROW_NUMBER will therefore pick one of them arbitrarily (i.e. whichever record happens to be emitted first) and assign it the value "1", and this will be deemed the "latest". Subsequent executions of the same SQL could (in theory) return a different record.
The suspicious part is NUMBER_RECEIVED which sounds like it's a number, not a date? Sorting by this means that the records with the highest NUMBER_RECEIVED will be preferred. Was this intentional?
I'm not sure why the PARTITION is there, this would cause the query to return one "latest" record for each value of ORG_CODE that it finds. I can only assume this was intentional.
The problem is that the query can only determine the "latest record" as well as it can based on the data provided to it. In this case, it's possible the data is simply not granular enough to be able to decide which record is the actual "latest" record.

SQL query for latest record [duplicate]

This question already has answers here:
Oracle select most recent date record
(4 answers)
Closed 5 years ago.
I want to select the latest record from a table as based on a date field (crtn_dt). The query below does not work. Does anyone have an idea how it should be fixed?
select * from parcels
order by crtn_dt desc
where rownum = 1
You'd need to order data in the subquery and filter them in an outer query.
select *
from (
select *
from parcels
order by crtn_dt desc
)
where rownum = 1
order by clause is among last operations to perform.
What your query does, apart from being semantically incorrect, it returns one (thanks to rownum = 1 predicate) arbitrary row, and then applies order by clause to that one row.

Oracle tuning for query with query annidate

i am trying to better a query. I have a dataset of ticket opened. Every ticket has different rows, every row rappresent an update of the ticket. There is a field (dt_update) that differs it every row.
I have this indexs in the st_remedy_full_light.
IDX_ASSIGNMENT (ASSIGNMENT)
IDX_REMEDY_INC_ID (REMEDY_INC_ID)
IDX_REMDULL_LIGHT_DTUPD (DT_UPDATE)
Now, the query is performed in 8 second. Is high for me.
WITH last_ticket AS
( SELECT *
FROM st_remedy_full_light a
WHERE a.dt_update IN
( SELECT MAX(dt_update)
FROM st_remedy_full_light
WHERE remedy_inc_id = a.remedy_inc_id
)
)
SELECT remedy_inc_id, ASSIGNMENT FROM last_ticket
This is the plan
How i could to better this query?
P.S. This is just a part of a big query
Additional information:
- The table st_remedy_full_light contain 529.507 rows
You could try:
WITH last_ticket AS
( SELECT remedy_inc_id, ASSIGNMENT,
rank() over (partition by remedy_inc_id order by dt_update desc) rn
FROM st_remedy_full_light a
)
SELECT remedy_inc_id, ASSIGNMENT FROM last_ticket
where rn = 1;
The best alternative query, which is also much easier to execute, is this:
select remedy_inc_id
, max(assignment) keep (dense_rank last order by dt_update)
from st_remedy_full_light
group by remedy_inc_id
This will use only one full table scan and a (hash/sort) group by, no self joins.
Don't bother about indexed access, as you'll probably find a full table scan is most appropriate here. Unless the table is really wide and a composite index on all columns used (remedy_inc_id,dt_update,assignment) would be significantly quicker to read than the table.

Resources