Top 10 products for dates between - clickhouse

I'm trying to figure out the better way of creating a single query that will produce a result with top 10 products for each date. I have a row with two columns - PID (int) and EventDate (date), a row for each click.
Can you suggest how I can get the result with Top 10 products clicked on for a date range? I'm getting stuck on understanding how the sub-query must be constructed. I can only get the part with group by date, but then my mind gets stuck on count() and aggregation issues.
Here's my query for an individual date, but I want to have to for a range of dates. I can, of course, do sub-query by generating event dates, but want to figure out how to do it more elegantly.
SELECT TOP 10 COUNT() as count, PID
FROM view_product
WHERE EventDate = toDate('2020-05-11')
GROUP BY PID
ORDER BY count DESC
The expected output is something like this:
PID Count Date
1 123 2020-02-04
21 101 2020-02-04
1332 99 2020-02-04
11 51 2020-02-04
634 49 2020-02-04
1332 43 2020-02-04
1 24 2020-02-04
21 23 2020-02-04
1332 6 2020-02-04
11 3 2020-02-04
1 266 2020-02-02
21 241 2020-02-02
1332 232 2020-02-02
11 179 2020-02-02
634 163 2020-02-02
1332 159 2020-02-02
1 144 2020-02-02
21 100 2020-02-02
1332 99 2020-02-02
11 74 2020-02-02

It needs to use LIMIT BY-clause that takes 10 top rows of each day:
SELECT
PID,
EventDate,
count() AS Count
FROM view_product
WHERE EventDate >= '2020-05-01' AND EventDate < '2020-06-01'
GROUP BY EventDate, PID
ORDER BY EventDate, Count DESC
LIMIT 10 BY EventDate;
The test example:
SELECT
PID,
EventDate,
count() AS Count
FROM (
/* emulate test set */
SELECT test_data.1 AS PID, toDate(test_data.2) AS EventDate
FROM (
SELECT arrayJoin([
(1, '2020-02-04'),
(21, '2020-02-04'),
(1332, '2020-02-04'),
(11, '2020-02-04'),
(634, '2020-02-04'),
(1, '2020-02-04'),
(1, '2020-02-04'),
(21, '2020-02-04'),
(1, '2020-02-04'),
(1, '2020-02-02'),
(21, '2020-02-02'),
(11, '2020-02-02'),
(1332, '2020-02-02'),
(1332, '2020-02-02'),
(1332, '2020-02-02'),
(11, '2020-02-02')]) test_data))
GROUP BY EventDate, PID
ORDER BY EventDate, Count DESC
LIMIT 2 BY EventDate;
/* result
┌──PID─┬──EventDate─┬─Count─┐
│ 1332 │ 2020-02-02 │ 3 │
│ 11 │ 2020-02-02 │ 2 │
│ 1 │ 2020-02-04 │ 4 │
│ 21 │ 2020-02-04 │ 2 │
└──────┴────────────┴───────┘
*/
To get just n-top items without count-values use topK-aggregated function:
SELECT
EventDate,
topK(10)(PID)
FROM (
/* emulate test set */
SELECT test_data.1 AS PID, toDate(test_data.2) AS EventDate
FROM (
SELECT arrayJoin([
(1, '2020-02-04'),
(21, '2020-02-04'),
(1332, '2020-02-04'),
(11, '2020-02-04'),
(634, '2020-02-04'),
(1, '2020-02-04'),
(1, '2020-02-04'),
(21, '2020-02-04'),
(1, '2020-02-04'),
(1, '2020-02-02'),
(21, '2020-02-02'),
(11, '2020-02-02'),
(1332, '2020-02-02'),
(1332, '2020-02-02'),
(1332, '2020-02-02'),
(11, '2020-02-02')]) test_data))
GROUP BY EventDate;
/* result
┌──EventDate─┬─topK(10)(PID)──────┐
│ 2020-02-02 │ [1332,11,1,21] │
│ 2020-02-04 │ [1,21,1332,11,634] │
└────────────┴────────────────────┘
*/

Related

Find handicap based on Team Average

Good Day. I have a new Apex database For a dart league.
I have a lot of work to do! but cant combine (with code - I have the right numbers on the interactive grid by using the sum function) I have the following code and results that work for each player, which I need, But cannot get the summed of average for each player.... this matters when creating a handicap.
select Team, name,
COUNT(WEEK) * 3 GAMES,
sum(game_1) + sum(game_2) + sum(game_3) points,
Round(((sum(game_1) + sum(game_2) + sum(game_3)) / ((COUNT(WEEK) * 3))), 2) Average
from score_tbl
group by Team, name
order by 1
TEAM
NAME
GAMES
POINTS
AVERAGE
1
B Tyler
6
142
23.67
1
Blind
6
108
18
1
Jim V
6
53
8.83
1
KC M
6
82
13.67
2
J Spass
6
102
17
2
Randy B
6
105
17.5
2
Tim Ketz
6
74
12.33
2
Todd Lapan
6
51
8.5
I am trying to figure out the code to sum the Averages for each player by team.
Team Average Handicap
Team 1 64.17
Team 2 55.33
etc..
then, if possible compare those averages to find the highest average. Then take the Highest avg - (each Team) * 90%.
I am trying to figure out the code to sum the Averages for each player by team.
Wrap the expression for generating the average in an analytic SUM for each team partition:
SELECT team,
name,
COUNT(WEEK) * 3 AS games,
SUM(game_1 + game_2 + game_3) AS points,
ROUND(AVG(game_1 + game_2 + game_3) / 3, 2) AS average,
ROUND(SUM(AVG(game_1 + game_2 + game_3) / 3) OVER (PARTITION BY team), 2)
AS total_average
FROM score_tbl
GROUP BY
team, name
ORDER BY
team;
Which, for the sample data:
CREATE TABLE score_tbl(team, name, week, game_1, game_2, game_3) AS
SELECT 1, 'B Tyler', 1, 24, 24, 23 FROM DUAL UNION ALL
SELECT 1, 'Blind', 1, 18, 18, 18 FROM DUAL UNION ALL
SELECT 1, 'Jim V', 1, 9, 8, 9.5 FROM DUAL UNION ALL
SELECT 1, 'KC M', 1, 14, 14, 13 FROM DUAL UNION ALL
SELECT 2, 'J Spass', 1, 17, 17, 17 FROM DUAL UNION ALL
SELECT 2, 'Randy B', 1, 18, 17, 17.5 FROM DUAL UNION ALL
SELECT 2, 'Tim Ketz', 1, 13, 12, 12 FROM DUAL UNION ALL
SELECT 2, 'Todd Lapan', 1, 9, 8, 8.5 FROM DUAL UNION ALL
SELECT 1, 'B Tyler', 1, 24, 24, 23 FROM DUAL UNION ALL
SELECT 1, 'Blind', 1, 18, 18, 18 FROM DUAL UNION ALL
SELECT 1, 'Jim V', 1, 9, 8, 9.5 FROM DUAL UNION ALL
SELECT 1, 'KC M', 1, 14, 14, 13 FROM DUAL UNION ALL
SELECT 2, 'J Spass', 1, 17, 17, 17 FROM DUAL UNION ALL
SELECT 2, 'Randy B', 1, 18, 17, 17.5 FROM DUAL UNION ALL
SELECT 2, 'Tim Ketz', 1, 13, 12, 12 FROM DUAL UNION ALL
SELECT 2, 'Todd Lapan', 1, 9, 8, 8.5 FROM DUAL;
Outputs:
TEAM
NAME
GAMES
POINTS
AVERAGE
TOTAL_AVERAGE
1
B Tyler
6
142
23.67
64.17
1
Blind
6
108
18
64.17
1
Jim V
6
53
8.83
64.17
1
KC M
6
82
13.67
64.17
2
J Spass
6
102
17
55.33
2
Randy B
6
105
17.5
55.33
2
Tim Ketz
6
74
12.33
55.33
2
Todd Lapan
6
51
8.5
55.33
fiddle

Multiple joins display student course name

I have the following setup, which is working perfectly. I am difficulty figuring out the syntax how to display the course name in the output. In my test CASE all the rows should have the value Geometry.
In addition, how could I use rank or rank_dense to limit the output to display only 1 row with the highest average?
CREATE TABLE students(student_id, first_name, last_name) AS
SELECT 1, 'Faith', 'Aaron' FROM dual UNION ALL
SELECT 2, 'Lisa', 'Saladino' FROM dual UNION ALL
SELECT 3, 'Leslee', 'Altman' FROM dual UNION ALL
SELECT 4, 'Patty', 'Kern' FROM dual UNION ALL
SELECT 5, 'Betty', 'Bowers' FROM dual;
CREATE TABLE courses(course_id, course_name) AS
SELECT 1, 'Geometry' FROM dual UNION ALL
SELECT 2, 'Trigonometry' FROM dual UNION ALL
SELECT 3, 'Calculus' FROM DUAL;
CREATE TABLE grades(student_id,
course_id, grade) AS
SELECT 1, 1, 75 FROM dual UNION ALL
SELECT 1, 1, 81 FROM dual UNION ALL
SELECT 1, 1, 76 FROM dual UNION ALL
SELECT 2, 1, 100 FROM dual UNION ALL
SELECT 2, 1, 95 FROM dual UNION ALL
SELECT 2, 1, 96 FROM dual UNION ALL
SELECT 3, 1, 80 FROM dual UNION ALL
SELECT 3, 1, 85 FROM dual UNION ALL
SELECT 3, 1, 86 FROM dual UNION ALL
SELECT 4, 1, 88 FROM dual UNION ALL
SELECT 4, 1, 85 FROM dual UNION ALL
SELECT 4, 1, 91 FROM dual UNION ALL
SELECT 5, 1, 98 FROM dual UNION ALL
SELECT 5, 1, 74 FROM dual UNION ALL
SELECT 5, 1, 81 FROM dual;
/* average grade of each student */
select s.student_id
, s.first_name
, s.last_name
, round(avg(g.grade), 1) as student_avg
from students s
join grades g
on s.student_id = g.student_id
group by s.student_id, s.first_name, s.last_name
ORDER BY avg(g.grade) DESC;
Something like this?
SQL> with temp as
2 (select s.student_id
3 , s.first_name
4 , s.last_name
5 , c.course_name
6 , round(avg(g.grade), 1) as student_avg
7 , rank() over (order by avg(g.grade) desc) rnk
8 from students s join grades g on s.student_id = g.student_id
9 join courses c on c.course_id = g.course_id
10 group by s.student_id, s.first_name, s.last_name, c.course_name
11 )
12 select student_id, first_name, last_name, course_name, student_avg
13 from temp
14 where rnk <= 3
15 order by rnk;
STUDENT_ID FIRST_ LAST_NAM COURSE_NAME STUDENT_AVG
---------- ------ -------- ------------ -----------
2 Lisa Saladino Geometry 97
4 Patty Kern Geometry 88
5 Betty Bowers Geometry 84.3
SQL>

LISTAG function in oracle giving duplicate values

I have two tables User_details and Level_details.
User_details table:
ID Name
1 A
2 B
3 C
4 D
5 E
Level_details table:
trns_id Lvl usr_id
66 1 1
66 1 5
77 1 2
77 2 3
66 2 4
66 2 3
77 2 3
66 2 4
I am getting the result like:
trns_id Lvl name
66 1 A, E
66 2 D, C, D
77 1 B
77 2 C, C
I am using LISTAG function to get name
LISTAGG(( SELECT name FROM User_details l WHERE l.usr_id = id and trns_id=t1.trns_id and lvl=t1.lvl ), ',') WITHIN GROUP( ORDER BY lvl ) AS Name
You can use the distinct modifier in a listagg function call:
SELECT trns_id, lvl, LISTAGG(DISTINCT name, ', ') WITHIN GROUP (ORDER BY name)
FROM level_details l
JOIN user_details u ON l.usr_id = u.id
GROUP BY trns_id, lvl
If your database version doesn't support DISTINCT within LISTAGG, then you'll have to first select distinct values (lines #21 - 23), then aggregate them (line #20). Lines #1 - 17 represent sample data; you already have that and don't type it. Query you need begins at line #18.
SQL> with user_details (usr_id, name) as
2 (select 1, 'A' from dual union all
3 select 2, 'B' from dual union all
4 select 3, 'C' from dual union all
5 select 4, 'D' from dual union all
6 select 5, 'E' from dual
7 ),
8 level_details (trns_id, lvl, usr_id) as
9 (select 66, 1, 1 from dual union all
10 select 66, 1, 5 from dual union all
11 select 77, 1, 2 from dual union all
12 select 77, 2, 3 from dual union all
13 select 66, 2, 4 from dual union all
14 select 66, 2, 3 from dual union all
15 select 77, 2, 3 from dual union all
16 select 66, 2, 4 from dual
17 )
18 select x.trns_id,
19 x.lvl,
20 listagg(x.name, ', ') within group (order by x.lvl) name
21 from (select distinct u.usr_id, u.name, d.trns_id, d.lvl
22 from user_details u join level_details d on d.usr_id = u.usr_id
23 ) x
24 group by x.trns_id,
25 x.lvl;
TRNS_ID LVL NAME
---------- ---------- ---------------
66 1 A, E
66 2 C, D
77 1 B
77 2 C
SQL>
LISTAGG gives duplicate values if you have duplicate values
trns_id Lvl usr_id
77 2 3
77 2 3
You can remove duplicates first:
select trns_id, Lvl, LISTAGG(name)
from (
select distinct l.trns_id l.Lvl, u.name
from User_details u
join Level_details l on l.usr_id=u.ID
)
group by trns_id, Lvl

How to aggregate array type in clickhouse

This is the example table
exampleTable:
id | weeklyNumber |
---- -------------
1 | [2,5,9] |
------------------
2 | [1,10,4] |
The expected results should be the aggregation result of weeklyNumber array which is
[3,15,13] (2+1, 5+10, 9+4)
I did not get idea how to do this.
----- update ----
In addition,
we have many rows of the below table
exampleTable:
id | weeklyNumber | monthlyNumber
---- ------------- -------------
1 | [2,5,9] | [20,50,90]
--------------------------------
2 | [1,10,4] | [10,100,40]
the result should be [2/20 + 1/10, 5/50 + 10/100, 9/90 + 4/40]. How to do that?
It needs to use ForEach-aggregate function combinator:
SELECT sumForEach(weeklyNumber)
FROM
(
SELECT
1 AS id,
[2, 5, 9] AS weeklyNumber
UNION ALL
SELECT
2 AS id,
[1, 10, 4] AS weeklyNumber
)
/*
┌─sumForEach(weeklyNumber)─┐
│ [3,15,13] │
└──────────────────────────┘
*/
In some cases could be used this query:
SELECT arrayReduce('sumForEach', groupArray(weeklyNumber))
FROM
(
SELECT
1 AS id,
[2, 5, 9] AS weeklyNumber
UNION ALL
SELECT
2 AS id,
[1, 10, 4] AS weeklyNumber
)
/*
┌─arrayReduce('sumForEach', groupArray(weeklyNumber))─┐
│ [3,15,13] │
└─────────────────────────────────────────────────────┘
*/
UPDATE
SELECT sumForEach(arrayMap((x, y) -> (x / y), weeklyNumber, monthlyNumber)) AS result
FROM
(
SELECT
1 AS id,
[2, 5, 9] AS weeklyNumber,
[20, 50, 90] AS monthlyNumber
UNION ALL
SELECT
2 AS id,
[1, 10, 4] AS weeklyNumber,
[10, 100, 40] AS monthlyNumber
)
/*
┌─result────────┐
│ [0.2,0.2,0.2] │
└───────────────┘
*/

select min values of time for each date in a range and getting the average in oracle

I have a table in oracle user_transn(userid, resourceid, transid, act_timestamp) with values like
(21, 14, 123321, 28-NOV-11 13:30:21)
(21, 14, 123321, 28-NOV-11 14:29:28)
(21, 14, 123321, 29-NOV-11 18:44:22)
(21, 14, 123321, 30-NOV-11 11:30:55)
(21, 14, 123321, 30-NOV-11 16:56:11)
(21, 14, 123321, 30-NOV-11 19:32:31)
(21, 14, 123321, 31-NOV-11 09:22:51)
(21, 14, 123321, 31-NOV-11 12:22:49)
(21, 14, 123321, 31-NOV-11 13:11:17)
(21, 14, 123321, 31-NOV-11 16:41:21)
The query should take the minimum time of the act_timestamp field of each distinct date and calculate the average minimum time over the given date range (which in this case is 28-31 nov)
So for above the result should be: 13:30:21 + 11:30:55 + 9:22:51 /3 = 11:27:42
as the average min time
and similarly for max time.
Thanks in advance
Select min_timestamp, calculate the average of the time-part, and add the current day to convert back to a date:
SELECT
TO_CHAR(TRUNC(SYSDATE) + AVG(min_timestamp - TRUNC(min_timestamp)), 'HH24:MI:SS')
FROM
(
SELECT MIN(act_timestamp) AS min_timestamp
FROM user_transn
GROUP BY TRUNC(act_timestamp)
)

Resources