Find most occurring value in a group - clickhouse

I want to find the most occurring value per group.
I tried using top(k)(column) but I get below error:
Column class is not under aggregate function and not in GROUP BY.
For example:
If I have table test_date with columns(pid, value)
pid, value
----------
1,a
1,b
1,a
1,c
I want result :
pid, value
----------
1,a
I tried SELECT pid,top(1)(value) top_value FROM test_data group by pid
I get the error:
Column value is not under aggregate function and not in GROUP BY
I also tried with anyHeavy() but it only works for values that occur more than in half the cases

This query should help you:
SELECT
pid,
/*
Decompose the query in parts:
1. groupArray((value, count)): convert the group of rows with the same 'pid' to the array of tuples (value, count)
2. arrayReverseSort: make reverse sorting by 'count' ('x.2' is 'count')
3. [1].1: take the 'value' from the first item of the sorted array
*/
arrayReverseSort(x -> x.2, groupArray((value, count)))[1].1 AS value
FROM
(
SELECT
pid,
value,
count() AS count
FROM test_date
GROUP BY
pid,
value
)
GROUP BY pid
ORDER BY pid ASC

SELECT pid,topK(1)(value) top_value FROM test_data group by pid

Related

Trying to display top 3 amount from a table using sql query in oracle 11g..column is of varchar type

Am trying to list top 3 records from atable based on some amount stored in a column FTE_TMUSD which is of varchar datatype
below is the query i tried
SELECT *FROM
(
SELECT * FROM FSE_TM_ENTRY
ORDER BY FTE_TMUSD desc
)
WHERE rownum <= 3
ORDER BY FTE_TMUSD DESC ;
o/p i got
972,9680,963 -->FTE_TMUSD values which are not displayed in desc
I am expecting an o/p which will display the top 3 records of values
That should work; inline view is ordered by FTE_TMUSD in descending order, and you're selecting values from it.
What looks suspicious are values you specified as the result. It appears that FTE_TMUSD's datatype is VARCHAR2 (ah, yes - it is, you said so). It means that values are sorted as strings, not numbers - and it seems that you expect numbers. So, apply TO_NUMBER to that column. Note that it'll fail if column contains anything but numbers (for example, if there's a value 972C).
Also, an alternative to your query might be use of analytic functions, such as row_number:
with temp as
(select f.*,
row_number() over (order by to_number(f.fte_tmusd) desc) rn
from fse_tm_entry f
)
select *
from temp
where rn <= 3;

Hadoop Hive MAX gives multiple results

I am trying to get a maximum value from a count selecting 2 label srcip and max, but everytime I include srcip I have to use group by srcip at the end and gives me result as the max wasnt even there.
When I write the query like this it gives me the correct max value but I want to select srcip as well.
Select max(count1) as maximum
from (SELECT srcip,count(srcip) as count1 from data group by srcip)t;
But when I do include srcip in the select I get result as there was no max function
Select srcip,max(count1) as maximum
from (SELECT srcip,count(srcip) as count1 from data group by srcip)t
group by srcip;
I would expect from this a single result but I get multiple.
Anyone has any ideas?
You may do ORDER BY count DESC with LIMIT 1 to get the scrip with MAX of count.
SELECT srcip, count(srcip) as count1
from data group by srcip
ORDER BY count1 DESC LIMIT 1
let's consider you have a data like this.
Table
Let's see what happens when you run following query, what happens to data.
Query
SELECT srcip,count(srcip) as count1 from data group by srcip
Output: table1
Now let's see what happens you run your outer query on above table .
Select srcip,max(count1) as maximum from table1 group by srcip
Same Output
Reason being your query says to select srcip and maximum of count from each group of srcip. And we have 3 groups, so 3 rows.
The query below returns exact one row having the max count and the associated scrip. This is the query based on the expected result; you would rather look more into sql and earlier comments, then progress to hive analytical queries.
Some people could argue that there is better way to optimize this query for your expected result but this should give you a motivation to look more into Hive analytical queries.
select scrip, count1 as maximum from (select srcip, count(scrip) over (PARTITION by scrip) as count1, row_number() over (ORDER by scrip desc) as row_num from data) q1 having row_num = 1;

Oracle Select if value is smaller than before

i have table:
how do i select ID form this table when value after less than previous value (expected result is ID = B2 and C1 ). Thanks you
You can use window function lead for to get the last row's value and check if current row's value is lesser than that. You can also use lead in similar way to get the value from next row.
select distinct ID
from (
select t.*,
lag(value) over (
partition by ID order by time
) as last_value
from your_table t
) t
where value < last_value;

How to check that group has a value in Oracle?

For example, I have a table tbl like
values
10
20
30
40
on this table by the condition I have GROUP BY like this:
SELECT ???
FROM tbl
GROUP BY values
I need to check that group has some value, for example 30
UPD:
In real task a have a table with many columns and other operations on them and in one column i need to check whether value in every group of this column.
UPD2:
I need something like this:
select
min(created_timestamp),
max(resource_id),
max(price),
CASE WHEN event_type has (1704 or 1701 or 1703) THEN return found value END
CASE WHEN event_type has (1707) THEN return 1707 END
from subscriptions
group by guid
SELECT
MIN(created_timestamp),
MAX(resource_id),
MAX(price),
MIN(CASE WHEN event_type IN (1704, 1701, 1703)
THEN found_value
WHEN event_type = 1707
THEN 1707
ELSE NULL
END)
FROM subscriptions
GROUP BY guid ;
I did not get what you have in the select clause .. but if you want to see the values also in the out put when you run the group by query try this
select function(), values from tbl group by values
function() -- could be any function like -- count or sum
and if you want only specific to value 30 .. then add a where clause values = 30.
You dont need to use group by clause if your aim is to find if some value exists.
Use this method if you also consider the performance.
SELECT DECODE (COUNT(1),0,'Not Exist','Yes has some values')
FROM dual
WHERE EXISTS ( SELECT 1
FROM tbl
WHERE VALUES='&Your_Value_To_Check'
)

Oracle aggregate function to return a random value for a group?

The standard SQL aggregate function max() will return the highest value in a group; min() will return the lowest.
Is there an aggregate function in Oracle to return a random value from a group? Or some technique to achieve this?
E.g., given the table foo:
group_id value
1 1
1 5
1 9
2 2
2 4
2 8
The SQL query
select group_id, max(value), min(value), some_aggregate_random_func(value)
from foo
group by group_id;
might produce:
group_id max(value), min(value), some_aggregate_random_func(value)
1 9 1 1
2 8 2 4
with, obviously, the last column being any random value in that group.
You can try something like the following
select deptno,max(sal),min(sal),max(rand_sal)
from(
select deptno,sal,first_value(sal)
over(partition by deptno order by dbms_random.value) rand_sal
from emp)
group by deptno
/
The idea is to sort the values within group in random order and pick the first.I can think of other ways but none so efficient.
You might prepend a random string to the column you want to extract the random element from, and then select the min() element of the column and take out the prepended string.
select group_id, max(value), min(value), substr(min(random_value),11)
from (select dbms_random.string('A', 10)||value random_value,foo.* from foo)
In this way you cand avoid using the aggregate function and specifying twice the group by, which might be useful in a scenario where your query is very complicated / or you are just exploring the data and are entering manually queries with a lengthy and changing list of group by columns.

Resources