Using CASE statements in ORDER BY - sql-order-by

I have a query that needs to take a different order, based other fields. It is possible to use a CASE statement in an ORDER BY, so so far that is working fine. The only thing that doesn't seem possible, is to change the ascending/descending part.
My query:
SELECT articleid, MIN(createtime) min, MAX(createtime) max
FROM items
GROUP BY articleid
ORDER BY
CASE WHEN logic='foo' THEN min ELSE '0001-01-01 00:00:00' END ASC,
CASE WHEN logic='bar' THEN max ELSE '0001-01-01 00:00:00' END DESC;
works, but I was wondering if a form like this could be made (I know this doesn't work):
SELECT articleid, MIN(createtime) min, MAX(createtime) max
FROM items
GROUP BY articleid
ORDER BY
CASE WHEN logic='foo' THEN min ASC ELSE max DESC END;

The ASC and DESC are only allowed at the end of the expressions. However, you could put the ASC at then end and just negate the max in the ELSE clause (assuming there's no nulls complicating the matter).
ORDER BY
CASE WHEN logic='foo' THEN min ELSE -max END ASC;

Related

Oracle - Sorting Conditionally on String and Number

I have a query where based on a column value, my sort will be dynamic. So, it is something like this:
ROW_NUMBER() OVER (PARTITION BY last_action
ORDER BY CASE
WHEN last_action = 'Insert' THEN company_name
ELSE percent_change
END DESC
Issue here is that they are different data types, so it throws an error. If I convert the "percent_change" to character, then it does not sort numerically. And, to complicate things, they want the "percent_change" in DESC and the "company_name" in ASC.
So, I was thinking if there is a way to convert the actual "company_name" into some numerical value, and subtract it from 0, and then I can so the numerical sort in DESC order.
Any ideas would be helpful.......
Sounds like you're after something like this, then:
ROW_NUMBER() OVER (PARTITION BY last_action
ORDER BY CASE
WHEN last_action = 'Insert' THEN company_name
END,
CASE
WHEN last_action = 'Insert' THEN NULL
ELSE percent_change
END DESC NULLS LAST)
This works by splitting the ordering out into two expressions, but because they're conditional on the column in the partition by clause, only one of them is going to impact the ordering at any one time.
What if you take your case outside of window function -
CASE WHEN last_action = 'Insert' THEN
ROW_NUMBER() OVER(PARTITION BY last_action ORDER BY company_name)
ELSE ROW_NUMBER() OVER(PARTITION BY last_action ORDER BY percent_change DESC)
END
Have you looked at using the DECODE function?
Decode is kind of like a simplified CASE statement.
You can use it in the order by clause (or anywhere you can use an expression)
DECODE(expr, key1, value1, key2, value2, default)
For example:
ORDER BY
DECODE(tbl.companyName, 'name1', 1, 'name2', 2, 'name3', 3, 9999)
The last value is the default if the first expression doesn't match.
In your example, I think you can combine CASE and DECODE to yield the final expression for the order by clause.
Anther trick for sorting numbers as characters is to pad the string with leading zeros. You can use LPAD or concatenation and substr (RIGHT)
LPAD(expr, 9, '0');
SUBRSTR('000000000' || expr, -9);
I use something like the above to fix SSNs that have had leading 0 dropped (for example when they come through Excel)

How to SELECT the MAX Time Difference Between Any 2 Consecutive Rows Per Value?

Just had a user answer this correctly for TSQL, but wondering how best to achieve this now in SQL Developer/PLSQL seeing as there is no DATEDIFF function.
Table I want to query on has some 'CODE' values, which can naturally have multiple primary key records ('OccsID') in a table 'Occs'. There is also a datetime column called 'CreateDT' for each OccsID.
Just want to find the maximum possible time variance between any 2 consecutive rows in 'Occs', per 'CODE'.
If you subtract the "next" date and "this" date (using the LEAD analytic function), you'll get the date difference. Then fetch the maximum difference per code. Something like this:
with diff as
(select occsid,
code,
nvl(lead(createdt) over (partition by code order by createdt), createdt) - createdt date_diff
from test
)
select code,
max(date_diff)
from diff
group by code;
Assuming that this T-SQL version works for you (from the prior question)
SELECT x.code, MAX(x.diff_sec) FROM
(
SELECT
code,
DATEDIFF(
SECOND,
CreateDT,
LEAD(CreateDT) OVER(PARTITION BY CODE ORDER BY CreateDT) --next row's createdt
) as diff_sec
FROM Occs
)x
GROUP BY x.code
The simplest option is just to subtract the two dates to get a difference in days. You can then multiply to get the difference in hours, minutes, or seconds
SELECT x.code, MAX(x.diff_day), MAX(x.diff_sec)
FROM
(
SELECT
code,
CreateDT -
LEAD(CreateDT) OVER(PARTITION BY CODE ORDER BY CreateDT) as diff_day,
24*60*60* (CreateDT -
LEAD(CreateDT) OVER(PARTITION BY CODE ORDER BY CreateDT)) as diff_sec
FROM Occs
)x
GROUP BY x.code

Hadoop Hive MAX gives multiple results

I am trying to get a maximum value from a count selecting 2 label srcip and max, but everytime I include srcip I have to use group by srcip at the end and gives me result as the max wasnt even there.
When I write the query like this it gives me the correct max value but I want to select srcip as well.
Select max(count1) as maximum
from (SELECT srcip,count(srcip) as count1 from data group by srcip)t;
But when I do include srcip in the select I get result as there was no max function
Select srcip,max(count1) as maximum
from (SELECT srcip,count(srcip) as count1 from data group by srcip)t
group by srcip;
I would expect from this a single result but I get multiple.
Anyone has any ideas?
You may do ORDER BY count DESC with LIMIT 1 to get the scrip with MAX of count.
SELECT srcip, count(srcip) as count1
from data group by srcip
ORDER BY count1 DESC LIMIT 1
let's consider you have a data like this.
Table
Let's see what happens when you run following query, what happens to data.
Query
SELECT srcip,count(srcip) as count1 from data group by srcip
Output: table1
Now let's see what happens you run your outer query on above table .
Select srcip,max(count1) as maximum from table1 group by srcip
Same Output
Reason being your query says to select srcip and maximum of count from each group of srcip. And we have 3 groups, so 3 rows.
The query below returns exact one row having the max count and the associated scrip. This is the query based on the expected result; you would rather look more into sql and earlier comments, then progress to hive analytical queries.
Some people could argue that there is better way to optimize this query for your expected result but this should give you a motivation to look more into Hive analytical queries.
select scrip, count1 as maximum from (select srcip, count(scrip) over (PARTITION by scrip) as count1, row_number() over (ORDER by scrip desc) as row_num from data) q1 having row_num = 1;

Return Boolean value when table has data in the specified range

I need a query to return boolean when there's table has data in the given range.
Assume table
Customer
[User ID, Name, Date, Products_Purchased]
I'm trying to do:
select case when exists(
select Date, count(*)
from Customer
where date between '2015-08-03' and '2015-08-05'
)
then cast(1 as BIT)
else case(0 as BIT)end;
This is throwing an error near "select Date".
However, weird part is the inner query is running perfectly fine.
Im wondering if im missing out something here !
What about something more straightforward e.g.
select case when count(*) >0 then 1 else 0 end as HIT
from ... where ...
That way you don't have to bother about Hive assuming that EXISTS implies a correlated sub-query, automagically translated into a MapJoin, i.e. a Java HashMap shuffled to the 2nd line of Mappers jobs, etc. Not exactly your use case.
Then it's not useful to compute the exact count, so the query could be refined as
select case when count(*) >0 then 1 else 0 end as HIT
from
(select ... from ... where ... limit 1) X
[Edit] There is no "bit" datatype in Hive. But the default "int" should be OK if you just want a return flag (zero / non-zero)

Oracle pagination ROWNUM column>=value challenge

Having some trouble with oracle pagination. Case:
Table with > 1 billion rows:
Measurement(Id Number, Classification VARCHAR, Value NUMBER)
Index:
ON Measurement(Value)
I need a query that gets the first match and the following 2000 matches ordered by Value. I also would like to use the index.
First idea:
SELECT * FROM Measurement WHERE Value >= 1234567890
AND ROWNUM <= 2000 ORDER BY Value ASC
Result:
The query just returns the first 2000 cases it can find in the table, starting from the top, where Value is higher or equal to 1234567890, and then orders that resultset ascending.
Second idea:
SELECT * FROM
(SELECT * FROM Measurement WHERE Value >= 1234567890 ORDER BY Value ASC)
WHERE ROWNUM <= 2000
Result:
Oracle does not understand that ROWNUM should limit the amount from the inner query, so oracle decides to get all rows where Value is greater or equal to 1234567890 first, and then order that giant resultset before returning the first 2000 rows. Because Oracle is guessing that most of the data in the table will be returned, it ignores any use of index as well.
None of these approaches are acceptable as the first one gives the wrong results, and the second one takes hours.
Is pagination supported at all in Oracle?
You can use the following
SELECT * FROM
(SELECT Id, Classification, Value, ROWNUM Rank FROM Measurement WHERE Value >= 1234567890)
WHERE Rank <= 2000
order by Rank
You do not need to order in the sub-query. Simply unnecessary.
The above is not pagination but the firs page I would suppose.
Not sure if you got the solution for your problem, but to put my two cents:
The first query will not answer your requirements as it will fetch 2000 random records that satisfy your query and then do an order by.
Coming to the second query :
Oracle will first do the execution of the second query and will then only move to the outer query. So, the rownum filter will be applied only after the inner query is executed.
You can try the below approach, to do INDEX FAST FULL SCAN, i have tested it on a table with 2.76 million rows and it is having lesser cost than the other approach:
SELECT * from Measurement
where value in ( SELECT VALUE FROM
(SELECT Value FROM Measurement
WHERE Value >= 1234567890 ORDER BY Value ASC)
WHERE ROWNUM <= 2000)
Hope it Helps
Vishad
I think I have fond a potential solution. However, it's not a query.
declare
cursor c is
SELECT * FROM Measurement WHERE Value >= 1234567890 ORDER BY Value ASC;
l_rec c%rowtype;
begin
open c;
for i in 1 .. 2000
loop
fetch c into l_rec;
exit when c%notfound;
end loop;
close c;
end;
/
Kindly experiment with more options
SELECT *
FROM( SELECT /*+ FIRST_ROWS(2000) */
Id,
Classification,
Value,
ROW_NUMBER() OVER (ORDER BY Value) AS rn
FROM Measurement
where Value > 1234567889
)
WHERE rn <=2000;
Update1:- Force the use of index on Value.Here IDX_ON_VALUE is the Name of the index on Value in Measurement
SELECT * FROM
(SELECT /*+ INDEX(a IDX_ON_VALUE) */* FROM Measurement
a WHERE value >=1234567890 )
ORDER BY a.Value ASC)
WHERE ROWNUM <= 2000

Resources