Programming Percentages in PL/SQL - oracle

--56/25000*100 = 0.244 = 0.2 % as percentage column answer
I am trying to write out this math problem as a query in pl/sql.
The table looks like this
Test_Name , Total# Pass, Fail, Percent, View_name
I need to write out a query that divides the total number of records that fail (56) against the total number of records with in the table (25000) times 100.
How can this be queried?

Sounds like you want something like this
select 100.0 * count(case when fail = true then 1 else null end) / count(*)
from table_name;

Related

Referancing value from select column in where clause : Oracle

My tables are as below
MS_ISM_ISSUE
ISSUE_ID ISSUE_DUE_DATE ISSUE_SOURCE_TYPE
I1 25-11-2018 1
I2 25-12-2018 1
I3 27-03-2019 2
MS_ISM_SOURCE_SETUP
SOURCE_ID MODULE_NAME
1 IT-Compliance
2 Risk Assessment
I have written following query.
with rs as
(select
count(ISSUE_ID) as ISSUE_COUNT, src.MODULE_NAME,
case
when ISSUE_DUE_DATE<sysdate then 'Overdue'
when ISSUE_DUE_DATE between sysdate and sysdate + 90 then 'Within 3 months'
when ISSUE_DUE_DATE>sysdate+90 then 'Beyond 90 days'
end as date_range
from MS_ISM_ISSUE issue, MS_ISM_SOURCE_SETUP src
where issue.Issue_source_type = src.source_id
group by src.MODULE_NAME, case
when ISSUE_DUE_DATE<sysdate then 'Overdue'
when ISSUE_DUE_DATE between sysdate and sysdate + 90 then 'Within 3 months'
when ISSUE_DUE_DATE>sysdate+90 then 'Beyond 90 days'
end)
select ISSUE_COUNT,MODULE_NAME, DATE_RANGE,
(select count(ISSUE_COUNT) from rs where rs.MODULE_NAME=MODULE_NAME) as total from rs;
The output of the code is as below.
ISSUE_COUNT MODULE_NAME DATE_RANGE Total
1 IT-Compliance Overdue 3
1 IT-Compliance Within 3 months 3
1 Risk Assessment Beyond 90 days 3
The result is correct till 3rd column. In 4th column what I want is, total of Issue count for given module name. Hence in above case Total column will have value as 2 for first and second row (since there are 2 Issues for IT-Compliance) and value 1 for the third row (since one issue is present for Risk Assessment).
Essentially, I want to achieve is to replace current row's MODULE_NAME in last where clause. How do I achieve this using query?
OK, this condition
where rs.MODULE_NAME=MODULE_NAME
is essentially the same as if you wrote
where MODULE_NAME = MODULE_NAME
which is simply always true (if there are no nulls in module_name).
Try using different table alias for inner query and outer query, e.g.
select count(ISSUE_COUNT) from rs rs2 where rs2.MODULE_NAME=rs.MODULE_NAME
You can also try to use analytic function here, something like
select ISSUE_COUNT,
MODULE_NAME,
DATE_RANGE,
COUNT(ISSUE_COUNT) OVER (PARTITION BY RS.MODULE_NAME) AS TOTAL
from rs
instead of your subquery

Return Boolean value when table has data in the specified range

I need a query to return boolean when there's table has data in the given range.
Assume table
Customer
[User ID, Name, Date, Products_Purchased]
I'm trying to do:
select case when exists(
select Date, count(*)
from Customer
where date between '2015-08-03' and '2015-08-05'
)
then cast(1 as BIT)
else case(0 as BIT)end;
This is throwing an error near "select Date".
However, weird part is the inner query is running perfectly fine.
Im wondering if im missing out something here !
What about something more straightforward e.g.
select case when count(*) >0 then 1 else 0 end as HIT
from ... where ...
That way you don't have to bother about Hive assuming that EXISTS implies a correlated sub-query, automagically translated into a MapJoin, i.e. a Java HashMap shuffled to the 2nd line of Mappers jobs, etc. Not exactly your use case.
Then it's not useful to compute the exact count, so the query could be refined as
select case when count(*) >0 then 1 else 0 end as HIT
from
(select ... from ... where ... limit 1) X
[Edit] There is no "bit" datatype in Hive. But the default "int" should be OK if you just want a return flag (zero / non-zero)

Oracle pagination ROWNUM column>=value challenge

Having some trouble with oracle pagination. Case:
Table with > 1 billion rows:
Measurement(Id Number, Classification VARCHAR, Value NUMBER)
Index:
ON Measurement(Value)
I need a query that gets the first match and the following 2000 matches ordered by Value. I also would like to use the index.
First idea:
SELECT * FROM Measurement WHERE Value >= 1234567890
AND ROWNUM <= 2000 ORDER BY Value ASC
Result:
The query just returns the first 2000 cases it can find in the table, starting from the top, where Value is higher or equal to 1234567890, and then orders that resultset ascending.
Second idea:
SELECT * FROM
(SELECT * FROM Measurement WHERE Value >= 1234567890 ORDER BY Value ASC)
WHERE ROWNUM <= 2000
Result:
Oracle does not understand that ROWNUM should limit the amount from the inner query, so oracle decides to get all rows where Value is greater or equal to 1234567890 first, and then order that giant resultset before returning the first 2000 rows. Because Oracle is guessing that most of the data in the table will be returned, it ignores any use of index as well.
None of these approaches are acceptable as the first one gives the wrong results, and the second one takes hours.
Is pagination supported at all in Oracle?
You can use the following
SELECT * FROM
(SELECT Id, Classification, Value, ROWNUM Rank FROM Measurement WHERE Value >= 1234567890)
WHERE Rank <= 2000
order by Rank
You do not need to order in the sub-query. Simply unnecessary.
The above is not pagination but the firs page I would suppose.
Not sure if you got the solution for your problem, but to put my two cents:
The first query will not answer your requirements as it will fetch 2000 random records that satisfy your query and then do an order by.
Coming to the second query :
Oracle will first do the execution of the second query and will then only move to the outer query. So, the rownum filter will be applied only after the inner query is executed.
You can try the below approach, to do INDEX FAST FULL SCAN, i have tested it on a table with 2.76 million rows and it is having lesser cost than the other approach:
SELECT * from Measurement
where value in ( SELECT VALUE FROM
(SELECT Value FROM Measurement
WHERE Value >= 1234567890 ORDER BY Value ASC)
WHERE ROWNUM <= 2000)
Hope it Helps
Vishad
I think I have fond a potential solution. However, it's not a query.
declare
cursor c is
SELECT * FROM Measurement WHERE Value >= 1234567890 ORDER BY Value ASC;
l_rec c%rowtype;
begin
open c;
for i in 1 .. 2000
loop
fetch c into l_rec;
exit when c%notfound;
end loop;
close c;
end;
/
Kindly experiment with more options
SELECT *
FROM( SELECT /*+ FIRST_ROWS(2000) */
Id,
Classification,
Value,
ROW_NUMBER() OVER (ORDER BY Value) AS rn
FROM Measurement
where Value > 1234567889
)
WHERE rn <=2000;
Update1:- Force the use of index on Value.Here IDX_ON_VALUE is the Name of the index on Value in Measurement
SELECT * FROM
(SELECT /*+ INDEX(a IDX_ON_VALUE) */* FROM Measurement
a WHERE value >=1234567890 )
ORDER BY a.Value ASC)
WHERE ROWNUM <= 2000

How to get records randomly from the oracle database?

I need to select rows randomly from an Oracle DB.
Ex: Assume a table with 100 rows, how I can randomly return 20 of those records from the entire 100 rows.
SELECT *
FROM (
SELECT *
FROM table
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum < 21;
SAMPLE() is not guaranteed to give you exactly 20 rows, but might be suitable (and may perform significantly better than a full query + sort-by-random for large tables):
SELECT *
FROM table SAMPLE(20);
Note: the 20 here is an approximate percentage, not the number of rows desired. In this case, since you have 100 rows, to get approximately 20 rows you ask for a 20% sample.
SELECT * FROM table SAMPLE(10) WHERE ROWNUM <= 20;
This is more efficient as it doesn't need to sort the Table.
SELECT column FROM
( SELECT column, dbms_random.value FROM table ORDER BY 2 )
where rownum <= 20;
In summary, two ways were introduced
1) using order by DBMS_RANDOM.VALUE clause
2) using sample([%]) function
The first way has advantage in 'CORRECTNESS' which means you will never fail get result if it actually exists, while in the second way you may get no result even though it has cases satisfying the query condition since information is reduced during sampling.
The second way has advantage in 'EFFICIENT' which mean you will get result faster and give light load to your database.
I was given an warning from DBA that my query using the first way gives loads to the database
You can choose one of two ways according to your interest!
In case of huge tables standard way with sorting by dbms_random.value is not effective because you need to scan whole table and dbms_random.value is pretty slow function and requires context switches. For such cases, there are 3 additional methods:
1: Use sample clause:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
for example:
select *
from s1 sample block(1)
order by dbms_random.value
fetch first 1 rows only
ie get 1% of all blocks, then sort them randomly and return just 1 row.
2: if you have an index/primary key on the column with normal distribution, you can get min and max values, get random value in this range and get first row with a value greater or equal than that randomly generated value.
Example:
--big table with 1 mln rows with primary key on ID with normal distribution:
Create table s1(id primary key,padding) as
select level, rpad('x',100,'x')
from dual
connect by level<=1e6;
select *
from s1
where id>=(select
dbms_random.value(
(select min(id) from s1),
(select max(id) from s1)
)
from dual)
order by id
fetch first 1 rows only;
3: get random table block, generate rowid and get row from the table by this rowid:
select *
from s1
where rowid = (
select
DBMS_ROWID.ROWID_CREATE (
1,
objd,
file#,
block#,
1)
from
(
select/*+ rule */ file#,block#,objd
from v$bh b
where b.objd in (select o.data_object_id from user_objects o where object_name='S1' /* table_name */)
order by dbms_random.value
fetch first 1 rows only
)
);
To randomly select 20 rows I think you'd be better off selecting the lot of them randomly ordered and selecting the first 20 of that set.
Something like:
Select *
from (select *
from table
order by dbms_random.value) -- you can also use DBMS_RANDOM.RANDOM
where rownum < 21;
Best used for small tables to avoid selecting large chunks of data only to discard most of it.
Here's how to pick a random sample out of each group:
SELECT GROUPING_COLUMN,
MIN (COLUMN_NAME) KEEP (DENSE_RANK FIRST ORDER BY DBMS_RANDOM.VALUE)
AS RANDOM_SAMPLE
FROM TABLE_NAME
GROUP BY GROUPING_COLUMN
ORDER BY GROUPING_COLUMN;
I'm not sure how efficient it is, but if you have a lot of categories and sub-categories, this seems to do the job nicely.
-- Q. How to find Random 50% records from table ?
when we want percent wise randomly data
SELECT *
FROM (
SELECT *
FROM table_name
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum <= (select count(*) from table_name) * 50/100;

T-SQL Use Table Variable or Sum Against Parent Table

The scenario is this, I am creating a log table that will end up being quite large once it is all said and done and I want to create a status table that will query from the table with different date ranges and sum the results into multiple total fields.
I plan on writing this into a Stored Procedure but my question would I gain the best performance from reading all my records from the log table into a temp table before doing the sum operations.
IE I have this table:
SummaryValues
90DayValues
60DayValues
30DayValues
14DayValues
7DayValues
1DayValues
Would it be logical to make a take all values for the previous 90 days and then insert them into a table value before then calculating my sum for my 6 fields in my summary table or would it be just as fast to execute 6 sum statements from the log table?
Sometimes you are better reading into a temp table first. Sometimes not. This makes sense if you have multiple passes of processing on the same data
However, if you want "last 90 days", "last 60 day" etc then it can be done in one query
Reading the question again, I'd just run one query and calculate all values in one go. And not bother with any intermediate tables
SELECT
Stuff,
SUM(CASE WHEN dayDiff <= 90 THEN SomeValue ELSE 0 END) AS SumValue90,
SUM(CASE WHEN dayDiff <= 60 THEN SomeValue ELSE 0 END) AS SumValue60,
SUM(CASE WHEN dayDiff <= 30 THEN SomeValue ELSE 0 END) AS SumValue30
FROM
(
SELECT
Stuff,
DATEDIFF(day, SomeData, GETDATE()) AS dayDiff
FROM
Mytable
WHERE
...
) foo
GROUP BY
...

Resources