stddev function in plsql is giving me the wrong value - oracle

I need to calculate the stddev function in plsql, but when I compared the values with java program, it is different.
I need to get the stddev function for this set of values(100,104,105,103,110,115,130,95,91,105,106,101,65,91,95), when I used plsql the value is : 14.032 and using java it is 13.557. could you please help in getting the correct value using oracle plsql

Here is an explicit SQL query which takes your numbers and steps them through the Standard Deviation calculation.
with t23 as (
select column_value as val
from table (sys.odcinumberlist(100,104,105,103,110,115,130,95,91,105,106,101,65,91,95))
)
, mn as (
select avg(val) as avg_val
, count(*) as cnt
from t23
) , inp as (
select t23.val
, t23.val - mn.avg_val as diff
, power(t23.val - mn.avg_val , 2) diff_sq
, mn.cnt
from t23 cross join mn
)
select sum(diff_sq) as sum_diff_sq
, sqrt( sum(diff_sq) / (cnt-1) ) as sd
from inp
group by cnt
;
The value of SD is 14.0329544118054 which suggests the Oracle stddev() function is correct and whatever you're running in Java is incorrect.

Great code there #APC - I like the look through nature of the code :)
So I came across this same issue today and the answer is Oracle has two built-in functions for Standard Deviation according to the two distinct calculations:
a) stddev - for standard deviation sample; and
b) stddev_pop standard deviation population
select stddev(column_value), stddev_pop(column_value)
from table (sys.odcinumberlist(100,104,105,103,110,115,130,95,91,105,106,101,65,91,95))
Example
STDDEV(COLUMN_VALUE) STDDEV_POP(COLUMN_VALUE)
-------------------- ------------------------
14.0329544 13.5571219
So use the one that suits your needs...
Some info on standard deviations and which one to use here.

Related

dbms_random.value() in Snowflake - Oracle to snowflake conversion

Below is the oracle sql query that I got to convert to snowflake. In here, i am blocked in creating dbms_random.value() in snowflake
select emp_id, emp_name, emp_mob,
(case when dbms_random.value() >= 0.85 then 'Y' else 'N' end) as tag
from eds_dwg.employee_data
Can someone help me on this?
Thanks
You can use Snowflake Data generation functions: https://docs.snowflake.com/en/sql-reference/functions-data-generation.html
NORMAL() returns a floating point number with a specified mean and standard deviation. Something like this with correct adaptions of the parameters could to the trick: https://docs.snowflake.com/en/sql-reference/functions/normal.html
An alternative can be using UNIFORM(): https://docs.snowflake.com/en/sql-reference/functions/uniform.html
Example from docs to generate a value between 0 and 1:
select uniform(0::float, 1::float, random()) from table(generator(rowcount => 5));

Trying to return dataset from SQL statement with function

I've been digging on this for most of the day but have so far been unable to find the right answer. I'm trying to find a way to return the results of a SQL query in a custom function. All data in our system is both transaction and effective dated, and we are commonly called upon to compare data between two points of time. Right now I use a WITH clause to pull two different datasets ("Before" and "After"). The problem is that the queries used to create these datasets are very long, and each CTE is basically the same thing just with different effective dates. I'd like to find a way to create a function that I can pass the effective/transaction dates to for the comparison so I don't have so much redundant logic in my SQL. Here's the catch - I have read-only access and cannot create any objects in the DB. I've read that I can use a Declare statement to get around this but just haven't been able to get it right so far.
Here's an example of what I have now. I've simplified the query greatly so this isn't a complete mess.
WITH effective_date AS (
SELECT to_date(:EFFDT) AS effdt,
to_date(:REPORT_DATE_BEFORE) AS report_dt_before,
to_date(:REPORT_DATE_AFTER) AS report_dt_after
FROM dual),
election_data_before AS (
SELECT *
FROM effective_date efd
CROSS JOIN elections e
WHERE efd.effdt >= e.start_dt
AND efd.effdt < e.until_dt
AND efd.report_dt_before >= e.tran_start_dt
AND efd.report_dt_before < e.tran_until_dt),
election_data_after AS (
SELECT *
FROM effective_date efd
CROSS JOIN elections e
WHERE efd.effdt >= e.start_dt
AND efd.effdt < e.until_dt
AND efd.report_dt_after >= e.tran_start_dt
AND efd.report_dt_after < e.tran_until_dt)
SELECT ...
FROM election_data_before edb
INNER JOIN election_data_after eda
ON edb.employee_id = eda.employee_id
AND edb.benefit_type = eda.benefit_type
WHERE ...
This doesn't look so bad, but like I said this is extremely simplified. Here's what I'd like to be able to do. I know this is garbage code, just trying to illustrate what I'm picturing.
FUNCTION elections ( effdt date, report_dt date )
RETURN (
SELECT *
FROM elections e
WHERE effdt >= e.start_dt
AND effdt < e.until_dt
AND report_dt >= e.tran_start_dt
AND report_dt < e.tran_until_dt)
SELECT ...
FROM elections(:EFFDT, :REPORT_DT_BEFORE) edb
ON pp.employee_id = edb.employee_id
INNER JOIN elections(:EFFDT, :REPORT_DT_AFTER) eda
ON pp.employee_id = eda.employee_id
AND edb.benefit_type = eda.benefit_type
WHERE ...
I've been reading about pipelined functions and anonymous blocks all day but haven't been able to put it all together. If anyone can point me in the right direction or let me know if I'm better off just using two different CTEs I'd appreciate it. Thanks!
In order to create and use a pipelined table function, you must first create an object type.
create or replace type election_type as object
(
col1 varchar2(100),
col2 date,
col3 number -- Here you define the name and datatype of the columns you want
-- to return from the select query.
);
/
create or replace type election_type_tab as table of election_type;
-- You need a collection (nested table) type to return multiple records
-- of the type defined above from your function
Now, define your function using an implicit cursor for loop to extract and pass rows to the caller.
CREATE OR REPLACE FUNCTION fn_elections (
effdt DATE,
report_dt DATE
) RETURN election_type_tab
PIPELINED
AS
BEGIN
FOR rec IN (
SELECT * --This should return the same columns as that of election_type
FROM elections e --,some_othertable s
WHERE effdt >= e.start_dt AND
effdt < e.until_dt
AND report_dt >= e.tran_start_dt
AND report_dt < e.tran_until_dt
) LOOP
PIPE ROW ( election_type(rec.col1,rec.col2,rec.col3) );
END LOOP;
return;
END;
/
You may then call it like this.
select * from TABLE(fn_elections(sysdate,sysdate+1)); --or some other date argument
Demo

Oracle Spatial - SDO_BUFFER does not work?

I have a table which has SDO_Geometries and I query all the geometries to find their start and end point, then I insert these points to another table called ORAHAN. Now my main purpose is for each point in orahan I must find if it is intersects with another point in orahan when giving 2 cm buffer to points.
So I write some pl sql using Relate and Bufer functions but when I check some records in Map Info, I saw there is points within 1 cm area from itself but no record in intersections table called ORAHANCROSSES.
Am I use these functions wrongly or what?
Note: I am using Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production and
PL/SQL Release 11.2.0.1.0 - and SDO_PACKAGE
ORAHAN has approximately 400 thousands records.(points and other columns.)
declare
BEGIN
for curs in (select * from ORAHAN t) loop
for curs2 in (select *
from ORAHAN t2
where SDO_RELATE(t2.geoloc,SDO_GEOM.SDO_BUFFER(curs.geoloc,0.02,0.5) ,
'mask=ANYINTERACT') = 'TRUE'
and t2.mi_prinx <> curs.mi_prinx) loop
Insert INTO ORAHANCROSSES
values
(curs.Mip, curs.Startmi, curs2.Mip, curs2.Startmi);
commit;
end loop;
end loop;
END;
And this is MapInfo map image that shows 3 points which are close to each other aproximately 1 centimeter. But in the orahancrosses there is no record matching these 3.
Note: 0,00001000km equals 1cm
Orahan Metadata:
select * from user_sdo_geom_metadata where table_name = 'ORAHAN';
And diminfo:
What is the coordinate system of your data ? And, most important, what tolerance have you set in your metadata ?
Some other comments:
1) Don't use a relate with buffer approach. Just use a within-distance approach.
2) You don't need a PL/SQL loop for that sort of query just use a simple CTAS:
create table orahancrosses as
select c1.mip mip_1, c1.startmi startmi_1, c2.mip mip_2, c2.startmi startmi_2
from orahan c1, orahan c2
where sdo_within_distance (c2.geoloc, c1.geoloc, 'distance=2 unit=cm') = 'TRUE'
and c2.mi_prinx <> c1.mi_prinx;
3) As written, couples of points A and B that are within 2 cm will be returned twice: once as (A,B) and once again as (B,A). To avoid that (and only return one of the cases), then write the query like this:
create table orahancrosses as
select c1.mip mip_1, c1.startmi startmi_1, c2.mip mip_2, c2.startmi startmi_2
from orahan c1, orahan c2
where sdo_within_distance (c2.geoloc, c1.geoloc, 'distance=2 unit=cm') = 'TRUE'
and c1.rowid < c2.rowid;
3) Processing the number of points you mention (400000+) should run better using the SDO_JOIN technique, like this:
create table orahancrosses as
select c1.mip mip_1, c1.startmi startmi_1, c2.mip mip_2, c2.startmi startmi_2
from table (
sdo_join (
'ORAHAN','GEOLOC',
'ORAHAN','GEOLOC',
'DISTANCE=2 UNIT=CM'
)
) j,
orahan c1,
orahan c2
where j.rowid1 < j.rowid2
and c1.rowid = j.rowid1
and c2.rowid = j.rowid2;
This will probably still take time to process - depending on the capacity of your database server. If you are licences for Oracle Enterprise Edition and your hardware has the proper capacity (# of cores) then parallelism can reduce the elapsed time.
4) You say you are using Oracle 11g. What exact version ? Version 11.2.0.4 is the terminal release for 11gR2. Anything older is no longer supported. By now you should really be on 12cR1 (12.1.0.2). The major benefit of 12.1.0.2 in your case s the Vector Performance Accelerator feature that speeds up a number of spatial functions and operators (only if you own the proper Oracle Spatial licenses - it is not available with the free Oracle Locator feature).
======================================
Using the two points in your example. Let's compute the distance:
select sdo_geom.sdo_distance(
sdo_geometry (2001,null,sdo_point_type(521554.782174622,4230983.08336913,null),null,null),
sdo_geometry (2001,null,sdo_point_type(521554.782174622,4230983.07336716,null),null,null),
0.005
) distance
from dual;
DISTANCE
----------
.01000197
1 row selected.
Notice I don't specify any SRID. Assuming the coordinates are expressed in meters, the distance between them is indeed a little more than 1 cm.
======================================
The reason why your original syntax does not work is, as you noticed, because of the tolerance you specify for the SDO_BUFFER() call. You pass it as 0.5 (=50cm) to produce a buffer with a radius of 0.02 (2cm). The effect is that the buffer produced effectively dissolves into the point itself.
For example at tolerance 0.5:
select sdo_geom.sdo_buffer(sdo_geometry (2001,null,sdo_point_type(521554.782174622,4230983.08336913,null),null,null),0.02,0.5) from dual;
Produces:
SDO_GEOMETRY(2001, NULL, SDO_POINT_TYPE(521554.782, 4230983.08, NULL), NULL, NULL)
At tolerance 0.005:
select sdo_geom.sdo_buffer(sdo_geometry (2001,null,sdo_point_type(521554.782174622,4230983.08336913,null),null,null),0.02,0.005) from dual;
You get the proper buffer:
SDO_GEOMETRY(2003, NULL, NULL, SDO_ELEM_INFO_ARRAY(1, 1003, 2), SDO_ORDINATE_ARRAY(521554.782, 4230983.06, 521554.802, 4230983.08, 521554.782, 4230983.1, 521554.762, 4230983.08, 521554.782, 4230983.06))
And the very close point now matches with that buffer:
select sdo_geom.relate(
sdo_geom.sdo_buffer(sdo_geometry (2001,null,sdo_point_type(521554.782174622,4230983.08336913,null),null,null),0.02,0.005),
'determine',
sdo_geometry (2001,null,sdo_point_type(521554.782174622,4230983.07336716,null),null,null),
0.005
) relation
from dual;
RELATION
-------------------------
CONTAINS
1 row selected.
======================================
Now the fact that your data does not have a proper explicit SRID means that the use of explicit units in measurements or distance-based searches will not work. Because the database does not know what coordinate system your data is in, it does not know how to determine that two points are less than a set number of cm or m apart. All you can do is assume the coordinates are in meters.
So in the examples I give above, replace 'DISTANCE=2 UNIT=CM' with 'DISTANCE=0.02'

Select from a loop in Oracle

In oracle 11g, I want to execute a query like that :
In this case, I didn't allowed use Function or Procedure.
I tried to Google it, but i couldn't find a good solution. Almost show me the way to use Function or Store Procedure.
Table X with columns (A,B,C)
With a row in table X i want to select :
Count = B - A;
for(i=0;i<Count;i++)
{
C++;
D = C * A;
}
Expect result : table Y with columns (A,B,C,D)
You are thinking like a 3GL developer. Java (or whatever) only has arrays, so everything is an iteration. But SQL is a set-oriented language: we don't need loops to work on sets of data. Oracle SQL has built-in aggregation functions which allow us to compute values from sets of records.
For instance, this query calculates total remuneration (salary plus commission), number of employees and average salary:
select sum(sal + nvl(comm,0)) as total_renum
, count(*) as total_emps
, avg(sal) as average_salary
from emp
/
Oracle has a comprehensive range of such functions, some of them are really powerful. Find out more. Be sure to check out analytic functions too.
Hmmm, so you subsequently posted a cryptic snippet of code. It's still not clear exactly what you want, but this might produce the outcome for your tab;e Y:
select a
, b
, c
, 0 + ((c+level) * a) as d
from x
connect by level <= (b-a)
/
For each row in table X it will generate (b-a) rows, with a derived value of d. I have assumed a start of 0 for d.

convert from time in format h.mm to minutes

I have time stored as number in an oracle database in the format hh.mm, I want to calculate the sum and then write it back in another column in the same format hh.mm using a store procedure.
Is there any function in oracle for this type of summation or I have to do it from scratch?
Thanks in advance.
You are storing values in a custom format. So obviously Oracle won't a built-in function to handle it.
Doing this requires a series of steps, which could be wrapped into a user-defined function.
Convert the numeric column into a string. It is important to specify the format mask, otherwise the next step will produce the wrong result
Use regular expressions to extract the hour and minute values from the column
Derive the total number of minutes with simple arithmetic
Derive the new total number of hours and remainder of minutes with simple arithmetic
Derive the final number as a pseudo-decimal value.
Here is the SQL ...
select hh + (mi/100) as final_result
from (
select trunc(step3.tot_mins/60) as hh
, step3.tot_mins - (trunc(step3.tot_mins/60)*60) as mi
from (
select sum((step2.hh*60)+step2.mi) as tot_mins
from (
with step1 as (select to_char(ctime, '00000000000.99') ctime
from your_table)
select to_number(regexp_substr(ctime, '([0-9]+)', 1,1)) as hh
, to_number(regexp_substr(ctime, '([0-9]+)', 1, 2)) as mi
from step1
) step2
) step3
) step4
/
... and here is the obligatory SQL Fiddle.

Resources