Why Oracle is not using Bankers rule (the rounding method)?
Accurate decimal arithmatic is a large and complex subject.
Google 'mike colishaw decimal rounding' if you want to read the ahem Oracle on the subject.
Basically there are many rounding schemes which are possible:-
Round everthing down - the default in most languages including C as Oracle is written in C this is probably why they do this.
Round everything up - rarely seen but occasionally needs to be implemented because of obscure market and tax rules.
Basic Half Rounding - anything above .5 rounds up everything else rounds down.
Generous Half Rounding - anything below .5 rounds down everthing else rounds up.
Bankers Rounding - Even numbers follow the Basic Half Rounding rule, odd numbers the Generous Half Rounding rule. This is rarely seen in actual banks which prefer rounding up if the moneys coming thier way and rounding down when its going the clients way.
ORACLE NUMBER is actually a pretty good Decimal Arithmatic implementation and is accurate as far as it goes.
Oracle has implemented round half away from zero:
SQL> select round(22.5) from dual
2 /
ROUND(22.5)
-----------
23
SQL> select round(23.5) from dual
2 /
ROUND(23.5)
-----------
24
SQL> select round(-23.5) from dual
2 /
ROUND(-23.5)
------------
-24
SQL> select round(-22.5) from dual
2 /
ROUND(-22.5)
------------
-23
SQL>
Why don't they change it to Bankers' Rounding? Well, for most purposes round half away from zero is good enough. Plus there's that old fallback, changing it would likely break too much of the existing codebase - Oracle's own as well as all their customers.
Old thread, but someone may still need this. Oracle's binary floats and binary doubles follow the banker's rounding rule when rounding to a whole number. So you can use that. It's ugly but it works:
given : price = 2.445
SQL> select round(to_binary_float(price * 100)) / 100 as price_rounded from dual;
price_rounded
-------------
2.44
given : price = 2.435
SQL> select round(to_binary_float(price * 100)) / 100 as price_rounded from dual;
price_rounded
-------------
2.44
The multiply and divide by 100 are necessary in this example. I haven't been able to figure out the specifics of the behavior, but select round(to_binary_float(price), 2) for some decimal, price, does not seem to consistently round up or down by the same rules. I have found, however, that rounding to a whole number consistently gives me what I need.
You can always implement your own function for banker's rounding as described here.
Banker's rounding round's 0.5 to 0: it round's towards even numbers.
Related
Not sure if possible, but here's a working version of a form I want to reproduce in Redcap- http://extubation.net/
Provided equation that is calculated-
Gestational age, oxygen, respiratory score, day of life, pH, and weight at extubation are the input fields.
Is this possible to reproduce within a Redcap form?
Yes, it should be straightforward. I just plugged some random numbers into a calculation I defined based on your image and this popped out:
I don't know these metrics and the numbers I put in might be way off, so this might be off by some orders of magnitude, but the important points are:
In REDCap calculations, base and exponents are both wrapped in parentheses, so (3)^(2) yields 9, while 3^2 is not syntactic.
To get Euler's number, you can either hardcode it to some degree of precision yourself in the calculation (2.718281828459045)^(-47.3379 + 0.40635*[gage]..., or you can use the JavaScript Math library and use (Math.E)^(-47.3379 + 0.40635*[gage].... If you do the latter, you must have your admin add the calculation for you, as JavaScript in calculations is reserved for administrators. It might only give you so much precision anyway, so you might as well hard code it.
My calculation was:
(Math.E) ^ (-47.3379 + 0.40635*[gage] - 0.0701*[o2] - 0.1273*[rss] + 0.04202*[edol] + 0.52154*10*[bgph] + 0.0016*[ext_weight_g])
How can I change a precision of functions ln and power in oracle? I'm getting very precise results - 40 digits. The problem is that I have a huge table, therefore, the calculations are very long, and I don't need that kind of precision. Standard 7 or 16 digits would be fine, and probably would speed up the computation. Note that I'm not asking about round function because it would only change the format of the result, and would not influence the computation itself.
Edit
My real query is complicated, so to keep things simple, let us consider
select ln(2) from dual;
As a result, I'm getting
.6931471805599453094172321214581765680782
whereas, I would like to get, e.g., .69314718, but not by rounding the final result .6931471805599453094172321214581765680782. I want to avoid the computation of those additional digits.
Just trunc the ln to avoid rounding.
select trunc(ln(2),7),ln(2) from dual;
Outputs:
0.6931471 0.693147180559945
It turned out that the argument conversion to binary_double is the perfect solution for my efficiency problems. For binary_double arguments, power and ln functions produce binary_double results. Now, both of my queries are evaluated in couple of minutes instead of 1 hour 15 minutes and 40 minutes.
I have a situation where I'm performing a calculate over a huge number of rows, and I can really increase the performance if I can eschew a conditional statement.
What I need is for a given positive, zero, or negative integer I want the result 1, 0, -1 respectively.
So if I do col/ABS(col), I will get 1 for a positive number, and -1 for a negative number, but of course if col equals 0 then I'll get an error. I can't get an error.
This seems simple enough, but I can't wrap my ahead around it.
Assuming either two's complement 32-bit integers, or one's complement with no negative-zero to worry about, then the following works well:
(x>>31) - (-x>>31);
Replace 31 with 63 for 64-bit integers, and so on.
col/max(1, abs(col))
Ugly but works. For integers, that is. For floating point values where there's no well-defined smallest positive value, you're stuck unless the language allows you to look into it as a bit sequence, then you can just do the same with the sign flag and the significand.
Whether this helps optimising anything is highly debatable though. It certainly makes things harder to read.
I am doing this in an SSAS tabular model, which doesn't have a MAX function like in #bizclop's answer (which helps in many other applications such as EXCEL).
I ended up doing the following which was inspired by the accepted answer:
ROUND([col] / (ABS([col]) + 1), 0)
This ended up reducing my query time quite significantly (40%) versus IF([col] <> 0, [col]/ABS([col], 0).
I'm having problems with a mammoth legacy PL/SQL procedure which has the following logic:
l_elapsed := dbms_utility.get_time - l_timestamp;
where l_elapsed and l_timestamp are of type PLS_INTEGER and l_timestamp holds the result of a previous call to get_time
This line suddenly started failing during a batch run with a ORA-01426: numeric overflow
The documentation on get_time is a bit vague, possibly deliberately so, but it strongly suggests that the return value has no absolute significance, and can be pretty much any numeric value. So I was suspicious to see it being assigned to a PLS_INTEGER, which can only support 32 bit integers. However, the interweb is replete with examples of people doing exactly this kind of thing.
The smoking gun is found when I invoke get_time manually, it is returning a value of -214512572, which is suspiciously close to the min value of a 32 bit signed integer. I'm wondering if during the time elapsed between the first call to get_time and the next, Oracle's internal counter rolled over from its max value and its min value, resulting in an overflow when trying to subtract one from the other.
Is this a likely explanation? If so, is this an inherent flaw in the get_time function? I could just wait and see if the batch fails again tonight, but I'm keen to get an explanation for this behaviour before then.
Maybe late, but this may benefit someone searching on the same question.
The underlying implementation is a simple 32 bit binary counter, which is incremented every 100th of a second, starting from when the database was last started.
This binary counter is is being mapped onto a PL/SQL BINARY_INTEGER type - which is a signed 32-bit integer (there is no sign of it being changed to 64-bit on 64-bit machines).
So, presuming the clock starts at zero it will hit the +ve integer limit after about 248 days, and then flip over to become a -ve value falling back down to zero.
The good news is that provided both numbers are the same sign, you can do a simple subtraction to find duration - otherwise you can use the 32-bit remainder.
IF SIGN(:now) = SIGN(:then) THEN
RETURN :now - :then;
ELSE
RETURN MOD(:now - :then + POWER(2,32),POWER(2,32));
END IF;
Edit : This code will blow the int limit and fail if the gap between the times is too large (248 days) but you shouldn't be using GET_TIME to compare durations measure in days anyway (see below).
Lastly - there's the question of why you would ever use GET_TIME.
Historically, it was the only way to get a sub-second time, but since the introduction of SYSTIMESTAMP, the only reason you would ever use GET_TIME is because it's fast - it is a simple mapping of a 32-bit counter, with no real type conversion, and doesn't make any hit on the underlying OS clock functions (SYSTIMESTAMP seems to).
As it only measures relative time, it's only use is for measuring the duration between two points. For any task that takes a significant amount of time (you know, over 1/1000th of a second or so) the cost of using a timestamp instead is insignificant.
The number of occasions on where it is actually useful is minimal (the only one I've found is checking the age of data in a cache, where doing a clock hit for every access becomes significant).
From the 10g doc:
Numbers are returned in the range -2147483648 to 2147483647 depending on platform and machine, and your application must take the sign of the number into account in determining the interval. For instance, in the case of two negative numbers, application logic must allow that the first (earlier) number will be larger than the second (later) number which is closer to zero. By the same token, your application should also allow that the first (earlier) number be negative and the second (later) number be positive.
So while it is safe to assign the result of dbms_utility.get_time to a PLS_INTEGER it is theoretically possible (however unlikely) to have an overflow during the execution of your batch run. The difference between the two values would then be greater than 2^31.
If your job takes a lot of time (therefore increasing the chance that the overflow will happen), you may want to switch to a TIMESTAMP datatype.
Assigning a negative value to your PLS_INTEGER variable does raise an ORA-01426:
SQL> l
1 declare
2 a pls_integer;
3 begin
4 a := -power(2,33);
5* end;
SQL> /
declare
*
FOUT in regel 1:
.ORA-01426: numeric overflow
ORA-06512: at line 4
However, you seem to suggest that -214512572 is close to -2^31, but it's not, unless you forgot to typ a digit. Are we looking at a smoking gun?
Regards,
Rob.
I have an array of numbers that potentially have up to 8 decimal places and I need to find the smallest common number I can multiply them by so that they are all whole numbers. I need this so all the original numbers can all be multiplied out to the same scale and be processed by a sealed system that will only deal with whole numbers, then I can retrieve the results and divide them by the common multiplier to get my relative results.
Currently we do a few checks on the numbers and multiply by 100 or 1,000,000, but the processing done by the *sealed system can get quite expensive when dealing with large numbers so multiplying everything by a million just for the sake of it isn’t really a great option. As an approximation lets say that the sealed algorithm gets 10 times more expensive every time you multiply by a factor of 10.
What is the most efficient algorithm, that will also give the best possible result, to accomplish what I need and is there a mathematical name and/or formula for what I’m need?
*The sealed system isn’t really sealed. I own/maintain the source code for it but its 100,000 odd lines of proprietary magic and it has been thoroughly bug and performance tested, altering it to deal with floats is not an option for many reasons. It is a system that creates a grid of X by Y cells, then rects that are X by Y are dropped into the grid, “proprietary magic” occurs and results are spat out – obviously this is an extremely simplified version of reality, but it’s a good enough approximation.
So far there are quiet a few good answers and I wondered how I should go about choosing the ‘correct’ one. To begin with I figured the only fair way was to create each solution and performance test it, but I later realised that pure speed wasn’t the only relevant factor – an more accurate solution is also very relevant. I wrote the performance tests anyway, but currently the I’m choosing the correct answer based on speed as well accuracy using a ‘gut feel’ formula.
My performance tests process 1000 different sets of 100 randomly generated numbers.
Each algorithm is tested using the same set of random numbers.
Algorithms are written in .Net 3.5 (although thus far would be 2.0 compatible)
I tried pretty hard to make the tests as fair as possible.
Greg – Multiply by large number
and then divide by GCD – 63
milliseconds
Andy – String Parsing
– 199 milliseconds
Eric – Decimal.GetBits – 160 milliseconds
Eric – Binary search – 32
milliseconds
Ima – sorry I couldn’t
figure out a how to implement your
solution easily in .Net (I didn’t
want to spend too long on it)
Bill – I figure your answer was pretty
close to Greg’s so didn’t implement
it. I’m sure it’d be a smidge faster
but potentially less accurate.
So Greg’s Multiply by large number and then divide by GCD” solution was the second fastest algorithm and it gave the most accurate results so for now I’m calling it correct.
I really wanted the Decimal.GetBits solution to be the fastest, but it was very slow, I’m unsure if this is due to the conversion of a Double to a Decimal or the Bit masking and shifting. There should be a
similar usable solution for a straight Double using the BitConverter.GetBytes and some knowledge contained here: http://blogs.msdn.com/bclteam/archive/2007/05/29/bcl-refresher-floating-point-types-the-good-the-bad-and-the-ugly-inbar-gazit-matthew-greig.aspx but my eyes just kept glazing over every time I read that article and I eventually ran out of time to try to implement a solution.
I’m always open to other solutions if anyone can think of something better.
I'd multiply by something sufficiently large (100,000,000 for 8 decimal places), then divide by the GCD of the resulting numbers. You'll end up with a pile of smallest integers that you can feed to the other algorithm. After getting the result, reverse the process to recover your original range.
Multiple all the numbers by 10
until you have integers.
Divide
by 2,3,5,7 while you still have all
integers.
I think that covers all cases.
2.1 * 10/7 -> 3
0.008 * 10^3/2^3 -> 1
That's assuming your multiplier can be a rational fraction.
If you want to find some integer N so that N*x is also an exact integer for a set of floats x in a given set are all integers, then you have a basically unsolvable problem. Suppose x = the smallest positive float your type can represent, say it's 10^-30. If you multiply all your numbers by 10^30, and then try to represent them in binary (otherwise, why are you even trying so hard to make them ints?), then you'll lose basically all the information of the other numbers due to overflow.
So here are two suggestions:
If you have control over all the related code, find another
approach. For example, if you have some function that takes only
int's, but you have floats, and you want to stuff your floats into
the function, just re-write or overload this function to accept
floats as well.
If you don't have control over the part of your system that requires
int's, then choose a precision to which you care about, accept that
you will simply have to lose some information sometimes (but it will
always be "small" in some sense), and then just multiply all your
float's by that constant, and round to the nearest integer.
By the way, if you're dealing with fractions, rather than float's, then it's a different game. If you have a bunch of fractions a/b, c/d, e/f; and you want a least common multiplier N such that N*(each fraction) = an integer, then N = abc / gcd(a,b,c); and gcd(a,b,c) = gcd(a, gcd(b, c)). You can use Euclid's algorithm to find the gcd of any two numbers.
Greg: Nice solution but won't calculating a GCD that's common in an array of 100+ numbers get a bit expensive? And how would you go about that? Its easy to do GCD for two numbers but for 100 it becomes more complex (I think).
Evil Andy: I'm programing in .Net and the solution you pose is pretty much a match for what we do now. I didn't want to include it in my original question cause I was hoping for some outside the box (or my box anyway) thinking and I didn't want to taint peoples answers with a potential solution. While I don't have any solid performance statistics (because I haven't had any other method to compare it against) I know the string parsing would be relatively expensive and I figured a purely mathematical solution could potentially be more efficient.
To be fair the current string parsing solution is in production and there have been no complaints about its performance yet (its even in production in a separate system in a VB6 format and no complaints there either). It's just that it doesn't feel right, I guess it offends my programing sensibilities - but it may well be the best solution.
That said I'm still open to any other solutions, purely mathematical or otherwise.
What language are you programming in? Something like
myNumber.ToString().Substring(myNumber.ToString().IndexOf(".")+1).Length
would give you the number of decimal places for a double in C#. You could run each number through that and find the largest number of decimal places(x), then multiply each number by 10 to the power of x.
Edit: Out of curiosity, what is this sealed system which you can pass only integers to?
In a loop get mantissa and exponent of each number as integers. You can use frexp for exponent, but I think bit mask will be required for mantissa. Find minimal exponent. Find most significant digits in mantissa (loop through bits looking for last "1") - or simply use predefined number of significant digits.
Your multiple is then something like 2^(numberOfDigits-minMantissa). "Something like" because I don't remember biases/offsets/ranges, but I think idea is clear enough.
So basically you want to determine the number of digits after the decimal point for each number.
This would be rather easier if you had the binary representation of the number. Are the numbers being converted from rationals or scientific notation earlier in your program? If so, you could skip the earlier conversion and have a much easier time. Otherwise you might want to pass each number to a function in an external DLL written in C, where you could work with the floating point representation directly. Or you could cast the numbers to decimal and do some work with Decimal.GetBits.
The fastest approach I can think of in-place and following your conditions would be to find the smallest necessary power-of-ten (or 2, or whatever) as suggested before. But instead of doing it in a loop, save some computation by doing binary search on the possible powers. Assuming a maximum of 8, something like:
int NumDecimals( double d )
{
// make d positive for clarity; it won't change the result
if( d<0 ) d=-d;
// now do binary search on the possible numbers of post-decimal digits to
// determine the actual number as quickly as possible:
if( NeedsMore( d, 10e4 ) )
{
// more than 4 decimals
if( NeedsMore( d, 10e6 ) )
{
// > 6 decimal places
if( NeedsMore( d, 10e7 ) ) return 10e8;
return 10e7;
}
else
{
// <= 6 decimal places
if( NeedsMore( d, 10e5 ) ) return 10e6;
return 10e5;
}
}
else
{
// <= 4 decimal places
// etc...
}
}
bool NeedsMore( double d, double e )
{
// check whether the representation of D has more decimal points than the
// power of 10 represented in e.
return (d*e - Math.Floor( d*e )) > 0;
}
PS: you wouldn't be passing security prices to an option pricing engine would you? It has exactly the flavor...