I was wondering, how I can get the best precision on ruby. Someone told me that the best precision is probably between 0 and 1, because as you go into larger numbers the step increases as well.
I suppose a way to find out would be to know what the minimum float number is and what the next float number, then the precision would be the difference, right? If I'm correct how could I do this on ruby?
I am not sure how to use this http://ruby.wikia.com/wiki/Float to find that information.
Any help appreciated.
In terms of significant digits, the precision is the same, regardless of scale. That is, if you scale your range from [0.0, 1000.0] down to [0.0, 1.0] just by dividing numbers in the natural range by 1000.0, this will have no discernible effect on the precision of your range. In fact, a larger range will have marginally greater precision since it fully contains the smaller range.
As for discovering the absolute precision, you have two problems:
The absolute precision depends on the magnitude, which varies "infinitely" within the range [0, 1] (limx→0 log(x) = –∞). So there is no one precision for numbers in that range. You can only derive absolute precision at a given point in the range.
The common technique for discovering the minimum step — known as the ulp — is to interpret the bit-representation of the float as an integer, increment it by one, and reinterpret the result as a float. Ruby doesn't, AFAIK, let you do this.
There is, however, an iterative solution. Simply add 1.0 to the number and subtract ((x + 1.0) - x). If the difference is zero, double the addend ((x + 2.0) - x) and repeat until the difference is non-zero. Otherwise, halve the addend (to 0.5) and repeat until the difference is zero. Whenever you stop, the lowest addend that produces a non-zero difference is the ulp. (I described this from vague memory, so it might be NQR.)
You can use class Rational - it stores non-integer numbers as fraction of two Integers, which (as far as I know) will be automatically converted to Bignum, when need.
The Flt ruby library provides arbitrary floating point precision.
Related
This is something that's been on my mind for years, but I never took the time to ask before.
Many (pseudo) random number generators generate a random number between 0.0 and 1.0. Mathematically there are infinite numbers in this range, but double is a floating point number, and therefore has a finite precision.
So the questions are:
Just how many double numbers are there between 0.0 and 1.0?
Are there just as many numbers between 1 and 2? Between 100 and 101? Between 10^100 and 10^100+1?
Note: if it makes a difference, I'm interested in Java's definition of double in particular.
Java doubles are in IEEE-754 format, therefore they have a 52-bit fraction; between any two adjacent powers of two (inclusive of one and exclusive of the next one), there will therefore be 2 to the 52th power different doubles (i.e., 4503599627370496 of them). For example, that's the number of distinct doubles between 0.5 included and 1.0 excluded, and exactly that many also lie between 1.0 included and 2.0 excluded, and so forth.
Counting the doubles between 0.0 and 1.0 is harder than doing so between powers of two, because there are many powers of two included in that range, and, also, one gets into the thorny issues of denormalized numbers. 10 of the 11 bits of the exponents cover the range in question, so, including denormalized numbers (and I think a few kinds of NaN) you'd have 1024 times the doubles as lay between powers of two -- no more than 2**62 in total anyway. Excluding denormalized &c, I believe the count would be 1023 times 2**52.
For an arbitrary range like "100 to 100.1" it's even harder because the upper bound cannot be exactly represented as a double (not being an exact multiple of any power of two). As a handy approximation, since the progression between powers of two is linear, you could say that said range is 0.1 / 64th of the span between the surrounding powers of two (64 and 128), so you'd expect about
(0.1 / 64) * 2**52
distinct doubles -- which comes to 7036874417766.4004... give or take one or two;-).
Every double value whose representation is between 0x0000000000000000 and 0x3ff0000000000000 lies in the interval [0.0, 1.0]. That's (2^62 - 2^52) distinct values (plus or minus a couple depending on whether you count the endpoints).
The interval [1.0, 2.0] corresponds to representations between 0x3ff0000000000000 and 0x400000000000000; that's 2^52 distinct values.
The interval [100.0, 101.0] corresponds to representations between 0x4059000000000000 and 0x4059400000000000; that's 2^46 distinct values.
There are no doubles between 10^100 and 10^100 + 1. Neither one of those numbers is representable in double precision, and there are no doubles that fall between them. The closest two double precision numbers are:
99999999999999982163600188718701095...
and
10000000000000000159028911097599180...
Others have already explained that there are around 2^62 doubles in the range [0.0, 1.0].
(Not really surprising: there are almost 2^64 distinct finite doubles; of those, half are positive, and roughly half of those are < 1.0.)
But you mention random number generators: note that a random number generator generating numbers between 0.0 and 1.0 cannot in general produce all these numbers; typically it'll only produce numbers of the form n/2^53 with n an integer (see e.g. the Java documentation for nextDouble). So there are usually only around 2^53 (+/-1, depending on which endpoints are included) possible values for the random() output. This means that most doubles in [0.0, 1.0] will never be generated.
The article Java's new math, Part 2: Floating-point numbers from IBM offers the following code snippet to solve this (in floats, but I suspect it works for doubles as well):
public class FloatCounter {
public static void main(String[] args) {
float x = 1.0F;
int numFloats = 0;
while (x <= 2.0) {
numFloats++;
System.out.println(x);
x = Math.nextUp(x);
}
System.out.println(numFloats);
}
}
They have this comment about it:
It turns out there are exactly 8,388,609 floats between 1.0 and 2.0 inclusive; large but hardly the uncountable infinity of real numbers that exist in this range. Successive numbers are about 0.0000001 apart. This distance is called an ULP for unit of least precision or unit in the last place.
2^53 - the size of the significand/mantissa of a 64bit floating point number including the hidden bit.
Roughly yes, as the significand is fixed but the exponent changes.
See the wikipedia article for more information.
The Java double is a IEEE 754 binary64 number.
This means that we need to consider:
Mantissa is 52 bit
Exponent is 11 bit number with 1023 bias (ie with 1023 added to it)
If the exponent is all 0 and the mantissa is non zero then the number is said to be non-normalized
This basically means there is a total of 2^62-2^52+1 of possible double representations that according to the standard are between 0 and 1. Note that 2^52+1 is to the remove the cases of the non-normalized numbers.
Remember that if mantissa is positive but exponent is negative number is positive but less than 1 :-)
For other numbers it is a bit harder because the edge integer numbers may not representable in a precise manner in the IEEE 754 representation, and because there are other bits used in the exponent to be able represent the numbers, so the larger the number the lower the different values.
Could someone explain how calculators (such as casio pocket ones) manage equations such as '500/12' and are able to return '125/3' as the result, alternately can someone name some algorithms which do this?
By imprecise numbers I mean numbers which cannot be represented in a fixed number of decimal places, such as 0.333 recurring.
Windows calculator is able to demonstrate this, if you perform '1/3' you will get '0.3333333333333333' as the answer, but then if you multiply this by 3 you will arrive back at '1'.
My HP's fraction display let's you set several modes for fraction display:
Set a maximum denominator. The displayed the fraction is n/d closest to the internal floating point value without d exceeding the maximum. For example, if the maximum is set to 10, the floating point number for pi is nearest the fraction 22/7. However, if the maximum is 1000, then the nearest fraction is 355/113.
Set an exact denominator and reduce the result. The displayed fraction is the n/d closest to the internal floating point value where d is equal to the exact denominator. Having computed n, the fraction is then reduced by the greatest-common-denominator. For example, if the denominator is fixed to be 32, then the floating point number 0.51 is nearest to 16/32 which gets reduced to 1/2. Likewise, the floating point number 0.516 is nearest to 17/32 which is irreducible.
Set an exact denominator and do not reduce the result. For example, 0.51 is shown as 16/32, an unreduced fraction.
The algorithm for the maximum-denominator approach uses continued fractions. An easy to follow example in Python can be found in the limit_denominator method at http://hg.python.org/cpython/file/2.7/Lib/fractions.py#l206 .
The method for the exact-denominator approach is easier. Given a denominator d and a floating point number x, the numerator is just d * x rounded to the nearest integer. Then reduce the fraction n/d by computing the greatest common divisor.
Optionally, the original floating point number can be replaced by the displayed fraction. This is known as a snap-to-grid. That way, you can enter 0.333 to create a fraction that is exactly equal to 1/3. This lets you do exact fractional arithmetic without round-off.
Hope this answer clears everything up for you :-) Let me know if any part needs elaboration or further explanation.
I'd suggest you look at the GMP library's rational number functions. At some point, you will need to accept finite precision in your calculations, unless the sequence of operations is particularly simple. The irrationals (transcendental functions / constants) can only be approximated, e.g., as continued fractions.
I would like to know the best way to get a pseudorandom float number in a closed interval using the Ruby rand kernel function (please not Random module).
To take an example I will use the closed interval [0.0, 7.7] (both 0.0 and 7.7 included in the interval), but any other float interval should be valid too.
For the interval [0.0, 7.7] the next solution is not valid:
rand * 7.7
Why?
If you call rand without arguments you will get a pseudorandom floating point number greater than or equal to 0.0 and less than 1.0. So what is the range of float numbers that the previous solutions can give to us?
rand will return a pseudorandom float number in the range [0.0, 0.9999999...]
0.0 * 7.7
=> 0.0 # Correct!
0.9999999 * 7.7
=> 7.69999923 # Incorrect!
The interval does not match with [0.0, 7.7].
Does anyone know an elegant solution to this problem?
Thank you!
There's a Random class that can do what you want:
generator = Random.new # You need to instance it
generator.rand 0.0..7.7
(The documentation states the difference between 0.0..7.7 and 0.0...7.7 will be taken in account.)
In the future 1.9.3, you'll be able to pass a range to Kernel#rand and Random.rand (you can already do that in the preview version).
I would do something like this:
Fineness = 2**64
puts rand(Fineness+1)*7.7/Fineness
Whenever rand returns its maximum possible value, you will get Fineness*7.7/Fineness which turns out to equal 7.7 exactly (but I'm not totally sure this will always be the case, because floats are inexact).
As long as Fineness has more bits in it than a double on your computer, then I believe you will not notice any strangeness in the distribution of your results.
How about:
(rand/0.9999999999999999...)*7.7
Basically, normalize the random number by the largest possible random number. That way you will create the range [0..1].
However, I am unsure how to get the max number, which is less than 1.0 in ruby.
Why do you need this? I don't know of a case where there would be a need for this as a true single or double precision number. On the other hand, there are real cases where you might need numbers between 0.0 and 7.7 in increments of 0.1. In that case you could use well established techniques to go from 0 to 77 and then divide by 10.
Depending on the number of digits of precision you need you could use a round to even approach to snap to the boundaries of the interval to the edges. Hope this helps.
Here is the text from Wikipedia
Round half to evenA tie-breaking rule that is even less biased is
round half to even, namely
If the fraction of y is 0.5, then q is the even integer nearest to y.
Thus, for example, +23.5 becomes +24, +22.5 becomes +22, −22.5 becomes
−22, and −23.5 becomes −24.
This method also treats positive and negative values symmetrically,
and therefore is free of overall bias if the original numbers are
positive or negative with equal probability. In addition, for most
reasonable distributions of y values, the expected (average) value of
the rounded numbers is essentially the same as that of the original
numbers, even if the latter are all positive (or all negative).
However, this rule will still introduce a positive bias for even
numbers (including zero), and a negative bias for the odd ones.
This variant of the round-to-nearest method is also called unbiased
rounding (ambiguously, and a bit abusively), convergent rounding,
statistician's rounding, Dutch rounding, Gaussian rounding, or
bankers' rounding. This is widely used in bookkeeping.
This is the default rounding mode used in IEEE 754 computing functions
and operators.
I am wondering if this is true: When I take the square root of a squared integer, like in
f = Math.sqrt(123*123)
I will get a floating point number very close to 123. Due to floating point representation precision, this could be something like 122.99999999999999999999 or 123.000000000000000000001.
Since floor(122.999999999999999999) is 122, I should get 122 instead of 123. So I expect that floor(sqrt(i*i)) == i-1 in about 50% of the cases. Strangely, for all the numbers I have tested, floor(sqrt(i*i) == i. Here is a small ruby script to test the first 100 million numbers:
100_000_000.times do |i|
puts i if Math.sqrt(i*i).floor != i
end
The above script never prints anything. Why is that so?
UPDATE: Thanks for the quick reply, this seems to be the solution: According to wikipedia
Any integer with absolute value less
than or equal to 2^24 can be exactly
represented in the single precision
format, and any integer with absolute
value less than or equal to 2^53 can
be exactly represented in the double
precision format.
Math.sqrt(i*i) starts to behave as I've expected it starting from i=9007199254740993, which is 2^53 + 1.
Here's the essence of your confusion:
Due to floating point representation
precision, this could be something
like 122.99999999999999999999 or
123.000000000000000000001.
This is false. It will always be exactly 123 on a IEEE-754 compliant system, which is almost all systems in these modern times. Floating-point arithmetic does not have "random error" or "noise". It has precise, deterministic rounding, and many simple computations (like this one) do not incur any rounding at all.
123 is exactly representable in floating-point, and so is 123*123 (so are all modest-sized integers). So no rounding error occurs when you convert 123*123 to a floating-point type. The result is exactly 15129.
Square root is a correctly rounded operation, per the IEEE-754 standard. This means that if there is an exact answer, the square root function is required to produce it. Since you are taking the square root of exactly 15129, which is exactly 123, that's exactly the result you get from the square root function. No rounding or approximation occurs.
Now, for how large of an integer will this be true?
Double precision can exactly represent all integers up to 2^53. So as long as i*i is less than 2^53, no rounding will occur in your computation, and the result will be exact for that reason. This means that for all i smaller than 94906265, we know the computation will be exact.
But you tried i larger than that! What's happening?
For the largest i that you tried, i*i is just barely larger than 2^53 (1.1102... * 2^53, actually). Because conversions from integer to double (or multiplication in double) are also correctly rounded operations, i*i will be the representable value closest to the exact square of i. In this case, since i*i is 54 bits wide, the rounding will happen in the very lowest bit. Thus we know that:
i*i as a double = the exact value of i*i + rounding
where rounding is either -1,0, or 1. If rounding is zero, then the square is exact, so the square root is exact, so we already know you get the right answer. Let's ignore that case.
So now we're looking at the square root of i*i +/- 1. Using a Taylor series expansion, the infinitely precise (unrounded) value of this square root is:
i * (1 +/- 1/(2i^2) + O(1/i^4))
Now this is a bit fiddly to see if you haven't done any floating point error analysis before, but if you use the fact that i^2 > 2^53, you can see that the:
1/(2i^2) + O(1/i^4)
term is smaller than 2^-54, which means that (since square root is correctly rounded, and hence its rounding error must be smaller than 2^54), the rounded result of the sqrt function is exactly i.
It turns out that (with a similar analysis), for any exactly representable floating point number x, sqrt(x*x) is exactly x (assuming that the intermediate computation of x*x doesn't over- or underflow), so the only way you can encounter rounding for this type of computation is in the representation of x itself, which is why you see it starting at 2^53 + 1 (the smallest unrepresentable integer).
For "small" integers, there is usually an exact floating-point representation.
It's not too hard to find cases where this breaks down as you'd expect:
Math.sqrt(94949493295293425**2).floor
# => 94949493295293424
Math.sqrt(94949493295293426**2).floor
# => 94949493295293424
Math.sqrt(94949493295293427**2).floor
# => 94949493295293424
Ruby's Float is a double-precision floating point number, which means that it can accurately represent numbers with (rule of thumb) about 16 significant decimal digits. For regular single-precision floating point numbers it's about significant 7 digits.
You can find more information here:
What Every Computer Scientist Should Know About Floating-Point Arithmetic:
http://docs.sun.com/source/819-3693/ncg_goldberg.html
This is more of a numerical analysis rather than programming question, but I suppose some of you will be able to answer it.
In the sum two floats, is there any precision lost? Why?
In the sum of a float and a integer, is there any precision lost? Why?
Thanks.
In the sum two floats, is there any precision lost?
If both floats have differing magnitude and both are using the complete precision range (of about 7 decimal digits) then yes, you will see some loss in the last places.
Why?
This is because floats are stored in the form of (sign) (mantissa) × 2(exponent). If two values have differing exponents and you add them, then the smaller value will get reduced to less digits in the mantissa (because it has to adapt to the larger exponent):
PS> [float]([float]0.0000001 + [float]1)
1
In the sum of a float and a integer, is there any precision lost?
Yes, a normal 32-bit integer is capable of representing values exactly which do not fit exactly into a float. A float can still store approximately the same number, but no longer exactly. Of course, this only applies to numbers that are large enough, i. e. longer than 24 bits.
Why?
Because float has 24 bits of precision and (32-bit) integers have 32. float will still be able to retain the magnitude and most of the significant digits, but the last places may likely differ:
PS> [float]2100000050 + [float]100
2100000100
The precision depends on the magnitude of the original numbers. In floating point, the computer represents the number 312 internally as scientific notation:
3.12000000000 * 10 ^ 2
The decimal places in the left hand side (mantissa) are fixed. The exponent also has an upper and lower bound. This allows it to represent very large or very small numbers.
If you try to add two numbers which are the same in magnitude, the result should remain the same in precision, because the decimal point doesn't have to move:
312.0 + 643.0 <==>
3.12000000000 * 10 ^ 2 +
6.43000000000 * 10 ^ 2
-----------------------
9.55000000000 * 10 ^ 2
If you tried to add a very big and a very small number, you would lose precision because they must be squeezed into the above format. Consider 312 + 12300000000000000000000. First you have to scale the smaller number to line up with the bigger one, then add:
1.23000000000 * 10 ^ 15 +
0.00000000003 * 10 ^ 15
-----------------------
1.23000000003 <-- precision lost here!
Floating point can handle very large, or very small numbers. But it can't represent both at the same time.
As for ints and doubles being added, the int gets turned into a double immediately, then the above applies.
When adding two floating point numbers, there is generally some error. D. Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic" describes the effect and the reasons in detail, and also how to calculate an upper bound on the error, and how to reason about the precision of more complex calculations.
When adding a float to an integer, the integer is first converted to a float by C++, so two floats are being added and error is introduced for the same reasons as above.
The precision available for a float is limited, so of course there is always the risk that any given operation drops precision.
The answer for both your questions is "yes".
If you try adding a very large float to a very small one, you will for instance have problems.
Or if you try to add an integer to a float, where the integer uses more bits than the float has available for its mantissa.
The short answer: a computer represents a float with a limited number of bits, which is often done with mantissa and exponent, so only a few bytes are used for the significant digits, and the others are used to represent the position of the decimal point.
If you were to try to add (say) 10^23 and 7, then it won't be able to accurately represent that result. A similar argument applies when adding a float and integer -- the integer will be promoted to a float.
In the sum two floats, is there any precision lost?
In the sum of a float and a integer, is there any precision lost? Why?
Not always. If the sum is representable with the precision you ask, and you won't get any precision loss.
Example: 0.5 + 0.75 => no precision loss
x * 0.5 => no precision loss (except if x is too much small)
In the general case, one add floats in slightly different ranges so there is a precision loss which actually depends on the rounding mode.
ie: if you're adding numbers with totally different ranges, expect precision problems.
Denormals are here to give extra-precision in extreme cases, at the expense of CPU.
Depending on how your compiler handle floating-point computation, results can vary.
With strict IEEE semantics, adding two 32 bits floats should not give better accuracy than 32 bits.
In practice it may requires more instruction to ensure that, so you shouldn't rely on accurate and repeatable results with floating-point.
In both cases yes:
assert( 1E+36f + 1.0f == 1E+36f );
assert( 1E+36f + 1 == 1E+36f );
The case float + int is the same as float + float, because a standard conversion is applied to the int. In the case of float + float, this is implementation dependent, because an implementation may choose to do the addition at double precision. There may be some loss when you store the result, of course.
In both cases, the answer is "yes". When adding an int to a float, the integer is converted to floating point representation before the addition takes place anyway.
To understand why, I suggest you read this gem: What Every Computer Scientist Should Know About Floating-Point Arithmetic.