Follow up to this question:
I want to calculate 1/1048576 and get the correct result, i.e. 0.00000095367431640625.
Using BigDecimal's / truncates the result:
require 'bigdecimal'
a = BigDecimal.new(1)
#=> #<BigDecimal:7fd8f18aaf80,'0.1E1',9(27)>
b = BigDecimal.new(2**20)
#=> #<BigDecimal:7fd8f189ed20,'0.1048576E7',9(27)>
n = a / b
#=> #<BigDecimal:7fd8f0898750,'0.9536743164 06E-6',18(36)>
n.to_s('F')
#=> "0.000000953674316406" <- should be ...625
This really surprised me, because I was under the impression that BigDecimal would just work.
To get the correct result, I have to use div with an explicit precision:
n = a.div(b, 100)
#=> #<BigDecimal:7fd8f29517a8,'0.9536743164 0625E-6',27(126)>
n.to_s('F')
#=> "0.00000095367431640625" <- correct
But I don't really understand that precision argument. Why do I have to specify it and what value do I have to use to get un-truncated results?
Does this even qualify as "arbitrary-precision floating point decimal arithmetic"?
Furthermore, if I calculate the above value via:
a = BigDecimal.new(5**20)
#=> #<BigDecimal:7fd8f20ab7e8,'0.9536743164 0625E14',18(27)>
b = BigDecimal.new(10**20)
#=> #<BigDecimal:7fd8f2925ab8,'0.1E21',9(36)>
n = a / b
#=> #<BigDecimal:7fd8f4866148,'0.9536743164 0625E-6',27(54)>
n.to_s('F')
#=> "0.00000095367431640625"
I do get the correct result. Why?
BigDecimal can perform arbitrary-precision floating point decimal arithmetic, however it cannot automatically determine the "correct" precision for a given calculation.
For example, consider
BigDecimal.new(1)/BigDecimal.new(3)
# <BigDecimal:1cfd748, '0.3333333333 33333333E0', 18(36)>
Arguably, there is no correct precision in this case; the right value to use depends on the accuracy required in your calculations. It's worth noting that in a mathematical sense†, almost all whole number divisions result in a number with an infinite decimal expansion, thus requiring rounding. A fraction only has a finite representation if, after reducing it to lowest terms, the denominator's only prime factors are 2 and 5.
So you have to specify the precision. Unfortunately the precision argument is a little weird, because it seems to be both the number of significant digits and the number of digits after the decimal point. Here's 1/1048576 for varying precision
1 0.000001
2 0.00000095
3 0.000000953
9 0.000000953
10 0.0000009536743164
11 0.00000095367431641
12 0.000000953674316406
18 0.000000953674316406
19 0.00000095367431640625
For any value less than 10, BigDecimal truncates the result to 9 digits which is why you get a sudden spike in accuracy at precision 10: at that point is switches to truncating to 18 digits (and then rounds to 10 significant digits).
† Depending on how comfortable you are comparing the sizes of countably infinite sets.
Related
Can somebody explain why multiplying by 100 here gives a less accurate result but multiplying by 10 twice gives a more accurate result?
± % sc
Loading development environment (Rails 3.0.1)
>> 129.95 * 100
12994.999999999998
>> 129.95*10
1299.5
>> 129.95*10*10
12995.0
If you do the calculations by hand in double-precision binary, which is limited to 53 significant bits, you'll see what's going on:
129.95 = 1.0000001111100110011001100110011001100110011001100110 x 2^7
129.95*100 = 1.1001011000010111111111111111111111111111111111111111011 x 2^13
This is 56 significant bits long, so rounded to 53 bits it's
1.1001011000010111111111111111111111111111111111111111 x 2^13, which equals
12994.999999999998181010596454143524169921875
Now 129.95*10 = 1.01000100110111111111111111111111111111111111111111111 x 2^10
This is 54 significant bits long, so rounded to 53 bits it's 1.01000100111 x 2^10 = 1299.5
Now 1299.5 * 10 = 1.1001011000011 x 2^13 = 12995.
First off: you are looking at the string representation of the result, not the actual result itself. If you really want to compare the two results, you should format both results explicitly, using String#% and you should format both results the same way.
Secondly, that's just how binary floating point numbers work. They are inexact, they are finite and they are binary. All three mean that you get rounding errors, which generally look totally random, unless you happen to have memorized the entirety of IEEE754 and can recite it backwards in your sleep.
There is no floating point number exactly equal to 129.95. So your language uses a value which is close to it instead. When that value is multiplied by 100, the result is close to 12995, but it just so happens to not equal 12995. (It is also not exactly equal to 100 times the original value it used in place of 129.95.) So your interpreter prints a decimal number which is close to (but not equal to) the value of 129.95 * 100 and which shows you that it is not exactly 12995. It also just so happens that the result 129.95 * 10 is exactly equal to 1299.5. This is mostly luck.
Bottom line is, never expect equality out of any floating point arithmetic, only "closeness".
I want to multiply more than 10K probability estimates (values between 0 and 1).
I am using Ruby. And I used BigDecimal to store the small numbers like,
prod = BigDecimal.new("1")
prod = prod * BigDecimal.new("#{ngramlm[key]}")
but after few iterations prod becomes zero. Could you please help me how to store the final product in prod(which would be a very small number near to zero)!!
What you are describing sounds like a typical case for using log probabilities (http://en.wikipedia.org/wiki/Log_probability). Use of log(y)=log(x1)+log(x2) instead of y=x1*x2 (turn your multiplications into additions of log probabilities) will result in improved speed and numerical stability.
You may use native Ruby Rational class. As a rational number can be represented as a paired integer number; a/b (b>0).
e.g.
Rational(0.3) #=> (5404319552844595/18014398509481984)
Rational('0.3') #=> (3/10)
Rational('2/3') #=> (2/3)
0.3.to_r #=> (5404319552844595/18014398509481984)
'0.3'.to_r #=> (3/10)
'2/3'.to_r #=> (2/3)
0.3.rationalize #=> (3/10)
So your numbers will be converted to rationals, you may get bigger precision as rational by rational will give you the rational. E.g.
Rational(2, 3) * Rational(2, 3) #=> (4/9)
Rational(900) * Rational(1) #=> (900/1)
Rational(-2, 9) * Rational(-9, 2) #=> (1/1)
Rational(9, 8) * 4 #=> (9/2)
So you will basically deal with multiplication of integers in the numerator and denominator and this is precise.
Somehow I can't find the answer to this using google or SO...
Consider:
require 'bigdecimal'
puts (BigDecimal.new(1)/BigDecimal.new(3)).to_s
#=> 0.333333333333333333E0
I want the ability to specify a precision of 100 or 200 or 1000, which would print out "0." followed by 100 threes, 200 threes, or 1000 threes, respectively.
How can I accomplish this? The answer should also work for non-repeating decimals, in which case the extra digits of precision of would be filled with zeros.
Thanks!
I think the problem is that the BigDecimal objects don't have their precision set to a value high enough. I can get a 1000 fractional digits printed if I explicitly specify the required precision of the operation by using div instead of /:
require 'bigdecimal'
puts (BigDecimal.new(1).div(BigDecimal.new(3), 1000)).to_s
#=> 0.3333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333E0
After that, you can limit the number of fractional digits with round.
I'm getting a strange behaviour with BigDecimal in ruby. Why does this print false?
require 'bigdecimal'
a = BigDecimal.new('100')
b = BigDecimal.new('5.1')
c = a / b
puts c * b == a #false
BigDecimal doesn't claim to have infinite precision, it just provides support for precisions outside the normal floating point ranges:
BigDecimal provides similar support for very large or very accurate floating point numbers.
But BigDecimal values still have a finite number of significant digits, hence the precs method:
precs
Returns an Array of two Integer values.
The first value is the current number of significant digits in the BigDecimal. The second value is the maximum number of significant digits for the BigDecimal.
You can see things starting to go awry if you look at your c:
>> c.to_s
=> "0.19607843137254901960784313725E2"
That's a nice clean rational number but BigDecimal doesn't know that, it is still stuck seeing c as a finite string of digits.
If you use Rational instead, you'll get the results you're expecting:
>> a = Rational(100)
>> b = Rational(51, 10)
>> c * b == a
=> true
Of course, this trickery only applies if you are working with Rational numbers so anything fancy (such as roots or trigonometry) is out of bounds.
This is normal behaviour, and not at all strange.
BigDecimal does not guarantee infinite accuracy, it allows you to specify arbitrary accuracy, which is not the same thing. The value 100/5.1 cannot be expressed with complete precision using floating point internal representation. Doesn't matter how many bits are used.
A "big rational" approach could achieve it - but would not give you access to some functions e.g. square roots.
See http://ruby-doc.org/core-1.9.3/Rational.html
# require 'rational' necessary only in Ruby 1.8
a = 100.to_r
b = '5.1'.to_r
c = a / b
c * b == a
# => true
Can somebody explain why multiplying by 100 here gives a less accurate result but multiplying by 10 twice gives a more accurate result?
± % sc
Loading development environment (Rails 3.0.1)
>> 129.95 * 100
12994.999999999998
>> 129.95*10
1299.5
>> 129.95*10*10
12995.0
If you do the calculations by hand in double-precision binary, which is limited to 53 significant bits, you'll see what's going on:
129.95 = 1.0000001111100110011001100110011001100110011001100110 x 2^7
129.95*100 = 1.1001011000010111111111111111111111111111111111111111011 x 2^13
This is 56 significant bits long, so rounded to 53 bits it's
1.1001011000010111111111111111111111111111111111111111 x 2^13, which equals
12994.999999999998181010596454143524169921875
Now 129.95*10 = 1.01000100110111111111111111111111111111111111111111111 x 2^10
This is 54 significant bits long, so rounded to 53 bits it's 1.01000100111 x 2^10 = 1299.5
Now 1299.5 * 10 = 1.1001011000011 x 2^13 = 12995.
First off: you are looking at the string representation of the result, not the actual result itself. If you really want to compare the two results, you should format both results explicitly, using String#% and you should format both results the same way.
Secondly, that's just how binary floating point numbers work. They are inexact, they are finite and they are binary. All three mean that you get rounding errors, which generally look totally random, unless you happen to have memorized the entirety of IEEE754 and can recite it backwards in your sleep.
There is no floating point number exactly equal to 129.95. So your language uses a value which is close to it instead. When that value is multiplied by 100, the result is close to 12995, but it just so happens to not equal 12995. (It is also not exactly equal to 100 times the original value it used in place of 129.95.) So your interpreter prints a decimal number which is close to (but not equal to) the value of 129.95 * 100 and which shows you that it is not exactly 12995. It also just so happens that the result 129.95 * 10 is exactly equal to 1299.5. This is mostly luck.
Bottom line is, never expect equality out of any floating point arithmetic, only "closeness".