Handling very small numbers in Ruby - ruby

I want to multiply more than 10K probability estimates (values between 0 and 1).
I am using Ruby. And I used BigDecimal to store the small numbers like,
prod = BigDecimal.new("1")
prod = prod * BigDecimal.new("#{ngramlm[key]}")
but after few iterations prod becomes zero. Could you please help me how to store the final product in prod(which would be a very small number near to zero)!!

What you are describing sounds like a typical case for using log probabilities (http://en.wikipedia.org/wiki/Log_probability). Use of log(y)=log(x1)+log(x2) instead of y=x1*x2 (turn your multiplications into additions of log probabilities) will result in improved speed and numerical stability.

You may use native Ruby Rational class. As a rational number can be represented as a paired integer number; a/b (b>0).
e.g.
Rational(0.3) #=> (5404319552844595/18014398509481984)
Rational('0.3') #=> (3/10)
Rational('2/3') #=> (2/3)
0.3.to_r #=> (5404319552844595/18014398509481984)
'0.3'.to_r #=> (3/10)
'2/3'.to_r #=> (2/3)
0.3.rationalize #=> (3/10)
So your numbers will be converted to rationals, you may get bigger precision as rational by rational will give you the rational. E.g.
Rational(2, 3) * Rational(2, 3) #=> (4/9)
Rational(900) * Rational(1) #=> (900/1)
Rational(-2, 9) * Rational(-9, 2) #=> (1/1)
Rational(9, 8) * 4 #=> (9/2)
So you will basically deal with multiplication of integers in the numerator and denominator and this is precise.

Related

How to prevent BigDecimal from truncating results?

Follow up to this question:
I want to calculate 1/1048576 and get the correct result, i.e. 0.00000095367431640625.
Using BigDecimal's / truncates the result:
require 'bigdecimal'
a = BigDecimal.new(1)
#=> #<BigDecimal:7fd8f18aaf80,'0.1E1',9(27)>
b = BigDecimal.new(2**20)
#=> #<BigDecimal:7fd8f189ed20,'0.1048576E7',9(27)>
n = a / b
#=> #<BigDecimal:7fd8f0898750,'0.9536743164 06E-6',18(36)>
n.to_s('F')
#=> "0.000000953674316406" <- should be ...625
This really surprised me, because I was under the impression that BigDecimal would just work.
To get the correct result, I have to use div with an explicit precision:
n = a.div(b, 100)
#=> #<BigDecimal:7fd8f29517a8,'0.9536743164 0625E-6',27(126)>
n.to_s('F')
#=> "0.00000095367431640625" <- correct
But I don't really understand that precision argument. Why do I have to specify it and what value do I have to use to get un-truncated results?
Does this even qualify as "arbitrary-precision floating point decimal arithmetic"?
Furthermore, if I calculate the above value via:
a = BigDecimal.new(5**20)
#=> #<BigDecimal:7fd8f20ab7e8,'0.9536743164 0625E14',18(27)>
b = BigDecimal.new(10**20)
#=> #<BigDecimal:7fd8f2925ab8,'0.1E21',9(36)>
n = a / b
#=> #<BigDecimal:7fd8f4866148,'0.9536743164 0625E-6',27(54)>
n.to_s('F')
#=> "0.00000095367431640625"
I do get the correct result. Why?
BigDecimal can perform arbitrary-precision floating point decimal arithmetic, however it cannot automatically determine the "correct" precision for a given calculation.
For example, consider
BigDecimal.new(1)/BigDecimal.new(3)
# <BigDecimal:1cfd748, '0.3333333333 33333333E0', 18(36)>
Arguably, there is no correct precision in this case; the right value to use depends on the accuracy required in your calculations. It's worth noting that in a mathematical sense†, almost all whole number divisions result in a number with an infinite decimal expansion, thus requiring rounding. A fraction only has a finite representation if, after reducing it to lowest terms, the denominator's only prime factors are 2 and 5.
So you have to specify the precision. Unfortunately the precision argument is a little weird, because it seems to be both the number of significant digits and the number of digits after the decimal point. Here's 1/1048576 for varying precision
1 0.000001
2 0.00000095
3 0.000000953
9 0.000000953
10 0.0000009536743164
11 0.00000095367431641
12 0.000000953674316406
18 0.000000953674316406
19 0.00000095367431640625
For any value less than 10, BigDecimal truncates the result to 9 digits which is why you get a sudden spike in accuracy at precision 10: at that point is switches to truncating to 18 digits (and then rounds to 10 significant digits).
† Depending on how comfortable you are comparing the sizes of countably infinite sets.

Is there an IDIOMATIC way to get a random Fixnum in Ruby?

I'm playing with an algorithm which uses random numbers. It would be nice to be able to maximize the randomness I can get while keeping the number a nice reasonably-performant integer, so ideally they'd be in the range Fixnum::MIN .. Fixnum::MAX, but 0..Fixnum::MAX ought to be fine too.
OH WAIT. Those constants aren't actually things that exist. So when you read that Random.rand returns a float unless you pass it an integer argument the only obvious course of action is to resort to terrible hacks like these.
Is there any more-idiomatic way to get a random integer in Ruby, or does Yukihiro just expect me to make my code hideous and duplicate dubious integer-size exponentiation if I want this sort of capability?
Random Values from 0..FIXNUM_MAX
When Fixnum overflows, Ruby will just convert to Bignum. However, this related answer shows how to calculate the minimum and maximum values of Fixnum for your platform. Using that as a starting point, you can get a positive integer in the desired range with:
FIXNUM_MAX = (2**(0.size * 8 -2) -1)
Random.rand FIXNUM_MAX
Negative Integers
If you insist on having negative numbers too, then the following may be "close enough" for your purposes, even though FIXNUM_MIN.abs == FIXNUM_MAX may be false on your platform:
FIXNUM_MAX = (2**(0.size * 8 -2) -1)
random_num = Random.rand FIXNUM_MAX
random_num.even? ? random_num : random_num * -1
See Also
Kernel#rand
Random#rand
SecureRandom#random_number
This should get you a fairly large number of sample integers:
require "securerandom"
exponent = rand(1..15)
puts (SecureRandom.random_number * 10**exponent).to_i
a faster working algo that produces same or possibly better randomness:
r = Random.new
exponent = rand(1..15)
puts (r.rand * 10**exponent).to_i
or even a simpler way:
FIXNUM_MAX = (2**(0.size * 8 -2) -1)
FIXNUM_MIN = -(2**(0.size * 8 -2))
p rand(FIXNUM_MIN..FIXNUM_MAX)

Cubic root of large number

I'm trying to identify the cubic root of a large number. I found a solution which works for smaller numbers, but not in this case:
require 'openssl'
q = OpenSSL::BN::generate_prime(2048)
ti = q.to_i #=> 3202718747...
ti3 = ti ** 3 #=> 328515909...
m = ti3 ** (1/3.0) #=> Infinity
I was hoping to see m = the original output of ti. Yes, this is a part of a Matasano challenge. I've put a lot of effort into not seeking help thus far, but I've reached a point where it's just a "how do I do something otherwise simple, in Ruby". Any assistance appreciated.
In ruby operations on integers automatically get promoted to bignums (arbitrary precision integers), so you never get an overflow.
The same is not true of floating point operations: you end up with infinity because raising to the power 1/3 is a floating point operation and the first thing it does is try to convert your number to a float. The biggest number a float in ruby can represent is about 10^308 whereas your number is probably around the 10^1800 mark, so it bails out and returns Infinity
Ruby has a BigDecimal class for this. You might therefore be tempted to do
BigDecimal.new(ti3) ** (1/3.0)
This gives a wildly wrong answer for me - I suspect because (1/3.0) is a float, so only approximately 1/3
BigDecimal.new(ti3) ** Rational(1,3)
On the other hand produces the correct result for me (with negligible error). Rational is Ruby's class for representing fractions in an exact manner. In ruby 2.1 you can shorten this to
BigDecimal.new(ti3) ** (1r/3)
The docs do say that only integer exponents are supported but this seems to be a hangover from the ruby 1.8 days
The following code was put forward based on the two pieces of advice given.
def nthroot(n, a, precision = 1e-1024)
x = a
begin
prev = x
x = ((n - 1) * prev + a / (prev ** (n - 1))) / n
end while (prev - x).abs > precision
x
end
It was based on an implementation of Newton's method which dealt with floats, but also just returned infinity. This version deals with integers only, but works for large integers.
Of course, an nthroot, may be called with n = 3.
I don't know what the Matasano challenge is, but what comes to mind is Newton's Method
The wikipedia page on Cube Roots also suggests using Newton's Method

How do I square a number without using multiplication?

Was wondering if there is a way to write a method which squares a number (integer or decimal/float) without using the operational sign (*). For example: square of 2 will be 4, square of 2.5 will be 6.25, and 3.5's will be 12.25.
Here is my approach:
def square(num)
number = num
number2 = number
(1...(number2.floor)).each{ num += number }
num
end
puts square(2) #=> 4 [Correct]
puts square(16) #=> 256 [Correct]
puts square(2.5) #=> 5.0 [Wrong]
puts square(3.5) #=> 10.5 [Wrong]
The code works for integers, but not with floats/decimals. What am I doing wrong here? Also, if anybody has a fresh approach to this problem then please share. Algorithms are also welcome. Also, considering performance of the method will be a plus.
There are a few tricks you could use, arranged here in order of increasing trickery.
Logarithms
Observe that k * k = e^log(k*k) = e^(log(k) + log(k)), and use that rule:
Math.exp(Math.log(5.2) + Math.log(5.2))
# => 27.04
No multiplication here!
Division
As another commenter suggested, you could take the reciprocal operation, division: k/(1.0/k) == k^2. However, this introduces additional floating-point errors, since k / (1.0 / k) is two floating-point operations, whereas k * k is only one.
Exponentiation
Or, since this is Ruby, if you want exactly the same value as the floating-point operation and you don't want to use the multiplication operator, you can use the exponentiation operator: k**2 == k * k.
Call a web service
It's not multiplying if you don't do it yourself!
require 'wolfram' # https://github.com/cldwalker/wolfram
query = 'Square[5.2]'
result = Wolfram.fetch(query)
Blatant cheating
Finally, if you're feeling really cheap, you could avoid actually employing the literal "*" operation, and use something equivalent:
n = ...
require 'base64'
n.send (Base64.decode64 'Kg==').to_sym, n # => n * n
Didn't use any operation sign.
def square(num)
num.send 42.chr, num
end
Well, the inverse of multiplication is division, so you can get the same result* by dividing by its inverse. That is: square(n) = n / (1.0 / n). Just make sure you don't inadvertently do integer division.
*Technically dividing twice introduces a second opportunity for rounding error in floating-point arithmetic since it performs two operations. So, this will not produce exactly the same result as floating-point multiplication - but this was also not a requirement in the question.

BigDecimal loses precision after multiplication

I'm getting a strange behaviour with BigDecimal in ruby. Why does this print false?
require 'bigdecimal'
a = BigDecimal.new('100')
b = BigDecimal.new('5.1')
c = a / b
puts c * b == a #false
BigDecimal doesn't claim to have infinite precision, it just provides support for precisions outside the normal floating point ranges:
BigDecimal provides similar support for very large or very accurate floating point numbers.
But BigDecimal values still have a finite number of significant digits, hence the precs method:
precs
Returns an Array of two Integer values.
The first value is the current number of significant digits in the BigDecimal. The second value is the maximum number of significant digits for the BigDecimal.
You can see things starting to go awry if you look at your c:
>> c.to_s
=> "0.19607843137254901960784313725E2"
That's a nice clean rational number but BigDecimal doesn't know that, it is still stuck seeing c as a finite string of digits.
If you use Rational instead, you'll get the results you're expecting:
>> a = Rational(100)
>> b = Rational(51, 10)
>> c * b == a
=> true
Of course, this trickery only applies if you are working with Rational numbers so anything fancy (such as roots or trigonometry) is out of bounds.
This is normal behaviour, and not at all strange.
BigDecimal does not guarantee infinite accuracy, it allows you to specify arbitrary accuracy, which is not the same thing. The value 100/5.1 cannot be expressed with complete precision using floating point internal representation. Doesn't matter how many bits are used.
A "big rational" approach could achieve it - but would not give you access to some functions e.g. square roots.
See http://ruby-doc.org/core-1.9.3/Rational.html
# require 'rational' necessary only in Ruby 1.8
a = 100.to_r
b = '5.1'.to_r
c = a / b
c * b == a
# => true

Resources