I have been wondering what could be the maximum length of Integer before it gets to Float::INFINITY.
On my 64 bit (Arch Linux) system:
> 1023.**(3355446).bit_length
# => 33549731
> 1023.**(3355446).+(1000000 ** 1000000).+(1000 ** 100).bit_length
# => 33549731
In fact:
> a = 1023.**(3355446) ; ''
# => ""
> b = 1023.**(3355446).+(1000000 ** 1000000).+(1000 ** 100) ; ''
# => ""
> a.to_s.length == b.to_s.length
# => true
The above takes some time, but this one doesn't
a, b, length_of = 1023.**(3355446), 1023.**(3355446).+(1000000 ** 1000000).+(1000 ** 100), lambda { |x| Math.log10(x).to_i.next } ; ''
# => ""
length_of.(a).eql?(length_of.(b))
# => true
Thus, if you are running a program, which has an infinite loop and a counter which increases many hundreds or thousands of times a second, and you have to run it 24 * 365, that may cause bugs I think.
So the question is what determines the length of the Integer in Ruby? Does it differ on 32 bit and 64 bit systems?
Edit:
On my rapsberry pi 3 model B:
2.**(31580669).bit_length
# => 31580670
2.**(31580669).next.bit_length
# => 31580670
> l = ->(x) { Math.log10(x).to_i.next }
# => #<Proc:0x00a46df0#(irb):1 (lambda)>
> l === 2.**(31580669)
# => 9506729
> l === 2.**(31580669) + 100 ** 100
# => 9506729
So the question on Ruby 2.3 and older would be how big could be a Bignum. From Ruby 2.4+ the question is how big can be an Integer?
What could be the maximum length of Integer before it gets to Float::INFINITY.
Integer operations in Ruby will (almost) never return Infinity. An Integer can be as big as you have memory to hold it.
Float is implemented as a classic double precision-floating point number with an upper limit of about 1.7976931348623157e+308 and will return Float::Infinity if you go to high.
1.7976931348623157e+308.to_f + 10**307
=> Infinity
Some languages, like Perl 5, upgrade integers to doubles to get more space to work. So you will get Infinity if you go too high.
$ perl -wle 'printf "%f\n", 10**308'
100000000000000001097906362944045541740492309677311846336810682903157585404911491537163328978494688899061249669721172515611590283743140088328307009198146046031271664502933027185697489699588559043338384466165001178426897626212945177628091195786707458122783970171784415105291802893207873272974885715430223118336.000000
$ perl -wle 'printf "%f\n", 10**308 + 10**308'
Inf
But Ruby's Integers have no limit but your memory. When Integers get too large they switch to using the GNU Multiple Precision Arithmetic Library which supports arbitrary precision arithmetic.
There are a few operations which can result in Infinity, like power.
10**10000000
(irb):5: warning: in a**b, b may be too big
=> Infinity
But multiplication has no such limit.
a = 10**1000000
...
a *= a
...
a *= a
...
a *= a
...
a.bit_length
=> 26575425
Thus, if you are running a program, which has an infinite loop and a counter which increases many hundreds or thousands of times a second, and you have to run it 24 * 365, that may cause bugs I think.
This is a real world concern for 32 bit integers which becomes a pressing problem as 2038 approaches, but not 64 bit integers. If we incremented a counter a million times a second it would take almost 300,000 years. What I've just described is 64 bit time with microsecond resolution.
But in Ruby you can make a simple counter effectively as large as you want.
Related
I'm solving some Project Euler problems using Ruby, and specifically here I'm talking about problem 25 (What is the index of the first term in the Fibonacci sequence to contain 1000 digits?).
At first, I was using Ruby 2.2.3 and I coded the problem as such:
number = 3
a = 1
b = 2
while b.to_s.length < 1000
a, b = b, a + b
number += 1
end
puts number
But then I found out that version 2.4.2 has a method called digits which is exactly what I needed. I transformed to code to:
while b.digits.length < 1000
And when I compared the two methods, digits was much slower.
Time
./025/problem025.rb 0.13s user 0.02s system 80% cpu 0.190 total
./025/problem025.rb 2.19s user 0.03s system 97% cpu 2.275 total
Does anyone have an idea why?
Ruby's digits
... is implemented in rb_int_digits.
Which for non-tiny numbers (i.e., most of your numbers) uses rb_int_digits_bigbase.
Which extracts digit after digit naively with division/modulo by base.
So it should take quadratic time (at least with a small base such as 10).
Ruby's to_s
... is implemented in int_to_s.
Which uses rb_int2str.
Which for non-tiny numbers uses rb_big2str.
Which uses rb_big2str1.
Which might use big2str_gmp if available (which sounds/looks like it uses the fast GMP library) or ...
... uses big2str_generic.
Which uses big2str_karatsuba (sweet, I recognize that name!).
Which looks like it has something to do with ...
... Karatsuba's algorithm, which is a fast multiplication algorithm. If you multiply two n-digit numbers the naive way you learned in school, you take n2 single-digit products. Karatsuba on the other hand only needs about n1.585, which is quite a lot better. And I didn't read into this further, but I suspect what Ruby does here is also this efficient. Eric Lippert's answer with a base conversion algorithm uses Karatsuba multiplication and says "this [base conversion] algorithm is utterly dominated by the cost of the multiplication".
Comparing quadratic to n1.585 over the number lengths from 1 digit to 1000 digits gives factor 15:
(1..1000).sum { |i| i**2 } / (1..1000).sum { |i| i**1.585 }
=> 15.150583254950678
Which is roughly the factor you observed as well. Of course that's a rather naive comparison, but, well, why not.
GMP by the way apparently uses/used a "near O(n * log(n)) FFT-based multiplication algorithm".
Thanks to #Drenmi's answer for motivating me to dig into the source after all. I hope I did this right, no guarantees, I'm a Ruby beginner. But that's why I left all the links there for you to check for yourself :-P
Integer#digits doesn't just "split" the number. From the documentation:
Returns the array including the digits extracted by place-value
notation with radix base of int.
This extraction is done even if a base argument is omitted. The relevant source:
# ruby/numeric.c:4809
while (!FIXNUM_P(num) || FIX2LONG(num) > 0) {
VALUE qr = rb_int_divmod(num, base);
rb_ary_push(digits, RARRAY_AREF(qr, 1));
num = RARRAY_AREF(qr, 0);
}
As you can see, this process includes repeated modulo arithmetics, which likely accounts for the additional runtime.
Many ruby methods create objects (strins, arrays, etc.)
In ruby, object creation in ruby is "expensive".
For instance to_s creates a string and digits creates an array every time the while condition is evaluated.
If you want to optimize your example, you can do the following:
# create the smallest possible 1000 digits number
max = 10**999
number = 3
a = 1
b = 2
# do not create objects in while condition
while b < max
a, b = b, a + b
number += 1
end
puts number
I have not answered your question, but wish to suggest an improved algorithm for the problem you have addressed. For a given number of decimal digits, n, I have implemented the following algorithm.
estimate the number f of Fibonacci numbers ("FNs") that have n or fewer decimal digits.
compute the fth and (f-1)st FNs, and the number of digits m in the fth FN.
if m >= n back down from down from the (f-1)st FN until the (f-1)st FN has fewer than n decimal digits, at which time the fth FN is the smallest FN to have n decimal digits.
if m < n increase the fth FN until the it has n decimal digits, at which time it is the smallest FN to have n decimal digits.
The key is to compute a close estimate f in the first step.
Code
AVG_FNs_PER_DIGIT = 4.784971966781667
def first_fibonacci_with_n_digits(n)
return [1, 1] if n == 1
idx = (n * AVG_FNs_PER_DIGIT).round
fn, prev_fn = fib(idx)
fn.to_s.size >= n ? fib_down(n, fn, prev_fn, idx) : fib_up(n, fn, prev_fn, idx)
end
def fib(idx)
a = 1
b = 2
(idx - 2).times {a, b = b, a + b }
[b, a]
end
def fib_up(n, b, a, idx)
loop do
a, b = b, a + b
idx += 1
break [idx, b] if b.to_s.size == n
end
end
def fib_down(n, b, a, idx)
loop do
a, b = b - a, a
break [idx, b] if a.to_s.size == n - 1
idx -= 1
end
end
Benchmarks
In computing each Fibonacci number two operations are typically performed:
compute the number of digits in the last-computed Fibonacci number and if that number is equal to the target number of digits, terminate (for reasons made clear in the Explanation section below, it cannot be larger than the target number); else
compute the next number in the Fibonacci sequence.
By contrast, the method I have proposed performs the first step a relatively small number of times.
How important is the first step relative to the second and how does the use of n.digits.size compare with that of n.to_s.size in the first step? Let's run some benchmarks to find out.
def use_to_s(ndigits)
case ndigits
when 1
[1, 1]
else
a = 1
b = 2
idx = 3
loop do
break [idx, b] if b.to_s.length == ndigits
a, b = b, a + b
idx += 1
end
end
end
def use_digits(ndigits)
case ndigits
when 1
[1, 1]
else
a = 1
b = 2
idx = 3
loop do
break [idx, b] if b.digits.size == ndigits
a, b = b, a + b
idx += 1
end
end
end
require 'fruity'
def test(ndigits)
nfibs, last_fib = use_to_s(ndigits)
puts "\nndigits = #{ndigits}, nfibs=#{nfibs}, last_fib=#{last_fib}"
compare do
try_use_to_s { use_to_s(ndigits) }
try_use_digits { use_digits(ndigits) }
try_estimate { first_fibonacci_with_n_digits(ndigits) }
end
end
test 20
ndigits = 20, nfibs=93, last_fib=12200160415121876738
Running each test 128 times. Test will take about 1 second.
try_estimate is faster than try_use_to_s by 2x ± 0.1
try_use_to_s is faster than try_use_digits by 80.0% ± 10.0%
test 100
ndigits = 100, nfibs=476, last_fib=13447...37757 (90 digits omitted)
Running each test 16 times. Test will take about 4 seconds.
try_estimate is faster than try_use_to_s by 5x ± 0.1
try_use_to_s is faster than try_use_digits by 10x ± 1.0
test 500
ndigits = 500, nfibs=2390, last_fib=13519...63145 (490 digits omitted)
Running each test 2 times. Test will take about 27 seconds.
try_estimate is faster than try_use_to_s by 9x ± 0.1
try_use_to_s is faster than try_use_digits by 60x ± 1.0
test 1000
ndigits = 1000, nfibs=4782, last_fib=10700...27816 (990 digits omitted)
Running each test once. Test will take about 1 minute.
try_estimate is faster than try_use_to_s by 12x ± 10.0
try_use_to_s is faster than try_use_digits by 120x ± 100.0
There are two main take-aways from these results:
"try_estimate" is the fastest because it performs the first step relatively few times; and
the use of to_s is much faster than that of digits.
Further to the first of these observations note that the initial estimates of the index of the first FN having a given number of digits, compared to the actual index, are as follows:
for 20 digits: 96 est. vs 93 actual
for 100 digits: 479 est. vs 476 actual
for 500 digits: 2392 est. vs 2390 actual
for 1000 digits: 4785 est. vs 4782 actual
The deviation was at most 3, meaning numbers of digits had to be calculated for at most 3 FNs to obtain the desired result.
Explanation
The only explanation of the methods given in the section Code above is the derivation of the constant AVG_FNs_PER_DIGIT, which is used to calculate an estimate of the index of the first FN having the specified number of digits.
The derivation of this constant derives from the question and selected answer given here. (The Wiki for Fibonacci numbers provides a good overview of the mathematical properties of FNs.)
It is known that the first 7 FNs (including zero) have one digit; thereafter the FNs gain an additional digit every 4 or 5 FNs (i.e., sometimes 4, else 5). Therefore, as a very crude calculation, we see that to calculate the first FN with n digits, n >= 2, it will not be less than the 4*nth FN. For n = 1000, that would be 4,000. (In fact, the 4,782nd is the smallest to have 1,000 digits.) In other words, we don't need to calculate the number of digits in the first 4,000 FNs. We can improve on this estimate, however.
As n approaches infinity, the ratio of ranges 10**n...10**(n+1) (n-digit intervals) that contain 5 FNs to those that contain 4 FNs can be computed as follows.
LOG_10 = Math.log(10)
#=> 2.302585092994046
GR = (1 + Math.sqrt(5))/2
#=> 1.618033988749895
LOG_GR = Math.log(GR)
#=> 0.48121182505960347
RATIO_5to4 = (LOG_10 - 4*LOG_GR)/(5*LOG_GR - LOG_10)
#=> 3.6505564183095474
where GR is the Golden Ratio.
Over a large number of n-digit intervals let n4 be the number of those intervals containing 4 FNs and n5 be the number containing 5 FNs. The average number of FNs per interval is therefore (n4*4 + n5*5)/(n4 + n5). Since n5/n4 converges to RATIO_5to4, n5 approaches RATIO_5to4 * n4 in the limit (discarding roundoff error). If we substitute out n5, and let
b = 1/(1 + RATIO_5to4)
#=> 0.21502803321833364
we find the average number of FNs per n-digit interval converges to
avg = b * 4 + (1-b) *5
#=> 4.784971966781667
If fn is the first FN to have n decimal digits, the number of FNs in the sequence up to an including fn can therefore be approximated to be
n * avg
If, for example, the estimate of the index of the first FN to have 1000 decimal digits would be 1000 * 4.784971966781667).round #=> 4785.
In ruby, some large numbers are larger than infinity. Through binary search, I discovered:
(1.0/0) > 10**9942066.000000001 # => false
(1.0/0) > 10**9942066 # => true
RUBY_VERSION # => "2.3.0"
Why is this? What is special about 109942066? It doesn't seem to be an arbitrary number like 9999999, it is not close to any power of two (it's approximately equivelent to 233026828.36662442).
Why isn't ruby's infinity infinite? How is 109942066 involved?
I now realize, any number greater than 109942066 will overflow to infinity:
10**9942066.000000001 #=> Infinity
10**9942067 #=> Infinity
But that still leaves the question: Why 109942066?
TL;DR
I did the calculations done inside numeric.c's int_pow manually, checking where an integer overflow (and a propagation to Bignum's, including a call to rb_big_pow) occurs. Once the call to rb_big_pow happens there is a check whether the two intermediate values you've got in int_pow are too large or not, and the cutoff value seems to be just around 9942066 (if you're using a base of 10 for the power). Approximately this value is close to
BIGLEN_LIMIT / ceil(log2(base^n)) * n ==
32*1024*1024 / ceil(log2(10^16)) * 16 ==
32*1024*1024 / 54 * 16 ~=
9942054
where BIGLEN_LIMIT is an internal limit in ruby which is used as a constant to check if a power calculation would be too big or not, and is defined as 32*1024*1024. base is 10, and n is the largest power-of-2 exponent for the base that would still fit inside a Fixnum.
Unfortunately I can't find a better way than this approximation, due to the algorithm used to calculate powers of big numbers, but it might be good enough to use as an upper limit if your code needs to check validity before doing exponentiation on big numbers.
Original question:
The problem is not with 9942066, but that with one of your number being an integer, the other one being a float. So
(10**9942066).class # => Bignum
(10**9942066.00000001).class # => Float
The first one is representable by a specific number internally, which is smaller than Infinity. The second one, as it's still a float is not representable by an actual number, and is simply replaced by Infinity, which is of course not larger than Infinity.
Updated question:
You are right that there seem to be some difference around 9942066 (if you're using a 64-bit ruby under Linux, as the limits might be different under other systems). While ruby does use the GMP library to handle big numbers, it does some precheck before even going to GMP, as shown by the warnings you can receive. It will also do the exponentiation manually using GMP's mul commands, without calling GMP's pow functions.
Fortunately the warnings are easy to catch:
irb(main):010:0> (10**9942066).class
=> Bignum
irb(main):005:0> (10**9942067).class
(irb):5: warning: in a**b, b may be too big
=> Float
And then you can actually check where these warnings are emitted inside ruby's bignum.c library.
But first we need to get to the Bignum realm, as both of our numbers are simple Fixnums. The initial part of the calculation, and the "upgrade" from fixnum to bignum is done inside numeric.c. Ruby does quick exponentiation, and at every step it checks whether the result would still fit into a Fixnum (which is 2 bits less than the system bitsize: 62 bits on a 64 bit machine). If not, it will then convert the values to the Bignum realm, and continues the calculations there. We are interested at the point where this conversion happens, so let's try to figure out when it does in our 10^9942066 example (I'm using x,y,z variables as present inside the ruby's numeric.c code):
x = 10^1 z = 10^0 y = 9942066
x = 10^2 z = 10^0 y = 4971033
x = 10^2 z = 10^2 y = 4971032
x = 10^4 z = 10^2 y = 2485516
x = 10^8 z = 10^2 y = 1242758
x = 10^16 z = 10^2 y = 621379
x = 10^16 z = 10^18 y = 621378
x = OWFL
At this point x will overflow (10^32 > 2^62-1), so the process will continue on the Bignum realm by calculating x**y, which is (10^16)^621378 (which are actually still both Fixnums at this stage)
If you now go back to bignum.c and check how it determines if a number is too large or not, you can see that it will check the number of bits required to hold x, and multiply this number with y. If the result is larger than 32*1024*1024, it will then fail (emit a warning and does the calculations using basic floats).
(10^16) is 54 bits (ceil(log_2(10^16)) == 54), 54*621378 is 33554412. This is only slightly smaller than 33554432 (by 20), the limit after which ruby will not do Bignum exponentiation, but simply convert y to double, and hope for the best (which will obviously fail, and just return Infinity)
Now let's try to check this with 9942067:
x = 10^1 z = 10^0 y = 9942067
x = 10^1 z = 10^1 y = 9942066
x = 10^2 z = 10^1 y = 4971033
x = 10^2 z = 10^3 y = 4971032
x = 10^4 z = 10^3 y = 2485516
x = 10^8 z = 10^3 y = 1242758
x = 10^16 z = 10^3 y = 621379
x = 10^16 z = OWFL
Here, at the point z overflows (10^19 > 2^62-1), the calculation will continue on the Bignum realm, and will calculate x**y. Note that here it will calculate (10^16)^621379, and while (10^16) is still 54 bits, 54*621379 is 33554466, which is larger than 33554432 (by 34). As it's larger you'll get the warning, and ruby will only to calculations using double, hence the result is Infinity.
Note that these checks are only done if you are using the power function. That's why you can still do (10**9942066)*10, as similar checks are not present when doing plain multiplication, meaning you could implement your own quick exponentiation method in ruby, in which case it will still work with larger values, although you won't have this safety check anymore. See for example this quick implementation:
def unbounded_pow(x,n)
if n < 0
x = 1.0 / x
n = -n
end
return 1 if n == 0
y = 1
while n > 1
if n.even?
x = x*x
n = n/2
else
y = x*y
x = x*x
n = (n-1)/2
end
end
x*y
end
puts (10**9942066) == (unbounded_pow(10,9942066)) # => true
puts (10**9942067) == (unbounded_pow(10,9942067)) # => false
puts ((10**9942066)*10) == (unbounded_pow(10,9942067)) # => true
But how would I know the cutoff for a specific base?
My math is not exactly great, but I can tell a way to approximate where the cutoff value will be. If you check the above calls you can see the conversion between Fixnum and Bignum happens when the intermediate base reaches the limit of Fixnum. The intermediate base at this stage will always have an exponent which is a power of 2, so you just have to maximize this value. For example let's try to figure out the maximum cutoff value for 12.
First we have to check what is the highest base we can store in a Fixnum:
ceil(log2(12^1)) = 4
ceil(log2(12^2)) = 8
ceil(log2(12^4)) = 15
ceil(log2(12^8)) = 29
ceil(log2(12^16)) = 58
ceil(log2(12^32)) = 115
We can see 12^16 is the max we can store in 62 bits, or if we're using a 32 bit machine 12^8 will fit into 30 bits (ruby's Fixnums can store values up to two bits less than the machine size limit).
For 12^16 we can easily determine the cutoff value. It will be 32*1024*1024 / ceil(log2(12^16)), which is 33554432 / 58 ~= 578525. We can easily check this in ruby now:
irb(main):004:0> ((12**16)**578525).class
=> Bignum
irb(main):005:0> ((12**16)**578526).class
(irb):5: warning: in a**b, b may be too big
=> Float
Now we hate to go back to our original base of 12. There the cutoff will be around 578525*16 (16 being the exponent of the new base), which is 9256400. If you check in ruby, the values are actually quite close to this number:
irb(main):009:0> (12**9256401).class
=> Bignum
irb(main):010:0> (12**9256402).class
(irb):10: warning: in a**b, b may be too big
=> Float
Note that the problem is not with the number but with the operation, as told by the warning you get.
$ ruby -e 'puts (1.0/0) > 10**9942067'
-e:1: warning: in a**b, b may be too big
false
The problem is 10**9942067 breaks Ruby's power function. Instead of throwing an exception, which would be a better behavior, it erroneously results in infinity.
$ ruby -e 'puts 10**9942067'
-e:1: warning: in a**b, b may be too big
Infinity
The other answer explains why this happens near 10e9942067.
10**9942067 is not greater than infinity, it is erroneously resulting in infinity. This is a bad habit of a lot of math libraries that makes mathematicians claw their eyeballs out in frustration.
Infinity is not greater than infinity, they're equal, so your greater than check is false. You can see this by checking if they're equal.
$ ruby -e 'puts (1.0/0) == 10**9942067'
-e:1: warning: in a**b, b may be too big
true
Contrast this with specifying the number directly using scientific notation. Now Ruby doesn't have to do math on huge numbers, it just knows that any real number is less than infinity.
$ ruby -e 'puts (1.0/0) > 10e9942067'
false
Now you can put on as big an exponent as you like.
$ ruby -e 'puts (1.0/0) > 10e994206700000000000000000000000000000000'
false
the code below outputs 0.0. is this because of the overflow? how to avoid it? if not, why?
p ((1..100000).map {rand}).reduce :*
I was hoping to speed up this code:
p r.reduce(0) {|m, v| m + (Math.log10 v)}
and use this instead:
p Math.log10 (r.reduce :*)
but apparently this is not always possible...
The values produced by rand are all between 0.0 and 1.0. This means that on each multiplication, your number gets smaller. So by the time you have multiplied 1000 of them, it is probably indistinguishable from 0.
At some point, ruby will take your number to be so small that it is 0. for instance: 2.0e-1000 # => 0
Every multiplication reduces your number by about 1/21, so after about 50 of them, you are down 1/250, and after 100000 (actually, after about 700) you have underflowed the FP format itself, see here.
Ruby provides the BigDecimal class, which implements accurate floating point arithmetic.
require 'bigdecimal'
n = 100
decimals = n.times.map { BigDecimal.new rand.to_s }
result = decimals.reduce :*
result.nonzero?.nil? # returns nil if zero, self otherwise
# => false
result.precs # [significant_digits, maximum_significant_digits]
# => [1575, 1764]
Math.log10 result
# => -46.8031931083014
It is a lot slower than native floating point numbers, however. With n = 100_000, the decimals.reduce :* call went on for minutes on my computer before I finally interrupted it.
What's the best way to calculate if a byte has odd or even parity in Ruby? I've got a version working:
result = "AB".to_i(16).to_s(2).count('1').odd?
=> true
Converting a number to a string and counting the "1"s seems a poor way of calculating parity though. Any better methods?
I want to be able to calculate the parity of a 3DES key. Eventually, I'll want to convert even bytes to odd.
Thanks,
Dan
Unless what you have is not fast enough, keep it. It's clear and succinct, and its performance is better than you think.
We'll benchmark everything against array lookup, the fastest method I tested:
ODD_PARITY = [
false,
true,
true,
...
true,
false,
]
def odd_parity?(hex_string)
ODD_PARITY[hex_string.to_i(16)]
end
Array lookup computes the parity at a rate of 640,000 bytes per second.
Bowsersenior's C code computes parity at a rate of 640,000 bytes per second.
Your code computes parity at a rate of 284,000 bytes per second.
Bowsersenior's native code computes parity at a rate of 171,000 bytes per second.
Theo's shortened code computes parity at a rate of 128,000 bytes per second.
Have you taken a look at the RubyDES library? That may remove the need to write your own implementation.
To calculate parity, you can use something like the following:
require 'rubygems'
require 'inline' # RubyInline (install with `gem install RubyInline`)
class Fixnum
# native ruby version: simpler but slow
# algorithm from:
# http://graphics.stanford.edu/~seander/bithacks.html#ParityParallel
def parity_native
(((self * 0x0101010101010101) & 0x8040201008040201) % 0x1FF) & 1
end
class << self
# inline c version using RubyInline to create c extension
# 4-5 times faster than native version
# use as class method:
# Fixnum.parity(0xAB)
inline :C do |builder|
builder.c <<-EOC
int parity_c(int num) {
return (
((num * 0x0101010101010101ULL) & 0x8040201008040201ULL) % 0x1FF
) & 1;
}
EOC
end
end
def parity
self.class.parity_c(self)
end
def parity_odd?
1 == parity
end
def parity_even?
0 == parity
end
end
0xAB.parity # => 1
0xAB.parity_odd? # => true
0xAB.parity_even? # => false
(0xAB + 1).parity # => 0
According to simple benchmarks, the inline c version is 3-4 times faster than the native ruby version
require 'benchmark'
n = 10000
Benchmark.bm do |x|
x.report("inline c") do
n.times do
(0..255).map{|num| num.parity}
end
end
x.report("native ruby") do
n.times do
(0..255).map{|num| num.parity_native}
end
end
end
# inline c 1.982326s
# native ruby 7.044330s
Probably a lookup table of an Array with 255 entries would be fastest "In Ruby" solution.
In C I would mask and shift. Or if I have SSE4 I would use the POPCNT instruction with inline assembler. If you need this to be high performance write a native extension in C which does either of the above.
http://en.wikipedia.org/wiki/SSE4
How about using your original solution with memoization? This will only calculate once for each integer value.
class Fixnum
# Using a class variable for simplicity, and because subclasses of
# Fixnum—while very uncommon—would likely want to share it.
##parity = ::Hash.new{ |h,i| h[i] = i.to_s(2).count('1').odd? }
def odd_parity?
##parity[self]
end
def even_parity?
!##parity[self]
end
end
"AB".to_i(16).odd_parity?
#=> true
x = 'AB'.to_i(16)
p = 0
until x == 0
p += x & 1
x = x >> 1
end
puts p # => 5
which can be shortened to
x = 'AB'.to_i(16)
p = x & 1
p += x & 1 until (x >>= 1) == 0
if you want something that is unreadable ☺
I would construct a single table of 16 entries (as a 16 character table), corresponding to each nibble (half) of a bytes. Entries are 0,1,1,2,1,2,....4
To test your byte,
Mask out the left nibble and do a lookup, memorizing the number.
Do. a shift to the right by 4 and do a second lookup, adding the result number to the previous one to provide a sum.
Then test the low order bit from the sum. If it is 1, the byte is odd, if it is a 0, the byte is even. If result is even, you flip the high order bit, using the xor instruction.
THis lookup method is much faster than adding up the bits in a byte by single shifts.
email me for a simple function to do the parity for 8 bytes. 3DES uses 3 groups of 8 bytes.
Do you know how to fix the following issue with math precision?
p RUBY_VERSION # => "1.9.1"
p 0.1%1 # => 0.1
p 1.1%1 # => 0.1
p 90.0%1 # => 0.0
p 90.1%1 # => 0.0999999999999943
p 900.1%1 # => 0.100000000000023
p RUBY_VERSION # => "1.9.2"
p 0.1%1 # => 0.1
p 1.1%1 # => 0.10000000000000009
p 90.0%1 # => 0.0
p 90.1%1 # => 0.09999999999999432
p 900.1%1 # => 0.10000000000002274
Big Decimal
As the man said;
Squeezing infinitely many real numbers into a finite number of bits requires an approximate representation.
I have however had great success using the BigDecimal class. To quote its intro
Ruby provides built-in support for arbitrary precision integer arithmetic. For example:
42**13 -> 1265437718438866624512
BigDecimal provides similar support for very large or very accurate floating point numbers.
Taking one of your examples;
>> x = BigDecimal.new('900.1')
=> #<BigDecimal:101113be8,'0.9001E3',8(8)>
>> x % 1
=> #<BigDecimal:10110b498,'0.1E0',4(16)>
>> y = x % 1
=> #<BigDecimal:101104760,'0.1E0',4(16)>
>> y.to_s
=> "0.1E0"
>> y.to_f
=> 0.1
As you can see, ensuring decent precision is possible but it requires a little bit of effort.
This is true of all computer languages, not just Ruby. It's a feature of representing floating point numbers on binary computers:
What Every Computer Scientist Should Know About Floating Point Arithmetic
Writing 0.1 into a floating point will always result in rounding errors. If you want 'precise' decimal representation, you should use the Decimal type.