Bit interleaving optimized for Ruby - ruby

Granted, optimizing bit twiddling in Ruby is a bit of a mismatch to begin with. That aside, I'm looking for a snippet or a gem that can interleave two arbitrary integer coords optimized as best can be for MRI (1.9) or a native gem.
Some approaches in C are: http://graphics.stanford.edu/~seander/bithacks.html#InterleaveTableObvious
As an example or starting point, here's "Interleave bits the obvious way" in Ruby, somewhat uglified to keep it from creating temp arrays (which increase the runtime by about 2X per array) and with a binary length method inlined for a further 6% decrease (If you know neither input is ever zero, you can omit that check for a few percent more..)
def interleave(y)
z = 0
bl = self > 0 ? Math.log2(self) : 1
ybl = y > 0 ? Math.log2(y) : 1
((((bl <=> ybl) == -1) ? ybl : bl).floor + 1).times{|i| z |= (self & 1 << i) << i | (y & 1 << i) << (i + 1)}
return z
end
Results from a 2.66Ghz i5 with 1.9.2p180:
x = y = 0b11111111_11111111_11111111_11111111
Benchmark.bm{|bm| bm.report{1000000.times{x.interleave(y)}}}
user system total real
18.360000 0.010000 18.370000 ( 18.356196)
Surely there's a better way?
Update
I included the zero fix from #Wayne Conrad, albeit far uglier than his and only marginally faster. Also moved the floor and + 1 so as to be executed once instead of twice per.
Here is a Gist of this with matching de-interleave.

Here's a quick & cheesy implementation to get you going until a good one comes along:
def mortanize(x, y)
xs, ys = [x, y].map do |n|
n.to_s(2)
end
nbits = [xs, ys].map(&:size).max
xs, ys = [xs, ys].map do |n|
('0' * (nbits - n.size) + n).chars
end
ys.zip(xs).join.to_i(2)
end
As you might expect, it's no speed deamon. On my box, with MRI 1.8.7, it computes about 35,000 16-bit results per second. Yours computes 68,000 16-bit results per second. Or, see the next algorithm for 256,000 16-bit results per second.
If you're willing to trade a little memory and startup time for speed, then:
def base_mortanize(x, y)
xs, ys = [x, y].map do |n|
n.to_s(2)
end
nbits = [xs, ys].map(&:size).max
xs, ys = [xs, ys].map do |n|
('0' * (nbits - n.size) + n).chars
end
ys.zip(xs).join.to_i(2)
end
MORTON_TABLE_X = 256.times.map do |x|
base_mortanize(x, 0)
end
MORTON_TABLE_Y = 256.times.map do |y|
base_mortanize(0, y)
end
def mortanize(x, y)
z = []
while (x > 0 || y > 0)
z << (MORTON_TABLE_X[x & 0xff] | MORTON_TABLE_Y[y & 0xff])
x >>= 8
y >>= 8
end
z.reverse.inject(0) do |result, word|
result << 16 | word
end
end
This one computes 256,000 16-bit results per second.
There's a bug in your answer if either argument is zero. Here's one possible fix for it. First define this function:
def bit_size(x)
return 1 if x == 0
Math.log2(x).floor + 1
end
And then, inside interleave, replace:
z, bl, ybl = 0, (Math.log2(self)).floor + 1, (Math.log2(y)).floor + 1
with:
z = 0
bl = bit_size(x)
ybl = bit_size(y)
Here is the rspec test case I used:
describe "mortanize" do
it "should interleave integers" do
mortanize(0, 0).should eql 0
mortanize(0, 1).should eql 2
mortanize(1, 0).should eql 1
mortanize(0xf, 0x3).should eql 0x5f
mortanize(0x3, 0xf).should eql 0xaf
mortanize(0xf, 0x0).should eql 0x55
mortanize(0x0, 0xf).should eql 0xaa
mortanize(0x3, 0xc).should eql 0xa5
mortanize(0xf, 0xf).should eql 0xff
mortanize(0x1234, 0x4321).should eql 0x210e0d12
end
end

Here's another solution, benchmarked about 50% faster than the accepted one, and for 16-bit integers (where the first one only does 8-bit):
Magic = [0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF]
# Interleave lower 16 bits of x and y, so the bits of x
# are in the even positions and bits from y in the odd;
# z gets the resulting 32-bit Morton Number.
# x and y must initially be less than 65536.
# Rubyfied from http://graphics.stanford.edu/~seander/bithacks.html
def _interleave_bits_16b(x,y)
x = (x | (x << 8)) & Magic[3]
x = (x | (x << 4)) & Magic[2]
x = (x | (x << 2)) & Magic[1]
x = (x | (x << 1)) & Magic[0]
y = (y | (y << 8)) & Magic[3]
y = (y | (y << 4)) & Magic[2]
y = (y | (y << 2)) & Magic[1]
y = (y | (y << 1)) & Magic[0]
z = x | (y << 1)
end

If you have an implementation already in C, you can use FFI, otherwise you can write it directly with the help of RubyInline

Related

For given two integers A and B, find a pair of numbers X and Y such that A = X*Y and B = X xor Y

I'm struggling with this problem I've found in a competitive programming book, but without a solution how to do it. For given two integers A and B (can fit in 64-bit integer type), where A is odd, find a pair of numbers X and Y such that A = X*Y and B = X xor Y.
My approach was to list all divisors of A and try pairing numbers under sqrt(A) with numbers over sqrt(A) that multiply up to A and see if their xor is equal to B. But I don't know if that's efficient enough.
What would be a good solution/algorithm to this problem?
You know that at least one factor is <= sqrt(A). Let's make that one X.
The length of X in bits will be about half the length of A.
The upper bits of X, therefore -- the ones higher in value than sqrt(A) -- are all 0, and the corresponding bits in B must have the same value as the corresponding bits in Y.
Knowing the upper bits of Y gives you a pretty small range for the corresponding factor X = A/Y. Calculate Xmin and Xmax corresponding to the largest and smallest possible values for Y, respectively. Remember that Xmax must also be <= sqrt(A).
Then just try all the possible Xs between Xmin and Xmax. There won't be too many, so it won't take very long.
The other straightforward way to solve this problem relies on the fact that the lower n bits of XY and X xor Y depend only on the lower n bits of X and Y. Therefore, you can use the possible answers for the lower n bits to restrict the possible answers for the lower n+1 bits, until you're done.
I've worked out that, unfortunately, there can be more than one possibility for a single n. I don't know how often there will be a lot of possibilities, but it's probably not too often if at all, so this may be fine in a competitive context. Probabilistically, there will only be a few possibilities, since a solution for n bits will provide either 0 or two solutions for n+1 bits, with equal probability.
It seems to work out pretty well for random input. Here's the code I used to test it:
public static void solve(long A, long B)
{
List<Long> sols = new ArrayList<>();
List<Long> prevSols = new ArrayList<>();
sols.add(0L);
long tests=0;
System.out.print("Solving "+A+","+B+"... ");
for (long bit=1; (A/bit)>=bit; bit<<=1)
{
tests += sols.size();
{
List<Long> t = prevSols;
prevSols = sols;
sols = t;
}
final long mask = bit|(bit-1);
sols.clear();
for (long prevx : prevSols)
{
long prevy = (prevx^B) & mask;
if ((((prevx*prevy)^A)&mask) == 0)
{
sols.add(prevx);
}
long x = prevx | bit;
long y = (x^B)&mask;
if ((((x*y)^A)&mask) == 0)
{
sols.add(x);
}
}
}
tests += sols.size();
{
List<Long> t = prevSols;
prevSols = sols;
sols = t;
}
sols.clear();
for (long testx: prevSols)
{
if (A/testx >= testx)
{
long testy = B^testx;
if (testx * testy == A)
{
sols.add(testx);
}
}
}
System.out.println("" + tests + " checks -> X=" + sols);
}
public static void main(String[] args)
{
Random rand = new Random();
for (int range=Integer.MAX_VALUE; range > 32; range -= (range>>5))
{
long A = rand.nextLong() & Long.MAX_VALUE;
long X = (rand.nextInt(range)) + 2L;
X|=1;
long Y = A/X;
if (Y==0)
{
Y = rand.nextInt(65536);
}
Y|=1;
solve(X*Y, X^Y);
}
}
You can see the results here: https://ideone.com/cEuHkQ
Looks like it usually only takes a couple thousand checks.
Here's a simple recursion that observes the rules we know: (1) the least significant bits of both X and Y are set since only odd multiplicands yield an odd multiple; (2) if we set X to have the highest set bit of B, Y cannot be greater than sqrt(A); and (3) set bits in X or Y according to the current bit in B.
The following Python code resulted in under 300 iterations for all but one of the random pairs I picked from Matt Timmermans' example code. But the first one took 231,199 iterations :)
from math import sqrt
def f(A, B):
i = 64
while not ((1<<i) & B):
i = i - 1
X = 1 | (1 << i)
sqrtA = int(sqrt(A))
j = 64
while not ((1<<j) & sqrtA):
j = j - 1
if (j > i):
i = j + 1
memo = {"it": 0, "stop": False, "solution": []}
def g(b, x, y):
memo["it"] = memo["it"] + 1
if memo["stop"]:
return []
if y > sqrtA or y * x > A:
return []
if b == 0:
if x * y == A:
memo["solution"].append((x, y))
memo["stop"] = True
return [(x, y)]
else:
return []
bit = 1 << b
if B & bit:
return g(b - 1, x, y | bit) + g(b - 1, x | bit, y)
else:
return g(b - 1, x | bit, y | bit) + g(b - 1, x, y)
g(i - 1, X, 1)
return memo
vals = [
(6872997084689100999, 2637233646), # 1048 checks with Matt's code
(3461781732514363153, 262193934464), # 8756 checks with Matt's code
(931590259044275343, 5343859294), # 4628 checks with Matt's code
(2390503072583010999, 22219728382), # 5188 checks with Matt's code
(412975927819062465, 9399702487040), # 8324 checks with Matt's code
(9105477787064988985, 211755297373604352), # 3204 checks with Matt's code
(4978113409908739575,67966612030), # 5232 checks with Matt's code
(6175356111962773143,1264664368613886), # 3756 checks with Matt's code
(648518352783802375, 6) # B smaller than sqrt(A)
]
for A, B in vals:
memo = f(A, B)
[(x, y)] = memo["solution"]
print "x, y: %s, %s" % (x, y)
print "A: %s" % A
print "x*y: %s" % (x * y)
print "B: %s" % B
print "x^y: %s" % (x ^ y)
print "%s iterations" % memo["it"]
print ""
Output:
x, y: 4251585939, 1616572541
A: 6872997084689100999
x*y: 6872997084689100999
B: 2637233646
x^y: 2637233646
231199 iterations
x, y: 262180735447, 13203799
A: 3461781732514363153
x*y: 3461781732514363153
B: 262193934464
x^y: 262193934464
73 iterations
x, y: 5171068311, 180154313
A: 931590259044275343
x*y: 931590259044275343
B: 5343859294
x^y: 5343859294
257 iterations
x, y: 22180179939, 107776541
A: 2390503072583010999
x*y: 2390503072583010999
B: 22219728382
x^y: 22219728382
67 iterations
x, y: 9399702465439, 43935
A: 412975927819062465
x*y: 412975927819062465
B: 9399702487040
x^y: 9399702487040
85 iterations
x, y: 211755297373604395, 43
A: 9105477787064988985
x*y: 9105477787064988985
B: 211755297373604352
x^y: 211755297373604352
113 iterations
x, y: 68039759325, 73164771
A: 4978113409908739575
x*y: 4978113409908739575
B: 67966612030
x^y: 67966612030
69 iterations
x, y: 1264664368618221, 4883
A: 6175356111962773143
x*y: 6175356111962773143
B: 1264664368613886
x^y: 1264664368613886
99 iterations
x, y: 805306375, 805306369
A: 648518352783802375
x*y: 648518352783802375
B: 6
x^y: 6
59 iterations

my code result in an infinite loop

puts "enter a number"
x = gets.chomp.to_i
y = 0
while x != 1
y += 1
if x % 2 == 0
x = x / 2
else
x = x*3 + 1
end
print "#{x} "
end
puts "\nThe number of sequence is #{y+1}"
Hi, if I key in negative number or 0, I will get an infinite loop. How do I avoid entering the loop if my number is 0 or negative.
You can use x > 1 i.e
puts "enter a number"
x = gets.chomp.to_i
# if you want to consider negative as positive then x = gets.chomp.to_i.abs
y = 0
while (x > 1)
y += 1
if x % 2 == 0
x = x / 2
else
x = x*3 + 1
end
print "#{x} "
end
puts "\nThe number of sequence is #{y+1}"
Hope it helps
To answer your question:
Your code works perfectly well and does exactly what it is told to do:
while x is not 1 OR x is smaller than 0 do this codeblock.
If you set x to a negative number, x will never be a positive number, so it runs forever (because x is always smaller 0).
So, the code is correct, but there is a flaw in the logic behind it :)

Multiplying with divide and conquer

Below I've posted the code to a non-working "divide and conquer" multiplication method in ruby(with debug prints). I cannot tell if its broken code, or a quirk in Ruby like how the L-shift(<<) operator doesn't push bits into the bit-bucket; this is unexpected compared to similar operations in C++.
Is it broken code (doesn't match the original algorithm) or unexpected behavior?
Pseudo code for original algorithm
def multiply(x,y,n, level)
#print "Level #{level}\n"
if n == 1
#print "\tx[#{x.to_s(2)}][#{y.to_s(2)}]\n\n"
return x*y
end
mask = 2**n - 2**(n/2)
xl = x >> (n / 2)
xr = x & ~mask
yl = y >> (n / 2)
yr = y & ~mask
print " #{n} | x = #{x.to_s(2)} = L[#{xl.to_s(2)}][#{xr.to_s(2)}]R \n"
print " #{n} | y = #{y.to_s(2)} = L[#{yl.to_s(2)}][#{yr.to_s(2)}]R \n"
#print "\t[#{xl.to_s(2)}][#{yr.to_s(2)}]\n"
#print "\t[#{xr.to_s(2)}][#{yr.to_s(2)}]\n"
#print "\t([#{xl.to_s(2)}]+[#{xr.to_s(2)}])([#{yl.to_s(2)}]+[#{yr.to_s(2)}])\n\n"
p1 = multiply( xl, yl, n/2, level+1)
p2 = multiply( xr, yr, n/2, level+1)
p3 = multiply( xl+xr, yl+yr, n/2, level+1)
return p1 * 2**n + (p3 - p1 - p2) * 2**(n/2) + p2
end
x = 21
y = 22
print "x = #{x} = #{x.to_s(2)}\n"
print "y = #{y} = #{y.to_s(2)}\n"
print "\nDC_multiply\t#{x}*#{y} = #{multiply(x,y,8, 1)} \nregular\t#{x}*#{y} = #{x*y}\n\n "
I am not familiar with the divide and conquer algorithm but i don't think it contains parts you can't do in Ruby.
Here is a quick attempt:
def multiplb(a,b)
#Break recursion when a or b has one digit
if a < 10 || b < 10
a * b
else
#Max number of digits of a and b
n = [a.to_s.length, b.to_s.length].max
# Steps to split numbers to high and low digits sub-numbers
# (1) to_s.split('') => Converting digits to string then arrays to ease splitting numbers digits
# (2) each_slice => Splitting both numbers to high(left) and low(right) digits groups
# (3) to_a , map, join, to_i => Simply returning digits to numbers
al, ar = a.to_s.split('').each_slice(n/2).to_a.map(&:join).map(&:to_i)
bl, br = b.to_s.split('').each_slice(n/2).to_a.map(&:join).map(&:to_i)
#Recursion
p1 = multiplb(al, bl)
p2 = multiplb(al + ar, bl + br)
p3 = multiplb(ar, br)
p1 * (10**n) + (p2 - p1 - p3) * (10**(n/2)) + p3
end
end
#Test
puts multiplb(1980, 2315)
# => 4583700 yeah that's correct :)
Here are some references to further explain part of the code:
Finding max of numbers => How do you find a min / max with Ruby?
Spliting an array to half => Splitting an array into equal parts in ruby
Turning a fixnum into array => Turning long fixed number to array Ruby
Hope it hepls !

How does ConstantTimeByteEq work?

In Go's crytography library, I found this function ConstantTimeByteEq. What does it do, how does it work?
// ConstantTimeByteEq returns 1 if x == y and 0 otherwise.
func ConstantTimeByteEq(x, y uint8) int {
z := ^(x ^ y)
z &= z >> 4
z &= z >> 2
z &= z >> 1
return int(z)
}
x ^ y is x XOR y, where the result is 1 when the arguments are different and 0 when the arguments are the same:
x = 01010011
y = 00010011
x ^ y = 01000000
^(x ^ y) negates this, i.e., you get 0 when the arguments are different and 1 otherwise:
^(x ^ y) = 10111111 => z
Then we start shifting z to the right for masking its bits by itself. A shift pads the left side of the number with zero bits:
z >> 4 = 00001011
With the goal of propagating any zeros in z to the result, start ANDing:
z = 10111111
z >> 4 = 00001011
z & (z >> 4) = 00001011
also fold the new value to move any zero to the right:
z = 00001011
z >> 2 = 00000010
z & (z >> 2) = 00000010
further fold to the last bit:
z = 00000010
z >> 1 = 00000001
z & (z >> 1) = 00000000
On the other hand, if you have x == y initially, it goes like this:
z = 11111111
z (& z >> 4) = 00001111
z (& z >> 2) = 00000011
z (& z >> 1) = 00000001
So it really returns 1 when x == y, 0 otherwise.
Generally, if both x and y are zero the comparison can take less time than other cases. This function tries to make it so that all calls take the same time regardless of the values of its inputs. This way, an attacker can't use timing based attacks.
It does exactly what the documentation says: It checks if x and y are equal. From a functional point it is just x == y, dead simple.
Doing x == y in this cryptic bit-fiddling-way prevent timing side attacks to algorithms: A x == y may get compiled to code which performs faster if x = y and slower if x != y (or the other way around) due to branch prediction in CPUs. This can be used by an attacker to learn something about the data handled by the cryptographic routines and thus compromise security.

Optimizing the damerau version of the levenshtein algorithm to better than O(n*m)

Here is the algorithm (in ruby)
#http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
def self.dameraulevenshtein(seq1, seq2)
oneago = nil
thisrow = (1..seq2.size).to_a + [0]
seq1.size.times do |x|
twoago, oneago, thisrow = oneago, thisrow, [0] * seq2.size + [x + 1]
seq2.size.times do |y|
delcost = oneago[y] + 1
addcost = thisrow[y - 1] + 1
subcost = oneago[y - 1] + ((seq1[x] != seq2[y]) ? 1 : 0)
thisrow[y] = [delcost, addcost, subcost].min
if (x > 0 and y > 0 and seq1[x] == seq2[y-1] and seq1[x-1] == seq2[y] and seq1[x] != seq2[y])
thisrow[y] = [thisrow[y], twoago[y-2] + 1].min
end
end
end
return thisrow[seq2.size - 1]
end
My problem is that with a seq1 of length 780, and seq2 of length 7238, this takes about 25 seconds to run on an i7 laptop. Ideally, I'd like to get this reduced to about a second, since it's running as part of a webapp.
I found that there is a way to optimize the vanilla levenshtein distance such that the runtime drops from O(n*m) to O(n + d^2) where n is the length of the longer string, and d is the edit distance. So, my question becomes, can the same optimization be applied to the damerau version I have (above)?
Yes the optimization can be applied to the damereau version. Here is a haskell code to do this (I don't know Ruby):
distd :: Eq a => [a] -> [a] -> Int
distd a b
= last (if lab == 0 then mainDiag
else if lab > 0 then lowers !! (lab - 1)
else{- < 0 -} uppers !! (-1 - lab))
where mainDiag = oneDiag a b (head uppers) (-1 : head lowers)
uppers = eachDiag a b (mainDiag : uppers) -- upper diagonals
lowers = eachDiag b a (mainDiag : lowers) -- lower diagonals
eachDiag a [] diags = []
eachDiag a (bch:bs) (lastDiag:diags) = oneDiag a bs nextDiag lastDiag : eachDiag a bs diags
where nextDiag = head (tail diags)
oneDiag a b diagAbove diagBelow = thisdiag
where doDiag [_] b nw n w = []
doDiag a [_] nw n w = []
doDiag (apr:ach:as) (bpr:bch:bs) nw n w = me : (doDiag (ach:as) (bch:bs) me (tail n) (tail w))
where me = if ach == bch then nw else if ach == bpr && bch == apr then nw else 1 + min3 (head w) nw (head n)
firstelt = 1 + head diagBelow
thisdiag = firstelt : doDiag a b firstelt diagAbove (tail diagBelow)
lab = length a - length b
min3 x y z = if x < y then x else min y z
distance :: [Char] -> [Char] -> Int
distance a b = distd ('0':a) ('0':b)
The code above is an adaptation of this code.

Resources