Ruby—integer substring (subbit?) - ruby

Substrings work where "hello"[0..2] returns "hel"
Is there an integer equivalent, that returns the sub-bits, as in 9[1..3] returns 4 or "100"?
Context:
I'm trying to do a genetic algorithm, and there is a cross over stage where you split two chromosomes in two, then swap halves, e.g. "10101010" crossed with "00000000" at i = 4 would be "00001010", "10100000"
However, I'm doing this a few million times, so using strings is inefficient, so is there a bitwise way to do this?

You might consider refining the class Fixnum to permit Fixnum#[]'s argument to be either a bit index or a range of bit indices.
module StabBitRange
def [](arg)
case arg
when Range
mask = arg.reduce(0) { |n, idx| n |= (1 << idx) }
(self & mask) >> arg.first
else
super
end
end
end
module StabBits
refine Fixnum do
prepend StabBitRange
end
end
See Refinements and Module#prepend. Prepend StabBitRange moves the methods contained therein (here just one) before the methods with the same names in Fixnum in the ancestor chain. The alternative (commonly used before Prepend was introduced in Ruby v2.0) is to alias Fixnum#[] and then redefine Fixnum[] to be the new method that takes either a bit index or a range of bit indices. (See the edit history of this answer to see how that is done.) I expect readers will agree that Prepend is a more sensible way of doing that.
To make this code available for use we need only invoke the keyword using.
using StabBits
n = 682
n.to_s(2) #=> "1010101010"
n[0] #=> 0
n[1] #=> 1
n[2] #=> 0
n[0..7] #=> 170 170.to_s(2) => "10101010"
n[1..7] #=> 85 85.to_s(2) => "1010101"
n[2..6] #=> 10 10.to_s(2) => "1010"
When StabBitRange#\[\] is called and the argument is not a range, super invokes Fixnum#[] and passes the argument. That method handles argument errors as well as returning the desired bit when there are no errors.
When this code is run in IRB, the following exception is raised: RuntimeError: main.using is permitted only at top level. That's because IRB is running at the top level, but code run within IRB is running at a higher level from the perspective of the Ruby parser. To run this in IRB it is necessary to enclose using StabBits and the code following in a module.

Related

Why would #each_with_object and #inject switch the order of block parameters?

#each_with_object and #inject can both be used to build a hash.
For example:
matrix = [['foo', 'bar'], ['cat', 'dog']]
some_hash = matrix.inject({}) do |memo, arr|
memo[arr[0]] = arr
memo # no implicit conversion of String into Integer (TypeError) if commented out
end
p some_hash # {"foo"=>["foo", "bar"], "cat"=>["cat", "dog"]}
another_hash = matrix.each_with_object({}) do |arr, memo|
memo[arr[0]] = arr
end
p another_hash # {"foo"=>["foo", "bar"], "cat"=>["cat", "dog"]}
One of the key differences between the two is #each_with_object keeps track of memo through the entire iteration while #inject sets memo equal to the value returned by the block on each iteration.
Another difference is the order or the block parameters.
Is there some intention being communicated here? It doesn't make sense to reverse the block parameters of two similar methods.
They have a different ancestry.
each_with_object has been added to Ruby 1.9 in 2007
inject goes back to Smalltalk in 1980
I guess if the language were designed with both methods from the begin they would likely expect arguments in the same order. But this is not how it happened. inject has been around since the begin of Ruby whereas each_with_object has been added 10 years later only.
inject expects arguments in the same order as Smalltalk's inject:into:
collection inject: 0 into: [ :memo :each | memo + each ]
which does a left fold. You can think of the collection as a long strip of paper that is folded up from the left and the sliding window of the fold function is always the part that has already been folded plus the next element of the remaining paper strip
# (memo = 0 and 1), 2, 3, 4
# (memo = 1 and 2), 3, 4
# (memo = 3 and 3), 4
# (memo = 6 and 4)
Following the Smalltalk convention made sense back then since all the initial methods in Enumerable are taken from Smalltalk and Matz did not want to confuse people who are familiar with Smalltalk.
Nor could anyone have the foresight to know that would happen in 2007 when each_with_object was introduced to Ruby 1.9 and the order of argument reflects the lexical order of the method name, which is each ... object.
And hence these two methods expect arguments in different orders for historical reasons.

What does the bracket operator do to a FixNum in Ruby?

Coming from Python I find the following behaviour very surprising:
irb(main):211:0> x= 33
=> 33
irb(main):212:0> x[0]
=> 1
irb(main):213:0> x[1]
=> 0
irb(main):214:0> x[2]
=> 0
Is there a rationale/philosophy for not raising an error in this example?
The bracket operator gives you the nth bit of the binary representation:
http://ruby-doc.org/core-2.1.2/Fixnum.html#method-i-5B-5D
You might be a little confused about what this is doing internally, but that's normal when dealing with Ruby because it's quite unlike other scripting languages. Unless you've used SmallTalk it might even seem insane.
When Ruby sees the following code:
x = 6
x[1]
What it's actually doing is this:
x.send(:[], 6) # Send :[] method call to x with arguments [ 6 ]
The x object is free to interpret that however it wants, and the behaviour is typically (though not always) defined by the class x belongs to if x is a normal instance.
In this case it's returning the bit at a given index, equivalent to x & (1 << 6) >> 6.
Sometimes the [] method does several things:
string = "brackets"
# Retrieve a single character
string[1]
# => "r"
# Retrieve a substring
string[5,2]
# => "et"
# Perform a pattern match
string[/[^aeiou]+/]
# => "br"
This also does some pretty crazy things since you can apply it to a Proc as well:
fn = lambda { |x| x + 1 }
# Conventional (explicit) call
fn.call(2)
# => 3
# Square bracket method
fn[5]
# => 6
Since Ruby leans very heavily on Duck Typing, this means that you can write a Proc to fill in where you'd normally have a Hash or an Array and the method receiving your object is none the wiser.
It's this flexibility in leaving the meaning of x[...] for your own class instance x up to you that makes it pretty powerful.
For example:
class Bracketeer
def [](i)
"%d brackets" % i
end
end
bracketeer = Bracketeer.new
bracketeer[6]
# => "6 brackets"
This simple notation often comes in handy when you're trying to create a minimal interface for a class of yours. In many cases you can use something simple like [] to replace what would be a more verbose method like find_by_id or cache_fetch.
Certainly. You'll find that the manual is quite illuminating.
This is returning the binary bit for the bit position value as a zero or one.
It is returning the n'th bit as rightfully observed by #msergeant.
What this means is, for the number 33, its binary representation is:
Index : [7][6][5][4] [3][2][1][0]
Bits : 0 0 1 0 0 0 0 1
Which explains the output:
irb(main):212:0> x[0]
=> 1
irb(main):213:0> x[1]
=> 0
irb(main):214:0> x[2]
=> 0

ruby bitwise product

I've got an assignment where I have had to write a function, which takes two integers and returns it's bitwise product. Bitwise product equals to bitwise sum (&) of all numbers between these two.
For example: bitwise product of 251 and 253 is:
irb(main):164:0> 251 & 252
=> 248
irb(main):165:0> 252 & 253
=> 252
irb(main):166:0> 248 & 252
=> 248 # this a bitwise & of these two between 251 & 253
My function:
def solution(m,n)
(m..n).to_a.inject{|sum, x| sum &= x}
end
Test:
irb(main):160:0> (251..253).to_a.inject{|sum, x| sum &= x}
=> 248 #same result
The program's evaluation:
correctness 100%
performance 0%
Can anyone, please explain what is with performance of this function? Thanks in advance!
EDIT
Since performance is undefined here, can you provide an analysis of the function for, for example, really big inputs and criticise/suggest better/more efficient solution?
p.s. thx for comments guys, i will follow the recommendations!
Following up on my comment, you can experiment using Benchmark:
def solution(m,n)
(m..n).to_a.inject{|sum, x| sum &= x}
end
def solution2(m,n)
(m..n).reduce(:&)
end
n = 50000
Benchmark.bm(7) do |b|
b.report("orig:") { n.times do; solution(128,253); end }
b.report("new: ") { n.times do; solution2(128,253); end }
end
Here's what I get:
user system total real
orig: 1.560000 0.000000 1.560000 ( 1.557156)
new: 0.640000 0.000000 0.640000 ( 0.634063)
You appear not to have any choice but to do the above, experiment to find a faster algorithm, and run the test until you register on the performance meter.
The most obvious performance problem here is a conversion to array (to_a), it's not necassary. You can call reduce and inject (appear to be the same) an any Enumerable, Range includes it as well. Enumerable is basically everything that can produce a finite (it's arguable, but mostly true) sequence of elements. Range seems to fit, if treated as a set of integers.
In order to iterate over an array with elements in this range, you first create an array and fill it with elements (by iterating over Range). Then you iterate on the resulting array applying your operations. So you create an array for the purpose of iterating on it, fill it, use it, toss it away. And it takes quite a memory allocation to do that if the range is large.
The next one, which is not critical, but reduces the amount of code that you have, requires some knowledge on the internal implementation of operators in Ruby. When you write a&b you are actually doing a.&(b): calling a & method on a number.
The sum you get in the block is not really some sort of accumulator, you don't actually need to assign anything there, sum is an intermediate value, the result of adding a new element should actually be a return value. Since an assignment returns the assigned value, it works even this way. Proof of that is this:
(251..253).inject{|sum, x| sum & x } # => 248, as expected
...and it turns out, this block is trivial: take one value, call a method on it with another value. Since inject takes a pair of values on each iteration, you can just give it a method name and let it handle the situation. Method names in Ruby are typically referenced with symbols like this:
(251..253).inject(:&) # => 248, as expected
All right, about twice as less code, less actions done and less objects made, nevertheless same result.
def solution(m,n)
(m..n).inject(:&)
end
We haven't looked closely at inject. Can you beat inject performance-wise? It's unlikely, it's a C method and is used here exactly for its purpose, you can look it up easily using gems pry and pry-doc (install, launch pry and type in show-source Range#inject):
[11] pry(main)> show-source Range#inject
From: enum.c (C Method):
Owner: Enumerable
Visibility: public
Number of lines: 34
...a bunch of C code here

What is prefered way to loop in Ruby?

Why is each loop preferred over for loop in Ruby? Is there a difference in time complexity or are they just syntactically different?
Yes, these are two different ways of iterating over, But hope this calculation helps.
require 'benchmark'
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
a.each { |x| sum += x }
}
This takes 5.866932 sec
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
for x in a
sum += x
end
}
This takes 6.146521 sec.
Though its not a right way to do the benchmarking, there are some other constraints too. But on a single machine, each seems to be a bit faster than for.
The variable referencing an item in iteration is temporary and does not have significance outside of the iteration. It is better if it is hidden from outside of the iteration. With external iterators, such variable is located outside of the iteration block. In the following, e is useful only within do ... end, but is separated from the block, and written outside of it; it does not look easy to a programmer:
for e in [:foo, :bar] do
...
end
With internal iterators, the block variable is defined right inside the block, where it is used. It is easier to read:
[:foo, :bar].each do |e|
...
end
This visibility issue is not just for a programmer. With respect to visibility in the sense of scope, the variable for an external iterator is accessible outside of the iteration:
for e in [:foo] do; end
e # => :foo
whereas in internal iterator, a block variable is invisible from outside:
[:foo].each do |e|; end
e # => undefined local variable or method `e'
The latter is better from the point of view of encapsulation.
When you want to nest the loops, the order of variables would be somewhat backwards with external iterators:
for a in [[:foo, :bar]] do
for e in a do
...
end
end
but with internal iterators, the order is more straightforward:
[[:foo, :bar]].each do |a|
a.each do |e|
...
end
end
With external iterators, you can only use hard-coded Ruby syntax, and you also have to remember the matching between the keyword and the method that is internally called (for calls each), but for internal iterators, you can define your own, which gives flexibility.
each is the Ruby Way. Implements the Iterator Pattern that has decoupling benefits.
Check also this: "for" vs "each" in Ruby
An interesting question. There are several ways of looping in Ruby. I have noted that there is a design principle in Ruby, that when there are multiple ways of doing the same, there are usually subtle differences between them, and each case has its own unique use, its own problem that it solves. So in the end you end up needing to be able to write (and not just to read) all of them.
As for the question about for loop, this is similar to my earlier question whethe for loop is a trap.
Basically there are 2 main explicit ways of looping, one is by iterators (or, more generally, blocks), such as
[1, 2, 3].each { |e| puts e * 10 }
[1, 2, 3].map { |e| e * 10 )
# etc., see Array and Enumerable documentation for more iterator methods.
Connected to this way of iterating is the class Enumerator, which you should strive to understand.
The other way is Pascal-ish looping by while, until and for loops.
for y in [1, 2, 3]
puts y
end
x = 0
while x < 3
puts x; x += 1
end
# same for until loop
Like if and unless, while and until have their tail form, such as
a = 'alligator'
a.chop! until a.chars.last == 'g'
#=> 'allig'
The third very important way of looping is implicit looping, or looping by recursion. Ruby is extremely malleable, all classes are modifiable, hooks can be set up for various events, and this can be exploited to produce most unusual ways of looping. The possibilities are so endless that I don't even know where to start talking about them. Perhaps a good place is the blog by Yusuke Endoh, a well known artist working with Ruby code as his artistic material of choice.
To demonstrate what I mean, consider this loop
class Object
def method_missing sym
s = sym.to_s
if s.chars.last == 'g' then s else eval s.chop end
end
end
alligator
#=> "allig"
Aside of readability issues, the for loop iterates in the Ruby land whereas each does it from native code, so in principle each should be more efficient when iterating all elements in an array.
Loop with each:
arr.each {|x| puts x}
Loop with for:
for i in 0..arr.length
puts arr[i]
end
In the each case we are just passing a code block to a method implemented in the machine's native code (fast code), whereas in the for case, all code must be interpreted and run taking into account all the complexity of the Ruby language.
However for is more flexible and lets you iterate in more complex ways than each does, for example, iterating with a given step.
EDIT
I didn't come across that you can step over a range by using the step() method before calling each(), so the flexibility I claimed for the for loop is actually unjustified.

Ruby min max assignment operators

When programming ruby I always find myself doing this:
a = [a, b].min
This means compare a and b and store the smallest value in a. I don't like writing the code above as I have to write a twice.
I know that some non-standard dialects of C++ had an operator which did exactly this
a <?= b
Which I find very convenient. But I'm not really interested in the operator as much as I'm in the feature of avoiding repetition. I would also be happy if I could write
a.keep_max(b)
a can be a quite long variable, like my_array[indice1][indice2], and you don't want to write that twice.
I did alot of googling on this and found no result, hopefully this question will pop up and be useful for others aswell.
So, is there any non-repeitive way to express what I want in ruby?
What you would like to do is in fact not possible in ruby (see this question). I think the best you can do is
def max(*args)
args.max
end
a = max a, b
I don't understand your question. You can always do something like this ...
module Comparable
def keep_min(other)
(self <=> other) <= 0 ? self : other
end
def keep_max(other)
(self <=> other) >= 0 ? self : other
end
end
1.keep_min(2)
=> 1
1.keep_max(2)
=> 2
Well, that won't work for all objects with <=> because not all of them are implementing Comparable, so you could monkey-patch Object.
Personally I prefer clarity and tend to avoid monkey-patching. Plus, this clearly is a binary predicate, just like "+", therefore method-chaining doesn't necessarily make sense so I prefer something like this to get rid of that array syntax:
def min(*args)
args.min
end
def max(*args)
args.max
end
min(1, 2)
=> 1
max(1, 2)
=> 2
But hey, I'm also a Python developer :-)
You can define your own method for it:
class Object
def keep_max(other)
[self, other].max
end
end
a = 3
b = 7
puts a.keep_max(b)
But you should be careful defining methods on Object as it can have unpredictable behaviour (for example, if objects cannot be compared).
def keep_max(var, other, binding)
eval "#{var} = [#{var}, #{other}].max", binding
end
a = 5
b = 78
keep_max(:a, :b, binding)
puts a
#=> 78
This basically does what you want. Take a look at Change variable passed in a method

Resources