Slow Ruby for simple array operations - ruby

I created simple insertion sort implementation in Ruby, following pseudo-code from Cormen's "Introduction to Algorithms":
def sort_insert(array)
(1 ... array.length).each do |item_index|
key = array[item_index]
i = item_index - 1
while i >= 0 && array[i] > key do
array[i + 1] = array[i]
i -= 1
end
array[i + 1] = key
end
array
end
It works, but performs very slowly. For ~20k elements array array = ((0..10_000).to_a * 2).shuffle, it takes about 20 seconds to sort. I measure time only for this method call, no data preparation etc. In JavaScript, a very similar solution to this takes about 1 second. Why is Ruby (v. 2.2.2p95) so slow here?
Edit:
JS version of this sorting, which I use:
function SortMethods() {
}
SortMethods.prototype.sortInsert = function(array) {
for(let itemIndex = 1; itemIndex < array.length; itemIndex++) {
let key = array[itemIndex];
let i = itemIndex - 1;
while( i >= 0 && array[i] > key) {
array[i + 1] = array[i];
i--;
}
array[i + 1] = key;
}
return array;
}

I'm going to disagree with the premise of your question—Ruby is turning in more than respectable performance on my machine.
To illustrate this, I created a file with 100k random numbers:
$ ruby -e '100_000.times {printf "%22.20f\n", rand}' > rand100k.csv
I then sorted this with the system sort utility, saving the results for later comparison (as a correctness check):
$ time sort -n < rand100k.csv > foo
real 0m0.067s
user 0m0.056s
sys 0m0.011s
I wrote a quicksort algorithm (which flips over to insertion sort when the sublist size gets small enough) in pure ruby, ran it, saved the results, and diff'ed the system sort output and ruby output files:
$ time ruby quicksort_w_insertion.rb < rand100k.csv > bar
real 0m0.546s
user 0m0.537s
sys 0m0.008s
$ diff foo bar
$
As you can see, both sorting runs produce identical output, and very quickly. In my opinion, a pure Ruby program which is only 8-10 times slower than the corresponding system utility is doing mighty darn fine speed-wise.
These runs were made with ruby 2.2.2p95 (2015-04-13 revision 50293) [x86_64-darwin14] on a MacBook Pro.

Related

Deleting duplicate characters from array

Got asked this question in an interview and couldn't find a solution.
Given an array of characters delete all the characters that got repeated k or more times consecutively and add '#' in the end of the array for every deleted character.
Example:
"xavvvarrrt"->"xaat######"
O(1) memory and O(n) time without writing to the same cell twice.
The tricky part for me was that I am not allowed to overwrite a cell more than once, which means I need to know exactly where each character will move after deleting the duplicates.
The best I could come up with is iterating once on the array and saving in a map the occurrences of each character, and when iterating again and checking if the current character is not deleted then move it to the new position according to the offset, if it is deleted then update an offset variable.
The problem with this approach is that it won't work in this scenario:
"aabbaa" because 'a' appears at two different places.
So when I thought about saving an array of occurrences in the map but now it won't use O(1) memory.
Thanks
This seems to work with your examples, although it seems a little complicated to me :) I wonder if we could simplify it. The basic idea is to traverse from left to right, keeping a record of how many places in the current block of duplicates are still available to replace, while the right pointer looks for more blocks to shift over.
JavaScript code:
function f(str){
str = str.split('')
let r = 1
let l = 0
let to_fill = 0
let count = 1
let fill = function(){
while (count > 0 && (to_fill > 0 || l < r)){
str[l] = str[r - count]
l++
count--
to_fill--
}
}
for (; r<str.length; r++){
if (str[r] == str[r-1]){
count++
} else if (count < 3){
if (to_fill)
fill()
count = 1
if (!to_fill)
l = r
} else if (!to_fill){
to_fill = count
count = 1
} else {
count = 1
}
}
if (count < 3)
fill()
while (l < str.length)
str[l++] = '#'
return str.join('')
}
var str = "aayyyycbbbee"
console.log(str)
console.log(f(str)) // "aacee#######"
str = "xavvvarrrt"
console.log(str)
console.log(f(str)) // "xaat######"
str = "xxaavvvaarrrbbsssgggtt"
console.log(str)
console.log(f(str))
Here is a version similar to the other JS answer, but a bit simpler:
function repl(str) {
str = str.split("");
var count = 1, write = 0;
for (var read = 0; read < str.length; read++) {
if (str[read] == str[read+1])
count++;
else {
if (count < 3) {
for (var i = 0; i < count; i++)
str[write++] = str[read];
}
count = 1;
}
}
while (write < str.length)
str[write++] = '#';
return str.join("");
}
function demo(str) {
console.log(str + " ==> " + repl(str));
}
demo("a");
demo("aa");
demo("aaa");
demo("aaaaaaa");
demo("aayyyycbbbee");
demo("xavvvarrrt");
demo("xxxaaaaxxxaaa");
demo("xxaavvvaarrrbbsssgggtt");
/*
Output:
a ==> a
aa ==> aa
aaa ==> ###
aaaaaaa ==> #######
aayyyycbbbee ==> aacee#######
xavvvarrrt ==> xaat######
xxxaaaaxxxaaa ==> #############
xxaavvvaarrrbbsssgggtt ==> xxaaaabbtt############
*/
The idea is to keep the current index for reading the next character and one for writing, as well as the number of consecutive repeated characters. If the following character is equal to the current, we just increase the counter. Otherwise we copy all characters below a count of 3, increasing the write index appropriately.
At the end of reading, anything from the current write index up to the end of the array is the number of repeated characters we have skipped. We just fill that with hashes now.
As we only store 3 values, memory consumption is O(1); we read each array cell twice, so O(n) time (the extra reads on writing could be eliminated by another variable); and each write index is accessed exactly once.

Insertion sort not sorting correctly

I have this insertion sort written in Scala:
def insertionSort(array:Array[Double]):Unit ={
for(i <- 1 until array.length){
var current = i - 1
var currentTemp = array(i)
do{
array(current + 1) = array(current)
current -= 1
}while((current >= 0 && currentTemp < array(current)))
array(current + 1) = currentTemp
}
}
I noticed that when I run the function, I get back the a sorted list except for the last number. Here is the output:
Unsorted Array: 0.447256485852796,0.5849012754409237,0.4132227916845631,0.5850539272085894,0.38731662469810446
Insertion Sorted Array: 0.38731662469810446,0.4132227916845631,0.5849012754409237,0.5850539272085894,0.447256485852796
I'm not sure what I can change so the last number is in its proper position. Any help would be great.

Scala PriorityQueue on Array[Int] performance issue with complex comparison function (caching is needed)

The problem involves the Scala PriorityQueue[Array[Int]] performance on large data set. The following operations are needed: enqueue, dequeue, and filter. Currently, my implementation is as follows:
For every element of type Array[Int], there is a complex evaluation function: (I'm not sure how to write it in a more efficient way, because it excludes the position 0)
def eval_fun(a : Array[Int]) =
if(a.size < 2) 3
else {
var ret = 0
var i = 1
while(i < a.size) {
if((a(i) & 0x3) == 1) ret += 1
else if((a(i) & 0x3) == 3) ret += 3
i += 1
}
ret / a.size
}
The ordering with a comparison function is based on the evaluation function: (Reversed, descendent order)
val arr_ord = new Ordering[Array[Int]] {
def compare(a : Array[Int], b : Array[Int]) = eval_fun(b) compare eval_fun(a) }
The PriorityQueue is defined as:
val pq: scala.collection.mutable.PriorityQueue[Array[Int]] = PriorityQueue()
Question:
Is there a more elegant and efficient way to write such a evaluation function? I'm thinking of using fold, but fold cannot exclude the position 0.
Is there a data structure to generate a priorityqueue with unique elements? Applying filter operation after each enqueue operation is not efficient.
Is there a cache method to reduce the evaluation computation? Since when adding a new element to the queue, every element may need to be evaluated by eval_fun again, which is not necessary if evaluated value of every element can be cached. Also, I should mention that two distinct element may have the same evaluated value.
Is there a more efficient data structure without using generic type? Because if the size of elements reaches 10,000 and the size of size reaches 1,000, the performance is terribly slow.
Thanks you.
(1) If you want maximum performance here, I would stick to the while loop, even if it is not terribly elegant. Otherwise, if you use a view on Array, you can easily drop the first element before going into the fold:
a.view.drop(1).foldLeft(0)( (sum, a) => sum + ((a & 0x03) match {
case 0x01 => 1
case 0x03 => 3
case _ => 0
})) / a.size
(2) You can maintain two structures, the priority queue, and a set. Both combined give you a sorted-set... So you could use collection.immutable.SortedSet, but there is no mutable variant in the standard library. Do want equality based on the priority function, or the actual array contents? Because in the latter case, you won't get around comparing arrays element by element for each insertion, undoing the effect of caching the priority function value.
(3) Just put the calculated priority along with the array in the queue. I.e.
implicit val ord = Ordering.by[(Int, Array[Int]), Int](_._1)
val pq = new collection.mutable.PriorityQueue[(Int, Array[Int])]
pq += eval_fun(a) -> a
Well, you can use a tail recursive loop (generally these are more "idiomatic":
def eval(a: Array[Int]): Int =
if (a.size < 2) 3
else {
#annotation.tailrec
def loop(ret: Int = 0, i: Int = 1): Int =
if (i >= a.size) ret / a.size
else {
val mod3 = (a(i) & 0x3)
if (mod3 == 1) loop(ret + 1, i + 1)
else if (mod3 == 3) loop(ret + 3, i + 1)
else loop(ret, i + 1)
}
loop()
}
Then you can use that to initialise a cached priority value:
case class PriorityArray(a: Array[Int]) {
lazy val priority = if (a.size < 2) 3 else {
#annotation.tailrec
def loop(ret: Int = 0, i: Int = 1): Int =
if (i >= a.size) ret / a.size
else {
val mod3 = (a(i) & 0x3)
if (mod3 == 2) loop(ret, i + 1)
else loop(ret + mod3, i + 1)
}
loop()
}
}
You may note also that I removed a redundant & op and have only the single conditional (for when it equals 2, rather than two checks for 1 && 3) – these should have some minimal effect.
There is not a huge difference from 0__'s proposal that just came though.
My answers:
If evaluation is critical, keep it as it is. You might get better performance with recursion (not sure why, but it happens), but you'll certainly get worse performance with pretty much any other approach.
No, there isn't, but you can come pretty close to it just modifying the dequeue operation:
def distinctDequeue[T](q: PriorityQueue[T]): T = {
val result = q.dequeue
while (q.head == result) q.dequeue
result
}
Otherwise, you'd have to keep a second data structure just to keep track of whether an element has been added or not. Either way, that equals sign is pretty heavy, but I have a suggestion to make it faster in the next item.
Note, however, that this requires that ties on the the cost function get solved in some other way.
Like 0__ suggested, put the cost on the priority queue. But you can also keep a cache on the function if that would be helpful. I'd try something like this:
val evalMap = scala.collection.mutable.HashMapWrappedArray[Int], Int
def eval_fun(a : Array[Int]) =
if(a.size < 2) 3
else evalMap.getOrElseUpdate(a, {
var ret = 0
var i = 1
while(i < a.size) {
if((a(i) & 0x3) == 1) ret += 1
else if((a(i) & 0x3) == 3) ret += 3
i += 1
}
ret / a.size
})
import scala.math.Ordering.Implicits._
val pq = new collection.mutable.PriorityQueue[(Int, WrappedArray[Int])]
pq += eval_fun(a) -> (a : WrappedArray[Int])
Note that I did not create a special Ordering -- I'm using the standard Ordering so that the WrappedArray will break the ties. There's little cost to wrap the Array, and you get it back with .array, but, on the other hand, you'll get the following:
Ties will be broken by comparing the array themselves. If there aren't many ties in the cost, this should be good enough. If there are, add something else to the tuple to help break ties without comparing the arrays.
That means all equal elements will be kept together, which will enable you to dequeue all of them at the same time, giving the impression of having kept only one.
And that equals will actually work, because WrappedArray compare like Scala sequences do.
I don't understand what you mean by that fourth point.

Number crunching in Ruby (optimisation needed)

Ruby may not be the optimal language for this but I'm sort of comfortable working with this in my terminal so that's what I'm going with.
I need to process the numbers from 1 to 666666 so I pin out all the numbers that contain 6 but doesn't contain 7, 8 or 9. The first number will be 6, the next 16, then 26 and so forth.
Then I needed it printed like this (6=6) (16=6) (26=6) and when I have ranges like 60 to 66 I need it printed like (60 THRU 66=6) (SPSS syntax).
I have this code and it works but it's neither beautiful nor very efficient so how could I optimize it?
(silly code may follow)
class Array
def to_ranges
array = self.compact.uniq.sort
ranges = []
if !array.empty?
# Initialize the left and right endpoints of the range
left, right = array.first, nil
array.each do |obj|
# If the right endpoint is set and obj is not equal to right's successor
# then we need to create a range.
if right && obj != right.succ
ranges << Range.new(left,right)
left = obj
end
right = obj
end
ranges << Range.new(left,right) unless left == right
end
ranges
end
end
write = ""
numbers = (1..666666).to_a
# split each number in an array containing it's ciphers
numbers = numbers.map { |i| i.to_s.split(//) }
# delete the arrays that doesn't contain 6 and the ones that contains 6 but also 8, 7 and 9
numbers = numbers.delete_if { |i| !i.include?('6') }
numbers = numbers.delete_if { |i| i.include?('7') }
numbers = numbers.delete_if { |i| i.include?('8') }
numbers = numbers.delete_if { |i| i.include?('9') }
# join the ciphers back into the original numbers
numbers = numbers.map { |i| i.join }
numbers = numbers.map { |i| i = Integer(i) }
# rangify consecutive numbers
numbers = numbers.to_ranges
# edit the ranges that go from 1..1 into just 1
numbers = numbers.map do |i|
if i.first == i.last
i = i.first
else
i = i
end
end
# string stuff
numbers = numbers.map { |i| i.to_s.gsub(".."," thru ") }
numbers = numbers.map { |i| "(" + i.to_s + "=6)"}
numbers.each { |i| write << " " + i }
File.open('numbers.txt','w') { |f| f.write(write) }
As I said it works for numbers even in the millions - but I'd like some advice on how to make prettier and more efficient.
I deleted my earlier attempt to parlez-vous-ruby? and made up for that. I know have an optimized version of x3ro's excellent example.
$,="\n"
puts ["(0=6)", "(6=6)", *(1.."66666".to_i(7)).collect {|i| i.to_s 7}.collect do |s|
s.include?('6')? "(#{s}0 THRU #{s}6=6)" : "(#{s}6=6)"
end ]
Compared to x3ro's version
... It is down to three lines
... 204.2 x faster (to 66666666)
... has byte-identical output
It uses all my ideas for optimization
gen numbers based on modulo 7 digits (so base-7 numbers)
generate the last digit 'smart': this is what compresses the ranges
So... what are the timings? This was testing with 8 digits (to 66666666, or 823544 lines of output):
$ time ./x3ro.rb > /dev/null
real 8m37.749s
user 8m36.700s
sys 0m0.976s
$ time ./my.rb > /dev/null
real 0m2.535s
user 0m2.460s
sys 0m0.072s
Even though the performance is actually good, it isn't even close to the C optimized version I posted before: I couldn't run my.rb to 6666666666 (6x10) because of OutOfMemory. When running to 9 digits, this is the comparative result:
sehe#meerkat:/tmp$ time ./my.rb > /dev/null
real 0m21.764s
user 0m21.289s
sys 0m0.476s
sehe#meerkat:/tmp$ time ./t2 > /dev/null
real 0m1.424s
user 0m1.408s
sys 0m0.012s
The C version is still some 15x faster... which is only fair considering that it runs on the bare metal.
Hope you enjoyed it, and can I please have your votes if only for learning Ruby for the purpose :)
(Can you tell I'm proud? This is my first encounter with ruby; I started the ruby koans 2 hours ago...)
Edit by #johndouthat:
Very nice! The use of base7 is very clever and this a great job for your first ruby trial :)
Here's a slight modification of your snippet that will let you test 10+ digits without getting an OutOfMemory error:
puts ["(0=6)", "(6=6)"]
(1.."66666666".to_i(7)).each do |i|
s = i.to_s(7)
puts s.include?('6') ? "(#{s}0 THRU #{s}6=6)" : "(#{s}6=6)"
end
# before:
real 0m26.714s
user 0m23.368s
sys 0m2.865s
# after
real 0m15.894s
user 0m13.258s
sys 0m1.724s
Exploiting patterns in the numbers, you can short-circuit lots of the loops, like this:
If you define a prefix as the 100s place and everything before it,
and define the suffix as everything in the 10s and 1s place, then, looping
through each possible prefix:
If the prefix is blank (i.e. you're testing 0-99), then there are 13 possible matches
elsif the prefix contains a 7, 8, or 9, there are no possible matches.
elsif the prefix contains a 6, there are 49 possible matches (a 7x7 grid)
else, there are 13 possible matches. (see the image below)
(the code doesn't yet exclude numbers that aren't specifically in the range, but it's pretty close)
number_range = (1..666_666)
prefix_range = ((number_range.first / 100)..(number_range.last / 100))
for p in prefix_range
ps = p.to_s
# TODO: if p == prefix_range.last or p == prefix_range.first,
# TODO: test to see if number_range.include?("#{ps}6".to_i), etc...
if ps == '0'
puts "(6=6) (16=6) (26=6) (36=6) (46=6) (56=6) (60 thru 66) "
elsif ps =~ /7|8|9/
# there are no candidate suffixes if the prefix contains 7, 8, or 9.
elsif ps =~ /6/
# If the prefix contains a 6, then there are 49 candidate suffixes
for i in (0..6)
print "(#{ps}#{i}0 thru #{ps}#{i}6) "
end
puts
else
# If the prefix doesn't contain 6, 7, 8, or 9, then there are only 13 candidate suffixes.
puts "(#{ps}06=6) (#{ps}16=6) (#{ps}26=6) (#{ps}36=6) (#{ps}46=6) (#{ps}56=6) (#{ps}60 thru #{ps}66) "
end
end
Which prints out the following:
(6=6) (16=6) (26=6) (36=6) (46=6) (56=6) (60 thru 66)
(106=6) (116=6) (126=6) (136=6) (146=6) (156=6) (160 thru 166)
(206=6) (216=6) (226=6) (236=6) (246=6) (256=6) (260 thru 266)
(306=6) (316=6) (326=6) (336=6) (346=6) (356=6) (360 thru 366)
(406=6) (416=6) (426=6) (436=6) (446=6) (456=6) (460 thru 466)
(506=6) (516=6) (526=6) (536=6) (546=6) (556=6) (560 thru 566)
(600 thru 606) (610 thru 616) (620 thru 626) (630 thru 636) (640 thru 646) (650 thru 656) (660 thru 666)
(1006=6) (1016=6) (1026=6) (1036=6) (1046=6) (1056=6) (1060 thru 1066)
(1106=6) (1116=6) (1126=6) (1136=6) (1146=6) (1156=6) (1160 thru 1166)
(1206=6) (1216=6) (1226=6) (1236=6) (1246=6) (1256=6) (1260 thru 1266)
(1306=6) (1316=6) (1326=6) (1336=6) (1346=6) (1356=6) (1360 thru 1366)
(1406=6) (1416=6) (1426=6) (1436=6) (1446=6) (1456=6) (1460 thru 1466)
(1506=6) (1516=6) (1526=6) (1536=6) (1546=6) (1556=6) (1560 thru 1566)
(1600 thru 1606) (1610 thru 1616) (1620 thru 1626) (1630 thru 1636) (1640 thru 1646) (1650 thru 1656) (1660 thru 1666)
etc...
Note I don't speak ruby, but I intend to dohave done a ruby version later just for speed comparison :)
If you just iterate all numbers from 0 to 117648 (ruby <<< 'print "666666".to_i(7)') and print them in base-7 notation, you'll at least have discarded any numbers containing 7,8,9. This includes the optimization suggestion by MrE, apart from lifting the problem to simple int arithmetic instead of char-sequence manipulations.
All that remains, is to check for the presence of at least one 6. This would make the algorithm skip at most 6 items in a row, so I deem it less unimportant (the average number of skippable items on the total range is 40%).
Simple benchmark to 6666666666
(Note that this means outputting 222,009,073 (222M) lines of 6-y numbers)
Staying close to this idea, I wrote this quite highly optimized C code (I don't speak ruby) to demonstrate the idea. I ran it to 282475248 (congruent to 6666666666 (mod 7)) so it was more of a benchmark to measure: 0m26.5s
#include <stdio.h>
static char buf[11];
char* const bufend = buf+10;
char* genbase7(int n)
{
char* it = bufend; int has6 = 0;
do
{
has6 |= 6 == (*--it = n%7);
n/=7;
} while(n);
return has6? it : 0;
}
void asciify(char* rawdigits)
{
do { *rawdigits += '0'; }
while (++rawdigits != bufend);
}
int main()
{
*bufend = 0; // init
long i;
for (i=6; i<=282475248; i++)
{
char* b7 = genbase7(i);
if (b7)
{
asciify(b7);
puts(b7);
}
}
}
I also benchmarked another approach, which unsurprisingly ran in less than half the time because
this version directly manipulates the results in ascii string form, ready for display
this version shortcuts the has6 flag for deeper recursion levels
this version also optimizes the 'twiddling' of the last digit when it is required to be '6'
the code is simply shorter...
Running time: 0m12.8s
#include <stdio.h>
#include <string.h>
inline void recursive_permute2(char* const b, char* const m, char* const e, int has6)
{
if (m<e)
for (*m = '0'; *m<'7'; (*m)++)
recursive_permute2(b, m+1, e, has6 || (*m=='6'));
else
if (has6)
for (*e = '0'; *e<'7'; (*e)++)
puts(b);
else /* optimize for last digit must be 6 */
puts((*e='6', b));
}
inline void recursive_permute(char* const b, char* const e)
{
recursive_permute2(b, b, e-1, 0);
}
int main()
{
char buf[] = "0000000000";
recursive_permute(buf, buf+sizeof(buf)/sizeof(*buf)-1);
}
Benchmarks measured with:
gcc -O4 t6.c -o t6
time ./t6 > /dev/null
$range_start = -1
$range_end = -1
$f = File.open('numbers.txt','w')
def output_number(i)
if $range_end == i-1
$range_end = i
elsif $range_start < $range_end
$f.puts "(#{$range_start} thru #{$range_end})"
$range_start = $range_end = i
else
$f.puts "(#{$range_start}=6)" if $range_start > 0 # no range, print out previous number
$range_start = $range_end = i
end
end
'1'.upto('666') do |n|
next unless n =~ /6/ # keep only numbers that contain 6
next if n =~ /[789]/ # remove nubmers that contain 7, 8 or 9
output_number n.to_i
end
if $range_start < $range_end
$f.puts "(#{$range_start} thru #{$range_end})"
end
$f.close
puts "Ruby is beautiful :)"
I came up with this piece of code, which I tried to keep more or less in FP-styling. Probably not much more efficient (as it has been said, with basic number logic you will be able to increase performance, for example by skipping from 19xx to 2000 directly, but that I will leave up to you :)
def check(n)
n = n.to_s
n.include?('6') and
not n.include?('7') and
not n.include?('8') and
not n.include?('9')
end
def spss(ranges)
ranges.each do |range|
if range.first === range.last
puts "(" + range.first.to_s + "=6)"
else
puts "(" + range.first.to_s + " THRU " + range.last.to_s + "=6)"
end
end
end
range = (1..666666)
range = range.select { |n| check(n) }
range = range.inject([0..0]) do |ranges, n|
temp = ranges.last
if temp.last + 1 === n
ranges.pop
ranges.push(temp.first..n)
else
ranges.push(n..n)
end
end
spss(range)
My first answer was trying to be too clever. Here is a much simpler version
class MutablePrintingCandidateRange < Struct.new(:first, :last)
def to_s
if self.first == nil and self.last == nil
''
elsif self.first == self.last
"(#{self.first}=6)"
else
"(#{self.first} thru #{self.last})"
end
end
def <<(x)
if self.first == nil and self.last == nil
self.first = self.last = x
elsif self.last == x - 1
self.last = x
else
puts(self) # print the candidates
self.first = self.last = x # reset the range
end
end
end
and how to use it:
numer_range = (1..666_666)
current_range = MutablePrintingCandidateRange.new
for i in numer_range
candidate = i.to_s
if candidate =~ /6/ and candidate !~ /7|8|9/
# number contains a 6, but not a 7, 8, or 9
current_range << i
end
end
puts current_range
Basic observation: If the current number is (say) 1900 you know that you can safely skip up to at least 2000...
(I didn't bother updating my C solution for formatting. Instead I went with x3ro's excellent ruby version and optimized that)
Undeleted:
I still am not sure whether the changed range-notation behaviour isn't actually what the OP wants: This version changes the behaviour of breaking up ranges that are actually contiguous modulo 6; I wouldn't be surprised the OP actually expected
.
....
(555536=6)
(555546=6)
(555556 THRU 666666=6)
instead of
....
(666640 THRU 666646=6)
(666650 THRU 666656=6)
(666660 THRU 666666=6)
I'll let the OP decide, and here is the modified version, which runs in 18% of the time as x3ro's version (3.2s instead of 17.0s when generating up to 6666666 (7x6)).
def check(n)
n.to_s(7).include?('6')
end
def spss(ranges)
ranges.each do |range|
if range.first === range.last
puts "(" + range.first.to_s(7) + "=6)"
else
puts "(" + range.first.to_s(7) + " THRU " + range.last.to_s(7) + "=6)"
end
end
end
range = (1..117648)
range = range.select { |n| check(n) }
range = range.inject([0..0]) do |ranges, n|
temp = ranges.last
if temp.last + 1 === n
ranges.pop
ranges.push(temp.first..n)
else
ranges.push(n..n)
end
end
spss(range)
My answer below is not complete, but just to show a path (I might come back and continue the answer):
There are only two cases:
1) All the digits besides the lowest one is either absent or not 6
6, 16, ...
2) At least one digit besides the lowest one includes 6
60--66, 160--166, 600--606, ...
Cases in (1) do not include any continuous numbers because they all have 6 in the lowest digit, and are different from one another. Cases in (2) all appear as continuous ranges where the lowest digit continues from 0 to 6. Any single continuation in (2) is not continuous with another one in (2) or with anything from (1) because a number one less than xxxxx0 will be xxxxy9, and a number one more than xxxxxx6 will be xxxxxx7, and hence be excluded.
Therefore, the question reduces to the following:
3)
Get all strings between "" to "66666" that do not include "6"
For each of them ("xxx"), output the string "(xxx6=6)"
4)
Get all strings between "" to "66666" that include at least one "6"
For each of them ("xxx"), output the string "(xxx0 THRU xxx6=6)"
The killer here is
numbers = (1..666666).to_a
Range supports iterations so you would be better off by going over the whole range and accumulating numbers that include your segments in blocks. When one block is finished and supplanted by another you could write it out.

How do I optimize this bit of ruby code to go faster?

It's an implementation of Sieve of Eratosthenes.
class PrimeGenerator
def self.get_primes_between( x, y)
sieve_array = Array.new(y) {|index|
(index == 0 ? 0 : index+1)
}
position_when_we_can_stop_checking = Math.sqrt(y).to_i
(2..position_when_we_can_stop_checking).each{|factor|
sieve_array[(factor).. (y-1)].each{|number|
sieve_array[number-1] = 0 if isMultipleOf(number, factor)
}
}
sieve_array.select{|element|
( (element != 0) && ( (x..y).include? element) )
}
end
def self.isMultipleOf(x, y)
return (x % y) == 0
end
end
Now I did this for a 'submit solutions to problems since you have time to kill' site. I chose ruby as my impl language.. however i was declared timed out.
I did some benchmarking
require 'benchmark'
Benchmark.bmbm do |x|
x.report ("get primes") { PrimeGenerator.get_primes_between(10000, 100000)}
end
ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
L:\Gishu\Ruby>ruby prime_generator.rb
Rehearsal ----------------------------------------------
get primes 33.953000 0.047000 34.000000 ( 34.343750)
------------------------------------ total: 34.000000sec
user system total real
get primes 33.735000 0.000000 33.735000 ( 33.843750)
ruby 1.8.6 (2007-03-13 patchlevel 0) [i386-mswin32]
Rehearsal ----------------------------------------------
get primes 65.922000 0.000000 65.922000 ( 66.110000)
------------------------------------ total: 65.922000sec
user system total real
get primes 67.359000 0.016000 67.375000 ( 67.656000)
So I redid the thing in C# 2.0 / VS 2008 -->
722 milliseconds
So now this prods me into thinking is it a problem with my implementation or is the perf diff between languages this wide? (I was amazed with the 1.9 Ruby VM... until I had to go compare it with C# :)
UPDATE:
Turned out to be my "put-eratosthenes-to-shame-adaptation" after all :) Eliminating unnecessary loop iterations was the major optimization. In case anyone is interested in the details.. you can read it here; this question is too long anyways.
I'd start by looking at your inner loop. sieve_array[(factor).. (y-1)] is going to create a new array each time it's executed. Instead, try replacing it with a normal indexing loop.
Obviously each computer is going to benchmark this differently, but I was able to make this run approximately 50x faster on my machine (Ruby 1.8.6) by removing the looping on the array with an each block, and by causing the inner loop to check less numbers.
factor=2
while factor < position_when_we_can_stop_checking
number = factor
while number < y-1
sieve_array[number-1] = 0 if isMultipleOf(number, factor)
number = number + factor; # Was incrementing by 1, causing too many checks
end
factor = factor +1
end
I don't know how it compares for speed, but this is a fairly small and simple SoE implementation that works fine for me:
def sieve_to(n)
s = (0..n).to_a
s[0]=s[1]=nil
s.each do |p|
next unless p
break if p * p > n
(p*p).step(n, p) { |m| s[m] = nil }
end
s.compact
end
There are a few further little speedups possible, but I think it's pretty good.
They're not exactly equivalent, so your 10_000 to 1_000_000 would equate to
sieve_to(1_000_000) - sieve_to(9_999)
or something closely approximate.
Anyhow, on WinXP, with Ruby 1.8.6 (and fairly hefty Xeon CPUs) I get this:
require 'benchmark'
Benchmark.bm(30) do |r|
r.report("Mike") { a = sieve_to(10_000) - sieve_to(1_000) }
r.report("Gishu") { a = PrimeGenerator.get_primes_between( 1_000, 10_000) }
end
which gives
user system total real
Mike 0.016000 0.000000 0.016000 ( 0.016000)
Gishu 1.641000 0.000000 1.641000 ( 1.672000)
(I stopped running the one million case because I got bored waiting).
So I'd say it was your algorithm. ;-)
The C# solution is pretty much guaranteed to be orders of magnitude faster though.
The Sieve of Eratosthenes works fine as illustrative way to find primes, but I would implement it a little bit different. The essence is that you don't have to check numbers which are multiples of already known primes. Now, instead of using an array to store this information, you can also create a list of all sequential primes up to the square root of the number you are checking, and then it suffices to go through the list of primes to check for primes.
If you think of it, this does what you do on the image, but in a more "virtual" way.
Edit: Quickly hacked implementation of what I mean (not copied from the web ;) ):
public class Sieve {
private readonly List<int> primes = new List<int>();
private int maxProcessed;
public Sieve() {
primes.Add(maxProcessed = 2); // one could add more to speed things up a little, but one is required
}
public bool IsPrime(int i) {
// first check if we can compare against known primes
if (i <= primes[primes.Count-1]) {
return primes.BinarySearch(i) >= 0;
}
// if not, make sure that we got all primes up to the square of i
int maxFactor = (int)Math.Sqrt(i);
while (maxProcessed < maxFactor) {
maxProcessed++;
bool isPrime = true;
for (int primeIndex = 0; primeIndex < primes.Count; primeIndex++) {
int prime = primes[primeIndex];
if (maxProcessed % prime == 0) {
isPrime = false;
break;
}
}
if (isPrime) {
primes.Add(maxProcessed);
}
}
// now apply the sieve to the number to check
foreach (int prime in primes) {
if (i % prime == 0) {
return false;
}
if (prime > maxFactor) {
break;
}
}
return true;
}
}
Uses about 67ms on my slow machine.... test app:
class Program {
static void Main(string[] args) {
Stopwatch sw = new Stopwatch();
sw.Start();
Sieve sieve = new Sieve();
for (int i = 10000; i <= 100000; i++) {
sieve.IsPrime(i);
}
sw.Stop();
Debug.WriteLine(sw.ElapsedMilliseconds);
}
}
Benchmark it with ruby-prof. it can spit out things tools like kcachegrind can look at to see where your code is slow.
Then once you make the ruby fast, use RubyInline to optimize the method for you.
I would also note that Ruby, in my experience, is a lot slower on Windows systems than on *nix. I'm not sure what speed processor you have, of course, but running this code on my Ubuntu box in Ruby 1.9 took around 10 seconds, and 1.8.6 took 30.

Resources