Is high-speed sorting impossible with Ruby? - ruby

I've studied the poignant guide and it really has helped me pick up the
language pretty fast. After that, I started solving some coding puzzles
using Ruby. It just helps a lot to get used to the language I feel.
I'm stuck with one such puzzle. I have solved it very easily since it is
pretty straight-forward, but the solution is being rejected (by the host
website) with the error 'Time Exceded'! I know that Ruby cannot compete
with the speed of C/C++ but it has got to be able to answer a tiny puzzle on a website which accepts
solutions in Ruby?
The puzzle is just a normal sort.
This is my solution
array ||= []
gets.to_i.times do
array << gets
end
puts array.sort
My question is, is there any other way I can achieve high-speed sorting with Ruby? I'm using the basic Array#sort here, but is there a way to do it faster, even though it means lot more lines of code?

I've solved that problem, and let me tell you using a nlogn algorithm to pass that is almost impossible unless you are using a very optimized C/Assembly version of it.
You need to explore other algorithms. Hint: O(n) Algorithm will do the trick, even for ruby.
Good Luck.

You're sorting strings when you should be sorting ints. Try:
array << gets.to_i

If there is no need for duplicate values to be repeated:
h = {}
gets.to_i.times{h[gets.to_i] = true}
(0..100000).each{|n| puts(n) if h[n]}
If duplicate values must be repeated:
h = Hash.new(0)
gets.to_i.times{h[gets.to_i] += 1}
(0..100000).each{|n| h[n].times{puts(n)}}

Related

In a lua for loop what is a # used for?

I know how for loops work and I use them quite often but also seem to often come across a # in others' code and I want to know what it is for and how to use it. An example of this would be:
for i = 1, #npc do local v = npc[i]
I cant seem to find anything online regarding this, maybe my searches just aren't good but it would be nice if someone could explain it for me, thanks.
In Lua, # is the length operator. for i = 1, #npc essentially loops from 1 to the length of the npc array.
As was already pointed out, it gets the length of a list. However, there's another thing worth pointing out: that for loop is suboptimal and unidiomatic. It would be better written as for i, v in ipairs(npc) do. In general, using # in a for loop is almost always the wrong thing to do.

Bubble Sort method

I am just learning ruby and KevinC's response (in this link) makes sense to me with one exception. I don't understand why the code is encompassed in the arr.each do |i| #while... end That part seems redundant to me as the 'while' loop is already hitting each of the positions? Can someone explain?
The inner loop finds a bubble and carries it up; if it finds another, lighter bubble, it switches them around and carries the lighter one. So you need several passes through the array to find all the bubbles and carry them to the correct place, since you can't float several bubbles at the same time.
EDIT:
The each is really misused in KevinC's code, since it is not used for its normal purpose: yielding elements of the collection. Instead of arr.each, it would be better to use arr.size.times - as it would be more informative to the reader. Redefining the i within the block is adding insult to injury. While none of this will cause the code to be wrong as such, it is misleading.
The other problem with the code is the fact that it does not provide the early termination condition (swapped in most other answers on that question). In theory, bubble sort could find the array sorted in the first pass; the other size - 1 steps are unnecesary. KevinC's code would still dry-hump the already sorted array, never realising it is done.
As for rewrite into block-less code, it is certainly possible, but you need to understand that blocks syntax is very idiomatic in Ruby, and non-block loops are almost unheard of in Ruby world. While Ruby has for, it is pretty much never used in Ruby. But...
arr.each do |i|
...
end
is equivalent to
for i in arr
...
end
which is, again, at least for the array case, equivalent to
index = 0
while index < arr.size
i = arr[index]
...
index += 1
end

Ruby: two-dimensional arrays syntax

In Ruby, suppose we have a 2-dimensional array, why is this syntax fine:
array.each do |x|
x.each do |y|
puts y
end
end
But this is not:
array.each{|x|.each{|y| puts y}}
Any ideas? Thanks
This should be fine array.each{|x| x.each{|y| puts y}}
You forget to refer x first.
I.e. . is supposed to be left associate operator. If you have noting on the left side - this is an error.
If you replace your do...end blocks with {...} carefully you'll find that your second form works the same as your first. But puts array accomplishes the same thing as this whole double loop.
If I may offer some polite meta-advice, your two Ruby questions today seem like you maybe were asked to do some things in a language you don't know, and are frustrated. This is understandable. But the good news is that, compared to many other languages, Ruby is built on a very small number of pieces. If you spend a little time getting really familiar with Array and Hash, you'll find the going much smoother thereafter.

fixnum and prime numbers in ruby

Before I set about to writing this myself, has anyone seen a ruby implementation of the following behavior?
puts 7.nextprime(); #=> 11
puts 7.previousprime(); #=> 5
puts 7.isprime(); #=> true
Obviously this kind of thing would be ugly for large numbers but for integers never exceeding a few thousand (the common instance for me) a sensible implementation is doable, hence the question.
Ruby comes with a built-in Prime class that allows you to iterate through primes starting at 1, but I see no way to initialize it with a starting value other than 1, nor a predicate check to determine whether or not a number is prime. I'd say go for it, though you should keep in mind that math in Ruby can be slow and if performance is a factor you may be better off considering writing it as a C or Java extension. Here's an example of how to use RubyInline to generate primes in C.
Also, I suggest you avoid using the method name 7.isprime - the convention in Ruby is 7.prime?.
Take a look at the snippets found here. They could give you a headstart.

What's the most efficient way to partition a large hash into N smaller hashes in Ruby?

The Problem
I'm working on a problem that involves sharding. As part of the problem I need to find the fastest way to partition a large Ruby hash (> 200,0000 entries) in two or more pieces.
Are there any non O(n) approaches?
Is there a non-Ruby i.e. C/C++ implementation?
Please don't reply with examples using the trivial approach of converting the hash to an array and rebuilding N distinct hashes.
My concern is that Ruby is too slow to do this kind of work.
The initial approach
This was the first solution I tried. What was appealing about it was:
it didn't need to loop slavishly across the hash
it didn't need to manage a counter to allocate the members evenly among the shards.
it's short and neat looking
Ok, it isn't O(n) but it relies on methods in the standard library which I figured would be faster than writing my own Ruby code.
pivot = s.size / 2
slices = s.each_slice(pivot)
s1 = Hash[*slices.entries[0].flatten]
s2 = Hash[*slices.entries[1].flatten]
A better solution
Mark and Mike were kind enough to suggest approaches. I have to admit that Mark's approach felt wrong - it did exactly what I didn't want - it looped over all of the members of the has and evaluated a conditional as it went - but since he'd taken the time to do the evaluation, I figured that I should try a similar approach and benchmark that. This is my adapted version of his approach (My keys aren't numbers so I can't take his approach verbatim)
def split_shard(s)
shard1 = {}
shard2 = {}
t = Benchmark.measure do
n = 0
pivot = s.size / 2
s.each_pair do |k,v|
if n < pivot
shard1[k] = v
else
shard2[k] = v
end
n += 1
end
end
$b += t.real
$e += s.size
return shard1, shard2
end
The results
In both cases, a large number of hashes are split into shards. The total number of elements across all of the hashes in the test data set was 1,680,324.
My initial solution - which had to be faster because it uses methods in the standard library and minimises the amount of Ruby code (no loop, no conditional) - runs in just over 9s
Mark's approach runs in just over 5s
That's a significant win
Take away
Don't be fooled by 'intuition' - measure the performance of competing algorithm
Don't worry about Ruby's performance as a language - my initial concern is that if I'm doing ten million of these operations, it could take a significant amount of time in Ruby but it doesn't really.
Thanks to Mark and Mike who both get points from me for their help.
Thanks!
I don't see how you can achieve this using an unmodified "vanilla" Hash - I'd expect that you'd need to get into the internals in order to make partitioning into some kind of bulk memory-copying operation. How good is your C?
I'd be more inclined to look into partitioning instead of creating a Hash in the first place, especially if the only reason for the 200K-item Hash existing in the first place is to be subdivided.
EDIT: After thinking about it at the gym...
The problem with finding some existing solution is that someone else needs to have (a) experienced the pain, (b) had the technical ability to address it and (c) felt community-friendly enough to have released it into the wild. Oh, and for your OS platform.
What about using a B-Tree instead of a Hash? Hold your data sorted by key and it can be traversed by memcpy(). B-Tree retrieval is O(log N), which isn't much of a hit against Hash most of the time.
I found something here which might help, and I'd expect there'd only be a little duck-typing wrapper needed to make it quack like a Hash.
Still gonna need those C/C++ skills, though. (Mine are hopelessly rusty).
This probably isn't fast enough for your needs (which sound like they'll require an extension in C), but perhaps you could use Hash#select?
I agree with Mike Woodhouse's idea. Is it possible for you to construct your shards at the same place where the original 200k-item hash is being constructed? If the items are coming out of a database, you could split your query into multiple disjoint queries, based either on some aspect of the key or by repeatedly using something like LIMIT 10000 to grab a chunk at a time.
Additional
Hi Chris, I just compared your approach to using Hash#select:
require 'benchmark'
s = {}
1.upto(200_000) { |i| s[i] = i}
Benchmark.bm do |x|
x.report {
pivot = s.size / 2
slices = s.each_slice(pivot)
s1 = Hash[*slices.entries[0].flatten]
s2 = Hash[*slices.entries[1].flatten]
}
x.report {
s1 = {}
s2 = {}
s.each_pair do |k,v|
if k < 100_001
s1[k] = v
else
s2[k] = v
end
end
}
end
It looks like Hash#select is much faster, even though it goes through the entire large hash for each one of the sub-hashes:
# ruby test.rb
user system total real
0.560000 0.010000 0.570000 ( 0.571401)
0.320000 0.000000 0.320000 ( 0.323099)
Hope this helps.

Resources