How to override Enumerables sort method - ruby

I'm trying to create a method that uses the functionality of Enumerables's sort method.
Imagine I have this data
data = [{project: 'proj', version: '1.1'}, {project: 'proj2', version: '1.11'}, {project: 'proj3', version: '1.2'}]
I want to be able to call the method like this:
data.natural_sort{|a,b| b[:version] <=> a[:version] }
The actual call that happens would achieve something like this:
data.sort{|a,b| MyModule.naturalize_str(b[:version]) <=> MyModule.naturalize_str(a[:version]) }
Heres my current broken code:
Enumerable.module_eval do
def natural_sort(&block)
if !block_given?
block = Proc.new{|a,b| Rearmed.naturalize_str(a[:version]) <=> Rearmed.naturalize_str(b[:version])}
end
sort do |a,b|
a = Rearmed.naturalize_str(a)
b = Rearmed.naturalize_str(b)
block.call(a,b)
end
end
end
It throws an error because a and b are the hashes instead of the versions I wanted.

You're working at odds with yourself here. In your natural_sort block you're expecting hash objects, yet within the implementation you've explicitly cast a and b to be strings.
In Ruby there's two ways to sort, the sort method with a,b pairs, and the sort_by method which uses an intermediate sort form to do the comparisons. The sort_by approach is usually significantly faster since it applies the transform to each object once, while the sort method does it each time a comparison is done.
Here's a rewrite:
def natural_sort_by(&block)
if (block_given?)
sort_by do |o|
Rearmed.naturalize_str(yield(o))
end
else
sort_by do |o|
Rearmed.naturalize_str(o)
end
end
end
Then you can call it this way:
data.natural_sort_by { |o| o[:version] }

Related

Array difference by explicitly specified method or block

If I have Arrays a and b, the expression a-b returns an Array with all those elements in a which are not in b. "Not in" means unequality (!=) here.
In my case, both arrays only contain elements of the same type (or, from the ducktyping perspective, only elements which understand a "equality" method f).
Is there an easy way to specify this f as a criterium of equality, in a similar way I can provide my own comparator when doing sort? Currently, I implemented this explicitly :
# Get the difference a-b, based on 'f':
a.select { |ael| b.all? {|bel| ael.f != bel.f} }
This works, but I wonder if there is an easier way.
UPDATE: From the comments to this question, I get the impression, that a concrete example would be appreciated. So, here we go:
class Dummy; end
# Create an Array of Dummy objects.
a = Array.new(99) { Dummy.new }
# Pick some of them at random
b = Array.new(10) { a.sample }
# Now I want to get those elements from a, which are not in b.
diff = a.select { |ael| b.all? {|bel| ael.object_id != bel.object_id} }
Of course in this case, I could also have said ! ael eql? bel, but in my general solution, this is not the case.
The "normal" object equality for e.g. Hashes and set operations on Arrays (such as the - operation) uses the output of the Object#hash method of the contained objects along with the semantics of the a.eql?(b) comparison.
This can be used to to improve performance. Ruby assumes here that two objects are eql? if the return value of their respective hash methods is the same (and consequently, assumes that two objects returning different hash values to not be eql?).
For a normal a - b operation, this can thus be used to first calculate the hash value of each object once and then only compare those values. This is quite fast.
Now, if you have a custom equality, your best bet would be to overwrite the object's hash methods so that they return suitable values for those semantics.
A common approach is to build an array containing all data taking part of the object's identity and getting its hash, e.g.
class MyObject
#...
attr_accessor :foo, :bar
def hash
[self.class, foo, bar].hash
end
end
In your object's hash method, you would than include all data that is currently considered by your f comparison method. Instead of actually using f then, you are using the default semantics of all Ruby objects and again can achieve quick set operations with your objects.
If however this is not feasible (e.g. because you need different equality semantics based on use-case), you could emulate what ruby does on your own.
With your f method, you could then perform your set operation as follows:
def f_difference(a, b)
a_map = a.each_with_object({}) do |a_el, hash|
hash[a_el.f] = a_el
end
b.each do |b_el|
a_map.delete b_el.f
end
a_map.values
end
With this approach, you only need to calculate the f value of each of your objects once. We first build a hash map with all f values and elements from a and remove the matching elements from b according to their f values. The remaining values are the result.
This approach saves you from having to loop over b for each object in a which can be slow of you have a lot of objects. If however you only have a few objects on each of your arrays, your original approach should already be fine.
Let's have a look at a benchmark whee I use the standard hash method in place of your custom f to have a comparable result.
require 'benchmark/ips'
def question_diff(a, b)
a.select { |ael| b.all? {|bel| ael.hash != bel.hash} }
end
def answer_diff(a, b)
a_map = a.each_with_object({}) do |a_el, hash|
hash[a_el.hash] = a_el
end
b.each do |b_el|
a_map.delete b_el.hash
end
a_map.values
end
A = Array.new(100) { rand(10_000) }
B = Array.new(10) { A.sample }
Benchmark.ips do |x|
x.report("question") { question_diff(A, B) }
x.report("answer") { answer_diff(A, B) }
x.compare!
end
With Ruby 2.7.1, I get the following result on my machine, showing that the original approach from the question is about 5.9 times slower than the optimized version from my answer:
Warming up --------------------------------------
question 1.304k i/100ms
answer 7.504k i/100ms
Calculating -------------------------------------
question 12.779k (± 2.0%) i/s - 63.896k in 5.002006s
answer 74.898k (± 3.3%) i/s - 375.200k in 5.015239s
Comparison:
answer: 74898.0 i/s
question: 12779.3 i/s - 5.86x (± 0.00) slower

Ruby Array Chainables Combining Map & Inject

Is there a nice way to create a Ruby array chainable on the fly that combines map and inject?
Here's what I mean. Let a be an array of integers, then to get all sums of 2 adjacent elements we can do:
a.each_cons(2).map(&:sum)
We can also get the product of all the elements of an array a by:
a.inject(1,&:*)
But we can't do:
a.each_cons(2).map(&:inject(1,&:*))
We can, however, define an array chainable:
class Array
def prod
return self.inject(1,&:*)
end
end
Then a.each_cons(2).map(&:prod) works fine.
If you use this wierd Symbol patch shown here:
https://stackoverflow.com/a/23711606/2981429
class Symbol
def call(*args, &block)
->(caller, *rest) { caller.send(self, *rest, *args, &block) }
end
end
This allows you to pass arguments to the proc shorthand by means of Currying:
[[1,2],[3,4]].map(&:inject.(1, &:*))
# => [2, 12]
I'm sure this has been requested in Ruby core many times, unfortunately I don't have a link to the Ruby forums right now but I promise you it's on there.
I doubt that this is what you're looking for, but don't forget that you can still call map with a normal block.
a.each_cons(2).map { |n1, n2| n1 * n2 }
Since you didn't mention it in the question I thought you might have overlooked the easiest option.

Ruby Enumerable#find returning mapped value

Does Ruby's Enumerable offer a better way to do the following?
output = things
.find { |thing| thing.expensive_transform.meets_condition? }
.expensive_transform
Enumerable#find is great for finding an element in an enumerable, but returns the original element, not the return value of the block, so any work done is lost.
Of course there are ugly ways of accomplishing this...
Side effects
def constly_find(things)
output = nil
things.each do |thing|
expensive_thing = thing.expensive_transform
if expensive_thing.meets_condition?
output = expensive_thing
break
end
end
output
end
Returning from a block
This is the alternative I'm trying to refactor
def costly_find(things)
things.each do |thing|
expensive_thing = thing.expensive_transform
return expensive_thing if expensive_thing.meets_condition?
end
nil
end
each.lazy.map.find
def costly_find(things)
things
.each
.lazy
.map(&:expensive_transform)
.find(&:meets_condition?)
end
Is there something better?
Of course there are ugly ways of accomplishing this...
If you had a cheap operation, you'd just use:
collection.map(&:operation).find(&:condition?)
To make Ruby call operation only "on a as-needed basis" (as the documentation says), you can simply prepend lazy:
collection.lazy.map(&:operation).find(&:condition?)
I don't think this is ugly at all—quite the contrary— it looks elegant to me.
Applied to your code:
def costly_find(things)
things.lazy.map(&:expensive_transform).find(&:meets_condition?)
end
I would be inclined to create an enumerator that generates values thing.expensive_transform and then make that the receiver for find with meets_condition? in find's block. For one, I like the way that reads.
Code
def costly_find(things)
Enumerator.new { |y| things.each { |thing| y << thing.expensive_transform } }.
find(&:meets_condition?)
end
Example
class Thing
attr_reader :value
def initialize(value)
#value = value
end
def expensive_transform
self.class.new(value*2)
end
def meets_condition?
value == 12
end
end
things = [1,3,6,4].map { |n| Thing.new(n) }
#=> [#<Thing:0x00000001e90b78 #value=1>, #<Thing:0x00000001e90b28 #value=3>,
# #<Thing:0x00000001e90ad8 #value=6>, #<Thing:0x00000001e90ab0 #value=4>]
costly_find(things)
#=> #<Thing:0x00000001e8a3b8 #value=12>
In the example I have assumed that expensive_things and things are instances of the same class, but if that is not the case the code would need to be modified in the obvious way.
I don't think there is a "obvious best general solution" for your problem, which is also simple to use. You have two procedures involved (expensive_transform and meets_condition?), and you also would need - if this were a library method to use - as a third parameter the value to return, if no transformed element meets the condition. You return nil in this case, but in a general solution, expensive_transform might also yield nil, and only the caller knows what unique value would indicate that the condition as not been met.
Hence, a possible solution within Enumerable would have the signature
class Enumerable
def find_transformed(default_return_value, transform_proc, condition_proc)
...
end
end
or something similar, so this is not particularily elegant either.
You could do it with a single block, if you agree to merge the semantics of the two procedures into one: You have only one procedure, which calculates the transformed value and tests it. If the test succeeds, it returns the transformed value, and if it fails, it returns the default value:
class Enumerable
def find_by(default_value, &block)
result = default_value
each do |element|
result = block.call(element)
break if result != default_value
end
end
result
end
You would use it in your case like this:
my_collection.find_by(nil) do |el|
transformed_value = expensive_transform(el)
meets_condition?(transformed_value) ? transformed_value : nil
end
I'm not sure whether this is really intuitive to use...

Ruby Hash destructive vs. non-destructive method

Could not find a previous post that answers my question...I'm learning how to use destructive vs. non-destructive methods in Ruby. I found an answer to the exercise I'm working on (destructively adding a number to hash values), but I want to be clear on why some earlier solutions of mine did not work. Here's the answer that works:
def modify_a_hash(the_hash, number_to_add_to_each_value)
the_hash.each { |k, v| the_hash[k] = v + number_to_add_to_each_value}
end
These two solutions come back as non-destructive (since they all use "each" I cannot figure out why. To make something destructive is it the equals sign above that does the trick?):
def modify_a_hash(the_hash, number_to_add_to_each_value)
the_hash.each_value { |v| v + number_to_add_to_each_value}
end
def modify_a_hash(the_hash, number_to_add_to_each_value)
the_hash.each { |k, v| v + number_to_add_to_each_value}
end
The terms "destructive" and "non-destructive" are a bit misleading here. Better is to use the conventional "in-place modification" vs. "returns a copy" terminology.
Generally methods that modify in-place have ! at the end of their name to serve as a warning, like gsub! for String. Some methods that pre-date this convention do not have them, like push for Array.
The = performs an assignment within the loop. Your other examples don't actually do anything useful since each returns the original object being iterated over regardless of any results produced.
If you wanted to return a copy you'd do this:
def modify_a_hash(the_hash, number_to_add)
Hash[
the_hash.collect do |k, v|
[ k, v + number_to_add ]
end
]
end
That would return a copy. The inner operation collect transforms key-value pairs into new key-value pairs with the adjustment applied. No = is required since there's no assignment.
The outer method Hash[] transforms those key-value pairs into a proper Hash object. This is then returned and is independent of the original.
Generally a non-destructive or "return a copy" method needs to create a new, independent version of the thing it's manipulating for the purpose of storing the results. This applies to String, Array, Hash, or any other class or container you might be working with.
Maybe this slightly different example will be helpful.
We have a hash:
2.0.0-p481 :014 > hash
=> {1=>"ann", 2=>"mary", 3=>"silvia"}
Then we iterate over it and change all the letters to the uppercase:
2.0.0-p481 :015 > hash.each { |key, value| value.upcase! }
=> {1=>"ANN", 2=>"MARY", 3=>"SILVIA"}
The original hash has changed because we used upcase! method.
Compare to method without ! sign, that doesn't modify hash values:
2.0.0-p481 :017 > hash.each { |key, value| value.downcase }
=> {1=>"ANN", 2=>"MARY", 3=>"SILVIA"}

How does one populate an array in Ruby?

Here is the code I'm working with:
class Trader
def initialize(ticker ="GLD")
#ticker = ticker
end
def yahoo_data(days=12)
require 'yahoofinance'
YahooFinance::get_historical_quotes_days( #ticker, days ) do |row|
puts "#{row.join(',')}" # this is where a solution is required
end
end
end
The yahoo_data method gets data from Yahoo Finance and puts the price history on the console. But instead of a simple puts that evaporates into the ether, how would you use the preceding code to populate an array that can be later manipulated as object.
Something along the lines of :
do |row| populate_an_array_method(row.join(',') end
If you don't give a block to get_historical_quotes_days, you'll get an array back. You can then use map on that to get an array of the results of join.
In general since ruby 1.8.7 most iterator methods will return an enumerable when they're called without a block. So if foo.bar {|x| puts x} would print the values 1,2,3 then enum = foo.bar will return an enumerable containing the values 1,2,3. And if you do arr = foo.bar.to_a, you'll get the array [1,2,3].
If have an iterator method, which does not do this (from some library perhaps, which does not adhere to this convention), you can use foo.enum_for(:bar) to get an enumerable which contains all the values yielded by bar.
So hypothetically, if get_historical_quotes_days did not already return an array, you could use YahooFinance.enum_for(:get_historical_quotes_days).map {|row| row.join(",") } to get what you want.

Resources