I am working with data structures fundamentals in Ruby for learning at the CS sophomore/junior level.
My question: Given the following code, does anyone see any design issues with this approach to a data structures library in Ruby? Especially the Module#abstract_method. Is it okay to do this in terms of duck typing philosophy? Does this make the code clearer for people from static languages and give some semblance of an interface?
class Module
def abstract_method(symbol)
module_eval <<-"end_eval"
def #{symbol.id2name}(*args)
raise MethodNotImplementedError
end
end_eval
end
end
class AbstractObject < Object
abstract_method :compare_to
protected :compare_to
class MethodNotImplementedError < StandardError; end
def initialize
super
end
include Comparable
def <=>(other)
if is_a?(other.class)
return compare_to(other)
elsif other.is_a?(self.class)
return -other.compare_to(self)
else
return self.class <=> other.class
end
end
end
# methods for insertion/deletion should be provided by concrete implementations as this behavior
# is unique to the type of data structure. Also, concrete classes should override purge to discard
# all the contents of the container
class Container < AbstractObject
include Enumerable
def initialize
super
#count = 0
end
attr_reader :count
alias :size :count
# should return an iterator
abstract_method :iter
# depends on iterator object returned from iter method
# layer of abstraction for how to iterate a structure
def each
i = iter
while i.more?
yield i.succ
end
end
# a visitor provides another layer of abstraction for additional
# extensible and re-usable traversal operations
def accept(visitor)
raise ArgumentError, "Argument must be a visitor" unless visitor.is_a?(Visitor)
each do |obj|
break if visitor.done?
visitor.visit(obj)
end
end
# expected to over-ride this in derived classes to clear container
def purge
#count = 0
end
def empty?
count == 0
end
def full?
false
end
def to_s
s = ""
each do |obj|
s << ", " if not s.empty?
s << obj.to_s
end
self.class + "{" + s + "}"
end
end
class List < Container
def initialize
super
end
def compare_to(obj)
"fix me"
end
end
A few remarks:
Defining a method that only raises a NotImplemented error is somewhat redundant, since Ruby will do that anyway if the method does not exist. The code you wrote there is just as useful as simply putting a comment to say "You must implement a method called compare_to". In fact that is what the Enumerable module in Ruby's standard library does - in the documentation it specifically says that in order to use the functionality in Enumerable you must define an each() method.
a compare_to method is also redundant, since that is precisely what the <=> operator is for.
Using an actual iterator object is a bit overkill in Ruby, since blocks tend to have a much more elegant and simple approach. Same goes for your visitor pattern - you don't need to use a visitor for "extensible and re-usable traversal operations" when you can just pass a block to a traverse method. For example you have many of them in Enumerable: each, each_with_index, map, inject, select, delete_if, partition, etc. All of these use a block in a different way to provide a different type of functionality, and other functionality can be added on in a fairly simple and consistent way (especially when you have open classes).
Regarding interfaces, in Ruby (and pretty much any other dynamic language, like Python) people usually use interfaces that are implicit, which means that you don't actually define the interface in code. Instead you typically rely on documentation and proper testing suites to ensure that code works well together.
I think that your code may be more coherent to someone coming from a Java world because it sticks to the "Java way" of doing things. However to other Ruby programmers your code would be confusing and difficult to work with since it doesn't really stick to the "Ruby way" of doing things. For example, an implementation of a select function using an iterator object:
it = my_list.iter
results = []
while it.has_next?
obj = it.next
results << obj if some_condition?
end
is much less clear to a Ruby programmer than:
results = my_list.select do |obj|
some_condition?
end
If you would like to see an example of a data structures library in Ruby, you can see the algorithms gem here: http://rubydoc.info/gems/algorithms/0.3.0/frames
Also take a look at what is provided by default in the Enumerable module: http://www.ruby-doc.org/core/classes/Enumerable.html. When you include Enumerable you receive all of these functions for free.
I hope this helps!
Related
Would it be bad practice to have a method that returns self on block_given? and a different type if a block was not provided?
The example:
Config#item will return the item if a block is not given, and will return Config if it is given.
class Item
:attr_reader :key
def initialize(key)
#key = key
end
def do_stuff
puts "#{key} doing some stuff"
self
end
end
class Config
attr_reader :items
def initialize
#items = {}
end
def item(key)
itm = #items[key] ||= Item.new(key)
if block_given?
yield(itm)
self
else
itm
end
end
end
Usage:
cnf = Config.new
cnf.item("foo") do |itm|
itm.do_stuff
end
.item("bar") do |itm|
itm.do_stuff
end
foo = .item("foo").do_stuff
cnf.item("baz").do_stuff
foo.do_stuff
The model is meant to use the same method item as a getter and as a way to refer to an item that needs to be configured or which configuration needs to be reopened.
Would it be bad practice to have a method that returns self on block_given? and a different type if a block was not provided?
No. In fact, there is an extremely well-known example of a method that has this exact signature: each. each returns self if a block is given, and an Enumerator when no block is given. In fact, many methods in Enumerable return an Enumerator when no block is given and something else if there is a block.
(I am actually surprised that you haven't encountered each or Enumerable so far.)
Not at all, as long as the users of your method have adequate understanding of this. Documentation helps quite significantly in these situations.
Consider the Ruby Standard Library. Many methods return different types based on their inputs and block_given?, such as Enumerable#map, Hash#each, and Range#step.
Like the standard library authors, you have to decide whether you prefer a compact interface to your class/model or consistent behavior from your methods. There are always tradeoffs to make, and you have numerous strong examples of each of these to draw from within the Ruby Standard Library.
Simple question:
In java you can define :
void myFunction<T>(T value) { //do stuff }
Is there an equivalent in ruby, and if not, how can I achieve a similar result (passing class types?)
You can pass a class to a method just like passing normal objects. For example
def create_object(klass, *args)
klass.new(*args)
end
create_object(String) #=> ""
create_object(Hash) #=> {}
create_object(Array, 3, :hello) #=> [:hello, :hello, :hello]
First a few definitions
Generics is an abstraction over types
Polymorphism is a sum-type pattern
Composition is a product-type pattern
Most OO languages lean towards polymorphism
Ruby is an OO language. Polymorphism is at the core of its design. The type system in Ruby is only getting strong in Ruby 3. So we may see more interesting generics at that point; but until now, I haven't heard that to be a feature.
To achieve this, we technically need to figure out a way of applying a method on separate types without knowing the type. It's a lot of code duplication is possible.
Your Java example…
void myFunction<T>(T value) { //do stuff }
…can be translated into Ruby as
def myFunction(value)
raise "Only works with T types" unless value.is_a? T
# do stuff
end
Where the magic now has to happen is in defining the possible set of T. I'm thinking something like…
class T
def _required_for_MyFunction()
raise "T is abstract!"
end
end
class Something < T
def _required_for_MyFunction()
# does something
end
end
class Nothing < T
def _required_for_MyFunction()
# does nothing
end
end
The painful part of polymorphism is that you have to define your type space from the get-go. The good parts of this is you have total control of the domain space.
Ruby follows duck typing. You can pass arguments of any class to any method (which is the original reason why you might need generics). If you want to get the class of said argument, you can still use #class
def foo(bar)
bar.class
end
foo 'baz' # => String
foo 42 # => Fixnum
Say I have a parent class:
class Stat
def val
raise "method must be implemented by subclass"
end
end
And a subclass:
class MyStat < Stat
def val
#performs costly calculation and returns value
end
end
By virtue of extending the parent class, I would like the subclass to not have to worry about caching the return value of the "val" method.
There are many patterns one could employ here to this effect, and I've tried several on for size, but none of them feel right to me and I know this is a solved problem so it feels silly to waste the time and effort. How is this most commonly dealt with?
Also, it's occurred to me that I may be asking the wrong questions. Maybe I should't be using inheritance at all but composition instead.
Any and all thoughts appreciated.
Edit:
Solution I went with can be summed up as follows:
class Stat
def value
#value ||= build_value
end
def build_value
#to be implemented by subclass
end
end
Typically I use a simple pattern regardless of the presence of inheritance:
class Parent
def val
#val ||= calculate_val
end
def calculate_value
fail "Implementation missing"
end
end
class Child < Parent
def calculate_val
# some expensive computation
end
end
I always prefer to wrap the complex and expensive logic in its own method or methods that have no idea that their return value will be memoized. It gives you a cleaner separation of concerns; one method is for caching, one method is for computing.
It also happens to give you a nice way of overriding the logic, without overriding the caching logic.
In the simple example above, the memoized method val is pretty redundant. But the pattern it also lets you memoize methods that accept arguments, or when the actual caching is less trivial, maintaining that separation of responsibilities between caching and computing:
def is_prime(n)
#is_prime ||= {}
#is_prime[n] ||= compute_is_prime
end
If you want to keep the method names same and not create new methods to put logic in, then prepend modules instead of using parent/child inheritance.
module MA
def val
puts("module's method")
#_val ||= super
end
end
class CA
def val
puts("class's method")
1
end
prepend MA
end
ca = CA.new
ca.val # will print "module's method" and "class's method". will return 1.
ca.val # will print "module's method". will return 1.
Due to the fact that Ruby doesn't support overloading (because of several trivial reasons), I am trying to find a way to 'simulate' it.
In static typed languages, you mustn't use instanceof, (excepting some particular cases of course...) to guide the application.
So, keeping this in mind, is this the correct way to overload a method in which I do care about the type of the variable? (In this case, I don't care about the number of parameters)
class User
attr_reader :name, :car
end
class Car
attr_reader :id, :model
end
class UserComposite
attr_accessor :users
# f could be a name, or a car id
def filter(f)
if (f.class == Car)
filter_by_car(f)
else
filter_by_name(f)
end
end
private
def filter_by_name(name)
# filtering by name...
end
def filter_by_car(car)
# filtering by car id...
end
end
There are cases where this is a good approach, and Ruby gives you the tools to deal with it.
However your case is unclear because your example contradicts itself. If f.class == Car then filter_by_car accepts a _car, not a _car_id.
I'm assuming that you're actually passing instances of the class around, and if so you can do this:
# f could be a name, or a car
def filter(f)
case f
when Car
filter_by_car(f)
else
filter_by_name(f)
end
end
case [x] looks at each of its when [y] clauses and executes the first one for which [y] === [x]
Effectively this is running Car === f. When you call #=== on a class object, it returns true if the argument is an instance of the class.
This is quite a powerful construct because different classes can define different "case equality". For example the Regexp class defines case equality to be true if the argument matches the expression, so the following works:
case "foo"
when Fixnum
# Doesn't run, the string isn't an instance of Fixnum
when /bar/
# Doesn't run, Regexp doesn't match
when /o+/
# Does run
end
Personally, I don't see a big problem in branching that way. Although it would look cleaner with a case
def filter(f)
case f
when Car
filter_by_car(f)
else
filter_by_name(f)
end
end
Slightly more complicated example involves replacing branching with objects (ruby is oop language, after all :) ). Here we define handlers for specific formats (classes) of data and then look up those handlers by incoming data class. Something along these lines:
class UserComposite
def filter(f)
handler(f).filter
end
private
def handler(f)
klass_name = "#{f.class}Handler"
klass = const_get(klass_name) if const_defined?(klass_name)
klass ||= DefaultHandler
klass.new(f)
end
class CarHandler
def filter
# ...
end
end
class DefaultHandler # filter by name or whatever
def filter
# ...
end
end
end
There could be a problem lurking in your architecture - UserComposite needs to know too much about Car and User. Suppose you need to add more types? UserComposite would gradually become bloated.
However, it's hard to give specific advice because the business logic behind filtering isn't clear (architecture should always adapt to your real-world use-cases).
Is there really a common action you need to do to both Cars and Users?
If not, don't conflate the behavior into a single UserComposite class.
If so, you should use decorators with a common interface. Roughly like this:
class Filterable
# common public methods for filtering, to be called by UserComposite
def filter
filter_impl # to be implemented by subclasses
end
end
class FilterableCar < Filterable
def initialize(car)
#car = car
end
private
def filter_impl
# do specific stuff with #car
end
end
class DefaultFilterable < Filterable
# Careful, how are you expecting this generic_obj to behave?
# It might be better replace the default subclass with a FilterableUser.
def initialize(generic_obj)
# ...
end
private
def filter_impl
# generic behavior
end
end
Then UserComposite only needs to care that it gets passed a Filterable, and all it has to do is call filter on that object. Having the common filterable interface keeps your code predictable, and easier to refactor.
I recommend that you avoid dynamically generating the filterable subclass name, because if you ever decide to rename the subclass, it'll be much harder to find the code doing the generating.
I have an object Results that contains an array of result objects along with some cached statistics about the objects in the array. I'd like the Results object to be able to behave like an array. My first cut at this was to add methods like this
def <<(val)
#result_array << val
end
This feels very c-like and I know Ruby has better way.
I'd also like to be able to do this
Results.each do |result|
result.do_stuff
end
but am not sure what the each method is really doing under the hood.
Currently I simply return the underlying array via a method and call each on it which doesn't seem like the most-elegant solution.
Any help would be appreciated.
For the general case of implementing array-like methods, yes, you have to implement them yourself. Vava's answer shows one example of this. In the case you gave, though, what you really want to do is delegate the task of handling each (and maybe some other methods) to the contained array, and that can be automated.
require 'forwardable'
class Results
include Enumerable
extend Forwardable
def_delegators :#result_array, :each, :<<
end
This class will get all of Array's Enumerable behavior as well as the Array << operator and it will all go through the inner array.
Note, that when you switch your code from Array inheritance to this trick, your << methods would start to return not the object intself, like real Array's << did -- this can cost you declaring another variable everytime you use <<.
each just goes through array and call given block with each element, that is simple. Since inside the class you are using array as well, you can just redirect your each method to one from array, that is fast and easy to read/maintain.
class Result
include Enumerable
def initialize
#results_array = []
end
def <<(val)
#results_array << val
end
def each(&block)
#results_array.each(&block)
end
end
r = Result.new
r << 1
r << 2
r.each { |v|
p v
}
#print:
# 1
# 2
Note that I have mixed in Enumerable. That will give you a bunch of array methods like all?, map, etc. for free.
BTW with Ruby you can forget about inheritance. You don't need interface inheritance because duck-typing doesn't really care about actual type, and you don't need code inheritance because mixins are just better for that sort of things.
Your << method is perfectly fine and very Ruby like.
To make a class act like an array, without actually inheriting directly from Array, you can mix-in the Enumerable module and add a few methods.
Here's an example (including Chuck's excellent suggestion to use Forwardable):
# You have to require forwardable to use it
require "forwardable"
class MyArray
include Enumerable
extend Forwardable
def initialize
#values = []
end
# Map some of the common array methods to our internal array
def_delegators :#values, :<<, :[], :[]=, :last
# I want a custom method "add" available for adding values to our internal array
def_delegator :#values, :<<, :add
# You don't need to specify the block variable, yield knows to use a block if passed one
def each
# "each" is the base method called by all the iterators so you only have to define it
#values.each do |value|
# change or manipulate the values in your value array inside this block
yield value
end
end
end
m = MyArray.new
m << "fudge"
m << "icecream"
m.add("cake")
# Notice I didn't create an each_with_index method but since
# I included Enumerable it knows how and uses the proper data.
m.each_with_index{|value, index| puts "m[#{index}] = #{value}"}
puts "What about some nice cabbage?"
m[0] = "cabbage"
puts "m[0] = #{m[0]}"
puts "No! I meant in addition to fudge"
m[0] = "fudge"
m << "cabbage"
puts "m.first = #{m.first}"
puts "m.last = #{m.last}"
Which outputs:
m[0] = fudge
m[1] = icecream
m[2] = cake
What about some nice cabbage?
m[0] = cabbage
No! I meant in addition to fudge
m.first = fudge
m.last = cabbage
This feels very c-like and I know Ruby
has better way.
If you want an object to 'feel' like an array, than overriding << is a good idea and very 'Ruby'-ish.
but am not sure what the each method
is really doing under the hood.
The each method for Array just loops through all the elements (using a for loop, I think). If you want to add your own each method (which is also very 'Ruby'-ish), you could do something like this:
def each
0.upto(#result_array.length - 1) do |x|
yield #result_array[x]
end
end
If you create a class Results that inherit from Array, you will inherit all the functionality.
You can then supplement the methods that need change by redefining them, and you can call super for the old functionality.
For example:
class Results < Array
# Additional functionality
def best
find {|result| result.is_really_good? }
end
# Array functionality that needs change
def compact
delete(ininteresting_result)
super
end
end
Alternatively, you can use the builtin library forwardable. This is particularly useful if you can't inherit from Array because you need to inherit from another class:
require 'forwardable'
class Results
extend Forwardable
def_delegator :#result_array, :<<, :each, :concat # etc...
def best
#result_array.find {|result| result.is_really_good? }
end
# Array functionality that needs change
def compact
#result_array.delete(ininteresting_result)
#result_array.compact
self
end
end
In both of these forms, you can use it as you want:
r = Results.new
r << some_result
r.each do |result|
# ...
end
r.compact
puts "Best result: #{r.best}"
Not sure I'm adding anything new, but decided to show a very short code that I wish I could have found in the answers to quickly show available options. Here it is without the enumerator that #shelvacu talks about.
class Test
def initialize
#data = [1,2,3,4,5,6,7,8,9,0,11,12,12,13,14,15,16,172,28,38]
end
# approach 1
def each_y
#data.each{ |x| yield(x) }
end
#approach 2
def each_b(&block)
#data.each(&block)
end
end
Lets check performance:
require 'benchmark'
test = Test.new
n=1000*1000*100
Benchmark.bm do |b|
b.report { 1000000.times{ test.each_y{|x| #foo=x} } }
b.report { 1000000.times{ test.each_b{|x| #foo=x} } }
end
Here's the result:
user system total real
1.660000 0.000000 1.660000 ( 1.669462)
1.830000 0.000000 1.830000 ( 1.831754)
This means yield is marginally faster than &block what we already know btw.
UPDATE: This is IMO the best way to create an each method which also takes care of returning an enumerator
class Test
def each
if block_given?
#data.each{|x| yield(x)}
else
return #data.each
end
end
end
If you really do want to make your own #each method, and assuming you don't want to forward, you should return an Enumerator if no block is given
class MyArrayLikeClass
include Enumerable
def each(&block)
return enum_for(__method__) if block.nil?
#arr.each do |ob|
block.call(ob)
end
end
end
This will return an Enumerable object if no block is given, allowing Enumerable method chaining