How can I make the set difference insensitive to case? - ruby

I have a class in which the data is stored as a set and I want to be able to compare objects of that class such that the letter case of the elements is of no matter. For example if the set contains elements that are strings there should be no difference of "a" and "A".
To do this I have tried to define the eql? method of the set members to be insensitive to case but this has no effect on the method - (alias difference) in Set. So, how should I go about to make - insensitive to case?
The following code illustrates the problem:
require 'set'
class SomeSet
include Enumerable
def initialize; #elements = Set.new; end
def add(o)
#elements.add(o)
self
end
def each(&block) # To enable +Enumerable+
#elements.each(&block)
end
def difference(compared_list)
#elements - compared_list
end
end
class Element
attr_reader :element
def initialize(element); #element = element; end
# This seems to have no effect on +difference+
def eql?(other_element)
element.casecmp(other_element.element) == 0
end
end
set1 = SomeSet.new
set2 = SomeSet.new
set1.add("a")
set2.add("A")
# The following turns out false but I want it to turn out true as case
# should not matter.
puts set1.difference(set2).empty?

Ok, firstly, you're just storing strings from SomeSet#add, you need to store an instance of Element, like so:
def add(o)
#elements.add(Element.new(o))
self
end
And you need to implement a hash method in your Element class.
You can convert Element##element to lowercase, and pass on its hash.
def hash
element.downcase.hash
end
Full code and demo: http://codepad.org/PffThml2
Edit: For my O(n) insertion comment, above:
Insertions are O(1). From what I can see, eql? is only used with the hash of 2 elements is same. As we're doing hash on the downcased version of the element, it will be fairly well distributed, and eql? shouldn't be called much (if it is called at all).

From the docs:
The equality of each couple of elements is determined according to Object#eql? and Object#hash, since Set uses Hash as storage.
Perhaps you need to implement Object#hash as well.
require 'set'
class String2
attr_reader :value
def initialize v
#value = v
end
def eql? v
value.casecmp(v.value) == 0
end
def hash
value.downcase.hash
end
end
set1 = Set.new
set2 = Set.new
set1.add(String2.new "a")
set2.add(String2.new "A")
puts set1.difference(set2).empty?

Related

How does Set in ruby compare elements?

I am trying to put custom objects in a set. I tried this:
require 'set'
class Person
include Comparable
def initialize(name, age)
#name = name
#age = age
end
attr_accessor :name, :age
def ==(other)
#name == other.name
end
alias eql? ==
end
a = Person.new("a", 18)
b = Person.new("a", 18)
people = Set[]
people << a
people << b
puts a == b # true
It seems that Set does not identify same objects with Object#eql? or == methods:
puts people # #<Set: {#<Person:0x00007f9e09843df8 #name="a", #age=18>, #<Person:0x00007f9e09843da8 #name="a", #age=18>}>
How does Set identify two same objects?
From the docs:
Set uses Hash as storage, so you must note the following points:
Equality of elements is determined according to Object#eql? and Object#hash. [...]
That said: If you want two people to be equal when they have the same name, then you must implement hash accordingly:
def hash
#name.hash
end
Ruby's built-in Set stores items in a Hash. So for your objects to be treated as the "same" by Set, you also need to define a custom hash method. Something like this would work:
def hash
#name.hash
end
Use gem which set.rb to see where the source code for Set is stored, and try reading through it. It's clear and well-written.

Is there a good way to pass "array addresses" to a method?

How can I refactor the following? I have some values stored in my YAML file as nested arrays, but I want to pull all my transactions into two get and set methods. This works, but is obviously limited and bulky. It feels wrong.
module Persistance
#store = YAML::Store.new('store.yml')
def self.get_transaction(key)
#store.transaction { #store[key] }
end
def self.get_nested_transaction(key, sub)
#store.transaction { #store[key][sub] }
end
end
Bonus credit: I also have an additional method for incrementing values in my YAML file. Is there a further way to refactor this code? Does it make sense to just pass blocks to a single database accessing method?
Hey I remember thinking about this when I was practicing PStore a little while ago. I didn't figure out a working approach then but I managed to get one now. By the way, yaml/store is pretty cool and you can take credit for introducing me to it.
Anyway, on with the code. Basically here's a couple important concepts:
The #store is similar to a hash in that you can use [] and []= but it's not actually a hash, it's a YAML::Store.
Ruby 2.3 has a method Hash#dig which is kind of the missing puzzle piece here. You provide a list of keys and it treats each as successive keys. You can use this for both get and set, as my code shows
If #store were a true hash that would be the end of it but's not, so for this answer I added a YAML::Store#dig method which has the same usage as the original.
require 'yaml/store'
class YAML::Store
def dig(*keys)
first_val = self[keys.shift]
if keys.empty?
first_val
else
keys.reduce(first_val) do |result, key|
first_val[key]
end
end
end
end
class YamlStore
attr_reader :store
def initialize filename
#store = YAML::Store.new filename
end
def get *keys
#store.transaction do
#store.dig *keys
end
end
def set *keys, val
#store.transaction do
final_key = keys.pop
hash_to_set = keys.empty? ? #store : #store.dig(*keys)
hash_to_set.send :[]=, final_key, val
end
end
end
filename = 'store.yml'
db = YamlStore.new filename
db.set :a, {}
puts db.get :a
# => {}
db.set :a, :b, 1
puts db.get :a, :b
# => 1

a set of strings and reopening String

In an attempt to answer this question: How can I make the set difference insensitive to case?, I was experimenting with sets and strings, trying to have a case-insensitive set of strings. But for some reason when I reopen String class, none of my custom methods are invoked when I add a string to a set. In the code below I see no output, but I expected at least one of the operators that I overloaded to be invoked. Why is this?
EDIT: If I create a custom class, say, String2, where I define a hash method, etc, these methods do get called when I add my object to a set. Why not String?
require 'set'
class String
alias :compare_orig :<=>
def <=> v
p '<=>'
downcase.compare_orig v.downcase
end
alias :eql_orig :eql?
def eql? v
p 'eql?'
eql_orig v
end
alias :hash_orig :hash
def hash
p 'hash'
downcase.hash_orig
end
end
Set.new << 'a'
Looking at the source code for Set, it uses a simple hash for storage:
def add(o)
#hash[o] = true
self
end
So it looks like what you need to do instead of opening String is open Set. I haven't tested this, but it should give you the right idea:
class MySet < Set
def add(o)
if o.is_a?(String)
#hash[o.downcase] = true
else
#hash[o] = true
end
self
end
end
Edit
As noted in the comments, this can be implemented in a much simpler way:
class MySet < Set
def add(o)
super(o.is_a?(String) ? o.downcase : o)
end
end

Uniqueness of Ruby Instances

If I create two String instances with the same content separately they are identical. This is not the case with custom classes by default (see example below).
If I have my own class (Test below) and I have a variable (#v below) which is unique, ie. two Test instances with the same #v should be treated as identical, then how would I go about telling Ruby this is the case?
Consider this example:
class Test
def initialize(v)
#v = v
end
end
a = {Test.new('a') => 1, Test.new('b') => 2}
a.delete(Test.new('a'))
p a
# # Desired output:
# => {#<Test:0x100124ef8 #v="b">=>2}
You need to define an == method that defines what equality means for your class. In this case, you would want:
class Test
def initialize(v)
#v = v
end
def ==(other)
#v == other.instance_variable_get(:#v)
end
end
You are using objects of class Test as keys for the hash. In order for that to function properly (and consequently a.delete), you need to define two methods inside Test: Test#hash and Test#eql?
From: http://ruby-doc.org/core/classes/Hash.html
Hash uses key.eql? to test keys for
equality. If you need to use instances
of your own classes as keys in a Hash,
it is recommended that you define both
the eql? and hash methods. The hash
method must have the property that
a.eql?(b) implies a.hash == b.hash.
I found a different way to tackle this, by keeping track of all the instances of Test internally I can return the premade instance rather than making a new one and telling ruby they're equivalent:
class Test
def self.new(v)
begin
return ##instances[v] if ##instances[v]
rescue
end
new_test = self.allocate
new_test.instance_variable_set(:#v,v)
(##instances ||= {})[v] = new_test
end
end
Now Test.new('a') == Test.new('a') and Test.new('a') === Test.new('a') :)
Most of the time, an object you need to be comparable and/or hashable is composed of member variables which are either primitives (integers, strings, etc) or are themselves comparable/hashable. In those cases, this module:
module Hashable
include Comparable
def ==(other)
other.is_a?(self.class) && other.send(:parts) == parts
end
alias_method :eql?, :==
def hash
parts.hash
end
end
can simply be included in your class to take care of all of the busywork. All you have to do is define a "parts" method that returns all of the values that comprise the object's state:
class Foo
include Hashable
def initialize(a, b)
#a = a
#b = b
end
private
def parts
[#a, #b]
end
end
Objects built this way are comparable (they have <, <=, ==, >=, >, != and equ?) and they can be hash keys.

How do I add 'each' method to Ruby object (or should I extend Array)?

I have an object Results that contains an array of result objects along with some cached statistics about the objects in the array. I'd like the Results object to be able to behave like an array. My first cut at this was to add methods like this
def <<(val)
#result_array << val
end
This feels very c-like and I know Ruby has better way.
I'd also like to be able to do this
Results.each do |result|
result.do_stuff
end
but am not sure what the each method is really doing under the hood.
Currently I simply return the underlying array via a method and call each on it which doesn't seem like the most-elegant solution.
Any help would be appreciated.
For the general case of implementing array-like methods, yes, you have to implement them yourself. Vava's answer shows one example of this. In the case you gave, though, what you really want to do is delegate the task of handling each (and maybe some other methods) to the contained array, and that can be automated.
require 'forwardable'
class Results
include Enumerable
extend Forwardable
def_delegators :#result_array, :each, :<<
end
This class will get all of Array's Enumerable behavior as well as the Array << operator and it will all go through the inner array.
Note, that when you switch your code from Array inheritance to this trick, your << methods would start to return not the object intself, like real Array's << did -- this can cost you declaring another variable everytime you use <<.
each just goes through array and call given block with each element, that is simple. Since inside the class you are using array as well, you can just redirect your each method to one from array, that is fast and easy to read/maintain.
class Result
include Enumerable
def initialize
#results_array = []
end
def <<(val)
#results_array << val
end
def each(&block)
#results_array.each(&block)
end
end
r = Result.new
r << 1
r << 2
r.each { |v|
p v
}
#print:
# 1
# 2
Note that I have mixed in Enumerable. That will give you a bunch of array methods like all?, map, etc. for free.
BTW with Ruby you can forget about inheritance. You don't need interface inheritance because duck-typing doesn't really care about actual type, and you don't need code inheritance because mixins are just better for that sort of things.
Your << method is perfectly fine and very Ruby like.
To make a class act like an array, without actually inheriting directly from Array, you can mix-in the Enumerable module and add a few methods.
Here's an example (including Chuck's excellent suggestion to use Forwardable):
# You have to require forwardable to use it
require "forwardable"
class MyArray
include Enumerable
extend Forwardable
def initialize
#values = []
end
# Map some of the common array methods to our internal array
def_delegators :#values, :<<, :[], :[]=, :last
# I want a custom method "add" available for adding values to our internal array
def_delegator :#values, :<<, :add
# You don't need to specify the block variable, yield knows to use a block if passed one
def each
# "each" is the base method called by all the iterators so you only have to define it
#values.each do |value|
# change or manipulate the values in your value array inside this block
yield value
end
end
end
m = MyArray.new
m << "fudge"
m << "icecream"
m.add("cake")
# Notice I didn't create an each_with_index method but since
# I included Enumerable it knows how and uses the proper data.
m.each_with_index{|value, index| puts "m[#{index}] = #{value}"}
puts "What about some nice cabbage?"
m[0] = "cabbage"
puts "m[0] = #{m[0]}"
puts "No! I meant in addition to fudge"
m[0] = "fudge"
m << "cabbage"
puts "m.first = #{m.first}"
puts "m.last = #{m.last}"
Which outputs:
m[0] = fudge
m[1] = icecream
m[2] = cake
What about some nice cabbage?
m[0] = cabbage
No! I meant in addition to fudge
m.first = fudge
m.last = cabbage
This feels very c-like and I know Ruby
has better way.
If you want an object to 'feel' like an array, than overriding << is a good idea and very 'Ruby'-ish.
but am not sure what the each method
is really doing under the hood.
The each method for Array just loops through all the elements (using a for loop, I think). If you want to add your own each method (which is also very 'Ruby'-ish), you could do something like this:
def each
0.upto(#result_array.length - 1) do |x|
yield #result_array[x]
end
end
If you create a class Results that inherit from Array, you will inherit all the functionality.
You can then supplement the methods that need change by redefining them, and you can call super for the old functionality.
For example:
class Results < Array
# Additional functionality
def best
find {|result| result.is_really_good? }
end
# Array functionality that needs change
def compact
delete(ininteresting_result)
super
end
end
Alternatively, you can use the builtin library forwardable. This is particularly useful if you can't inherit from Array because you need to inherit from another class:
require 'forwardable'
class Results
extend Forwardable
def_delegator :#result_array, :<<, :each, :concat # etc...
def best
#result_array.find {|result| result.is_really_good? }
end
# Array functionality that needs change
def compact
#result_array.delete(ininteresting_result)
#result_array.compact
self
end
end
In both of these forms, you can use it as you want:
r = Results.new
r << some_result
r.each do |result|
# ...
end
r.compact
puts "Best result: #{r.best}"
Not sure I'm adding anything new, but decided to show a very short code that I wish I could have found in the answers to quickly show available options. Here it is without the enumerator that #shelvacu talks about.
class Test
def initialize
#data = [1,2,3,4,5,6,7,8,9,0,11,12,12,13,14,15,16,172,28,38]
end
# approach 1
def each_y
#data.each{ |x| yield(x) }
end
#approach 2
def each_b(&block)
#data.each(&block)
end
end
Lets check performance:
require 'benchmark'
test = Test.new
n=1000*1000*100
Benchmark.bm do |b|
b.report { 1000000.times{ test.each_y{|x| #foo=x} } }
b.report { 1000000.times{ test.each_b{|x| #foo=x} } }
end
Here's the result:
user system total real
1.660000 0.000000 1.660000 ( 1.669462)
1.830000 0.000000 1.830000 ( 1.831754)
This means yield is marginally faster than &block what we already know btw.
UPDATE: This is IMO the best way to create an each method which also takes care of returning an enumerator
class Test
def each
if block_given?
#data.each{|x| yield(x)}
else
return #data.each
end
end
end
If you really do want to make your own #each method, and assuming you don't want to forward, you should return an Enumerator if no block is given
class MyArrayLikeClass
include Enumerable
def each(&block)
return enum_for(__method__) if block.nil?
#arr.each do |ob|
block.call(ob)
end
end
end
This will return an Enumerable object if no block is given, allowing Enumerable method chaining

Resources