Uniq of ruby array fails to work - ruby

I have a array of my object Country which has the attributes "code" and "name"
The array could have a country in it more than once so I want to distinct the array.
This is my countries class
class Country
include Mongoid::Fields::Serializable
attr_accessor :name, :code
FILTERS = ["Afghanistan","Brunei","Iran", "Kuwait", "Libya", "Saudi Arabia", "Sudan", "Yemen", "Britain (UK)", "Antarctica", "Bonaire Sint Eustatius & Saba", "British Indian Ocean Territory", "Cocos (Keeling) Islands", "St Barthelemy", "St Martin (French part)", "Svalbard & Jan Mayen","Vatican City"]
EXTRAS = {
'eng' => 'England',
'wal' => 'Wales',
'sco' => 'Scotland',
'nlr' => 'Northern Ireland'
}
def initialize(name, code)
#name = name
#code = code
end
def deserialize(object)
return nil unless object
Country.new(object['name'], object['code'])
end
def serialize(country)
{:name => country.name, :code => country.code}
end
def self.all
add_extras(filter(TZInfo::Country.all.map{|country| to_country country})).sort! {|c1, c2| c1.name <=> c2.name}
end
def self.get(code)
begin
to_country TZInfo::Country.get(code)
rescue TZInfo::InvalidCountryCode => e
'InvalidCountryCode' unless EXTRAS.has_key? code
Country.new EXTRAS[code], code
end
end
def self.get_by_name(name)
all.select {|country| country.name.downcase == name.downcase}.first
end
def self.filter(countries)
countries.reject {|country| FILTERS.include?(country.name)}
end
def self.add_extras(countries)
countries + EXTRAS.map{|k,v| Country.new v, k}
end
private
def self.to_country(country)
Country.new country.name, country.code
end
end
and my request for the array which is called from another class
def countries_ive_drunk
(had_drinks.map {|drink| drink.beer.country }).uniq
end
If I throw the array I can see the structure is:
[
#<Country:0x5e3b4c8 #name="Belarus", #code="BY">,
#<Country:0x5e396e0 #name="Britain (UK)", #code="GB">,
#<Country:0x5e3f350 #name="Czech Republic", #code="CZ">,
#<Country:0x5e3d730 #name="Germany", #code="DE">,
#<Country:0x5e43778 #name="United States", #code="US">,
#<Country:0x5e42398 #name="England", #code="eng">,
#<Country:0x5e40f70 #name="Aaland Islands", #code="AX">,
#<Country:0x5e47978 #name="England", #code="eng">,
#<Country:0x5e46358 #name="Portugal", #code="PT">,
#<Country:0x5e44d38 #name="Georgia", #code="GE">,
#<Country:0x5e4b668 #name="Germany", #code="DE">,
#<Country:0x5e4a2a0 #name="Anguilla", #code="AI">,
#<Country:0x5e48c98 #name="Anguilla", #code="AI">
]
This is the same, whether or not I do .uniq and you can see there is two "Anguilla"

As pointed out by others, the problem is that uniq uses hash to distinguish between countries and that by default, Object#hash is different for all objects. It will also use eql? in case two objects return the same hash value, to be sure if they are eql or not.
The best solution is to make your class correct in the first place!
class Country
# ... your previous code, plus:
include Comparable
def <=>(other)
return nil unless other.is_a?(Country)
(code <=> other.code).nonzero? || (name <=> other.name)
# or less fancy:
# [code, name] <=> [other.code, other.name]
end
def hash
[name, code].hash
end
alias eql? ==
end
Country.new("Canada", "CA").eql?(Country.new("Canada", "CA")) # => true
Now you can sort arrays of Countries, use countries as key for hashes, compare them, etc...
I've included the above code to show how it's done in general, but in your case, you get all this for free if you subclass Struct(:code, :name)...
class Country < Stuct(:name, :code)
# ... the rest of your code, without the `attr_accessible` nor the `initialize`
# as Struct provides these and `hash`, `eql?`, `==`, ...
end

Objects in array are considered duplicate by Array#uniq if their #hash values are duplicate, which is not the case in this code. You need to use different approach to do what intended, like this:
def countries_ive_drunk
had_drinks.map {|drink| drink.beer.country.code }
.uniq
.map { |code| Country.get code}
end

This boils down to what does equality mean? When is an object a duplicate of another? The default implementations of ==, eql? just compare the ruby object_id which is why you don't get the results you want.
You could implement ==, eql? and hash in a way that makes sense for your class, for example by comparing the countries' codes.
An alternative is to use uniq_by. This is an active support addition to Array, but mongoid depends on active support anyway, so you wouldn't be adding a dependency.
some_list_of_countries.uniq_by {|c| c.code}
Would use countries' codes to uniq them. You can shorten that to
some_list_of_countries.uniq_by(&:code)

Each element in the array is separate class instance.
#<Country:0x5e4a2a0 #name="Anguilla", #code="AI">
#<Country:0x5e48c98 #name="Anguilla", #code="AI">
The ids are unique.

#<Country:0x5e4a2a0 #name="Anguilla", #code="AI">,
#<Country:0x5e48c98 #name="Anguilla", #code="AI">
Array#uniq thinks these are different objects (different instances of Country class), because the objects' ids are different.
Obviously you need to change your strategy.

At least as early as 1.9.3, Array#uniq would take a block just like uniq_by. uniq_by is now deprecated.

Related

Ruby Set with custom class to equal basic strings

I want to be able to find a custom class in my set given just a string. Like so:
require 'set'
Rank = Struct.new(:name, keyword_init: true) {
def hash
name.hash
end
def eql?(other)
hash == other.hash
end
def ==(other)
hash == other.hash
end
}
one = Rank.new(name: "one")
two = Rank.new(name: "two")
set = Set[one, two]
but while one == "one" and one.eql?("one") are both true, set.include?("one") is still false. what am i missing?
thanks!
Set is built upon Hash, and Hash considers two objects the same if:
[...] their hash value is identical and the two objects are eql? to each other.
What you are missing is that eql? isn't necessarily commutative. Making Rank#eql? recognize strings doesn't change the way String#eql? works:
one.eql?('one') #=> true
'one'.eql?(one) #=> false
Therefore it depends on which object is the hash key and which is the argument to include?:
Set['one'].include?(one) #=> true
Set[one].include?('one') #=> false
In order to make two objects a and b interchangeable hash keys, 3 conditions have to be met:
a.hash == b.hash
a.eql?(b) == true
b.eql?(a) == true
But don't try to modify String#eql? – fiddling with Ruby's core classes isn't recommended and monkey-patching probably won't work anyway because Ruby usually calls the C methods directly for performance reasons.
In fact, making both hash and eql? mimic name doesn't seem like a good idea in the first place. It makes the object's identity ambiguous which can lead to very strange behavior and hard to find bugs:
h = { one => 1, 'one' => 1 }
#=> {#<struct Rank name="one">=>1, "one"=>1}
# vs
h = { 'one' => 1, one => 1 }
#=> {"one"=>1}
what am i missing?
What you are missing is that "one" isn't in your set. one is in your set, but "one" isn't.
Therefore, the answer Ruby is giving you is perfectly correct.
All that you have done with your implementation of Rank is that any two ranks with the same name are considered to be the same by a Hash, Set, or Array#uniq. But, a Rank is not the same as a String.
If you want to be able to have a set-like data structure where you can look up things by one of their attributes, you will have to write it yourself.
Something like (untested):
class RankSet < Set
def [](*args)
super(*args.map(&:name))
end
def each
return enum_for(__callee__) unless block_given?
super {|e| yield e.name }
end
end
might get you started.
Or, instead of writing your own set, you can just use the fact that any arbitrary rank with the right name can be used for lookup:
set.include?(Rank.new(name: "one"))
#=> true
# even though it is a *different* `Rank` object

Does Ruby have a `Pair` data type?

Sometimes I need to deal with key / value data.
I dislike using Arrays, because they are not constrained in size (it's too easy to accidentally add more than 2 items, plus you end up needing to validate size later on). Furthermore, indexes of 0 and 1 become magic numbers and do a poor job of conveying meaning ("When I say 0, I really mean head...").
Hashes are also not appropriate, as it is possible to accidentally add an extra entry.
I wrote the following class to solve the problem:
class Pair
attr_accessor :head, :tail
def initialize(h, t)
#head, #tail = h, t
end
end
It works great and solves the problem, but I am curious to know: does the Ruby standard library comes with such a class already?
No, Ruby doesn't have a standard Pair class.
You could take a look at "Using Tuples in Ruby?".
The solutions involve either using a similar class as yours, the Tuples gem or OpenStruct.
Python has tuple, but even Java doesn't have one: "A Java collection of value pairs? (tuples?)".
You can also use OpenStruct datatype. Probably not exactly what you wanted, but here is an implementation ...
require 'ostruct'
foo = OpenStruct.new
foo.head = "cabeza"
foo.tail = "cola"
Finally,
puts foo.head
=> "cabeza"
puts foo.tail
=> "cola"
No, there is no such class in the Ruby core library or standard libraries. It would be nice to have core library support (as well as literal syntax) for tuples, though.
I once experimented with a class very similar to yours, in order to replace the array that gets yielded by Hash#each with a pair. I found that monkey-patching Hash#each to return a pair instead of an array actually breaks surprisingly little code, provided that the pair class responds appropriately to to_a and to_ary:
class Pair
attr_reader :first, :second
def to_ary; [first, second] end
alias_method :to_a, :to_ary
private
attr_writer :first, :second
def initialize(first, second)
self.first, self.second = first, second
end
class << self; alias_method :[], :new end
end
You don't need a special type, you can use a 2-element array with a little helper to give the pairs a consistent order. E.g.:
def pair(a, b)
(a.hash < b.hash) ? [a, b] : [b, a]
end
distances = {
pair("Los Angeles", "New York") => 2_789.6,
pair("Los Angeles", "Sydney") => 7_497,
}
distances[ pair("Los Angeles", "New York") ] # => 2789.6
distances[ pair("New York", "Los Angeles") ] # => 2789.6
distances[ pair("Sydney", "Los Angeles") ] # => 7497

How can I create a method that takes a hash (with or without an assigned value) as an argument?

So I am working through test first and am a little stuck. Here is my code so far:
class Dictionary
attr_accessor :entries, :keywords, :item
def initialize
#entries = {}
end
def add(item)
item.each do |words, definition|
#entries[words] = definition
end
end
def keywords
#entries.keys
end
end#class
I am stuck at the rspec test right here:
it 'add keywords (without definition)' do
#d.add('fish')
#d.entries.should == {'fish' => nil}
#d.keywords.should == ['fish']
end
How can I switch my add method around to take either a key/value pair, or just a key with the value set to nil? The first test specifies that the hash is empty when it is created so I cant give it default values there.
One might check the type of the parameter passed to the add method. Whether it’s not an Enumerable, which is apparently a mixin included in Arrays, Hashes etc., just assign it’s value to nil:
def add(item)
case item
when Enumerable
item.each do |words, definition|
#entries[words] = definition
end
else
#entries[item] = nil
end
end
Please note that case uses “case equality” to check argument type.
If you are always passing Strings to the method, you could just have a default value for the second string... Something like the following:
def add(word, definition = nil)
#entries[word] = definition
end
So your code might look something like this:
class Dictionary
attr_accessor :entries, :keywords, :item
def initialize
#entries = {}
end
def add(word, definition = nil)
#entries[word] = definition
end
def keywords
#entries.keys
end
end#class
If you want multiple additions (i.e. add key: "word", with: "many", options: nil), that design might not work for you and you would need to create a solution that would work on the lines of what #mudasobwa suggested. Perhaps:
def add(word, definition = nil)
return #entries[word] = definition unless word.is_a?(Enumerable)
return #entries.update word if word.is_a?(Hash)
raise "What?!"
end
Update, as par request
I updated the method above to allow for words that aren't strings (as you pointed out).
When passing a hash to a method, it is considered as a single parameter.
Key => Value pairs are an implied hash, so when passing a hash to a method, the following are generally the same:
Hash.new.update key: :value
Hash.new.update({key: :value})
Consider the following:
def test(a,b = nil)
puts "a = #{a}"
puts "b = #{b}"
end
test "string"
# => a = string
# => b =
test "string", key: :value, key2: :value2
# => a = string
# => b = {:key=>:value, :key2=>:value2}
test key: :value, key2: :value2, "string"
# Wrong Ruby Syntax due to implied Hash, would raise exception:
# => SyntaxError: (irb):8: syntax error, unexpected '\n', expecting =>
test({key: :value, key2: :value2}, "string")
# correct syntax.
This is why, when you pass add 'fish' => 'aquatic', it's considered only one parameter, a hash - as opposed to add 'fish', 'aquatic' which passes two parameters to the method.
If your method must accept different types of parameters (strings, hashes, numerals, symbols, arrays), you will need to deal with each option in a different way.
This is why #mudasobwa suggested checking the first parameter's type. His solution is pretty decent.
My version is a bit shorter to code, but it runs on the same idea.
def add(word, definition = nil)
return #entries[word] = definition unless word.is_a?(Enumerable)
return #entries.update word if word.is_a?(Hash)
raise "What?!"
end

How do I access the elements in a hash which is itself a value in a hash?

I have this hash $chicken_parts, which consists of symbol/hash pairs (many more than shown here):
$chicken_parts = { :beak = > {"name"=>"Beak", "color"=>"Yellowish orange", "function"=>"Pecking"}, :claws => {"name"=>"Claws", "color"=>"Dirty", function"=>"Scratching"} }
Then I have a class Embryo which has two class-specific hashes:
class Embryo
#parts_grown = Hash.new
#currently_developing = Hash.new
Over time, new pairs from $chicken_parts will be .merge!ed into #parts_grown. At various times, #currently developing will be declared equal to one of the symbol/hash pairs from #parts_grown.
I'm creating Embryo class functions and I want to be able to access the "name", "color", and "function" values in #currently_developing, but I don't seem to be able to do it.
def grow_part(part)
#parts_grown.merge!($chicken_parts[part])
end
def develop_part(part)
#currently_developing = #parts_grown[part]
seems to populate the hashes as expected, but
puts #currently_developing["name"]
does not work. Is this whole scheme a bad idea? Should I just make the Embryo hashes into arrays of symbols from $chicken_parts, and refer to it whenever needed? That seemed like cheating to me for some reason...
There's a little bit of confusion here. When you merge! in grow_part, you aren't adding a :beak => {etc...} pair to #parts_grown. Rather, you are merging the hash that is pointed too by the part name, and adding all of the fields of that hash directly to #parts_grown. So after one grow_part, #parts_grown might look like this:
{"name"=>"Beak", "color"=>"Yellowish orange", "function"=>"Pecking"}
I don't think that's what you want. Instead, try this for grow_part:
def grow_part(part)
#parts_grown[part] = $chicken_parts[part]
end
class Embryo
#parts_grown = {a: 1, b: 2}
def show
p #parts_grown
end
def self.show
p #parts_grown
end
end
embryo = Embryo.new
embryo.show
Embryo.show
--output:--
nil
{:a=>1, :b=>2}

Search ruby hash for empty value

I have a ruby hash like this
h = {"a" => "1", "b" => "", "c" => "2"}
Now I have a ruby function which evaluates this hash and returns true if it finds a key with an empty value. I have the following function which always returns true even if all keys in the hash are not empty
def hash_has_blank(hsh)
hsh.each do |k,v|
if v.empty?
return true
end
end
return false
end
What am I doing wrong here?
Try this:
def hash_has_blank hsh
hsh.values.any? &:empty?
end
Or:
def hash_has_blank hsh
hsh.values.any?{|i|i.empty?}
end
If you are using an old 1.8.x Ruby
I hope you're ready to learn some ruby magic here. I wouldn't define such a function globally like you did. If it's an operation on a hash, than it should be an instance method on the Hash class you can do it like this:
class Hash
def has_blank?
self.reject{|k,v| !v.nil? || v.length > 0}.size > 0
end
end
reject will return a new hash with all the empty strings, and than it will be checked how big this new hash is.
a possibly more efficient way (it shouldn't traverse the whole array):
class Hash
def has_blank?
self.values.any?{|v| v.nil? || v.length == 0}
end
end
But this will still traverse the whole hash, if there is no empty value
I've changed the empty? to !nil? || length >0 because I don't know how your empty method works.
If you just want to check if any of the values is an empty string you could do
h.has_value?('')
but your function seems to work fine.
I'd consider refactoring your model domain. Obviously the hash represents something tangible. Why not make it an object? If the item can be completely represented by a hash, you may wish to subclass Hash. If it's more complicated, the hash can be an attribute.
Secondly, the reason for which you are checking blanks can be named to better reflect your domain. You haven't told us the "why", but let's assume that your Item is only valid if it doesn't have any blank values.
class MyItem < Hash
def valid?
!invalid?
end
def invalid?
values.any?{|i| i.empty?}
end
end
The point is, if you can establish a vocabulary that makes sense in your domain, your code will be cleaner and more understandable. Using a Hash is just a means to an end and you'd be better off using more descriptive, domain-specific terms.
Using the example above, you'd be able to do:
my_item = MyItem["a" => "1", "b" => "", "c" => "2"]
my_item.valid? #=> false

Resources