Subclassing Ruby Hash, object has no methods of Hash? - ruby

I'm creating a object of hash in order to write a little script that reads in a file a line at a time, and assigns arrays into my hash class. I get wildly different results depending if I subclass Hash or not, plus using super changes things which I don't' understand.
My main issue is that without subclassing hash ( < Hash) it works perfectly, but I get no methods of Hash (like to iterate over the keys and get things out of it.... Subclassing Hash lets me do those things, but it seems that only the last element of the hashed arrays is ever stored.... so any insight into how you get the methods of a subclass. The Dictionary class is a great example I found on this site, and does exactly what I want, so I'm trying to understand how to use it properly.
filename = 'inputfile.txt.'
# ??? class Dictionary < Hash
class Dictionary
def initialize()
#data = Hash.new { |hash, key| hash[key] = [] }
end
def [](key)
#data[key]
end
def []=(key,words)
#data[key] += [words].flatten
#data[key]
# super(key,words)
end
end
listData = Dictionary.new
File.open(filename, 'r').each_line do |line|
line = line.strip.split(/[^[:alpha:]|#|\.]/)
puts "LIST-> #{line[0]} SUB-> #{line[1]} "
listData[line[0]] = ("#{line[1]}")
end
puts '====================================='
puts listData.inspect
puts '====================================='
print listData.reduce('') {|s, (k, v)|
s << "The key is #{k} and the value is #{v}.\n"
}
If anyone understands what is going on here subclassing hash, and has some pointers, that would be excellent.
Running without explicit < Hash:
./list.rb:34:in `<main>': undefined method `reduce' for #<Dictionary:0x007fcf0a8879e0> (NoMethodError)
That is the typical error I see when I try and iterate in any way over my hash.
Here is a sample input file:
listA billg#microsoft.com
listA ed#apple.com
listA frank#lotus.com
listB evanwhite#go.com
listB joespink#go.com
listB fredgrey#stop.com

I can't reproduce your problem using your code:
d = Dictionary.new #=> #<Dictionary:0x007f903a1adef8 #data={}>
d[4] << 5 #=> [5]
d[5] << 6 #=> [6]
d #=> #<Dictionary:0x007f903a1adef8 #data={4=>[5], 5=>[6]}>
d.instance_variable_get(:#data) #=> {4=>[5], 5=>[6]}
But of course you won't get reduce if you don't subclass or include a class/module that defines it, or define it yourself!
The way you have implemented Dictionary is bound to have problems. You should call super instead of reimplementing wherever possible. For example, simply this works:
class Dictionary < Hash
def initialize
super { |hash, key| hash[key] = [] }
end
end
d = Dictionary.new #=> {}
d['answer'] << 42 #=> [42]
d['pi'] << 3.14 #=> [3.14
d #=> {"answer"=>[42], "pi"=>[3.14]}
If you want to reimplement how and where the internal hash is stored (i.e., using #data), you'd have to reimplement at least each (since that is what almost all Enumerable methods call to) and getters/setters. Not worth the effort when you can just change one method instead.

While Andrew Marshall's answer
already correct, You could also try this alternative below.
Going from your code, We could assume that you want to create an object that
act like a Hash, but with a little bit different behaviour. Hence our first
code will be like this.
class Dictionary < Hash
Assigning a new value to some key in the dictionary will be done differently
in here. From your example above, the assignment won't replace the previous
value with a new one, but instead push the new value to the previous or to
a new array that initialized with the new value if the key doesn't exist yet.
Here I use the << operator as the shorthand of push method for Array.
Also, the method return the value since it's what super do (see the if part)
def []=(key, value)
if self[key]
self[key] << value
return value # here we mimic what super do
else
super(key, [value])
end
end
The advantage of using our own class is we could add new method to the class
and it will be accessible to all of it instance. Hence we need not to
monkeypatch the Hash class that considered dangerous thing.
def size_of(key)
return self[key].size if self[key]
return 0 # the case for non existing key
end
Now, if we combine all above we will get this code
class Dictionary < Hash
def []=(key, value)
if self[key]
self[key] << value
return value
else
super(key, [value])
end
end
def size_of(key)
return self[key].size if self[key]
return 0 # the case for non existing key
end
end
player_emails = Dictionary.new
player_emails["SAO"] = "kirito#sao.com" # note no << operator needed here
player_emails["ALO"] = "lyfa#alo.com"
player_emails["SAO"] = "lizbeth#sao.com"
player_emails["SAO"] = "asuna#sao.com"
player_emails.size_of("SAO") #=> 3
player_emails.size_of("ALO") #=> 1
player_emails.size_of("GGO") #=> 0
p listData
#=> {"SAO" => ["kirito#sao.com", "lizbeth#sao.com", "asuna#sao.com"],
#=> "ALO" => ["lyfa#alo.com"] }
But, surely, the class definition could be replaced with this single line
player_emails = Hash.new { [] }
# note that we wont use
#
# player_emails[key] = value
#
# instead
#
# player_emails[key] << value
#
# Oh, if you consider the comment,
# it will no longer considered a single line
While the answer are finished, I wanna comment some of your example code:
filename = 'inputfile.txt.'
# Maybe it's better to use ARGF instead,
# so you could supply the filename in the command line
# and, is the filename ended with a dot? O.o;
File.open(filename, 'r').each_line do |line|
# This line open the file anonimously,
# then access each line of the file.
# Please correct me, Is the file will properly closed? I doubt no.
# Saver version:
File.open(filename, 'r') do |file|
file.each_line do |line|
# ...
end
end # the file will closed when we reach here
# ARGF version:
ARGF.each_line do |line|
# ...
end
# Inside the each_line block
line = line.strip.split(/[^[:alpha:]|#|\.]/)
# I don't know what do you mean by that line,
# but using that regex will result
#
# ["listA", "", "", "billg#microsoft.com"]
#
# Hence, your example will fail since
# line[0] == "listA" and line[1] == ""
# also note that your regex mean
#
# any character except:
# letters, '|', '#', '|', '\.'
#
# If you want to split over one or more
# whitespace characters use \s+ instead.
# Hence we could replace it with:
line = line.strip.split(/\s+/)
puts "LIST-> #{line[0]} SUB-> #{line[1]} "
# OK, Is this supposed to debug the line?
# Tips: the simplest way to debug is:
#
# p line
#
# that's all,
listData[line[0]] = ("#{line[1]}")
# why? using (), then "", then #{}
# I suggest:
listData[line[0]] = line[1]
# But to make more simple, actually you could do this instead
key, value = line.strip.split(/\s+/)
listData[key] = value
# Outside the block:
puts '====================================='
# OK, that's too loooooooooong...
puts '=' * 30
# or better assign it to a variable since you use it twice
a = '=' * 30
puts a
p listData # better way to debug
puts a
# next:
print listData.reduce('') { |s, (k, v)|
s << "The key is #{k} and the value is #{v}.\n"
}
# why using reduce?
# for debugging you could use `p listData` instead.
# but since you are printing it, why not iterate for
# each element then print each of that.
listData.each do |k, v|
puts "The key is #{k} and the value is #{v}."
end
OK, sorry for blabbering so much, Hope it help.

Related

Syntax error, unexpected tIDENTIFIER, expecting ')' Ruby

I get the following error when running a simple method that takes in a proper noun string and returns the string properly capitalized.
def format_name(str)
parts = str.split
arr = []
parts.map do |part|
if part[0].upcase
else part[1..-1].downcase
arr << part
end
end
return arr.join(" ")
end
Test cases:
puts format_name("chase WILSON") # => "Chase Wilson"
puts format_name("brian CrAwFoRd scoTT") # => "Brian Crawford Scott"
The only possibility that the above code returns a blank output is because your arr is nil or blank. And the reason your arr is blank(yes it is blank in your case) because of this line of code:
if part[0].upcase
in which the statement would always return true, because with every iteration it would check if the first element of the part string can be upcased or not, which is true.
Hence, your else block never gets executed, even if this got executed this would have returned the same string as the input because you are just putting the plain part into the array arr without any formatting done.
There are some ways you can get the above code working. I'll put two cases:
# one where your map method could work
def format_name(str)
parts = str.split
arr = []
arr = parts.map do |part|
part.capitalize
end
return arr.join(" ")
end
# one where your loop code logic works
def format_name(str)
parts = str.split
arr = []
parts.map do |part|
arr << "#{part[0].upcase}#{part[1..-1].downcase}"
end
return arr.join(" ")
end
There are numerous other ways this could work. I'll also put the one I prefer if I am using just plain ruby:
def format_name(str)
str.split(' ').map(&:capitalize)
end
You could also read more about the Open Classes concept to put this into the String class of ruby
Also, checkout camelize method if you're using rails.

How do I get this block of ruby to add each individual hash into an array instead of just adding one hash multiple times?

#session is formatted as
[
['time','action','user'],
['time','action','user'],
...
]
and I'm trying to create an array that has those array elements but as hashes of {:time=>"time, :action=>"action", :user=>"user"}. The puts sessions line outputs each line as I desire, but when I try to capture those hashes into sessions_array I receive an array of only one hash repeated many times and not the unique hashes that puts is outputting.
sessions = Hash.new
sessions_array = Array.new
#session.each_with_index { |element, index|
next_element = #session[index+1]
sessions[:time] = element[0]
sessions[:action] = element[1]
sessions[:user] = element[2]
sessions_array << sessions
puts sessions
}
puts sessions_array
Create sessions inside of the each_with_index block instead of outside:
sessions_array = []
#session.each do |element|
sessions = {
time: element[0],
action: element[1],
user: element[2],
}
sessions_array << sessions
end
puts sessions_array
However, this can be done much more succinctly. When you're turning an array into another array with the same number of elements you almost always want to use map. Also, in a Ruby block you can extract the elements from an array by specifying multiple names in its arguments (|foo, bar, ...|).
This code is equivalent to the above:
sessions_array = #session.map do |time, action, user|
{ time: time, action: action, user: user }
end
You can see both of these snippets in action on repl.it here: https://repl.it/#jrunning/NavyImmaculateShockwave
Perhaps you are looking for something like the following.
Code
def hashify(data, keys)
data.map { |row| keys.zip(row).to_h }
end
Example
data = [
%w| 11:00 pummel Billy-Bob |,
%w| 02:00 maim Trixie |,
%w| 19:00 kill Bill |
]
#=> [["11:00", "pummel", "Billy-Bob"],
# ["02:00", "maim", "Trixie"],
# ["19:00", "kill", "Bill"]]
keys = [:time, :action, :user]
hashify(data, keys)
#=> [{:time=>"11:00", :action=>"pummel", :user=>"Billy-Bob"},
# {:time=>"02:00", :action=>"maim", :user=>"Trixie"},
# {:time=>"19:00", :action=>"kill", :user=>"Bill"}]
I have chosen to make data and keys arguments of the method so that those parameters can be modified without affecting the method itself.
Note that each of the three elements of:
data.map { |row| keys.zip(row) }
#=> [[[:time, "11:00"], [:action, "pummel"], [:user, "Billy-Bob"]],
# [[:time, "02:00"], [:action, "maim"], [:user, "Trixie"]],
# [[:time, "19:00"], [:action, "kill"], [:user, "Bill"]]]
is converted to a hash using the method Array#to_h. See also Array#zip.

How to use reduce/inject in Ruby without getting Undefined variable

When using an accumulator, does the accumulator exist only within the reduce block or does it exist within the function?
I have a method that looks like:
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
str.split.reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
end
return true if (new_array == new_array.sort)
end
When I execute this code I get the error
"undefined variable new_array in line 11 (the return statement)"
I also tried assigning the new_array value to another variable as an else statement inside my reduce block but that gave me the same results.
Can someone explain to me why this is happening?
The problem is that new_array is created during the call to reduce, and then the reference is lost afterwards. Local variables in Ruby are scoped to the block they are in. The array can be returned from reduce in your case, so you could use it there. However, you need to fix a couple things:
str.split does not break a string into characters in Ruby 2+. You should use str.chars, or str.split('').
The object retained for each new iteration of reduce must be retained by returning it from the block each time. The simplest way to do this is to put new_array as the last expression in your block.
Thus:
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
crazy_only = str.split('').reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
new_array
end
return true if (crazy_only == crazy_only.sort)
end
Note that your function is not very efficient, and not very idiomatic. Here's a shorter version of the function that is more idiomatic, but not much more efficient:
def my_useless_function(str)
crazy_letters = %w[a s d f g h]
crazy_only = str.chars.select{ |c| crazy_letters.include?(c) }
crazy_only == crazy_only.sort # evaluates to true or false
end
And here's a version that's more efficient:
def efficient_useless(str)
crazy_only = str.scan(/[asdfgh]/) # use regex to search for the letters you want
crazy_only == crazy_only.sort
end
Block local variables
new_array doesn't exist outside the block of your reduce call. It's a "block local variable".
reduce does return an object, though, and you should use it inside your method.
sum = [1, 2, 3].reduce(0){ |acc, elem| acc + elem }
puts sum
# 6
puts acc
# undefined local variable or method `acc' for main:Object (NameError)
Your code
Here's the least amount of change for your method :
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
new_array = str.split(//).reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
new_array
end
return true if (new_array == new_array.sort)
end
Notes:
return isn't needed at the end.
true if ... isn't needed either
for loop should never be used in Ruby
reduce returns the result of the last expression inside the block. It was for in your code.
If you always need to return the same object in reduce, it might be a sign you could use each_with_object.
"test".split is just ["test"]
String and Enumerable have methods that could help you. Using them, you could write a much cleaner and more efficient method, as in #Phrogz answer.

Reading strings from one file and adding to another file with suffix to make unique

I am processing documents in ruby.
I have a document I am extracting specific strings from using regexp and then adding them to another file. When added to the destination file they must be made unique so if that string already exists in the destination file I'am adding a simple suffix e.g. <word>_1. Eventually I want to be referencing the strings by name so random number generation or string from the date is no good.
At present I am storing each word added in an array and then everytime I add a word I check the string doesn't exist in an array which is fine if there is only 1 duplicate however there might be 2 or more so I need to check for the initial string then loop incrementing the suffix until it doesn't exist, (I have simplified my code so there may be bugs)
def add_word(word)
if #added_words include? word
suffix = 1
suffixed_word = word
while added_words include? suffixed_word
suffixed_word = word + "_" + suffix.to_s
suffix += 1
end
word = suffixed_word
end
#added_words << word
end
It looks messy, is there a better algorithm or ruby way of doing this?
Make #added_words a Set (don't forget to require 'set'). This makes for faster lookup as sets are implemented with hashes, while still using include? to check for set membership. It's also easy to extract the highest used suffix:
>> s << 'foo'
#=> #<Set: {"foo"}>
>> s << 'foo_1'
#=> #<Set: {"foo", "foo_1"}>
>> word = 'foo'
#=> "foo"
>> s.max_by { |w| w =~ /#{word}_?(\d+)?/ ; $1 || '' }
#=> "foo_1"
>> s << 'foo_12' #=>
#<Set: {"foo", "foo_1", "foo_12"}>
>> s.max_by { |w| w =~ /#{word}_?(\d+)?/ ; $1 || '' }
#=> "foo_12"
Now to get the next value you can insert, you could just do the following (imagine you already had 12 foos, so the next should be a foo_13):
>> s << s.max_by { |w| w =~ /#{word}_?(\d+)?/ ; $1 || '' }.next
#=> #<Set: {"foo", "foo_1", "foo_12", "foo_13"}
Sorry if the examples are a bit confused, I had anesthesia earlier today. It should be enough to give you an idea of how sets could potentially help you though (most of it would work with array too, but sets have faster lookup).
Change #added_words to a Hash with a default of zero. Then you can do:
#added_words = Hash.new(0)
def add_word( word)
#added_words[word] += 1
end
# put it to work:
list = %w(test foo bar test bar bar)
names = list.map do |w|
"#{w}_#{add_word(w)}"
end
p #added_words
#=> {"test"=>2, "foo"=>1, "bar"=>3}
p names
#=>["test_1", "foo_1", "bar_1", "test_2", "bar_2", "bar_3"]
In that case, I'd probably use a set or hash:
#in your class:
require 'set'
require 'forwardable'
extend Forwardable #I'm just including this to keep your previous api
#elsewhere you're setting up your instance_var, it's probably [] at the moment
def initialize
#added_words = Set.new
end
#then instead of `def add_word(word); #added_words.add(word); end`:
def_delegator :added_words, :add_word, :add
#or just change whatever loop to use ##added_words.add('word') rather than self#add_word('word')
##added_words.add('word') does nothing if 'word' already exists in the set.
If you've got some attributes that you're grouping via these sections, then a hash might be better:
#elsewhere you're setting up your instance_var, it's probably [] at the moment
def initialize
#added_words = {}
end
def add_word(word, attrs={})
#added_words[word] ||= []
#added_words[word].push(attrs)
end
Doing it the "wrong way", but in slightly nicer code:
def add_word(word)
if #added_words.include? word
suffixed_word = 1.upto(1.0/0.0) do |suffix|
candidate = [word, suffix].join("_")
break candidate unless #added_words.include?(candidate)
end
word = suffixed_word
end
#added_words << word
end

Can I reject objects which do not meet my criteria as they are entered into an array?

I know there are a number of ways to create new elements in an existing ruby array.
e.g.
myArray = []
myArray + other_array
myArray << obj
myArray[index] = obj
I'm also pretty sure I could use .collect, .map, .concat, .fill, .replace, .insert, .join, .pack and .push as well to add to or otherwise modify the contents of myArray.
However, I want to ensure that myArray only ever includes valid HTTP/HTTPS URLs.
Can anyone explain how I can enforce that kind of behaviour?
I would create a module that allows you to specify an acceptance block for an array, and then override all the methods you mention (and more, like concat) to pre-filter the argument before calling super. For example:
module LimitedAcceptance
def only_allow(&block)
#only_allow = block
end
def <<( other )
super if #only_allow[ other ]
end
def +( other_array )
super( other_array.select(&#only_allow) )
end
end
require 'uri'
my_array = []
my_array.extend LimitedAcceptance
my_array.only_allow do |item|
uri = item.is_a?(String) && URI.parse(item) rescue nil
uri.class <= URI::HTTP
end
my_array << "http://phrogz.net/"
my_array << "ftp://no.way"
my_array += %w[ ssh://bar http://ruby-lang.org http:// ]
puts my_array
#=> http://phrogz.net/
#=> http://ruby-lang.org
Create a class to encapsulate behavior you want. Then you can create your << method doing the verifications you want.
Put all logic that handle this data in methods in this domain class. Probably you will discover code floating around the use of this data to move to the new class.
My 2 cents.
Use this to insert. (untested).
def insert_to_array(first_array, second_array)
second_array.each do |i| {
if URI.parse(i).class == URI::HTTP
first_array.insert(i)
end
}
first_array
end

Resources