Explanation of Ruby code for building Trie data structures - ruby

So I have this ruby code I grabbed from wikipedia and I modified a bit:
#trie = Hash.new()
def build(str)
node = #trie
str.each_char { |ch|
cur = ch
prev_node = node
node = node[cur]
if node == nil
prev_node[cur] = Hash.new()
node = prev_node[cur]
end
}
end
build('dogs')
puts #trie.inspect
I first ran this on console irb, and each time I output node, it just keeps giving me an empty hash each time {}, but when I actually invoke that function build with parameter 'dogs' string, it actually does work, and outputs {"d"=>{"o"=>{"g"=>{"s"=>{}}}}}, which is totally correct.
This is probably more of a Ruby question than the actual question about how the algorithm works. I don't really have adequate Ruby knowledge to decipher what is going on there I guess.

You're probably getting lost inside that mess of code which takes an approach that seems a better fit for C++ than for Ruby. Here's the same thing in a more concise format that uses a special case Hash for storage:
class Trie < Hash
def initialize
# Ensure that this is not a special Hash by disallowing
# initialization options.
super
end
def build(string)
string.chars.inject(self) do |h, char|
h[char] ||= { }
end
end
end
It works exactly the same but doesn't have nearly the same mess with pointers and such:
trie = Trie.new
trie.build('dogs')
puts trie.inspect
Ruby's Enumerable module is full of amazingly useful methods like inject which is precisely what you want for a situation like this.

I think you are just using irb incorrectly. You should type the whole function in, then run it, and see if you get correct results. If it doesn't work, how about you post your entire IRB session here.
Also here is a simplified version of your code:
def build(str)
node = #trie
str.each_char do |ch|
node = (node[ch] ||= {})
end
# not sure what the return value should be
end

Related

Ruby Enumerable#find returning mapped value

Does Ruby's Enumerable offer a better way to do the following?
output = things
.find { |thing| thing.expensive_transform.meets_condition? }
.expensive_transform
Enumerable#find is great for finding an element in an enumerable, but returns the original element, not the return value of the block, so any work done is lost.
Of course there are ugly ways of accomplishing this...
Side effects
def constly_find(things)
output = nil
things.each do |thing|
expensive_thing = thing.expensive_transform
if expensive_thing.meets_condition?
output = expensive_thing
break
end
end
output
end
Returning from a block
This is the alternative I'm trying to refactor
def costly_find(things)
things.each do |thing|
expensive_thing = thing.expensive_transform
return expensive_thing if expensive_thing.meets_condition?
end
nil
end
each.lazy.map.find
def costly_find(things)
things
.each
.lazy
.map(&:expensive_transform)
.find(&:meets_condition?)
end
Is there something better?
Of course there are ugly ways of accomplishing this...
If you had a cheap operation, you'd just use:
collection.map(&:operation).find(&:condition?)
To make Ruby call operation only "on a as-needed basis" (as the documentation says), you can simply prepend lazy:
collection.lazy.map(&:operation).find(&:condition?)
I don't think this is ugly at all—quite the contrary— it looks elegant to me.
Applied to your code:
def costly_find(things)
things.lazy.map(&:expensive_transform).find(&:meets_condition?)
end
I would be inclined to create an enumerator that generates values thing.expensive_transform and then make that the receiver for find with meets_condition? in find's block. For one, I like the way that reads.
Code
def costly_find(things)
Enumerator.new { |y| things.each { |thing| y << thing.expensive_transform } }.
find(&:meets_condition?)
end
Example
class Thing
attr_reader :value
def initialize(value)
#value = value
end
def expensive_transform
self.class.new(value*2)
end
def meets_condition?
value == 12
end
end
things = [1,3,6,4].map { |n| Thing.new(n) }
#=> [#<Thing:0x00000001e90b78 #value=1>, #<Thing:0x00000001e90b28 #value=3>,
# #<Thing:0x00000001e90ad8 #value=6>, #<Thing:0x00000001e90ab0 #value=4>]
costly_find(things)
#=> #<Thing:0x00000001e8a3b8 #value=12>
In the example I have assumed that expensive_things and things are instances of the same class, but if that is not the case the code would need to be modified in the obvious way.
I don't think there is a "obvious best general solution" for your problem, which is also simple to use. You have two procedures involved (expensive_transform and meets_condition?), and you also would need - if this were a library method to use - as a third parameter the value to return, if no transformed element meets the condition. You return nil in this case, but in a general solution, expensive_transform might also yield nil, and only the caller knows what unique value would indicate that the condition as not been met.
Hence, a possible solution within Enumerable would have the signature
class Enumerable
def find_transformed(default_return_value, transform_proc, condition_proc)
...
end
end
or something similar, so this is not particularily elegant either.
You could do it with a single block, if you agree to merge the semantics of the two procedures into one: You have only one procedure, which calculates the transformed value and tests it. If the test succeeds, it returns the transformed value, and if it fails, it returns the default value:
class Enumerable
def find_by(default_value, &block)
result = default_value
each do |element|
result = block.call(element)
break if result != default_value
end
end
result
end
You would use it in your case like this:
my_collection.find_by(nil) do |el|
transformed_value = expensive_transform(el)
meets_condition?(transformed_value) ? transformed_value : nil
end
I'm not sure whether this is really intuitive to use...

I an getting an "Undefined method 'new' for.... (A number that changes each time)"

I made a simple program with a single method and I'm trying to test it, but I keep getting this weird error, and I have no idea why it keeps happening.
Here's my code for the only method I wrote:
def make_database(lines)
i = 0
foods = hash.new()
while i < lines.length do
lines[i] = lines[i].chomp()
words = lines[i].split(',')
if(words[1].casecmp("b") == 0)
foods[words[0]] = words[3]
end
end
return foods
end
And then here's what I have for calling the method (Inside the same program).
if __FILE__ == $PROGRAM_NAME
lines = []
$stdin.each { |line| lines << line}
foods = make_database(lines).new
puts foods
end
I am painfully confused, especially since it gives me a different random number for each "Undefined method 'new' for (Random number)".
It's a simple mistake. hash calls a method on the current object that returns a number used by the Hash structure for indexing entries, where Hash is the hash class you're probably intending:
foods = Hash.new()
Or more succinctly:
foods = { }
It's ideal to use { } in place of Hash.new unless you need to specify things like defaults, as is the case with:
Hash.new(0)
Where all values are initialized to 0 by default. This can be useful when creating simple counters.
Ruby classes are identified by leading capital letters to avoid confusion like this. Once you get used to the syntax you'll have an easier time spotting mistakes like that.
Note that when writing Ruby code you will almost always omit braces/brackets on empty argument lists. That is x() is expressed simply as x. This keeps code more readable, especially when chaining, like x.y.z instead of x().y().z()
Other things to note include being able to read in all lines with readlines instead of what you have there where you manually compose it. Try:
make_database($stdin.readlines.map(&:chomp))
A more aggressive refactoring of your code looks like this:
def make_database(lines)
# Define a Hash based on key/value pairs in an Array...
Hash[
# ...where these pairs are based on the input lines...
lines.map do |line|
# ...which have comma-separated components.
line.split(',')
end.reject do |key, flag, _, value|
# Pick out only those that have the right flag.
flag.downcase == 'b'
end.map do |key, flag, _, value|
# Convert to a simple key/value pair array
[ key, value ]
end
]
end
That might be a little hard to follow, but once you get the hang of chaining together a series of otherwise simple operations your Ruby code will be a lot more flexible and far easier to read.

Functionally find mapping of first value that passes a test

In Ruby, I have an array of simple values (possible encodings):
encodings = %w[ utf-8 iso-8859-1 macroman ]
I want to keep reading a file from disk until the results are valid. I could do this:
good = encodings.find{ |enc| IO.read(file, "r:#{enc}").valid_encoding? }
contents = IO.read(file, "r:#{good}")
...but of course this is dumb, since it reads the file twice for the good encoding. I could program it in gross procedural style like so:
contents = nil
encodings.each do |enc|
if (s=IO.read(file, "r:#{enc}")).valid_encoding?
contents = s
break
end
end
But I want a functional solution. I could do it functionally like so:
contents = encodings.map{|e| IO.read(f, "r:#{e}")}.find{|s| s.valid_encoding? }
…but of course that keeps reading files for every encoding, even if the first was valid.
Is there a simple pattern that is functional, but does not keep reading the file after a the first success is found?
If you sprinkle a lazy in there, map will only consume those elements of the array that are used by find - i.e. once find stops, map stops as well. So this will do what you want:
possible_reads = encodings.lazy.map {|e| IO.read(f, "r:#{e}")}
contents = possible_reads.find {|s| s.valid_encoding? }
Hopping on sepp2k's answer: If you can't use 2.0, lazy enums can be easily implemented in 1.9:
class Enumerator
def lazy_find
self.class.new do |yielder|
self.each do |element|
if yield(element)
yielder.yield(element)
break
end
end
end
end
end
a = (1..100).to_enum
p a.lazy_find { |i| i.even? }.first
# => 2
You want to use the break statement:
contents = encodings.each do |e|
s = IO.read( f, "r:#{e}" )
s.valid_encoding? and break s
end
The best I can come up with is with our good friend inject:
contents = encodings.inject(nil) do |s,enc|
s || (c=File.open(f,"r:#{enc}").valid_encoding? && c
end
This is still sub-optimal because it continues to loop through encodings after finding a match, though it doesn't do anything with them, so it's a minor ugliness. Most of the ugliness comes from...well, the code itself. :/

How do I see if a multi-dimensional hash has a value in ruby?

Currently I am doing the following, but I am sure there must be a better way:
def birthday_defined?(map)
map && map[:extra] && map[:extra][:raw_info] && map[:extra][:raw_info][:birthday]
end
There may be cases where only map[:extra] is defined, and then I will end up getting Nil exception errors cause map[:extra][:raw_info] doesn't exist if I dont use my checked code above.
If you're using Rails, then you can use try (and NilClass#try):
value = map.try(:[], :extra).try(:[], :raw_info).try(:[], :birthday)
That looks a bit repetitive: it is just doing the same thing over and over again while feeding the result of one step into the next step. That code pattern means that we have a hidden injection:
value = [:extra, :raw_info, :birthday].inject(map) { |h, k| h.try(:[], k) }
This approach nicely generalizes to any path into map that you have in mind:
path = [ :some, :path, :of, :keys, :we, :care, :about ]
value = path.inject(map) { |h, k| h.try(:[], k) }
Then you can look at value.nil?.
Of course, if you're not using Rails then you'll need a replacement for try but that's not difficult.
I have two ways. Both have the same code but subtly different:
# Method 1
def birthday_defined?(map)
map[:extra][:raw_info][:birthday] rescue nil # rescues current line
end
# Method 2
def birthday_defined?(map)
map[:extra][:raw_info][:birthday]
rescue # rescues whole method
nil
end
Use a begin/rescue block.
begin
map[:extra][:raw_info][:birthday]
rescue Exception => e
'No birthday! =('
end
That's idiomatic why to do it. And yes it can be a little cumbersome.
If you want to extend Hash a bit though, you can do some cool stuff with something like a key path. See Access Ruby Hash Using Dotted Path Key String
def birthday_defined?
map.dig('extra.raw_info.birthday')
end
This is a little hacky but it will work:
def birthday_defined?(map)
map.to_s[":birthday"]
end
If map contains :birthday then it will return the string which will evaluate to true in a conditional statement while if it doesn't contain :birthday, it will return nil.
Note: This assumes the key :birthday does not appear at potentially multiple locations in map.
This should work for you:
def birthday_defined?(map)
map
.tap{|x| (x[:extra] if x)
.tap{|x| (x[:raw_info] if x)
.tap{|x| (x[:birthday] if x)
.tap{|x| return x}}}}
end

Access Ruby accessors using block variables

I have an application that I am using Active Record to access a database. I'm trying to take the information and put it into a hash for later use.
Here is basically what I am doing.
require 'active_support'
#emailhash = Hash.new
emails = Email.find(:all)
emails.each do |email|
email.attributes.keys.each do |#column|
#emailhash[email.ticketno] || Hash.new
#emailhash[email.ticketno] = email.#column
end
end
The line that doesn't work is:
#emailhash[email.ticketno] = email.#column
Is there any way that I can do this properly? Basically my goal is to build a hash off of the values that are stored in the table columns, any input is appreciated.
Ruby programmers usually indent 2
Your code was squishing all of the emails into one hash entry, rather than an entry per email.
If you want to call a method dynamically, use Object.send.
#emailhash[email.ticketno] || Hash.new doesn't do anything.
Something like this might do it:
require 'active_support'
#emailhash = {}
Email.find(:all).each do |email|
#mailhash[email.ticketno] = {}
email.attributes.keys.each do |key|
#emailhash[email.ticketno][key] = email.send(key)
end
end
The key piece is "send", which calls the method identified by a string or symbol.
You cannot have an iterator variable starting with an #. Try something like this:
require 'active_support'
#emailhash = Hash.new
emails = Email.find(:all)
emails.each do |email|
#emailhash[email.ticketno] = email.attributes.keys.collect{|key| key.column}
end
In addition to blinry's comment, the line
#emailhash[email.ticketno] || Hash.new
looks suspicious. Are you sure you don't want
#emailhash[email.ticketno] ||= Hash.new
?
Besides the previous accurate observations, I would like to add the following:
Point 1.
Is important to state that #ivars may not work on formal function parameters... This said, I think it is invalid to have:
collection.each { |#not_valid| }
Is also a bad practice to have #ivars inside blocks, you won't know for sure in which context this block will be executed in (As you many know, the self reference inside that block may be different than the self reference outside it).
Point 2.
Another point that you should have in mind is that if you don't assign the result of a (||) operator this won't do any modification at all (just will be a time waster), however you could use:
mail_hash[email.ticketno] = mail_hash[email.ticketno] || Hash.new
That can be easily rewritten to:
mail_hash[email.ticketno] ||= Hash.new
Point 3.
Even if email.attributes.keys is a cheap instruction, is not free... I would suggest to have that outside the iteration block (given that the keys will always be the same for each record, given we are not using Document Databases).
Final Result
require 'active_support'
mails = Email.all
#mailshash = mails.inject(Hash.new) do |hsh, mail|
# mail.attributes is already a representation of the
# email record in a hash
hsh[mail.ticketno] = mail.attributes
hsh
end

Resources