I am trying to understand this code snippet:
while row = input.gets
row.strip!
next if row.empty?
valuesplit = row.split("---")
a, b = valuesplit[1..2]
unless a == b
$hash1[a] ||= {} <--------------What is this line doing? How is the whole loop
$hash1[a][b] = true being traversed?
if $hash1[b] && $hash1[b][a] <-----Can you please describe this if() loop
$hash2[a] ||= []
$hash2[a] << b
$hash2[b] ||= []
$hash2[b] << a
end
end
end
NOTE: $hash1 = {}
$hash2 = {}
Thanks!
UPDATE
Input:
junkdata1 value1 value2
junkdata2 value3 value4
junkdata3 value5 value6
and so on.
Updated the code lines with comments too.
# loops by reading in every line of the input
# (input can be a file or another I/O object)
# every line is stored successively in a variable
# called "row"
while row = input.gets
# removes leading and trailing whitespace from
# the string that is stored in the "row" variable
row.strip!
# if the string is empty, continue to the next
# line (go back to beginning of loop)
next if row.empty?
# split the string into an array of substrings
# based on the "---" delimiter
valuesplit = row.split("---")
# assign the second substring in the valuesplit
# array to a variable called a, and the third to
# a variable called b
a, b = valuesplit[1..2]
# if a and b are different
unless a == b
# initialize the hash1 dictionary's a entry
# to an empty sub-dictionary if it is null
$hash1[a] ||= {}
# in the hash1 dictionary, set a's entry
# to a dictionary that has b as the entry
# and true as the value
$hash1[a][b] = true
# if the value for the b entry in the hash1
# dictionary is true (not false or null) AND the value for a's
# entry of the dictionary found at the b
# entry of the hash1 dictionary is true
if $hash1[b] && $hash1[b][a]
# initialize the hash2 dictionary's a entry
# to an empty arraylist if it null or false
$hash2[a] ||= []
# add b to this arraylist
$hash2[a] << b
# initialize the hash2 dictionary's b entry
# to an empty arraylist if it null or false
$hash2[b] ||= []
# add a to this arraylist
$hash2[b] << a
end # end of the if $hash1[b]... statement
end # end of the unless a == b statement
end # end of the gets loop
I still feel the question is a little vague. You should also note that I have ignored your example data. Given your example data, the results of both $hash1 and $hash2 are empty hashes.
For your first question:
$hash1[a] ||= {}
The above is a combination of two things
First is an index into a hash which I'll assume you're familiar with.
The second is sort of a conditional assignment. As an example:
blah ||= 1
The above says, assign the value 1 to blah as long as blah is nil. If blah is not nil then the assignment is not performed.
For the if statement we'll need some context:
if $hash1[b] && $hash1[b][a] #if the pair exists reversed
$hash2[a] ||= [] #initialize the array for the second
$hash2[a] << b #append the third to the second's array
$hash2[b] ||= [] #likewise for the reverse
$hash2[b] << a
end
If we assume that the initial values of $hash1 and $hash2 are {} as you note, and if we assume the input are a series of --- delimted values, then given the following data set:
foo---b---c
foo---c---a
foo---a---b
foo---b---a
foo---a---d
foo---d---a
The value of $hash1 would be:
{"a"=>{"b"=>true, "d"=>true}, "b"=>{"a"=>true, "c"=>true}, "c"=>{"a"=>true}, "d"=>{"a"=>true}}
Where $hash2 would be:
{"a"=>["b", "d"], "b"=>["a"], "d"=>["a"]}
Given this, I can make an educated guess that the block of code is generating a dictionary of relationships. In the above, $hash1 lists whether a given value refers to other values. A sort of a truth test. If you wanted to know if A referred to B you could just use:
$hash1['a']['b']
If the result is true then the answer is yes.
$hash2 is a sort of dictionary of two way relationships.
If you checked:
$hash2['a']
You would find an array of all the things to which A refers which also refer to A.
Cheers.
foo ||= bar
This is shorthand for
foo = foo || bar
Just like it is shorthand in Java. It basically means "set foo equal to the default of bar if foo is currently nil, otherwise leave it alone" (nil evaluates to false in ruby)
Related
I have two hashes of the same length:
hash1 = {"1"=>"val", "2"=>"val", "3"=>"", "4"=>""}
hash2 = {"1"=>[""], "2"=>["value"], "3"=>["val1", "val2"], "4"=>[""]}
I need to compare them. The corresponding keys need to either both have a value (for hash1, this means non-blank, and for hash2, this means there must be a non-blank value in the array) or both have a blank value (for hash2, this means the value is [""]).
Key "1" fails (array has one value and that value is blank)
Key "2" passes (both have values)
Key "3" fails (hash1 is blank)
Key "4" passes (hash1 is blank and hash2 has one value in the array and that value is blank)
If one of these comparisons fails, then I should get false returned. I'm not sure how to do a comparison like this.
Assuming that hashes are already ordered:
hash1 = {"1"=>"val", "2"=>"val", "3"=>"", "4"=>""}
hash2 = {"1"=>[""], "2"=>["value"], "3"=>["val1", "val2"], "4"=>[""]}
hash1.zip(hash2).all? do |(_, fv), (_, lv)|
fv.empty? ^ !lv.all?(&:empty?)
end
Here we take a benefit of using XOR. Whether hashes are not ordered, the preprocessing (ordering) required.
According to #sawa’s and #CarySwoveland’s comments, for not sorted hashes:
hash1.sort.zip(hash2.sort).all? do |(fk, fv), (lk, lv)|
# ⇓ consistency ⇓ true if one operand is truthy, other is falsey
(fk == lk) && (fv.empty? ^ !lv.all?(&:empty?))
end
hash1.merge(hash2){|_, v1, v2| v2.dup.push(v1)}
.all?{|_, v| v.all?(&:empty?) or v.none?(&:empty?)}
Or following #mudasobwa's suggestion:
hash2.merge(hash1){|_, v2, *v1| v1 + v2}
.all?{|_, v| v.all?(&:empty?) or v.none?(&:empty?)}
Edit: better, I think:
hash1.all? { |k,v| !(v.empty? ^ (hash2[k]==[""])) }
#=> false
Original answer:
keys = hash1.keys
#=> ["1", "2", "3", "4"]
hash1.values_at(*keys).zip(hash2.values_at(*keys)).all? do |v1,v2|
!(v1.empty? ^ (v2==[""]))
end
#=> false
^ is Ruby's XOR operator.
Another approach:
def compare_hashes(hash1, hash2)
(1..hash1.length).each do |n|
n = n.to_s # can easily swap this if you want to use integers or symbols
return false if hash1[n].empty? || hash2[n].empty? || hash2[n].all?(&:empty?)
end
true
end
The following is more-or-less a direct translation of the requirements.(*)
# similar(h1,h2) assumes that h1 is a string-valued hash,
# and that h2 is a hash with values that are all arrays of strings.
#
def similar(h1,h2)
return false if h1.length != h2.length
h1.all? {|key, v1|
v2=h2[key]
v2 != nil and
((v1=="" and v2==[""]) or
(v1 != "" and !v2.all?{|x| x.length==0} ))
}
end
(*) The OP stipulated that the two hashes have the same number of keys, so the check that that's the case could perhaps be omitted, but there is no harm in checking, and the method as written is likely to be a tiny bit more useful (or at least more realistic) with the check included. Feel free to omit it.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Is there anyway of grouping first common letters in an array of strings?
For example:
array = [ 'hello', 'hello you', 'people', 'finally', 'finland' ]
so when i do
array.group_by{ |string| some_logic_with_string }
The result should be,
{
'hello' => ['hello', 'hello you'],
'people' => ['people'],
'fin' => ['finally', 'finland']
}
NOTE: Some test cases are ambiguous and expectations conflict with other tests, you need to fix them.
I guess plain group_by may not work, a further processing is needed.
I have come up with below code that seems to work for all the given test cases in consistent manner.
I have left notes in the code to explain the logic. Only way to fully understand it will be to inspect value of h and see the flow for a simple test case.
def group_by_common_chars(array)
# We will iteratively group by as many time as there are characters
# in a largest possible key, which is max length of all strings
max_len = array.max_by {|i| i.size}.size
# First group by first character.
h = array.group_by{|i| i[0]}
# Now iterate remaining (max_len - 1) times
(1...max_len).each do |c|
# Let's perform a group by next set of starting characters.
t = h.map do |k,v|
h1 = v.group_by {|i| i[0..c]}
end.reduce(&:merge)
# We need to merge the previously generated hash
# with the hash generated in this iteration. Here things get tricky.
# If previously, we had
# {"a" => ["a"], "ab" => ["ab", "abc"]},
# and now, we have
# {"a"=>["a"], "ab"=>["ab"], "abc"=>["abc"]},
# We need to merge the two hashes such that we have
# {"a"=>["a"], "ab"=>["ab", "abc"], "abc"=>["abc"]}.
# Note that `Hash#merge`'s block is called only for common keys, so, "abc"
# will get merged, we can't do much about it now. We will process
# it later in the loop
h = h.merge(t) do |k, o, n|
if (o.size != n.size)
diff = [o,n].max - [o,n].min
if diff.size == 1 && t.value?(diff)
[o,n].max
else
[o,n].min
end
else
o
end
end
end
# Sort by key length, smallest in the beginning.
h = h.sort {|i,j| i.first.size <=> j.first.size }.to_h
# Get rid of those key-value pairs, where value is single element array
# and that single element is already part of another key-value pair, and
# that value array has more than one element. This step will allow us
# to get rid of key-value like "abc"=>["abc"] in the example discussed
# above.
h = h.tap do |h|
keys = h.keys
keys.each do |k|
v = h[k]
if (v.size == 1 &&
h.key?(v.first) &&
h.values.flatten.count(v.first) > 1) then
h.delete(k)
end
end
end
# Get rid of those keys whose value array consist of only elements that
# already part of some other key. Since, hash is ordered by key's string
# size, this process allows us to get rid of those keys which are smaller
# in length but consists of only elements that are present somewhere else
# with a key of larger length. For example, it lets us to get rid of
# "a"=>["aba", "abb", "aaa", "aab"] from a hash like
# {"a"=>["aba", "abb", "aaa", "aab"], "ab"=>["aba", "abb"], "aa"=>["aaa", "aab"]}
h.tap do |h|
keys = h.keys
keys.each do |k|
values = h[k]
other_values = h.values_at(*(h.keys-[k])).flatten
already_present = values.all? do |v|
other_values.include?(v)
end
h.delete(k) if already_present
end
end
end
Sample Run:
p group_by_common_chars ['hello', 'hello you', 'people', 'finally', 'finland']
#=> {"fin"=>["finally", "finland"], "hello"=>["hello", "hello you"], "people"=>["people"]}
p group_by_common_chars ['a', 'ab', 'abc']
#=> {"a"=>["a"], "ab"=>["ab", "abc"]}
p group_by_common_chars ['aba', 'abb', 'aaa', 'aab']
#=> {"ab"=>["aba", "abb"], "aa"=>["aaa", "aab"]}
p group_by_common_chars ["Why", "haven't", "you", "answered", "the", "above", "questions?", "Please", "do", "so."]
#=> {"a"=>["answered", "above"], "do"=>["do"], "Why"=>["Why"], "you"=>["you"], "so."=>["so."], "the"=>["the"], "Please"=>["Please"], "haven't"=>["haven't"], "questions?"=>["questions?"]}
Not sure, if you can sort by all common letters. But if you want to do sort only by first letter then here it is:
array = [ 'hello', 'hello you', 'people', 'finally', 'finland' ]
result = {}
array.each { |st| result[st[0]] = result.fetch(st[0], []) + [st] }
pp result
{"h"=>["hello", "hello you"], "p"=>["people"], "f"=>["finally", "finland"]}
Now result contains your desired hash.
Hmm, you're trying to do something that's pretty custom. I can think of two classical approaches that sort of do what you want: 1) Stemming and 2) Levenshtein Distance.
With stemming you're finding the root word to a longer word. Here's a gem for it.
Levenshtein is a famous algorithm which calculates the difference between two strings. There is a gem for it that runs pretty fast due to a native C extension.
I want to define a method which can take an optional amount of arguments and hashes, like so
def foo(*b, **c)
2.times.map.with_index { |i|
new_hash, new_array = {}, b
c.map { |key, value| new_hash[key] = value[i] unless value[i].nil? }
new_array << new_hash if new_hash.length > 0
send(:bar, new_array)
}
end
def bar(*b)
p b
end
If I've understood the splat and double splat operators correctly (which I doubt), then this should send the array b to the bar method, and only adding the new_hash from foo if it contains something. However, something weird happens - I'll try and illustrate with some snippets below
# invoking #foo
foo(a, key: 'value')
# first iteration of loop in #foo
# i is 0
# b is []
# c is { :key => ['value1'] }
# send(:bar, new_array) => send(:bar, [{:key => 'value1'}])
# bar yields: [{:key => 'value1'}]
Now, however, something happens
# second iteration of loop in #foo
# i is 1
# b is [:key => 'value1'] <---- why?
# c is { :key => ['value1']
Why has the value of b changed inside the loop of foo?
edit Updated the code to reflect a new array is created for each iteration
new_hash, new_array = {}, b
This doesn't create a copy of b. Now new_array and b point to the same object. Modifying one in-place will modify the other.
new_array << new_hash
That modifies new_array (and thus b) in place, so the new element remains on the next iteration. Use something like +, which creates a copy:
send(:bar, *(b + (new_hash.empty? ? [] : [new_hash])))
I am currently learning Ruby and I'm trying to write a simple Ruby grocery_list method. Here are the instructions:
We want to write a program to help keep track of a grocery list. It takes a grocery item (like "eggs") as an argument, and returns the grocery list (that is, the item names with the quantities of each item). If you pass the same argument twice, it should increment the quantity.
def grocery_list(item)
array = []
quantity = 1
array.each {|x| quantity += x }
array << "#{quantity}" + " #{item}"
end
puts grocery_list("eggs", "eggs")
so I'm trying to figure out here how to return "2 eggs" by passing eggs twice
To help you count the different items you can use as Hash. A Hash is similar to an Array, but with Strings instead of Integers als an Index:
a = Array.new
a[0] = "this"
a[1] = "that"
h = Hash.new
h["sonja"] = "asecret"
h["brad"] = "beer"
In this example the Hash might be used for storing passwords for users. But for your
example you need a hash for counting. Calling grocery_list("eggs", "beer", "milk", "eggs")
should lead to the following commands being executed:
h = Hash.new(0) # empty hash {} created, 0 will be default value
h["eggs"] += 1 # h is now {"eggs"=>1}
h["beer"] += 1 # {"eggs"=>1, "beer"=>1}
h["milk"] += 1 # {"eggs"=>1, "beer"=>1, "milk"=>1}
h["eggs"] += 1 # {"eggs"=>2, "beer"=>1, "milk"=>1}
You can work through all the keys and values of a Hash with the each-loop:
h.each{|key, value| .... }
and build up the string we need as a result, adding
the number of items if needed, and the name of the item.
Inside the loop we always add a comma and a blank at the end.
This is not needed for the last element, so after the
loop is done we are left with
"2 eggs, beer, milk, "
To get rid of the last comma and blank we can use chop!, which "chops off"
one character at the end of a string:
output.chop!.chop!
One more thing is needed to get the complete implementation of your grocery_list:
you specified that the function should be called like so:
puts grocery_list("eggs", "beer", "milk","eggs")
So the grocery_list function does not know how many arguments it's getting. We can handle
this by specifying one argument with a star in front, then this argument will
be an array containing all the arguments:
def grocery_list(*items)
# items is an array
end
So here it is: I did your homework for you and implemented grocery_list.
I hope you actually go to the trouble of understanding the implementation,
and don't just copy-and-paste it.
def grocery_list(*items)
hash = Hash.new(0)
items.each {|x| hash[x] += 1}
output = ""
hash.each do |item,number|
if number > 1 then
output += "#{number} "
end
output += "#{item}, "
end
output.chop!.chop!
return output
end
puts grocery_list("eggs", "beer", "milk","eggs")
# output: 2 eggs, beer, milk
def grocery_list(*item)
item.group_by{|i| i}
end
p grocery_list("eggs", "eggs","meat")
#=> {"eggs"=>["eggs", "eggs"], "meat"=>["meat"]}
def grocery_list(*item)
item.group_by{|i| i}.flat_map{|k,v| [k,v.length]}
end
p grocery_list("eggs", "eggs","meat")
#=>["eggs", 2, "meat", 1]
def grocery_list(*item)
Hash[*item.group_by{|i| i}.flat_map{|k,v| [k,v.length]}]
end
grocery_list("eggs", "eggs","meat")
#=> {"eggs"=>2, "meat"=>1}
grocery_list("eggs", "eggs","meat","apple","apple","apple")
#=> {"eggs"=>2, "meat"=>1, "apple"=>3}
or as #Lee said:
def grocery_list(*item)
item.each_with_object(Hash.new(0)) {|a, h| h[a] += 1 }
end
grocery_list("eggs", "eggs","meat","apple","apple","apple")
#=> {"eggs"=>2, "meat"=>1, "apple"=>3}
Use a Hash Instead of an Array
When you want an easy want to count things, you can use a hash key to hold the name of the thing you want to count, and the value of that key is the quantity. For example:
#!/usr/bin/env ruby
class GroceryList
attr_reader :list
def initialize
# Specify hash with default quantity of zero.
#list = Hash.new(0)
end
# Increment the quantity of each item in the #list, using the name of the item
# as a hash key.
def add_to_list(*items)
items.each { |item| #list[item] += 1 }
#list
end
end
if $0 == __FILE__
groceries = GroceryList.new
groceries.add_to_list('eggs', 'eggs')
puts 'Grocery list correctly contains 2 eggs.' if groceries.list['eggs'] == 2
end
Here's a more verbose, but perhaps more readable solutions to your challenge.
def grocery_list(*items) # Notice the asterisk in front of items. It means "put all the arguments into an array called items"
my_grocery_hash = {} # Creates an empty hash
items.each do |item| # Loops over the argument array and passes each argument into the loop as item.
if my_grocery_hash[item].nil? # Returns true of the item is not a present key in the hash...
my_grocery_hash[item] = 1 # Adds the key and sets the value to 1.
else
my_grocery_hash[item] = my_grocery_hash[item] + 1 # Increments the value by one.
end
end
my_grocery_hash # Returns a hash object with the grocery name as the key and the number of occurences as the value.
end
This will create an empty hash (called dictionaries or maps in other languages) where each grocery is added as a key with the value set to one. In case the same grocery appears multiple times as a parameter to your method, the value is incremented.
If you want to create a text string and return that instead of the hash object and you can do like this after the iteration:
grocery_list_string = "" # Creates an empty string
my_grocery_hash.each do |key, value| # Loops over the hash object and passes two local variables into the loop with the current entry. Key being the name of the grocery and value being the amount.
grocery_list_string << "#{value} units of #{key}\n" # Appends the grocery_list_string. Uses string interpolation, so #{value} becomes 3 and #{key} becomes eggs. The remaining \n is a newline character.
end
return grocery_list_string # Explicitly declares the return value. You can ommit return.
Updated answer to comment:
If you use the first method without adding the hash iteration you will get a hash object back which can be used to look up the amount like this.
my_hash_with_grocery_count = grocery_list("Lemonade", "Milk", "Eggs", "Lemonade", "Lemonade")
my_hash_with_grocery_count["Milk"]
--> 1
my_hash_with_grocery_count["Lemonade"]
--> 3
Enumerable#each_with_object can be useful for things like this:
def list_to_hash(*items)
items.each_with_object(Hash.new(0)) { |item, list| list[item] += 1 }
end
def hash_to_grocery_list_string(hash)
hash.each_with_object([]) do |(item, number), result|
result << (number > 1 ? "#{number} #{item}" : item)
end.join(', ')
end
def grocery_list(*items)
hash_to_grocery_list_string(list_to_hash(*items))
end
p grocery_list('eggs', 'eggs', 'bread', 'milk', 'eggs')
# => "3 eggs, bread, milk"
It iterates an array or hash to enable building another object in a convenient way. The list_to_hash method uses it to build a hash from the items array (the splat operator converts the method arguments to an array); the hash is created so that each value is initialized to 0. The hash_to_grocery_list_string method uses it to build an array of strings that is joined to a comma-separated string.
I have an array lets say
array1 = ["abc", "a", "wxyz", "ab",......]
How do I make sure neither for example "a" (any 1 character), "ab" (any 2 characters), "abc" (any 3 characters), nor words like "that", "this", "what" etc nor any of the foul words are saved in array1?
This removes elements with less than 4 characters and words like this, that, what from array1 (if I got it right):
array1.reject! do |el|
el.length < 4 || ['this', 'that', 'what'].include?(el)
end
This changes array1. If you use reject (without !), it'll return the result and not change array1
You can open and add a new interface to the Array class which will disallow certain words. Example:
class Array
def add(ele)
unless rejects.include?(ele)
self.push ele
end
end
def rejects
['this', 'that', 'what']
end
end
arr = []
arr.add "one"
puts arr
arr.add "this"
puts arr
arr.add "aslam"
puts arr
Output would be:
one one one aslam
And notice the word "this" was not added.
You could create a stop list. Using a hash for this would be more efficient than an array as lookup time will be consistant with a hash. With an array the lookup time is proportional to the number of elements in the array. If you are going to check for stop words a lot, I suggest using a hash that contains all the stop words. Using your code, you could do the following
badwords_a = ["abc", "a", "wxyz", "ab"] # Your array of bad words
badwords_h = {} # Initialize and empty hash
badwords_a.each{|word| badwords_h[word] = nil} # Fill the hash
goodwords = []
words_to_process = ["abc","a","Foo","Bar"] # a list of words you want to process
words_to_process.each do |word| # Process new words
if badwords_h.key?(word)
else
goodwords << word # Add the word if it did not match the bad list
end
end
puts goodwords.join(", ")