How to make dynamic multi-dimensional array in ruby? - ruby

I have a beginner ruby question about multi dimensional arrays.
I want to sort entries by year and month. So I want to create a multi-dimensional array that would contain years -> months -> entries of month
So the array would be like:
2009 ->
08
-> Entry 1
-> Entry 2
09
-> Entry 3
2007 ->
10
-> Entry 5
Now I have:
#years = []
#entries.each do |entry|
timeobj = Time.parse(entry.created_at.to_s)
year = timeobj.strftime("%Y").to_i
month = timeobj.strftime("%m").to_i
tmparr = []
tmparr << {month=>entry}
#years.push(year)
#years << tmparr
end
but when I try to iterate through the years array, I get: "undefined method `each' for 2009:Fixnum"
Tried also:
#years = []
#entries.each do |entry|
timeobj = Time.parse(entry.created_at.to_s)
year = timeobj.strftime("%Y").to_i
month = timeobj.strftime("%m").to_i
#years[year][month] << entry
end

You are getting the error because a FixNum (that is, a number) is pushed on the array, in the line that reads #years.push(year).
Your approach of using Arrays to start with is a bit flawed; an array is perfect to hold an ordered list of items. In your case, you have a mapping from keys to values, which is perfect for a Hash.
In the first level, the keys are years, the values are hashes. The second level's hashes contain keys of months, and values of arrays of entries.
In this case, a typical output of your code would look something like (based on your example):
{ 2009 => { 8 => [Entry1, Entry2], 9 => [Entry3] }, 2007 => { 10 => [Entry5] }}
Notice that, however, the order of years and months is not guaranteed to be in any particular order. The solution is normally to order the keys whenever you want to access them. Now, a code that would generate such an output (based on your layout of code, although can be made much rubier):
#years = {}
#entries.each do |entry|
timeobj = Time.parse(entry.created_at.to_s)
year = timeobj.strftime("%Y").to_i
month = timeobj.strftime("%m").to_i
#years[year] ||= {} # Create a sub-hash unless it already exists
#years[year][month] ||= []
#years[year][month] << entry
end

You can get the nested array structure in one line by using a combination of group_bys and map:
#entries.group_by {|entry| entry.created_at.year }.map { |year, entries| [year, entries.group_by {|entry| entry.created_at.month }] }

I'm using hash tables instead of arrays, because I think it probably makes more sense here. However, it's fairly trivial to change back to using arrays if that's what you prefer.
entries = [
[2009, 8, 1],
[2009, 8, 2],
[2009, 9, 3],
[2007, 10, 5]
]
years = Hash.new
entries.each { |e|
year = e[0]
month = e[1]
entry = e[2]
# Add to years array
years[year] ||= Hash.new
years[year][month] ||= Array.new
years[year][month] << entry
}
puts years.inspect
The output is: {2007=>{10=>[5]}, 2009=>{8=>[1, 2], 9=>[3]}}

# create a hash of hashes of array
#years = Hash.new do |h,k|
h[k] = Hash.new do |sh, sk|
sh[sk] = []
end
end
#entries.each do |entry|
timeobj = Time.parse(entry.created_at.to_s)
year = timeobj.year
month = timeobj.month
#years[year][month] << entry
end

Related

How to find count matching characters at the same indes and at an unmatching index

I have built a version of mastermind that checks a user's input and provides feedback based on how close the user's guess was to the winning sequence. If you're not familiar with the game, you get feedback indicating how many of your characters were guessed correctly at the same index and how many characters guessed are in the sequence, but at the wrong index. If there are duplicates in the guess, then you would not count the extra values unless they correspond to the same number of duplicates in the secret code.
Example: If the sequence is ["G","G","G","Y"] and the user guesses ["G", "Y","G","G"] then you'd want to return 2 for items at the same index and 2 for items at different indexes that are included in the secret sequence.
Another example: If the sequence is ["X","R","Y","T"] and the user guesses ["T","T","Y","Y"] then you'd return 1 for items at the same index 1 for the character guessed that is in the sequence but at the wrong index.
Anyway, to me this is not a simple problem to solve. Here's the code I used to get it to work, but it's not elegant. There must be a better way. I was hoping someone can tell me what I'm missing here?? New to Ruby...
def index_checker(input_array, sequence_array)
count = 0
leftover_input = []
leftover_sequence = []
input.each_with_index do |char, idx|
if char == sequence[idx]
count += 1
else
leftover_input << char
leftover_sequence << sequence[idx]
end
end
diff_index_checker(leftover_input, leftover_sequence, count)
end
def diff_index_checker(input, sequence, count)
count2 = 0
already_counted = []
input.each do |char|
if sequence.include?(char) && !already_counted.include?(char)
count2 += 1
already_counted << char
end
end
[count, count2]
end
Here's a clean Ruby solution, written in idiomatic Ruby object-oriented style:
class Mastermind
def initialize(input_array, sequence_array)
#input_array = input_array
#sequence_array = sequence_array
end
def matches
[index_matches, other_matches]
end
def results
[index_matches.size, other_matches.size]
end
private
attr_reader :input_array, :sequence_array
def index_matches
input_array.select.with_index { |e, i| e == sequence_array[i] }
end
def other_matches
non_exact_input & non_exact_sequence
end
def non_exact_input
array_difference(input_array, index_matches)
end
def non_exact_sequence
array_difference(sequence_array, index_matches)
end
# This method is based on https://stackoverflow.com/a/3852809/5961578
def array_difference(array_1, array_2)
counts = array_2.inject(Hash.new(0)) { |h, v| h[v] += 1; h }
array_1.reject { |e| counts[e] -= 1 unless counts[e].zero? }
end
end
You would use this class as follows:
>> input_array = ["G","G","G","Y"]
>> sequence_array = ["G", "Y","G","G"]
>> guess = Mastermind.new(input_array, sequence_array)
>> guess.results
#> [2, 2]
>> guess.matches
#> [["G", "G"], ["G", "Y"]]
Here's how it works. First everything goes into a class called Mastermind. We create a constructor for the class (which in Ruby is a method called initialize) and we have it accept two arguments: input array (the user guess), and sequence array (the answer).
We set each of these arguments to an instance variable, which is indicated by its beginning with #. Then we use attr_reader to create getter methods for #input_array and #sequence_array, which allows us to get the values by calling input_array and sequence_array from any instance method within the class.
We then define two public methods: matches (which returns an array of exact matches and an array of other matches (the ones that match but at the wrong index), and results (which returns a count of each of these two arrays).
Now, within the private portion of our class, we can define the guts of the logic. Each method has a specific job, and each is named to (hopefully) help a reader understand what it is doing.
index_matches returns a subset of the input_array whose elements match the sequence_array exactly.
other_matches returns a subset of the input_array whose elements do not match the sequence_array exactly, but do match at the wrong index.
other_matches relies on non_exact_input and non_exact_sequence, each of which is computed using the array_difference method, which I copied from another SO answer. (There is no convenient Ruby method that allows us to subtract one array from another without deleting duplicates).
Code
def matches(hidden, guess)
indices_wo_match = hidden.each_index.reject { |i| hidden[i] == guess[i] }
hidden_counts = counting_hash(hidden.values_at *indices_wo_match)
guess_counts = counting_hash(guess.values_at *indices_wo_match)
[hidden.size - indices_wo_match.size, guess_counts.reduce(0) { |tot, (k, cnt)|
tot + [hidden_counts[k], cnt].min }]
end
def counting_hash(arr)
arr.each_with_object(Hash.new(0)) { |s, h| h[s] += 1 }
end
Examples
matches ["G","G","G","Y"], ["G", "Y","G","G"]
#=> [2, 2]
matches ["X","R","Y","T"] , ["T","T","Y","Y"]
#=> [1, 1]
Explanation
The steps are as follows.
hidden = ["G","G","G","Y"]
guess = ["G", "Y","G","G"]
Save the indices i for which hidden[i] != guess[i].
indices_wo_match = hidden.each_index.reject { |i| hidden[i] == guess[i] }
#=> [1, 3]
Note that the number of indices for which the values are equal is as follows.
hidden.size - indices_wo_match.size
#=> 2
Now compute the numbers of remaining elements of guess that pair with one of the remaining values of hidden by having the same value. Begin by counting the numbers of instances of each unique element of hidden and then do the same for guess.
hidden_counts = counting_hash(hidden.values_at *indices_wo_match)
#=> {"G"=>1, "Y"=>1}
guess_counts = counting_hash(guess.values_at *indices_wo_match)
#=> {"Y"=>1, "G"=>1}
To understand how counting_hash works, see Hash::new, especially the explanation of the effect of providing a default value as an argument of new. In brief, if a hash is defined h = Hash.new(3), then if h does not have a key k, h[k] returns the default value, here 3 (the hash is not changed).
Now compute the numbers of matches of elements of guess that were not equal to the value of hidden at the same index and which pair with an element of hidden that have the same value.
val_matches = guess_counts.reduce(0) do |tot, (k, cnt)|
tot + [hidden_counts[k], cnt].min
end
#=> 2
Lastly, return the values of interest.
[hidden.size - indices_wo_match.size, val_matches]
#=> [2, 2]
In the code presented above I have substituted out the variable val_matches.
With Ruby 2.4+ one can use Enumerable#sum to replace
guess_counts.reduce(0) { |tot, (k, cnt)| tot + [hidden_counts[k], cnt].min }
with
guess_counts.sum { |k, cnt| [hidden_counts[k], cnt].min }
def judge(secret, guess)
full = secret.zip(guess).count { |s, g| s == g }
semi = secret.uniq.sum { |s| [secret.count(s), guess.count(s)].min } - full
[full, semi]
end
Demo:
> judge(["G","G","G","Y"], ["G","Y","G","G"])
=> [2, 2]
> judge(["X","R","Y","T"], ["T","T","Y","Y"])
=> [1, 1]
A shorter alternative, though I find it less clear:
full = secret.zip(guess).count(&:uniq!)
I prefer my other answer for its simplicity, but this one would be faster if someone wanted to use this for arrays larger than Mastermind's.
def judge(secret, guess)
full = secret.zip(guess).count { |s, g| s == g }
pool = secret.group_by(&:itself)
[full, guess.count { |g| pool[g]&.pop } - full]
end
Demo:
> judge(["G","G","G","Y"], ["G","Y","G","G"])
=> [2, 2]
> judge(["X","R","Y","T"], ["T","T","Y","Y"])
=> [1, 1]

Parse nested indented list with Ruby by using "select_before"

I want to parse the formal list from https://www.loc.gov/marc/bibliographic/ecbdlist.html into a nested structure of hashes and arrays.
At first, I used a recursive approach - but ran into the problem that Ruby (and BTW also Python) can handle only less than 1000 recursive calls (stack level too deep).
I found "select_before" and it seemed great:
require 'pp'
# read list into array and get rid of unnecessary lines
marc = File.readlines('marc21.txt', 'r:utf-8')[0].lines.map(&:chomp).select { |line| line if !line.match(/^\s*$/) && !line.match(/^--.+/) }
# magic starts here
marc = marc.slice_before { |line| line[/^ */].size == 0 }.to_a
marc = marc.inject({}) { |hash, arr| hash = hash.merge( arr[0] => arr[1..-1] ) }
I now want to iterate these steps throughout the array. As the indentation levels in the list vary ([0, 2, 3, 4, 5, 6, 8, 9, 10, 12] not all of them always present), I use a helper method get_indentation_map to use only the smallest amount of indentation in each iteration.
But adding only one level (far from the goal of turning the whole array into the new structure), I get the error "no implicit conversion of Regex into Integer" the reason of which I fail to see:
def get_indentation_map( arr )
arr.map { |line| line[/^ */].size }
end
# starting again after slice_before of the unindented lines (== 0)
marc = marc.inject({}) do |hash, arr|
hash = hash.merge( arr[0] => arr[1..-1] ) # so far like above
# now trying to do the same on the next level
hash = hash.inject({}) do |h, a|
indentation_map = get_indentation_map( a ).uniq.sort
# only slice before smallest indentation
a = a.slice_before { |line| line[/^ */].size == indentation_map[0] }.to_a
h = h.merge( a[0] => a[1..-1] )
end
hash
end
I would be very grateful for hints how to best parse this list. I aim at a json-like structure in which every entry is the key for the further indented lines (if there are). Thanks in advance.

Comparing values of one hash to many hashes to get inverse document frequency in ruby

I'm trying to find the inverse document frequency for a categorization algorithm and am having trouble getting it the way that my code is structured (with nested hashes), and generally comparing one hash to many hashes.
My training code looks like this so far:
def train!
#data = {}
#all_books.each do |category, books|
#data[category] = {
words: 0,
books: 0,
freq: Hash.new(0)
}
books.each do |filename, tokens|
#data[category][:words] += tokens.count
#data[category][:books] += 1
tokens.each do |token|
#data[category][:freq][token] += 1
end
end
#data[category][:freq].map { |k, v| v = (v / #data[category][:freq].values.max) }
end
end
Basically, I have a hash with 4 categories (subject to change), and for each have word count, book count, and a frequency hash which shows term frequency for the category. How do I get the frequency of individual words from one category compared against the frequency of the words shown in all categories? I know how to do the comparison for one set of hash keys against another, but am not sure how to loop through a nested hash to get the frequency of terms against all other terms, if that makes sense.
Edit to include predicted outcome -
I'd like to return a hash of nested hashes (one for each category) that shows the word as the key, and the number of other categories in which it appears as the value. i.e. {:category1 = {:word => 3, :other => 2, :third => 1}, :category2 => {:another => 1, ...}} Alternately an array of category names as the value, instead of the number of categories, would also work.
I've tried creating a new hash as follows, but it's turning up empty:
def train!
#data = {}
#all_words = Hash.new([]) #new hash for all words, default value is empty array
#all_books.each do |category, books|
#data[category] = {
words: 0,
books: 0,
freq: Hash.new(0)
}
books.each do |filename, tokens|
#data[category][:words] += tokens.count
#data[category][:books] += 1
tokens.each do |token|
#data[category][:freq][token] += 1
#all_words[token] << category #should insert category name if the word appears, right?
end
end
#data[category][:freq].map { |k, v| v = (v / #data[category][:freq].values.max) }
end
end
If someone can help me figure out why the #all_words hash is empty when the code is run, I may be able to get the rest.
I haven't gone through it all, but you certainly have an error:
#all_words[token] << category #should insert category name if the word appears, right?
Nope. #all_words[token] will return empty array, but not create a new slot with an empty array, like you're assuming. So that statement doesn't modify the #all_words hash at all.
Try these 2 changes and see if it helps:
#all_words = {} # ditch the default value
...
(#all_words[token] ||= []) << category # lazy-init the array, and append

Find nearest lower number in string array (ruby)

New to ruby. I have an array created by nokogiri like this :
Array = ["10:31 Main title", ...]
It is a schedule in the format hour:minute title.
Now I have a time, say 10:35 and I want to find the entry in the array with the nearest lower number (time and title). It is like what is playing now?
How can I do this in ruby? I am at a blank here...
Thank you
You can achieve this using bsearch like below
a = [1, 4, 8, 11, 97]
a.bsearch {|x| x >= 7} # which results 8
You're going to have to walk the array and parse each entry. You'll also have to take into consideration whether the times are 12-hour or 24-hour, e.g. "10:31 Main Title" does that mean 10:31 AM or PM (in 12 hour clock). If its a 24-hour clock then 10:31 is 10:31 [am] and you'll also have 22:31 to reflect 10:31 [pm].
So you could walk the array, parsing each entry and then building a new structure which you can sort by. Ultimately you can get the lowest value and then just find the index of that entry in the original array.
require 'date'
a1 = ["10:31 The Beastmaster", "10:36 C.H.U.D.", "11:30 Goonies", "11:30 Krull", "11:59 Batteries Not Included"]
#=> ["10:31 The Beastmaster", "10:36 C.H.U.D.", "11:30 Goonies", "11:30 Krull", "11:59 Batteries Not Included"]
h1 = {}; a1.each {|x| m = x.match(/(\d{1,2}:\d{2})\s+(\w.*)/); h1[m[1]] ||= []; h1[m[1]] << m[2]}; h1 # => hash with times as keys and array of titles as corresponding values
#=> {"10:31"=>["The Beastmaster"], "10:36"=>["C.H.U.D."], "11:30"=>["Goonies", "Krull"], "11:59"=>["Batteries Not Included"]}
t1 = DateTime.rfc3339('2014-02-03T10:35:00-08:00').to_time.to_i
#=> 1391452500
within_an_hour = 60 * 60
#=> 3600
t2 = t1 + within_an_hour
#=> 1391456100
a2 = h1.keys.partition {|x| x > Time.at(t1).strftime("%I:%M")}[0] # => all upcoming times
#=> ["10:36", "11:30", "11:59"]
h2 = {}; a2.each {|x| h2[x] = h1[x]}; h2 # => all upcoming show times with corresponding titles
#=> {"10:36"=>["C.H.U.D."], "11:30"=>["Goonies", "Krull"], "11:59"=>["Batteries Not Included"]}
a3 = a2.partition {|x| x < Time.at(t2).strftime("%I:%M")}[0] # => upcoming times within an hour
#=> ["10:36", "11:30"]
h3 = {}; a3.each {|x| h3[x] = h1[x]}; h3 # => upcoming show times with corresponding titles within an hour
#=> {"10:36"=>["C.H.U.D."], "11:30"=>["Goonies", "Krull"]}
using the above code in a method:
require 'date'
def what_is_playing_now(time, a1=["10:31 The Beastmaster", "10:36 C.H.U.D.", "11:30 Goonies", "11:30 Krull", "11:59 Batteries Not Included"])
h1 = {}; a1.each {|x| m = x.match(/(\d{1,2}:\d{2})\s+(\w.*)/); h1[m[1]] ||= []; h1[m[1]] << m[2]}; h1 # => hash with times as keys and array of titles as corresponding values
t1 = DateTime.rfc3339("2014-02-03T#{time}:00-08:00").to_time.to_i
a2 = h1.keys.partition {|x| x > Time.at(t1).strftime("%I:%M")}[0] # => all upcoming times
h2 = {}; a2.each {|x| h2[x] = h1[x]}; h2 # => all upcoming show times with corresponding titles
"#{a2.first} #{h2[a2.first].sort.first}"
end
what_is_playing_now("10:35")
#=> "10:36 C.H.U.D."
sources:
https://www.ruby-forum.com/topic/129755
http://ruby-doc.org/core-1.9.3/Enumerable.html#method-i-partition
http://www.ruby-doc.org/stdlib-1.9.3/libdoc/date/rdoc/Date.html
http://www.ruby-doc.org/stdlib-1.9.3/libdoc/date/rdoc/Time.html
Since your array contains strings starting with numbers, these can be nicely sorted naturally.
my_array.sort.reverse.find{ |i| i < "10:35" }
This will sort your array in ascending order, then reverse it, and finally return the first item for which the block returns true.
If you are at Ruby version > 2.0, you can also use Array#bsearch:
my_array.sort.bsearch{ |i| i < "10:35" }
This will sort your array and will then use a binary search approach to finding the desired item (Thanks #ala for pointing this out).
These simple lines of code expect the time to be in 24h format with leading zeros (i.e. hh:mm), since it depends on comparing the lines lexicographically.

How do I process multiple variables in the same way?

It's highly probable this question has been asked, but I can't find the answer.
I have four variables:
a,b,c,d = [a,b,c,d].map{|myvar| myvar+1 }
How can I make this line more DRY (keeping it compact), i.e., achieve the same changes without repeating variable names?
Don't create separate variables, put the values in an Array or Hash from the beginning.
data = []
data << 1
data << 2
data << 3
data << 4
data = data.map { |value| value + 1 }
data.inspect # => [2, 3, 4, 5]
or
data = {}
data[:a] = 1
data[:b] = 2
data[:c] = 3
data[:d] = 4
data.each { |key, value| data[key] = value + 1}
data.inspect # => {:a=>2, :b=>3, :c=>4, :d=>5}
i have a growing suspicion that short answer (for this specific example with integers) is "no way"
due to the same reason as described in the answer in my previous question:
replacing referenced Integer value in Ruby like String#replace does
update:
if variables we operate on are an Array, Hash or String, and they keep the same datatype after the performed operation, it's drier, more compact and saving memory to use replace
[a,b,c,d].each{|v| v.replace(v + [1])} #example for an array

Resources