How to use collect and include for multidimensional array - ruby

I have:
array1 = [[1,2,3,4,5],[7,8,9,10],[11,12,13,14]]
#student_ids = [1,2,3]
I want to replace elements in array1 that are included in #student_ids with 'X'. I want to see:
[['X','X','X',4,5],[7,8,9,10],[11,12,13,14]]
I have code that is intended to do this:
array1.collect! do |i|
if i.include?(#student_ids) #
i[i.index(#student_ids)] = 'X'; i # I want to replace all with X
else
i
end
end
If #student_ids is 1, then it works, but if #student_ids has more than one element such as 1,2,3, it raises errors. Any help?

It's faster to use a hash or a set than to repeatedly test [1,2,3].include?(n).
arr = [[1,2,3,4,5],[7,8,9,10],[11,12,13,14]]
ids = [1,2,3]
Use a hash
h = ids.product(["X"]).to_h
#=> {1=>"X", 2=>"X", 3=>"X"}
arr.map { |a| a.map { |n| h.fetch(n, n) } }
#=> [["X", "X", "X", 4, 5], [7, 8, 9, 10], [11, 12, 13, 14]]
See Hash#fetch.
Use a set
require 'set'
ids = ids.to_set
#=> #<Set: {1, 2, 3}>
arr.map { |a| a.map { |n| ids.include?(n) ? "X" : n } }
#=> [["X", "X", "X", 4, 5], [7, 8, 9, 10], [11, 12, 13, 14]]
Replace both maps with map! if the array is to be modified in place (mutated).

Try following, (taking #student_ids = [1, 2, 3])
array1.inject([]) { |m,a| m << a.map { |x| #student_ids.include?(x) ? 'X' : x } }
# => [["X", "X", "X", 4, 5], [7, 8, 9, 10], [11, 12, 13, 14]]

You can use each_with_index and replace the item you want:
array1 = [[1,2,3,4,5],[7,8,9,10],[11,12,13,14]]
#student_ids = [1,2,3]
array1.each_with_index do |sub_array, index|
sub_array.each_with_index do |item, index2|
array1[index][index2] = 'X' if #student_ids.include?(item)
end
end

You can do the following:
def remove_student_ids(arr)
arr.each_with_index do |value, index|
arr[index] = 'X' if #student_ids.include?(value) }
end
end
array1.map{ |sub_arr| remove_student_ids(sub_arr)}

Related

Ruby: How to find the most frequent substring of length n? [duplicate]

I have this program with a class DNA. The program counts the most frequent k-mer in a string. So, it is looking for the most common substring in a string with a length of k.
An example would be creating a dna1 object with a string of AACCAATCCG. The count k-mer method will look for a subtring with a length of k and output the most common answer. So, if we set k = 1 then 'A' and 'C' will be the most occurrence in the string because it appears four times. See example below:
dna1 = DNA.new('AACCAATCCG')
=> AACCAATCCG
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
Here is my DNA class :
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
protected
attr_reader :nucleotide
end
Here is my count kmer method that I am trying to implement:
# I have k as my only parameter because I want to pass the nucleotide string in the method
def count_kmer(k)
# I created an array as it seems like a good way to split up the nucleotide string.
counts = []
#this tries to count how many kmers of length k there are
num_kmers = self.nucleotide.length- k + 1
#this should try and look over the kmer start positions
for i in num_kmers
#Slice the string, so that way we can get the kmer
kmer = self.nucleotide.split('')
end
#add kmer if its not present
if !kmer = counts
counts[kmer] = 0
#increment the count for kmer
counts[kmer] +=1
end
#return the final count
return counts
end
#end dna class
end
I'm not sure where my method went wrong.
Something like this?
require 'set'
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
EDIT: Here's the full text of the class:
require 'set'
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
protected
attr_reader :nucleotide
end
This produces the following output, using Ruby 2.2.1, using the class and method you specified:
>> dna1 = DNA.new('AACCAATCCG')
=> #<DNA:0x007fe15205bc30 #nucleotide="AACCAATCCG">
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
As a bonus, you can also do:
>> dna1.kmers(2)
=> ["AA", "AC", "CC", "CA", "AA", "AT", "TC", "CC", "CG"]
Code
def most_frequent_substrings(str, k)
(0..str.size-k).each_with_object({}) do |i,h|
b = []
str[i..-1].scan(Regexp.new str[i,k]) { b << Regexp.last_match.begin(0) + i }
(h[b.size] ||= []) << b
end.max_by(&:first).last.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
end
Example
str = "ABBABABBABCATSABBABB"
most_frequent_substrings(str, 4)
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
This shows that the most frequently-occurring 4-character substring of strappears 3 times. There are two such substrings: "ABBA" and "BBAB". "ABBA" begins at offsets (into str) 0, 5 and 14, "BBAB" substrings begin at offsets 1, 6 and 15.
Explanation
For the example above the steps are as follows.
k = 4
n = str.size - k
#=> 20 - 4 => 16
e = (0..n).each_with_object([])
#<Enumerator: 0..16:each_with_object([])>
We can see the values that will be generated by this enumerator by converting it to an array.
e.to_a
#=> [[0, []], [1, []], [2, []], [3, []], [4, []], [5, []], [6, []], [7, []], [8, []],
# [9, []], [10, []], [11, []], [12, []], [13, []], [14, []], [15, []], [16, []]]
Note the empty array contained in each element will be modified as the array is built. Continuing, the first element of e is passed to the block and the block variables are assigned using parallel assignment:
i,a = e.next
#=> [0, []]
i #=> 0
a #=> []
We are now considering the substring of size 4 that begins at str offset i #=> 0, which is seen to be "ABBA". Now the block calculation is performed.
b = []
r = Regexp.new str[i,k]
#=> Regexp.new str[0,4]
#=> Regexp.new "ABBA"
#=> /ABAB/
str[i..-1].scan(r) { b << Regexp.last_match.begin(0) + i }
#=> "ABBABABBABCATSABBABB".scan(r) { b << Regexp.last_match.begin(0) + i }
b #=> [0, 5, 14]
We next have
(h[b.size] ||= []) << b
which becomes
(h[b.size] = h[b.size] || []) << b
#=> (h[3] = h[3] || []) << [0, 5, 14]
Since h has no key 3, h[3] on the right side equals nil. Continuing,
#=> (h[3] = nil || []) << [0, 5, 14]
#=> (h[3] = []) << [0, 5, 14]
h #=> { 3=>[[0, 5, 14]] }
Notice that we throw away scan's return value. All we need is b
This tells us the "ABBA" appears thrice in str, beginning at offsets 0, 5 and 14.
Now observe
e.to_a
#=> [[0, [[0, 5, 14]]], [1, [[0, 5, 14]]], [2, [[0, 5, 14]]],
# ...
# [16, [[0, 5, 14]]]]
After all elements of e have been passed to the block, the block returns
h #=> {3=>[[0, 5, 14], [1, 6, 15]],
# 1=>[[2], [3], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]],
# 2=>[[4, 16], [5, 14], [6, 15]]}
Consider substrings that appear just once: h[1]. One of those is [2]. This pertains to the 4-character substring beginning at str offset 2:
str[2,4]
#=> "BABA"
That is found to be the only instance of that substring. Similarly, among the substrings that appear twice is str[4,4] = str[16,4] #=> "BABB", given by h[2][0] #=> [4, 16].
Next we determine the greatest frequency of a substring of length 4:
c = h.max_by(&:first)
#=> [3, [[0, 5, 14], [1, 6, 15]]]
(which could also be written c = h.max_by { |k,_| k }).
d = c.last
#=> [[0, 5, 14], [1, 6, 15]]
For convenience, convert d to a hash:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
and return that hash from the method.
There is one detail that deserves mention. It is possible that d will contain two or more arrays that reference the same substring, in which case the value of the associated key (the substring) will equal the last of those arrays. Here's a simple example.
str = "AAA"
k = 2
In this case the array d above will equal
d = [[0], [1]]
Both of these reference str[0,2] #=> str[1,2] #=> "AA". In building the hash the first is overwritten by the second:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"AA"=>[1]}

How to write a method that counts the most common substring in a string in ruby?

I have this program with a class DNA. The program counts the most frequent k-mer in a string. So, it is looking for the most common substring in a string with a length of k.
An example would be creating a dna1 object with a string of AACCAATCCG. The count k-mer method will look for a subtring with a length of k and output the most common answer. So, if we set k = 1 then 'A' and 'C' will be the most occurrence in the string because it appears four times. See example below:
dna1 = DNA.new('AACCAATCCG')
=> AACCAATCCG
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
Here is my DNA class :
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
protected
attr_reader :nucleotide
end
Here is my count kmer method that I am trying to implement:
# I have k as my only parameter because I want to pass the nucleotide string in the method
def count_kmer(k)
# I created an array as it seems like a good way to split up the nucleotide string.
counts = []
#this tries to count how many kmers of length k there are
num_kmers = self.nucleotide.length- k + 1
#this should try and look over the kmer start positions
for i in num_kmers
#Slice the string, so that way we can get the kmer
kmer = self.nucleotide.split('')
end
#add kmer if its not present
if !kmer = counts
counts[kmer] = 0
#increment the count for kmer
counts[kmer] +=1
end
#return the final count
return counts
end
#end dna class
end
I'm not sure where my method went wrong.
Something like this?
require 'set'
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
EDIT: Here's the full text of the class:
require 'set'
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
protected
attr_reader :nucleotide
end
This produces the following output, using Ruby 2.2.1, using the class and method you specified:
>> dna1 = DNA.new('AACCAATCCG')
=> #<DNA:0x007fe15205bc30 #nucleotide="AACCAATCCG">
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
As a bonus, you can also do:
>> dna1.kmers(2)
=> ["AA", "AC", "CC", "CA", "AA", "AT", "TC", "CC", "CG"]
Code
def most_frequent_substrings(str, k)
(0..str.size-k).each_with_object({}) do |i,h|
b = []
str[i..-1].scan(Regexp.new str[i,k]) { b << Regexp.last_match.begin(0) + i }
(h[b.size] ||= []) << b
end.max_by(&:first).last.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
end
Example
str = "ABBABABBABCATSABBABB"
most_frequent_substrings(str, 4)
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
This shows that the most frequently-occurring 4-character substring of strappears 3 times. There are two such substrings: "ABBA" and "BBAB". "ABBA" begins at offsets (into str) 0, 5 and 14, "BBAB" substrings begin at offsets 1, 6 and 15.
Explanation
For the example above the steps are as follows.
k = 4
n = str.size - k
#=> 20 - 4 => 16
e = (0..n).each_with_object([])
#<Enumerator: 0..16:each_with_object([])>
We can see the values that will be generated by this enumerator by converting it to an array.
e.to_a
#=> [[0, []], [1, []], [2, []], [3, []], [4, []], [5, []], [6, []], [7, []], [8, []],
# [9, []], [10, []], [11, []], [12, []], [13, []], [14, []], [15, []], [16, []]]
Note the empty array contained in each element will be modified as the array is built. Continuing, the first element of e is passed to the block and the block variables are assigned using parallel assignment:
i,a = e.next
#=> [0, []]
i #=> 0
a #=> []
We are now considering the substring of size 4 that begins at str offset i #=> 0, which is seen to be "ABBA". Now the block calculation is performed.
b = []
r = Regexp.new str[i,k]
#=> Regexp.new str[0,4]
#=> Regexp.new "ABBA"
#=> /ABAB/
str[i..-1].scan(r) { b << Regexp.last_match.begin(0) + i }
#=> "ABBABABBABCATSABBABB".scan(r) { b << Regexp.last_match.begin(0) + i }
b #=> [0, 5, 14]
We next have
(h[b.size] ||= []) << b
which becomes
(h[b.size] = h[b.size] || []) << b
#=> (h[3] = h[3] || []) << [0, 5, 14]
Since h has no key 3, h[3] on the right side equals nil. Continuing,
#=> (h[3] = nil || []) << [0, 5, 14]
#=> (h[3] = []) << [0, 5, 14]
h #=> { 3=>[[0, 5, 14]] }
Notice that we throw away scan's return value. All we need is b
This tells us the "ABBA" appears thrice in str, beginning at offsets 0, 5 and 14.
Now observe
e.to_a
#=> [[0, [[0, 5, 14]]], [1, [[0, 5, 14]]], [2, [[0, 5, 14]]],
# ...
# [16, [[0, 5, 14]]]]
After all elements of e have been passed to the block, the block returns
h #=> {3=>[[0, 5, 14], [1, 6, 15]],
# 1=>[[2], [3], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]],
# 2=>[[4, 16], [5, 14], [6, 15]]}
Consider substrings that appear just once: h[1]. One of those is [2]. This pertains to the 4-character substring beginning at str offset 2:
str[2,4]
#=> "BABA"
That is found to be the only instance of that substring. Similarly, among the substrings that appear twice is str[4,4] = str[16,4] #=> "BABB", given by h[2][0] #=> [4, 16].
Next we determine the greatest frequency of a substring of length 4:
c = h.max_by(&:first)
#=> [3, [[0, 5, 14], [1, 6, 15]]]
(which could also be written c = h.max_by { |k,_| k }).
d = c.last
#=> [[0, 5, 14], [1, 6, 15]]
For convenience, convert d to a hash:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
and return that hash from the method.
There is one detail that deserves mention. It is possible that d will contain two or more arrays that reference the same substring, in which case the value of the associated key (the substring) will equal the last of those arrays. Here's a simple example.
str = "AAA"
k = 2
In this case the array d above will equal
d = [[0], [1]]
Both of these reference str[0,2] #=> str[1,2] #=> "AA". In building the hash the first is overwritten by the second:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"AA"=>[1]}

Ruby reducing a number array into start end range array

I have an array of numbers as below:
[11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
I would like to reduce this array to:
[[11,14], [19,21], [29,30], [33,33]]
Identify consequent numbers in an array and push only the start and end of its ranges.
How to achieve this?
Exactly some problem is solved to give an example for slice_before method in ruby docs:
a = [0, 2, 3, 4, 6, 7, 9]
prev = a[0]
p a.slice_before { |e|
prev, prev2 = e, prev
prev2 + 1 != e
}.map { |es|
es.length <= 2 ? es.join(",") : "#{es.first}-#{es.last}"
}.join(",")
In your case you need to tweak it a little:
a = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
prev = a[0]
p a.slice_before { |e|
prev, prev2 = e, prev
prev2 + 1 != e
}.map { |es|
[es.first, es.last]
}
Here's another way, using an enumerator with Enumerator#next and Enumerator#peek. It works for any collection that implements succ (aka next).
Code
def group_consecs(a)
enum = a.each
pairs = [[enum.next]]
loop do
if pairs.last.last.succ == enum.peek
pairs.last << enum.next
else
pairs << [enum.next]
end
end
pairs.map { |g| (g.size > 1) ? g : g*2 }
end
Note that Enumerator#peek raises a StopInteration exception if the enumerator enum is already at the end when enum.peek is invoked. That exception is handled by Kernel#loop, which breaks the loop.
Examples
a = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
group_consecs(a)
#=> [[11, 12, 13, 14], [19, 20, 21], [29, 30], [33, 33]]
a = ['a','b','c','f','g','i','l','m']
group_consecs(a)
#=> [["a", "b", "c"], ["f", "g"], ["i", "i"], ["l", "m"]]
a = ['aa','ab','ac','af','ag','ai','al','am']
group_consecs(a)
#=> [["aa", "ab", "ac"], ["af", "ag"], ["ai, ai"], ["al", "am"]]
a = [:a,:b,:c,:f,:g,:i,:l,:m]
group_consecs(a)
#=> [[:a, :b, :c], [:f, :g], [:i, :i], [:l, :m]]
Generate an array of seven date objects for an example, then group consecutive dates:
require 'date'
today = Date.today
a = 10.times.map { today = today.succ }.values_at(0,1,2,5,6,8,9)
#=> [#<Date: 2014-08-07 ((2456877j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-08 ((2456878j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-09 ((2456879j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-12 ((2456882j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-13 ((2456883j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-15 ((2456885j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-16 ((2456886j,0s,0n),+0s,2299161j)>]
group_consecs(a)
#=> [[#<Date: 2014-08-07 ((2456877j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-08 ((2456878j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-09 ((2456879j,0s,0n),+0s,2299161j)>
# ],
# [#<Date: 2014-08-12 ((2456882j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-13 ((2456883j,0s,0n),+0s,2299161j)>
# ],
# [#<Date: 2014-08-15 ((2456885j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-16 ((2456886j,0s,0n),+0s,2299161j)>
# ]]
This is some code I wrote for a project a while ago:
class Array
# [1,2,4,5,6,7,9,13].to_ranges # => [1..2, 4..7, 9..9, 13..13]
# [1,2,4,5,6,7,9,13].to_ranges(true) # => [1..2, 4..7, 9, 13]
def to_ranges(non_ranges_ok=false)
self.sort.each_with_index.chunk { |x, i| x - i }.map { |diff, pairs|
if (non_ranges_ok)
pairs.first[0] == pairs.last[0] ? pairs.first[0] : pairs.first[0] .. pairs.last[0]
else
pairs.first[0] .. pairs.last[0]
end
}
end
end
if ($0 == __FILE__)
require 'awesome_print'
ary = [1, 2, 4, 5, 6, 7, 9, 13, 12]
ary.to_ranges(false) # => [1..2, 4..7, 9..9, 12..13]
ary.to_ranges(true) # => [1..2, 4..7, 9, 12..13]
ary = [1, 2, 4, 8, 5, 6, 7, 3, 9, 11, 12, 10]
ary.to_ranges(false) # => [1..12]
ary.to_ranges(true) # => [1..12]
end
It's easy to change that to only return the start/end pairs:
class Array
def to_range_pairs(non_ranges_ok=false)
self.sort.each_with_index.chunk { |x, i| x - i }.map { |diff, pairs|
if (non_ranges_ok)
pairs.first[0] == pairs.last[0] ? [pairs.first[0]] : [pairs.first[0], pairs.last[0]]
else
[pairs.first[0], pairs.last[0]]
end
}
end
end
if ($0 == __FILE__)
require 'awesome_print'
ary = [1, 2, 4, 5, 6, 7, 9, 13, 12]
ary.to_range_pairs(false) # => [[1, 2], [4, 7], [9, 9], [12, 13]]
ary.to_range_pairs(true) # => [[1, 2], [4, 7], [9], [12, 13]]
ary = [1, 2, 4, 8, 5, 6, 7, 3, 9, 11, 12, 10]
ary.to_range_pairs(false) # => [[1, 12]]
ary.to_range_pairs(true) # => [[1, 12]]
end
Here's an elegant solution:
arr = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
output = []
# Sort array
arr.sort!
# Loop through each element in the list
arr.each do |element|
# Set defaults - for if there are no consecutive numbers in the list
start = element
endd = element
# Loop through consecutive numbers and check if they are inside the list
i = 1
while arr.include?(element+i) do
# Set element as endd
endd = element+i
# Remove element from list
arr.delete(element+i)
# Increment i
i += 1
end
# Push [start, endd] pair to output
output.push([start, endd])
end
[Edit: Ha! I misunderstood the question. In your example, for the array
a = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
you showed the desired array of pairs to be:
[[11,14], [19,21], [29,30], [33,33]]
which correspond to the following offsets in a:
[[0,3], [4,6], [7,8], [9,9]]
These pairs respective span the first 4 elements, the next 3 elements, then next 2 elements and the next element (by coincidence, evidently). I thought you wanted such pairs, each with a span one less than the previous, and the span of the first being as large as possible. If you have a quick look at my examples below, my assumption may be clearer. Looking back I don't know why I didn't understand the question correctly (I should have looked at the answers), but there you have it.
Despite my mistake, I'll leave this up as I found it an interesting problem, and had the opportunity to use the quadratic formula in the solution.
tidE]
This is how I would do it.
Code
def pull_pairs(a)
n = ((-1 + Math.sqrt(1.0 + 8*a.size))/2).to_i
cum = 0
n.downto(1).map do |i|
first = cum
cum += i
[a[first], a[cum-1]]
end
end
Examples
a = %w{a b c d e f g h i j k l}
#=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l"]
pull_pairs(a)
#=> [["a", "d"], ["e", "g"], ["h", "i"], ["j", "j"]]
a = [*(1..25)]
#=> [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
# 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
pull_pairs(a)
#=> [[1, 6], [7, 11], [12, 15], [16, 18], [19, 20], [21, 21]]
a = [*(1..990)]
#=> [1, 2,..., 990]
pull_pairs(a)
#=> [[1, 44], [45, 87],..., [988, 989], [990, 990]]
Explanation
First, we'll compute the the number of pairs of values in the array we will produce. We are given an array (expressed algebraically):
a = [a0,a1,...a(m-1)]
where m = a.size.
Given n > 0, the array to be produced is:
[[a0,a(n-1)], [a(n),a(2n-2)],...,[a(t),a(t)]]
These elements span the first n+(n-1)+...+1 elements of a. As this is an arithmetic progession, the sum equals n(n+1)/2. Ergo,
t = n(n+1)/2 - 1
Now t <= m-1, so we maximize the number of pairs in the output array by choosing the largest n such that
n(n+1)/2 <= m
which is the float solution for n in the quadratic:
n^2+n-2m = 0
rounded down to an integer, which is
int((-1+sqrt(1^1+4(1)(2m))/2)
or
int((-1+sqrt(1+8m))/2)
Suppose
a = %w{a b c d e f g h i j k l}
Then m (=a.size) = 12, so:
n = int((-1+sqrt(97))/2) = 4
and the desired array would be:
[['a','d'],['e','g'],['h','i'],['j','j']]
Once n has been computed, constructing the array of pairs is straightforward.

Grouping consecutive numbers in an array

I need to add consecutive numbers to a new array and, if it is not a consecutive number, add only that value to a new array:
old_array = [1, 2, 3, 5, 7, 8, 9, 20, 21, 23, 29]
I want to get this result:
new_array = [
[1,2,3],
[5],
[7,8,9]
[20,21]
[23],
[29]
]
Is there an easier way to do this?
A little late to this party but:
old_array.slice_when { |prev, curr| curr != prev.next }.to_a
# => [[1, 2, 3], [5], [7, 8, 9], [20, 21], [23], [29]]
This is the official answer given in RDoc (slightly modified):
actual = old_array.first
old_array.slice_before do
|e|
expected, actual = actual.next, e
expected != actual
end.to_a
A couple other ways:
old_array = [1, 2, 3, 5, 7, 8, 9, 20, 21, 23, 29]
#1
a, b = [], []
enum = old_array.each
loop do
b << enum.next
unless enum.peek.eql?(b.last.succ)
a << b
b = []
end
end
a << b if b.any?
a #=> [[1, 2, 3], [5], [7, 8, 9], [20, 21], [23], [29]]
#2
def pull_range(arr)
b = arr.take_while.with_index { |e,i| e-i == arr.first }
[b, arr[b.size..-1]]
end
b, l = [], a
while l.any?
f, l = pull_range(l)
b << f
end
b #=> [[1, 2, 3], [5], [7, 8, 9], [20, 21], [23], [29]]
Using chunk you could do:
old_array.chunk([old_array[0],old_array[0]]) do |item, block_data|
if item > block_data[1]+1
block_data[0] = item
end
block_data[1] = item
block_data[0]
end.map { |_, i| i }
# => [[1, 2, 3], [5], [7, 8, 9], [20, 21], [23], [29]]
Some answers seem unnecessarily long, it is possible to do this in a very compact way:
arr = [1, 2, 3, 5, 7, 8, 9, 20, 21, 23, 29]
arr.inject([]) { |a,e| (a[-1] && e == a[-1][-1] + 1) ? a[-1] << e : a << [e]; a }
# [[1, 2, 3], [5], [7, 8, 9], [20, 21], [23], [29]]
Alternatively, starting with the first element to get rid of the a[-1] condition (needed for the case when a[-1] would be nil because a is empty):
arr[1..-1].inject([[arr[0]]]) { |a,e| e == a[-1][-1] + 1 ? a[-1] << e : a << [e]; a }
# [[1, 2, 3], [5], [7, 8, 9], [20, 21], [23], [29]]
Enumerable#inject iterates all elements of the enumerable, building up a result value which starts with the given object. I give it an empty Array or an Array with the first value wrapped in an Array respectively in my solutions. Then I simply check if the next element of the input Array we are iterating is equal to the last value of the last Array in the resulting Array plus 1 (i.e, if it is the next consecutive element). If it is, I append it to the last list. Otherwise, I start a new list with that element in it and append it to the resulting Array.
You could also do it like this:
old_array=[1, 2, 3, 5, 7, 8, 9, 20, 21, 23, 29]
new_array=[]
tmp=[]
prev=nil
for i in old_array.each
if i != old_array[0]
if i - prev == 1
tmp << i
else
new_array << tmp
tmp=[i]
end
if i == old_array[-1]
new_array << tmp
break
end
prev=i
else
prev=i
tmp << i
end
end
Using a Hash you can do:
counter = 0
groups = {}
old_array.each_with_index do |e, i|
groups[counter] ||= []
groups[counter].push old_array[i]
counter += 1 unless old_array.include? e.next
end
new_array = groups.keys.map { |i| groups[i] }

How to quickly print Ruby hashes in a table format?

Is there a way to quickly print a ruby hash in a table format into a file?
Such as:
keyA keyB keyC ...
123 234 345
125 347
4456
...
where the values of the hash are arrays of different sizes. Or is using a double loop the only way?
Thanks
Try this gem I wrote (prints hashes, ruby objects, ActiveRecord objects in tables): http://github.com/arches/table_print
Here's a version of steenslag's that works when the arrays aren't the same size:
size = h.values.max_by { |a| a.length }.length
m = h.values.map { |a| a += [nil] * (size - a.length) }.transpose.insert(0, h.keys)
nil seems like a reasonable placeholder for missing values but you can, of course, use whatever makes sense.
For example:
>> h = {:a => [1, 2, 3], :b => [4, 5, 6, 7, 8], :c => [9]}
>> size = h.values.max_by { |a| a.length }.length
>> m = h.values.map { |a| a += [nil] * (size - a.length) }.transpose.insert(0, h.keys)
=> [[:a, :b, :c], [1, 4, 9], [2, 5, nil], [3, 6, nil], [nil, 7, nil], [nil, 8, nil]]
>> m.each { |r| puts r.map { |x| x.nil?? '' : x }.inspect }
[:a, :b, :c]
[ 1, 4, 9]
[ 2, 5, ""]
[ 3, 6, ""]
["", 7, ""]
["", 8, ""]
h = {:a => [1, 2, 3], :b => [4, 5, 6], :c => [7, 8, 9]}
p h.values.transpose.insert(0, h.keys)
# [[:a, :b, :c], [1, 4, 7], [2, 5, 8], [3, 6, 9]]
No, there's no built-in function. Here's a code that would format it as you want it:
data = { :keyA => [123, 125, 4456], :keyB => [234000], :keyC => [345, 347] }
length = data.values.max_by{ |v| v.length }.length
widths = {}
data.keys.each do |key|
widths[key] = 5 # minimum column width
# longest string len of values
val_len = data[key].max_by{ |v| v.to_s.length }.to_s.length
widths[key] = (val_len > widths[key]) ? val_len : widths[key]
# length of key
widths[key] = (key.to_s.length > widths[key]) ? key.to_s.length : widths[key]
end
result = ""
data.keys.each {|key| result += key.to_s.ljust(widths[key]) + " " }
result += "\n"
for i in 0.upto(length)
data.keys.each { |key| result += data[key][i].to_s.ljust(widths[key]) + " " }
result += "\n"
end
# TODO write result to file...
Any comments and edits to refine the answer are very welcome.

Resources