Find all repeating non-overlapping substrings and cycles - ruby

I have a complex problem of string manipulation at hand.
I have a string in which I will have cycles, as well as recurrences which I need to identify and list down.
'abcabcabcabcabcdkkabclilabcoabcdieabcdowabcdppabzabx'
Following are the possible patterns ->
Actual indexes not used
abc -> 0,3,6,9,12,15,17, ..... (occurence index for recurring string),
0,3,6,9 (unique_occurence index for recurring string, 12, 15, 17
disqualified as there abc was a part of longer repeating substring)
abcd -> 12, 15, 17 (occurence index for recurring string), 12, 15, 17
(unique occurence index for recurring string)
bcda -> 13, 16, 18.. (occurence index for recurring string), (unique occurence index for recurring string) as it is an overlap for
the string abcd Hence it is something not required ab ->
0,3,6,9,12,15,17, 25, 27 ...(occurence index for recurring string),
25, 27(unique occurence index for recurring string). .....
I want to find all unique recurring occurences/recurrences, i.e. All Unique, Non-Overlapping values of recurring string. As mentioned above. And the input string may contain,
ALL cyclic patterns(abcabcabcdefdefdeflkjlkjlkj => abc, def, lkj are recurrences in cycle, but bc, ab, bcab are not expected as they are outcomes of false positives)
OR
Separately recurring patterns(abcxabcdabcm => abc is recurrence but not cycle, i.e. they are not adjecent)
Or
A mix of both(abcabcabcabcabclkabcdokabcdhuabcd => abc is a cyclic recurrence, and abcd is a non cyclic recurrence and we need to find both -> only abcd, abc are recurring, not bc, ab, bcda, etc)
Can someone propose a solution algo for this problem statement. I am trying using suffix_arrays which is not finding overlapping results as well.

A hash is constructed whose keys consist of all unique substrings of a given string that appear at least twice in the string (not overlapping) and, for each key, the value is an array of all offsets into the string where the value of the key (a substring) begins.
Code
def recurring_substrings(str)
arr = str.chars
(1..str.size/2).each_with_object({}) do |n,h|
arr.each_cons(n).map { |b| b.join }.uniq.each do |s|
str.scan(Regexp.new(s)) { (h[s] ||= []) << Regexp.last_match.begin(0) }
end
end.reject { |_,v| v.size == 1 }
end
Examples
recurring_substrings 'abjkabrjkab'
#=> {"a"=>[0, 4, 9], "b"=>[1, 5, 10], "j"=>[2, 7], "k"=>[3, 8], "ab"=>[0, 4, 9],
# "jk"=>[2, 7], "ka"=>[3, 8], "jka"=>[2, 7], "kab"=>[3, 8], "jkab"=>[2, 7]}
recurring_substrings "abcabcabcabcabcdkkabclilabcoabcdieabcdowabcdppabzabx"
#=> {"a"=>[0, 3, 6, 9, 12, 18, 24, 28, 34, 40, 46, 49],
# "b"=>[1, 4, 7, 10, 13, 19, 25, 29, 35, 41, 47, 50],
# "c"=>[2, 5, 8, 11, 14, 20, 26, 30, 36, 42], "d"=>[15, 31, 37, 43],
# "k"=>[16, 17], "l"=>[21, 23], "i"=>[22, 32], "o"=>[27, 38], "p"=>[44, 45],
# "ab"=>[0, 3, 6, 9, 12, 18, 24, 28, 34, 40, 46, 49],
# "bc"=>[1, 4, 7, 10, 13, 19, 25, 29, 35, 41], "ca"=>[2, 5, 8, 11],
# "cd"=>[14, 30, 36, 42],
# "abc"=>[0, 3, 6, 9, 12, 18, 24, 28, 34, 40], "bca"=>[1, 4, 7, 10],
# "cab"=>[2, 5, 8, 11], "bcd"=>[13, 29, 35, 41],
# "abca"=>[0, 6], "bcab"=>[1, 7], "cabc"=>[2, 8], "abcd"=>[12, 28, 34, 40],
# "abcab"=>[0, 6], "bcabc"=>[1, 7], "cabca"=>[2, 8],
# "abcabc"=>[0, 6], "bcabca"=>[1, 7], "cabcab"=>[2, 8]}
Explanation
For the first example above, the steps are as follows.
str = 'abjkabrjkab'
arr = str.chars
#=> ["a", "b", "j", "k", "a", "b", "r", "j", "k", "a", "b"]
q = str.size/2 # max size for string to repeat at least once
#=> 5
b = (1..q).each_with_object({})
#=> #<Enumerator: 1..5:each_with_object({})>
We can see which elements will be generated by this enumerator by converting it to an array. (I will do this a few more times below.)
b.to_a
#=> [[1, {}], [2, {}], [3, {}], [4, {}], [5, {}]]
The empty hashes will be built up as calculations progress.
Next pass the first element to the block and set the block variables to it using parallel assignment (sometimes called multiple assignment).
n,h = b.next
#=> [1, {}]
n #=> 1
h #=> {}
c = arr.each_cons(n)
#=> #<Enumerator: ["a", "b", "j", "k", "a", "b", "r", "j", "k", "a", "b"]:each_cons(1)>
c is an array of all substrings of length 1. At the next iteration it will be an array of all substrings of length 2 and so on. See Emumerable#each_cons.
c.to_a # Let's see which elements will be generated.
#=> [["a"], ["b"], ["j"], ["k"], ["a"], ["b"], ["r"], ["j"], ["k"], ["a"], ["b"]]
d = c.map { |b| b.join }
#=> ["a", "b", "j", "k", "a", "b", "r", "j", "k", "a", "b"]
e = d.uniq
#=> ["a", "b", "j", "k", "r"]
At the next iteration this will be
r = arr.each_cons(2)
#=> #<Enumerator: ["a", "b", "j", "k", "a", "b", "r", "j", "k", "a", "b"]:
# each_cons(2)>
r.to_a
#=> [["a", "b"], ["b", "j"], ["j", "k"], ["k", "a"], ["a", "b"],
# ["b", "r"], ["r", "j"], ["j", "k"], ["k", "a"], ["a", "b"]]
s = r.map { |b| b.join }
#=> ["ab", "bj", "jk", "ka", "ab", "br", "rj", "jk", "ka", "ab"]
s.uniq
#=> ["ab", "bj", "jk", "ka", "br", "rj"]
Continuing,
f = e.each
#=> #<Enumerator: ["a", "b", "j", "k", "r"]:each>
f.to_a # Let's see which elements will be generated.
#=> ["a", "b", "j", "k", "r"]
s = f.next
#=> "a"
r = (Regexp.new(s))
#=> /a/
str.scan(r) { (h[s] ||= []) << Regexp.last_match.begin(0) }
If h does not yet have a key s, h[s] #=> nil. h[s] ||= [], which expands to h[s] = h[s] || [], converts h[s] to an empty array before executing h[s] << Regexp.last_match.begin(0). That is, h[s] = h[s] || [] #=> nil || [] #=> [].
Within the block the MatchData object is retrieved with the class method Regexp::last_match. (Alternatively, one could substitute the global variable $~ for Regexp.last_match. For details, search for "special global variables" at Regexp.) MatchData#begin returns the index of str at which the current match begins.
Now
h #=> {"a"=>[0, 4, 9]}
The remaining calculations are similar, adding key-value pairs to h until the has given in the example has been constructed.

For further processing after #CarySwoveland's excellent answer :
def ignore_smaller_substrings(hash)
found_indices = []
new_hash = {}
hash.sort_by{|s,_| [-s.size,s]}.each{|s,indices|
indices -= found_indices
found_indices |= indices
new_hash[s]=indices unless indices.empty?
}
new_hash
end
pp ignore_smaller_substrings(recurring_substrings('abcabcabcabcabcdkkabclilabcoabcdieabcdowabcdppabzabx'))
Hash is sorted by decreasing string length (and then alphabetically), and indices are only allowed to appear once.
It outputs
{"abcabc"=>[0, 6],
"bcabca"=>[1, 7],
"cabcab"=>[2, 8],
"abcd"=>[12, 28, 34, 40],
"abc"=>[3, 9, 18, 24],
"bca"=>[4, 10],
"bcd"=>[13, 29, 35, 41],
"cab"=>[5, 11],
"ab"=>[46, 49],
"bc"=>[19, 25],
"cd"=>[14, 30, 36, 42],
"b"=>[47, 50],
"c"=>[20, 26],
"d"=>[15, 31, 37, 43],
"i"=>[22, 32],
"k"=>[16, 17],
"l"=>[21, 23],
"o"=>[27, 38],
"p"=>[44, 45]}
It doesn't answer the question exactly, but it comes a bit closer.

Related

Ruby inbuilt method to get the position of letter in the alphabet series?

Input: str = "stackoverflow"
Output: [19 20 1 3 11 15 22 5 18 6 12 15 23]
Do we have any method to get the position of the letters in ruby?
So that I can use something like str.chars.map { |al| al.some_method }.
str.chars = ["s", "t", "a", "c", "k", "o", "v", "e", "r", "f", "l", "o", "w"]
You can do this. I'd use String#chars which returns the ASCII numbers of each character in the string.
'abcdggg'.bytes
# => [97, 98, 99, 100, 103, 103, 103]
As you can see, the alphabet is sequential, each letter is one higher than the previous one. You can get it's position in the alphabet by taking 96 from the number.
Note that the upper-case letter is in a different position, but we can fix this using String#downcase.
To get all the alphabetical positions in a string (if it only has letters) we can write this method.
def alphabet_positions(string)
string.downcase.bytes.map{|b| b - 96}
end
This will work unexpectedly if any characters aren't letters, tho.
You can build a hash with position of a letter in an alphabet and then query this hash:
indexes = ('a'..'z').each_with_index.map{|l,i| [l, i+1]}.to_h
"stackoverflow".chars.map{|l| indexes[l]}
# => [19, 20, 1, 3, 11, 15, 22, 5, 18, 6, 12, 15, 23]
You can do that :
def position(letter)
letter.upcase.ord - 'A'.ord + 1
end
And then :
chars = ["s", "t", "a", "c", "k", "o", "v", "e", "r", "f", "l", "o", "w"]
chars.map do |char| position(char) end
=> [19, 20, 1, 3, 11, 15, 22, 5, 18, 6, 12, 15, 23]
See ord method for more information or this question
Below will give you the result you want.
str = "stackoverflow"
def conversion(str)
arr = []
str.upcase.gsub(/[A-Z]/){|m| arr << m.ord-64}
return arr
end
It is better to use each_char than chars because the latter creates an array that is immediately thrown out.
str.each_char.map{|al| al.ord - ?a.ord + 1}
# => [19, 20, 1, 3, 11, 15, 22, 5, 18, 6, 12, 15, 23]

How does one create a loop with indefinite nested loops?

say you have a list [ 1 , 2 ,3 ...... n]
if you needed to compare two elements so you would write something like
list = (0..9999).to_a
idx = 0
while idx < list.length
idx2 = idx
while idx2 < list.length
puts list[idx] + list[idx2] if (list[idx] + list[idx2]).odd?
idx2 += 1
end
idx += 1
end
But what if the number of comparisons is not constant and increases?
This code hard codes the comparison by having one loop inside another, but if you needed to compare 4 or more elements how does one write a loop or something that achieves this if you don't know the maximum number of comparisons?
We have a helpful method in ruby to do this, and that is Array#combination:
def find_odd_sums(list, num_per_group)
list.combination(num_per_group).to_a.map(&:sum).select(&:odd?)
end
You can re-implement combination, if you choose to. There are many versions of this function available at Algorithm to return all combinations of k elements from n
This question is not clear. Firstly, the title, which is vague, asks how a particular approach to an unstated problem can be implemented. What you need, at the beginning, is a statement in words of the problem.
I will make a guess as to what that statement might be and then propose a solution.
Given
an array arr;
a positive integer n, 1 <= n <= arr.size; and
a method m having n arguments that are distinct elements of arr that returns true or false,
what combinations of n elements of arr cause m to return true?
We can use the following method combined with a definition of the method m.
def combos(arr, n, m)
arr.combination(n).select { |x| public_send(m, *x) }
end
The key, of course, is the method Array#combination. See also the docs for the methods Enumerable#select and Object#public_send.
Here is its use with the example given in the question.
def m(*x)
x.sum.odd?
end
arr = [1,2,3,4,5,6]
combos(arr, 2, :m)
#=> [[1, 2], [1, 4], [1, 6], [2, 3], [2, 5], [3, 4], [3, 6], [4, 5], [5, 6]]
combos(arr, 3, :m)
#=> [[1, 2, 4], [1, 2, 6], [1, 3, 5], [1, 4, 6], [2, 3, 4], [2, 3, 6],
# [2, 4, 5], [2, 5, 6], [3, 4, 6], [4, 5, 6]]
combos(arr, 4, :m)
#=> [[1, 2, 3, 5], [1, 2, 4, 6], [1, 3, 4, 5], [1, 3, 5, 6], [2, 3, 4, 6], [2, 4, 5, 6]]
See the doc for Array#sum (which made it's debut in Ruby v2.4.
Here's a second example: given an array of letters, which combinations of five letters have two vowels?
VOWEL_COUNTER = %w| a e i o u |.product([1]).to_h.tap { |h| h.default=0 }
#=> {"a"=>1, "e"=>1, "i"=>1, "o"=>1, "u"=>1}
VOWEL_COUNTER['a']
#=> 1
By setting the hash's default value to zero, VOWEL_COUNTER[k] will return zero if it does not have a key k. For example,
VOWEL_COUNTER['r']
#=> 0
def m(*x)
x.sum { |c| VOWEL_COUNTER[c] } == 2
end
arr = %w| a r t u e v s |
combos(arr, 5, :m)
#=> [["a", "r", "t", "u", "v"], ["a", "r", "t", "u", "s"],
# ["a", "r", "t", "e", "v"], ["a", "r", "t", "e", "s"],
# ["a", "r", "u", "v", "s"], ["a", "r", "e", "v", "s"],
# ["a", "t", "u", "v", "s"], ["a", "t", "e", "v", "s"],
# ["r", "t", "u", "e", "v"], ["r", "t", "u", "e", "s"],
# ["r", "u", "e", "v", "s"], ["t", "u", "e", "v", "s"]]
Note that VOWEL_COUNTER is constructed as follows.
a = %w| a e i o u |
#=> ["a", "e", "i", "o", "u"]
b = a.product([1])
#=> [["a", 1], ["e", 1], ["i", 1], ["o", 1], ["u", 1]]
c = b.to_h
#=> {"a"=>1, "e"=>1, "i"=>1, "o"=>1, "u"=>1}
With this hash,
c['r']
#=> nil
so we need to set the default value to zero.
VOWEL_COUNTER = c.tap { |h| h.default=0 }
#=> {"a"=>1, "e"=>1, "i"=>1, "o"=>1, "u"=>1}
c['r']
#=> 0
Alternatively, we could have omitted the last step (setting the hash's default to zero), and written
x.sum { |c| VOWEL_COUNTER[c].to_i } == 2
because NilClass#to_i converts nil to zero.
See also the docs for the methods #select, #public_send
I feel like everyone is making this more complicated than it is. You sure got pointed to the right direction (Array#combination, Array#repeated_combination, Array#permutation, Array#repeated_permutation). To accomplish the exact thing you are doing, you can simply do:
list.repeated_combination(2) { |c| puts c.sum if c.sum.odd? }
Check the links above to see the difference between them.
If you want to create a helper method you can, but in my opinion it's not really needed in this case. Replace 2 with the number you are looking for and you got your answer.

Strange Ruby 2+ Behavior with "select!"

I'm having an issue that I can't seem to find documented or explained anywhere so I'm hoping someone here can help me out. I've verified the unexpected behavior on three versions of Ruby, all 2.1+, and verified that it doesn't happen on an earlier version (though it's through tryruby.org and I don't know which version they're using). Anyway, for the question I'll just post some code with results and hopefully someone can help me debug it.
arr = %w( r a c e c a r ) #=> ["r","a","c","e","c","a","r"]
arr.select { |c| arr.count(c).odd? } #=> ["e"]
arr.select! { |c| arr.count(c).odd? } #=> ["e","r"] <<<<<<<<<<<<<<< ??????
I think the confusing part for me is clearly marked and if anyone can explain if this is a bug or if there's some logic to it, I'd greatly appreciate it. Thanks!
You're modifying the array while you're read from it while you iterate over it. I'm not sure the result is defined behavior. The algorithm isn't required to keep the object in any kind of sane state while it's running.
Some debug printing during the iteration shows why your particular result happens:
irb(main):005:0> x
=> ["r", "a", "c", "e", "c", "a", "r"]
irb(main):006:0> x.select! { |c| p x; x.count(c).odd? }
["r", "a", "c", "e", "c", "a", "r"]
["r", "a", "c", "e", "c", "a", "r"]
["r", "a", "c", "e", "c", "a", "r"]
["r", "a", "c", "e", "c", "a", "r"] # "e" is kept...
["e", "a", "c", "e", "c", "a", "r"] # ... and moved to the start of the array
["e", "a", "c", "e", "c", "a", "r"]
["e", "a", "c", "e", "c", "a", "r"] # now "r" is kept
=> ["e", "r"]
You can see by the final iteration, there is only one r, and that the e has been moved to the front of the array. Presumably the algorithm modifies the array in-place, moving matched elements to the front, overwriting elements that have already failed your test. It keeps track of how many elements are matched and moved, and then truncates the array down to that many elements.
So, instead, use select.
A longer example that matches more elements makes the problem a little clearer:
irb(main):001:0> nums = (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
irb(main):002:0> nums.select! { |i| p nums; i.even? }
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[2, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[2, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[2, 4, 3, 4, 5, 6, 7, 8, 9, 10]
[2, 4, 3, 4, 5, 6, 7, 8, 9, 10]
[2, 4, 6, 4, 5, 6, 7, 8, 9, 10]
[2, 4, 6, 4, 5, 6, 7, 8, 9, 10]
[2, 4, 6, 8, 5, 6, 7, 8, 9, 10]
[2, 4, 6, 8, 5, 6, 7, 8, 9, 10]
=> [2, 4, 6, 8, 10]
You can see that it does indeed move matched elements to the front of the array, overwriting non-matched elements, and then truncate the array.
Just to give you some other ways of accomplishing what you're doing:
arr = %w( r a c e c a r )
arr.group_by{ |c| arr.count(c).odd? }
# => {false=>["r", "a", "c", "c", "a", "r"], true=>["e"]}
arr.group_by{ |c| arr.count(c).odd? }.values
# => [["r", "a", "c", "c", "a", "r"], ["e"]]
arr.partition{ |c| arr.count(c).odd? }
# => [["e"], ["r", "a", "c", "c", "a", "r"]]
And if you want more readable keys:
arr.group_by{ |c| arr.count(c).odd? ? :odd : :even }
# => {:even=>["r", "a", "c", "c", "a", "r"], :odd=>["e"]}
partition and group_by are basic building blocks for separating elements in an array into some sort of grouping, so it is good to be familiar with them.

Finding similar objects located in same index position of arrays in Ruby

I have the following hash:
hash = {"1"=>[ 5, 13, "B", 4, 10],
"2"=>[27, 19, "B", 18, 20],
"3"=>[45, 41, "B", 44, 31],
"4"=>[48, 51, "B", 58, 52],
"5"=>[70, 69, "B", 74, 73]}
Here is my code:
if hash.values.all? { |array| array[0] == "B" } ||
hash.values.all? { |array| array[1] == "B" } ||
hash.values.all? { |array| array[2] == "B" } ||
hash.values.all? { |array| array[3] == "B" } ||
hash.values.all? { |array| array[4] == "B" }
puts "Hello World"
What my code does is iterates through an array such that if the same element appears in the same index position of each array, it will output the string "Hello World" (Since "B" is in the [2] position of each array, it will puts the string. Is there a way to condense my current code without having a bunch of or's connecting each index of the array?
Assuming all arrays are always of the same length, the following gives you the column indexes where all values are equal:
hash.values.transpose.each_with_index.map do |column, index|
index if column.all? {|x| x == column[0] }
end.compact
The result is [2] for your hash. So you know that for all arrays the index 2 has the same values.
You can print "Hello World" if the resulting array has at least one element.
How does it work?
hash.values.transpose gives you all the arrays, but with transposed (all rows are now columns) values:
hash.values.transpose
=> [[5, 27, 45, 48, 70],
[13, 19, 41, 51, 69],
["B", "B", "B", "B", "B"],
[4, 18, 44, 58, 74],
[10, 20, 31, 52, 73]]
.each_with_index.map goes over every row of the transposed array while providing an inner array and its index.
We look at every inner array, yielding the column index only if all elements are equal using all?.
hash.values.transpose.each_with_index.map {|column, index| index if column.all? {|x| x == column[0] }
=> [nil, nil, 2, nil, nil]
Finally, we compact the result to get rid of the nil values.
Edit: First, I used reduce to find the column with identical elements. #Nimir pointed out, that I re-implemented all?. So I edited my anwer to use all?.
From #tessi brilliant answer i though of this way:
hash.values.transpose.each_with_index do |column, index|
puts "Index:#{index} Repeated value:#{column.first}" if column.all? {|x| x == column[0]}
end
#> Index:2 Repeated value:B
How?
Well, the transpose already solves the problem:
hash.values.transpose
=> [[5, 27, 45, 48, 70],
[13, 19, 41, 51, 69],
["B", "B", "B", "B", "B"],
[4, 18, 44, 58, 74],
[10, 20, 31, 52, 73]
]
We can do:
column.all? {|x| x == column[0]}
To find column with identical items
Assuming that all the values of the hash will be arrays of the same size, how about something like:
hash
=> {"1"=>[5, 13, "B", 4, 10], "2"=>[27, 19, "B", 18, 20], "3"=>[45, 41, "B", 44, 31], "4"=>[48, 51, "B", 58, 52], "5"=>[70, 69, "B", 74, 73]}
arr_of_arrs = hash.values
=> [[5, 13, "B", 4, 10], [27, 19, "B", 18, 20], [45, 41, "B", 44, 31], [48, 51, "B", 58, 52], [70, 69, "B", 74, 73]]
first_array = arr_of_arrs.shift
=> [5, 13, "B", 4, 10]
first_array.each_with_index do |element, index|
arr_of_arrs.map {|arr| arr[index] == element }.all?
end.any?
=> true
This is not really different from what you have now, as far as performance - in fact, it may be a bit slower. However, it allows for a dynamic number of incoming key/value pairs.
I ended up using the following:
fivebs = ["B","B","B","B","B"]
if hash.values.transpose.any? {|array| array == fivebs}
puts "Hello World"
If efficiency, rather than readability, is most important, I expect this decidedly un-Ruby-like and uninteresting solution probably would do well:
arr = hash.values
arr.first.size.times.any? { |i| arr.all? { |e| e[i] == ?B } }
#=> true
Only one intermediate array (arr) is constructed (e.g, no transposed array), and it quits if and when a match is found.
More Ruby-like is the solution I mentioned in a comment on your question:
hash.values.transpose.any? { |arr| arr.all? { |e| e == ?B } }
As you asked for an explanation of #Phrogz's solution to the earlier question, which is similar to this one, let me explain the above line of code, by stepping through it:
a = hash.values
#=> [[ 5, 13, "B", 4, 10],
# [27, 19, "B", 18, 20],
# [45, 41, "B", 44, 31],
# [48, 51, "B", 58, 52],
# [70, 69, "B", 74, 73]]
b = a.transpose
#=> [[ 5, 27, 45, 48, 70],
# [ 13, 19, 41, 51, 69],
# ["B", "B", "B", "B", "B"],
# [ 4, 18, 44, 58, 74],
# [ 10, 20, 31, 52, 73]]
In the last step:
b.any? { |arr| arr.all? { |e| e == ?B } }
#=> true
(where ?B is shorthand for the one-character string "B") an enumerator is created:
c = b.to_enum(:any?)
#=> #<Enumerator: [[ 5, 27, 45, 48, 70],
# [ 13, 19, 41, 51, 69],
# ["B", "B", "B", "B", "B"],
# [ 4, 18, 44, 58, 74],
# [ 10, 20, 31, 52, 73]]:any?>
When the enumerator (any enumerator) is acting on an array, the elements of the enumerator are passed into the block (and assigned to the block variable, here arr) by Array#each. The first element passed into the block is:
arr = [5, 27, 45, 48, 70]
and the following is executed:
arr.all? { |e| e == ?B }
#=> [5, 27, 45, 48, 70].all? { |e| e == ?B }
#=> false
Notice that false is returned to each right after:
5 == ?B
#=> false
is evaluated. Since false is returned, we move on to the second element of the enumerator:
[13, 19, 41, 51, 69].all? { |e| e == ?B }
#=> false
so we continue. But
["B", "B", "B", "B", "B"].all? { |e| e == ?B }
#=> true
so when true is returned to each, the latter returns true and we are finished.

Ruby reducing a number array into start end range array

I have an array of numbers as below:
[11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
I would like to reduce this array to:
[[11,14], [19,21], [29,30], [33,33]]
Identify consequent numbers in an array and push only the start and end of its ranges.
How to achieve this?
Exactly some problem is solved to give an example for slice_before method in ruby docs:
a = [0, 2, 3, 4, 6, 7, 9]
prev = a[0]
p a.slice_before { |e|
prev, prev2 = e, prev
prev2 + 1 != e
}.map { |es|
es.length <= 2 ? es.join(",") : "#{es.first}-#{es.last}"
}.join(",")
In your case you need to tweak it a little:
a = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
prev = a[0]
p a.slice_before { |e|
prev, prev2 = e, prev
prev2 + 1 != e
}.map { |es|
[es.first, es.last]
}
Here's another way, using an enumerator with Enumerator#next and Enumerator#peek. It works for any collection that implements succ (aka next).
Code
def group_consecs(a)
enum = a.each
pairs = [[enum.next]]
loop do
if pairs.last.last.succ == enum.peek
pairs.last << enum.next
else
pairs << [enum.next]
end
end
pairs.map { |g| (g.size > 1) ? g : g*2 }
end
Note that Enumerator#peek raises a StopInteration exception if the enumerator enum is already at the end when enum.peek is invoked. That exception is handled by Kernel#loop, which breaks the loop.
Examples
a = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
group_consecs(a)
#=> [[11, 12, 13, 14], [19, 20, 21], [29, 30], [33, 33]]
a = ['a','b','c','f','g','i','l','m']
group_consecs(a)
#=> [["a", "b", "c"], ["f", "g"], ["i", "i"], ["l", "m"]]
a = ['aa','ab','ac','af','ag','ai','al','am']
group_consecs(a)
#=> [["aa", "ab", "ac"], ["af", "ag"], ["ai, ai"], ["al", "am"]]
a = [:a,:b,:c,:f,:g,:i,:l,:m]
group_consecs(a)
#=> [[:a, :b, :c], [:f, :g], [:i, :i], [:l, :m]]
Generate an array of seven date objects for an example, then group consecutive dates:
require 'date'
today = Date.today
a = 10.times.map { today = today.succ }.values_at(0,1,2,5,6,8,9)
#=> [#<Date: 2014-08-07 ((2456877j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-08 ((2456878j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-09 ((2456879j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-12 ((2456882j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-13 ((2456883j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-15 ((2456885j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-16 ((2456886j,0s,0n),+0s,2299161j)>]
group_consecs(a)
#=> [[#<Date: 2014-08-07 ((2456877j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-08 ((2456878j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-09 ((2456879j,0s,0n),+0s,2299161j)>
# ],
# [#<Date: 2014-08-12 ((2456882j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-13 ((2456883j,0s,0n),+0s,2299161j)>
# ],
# [#<Date: 2014-08-15 ((2456885j,0s,0n),+0s,2299161j)>,
# #<Date: 2014-08-16 ((2456886j,0s,0n),+0s,2299161j)>
# ]]
This is some code I wrote for a project a while ago:
class Array
# [1,2,4,5,6,7,9,13].to_ranges # => [1..2, 4..7, 9..9, 13..13]
# [1,2,4,5,6,7,9,13].to_ranges(true) # => [1..2, 4..7, 9, 13]
def to_ranges(non_ranges_ok=false)
self.sort.each_with_index.chunk { |x, i| x - i }.map { |diff, pairs|
if (non_ranges_ok)
pairs.first[0] == pairs.last[0] ? pairs.first[0] : pairs.first[0] .. pairs.last[0]
else
pairs.first[0] .. pairs.last[0]
end
}
end
end
if ($0 == __FILE__)
require 'awesome_print'
ary = [1, 2, 4, 5, 6, 7, 9, 13, 12]
ary.to_ranges(false) # => [1..2, 4..7, 9..9, 12..13]
ary.to_ranges(true) # => [1..2, 4..7, 9, 12..13]
ary = [1, 2, 4, 8, 5, 6, 7, 3, 9, 11, 12, 10]
ary.to_ranges(false) # => [1..12]
ary.to_ranges(true) # => [1..12]
end
It's easy to change that to only return the start/end pairs:
class Array
def to_range_pairs(non_ranges_ok=false)
self.sort.each_with_index.chunk { |x, i| x - i }.map { |diff, pairs|
if (non_ranges_ok)
pairs.first[0] == pairs.last[0] ? [pairs.first[0]] : [pairs.first[0], pairs.last[0]]
else
[pairs.first[0], pairs.last[0]]
end
}
end
end
if ($0 == __FILE__)
require 'awesome_print'
ary = [1, 2, 4, 5, 6, 7, 9, 13, 12]
ary.to_range_pairs(false) # => [[1, 2], [4, 7], [9, 9], [12, 13]]
ary.to_range_pairs(true) # => [[1, 2], [4, 7], [9], [12, 13]]
ary = [1, 2, 4, 8, 5, 6, 7, 3, 9, 11, 12, 10]
ary.to_range_pairs(false) # => [[1, 12]]
ary.to_range_pairs(true) # => [[1, 12]]
end
Here's an elegant solution:
arr = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
output = []
# Sort array
arr.sort!
# Loop through each element in the list
arr.each do |element|
# Set defaults - for if there are no consecutive numbers in the list
start = element
endd = element
# Loop through consecutive numbers and check if they are inside the list
i = 1
while arr.include?(element+i) do
# Set element as endd
endd = element+i
# Remove element from list
arr.delete(element+i)
# Increment i
i += 1
end
# Push [start, endd] pair to output
output.push([start, endd])
end
[Edit: Ha! I misunderstood the question. In your example, for the array
a = [11, 12, 13, 14, 19, 20, 21, 29, 30, 33]
you showed the desired array of pairs to be:
[[11,14], [19,21], [29,30], [33,33]]
which correspond to the following offsets in a:
[[0,3], [4,6], [7,8], [9,9]]
These pairs respective span the first 4 elements, the next 3 elements, then next 2 elements and the next element (by coincidence, evidently). I thought you wanted such pairs, each with a span one less than the previous, and the span of the first being as large as possible. If you have a quick look at my examples below, my assumption may be clearer. Looking back I don't know why I didn't understand the question correctly (I should have looked at the answers), but there you have it.
Despite my mistake, I'll leave this up as I found it an interesting problem, and had the opportunity to use the quadratic formula in the solution.
tidE]
This is how I would do it.
Code
def pull_pairs(a)
n = ((-1 + Math.sqrt(1.0 + 8*a.size))/2).to_i
cum = 0
n.downto(1).map do |i|
first = cum
cum += i
[a[first], a[cum-1]]
end
end
Examples
a = %w{a b c d e f g h i j k l}
#=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l"]
pull_pairs(a)
#=> [["a", "d"], ["e", "g"], ["h", "i"], ["j", "j"]]
a = [*(1..25)]
#=> [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
# 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
pull_pairs(a)
#=> [[1, 6], [7, 11], [12, 15], [16, 18], [19, 20], [21, 21]]
a = [*(1..990)]
#=> [1, 2,..., 990]
pull_pairs(a)
#=> [[1, 44], [45, 87],..., [988, 989], [990, 990]]
Explanation
First, we'll compute the the number of pairs of values in the array we will produce. We are given an array (expressed algebraically):
a = [a0,a1,...a(m-1)]
where m = a.size.
Given n > 0, the array to be produced is:
[[a0,a(n-1)], [a(n),a(2n-2)],...,[a(t),a(t)]]
These elements span the first n+(n-1)+...+1 elements of a. As this is an arithmetic progession, the sum equals n(n+1)/2. Ergo,
t = n(n+1)/2 - 1
Now t <= m-1, so we maximize the number of pairs in the output array by choosing the largest n such that
n(n+1)/2 <= m
which is the float solution for n in the quadratic:
n^2+n-2m = 0
rounded down to an integer, which is
int((-1+sqrt(1^1+4(1)(2m))/2)
or
int((-1+sqrt(1+8m))/2)
Suppose
a = %w{a b c d e f g h i j k l}
Then m (=a.size) = 12, so:
n = int((-1+sqrt(97))/2) = 4
and the desired array would be:
[['a','d'],['e','g'],['h','i'],['j','j']]
Once n has been computed, constructing the array of pairs is straightforward.

Resources