Ruby: How to find the most frequent substring of length n? [duplicate] - ruby

I have this program with a class DNA. The program counts the most frequent k-mer in a string. So, it is looking for the most common substring in a string with a length of k.
An example would be creating a dna1 object with a string of AACCAATCCG. The count k-mer method will look for a subtring with a length of k and output the most common answer. So, if we set k = 1 then 'A' and 'C' will be the most occurrence in the string because it appears four times. See example below:
dna1 = DNA.new('AACCAATCCG')
=> AACCAATCCG
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
Here is my DNA class :
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
protected
attr_reader :nucleotide
end
Here is my count kmer method that I am trying to implement:
# I have k as my only parameter because I want to pass the nucleotide string in the method
def count_kmer(k)
# I created an array as it seems like a good way to split up the nucleotide string.
counts = []
#this tries to count how many kmers of length k there are
num_kmers = self.nucleotide.length- k + 1
#this should try and look over the kmer start positions
for i in num_kmers
#Slice the string, so that way we can get the kmer
kmer = self.nucleotide.split('')
end
#add kmer if its not present
if !kmer = counts
counts[kmer] = 0
#increment the count for kmer
counts[kmer] +=1
end
#return the final count
return counts
end
#end dna class
end
I'm not sure where my method went wrong.

Something like this?
require 'set'
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
EDIT: Here's the full text of the class:
require 'set'
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
protected
attr_reader :nucleotide
end
This produces the following output, using Ruby 2.2.1, using the class and method you specified:
>> dna1 = DNA.new('AACCAATCCG')
=> #<DNA:0x007fe15205bc30 #nucleotide="AACCAATCCG">
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
As a bonus, you can also do:
>> dna1.kmers(2)
=> ["AA", "AC", "CC", "CA", "AA", "AT", "TC", "CC", "CG"]

Code
def most_frequent_substrings(str, k)
(0..str.size-k).each_with_object({}) do |i,h|
b = []
str[i..-1].scan(Regexp.new str[i,k]) { b << Regexp.last_match.begin(0) + i }
(h[b.size] ||= []) << b
end.max_by(&:first).last.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
end
Example
str = "ABBABABBABCATSABBABB"
most_frequent_substrings(str, 4)
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
This shows that the most frequently-occurring 4-character substring of strappears 3 times. There are two such substrings: "ABBA" and "BBAB". "ABBA" begins at offsets (into str) 0, 5 and 14, "BBAB" substrings begin at offsets 1, 6 and 15.
Explanation
For the example above the steps are as follows.
k = 4
n = str.size - k
#=> 20 - 4 => 16
e = (0..n).each_with_object([])
#<Enumerator: 0..16:each_with_object([])>
We can see the values that will be generated by this enumerator by converting it to an array.
e.to_a
#=> [[0, []], [1, []], [2, []], [3, []], [4, []], [5, []], [6, []], [7, []], [8, []],
# [9, []], [10, []], [11, []], [12, []], [13, []], [14, []], [15, []], [16, []]]
Note the empty array contained in each element will be modified as the array is built. Continuing, the first element of e is passed to the block and the block variables are assigned using parallel assignment:
i,a = e.next
#=> [0, []]
i #=> 0
a #=> []
We are now considering the substring of size 4 that begins at str offset i #=> 0, which is seen to be "ABBA". Now the block calculation is performed.
b = []
r = Regexp.new str[i,k]
#=> Regexp.new str[0,4]
#=> Regexp.new "ABBA"
#=> /ABAB/
str[i..-1].scan(r) { b << Regexp.last_match.begin(0) + i }
#=> "ABBABABBABCATSABBABB".scan(r) { b << Regexp.last_match.begin(0) + i }
b #=> [0, 5, 14]
We next have
(h[b.size] ||= []) << b
which becomes
(h[b.size] = h[b.size] || []) << b
#=> (h[3] = h[3] || []) << [0, 5, 14]
Since h has no key 3, h[3] on the right side equals nil. Continuing,
#=> (h[3] = nil || []) << [0, 5, 14]
#=> (h[3] = []) << [0, 5, 14]
h #=> { 3=>[[0, 5, 14]] }
Notice that we throw away scan's return value. All we need is b
This tells us the "ABBA" appears thrice in str, beginning at offsets 0, 5 and 14.
Now observe
e.to_a
#=> [[0, [[0, 5, 14]]], [1, [[0, 5, 14]]], [2, [[0, 5, 14]]],
# ...
# [16, [[0, 5, 14]]]]
After all elements of e have been passed to the block, the block returns
h #=> {3=>[[0, 5, 14], [1, 6, 15]],
# 1=>[[2], [3], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]],
# 2=>[[4, 16], [5, 14], [6, 15]]}
Consider substrings that appear just once: h[1]. One of those is [2]. This pertains to the 4-character substring beginning at str offset 2:
str[2,4]
#=> "BABA"
That is found to be the only instance of that substring. Similarly, among the substrings that appear twice is str[4,4] = str[16,4] #=> "BABB", given by h[2][0] #=> [4, 16].
Next we determine the greatest frequency of a substring of length 4:
c = h.max_by(&:first)
#=> [3, [[0, 5, 14], [1, 6, 15]]]
(which could also be written c = h.max_by { |k,_| k }).
d = c.last
#=> [[0, 5, 14], [1, 6, 15]]
For convenience, convert d to a hash:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
and return that hash from the method.
There is one detail that deserves mention. It is possible that d will contain two or more arrays that reference the same substring, in which case the value of the associated key (the substring) will equal the last of those arrays. Here's a simple example.
str = "AAA"
k = 2
In this case the array d above will equal
d = [[0], [1]]
Both of these reference str[0,2] #=> str[1,2] #=> "AA". In building the hash the first is overwritten by the second:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"AA"=>[1]}

Related

How to use collect and include for multidimensional array

I have:
array1 = [[1,2,3,4,5],[7,8,9,10],[11,12,13,14]]
#student_ids = [1,2,3]
I want to replace elements in array1 that are included in #student_ids with 'X'. I want to see:
[['X','X','X',4,5],[7,8,9,10],[11,12,13,14]]
I have code that is intended to do this:
array1.collect! do |i|
if i.include?(#student_ids) #
i[i.index(#student_ids)] = 'X'; i # I want to replace all with X
else
i
end
end
If #student_ids is 1, then it works, but if #student_ids has more than one element such as 1,2,3, it raises errors. Any help?
It's faster to use a hash or a set than to repeatedly test [1,2,3].include?(n).
arr = [[1,2,3,4,5],[7,8,9,10],[11,12,13,14]]
ids = [1,2,3]
Use a hash
h = ids.product(["X"]).to_h
#=> {1=>"X", 2=>"X", 3=>"X"}
arr.map { |a| a.map { |n| h.fetch(n, n) } }
#=> [["X", "X", "X", 4, 5], [7, 8, 9, 10], [11, 12, 13, 14]]
See Hash#fetch.
Use a set
require 'set'
ids = ids.to_set
#=> #<Set: {1, 2, 3}>
arr.map { |a| a.map { |n| ids.include?(n) ? "X" : n } }
#=> [["X", "X", "X", 4, 5], [7, 8, 9, 10], [11, 12, 13, 14]]
Replace both maps with map! if the array is to be modified in place (mutated).
Try following, (taking #student_ids = [1, 2, 3])
array1.inject([]) { |m,a| m << a.map { |x| #student_ids.include?(x) ? 'X' : x } }
# => [["X", "X", "X", 4, 5], [7, 8, 9, 10], [11, 12, 13, 14]]
You can use each_with_index and replace the item you want:
array1 = [[1,2,3,4,5],[7,8,9,10],[11,12,13,14]]
#student_ids = [1,2,3]
array1.each_with_index do |sub_array, index|
sub_array.each_with_index do |item, index2|
array1[index][index2] = 'X' if #student_ids.include?(item)
end
end
You can do the following:
def remove_student_ids(arr)
arr.each_with_index do |value, index|
arr[index] = 'X' if #student_ids.include?(value) }
end
end
array1.map{ |sub_arr| remove_student_ids(sub_arr)}

How to merge hash of hashes and set default value if value don't exists

I need to merge values of hash a into out with sort keys in a.
a = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
out = [
{"X": [4, 1]},
{"Y": [5, 0]},
{"Z": [0, 5]},
]
I would do something like this:
a = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
sorted_keys = a.values.flat_map(&:keys).uniq.sort
#=> [11, 12]
a.map { |k, v| { k => v.values_at(*sorted_keys).map(&:to_i) } }
#=> [ { "X" => [4, 1] }, { "Y" => [5, 0] }, { "Z" => [0, 5] }]
Code
def modify_values(g)
sorted_keys = g.reduce([]) {|arr,(_,v)| arr | v.keys}.sort
g.each_with_object({}) {|(k,v),h| h[k] = Hash.new(0).merge(v).values_at(*sorted_keys)}
end
Example
g = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
modify_values(g)
#=> {"X"=>[4, 1], "Y"=>[5, 0], "Z"=>[0, 5]}
Explanation
The steps are as follows (for the hash a in the example). First obtain an array of the unique keys from g's values (see Enumerable#reduce and Array#|), then sort that array.
b = a.reduce([]) {|arr,(_,v)| arr | v.keys}
#=> [12, 11]
sorted_keys = b.sort
#=> [11, 12]
The first key-value pair of a, together with an empty hash, is passed to each_with_object's block. The block variables are computed using parallel assignment:
(k,v),h = [["X", {12=>1, 11=>4}], {}]
k #=> "X"
v #=> {12=>1, 11=>4}
h #=> {}
The block calculation is then performed. First an empty hash with a default value 0 is created:
f = Hash.new(0)
#=> {}
The hash v is then merged into f. The result is hash with the same key-value pairs as v but with a default value of 0. The significance of the default value is that if f does not have a key k, f[k] returns the default value. See Hash::new.
g = f.merge(v)
#=> {12=>1, 11=>4}
g.default
#=> 0 (yup)
Then extract the values corresponding to sorted_keys:
h[k] = g.values_at(*sorted_keys)
#=> {12=>1, 11=>4}.values_at(11, 12)
#=> [4, 1]
When a's next key-value pair is passed to the block, the calculations are as follows.
(k,v),h = [["Y", {11=>5}], {"X"=>[4, 1]}] # Note `h` has been updated
k #=> "Y"
v #=> {11=>5}
h #=> {"X"=>[4, 1]}
f = Hash.new(0)
#=> {}
g = f.merge(v)
#=> {11=>5}
h[k] = g.values_at(*sorted_keys)
#=> {11=>5}.values_at(11, 12)
#=> [5, 0] (Note h[12] equals h's default value)
and now
h #=> {"X"=>[4, 1], "Y"=>[5, 0]}
The calculation for the third key-value pair of a is similar.

How to write a method that counts the most common substring in a string in ruby?

I have this program with a class DNA. The program counts the most frequent k-mer in a string. So, it is looking for the most common substring in a string with a length of k.
An example would be creating a dna1 object with a string of AACCAATCCG. The count k-mer method will look for a subtring with a length of k and output the most common answer. So, if we set k = 1 then 'A' and 'C' will be the most occurrence in the string because it appears four times. See example below:
dna1 = DNA.new('AACCAATCCG')
=> AACCAATCCG
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
Here is my DNA class :
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
protected
attr_reader :nucleotide
end
Here is my count kmer method that I am trying to implement:
# I have k as my only parameter because I want to pass the nucleotide string in the method
def count_kmer(k)
# I created an array as it seems like a good way to split up the nucleotide string.
counts = []
#this tries to count how many kmers of length k there are
num_kmers = self.nucleotide.length- k + 1
#this should try and look over the kmer start positions
for i in num_kmers
#Slice the string, so that way we can get the kmer
kmer = self.nucleotide.split('')
end
#add kmer if its not present
if !kmer = counts
counts[kmer] = 0
#increment the count for kmer
counts[kmer] +=1
end
#return the final count
return counts
end
#end dna class
end
I'm not sure where my method went wrong.
Something like this?
require 'set'
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
EDIT: Here's the full text of the class:
require 'set'
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
protected
attr_reader :nucleotide
end
This produces the following output, using Ruby 2.2.1, using the class and method you specified:
>> dna1 = DNA.new('AACCAATCCG')
=> #<DNA:0x007fe15205bc30 #nucleotide="AACCAATCCG">
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
As a bonus, you can also do:
>> dna1.kmers(2)
=> ["AA", "AC", "CC", "CA", "AA", "AT", "TC", "CC", "CG"]
Code
def most_frequent_substrings(str, k)
(0..str.size-k).each_with_object({}) do |i,h|
b = []
str[i..-1].scan(Regexp.new str[i,k]) { b << Regexp.last_match.begin(0) + i }
(h[b.size] ||= []) << b
end.max_by(&:first).last.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
end
Example
str = "ABBABABBABCATSABBABB"
most_frequent_substrings(str, 4)
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
This shows that the most frequently-occurring 4-character substring of strappears 3 times. There are two such substrings: "ABBA" and "BBAB". "ABBA" begins at offsets (into str) 0, 5 and 14, "BBAB" substrings begin at offsets 1, 6 and 15.
Explanation
For the example above the steps are as follows.
k = 4
n = str.size - k
#=> 20 - 4 => 16
e = (0..n).each_with_object([])
#<Enumerator: 0..16:each_with_object([])>
We can see the values that will be generated by this enumerator by converting it to an array.
e.to_a
#=> [[0, []], [1, []], [2, []], [3, []], [4, []], [5, []], [6, []], [7, []], [8, []],
# [9, []], [10, []], [11, []], [12, []], [13, []], [14, []], [15, []], [16, []]]
Note the empty array contained in each element will be modified as the array is built. Continuing, the first element of e is passed to the block and the block variables are assigned using parallel assignment:
i,a = e.next
#=> [0, []]
i #=> 0
a #=> []
We are now considering the substring of size 4 that begins at str offset i #=> 0, which is seen to be "ABBA". Now the block calculation is performed.
b = []
r = Regexp.new str[i,k]
#=> Regexp.new str[0,4]
#=> Regexp.new "ABBA"
#=> /ABAB/
str[i..-1].scan(r) { b << Regexp.last_match.begin(0) + i }
#=> "ABBABABBABCATSABBABB".scan(r) { b << Regexp.last_match.begin(0) + i }
b #=> [0, 5, 14]
We next have
(h[b.size] ||= []) << b
which becomes
(h[b.size] = h[b.size] || []) << b
#=> (h[3] = h[3] || []) << [0, 5, 14]
Since h has no key 3, h[3] on the right side equals nil. Continuing,
#=> (h[3] = nil || []) << [0, 5, 14]
#=> (h[3] = []) << [0, 5, 14]
h #=> { 3=>[[0, 5, 14]] }
Notice that we throw away scan's return value. All we need is b
This tells us the "ABBA" appears thrice in str, beginning at offsets 0, 5 and 14.
Now observe
e.to_a
#=> [[0, [[0, 5, 14]]], [1, [[0, 5, 14]]], [2, [[0, 5, 14]]],
# ...
# [16, [[0, 5, 14]]]]
After all elements of e have been passed to the block, the block returns
h #=> {3=>[[0, 5, 14], [1, 6, 15]],
# 1=>[[2], [3], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]],
# 2=>[[4, 16], [5, 14], [6, 15]]}
Consider substrings that appear just once: h[1]. One of those is [2]. This pertains to the 4-character substring beginning at str offset 2:
str[2,4]
#=> "BABA"
That is found to be the only instance of that substring. Similarly, among the substrings that appear twice is str[4,4] = str[16,4] #=> "BABB", given by h[2][0] #=> [4, 16].
Next we determine the greatest frequency of a substring of length 4:
c = h.max_by(&:first)
#=> [3, [[0, 5, 14], [1, 6, 15]]]
(which could also be written c = h.max_by { |k,_| k }).
d = c.last
#=> [[0, 5, 14], [1, 6, 15]]
For convenience, convert d to a hash:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
and return that hash from the method.
There is one detail that deserves mention. It is possible that d will contain two or more arrays that reference the same substring, in which case the value of the associated key (the substring) will equal the last of those arrays. Here's a simple example.
str = "AAA"
k = 2
In this case the array d above will equal
d = [[0], [1]]
Both of these reference str[0,2] #=> str[1,2] #=> "AA". In building the hash the first is overwritten by the second:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"AA"=>[1]}

Checking arrays and implementing bool methods

You have an array. If any two numbers add to zero in the array, return true. It doesn't matter how many pairs there are—as long as there is one pair that adds to zero, return true. If there is a zero, it can only return true if there is more than one.
I wrote two functions, one to check for each, and a final one to combine both, and return false if either aren't met.
def checkZero(array)
zerocount = 0
for j in 0..array.count
if array[j] == 0
zerocount += 1
end
end
if zerocount > 1 #this part seems to not be working, not sure why
return true
else
return false
end
end
def checkNegative(array)
for j in 0..array.count
neg = -array[j] #set a negative value of the current value
if array.include?(neg) #check to see whether the negative exists in the array
return true
else
return false
end
end
end
def checkArray(array)
if checkZero(array) == true or checkNegative(array) == true
return true
else
return false
end
end
Then run something like
array = [1,2,3,4,0,1,-1]
checkArray(array)
So far, Ruby isn't returning anything. I just get a blank. I have a feeling my return isn't right.
The problem may be that you didn't output the result.
array = [1,2,3,4,0,1,-1]
puts checkArray(array)
The checkArray method can be written like the following, if performance (O(n^2)) is not a great concern:
def check_array(array)
array.combination(2).any?{|p| p.reduce(:+) == 0}
end
The more efficient (O(n log n)) solution is:
def check_array(array)
array.sort! # `array = array.sort` if you need the original array unchanged
i, j = 0, array.size - 1
while i < j
sum = array[i] + array[j]
if sum > 0
j -= 1
elsif sum < 0
i += 1
else
return true
end
end
return false
end
Here's are a few relatively efficient ways to check if any two values sum to zero:
Solution #1
def checkit(a)
return true if a.count(&:zero?) > 1
b = a.uniq.map(&:abs)
b.uniq.size < b.size
end
Solution #2
def checkit(a)
return true if a.sort_by(&:abs).each_cons(2).find { |x,y| x == -y }
false
end
Solution #3
def checkit(a)
return true if a.count(&:zero?) > 1
pos, non_pos = a.group_by { |n| n > 0 }.values
(pos & non_pos.map { |n| -n }).any?
end
Solution #4
require 'set'
def checkit(a)
a.each_with_object(Set.new) do |n,s|
return true if s.include?(-n)
s << n
end
false
end
Examples
checkit([1, 3, 4, 2, 2,-3,-5,-7, 0, 0]) #=> true
checkit([1, 3, 4, 2, 2,-3,-5,-7, 0]) #=> true
checkit([1, 3, 4, 2,-3, 2,-3,-5,-7, 0]) #=> true
checkit([1, 3, 4, 2, 2,-5,-7, 0]) #=> false
Explanations
The following all refer to the array:
a = [1,3,4,2,2,-3,-5,-7,0]
#1
Zeroes present a bit of a problem, so lets first see if there are more than one, in which case we are finished. Since a.count(&:zero?) #=> 1, a.count(&:zero?) > 1 #=> false, so
return true if a.count(&:zero?) > 1
does not cause us to return. Next, we remove any duplicates:
a.uniq #=> [1, 3, 4, 2, -3, -5, -7, 0]
Then convert all the numbers to their absolute values:
b = a.uniq,map(&:abs) #=> [1, 3, 4, 2, 3, 5, 7, 0]
Lastly see if c contains any dups, meaning the original array contained at least two non-zero numbers with opposite signs:
c.uniq.size < c.size #=> true
#2
b = a.sort_by(&:abs)
#=> [0, 1, 2, 2, 3, -3, 4, -5, -7]
c = b.each_cons(2)
#=> #<Enumerator: [0, 1, 2, 2, 3, -3, 4, -5, -7]:each_cons(2)>
To see the contents of the enumerator:
c.to_a
#=> [[0, 1], [1, 2], [2, 2], [2, 3], [3, -3], [-3, 4], [4, -5], [-5, -7]]
c.find { |x,y| x == -y }
#=> [3, -3]
so true is returned.
#3
return true if a.count(&:zero?) > 1
#=> return true if 1 > 1
h = a.group_by { |n| n > 0 }
#=> {true=>[1, 3, 4, 2, 2], false=>[-3, -5, -7, 0]}
b = h.values
#=> [[1, 3, 4, 2, 2], [-3, -5, -7, 0]]
pos, non_pos = b
pos
#=> [1, 3, 4, 2, 2]
non_pos
#=> [-3, -5, -7, 0]
c = non_pos.map { |n| -n }
#=> [3, 5, 7, 0]
d = pos & c
#=> [3]
d.any?
#=> true
#4
require 'set'
enum = a.each_with_object(Set.new)
#=> #<Enumerator: [1, 3, 4, 2, 2, -3, -5, -7, 0]:each_with_object(#<Set: {}>)>
enum.to_a
#=> [[1, #<Set: {}>],
# [3, #<Set: {}>],
# ...
# [0, #<Set: {}>]]
Values are passed into the block, assigned to the block variables and the block is executed, as follows:
n, s = enum.next
#=> [1, #<Set: {}>]
s.include?(-n)
#=> #<Set: {}>.include?(-1)
#=> false
s << n
#=> #<Set: {1}>
n, s = enum.next
#=> [3, #<Set: {1}>]
s.include?(-3)
#=> false
s << n
#=> #<Set: {1, 3}>
...
n, s = enum.next
#=> [2, #<Set: {1, 3, 4, 2}>]
s.include?(-n)
#=> false
s << n
#=> #<Set: {1, 3, 4, 2}> # no change
n, s = enum.next
#=> [-3, #<Set: {1, 3, 4, 2}>]
s.include?(-n)
#=> true
causing true to be returned.
I can’t reproduce any problem with your code, but you can express the solution very succinctly using combination to get all possible pairs, then summing each pair with reduce, and finally checking if any are zero?:
[1,2,3,4,0,1,-1].combination(2).map { |pair| pair.reduce(:+) }.any?(&:zero?)
This is a bit of a code review. Let's start with the first method:
def checkZero(array)
Ruby naming convention is snake_case rather than camelCase. This should be def check_zero(array)
Now the loop:
zerocount = 0
for j in 0..array.count
if array[j] == 0
zerocount += 1
end
end
As #AndrewMarshall said, for is not idiomatic. each is preferable. However, in ruby initializing a variable before a loop is almost never needed thanks to all the methods available to you on Array and Enumerable (which is included in Array). I highly recommend committing these methods to memory. The above can be written
array.any? {|number| number.zero?}
or equivalently
array.any?(&:zero?)
Now, this part:
if zerocount > 1 #this part seems to not be working, not sure why
return true
else
return false
end
end
Whenever you have the pattern
if (expr that returns true or false)
return true
else
return false
end
it can be simplified to simply return (expr that returns true or false). And you can even omit the return if it is the last statement of a method.
Putting it all together:
def check_zero(array)
array.any?(&:zero?)
end
def check_zero_sum(array)
array.combination(2).any?{|a,b| a + b == 0}
end
def check_array(array)
check_zero(array) || check_zero_sum(array)
end
(Note I borrowed AndrewMarshall's code for check_zero_sum which I think is easy to follow, but #CarySwoveland's answer will be faster)
Edit
I missed the fact that check_zero isn't even necessary because you want at least a pair, in which case check_zero_sum is all you need.
def check_array(array)
array.combination(2).any?{|a,b| a + b == 0}
end

Ruby: Collect index from Array/String Matchdata

I'm new to Ruby, here's my problem : I would like to iterate through either an Array or String to obtain the index of characters that match a Regex.
Sample Array/String
a = %q(A B A A C C B D A D)
b = %w(A B A A C C B D A D)
What I need is something for variable a or b like ;
#index of A returns;
[0, 2, 3,8]
#index of B returns
[1,6]
#index of C returns
[5,6]
#etc
I've tried to be a little sly with
z = %w()
a =~ /\w/.each_with_index do |x, y|
puts z < y
end
but that didn't workout so well.
Any solutions ?
For array, you could use
b.each_index.select { |i| b[i] == 'A' }
For string, you could split it to an array first (a.split(/\s/)).
If you want to get each character's index as a hash, this would work:
b = %w(A B A A C C B D A D)
h = {}
b.each_with_index { |e, i|
h[e] ||= []
h[e] << i
}
h
#=> {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
Or as a "one-liner":
b.each_with_object({}).with_index { |(e, h), i| (h[e] ||= []) << i }
#=> {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
If you want to count occurrences of each letter you can define helper method:
def occurrences(collection)
collection = collection.split(/\s/) if collection.is_a? String
collection.uniq.inject({}) do |result, letter|
result[letter] = collection.each_index.select { |index| collection[index] == letter }
result
end
end
# And use it like this. This will return you a hash something like this:
# {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
occurrences(a)
occurrences(b)
This should work either for String or Array.

Resources