Ruby: Collect index from Array/String Matchdata - ruby

I'm new to Ruby, here's my problem : I would like to iterate through either an Array or String to obtain the index of characters that match a Regex.
Sample Array/String
a = %q(A B A A C C B D A D)
b = %w(A B A A C C B D A D)
What I need is something for variable a or b like ;
#index of A returns;
[0, 2, 3,8]
#index of B returns
[1,6]
#index of C returns
[5,6]
#etc
I've tried to be a little sly with
z = %w()
a =~ /\w/.each_with_index do |x, y|
puts z < y
end
but that didn't workout so well.
Any solutions ?

For array, you could use
b.each_index.select { |i| b[i] == 'A' }
For string, you could split it to an array first (a.split(/\s/)).

If you want to get each character's index as a hash, this would work:
b = %w(A B A A C C B D A D)
h = {}
b.each_with_index { |e, i|
h[e] ||= []
h[e] << i
}
h
#=> {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
Or as a "one-liner":
b.each_with_object({}).with_index { |(e, h), i| (h[e] ||= []) << i }
#=> {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}

If you want to count occurrences of each letter you can define helper method:
def occurrences(collection)
collection = collection.split(/\s/) if collection.is_a? String
collection.uniq.inject({}) do |result, letter|
result[letter] = collection.each_index.select { |index| collection[index] == letter }
result
end
end
# And use it like this. This will return you a hash something like this:
# {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
occurrences(a)
occurrences(b)
This should work either for String or Array.

Related

How do Ruby multiple code blocks work in conjunction/when chained?

Here's a function in Ruby to find if 2 unique number in an array add up to a sum:
def sum_eq_n? (arr, n)
return true if arr.empty? && n == 0
p "first part array:" + String(arr.product(arr).reject { |a,b| a == b })
puts "\n"
p "first part bool:" + String(arr.product(arr).reject { |a,b| a == b }.any?)
puts "\n"
p "second part:" + String(arr.product(arr).reject { |a,b| a + b == n } )
puts "\n"
result = arr.product(arr).reject { |a,b| a == b }.any? { |a,b| a + b == n }
return result
end
#define inputs
l1 = [1, 2, 3, 4, 5, 5]
n = 10
#run function
print "Result is: " + String(sum_eq_n?(l1, n))
I'm confused how the calculation works to produce result. As you can see I've broken the function down into a few parts to visualize this. I've researched and understand the .reject and the .any? methods individually.
However, I'm still confused on how it fits all together in the 1 liner. How are the 2 blocks evaluated in combination? I've only found examples with .reject with 1 code block afterwards. Is .reject applied to both? I also thought there might be an implicit AND in between the 2 code blocks, but I tried to add a 3rd dummy block and it failed, so at this point I'm just not really sure how it works at all.
You can interpret the expression via these equivalent substitutions:
# orig
arr.product(arr).reject { |a,b| a == b }.any? { |a,b| a + b == n }
# same as
pairs = arr.product(arr)
pairs.reject { |a,b| a == b }.any? { |a,b| a + b == n }
# same as
pairs = arr.product(arr)
different_pairs = pairs.reject { |a,b| a == b }
different_pairs.any? { |a,b| a + b == n }
Each block is an argument for the respective method -- one for reject, and one for any?. They are evaluated in order, and are not combined. The parts that make up the expression can be wrapped in parenthesis to show this:
((arr.product(arr)).reject { |a,b| a == b }).any? { |a,b| a + b == n }
# broken up lines:
(
(
arr.product(arr) # pairs
).reject { |a,b| a == b } # different_pairs
).any? { |a,b| a + b == n }
Blocks in Ruby Are Method Arguments
Blocks in Ruby are first-class syntax structures for passing closures as arguments to methods. If you're more familiar with object-oriented concepts than functional ones, here is an example of an object (kind of) acting as a closure:
class MultiplyPairStrategy
def perform(a, b)
a * b
end
end
def convert_using_strategy(pairs, strategy)
new_array = []
for pair in pairs do
new_array << strategy.perform(*pair)
end
new_array
end
pairs = [
[2, 3],
[5, 4],
]
multiply_pair = MultiplyPairStrategy.new
convert_using_strategy(pairs, multiply_pair) # => [6, 20]
Which is the same as:
multiply_pair = Proc.new { |a, b| a * b }
pairs.map(&multiply_pair)
Which is the same as the most idiomatic:
pairs.map { |a, b| a * b }
The return result of the first method is returned and used by the second method.
This:
result = arr.product(arr).reject { |a,b| a == b }.any? { |a,b| a + b == n }
is functionality equivalent to:
results = arr.product(arr).reject { |a,b| a == b} # matrix of array pairs with identical values rejected
result = results.any? { |a,b| a + b == n } #true/false
This might be best visualized in pry (comments mine)
[1] pry(main)> arr = [1, 2, 3, 4, 5]
=> [1, 2, 3, 4, 5]
[2] pry(main)> n = 10
=> 10
[3] pry(main)> result_reject = arr.product(arr).reject { |a,b| a == b } # all combinations of array elements, with identical ones removed
=> [[1, 2],
[1, 3],
[1, 4],
[1, 5],
[1, 5],
[2, 1],
[2, 3],
[2, 4],
[2, 5],
[2, 5],
[3, 1],
[3, 2],
[3, 4],
[3, 5],
[3, 5],
[4, 1],
[4, 2],
[4, 3],
[4, 5],
[4, 5],
[5, 1],
[5, 2],
[5, 3],
[5, 4],
[5, 1],
[5, 2],
[5, 3],
[5, 4]]
[4] pry(main)> result_reject.any? { |a,b| a + b == n } # do any of the pairs of elements add together to equal ` n` ?
=> false
[5] pry(main)> arr.product(arr).reject { |a,b| a == b }.any? { |a,b| a + b == n } # the one liner
=> false
Each operation "chains" into the next, which visualized looks like:
arr.product(arr).reject { |a,b| a == b }.any? { |a,b| a + b == n }
|--|------A----->-----------B----------->-------------C----------|
Where part A, calling .product(arr), evaluates to an object. This object has a reject method that's called subsequently, and this object has an any? method that's called in turn. It's a fancy version of a.b.c.d where one call is used to generate an object for a subsequent call.
What's not apparent from that is the fact that product returns an Enumerator, which is an object that can be used to fetch the results, but is not the actual results per-se. It's more like an intent to return the results, and an ability to fetch them in a multitude of ways. These can be chained together to get the desired end product.
As a note this code can be reduced to:
arr.repeated_permutation(2).map(&:sum).include?(n)
Where the repeated_permutation method gives you all 2-digit combinations of numbers without duplicate numbers. This can be easily scaled up to N digits by changing that parameter. include? tests if the target is present.
If you're working with large arrays you may want to slightly optimize this:
arr.repeated_permutation(2).lazy.map(&:sum).include?(n)
Where that will stop on the first match found and avoid further sums. The lazy call has the effect of propagating individual values through to the end of the chain instead of each stage of the chain running to completion before forwarding to the next.
The idea of lazy is one of the interesting things about Enumerable. You can control how the values flow through those chains.

How to merge hash of hashes and set default value if value don't exists

I need to merge values of hash a into out with sort keys in a.
a = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
out = [
{"X": [4, 1]},
{"Y": [5, 0]},
{"Z": [0, 5]},
]
I would do something like this:
a = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
sorted_keys = a.values.flat_map(&:keys).uniq.sort
#=> [11, 12]
a.map { |k, v| { k => v.values_at(*sorted_keys).map(&:to_i) } }
#=> [ { "X" => [4, 1] }, { "Y" => [5, 0] }, { "Z" => [0, 5] }]
Code
def modify_values(g)
sorted_keys = g.reduce([]) {|arr,(_,v)| arr | v.keys}.sort
g.each_with_object({}) {|(k,v),h| h[k] = Hash.new(0).merge(v).values_at(*sorted_keys)}
end
Example
g = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
modify_values(g)
#=> {"X"=>[4, 1], "Y"=>[5, 0], "Z"=>[0, 5]}
Explanation
The steps are as follows (for the hash a in the example). First obtain an array of the unique keys from g's values (see Enumerable#reduce and Array#|), then sort that array.
b = a.reduce([]) {|arr,(_,v)| arr | v.keys}
#=> [12, 11]
sorted_keys = b.sort
#=> [11, 12]
The first key-value pair of a, together with an empty hash, is passed to each_with_object's block. The block variables are computed using parallel assignment:
(k,v),h = [["X", {12=>1, 11=>4}], {}]
k #=> "X"
v #=> {12=>1, 11=>4}
h #=> {}
The block calculation is then performed. First an empty hash with a default value 0 is created:
f = Hash.new(0)
#=> {}
The hash v is then merged into f. The result is hash with the same key-value pairs as v but with a default value of 0. The significance of the default value is that if f does not have a key k, f[k] returns the default value. See Hash::new.
g = f.merge(v)
#=> {12=>1, 11=>4}
g.default
#=> 0 (yup)
Then extract the values corresponding to sorted_keys:
h[k] = g.values_at(*sorted_keys)
#=> {12=>1, 11=>4}.values_at(11, 12)
#=> [4, 1]
When a's next key-value pair is passed to the block, the calculations are as follows.
(k,v),h = [["Y", {11=>5}], {"X"=>[4, 1]}] # Note `h` has been updated
k #=> "Y"
v #=> {11=>5}
h #=> {"X"=>[4, 1]}
f = Hash.new(0)
#=> {}
g = f.merge(v)
#=> {11=>5}
h[k] = g.values_at(*sorted_keys)
#=> {11=>5}.values_at(11, 12)
#=> [5, 0] (Note h[12] equals h's default value)
and now
h #=> {"X"=>[4, 1], "Y"=>[5, 0]}
The calculation for the third key-value pair of a is similar.

Ruby: How to find the most frequent substring of length n? [duplicate]

I have this program with a class DNA. The program counts the most frequent k-mer in a string. So, it is looking for the most common substring in a string with a length of k.
An example would be creating a dna1 object with a string of AACCAATCCG. The count k-mer method will look for a subtring with a length of k and output the most common answer. So, if we set k = 1 then 'A' and 'C' will be the most occurrence in the string because it appears four times. See example below:
dna1 = DNA.new('AACCAATCCG')
=> AACCAATCCG
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
Here is my DNA class :
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
protected
attr_reader :nucleotide
end
Here is my count kmer method that I am trying to implement:
# I have k as my only parameter because I want to pass the nucleotide string in the method
def count_kmer(k)
# I created an array as it seems like a good way to split up the nucleotide string.
counts = []
#this tries to count how many kmers of length k there are
num_kmers = self.nucleotide.length- k + 1
#this should try and look over the kmer start positions
for i in num_kmers
#Slice the string, so that way we can get the kmer
kmer = self.nucleotide.split('')
end
#add kmer if its not present
if !kmer = counts
counts[kmer] = 0
#increment the count for kmer
counts[kmer] +=1
end
#return the final count
return counts
end
#end dna class
end
I'm not sure where my method went wrong.
Something like this?
require 'set'
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
EDIT: Here's the full text of the class:
require 'set'
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
protected
attr_reader :nucleotide
end
This produces the following output, using Ruby 2.2.1, using the class and method you specified:
>> dna1 = DNA.new('AACCAATCCG')
=> #<DNA:0x007fe15205bc30 #nucleotide="AACCAATCCG">
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
As a bonus, you can also do:
>> dna1.kmers(2)
=> ["AA", "AC", "CC", "CA", "AA", "AT", "TC", "CC", "CG"]
Code
def most_frequent_substrings(str, k)
(0..str.size-k).each_with_object({}) do |i,h|
b = []
str[i..-1].scan(Regexp.new str[i,k]) { b << Regexp.last_match.begin(0) + i }
(h[b.size] ||= []) << b
end.max_by(&:first).last.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
end
Example
str = "ABBABABBABCATSABBABB"
most_frequent_substrings(str, 4)
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
This shows that the most frequently-occurring 4-character substring of strappears 3 times. There are two such substrings: "ABBA" and "BBAB". "ABBA" begins at offsets (into str) 0, 5 and 14, "BBAB" substrings begin at offsets 1, 6 and 15.
Explanation
For the example above the steps are as follows.
k = 4
n = str.size - k
#=> 20 - 4 => 16
e = (0..n).each_with_object([])
#<Enumerator: 0..16:each_with_object([])>
We can see the values that will be generated by this enumerator by converting it to an array.
e.to_a
#=> [[0, []], [1, []], [2, []], [3, []], [4, []], [5, []], [6, []], [7, []], [8, []],
# [9, []], [10, []], [11, []], [12, []], [13, []], [14, []], [15, []], [16, []]]
Note the empty array contained in each element will be modified as the array is built. Continuing, the first element of e is passed to the block and the block variables are assigned using parallel assignment:
i,a = e.next
#=> [0, []]
i #=> 0
a #=> []
We are now considering the substring of size 4 that begins at str offset i #=> 0, which is seen to be "ABBA". Now the block calculation is performed.
b = []
r = Regexp.new str[i,k]
#=> Regexp.new str[0,4]
#=> Regexp.new "ABBA"
#=> /ABAB/
str[i..-1].scan(r) { b << Regexp.last_match.begin(0) + i }
#=> "ABBABABBABCATSABBABB".scan(r) { b << Regexp.last_match.begin(0) + i }
b #=> [0, 5, 14]
We next have
(h[b.size] ||= []) << b
which becomes
(h[b.size] = h[b.size] || []) << b
#=> (h[3] = h[3] || []) << [0, 5, 14]
Since h has no key 3, h[3] on the right side equals nil. Continuing,
#=> (h[3] = nil || []) << [0, 5, 14]
#=> (h[3] = []) << [0, 5, 14]
h #=> { 3=>[[0, 5, 14]] }
Notice that we throw away scan's return value. All we need is b
This tells us the "ABBA" appears thrice in str, beginning at offsets 0, 5 and 14.
Now observe
e.to_a
#=> [[0, [[0, 5, 14]]], [1, [[0, 5, 14]]], [2, [[0, 5, 14]]],
# ...
# [16, [[0, 5, 14]]]]
After all elements of e have been passed to the block, the block returns
h #=> {3=>[[0, 5, 14], [1, 6, 15]],
# 1=>[[2], [3], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]],
# 2=>[[4, 16], [5, 14], [6, 15]]}
Consider substrings that appear just once: h[1]. One of those is [2]. This pertains to the 4-character substring beginning at str offset 2:
str[2,4]
#=> "BABA"
That is found to be the only instance of that substring. Similarly, among the substrings that appear twice is str[4,4] = str[16,4] #=> "BABB", given by h[2][0] #=> [4, 16].
Next we determine the greatest frequency of a substring of length 4:
c = h.max_by(&:first)
#=> [3, [[0, 5, 14], [1, 6, 15]]]
(which could also be written c = h.max_by { |k,_| k }).
d = c.last
#=> [[0, 5, 14], [1, 6, 15]]
For convenience, convert d to a hash:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
and return that hash from the method.
There is one detail that deserves mention. It is possible that d will contain two or more arrays that reference the same substring, in which case the value of the associated key (the substring) will equal the last of those arrays. Here's a simple example.
str = "AAA"
k = 2
In this case the array d above will equal
d = [[0], [1]]
Both of these reference str[0,2] #=> str[1,2] #=> "AA". In building the hash the first is overwritten by the second:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"AA"=>[1]}

How to write a method that counts the most common substring in a string in ruby?

I have this program with a class DNA. The program counts the most frequent k-mer in a string. So, it is looking for the most common substring in a string with a length of k.
An example would be creating a dna1 object with a string of AACCAATCCG. The count k-mer method will look for a subtring with a length of k and output the most common answer. So, if we set k = 1 then 'A' and 'C' will be the most occurrence in the string because it appears four times. See example below:
dna1 = DNA.new('AACCAATCCG')
=> AACCAATCCG
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
Here is my DNA class :
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
protected
attr_reader :nucleotide
end
Here is my count kmer method that I am trying to implement:
# I have k as my only parameter because I want to pass the nucleotide string in the method
def count_kmer(k)
# I created an array as it seems like a good way to split up the nucleotide string.
counts = []
#this tries to count how many kmers of length k there are
num_kmers = self.nucleotide.length- k + 1
#this should try and look over the kmer start positions
for i in num_kmers
#Slice the string, so that way we can get the kmer
kmer = self.nucleotide.split('')
end
#add kmer if its not present
if !kmer = counts
counts[kmer] = 0
#increment the count for kmer
counts[kmer] +=1
end
#return the final count
return counts
end
#end dna class
end
I'm not sure where my method went wrong.
Something like this?
require 'set'
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
EDIT: Here's the full text of the class:
require 'set'
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
def count_kmer(k)
max_kmers = kmers(k)
.each_with_object(Hash.new(0)) { |value, count| count[value] += 1 }
.group_by { |_,v| v }
.max
[Set.new(max_kmers[1].map { |e| e[0] }), max_kmers[0]]
end
def kmers(k)
nucleotide.chars.each_cons(k).map(&:join)
end
protected
attr_reader :nucleotide
end
This produces the following output, using Ruby 2.2.1, using the class and method you specified:
>> dna1 = DNA.new('AACCAATCCG')
=> #<DNA:0x007fe15205bc30 #nucleotide="AACCAATCCG">
>> dna1.count_kmer(1)
=> [#<Set: {"A", "C"}>, 4]
>> dna1.count_kmer(2)
=> [#<Set: {"AA", "CC"}>, 2]
As a bonus, you can also do:
>> dna1.kmers(2)
=> ["AA", "AC", "CC", "CA", "AA", "AT", "TC", "CC", "CG"]
Code
def most_frequent_substrings(str, k)
(0..str.size-k).each_with_object({}) do |i,h|
b = []
str[i..-1].scan(Regexp.new str[i,k]) { b << Regexp.last_match.begin(0) + i }
(h[b.size] ||= []) << b
end.max_by(&:first).last.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
end
Example
str = "ABBABABBABCATSABBABB"
most_frequent_substrings(str, 4)
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
This shows that the most frequently-occurring 4-character substring of strappears 3 times. There are two such substrings: "ABBA" and "BBAB". "ABBA" begins at offsets (into str) 0, 5 and 14, "BBAB" substrings begin at offsets 1, 6 and 15.
Explanation
For the example above the steps are as follows.
k = 4
n = str.size - k
#=> 20 - 4 => 16
e = (0..n).each_with_object([])
#<Enumerator: 0..16:each_with_object([])>
We can see the values that will be generated by this enumerator by converting it to an array.
e.to_a
#=> [[0, []], [1, []], [2, []], [3, []], [4, []], [5, []], [6, []], [7, []], [8, []],
# [9, []], [10, []], [11, []], [12, []], [13, []], [14, []], [15, []], [16, []]]
Note the empty array contained in each element will be modified as the array is built. Continuing, the first element of e is passed to the block and the block variables are assigned using parallel assignment:
i,a = e.next
#=> [0, []]
i #=> 0
a #=> []
We are now considering the substring of size 4 that begins at str offset i #=> 0, which is seen to be "ABBA". Now the block calculation is performed.
b = []
r = Regexp.new str[i,k]
#=> Regexp.new str[0,4]
#=> Regexp.new "ABBA"
#=> /ABAB/
str[i..-1].scan(r) { b << Regexp.last_match.begin(0) + i }
#=> "ABBABABBABCATSABBABB".scan(r) { b << Regexp.last_match.begin(0) + i }
b #=> [0, 5, 14]
We next have
(h[b.size] ||= []) << b
which becomes
(h[b.size] = h[b.size] || []) << b
#=> (h[3] = h[3] || []) << [0, 5, 14]
Since h has no key 3, h[3] on the right side equals nil. Continuing,
#=> (h[3] = nil || []) << [0, 5, 14]
#=> (h[3] = []) << [0, 5, 14]
h #=> { 3=>[[0, 5, 14]] }
Notice that we throw away scan's return value. All we need is b
This tells us the "ABBA" appears thrice in str, beginning at offsets 0, 5 and 14.
Now observe
e.to_a
#=> [[0, [[0, 5, 14]]], [1, [[0, 5, 14]]], [2, [[0, 5, 14]]],
# ...
# [16, [[0, 5, 14]]]]
After all elements of e have been passed to the block, the block returns
h #=> {3=>[[0, 5, 14], [1, 6, 15]],
# 1=>[[2], [3], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]],
# 2=>[[4, 16], [5, 14], [6, 15]]}
Consider substrings that appear just once: h[1]. One of those is [2]. This pertains to the 4-character substring beginning at str offset 2:
str[2,4]
#=> "BABA"
That is found to be the only instance of that substring. Similarly, among the substrings that appear twice is str[4,4] = str[16,4] #=> "BABB", given by h[2][0] #=> [4, 16].
Next we determine the greatest frequency of a substring of length 4:
c = h.max_by(&:first)
#=> [3, [[0, 5, 14], [1, 6, 15]]]
(which could also be written c = h.max_by { |k,_| k }).
d = c.last
#=> [[0, 5, 14], [1, 6, 15]]
For convenience, convert d to a hash:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"ABBA"=>[0, 5, 14], "BBAB"=>[1, 6, 15]}
and return that hash from the method.
There is one detail that deserves mention. It is possible that d will contain two or more arrays that reference the same substring, in which case the value of the associated key (the substring) will equal the last of those arrays. Here's a simple example.
str = "AAA"
k = 2
In this case the array d above will equal
d = [[0], [1]]
Both of these reference str[0,2] #=> str[1,2] #=> "AA". In building the hash the first is overwritten by the second:
d.each_with_object({}) { |a,h| h[str[a.first,k]] = a }
#=> {"AA"=>[1]}

Removing elements from array Ruby

Let's say I am trying to remove elements from array a = [1,1,1,2,2,3]. If I perform the following:
b = a - [1,3]
Then I will get:
b = [2,2]
However, I want the result to be
b = [1,1,2,2]
i.e. I only remove one instance of each element in the subtracted vector not all cases. Is there a simple way in Ruby to do this?
You may do:
a= [1,1,1,2,2,3]
delete_list = [1,3]
delete_list.each do |del|
a.delete_at(a.index(del))
end
result : [1, 1, 2, 2]
[1,3].inject([1,1,1,2,2,3]) do |memo,element|
memo.tap do |memo|
i = memo.find_index(e)
memo.delete_at(i) if i
end
end
Not very simple but:
a = [1,1,1,2,2,3]
b = a.group_by {|n| n}.each {|k,v| v.pop [1,3].count(k)}.values.flatten
=> [1, 1, 2, 2]
Also handles the case for multiples in the 'subtrahend':
a = [1,1,1,2,2,3]
b = a.group_by {|n| n}.each {|k,v| v.pop [1,1,3].count(k)}.values.flatten
=> [1, 2, 2]
EDIT: this is more an enhancement combining Norm212 and my answer to make a "functional" solution.
b = [1,1,3].each.with_object( a ) { |del| a.delete_at( a.index( del ) ) }
Put it in a lambda if needed:
subtract = lambda do |minuend, subtrahend|
subtrahend.each.with_object( minuend ) { |del| minuend.delete_at( minuend.index( del ) ) }
end
then:
subtract.call a, [1,1,3]
A simple solution I frequently use:
arr = ['remove me',3,4,2,45]
arr[1..-1]
=> [3,4,2,45]
a = [1,1,1,2,2,3]
a.slice!(0) # remove first index
a.slice!(-1) # remove last index
# a = [1,1,2,2] as desired
For speed, I would do the following, which requires only one pass through each of the two arrays. This method preserves order. I will first present code that does not mutate the original array, then show how it can be easily modified to mutate.
arr = [1,1,1,2,2,3,1]
removals = [1,3,1]
h = removals.group_by(&:itself).transform_values(&:size)
#=> {1=>2, 3=>1}
arr.each_with_object([]) { |n,a|
h.key?(n) && h[n] > 0 ? (h[n] -= 1) : a << n }
#=> [1, 2, 2, 1]
arr
#=> [1, 1, 1, 2, 2, 3, 1]
To mutate arr write:
h = removals.group_by(&:itself).transform_values(&:count)
arr.replace(arr.each_with_object([]) { |n,a|
h.key?(n) && h[n] > 0 ? (h[n] -= 1) : a << n })
#=> [1, 2, 2, 1]
arr
#=> [1, 2, 2, 1]
This uses the 21st century method Hash#transform_values (new in MRI v2.4), but one could instead write:
h = Hash[removals.group_by(&:itself).map { |k,v| [k,v.size] }]
or
h = removals.each_with_object(Hash.new(0)) { | n,h| h[n] += 1 }

Resources