Ruby extract subarray from integer - ruby

I need to extract part of an array based on an integer and if there are no enough values, fill this array with specifics values if the array size doesn't feet with this integer.
As example :
I have an array like that:
[[1,2], [2,1], [3,3]]
If my integer is 2 I need this :
[[1,2], [2,1]]
If my integer is 4 I need this :
[[1,2], [2,1], [3,3], [nil, nil]]

You can do the same using Fixnum#times methos:
a = [[1,2], [2,1], [3,3]]
def extract_sub_array array, size
size.times.map { |i| array.fetch(i, [nil, nil]) }
end
extract_sub_array a, 2
# => [[1, 2], [2, 1]]
extract_sub_array a, 4
# => [[1, 2], [2, 1], [3, 3], [nil, nil]]

def convert(a,n)
Array.new(n) { |i| a[i] || [nil,nil] }
end
a = [[1,2], [2,1], [3,3]]
convert(a,2) #=> [[1, 2], [2, 1]]
convert(a,3) #=> [[1, 2], [2, 1], [3, 3]]
convert(a,4) #=> [[1, 2], [2, 1], [3, 3], [nil, nil]]
convert(a,5) #=> [[1, 2], [2, 1], [3, 3], [nil, nil], [nil, nil]]

Assuming, the desired_length is specified and an array is named arr:
arr = [[1,2], [2,1], [3,3]]
# Will shrink an array or fill it with nils
# #param arr [Array] an array
# #param desired_length [Integer] the target length
def yo arr, desired_length
arr[0...desired_length] + [[nil,nil]]*[0,desired_length-arr.length].max
end
yo arr, 2
#⇒ [[1,2], [2,1]]
yo arr, 4
#⇒ [[1,2], [2,1], [3,3], [nil, nil]]

Related

How to find indices of identical sub-sequences in two strings in Ruby?

Here each instance of the class DNA corresponds to a string such as 'GCCCAC'. Arrays of substrings containing k-mers can be constructed from these strings. For this string there are 1-mers, 2-mers, 3-mers, 4-mers, 5-mers and one 6-mer:
6 1-mers: ["G", "C", "C", "C", "A", "C"]
5 2-mers: ["GC", "CC", "CC", "CA", "AC"]
4 3-mers: ["GCC", "CCC", "CCA", "CAC"]
3 4-mers: ["GCCC", "CCCA", "CCAC"]
2 5-mers: ["GCCCA", "CCCAC"]
1 6-mers: ["GCCCAC"]
The pattern should be evident. See the Wiki for details.
The problem is to write the method shared_kmers(k, dna2) of the DNA class which returns an array of all pairs [i, j] where this DNA object (that receives the message) shares with dna2 a common k-mer at position i in this dna and at position j in dna2.
dna1 = DNA.new('GCCCAC')
dna2 = DNA.new('CCACGC')
dna1.shared_kmers(2, dna2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
dna2.shared_kmers(2, dna1)
#=> [[0, 1], [0, 2], [1, 3], [2, 4], [4, 0]]
dna1.shared_kmers(3, dna2)
#=> [[2, 0], [3, 1]]
dna1.shared_kmers(4, dna2)
#=> [[2, 0]]
dna1.shared_kmers(5, dna2)
#=> []
class DNA
attr_accessor :sequencing
def initialize(sequencing)
#sequencing = sequencing
end
def kmers(k)
#sequencing.each_char.each_cons(k).map(&:join)
end
def shared_kmers(k, dna)
kmers(k).each_with_object([]).with_index do |(kmer, result), index|
dna.kmers(k).each_with_index do |other_kmer, other_kmer_index|
result << [index, other_kmer_index] if kmer.eql?(other_kmer)
end
end
end
end
dna1 = DNA.new('GCCCAC')
dna2 = DNA.new('CCACGC')
dna1.kmers(2)
#=> ["GC", "CC", "CC", "CA", "AC"]
dna2.kmers(2)
#=> ["CC", "CA", "AC", "CG", "GC"]
dna1.shared_kmers(2, dna2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
dna2.shared_kmers(2, dna1)
#=> [[0, 1], [0, 2], [1, 3], [2, 4], [4, 0]]
dna1.shared_kmers(3, dna2)
#=> [[2, 0], [3, 1]]
dna1.shared_kmers(4, dna2)
#=> [[2, 0]]
dna1.shared_kmers(5, dna2)
#=> []
I will address the crux of your problem only, without reference to a class DNA. It should be easy to reorganize what follows quite easily.
Code
def match_kmers(s1, s2, k)
h1 = dna_to_index(s1, k)
h2 = dna_to_index(s2, k)
h1.flat_map { |k,_| h1[k].product(h2[k] || []) }
end
def dna_to_index(dna, k)
dna.each_char.
with_index.
each_cons(k).
with_object({}) {|arr,h| (h[arr.map(&:first).join] ||= []) << arr.first.last}
end
Examples
dna1 = 'GCCCAC'
dna2 = 'CCACGC'
match_kmers(dna1, dna2, 2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
match_kmers(dna2, dna1, 2)
#=> [[0, 1], [0, 2], [1, 3], [2, 4], [4, 0]]
match_kmers(dna1, dna2, 3)
#=> [[2, 0], [3, 1]]
match_kmers(dna2, dna1, 3)
#=> [[0, 2], [1, 3]]
match_kmers(dna1, dna2, 4)
#=> [[2, 0]]
match_kmers(dna2, dna1, 4)
#=> [[0, 2]]
match_kmers(dna1, dna2, 5)
#=> []
match_kmers(dna2, dna1, 5)
#=> []
match_kmers(dna1, dna2, 6)
#=> []
match_kmers(dna2, dna1, 6)
#=> []
Explanation
Consider dna1 = 'GCCCAC'. This contains 5 2-mers (k = 2):
dna1.each_char.each_cons(2).to_a.map(&:join)
#=> ["GC", "CC", "CC", "CA", "AC"]
Similarly, for dna2 = 'CCACGC':
dna2.each_char.each_cons(2).to_a.map(&:join)
#=> ["CC", "CA", "AC", "CG", "GC"]
These are the keys of the hashes produced by dna_to_index for dna1 and dna2, respectively. The hash values are arrays of indices of where the corresponding key begins in the DNA string. Let's compute those hashes for k = 2:
h1 = dna_to_index(dna1, 2)
#=> {"GC"=>[0], "CC"=>[1, 2], "CA"=>[3], "AC"=>[4]}
h2 = dna_to_index(dna2, 2)
#=> {"CC"=>[0], "CA"=>[1], "AC"=>[2], "CG"=>[3], "GC"=>[4]}
h1 shows that:
"GC" begins at index 0 of dna1
"CC" begins at indices 1 and 2 of dna1
"CA" begins at index 3 of dna1
"CC" begins at index 4 of dna1
h2 has a similar interpretation. See Enumerable#flat_map and Array#product.
The method match_kmers is then used to construct the desired array of pairs of indices [i, j] such that h1[i] = h2[j].
Now let's look at the hashes produced for 3-mers (k = 3):
h1 = dna_to_index(dna1, 3)
#=> {"GCC"=>[0], "CCC"=>[1], "CCA"=>[2], "CAC"=>[3]}
h2 = dna_to_index(dna2, 3)
#=> {"CCA"=>[0], "CAC"=>[1], "ACG"=>[2], "CGC"=>[3]}
We see that the first 3-mer in dna1 is "GCC", beginning at index 0. This 3-mer does not appear in dna2, however, so there are no elements [0, X] in the array returned (X being just a placeholder). Nor is "CCC" a key in the second hash. "CCA" and "CAC" are present in the second hash, however, so the array returned is:
h1["CCA"].product(h2["CCA"]) + h1["CAC"].product(h2["CAC"])
#=> [[2, 0]] + [[3, 1]]
#=> [[2, 0], [3, 1]]
I would start by writing a method to enumerate subsequences of a given length (i.e. the k-mers):
class DNA
def initialize(sequence)
#sequence = sequence
end
def each_kmer(length)
return enum_for(:each_kmer, length) unless block_given?
0.upto(#sequence.length - length) { |i| yield #sequence[i, length] }
end
end
DNA.new('GCCCAC').each_kmer(2).to_a
#=> ["GC", "CC", "CC", "CA", "AC"]
On top of this you can easily collect the indices of identical k-mers using a nested loop:
class DNA
# ...
def shared_kmers(length, other)
indices = []
each_kmer(length).with_index do |k, i|
other.each_kmer(length).with_index do |l, j|
indices << [i, j] if k == l
end
end
indices
end
end
dna1 = DNA.new('GCCCAC')
dna2 = DNA.new('CCACGC')
dna1.shared_kmers(2, dna2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
Unfortunately, the above code traverses other.each_kmer for each k-mer in the receiver. We can optimize this by building a hash containing all indices for each k-mer in other up-front:
class DNA
# ...
def shared_kmers(length, other)
hash = Hash.new { |h, k| h[k] = [] }
other.each_kmer(length).with_index { |k, i| hash[k] << i }
indices = []
each_kmer(length).with_index do |k, i|
hash[k].each { |j| indices << [i, j] }
end
indices
end
end

Ruby, remove super-arrays

If I have an array of arrays, A, and want to get rid of all arrays in A who also have a sub-array in A, how would I do that. In this context, array_1 is a sub-array of array_2 if array_1 - array_2 = []. In the case that multiple arrays are simply rearranged versions of the same elements, bonus points if you can get rid of all but one of them, but you can handle this however you want if it's easier.
In python, I could easily use comprehension, with A being a set of frozen sets :
A = {a for a in A if all(b-a for b in A-{a})}
Is there a simple way to write this in ruby? I don't care if the order of A or it's arrays are preserved at all. Also, in my program, none of the arrays have duplicate elements, if that makes things any easier/faster.
Example
A = [[1,6],[1,2],[2,4],[3,5],[1,3,6],[2,3,6]]
# [1,6] is a subarray of [1,3,6], so [1,3,6] should be removed
remove_super_arrays(A)
> A = [[1,6],[1,2],[2,4],[3,5],[2,3,6]]
A = [[1,2,4],[2,3,4],[1,4,5],[2,6]]
# although there is overlap, there are no subarrays, so nothing should be removed
remove_super_arrays(A)
> A = [[1,2,4],[2,3,4],[1,4,5],[2,6]]
A = [[1],[2,1,3],[2,4],[1,4]]
# [1] is a subarray of [2,1,3] and [1,4]
remove_super_arrays(A)
> A = [[1],[2,4]]
Code
def remove_super_arrays(arr)
order = arr.each_with_index.to_a.to_h
arr.sort_by(&:size).reject.with_index do |a,i|
arr[0,i].any? { |aa| (aa.size < a.size) && (aa-a).empty? }
end.sort_by { |a| order[a] }
end
Examples
remove_super_arrays([[1,6],[1,2],[2,4],[3,5],[1,3,6],[2,3,6]] )
#=> [[1,6],[1,2],[2,4],[3,5],[2,3,6]]
remove_super_arrays([[1,2,4],[2,3,4],[1,4,5],[2,6]])
#=> [[1,2,4],[2,3,4],[1,4,5],[2,6]]
remove_super_arrays([[1],[2,1,3],[2,4],[1,4]])
#=> [[1],[2,4]]
Explanation
Consider the first example.
arr = [[1,6],[1,2],[2,4],[3,5],[1,3,6],[2,3,6]]
We first save the positions of the elements of a
order = arr.each_with_index.to_a.to_h # save original order
#=> {[1, 6]=>0, [1, 2]=>1, [2, 4]=>2, [3, 5]=>3, [1, 3, 6]=>4, [2, 3, 6]=>5}
Then reject elements of arr:
b = arr.sort_by(&:size)
#=> [[1, 6], [1, 2], [2, 4], [3, 5], [1, 3, 6], [2, 3, 6]]
c = b.reject.with_index do |a,i|
arr[0,i].any? { |aa| (aa.size < a.size) && (aa-a).empty? }
end
#=> [[1, 6], [1, 2], [2, 4], [3, 5], [2, 3, 6]]
Lastly, reorder c to correspond to the original ordering of the elements of arr.
c.sort_by { |a| order[a] }
#=> [[1, 6], [1, 2], [2, 4], [3, 5], [2, 3, 6]]
which in this case happens to be the same order as the elements of c.
Let's look more carefully at the calculation of c:
enum1 = b.reject
#=> #<Enumerator: [[1, 6], [1, 2], [2, 4], [3, 5], [1, 3, 6],
# [2, 3, 6]]:reject>
enum2 = enum1.with_index
#=> #<Enumerator: #<Enumerator: [[1, 6], [1, 2], [2, 4], [3, 5],
# [1, 3, 6], [2, 3, 6]]:reject>:with_index>
The first element is generated by the enumerator enum2 and passed to the block and assigned as values of the block variables:
a, i = enum2.next
#=> [[1, 6], 0]
a #=> [1, 6]
i #=> 0
The block calculation is then performed:
d = arr[0,i]
#=> []
d.any? { |aa| (aa.size < a.size) && (aa-a).empty? }
#=> false
so a[0] is not rejected. The next pair passed to the block by enum2 is [[1, 2], 1]. That value is retained as well, but let's skip ahead to the last element passed to the block by enum2:
a, i = enum2.next
#=> [[1, 2], 1]
a, i = enum2.next
#=> [[2, 4], 2]
a, i = enum2.next
#=> [[3, 5], 3]
a, i = enum2.next
#=> [[1, 3, 6], 4]
a #=> [1, 3, 6]
i #=> 4
Perform the block calculation:
d = arr[0,i]
#=> [[1, 6], [1, 2], [2, 4], [3, 5]]
d.any? { |aa| (aa.size < a.size) && (aa-a).empty? }
#=> true
As true is returned, a is rejected. In the last calculation the first element of d is passed to the block and the following calculation is performed:
aa = [1, 6]
(aa.size < a.size)
#=> 2 < 3 => true
(aa-a).empty?
#=> ([1, 6] - [1, 3, 6]).empty? => [].empty? => true
As true && true #=> true, a ([1, 3, 6]) is rejected.
Alternative calculation
The following is a closer match to the OP's Python equivalent, but less efficient:
def remove_super_arrays(arr)
arr.select do |a|
(arr-[a]).all? { |aa| aa.size > a.size || (aa-a).any? }
end
end
or
def remove_super_arrays(arr)
arr.reject do |a|
(arr-[a]).any? { |aa| (aa.size < a.size) && (aa-a).empty? }
end
end
This was a nice exercise for me. I have used the logic from here.
My code iterates over each subarray (except the first), then there is the magic substraction using the first index, when it is empty the other array contained both numbers.
def remove_super_arrays(arr)
arr.each_with_index.with_object([]) do |(sub_array, index), result|
next if index == 0
result << sub_array unless (arr.first - sub_array).empty?
end.unshift(arr.first)
end
arr = [[1,6],[1,2],[2,4],[3,5],[1,3,6],[2,3,6]]
p remove_super_arrays(arr)
#=> [[1, 6], [1, 2], [2, 4], [3, 5], [2, 3, 6]]

How to search within a two-dimensional array

I'm trying to learn how to search within a two-dimensional array; for example:
array = [[1,1], [1,2], [1,3], [2,1], [2,4], [2,5]]
I want to know how to search within the array for the arrays that are of the form [1, y] and then show what the other y numbers are: [1, 2, 3].
If anyone can help me understand how to search only with numbers (as a lot of the examples I found include strings or hashes) and even where to look for the right resources even, that would be helpful.
Ruby allows you to look into an element by using parentheses in the block argument. select and map only assign a single block argument, but you can look into the element:
array.select{|(x, y)| x == 1}
# => [[1, 1], [1, 2], [1, 3]]
array.select{|(x, y)| x == 1}.map{|(x, y)| y}
# => [1, 2, 3]
You can omit the parentheses that correspond to the entire expression between |...|:
array.select{|x, y| x == 1}
# => [[1, 1], [1, 2], [1, 3]]
array.select{|x, y| x == 1}.map{|x, y| y}
# => [1, 2, 3]
As a coding style, it is a custom to mark unused variables as _:
array.select{|x, _| x == 1}
# => [[1, 1], [1, 2], [1, 3]]
array.select{|x, _| x == 1}.map{|_, y| y}
# => [1, 2, 3]
You can use Array#select and Array#map methods:
array = [[1,1], [1,2], [1,3], [2,1], [2,4], [2,5]]
#=> [[1, 1], [1, 2], [1, 3], [2, 1], [2, 4], [2, 5]]
array.select { |el| el[0] == 1 }
#=> [[1, 1], [1, 2], [1, 3]]
array.select { |el| el[0] == 1 }.map {|el| el[1] }
#=> [1, 2, 3]
For more methods on arrays explore docs.
If you first select and then map you can use the grep function to to it all in one function:
p array.grep ->x{x[0]==1}, &:last #=> [1,2,3]
Another way of doing the same thing is to use Array#map together with Array#compact. This has the benefit of only requiring one block and a trivial operation, which makes it a bit easier to comprehend.
array.map { |a, b| a if b == 1 }
#=> [1, 2, 3, nil, nil, nil]
array.map { |a, b| a if b == 1 }.compact
#=> [1, 2, 3]
You can use each_with_object:
array.each_with_object([]) { |(x, y), a| a << y if x == 1 }
#=> [1, 2, 3]

How do I implement Common Lisp's mapcar in Ruby?

I want to implement Lisp's mapcar in Ruby.
Wishful syntax:
mul = -> (*args) { args.reduce(:*) }
mapcar(mul, [1,2,3], [4,5], [6]) would yield [24, nil, nil].
Here is the solution I could think of:
arrs[0].zip(arrs[1], arrs[2]) => [[1, 4, 6], [2, 5, nil], [3, nil, nil]]
Then I could:
[[1, 4, 6], [2, 5, nil], [3, nil, nil]].map do |e|
e.reduce(&mul) unless e.include?(nil)
end
=> [24, nil, nil]
But I'm stuck on the zip part. If the input is [[1], [1,2], [1,2,3], [1,2,3,4]], the zip part would need to change to:
arrs[0].zip(arrs[1], arrs[2], arrs[3])
For two input arrays I could write something like this:
def mapcar2(fn, *arrs)
return [] if arrs.empty? or arrs.include? []
arrs[0].zip(arrs[1]).map do |e|
e.reduce(&fn) unless e.include? nil
end.compact
end
But I do not know how go beyond more than two arrays:
def mapcar(fn, *arrs)
# Do not know how to abstract this
# zipped = arrs[0].zip(arrs[1], arrs[2]..., arrs[n-1])
# where n is the size of arrs
zipped.map do |e|
e.reduce(&fn) unless e.include?(nil)
end.compact
end
Does anyone have any advice?
If I got your question properly you just need:
arrs = [[1,2], [3,4], [5,6]]
zipped = arrs[0].zip(*arrs[1..-1])
# => [[1, 3, 5], [2, 4, 6]]
Or a nicer alternative, IHMO:
zipped = arrs.first.zip(*arrs.drop(1))
If all arrays inside arrs are of the same length you can use the transpose method:
arrs = [[1,2], [3,4], [5,6]]
arrs.transpose
# => [[1, 3, 5], [2, 4, 6]]
According to toro2k, one of the possible implementations of mapcar in Ruby:
def mapcar(fn, *arrs)
return [] if arrs.empty? or arrs.include? []
transposed = if arrs.all? { |a| arrs.first.size == a.size }
arrs.transpose
else
arrs[0].zip(*arrs.drop(1))
end
transposed.map do |e|
e.collect(&fn) unless e.include? nil
end.compact!
end

Array of arrays into array of hashes

i want to convert in ruby
[[1, 1], [2, 3], [3, 5], [4, 1], [1, 2], [2, 3], [3, 5], [4, 1]]
into
[{1=>1}, {2=>3}, {3=>5}, {4=>1}, {1=>2}, {2=>3}, {3=>5}, {4=>1}]
and after this to obtain sum of all different keys:
{1=>3,2=>6,3=>10,4=>2}
For the second question
sum = Hash.new(0)
original_array.each{|x, y| sum[x] += y}
sum # => {1 => 3, 2 => 6, 3 => 10, 4 => 2}
Functional approach:
xs = [[1, 1], [2, 3], [3, 5], [4, 1], [1, 2], [2, 3], [3, 5], [4, 1]]
Hash[xs.group_by(&:first).map do |k, pairs|
[k, pairs.map { |x, y| y }.inject(:+)]
end]
#=> {1=>3, 2=>6, 3=>10, 4=>2}
Using Facets is much simpler thanks to the abstractions map_by (a variation of group_by) and mash (map + Hash):
require 'facets'
xs.map_by { |k, v| [k, v] }.mash { |k, vs| [k, vs.inject(:+)] }
#=> {1=>3, 2=>6, 3=>10, 4=>2}
You don't need the intermediate form.
arrays = [[1, 1], [2, 3], [3, 5], [4, 1], [1, 2], [2, 3], [3, 5], [4, 1]]
aggregate = arrays.each_with_object Hash.new do |(key, value), hash|
hash[key] = hash.fetch(key, 0) + value
end
aggregate # => {1=>3, 2=>6, 3=>10, 4=>2}
arr= [[1, 1], [2, 3], [3, 5], [4, 1], [1, 2], [2, 3], [3, 5], [4, 1]]
final = Hash.new(0)
second_step = arr.inject([]) do |arr,inner|
arr << Hash[*inner]
final[inner.first] += inner.last
arr
end
second_step
#=> [{1=>1}, {2=>3}, {3=>5}, {4=>1}, {1=>2}, {2=>3}, {3=>5}, {4=>1}]
final
#=> {1=>3, 2=>6, 3=>10, 4=>2}
if you directly only need the last step
arr.inject(Hash.new(0)){|hash,inner| hash[inner.first] += inner.last;hash}
=> {1=>3, 2=>6, 3=>10, 4=>2}

Resources