Iterate two collection at same time - ruby

a = [1,2,3]
b = [4,5 ]
What I want is to iterate these two collection at same time and do something with iterator, the pseudo code would be like:
for i in a
for j in b
collect i * j
when one collection runs out of element, the loop stops.
the result will be [4, 10]
What I have is this:
a = [1,2,3]
b = [4,5 ]
a.zip(b).reject { |c| c.any? { |d| d.nil? } }.map { |e| e.reduce(&:*) }
Any better solution? Thanks!
And The perfect solution I am looking for is to match the intent of my pseudo code.

You can do this:
a, b = b, a if b.length < a.length
a.zip(b).map { |ia, ib| ia * ib }
# => [4, 10]
The first line makes sure that array a has at most the same number of elements as array b. This is because zip creates an array of arrays of the length of the called array. Having a as the shortest array makes sure that there would be no nils.

Here is another way to do it:
[a.length, b.length].min.times.map {|i| a[i]*b[i] }
The idea is that you take the shorter of the two array lengths, [a.length, b.length].min, and you iterate that many times over an integer, i, which you use as an index into the arrays.

Related

How to determine whether an array is contained in another array

The question is, given [1,2,3,4,5] and [2,4,5], to determine whether (every element in) the second array is contained in the first one. The answer is true.
What's the most succinct and efficient way to do better than:
arr2.reject { |e| arr1.include?(e) } .empty?
Array subtraction should work, as in
(arr2 - arr1).empty?
Description of method:
Returns a new array that is a copy of the original array, removing any
items that also appear in [the second array]. The order is preserved from the
original array.
It compares elements using their hash and eql? methods for efficiency.
I don't consider myself an expert on efficiency, but #Ryan indicated in comments to his answer that it's reasonably efficient at scale.
The bad O(n²) one-liner would look like this:
arr2.all? { |x| arr1.include? x }
arr2.all? &arr1.method(:include?) # alternative
If your objects are hashable, you can make this O(n) by making a set out of the first array:
require 'set'
arr2.all? &Set.new(arr1).method(:include?)
If your objects are totally, like, ordered, you can make it O(n log n) with a sort and a binary search:
arr1.sort!
arr2.all? { |x| arr1.bsearch { |y| x <=> y } }
As mentioned by #Ryan you can use sets. In which case Set#subset? is available to you which is pretty readable (note the two different ways of defining a set from an array):
require 'set'
s1 = Set.new([1, 2, 3])
s2 = [1, 2].to_set
s3 = [1, 3].to_set
s4 = [1, 4].to_set
s1.subset? s1 #=> true
s2.subset? s1 #=> true
s3.subset? s1 #=> true
s4.subset? s1 #=> false
Also consider using Set#proper_subset if required.
s1.proper_subset? s1 #=> false
s2.proper_subset? s1 #=> true
NB A set contains no duplicate elements e.g. Set.new([1,2,3,3]) #=> #<Set: {1, 2, 3}>

Merge sort algorithm using recursion

I'm doing The Odin Project. The practice problem is: create a merge sort algorithm using recursion. The following is modified from someone's solution:
def merge_sort(arry)
# kick out the odds or kick out of the recursive splitting?
# I wasn't able to get the recombination to work within the same method.
return arry if arry.length == 1
arry1 = merge_sort(arry[0...arry.length/2])
arry2 = merge_sort(arry[arry.length/2..-1])
f_arry = []
index1 = 0 # placekeeper for iterating through arry1
index2 = 0 # placekeeper for iterating through arry2
# stops when f_arry is as long as combined subarrays
while f_arry.length < (arry1.length + arry2.length)
if index1 == arry1.length
# pushes remainder of arry2 to f_arry
# not sure why it needs to be flatten(ed)!
(f_arry << arry2[index2..-1]).flatten!
elsif index2 == arry2.length
(f_arry << arry1[index1..-1]).flatten!
elsif arry1[index1] <= arry2[index2]
f_arry << arry1[index1]
index1 += 1
else
f_arry << arry2 [index2]
index2 += 1
end
end
return f_arry
end
Is the first line return arry if arry.length == 1 kicking it out of the recursive splitting of the array(s) and then bypassing the recursive splitting part of the method to go back to the recombination section? It seems like it should then just keep resplitting it once it gets back to that section as it recurses through.
Why must it be flatten-ed?
The easiest way to understand the first line is to understand that the only contract that merge_sort is bound to is to "return a sorted array" - if the array has only one element (arry.length == 1) it is already sorted - so nothing needs to be done! Simply return the array itself.
In recursion, this is known as a "Stop condition". If you don't provide a stop condition - the recursion will never end (since it will always call itself - and never return)!
The result you need to flatten your result, is because you are pushing an array as an element in you resulting array:
arr = [1]
arr << [2, 3]
# => [1, [2, 3]]
If you try to flatten the resulting array only at the end of the iteration, and not as you are adding the elements, you'll have a problem, since its length will be skewed:
arr = [1, [2, 3]]
arr.length
# => 2
Although arr contains three numbers it has only two elements - and that will break your solution.
You want all the elements in your array to be numbers, not arrays. flatten! makes sure that all elements in your array are atoms, and if they are not, it adds the child array's elements to itself instead of the child array:
arr.flatten!
# => [1, 2, 3]
Another you option you might want to consider (and will be more efficient) is to use concat instead:
arr = [1]
arr.concat([2, 3])
# => [1, 2, 3]
This method add all the elements in the array passed as parameter to the array it is called on.

Efficient way of removing similar arrays in an array of arrays

I am trying to analyze some documents and find similarities in them. After analysis, I have an array, the elements of which are arrays of data from documents considered similar. But sometimes I have two almost similar elements, and naturally I want to leave the biggest of them. For simplification:
data = [[1,2,3,4,5,6], [7,8,9,10], [1,2,3,5,6]...]
How do I efficiently process the data that I get:
data = [[1,2,3,4,5,6], [7,8,9,10]...]
I suppose I could intersect every array, and if the intersected array matches one of the original arrays - I ignore it. Here is a quick code I wrote:
data = [[1,2,3,4,5,6], [7,8,9,10], [1,2,3,5,6], [7,9,10]]
cleaned = []
data.each_index do |i|
similar = false
data.each_index do |j|
if i == j
next
elsif data[i]&data[j] == data[i]
similar = true
break
end
end
unless similar
cleaned << data[i]
end
end
puts cleaned.inspect
Is this an efficient way to go? Also, the current behaviour only allows to leave out arrays that are a few elements short, and I might want to merge similar arrays if they occur:
[[1,2,3,4,5], [1,3,4,5,6]] => [[1,2,3,4,5,6]]
You can delete any element in the list if it is fully contained in another element:
data.delete_if do |arr|
data.any? { |a2| !a2.equal?(arr) && arr - a2 == [] }
end
# => [[1, 2, 3, 4, 5, 6], [7, 8, 9, 10]]
This is a bit more efficient than your suggestion since once you decide that an element should be removed, you don't check against it in the next iterations.

Get index of array element faster than O(n)

Given I have a HUGE array, and a value from it. I want to get index of the value in array. Is there any other way, rather then call Array#index to get it? The problem comes from the need of keeping really huge array and calling Array#index enormous amount of times.
After a couple of tries I found that caching indexes inside elements by storing structs with (value, index) fields instead of the value itself gives a huge step in performance (20x times win).
Still I wonder if there's a more convenient way of finding index of en element without caching (or there's a good caching technique that will boost up the performance).
Why not use index or rindex?
array = %w( a b c d e)
# get FIRST index of element searched
puts array.index('a')
# get LAST index of element searched
puts array.rindex('a')
index: http://www.ruby-doc.org/core-1.9.3/Array.html#method-i-index
rindex: http://www.ruby-doc.org/core-1.9.3/Array.html#method-i-rindex
Convert the array into a hash. Then look for the key.
array = ['a', 'b', 'c']
hash = Hash[array.map.with_index.to_a] # => {"a"=>0, "b"=>1, "c"=>2}
hash['b'] # => 1
Other answers don't take into account the possibility of an entry listed multiple times in an array. This will return a hash where each key is a unique object in the array and each value is an array of indices that corresponds to where the object lives:
a = [1, 2, 3, 1, 2, 3, 4]
=> [1, 2, 3, 1, 2, 3, 4]
indices = a.each_with_index.inject(Hash.new { Array.new }) do |hash, (obj, i)|
hash[obj] += [i]
hash
end
=> { 1 => [0, 3], 2 => [1, 4], 3 => [2, 5], 4 => [6] }
This allows for a quick search for duplicate entries:
indices.select { |k, v| v.size > 1 }
=> { 1 => [0, 3], 2 => [1, 4], 3 => [2, 5] }
Is there a good reason not to use a hash? Lookups are O(1) vs. O(n) for the array.
If your array has a natural order use binary search.
Use binary search.
Binary search has O(log n) access time.
Here are the steps on how to use binary search,
What is the ordering of you array? For example, is it sorted by name?
Use bsearch to find elements or indices
Code example
# assume array is sorted by name!
array.bsearch { |each| "Jamie" <=> each.name } # returns element
(0..array.size).bsearch { |n| "Jamie" <=> array[n].name } # returns index
If it's a sorted array you could use a Binary search algorithm (O(log n)). For example, extending the Array-class with this functionality:
class Array
def b_search(e, l = 0, u = length - 1)
return if lower_index > upper_index
midpoint_index = (lower_index + upper_index) / 2
return midpoint_index if self[midpoint_index] == value
if value < self[midpoint_index]
b_search(value, lower_index, upper_index - 1)
else
b_search(value, lower_index + 1, upper_index)
end
end
end
Taking a combination of #sawa's answer and the comment listed there you could implement a "quick" index and rindex on the array class.
class Array
def quick_index el
hash = Hash[self.map.with_index.to_a]
hash[el]
end
def quick_rindex el
hash = Hash[self.reverse.map.with_index.to_a]
array.length - 1 - hash[el]
end
end
Still I wonder if there's a more convenient way of finding index of en element without caching (or there's a good caching technique that will boost up the performance).
You can use binary search (if your array is ordered and the values you store in the array are comparable in some way). For that to work you need to be able to tell the binary search whether it should be looking "to the left" or "to the right" of the current element. But I believe there is nothing wrong with storing the index at insertion time and then using it if you are getting the element from the same array.

How to count in a loop?

I'm new to Ruby, how can I count elements in a loop?
In Java I would write it like this
int[] tablica = { 23,53,23,13 };
int sum = 0;
for (int i = 0; i <= 1; i++) { // **only first two**
sum += tablica[i];
}
System.out.println(sum);
EDIT: I want only first two
You can sum all the elements in an array like this:
arr = [1,2,3,4,5,6]
arr.inject(:+)
# any operator can be here, it will be
# interpolated between the elements (if you use - for example
# you will get 1-2-3-4-5-6)
Or, if you want to iterate over the elements:
arr.each do |element|
do_something_with(element)
Or, if you need the index too:
arr.each_with_index do |element, index|
puts "#{index}: #{element}"
tablica.take(2).reduce(:+)
But seriously? What's wrong with just
tablica[0] + tablica[1]
Hey, it even works in Ruby and Java … and C, C++, Objective-C, Objective-C++, D, C#, ECMAScript, PHP, Python. Without changes.
There are many ways, but if you want the current object and a counter use the each_with_index method
some_collection.each_with_index do |o, i|
# 'o' is your object, 'i' is your index
end
EDIT: Oops, read that too quickly. You can do this
sum = 0
some_collection.each { |i| sum += i }
With Enumerable#inject:
tablica = [23, 53, 23, 13]
tablica.inject(0, :+) # 112
If you just need a sum, here is a simple way:
tablica = [ 23,53,23,13 ]
puts tablica.inject(0){|sum,current_number| sum+current_number}
For first two elements (or whatever contiguous range) you can use a range:
tablica = [ 23,53,23,13 ]
puts tablica[0..1].inject(0){|sum,current_number| sum+current_number}
What this does:
The block (the statement within {...}) is called internally by inject, once for each element in the array.
At the first iteration, sum has the initial value 0 (that we passed to inject)
And current_number contains the 0th element in the array.
We add the two values (0 and 23) and this value gets assigned to sum when the block returns.
Then on the next iteration, we get sum variable as 23 and current_number as 53. And the process repeats.

Resources