Ruby - Find element not in common for two arrays - ruby

I've been thinking about a following problem - there are two arrays, and I need to find elements not common for them both, for example:
a = [1,2,3,4]
b = [1,2,4]
And the expected answer is [3].
So far I've been doing it like this:
a.select { |elem| !b.include?(elem) }
But it gives me O(N ** 2) time complexity. I'm sure it can be done faster ;)
Also, I've been thinking about getting it somehow like this (using some method opposite to & which gives common elements of 2 arrays):
a !& b #=> doesn't work of course
Another way might be to add two arrays and find the unique element with some method similar to uniq, so that:
[1,1,2,2,3,4,4].some_method #=> would return 3

The simplest (in terms of using only the arrays already in place and stock array methods, anyway) solution is the union of the differences:
a = [1,2,3,4]
b = [1,2,4]
(a-b) | (b-a)
=> [3]
This may or may not be better than O(n**2). There are other options which are likely to give better peformance (see other answers/comments).
Edit: Here's a quick-ish implementation of the sort-and-iterate approach (this assumes no array has repeated elements; otherwise it will need to be modified depending on what behavior is wanted in that case). If anyone can come up with a shorter way to do it, I'd be interested. The limiting factor is the sort used. I assume Ruby uses some sort of Quicksort, so complexity averages O(n log n) with possible worst-case of O(n**2); if the arrays are already sorted, then of course the two calls to sort can be removed and it will run in O(n).
def diff a, b
a = a.sort
b = b.sort
result = []
bi = 0
ai = 0
while (ai < a.size && bi < b.size)
if a[ai] == b[bi]
ai += 1
bi += 1
elsif a[ai]<b[bi]
result << a[ai]
ai += 1
else
result << b[bi]
bi += 1
end
end
result += a[ai, a.size-ai] if ai<a.size
result += b[bi, b.size-bi] if bi<b.size
result
end

As #iamnotmaynard noted in the comments, this is traditionally a set operation (called the symmetric difference). Ruby's Set class includes this operation, so the most idiomatic way to express it would be with a Set:
Set.new(a) ^ b
That should give O(n) performance (since a set membership test is constant-time).

a = [1, 2, 3]
b = [2, 3, 4]
a + b - (a & b)
# => [1, 4]

The solution for Array divergences is like:
a = [1, 2, 3]
b = [2, 3, 4]
(a - b) | (b - a)
# => [1, 4]
You can also read my blog post about Array coherences

Related

JAVA - Recursive function to determine all prime factors of n

What I have problems understanding is how can I write a recursive method that adds to the array, if I can only have n as parameter to my function.
You have two cases: base case and recursion case. For your problem, the high-level logic looks like this:
if n is prime
return array(n) // return a one-element array
else {
find a prime divisor, p
// return an array of p and the factorisation of n/p
return array(p, FACTORIZATION(n/p) )
}
Does that get you moving? You'll need to know how to make and append to arrays in your chosen language, but those are implementation details.
It would look either like:
def factorize(n):
factors= list()
found= False
t= 2
while t*t <= n and not found:
while (n % t) == 0:
# divisible by 2
factors.append(t)
found= True
n//= t
t+= 1
if found:
factors.extend(factorize(n))
else:
factors.append(n)
return factors
factorize(3*5*5*7*7*31*101)
# --> [3, 5, 5, 7, 7, 31, 101]
Which is a naiive apporoach, to keep it simple. Or you allow some more (named) arguments to your recursive function, which would also allow passing a list. Like:
def factorize2(n, result=None, t=2):
if result:
factors= result
else:
factors= list()
found= False
while t*t <= n and not found:
while (n % t) == 0:
factors.append(t)
found= True
n//= t
t+= 1
if found:
factorize2(n, factors, t+1)
else:
factors.append(n)
return factors
The basic difference is, that here you reuse the list the top level function created. This way you might give the garbabge collector a little less work (though in case of a factorization function this probably doesn't make much of a difference, but in other cases I think it does). The second point is, that you already tested some factors and don't have to retest them. This is why I pass t.
Of course this is still naiive. You can easily improve the performance, by avoiding the t*t < n check in each iteration and by just testing t if t is -1/1 mod 6 and so on.
Another approach to have in your toolbox is to not return an array, but rather a linked list. That is a data structure where each piece of data links to the next, links to the next, and so on. Factorization doesn't really show the power of it, but here it is anyways:
def factorize(n, start=2):
i = start
while i < n:
if n % i == 0:
return [i, factorize(n//i, i)]
elif n < i*i:
break
i = i + 1
if 1 < i:
return [n, None]
else:
return None
print(factorize(3*5*5*7*7*31*101)) # [3, [5, [5, [7, [7, [31, [101, None]]]]]]]
The win with this approach is that it does not modify the returned data structure. So if you're doing something like searching for an optimal path through a graph, you can track multiple next moves without conflict. I find this particularly useful when modifying dynamic programming algorithms to actually find the best solution, rather than to report how good it is.
The one difficulty is that you wind up with a nested data structure. But you can always flatten it as follows:
def flatten_linked_list (ll):
answer = []
while ll is not None:
answer.append(ll[0])
ll = ll[1]
return answer
# prints [3, 5, 5, 7, 7, 31, 101]
print(flatten_linked_list( factorize(3*5*5*7*7*31*101) ))
Here are two recursive methods. The first has one parameter and uses a while loop to find a divisor, then uses divide and conquer to recursively query the factors for each of the two results of that division. (This method is more of an exercise since we can easily do this much more efficiently.) The second relies on a second parameter as a pointer to the current prime factor, which allows for a much more efficient direct enumeration.
JavaScript code:
function f(n){
let a = ~~Math.sqrt(n)
while (n % a)
a--
if (a < 2)
return n == 1 ? [] : [n]
let b = n / a
let [fa, fb] = [f(a), f(b)]
return (fa.length > 1 ? fa : [a]).concat(
fb.length > 1 ? fb : [b])
}
function g(n, d=2){
if (n % d && d*d < n)
return g(n, d == 2 ? 3 : d + 2)
else if (d*d <= n)
return [d].concat(g(n / d, d))
return n == 1 ? [] : [n]
}
console.log(f(567))
console.log(g(12345))

Merge sort algorithm using recursion

I'm doing The Odin Project. The practice problem is: create a merge sort algorithm using recursion. The following is modified from someone's solution:
def merge_sort(arry)
# kick out the odds or kick out of the recursive splitting?
# I wasn't able to get the recombination to work within the same method.
return arry if arry.length == 1
arry1 = merge_sort(arry[0...arry.length/2])
arry2 = merge_sort(arry[arry.length/2..-1])
f_arry = []
index1 = 0 # placekeeper for iterating through arry1
index2 = 0 # placekeeper for iterating through arry2
# stops when f_arry is as long as combined subarrays
while f_arry.length < (arry1.length + arry2.length)
if index1 == arry1.length
# pushes remainder of arry2 to f_arry
# not sure why it needs to be flatten(ed)!
(f_arry << arry2[index2..-1]).flatten!
elsif index2 == arry2.length
(f_arry << arry1[index1..-1]).flatten!
elsif arry1[index1] <= arry2[index2]
f_arry << arry1[index1]
index1 += 1
else
f_arry << arry2 [index2]
index2 += 1
end
end
return f_arry
end
Is the first line return arry if arry.length == 1 kicking it out of the recursive splitting of the array(s) and then bypassing the recursive splitting part of the method to go back to the recombination section? It seems like it should then just keep resplitting it once it gets back to that section as it recurses through.
Why must it be flatten-ed?
The easiest way to understand the first line is to understand that the only contract that merge_sort is bound to is to "return a sorted array" - if the array has only one element (arry.length == 1) it is already sorted - so nothing needs to be done! Simply return the array itself.
In recursion, this is known as a "Stop condition". If you don't provide a stop condition - the recursion will never end (since it will always call itself - and never return)!
The result you need to flatten your result, is because you are pushing an array as an element in you resulting array:
arr = [1]
arr << [2, 3]
# => [1, [2, 3]]
If you try to flatten the resulting array only at the end of the iteration, and not as you are adding the elements, you'll have a problem, since its length will be skewed:
arr = [1, [2, 3]]
arr.length
# => 2
Although arr contains three numbers it has only two elements - and that will break your solution.
You want all the elements in your array to be numbers, not arrays. flatten! makes sure that all elements in your array are atoms, and if they are not, it adds the child array's elements to itself instead of the child array:
arr.flatten!
# => [1, 2, 3]
Another you option you might want to consider (and will be more efficient) is to use concat instead:
arr = [1]
arr.concat([2, 3])
# => [1, 2, 3]
This method add all the elements in the array passed as parameter to the array it is called on.

The most idiomatic way to iterate through a Ruby array, exiting when an arbitrary condition met?

I want to iterate through an array, each element of which is an array of two integers (e.g. `[3,5]'); for each of these elements, I want to calculate the sum of the two integers, exiting the loop when any of these sums exceeds a certain arbitrary value. The source array is quite large, and I will likely find the desired value near the beginning, so looping through all of the unneeded elements is not a good option.
I have written three loops to do this, all of which produce the desired result. My question is: which is more idiomatic Ruby? Or--better yet--is there a better way? I try not to use non-local loop variables in, but break statements look kind of hackish to my (admittedly novice) eye.
# Loop A
pairs.each do |pair|
pair_sum = pair.inject(:+)
arr1 << pair_sum
break if pair_sum > arr2.max
end
#Loop B - (just A condensed)
pairs.each { |pair| arr1.last <= arr2.max ? arr1 << pair.inject(:+) : break }
#Loop C
i = 0
pair_sum = 0
begin
pair_sum = pairs[i].inject(:+)
arr1 << pair_sum
i += 1
end until pair_sum > arr2.max
A similar question was asked at escaping the .each { } iteration early in Ruby, but the responses were essentially that, while using .each or .each_with_index and exiting with break when the target index was reached would work, .take(num_elements).each is more idiomatic. In my situation, however, I don't know in advance how many elements I'll have to iterate through, presenting me with what appears to be a boundary case.
This is from a project Euler-type problem I've already solved, btw. Just wondering about the community-preferred syntax. Thanks in advance for your valuable time.
take and drop have a variant take_while and drop_while where instead of providing a fixed number of elements you provide a block. Ruby will accumulate values from the receiver (in the case of take_while) as long as the block returns true. Your code could be rewritten as
array.take_while {|pair| pair.sum < foo}.map(&:sum)
This does mean that you calculate the sum of some of these pairs twice.
In Ruby 2.0 there's Enumerable#lazy which returns a lazy enumerator:
sums = pairs.lazy.map { |a, b| a + b }.take_while { |pair_sum| pair_sum < some_max_value }.force
This avoids calculating the sums twice.
[[1, 2], [3, 4], [5, 6]].find{|x, y| x + y > 6}
# => [3, 4]
[[1, 2], [3, 4], [5, 6]].find{|x, y| x + y > 6}.inject(:+)
#=> 7

Ruby: how to know depth of multidemensional array

This is my problem I have met in my assignment.
Array A has two elements: array B and array C.
Array B has two elements: array D and array E
At some point, array X just contains two elements: string a and string b.
I don't know how to determine how deep array A is. For example:
arrA = [
[
[1,2]
]
]
I have tested by: A[0][0][0] == nil which returns false. Moreover, A[0][0]..[0] == nil always returns false. So, I cannot do this way to know how deep array A is.
If this is not what you're looking for, it should be a good starting point:
def depth (a)
return 0 unless a.is_a?(Array)
return 1 + depth(a[0])
end
> depth(arrA)
=> 3
Please note that this only measures the depth of the first branch.
My solution which goes below answers the maximum depth of any array:
Example: for arr=[ [[1],[2,3]], [[[ 3,4 ]]] ], the maximum depth of arr is 4 for 3,4.
Aprroach - flatten by one level and compare
b, depth = arr.dup, 1
until b==arr.flatten
depth+=1
b=b.flatten(1)
end
puts "Array depth: #{depth}" #=> 4
Hope it answers your question.
A simple pure functional recursive solution:
def depth(xs, n=0)
return case
when xs.class != Array
n
when xs == []
n + 1
else
xs.collect{|x| depth x, n+1}.max
end
end
Examples:
depth([]) == 1
depth([['a']])) == 2
depth([1, 2, 3, 4, [1, 2, 3, [[2, 2],[]], 4, 5, 6, 7], 5, 5, [[[[[3, 4]]]]], [[[[[[[[[1, 2]]]]]]]]]]) == 10
Here's a one-liner similar to kiddorails' solution extracted into a method:
def depth(array)
array.to_a == array.flatten(1) ? 1 : depth(array.flatten(1)) + 1
end
It will flatten the array 1 dimension at the time until it can't flatten anymore, while counting the dimensions.
Why is this better than other solutions out there?
doesn't require modification to native classes (avoid that if possible)
doesn't use metaprogramming (is_a?, send, respond_to?, etc.)
fairly easy to read
works on hashes as well (notice array.to_a)
actually works (unlike only checking the first branch, and other silly stuff)
Also one line code if you want to use
def depth (a)
a.to_s.count("[")
end

puzzle using arrays

My first array M + N size and second array of size N.
let us say m=4,n=5
a[ ]= 1,3,5,7,0,0,0,0,0
b[ ]= 2,4,6,8,10
Now , how can i merge these two arrays without using external sorting algorithms and any other temporary array(inplace merge) but complexity should be o(n).Resultant array must be in sorted order.
Provided a is exactly the right size and arrays are already sorted (as seems to be the case), the following pseudo-code should help:
# 0 1 2 3 4 5 6 7 8
a = [1,3,5,7,0,0,0,0,0]
b = [2,4,6,8,10]
afrom = 3
bfrom = 4
ato = 8
while bfrom >= 0:
if afrom == -1:
a[ato] = b[bfrom]
ato = ato - 1
bfrom = bfrom - 1
else:
if b[bfrom] > a[afrom]:
a[ato] = b[bfrom]
ato = ato - 1
bfrom = bfrom - 1
else:
a[ato] = a[afrom]
ato = ato - 1
afrom = afrom - 1
print a
It's basically a merge of the two lists into one, starting at the ends. Once bfrom hits -1, there are no more elements in b so the remainder in a were less than the lowest in b. Therefore the rest of a can remain unchanged.
If a runs out first, then it's a matter of transferring the rest of b since all the a elements have been transferred above ato already.
This is O(n) as requested and would result in something like:
[1, 2, 3, 4, 5, 6, 7, 8, 10]
Understanding that pseudo-code and translating it to your specific language is a job for you, now that you've declared it homework :-)
for (i = 0; i < N; i++) {
a[m+i] = b[i];
}
This will do an in-place merge (concatenation).
If you're asking for an ordered merge, that's not possible in O(N). If it were to be possible, you could use it to sort in O(N). And of course O(N log N) is the best known general-case sorting algorithm...
I've got to ask, though, looking at your last few questions: are you just asking us for homework help? You do know that it's OK to say "this is homework", and nobody will laugh at you, right? We'll even still do our best to help you learn.
Do you want a sorted array ? If not this should do
for(int i=a.length-1,j=0;i >=0; i--)
{
a[i] = b[j++];
}
You can take a look at in-place counting sort that works provided you know the input range. Effectively O(n).

Resources