How to do sane "set-difference" in Ruby? - ruby

Demo (I expect result [3]):
[1,2] - [1,2,3] => [] # Hmm
[1,2,3] - [1,2] => [3] # I see
a = [1,2].to_set => #<Set: {1, 2}>
b = [1,2,3].to_set => #<Set: {1, 2, 3}>
a - b => #<Set: {}> WTF!
And:
[1,2,9] - [1,2,3] => [9] # Hmm. Would like [[9],[3]]
How is one to perform a real set difference regardless of order of the inputs?
Ps. As an aside, I need to do this for two 2000-element arrays. Usually, array #1 will have fewer elements than array #2, but this is not guaranteed.

The - operator applied to two arrays a and b gives the relative complement of b in a (items that are in a but not in b).
What you are looking for is the symmetric difference of two sets (the union of both relative complements between the two). This will do the trick:
a = [1, 2, 9]
b = [1, 2, 3]
a - b | b - a # => [3, 9]
If you are operating on Set objects, you may use the overloaded ^ operator:
c = Set[1, 2, 9]
d = Set[1, 2, 3]
c ^ d # => #<Set: {3, 9}>
For extra fun, you could also find the relative complement of the intersection in the union of the two sets:
( a | b ) - ( a & b ) # => #<Set: {3, 9}>

Related

How to stop shift from affecting other instances of an array

I need to shift through an array and keep a copy of the original array for future.
I tried creating another variable using a = b, but both are affected when I shift a.
rb(main):001:0> a = [1,2,3,4,5]
# => [1, 2, 3, 4, 5]
irb(main):002:0> b = a
# => [1, 2, 3, 4, 5]
irb(main):003:0> c = a.shift
# => 1
irb(main):004:0> a
# => [2, 3, 4, 5]
irb(main):005:0> b
# => [2, 3, 4, 5]
irb(main):006:0> c
# => 1
Is there a way to keep this from happening?
In Ruby it's important to remember variables are object references which behave a lot like pointers, so b = a does not make a copy, it is another reference to the same object.
To make a copy you must be explicit and use dup or clone to achieve this:
b = a.dup
If you're ever confused by Ruby's behaviour, stop and look at the objects you're dealing with:
a = [ 1 ]
b = a
a.object_id == b.object_id
# => true
They're exactly the same object, but when cloned:
b = a.dup
a.object_id == b.object_id
# => false
Now they're independent, at least on the top-level.
Note that this comes with some caveats, as this is only a shallow copy:
a = [ [ 1 ] ]
b = a.dup
b[0].object_id == a[0].object_id
# => true
This is where deep_clone tools come in handy if you need a complete clone, something available from various gems but most popularly ActiveSupport from Rails.
One thing you'll find in Ruby is it tends to steer towards a more functional style, as in if you wanted to strip an element from a and avoid mangling b:
a = [ 1, 2, 3, 4, 5 ]
b = a
a = a.drop(1)
# => [2, 3, 4, 5]
Where drop skips over the first N entries and returns the rest as a copy:
b
# => [1, 2, 3, 4, 5]

What does a comma followed by an equals sign mean in Ruby?

Just saw something like this in some Ruby code:
def getis;gets.split.map(&:to_i);end
k,=getis # What is this line doing?
di=Array::new(k){Array::new(k)}
It assigns the array's first element using Ruby's multiple assignment:
a, = [1, 2, 3]
a #=> 1
Or:
a, b = [1, 2, 3]
a #=> 1
b #=> 2
You can use * to fetch the remaining elements:
a, *b = [1, 2, 3]
a #=> 1
b #=> [2, 3]
Or:
*a, b = [1, 2, 3]
a #=> [1, 2]
b #=> 3
It works like this. If lhs has single element and rhs has multiple values then lhs gets assigned an array of values, like this.
a = 1,2,3 #=> a = [1,2,3]
Whereas if lhs has more elements than rhs, then excess elements in lhs are discarded
a,b,c = 1,2 #=> a = 1, b = 2, c = nil
Therefore
a, = 1,2,3 #=> a = 1. The rest i.e. [2,3] are discarded

How to drop the end of an array in Ruby

Array#drop removes the first n elements of an array. What is a good way to remove the last m elements of an array? Alternately, what is a good way to keep the middle elements of an array (greater than n, less than m)?
This is exactly what Array#pop is for:
x = [1,2,3]
x.pop(2) # => [2,3]
x # => [1]
You can also use Array#slice method, e.g.:
[1,2,3,4,5,6].slice(1..4) # => [2, 3, 4, 5]
or
a = [1,2,3,4,5,6]
a.take 3 # => [1, 2, 3]
a.first 3 # => [1, 2, 3]
a.first a.size - 1 # to get rid of the last one
The most direct opposite of drop (drop the first n elements) would be take, which keeps the first n elements (there's also take_while which is analogous to drop_while).
Slice allows you to return a subset of the array either by specifying a range or an offset and a length. Array#[] behaves the same when passed a range as an argument or when passed 2 numbers
this will get rid of last n elements:
a = [1,2,3,4,5,6]
n = 4
p a[0, (a.size-n)]
#=> [1, 2]
n = 2
p a[0, (a.size-n)]
#=> [1, 2, 3, 4]
regard "middle" elements:
min, max = 2, 5
p a.select {|v| (min..max).include? v }
#=> [2, 3, 4, 5]
I wanted the return value to be the array without the dropped elements. I found a couple solutions here to be okay:
count = 2
[1, 2, 3, 4, 5].slice 0..-(count + 1) # => [1, 2, 3]
[1, 2, 3, 4, 5].tap { |a| a.pop count } # => [1, 2, 3]
But I found another solution to be more readable if the order of the array isn't important (in my case I was deleting files):
count = 2
[1, 2, 3, 4, 5].reverse.drop count # => [3, 2, 1]
You could tack another .reverse on there if you need to preserve order but I think I prefer the tap solution at that point.
You can achieve the same as Array#pop in a non destructive way, and without needing to know the lenght of the array:
a = [1, 2, 3, 4, 5, 6]
b = a[0..-2]
# => [1, 2, 3, 4, 5]
n = 3 # if we want drop the last n elements
c = a[0..-(n+1)]
# => [1, 2, 3]
Array#delete_at() is the simplest way to delete the last element of an array, as so
arr = [1,2,3,4,5,6]
arr.delete_at(-1)
p arr # => [1,2,3,4,5]
For deleting a segment, or segments, of an array use methods in the other answers.
You can also add some methods
class Array
# Using slice
def cut(n)
slice(0..-n-1)
end
# Using pop
def cut2(n)
dup.tap{|x| x.pop(n)}
end
# Using take
def cut3(n)
length - n >=0 ? take(length - n) : []
end
end
[1,2,3,4,5].cut(2)
=> [1, 2, 3]

Ruby Logical Operators - Elements in one but not both arrays

Let's say I have two arrays:
a = [1,2,3]
b = [1,2]
I want a logical operation to perform on both of these arrays that returns the elements that are not in both arrays (i.e. 3). Thanks!
Arrays in Ruby very conveniently overload some math and bitwise operators.
Elements that are in a, but not in b
a - b # [3]
Elements that are both in a and b
a & b # [1, 2]
Elements that are in a or b
a | b # [1, 2, 3]
Sum of arrays (concatenation)
a + b # [1, 2, 3, 1, 2]
You get the idea.
p (a | b) - (a & b) #=> [3]
Or use sets
require 'set'
a.to_set ^ b
There is a third way of looking at this solution, which directly answers the question and does not require the use of sets:
r = (a-b) | (b-a)
(a-b) will give you what is in array a but not b:
a-b
=> [3]
(b-a) will give you what is in array b but not a:
b-a
=> []
OR-ing the two array subtractions will give you final result of anything that is not in both arrays:
r = ab | ba
=> [3]
Another example might make this even more clear:
a = [1,2,3]
=> [1, 2, 3]
b = [2,3,4]
=> [2, 3, 4]
a-b
=> [1]
b-a
=> [4]
r = (a-b) | (b-a)
=> [1, 4]

Reordering an array based on its existence in another array

a=[1,2,3,4,5]
b=[4,3]
array_wanted=[4,3,1,2,5]
I could do this via a mapping and pushing, but i would love to know more elegant ways of doing this.
(b & a) + (a - b)
# => [4, 3, 1, 2, 5]
And if you are sure that all elements from b are present in a, union operator | seems to return the proper order:
b | a
# => [4, 3, 1, 2, 5]

Resources