Check if array already contains new set regardless of order - ruby

I have an array of arrays containing objects:
[ [A, B, C],
[A, B, D],
[B, C, D] ]
I want to check that a value like [B, A, C] can't be added since it's not unique for my purposes. The existing arrays within the array shouldn't have any duplicates (I'm already handling that).
I tried the following code but it's not working:
#if false, don't add to existing array
!big_array.sort.include? new_array.sort
What am I doing wrong?

require 'set'
a = [['a', 'b', 'c'],
['a', 'b', 'd'],
['b', 'c', 'd']]
as = a.map(&:to_set)
as.include? ['b', 'a', 'c'].to_set #=> true
as.include? ['b', 'a', 'e'].to_set #=> false
Use:
(as << row.to_set) unless as.include? row.to_set
then when finished:
as.to_a
In view of your comment, if you add all your rows to a:
a = [['a', 'b', 'c'],
['a', 'b', 'd'],
['b', 'c', 'd'],
['a', 'c', 'b'],
['c', 'a', 'b'],
['e', 'a', 'b'],
['c', 'b', 'd']]
then:
a.reverse
.map(&:to_set)
.uniq
.map(&:to_a)
#=> [["b", "c", "d"],
# ["e", "a", "b"],
# ["a", "b", "c"],
# ["a", "b", "d"]]
reverse is needed to keep your original arrays, but note that ordering is not preserved in the result. If you wish to keep the ordering of the modified a:
a.each_with_object(Set.new) { |row,set| set << row.to_set }
.map(&:to_a)
#=> [["a", "b", "c"],
# ["a", "b", "d"],
# ["b", "c", "d"],
# ["e", "a", "b"]]

You should be sorting the arrays inside your big array. Not the big array itself
!big_array.map(&:sort).include? new_array.sort

a = [
['a', 'b', 'c'],
['a', 'b', 'd'],
['b', 'c', 'd']
]
class Array
def add_only_if_combination_does_not_exist_in(double_array)
if double_array.map(&:sort).include?(self.sort)
puts "Won't be added since it already exists!"
else
puts 'Will be added'
double_array << self
end
end
end
['b', 'a', 'c'].add_only_if_combination_does_not_exist_in(a)
['b', 'a', 'f'].add_only_if_combination_does_not_exist_in(a) #=> Will be added
p a #=> [["a", "b", "c"], ["a", "b", "d"], ["b", "c", "d"], ["b", "a", "f"]]

If you don't care about the order of the elements, consider using the Set class.
require 'set'
big_set = Set.new
big_set << Set.new(['a', 'b', 'c'])
# => #<Set: {#<Set: {"a", "b", "c"}>}>
big_set << Set.new(['c', 'b', 'a'])
# => #<Set: {#<Set: {"a", "b", "c"}>}>
big_set << Set.new(['d', 'a', 'b'])
# => #<Set: {#<Set: {"a", "b", "c"}>, #<Set: {"d", "a", "b"}>}>

Related

processing array with duplicates

I have an array
a = ['A', 'B', 'B', 'C', 'D', 'D']
and I have to go thru all the elements, do something depending on whether the is the last occurance or not, and remove the element after processing it.
The elements are already sorted if that matters.
I'm looking for something efficient. Any suggestions?
Her what I have until now. THIS WORKS AS EXPECTED but not sure it is very efficient.
a = ['A', 'B', 'B', 'C', 'D', 'D']
while !a.empty?
b = a.shift
unless a.count(b) > 0
p "unique #{b}"
else
p "duplicate #{b}"
end
end
and it produces
"unique A"
"duplicate B"
"unique B"
"unique C"
"duplicate D"
"unique D"
Thanks
Simple way:
array = ["A", "B", "B", "C", "D", "D"]
array.group_by{|e| e}.each do |key,value|
*duplicate, uniq = value
duplicate.map do |e|
puts "Duplicate #{e}"
end
puts "Unique #{uniq}"
end
As per Stefan's comment and suggestion, shorter way is:
array.chunk_while(&:==).each do |*duplicate, uniq|
duplicate.map do |e|
puts "Duplicate #{e}"
end
puts "Unique #{uniq}"
end
# Above both will give the same Output:
---------------------------------------
Unique A
Duplicate B
Unique B
Unique C
Duplicate D
Unique D
Based on your code and expected output, I think this is an efficient way to do what you're looking for:
a = ['A', 'B', 'B', 'C', 'D', 'D']
a.each_index do |i|
if i < a.length - 1 && a[i+1] == a[i]
puts "This is not the last occurrence of #{a[i]}"
else
puts "This is the last occurrence of #{a[i]}"
end
end
# Output:
# This is the last occurrence of A
# This is not the last occurrence of B
# This is the last occurrence of B
# This is the last occurrence of C
# This is not the last occurrence of D
# This is the last occurrence of D
But I want to reiterate the importance of the wording in my output versus yours. This is not about whether the value is unique or not in the input. It seems to be about whether the value is the last occurrence within the input or not.
Quite similar to the answer of #GaganGami but using chunk_while.
a.chunk_while { |a,b| a == b }
.each do |*list,last|
list.each { |e| puts "duplicate #{e}" }
puts "unique #{last}"
end
chunk_whilesplits the array into sub arrays when the element changes.
['A', 'B', 'B', 'C', 'D', 'D'].chunk_while { |a,b| a == b }.to_a
# => [["A"], ["B", "B"], ["C"], ["D", "D"]]
The OP stated that the elements of a are sorted, but that is not required by the method I propose. It also maintains array-order, which could be important for the "do something" code performed for each element to be removed. It does so with no performance penalty over the case where the array is already sorted.
For the array
['A', 'B', 'D', 'C', 'B', 'D']
I assume that some code is to be executed for 'A', 'C' the second 'B' and the second 'D', in that order, after which a new array
['B', 'D']
is returned.
Code
def do_something(e) end
def process_last_dup(a)
a.dup.
tap do |b|
b.each_with_index.
reverse_each.
uniq(&:first).
reverse_each { |_,i| do_something(a[i]) }.
each { |_,i| b.delete_at(i) }
end
end
Example
a = ['A', 'B', 'B', 'C', 'D', 'D']
process_last_dup(a)
#=> ["B", "D"]
Explanation
The steps are as follows.
b = a.dup
#=> ["A", "B", "B", "C", "D", "D"]
c = b.each_with_index
#=> #<Enumerator: ["A", "B", "B", "C", "D", "D"]:each_with_index>
d = c.reverse_each
#=> #<Enumerator: #<Enumerator: ["A",..., "D"]:each_with_index>:reverse_each>
Notice that d can be thought of as a "compound" enumerator. We can convert it to an array to see the elements it will generate and pass to uniq.
d.to_a
#=> [["D", 5], ["D", 4], ["C", 3], ["B", 2], ["B", 1], ["A", 0]]
Continuing,
e = d.uniq(&:first)
#=> [["D", 5], ["C", 3], ["B", 2], ["A", 0]]
e.reverse_each { |_,i| do_something(a[i]) }
reverse_each is used so that do_something is first executed for 'A', then for the second 'B', and so on.
e.each { |_,i| b.delete_at(i) }
b #=> ["B", "D"]
If a is to be modified in place replace a.dup. with a..
Readers may have noticed that the code I gave at the beginning used Object#tap so that tap's block variable b, which initially equals a.dup, will be returned after it has been modified within tap's block, rather than explicitly setting b = a.sup at the beginning and b at the end, as I've done in my step-by-step explanation. Both approaches yield the same result, of course.
The doc for Enumerable#uniq does not specify whether the first element is kept, but it does reference Array.uniq, which does keep the first. If there is any uneasiness about that one could always replace reverse_each with reverse so that Array.uniq would be used.

Sort array by other array

I have two arrays:
a = [ 1, 0, 2, 1, 0]
b = ['a', 'b', 'c', 'd', 'e']
I want to order the b array according to a's elements values.
I can make this by merging the two arrays into a Hash and the order by key:
h = Hash[b.zip a]
=> {"a"=>1, "b"=>0, "c"=>2, "d"=>1, "e"=>0}
h2 = Hash[h.sort_by{|k, v| v}]
=> {"b"=>0, "e"=>0, "a"=>1, "d"=>1, "c"=>2}
array = h2.keys
=> ["b", "e", "a", "d", "c"]
Where there is a tie the order may be chosen arbitrary.
Is there a way (maybe more compact), I can achieve this without using the hash.
a.zip(b).sort.map(&:last)
In parts:
p a.zip(b) # => [[1, "a"], [0, "b"], [2, "c"], [1, "d"], [0, "e"]]
p a.zip(b).sort # => [[0, "b"], [0, "e"], [1, "a"], [1, "d"], [2, "c"]]
p a.zip(b).sort.map(&:last) # => ["b", "e", "a", "d", "c"]
a = [ 1, 0, 2, 1, 0]
b = ['a', 'b', 'c', 'd', 'e']
p b.sort_by.each_with_index{|el,i| a[i]}
# => ["b", "e", "a", "d", "c"]

Find all occurrences of 1 or 2 letters in a string using ruby

If I have a string such as 'abcde' and I want to get a 2d array of all combinations of 1 or 2 letters.
[ ['a', 'b', 'c', 'd', 'e'], ['ab', 'c', 'de'], ['a', 'bc', 'd', 'e'] ...
How would I go abouts doing so?
I want to do this in ruby, and think I should be using a regex. I've tried using
strn = 'abcde'
strn.scan(/[a-z][a-z]/)
but this is only going to give me the distinct sets of 2 characters
['ab', 'cd']
I think this should do it (haven't tested yet):
def find_letter_combinations(str)
return [[]] if str.empty?
combinations = []
find_letter_combinations(str[1..-1]).each do |c|
combinations << c.unshift(str[0])
end
return combinations if str.length == 1
find_letter_combinations(str[2..-1]).each do |c|
combinations << c.unshift(str[0..1])
end
combinations
end
Regular expressions will not help for this sort of problem. I suggest using the handy Array#combination(n) function in Ruby 1.9:
def each_letter_and_pair(s)
letters = s.split('')
letters.combination(1).to_a + letters.combination(2).to_a
end
ss = each_letter_and_pair('abcde')
ss # => [["a"], ["b"], ["c"], ["d"], ["e"], ["a", "b"], ["a", "c"], ["a", "d"], ["a", "e"], ["b", "c"], ["b", "d"], ["b", "e"], ["c", "d"], ["c", "e"], ["d", "e"]]
No, regex is not suitable here. Sure you can match either one or two chars like this:
strn.scan(/[a-z][a-z]?/)
# matches: ['ab', 'cd', 'e']
but you can't use regex to generate a (2d) list of all combinations.
A functional recursive approach:
def get_combinations(xs, lengths)
return [[]] if xs.empty?
lengths.take(xs.size).flat_map do |n|
get_combinations(xs.drop(n), lengths).map { |ys| [xs.take(n).join] + ys }
end
end
get_combinations("abcde".chars.to_a, [1, 2])
#=> [["a", "b", "c", "d", "e"], ["a", "b", "c", "de"],
# ["a", "b", "cd", "e"], ["a", "bc", "d", "e"],
# ["a", "bc", "de"], ["ab", "c", "d", "e"],
# ["ab", "c", "de"], ["ab", "cd", "e"]]

Find all subsets of size N in an array using Ruby

Given an array ['a', 'b', 'c', 'd', 'e', 'f'], how would I get a list of all subsets containing two, three, and four elements?
I'm quite new to Ruby (moving from C#) and am not sure what the 'Ruby Way' would be.
Check out Array#combination
Then something like this:
2.upto(4) { |n| array.combination(n) }
Tweaking basicxman's a little bit:
2.upto(4).flat_map { |n| array.combination(n).to_a }
#=> [["a", "b"], ["a", "c"], ["a", "d"], ..., ["c", "d", "e", "f"]]

Swapping array elements using parallel assignment

Intrigued by this question, I have played a bit with parallel assignment with arrays and method calls. So here's an paradigmatic example, trying to swap two members in an array, by their value:
deck = ['A', 'B', 'C']
#=> ["A", "B", "C"]
deck[deck.index("A")], deck[deck.index("B")] = deck[deck.index("B")], deck[deck.index("A")]
#=> ["B", "A"]
deck
#=> ["A", "B", "C"]
The array hasn't changed. But if we change the order of arguments, it works:
deck[deck.index("B")], deck[deck.index("A")] = deck[deck.index("A")], deck[deck.index("B")]
#=> ["A", "B"]
deck
#=> ["B", "A", "C"]
I guess it has to do with the order of calling the index methods within the assignment, but not see it clearly. Can someone please explain the order of things underneath, and why the first example doesn't swap the member, and second does?
It is expected. It follows from how ruby evaluates expressions.
deck[deck.index("A")], deck[deck.index("B")] = deck[deck.index("B")], deck[deck.index("A")]
Implies
deck[deck.index("A")], deck[deck.index("B")] = 'B', 'A'
Note: strings 'A' and 'B' here are for illustration only. Ruby doesn't create new string objects here. Which essentially is:
deck[deck.index("A")] = 'B' -> deck[0] = 'B' (deck = ['B', 'B', 'C'])
deck[deck.index("B")] = 'A' -> deck[0] = 'A' (deck = ['A', 'B', 'C'])
Array#index returns when it finds the first match.
Now,
deck[deck.index("B")], deck[deck.index("A")] = deck[deck.index("A")], deck[deck.index("B")]
-> deck[deck.index("B")], deck[deck.index("A")] = 'A', 'B'
-> deck[deck.index("B")] = 'A' -> deck[1] = 'A' (deck = ['A', 'A', 'C'])
-> deck[deck.index("A")] = 'B' -> deck[0] = 'B' (deck = ['B', 'A', 'C'])
Just as an example, compare the machinations used to search the array, find the correct indexes then swap the values, with what you could do using a Hash:
h = { "cat" => "feline", "dog" => "canine", "cow" => "bovine" }
h['dog'], h['cat'] = h.values_at('cat', 'dog')
h #=> {"cat"=>"canine", "dog"=>"feline", "cow"=>"bovine"}
Now, if Ruby had an assignable values_at= Hash method it could be even cleaner:
h.values_at('dog', 'cat') = h.values_at('cat', 'dog')
but, alas, we don't. Hash slicing is a very powerful tool in Perl and something I miss about Ruby.
And, yes, I know I can add my own assignable values_at=.
M Rajesh is correct, but he actually had to think in order to work it out. I'm too lazy for that!
Here's a printf-debugging way of showing what happened.
deck = ['A', 'B', 'C']
#=> ["A", "B", "C"]
deck[deck.index("A").tap {|index|
STDERR.puts "Result of indexing for #{"A".inspect} is #{index.inspect}"
}],
deck[deck.index("B").tap {|index|
STDERR.puts "Result of indexing for #{"B".inspect} is #{index.inspect}"
}] =
deck[deck.index("B")], deck[deck.index("A")]
# Result of indexing for "A" is 0
# Result of indexing for "B" is 0
#=> ["B", "A"]
deck
#=> ["A", "B", "C"]

Resources