processing array with duplicates - ruby

I have an array
a = ['A', 'B', 'B', 'C', 'D', 'D']
and I have to go thru all the elements, do something depending on whether the is the last occurance or not, and remove the element after processing it.
The elements are already sorted if that matters.
I'm looking for something efficient. Any suggestions?
Her what I have until now. THIS WORKS AS EXPECTED but not sure it is very efficient.
a = ['A', 'B', 'B', 'C', 'D', 'D']
while !a.empty?
b = a.shift
unless a.count(b) > 0
p "unique #{b}"
else
p "duplicate #{b}"
end
end
and it produces
"unique A"
"duplicate B"
"unique B"
"unique C"
"duplicate D"
"unique D"
Thanks

Simple way:
array = ["A", "B", "B", "C", "D", "D"]
array.group_by{|e| e}.each do |key,value|
*duplicate, uniq = value
duplicate.map do |e|
puts "Duplicate #{e}"
end
puts "Unique #{uniq}"
end
As per Stefan's comment and suggestion, shorter way is:
array.chunk_while(&:==).each do |*duplicate, uniq|
duplicate.map do |e|
puts "Duplicate #{e}"
end
puts "Unique #{uniq}"
end
# Above both will give the same Output:
---------------------------------------
Unique A
Duplicate B
Unique B
Unique C
Duplicate D
Unique D

Based on your code and expected output, I think this is an efficient way to do what you're looking for:
a = ['A', 'B', 'B', 'C', 'D', 'D']
a.each_index do |i|
if i < a.length - 1 && a[i+1] == a[i]
puts "This is not the last occurrence of #{a[i]}"
else
puts "This is the last occurrence of #{a[i]}"
end
end
# Output:
# This is the last occurrence of A
# This is not the last occurrence of B
# This is the last occurrence of B
# This is the last occurrence of C
# This is not the last occurrence of D
# This is the last occurrence of D
But I want to reiterate the importance of the wording in my output versus yours. This is not about whether the value is unique or not in the input. It seems to be about whether the value is the last occurrence within the input or not.

Quite similar to the answer of #GaganGami but using chunk_while.
a.chunk_while { |a,b| a == b }
.each do |*list,last|
list.each { |e| puts "duplicate #{e}" }
puts "unique #{last}"
end
chunk_whilesplits the array into sub arrays when the element changes.
['A', 'B', 'B', 'C', 'D', 'D'].chunk_while { |a,b| a == b }.to_a
# => [["A"], ["B", "B"], ["C"], ["D", "D"]]

The OP stated that the elements of a are sorted, but that is not required by the method I propose. It also maintains array-order, which could be important for the "do something" code performed for each element to be removed. It does so with no performance penalty over the case where the array is already sorted.
For the array
['A', 'B', 'D', 'C', 'B', 'D']
I assume that some code is to be executed for 'A', 'C' the second 'B' and the second 'D', in that order, after which a new array
['B', 'D']
is returned.
Code
def do_something(e) end
def process_last_dup(a)
a.dup.
tap do |b|
b.each_with_index.
reverse_each.
uniq(&:first).
reverse_each { |_,i| do_something(a[i]) }.
each { |_,i| b.delete_at(i) }
end
end
Example
a = ['A', 'B', 'B', 'C', 'D', 'D']
process_last_dup(a)
#=> ["B", "D"]
Explanation
The steps are as follows.
b = a.dup
#=> ["A", "B", "B", "C", "D", "D"]
c = b.each_with_index
#=> #<Enumerator: ["A", "B", "B", "C", "D", "D"]:each_with_index>
d = c.reverse_each
#=> #<Enumerator: #<Enumerator: ["A",..., "D"]:each_with_index>:reverse_each>
Notice that d can be thought of as a "compound" enumerator. We can convert it to an array to see the elements it will generate and pass to uniq.
d.to_a
#=> [["D", 5], ["D", 4], ["C", 3], ["B", 2], ["B", 1], ["A", 0]]
Continuing,
e = d.uniq(&:first)
#=> [["D", 5], ["C", 3], ["B", 2], ["A", 0]]
e.reverse_each { |_,i| do_something(a[i]) }
reverse_each is used so that do_something is first executed for 'A', then for the second 'B', and so on.
e.each { |_,i| b.delete_at(i) }
b #=> ["B", "D"]
If a is to be modified in place replace a.dup. with a..
Readers may have noticed that the code I gave at the beginning used Object#tap so that tap's block variable b, which initially equals a.dup, will be returned after it has been modified within tap's block, rather than explicitly setting b = a.sup at the beginning and b at the end, as I've done in my step-by-step explanation. Both approaches yield the same result, of course.
The doc for Enumerable#uniq does not specify whether the first element is kept, but it does reference Array.uniq, which does keep the first. If there is any uneasiness about that one could always replace reverse_each with reverse so that Array.uniq would be used.

Related

How do I move an element of an array one place up/down with Ruby

Let's say I have this array
array = ['a', 'b', 'c', 'd']
What is a good way to target an element (for example 'b') and switch it with the next element in line (in this case 'c') so the outcome becomes:
=> ['a', 'c', 'b', 'd']
array[1], array[2] = array[2], array[1]
array #=> ["a", "c", "b", "d"]
or
array[1, 2] = array.values_at(2, 1)
array #=> ["a", "c", "b", "d"]
There is no build in function to do this. You can swap the values like so:
array = %w[a b c d]
array[1..2] = array[1..2].reverse
array #=> ["a", "c", "b", "d"]
You could add some helper methods to the core array class.
class Array
def move_up(index)
self[index, 2] = self[index, 2].reverse
self
end
def move_down(index)
move_up(index - 1)
end
end
Note: Keep in mind that this solution mutates the original array. You could also opt for a version that creates a new array. For this version you can call #dup (result = dup) than work with result instead of self.
References:
Array#[]
Array#[]=
Array#reverse
Object#dup
Try this for swapping
array[0],array[1] = array[1],array[0]
or in general
array[i],array[i+1] = array[i+1],array[i]
Assuming that you want to target the elements by their indices, a combination of insert and delete_at would work:
array = %w[a b c d]
array.insert(2, array.delete_at(1))
array
#=> ["a", "c", "b", "d"]

Sort array a by value in array b in Ruby

I have an array
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
and another array:
b = [0, 3, 6, 3, 4, 0, 1]
Is it possible to sort array a according to values in array b?
The result should be:
a = ['c', 'e', 'b', 'd', 'g', 'a', 'f']
Something like this doesn't seem to exist in Ruby/Rails:
a.sort_with_index{ |elem, index| b[index] }
Edit: in response to the duplicate marking: the question being referred to has an array with elements having an ID, and the other array references the ID's directly. This is not the same, and the same solution will not apply...
a.sort_by.with_index { |_,i| [-b[i], i] }
#=> ["c", "e", "b", "d", "g", "a", "f"]
This uses the indices of elements in a to break ties. I see from a comment on #tadman's answer that that is desired, though it is not a requirement given in the statement of the question. See the third paragraph of the doc for Array#<=> for an explanation of how arrays are ordered in sorting operations.
You can just combine the two, sort, and strip out the original a values:
a.zip(b).sort_by { |_a, _b| -_b }.map { |_a,_| _a }

Check if array already contains new set regardless of order

I have an array of arrays containing objects:
[ [A, B, C],
[A, B, D],
[B, C, D] ]
I want to check that a value like [B, A, C] can't be added since it's not unique for my purposes. The existing arrays within the array shouldn't have any duplicates (I'm already handling that).
I tried the following code but it's not working:
#if false, don't add to existing array
!big_array.sort.include? new_array.sort
What am I doing wrong?
require 'set'
a = [['a', 'b', 'c'],
['a', 'b', 'd'],
['b', 'c', 'd']]
as = a.map(&:to_set)
as.include? ['b', 'a', 'c'].to_set #=> true
as.include? ['b', 'a', 'e'].to_set #=> false
Use:
(as << row.to_set) unless as.include? row.to_set
then when finished:
as.to_a
In view of your comment, if you add all your rows to a:
a = [['a', 'b', 'c'],
['a', 'b', 'd'],
['b', 'c', 'd'],
['a', 'c', 'b'],
['c', 'a', 'b'],
['e', 'a', 'b'],
['c', 'b', 'd']]
then:
a.reverse
.map(&:to_set)
.uniq
.map(&:to_a)
#=> [["b", "c", "d"],
# ["e", "a", "b"],
# ["a", "b", "c"],
# ["a", "b", "d"]]
reverse is needed to keep your original arrays, but note that ordering is not preserved in the result. If you wish to keep the ordering of the modified a:
a.each_with_object(Set.new) { |row,set| set << row.to_set }
.map(&:to_a)
#=> [["a", "b", "c"],
# ["a", "b", "d"],
# ["b", "c", "d"],
# ["e", "a", "b"]]
You should be sorting the arrays inside your big array. Not the big array itself
!big_array.map(&:sort).include? new_array.sort
a = [
['a', 'b', 'c'],
['a', 'b', 'd'],
['b', 'c', 'd']
]
class Array
def add_only_if_combination_does_not_exist_in(double_array)
if double_array.map(&:sort).include?(self.sort)
puts "Won't be added since it already exists!"
else
puts 'Will be added'
double_array << self
end
end
end
['b', 'a', 'c'].add_only_if_combination_does_not_exist_in(a)
['b', 'a', 'f'].add_only_if_combination_does_not_exist_in(a) #=> Will be added
p a #=> [["a", "b", "c"], ["a", "b", "d"], ["b", "c", "d"], ["b", "a", "f"]]
If you don't care about the order of the elements, consider using the Set class.
require 'set'
big_set = Set.new
big_set << Set.new(['a', 'b', 'c'])
# => #<Set: {#<Set: {"a", "b", "c"}>}>
big_set << Set.new(['c', 'b', 'a'])
# => #<Set: {#<Set: {"a", "b", "c"}>}>
big_set << Set.new(['d', 'a', 'b'])
# => #<Set: {#<Set: {"a", "b", "c"}>, #<Set: {"d", "a", "b"}>}>

Count sequential occurrences of element in ruby array

Given some array such as the following:
x = ['a', 'b', 'b', 'c', 'a', 'a', 'a']
I want to end up with something that shows how many times each element repeats sequentially. So maybe I end up with the following:
[['a', 1], ['b', 2], ['c', 1], ['a', 3]]
The structure of the results isn't that important... could be some other data types of needed.
1.9 has Enumerable#chunk for just this purpose:
x.chunk{|y| y}.map{|y, ys| [y, ys.length]}
This is not a general solution, but if you only need to match single characters, it can be done like this:
x.join.scan(/(\w)(\1*)/).map{|x| [x[0], x.join.length]}
Here's one line solution. The logic same as Matt suggested, though, works fine with nil's in front of x:
x.each_with_object([]) { |e, r| r[-1] && r[-1][0] == e ? r[-1][-1] +=1 : r << [e, 1] }
Here's my approach:
# Starting array
arr = [nil, nil, "a", "b", "b", "c", "a", "a", "a"]
# Array to hold final values as requested
counts = []
# Array of previous `count` element
previous = nil
arr.each do |letter|
# If this letter matches the last one we checked, increment count
if previous and previous[0] == letter
previous[1] += 1
# Otherwise push a new array for letter/count
else
previous = [letter, 1]
counts.push previous
end
end
I should note that this doesn't suffer from the same problem that Matt Sanders describes, since we're mindful of our first time through the iteration.

Swapping array elements using parallel assignment

Intrigued by this question, I have played a bit with parallel assignment with arrays and method calls. So here's an paradigmatic example, trying to swap two members in an array, by their value:
deck = ['A', 'B', 'C']
#=> ["A", "B", "C"]
deck[deck.index("A")], deck[deck.index("B")] = deck[deck.index("B")], deck[deck.index("A")]
#=> ["B", "A"]
deck
#=> ["A", "B", "C"]
The array hasn't changed. But if we change the order of arguments, it works:
deck[deck.index("B")], deck[deck.index("A")] = deck[deck.index("A")], deck[deck.index("B")]
#=> ["A", "B"]
deck
#=> ["B", "A", "C"]
I guess it has to do with the order of calling the index methods within the assignment, but not see it clearly. Can someone please explain the order of things underneath, and why the first example doesn't swap the member, and second does?
It is expected. It follows from how ruby evaluates expressions.
deck[deck.index("A")], deck[deck.index("B")] = deck[deck.index("B")], deck[deck.index("A")]
Implies
deck[deck.index("A")], deck[deck.index("B")] = 'B', 'A'
Note: strings 'A' and 'B' here are for illustration only. Ruby doesn't create new string objects here. Which essentially is:
deck[deck.index("A")] = 'B' -> deck[0] = 'B' (deck = ['B', 'B', 'C'])
deck[deck.index("B")] = 'A' -> deck[0] = 'A' (deck = ['A', 'B', 'C'])
Array#index returns when it finds the first match.
Now,
deck[deck.index("B")], deck[deck.index("A")] = deck[deck.index("A")], deck[deck.index("B")]
-> deck[deck.index("B")], deck[deck.index("A")] = 'A', 'B'
-> deck[deck.index("B")] = 'A' -> deck[1] = 'A' (deck = ['A', 'A', 'C'])
-> deck[deck.index("A")] = 'B' -> deck[0] = 'B' (deck = ['B', 'A', 'C'])
Just as an example, compare the machinations used to search the array, find the correct indexes then swap the values, with what you could do using a Hash:
h = { "cat" => "feline", "dog" => "canine", "cow" => "bovine" }
h['dog'], h['cat'] = h.values_at('cat', 'dog')
h #=> {"cat"=>"canine", "dog"=>"feline", "cow"=>"bovine"}
Now, if Ruby had an assignable values_at= Hash method it could be even cleaner:
h.values_at('dog', 'cat') = h.values_at('cat', 'dog')
but, alas, we don't. Hash slicing is a very powerful tool in Perl and something I miss about Ruby.
And, yes, I know I can add my own assignable values_at=.
M Rajesh is correct, but he actually had to think in order to work it out. I'm too lazy for that!
Here's a printf-debugging way of showing what happened.
deck = ['A', 'B', 'C']
#=> ["A", "B", "C"]
deck[deck.index("A").tap {|index|
STDERR.puts "Result of indexing for #{"A".inspect} is #{index.inspect}"
}],
deck[deck.index("B").tap {|index|
STDERR.puts "Result of indexing for #{"B".inspect} is #{index.inspect}"
}] =
deck[deck.index("B")], deck[deck.index("A")]
# Result of indexing for "A" is 0
# Result of indexing for "B" is 0
#=> ["B", "A"]
deck
#=> ["A", "B", "C"]

Resources