How do you merge consecutive repeating elements in an array? - ruby

I need to merge consecutive repeating elements in an array, such that
[1, 2, 2, 3, 1]
becomes
[1, 2, 3, 1]
#uniq doesn't work for this purpose. Why? Because #uniq will produce this:
[1, 2, 3]

There is a abstraction in the core that pretty much does the job, Enumerable#chunk:
xs = [1, 2, 2, 3, 3, 3, 1]
xs.chunk { |x| x }.map(&:first)
#=> [1, 2, 3, 1]

def remove_consecutive_duplicates(xs)
[xs.first] + xs.each_cons(2).select do |x,y|
x != y
end.map(&:last)
end
remove_consecutive_duplicates([1, 2, 2, 3, 1])
#=> [1,2,3,1]
This returns a new array like uniq does and works in O(n) time.

sepp2k's answer is already accepted, but here are some alternatives:
# Because I love me some monkeypatching
class Array
def remove_consecutive_duplicates_2
# Because no solution is complete without inject
inject([]){ |r,o| r << o unless r.last==o; r }
end
def remove_consecutive_duplicates_3
# O(2n)
map.with_index{ |o,i| o if i==0 || self[i-1]!=o }.compact
end
def remove_consecutive_duplicates_4
# Truly O(n)
result = []
last = nil
each do |o|
result << o unless last==o
last = o
end
result
end
end
And although performance is not everything, here are some benchmarks:
Rehearsal --------------------------------------------
sepp2k     2.740000   0.010000   2.750000 (  2.734665)
Phrogz_2   1.410000   0.000000   1.410000 (  1.420978)
Phrogz_3   1.520000   0.020000   1.540000 (  1.533197)
Phrogz_4   1.000000   0.000000   1.000000 (  0.997460)
----------------------------------- total: 6.700000sec
               user     system      total        real
sepp2k     2.780000   0.000000   2.780000 (  2.782354)
Phrogz_2   1.450000   0.000000   1.450000 (  1.440868)
Phrogz_3   1.530000   0.020000   1.550000 (  1.539190)
Phrogz_4   1.020000   0.000000   1.020000 (  1.025331)
Benchmarks run on removing duplicates from orig = (0..1000).map{ rand(5) } 10,000 times.

does !uniq not work for what you are doing?
http://ruby-doc.org/docs/ProgrammingRuby/html/ref_c_array.html

Related

How can I remove duplicates in an array without using `uniq`?

The object of my coding exercise is to get rid of duplicates in an array without using the uniq method. Here is my code:
numbers = [1, 4, 2, 4, 3, 1, 5]
def my_uniq(array)
sorted = array.sort
count = 1
while count <= sorted.length
while true
sorted.delete_if {|i| i = i + count}
count += 1
end
end
return sorted
end
When I run this, I get an infinite loop. What is wrong?
Can I use delete the way that I am doing with count?
How will it execute? Will count continue until the end of the array before the method iterates to the next index?
I did this with each or map, and got the same results. What is the best way to do this using each, delete_if, map, or a while loop (with a second loop that compares against the first one)?
Here is a clearly written example.
numbers = [1, 4, 2, 4, 3, 1, 5]
def remove_duplicates(array)
response = Array.new
array.each do |number|
response << number unless response.include?(number)
end
return response
end
remove_duplicates(numbers)
As others pointed out, your inner loop is infinite. Here's a concise solution with no loops:
numbers.group_by{|n| n}.keys
You can sort it if you want, but this solution doesn't require it.
the problem is that the inner loop is an infinite loop:
while true
sorted.delete_if {|i| i = i + count}
count += 1
end #while
you can probably do what you are doing but it's not eliminating duplicates.
one way to do this would be:
numbers = [1, 4, 2, 4, 3, 1, 5]
target = []
numbers.each {|x| target << x unless target.include?(x) }
puts target.inspect
to add it to the array class:
class ::Array
def my_uniq
target = []
self.each {|x| target << x unless target.include?(x) }
target
end
end
now you can do:
numbers = [1, 4, 2, 4, 3, 1, 5]
numbers.my_uniq
You count use Set that acts like an array with does not allow duplicates:
require 'set'
numbers = [1, 4, 2, 4, 3, 1, 5]
Set.new(numbers).to_a
#=> [1, 4, 2, 3, 5]
Try using Array#& passing the array itself as parameter:
x = [1,2,3,3,3]
x & x #=> [1,2,3]
This is one of the answer. However, I do not know how much of performance issue it takes to return unique
def my_uniq(ints)
i = 0
uniq = []
while i < ints.length
ints.each do |integers|
if integers == i
uniq.push(integers)
end
i += 1
end
end
return uniq
end

Removing elements from array Ruby

Let's say I am trying to remove elements from array a = [1,1,1,2,2,3]. If I perform the following:
b = a - [1,3]
Then I will get:
b = [2,2]
However, I want the result to be
b = [1,1,2,2]
i.e. I only remove one instance of each element in the subtracted vector not all cases. Is there a simple way in Ruby to do this?
You may do:
a= [1,1,1,2,2,3]
delete_list = [1,3]
delete_list.each do |del|
a.delete_at(a.index(del))
end
result : [1, 1, 2, 2]
[1,3].inject([1,1,1,2,2,3]) do |memo,element|
memo.tap do |memo|
i = memo.find_index(e)
memo.delete_at(i) if i
end
end
Not very simple but:
a = [1,1,1,2,2,3]
b = a.group_by {|n| n}.each {|k,v| v.pop [1,3].count(k)}.values.flatten
=> [1, 1, 2, 2]
Also handles the case for multiples in the 'subtrahend':
a = [1,1,1,2,2,3]
b = a.group_by {|n| n}.each {|k,v| v.pop [1,1,3].count(k)}.values.flatten
=> [1, 2, 2]
EDIT: this is more an enhancement combining Norm212 and my answer to make a "functional" solution.
b = [1,1,3].each.with_object( a ) { |del| a.delete_at( a.index( del ) ) }
Put it in a lambda if needed:
subtract = lambda do |minuend, subtrahend|
subtrahend.each.with_object( minuend ) { |del| minuend.delete_at( minuend.index( del ) ) }
end
then:
subtract.call a, [1,1,3]
A simple solution I frequently use:
arr = ['remove me',3,4,2,45]
arr[1..-1]
=> [3,4,2,45]
a = [1,1,1,2,2,3]
a.slice!(0) # remove first index
a.slice!(-1) # remove last index
# a = [1,1,2,2] as desired
For speed, I would do the following, which requires only one pass through each of the two arrays. This method preserves order. I will first present code that does not mutate the original array, then show how it can be easily modified to mutate.
arr = [1,1,1,2,2,3,1]
removals = [1,3,1]
h = removals.group_by(&:itself).transform_values(&:size)
#=> {1=>2, 3=>1}
arr.each_with_object([]) { |n,a|
h.key?(n) && h[n] > 0 ? (h[n] -= 1) : a << n }
#=> [1, 2, 2, 1]
arr
#=> [1, 1, 1, 2, 2, 3, 1]
To mutate arr write:
h = removals.group_by(&:itself).transform_values(&:count)
arr.replace(arr.each_with_object([]) { |n,a|
h.key?(n) && h[n] > 0 ? (h[n] -= 1) : a << n })
#=> [1, 2, 2, 1]
arr
#=> [1, 2, 2, 1]
This uses the 21st century method Hash#transform_values (new in MRI v2.4), but one could instead write:
h = Hash[removals.group_by(&:itself).map { |k,v| [k,v.size] }]
or
h = removals.each_with_object(Hash.new(0)) { | n,h| h[n] += 1 }

How to interpolate an array?

I would like to do something like join with an Array, but instead of getting the result as a String, I would like to get an Array. I will call this interpolate. For example, given:
a = [1, 2, 3, 4, 5]
I expect:
a.interpolate(0) # => [1, 0, 2, 0, 3, 0, 4, 0, 5]
a.interpolate{Array.new} # => [1, [], 2, [], 3, [], 4, [], 5]
What is the best way to get this? The reason I need it to take a block is because when I use it with a block, I want different instances for each interpolator that comes in between.
After getting great answers from many, I came up with some modified ones.
This one is a modification from tokland's answer. I made it accept nil for conj1. And also moved if conj2 condition to outside of the flat_map loop to make it faster.
class Array
def interpolate conj1 = nil, &conj2
return [] if empty?
if conj2 then first(length - 1).flat_map{|e| [e, conj2.call]}
else first(length - 1).flat_map{|e| [e, conj1]}
end << last
end
end
This one is a modification of Victor Moroz's answer. I added the functionality to accept a block.
class Array
def interpolate conj1 = nil, &conj2
return [] if empty?
first, *rest = self
if conj2 then rest.inject([first]) {|a, e| a.push(conj2.call, e)}
else rest.inject([first]) {|a, e| a.push(conj1, e)}
end
end
end
After benchmark test, the second one looks faster. It seems that flat_map, although looking beautiful, is slow.
Use zip:
a.zip(Array.new(a.size) { 0 }).flatten(1)[0...-1]
Another way
class Array
def interpolate(pol=nil)
new_ary = self.inject([]) do |memo, orig_item|
pol = yield if block_given?
memo += [orig_item, pol]
end
new_ary.pop
new_ary
end
end
[1,2,3].interpolate("A")
#=> [1, "A", 2, "A", 3]
[1,2,3].interpolate {Array.new}
#=> [1, [], 2, [], 3]
class Array
def interpolate_with val
res = []
self.each_with_index do |el, idx|
res << val unless idx == 0
res << el
end
res
end
end
Usage:
ruby-1.9.3-p0 :021 > [1,2,3].interpolate_with 0
=> [1, 0, 2, 0, 3]
ruby-1.9.3-p0 :022 > [1,2,3].interpolate_with []
=> [1, [], 2, [], 3]
Not really sure what you want to do with a block, but I would do it this way:
class Array
def interpolate(sep)
h, *t = self
t.empty? ? [h] : t.inject([h]) { |a, e| a.push(sep, e) }
end
end
UPDATE:
Benchmarks (array size = 100):
user system total real
inject 0.730000 0.000000 0.730000 ( 0.767565)
zip 1.030000 0.000000 1.030000 ( 1.034664)
Actually I am a bit surprised, I thought zip would be faster.
UPDATE2:
zip is faster, flatten is not.
Here's a simple version (which can handle multiple values and/or a block) using flat_map and each_cons:
class Array
def interpolate *values
each_cons(2).flat_map do |e, _|
[e, *values, *(block_given? ? yield(e) : [])]
end << last
end
end
[1,2,3].interpolate(0, "") # => [1, 0, "", 2, 0, "", 3]
[1,2,3].interpolate(&:even?) # => [1, false, 2, true, 3]
This does it inplace:
class Array
def interpolate(t = nil)
each_with_index do |e, i|
t = yield if block_given?
insert(i, t) if i % 2 == 1
end
end
end
This works because t is inserted before the element with the current index, which makes the just inserted t the element with the current index, which means that the iteration can continue normally.
So many ways to do this. For example (Ruby 1.9):
class Array
def intersperse(item = nil)
return clone if self.empty?
take(self.length - 1).flat_map do |x|
[x, item || yield]
end + [self.last]
end
end
p [].intersperse(0)
#=> []
p [1, 2, 3, 4, 5].intersperse(0)
#= >[1, 0, 2, 0, 3, 0, 4, 0, 5]
p [1, 2, 3, 4, 5].intersperse { 0 }
#= >[1, 0, 2, 0, 3, 0, 4, 0, 5]
(I use the Haskell function name: intersperse.)
Here is one way:
theArray.map {|element| [element, interpolated_obj]}.flatten

Remove from the array elements that are repeated

What is the best way to remove from the array elements that are repeated.
For example, from the array
a = [4, 3, 3, 1, 6, 6]
need to get
a = [4, 1]
My method works to too slowly with big amount of elements.
arr = [4, 3, 3, 1, 6, 6]
puts arr.join(" ")
nouniq = []
l = arr.length
uniq = nil
for i in 0..(l-1)
for j in 0..(l-1)
if (arr[j] == arr[i]) and ( i != j )
nouniq << arr[j]
end
end
end
arr = (arr - nouniq).compact
puts arr.join(" ")
a = [4, 3, 3, 1, 6, 6]
a.select{|b| a.count(b) == 1}
#=> [4, 1]
More complicated but faster solution (O(n) I believe :))
a = [4, 3, 3, 1, 6, 6]
ar = []
add = proc{|to, form| to << from[1] if form.uniq.size == from.size }
a.sort!.each_cons(3){|b| add.call(ar, b)}
ar << a[0] if a[0] != a[1]; ar << a[-1] if a[-1] != a[-2]
arr = [4, 3, 3, 1, 6, 6]
arr.
group_by {|e| e }.
map {|e, es| [e, es.length] }.
reject {|e, count| count > 1 }.
map(&:first)
# [4, 1]
Without introducing the need for a separate copy of the original array and using inject:
[4, 3, 3, 1, 6, 6].inject({}) {|s,v| s[v] ? s.merge({v=>s[v]+1}) : s.merge({v=>1})}.select {|k,v| k if v==1}.keys
=> [4, 1]
I needed something like this, so tested a few different approaches. These all return an array of the items that are duplicated in the original array:
module Enumerable
def dups
inject({}) {|h,v| h[v]=h[v].to_i+1; h}.reject{|k,v| v==1}.keys
end
def only_duplicates
duplicates = []
self.each {|each| duplicates << each if self.count(each) > 1}
duplicates.uniq
end
def dups_ej
inject(Hash.new(0)) {|h,v| h[v] += 1; h}.reject{|k,v| v==1}.keys
end
def dedup
duplicates = self.dup
self.uniq.each { |v| duplicates[self.index(v)] = nil }
duplicates.compact.uniq
end
end
Benchark results for 100,000 iterations, first with an array of integers, then an array of strings. Performance will vary depending on the numer of duplicates found, but these tests are with a fixed number of duplicates (~ half array entries are duplicates):
test_benchmark_integer
user system total real
Enumerable.dups 2.560000 0.040000 2.600000 ( 2.596083)
Enumerable.only_duplicates 6.840000 0.020000 6.860000 ( 6.879830)
Enumerable.dups_ej 2.300000 0.030000 2.330000 ( 2.329113)
Enumerable.dedup 1.700000 0.020000 1.720000 ( 1.724220)
test_benchmark_strings
user system total real
Enumerable.dups 4.650000 0.030000 4.680000 ( 4.722301)
Enumerable.only_duplicates 47.060000 0.150000 47.210000 ( 47.478509)
Enumerable.dups_ej 4.060000 0.030000 4.090000 ( 4.123402)
Enumerable.dedup 3.290000 0.040000 3.330000 ( 3.334401)
..
Finished in 73.190988 seconds.
So of these approaches, it seems the Enumerable.dedup algorithm is the best:
dup the original array so it is immutable
gets the uniq array elements
for each unique element: nil the first occurence in the dup array
compact the result
If only (array - array.uniq) worked correctly! (it doesn't - it removes everything)
Here's my spin on a solution used by Perl programmers using a hash to accumulate counts for each element in the array:
ary = [4, 3, 3, 1, 6, 6]
ary.inject({}) { |h,a|
h[a] ||= 0
h[a] += 1
h
}.select{ |k,v| v == 1 }.keys # => [4, 1]
It could be on one line, if that's at all important, by judicious use of semicolons between the lines in the map.
A little different way is:
ary.inject({}) { |h,a| h[a] ||= 0; h[a] += 1; h }.map{ |k,v| k if (v==1) }.compact # => [4, 1]
It replaces the select{...}.keys with map{...}.compact so it's not really an improvement, and, to me is a bit harder to understand.

In Ruby, what is the cleanest way of obtaining the index of the largest value in an array?

If a is the array, I want a.index(a.max), but something more Ruby-like. It should be obvious, but I'm having trouble finding the answer at so and elsewhere. Obviously, I am new to Ruby.
For Ruby 1.8.7 or above:
a.each_with_index.max[1]
It does one iteration. Not entirely the most semantic thing ever, but if you find yourself doing this a lot, I would wrap it in an index_of_max method anyway.
In ruby 1.9.2 I can do this;
arr = [4, 23, 56, 7]
arr.rindex(arr.max) #=> 2
Here is what I am thinking to answer this question :
a = (1..12).to_a.shuffle
# => [8, 11, 9, 4, 10, 7, 3, 6, 5, 12, 1, 2]
a.each_index.max_by { |i| a[i] }
# => 9
Just wanted to note a behavioral and performance difference for some of the solutions here. The "tie breaking" behavior of duplicate max elements:
a = [3,1,2,3]
a.each_with_index.max[1]
# => 3
a.index(a.max)
# => 0
Out of curiosity I ran them both in Benchmark.bm (for the a above):
user system total real
each_with_index.max 0.000000 0.000000 0.000000 ( 0.000011)
index.max 0.000000 0.000000 0.000000 ( 0.000003)
Then I generated a new a with Array.new(10_000_000) { Random.rand } and reran the test:
user system total real
each_with_index.max
2.790000 0.000000 2.790000 ( 2.792399)
index.max 0.470000 0.000000 0.470000 ( 0.467348)
This makes me think unless you specifically need to choose the higher index max, a.index(a.max) is the better choice.
Here is a way to get all the index values of the max values if more than one.
Given:
> a
=> [1, 2, 3, 4, 5, 6, 7, 9, 9, 2, 3]
You can find the index of all the max values (or any given value) by:
> a.each_with_index.select {|e, i| e==a.max}.map &:last
=> [7, 8]
a = [1, 4 8]
a.inject(a[0]) {|max, item| item > max ? item : max }
At least it's Ruby-like :)
Using #each_with_index and #each_with_object. Only one pass required.
def index_of_first_max(e)
e.each_with_index.each_with_object({:max => nil, :idx => nil}) { |x, m|
x, i = x
if m[:max].nil? then m[:max] = x
elsif m[:max] < x then m[:max] = x; m[:idx] = i
end
}[:idx]
end
Or combining #each_with_index with #inject:
def index_of_first_max(e)
e.each_with_index.inject([nil, 0]) { |m, x|
x, i = x
m, mi = m
if m.nil? || m < x then [x, i]
else [m, mi]
end
}.last
end

Resources