Selecting certain keys in a Hash - ruby

I have a Hash h and want to have an Array of those keys, where their values fulfil a certain condition. My naive approach is:
result = h.select { |_, val| val.fulfils_condition? }.keys
This works, of course, but looks unnecessarily inefficient to me, as it requires an intermediate Hash to be constructed, from which then the result Array is calculated.
Of course I could do an explicit loop:
result = []
h.each do
|key, val|
result << key if val.fulfils_condition?
end
but this I consider ugly due to the explicit handling of result. I also was contemplating this one:
result = h.reduce([]) do
|memo, pair|
memo << pair.first if pair.last.fulfils_condition?
memo
end
but this is not really more readable, and requires the construction of the intermediate pair arrays, each holding a key-value-pair.
Is there an alternative approach, which is compact, and does not need to calculate a temporary Hash?

Given:
h = {}; (1..100).each {|v| h[v.to_s] = v }
You can use the memory_profiler gem to measure allocations. Use something like MemoryProfiler.report { code }.total_allocated.
If memory allocations are really at a premium here, your approach of preallocating the result and then enumerating with #each is what you want. The reason for this is that Ruby optimizes Hash#each for blocks with an arity of 2, so that a new array isn't constructed per loop. The only allocation in this approach is the results array.
MemoryProfiler.report { r = [] ; h.each {|k, v| r << k if v > 1 } }.total_allocated
# => 1
Using #reduce, OTOH, results in an allocation per loop because you break the arity rule:
MemoryProfiler.report { h.reduce([]) {|agg, (k, v)| agg << k if v > 1 ; agg } }.total_allocated
# => 101
If you want something more "self-contained" and are willing to sacrifice an extra allocation, you'll want to use #each_key (which does create an intermediate array of keys) and then index into the hash to test each value.
h.each_key.select {|k| h[k] > 1 }

Simple is best I think. How about this?
h.keys.select { |k| k.meets_condition }

I love #ChrisHeald solution
h.each_key.filter_map { |k| k if condition(h[k]) }
I would have gone with a more verbose
h.each_with_object([]) { |(k, v), arr| arr < k if condition(v) }
or even
h.map { |k, v| k if condition(v) }.compact
which are clearly constructing more than needed, but are still quite clear.

Riffing off Peter Camilleri's proposal, I think the following does what you want:
h = {:a => 1, :b => -1, :c => 2, :d => -2}
h.keys.select { |k| h[k] < 0 } # [:b, :d]

Related

Finding the mode of an array in Ruby

When creating a method to find the mode of an array, I see people iterating over the array through a hash with default value 0:
def mode(array)
hash = Hash.new(0)
array.each do |i|
hash[i]+=1
end
end
or
freq = arr.inject(Hash.new(0)) { |h,v| h[v] += 1; h }
Can someone explain the following part of the block?
hash[i] = hash[i] + 1 or h[v] = h[v] + 1
How does the iterator know to add +1 to each unique key of the hash? For example:
array = [1,1,1,2,3]
freq = arr.inject(Hash.new(0)) { |h,v| h[v] += 1; h }
#=> {1:3, 2:1, 3:1}
If someone can explain how to find the mode of an array, I would be grateful.
In you first example, you need the method to return the hash that is created, or do some manipulation of the hash to compute the mode. Let's try it, just returning the hash (so I've added hash as the last line):
def hash_for_mode(array)
hash = Hash.new(0)
array.each do |i|
hash[i]+=1
end
hash
end
array = [1,3,1,4,3]
hash_for_mode(array) #=> {1=>2, 3=>2, 4=>1}
With hash_for_mode you can easily compute the mode.
By defining the hash h = Hash.new(0), we are telling Ruby that the default value is zero. By that, we mean that if a calculation is performed that depends on h[k] when k is not a key of h, h[k] will be set equal to the default value.
Consider, for example, when the first value of array (1 in my example) is passed into the block and assigned to the block variable i. hash does not have a key 1. (It has no keys yet.) hash[1] += 1 is shorthand for hash[1] = hash[1] + 1, so Ruby will replace hash[1] on the right side of the equality with the default value, zero, resulting in hash[1] => 1.
When the third value of array (another 1) is passed into the block, hash[1] already exists (and equals 1) so we just add one to give it a new value 2.
In case you were wondering, if we have:
hash = Hash.new(0)
hash[1] += 1
hash #=> {1=>1}
puts hash[2] #=> nil
hash #=> {1=>1}
That is, merely referencing a key that is not in the hash (here puts hash[2]), does not add a key-value pair to the hash.
Another common way to do the same thing is:
def hash_for_mode(array)
array.each_with_object({}) { |i,hash| hash[i] = (hash[i] || 0) + 1 }
end
hash_for_mode(array) #=> {1=>2, 3=>2, 4=>1}
This relies on the fact that:
(hash[i] || 0) #=> hash[i] if hash already has a key i
(hash[i] || 0) #=> 0 if hash does not have a key i, so hash[k]=>nil
(This requires that your hash does not contain any pairs k=>nil.)
Also, notice that rather than having the first statement:
hash = {}
and the last statement:
hash
I've used the method Enumerable#each_with_object, which returns the value of the hash. This is preferred here to using Enumerable#inject (a.k.a reduce) because you don't need to return hash to the iterator (no ; h needed).
array = [1,3,1,4,3]
array.group_by(&:itself).transform_values(&:count)
# => {1=>2, 3=>2, 4=>1}

Ruby - Return duplicates in an array using hashes, is this efficient?

I have solved the problem using normal loops and now using hashes, however I am not confident I used the hashes as well as I could have. Here is my code:
# 1-100 whats duplicated
def whats_duplicated?(array)
temp = Hash.new
output = Hash.new
# Write the input array numbers to a hash table and count them
array.each do |element|
if temp[element] >= 1
temp[element] += 1
else
temp[element] = 1
end
end
# Another hash, of only the numbers who appeared 2 or more times
temp.each do |hash, count|
if count > 1
output[hash] = count
end
end
# Return our sorted and formatted list as a string for screen
output.sort.inspect
end
### Main
# array_1 is an array 1-100 with duplicate numbers
array_1 = []
for i in 0..99
array_1[i] = i+1
end
# seed 10 random indexes which will likely be duplicates
for i in 0..9
array_1[rand(0..99)] = rand(1..100)
end
# print to screen the duplicated numbers & their count
puts whats_duplicated?(array_1)
My question is really what to improve? This is a learning excercise for myself, I am practising some of the typical brain-teasers you may get in an interview and while I can do this easily using loops, I want to learn an efficient use of hashes. I re-did the problem using hashes hoping for efficiency but looking at my code I think it isn't the best it could be. Thanks to anyone who takes an interest in this!
The easiest way to find duplicates in ruby, is to group the elements, and then count how many are in each group:
def whats_duplicated?(array)
array.group_by { |x| x }.select { |_, xs| xs.length > 1 }.keys
end
whats_duplicated?([1,2,3,3,4,5,3,2])
# => [2, 3]
def whats_duplicated?(array)
array.each_with_object(Hash.new(0)) { |val, hsh| hsh[val] += 1 }.select { |k,v| v > 1 }.keys
end
I would do it this way:
def duplicates(array)
counts = Hash.new { |h,k| h[k] = 0 }
array.each do |number|
counts[number] += 1
end
counts.select { |k,v| v > 1 }.keys
end
array = [1,2,3,4,4,5,6,6,7,8,8,9]
puts duplicates(array)
# => [4,6,8]
Some comments about your code: The block if temp[element] == 1 seems not correct. I think that will fail if a number occurs three or more times in the array. You should at least fix it to:
if temp[element] # check if element exists in hash
temp[element] += 1 # if it does increment
else
temp[element] = 1 # otherwise init hash at that position with `1`
end
Furthermore I recommend not to use the for x in foo syntax. Use foo.each do |x| instead. Hint: I like to ask in interviews about the difference between both versions.

Rank hash keys based on value

I'm trying to determine a rank for each key in a hash against the other keys based on it's value. The value is numeric. Ranks can be repeated (i.e. 3 keys can tie for first place). This works, but is ugly.
standings.sort_by {|k, v| v}.reverse!
prev_k = nil
standings.each_with_index do |(k, v), i|
if i == 0
k.rank = 1
elsif v == standings[prev_k]
k.rank = prev_k.rank
else
k.rank = prev_k.rank + 1
end
prev_k = k
end
Give this a try:
ranks = Hash[standings.values.sort.uniq.reverse.each_with_index.to_a]
standings.each { |k, v| k.rank = ranks[v] + 1 }
I'm not sure it's any prettier, but it's a bit more compact, carries fewer loop variables, and has no conditionals.

Ruby #select, but only select a certain number

Whats the best way in Ruby to do something like my_array.select(n){ |elem| ... }, where the n means "I only want n elements returned, and stop evaluating after that number is reached"?
This should do the trick:
my_array.select(n) { |elem| elem.meets_condition? }.take(n)
However, this will still evaluate all items.
If you have a lazy enumerator, you could do this in a more efficient manner.
https://github.com/ruby/ruby/pull/100 shows an attempt at enabling this feature.
You can easily implement lazy_select:
module Enumerable
def lazy_select
Enumerator.new do |yielder|
each do |e|
yielder.yield(e) if yield(e)
end
end
end
end
Then things like
(1..10000000000).to_enum.lazy_select{|e| e % 3 == 0}.take(3)
# => [3, 6, 9]
execute instantly.
Looks like there's no avoiding a traditional loop if you're using stock 1.8.7 or 1.9.2...
result = []
num_want = 4
i = 0
while (elem = my_array[i]) && my_array.length < num_want
result << elem if elem.some_condition
i += 1
end
You could make an Enumerable-like extension which has your desired selectn semantics:
module SelectN
def selectn(n)
out = []
each do |e|
break if n <= 0
if yield e
out << e
n -= 1
end
end
out
end
end
a = (0..9).to_a
a.select{ |e| e%3 == 0 } # [0, 3, 6, 9]
a.extend SelectN
a.selectn(1) { |e| e%3 == 0 } # [0]
a.selectn(3) { |e| e%3 == 0 } # [0, 3, 6]
# for convenience, you could inject this behavior into all Arrays
# the usual caveats about monkey-patching std library behavior applies
class Array; include SelectN; end
(0..9).to_a.selectn(2) { |e| e%3 == 0 } # [0,3]
(0..9).to_a.selectn(99) { |e| e%3 == 0 } # [0,3, 6, 9]
Why not flip this around and do the #take before the #select:
my_array.take(n).select { |elem| ... }
That will ensure you only do your computation for n number of items.
EDIT:
Enumerable::Lazy is known to be slower, but if your computation is known to be more computationally expensive than the lazy slowness, you can use the Ruby 2.0 feature:
my_array.lazy.select { |elem| ... }.take(n)
See: http://blog.railsware.com/2012/03/13/ruby-2-0-enumerablelazy/
I guess broken loop can be done in old-fashioned loop style with break or something like this:
n = 5
[1,2,3,4,5,6,7].take_while { |e| n -= 1; n >= 0 && e < 7 }
In functional language this would be recursion, but without TCO it doesn't make much sense in Ruby.
UPDATE
take_while was stupid idea as dbenhur pointed out, so I don't know anything better than a loop.

How to find and return a duplicate value in array

arr is array of strings:
["hello", "world", "stack", "overflow", "hello", "again"]
What would be an easy and elegant way to check if arr has duplicates, and if so, return one of them (no matter which)?
Examples:
["A", "B", "C", "B", "A"] # => "A" or "B"
["A", "B", "C"] # => nil
a = ["A", "B", "C", "B", "A"]
a.detect{ |e| a.count(e) > 1 }
I know this isn't very elegant answer, but I love it. It's beautiful one liner code. And works perfectly fine unless you need to process huge data set.
Looking for faster solution? Here you go!
def find_one_using_hash_map(array)
map = {}
dup = nil
array.each do |v|
map[v] = (map[v] || 0 ) + 1
if map[v] > 1
dup = v
break
end
end
return dup
end
It's linear, O(n), but now needs to manage multiple lines-of-code, needs test cases, etc.
If you need an even faster solution, maybe try C instead.
And here is the gist comparing different solutions: https://gist.github.com/naveed-ahmad/8f0b926ffccf5fbd206a1cc58ce9743e
You can do this in a few ways, with the first option being the fastest:
ary = ["A", "B", "C", "B", "A"]
ary.group_by{ |e| e }.select { |k, v| v.size > 1 }.map(&:first)
ary.sort.chunk{ |e| e }.select { |e, chunk| chunk.size > 1 }.map(&:first)
And a O(N^2) option (i.e. less efficient):
ary.select{ |e| ary.count(e) > 1 }.uniq
Simply find the first instance where the index of the object (counting from the left) does not equal the index of the object (counting from the right).
arr.detect {|e| arr.rindex(e) != arr.index(e) }
If there are no duplicates, the return value will be nil.
I believe this is the fastest solution posted in the thread so far, as well, since it doesn't rely on the creation of additional objects, and #index and #rindex are implemented in C. The big-O runtime is N^2 and thus slower than Sergio's, but the wall time could be much faster due to the the fact that the "slow" parts run in C.
detect only finds one duplicate. find_all will find them all:
a = ["A", "B", "C", "B", "A"]
a.find_all { |e| a.count(e) > 1 }
Here are two more ways of finding a duplicate.
Use a set
require 'set'
def find_a_dup_using_set(arr)
s = Set.new
arr.find { |e| !s.add?(e) }
end
find_a_dup_using_set arr
#=> "hello"
Use select in place of find to return an array of all duplicates.
Use Array#difference
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
def find_a_dup_using_difference(arr)
arr.difference(arr.uniq).first
end
find_a_dup_using_difference arr
#=> "hello"
Drop .first to return an array of all duplicates.
Both methods return nil if there are no duplicates.
I proposed that Array#difference be added to the Ruby core. More information is in my answer here.
Benchmark
Let's compare suggested methods. First, we need an array for testing:
CAPS = ('AAA'..'ZZZ').to_a.first(10_000)
def test_array(nelements, ndups)
arr = CAPS[0, nelements-ndups]
arr = arr.concat(arr[0,ndups]).shuffle
end
and a method to run the benchmarks for different test arrays:
require 'fruity'
def benchmark(nelements, ndups)
arr = test_array nelements, ndups
puts "\n#{ndups} duplicates\n"
compare(
Naveed: -> {arr.detect{|e| arr.count(e) > 1}},
Sergio: -> {(arr.inject(Hash.new(0)) {|h,e| h[e] += 1; h}.find {|k,v| v > 1} ||
[nil]).first },
Ryan: -> {(arr.group_by{|e| e}.find {|k,v| v.size > 1} ||
[nil]).first},
Chris: -> {arr.detect {|e| arr.rindex(e) != arr.index(e)} },
Cary_set: -> {find_a_dup_using_set(arr)},
Cary_diff: -> {find_a_dup_using_difference(arr)}
)
end
I did not include #JjP's answer because only one duplicate is to be returned, and when his/her answer is modified to do that it is the same as #Naveed's earlier answer. Nor did I include #Marin's answer, which, while posted before #Naveed's answer, returned all duplicates rather than just one (a minor point but there's no point evaluating both, as they are identical when return just one duplicate).
I also modified other answers that returned all duplicates to return just the first one found, but that should have essentially no effect on performance, as they computed all duplicates before selecting one.
The results for each benchmark are listed from fastest to slowest:
First suppose the array contains 100 elements:
benchmark(100, 0)
0 duplicates
Running each test 64 times. Test will take about 2 seconds.
Cary_set is similar to Cary_diff
Cary_diff is similar to Ryan
Ryan is similar to Sergio
Sergio is faster than Chris by 4x ± 1.0
Chris is faster than Naveed by 2x ± 1.0
benchmark(100, 1)
1 duplicates
Running each test 128 times. Test will take about 2 seconds.
Cary_set is similar to Cary_diff
Cary_diff is faster than Ryan by 2x ± 1.0
Ryan is similar to Sergio
Sergio is faster than Chris by 2x ± 1.0
Chris is faster than Naveed by 2x ± 1.0
benchmark(100, 10)
10 duplicates
Running each test 1024 times. Test will take about 3 seconds.
Chris is faster than Naveed by 2x ± 1.0
Naveed is faster than Cary_diff by 2x ± 1.0 (results differ: AAC vs AAF)
Cary_diff is similar to Cary_set
Cary_set is faster than Sergio by 3x ± 1.0 (results differ: AAF vs AAC)
Sergio is similar to Ryan
Now consider an array with 10,000 elements:
benchmark(10000, 0)
0 duplicates
Running each test once. Test will take about 4 minutes.
Ryan is similar to Sergio
Sergio is similar to Cary_set
Cary_set is similar to Cary_diff
Cary_diff is faster than Chris by 400x ± 100.0
Chris is faster than Naveed by 3x ± 0.1
benchmark(10000, 1)
1 duplicates
Running each test once. Test will take about 1 second.
Cary_set is similar to Cary_diff
Cary_diff is similar to Sergio
Sergio is similar to Ryan
Ryan is faster than Chris by 2x ± 1.0
Chris is faster than Naveed by 2x ± 1.0
benchmark(10000, 10)
10 duplicates
Running each test once. Test will take about 11 seconds.
Cary_set is similar to Cary_diff
Cary_diff is faster than Sergio by 3x ± 1.0 (results differ: AAE vs AAA)
Sergio is similar to Ryan
Ryan is faster than Chris by 20x ± 10.0
Chris is faster than Naveed by 3x ± 1.0
benchmark(10000, 100)
100 duplicates
Cary_set is similar to Cary_diff
Cary_diff is faster than Sergio by 11x ± 10.0 (results differ: ADG vs ACL)
Sergio is similar to Ryan
Ryan is similar to Chris
Chris is faster than Naveed by 3x ± 1.0
Note that find_a_dup_using_difference(arr) would be much more efficient if Array#difference were implemented in C, which would be the case if it were added to the Ruby core.
Conclusion
Many of the answers are reasonable but using a Set is the clear best choice. It is fastest in the medium-hard cases, joint fastest in the hardest and only in computationally trivial cases - when your choice won't matter anyway - can it be beaten.
The one very special case in which you might pick Chris' solution would be if you want to use the method to separately de-duplicate thousands of small arrays and expect to find a duplicate typically less than 10 items in. This will be a bit faster as it avoids the small additional overhead of creating the Set.
Alas most of the answers are O(n^2).
Here is an O(n) solution,
a = %w{the quick brown fox jumps over the lazy dog}
h = Hash.new(0)
a.find { |each| (h[each] += 1) == 2 } # => 'the"
What is the complexity of this?
Runs in O(n) and breaks on first match
Uses O(n) memory, but only the minimal amount
Now, depending on how frequent duplicates are in your array these runtimes might actually become even better. For example if the array of size O(n) has been sampled from a population of k << n different elements only the complexity for both runtime and space becomes O(k), however it is more likely that the original poster is validating input and wants to make sure there are no duplicates. In that case both runtime and memory complexity O(n) since we expect the elements to have no repetitions for the majority of inputs.
Ruby Array objects have a great method, select.
select {|item| block } → new_ary
select → an_enumerator
The first form is what interests you here. It allows you to select objects which pass a test.
Ruby Array objects have another method, count.
count → int
count(obj) → int
count { |item| block } → int
In this case, you are interested in duplicates (objects which appear more than once in the array). The appropriate test is a.count(obj) > 1.
If a = ["A", "B", "C", "B", "A"], then
a.select{|item| a.count(item) > 1}.uniq
=> ["A", "B"]
You state that you only want one object. So pick one.
find_all() returns an array containing all elements of enum for which block is not false.
To get duplicate elements
>> arr = ["A", "B", "C", "B", "A"]
>> arr.find_all { |x| arr.count(x) > 1 }
=> ["A", "B", "B", "A"]
Or duplicate uniq elements
>> arr.find_all { |x| arr.count(x) > 1 }.uniq
=> ["A", "B"]
Something like this will work
arr = ["A", "B", "C", "B", "A"]
arr.inject(Hash.new(0)) { |h,e| h[e] += 1; h }.
select { |k,v| v > 1 }.
collect { |x| x.first }
That is, put all values to a hash where key is the element of array and value is number of occurences. Then select all elements which occur more than once. Easy.
Ruby 2.7 introduced Enumerable#tally
And you can use it this way:
ary = ["A", "B", "C", "B", "A", "A"]
ary.tally.select { |_, count| count > 1 }.keys
# => ["A", "B"]
ary = ["A", "B", "C"]
ary.tally.select { |_, count| count > 1 }.keys
# => []
I know this thread is about Ruby specifically, but I landed here looking for how to do this within the context of Ruby on Rails with ActiveRecord and thought I would share my solution too.
class ActiveRecordClass < ActiveRecord::Base
#has two columns, a primary key (id) and an email_address (string)
end
ActiveRecordClass.group(:email_address).having("count(*) > 1").count.keys
The above returns an array of all email addresses that are duplicated in this example's database table (which in Rails would be "active_record_classes").
a = ["A", "B", "C", "B", "A"]
a.each_with_object(Hash.new(0)) {|i,hash| hash[i] += 1}.select{|_, count| count > 1}.keys
This is a O(n) procedure.
Alternatively you can do either of the following lines. Also O(n) but only one iteration
a.each_with_object(Hash.new(0).merge dup: []){|x,h| h[:dup] << x if (h[x] += 1) == 2}[:dup]
a.inject(Hash.new(0).merge dup: []){|h,x| h[:dup] << x if (h[x] += 1) == 2;h}[:dup]
This code will return list of duplicated values. Hash keys are used as an efficient way of checking which values have already been seen. Based on whether value has been seen, the original array ary is partitioned into 2 arrays: first containing unique values and second containing duplicates.
ary = ["hello", "world", "stack", "overflow", "hello", "again"]
hash={}
arr.partition { |v| hash.has_key?(v) ? false : hash[v]=0 }.last.uniq
=> ["hello"]
You can further shorten it - albeit at a cost of slightly more complex syntax - to this form:
hash={}
arr.partition { |v| !hash.has_key?(v) && hash[v]=0 }.last.uniq
Here is my take on it on a big set of data - such as a legacy dBase table to find duplicate parts
# Assuming ps is an array of 20000 part numbers & we want to find duplicates
# actually had to it recently.
# having a result hash with part number and number of times part is
# duplicated is much more convenient in the real world application
# Takes about 6 seconds to run on my data set
# - not too bad for an export script handling 20000 parts
h = {};
# or for readability
h = {} # result hash
ps.select{ |e|
ct = ps.count(e)
h[e] = ct if ct > 1
}; nil # so that the huge result of select doesn't print in the console
r = [1, 2, 3, 5, 1, 2, 3, 1, 2, 1]
r.group_by(&:itself).map { |k, v| v.size > 1 ? [k] + [v.size] : nil }.compact.sort_by(&:last).map(&:first)
each_with_object is your friend!
input = [:bla,:blubb,:bleh,:bla,:bleh,:bla,:blubb,:brrr]
# to get the counts of the elements in the array:
> input.each_with_object({}){|x,h| h[x] ||= 0; h[x] += 1}
=> {:bla=>3, :blubb=>2, :bleh=>2, :brrr=>1}
# to get only the counts of the non-unique elements in the array:
> input.each_with_object({}){|x,h| h[x] ||= 0; h[x] += 1}.reject{|k,v| v < 2}
=> {:bla=>3, :blubb=>2, :bleh=>2}
a = ["A", "B", "C", "B", "A"]
b = a.select {|e| a.count(e) > 1}.uniq
c = a - b
d = b + c
Results
d
=> ["A", "B", "C"]
If you are comparing two different arrays (instead of one against itself) a very fast way is to use the intersect operator & provided by Ruby's Array class.
# Given
a = ['a', 'b', 'c', 'd']
b = ['e', 'f', 'c', 'd']
# Then this...
a & b # => ['c', 'd']
This runs very quickly (iterated through 2.3mil ids, took less than a second to push dups into their own array)
Had to do this at work with 2.3 mil IDs I imported into a file, I imported list as sorted, also can be sorted by ruby.
list = CSV.read(path).flatten.sort
dup_list = []
list.each_with_index do |id, index|
dup_list.push(id) if id == list[index +1]
end
dup_list.to_set.to_a
def duplicates_in_array(array)
hash = {}
duplicates_hash = {}
array.each do |v|
hash[v] = (hash[v] || 0 ) + 1
end
hash.keys.each do |hk|
duplicates_hash[hk] = hash[hk] if hash[hk] > 1
end
return duplicates_hash
end
This will return a hash containing each duplicate in the array, and the amount of time it is duplicated
for example:
array = [1,2,2,4,5,6,7,7,7,7]
duplicates_in_array(array)
=> {2=>2, 7=>4}
I needed to find out how many duplicates there were and what they were so I wrote a function building off of what Naveed had posted earlier:
def print_duplicates(array)
puts "Array count: #{array.count}"
map = {}
total_dups = 0
array.each do |v|
map[v] = (map[v] || 0 ) + 1
end
map.each do |k, v|
if v != 1
puts "#{k} appears #{v} times"
total_dups += 1
end
end
puts "Total items that are duplicated: #{total_dups}"
end
Let's create duplication method that take array of elements as input
In the method body, let's create 2 new array objects one is seen and another one is duplicate
finally lets iterate through each object in given array and for every iteration lets find that object existed in seen array.
if object existed in the seen_array, then it is considered as duplicate object and push that object into duplication_array
if object not-existed in the seen, then it is considered as unique object and push that object into seen_array
let's demonstrate in Code Implementation
def duplication given_array
seen_objects = []
duplication_objects = []
given_array.each do |element|
duplication_objects << element if seen_objects.include?(element)
seen_objects << element
end
duplication_objects
end
Now call duplication method and output return result -
dup_elements = duplication [1,2,3,4,4,5,6,6]
puts dup_elements.inspect
[1,2,3].uniq!.nil? => true
[1,2,3,3].uniq!.nil? => false
Notice the above is destructive

Resources