removing duplicates in array of arrays in Ruby - ruby

I have an array of arrays, like this:
aa = [ [a,d], [a,d1], [a,d], [b,d], [b,d2], [b,d3], [b,d2], [a,d2] ]
I would like to have a unique array of arrays, not just on the first element - which I can do by doing something like aa.uniq(&:first) - but rather remove the inner arrays if BOTH values match. So the result would be:
aa = [ [a,d], [a,d1], [a,d2], [b,d], [b,d2], [b,d3] ]
Can anyone assist in pointing me to an efficient way of doing this? I have large nr of arrays - in the order of 1 million - that I need to process.
Any help appreciated! John

If you need to maintain a collection of elements where each element is unique and their order is not important. You should use a Set. For instance,
require 'set'
my_set = Set.new
my_set << [1, 'a']
my_set << [1, 'a']
my_set << [1, 'b']
my_set.each { |elem| puts "#{elem}" }
It will give you
[1, "a"]
[1, "b"]
If the order is important, then use the uniq! on you array
aa.uniq!

If you want to get unique elements from an array, which will remove duplicate element, you can try this:
a = [[1, 2], [2, 3], [1, 2], [2, 3], [3, 4]]
a & a #=> [[1, 2], [2, 3], [3, 4]]

Try like this:
aa = [ ["a","d"], ["a","d1"], ["a","d"], ["b","d"] ]
aa.uniq
aa=[["a", "d"], ["a", "d1"], ["b", "d"]]
You missed double quotations ("). Inside of array, variables a, d, a, d1, etc. are strings. So, you should put them inside of double quotations ("").

Related

How can I sort this hash with an array as a value?

I have the following hash
{"f"=>[0, 1], "i"=>[1, 2], "n"=>[2, 2], "d"=>[3, 1], "g"=>[6, 1]}
I ultimately want to grab this value out of it "i"=>[1, 2]. I want to grab this value because the logic is that out of all these characters in some string, i has the highest value of occurrences 2 but appears first in a string given by its index in the array 1.
So for a string 'finding' the i character would be the first returned. I've made it so I generated this hash, now I just need to sort it so that the character that has the lowest index but the highest count will be first. Does anyone have an elegant solution to this?
In the Ruby sort_by block, Ruby can compare arrays, which is done in element order. In your case, you want reverse order by the second element of the array, then order by first element. So you can construct your sort block as follows:
arr = {"f"=>[0, 1], "i"=>[1, 2], "n"=>[2, 2], "d"=>[3, 1], "g"=>[6, 1]}
arr.sort_by { |a| [-a[1][1], a[1][0]] }.first
Ruby firsts converts arr to an array that looks like this:
[["f", [0, 1]], ["i", [1, 2]], ["n", [2, 2]], ["d", [3, 1]], ["g", [6, 1]]]
Then for each element that looks like [letter, [position, count]] (represented by the sort block argument, a), it is comparing [-count, position] for the sort.
This will give:
["i", [1,2]]
You can then do with that form whatever you wish.
Note, you can use max_by ... instead of sort_by ... .first in the above. I completely forgot about max_by, but Jörg W Mittag's nice answer reminded me.
If I understand the question correctly, there is no need to sort the hash at all to get the answer you are looking for, since you never actually need the sorted hash, you only need the maximum.
So, something like this should do the trick:
hash.max_by {|(_, (idx, frequency))| [frequency, -idx] }
#=> [?i, [1, 2]]
The entire thing would then look something like this:
str = 'Hello Nett'
str.
each_char.
with_index.
each_with_object(Hash.new {|h, k| h[k] = [0, nil]}) do |(char, idx), acc|
next unless /\p{Alphabetic}/ === char
char = char.downcase
acc[char][1] ||= idx
acc[char][0] += 1
end.
max_by {|(_, (frequency, idx))| [frequency, -idx] }
#=> [?n, [2, 1]]

Generate a filtered subset of repeated permutations of an array of objects (with given length k)

I'm new to Ruby. I need to generate all combinations of objects based on a length.
For example, array = [obj1, obj2, obj3], length = 2, then combinations are:
[
[obj1, obj1],
[obj1, obj2],
[obj1, obj3],
# ...
[obj3, obj3]
]
I know I can use repeated_permutation method for this problem, but I need also to be able to filter some permutations. For example, to filter out permutations where 2 identical objects are one after another, i.e. like this [obj1, obj1].
If all you need is to remove any pairs that are the same obj, you can simply use the permutation method.
arr = [1,2,3]
arr.permutation(2).to_a
#=> [[1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2]]
Given an arbitrary input array:
a = [1, 2, 3, 3, 4]
If you only wish to generate the unique permutations, then you can simply do:
a.uniq.permutation(2)
(uniq is not needed, if you know the initial array contains unique elements!)
However, as a more general solution, you must do:
a.repeated_permutation(2).reject { |permutation| ** FILTER RULE GOES HERE ** }
So for example, if you wish to filter all results which do not have two consecutive repeated values, then you can do:
a.repeated_permutation(2).reject do |permutation|
permutation.each_cons(2).any? {|x, y| x == y}
end
Taking this to the extreme, here is a generalised method:
def filtered_permutations(array, length)
array.repeated_permutation(length).reject{|permutation| yield(permutation)}
end
# Or, if you prefer:
def filtered_permutations(array, length, &block)
array.repeated_permutation(length).reject(&block)
end
# Usage:
a = [1, 2, 3, 3, 4]
filtered_permutations(a, 2) {|permutation| permutation.each_cons(2).any? {|x, y| x == y} }
# Or, if you prefer:
filtered_permutations(a, 2) {|permutation| permutation.each_cons(2).any? {|consecutive| consecutive.uniq.one?} }
Pass a block where you perform your "filtering". So to remove those with identical elements you'd go with:
a = [1,2,3]
a.repeated_permutation(2).reject { |permutation| permutation.uniq.one? }
#=> [[1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2]]

Method to sort strings in descending order (in complex keys)

In order to descend-sort an array a of strings, reverse can be used.
a.sort.reverse
But when you want to use a string among multiple sort keys, that cannot be done. Suppose items is an array of items that have attributes attr1 (String), attr2 (String), attr3 (Integer). Sort can be done like:
items.sort_by{|item| [item.attr1, item.attr2, item.attr3]}
Switching from ascending to descending can be done independently for Integer by multiplying it with -1:
items.sort_by{|item| [item.attr1, item.attr2, -item.attr3]}
But such method is not straightforward for String. Can such method be defined? When you want to do descending sort with respect to attr2, it should be written like:
items.sort_by{|item| [item.attr1, item.attr2.some_method, item.attr3]}
I think you can always convert your strings into an array of integers (ord). Like this:
strings = [["Hello", "world"], ["Hello", "kitty"], ["Hello", "darling"]]
strings.sort_by do |s1, s2|
[
s1,
s2.chars.map(&:ord).map{ |n| -n }
]
end
PS:
As #CarySwoveland caught here is a corner case with empty string, which could be solved with this non elegant solution:
strings.sort_by do |s1, s2|
[
s1,
s2.chars.
map(&:ord).
tap{|chars| chars << -Float::INFINITY if chars.empty? }.
map{ |n| -n }
]
end
And #Jordan kindly mentioned that sort_by uses Schwartzian Transform so you don't need preprocessing at all.
The following supports all objects that respond to <=>.
def generalized_array_sort(arr, inc_or_dec)
arr.sort do |a,b|
comp = 0
a.zip(b).each_with_index do |(ae,be),i|
next if (ae<=>be).zero?
comp = (ae<=>be) * (inc_or_dec[i]==:inc ? 1 : -1)
break
end
comp
end
end
Example
arr = [[3, "dog"], [4, "cat"], [3, "cat"], [4, "dog"]]
inc_or_dec = [:inc, :dec]
generalized_array_sort(arr, inc_or_dec)
#=> [[3, "dog"], [3, "cat"], [4, "dog"], [4, "cat"]]
Another example
class A; end
class B<A; end
class C<B; end
[A,B,C].sort #=> [C, B, A]
arr = [[3, A], [4, B], [3, B], [4, A], [3, C], [4,C]]
inc_or_dec = [:inc, :dec]
generalized_array_sort(arr, inc_or_dec)
#=> [[3, A], [3, B], [3, C], [4, A], [4, B], [4, C]]
I'm not sure either of these passes your straightforwardness test, but I think both work correctly. Using #CarySwoveland's test data:
arr = [[3, "dog"], [4, "cat"], [3, "cat"], [4, "dog"]]
arr.sort_by {|a, b| [ a, *b.codepoints.map(&:-#) ] }
# => [[3, "dog"], [3, "cat"], [4, "dog"], [4, "cat"]]
Alternatively, here's a solution that works regardless of the type (i.e. it needn't be a string):
arr.sort do |a, b|
c0 = a[0] <=> b[0]
next c0 unless c0.zero?
-(a[1] <=> b[1])
end
# => [[3, "dog"], [3, "cat"], [4, "dog"], [4, "cat"]]
The latter could be generalized as a method like so:
def arr_cmp(a, b, *dirs)
return 0 if a.empty? && b.empty?
return a <=> b if dirs.empty?
a0, *a = a
b0, *b = b
dir, *dirs = dirs
c0 = a0 <=> b0
return arr_cmp(a, b, *dirs) if c0.zero?
dir * c0
end
This works just like <=> but as its final arguments takes a list of 1 or -1s indicating to the sort directions for each respective array element, e.g.:
a = [3, "dog"]
b = [3, "cat"]
arr_cmp(a, b, 1, 1) # => 1
arr_cmp(a, b, 1, -1) # => -1
Like <=> it's most useful in a sort block:
arr.sort {|a, b| arr_cmp(a, b, 1, -1) }
# => [[3, "dog"], [3, "cat"], [4, "dog"], [4, "cat"]]
I haven't tested it much, though, so there are probably edge cases for which it fails.
While I have no idea about generic academic implementation, in the real life I would go with:
class String
def hash_for_sort precision = 5
(#h_f_p ||= {})[precision] ||= self[0...precision].codepoints.map do |cp|
[cp, 99999].min.to_s.ljust 5, '0'
join.to_i
end
end
Now feel free to sort by -item.attr2.hash_for_sort.
The approach above has some glitches:
no valid sorting for the strings, that differ in > precision letters;
initial call to the function is O(self.length);
codepoints above 99999 would be considered equal (sorting is not accurate).
But taking into account the real circumstanses, I can not imagine when this won’t suffice.
P.S. If I were to solve this task precisely, I would search for an algorithm, converting strings to floats in a one-to-one manner.

Three ways to create a range, hash, array in ruby

I am doing a tutorial course on ruby and it asks for 3 ways to create range, hash, array.
I can only think of 2: (1..3) and Range.new(1,3) (and similarly for hash and array).
What is the third way?
The tutorial in question is The Odin Project
Ranges may be constructed using the s..e and s...e literals, or with ::new.
Ranges constructed using .. run from the beginning to the end inclusively.
Those created using ... exclude the end value. When used as an iterator, ranges return each value in the sequence.
(0..2) == (0..2) #=> true
(0..2) == Range.new(0,2) #=> true
(0..2) == (0...2) #=> false
Read More Here
For Arrays there's Array::[] (example taken directly from the docs):
Array.[]( 1, 'a', /^A/ ) # => [1, "a", /^A/]
Array[ 1, 'a', /^A/ ] # => [1, "a", /^A/]
[ 1, 'a', /^A/ ] # => [1, "a", /^A/]
Similarly there's Hash::[]. Not sure about Ranges; in fact, the docs (as far as I can tell) only mention literals and Range::new.
I can't see why you'd use these over a literal, but there you go.
You can also make a exclusive range, using (1...4), which if turned into an array would become [1, 2, 3]
(1..3) is an inclusive range, so it contains all numbers, from 1 to 3, but if you used (1...3), having 3 dots instead of 2 makes it exclusive, so it contains all numbers from 1, up to but not including 3.
As for arrays and hashes, #to_a, Array#[], #to_h, and Hash#[] will work.
(1..3).to_a
=> [1, 2, 3]
Array[1, 2, 3]
=> [1, 2, 3]
[[1, 2], [3, 4], [5, 6]].to_h
=> {1=>2, 3=>4, 5=>6}
Hash[ [[1, 2], [3, 4], [5, 6]] ]
=> {1=>2, 3=>4, 5=>6}
But they are probably looking for Array#[] and Hash#[] on the array and hash part.

Find all subsets in an array

I need help with solving this ruby array question.
Get all the subsets of an array. Unique set only. No repeats of any number. num_subset([1,2,3]) ==> result should be [[], ["1"], ["1", "2"], ["1", "2", "3"], ["1", "3"], ["2"], ["2", "3"], ["3"]]
def num_subset(arr)
holder =[]
order_subset = [[]]
(arr.length).times do |m|
arr.map do |n|
holder += [n]
order_subset << holder
end
holder =[] # resets holder
arr.shift # takes the first element out
end
order_subset
end
My result ==> [[], ["1"], ["1", "2"], ["1", "2", "3"], ["2"], ["2", "3"], ["3"]. My problem is that I am missing one result ["1", "3"]
Need some help pointing me to the right direction. Spent hours on this already. Do not use #combination short cut. I need to work this out manually.
a = [1, 2, 3]
arr = []
for i in 0..(a.length) do
arr = arr + a.combination(i).to_a
end
> arr
# [[], [1], [2], [3], [1, 2], [1, 3], [2, 3], [1, 2, 3]]
I believe this is the most rubyish solution to find combinations
a = [1,2,3]
p (0..a.length).collect { |i|
a.combination(i).to_a
}.flatten(1)
# [[], [1], [2], [3], [4], [1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4], [1, 2, 3, 4]]
Looks like you're looking at a starting point somewhere in the array and then looking at all sub arrays from that starting point on, after which you move the starting point down. That way, you're missing the sub arrays with gaps. For [1,2,3], the only sub array with a gap is [1,3].
For example (ignoring [] since you've hardcoded that)
[(1),2,3,4] -> [1]
[(1,2),3,4] -> [1,2]
[(1,2,3),4] -> [1,2,3]
[(1,2,3,4)] -> [1,2,3,4]
[1,(2),3,4] -> [2]
[1,(2,3),4] -> [2,3]
[1,(2,3,4)] -> [2,3,4]
[1,2,(3),4] -> [3]
[1,2,(3,4)] -> [3,4]
[1,2,3,(4)] -> [4]
So I'd expect your output for [1,2,3,4] to be [[],[1],[1,2],[1,2,3],[1,2,3,4],[2],[2,3],[2,3,4],[3],[3,4],[4]].
You really need to rethink your algorithm. You could try recursion. Take the head of your array (1), construct all possible sub arrays of the tail ([2,3]), duplicate that, and prefix half of it with the head. Of course, to construct the sub arrays, you call the same function, all the way down to an empty array.
[1,2,3] ->
....[2,3] ->
........[3] ->
............[] ->
................# an empty array is its own answer
................[]
............# duplicating the empty array and prefixing one with 3
............[3], []
........# duplicating the result from the last step and prefixing half with 2
........[2,3], [2], [3], []
....# duplicating the result from the last step and prefixing half with 1
....[1,2,3], [1,2], [1,3], [1], [2,3], [2], [3], []
I have created a method to find all subsets of an array. I am using binary number to make iteration of array very less.
def find_subset(input_array)
no_of_subsets = 2**input_array.length - 1
all_subsets = []
expected_length_of_binary_no = input_array.length
for i in 1..(no_of_subsets) do
binary_string = i.to_s(2)
binary_string = binary_string.rjust(expected_length_of_binary_no, '0')
binary_array = binary_string.split('')
subset = []
binary_array.each_with_index do |bin, index|
if bin.to_i == 1
subset.push(input_array[index])
end
end
all_subsets.push(subset)
end
all_subsets
end
Output of [1,2,3] would be
[[3], [2], [2, 3], [1], [1, 3], [1, 2], [1, 2, 3]]
My solution.
The basic idea over here is that subsets of an array are
Subsets of the array with one less element - let's call these old subsets
array of elements containing that one less element added each of the old subsets
For Example -
Subsets([1, 2, 3]) are -
Subsets([1, 2]) - old_subsets
Tack on 3 to each of old_subsets
def subsets(arr)
return [[]] if arr.empty?
old_subsets = subsets(arr.drop(1))
new_subsets = []
old_subsets.each do |subset|
new_subsets << subset + [arr.first]
end
old_subsets + new_subsets
end
Recursive solution
def subsets(arr)
(l = arr.pop) ? subsets(arr).map{|s| [s,s+[l]]}.flatten(1) : [[]]
end
or in a more descriptive way
def subsets(arr)
return [[]] if arr.empty?
last = arr.pop
subsets(arr).map{|set| [set, set + [last]]}.flatten(1)
end

Resources