Comparing two arrays ignoring element order in Ruby - ruby

I need to check whether two arrays contain the same data in any order.
Using the imaginary compare method, I would like to do:
arr1 = [1,2,3,5,4]
arr2 = [3,4,2,1,5]
arr3 = [3,4,2,1,5,5]
arr1.compare(arr2) #true
arr1.compare(arr3) #false
I used arr1.sort == arr2.sort, which appears to work, but is there a better way of doing this?

The easiest way is to use intersections:
#array1 = [1,2,3,4,5]
#array2 = [2,3,4,5,1]
So the statement
#array2 & #array1 == #array2
Will be true. This is the best solution if you want to check whether array1 contains array2 or the opposite (that is different). You're also not fiddling with your arrays or changing the order of the items.
You can also compare the length of both arrays if you want them to be identical in size:
#array1.size == #array2.size && #array1 & #array2 == #array1
It's also the fastest way to do it (correct me if I'm wrong)

Sorting the arrays prior to comparing them is O(n log n). Moreover, as Victor points out, you'll run into trouble if the array contains non-sortable objects. It's faster to compare histograms, O(n).
You'll find Enumerable#frequency in Facets, but implement it yourself, which is pretty straightforward, if you prefer to avoid adding more dependencies:
require 'facets'
[1, 2, 1].frequency == [2, 1, 1].frequency
#=> true

If you know that there are no repetitions in any of the arrays (i.e., all the elements are unique or you don't care), using sets is straight forward and readable:
Set.new(array1) == Set.new(array2)

You can actually implement this #compare method by monkey patching the Array class like this:
class Array
def compare(other)
sort == other.sort
end
end
Keep in mind that monkey patching is rarely considered a good practice and you should be cautious when using it.
There's probably is a better way to do this, but that's what came to mind. Hope it helps!

The most elegant way I have found:
arr1 = [1,2,3,5,4]
arr2 = [3,4,2,1,5]
arr3 = [3,4,2,1,5,5]
(arr1 - arr2).empty?
=> true
(arr3 - arr2).empty?
=> false

You can open array class and define a method like this.
class Array
def compare(comparate)
to_set == comparate.to_set
end
end
arr1.compare(arr2)
irb => true
OR use simply
arr1.to_set == arr2.to_set
irb => true

Here is a version that will work on unsortable arrays
class Array
def unordered_hash
unless #_compare_o && #_compare_o == hash
p = Hash.new(0)
each{ |v| p[v] += 1 }
#_compare_p = p.hash
#_compare_o = hash
end
#_compare_p
end
def compare(b)
unordered_hash == b.unordered_hash
end
end
a = [ 1, 2, 3, 2, nil ]
b = [ nil, 2, 1, 3, 2 ]
puts a.compare(b)

Use difference method if length of arrays are the same
https://ruby-doc.org/core-2.7.0/Array.html#method-i-difference
arr1 = [1,2,3]
arr2 = [1,2,4]
arr1.difference(arr2) # => [3]
arr2.difference(arr1) # => [4]
# to check that arrays are equal:
arr2.difference(arr1).empty?
Otherwise you could use
# to check that arrays are equal:
arr1.sort == arr2.sort

Related

Best way to interleave two enums in ruby?

I'm looking for a more elegant way of blending together two SQL resultsets with a given ratio. Within each of them I want them to be worked through in the same order they come in, but I want to interleave the processing to achieve a desired blend.
I realised this can be made into a very generic method working with two enums and yielding items to process, so I've written this method which I'm simultaneously quite proud of (nice generic solution) and quite ashamed of.
def combine_enums_with_ratio(enum_a, enum_b, desired_ratio)
a_count = 1
b_count = 1
a_finished = false
b_finished = false
loop do
ratio_so_far = a_count / b_count.to_f
if !a_finished && (b_finished || ratio_so_far <= desired_ratio)
begin
yield enum_a.next
a_count += 1
rescue StopIteration
a_finished = true
end
end
if !b_finished && (a_finished || ratio_so_far > desired_ratio)
begin
yield enum_b.next
b_count += 1
rescue StopIteration
b_finished = true
end
end
break if a_finished && b_finished
end
end
Ashamed because it's clearly written in a very imperative style. Not looking very rubyish. Maybe there's a way of using one of ruby's nice declarative looping methods, except they don't seem to work holding open two enums like this. So then I believe I'm left having to rescue an exception as part of control flow like this, which feels very dirty. I'm missing java's hasNext() method.
Is there a better way?
I did find a similar question about comparing enums: Ruby - Compare two Enumerators elegantly . Some compact answers, but not particularly solving it, and my problem involving unequal lengths and unequal yielding seems trickier.
Here's a shorter and more general approach:
def combine_enums_with_ratio(ratios)
return enum_for(__method__, ratios) unless block_given?
counts = ratios.transform_values { |value| Rational(1, value) }
until counts.empty?
begin
enum, _ = counts.min_by(&:last)
yield enum.next
counts[enum] += Rational(1, ratios[enum])
rescue StopIteration
counts.delete(enum)
end
end
end
Instead of two enums, it takes a hash of enum => ratio pairs.
At first, it creates a counts hash using the ratio's reciprocal, i.e. enum_a => 3, enum_b => 2 becomes:
counts = { enum_a => 1/3r, enum_b => 1/2r }
Then, within a loop, it fetches the hash's minimum value, which is enum_a in the above example. It yields its next value and increment its counts ratio value:
counts[enum_a] += 1/3r
counts #=> {:enum_a=>(2/3), :enum_b=>(1/2)}
On the next iteration, enum_b has the smallest value, so its next value will be yielded and its ratio be incremented:
counts[enum_b] += 1/2r
counts #=> {:enum_a=>(2/3), :enum_b=>(1/1)}
If you keep incrementing enum_a by (1/3) and enum_b by (1/2), the yield ratio of their elements will be 3:2.
Finally, the rescue clause handles enums running out of elements. If this happens, that enum is removed from the counts hash.
Once the counts hash is empty, the loop stops.
Example usage with 3 enums:
enum_a = (1..10).each
enum_b = ('a'..'f').each
enum_c = %i[foo bar baz].each
combine_enums_with_ratio(enum_a => 3, enum_b => 2, enum_c => 1).to_a
#=> [1, "a", 2, 3, "b", :foo, 4, "c", 5, 6, "d", :bar, 7, "e", 8, 9, "f", :baz, 10]
# <---------------------> <---------------------> <--------------------->
# 3:2:1 3:2:1 3:2:1

Parse nested indented list with Ruby by using "select_before"

I want to parse the formal list from https://www.loc.gov/marc/bibliographic/ecbdlist.html into a nested structure of hashes and arrays.
At first, I used a recursive approach - but ran into the problem that Ruby (and BTW also Python) can handle only less than 1000 recursive calls (stack level too deep).
I found "select_before" and it seemed great:
require 'pp'
# read list into array and get rid of unnecessary lines
marc = File.readlines('marc21.txt', 'r:utf-8')[0].lines.map(&:chomp).select { |line| line if !line.match(/^\s*$/) && !line.match(/^--.+/) }
# magic starts here
marc = marc.slice_before { |line| line[/^ */].size == 0 }.to_a
marc = marc.inject({}) { |hash, arr| hash = hash.merge( arr[0] => arr[1..-1] ) }
I now want to iterate these steps throughout the array. As the indentation levels in the list vary ([0, 2, 3, 4, 5, 6, 8, 9, 10, 12] not all of them always present), I use a helper method get_indentation_map to use only the smallest amount of indentation in each iteration.
But adding only one level (far from the goal of turning the whole array into the new structure), I get the error "no implicit conversion of Regex into Integer" the reason of which I fail to see:
def get_indentation_map( arr )
arr.map { |line| line[/^ */].size }
end
# starting again after slice_before of the unindented lines (== 0)
marc = marc.inject({}) do |hash, arr|
hash = hash.merge( arr[0] => arr[1..-1] ) # so far like above
# now trying to do the same on the next level
hash = hash.inject({}) do |h, a|
indentation_map = get_indentation_map( a ).uniq.sort
# only slice before smallest indentation
a = a.slice_before { |line| line[/^ */].size == indentation_map[0] }.to_a
h = h.merge( a[0] => a[1..-1] )
end
hash
end
I would be very grateful for hints how to best parse this list. I aim at a json-like structure in which every entry is the key for the further indented lines (if there are). Thanks in advance.

Iterating over each element of an array, except the first one

What is the idiomatic Ruby way to write this code?
Given an array, I would like to iterate through each element of that array, but skip the first one. I want to do this without allocating a new array.
Here are two ways I've come up with, but neither feels particularly elegant.
This works but seems way too verbose:
arr.each_with_index do |elem, i|
next if i.zero? # skip the first
...
end
This works but allocates a new array:
arr[1..-1].each { ... }
Edit/clarification: I'd like to avoid allocating a second array. Originally I said I wanted to avoid "copying" the array, which was confusing.
Using the internal enumerator is certainly more intuitive, and you can do this fairly elegantly like so:
class Array
def each_after(n)
each_with_index do |elem, i|
yield elem if i >= n
end
end
end
And now:
arr.each_after(1) do |elem|
...
end
I want to do this without creating a copy of the array.
1) Internal iterator:
arr = [1, 2, 3]
start_index = 1
(start_index...arr.size).each do |i|
puts arr[i]
end
--output:--
2
3
2) External iterator:
arr = [1, 2, 3]
e = arr.each
e.next
loop do
puts e.next
end
--output:--
2
3
OK, maybe this is bad form to answer my own question. But I've been racking my brain on this and poring over the Enumerable docs, and I think I've found a good solution:
arr.lazy.drop(1).each { ... }
Here's proof that it works :-)
>> [1,2,3].lazy.drop(1).each { |e| puts e }
2
3
Concise: yes. Idiomatic Ruby… maybe? What do you think?

In Ruby, is there an Array method that combines 'select' and 'map'?

I have a Ruby array containing some string values. I need to:
Find all elements that match some predicate
Run the matching elements through a transformation
Return the results as an array
Right now my solution looks like this:
def example
matchingLines = #lines.select{ |line| ... }
results = matchingLines.map{ |line| ... }
return results.uniq.sort
end
Is there an Array or Enumerable method that combines select and map into a single logical statement?
I usually use map and compact together along with my selection criteria as a postfix if. compact gets rid of the nils.
jruby-1.5.0 > [1,1,1,2,3,4].map{|n| n*3 if n==1}
=> [3, 3, 3, nil, nil, nil]
jruby-1.5.0 > [1,1,1,2,3,4].map{|n| n*3 if n==1}.compact
=> [3, 3, 3]
Ruby 2.7+
There is now!
Ruby 2.7 is introducing filter_map for this exact purpose. It's idiomatic and performant, and I'd expect it to become the norm very soon.
For example:
numbers = [1, 2, 5, 8, 10, 13]
enum.filter_map { |i| i * 2 if i.even? }
# => [4, 16, 20]
Here's a good read on the subject.
Hope that's useful to someone!
You can use reduce for this, which requires only one pass:
[1,1,1,2,3,4].reduce([]) { |a, n| a.push(n*3) if n==1; a }
=> [3, 3, 3]
In other words, initialize the state to be what you want (in our case, an empty list to fill: []), then always make sure to return this value with modifications for each element in the original list (in our case, the modified element pushed to the list).
This is the most efficient since it only loops over the list with one pass (map + select or compact requires two passes).
In your case:
def example
results = #lines.reduce([]) do |lines, line|
lines.push( ...(line) ) if ...
lines
end
return results.uniq.sort
end
Another different way of approaching this is using the new (relative to this question) Enumerator::Lazy:
def example
#lines.lazy
.select { |line| line.property == requirement }
.map { |line| transforming_method(line) }
.uniq
.sort
end
The .lazy method returns a lazy enumerator. Calling .select or .map on a lazy enumerator returns another lazy enumerator. Only once you call .uniq does it actually force the enumerator and return an array. So what effectively happens is your .select and .map calls are combined into one - you only iterate over #lines once to do both .select and .map.
My instinct is that Adam's reduce method will be a little faster, but I think this is far more readable.
The primary consequence of this is that no intermediate array objects are created for each subsequent method call. In a normal #lines.select.map situation, select returns an array which is then modified by map, again returning an array. By comparison, the lazy evaluation only creates an array once. This is useful when your initial collection object is large. It also empowers you to work with infinite enumerators - e.g. random_number_generator.lazy.select(&:odd?).take(10).
If you have a select that can use the case operator (===), grep is a good alternative:
p [1,2,'not_a_number',3].grep(Integer){|x| -x } #=> [-1, -2, -3]
p ['1','2','not_a_number','3'].grep(/\D/, &:upcase) #=> ["NOT_A_NUMBER"]
If we need more complex logic we can create lambdas:
my_favourite_numbers = [1,4,6]
is_a_favourite_number = -> x { my_favourite_numbers.include? x }
make_awesome = -> x { "***#{x}***" }
my_data = [1,2,3,4]
p my_data.grep(is_a_favourite_number, &make_awesome) #=> ["***1***", "***4***"]
I'm not sure there is one. The Enumerable module, which adds select and map, doesn't show one.
You'd be required to pass in two blocks to the select_and_transform method, which would be a bit unintuitive IMHO.
Obviously, you could just chain them together, which is more readable:
transformed_list = lines.select{|line| ...}.map{|line| ... }
Simple Answer:
If you have n records, and you want to select and map based on condition then
records.map { |record| record.attribute if condition }.compact
Here, attribute is whatever you want from the record and condition you can put any check.
compact is to flush the unnecessary nil's which came out of that if condition
No, but you can do it like this:
lines.map { |line| do_some_action if check_some_property }.reject(&:nil?)
Or even better:
lines.inject([]) { |all, line| all << line if check_some_property; all }
I think that this way is more readable, because splits the filter conditions and mapped value while remaining clear that the actions are connected:
results = #lines.select { |line|
line.should_include?
}.map do |line|
line.value_to_map
end
And, in your specific case, eliminate the result variable all together:
def example
#lines.select { |line|
line.should_include?
}.map { |line|
line.value_to_map
}.uniq.sort
end
def example
#lines.select {|line| ... }.map {|line| ... }.uniq.sort
end
In Ruby 1.9 and 1.8.7, you can also chain and wrap iterators by simply not passing a block to them:
enum.select.map {|bla| ... }
But it's not really possible in this case, since the types of the block return values of select and map don't match up. It makes more sense for something like this:
enum.inject.with_index {|(acc, el), idx| ... }
AFAICS, the best you can do is the first example.
Here's a small example:
%w[a b 1 2 c d].map.select {|e| if /[0-9]/ =~ e then false else e.upcase end }
# => ["a", "b", "c", "d"]
%w[a b 1 2 c d].select.map {|e| if /[0-9]/ =~ e then false else e.upcase end }
# => ["A", "B", false, false, "C", "D"]
But what you really want is ["A", "B", "C", "D"].
You should try using my library Rearmed Ruby in which I have added the method Enumerable#select_map. Heres an example:
items = [{version: "1.1"}, {version: nil}, {version: false}]
items.select_map{|x| x[:version]} #=> [{version: "1.1"}]
# or without enumerable monkey patch
Rearmed.select_map(items){|x| x[:version]}
If you want to not create two different arrays, you can use compact! but be careful about it.
array = [1,1,1,2,3,4]
new_array = map{|n| n*3 if n==1}
new_array.compact!
Interestingly, compact! does an in place removal of nil. The return value of compact! is the same array if there were changes but nil if there were no nils.
array = [1,1,1,2,3,4]
new_array = map{|n| n*3 if n==1}.tap { |array| array.compact! }
Would be a one liner.
Your version:
def example
matchingLines = #lines.select{ |line| ... }
results = matchingLines.map{ |line| ... }
return results.uniq.sort
end
My version:
def example
results = {}
#lines.each{ |line| results[line] = true if ... }
return results.keys.sort
end
This will do 1 iteration (except the sort), and has the added bonus of keeping uniqueness (if you don't care about uniq, then just make results an array and results.push(line) if ...
Here is a example. It is not the same as your problem, but may be what you want, or can give a clue to your solution:
def example
lines.each do |x|
new_value = do_transform(x)
if new_value == some_thing
return new_value # here jump out example method directly.
else
next # continue next iterate.
end
end
end

How do I test if all items in an array are identical?

I can generate a few lines of code that will do this but I'm wondering if there's a nice clean Rubyesque way of doing this. In case I haven't been clear, what I'm looking for is an array method that will return true if given (say) [3,3,3,3,3] or ["rabbits","rabbits","rabbits"] but will return false with [1,2,3,4,5] or ["rabbits","rabbits","hares"].
Thanks
You can use Enumerable#all? which returns true if the given block returns true for all the elements in the collection.
array.all? {|x| x == array[0]}
(If the array is empty, the block is never called, so doing array[0] is safe.)
class Array
def same_values?
self.uniq.length == 1
end
end
[1, 1, 1, 1].same_values?
[1, 2, 3, 4].same_values?
What about this one? It returns false for an empty array though, you can change it to <= 1 and it will return true in that case. Depending on what you need.
I too like preferred answer best, short and sweet. If all elements were from the same Enumerable class, such as Numeric or String, one could use
def all_equal?(array) array.max == array.min end
I would use:
array = ["rabbits","rabbits","hares", nil, nil]
array.uniq.compact.length == 1
I used to use:
array.reduce { |x,y| x == y ? x : nil }
It may fail when array contains nil.

Resources