avoid keys duplication to get a random hash key - ruby

I need to pick a hash entry at random, so I do
h = {1 => 'one', 2 => 'two', 3 => 'three'}
k = h.keys.sample
result = h[k]
Since h.keys creates a new array I do not like it. Is there a way to avoid creating a new array every time?

This will not generate another array. On average hash_random_value will iterate halfway through the given hash to produce a random value.
def hash_random_value(h)
i = rand(h.length)
h.each_with_index do |(_, v), i2|
return v if i == i2
end
end
h = {1 => 'one', 2 => 'two', 3 => 'three'}
hash_random_value(h)
This being said, you should optimize only when you are certain you need to do that. The only way you can know is to profile your code, otherwise you are most likely doing premature optimisation. I.e. complicating your code and increasing the chance of introducing bugs--sometimes even decreasing the performance of your program. Your original solution is much easier to understand than mine, and it is immediately obvious that it is correct.

I'd like to first reiterate what most people are saying: this probably doesn't matter.
Second, I'll point out that it sure seems like you want a random value, not a random key. Maybe that's just because your example snippet of code doesn't show what you're really doing.
If you very frequently need a random value, and very infrequently update the Hash, I'd recommend caching the values any time the Hash is modified and then taking a random value from the cache. One way to do that might be like this:
class RandomValueHash < Hash
def []=(k, v)
super(k, v)
#values = self.values
end
def sample_value
#values ||= self.values
#values.sample
end
end
rvh = RandomValueHash[{1 => 'one', 2 => 'two', 3 => 'three'}]
rvh.sample_value
# => "one"
rvh[4] = 'four'
rvh[5] = 'five'
rvh.sample_value
# => "four"
Of course, if you really do want a random key rather than value, the exact same concept applies. Either way, this avoids recreating the Array every time you get a value; it only creates it when necessary.

If you need to make the random sample a lot, and need it to be efficient, then perhaps a Ruby Hash is not the right data structure or storage for your problem. Even a wrapper class that maintained Hash and Array attributes together might work well - if for instance for every write to the hash you needed to read 20 random samples.
Whether or not that works for you not only depends on the ratio of reading and writing, it also relates to the logical structure of your problem data (as opposed to how you've chosen to represent it in your solution).
But before you set off on re-thinking your problem, you need to have a practical need for higher performance in the affected code. The hash would need to be pretty large in order to have a noticeable cost to fetching its keys. h.keys takes about 250ms when the hash has 1 million entries on my laptop.

How about...
h = {1 => 'one', 2 => 'two', 3 => 'three'}
k = h.keys
...
result = h[k.sample]
You can do the result = h[k.sample] times as often as you like, and it won't be regenerating the k array. However, you should regenerate k any time h changes.
ADDENDUM: I'm throwing in benchmark code for several of the proposed solutions. Enjoy.
#!/usr/bin/env ruby
require 'benchmark'
NUM_ITERATIONS = 1_000_000
def hash_random_value(h)
i = rand(h.length)
h.each_with_index do |(_, v), i2|
return v if i == i2
end
end
class RandomValueHash < Hash
def []=(k, v)
super(k, v)
#values = self.values
end
def sample_value
#values ||= self.values
#values.sample
end
end
Benchmark.bmbm do |b|
h = {1 => 'one', 2 => 'two', 3 => 'three'}
b.report("original proposal") do
NUM_ITERATIONS.times {k = h.keys.sample; result = h[k]}
end
b.report("hash_random_value") do
NUM_ITERATIONS.times {result = hash_random_value(h)}
end
b.report("manual keyset") do
k = h.keys
NUM_ITERATIONS.times {result = h[k.sample]}
end
rvh = RandomValueHash[{1 => 'one', 2 => 'two', 3 => 'three'}]
b.report("RandomValueHash") do
NUM_ITERATIONS.times {result = rvh.sample_value}
end
end

Not really. Hashes don't have an index so you either convert them to an Array and pick a random index or you Enumerate your hash for a random number of times. You should benchmark which method is fastest but I doubt you can avoid creating a new object.
If you don't care about your object you could shift it's keys for a random number of times but then you'd cerate Arrays for return values.

Unless you have a gigantic hash, this is a pointless concern. Ruby is no efficiency powerhouse, and if you're that worried about this, you should be using C(++).

something like this:
h.each_with_index.reduce(nil) {|m, ((_, v), i)|
rand(i + 1) == 0 ? v : m
}

Related

Best way to interleave two enums in ruby?

I'm looking for a more elegant way of blending together two SQL resultsets with a given ratio. Within each of them I want them to be worked through in the same order they come in, but I want to interleave the processing to achieve a desired blend.
I realised this can be made into a very generic method working with two enums and yielding items to process, so I've written this method which I'm simultaneously quite proud of (nice generic solution) and quite ashamed of.
def combine_enums_with_ratio(enum_a, enum_b, desired_ratio)
a_count = 1
b_count = 1
a_finished = false
b_finished = false
loop do
ratio_so_far = a_count / b_count.to_f
if !a_finished && (b_finished || ratio_so_far <= desired_ratio)
begin
yield enum_a.next
a_count += 1
rescue StopIteration
a_finished = true
end
end
if !b_finished && (a_finished || ratio_so_far > desired_ratio)
begin
yield enum_b.next
b_count += 1
rescue StopIteration
b_finished = true
end
end
break if a_finished && b_finished
end
end
Ashamed because it's clearly written in a very imperative style. Not looking very rubyish. Maybe there's a way of using one of ruby's nice declarative looping methods, except they don't seem to work holding open two enums like this. So then I believe I'm left having to rescue an exception as part of control flow like this, which feels very dirty. I'm missing java's hasNext() method.
Is there a better way?
I did find a similar question about comparing enums: Ruby - Compare two Enumerators elegantly . Some compact answers, but not particularly solving it, and my problem involving unequal lengths and unequal yielding seems trickier.
Here's a shorter and more general approach:
def combine_enums_with_ratio(ratios)
return enum_for(__method__, ratios) unless block_given?
counts = ratios.transform_values { |value| Rational(1, value) }
until counts.empty?
begin
enum, _ = counts.min_by(&:last)
yield enum.next
counts[enum] += Rational(1, ratios[enum])
rescue StopIteration
counts.delete(enum)
end
end
end
Instead of two enums, it takes a hash of enum => ratio pairs.
At first, it creates a counts hash using the ratio's reciprocal, i.e. enum_a => 3, enum_b => 2 becomes:
counts = { enum_a => 1/3r, enum_b => 1/2r }
Then, within a loop, it fetches the hash's minimum value, which is enum_a in the above example. It yields its next value and increment its counts ratio value:
counts[enum_a] += 1/3r
counts #=> {:enum_a=>(2/3), :enum_b=>(1/2)}
On the next iteration, enum_b has the smallest value, so its next value will be yielded and its ratio be incremented:
counts[enum_b] += 1/2r
counts #=> {:enum_a=>(2/3), :enum_b=>(1/1)}
If you keep incrementing enum_a by (1/3) and enum_b by (1/2), the yield ratio of their elements will be 3:2.
Finally, the rescue clause handles enums running out of elements. If this happens, that enum is removed from the counts hash.
Once the counts hash is empty, the loop stops.
Example usage with 3 enums:
enum_a = (1..10).each
enum_b = ('a'..'f').each
enum_c = %i[foo bar baz].each
combine_enums_with_ratio(enum_a => 3, enum_b => 2, enum_c => 1).to_a
#=> [1, "a", 2, 3, "b", :foo, 4, "c", 5, 6, "d", :bar, 7, "e", 8, 9, "f", :baz, 10]
# <---------------------> <---------------------> <--------------------->
# 3:2:1 3:2:1 3:2:1

Array of strings Group by first common letters [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Is there anyway of grouping first common letters in an array of strings?
For example:
array = [ 'hello', 'hello you', 'people', 'finally', 'finland' ]
so when i do
array.group_by{ |string| some_logic_with_string }
The result should be,
{
'hello' => ['hello', 'hello you'],
'people' => ['people'],
'fin' => ['finally', 'finland']
}
NOTE: Some test cases are ambiguous and expectations conflict with other tests, you need to fix them.
I guess plain group_by may not work, a further processing is needed.
I have come up with below code that seems to work for all the given test cases in consistent manner.
I have left notes in the code to explain the logic. Only way to fully understand it will be to inspect value of h and see the flow for a simple test case.
def group_by_common_chars(array)
# We will iteratively group by as many time as there are characters
# in a largest possible key, which is max length of all strings
max_len = array.max_by {|i| i.size}.size
# First group by first character.
h = array.group_by{|i| i[0]}
# Now iterate remaining (max_len - 1) times
(1...max_len).each do |c|
# Let's perform a group by next set of starting characters.
t = h.map do |k,v|
h1 = v.group_by {|i| i[0..c]}
end.reduce(&:merge)
# We need to merge the previously generated hash
# with the hash generated in this iteration. Here things get tricky.
# If previously, we had
# {"a" => ["a"], "ab" => ["ab", "abc"]},
# and now, we have
# {"a"=>["a"], "ab"=>["ab"], "abc"=>["abc"]},
# We need to merge the two hashes such that we have
# {"a"=>["a"], "ab"=>["ab", "abc"], "abc"=>["abc"]}.
# Note that `Hash#merge`'s block is called only for common keys, so, "abc"
# will get merged, we can't do much about it now. We will process
# it later in the loop
h = h.merge(t) do |k, o, n|
if (o.size != n.size)
diff = [o,n].max - [o,n].min
if diff.size == 1 && t.value?(diff)
[o,n].max
else
[o,n].min
end
else
o
end
end
end
# Sort by key length, smallest in the beginning.
h = h.sort {|i,j| i.first.size <=> j.first.size }.to_h
# Get rid of those key-value pairs, where value is single element array
# and that single element is already part of another key-value pair, and
# that value array has more than one element. This step will allow us
# to get rid of key-value like "abc"=>["abc"] in the example discussed
# above.
h = h.tap do |h|
keys = h.keys
keys.each do |k|
v = h[k]
if (v.size == 1 &&
h.key?(v.first) &&
h.values.flatten.count(v.first) > 1) then
h.delete(k)
end
end
end
# Get rid of those keys whose value array consist of only elements that
# already part of some other key. Since, hash is ordered by key's string
# size, this process allows us to get rid of those keys which are smaller
# in length but consists of only elements that are present somewhere else
# with a key of larger length. For example, it lets us to get rid of
# "a"=>["aba", "abb", "aaa", "aab"] from a hash like
# {"a"=>["aba", "abb", "aaa", "aab"], "ab"=>["aba", "abb"], "aa"=>["aaa", "aab"]}
h.tap do |h|
keys = h.keys
keys.each do |k|
values = h[k]
other_values = h.values_at(*(h.keys-[k])).flatten
already_present = values.all? do |v|
other_values.include?(v)
end
h.delete(k) if already_present
end
end
end
Sample Run:
p group_by_common_chars ['hello', 'hello you', 'people', 'finally', 'finland']
#=> {"fin"=>["finally", "finland"], "hello"=>["hello", "hello you"], "people"=>["people"]}
p group_by_common_chars ['a', 'ab', 'abc']
#=> {"a"=>["a"], "ab"=>["ab", "abc"]}
p group_by_common_chars ['aba', 'abb', 'aaa', 'aab']
#=> {"ab"=>["aba", "abb"], "aa"=>["aaa", "aab"]}
p group_by_common_chars ["Why", "haven't", "you", "answered", "the", "above", "questions?", "Please", "do", "so."]
#=> {"a"=>["answered", "above"], "do"=>["do"], "Why"=>["Why"], "you"=>["you"], "so."=>["so."], "the"=>["the"], "Please"=>["Please"], "haven't"=>["haven't"], "questions?"=>["questions?"]}
Not sure, if you can sort by all common letters. But if you want to do sort only by first letter then here it is:
array = [ 'hello', 'hello you', 'people', 'finally', 'finland' ]
result = {}
array.each { |st| result[st[0]] = result.fetch(st[0], []) + [st] }
pp result
{"h"=>["hello", "hello you"], "p"=>["people"], "f"=>["finally", "finland"]}
Now result contains your desired hash.
Hmm, you're trying to do something that's pretty custom. I can think of two classical approaches that sort of do what you want: 1) Stemming and 2) Levenshtein Distance.
With stemming you're finding the root word to a longer word. Here's a gem for it.
Levenshtein is a famous algorithm which calculates the difference between two strings. There is a gem for it that runs pretty fast due to a native C extension.

Ruby iterate through hash and compare value pairs

My Ruby assignment is to iterate through a hash and return the key associated with the lowest value, without using any of the following methods:
#keys #values #min #sort #min_by
I don't understand how to iterate through the hash and store each pair as it comes through, compare it to the last pair that came through, and return the lowest key. This is my code to show you my thought process, but it of course does not work. Any thoughts on how to do this? Thanks!
def key_for_min_value(name_hash)
index = 0
lowest_hash = {}
name_hash.collect do |key, value|
if value[index] < value[index + 1]
lowest = value
index = index + 1
key_for_min_value[value]
return lowest
end
end
end
Track min_value and key_for_min_value. Iterate through the hash, and any time the current value is lower than min_value, update both of these vars. At the end of the loop, return key_for_min_value.
I didn't include sample code because, hey, this is homework. :) Good luck!
One way to do it is transforming our hash into an array;
def key_for_min_value(name_hash)
# Convert hash to array
name_a = name_hash.to_a
# Default key value
d_value= 1000
d_key= 0
# Iterate new array
name_a.each do |i|
# If current value is lower than default, change value&key
if i[1] < d_value
d_value = i[1]
d_key = i[0]
end
end
return d_key
end
You might need to change d_value to something higher or find something more creative :)
We can use Enumerable#reduce method to compare entries and pick the smallest value. Each hash entry gets passed in as an array with 2 elements in reduce method, hence, I am using Array#first and Array#last methods to access key and values.
h = {"a" => 1, "b" => 2, "c" => 0}
p h.reduce{ |f, s| f.last > s.last ? s : f }.first
#=> "c"

Comparing two arrays ignoring element order in Ruby

I need to check whether two arrays contain the same data in any order.
Using the imaginary compare method, I would like to do:
arr1 = [1,2,3,5,4]
arr2 = [3,4,2,1,5]
arr3 = [3,4,2,1,5,5]
arr1.compare(arr2) #true
arr1.compare(arr3) #false
I used arr1.sort == arr2.sort, which appears to work, but is there a better way of doing this?
The easiest way is to use intersections:
#array1 = [1,2,3,4,5]
#array2 = [2,3,4,5,1]
So the statement
#array2 & #array1 == #array2
Will be true. This is the best solution if you want to check whether array1 contains array2 or the opposite (that is different). You're also not fiddling with your arrays or changing the order of the items.
You can also compare the length of both arrays if you want them to be identical in size:
#array1.size == #array2.size && #array1 & #array2 == #array1
It's also the fastest way to do it (correct me if I'm wrong)
Sorting the arrays prior to comparing them is O(n log n). Moreover, as Victor points out, you'll run into trouble if the array contains non-sortable objects. It's faster to compare histograms, O(n).
You'll find Enumerable#frequency in Facets, but implement it yourself, which is pretty straightforward, if you prefer to avoid adding more dependencies:
require 'facets'
[1, 2, 1].frequency == [2, 1, 1].frequency
#=> true
If you know that there are no repetitions in any of the arrays (i.e., all the elements are unique or you don't care), using sets is straight forward and readable:
Set.new(array1) == Set.new(array2)
You can actually implement this #compare method by monkey patching the Array class like this:
class Array
def compare(other)
sort == other.sort
end
end
Keep in mind that monkey patching is rarely considered a good practice and you should be cautious when using it.
There's probably is a better way to do this, but that's what came to mind. Hope it helps!
The most elegant way I have found:
arr1 = [1,2,3,5,4]
arr2 = [3,4,2,1,5]
arr3 = [3,4,2,1,5,5]
(arr1 - arr2).empty?
=> true
(arr3 - arr2).empty?
=> false
You can open array class and define a method like this.
class Array
def compare(comparate)
to_set == comparate.to_set
end
end
arr1.compare(arr2)
irb => true
OR use simply
arr1.to_set == arr2.to_set
irb => true
Here is a version that will work on unsortable arrays
class Array
def unordered_hash
unless #_compare_o && #_compare_o == hash
p = Hash.new(0)
each{ |v| p[v] += 1 }
#_compare_p = p.hash
#_compare_o = hash
end
#_compare_p
end
def compare(b)
unordered_hash == b.unordered_hash
end
end
a = [ 1, 2, 3, 2, nil ]
b = [ nil, 2, 1, 3, 2 ]
puts a.compare(b)
Use difference method if length of arrays are the same
https://ruby-doc.org/core-2.7.0/Array.html#method-i-difference
arr1 = [1,2,3]
arr2 = [1,2,4]
arr1.difference(arr2) # => [3]
arr2.difference(arr1) # => [4]
# to check that arrays are equal:
arr2.difference(arr1).empty?
Otherwise you could use
# to check that arrays are equal:
arr1.sort == arr2.sort

How to check if a value exists in an array in Ruby

I have a value 'Dog' and an array ['Cat', 'Dog', 'Bird'].
How do I check if it exists in the array without looping through it? Is there a simple way of checking if the value exists, nothing more?
You're looking for include?:
>> ['Cat', 'Dog', 'Bird'].include? 'Dog'
=> true
There is an in? method in ActiveSupport (part of Rails) since v3.1, as pointed out by #campeterson. So within Rails, or if you require 'active_support', you can write:
'Unicorn'.in?(['Cat', 'Dog', 'Bird']) # => false
OTOH, there is no in operator or #in? method in Ruby itself, even though it has been proposed before, in particular by Yusuke Endoh a top notch member of ruby-core.
As pointed out by others, the reverse method include? exists, for all Enumerables including Array, Hash, Set, Range:
['Cat', 'Dog', 'Bird'].include?('Unicorn') # => false
Note that if you have many values in your array, they will all be checked one after the other (i.e. O(n)), while that lookup for a hash will be constant time (i.e O(1)). So if you array is constant, for example, it is a good idea to use a Set instead. E.g:
require 'set'
ALLOWED_METHODS = Set[:to_s, :to_i, :upcase, :downcase
# etc
]
def foo(what)
raise "Not allowed" unless ALLOWED_METHODS.include?(what.to_sym)
bar.send(what)
end
A quick test reveals that calling include? on a 10 element Set is about 3.5x faster than calling it on the equivalent Array (if the element is not found).
A final closing note: be wary when using include? on a Range, there are subtleties, so refer to the doc and compare with cover?...
Try
['Cat', 'Dog', 'Bird'].include?('Dog')
If you want to check by a block, you could try any? or all?.
%w{ant bear cat}.any? {|word| word.length >= 3} #=> true
%w{ant bear cat}.any? {|word| word.length >= 4} #=> true
[ nil, true, 99 ].any? #=> true
See Enumerable for more information.
My inspiration came from "evaluate if array has any items in ruby"
Use Enumerable#include:
a = %w/Cat Dog Bird/
a.include? 'Dog'
Or, if a number of tests are done,1 you can get rid of the loop (that even include? has) and go from O(n) to O(1) with:
h = Hash[[a, a].transpose]
h['Dog']
1. I hope this is obvious but to head off objections: yes, for just a few lookups, the Hash[] and transpose ops dominate the profile and are each O(n) themselves.
Ruby has eleven methods to find elements in an array.
The preferred one is include? or, for repeated access, creat a Set and then call include? or member?.
Here are all of them:
array.include?(element) # preferred method
array.member?(element)
array.to_set.include?(element)
array.to_set.member?(element)
array.index(element) > 0
array.find_index(element) > 0
array.index { |each| each == element } > 0
array.find_index { |each| each == element } > 0
array.any? { |each| each == element }
array.find { |each| each == element } != nil
array.detect { |each| each == element } != nil
They all return a trueish value if the element is present.
include? is the preferred method. It uses a C-language for loop internally that breaks when an element matches the internal rb_equal_opt/rb_equal functions. It cannot get much more efficient unless you create a Set for repeated membership checks.
VALUE
rb_ary_includes(VALUE ary, VALUE item)
{
long i;
VALUE e;
for (i=0; i<RARRAY_LEN(ary); i++) {
e = RARRAY_AREF(ary, i);
switch (rb_equal_opt(e, item)) {
case Qundef:
if (rb_equal(e, item)) return Qtrue;
break;
case Qtrue:
return Qtrue;
}
}
return Qfalse;
}
member? is not redefined in the Array class and uses an unoptimized implementation from the Enumerable module that literally enumerates through all elements:
static VALUE
member_i(RB_BLOCK_CALL_FUNC_ARGLIST(iter, args))
{
struct MEMO *memo = MEMO_CAST(args);
if (rb_equal(rb_enum_values_pack(argc, argv), memo->v1)) {
MEMO_V2_SET(memo, Qtrue);
rb_iter_break();
}
return Qnil;
}
static VALUE
enum_member(VALUE obj, VALUE val)
{
struct MEMO *memo = MEMO_NEW(val, Qfalse, 0);
rb_block_call(obj, id_each, 0, 0, member_i, (VALUE)memo);
return memo->v2;
}
Translated to Ruby code this does about the following:
def member?(value)
memo = [value, false, 0]
each_with_object(memo) do |each, memo|
if each == memo[0]
memo[1] = true
break
end
memo[1]
end
Both include? and member? have O(n) time complexity since the both search the array for the first occurrence of the expected value.
We can use a Set to get O(1) access time at the cost of having to create a Hash representation of the array first. If you repeatedly check membership on the same array this initial investment can pay off quickly. Set is not implemented in C but as plain Ruby class, still the O(1) access time of the underlying #hash makes this worthwhile.
Here is the implementation of the Set class:
module Enumerable
def to_set(klass = Set, *args, &block)
klass.new(self, *args, &block)
end
end
class Set
def initialize(enum = nil, &block) # :yields: o
#hash ||= Hash.new
enum.nil? and return
if block
do_with_enum(enum) { |o| add(block[o]) }
else
merge(enum)
end
end
def merge(enum)
if enum.instance_of?(self.class)
#hash.update(enum.instance_variable_get(:#hash))
else
do_with_enum(enum) { |o| add(o) }
end
self
end
def add(o)
#hash[o] = true
self
end
def include?(o)
#hash.include?(o)
end
alias member? include?
...
end
As you can see the Set class just creates an internal #hash instance, maps all objects to true and then checks membership using Hash#include? which is implemented with O(1) access time in the Hash class.
I won't discuss the other seven methods as they are all less efficient.
There are actually even more methods with O(n) complexity beyond the 11 listed above, but I decided to not list them since they scan the entire array rather than breaking at the first match.
Don't use these:
# bad examples
array.grep(element).any?
array.select { |each| each == element }.size > 0
...
Several answers suggest Array#include?, but there is one important caveat: Looking at the source, even Array#include? does perform looping:
rb_ary_includes(VALUE ary, VALUE item)
{
long i;
for (i=0; i<RARRAY_LEN(ary); i++) {
if (rb_equal(RARRAY_AREF(ary, i), item)) {
return Qtrue;
}
}
return Qfalse;
}
The way to test the word presence without looping is by constructing a trie for your array. There are many trie implementations out there (google "ruby trie"). I will use rambling-trie in this example:
a = %w/cat dog bird/
require 'rambling-trie' # if necessary, gem install rambling-trie
trie = Rambling::Trie.create { |trie| a.each do |e| trie << e end }
And now we are ready to test the presence of various words in your array without looping over it, in O(log n) time, with same syntactic simplicity as Array#include?, using sublinear Trie#include?:
trie.include? 'bird' #=> true
trie.include? 'duck' #=> false
If you don't want to loop, there's no way to do it with Arrays. You should use a Set instead.
require 'set'
s = Set.new
100.times{|i| s << "foo#{i}"}
s.include?("foo99")
=> true
[1,2,3,4,5,6,7,8].to_set.include?(4)
=> true
Sets work internally like Hashes, so Ruby doesn't need to loop through the collection to find items, since as the name implies, it generates hashes of the keys and creates a memory map so that each hash points to a certain point in memory. The previous example done with a Hash:
fake_array = {}
100.times{|i| fake_array["foo#{i}"] = 1}
fake_array.has_key?("foo99")
=> true
The downside is that Sets and Hash keys can only include unique items and if you add a lot of items, Ruby will have to rehash the whole thing after certain number of items to build a new map that suits a larger keyspace. For more about this, I recommend you watch "MountainWest RubyConf 2014 - Big O in a Homemade Hash by Nathan Long".
Here's a benchmark:
require 'benchmark'
require 'set'
array = []
set = Set.new
10_000.times do |i|
array << "foo#{i}"
set << "foo#{i}"
end
Benchmark.bm do |x|
x.report("array") { 10_000.times { array.include?("foo9999") } }
x.report("set ") { 10_000.times { set.include?("foo9999") } }
end
And the results:
user system total real
array 7.020000 0.000000 7.020000 ( 7.031525)
set 0.010000 0.000000 0.010000 ( 0.004816)
This is another way to do this: use the Array#index method.
It returns the index of the first occurrence of the element in the array.
For example:
a = ['cat','dog','horse']
if a.index('dog')
puts "dog exists in the array"
end
index() can also take a block:
For example:
a = ['cat','dog','horse']
puts a.index {|x| x.match /o/}
This returns the index of the first word in the array that contains the letter 'o'.
Fun fact,
You can use * to check array membership in a case expressions.
case element
when *array
...
else
...
end
Notice the little * in the when clause, this checks for membership in the array.
All the usual magic behavior of the splat operator applies, so for example if array is not actually an array but a single element it will match that element.
Check exists
Use include?
Example:
arr = [1, 2, 3]
arr.include?(1) -> true
arr.include?(4) -> false
Check does not exist
Use exclude?
Example:
arr = %w(vietnam china japan)
arr.exclude?('usa') -> true
arr.exclude?('china') -> false
There are multiple ways to accomplish this. A few of them are as follows:
a = [1,2,3,4,5]
2.in? a #=> true
8.in? a #=> false
a.member? 1 #=> true
a.member? 8 #=> false
This will tell you not only that it exists but also how many times it appears:
a = ['Cat', 'Dog', 'Bird']
a.count("Dog")
#=> 1
You can try:
Example: if Cat and Dog exist in the array:
(['Cat','Dog','Bird'] & ['Cat','Dog'] ).size == 2 #or replace 2 with ['Cat','Dog].size
Instead of:
['Cat','Dog','Bird'].member?('Cat') and ['Cat','Dog','Bird'].include?('Dog')
Note: member? and include? are the same.
This can do the work in one line!
If you need to check multiples times for any key, convert arr to hash, and now check in O(1)
arr = ['Cat', 'Dog', 'Bird']
hash = arr.map {|x| [x,true]}.to_h
=> {"Cat"=>true, "Dog"=>true, "Bird"=>true}
hash["Dog"]
=> true
hash["Insect"]
=> false
Performance of Hash#has_key? versus Array#include?
Parameter Hash#has_key? Array#include
Time Complexity O(1) operation O(n) operation
Access Type Accesses Hash[key] if it Iterates through each element
returns any value then of the array till it
true is returned to the finds the value in Array
Hash#has_key? call
call
For single time check using include? is fine
For what it's worth, The Ruby docs are an amazing resource for these kinds of questions.
I would also take note of the length of the array you're searching through. The include? method will run a linear search with O(n) complexity which can get pretty ugly depending on the size of the array.
If you're working with a large (sorted) array, I would consider writing a binary search algorithm which shouldn't be too difficult and has a worst case of O(log n).
Or if you're using Ruby 2.0, you can take advantage of bsearch.
If we want to not use include? this also works:
['cat','dog','horse'].select{ |x| x == 'dog' }.any?
How about this way?
['Cat', 'Dog', 'Bird'].index('Dog')
['Cat', 'Dog', 'Bird'].detect { |x| x == 'Dog'}
=> "Dog"
!['Cat', 'Dog', 'Bird'].detect { |x| x == 'Dog'}.nil?
=> true
If you're trying to do this in a MiniTest unit test, you can use assert_includes. Example:
pets = ['Cat', 'Dog', 'Bird']
assert_includes(pets, 'Dog') # -> passes
assert_includes(pets, 'Zebra') # -> fails
There's the other way around this.
Suppose the array is [ :edit, :update, :create, :show ], well perhaps the entire seven deadly/restful sins.
And further toy with the idea of pulling a valid action from some string:
"my brother would like me to update his profile"
Then:
[ :edit, :update, :create, :show ].select{|v| v if "my brother would like me to update his profile".downcase =~ /[,|.| |]#{v.to_s}[,|.| |]/}
I always find it interesting to run some benchmarks to see the relative speed of the various ways of doing something.
Finding an array element at the start, middle or end will affect any linear searches but barely affect a search against a Set.
Converting an Array to a Set is going to cause a hit in processing time, so create the Set from an Array once, or start with a Set from the very beginning.
Here's the benchmark code:
# frozen_string_literal: true
require 'fruity'
require 'set'
ARRAY = (1..20_000).to_a
SET = ARRAY.to_set
DIVIDER = '-' * 20
def array_include?(elem)
ARRAY.include?(elem)
end
def array_member?(elem)
ARRAY.member?(elem)
end
def array_index(elem)
ARRAY.index(elem) >= 0
end
def array_find_index(elem)
ARRAY.find_index(elem) >= 0
end
def array_index_each(elem)
ARRAY.index { |each| each == elem } >= 0
end
def array_find_index_each(elem)
ARRAY.find_index { |each| each == elem } >= 0
end
def array_any_each(elem)
ARRAY.any? { |each| each == elem }
end
def array_find_each(elem)
ARRAY.find { |each| each == elem } != nil
end
def array_detect_each(elem)
ARRAY.detect { |each| each == elem } != nil
end
def set_include?(elem)
SET.include?(elem)
end
def set_member?(elem)
SET.member?(elem)
end
puts format('Ruby v.%s', RUBY_VERSION)
{
'First' => ARRAY.first,
'Middle' => (ARRAY.size / 2).to_i,
'Last' => ARRAY.last
}.each do |k, element|
puts DIVIDER, k, DIVIDER
compare do
_array_include? { array_include?(element) }
_array_member? { array_member?(element) }
_array_index { array_index(element) }
_array_find_index { array_find_index(element) }
_array_index_each { array_index_each(element) }
_array_find_index_each { array_find_index_each(element) }
_array_any_each { array_any_each(element) }
_array_find_each { array_find_each(element) }
_array_detect_each { array_detect_each(element) }
end
end
puts '', DIVIDER, 'Sets vs. Array.include?', DIVIDER
{
'First' => ARRAY.first,
'Middle' => (ARRAY.size / 2).to_i,
'Last' => ARRAY.last
}.each do |k, element|
puts DIVIDER, k, DIVIDER
compare do
_array_include? { array_include?(element) }
_set_include? { set_include?(element) }
_set_member? { set_member?(element) }
end
end
Which, when run on my Mac OS laptop, results in:
Ruby v.2.7.0
--------------------
First
--------------------
Running each test 65536 times. Test will take about 5 seconds.
_array_include? is similar to _array_index
_array_index is similar to _array_find_index
_array_find_index is faster than _array_any_each by 2x ± 1.0
_array_any_each is similar to _array_index_each
_array_index_each is similar to _array_find_index_each
_array_find_index_each is faster than _array_member? by 4x ± 1.0
_array_member? is faster than _array_detect_each by 2x ± 1.0
_array_detect_each is similar to _array_find_each
--------------------
Middle
--------------------
Running each test 32 times. Test will take about 2 seconds.
_array_include? is similar to _array_find_index
_array_find_index is similar to _array_index
_array_index is faster than _array_member? by 2x ± 0.1
_array_member? is faster than _array_index_each by 2x ± 0.1
_array_index_each is similar to _array_find_index_each
_array_find_index_each is similar to _array_any_each
_array_any_each is faster than _array_detect_each by 30.000000000000004% ± 10.0%
_array_detect_each is similar to _array_find_each
--------------------
Last
--------------------
Running each test 16 times. Test will take about 2 seconds.
_array_include? is faster than _array_find_index by 10.000000000000009% ± 10.0%
_array_find_index is similar to _array_index
_array_index is faster than _array_member? by 3x ± 0.1
_array_member? is faster than _array_find_index_each by 2x ± 0.1
_array_find_index_each is similar to _array_index_each
_array_index_each is similar to _array_any_each
_array_any_each is faster than _array_detect_each by 30.000000000000004% ± 10.0%
_array_detect_each is similar to _array_find_each
--------------------
Sets vs. Array.include?
--------------------
--------------------
First
--------------------
Running each test 65536 times. Test will take about 1 second.
_array_include? is similar to _set_include?
_set_include? is similar to _set_member?
--------------------
Middle
--------------------
Running each test 65536 times. Test will take about 2 minutes.
_set_member? is similar to _set_include?
_set_include? is faster than _array_include? by 1400x ± 1000.0
--------------------
Last
--------------------
Running each test 65536 times. Test will take about 4 minutes.
_set_member? is similar to _set_include?
_set_include? is faster than _array_include? by 3000x ± 1000.0
Basically the results tell me to use a Set for everything if I'm going to search for inclusion unless I can guarantee that the first element is the one I want, which isn't very likely. There's some overhead when inserting elements into a hash, but the search times are so much faster I don't think that should ever be a consideration. Again, if you need to search it, don't use an Array, use a Set. (Or a Hash.)
The smaller the Array, the faster the Array methods will run, but they're still not going to keep up, though in small arrays the difference might be tiny.
"First", "Middle" and "Last" reflect the use of first, size / 2 and last for ARRAY for the element being searched for. That element will be used when searching the ARRAY and SET variables.
Minor changes were made for the methods that were comparing to > 0 because the test should be >= 0 for index type tests.
More information about Fruity and its methodology is available in its README.
it has many ways to find a element in any array but the simplest way is 'in ?' method.
example:
arr = [1,2,3,4]
number = 1
puts "yes #{number} is present in arr" if number.in? arr
If you want to return the value not just true or false, use
array.find{|x| x == 'Dog'}
This will return 'Dog' if it exists in the list, otherwise nil.
if you don't want to use include? you can first wrap the element in an array and then check whether the wrapped element is equal to the intersection of the array and the wrapped element. This will return a boolean value based on equality.
def in_array?(array, item)
item = [item] unless item.is_a?(Array)
item == array & item
end
Here is one more way to do this:
arr = ['Cat', 'Dog', 'Bird']
e = 'Dog'
present = arr.size != (arr - [e]).size
array = [ 'Cat', 'Dog', 'Bird' ]
array.include?("Dog")
Try below
(['Cat', 'Dog', 'Bird'] & ['Dog']).any?

Resources