Using a pair of values as a key - ruby

Very frequently I've had the need to hash a pair of values. Often, I just generate a range between num1 and num2 and hash that as a key, but that's pretty slow because the distance between those two numbers can be quite large.
How can one go about hashing a pair of values to a table? For example, say I'm iterating through an array and want to hash every single possible pair of values into a hash table, where the key is the pair of nums and the value is their sum. What's an efficient way to do this? I've also thought about hashing an an array as the key, but that doesn't work.
Also, how would one go about extending this to 3,4, or 5 numbers?
EDIT:
I'm referring to hashing for O(1) lookup in a hashtable.

Just do it.
You can simply hash on the array...
Verification
Let me show a little experiment:
array = [ [1,2], [3,4], ["a", "b"], ["c", 5] ]
hash = {}
array.each do |e|
e2 = e.clone
e << "dummy"
e2 << "dummy"
hash[e] = (hash[e] || 0) + 1
hash[e2] = (hash[e2] || 0) + 1
puts "e == e2: #{(e==e2).inspect}, e.id = #{e.object_id}, e.hash = #{e.hash}, e2.id = #{e2.object_id}, e2.hash = #{e2.hash}"
end
puts hash.inspect
As you see, I take a few arrays, clone them, modify them separately; after this, we are sure that e and e2 are different arrays (i.e. different object IDs); but they contain the same elements. After this, the two different arrays are used as hash keys; and since they have the same content, are hashed together.
e == e2: true, e.id = 19797864, e.hash = -769884714, e2.id = 19797756, e2.hash = -769884714
e == e2: true, e.id = 19797852, e.hash = -642596098, e2.id = 19797588, e2.hash = -642596098
e == e2: true, e.id = 19797816, e.hash = 104945655, e2.id = 19797468, e2.hash = 104945655
e == e2: true, e.id = 19797792, e.hash = -804444135, e2.id = 19797348, e2.hash = -804444135
{[1, 2, "dummy"]=>2, [3, 4, "dummy"]=>2, ["a", "b", "dummy"]=>2, ["c", 5, "dummy"]=>2}
As you see, you can not only use arrays as keys, but it also recognizes them as being the "same" (and not some weird object identity which it could also be).
Caveat
Obviously this works only to a point. The contents of the arrays must recursively be well-defined with regards to hashing. I.e., you can use sane things like strings, numbers, other arrays, even nil in there.
Reference
From http://ruby-doc.org/core-2.4.0/Hash.html :
Two objects refer to the same hash key when their hash value is identical and the two objects are eql? to each other.
From http://ruby-doc.org/core-2.4.0/Array.html#method-i-eql-3F :
eql?(other) → true or false
Returns true if self and other are the same object, or are both arrays with the same content (according to Object#eql?).
hash → integer
Compute a hash-code for this array.
Two arrays with the same content will have the same hash code (and will compare using eql?).
Emphasis mine.

If you are using a range or array, then you can also call hash on it and use that.
(num1..num2).hash
[num1, num2].hash
That will return a key that you can use as a hash. I have no idea if this is efficient. It does show the source code on the range documentation and the array documentation
Another way I would do it is to turn the numbers into strings. This is the better solution if you are worried about hash collisions.
'num1:num2'
And the ruby-esque ways that I would solve your problem are:
number_array.combination(2).each { |arr| my_hash[arr.hash] = arr }
number_array.combination(2).each { |arr| my_hash[arr.join(":")] = arr }

A hash table, where the key is the pair of nums and the value is their sum:
h = {}
[1,4,6,8].combination(2){|ar| h[ar] = ar.sum}
p h #=>{[1, 4]=>5, [1, 6]=>7, [1, 8]=>9, [4, 6]=>10, [4, 8]=>12, [6, 8]=>14}
Note that using arrays as hash keys is no problem at all. To extend this to 3,4, or 5 numbers use combination(3) #or 4 or 5.

Related

How to determine whether an array is contained in another array

The question is, given [1,2,3,4,5] and [2,4,5], to determine whether (every element in) the second array is contained in the first one. The answer is true.
What's the most succinct and efficient way to do better than:
arr2.reject { |e| arr1.include?(e) } .empty?
Array subtraction should work, as in
(arr2 - arr1).empty?
Description of method:
Returns a new array that is a copy of the original array, removing any
items that also appear in [the second array]. The order is preserved from the
original array.
It compares elements using their hash and eql? methods for efficiency.
I don't consider myself an expert on efficiency, but #Ryan indicated in comments to his answer that it's reasonably efficient at scale.
The bad O(n²) one-liner would look like this:
arr2.all? { |x| arr1.include? x }
arr2.all? &arr1.method(:include?) # alternative
If your objects are hashable, you can make this O(n) by making a set out of the first array:
require 'set'
arr2.all? &Set.new(arr1).method(:include?)
If your objects are totally, like, ordered, you can make it O(n log n) with a sort and a binary search:
arr1.sort!
arr2.all? { |x| arr1.bsearch { |y| x <=> y } }
As mentioned by #Ryan you can use sets. In which case Set#subset? is available to you which is pretty readable (note the two different ways of defining a set from an array):
require 'set'
s1 = Set.new([1, 2, 3])
s2 = [1, 2].to_set
s3 = [1, 3].to_set
s4 = [1, 4].to_set
s1.subset? s1 #=> true
s2.subset? s1 #=> true
s3.subset? s1 #=> true
s4.subset? s1 #=> false
Also consider using Set#proper_subset if required.
s1.proper_subset? s1 #=> false
s2.proper_subset? s1 #=> true
NB A set contains no duplicate elements e.g. Set.new([1,2,3,3]) #=> #<Set: {1, 2, 3}>

Iterate two collection at same time

a = [1,2,3]
b = [4,5 ]
What I want is to iterate these two collection at same time and do something with iterator, the pseudo code would be like:
for i in a
for j in b
collect i * j
when one collection runs out of element, the loop stops.
the result will be [4, 10]
What I have is this:
a = [1,2,3]
b = [4,5 ]
a.zip(b).reject { |c| c.any? { |d| d.nil? } }.map { |e| e.reduce(&:*) }
Any better solution? Thanks!
And The perfect solution I am looking for is to match the intent of my pseudo code.
You can do this:
a, b = b, a if b.length < a.length
a.zip(b).map { |ia, ib| ia * ib }
# => [4, 10]
The first line makes sure that array a has at most the same number of elements as array b. This is because zip creates an array of arrays of the length of the called array. Having a as the shortest array makes sure that there would be no nils.
Here is another way to do it:
[a.length, b.length].min.times.map {|i| a[i]*b[i] }
The idea is that you take the shorter of the two array lengths, [a.length, b.length].min, and you iterate that many times over an integer, i, which you use as an index into the arrays.

Find the largest value for an array of hashes with common keys?

I have two arrays, each containing any number of hashes with identical keys but differing values:
ArrayA = [{value: "abcd", value_length: 4, type: 0},{value: "abcdefgh", value_length: 8, type: 1}]
ArrayB = [{value: "ab", value_length: 2, type: 0},{value: "abc", value_length: 3, type: 1}]
Despite having any number, the number of hashes will always be equal.
How could I find the largest :value_length for every hash whose value is of a certain type?
For instance, the largest :value_length for a hash with a :type of 0 would be 4. The largest :value_length for a hash with a :type of 1 would be 8.
I just can't get my head around this problem.
A simple way:
all = ArrayA + ArrayB # Add them together if you want to search both arrays.
all.select{|x| x[:type] == 0}
.max_by{|x| x[:value_length]}
And if you wanna reuse it just create a function:
def find_max_of_my_array(arr,type)
arr.select{|x| x[:type] == type}
.max_by{|x| x[:value_length]}
end
p find_max_of_my_array(ArrayA, 0) # => {:value=>"abcd", :value_length=>4, :type=>0}
I'm not totally sure I know what the output you want is, but try this. I assume the arrays are ordered so that ArrayA[x][:type] == ArrayB[x][:type] and that you are looking for the max between (ArrayA[x], ArrayB[x]) not the whole array. If that is not the case, then the other solutions that concat the two array first will work great.
filtered_by_type = ArrayA.zip(ArrayB).select{|x| x[0][:type] == type }
filtered_by_type.map {|a| a.max_by {|x| x[:value_length] } }
Here's how I approached it: You're looking for the maximum of something, so the Array#max method will probably be useful. You want the actual value itself, not the containing hash, so that gives us some flexibility. Getting comfortable with the functional programming style helps here. In my mind, I can see how select, map, and max fit together. Here's my solution which, as specified, returns the number itself, the maximum value:
def largest_value_length(type, hashes)
# Taking it slowly
right_type_hashes = hashes.select{|h| h[:type] == type}
value_lengths = right_type_hashes.map{|h| h[:value_length]}
maximum = value_lengths.max
# Or, in one line
#hashes.select{|h| h[:type] == type}.map{|h| h[:value_length]}.max
end
puts largest_value_length(1, ArrayA + ArrayB)
=> 8
You can also sort after filtering by type. That way you can get smallest, second largest etc.
all = ArrayA + ArrayB
all = all.select { |element| element[:type] == 1 }
.sort_by { |k| k[:value_length] }.reverse
puts all[0][:value_length]
#8
puts all[all.length-1][:value_length]
#3

Ruby: How to find the key of the largest value in a hash?

Hello I'm trying to find the largest value in my hash.
I made a search in google and I found this code:
def largest_hash_key(hash)
key = hash.sort{|a,b| a[1] <=> b[1]}.last
puts key
end
hash = { "n" => 100, "m" => 100, "y" => 300, "d" => 200, "a" => 0 }
largest_hash_key(hash)
in this code "puts" prints the largest key and value e.x y300.
So, how I can modify the code in order to find the largest value and put it's key in to_s variable?
This is O(n):
h = {"n" => 100, "m" => 100, "y" => 300, "d" => 200, "a" => 0}
key_with_max_value = h.max_by { |k, v| v }[0] #=> "y"
Here is another way of doing what you want. This will find all the keys with the maximum value:
h = {"n" => 100, "m" => 100, "y" => 300, "d" => 200, "a" => 0, "z" => 300}
max = h.values.max
output_hash = Hash[h.select { |k, v| v == max}]
puts "key(s) for the largest value: #{output_hash.keys}"
#=>key(s) for the largest value: ["y", "z"]
You can modify your method's first statement to
key = hash.sort{|a,b| a[1] <=> b[1]}.last[0]
Hash.sort returns an array of key-value pairs. last gets you the key-value pair with the largest value. Its first element is the corresponding key.
Sort the hash once instead of finding max. This way you can also get smallest etc.
def reverse_sort_hash_value(hash)
hash = hash.sort_by {|k,v| v}.reverse
end
h = reverse_sort_hash_value(h)
Key of largest value
max = *h[0][0]
Get Key/Value of the smallest value
puts *h[h.length-1]
You can convert to hash using Hash[h.select { |k, v| v == max}] or using h.to_h
I think it is not a good idea to use something you find on google and tweak it until it somehow runs. If we develop software, we should do something that we understand.
A Hash is optimized to lookup a value by key. It is not optimized to sort the values or find by properties of the values. So the data structure is not helpful for your problem. Other data structures like trees or even arrays may be better.
But if you want to use a hash because of some other reasons, of course it is possible. Somehow you just need to loop over the whole hash.
The algorithm is quite easy: loop over the whole hash and check if the value is bigger and the previous biggest value:
max_value = 0 # or -Infinity if you have negative values
key_for_max_value = nil
hash.each_pair do | key, value |
if value > max_value
max_value = value
key_for_max_value = key
end
end
puts "The largest value is #{max_value} and it is has the key #{key_for_max_value}"
Some of the other solutions use tricks like to sort the array, but this only hides the complexity.

Get index of array element faster than O(n)

Given I have a HUGE array, and a value from it. I want to get index of the value in array. Is there any other way, rather then call Array#index to get it? The problem comes from the need of keeping really huge array and calling Array#index enormous amount of times.
After a couple of tries I found that caching indexes inside elements by storing structs with (value, index) fields instead of the value itself gives a huge step in performance (20x times win).
Still I wonder if there's a more convenient way of finding index of en element without caching (or there's a good caching technique that will boost up the performance).
Why not use index or rindex?
array = %w( a b c d e)
# get FIRST index of element searched
puts array.index('a')
# get LAST index of element searched
puts array.rindex('a')
index: http://www.ruby-doc.org/core-1.9.3/Array.html#method-i-index
rindex: http://www.ruby-doc.org/core-1.9.3/Array.html#method-i-rindex
Convert the array into a hash. Then look for the key.
array = ['a', 'b', 'c']
hash = Hash[array.map.with_index.to_a] # => {"a"=>0, "b"=>1, "c"=>2}
hash['b'] # => 1
Other answers don't take into account the possibility of an entry listed multiple times in an array. This will return a hash where each key is a unique object in the array and each value is an array of indices that corresponds to where the object lives:
a = [1, 2, 3, 1, 2, 3, 4]
=> [1, 2, 3, 1, 2, 3, 4]
indices = a.each_with_index.inject(Hash.new { Array.new }) do |hash, (obj, i)|
hash[obj] += [i]
hash
end
=> { 1 => [0, 3], 2 => [1, 4], 3 => [2, 5], 4 => [6] }
This allows for a quick search for duplicate entries:
indices.select { |k, v| v.size > 1 }
=> { 1 => [0, 3], 2 => [1, 4], 3 => [2, 5] }
Is there a good reason not to use a hash? Lookups are O(1) vs. O(n) for the array.
If your array has a natural order use binary search.
Use binary search.
Binary search has O(log n) access time.
Here are the steps on how to use binary search,
What is the ordering of you array? For example, is it sorted by name?
Use bsearch to find elements or indices
Code example
# assume array is sorted by name!
array.bsearch { |each| "Jamie" <=> each.name } # returns element
(0..array.size).bsearch { |n| "Jamie" <=> array[n].name } # returns index
If it's a sorted array you could use a Binary search algorithm (O(log n)). For example, extending the Array-class with this functionality:
class Array
def b_search(e, l = 0, u = length - 1)
return if lower_index > upper_index
midpoint_index = (lower_index + upper_index) / 2
return midpoint_index if self[midpoint_index] == value
if value < self[midpoint_index]
b_search(value, lower_index, upper_index - 1)
else
b_search(value, lower_index + 1, upper_index)
end
end
end
Taking a combination of #sawa's answer and the comment listed there you could implement a "quick" index and rindex on the array class.
class Array
def quick_index el
hash = Hash[self.map.with_index.to_a]
hash[el]
end
def quick_rindex el
hash = Hash[self.reverse.map.with_index.to_a]
array.length - 1 - hash[el]
end
end
Still I wonder if there's a more convenient way of finding index of en element without caching (or there's a good caching technique that will boost up the performance).
You can use binary search (if your array is ordered and the values you store in the array are comparable in some way). For that to work you need to be able to tell the binary search whether it should be looking "to the left" or "to the right" of the current element. But I believe there is nothing wrong with storing the index at insertion time and then using it if you are getting the element from the same array.

Resources