Ruby: hash that doesn't remember key values - ruby

Is there a hash implementation around that doens't remember key values? I have to make a giant hash but I don't care what the keys are.
Edit:
Ruby's hash implementation stores the key's value. I would like hash that doesn't remember the key's value. It just uses the hash function to store your value and forgets the key. The reason for this is that I need to make a hash for about 5 gb of data and I don't care what the key values are after creating it. I only want to be able to look up the values based on other keys.
Edit Edit:
The language is kind of confusing. By key's value I mean this:
hsh['value'] = data
I don't care what 'value' is after the hash function stores data in the hash.
Edit^3:
Okay so here's what I am doing: I am generating every 35-letter (nucleotide) kmer for a set of multiple genes. Each gene has an ID. The hash looks like this:
kmers = { 'A...G' => [1, 5, 3], 'G...T' => [4, 9, 9, 3] }
So the hash key is the kmer, and the value is an array containing IDs for the gene(s)/string(s) that have that kmer.
I am querying the hash for kmers in another dataset to quickly find matching genes. I don't care what the hash keys are, I just need to get the array of numbers from a kmer.
>> kmers['A...G']
=> [1, 5, 3]
>> kmers.keys.first
=> "Sorry Dave, I can't do that"

I guess you want a set, allthough it stores unique keys and no values. It has the fast lookup time from a hash.
Set is included in the standard libtrary.
require 'set'
s = Set.new
s << 'aaa'
p s.merge(['ccc', 'ddd']) #=> #<Set: {"aaa", "ccc", "ddd"}>

Even if there was an oddball hash that just recorded existence (which is how I understand the question) you probably wouldn't want to use it, as the built-in Hash would be simpler, faster, not require a gem, etc. So just set...
h[k] = k
...and call it a day...

I assume the 5 gb string is a genome, and the kmers are 35 base pair nucleotide sequences.
What I'd probably do (slightly simplified) is:
human_genome = File.read("human_genome.txt")
human_kmers = Set.new
human_genome.each_cons(35) do |potential_kmer|
human_kmers << potential_kmer unless human_kmers.include?(potential_kmer)
end
unknown_gene = File.read("unknown_gene.txt")
related_to_humans = unknown_gene.each_cons(35).any? do |unknown_gene_kmer|
human_kmers.include?(unknown_gene_kmer)
end

I have to make a giant hash but I don't care what the keys are.
That is called an array. Just use an array. A hash without keys is not a hash at all and loses its value. If you don't need key-value lookup then you don't need a hash.

Use an Array. An Array indexes by integers instead of keys. http://www.ruby-doc.org/core/classes/Array.html
a = []
a << "hello"
puts a #=> ["hello"]

Related

How to get the index of a key in a hash?

I'm trying to get the index of a key in a hash.
I know how to do this in an array:
arr = ['Done', 13, 0.4, true]
a = arr.index('Done')
puts a
Is there a method or some sort of way to do this something like this with a key in a hash? Thanks!
Hashes aren't usually treated as ordered structures, they simply have a list of keys and values corresponding to those keys.
It's true that in Ruby hashes are technically ordered, but there's very rarely an actual use case for treating them as such.
If what you want to do is find the key corresponding to a value in a hash, you can simply use the Hash#key method:
hash = { a: 1, b: 2 }
hash.key(1) # => :a
I suppose you could use hash.keys.index(hash.key(1)) to get 0 since it's the first value, but again, I wouldn't advise doing this because it's not typical use of the data structure
There are at least a couple ways you can get this information, the 2 that come to mind are Enumerable's find_index method to pass each element to a block and check for your key:
hash.find_index { |key, _| key == 'Done' }
or you could get all the keys from your hash as an array and then look up the index as you've been doing:
hash.keys.index('Done')

How symbols equal each others?

When I do :symbol == :symbol I find that its true. They are the same.
If this is the case, how can we create arrays like this:
a = [{:name=>"Michael"},{:name=>"John"}]
Look the below code :
a = [{:name=>"Michael"},{:name=>"John"}]
a.map(&:object_id) # => [70992070, 70992050]
This is because a is an array of Hash, but they are 2 different hash objects. In Ruby, Hash must have uniq key. But 2 different hash can have same named symbols as keys.
You seem to be confused about hash keys. One hash cannot contain the same key twice, but two different hashes can have the same object as a key. For example:
a_key = "hello"
spanish = { a_key => "hola" }
french = { a_key => "bonjour" }
some_array = [spanish, french]
On top of that, it is possible for arrays to contain duplicate objects (e.g. [1, 2, 1] is valid) -- but these aren't even duplicates. Two hashes that contain the same key are still different objects.
There's nothing at all unusual about an array like that. In fact, it's normal for hashes in an array to have keys in common, because usually if you want to put things in an array, it means they have something in common that you can use to deal with them in the same way.

How to save an array of information coming from a hash in Ruby

I am new to ruby and don't have much experience with hashes, I have a variable named tweets and it is a hash as such:
{"statuses"=>[{"metadata"=>{"result_type"=>"recent", "iso_language_code"=>"tl"}, "lang"=>"tl"}]}
I would like to save the array of information as a separate variable in an array. How would I go about this?
Hash's have 2 very nice methods,
hash.values
hash.keys
in your case -
h = {"statuses"=>[{"metadata"=>{"result_type"=>"recent", "iso_language_code"=>"tl"}, "lang"=>"tl"}]}
p h.values
p.keys
These output arrays of each type. This might be what you want.
Also, this question will very well be closed. 1 Google search reported several Hash to Array SO questions.
Ruby Hash to array of values
Converting Ruby hashes to arrays
If you have a Hash like so:
hash = {:numbers => [1,2,3,4]}
And you need to capture the array into a new variable. You can just access the key and assign it to a new variable like so:
one_to_five = hash[:numbers]
However, note that the new variable actually holds the array that is in the hash. So altering the hash's array alters the new variable's array.
hash[:numbers] << 6
puts one_to_five #=> [1,2,3,4,5,6]
If you use dup, it will create a copy of the array so it will be two separate arrays.
one_to_five = hash[:numbers].dup
hash[:numbers] << 6
puts one_to_five #=> [1,2,3,4,5]
So, in your case:
hash = {'statuses' => [{"metadata"=>{"result_type"=>"recent", "iso_language_code"=>"tl"}, "lang"=>"tl"}]}
new_array = hash['statuses'].dup
However, it would be interesting to see what it is you are wishing to accomplish with your code, or at least get a little more context, because this may not be the best approach for your final goal. There are a great many things you can do with Arrays and Hashes (and Enumerable) and I would encourage you to read through the documentation on them.

Behavior of altered array keys in hashes

Ruby allows for a mutable object to be used as a hash key, and I was curious how this worked when the object is updated. It seems like the referenced object is irretrievable from key requests if it's updated.
key = [1,2]
test = {key => 12}
test # => {[1, 2] => 12}
test[key] # => 12
test[[1,2]] # => 12
test[[1,2,3]] # => nil
key << 3
test # => {[1, 2, 3] => 12}
test[key] # => nil
test[[1,2]] # => nil
test[[1,2,3]] # => nil
Why does this work this way? Why can't I provide a key to the hash which will return the value associated with the list I original used as a key?
According to the documentation:
Two objects refer to the same hash key when their hash value is identical and the two objects are eql? to each other.
Mutating a key doesn't change the hash it's stored under. After you mutate the key, trying to index with [1,2] matches the hash but not eql?, while [1,2,3] matches the eql? but isn't found by hash.
See this article for a more elaborate explanation.
You can rehash test, however, to recalculate the hashes based on current key values:
test.rehash
test[[1,2,3]] # => 12
class D
end
p D.new.methods.include?(:hash) #=> true
# so the D instance has a hash method. What does it do?
p D.new.hash #=> -332308361 # just some number
(Almost) every object in Ruby has a hash method. The Hash calls this method when the object is used as a key, and uses the resulting number to store and retrieve the key. (There are smart procedures to handle duplicate numbers (hash collisions)). Retrieving goes like this:
a_hash[[1,2,3]]
# the a_hash calls the hash method to the [1,2,3] object
# and checks if it has stored a value for the resulting number.
This number is only created once: when the key is added to the hash instance.
Problems arise when you start messing with the key after including it in a hash: the hashmethod of the object will differ from the one stored in the hash.
Don't do that, or
consider not using mutable objects as keys, or
remember to do a timely:
a_hash.rehash
which will recalculate all hash numbers.
Note: For strings keys, a copy is used for calculating the hash number, so modifying the original key won't matter.
It would be inconvenient if the identity of an array matters as the hash key. If you have a hash with a key [1, 2], you want to be able to access that with a different array object [1, 2] that has the same content. You want access by the content, not the identity. That would mean that what particular object (with the particular object id) is stored as a key does not matter for a hash. All that matters is the content of the key at the time it was assigned to the hash.
Therefore, after doing key << 3, it makes sense that test[key] or test[[1, 2, 3]] does not return the stored value anymore because key at the time of assignment to test was [1, 2].
The tricky thing is that test[[1, 2]] also returns nil. That is the limitation of Ruby.
If you want the hash to reflect the change made in the key objects, there is a method Hash#rehash.
test.rehash
test[key] # => 12
test[[1,2]] # => nil
test[[1,2,3]] # => 12

Ruby dynamically naming arrays

I want to iterate through a number of arrays and want to dynamically name them from an array of names. Something like this, replace name with the elements from the names array...
names=[a, b, c]
names.each{|name|
name_array1=[]
name_array2=[]
name_array[0][0].each{|i|
if i>0
name_array1.push([i])
end
if i<0
name_array2.push([i])
end
}
}
basically creating the arrays a_array1, a_array2, a_array[0][0], b_array1, b_array2, b_array[0][0], c_array1, c_array2, c_array[0][0]
Is this even possible?
Ruby does not support dynamic local variable names1.
However, this can be easily represented using a Hash. A Hash maps a Key to a Value and, in this case, the Key represents a "name" and the Value is the Array:
# use Symbols for names, although Strings would work too
names = [:a, :b, :c]
# create a new hash
my_arrays = {}
# add some arrays to our hash
names.each_with_index { |name, index|
array = [index] * (index + 1)
my_arrays[name] = array
}
# see what we have
puts my_arrays
# access "by name"
puts my_arrays[:b]
(There are ways to write the above without side-effects, but this should be a start.)
1 Dynamic instance/class variable names are a different story, but are best left as an "advanced topic" for now and are not applicable to the current task. In the past (Ruby 1.8.x), eval could be used to alter local variable bindings, but this was never a "good" approach and does not work in newer versions.

Resources