Is it possible to have multidimensional arrays in Ruby? - ruby

I have an array that stores banned IP addresses in my application:
bannedips = ["10.10.10.10", "20.20.20.20", "30.30.30.30"]
I want to add more information to each banned IP address (IP address, ban timestamp, ban reason).
How can I do this in Ruby?

In Ruby, multidimensional arrays are simply arrays of arrays:
bannedips = [["10.10.10.10", "more data", "etc"], ["20.20.20.20", ...]]
A better approach would be to use an array of hashes, so you can label values:
bannedips = [{ip: "10.10.10.10", timestamp: 89327414}, ...]

If there are a reasonable number of IPs to be tracked, I'd probably use a simple Hash:
banned_ips = {
"10.10.10.10" => {:timestamp => Time.now, :reason => 'foo'},
"20.20.20.20" => {:timestamp => Time.now, :reason => 'bar'},
"30.30.30.30" => {:timestamp => Time.now, :reason => nil}
}
A hash is a quick and dirty way to create a list that acts like an indexed database; Lookups are extremely fast. And, since you can only have a single instance of a particular key, it keeps you from dealing with duplicate data:
banned_ips["20.20.20.20"] # => {:timestamp=>2015-01-02 12:33:19 -0700, :reason=>"bar"}
banned_ips.keys # => ["10.10.10.10", "20.20.20.20", "30.30.30.30"]
As a general programming tip for choosing arrays vs hashes. If you:
have to quickly access a specific value, use a hash, which acts like a random-access database.
want to have a queue or list of values you'll sequentially access, then use an Array.
So, for what you want, retrieving values tied to a specific IP, use a hash. An array, or array-of-arrays would cause the code to waste time looking for the particular value and would slow down as new items were added to the array because of those lookups.
There's a point where it becomes more sensible to store this sort of information into a database, and as a developer it's good to learn about them. They're one of many tools we need to have in our toolbox.

Yes, multidimensional arrays are possible in Ruby. Arrays can contain any value, so a multidimensional array is just an array which contains other arrays:
banned_ips = [
["10.10.10.10", Date.new(2015, 1, 2), "reason"],
["20.20.20.20", Date.new(2014, 12, 28), "reason"],
["30.30.30.30", Date.new(2014, 12, 29), "reason"],
]
Personally though I wouldn't recommend using a multidimensional array for this purpose. Instead, create a class which encapsulates information about the banned IP.
Simple example:
class BannedIP
attr_reader :ip, :time, :reason
def initialize(ip, time:, reason: "N/A")
#ip = ip
#time = time
#reason = reason
end
end
banned_ips = [
BannedIP.new("10.10.10.10", time: Date.new(2015, 1, 2)),
BannedIP.new("20.20.20.20", time: Date.new(2014, 12, 28)),
BannedIP.new("30.30.30.30", time: Date.new(2014, 12, 29), reason: "Spam"),
]

Related

Position of key/value pairs in a hash in Ruby (or any language)

I heard that the positions of the key value pairs in a hash are not fixed, and could be rearranged.
I would like to know if this is true, and if it is, could someone point me to some documentation? If it is wrong, it would be great to have some documentation to the contrary.
To illustrate, if I have the following hash:
NUMBERS = {
1000 => "M",
900 => "CM",
500 => "D",
400 => "CD",
100 => "C",
90 => "XC",
50 => "L",
40 => "XL",
10 => "X",
9 => "IX",
5 => "V",
4 => "IV",
1 => "I",
}
and iterate through it over and over again, would the first key/value pair possibly not be 1000 => 'M'? Or, are the positions of the key/value pairs fixed by definition, and would have to be manually changed in order for the positions to change?
This question is a more general and basic question about the qualities of hashes. I'm not asking how to get to a certain position in a hash.
Generally hashes (or dictionaries, associative arrays etc...) are considered unordered data structures.
From Wikipedia
In addition, associative arrays may also include other operations such
as determining the number of bindings or constructing an iterator to
loop over all the bindings. Usually, for such an operation, the order
in which the bindings are returned may be arbitrary.
However since Ruby 1.9, hash keys maintain the order in which they were inserted in Ruby.
The answer is right at the top of the Ruby documentation for Hash
Hashes enumerate their values in the order that the corresponding keys
were inserted.
In Ruby you can test it yourself pretty easily
key_indices = {
1000 => 0,
900 => 1,
500 => 2,
400 => 3,
100 => 4,
90 => 5,
50 => 6,
40 => 7,
10 => 8,
9 => 9,
5 => 10,
4 => 11,
1 => 12
}
1_000_000.times do
key_indices.each_with_index do |key_val, i|
raise if key_val.last != i
end
end
A hash (also called associative array) is an unordered data structure.
Since Ruby 1.9 Ruby keeps the order of the keys as inserted though.
You can find a whole lot more about this here: Is order of a Ruby hash literal guaranteed?
And some here https://ruby-doc.org/core-2.4.1/Hash.html

An Efficient Numerically Keyed Data Structure in Ruby

One wants to store objects, allowing them to be retrieved via a numerical key. These keys can range from 0 to an arbitrary size (~100K, for instance), but not every natural number in the range has a corresponding object.
One might have the following:
structure[0] => some_obj_a
structure[3] => some_obj_b
structure[7] => some_obj_c
...
structure[100103] => some_obj_z
But all other keys (1, 2, 4, 5, 6, ...) do not have an associated object. The numerical keys are used for retrieval, such that an "ID" is provided to return an object associated to that ID:
ID = get_input_id
my_obj = structure[ID]
What is the most efficient data structure for this scenario in Ruby? And for what reasons? (So far, I can see it being a hash or an array.)
I define "efficient" in terms of:
Least memory used
Fastest lookup times
Fastest entry creation/updates (at arbitrary keys)
An initialization for this structure might be
hsh = Hash.new # or Array.new
hsh[0] = {:id => 0, :var => "a", :count => 45}
hsh[3] = {:id => 3, :var => "k", :count => 32}
hsh[7] = {:id => 7, :var => "e", :count => 2}
You've essentially described a sparse array or a hash.
Hashes are fast and only use the memory they have to use. They are also memory efficient. There is no "magic" data structure for this that'll be any faster. Use a hash.

Efficient way to verify a large hash in Ruby tests

What is an efficient way to test that a hash contains specific keys and values?
By efficient I mean the following items:
easy to read output when failing
easy to read source of test
shortest test to still be functional
Sometimes in Ruby I must create a large hash. I would like to learn an efficient way to test these hashes:
expected_hash = {
:one => 'one',
:two => 'two',
:sub_hash1 => {
:one => 'one',
:two => 'two'
},
:sub_hash2 => {
:one => 'one',
:two => 'two'
}
}
I can test this has several ways. The two ways I use the most are the whole hash at once, or a single item:
assert_equal expected_hash, my_hash
assert_equal 'one', my_hash[:one]
These work for small hashes like our example hash, but for a very large hash these methods break down. The whole hash test will display too much information on a failure. And the single item test would make my test code too large.
I was thinking an efficient way would be to break up the tests into many smaller tests that validated only part of the hash. Each of these smaller tests could use the whole hash style test. Unfortunately I don't know how to get Ruby to do this for items not in a sub-hash. The sub-hashes can be done like the following:
assert_equal expected_hash[:sub_hash1], my_hash[:sub_hash1]
assert_equal expected_hash[:sub_hash2], my_hash[:sub_hash2]
How do I test the remaining parts of the hash?
When testing, you need to use manageable chunks. As you found, testing against huge hashes makes it difficult to track what is happening.
Consider converting your hashes to arrays using to_a. Then you can easily use the set operators to compare arrays, and find out what is missing/changed.
For instance:
[1,2] - [2,3]
=> [1]
[2,3] - [1,2]
=> [3]
You can use hash method on hashes to compare 2 of them or their parts:
h = {:test => 'test'}
=> {:test=>"test"}
h1 = {:test => 'test2'}
=> {:test=>"test2"}
h.hash
=> -1058076452551767024
h1.hash
=> 1300393442551759555
h.hash == h1.hash
=> false

Convert array-of-hashes to a hash-of-hashes, indexed by an attribute of the hashes

I've got an array of hashes representing objects as a response to an API call. I need to pull data from some of the hashes, and one particular key serves as an id for the hash object. I would like to convert the array into a hash with the keys as the ids, and the values as the original hash with that id.
Here's what I'm talking about:
api_response = [
{ :id => 1, :foo => 'bar' },
{ :id => 2, :foo => 'another bar' },
# ..
]
ideal_response = {
1 => { :id => 1, :foo => 'bar' },
2 => { :id => 2, :foo => 'another bar' },
# ..
}
There are two ways I could think of doing this.
Map the data to the ideal_response (below)
Use api_response.find { |x| x[:id] == i } for each record I need to access.
A method I'm unaware of, possibly involving a way of using map to build a hash, natively.
My method of mapping:
keys = data.map { |x| x[:id] }
mapped = Hash[*keys.zip(data).flatten]
I can't help but feel like there is a more performant, tidier way of doing this. Option 2 is very performant when there are a very minimal number of records that need to be accessed. Mapping excels here, but it starts to break down when there are a lot of records in the response. Thankfully, I don't expect there to be more than 50-100 records, so mapping is sufficient.
Is there a smarter, tidier, or more performant way of doing this in Ruby?
Ruby <= 2.0
> Hash[api_response.map { |r| [r[:id], r] }]
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
However, Hash::[] is pretty ugly and breaks the usual left-to-right OOP flow. That's why Facets proposed Enumerable#mash:
> require 'facets'
> api_response.mash { |r| [r[:id], r] }
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
This basic abstraction (convert enumerables to hashes) was asked to be included in Ruby long ago, alas, without luck.
Note that your use case is covered by Active Support: Enumerable#index_by
Ruby >= 2.1
[UPDATE] Still no love for Enumerable#mash, but now we have Array#to_h. It creates an intermediate array, but it's better than nothing:
> object = api_response.map { |r| [r[:id], r] }.to_h
Something like:
ideal_response = api_response.group_by{|i| i[:id]}
#=> {1=>[{:id=>1, :foo=>"bar"}], 2=>[{:id=>2, :foo=>"another bar"}]}
It uses Enumerable's group_by, which works on collections, returning matches for whatever key value you want. Because it expects to find multiple occurrences of matching key-value hits it appends them to arrays, so you end up with a hash of arrays of hashes. You could peel back the internal arrays if you wanted but could run a risk of overwriting content if two of your hash IDs collided. group_by avoids that with the inner array.
Accessing a particular element is easy:
ideal_response[1][0] #=> {:id=>1, :foo=>"bar"}
ideal_response[1][0][:foo] #=> "bar"
The way you show at the end of the question is another valid way of doing it. Both are reasonably fast and elegant.
For this I'd probably just go:
ideal_response = api_response.each_with_object(Hash.new) { |o, h| h[o[:id]] = o }
Not super pretty with the multiple brackets in the block but it does the trick with just a single iteration of the api_response.

Is saving a hash in another hash common practice?

I'd like to save some hash objects to a collection (in the Java world think of it as a List). I search online to see if there is a similar data structure in Ruby and have found none. For the moment being I've been trying to save hash a[] into hash b[], but have been having issues trying to get data out of hash b[].
Are there any built-in collection data structures on Ruby? If not, is saving a hash in another hash common practice?
If it's accessing the hash in the hash that is the problem then try:
>> p = {:name => "Jonas", :pos => {:x=>100.23, :y=>40.04}}
=> {:pos=>{:y=>40.04, :x=>100.23}, :name=>"Jonas"}
>> p[:pos][:x]
=> 100.23
There shouldn't be any problem with that.
a = {:color => 'red', :thickness => 'not very'}
b = {:data => a, :reason => 'NA'}
Perhaps you could explain what problems you're encountering.
The question is not completely clear, but I think you want to have a list (array) of hashes, right?
In that case, you can just put them in one array, which is like a list in Java:
a = {:a => 1, :b => 2}
b = {:c => 3, :d => 4}
list = [a, b]
You can retrieve those hashes like list[0] and list[1]
Lists in Ruby are arrays. You can use Hash.to_a.
If you are trying to combine hash a with hash b, you can use Hash.merge
EDIT: If you are trying to insert hash a into hash b, you can do
b["Hash a"] = a;
All the answers here so far are about Hash in Hash, not Hash plus Hash, so for reasons of completeness, I'll chime in with this:
# Define two independent Hash objects
hash_a = { :a => 'apple', :b => 'bear', :c => 'camel' }
hash_b = { :c => 'car', :d => 'dolphin' }
# Combine two hashes with the Hash#merge method
hash_c = hash_a.merge(hash_b)
# The combined hash has all the keys from both sets
puts hash_c[:a] # => 'apple'
puts hash_c[:c] # => 'car', not 'camel' since B overwrites A
Note that when you merge B into A, any keys that A had that are in B are overwritten.

Resources