selecting hash values by value property in ruby - ruby

In Ruby, I have a hash of objects. Each object has a type and a value. I am trying to design an efficient function that can get the average of the values of all of objects of a certain type within the hash.
Here is an example of how this is currently implemented:
#the hash is composed of a number of objects of class Robot (example name)
class Robot
attr_accessor :type, :value
def initialize(type, value)
#type = type
#value = value
end
end
#this is the hash that inclues the Robot objects
hsh = { 56 => Robot.new(:x, 5), 21 => Robot.new(:x, 25), 45 => Robot.new(:x, 35), 31 => Robot.new(:y, 15), 0 => Robot.new(:y, 5) }
#this is the part where I find the average
total = 0
count = 0
hsh.each_value { |r|
if r.type == :x #is there a better way to get only objects of type :x ?
total += r.value
count += 1
end
}
average = total / count
So my question is:
is there a better way to do this that does not involve looping through the entire hash?
Note that I can't use the key values because there will be multiple objects with the same type in the same hash (and the key values are being used to signify something else already).
If there is an easy way to do this with arrays, that would also work (since I can convert the hash to an array easily).
Thanks!
EDIT: fixed error in my code.

hsh.values.select {|v| v.type == :x}.map(&:value).reduce(:+) / hsh.size
I am trying to design an efficient function that can get the average of the values of all of objects of a certain type within the hash
Unless I misunderstood what you are trying to say, this is not what the code you posted does. The average value of the :x robots is 21 (there's 3 :x robots, with values 5, 25 and 35; 5 + 25 + 35 == 65 and 65 divided by 3 robots is 21), but your code (and mine, since I modeled mine after yours) prints 13.
is there a better way to get only objects of type :x ?
Yes. To select elements, use the select method.
is there a better way to do this that does not involve looping through the entire hash?
No. If you want to find all objects with a given property, you have to look at all objects to see whether or not they have that property.
Do you have actual hard statistical evidence that this method is causing your performance bottlenecks?

You can also directly map the keys like this
hsh.values.map {|k| k[:x]}.reduce(:+) / hsh.size

Related

Reference to array cell in Ruby?

Can I have a reference to an array cell in Ruby? In C++, I can do something like:
int& ref = arr[x][y];
and later work with the variable ref without the need of typing the whole arr[x][y].
I want to do this as I need to access one and the same cell multiple times throughout a function (I'm doing memoization) and typing unnecessary indexes may only lead to errors.
All values in ruby are references, so this is certainly possible, but with some important limitations. One caveat is that ruby doesn't DIRECTLY support multidimensional arrays, but you can implement one as an array of arrays or as a hash keyed by tuples.
You can achieve this in cases where the value at (x, y) has already been set by assigning to the value at the given coordinates. If no value currently exists at that location, then you must initialize that value before you can have a reference to it:
# if x and y are indices and a is your "multidimensional array"
a[x][y] = 'First Value' # Initial value at (x, y)
ref = a[x][y] # take a reference to the value referenced by a[x][y]
ref.gsub! 'First', 'Second'
a[x][y] # => 'Second Value'
Keep in mind that the assignment operator in ruby generally means "make the reference on the left side refer to the value on the right". This means that if you use the assignment operator on your reference, then you're actually making it refer to a new value:
a[x][y] = 1 # Initialize value with 1
ref = a[x][y] # Take the reference
ref += 1 # Assignment
ref # => 2
a[x][y] # => 1
You might have better success by using a Hash and keying the hash with tuples of your coordinates, and then using these tuples to get references to specific locations:
a = {}
loc = [x, y]
a[loc] = 'First Value' # Initial value
a[[x,y]] # => 'First Value'
a[loc] = 'Second Value' # Assignment
a[[x,y]] # => 'Second Value'
a[loc] = 1 # Assignment
a[loc] += 1 # Assignment
a[[x,y]] # => '2'
Ruby is considered pass by value so to answer your question (not pass by reference like C++), it's not directly possible to do what you're asking.
There's a really good post in this answer by Abe that you should read through:
Is Ruby pass by reference or by value?
For ref to continue to point to the actual data of arr[x][y] at any given time, one possibiliy is to write it as a method :
def ref
ar[1][1]
end
In a high level language like ruby, all variables are references and there is no "pointers" or levels of indirections like C or C++, you should create objects to hold this references to get similar behavior
This is what I would do on ruby
Suppose you need to save a "pointer" to a ruby array, then you create a Class to access the array in a given index (there is no such thing like getting a "pointer" to a value in ruby)
class ArrayPointer
def initialize(array, index)
#array = array
#index = index
end
def read
#array[index]
end
def write(value)
#array[index] = value
end
end
Then, you use the clase this way
array = [1, 2, 3]
pointer = ArrayPointer.new(array, 1)
pointer.write(20)
puts array # [1, 20, 3]
You also can get "pointers" to local variables, but is too weird and uncommon in ruby world and it almost doesn't make sense
Note this kind of code is weird and not common in ruby, but it is interesting from the didactic point of view to compare two great languages like Ruby and C
In the Object Oriented nature of ruby, is preferable to design good abstractions (e.g. instead of using an array to represent your data, if preferable to define a class with methods like the ruby way) before only using elemental structures such as Array or Hash to represent the data used by your program (the last approach common in C, is not the ruby way)

Iterating through a Hash while Updating a Counter

I have a hash:
students = {
class1: 11,
class2: 24,
class3: 38,
class4: 62
}
I want there to be four lines of output:
1) 11
2) 35 #11 + 24
3) 73 #35 + 38
4) 135 #73 + 62
It goes through each element, and adds a value to a counter, printing each iteration as it goes. I need something like:
students.each do |key, value|
value + counter = total
puts total
end
but I have no idea how to do it. Please advise.
There are many ways to do this, but I will suggest a way that will teach you a few different things about Ruby. This is also a very Ruby-like way to address this problem.
Code
students = {
class1: 11,
class2: 24,
class3: 38,
class4: 62 }
students.reduce(0) do |tot, (k,v)|
tot += v
puts k[/\d+/] + ") #{tot}"
tot
end
1) 11
2) 35
3) 73
4) 135
Explanation
I've used Enumerable#reduce (a.k.a. inject) because that method is convenient for totaling a collection of numbers, using a variable (here tot) that maintains the running total within the block. That's just what you need.
Aside: you will learn a lot be reading the documentation for Ruby methods. Methods are referenced like this: SomeClass#method or SomeModule#method. Here, reduce is an instance method of the module Enumerable. students is an instance of the class Hash, but that class "mixes-in" (includes) the instance methods of the module Enumerable.
The object tot is a Fixnum that is initialized to reduce's argument, which here is zero. (If no initial value were given, the initial value from student--11--would be assigned to tot). Each time the code in the block is executed the value at the end of the block is returned to the enumerator (which is why tot is there). After all the elements of the receiver students have been enumerated, the value of tot is returned by reduce (though you will not be making use of that).
The first time the block is called the block variables are as follows:
tot => 0
k => :class1
v => 11
To print
1) 11
I presume you want the label 1) to be the right end of :class1. To exact 1 from the symbol k => :class1, you can use the method Symbol#[] with the regex /\d+/, which extracts a string of one or more digits 0-9 (as many as there are).
In reading the documentation for the method Symbol#[], you will see that it converts the symbol :class1 to the string "class1" and then invokes the method String[] on that string.
Since Ruby 1.9+, many prefer to use Enumerable#each_with_object rather than reduce. That method would be used like this:
students.each_with_object(0) do |(k,v),tot|
tot += v
puts k[/\d+/] + ") #{tot}"
end
Notice that with this method it is not necessary to return the value of the object (tot) to the enumerator, and that tot is at the end of the list of block variables, whereas it is at the beginning for reduce.
You were actually fairly close. You need to define the total variable outside of your block, just assign 0 to it using
total=0
this will make it stay through all the iterations. Then you just need a small change in
students.each do |key, value|
total=total+value
puts total
end
and it will do what you want.
Switching the order of value + counter = total is important, as assignments (through =) always assign to the variable on the left.
Using your code, made some small change:-- Try this:--
total = 0
students.each do |key, value|
total += value
puts total
end

How to use less memory generating Array permutation?

So I need to get all possible permutations of a string.
What I have now is this:
def uniq_permutations string
string.split(//).permutation.map(&:join).uniq
end
Ok, now what is my problem: This method works fine for small strings but I want to be able to use it with strings with something like size of 15 or maybe even 20. And with this method it uses a lot of memory (>1gb) and my question is what could I change not to use that much memory?
Is there a better way to generate permutation? Should I persist them at the filesystem and retrieve when I need them (I hope not because this might make my method slow)?
What can I do?
Update:
I actually don't need to save the result anywhere I just need to lookup for each in a table to see if it exists.
Just to reiterate what Sawa said. You do understand the scope? The number of permutations for any n elements is n!. It's about the most aggressive mathematical progression operation you can get. The results for n between 1-20 are:
[1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, 39916800, 479001600,
6227020800, 87178291200, 1307674368000, 20922789888000, 355687428096000,
6402373705728000, 121645100408832000, 2432902008176640000]
Where the last number is approximately 2 quintillion, which is 2 billion billion.
That is 2265820000 gigabytes.
You can save the results to disk all day long - unless you own all the Google datacenters in the world you're going to be pretty much out of luck here :)
Your call to map(&:join) is what is creating the array in memory, as map in effect turns an Enumerator into an array. Depending on what you want to do, you could avoid creating the array with something like this:
def each_permutation(string)
string.split(//).permutation do |permutaion|
yield permutation.join
end
end
Then use this method like this:
each_permutation(my_string) do |s|
lookup_string(s) #or whatever you need to do for each string here
end
This doesn’t check for duplicates (no call to uniq), but avoids creating the array. This will still likely take quite a long time for large strings.
However I suspect in your case there is a better way of solving your problem.
I actually don't need to save the result anywhere I just need to lookup for each in a table to see if it exists.
It looks like you’re looking for possible anagrams of a string in an existing word list. If you take any two anagrams and sort the characters in them, the resulting two strings will be the same. Could you perhaps change your data structures so that you have a hash, with keys being the sorted string and the values being a list of words that are anagrams of that string. Then instead of checking all permutations of a new string against a list, you just need to sort the characters in the string, and use that as the key to look up the list of all strings that are permutations of that string.
Perhaps you don't need to generate all elements of the set, but rather only a random or constrained subset. I have written an algorithm to generate the m-th permutation in O(n) time.
First convert the key to a list representation of itself in the factorial number system. Then iteratively pull out the item at each index specified by the new list and of the old.
module Factorial
def factorial num; (2..num).inject(:*) || 1; end
def factorial_floor num
tmp_1 = 0
1.upto(1.0/0.0) do |counter|
break [tmp_1, counter - 1] if (tmp_2 = factorial counter) > num
tmp_1 = tmp_2 #####
end # #
end # #
end # returns [factorial, integer that generates it]
# for the factorial closest to without going over num
class Array; include Factorial
def generate_swap_list key
swap_list = []
key -= (swap_list << (factorial_floor key)).last[0] while key > 0
swap_list
end
def reduce_swap_list swap_list
swap_list = swap_list.map { |x| x[1] }
((length - 1).downto 0).map { |element| swap_list.count element }
end
def keyed_permute key
apply_swaps reduce_swap_list generate_swap_list key
end
def apply_swaps swap_list
swap_list.map { |index| delete_at index }
end
end
Now, if you want to randomly sample some permutations, ruby comes with Array.shuffle!, but this will let you copy and save permutations or to iterate through the permutohedral space. Or maybe there's a way to constrain the permutation space for your purposes.
constrained_generator_thing do |val|
Array.new(sample_size) {array_to_permute.keyed_permute val}
end
Perhaps I am missing the obvious, but why not do
['a','a','b'].permutation.to_a.uniq!

Subtraction of two arrays with incremental indexes of the other array to a maximum limit

I have lots of math to do on lots of data but it's all based on a few base templates. So instead of say, when doing math between 2 arrays I do this:
results = [a[0]-b[1],a[1]-b[2],a[2]-b[3]]
I want to instead just put the base template: a[0]-b[1] and make it automatically fill say 50 places in the results array. So I don't always have to manually type it.
What would be the ways to do that? And would a good way be to create 1 method that does this automatically. And I just tell it the math and it fills out an array?
I have no clue, I'm really new to programming.
a = [2,3,4]
b = [1,2,3,4]
results = a.zip(b.drop(1)).take(50).map { |v,w| v - w }
Custom
a = [2,3,4..............,1000]
b = [1,2,3,4,.............900]
class Array
def self.calculate_difference(arr1,arr2,limit)
begin
result ||= Array.new
limit.send(:times) {|index| result << arr1[index]-arr2[index+=1]}
result
rescue
raise "Index/Limit Error"
end
end
end
Call by:
Array.calculate_difference(a,b,50)

Some simple Ruby questions - iterators, blocks, and symbols

My background is in PHP and C#, but I'd really like to learn RoR. To that end, I've started reading the official documentation. I have some questions about some code examples.
The first is with iterators:
class Array
def inject(n)
each { |value| n = yield(n, value) }
n
end
def sum
inject(0) { |n, value| n + value }
end
def product
inject(1) { |n, value| n * value }
end
end
I understand that yield means "execute the associated block here." What's throwing me is the |value| n = part of the each. The other blocks make more sense to me as they seem to mimic C# style lambdas:
public int sum(int n, int value)
{
return Inject((n, value) => n + value);
}
But the first example is confusing to me.
The other is with symbols. When would I want to use them? And why can't I do something like:
class Example
attr_reader #member
# more code
end
In the inject or reduce method, n represents an accumulated value; this means the result of every iteration is accumulated in the n variable. This could be, as is in your example, the sum or product of the elements in the array.
yield returns the result of the block, which is stored in n and used in the next iterations. This is what makes the result "cumulative."
a = [ 1, 2, 3 ]
a.sum # inject(0) { |n, v| n + v }
# n == 0; n = 0 + 1
# n == 1; n = 1 + 2
# n == 3; n = 3 + 3
=> 6
Also, to compute the sum you could also have written a.reduce :+. This works for any binary operation. If your method is named symbol, writing a.reduce :symbol is the same as writing a.reduce { |n, v| n.symbol v }.
attr and company are actually methods. Under the hood, they dynamically define the methods for you. It uses the symbol you passed to work out the names of the instance variable and the methods. :member results in the #member instance variable and the member and member = methods.
The reason you can't write attr_reader #member is because #member isn't an object in itself, nor can it be converted to a symbol; it actually tells ruby to fetch the value of the instance variable #member of the self object, which, at class scope, is the class itself.
To illustrate:
class Example
#member = :member
attr_accessor #member
end
e = Example.new
e.member = :value
e.member
=> :value
Remember that accessing unset instance variables yields nil, and since the attr method family accepts only symbols, you get: TypeError: nil is not a symbol.
Regarding Symbol usage, you can sort of use them like strings. They make excellent hash keys because equal symbols always refer to the same object, unlike strings.
:a.object_id == :a.object_id
=> true
'a'.object_id == 'a'.object_id
=> false
They're also commonly used to refer to method names, and can actually be converted to Procs, which can be passed to methods. This is what allows us to write things like array.map &:to_s.
Check out this article for more interpretations of the symbol.
For the definition of inject, you're basically setting up chained blocks. Specifically, the variable n in {|value| n = yield(n, value)} is essentially an accumulator for the block passed to inject. So, for example, for the definition of product, inject(1) {|value| n * value}, let's assume you have an array my_array = [1, 2, 3, 4]. When you call my_array.product, you start by calling inject with n = 1. each yields to the block defined in inject, which in turns yields to the block passed to inject itself with n (1) and the first value in the array (1 as well, in this case). This block, {|n, value| n * value} returns 1 == 1 * 1, which is set it inject's n variable. Next, 2 is yielded from each, and the block defined in inject block yields as yield(1, 2), which returns 2 and assigns it to n. Next 3 is yielded from each, the block yields the values (2, 3) and returns 6, which is stored in n for the next value, and so forth. Essentially, tracking the overall value agnostic of the calculation being performed in the specialised routines (sum and product) allows for generalization. Without that, you'd have to declare e.g.
def sum
n = 0
each {|val| n += val}
end
def product
n = 1
each {|val| n *= val}
end
which is annoyingly repetitive.
For your second question, attr_reader and its family are themselves methods that are defining the appropriate accessor routines using define_method internally, in a process called metaprogramming; they are not language statements, but just plain old methods. These functions expect to passed a symbol (or, perhaps, a string) that gives the name of the accessors you're creating. You could, in theory, use instance variables such as #member here, though it would be the value to which #member points that would be passed in and used in define_method. For an example of how these are implemented, this page shows some examples of attr_* methods.
def inject(accumulator)
each { |value| accumulator = yield(accumulator, value) }
accumulator
end
This is just yielding the current value of accumulator and the array item to inject's block and then storing the result back into accumulator again.
class Example
attr_reader #member
end
attr_reader is just a method whose argument is the name of the accessor you want to setup. So, in a contrived way you could do
class Example
#ivar_name = 'foo'
attr_reader #ivar_name
end
to create an getter method called foo
Your confusion with the first example may be due to your reading |value| n as a single expression, but it isn't.
This reformatted version might be clearer to you:
def inject(n)
each do |value|
n = yield(n, value)
end
return n
end
value is an element in the array, and it is yielded with n to whatever block is passed to inject, the result of which is set to n. If that's not clear, read up on the each method, which takes a block and yields each item in the array to it. Then it should be clearer how the accumulation works.
attr_reader is less weird when you consider that it is a method for generating accessor methods. It's not an accessor in itself. It doesn't need to deal with the #member variable's value, just its name. :member is just the interned version of the string 'member', which is the name of the variable.
You can think of symbols as lighter weight strings, with the additional bonus that every equal label is the same object - :foo.object_id == :foo.object_id, whereas 'foo'.object_id != 'foo'.object_id, because each 'foo' is a new object. You can try that for yourself in irb. Think of them as labels, or primitive strings. They're surprisingly useful and come up a lot, e.g. for metaprogramming or as keys in hashes. As pointed out elsewhere, calling object.send :foo is the same as calling object.foo
It's probably worth reading some early chapters from the 'pickaxe' book to learn some more ruby, it will help you understand and appreciate the extra stuff rails adds.
First you need to understand where to use symbols and where its not..
Symbol is especially used to represent something. Ex: :name, :age like that. Here we are not going to perform any operations using this.
String are used only for data processing. Ex: 'a = name'. Here I gonna use this variable 'a' further for other string operations in ruby.
Moreover, symbol is more memory efficient than strings and it is immutable. That's why ruby developer's prefers symbols than string.
You can even use inject method to calculate sum as (1..5).to_a.inject(:+)

Resources