Extensible Dependency Caching - ruby

I'm working on developing a system for computing and caching probability models, and am looking for either software that does this (preferably in R or Ruby) or a design pattern to use as I implement my own.
I have a general pattern of the form function C depends on the output of function B, which depends on the output of function A. I have three models, call them 1, 2, and 3. Model 1 implements A, B and C. Model 2 only implements C, and Model 3 implements A and C.
I would like to be able to get the value 'C' from all models with minimal recomputation of the intermediate steps.
To make things less abstract, a simple example:
I have a dependency graph that looks like so:
A1 is Model 1's implementation of A, and A3 is model 3's implementation of A. C depends on B, and B depends on A in all of the models.
The actual functions are as follows (again, this is a toy example, in reality these functions are much more complex, and can take minutes to hours to compute).
The values should be as follows.
Without caching, this is fine in any framework. I can make a class for model 1, and make model 2 extend that class, and have A,B, and C be functions on that class. Or I can use a dependency injection framework, replacing model 1's A and C with model 2's. And similarly for Model 3.
However I get into problems with caching. I want to compute C on all of the models, in order to compare the results.
So I compute C on model 1, and cache the results, A, B and C. Then I compute C on model 2, and it uses the cached version of B from before, since it is extended from model 2.
However when I compute model 3, I need to not use the cached version of B, since even though the function is the same, the function it depends on, A, is different.
Is there a good way to handle this sort of caching with dependency problem?

Anyway...with this, my first pass at it, is to make sure that functions A, B, and C are all pure functions, aka referentially transparent. That should help, because then you'd know whether to recompute a cached value depending on whether the input has changed or not.
So talking it through, When I'm computing C1, nothing's computed, so compute everything.
When computing C2, check if B1 needs updating. So you ask B1 if it needs updating. B1 checks if its input, A2 has changed from A1. It hasn't, and because all the functionals are referentially transparent, you're guaranteed that if the input hasn't changed, then the output is the same. So therefore, used the cached version of B1 to compute C2
When computing C3, check if B1 needs updating. So we ask B1 if it needs updating. B1 checks to see if its input, A3 has changed from A2, the last time it computed something. It has, so we recompute B1, and then subsequently recompute C3.
As for the dependency injection, I currently see no reason to organize it under the classes, A, B, and C. I'm guessing you want to use the strategy pattern, so that you can use operation overloading in order to keep the algorithm the same, but vary the implementations.
If you guys are using a language that can pass around functions, I would simply chain functions together with a bit of glue code that checks for whether it should call the function or use the cached value. And every time you need a different computation, reassemble all the implementations of the algorithm that you need.

The key to caching the method calls is to know where the method is implemented. You can do this by using UnboundMethod#owner (and you can get an unbound method by using Module#instance_method and passing in a symbol). Using those would lead to something like this:
class Model
def self.cache(id, input, &block)
id = get_cache_id(id, input)
##cache ||= {}
if !##cache.has_key?(id)
##cache[id] = block.call(input)
puts "Cache Miss: #{id}; Storing: #{##cache[id]}"
else
puts "Cache Hit: #{id}; Value: #{##cache[id]}"
end
##cache[id]
end
def self.get_cache_id(sym, input)
"#{instance_method(sym).owner}##{sym}(#{input})"
end
end
class Model1 < Model
def a
self.class.cache(__method__, nil) { |input|
1
}
end
def b(_a = :a)
self.class.cache(__method__, send(_a)) { |input|
input + 3
}
end
def c(_b = :b)
self.class.cache(__method__, send(_b)) { |input|
input ** 2
}
end
end
class Model2 < Model1
def c(_b = :b)
self.class.cache(__method__, send(_b)) { |input|
input ** 3
}
end
end
class Model3 < Model2
def a
self.class.cache(__method__, nil) { |input|
2
}
end
def c(_b = :b)
self.class.cache(__method__, send(_b)) { |input|
input ** 4
}
end
end
puts "#{Model1.new.c}"
puts "Cache after model 1: #{Model.send(:class_variable_get, :##cache).inspect}"
puts "#{Model2.new.c}"
puts "Cache after model 2: #{Model.send(:class_variable_get, :##cache).inspect}"
puts "#{Model3.new.c}"
puts "Cache after model 3: #{Model.send(:class_variable_get, :##cache).inspect}"

We ended up writing our own DSL in Ruby to support this problem.

Related

Algorithm to resolve circular dependencies?

I would like to create a module system whereby you can load dependencies in a circular fashion. The reason is, circular dependencies arise all over the place, like in Ruby on Rails, in the relationship between model classes:
# app/models/x.rb
class X < ActiveRecord::Base
has_many :y
end
# app/models/y.rb
class Y < ActiveRecord::Base
belongs_to :x
end
The way this generically works in Ruby on Rails models, at least, is it first loads all models, with the dependency being a symbol or "string" at first. Then a centralized manager takes all these objects and converts the strings into the corresponding classes, now that the classes are all defined. When you set the relationships of the first class to their corresponding classes, the rest of the classes are still relating to strings, while this first class relates to classes, so there is an interim period where some of the models are fully resolved to classes, while others are still as strings, as the algorithm is running and has not reached completion. Once all the strings are resolved to their corresponding classes, the algorithm is complete.
I can imagine a world where it gets much more complex, creating loops that are not one-to-one circular dependencies, but have many layers in between, creating a 5-node loop sort of thing. Or even one where there are back-and-forth dependencies between modules, such as:
# file/1.rb
class X
depends_on A
depends_on B
end
class Y
depends_on B
end
class Z
depends_on Q
end
# file/2.rb
class A
depends_on Y
end
class B
depends_on Z
end
# file/3.rb
class Q
depends_on X
end
To resolve this:
Load all files *.rb, treat depends_on as strings. Now we have the definitions of each class in their corresponding modules, but with the depends_on as strings instead of the desired classes.
Iterate through files and resolve classes.
For file 1.rb, resolve X... etc.
X depends on A, so associate with 2/A.
X also depends on B, so associate with 2/B.
Y depends on B, so associate with 2/B.
Z depends on Q, so associate with 3/Q.
A depends on Y, so associate with 1/Y.
B depends on Z, so associate with 1/Z.
Q depends on X, so associate with 1/X.
Done.
So basically, there are two "rounds". First round is load all the files, and initialize the classes. Second round is associate the class members with corresponding other classes. Doesn't matter the order.
But can it get any more complex? Requiring more than two rounds to resolve such circular dependencies? I'm not sure. Take for example something like this:
# file/1.rb
class A < A_P
class AA < A_P::AA_P
class AAA < A_P::AA_P::AAA_P
end
end
end
class B < B_Q
class BB < B_Q::BB_Q
class BBB < B_Q::BB_Q::BBB_Q
end
end
end
# file/2.rb
class A_P
class AA_P < B
class AAA_P < B::BB
end
end
end
class B_Q
class BB_Q < A
class BBB_Q < A::AA
end
end
end
In this contrived case, you have:
A (file/1) depending on A_P (file/2), then:
AA (file/1) depending on A_P and AA_P, then:
AA_P (file/2) depending on B (file/1), then:
B (file/1) depending on B_Q (file/2), etc....
That is, it seems like something weird is going on. I'm not sure, my head starts going in knots.
You can't define what class A is extending until class A_P is fully resolved. You can't define what class AA is until class AA_P is fully resolved, which depends on B being resolved, which depends on B_Q being resolved. Etc..
Is it possible to resolve such circular dependencies? What is the general algorithm for resolving arbitrarily complex circular dependencies? Such that, the end is that all the circular dependencies are wired up with the actual value, not a string or other such symbol representing the actual value. The end result of resolving circular dependencies are that every reference should refer to the actual object.
Is it always just a simple two-pass algorithm, first loading the base objects, then resolving their dependencies converting the "strings" of the dependencies to the base objects in the set?
Can you come up with an example where it requires more than a simple two-pass algorithm? And then describe how the algorithm should work in resolving the circular dependencies in that case? Or prove/explain how it is certain that only a simple two-pass algorithm is required?
Another example might be:
// ./p.jslike
import { method_x } from './q'
import { method_y } from './q'
function method_a() {
method_x()
}
function method_b() {
console.log('b')
}
function method_c() {
method_y()
}
function method_d() {
console.log('d')
}
// ./q.jslike
import { method_b } from './p'
import { method_d } from './p'
function method_x() {
method_b()
}
function method_y() {
method_b()
}
I guess that would also be two-pass.
The answer to your question depends on what you mean by "resolve".
Consider the following situation:
You have three classes, A, B and C, which depend on each other in the circular way.
After the first pass (loading), you have access to all the information about each class, that is written in the source files.
Now, let's say you want to fully "resolve" class A. Pretend you don't care about other classes for now.
Given just the information from files, is it possible to perform the desired "resolve" just for the A?
If the answer is no, then the answer to your question is no as well.
If the answer is yes, then it means that two passes are sufficient to fully resolve every class. You can do this either recursively or iteratively, and you'd probably want to cache fully resolved classes to avoid resolving them multiple times, although, it's not necessary for correctness (it's purely a performance optimization).

Multiplying string by integer vs integer by string in ruby

I was playing around in irb, and noticed one cannot do
5 * "Hello".
Error
String can't be coerced into Fixnum
However "Hello"*5 provided "HelloHelloHelloHelloHello" as expected.
What is the exact reason for this? I've been looking around in the doc's and could not find the exact reason for this behavior. Is this something the designers of ruby decided?
Basically, you are asking "why is multiplication not commutative"? There are two possible answers for this. Or rather one answer with two layers.
The basic principle of OO is that everything happens as the result of one object sending a message to another object and that object responding to that message. This "messaging" metaphor is very important, because it explains a lot of things in OO. For example, if you send someone a message, all you can observe is what their response is. You don't know, and have no idea of finding out, what they did to come up with that response. They could have just handed out a pre-recorded response (reference an instance variable). They could have worked hard to construct a response (execute a method). They could have handed the message off to someone else (delegation). Or, they just don't understand the message you are sending them (NoMethodError).
Note that this means that the receiver of the message is in total control. The receiver can respond in any way it wishes. This makes message sending inherently non-commutative. Sending message foo to a passing b as an argument is fundamentally different from sending message foo to b passing a as an argument. In one case, it is a and only a that decides how to respond to the message, in the other case it is b and only b.
Making this commutative requires explicit cooperation between a and b. They must agree on a common protocol and adhere to that protocol.
In Ruby, binary operators are simply message sends to the left operand. So, it is solely the left operand that decides what to do.
So, in
'Hello' * 5
the message * is sent to the receiver 'Hello' with the argument 5. In fact, you can alternately write it like this if you want, which makes this fact more obvious:
'Hello'.*(5)
'Hello' gets to decide how it responds to that message.
Whereas in
5 * 'Hello'
it is 5 which gets to decide.
So, the first layer of the answer is: Message sending in OO is inherently non-commutative, there is no expectation of commutativity anyway.
But, now the question becomes, why don't we design in some commutativity? For example, one possible way would be to interpret binary operators not as message sends to one of the operands but instead message sends to some third object. E.g., we could interpret
5 * 'Hello'
as
*(5, 'Hello')
and
'Hello' * 5
as
*('Hello', 5)
i.e. as message sends to self. Now, the receiver is the same in both cases and the receiver can arrange for itself to treat the two cases identically and thus make * commutative.
Another, similar possibility would be to use some sort of shared context object, e.g. make
5 * 'Hello'
equivalent to
Operators.*(5, 'Hello')
In fact, in mathematics, the meaning of a symbol is often dependent on context, e.g. in ℤ, 2 / 3 is undefined, in ℚ, it is 2/3, and in IEEE754, it is something close to, but not exactly identical to 0.333…. Or, in ℤ, 2 * 3 is 6, but in ℤ|5, 2 * 3 is 1.
So, it would certainly make sense to do this. Alas, it isn't done.
Another possibility would be to have the two operands cooperate using a standard protocol. In fact, for arithmetic operations on Numerics, there actually is such a protocol! If a receiver doesn't know what to do with an operand, it can ask that operand to coerce itself, the receiver, or both to something the receiver does know how to handle.
Basically, the protocol goes like this:
you call 5 * 'Hello'
5 doesn't know how to handle 'Hello', so it asks 'Hello' for a coercion. …
… 5 calls 'Hello'.coerce(5)
'Hello' responds with a pair of objects [a, b] (as an Array) such that a * b has the desired result
5 calls a * b
One common trick is to simply implement coerce to flip the operands, so that when 5 retries the operation, 'Hello' will be the receiver:
class String
def coerce(other)
[self, other]
end
end
5 * 'Hello'
#=> 'HelloHelloHelloHelloHello'
Okay, OO is inherently non-commutative, but we can make it commutative using cooperation, so why isn't it done? I must admit, I don't have a clear-cut answer to this question, but I can offer two educated guesses:
coerce is specifically intended for numeric coercion in arithmetic operations. (Note the protocol is defined in Numeric.) A string is not a number, nor is string concatenation an arithmetic operation.
We just don't expect * to be commutative with wildly different types such as Integer and String.
Of course, just for fun, we can actually observe that there is a certain symmetry between Integers and Strings. In fact, you can implement a common version of Integer#* for both String and Integer arguments, and you will see that the only difference is in what we choose as the "zero" element:
class Integer
def *(other)
zero = case other
when Integer then 0
when String then ''
when Array then []
end
times.inject(zero) {|acc, _| acc + other }
end
end
5 * 6
#=> 30
5 * 'six'
#=> 'sixsixsixsixsix'
5 * [:six]
#=> [:six, :six, :six, :six, :six, :six]
The reason for this is, of course, that the set of strings with the concatenation operation and the empty string as the identity element form a monoid, just like arrays with concatenation and the empty array and just like integers with addition and zero. Since all three are monoids, and our "multiplication as repeated addition" only requires monoid operations and laws, it will work for all monoids.
Note: Python has an interesting twist on this double-dispatch idea. Just like in Ruby, if you write
a * b
Python will re-write that into a message send:
a.__mul__(b)
However, if a can't handle the operation, instead of cooperating with b, it cooperates with Python by returning NotImplemented. Now, Python will try with b, but with a slight twist: it will call
b.__rmul__(a)
This allows b to know that it was on the right side of the operator. It doesn't matter much for multiplication (because multiplication is (usually but not always, see e.g. matrix multiplication) commutative), but remember that operator symbols are distinct from their operations. So, the same operator symbol can be used for operations that are commutative and ones that are non-commutative. Example: + is used in Ruby for addition (2 + 3 == 3 + 2) and also for concatenation ('Hello' + 'World' != 'World' + 'Hello'). So, it is actually advantageous for an object to know whether it was the right or left operand.
This is because that operators are also methods(Well there are exceptions as Cary has listed in the comments which I wasn't aware of).
For example
array << 4 == array.<<4
array[2] == array.[](2)
array[2] ='x' == array.[] =(2,'x')
In your example:
5 * "Hello" => 5.*("Hello")
Meanwhile
"hello" *5 => 5.*("hello")
An integer cannot take that method with a string param
If you ever dabble around in python try 5*hello and hello*5, both work. Pretty interesting that ruby has this feature to be honest.
Well, as Muntasir Alam has already told that Fixnum does not has a method named * which takes a string as argument. So, 5*"Hello" produces that error.But, to have fun we can actually achieve 5*"Hello" this by adding that missing method to the Fixnum class.
class Fixnum # open the class
def * str # Override the *() method
if str.is_a? String # If argument is String
temp = ""
self.times do
temp << str
end
temp
else # If the argument is not String
mul = 0
self.times do
mul += str
end
mul
end
end
end
now
puts 5*"Hello" #=> HelloHelloHelloHelloHello
puts 4*5 #=> 20
puts 5*10.4 #=> 52.0
Well, that was just to show that the opposite is also possible. But that will bring a lot of overhead. I think we should avoid that at all cost.

How does "Assignment Branch Condition size for index is too high" work?

Rubocop is always report the error:
app/controllers/account_controller.rb:5:3: C: Assignment Branch Condition size for index is too high. [30.95/24]
if params[:role]
#users = #search.result.where(:role => params[:role])
elsif params[:q] && params[:q][:s].include?('count')
#users = #search.result.order(params[:q][:s])
else
#users = #search.result
end
How to fix it? Anyone has good idea?
The ABC size [1][2] is
computed by counting the number of assignments, branches and conditions for a section of code. The counting rules in the original C++ Report article were specifically for the C, C++ and Java languages.
The previous links details what counts for A, B, and C. ABC size is a scalar magnitude, reminiscent of a triangulated relationship:
|ABC| = sqrt((A*A)+(B*B)+(C*C))
Actually, a quick google on the error shows that the first indexed page is the Rubocop docs for the method that renders that message.
Your repo or analysis tool will define a threshold amount when the warning is triggered.
Calculating, if you like self-inflicting....
Your code calcs as
(1+1+1)^2 +
(1+1+1+1+1+1+1+1+1+1+1+1+1)^2 +
(1+1+1+1)^2
=> 194
That's a 'blind' calculation with values I've made up (1s). However, you can see that the error states numbers that probably now make sense as your ABC and the threshold:
[30.95/24]
So cop threshold is 24 and your ABC size is 30.95. This tells us that the rubocop engine assign different numbers for A, B, and C. As well, different kinds or Assignments (or B or C) could have different values, too. E.G. a 'normal' assignment x = y is perhaps scored lower than a chained assignment x = y = z = r.
tl;dr answer
At this point, you probably have a fairly clear idea of how to reduce your ABC size. If not:
a simple way it to take the conditional used for your elsif and place it in a helper method.
since you are assigning an # variable, and largely calling from one as well, your code uses no encapsulation of memory. Thus, you can move both if and elsif block actions into each their own load_search_users_by_role and load_search_users_by_order methods.

Sketchup Ruby ComponentDefinition.count_instances only in selection

Given a Sketchup::ComponentDefinition object c_def, if I use c_def.count_instances or cdef.instances.length I have the total number of instances of my component in the whole model, just like documentation says it should.
ComponentDefinition::count_instances
ComponentDefinition::instances
Unfortunately I need to count instances separating by groups or sub-components.
E.g. suppose I have two different components in a model that use the same basic component.
The first one has 3 basic component instances and the second one has 5.
c_def.count_instances will always return 8, as it is the total number of instances, but I need to be able to tell that the first component has only 3 and the second one only 5.
How to do that?
Thanks!
You would then need to recursively traverse the entities of the instance you're interested in. I'm afraid there is no API method for doing this.
module Example
def self.count_definition_in_entities(entities, find_definition, count = 0)
entities.each { |entity|
definition = self.get_definition(entity)
next if definition.nil?
count += 1 if find_definition == definition
count = self.count_definition_in_entities(definition.entities, find_definition, count)
}
count
end
def self.get_definition(entity)
if entity.is_a?(Sketchup::ComponentInstance)
entity.definition
elsif entity.is_a?(Sketchup::Group)
entity.entities.parent
else
nil
end
end
end # module
d = Sketchup.active_model.definitions["Sophie"]
Example.count_definition_in_entities(Sketchup.active_model.entities, d)
Also, beware that count_instances doesn't a complete full model count. If you have an component C1 placed two times in another component C2. Then C1.count_instances return 2. If you add another copy C2 you might expect C1.count_instances to yield 4 - but it doesn't it still yields 2. The method only counts how many times the instance is placed in any Entities collection, but doesn't take into account the whole model three.

Ruby: Resumable functions with arguments

I want a function that keeps local state in Ruby. Each time I call the function I want to return a result that depends both on a calling argument and on the function's stored state. Here's a simple example:
def inc_mult(factor)
#state ||= 0 # initialize the state the first time.
#state += 1 # adjust the internal state.
factor * #state
end
Note that the state is initialized the first time, but subsequent calls access stored state. This is good, except that #state leaks into the surrounding context, which I don't want.
What is the most elegant way of rewriting this so that #state doesn't leak?
(Note: My actual example is much more
complicated, and initializing the
state is expensive.)
You probably want to encapsulate inc_mult into its own class, since you want to encapsulate its state separately from its containing object. This is how generators (the yield statement) work in Python and C#.
Something as simple as this would do it:
class Foo
state = 0
define_method(:[]) do |factor|
state += 1
factor * state
end
end
Philosophically, I think what you’re aiming for is incompatible with Ruby’s view of methods as messages, rather than as functions that can somewhat stand alone.
Functions are stateless. They are procedural code. Classes contain state as well as procedural code. The most elegant way to do this would be to follow the proper programming paradigm:
Class to maintain state
Function to manipulate state
Since you're using Ruby, it may seem a bit more elegent to you to put these things in a module that can be included. The module can handle maintaining state, and the method could simply be called via:
require 'incmodule'
IncModule::inc_mult(10)
Or something similar
I want a function that keeps local state in Ruby.
That word "function" should immediately raise a big fat red flashing warning sign that you are using the wrong programming language. If you want functions, you should use a functional programming language, not an object-oriented one. In a functional programming language, functions usually close over their lexical environment, which makes what you are trying to do absolutely trivial:
var state;
function incMult(factor) {
if (state === undefined) {
state = 0;
}
state += 1;
return factor * state;
}
print(incMult(2)); // => 2
print(incMult(2)); // => 4
print(incMult(2)); // => 6
This particular example is in ECMAScript, but it looks more or less the same in any functional programming language.
[Note: I'm aware that it's not a very good example, because ECMAScript is actually also an object-oriented language and because it has broken scope semantics that atually mean that state leaks in this case, too. In a language with proper scope semantics (and in a couple of years, ECMAScript will be one of them), this'll work as intended. I used ECMAScript mainly for its familiar syntax, not as an example of a good functional language.]
This is the way that state is encapsulated in functional languages since, well, since there are functional languages, all the way back to lambda calculus.
However, in the 1960s some clever people noticed that this was a very common pattern, and they decided that this pattern was so common that it deserved its own language feature. And thus, the object was born.
So, in an object-oriented language, instead of using functional closures to encapsulate state, you would use objects. As you may have noticed, methods in Ruby don't close over their lexical environment, unlike functions in functional programming languages. And this is precisely the reason: because encapsulation of state is achieved via other means.
So, in Ruby you would use an object like this:
inc_mult = Object.new
def inc_mult.call(factor)
#state ||= 0
#state += 1
factor * #state
end
p inc_mult.(2) # => 2
p inc_mult.(2) # => 4
p inc_mult.(2) # => 6
[Sidenote: This 1:1 correspondence is what functional programmers are talking about when they say "objects are just a poor man's closures". Of course, object-oriented programmers usually counter with "closures are just a poor man's objects". And the funny thing is, both of them are right and neither of them realize it.]
Now, for completeness' sake I want to point out that while methods don't close over their lexical environment, there is one construct in Ruby, which does: blocks. (Interestingly enough, blocks aren't objects.) And, since you can define methods using blocks, you can also define methods which are closures:
foo = Object.new
state = nil
foo.define_singleton_method :inc_mult do |factor|
state ||= 0
state += 1
factor * state
end
p foo.inc_mult(2) # => 2
p foo.inc_mult(2) # => 4
p foo.inc_mult(2) # => 6
It seems like you could just use a global or a class variable in some other class, which would at least allow you to skip over the immediately surrounding context.
Well, you could play around a bit... What about a function that rewrites itself?
def imult(factor)
state = 1;
rewrite_with_state(state+1)
factor*state
end
def rewrite_with_state(state)
eval "def imult(factor); state = #{state}; rewrite_with_state(#{state+1}); factor*state; end;"
end
Warning: This is extremely ugly and should not be used in production code!
you can use lambda.
eg,
$ cat test.rb
def mk_lambda( init = 0 )
state = init
->(factor=1, incr=nil){
state += incr || 1;
puts "state now is: #{state}"
factor * state
}
end
f = mk_lambda
p f[]
p f[1]
p f[2]
p f[100]
p f[100,50]
p f[100]
$ ruby test.rb
state now is: 1
1
state now is: 2
2
state now is: 3
6
state now is: 4
400
state now is: 54
5400
state now is: 55
5500
kind regards -botp

Resources