Algorithm to resolve circular dependencies? - algorithm

I would like to create a module system whereby you can load dependencies in a circular fashion. The reason is, circular dependencies arise all over the place, like in Ruby on Rails, in the relationship between model classes:
# app/models/x.rb
class X < ActiveRecord::Base
has_many :y
end
# app/models/y.rb
class Y < ActiveRecord::Base
belongs_to :x
end
The way this generically works in Ruby on Rails models, at least, is it first loads all models, with the dependency being a symbol or "string" at first. Then a centralized manager takes all these objects and converts the strings into the corresponding classes, now that the classes are all defined. When you set the relationships of the first class to their corresponding classes, the rest of the classes are still relating to strings, while this first class relates to classes, so there is an interim period where some of the models are fully resolved to classes, while others are still as strings, as the algorithm is running and has not reached completion. Once all the strings are resolved to their corresponding classes, the algorithm is complete.
I can imagine a world where it gets much more complex, creating loops that are not one-to-one circular dependencies, but have many layers in between, creating a 5-node loop sort of thing. Or even one where there are back-and-forth dependencies between modules, such as:
# file/1.rb
class X
depends_on A
depends_on B
end
class Y
depends_on B
end
class Z
depends_on Q
end
# file/2.rb
class A
depends_on Y
end
class B
depends_on Z
end
# file/3.rb
class Q
depends_on X
end
To resolve this:
Load all files *.rb, treat depends_on as strings. Now we have the definitions of each class in their corresponding modules, but with the depends_on as strings instead of the desired classes.
Iterate through files and resolve classes.
For file 1.rb, resolve X... etc.
X depends on A, so associate with 2/A.
X also depends on B, so associate with 2/B.
Y depends on B, so associate with 2/B.
Z depends on Q, so associate with 3/Q.
A depends on Y, so associate with 1/Y.
B depends on Z, so associate with 1/Z.
Q depends on X, so associate with 1/X.
Done.
So basically, there are two "rounds". First round is load all the files, and initialize the classes. Second round is associate the class members with corresponding other classes. Doesn't matter the order.
But can it get any more complex? Requiring more than two rounds to resolve such circular dependencies? I'm not sure. Take for example something like this:
# file/1.rb
class A < A_P
class AA < A_P::AA_P
class AAA < A_P::AA_P::AAA_P
end
end
end
class B < B_Q
class BB < B_Q::BB_Q
class BBB < B_Q::BB_Q::BBB_Q
end
end
end
# file/2.rb
class A_P
class AA_P < B
class AAA_P < B::BB
end
end
end
class B_Q
class BB_Q < A
class BBB_Q < A::AA
end
end
end
In this contrived case, you have:
A (file/1) depending on A_P (file/2), then:
AA (file/1) depending on A_P and AA_P, then:
AA_P (file/2) depending on B (file/1), then:
B (file/1) depending on B_Q (file/2), etc....
That is, it seems like something weird is going on. I'm not sure, my head starts going in knots.
You can't define what class A is extending until class A_P is fully resolved. You can't define what class AA is until class AA_P is fully resolved, which depends on B being resolved, which depends on B_Q being resolved. Etc..
Is it possible to resolve such circular dependencies? What is the general algorithm for resolving arbitrarily complex circular dependencies? Such that, the end is that all the circular dependencies are wired up with the actual value, not a string or other such symbol representing the actual value. The end result of resolving circular dependencies are that every reference should refer to the actual object.
Is it always just a simple two-pass algorithm, first loading the base objects, then resolving their dependencies converting the "strings" of the dependencies to the base objects in the set?
Can you come up with an example where it requires more than a simple two-pass algorithm? And then describe how the algorithm should work in resolving the circular dependencies in that case? Or prove/explain how it is certain that only a simple two-pass algorithm is required?
Another example might be:
// ./p.jslike
import { method_x } from './q'
import { method_y } from './q'
function method_a() {
method_x()
}
function method_b() {
console.log('b')
}
function method_c() {
method_y()
}
function method_d() {
console.log('d')
}
// ./q.jslike
import { method_b } from './p'
import { method_d } from './p'
function method_x() {
method_b()
}
function method_y() {
method_b()
}
I guess that would also be two-pass.

The answer to your question depends on what you mean by "resolve".
Consider the following situation:
You have three classes, A, B and C, which depend on each other in the circular way.
After the first pass (loading), you have access to all the information about each class, that is written in the source files.
Now, let's say you want to fully "resolve" class A. Pretend you don't care about other classes for now.
Given just the information from files, is it possible to perform the desired "resolve" just for the A?
If the answer is no, then the answer to your question is no as well.
If the answer is yes, then it means that two passes are sufficient to fully resolve every class. You can do this either recursively or iteratively, and you'd probably want to cache fully resolved classes to avoid resolving them multiple times, although, it's not necessary for correctness (it's purely a performance optimization).

Related

Why is only contravariance allowed for method input parameters according to the Liskov Substitution Principle?

I was trying to find good examples of why contra-variance is the only variance allowed for method input parameters according to the Liskov Substitution Principle, but until now none of the examples has completely answered ,y doubts.
I was trying to build a counter-example that could prove the statement above, but I am not sure about it. Suppose we have the following classes:
class Z {...}
class X extends Z {...}
class Y extends X {...}
class A {
void m(X x);
...
}
class B extends A {
void m(Y y); // Overriding inherited method m using covariance (by contradiction)
...
}
Now, suppose I have the following situation:
B b = new B();
A a = b; // Allowed because b is also an A object (B extends A)
Now, since the static type of a is A, theoretically we should be able to pass X objects to the method m on a (not sure what the LSP says about this):
a.m(new X());
At runtime (not sure what the LSP says about runtime and compile-time), on the other hand, this would fail, because a is actually pointing to a B object, but the method overridden in B m only accepts Y objects, which are of subtype X.
If we had allowed contra-variance instead, for example by overriding m in B by specifying the type of the parameter as Z or X, none of this would happen.
This is for now my (and my friend's) only explanation of why we are only allowed to use contra-variance for method parameters.
Is my explanation correct? Are there other situations that can explain this concept more in detail? Concrete examples are appreciated!

Interface for manipulating element order of a list

I want a list of items (I really want a stack for FILO, but I guess it is irrelevant). Is there some implementation for manipulating the item order? Things like:
Move x to 2 positions ahead
Move y to 3 positions behind
Move z to the top of position
I want to see that if something similar already exists, what other functionality it offers. I also want to use it/know how such interface works. I am doing this in Ruby, but I guess examples in other languages will be sufficient as well.
As far as I know, Ruby hasn't built-in method to move elements inside an array, but you could add them directly into Array class, like that:
class Array
def move(index, distance)
temp = self[index+distance]
self[index+distance] = self[index]
self[index] = temp
self
end
end
a = [1,2,3]
a.move(0,1)

Extensible Dependency Caching

I'm working on developing a system for computing and caching probability models, and am looking for either software that does this (preferably in R or Ruby) or a design pattern to use as I implement my own.
I have a general pattern of the form function C depends on the output of function B, which depends on the output of function A. I have three models, call them 1, 2, and 3. Model 1 implements A, B and C. Model 2 only implements C, and Model 3 implements A and C.
I would like to be able to get the value 'C' from all models with minimal recomputation of the intermediate steps.
To make things less abstract, a simple example:
I have a dependency graph that looks like so:
A1 is Model 1's implementation of A, and A3 is model 3's implementation of A. C depends on B, and B depends on A in all of the models.
The actual functions are as follows (again, this is a toy example, in reality these functions are much more complex, and can take minutes to hours to compute).
The values should be as follows.
Without caching, this is fine in any framework. I can make a class for model 1, and make model 2 extend that class, and have A,B, and C be functions on that class. Or I can use a dependency injection framework, replacing model 1's A and C with model 2's. And similarly for Model 3.
However I get into problems with caching. I want to compute C on all of the models, in order to compare the results.
So I compute C on model 1, and cache the results, A, B and C. Then I compute C on model 2, and it uses the cached version of B from before, since it is extended from model 2.
However when I compute model 3, I need to not use the cached version of B, since even though the function is the same, the function it depends on, A, is different.
Is there a good way to handle this sort of caching with dependency problem?
Anyway...with this, my first pass at it, is to make sure that functions A, B, and C are all pure functions, aka referentially transparent. That should help, because then you'd know whether to recompute a cached value depending on whether the input has changed or not.
So talking it through, When I'm computing C1, nothing's computed, so compute everything.
When computing C2, check if B1 needs updating. So you ask B1 if it needs updating. B1 checks if its input, A2 has changed from A1. It hasn't, and because all the functionals are referentially transparent, you're guaranteed that if the input hasn't changed, then the output is the same. So therefore, used the cached version of B1 to compute C2
When computing C3, check if B1 needs updating. So we ask B1 if it needs updating. B1 checks to see if its input, A3 has changed from A2, the last time it computed something. It has, so we recompute B1, and then subsequently recompute C3.
As for the dependency injection, I currently see no reason to organize it under the classes, A, B, and C. I'm guessing you want to use the strategy pattern, so that you can use operation overloading in order to keep the algorithm the same, but vary the implementations.
If you guys are using a language that can pass around functions, I would simply chain functions together with a bit of glue code that checks for whether it should call the function or use the cached value. And every time you need a different computation, reassemble all the implementations of the algorithm that you need.
The key to caching the method calls is to know where the method is implemented. You can do this by using UnboundMethod#owner (and you can get an unbound method by using Module#instance_method and passing in a symbol). Using those would lead to something like this:
class Model
def self.cache(id, input, &block)
id = get_cache_id(id, input)
##cache ||= {}
if !##cache.has_key?(id)
##cache[id] = block.call(input)
puts "Cache Miss: #{id}; Storing: #{##cache[id]}"
else
puts "Cache Hit: #{id}; Value: #{##cache[id]}"
end
##cache[id]
end
def self.get_cache_id(sym, input)
"#{instance_method(sym).owner}##{sym}(#{input})"
end
end
class Model1 < Model
def a
self.class.cache(__method__, nil) { |input|
1
}
end
def b(_a = :a)
self.class.cache(__method__, send(_a)) { |input|
input + 3
}
end
def c(_b = :b)
self.class.cache(__method__, send(_b)) { |input|
input ** 2
}
end
end
class Model2 < Model1
def c(_b = :b)
self.class.cache(__method__, send(_b)) { |input|
input ** 3
}
end
end
class Model3 < Model2
def a
self.class.cache(__method__, nil) { |input|
2
}
end
def c(_b = :b)
self.class.cache(__method__, send(_b)) { |input|
input ** 4
}
end
end
puts "#{Model1.new.c}"
puts "Cache after model 1: #{Model.send(:class_variable_get, :##cache).inspect}"
puts "#{Model2.new.c}"
puts "Cache after model 2: #{Model.send(:class_variable_get, :##cache).inspect}"
puts "#{Model3.new.c}"
puts "Cache after model 3: #{Model.send(:class_variable_get, :##cache).inspect}"
We ended up writing our own DSL in Ruby to support this problem.

Conventions on creating constants in Python

I am writing an application which needs to find out the schema of a database, across engines. To that end, I am writing a small database adapter using Python. I decided to first write a base class that outlines the functionality I need, and then implement it using classes that inherit from this base. Along the way, I need to implement some constants which need to be accessible across all these classes. Some of these constants need to be combined using C-style bitwise OR.
My question is,
what is the standard way of sharing such constants?
what is the right way to create constants that can be combined? I am referring to MAP_FIXED | MAP_FILE | MAP_SHARED style code that C allows.
For the former, I came across threads where all the constants were put into a module first. For the latter, I briefly thought of using a dict of booleans. Both of these seemed too unwieldly. I imagine that this is a fairly common requirement, and think some good way must indeed exist!
what is the standard way of sharing such constants?
Throughout the standard library, the most common way is to define constants as module-level variables using UPPER_CASE_WITH_UNDERSCORES names.
what is the right way to create constants that can be combined? I am referring to MAP_FIXED | MAP_FILE | MAP_SHARED style code that C allows.
The same rules as in C apply. You have to make sure that each constant value corresponds to a single, unique bit, i.e. powers of 2 (2, 4, 8, 16, ...).
Most of the time, people use hex numbers for this:
OPTION_A = 0x01
OPTION_B = 0x02
OPTION_C = 0x04
OPTION_D = 0x08
OPTION_E = 0x10
# ...
Some prefer a more human-readable style, computing the constant values dynamically using shift operators:
OPTION_A = 1 << 0
OPTION_B = 1 << 1
OPTION_C = 1 << 2
# ...
In Python, you could also use binary notation to make this even more obvious:
OPTION_A = 0b00000001
OPTION_B = 0b00000010
OPTION_C = 0b00000100
OPTION_D = 0b00001000
But since this notation is lengthy and hard to read, using hex or binary shift notation is probably preferable.
Constants generally go at the module level. From PEP 8:
Constants
Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAX_OVERFLOW and TOTAL.
If you want constants at class level, define them as class properties.
Stdlib is a great source of knowledge example of what you want can be found in doctest code:
OPTIONS = {}
# A function to add (register) an option.
def register_option(name):
return OPTIONS.setdefault(name, 1 << len(OPTIONS))
# A function to test if an option exist.
def has_option(options, name):
return bool(options & name)
# All my option defined here.
FOO = register_option('FOO')
BAR = register_option('BAR')
FOOBAR = register_option('FOOBAR')
# Test if an option figure out in `ARG`.
ARG = FOO | BAR
print has_option(ARG, FOO)
# True
print has_option(ARG, BAR)
# True
print has_option(ARG, FOOBAR)
# False
N.B: The re module also use bit-wise argument style too, if you want another example.
You often find constants at global level, and they are one of the few variables that exist up there. There are also people who write Constant namespaces using dicts or objects like this
class Const:
x = 33
Const.x
There are some people who put them in modules and others that attach them as class variables that instances access. Most of the time its personal taste, but just a few global variables can't really hurt that much.
Naming is usually UPPERCASE_WITH_UNDERSCORE, and they are usually module level but occasionally they live in their own class. One good reason to be in a class is when the values are special -- such as needing to be powers of two:
class PowTwoConstants(object):
def __init__(self, items):
self.names = items
enum = 1
for name in items:
setattr(self, name, enum)
enum <<= 1
constants = PowTwoConstants('ignore_case multiline newline'.split())
print constants.newline # prints 4
If you want to be able to export those constants to module level (or any other namespace) you can add the following to the class:
def export(self, namespace):
for name in self.names:
setattr(namespace, name, getattr(self, name))
and then
import sys
constants.export(sys.modules[__name__])

Why are so few things #specialized in Scala's standard library?

I've searched for the use of #specialized in the source code of the standard library of Scala 2.8.1. It looks like only a handful of traits and classes use this annotation: Function0, Function1, Function2, Tuple1, Tuple2, Product1, Product2, AbstractFunction0, AbstractFunction1, AbstractFunction2.
None of the collection classes are #specialized. Why not? Would this generate too many classes?
This means that using collection classes with primitive types is very inefficient, because there will be a lot of unnecessary boxing and unboxing going on.
What's the most efficient way to have an immutable list or sequence (with IndexedSeq characteristics) of Ints, avoiding boxing and unboxing?
Specialization has a high cost on the size of classes, so it must be added with careful consideration. In the particular case of collections, I imagine the impact will be huge.
Still, it is an on-going effort -- Scala library has barely started to be specialized.
Specialized can be expensive ( exponential ) in both size of classes and compile time. Its not just the size like the accepted answer says.
Open your scala REPL and type this.
import scala.{specialized => sp}
trait S1[#sp A, #sp B, #sp C, #sp D] { def f(p1:A): Unit }
Sorry :-). Its like a compiler bomb.
Now, lets take a simple trait
trait Foo[Int]{ }
The above will result in two compiled classes. Foo, the pure interface and Foo$1, the class implementation.
Now,
trait Foo[#specialized A] { }
A specialized template parameter here gets expanded/rewritten for 9 different primitive types ( void, boolean, byte, char, int, long, short, double, float ). So, basically you end up with 20 classes instead of 2.
Going back to the trait with 5 specialized template parameters, the classes get generated for every combination of possible primitive types. i.e its exponential in complexity.
2 * 10 ^ (no of specialized parameters)
If you are defining a class for a specific primitive type, you should be more explicit about it such as
trait Foo[#specialized(Int) A, #specialized(Int,Double) B] { }
Understandably one has to be frugal using specialized when building general purpose libraries.
Here is Paul Phillips ranting about it.
Partial answer to my own question: I can wrap an array in an IndexedSeq like this:
import scala.collection.immutable.IndexedSeq
def arrayToIndexedSeq[#specialized(Int) T](array: Array[T]): IndexedSeq[T] = new IndexedSeq[T] {
def apply(idx: Int): T = array(idx)
def length: Int = array.length
}
(Ofcourse you could still modify the contents if you have access to the underlying array, but I would make sure that the array isn't passed to other parts of my program).

Resources