How to compute difference between two activerecord instances? - activerecord

Model structure
class A {}
class B {has_many: A}
Now I have two instances of B,
b1 = has 3 instances of A having IDs [1,2,3]
b2 = has 2 instances of A having IDs [1,2]
How do I compute differences between b1 and b2, which gives me differences in associations too?

Use array subtraction. You might need to make sure your activerecord relation is converted into an array.
b2_missing = b1.as.pluck(:id).to_a - b2.as.pluck(:id)

Related

Problem of assigning some values to a set from multiple options algo

I have a problem statemment, where I have some sets and each set have some options, some specific option from options, needs to be assigned to that set.
Some options can be common in multiple sets, but none can be assigned to more than one set. Need an algo to achieve this. A rough example is
Set1 has options [100,101,102,103] - 2 needs to be selected,
Set2 has options [101,102,103,104] - 2 needs to be selected,
Set3 has options [99,100,101] - 2 needs to be selected,
so the possible solution is
Set1 gets 100,102
Set2 gets 103,104
Set3 gets 99,101
Can anyone suggests an approach on how can I get a generic solution to this problem.
This can be modelled as an instance of the bipartite graph matching problem.
Let A be the numbers which appear in the lists (combined into one set with no duplicates), and let B be the lists themselves, each repeated according to how many elements need to be selected from them. There is an edge from a number a ∈ A to a list b ∈ B whenever the number a is in the list b.
Therefore this problem can be approached using an algorithm which finds a "perfect matching", i.e. a matching which includes all vertices. Wikipedia lists several algorithms which can be used to find matchings as large as possible, including the Ford–Fulkerson algorithm and the Hopcroft–Karp algorithm.
Thanks #kaya3, I was not remembering the exact algo, and getting me remember that its a bipartite graph matching problem was really helpful.
But it wasn't giving me the exact solution when I needed n number of options for each So I followed the following approach, i.e.
A = [99,100,101,102,103,104]
B = [a1, a2, b1, b2, c1, c2]
# I repeated the instances because I need 2 instances for each and created my
# graph like this
# Now my graph will be like
99 => [c1, c2]
100 => [a1, a2, c1, c2]
101 => [a1, a2, b1, b2, c1, c2]
102 => [a1, a2, b1, b2]
103 => [a1, a2, b1, b2]
104 => [b1, b2]
Now it is giving correct solution everytime. I tried with multiple use cases. Repeating

Classification of multidimensional data

I would like to classify some multidimensional data:
The input data is as follows:
Data1: [[a1,b1,f1], [a2,b2,f2], ... [an,bn,fn]] where: fn = F(an,bn) --> ClassA
Data2: [[c1,d1,g1], [c2,d2,g2], ... [cn,dn,gn]] where: gn = G(cn,dn) --> ClassB
...
So, given Datax, as follows, we would like to classify it into one of the finite classes we have:
Datax: [[x1,y1,z1], [x2,y2,z2], ... [xn,yn,zn]] where: zn = Z(xn,yn) --> which class?
I could probably flatten the array for each record and train my classifier:
Data1: [a1,b1,f1,a2,b2,f2,...,an,bn,fn]
But I thought because the third values themselves are a function of the first two values (e.g. fn = F(an,bn)), I should consider that relationship in my training rather than going for a flat array.
Does it make any difference? or what is the best approach to solve this problem?
If the 3rd data of each tuple is the product of the same deterministic function (that can be different in each row but must the same for each triple of the row)
then you can simply cut of zn because it does not bring any new information.
ex: z1 = 3x1 + 2y1 ; z2 = 3x1 + 2y1 ; [...] ; zn = 3xn + 2yn
If it is not the case then you should leave z1.
Said this, I think you can flatten the array because most models would automatically understand those kind of dependancies.

find the intersection of two array structs in Matlab

How can I find the following intersection of two array structs in Matlab.
For example, I have two struct arrays a and b:
a(1)=struct('x',1,'y',1);
a(2)=struct('x',3,'y',2);
a(3)=struct('x',4,'y',3);
a(4)=struct('x',5,'y',4);
a(5)=struct('x',1,'y',5);
b(1)=struct('x',1,'y',1);
b(2)=struct('x',3,'y',5);
I want to find the intersection of a and b as follows:
c = intersect(a,b)
where c should be
c = struct('x',1,'y',1);
But when it seems wrong when I type intersect(a,b) since the elements of a and b are both structures. How can I combat this difficulty. Thanks.
The elegant solution would have been to supply intersect with a comparator operator (like in , e.g., C++).
Unfortunaetly, Matlab does not seem to support this kind of functionality/flexibility.
A workaround for your problem would be
% convert structs into matrices
A = [[a(:).x];[a(:).y]]';
B = [[b(:).x];[b(:).y]]';
% intersect the equivalent representation
[C, ia, ib] = intersect( A, B, 'rows' );
% map back to original structs
c = a(ia);
Alternatively, have you considered replacing your structs with class objects derived from handle class? It might be possible to overload the relational operators of the class and then it should be possible to sort the class objects directly (I haven't looked closely into this solution - it's just a proposal off the tip of my head).
A more general variant of Shai's approach is:
A = cell2mat(permute(struct2cell(a), [3 1 2]));
B = cell2mat(permute(struct2cell(b), [3 1 2]));
[C, ia] = intersect(A, B, 'rows');
c = a(ia);
This way you don't need to explicitly specify all the struct fields. Of course, this won't work if the struct fields contain non-numeric values.
Generalized approach for fields of any type and dimensions
If you're uncertain about the type and size of the data stored in your structs, interesect won't cut it. Instead, you'll have to use isequal with a loop. I'm using arrayfun here for elegancy:
[X, Y] = meshgrid(1:numel(a), 1:numel(b));
c = a(any(arrayfun(#(m, n)isequal(a(m), b(n)), X, Y)));
A systematic approach would be to produce a hash - and then use intersect:
hash_fun = #(x) sprintf('x:%g;y:%g',x.x,x.y);
ha = arrayfun(hash_fun,a,'UniformOutput',false);
hb = arrayfun(hash_fun,b,'UniformOutput',false);
[hi,ind_a,ind_b]=intersect(ha,hb)
res=a(ind_a) % result of intersection

Ruby - how to find a combination of results from picking 1 value from each array of a list of arrays

Lets say i have the following arrays in ruby contained in an array and I don't know how many arrays there will be or the length of them. An example below:
[["cat", "dog"],[1, 3, 5, 7],["morning", "afternoon", "evening"]]
what i want to do is have all combinations of results from picking 1 value from each array and returning it as an array of these combinations. Therefore, in the following example, there should be 2*4*3, or 24 possible unique results.
the result would be like :
result = [["cat", 1, "morning"], ["cat", 1, "afternoon"], ["dog", 5, "evening"] ...]
How would i go about doing this in ruby for a list of N arrays? I tried messing around with products and maps and injects but I cant get it working.
EDIT Since you made it clear that you're dealing with not just three arrays a1, a2 and a3 but an arrays of arrays, changing my solution to use product.
Like this?
a1.map{|x1| a2.map{|x2| a3.map{|x3| [x1, x2, x3] }}}.flatten(2)
Or with flat_map:
a1.flat_map{|x1| a2.flat_map{|x2| a3.map{|x3| [x1, x2, x3] }}}
Wow, or just:
a1.product(a2,a3)
If you have several arrays (not just the fixed number of 3 in you first example),
then:
input = [["cat", "dog"],[1, 3, 5, 7],["morning", "afternoon", "evening"]]
h,*rest = input
result = h.product(*rest)
Array#product:
xs[0].product(*xs.drop(1))
Note that you'd rather like to write Array.product(*xs), but Ruby has no such classmethod in the core (easy to write, sure, but probably it should be there).

algorithm issue - find the least common subset

a's are objects with multiple "categories", b's, for instance a1 has three cateories b1,b2,b3.
The problem is to, reduce the number of categories (which can grow rather large), into groups that always occurs together. A "largest common subset" thing.
So for instance, given the following data set:
a1{ b1,b2,b3 }
a2{ b2,b3 }
a3{ b1,b4 }
We can find that b2 and b3 always comes together..
b23 = {b2,b3}
..and we can reduce the category set to this:
a1{ b1, b23 }
a2{ b23 }
a3{ b1,b4 }
So, my issue is to find some algorithm to solve this problem.
I have started to look at the Longest Common Sequence problem, and it might be a solution. i.e. something like repeatedly grouping categories like this b' = LCS(set_of_As) until all categories has been traversed. However, this is not complete. I have to limit the input domain in some way to make this possible.
Do I miss something obvious? Any hints of a problem domain you can point me to? Does anyone recognize any other approach to such a problem.
Transform your sets to have sets of b's that include a's:
b1 { a1, a3 }
b2 { a1, a2 }
b3 { a1, a2 }
b4 { a3 }
Make sure the contents of the new b sets are sorted.
Sort your b sets by their contents.
Any two adjacent sets with the same elements are b's that occur in the same a sets.
I think you're on the right track with the LCS if you can impose an ordering on the catagories (if not then the LCS algorithm can't recognize {b3, b4} and {b4, b3}). If you can impose and ordering and sort them then I think something like this could work:
As = {a1={b1, b2},a2={b3},...}
while ((newgroup = LCS(As)) != empty) {
for (a in As) {
replace newgroup in a
}
}

Resources