Algorithms to create a tabular representation of a DAG? - algorithm

Given a DAG, in which each node belongs to a category, how can this graph be transformed into a table with a column for each category? The transformation doesn't have to be reversible, but should preserve useful information about the structure of the graph; and should be a 'natural' transformation, in the sense that a person looking at the graph and the table should not be surprised by any of the rows. It should also be compact, i.e. have few rows.
For example given a graph of nodes a1,b1,b2,c1 with edges a1->b1, a1->b2, b1->c1, b2->c1 (i.e. a diamond-shaped graph) I would expect to see the following table:
a b c
--------
a1 b1 c1
a1 b2 c1
I've thought about this problem quite a bit, but I'm having trouble coming up with an algorithm that gives intuitive results on certain graphs. Consider the graph a1,b1,c1 with edges a1->c1, b1->c1. I'd like the algorithm to produce this table:
a b c
--------
a1 b1 c1
But maybe it should produce this instead:
a b c
--------
a1 c1
a1 b1
I'm looking for creative ideas and insights into the problem. Feel free to vary to simplify or constrain the problem if you think it will help.
Brainstorm away!
Edit:
The transformation should always produce the same set of rows, although the order of rows does not matter.
The table should behave nicely when sorting and filtering using, e.g., Excel. This means that mutliple nodes cannot be packed into a single cell of the table - only one node per cell.

What you need is a variation of topological sorting. This is an algorithm that "sorts" graph vertexes as if a---->b edge meant a > b. Since the graph is a DAG, there is no cycles in it and this > relation is transitive, so at least one sorting order exists.
For your diamond-shaped graph two topological orders exist:
a1 b1 b2 c1
a1 b2 b1 c1
b1 and b2 items are not connected, even indirectly, therefore, they may be placed in any order.
After you sorted the graph, you know an approximation of order. My proposal is to fill the table in a straightforward way (1 vertex per line) and then "compact" the table. Perform sorting and pick the sequence you got as output. Fill the table from top to bottom, assigning a vertex to relevant column:
a b c
--------
a1
b2
b1
c1
Now compact the table by walking from top to bottom (and then make similar pass from bottom to top). On each iteration, you take a closer look to a "current" row (marked as =>) and to the "next" row.
If in a column nodes in current and next node differ, do nothing for this column:
from ----> to
X b c X b c
-------- --------
=> X1 . . X1 . .
X2 . . => X2 . .
If in a column X in the next row there is no vertex (table cell is empty) and in the current row there is vertex X1, then you sometimes should fill this empty cell with a vertex in the current row. But not always: you want your table to be logical, don't you? So copy the vertex if and only if there's no edge b--->X1, c--->X1, etc, for all vertexes in current row.
from ---> to
X b c X b c
-------- --------
=> X1 b c X1 b c
b1 c1 => X1 b1 c1
(Edit:) After first (forward) and second (backward) passes, you'll have such tables:
first second
a b c a b c
-------- --------
a1 a1 b2 c1
a1 b2 a1 b2 c1
a1 b1 a1 b1 c1
a1 b1 c1 a1 b1 c1
Then, just remove equal rows and you're done:
a b c
--------
a1 b2 c1
a1 b1 c1
And you should get a nice table. O(n^2).

How about compacting all reachable nodes from one node together in one cell ? For example, your first DAG should look like:
a b c
---------------
a1 [b1,b2]
b1 c1
b2 c1

It sounds like a train system map with stations within zones (a,b,c).
You could be generating a table of all possible routes in one direction. In which case "a1, b1, c1" would seem to imply a1->b1 so don't format it like that if you have only a1->c1, b1->c1
You could decide to produce a table by listing the longest routes starting in zone a,
using each edge only once, ending with the short leftover routes. Or allow edges to be reused only if they connect unused edges or extend a route.
In other words, do a depth first search, trying not to reuse edges (reject any path that doesn't include unused edges, and optionally trim used edges at the endpoints).

Here's what I ended up doing:
Find all paths emanating from a node without in-edges. (Could be expensive for some graphs, but works for mine)
Traverse each path to collect a row of values
Compact the rows
Compacting the rows is dones as follows.
For each pair of columns x,y
Construct a map of every value of x to it's possible values of y
Create another map For entries that only have one distinct value of y, mapping the value of x to its single value of y.
Fill in the blanks using these maps. When filling in a value, check for related blanks that can be filled.
This gives a very compact output and seems to meet all my requirements.

Related

How to use edge as constraint when drawing graph but not when computing rank

I have a dot graph, and I am using the constraint=false because for some of the edges, I don't want the edge to affect the rank of the nodes. Unfortunately, it appears that this means that the dot engine doesn't use that edge for layout of nodes within a rank, and it also seems to do a worse job at routing the edge itself. For example, see this graph:
digraph G {
subgraph G1 {
a1 -> b1
d1 -> b1
b1 -> c1
a1 -> c1 [constraint=false]
}
subgraph G2 {
a2 -> b2
d2 -> b2
b2 -> c2
a2 -> c2
}
}
see online
the a1 -> c1 edge could be routed left of the b1 node, but isn't. I don't want a1 -> c1 to be used for computing rank.
Is there a way to get the "best of both worlds" here? I.e. a way to get the better layout (like the a2 -> c2 edge) but not use the edge for computing rank?
I don't know a way to not use a edge for ranking, but use it for the layout (without affecting the ranking). Edges with constraint=false seem to be laid out after the placement of the nodes is determined.
What's left are some "hacks", not sure whether they are applicable in a generic manner for all of your use cases.
For example, if you make sure that the nodes linked with such an edge are mentioned before the others, the resulting layout is - at least in this example - improved:
subgraph G1 {
d1, a1;
a1 -> b1
d1 -> b1
b1 -> c1
a1 -> c1 [constraint=false]
}
example

Show that cross product of a x b is perpendicular to b [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
How do I know the cross product of A x B is perpendicular to B.
I'm little confused because there are 3 vectors instead of 2.
A = (0, -2, 5)
B = (2, 2, -5)
C= ( 7, -4, -5)
On R2 plane, (a x b) * b = 0 proves that a x b is perpendicular to b , but how do I find that on R3.
SO, after some of Research I finally figured out how to prove the vectors are perpendicularly to each other on R3.
A= (a1, a2, a3)
B= (b1, b2, b3)
C= (c1, c2, c3)
(AB x AC )* AB = 0
(AB x AC )* AC = 0
I don't think you understand what the cross product does. It gives a vector orthogonal to the two vectors.
The cross product a × b is defined as a vector c that is perpendicular
(orthogonal) to both a and b, with a direction given by the right-hand
rule and a magnitude equal to the area of the parallelogram that the
vectors span.
you can simply show this by using the definition of orthogonality which is from their dot products being zero.
Questions like this come down to precisely what you take to be your definitions.
For instance, one way to define the cross-product A x B is this:
By R^3 we mean three dimensional real space with a fixed orientation.
Observe that two linearly independent vectors A and B in R^3 span a plane, so every vector perpendicular to them lies on the (unique) line through the origin perpendicular to this plane.
Observe that for any positive magnitude, there are precisely two vectors along this line with that magnitude.
Observe that if we consider the ordered basis {A, B, C} of R^3, where C is one of the two vectors from the previous step, then one choice matches the orientation of R^3 and the other does not.
Define A x B as the vector C from the previous step for which {A, B, C} matches the orientation of R^3.
For instance, this is how the cross product is defined in the Wikipedia article:
"The cross product a × b is defined as a vector c that is perpendicular (orthogonal) to both a and b, with a direction given by the right-hand rule and a magnitude equal to the area of the parallelogram that the vectors span."
If this is your definition, then there is literally nothing to prove, because the definition already has the word "perpendicular" in it.
Another definition might go like this:
By R^3 we mean three-dimensional real space with a fixed orientation.
For an ordered basis { e1, e2, e3 } of R^3 with the same orientation as R^3, we can write any two vectors A and B as A = a1 e1 + a2 e2 + a3 e3 and B as B = b1 e1 + b2 e2 + b3 e3.
Observe that, regardless of the choice of { e1, e2, e3 } we make in step 2, the vector C := (a2 b3 - b2 a3) e1 - (a1 b3 - b3 a1) e2 + (a1 b2 - b1 a2) e3 is always the same.
Take the vector C from the previous step as the definition of A x B.
This isn't a great definition, because step 3 is both a lot of work and complete black magic, but it's one you'll commonly see. If this is your definition, the best way to prove that A x B is perpendicular to A and B would be to show that the other definition gives you the same vector as this one, and then the perpendicularity comes for free.
A more direct way would be to show that vectors with a dot product of zero are perpendicular, and then to calculate the dot product by doing a bunch of algebra. This is, again, a fairly popular way to do it, but it's essentially worthless because it doesn't offer any insight into what's going on.

How to understand `u=r÷s`, the division operator, in relational algebra?

let be a database having the following relational-schemes: R(A,B,D) and S(A,B) with the attributes of same name in the same domain and with the instances r and s respectively.
An instance of r
An instance of s
What is the scheme and what are the tuples of u=r÷s? How to define them in English with r and s?
My attempt
I know that
u=r÷s=
Which leads me to think that it would only be an array of one column A, but I'm not sure enough to know what will be ther result within the array.
Can you help me understand u=r÷s?
An intuitive property of the division operator of the relational algebra is simply that it is the inverse of the cartesian product. For example, if you have two relations R and S, then, if U is a relation defined as the cartesian product of them:
U = R x S
the division is the operator such that:
U ÷ R = S
and:
U ÷ S = R
So, you can think of the result of U ÷ R as: “the projection of U that, multiplied by R, produces U”, and of the operation ÷, as the operation that finds all the “parts” of U that are combined with all the tuples of R.
However, in order to be useful, we want that this operation can be applied to any couple of relations, that is, we want to divide a relation which is not the result of a cartesian product. For this, the formal definition is more complex.
So, supposing that we have two relations R and S with attributes respectively A and B, their division can be defined as:
R ÷ S = πA-B(R) - πA-B((πA-B(R) x S) - R)
that can be read in this way:
πA-B(R) x S: project R over the attributes of R which are not in S, and multiply (cartesian product) this relation with S. This produces a relation with the attributes A of R and with rows all the possible combinations of rows of S and the projection of R;
From the previous result subtract all the tuples originally in R, that is, perform (πA-B(R) x S) - R. In this way we obtain the “extra” tuples, that is the tuples in the cartesian product that were not present in the original relation.
Finally, subtract from the original relation those extra tuples (but, again, perform this operation only on the attributes of R which are not present in S). So, the final operation is: πA-B(R) - πA-B(the result of step 2).
So, coming to your example, the projection of r on D is equal to:
(D)
d1
d2
d3
d4
and the cartesian product with s is:
(A, B, D)
a1 b1 d1
a1 b1 d2
a1 b1 d3
a1 b1 d4
Now we can remove from this set the tuples that were also in the original relation r, i.e. the first two tuples and the last one, so that we obtain the following result:
(A, B, D)
a1 b1 d3
And finally, we can remove the previous tuples (projected on D), from the original relation (again projected on D), that is, we remove:
(D)
d3
from:
(D)
d1
d2
d3
d4
and we obtain the following result, which is the final result of the division:
(D)
d1
d2
d4
Finally, we could double check the result by multiplying it with the original relation s (which is composed only by the tuple (a1, b1)):
(A B D)
a1 b1 d1
a1 b1 d2
a1 b1 d4
And looking at the rows of the original relation r, you can see this fact, that should give you an important insight on the meaning of the division operator:
the only values of the column D in r that are present together with (a1, b1) (the only tuple of s), are d1, d2 and d4.
You can also see another example in Wikipedia, and for a detailed explanation of the division, together with its transformation is SQL, you could look at these slides.

Efficiently find lowest sum paths

This is a big ask but I'm a bit stuck!
I am wondering if there is a name for this problem, or a similar one.
I am probably over complicating finding the solution but I can't think of a way without a full brute force exhaustive search (my current implementation). This is not acceptable for the application involved.
I am wondering if there are any ways of simplifying this problem, or implementation strategies I could employ (language/tool choice is open).
Here is a quick description of the problem:
Given n sequences of length k:
a = [0, 1, 1] == [a1, a2, a3]
b = [1, 0, 2] == [b1, b2, b3]
c = [0, 0, 2] == [c1, c2, c3]
find paths of length k through the sequences as so (i'll give examples starting at a1, but hopefully you get the idea the same paths need to be derived from b1, c1)
a1 -> a2 -> a3
a1 -> b1 -> b2
a1 -> b1 -> a2
a1 -> b1 -> c1
a1 -> c1 -> c2
a1 -> c1 -> a2
a1 -> c1 -> b1
I want to know, which path(s) are going to have the lowest sum:
a1 -> a2 -> a3 == 2
a1 -> b1 -> b2 == 1
a1 -> b1 -> a2 == 2
a1 -> b1 -> c1 == 1
a1 -> c1 -> c2 == 0
a1 -> c1 -> a2 == 1
a1 -> c1 -> b1 == 1
So in this case, out of the sample a1 -> c1 -> c2 is the lowest.
EDIT:
Sorry, just to clear up the rules for deriving the path.
For example you can move from node a1 to b2 if you haven't already exhausted b2, and have exhausted the previous node in that sequence (b1).
An alternative solution using Dynamic Programming
Let's assume the arrays are given as a matrix A such that each row is identical to one of the original arrays. Your matrix will be of size (n+1)x(k+1), and make sure that A[_][0] = 0
Now, use DP to solve it:
f(x,y,z) = min { f(i,y,z-1) | x < i <= n} [union] { f(i+1,0,z) } + A[x][y]
f(_,_,0) = 0
f(n,k,z) = infinity for each z > 0
Idea: In each step you can choose to go to each of the following lines (same column) - or go to the next column, while decreasing the number of more nodes needed.
Moving to the next column is done via the dummy index A[_][0], without decreasing number of nodes needed to go more and without cost, since A[_][0] = 0.
Complexity:
This solution is basically a brute force, but using memorization of each already explored value of f(_,_,_) you basically need only to fill a matrix of size O(n*k^2), where each cell takes O(n) time to compute on first look- but in practice can be computed iteratively in O(1) per step, because you only need to minimize with the new element in the row1. This gives you O(n*k^2) - better than brute force.
(1) This is done by min{x1,x2,x3,...,xk} = min{x_k, min{x1,...,k_k-1}}, and we already know min{x1,...,k_k-1}
You can implement a modified version of A* algorithm.
Copy the matrix and fill it will 0s
foreach secondary diagonal m from the last to first
Foreach cell n in m
4. New matrix's cell n = old matrix cell n minus min(cell bellow n in new matrix, cell to the right of n in the new matrix).
Cell 0,0 in the new matrix is the shortest path
**implement A algorithem over the pseudocode above.

How to convert n-ary CSP to binary CSP using dual graph transformation

When I read the book -- Artificial Intelligence (a modern approach), I came across the following sentence describing the method to convert a n-ary Constraint Search Problem to a binary one:
Another way to convert an n-ary CSP to a binary one is the dual graph
transformation: create a new graph in which there will be one variable
for each constraint in the original graph, and one binary constraint
for each pair of constraints in the original graph that share
variables. For example, if the original graph has variables {X, Y, Z}
and constraints ⟨(X, Y, Z), C1⟩ and ⟨(X, Y ), C2⟩ then the dual graph
would have variables {C1, C2} with the binary constraint ⟨(X, Y ), R1
⟩, where (X, Y ) are the shared variables and R1 is a new relation
that defines the constraint between the shared variables, as specified
by the original C1 and C2.
I don't quite get the example provided in the book, can anybody help to explain it in another way and may better provide a concrete example? thanks :D
Let's say your problem has the following constraints:
C1, which involves x, y and z:
x + y = z
C2, which involves x and y:
x < y
with the following domains:
x :: [1,2,3]
y :: [1,2,3]
z :: [1,2,3]
The author says that you need to create 2 more variables, one for each constraint. They are defined as follows:
c1 = < x, y, z >
c2 = < x, y >
The domains of c1 and c2 are defined so that they don't violate C1 and C2, i.e.:
c1 :: [ <1,2,3>, <2,1,3>, <1,1,2>]
c2 :: [<1,2>, <2,3>, <1,3>]
c1 and c2 will be the nodes of the dual graph, but first you need to define a constraint between them, i.e. R1:
R1: "the 1st and the 2nd element of c1 (x and y) must be equal to the 1st and the 2nd element of c2 respectively" (actually you could split it in two simpler constraints)

Resources