I have adjacency matrix let it be called A size n*n
Where A(k,j)=A(j,k)=1 if k,j are connected in 1 hop.
Now it look that if I take
Dist=double(A)*double(A)>0 %getting all two hops connectivity
Dist=double(Dist)*double(A)>0 %getting all three hops connectivity
Dist=double(Dist)*double(A)>0 %getting all four hops connectivity
Is this right at all?
I tried it with some simple graphs and it looks legit
Can I use this fact to create distance matrix?
Where distance matrix will show the minimum number of hops from j to k
P.S:
If it legit I will be happy to understand why it is right, did now find info in Google
Yes, this is perfectly right: the entries of the adjacency matrix gives you the connections between vertices. Powers of the adjacency matrix are concatenating walks. The ijth entry of the kth power of the adjacency matrix tells you the number of walks of length k from vertex i to vertex j.
This can be quite easily proven by induction.
Be aware that the powers of the adjacency matrix count the number of iāj walks, not paths (a walk can repeat vertices, while a path cannot). So, to create a distance matrix you need to iterativerly power your adjacency matrix, and as soon as a ijth element is non-zero you have to assign the distance k in your distance matrix.
Here is a try:
% Adjacency matrix
A = rand(5)>0.5
D = NaN(A);
B = A;
k = 1;
while any(isnan(D(:)))
% Check for new walks, and assign distance
D(B>0 & isnan(D)) = k;
% Iteration
k = k+1;
B = B*A;
end
% Now D contains the distance matrix
Note that if you are searching for the shortest paths in a graph, you can also use Dijkstra's algorithm.
Finally, note that this is completely comptatible with sparse matrices. As adjacency matrices are often good candidates for sparse matrices, it may be highly beneficial in terms of performance.
Best,
Related
so in one of my lectures I came across the proof for:
šš”ššØš«šš¦: š“šš¦ algorithm that determines if a graph is bipartite
that has as its input an undirected graph šŗ = (š, šø) represented
as an š Ć š adjacency matrix, has the running time of Ī©(š^2)
We assume an algorithm ALG which test for bipartiteness (returns either true or false). And we also assume we have a graph šŗ0 = (š, šø0) with š = {1,2, ā¦ , š} and šø0 = { 1, š : 2 ā¤ š ā¤ š} (as this is a star it is a bipartite graph)
Within the proof there's a step saying:
"For a given algorithm ALG, we will construct another graph šŗ1 st: if ALG performs less than (šā1)C2 accesses to the adjacency matrix š“ of šŗ0,
then ALG will not distinguish between šŗ0 and šŗ1, and šŗ1 is not bipartite."
My question is what does (n-1)C2 accesses mean. Is it saying that for example if we have a different V = {A,B,C,D} then ALG will look at all node pairs except for the ones between D and the other nodes ?
Sorry if this isn't clear this proof really confused me.
G0 is an n-vertex star graph. It's bipartite, but if you add any other edge to it, the resulting graph is not. There are nā1 choose 2 = (nā1)(nā2)/2 = Ī©(n2) other edges that we can add. Every correct algorithm must check every single one in order to verify that G0 is bipartite.
Suppose you are given an undirected graph G with n vertices and m
edges represented by an n x n adjacency matrix A, and you are also
given a subset of vertices S (represented by an array of size m).
How can you check whether S is a vertex cover of G with quadratic
time and space complexity?
By the definition of a vertex cover, I know that we require every edge must be incident to a vertex that's contained in S.
I can easily come up with a cubic algorithm: iterate over the adjacency matrix; each 1 represents an edge (u, v). Check whether u or v are in S. If not, the answer is no. If we get to the end of the adjacency matrix, the answer is yes.
But how can I do this in O(n^2) time? I guess the only real "observation" I've made so far is that we can possibly skip intermediate rows while iterating over the adjacency matrix if we've already found the vertex corresponding to that row in S. However, this has not helped me very much.
Can someone please help me (or point me in the correct direction)?
Thanks
Construct an array T which is the positions of all of the elements NOT in S.
And then:
for i in T:
for j in T:
if A[i][j] == 1:
return False
return True
I'm trying to come up with a reasonable algorithm for this problem:
Let's say we have bunch of locations. We know the distances between each pair of locations. Each location also has a point. The goal is to maximize the sum of the points while travelling from a starting location to a destination location without exceeding a given amount of distance.
Here is a simple example:
Starting location: C , Destination: B, Given amount of distance: 45
Solution: C-A-B route with 9 points
I'm just curious if there is some kind of dynamic algorithm for this type of problem. What the would be the best, or rather easiest approach for that problem?
Any help is greatly appreciated.
Edit: You are not allowed to visit the same location many times.
EDIT: Under the newly added restriction that every node can be visited only once, the problem is most definitely NP-hard via reduction to Hamilton path: For a general undirected, unweighted graph, set all edge weights to zero and every vertex weight to 1. Then the maximum reachable score is n iif there is a Hamilton path in the original graph.
So it might be a good idea to look into integer linear programming solvers for instance families that are not constructed specifically to be hard.
The solution below assumes that a vertex can be visited more than once and makes use of the fact that node weights are bounded by a constant.
Let p(x) be the point value for vertex x and w(x,y) be the distance weight of the edge {x,y} or w(x,y) = ā if x and y are not adjacent.
If we are allowed to visit a vertex multiple times and if we can assume that p(x) <= C for some constant C, we might get away with the following recurrence: Let f(x,y,P) be the minimum distance we need to get from x to y while collecting P points. We have
f(x,y,P) = ā for all P < 0
f(x,x,p(x)) = 0 for all x
f(x,y,P) = MIN(z, w(x, z) + f(z, y, P - p(x)))
We can compute f using dynamic programming. Now we just need to find the largest P such that
f(start, end, P) <= distance upper bound
This P is the solution.
The complexity of this algorithm with a naive implementation is O(n^4 * C). If the graph is sparse, we can get O(n^2 * m * C) by using adjacency lists for the MIN aggregation.
I am thinking about the algorithm for the following problem (found on carrercup):
Given a polygon with N vertexes and N edges. There is an int number(could be negative) on every vertex and an operation in set(*,+) on every edge. Every time, we remove an edge E from the polygon, merge the two vertexes linked by the edge(V1,V2) to a new vertex with value: V1 op(E) V2. The last case would be two vertexes with two edges, the result is the bigger one.
Return the max result value can be gotten from a given polygon.
I think we can use just greedy approach. I.e. for polygon with k edges find a pair (p, q) which produces the maximum number when collapsing: (p ,q) = max ({i operation j : i, j - adjacent edges)
Then just call a recursion on polygons:
1. Let function CollapseMaxPair( P(k) ) - gets polygon with k edges and returns 'collapsed' polygon with k-1 edges
2. Then our recursion:
P = P(N);
Releat until two edges left
P = CollapseMaxPair( P )
maxvalue = max ( two remained values)
What do you think?
I have answered this question here: Google Interview : Find the maximum sum of a polygon and it was pointed out to me that that question is a duplicate of this one. Since no one has answered this question fully yet, I have decided to add this answer here as well.
As you have identified (tagged) correctly, this indeed is very similar to the matrix multiplication problem (in what order do I multiply matrixes in order to do it quickly).
This can be solved polynomially using a dynamic algorithm.
I'm going to instead solve a similar, more classic (and identical) problem, given a formula with numbers, addition and multiplications, what way of parenthesizing it gives the maximal value, for example
6+1 * 2 becomes (6+1)*2 which is more than 6+(1*2).
Let us denote our input a1 to an real numbers and o(1),...o(n-1) either * or +. Our approach will work as follows, we will observe the subproblem F(i,j) which represents the maximal formula (after parenthasizing) for a1,...aj. We will create a table of such subproblems and observe that F(1,n) is exactly the result we were looking for.
Define
F(i,j)
- If i>j return 0 //no sub-formula of negative length
- If i=j return ai // the maximal formula for one number is the number
- If i<j return the maximal value for all m between i (including) and j (not included) of:
F(i,m) (o(m)) F(m+1,j) //check all places for possible parenthasis insertion
This goes through all possible options. TProof of correctness is done by induction on the size n=j-i and is pretty trivial.
Lets go through runtime analysis:
If we do not save the values dynamically for smaller subproblems this runs pretty slow, however we can make this algorithm perform relatively fast in O(n^3)
We create a n*n table T in which the cell at index i,j contains F(i,j) filling F(i,i) and F(i,j) for j smaller than i is done in O(1) for each cell since we can calculate these values directly, then we go diagonally and fill F(i+1,i+1) (which we can do quickly since we already know all the previous values in the recursive formula), we repeat this n times for n diagonals (all the diagonals in the table really) and filling each cell takes (O(n)), since each cell has O(n) cells we fill each diagonals in O(n^2) meaning we fill all the table in O(n^3). After filling the table we obviously know F(1,n) which is the solution to your problem.
Now back to your problem
If you translate the polygon into n different formulas (one for starting at each vertex) and run the algorithm for formula values on it, you get exactly the value you want.
Here's a case where your greedy algorithm fails:
Imagine your polygon is a square with vertices A, B, C, D (top left, top right, bottom right, bottom left). This gives us edges (A,B), (A,D), (B,C), and (C, D).
Let the weights be A=-1, B=-1, C=-1, and D=1,000,000.
A (-1) ------ B (-1)
| |
| |
| |
| |
D(1000000) ---C (-1)
Clearly, the best strategy is to collapse (A,B), and then (B,C), so that you may end up with D by itself. Your algorithm, however, will start with either (A,D) or (D,C), which will not be optimal.
A greedy algorithm that combines the min sums has a similar weakness, so we need to think of something else.
I'm starting to see how we want to try to get all positive numbers together on one side and all negatives on the other.
If we think about the initial polygon entirely as a state, then we can imagine all the possible child states to be the subsequent graphs were an edge is collapsed. This creates a tree-like structure. A BFS or DFS would eventually give us an optimal solution, but at the cost of traversing the entire tree in the worst case, which is probably not as efficient as you'd like.
What you are looking for is a greedy best-first approach to search down this tree that is provably optimal. Perhaps you could create an A*-like search through it, although I'm not sure what your admissable heuristic would be.
I don't think the greedy algorithm works. Let the vertices be A = 0, B = 1, C = 2, and the edges be AB = a - 5b, BC = b + c, CA = -20. The greedy algorithm selects BC to evaluate first, value 3. Then AB, value, -15. However, there is a better sequence to use. Evaluate AB first, value -5. Then evaluate BC, value -3. I don't know of a better algorithm though.
I was wondering if anybody could point me in the direction of some eigenvector centrality pseudocode, or an algorithm (for social network analysis). I've already bounced around Wikipedia and Googled for some information, but I can't find any description of a generalized algorithm or pseudocode.
Thanks!
The eigenvector centrality of a vertex v in a graph G just seems to be the v'th entry of the dominant eigenvector of G's adjacency matrix A scaled by the sum of the entries of that eigenvector.
The power iteration, starting from any strictly-positive vector, will tend to the dominant eigenvector of A.
Notice that the only operation that power iteration needs to do is multiply A by a vector repeatedly. This is easy to do; the i'th entry of Av is just the sum of the entries of v corresponding to vertices j to which vertex i is connected.
The rate of convergence of power iteration is linear in the ratio of the largest eigenvalue to the eigenvalue whose absolute value is second largest. That is, if the largest eigenvalue is lambdamax and the second-largest-by-absolute-value eigenvalue is lambda2, the error in your eigenvalue estimate gets reduced by a factor of lambdamax / |lambda2|.
Graphs that arise in practice (social network graphs, for instance) typically have a wide gap between lambdamax and lambda2, so power iteration will typically converge acceptably fast; within a few dozen iterations and almost irrespective of the starting point, you will have an eigenvalue estimate that's within 10-9.
So, with that theory in mind, here's some pseudocode:
Let v = [1, 1, 1, 1, ... 1].
Repeat 100 times {
Let w = [0, 0, 0, 0, ... 0].
For each person i in the social network
For each friend j of i
Set w[j] = w[j] + v[i].
Set v = w.
}
Let S be the sum of the entries of v.
Divide each entry of v by S.
I only know a little about it. This is the pseudo code I learned in the class.
input: a diagonalizable matrix A
output: a scalar number h, which is the greatest(in absolute value) eigenvalue of A, and a nonzero vector v, the corresponding eigenvector of h, such that Av=hv
begin
initialization: initialize a vector b0, which may be an approximation to the dominant eigenvector or a random vector, and let k=0
while k is smaller than the maximum iteration
calculate bk+1 = A*bk/(|A*bk|)
set k=k+1
end