Eigenvector Centrality Algorithm/Pseudocode - algorithm

I was wondering if anybody could point me in the direction of some eigenvector centrality pseudocode, or an algorithm (for social network analysis). I've already bounced around Wikipedia and Googled for some information, but I can't find any description of a generalized algorithm or pseudocode.
Thanks!

The eigenvector centrality of a vertex v in a graph G just seems to be the v'th entry of the dominant eigenvector of G's adjacency matrix A scaled by the sum of the entries of that eigenvector.
The power iteration, starting from any strictly-positive vector, will tend to the dominant eigenvector of A.
Notice that the only operation that power iteration needs to do is multiply A by a vector repeatedly. This is easy to do; the i'th entry of Av is just the sum of the entries of v corresponding to vertices j to which vertex i is connected.
The rate of convergence of power iteration is linear in the ratio of the largest eigenvalue to the eigenvalue whose absolute value is second largest. That is, if the largest eigenvalue is lambdamax and the second-largest-by-absolute-value eigenvalue is lambda2, the error in your eigenvalue estimate gets reduced by a factor of lambdamax / |lambda2|.
Graphs that arise in practice (social network graphs, for instance) typically have a wide gap between lambdamax and lambda2, so power iteration will typically converge acceptably fast; within a few dozen iterations and almost irrespective of the starting point, you will have an eigenvalue estimate that's within 10-9.
So, with that theory in mind, here's some pseudocode:
Let v = [1, 1, 1, 1, ... 1].
Repeat 100 times {
Let w = [0, 0, 0, 0, ... 0].
For each person i in the social network
For each friend j of i
Set w[j] = w[j] + v[i].
Set v = w.
}
Let S be the sum of the entries of v.
Divide each entry of v by S.

I only know a little about it. This is the pseudo code I learned in the class.
input: a diagonalizable matrix A
output: a scalar number h, which is the greatest(in absolute value) eigenvalue of A, and a nonzero vector v, the corresponding eigenvector of h, such that Av=hv
begin
initialization: initialize a vector b0, which may be an approximation to the dominant eigenvector or a random vector, and let k=0
while k is smaller than the maximum iteration
calculate bk+1 = A*bk/(|A*bk|)
set k=k+1
end

Related

Integer partition weighted minimum

Given a non-negative integer $n$ and a positive real weight vector $w$ with dimension $m$, partition $n$ into a length-$m$ non-negative integer vector that sums to $n$ (call it $v$) such that $w\cdot v$ is the smallest. There maybe several partitions, and we only want the value of $w\cdot v$.
Seems like this problem can use a greedy algorithm to solve. From a target vector for $n-1$, we add 1 to each entry, and find the minimum among those $m$ vectors. but I don't think it's correct. The intuition is that it might add "over" the minimum. That is, there exists another partition not yielded by the add 1 procedure that falls in between the "minimum" of $n-1$ produced by this greedy algorithm and that of $n$ produced by this greedy algorithm. Can anyone prove if this is correct or incorrect?
Without loss of generality, assume that the elements of w are non-decreasing. Let v be a m-vector whose values are non-negative integers that sum to n. Then the smallest inner product of v and w is achieved by setting v[0] = n and v[i] = 0 for i > 0.
This is easy to prove. Suppose v is any other vector with v[i] > 0 for some i > 0. Then we can increase v[0] by v[i] and reduce v[i] to zero. The elements of v will still sum to n and the inner product of v and w will be reduced by w[i] - w[0] >= 0.

algorithm about unbound knapsack problem with possible negative weights?

I meet an unbounded knapsack problem with possible negative weights: There are k items, with weights x1, x2, ..., xk (xi can be positive or negative). Every item can have infinite number. The bag can store weight W > 0. How to store as little number as possible with exact W weight, if there is no solution just return -1.
That is
What's the algorithm to solve this problem?
Firstly, we cannot drop negative one. For example, x_1 = 3, x_2 = -1, W = 2. If we drop negative one, there can be no solution. However, there can be solution n_1=1, n_2=1.
The naive idea of dynamic programming/recursion with memorization cannot handle negative weight with infinite number.
dp[i][w] = minimum number of items to fill weight w by using item 1, 2, ..., i
dp[i][w] = min(dp[i-1][w], dp[i][w - xi] + 1)
Since xi can be negative and infinite number, there can be infinite state dp[i][w].
You can do a breadth-first search on the graph of achievable total weights, where there exists an edge from weight w to weight v if there is an item with weight v-w. Start at 0 and find the shortest path to W.
The trick is that you don't need to consider achievable weights less then -max(|xi|) or greater than W+max(|xi|). You don't need to consider anything else, because for every sequence of item weights that adds up to W, there is an order in which you can perform the additions so that the intermediate sum never goes outside those bounds.
Assuming that the weights are integers, this makes the graph finite.

How to get distance matrix from Adjacency matrix matlab

I have adjacency matrix let it be called A size n*n
Where A(k,j)=A(j,k)=1 if k,j are connected in 1 hop.
Now it look that if I take
Dist=double(A)*double(A)>0 %getting all two hops connectivity
Dist=double(Dist)*double(A)>0 %getting all three hops connectivity
Dist=double(Dist)*double(A)>0 %getting all four hops connectivity
Is this right at all?
I tried it with some simple graphs and it looks legit
Can I use this fact to create distance matrix?
Where distance matrix will show the minimum number of hops from j to k
P.S:
If it legit I will be happy to understand why it is right, did now find info in Google
Yes, this is perfectly right: the entries of the adjacency matrix gives you the connections between vertices. Powers of the adjacency matrix are concatenating walks. The ijth entry of the kth power of the adjacency matrix tells you the number of walks of length k from vertex i to vertex j.
This can be quite easily proven by induction.
Be aware that the powers of the adjacency matrix count the number of iā†’j walks, not paths (a walk can repeat vertices, while a path cannot). So, to create a distance matrix you need to iterativerly power your adjacency matrix, and as soon as a ijth element is non-zero you have to assign the distance k in your distance matrix.
Here is a try:
% Adjacency matrix
A = rand(5)>0.5
D = NaN(A);
B = A;
k = 1;
while any(isnan(D(:)))
% Check for new walks, and assign distance
D(B>0 & isnan(D)) = k;
% Iteration
k = k+1;
B = B*A;
end
% Now D contains the distance matrix
Note that if you are searching for the shortest paths in a graph, you can also use Dijkstra's algorithm.
Finally, note that this is completely comptatible with sparse matrices. As adjacency matrices are often good candidates for sparse matrices, it may be highly beneficial in terms of performance.
Best,

Optimum path in a graph to maximize a value

I'm trying to come up with a reasonable algorithm for this problem:
Let's say we have bunch of locations. We know the distances between each pair of locations. Each location also has a point. The goal is to maximize the sum of the points while travelling from a starting location to a destination location without exceeding a given amount of distance.
Here is a simple example:
Starting location: C , Destination: B, Given amount of distance: 45
Solution: C-A-B route with 9 points
I'm just curious if there is some kind of dynamic algorithm for this type of problem. What the would be the best, or rather easiest approach for that problem?
Any help is greatly appreciated.
Edit: You are not allowed to visit the same location many times.
EDIT: Under the newly added restriction that every node can be visited only once, the problem is most definitely NP-hard via reduction to Hamilton path: For a general undirected, unweighted graph, set all edge weights to zero and every vertex weight to 1. Then the maximum reachable score is n iif there is a Hamilton path in the original graph.
So it might be a good idea to look into integer linear programming solvers for instance families that are not constructed specifically to be hard.
The solution below assumes that a vertex can be visited more than once and makes use of the fact that node weights are bounded by a constant.
Let p(x) be the point value for vertex x and w(x,y) be the distance weight of the edge {x,y} or w(x,y) = āˆž if x and y are not adjacent.
If we are allowed to visit a vertex multiple times and if we can assume that p(x) <= C for some constant C, we might get away with the following recurrence: Let f(x,y,P) be the minimum distance we need to get from x to y while collecting P points. We have
f(x,y,P) = āˆž for all P < 0
f(x,x,p(x)) = 0 for all x
f(x,y,P) = MIN(z, w(x, z) + f(z, y, P - p(x)))
We can compute f using dynamic programming. Now we just need to find the largest P such that
f(start, end, P) <= distance upper bound
This P is the solution.
The complexity of this algorithm with a naive implementation is O(n^4 * C). If the graph is sparse, we can get O(n^2 * m * C) by using adjacency lists for the MIN aggregation.

Finding a square side length is R in 2D plane ?

I was at the high frequency Trading firm interview, they asked me
Find a square whose length size is R with given n points in the 2D plane
conditions:
--parallel sides to the axis
and it contains at least 5 of the n points
running complexity is not relative to the R
they told me to give them O(n) algorithm
Interesting problem, thanks for posting! Here's my solution. It feels a bit inelegant but I think it meets the problem definition:
Inputs: R, P = {(x_0, y_0), (x_1, y_1), ..., (x_N-1, y_N-1)}
Output: (u,v) such that the square with corners (u,v) and (u+R, v+R) contains at least 5 points from P, or NULL if no such (u,v) exist
Constraint: asymptotic run time should be O(n)
Consider tiling the plane with RxR squares. Construct a sparse matrix, B defined as
B[i][j] = {(x,y) in P | floor(x/R) = i and floor(y/R) = j}
As you are constructing B, if you find an entry that contains at least five elements stop and output (u,v) = (i*R, j*R) for i,j of the matrix entry containing five points.
If the construction of B did not yield a solution then either there is no solution or else the square with side length R does not line up with our tiling. To test for this second case we will consider points from four adjacent tiles.
Iterate the non-empty entries in B. For each non-empty entry B[i][j], consider the collection of points contained in the tile represented by the entry itself and in the tiles above and to the right. These are the points in entries: B[i][j], B[i+1][j], B[i][j+1], B[i+1][j+1]. There can be no more than 16 points in this collection, since each entry must have fewer than 5. Examine this collection and test if there are 5 points among the points in this collection satisfying the problem criteria; if so stop and output the solution. (I could specify this algorithm in more detail, but since (a) such an algorithm clearly exists, and (b) its asymptotic runtime is O(1), I won't go into that detail).
If after iterating the entries in B no solution is found then output NULL.
The construction of B involves just a single pass over P and hence is O(N). B has no more than N elements, so iterating it is O(N). The algorithm for each element in B considers no more than 16 points and hence does not depend on N and is O(1), so the overall solution meets the O(N) target.
Run through set once, keeping the 5 largest x values in a (sorted) local array. Maintaining the sorted local array is O(N) (constant time performed N times at most).
Define xMin and xMax as the x-coordinates of the two points with largest and 5th largest x values respectively (ie (a[0] and a[4]).
Sort a[] again on Y value, and set yMin and yMax as above, again in constant time.
Define deltaX = xMax- xMin, and deltaY as yMax - yMin, and R = largest of deltaX and deltaY.
The square of side length R located with upper-right at (xMax,yMax) meets the criteria.
Observation if R is fixed in advance:
O(N) complexity means no sort is allowed except on a fixed number of points, as only a Radix sort would meet the criteria and it requires a constraint on the values of xMax-xMin and of yMax-yMin, which was not provided.
Perhaps the trick is to start with the point furthest down and left, and move up and right. The lower-left-most point can be determined in a single pass of the input.
Moving up and right in steps and counitng points in the square requries sorting the points on X and Y in advance, which to be done in O(N) time requiress that the Radix sort constraint be met.

Resources