Sparse matrix - matrix multiplication - algorithm

How can a sparse matrix - matrix product be calculated? I know the 'classic' / mathematical way of doing it, but it seems pretty inefficient. Can it be improved?
I thought about storing the first matrix in CSR form and the second one in CSC form, so since the row and column vectors are sorted I won't have to search for a specific row / column I need, but I guess that doesn't help much.

With the disclaimers that (i) you really don't want to implement your own sparse matrix package and (ii) if you need to anyway, you should read Tim Davis's book on sparse linear algebra, here's how to do a sparse matrix multiply.
The usual naive dense multiply looks like this.
C = 0
for i {
for j {
for k {
C(i, j) = C(i, j) + (A(i, k) * B(k, j))
}
}
}
Since addition commutes, we can permute the loop indices any way we like. Let's put j outermost and i innermost.
C = 0
for j {
for k {
for i {
C(i, j) = C(i, j) + (A(i, k) * B(k, j))
}
}
}
Store all matrices in CSC form. Since j is outermost, we're working column-at-a-time on B and C (but not A). The middle loop is over k, which is rows of B, and, conveniently enough, we don't need to visit the entries of B that are zero. That makes the outer two loops go over the nonzero entries of B in the natural order. The inner loop increments the jth column of C by the kth column of A times B(k, j). To make this easy, we store the current column of C densely, together with the set of indexes where this column is nonzero, as a list/dense Boolean array. We avoid writing all of C or the Boolean array via the usual implicit initialization tricks.

Related

Efficiently sum max(Ai+Bj, Bi+Aj) over all i, j

You are given two integer arrays A and B of length N. You have to find the value of two summation:
Z=Σ Σ max(Ai+Bj, Bi+Aj)
Here is my brute force algorithm
for loop (i to length)
for loop (j to length)
sum+=Math.max(A[i]+B[j], A[j]+B[i]);
Please tell me a better efficient algorithm for this.
Rewrite the sum as Z = Σi Σj [max(Ai−Bi, Aj−Bj) + Bi + Bj] by using the distributive property of plus over max. Then construct C = A−B, sort it, and return Σi (2i+1)Ci + 2n Σi Bi (using zero-based indexing).
A minor improvement I can think of is to omit the results that you already computed. This means, instead of beginning the inner loop from 0, you can start with j = i. Since you already have computed the results for j < i in the previous loops.
To achieve this, you can change the instruction in the inner loop to the following:
if i != j
sum += 2 * Math.max(A[i]+B[j], A[j]+B[i]);
else
sum += Math.max(A[i]+B[j], A[j]+B[i]);
The reason is that every pair of i and j are visited twice by the loops.

Matrix chain Multiplication Different Recursive definition

Matrix Chain Multiplication has a dynamic programming solution where a recursive definition is used which works like this :
Problem : multiply i to j
Sub-problem : multiply i to k + multiply k+1 to j + multiplication cost
and this looks straight forward to memoize, due the repeating (i,j) sub-problems. But the following recursive definition which is bit different, I am facing difficulty memoizing it :
Can someone help memoizing this algo for matrix chain multiplication :
P is sequence of orders of matrices.
For eg, A(2,3)*B(3,4)*C(4,5), then P = {2,3,4,5}, i.e. order of ith matrix is P[i-1]*P[i]
also assumed P is 0-indexed.
Here I am multiplying adjacent matrices and recursing
Pseudocode :
chain_mul(P, n) {
if(n = 1) return 0
min_cost = inf
for( i = 1 to n-1) {
cost = P[i-1]*P[i]*P[i+1] + chain_mul(P-{P[i]}, n-1);
if(cost < min_cost) min_cost = cost
}
return min_cost
}
Here repeating sub-problem is structure of P, like I have shown below :
This cannot be memoized efficiently, because the argument P, iterates over all the subsets of the initial set P, so the memory required would be O(2^n).
The algoritms that can be memoized call the function specifying sections of the matrix chain, each section is characterized by two numbers, start and end index. The number of segments will be something like (n * (n + 1) / 2), and it is easy to implement a data structure to store and retrieve the results indexed by two numbers (e.g a matrix).

Efficient algorithm to calculate the mode of a hidden array

I'm trying to solve the extension to a problem I described in my question: Efficient divide-and-conquer algorithm
For this extension, there is known to be representatives for 3 parties at the event, and there are more members for 1 party attending than for any other. A formal description of the problem can be found below.
You are given an integer n. There is a hidden array A of size n, which contains elements that can take 1 of 3 values. There is a value, let this be m, that appears more often in the array than the other 2 values.
You are allowed queries of the form introduce(i, j), where i≠j, and 1 <= i, j <= n, and you will get a boolean value in return: You will get back 1, if A[i] = A[j], and 0 otherwise.
Output: B ⊆ [1, 2. ... n] where the A-value of every element in B is m.
A brute-force solution to this could calculate B in O(n2) by calling introduce(i, j) on n(n-1) combinations of elements and create 3 lists containing A-indexes of elements for which a 1 was returned when introduce was called on them, returning the list of largest size.
I understand the Boyer–Moore majority vote algorithm but can't find a way to modify it for this problem or find an efficient algorithm to solve it.
Scan for all A[i] = A[0], and make list I[] of all i for which A[i] != A[0]. Then scan for all A[I[j]] = A[I[0]], and so on. Which requires one O(n) scan for each possible value in A[].
[I assume if introduce(i, j) = 1 and introduce(j, k) = 1, then introduce(i, k) = 1 -- so you don't need to check all combinations of elements.]
Of course, this doesn't tell you what 'm' is, it just makes n lists, where n is the number of values, and each list is all the 'i' where A[i] is the same.

Find kth number in sum array

Given an array A with N elements I need to find pair (i,j) such that i is not equal to j and if we write the sum A[i]+A[j] for all pairs of (i,j) then it comes at the kth position.
Example : Let N=4 and arrays A=[1 2 3 4] and if K=3 then answer is 5 as we can see it clearly that sum array becomes like this : [3,4,5,5,6,7]
I can't go for all pair of i and j as N can go up to 100000. Please help how to solve this problem
I mean something like this :
int len=N*(N+1)/2;
int sum[len];
int count=0;
for(int i=0;i<N;i++){
for(int j=i+1;j<N;j++){
sum[count]=A[i]+A[j];
count++;
}
}
//Then just find kth element.
We can't go with this approach
A solution that is based on a fact that K <= 50: Let's take the first K + 1 elements of the array in a sorted order. Now we can just try all their combinations. Proof of correctness: let's assume that a pair (i, j) is the answer, where j > K + 1. But there are K pairs with the same or smaller sum: (1, 2), (1, 3), ..., (1, K + 1). Thus, it cannot be the K-th pair.
It is possible to achieve an O(N + K ^ 2) time complexity by choosing the K + 1 smallest numbers using a quickselect algorithm(it is possible to do even better, but it is not required). You can also just the array and get an O(N * log N + K ^ 2 * log K) complexity.
I assume that you got this question from http://www.careercup.com/question?id=7457663.
If k is close to 0 then the accepted answer to How to find kth largest number in pairwise sums like setA + setB? can be adapted quite easily to this problem and be quite efficient. You need O(n log(n)) to sort the array, O(n) to set up a priority queue, and then O(k log(k)) to iterate through the elements. The reversed solution is also efficient if k is near n*n - n.
If k is close to n*n/2 then that won't be very good. But you can adapt the pivot approach of http://en.wikipedia.org/wiki/Quickselect to this problem. First in time O(n log(n)) you can sort the array. In time O(n) you can set up a data structure representing the various contiguous ranges of columns. Then you'll need to select pivots O(log(n)) times. (Remember, log(n*n) = O(log(n)).) For each pivot, you can do a binary search of each column to figure out where it split it in time O(log(n)) per column, and total cost of O(n log(n)) for all columns.
The resulting algorithm will be O(n log(n) log(n)).
Update: I do not have time to do the finger exercise of supplying code. But I can outline some of the classes you might have in an implementation.
The implementation will be a bit verbose, but that is sometimes the cost of a good general-purpose algorithm.
ArrayRangeWithAddend. This represents a range of an array, summed with one value.with has an array (reference or pointer so the underlying data can be shared between objects), a start and an end to the range, and a shiftValue for the value to add to every element in the range.
It should have a constructor. A method to give the size. A method to partition(n) it into a range less than n, the count equal to n, and a range greater than n. And value(i) to give the i'th value.
ArrayRangeCollection. This is a collection of ArrayRangeWithAddend objects. It should have methods to give its size, pick a random element, and a method to partition(n) it into an ArrayRangeCollection that is below n, count of those equal to n, and an ArrayRangeCollection that is larger than n. In the partition method it will be good to not include ArrayRangeWithAddend objects that have size 0.
Now your main program can sort the array, and create an ArrayRangeCollection covering all pairs of sums that you are interested in. Then the random and partition method can be used to implement the standard quickselect algorithm that you will find in the link I provided.
Here is how to do it (in pseudo-code). I have now confirmed that it works correctly.
//A is the original array, such as A=[1,2,3,4]
//k (an integer) is the element in the 'sum' array to find
N = A.length
//first we find i
i = -1
nl = N
k2 = k
while (k2 >= 0) {
i++
nl--
k2 -= nl
}
//then we find j
j = k2 + nl + i + 1
//now compute the sum at index position k
kSum = A[i] + A[j]
EDIT:
I have now tested this works. I had to fix some parts... basically the k input argument should use 0-based indexing. (The OP seems to use 1-based indexing.)
EDIT 2:
I'll try to explain my theory then. I began with the concept that the sum array should be visualised as a 2D jagged array (diminishing in width as the height increases), with the coordinates (as mentioned in the OP) being i and j. So for an array such as [1,2,3,4,5] the sum array would be conceived as this:
3,4,5,6,
5,6,7,
7,8,
9.
The top row are all values where i would equal 0. The second row is where i equals 1. To find the value of 'j' we do the same but in the column direction.
... Sorry I cannot explain this any better!

How to generate a permutation?

My question is: given a list L of length n, and an integer i such that 0 <= i < n!, how can you write a function perm(L, n) to produce the ith permutation of L in O(n) time? What I mean by ith permutation is just the ith permutation in some implementation defined ordering that must have the properties:
For any i and any 2 lists A and B, perm(A, i) and perm(B, i) must both map the jth element of A and B to an element in the same position for both A and B.
For any inputs (A, i), (A, j) perm(A, i)==perm(A, j) if and only if i==j.
NOTE: this is not homework. In fact, I solved this 2 years ago, but I've completely forgotten how, and it's killing me. Also, here is a broken attempt I made at a solution:
def perm(s, i):
n = len(s)
perm = [0]*n
itCount = 0
for elem in s:
perm[i%n + itCount] = elem
i = i / n
n -= 1
itCount+=1
return perm
ALSO NOTE: the O(n) requirement is very important. Otherwise you could just generate the n! sized list of all permutations and just return its ith element.
def perm(sequence, index):
sequence = list(sequence)
result = []
for x in xrange(len(sequence)):
idx = index % len(sequence)
index /= len(sequence)
result.append( sequence[idx] )
# constant time non-order preserving removal
sequence[idx] = sequence[-1]
del sequence[-1]
return result
Based on the algorithm for shuffling, but we take the least significant part of the number each time to decide which element to take instead of a random number. Alternatively consider it like the problem of converting to some arbitrary base except that the base name shrinks for each additional digit.
Could you use factoradics? You can find an illustration via this MSDN article.
Update: I wrote an extension of the MSDN algorithm that finds i'th permutation of n things taken r at a time, even if n != r.
A computational minimalistic approach (written in C-style pseudocode):
function perm(list,i){
for(a=list.length;a;a--){
list.switch(a-1,i mod a);
i=i/a;
}
return list;
}
Note that implementations relying on removing elements from the original list tend to run in O(n^2) time, at best O(n*log(n)) given a special tree style list implementation designed for quickly inserting and removing list elements.
The above code rather than shrinking the original list and keeping it in order just moves an element from the end to the vacant location, still makes a perfect 1:1 mapping between index and permutation, just a slightly more scrambled one, but in pure O(n) time.
So, I think I finally solved it. Before I read any answers, I'll post my own here.
def perm(L, i):
n = len(L)
if (n == 1):
return L
else:
split = i%n
return [L[split]] + perm(L[:split] + L[split+1:], i/n)
There are n! permutations. The first character can be chosen from L in n ways. Each of those choices leave (n-1)! permutations among them. So this idea is enough for establishing an order. In general, you will figure out what part you are in, pick the appropriate element and then recurse / loop on the smaller L.
The argument that this works correctly is by induction on the length of the sequence. (sketch) For a length of 1, it is trivial. For a length of n, you use the above observation to split the problem into n parts, each with a question on an L' with length (n-1). By induction, all the L's are constructed correctly (and in linear time). Then it is clear we can use the IH to construct a solution for length n.

Resources