Related
Say we have a matrix of zeros and ones
0 1 1 1 0 0 0
1 1 1 1 0 1 1
0 0 1 0 0 1 0
0 1 1 0 1 1 1
0 0 0 0 0 0 1
0 0 0 0 0 0 1
and we want to find all the submatrices (we just need the row indices and column indices of the corners) with these properties:
contain at least L ones and L zeros
contain max H elements
i.e. take the previous matrix with L=1 and H=5, the submatrix 1 2 1 4 (row indices 1 2 and column indices 1 4)
0 1 1 1
1 1 1 1
satisfies the property 1 but has 8 elements (bigger than 5) so it is not good;
the matrix 4 5 1 2
0 1
0 0
is good because satisfies both the properties.
The objective is then to find all the submatrices with min area 2*L, max area H and containg at least L ones and L zeros.
If we consider a matrix as a rectangle it is easy to find all the possibile subrectangles with max area H and min area 2*L by looking at the divisors of all the numbers from H to 2*L.
For example, with H=5 and L=1 all the possibile subrectangles/submatrices are given by the divisors of
H=5 -> divisors [1 5] -> possibile rectangles of area 5 are 1x5 and 5x1
4 -> divisors [1 2 4] -> possibile rectangles of area 4 are 1x4 4x1 and 2x2
3 -> divisors [1 3] -> possibile rectangles of area 3 are 3x1 and 1x3
2*L=2 -> divisors [1 2] -> possibile rectangles of area 2 are 2x1 and 1x2
I wrote this code, which, for each number finds its divisors and cycles over them to find the submatrices. To find the submatrices it does this: take for example a 1x5 submatrix, what the code does is to fix the first line of the matrix and move step by step (along all the columns of the matrix) the submatrix from the left edge of the matrix to the right edge of the matrix, then the code fixes the second row of the matrix and moves the submatrix along all the columns from left to right, and so on until it arrives at the last row.
It does this for all the 1x5 submatrices, then it considers the 5x1 submatrices, then the 1x4, then the 4x1, then the 2x2, etc.
The code do the job in 2 seconds (it finds all the submatrices) but for big matrices, i.e. 200x200, a lot of minutes are needed to find all the submatrices. So I wonder if there are more efficient ways to do the job, and eventually which is the most efficient.
This is my code:
clc;clear all;close all
%% INPUT
P= [0 1 1 1 0 0 0 ;
1 1 1 1 0 1 1 ;
0 0 1 0 0 1 0 ;
0 1 1 0 1 1 1 ;
0 0 0 0 0 0 1 ;
0 0 0 0 0 0 1];
L=1; % a submatrix has to containg at least L ones and L zeros
H=5; % max area of a submatrix
[R,C]=size(P); % rows and columns of P
sub=zeros(1,6); % initializing the matrix containing the indexes of each submatrix (columns 1-4), their area (5) and the counter (6)
counter=1; % no. of submatrices found
%% FIND ALL RECTANGLES OF AREA >= 2*L & <= H
%
% idea: all rectangles of a certain area can be found using the area's divisors
% e.g. divisors(6)=[1 2 3 6] -> rectangles: 1x6 6x1 2x3 and 3x2
tic
for sH = H:-1:2*L % find rectangles of area H, H-1, ..., 2*L
div_sH=divisors(sH); % find all divisors of sH
disp(['_______AREA ', num2str(sH), '_______'])
for i = 1:round(length(div_sH)/2) % cycle over all couples of divisors
div_small=div_sH(i);
div_big=div_sH(end-i+1);
if div_small <= R && div_big <= C % rectangle with long side <= C and short side <= R
for j = 1:R-div_small+1 % cycle over all possible rows
for k = 1:C-div_big+1 % cycle over all possible columns
no_of_ones=length(find(P(j:j-1+div_small,k:k-1+div_big))); % no. of ones in the current submatrix
if no_of_ones >= L && no_of_ones <= sH-L % if the submatrix contains at least L ones AND L zeros
% row indexes columns indexes area position
sub(counter,:)=[j,j-1+div_small , k,k-1+div_big , div_small*div_big , counter]; % save the submatrix
counter=counter+1;
end
end
end
disp([' [', num2str(div_small), 'x', num2str(div_big), '] submatrices: ', num2str(size(sub,1))])
end
if div_small~=div_big % if the submatrix is a square, skip this part (otherwise there will be duplicates in sub)
if div_small <= C && div_big <= R % rectangle with long side <= R and short side <= C
for j = 1:C-div_small+1 % cycle over all possible columns
for k = 1:R-div_big+1 % cycle over all possible rows
no_of_ones=length(find(P(k:k-1+div_big,j:j-1+div_small)));
if no_of_ones >= L && no_of_ones <= sH-L
sub(counter,:)=[k,k-1+div_big,j,j-1+div_small , div_big*div_small, counter];
counter=counter+1;
end
end
end
disp([' [', num2str(div_big), 'x', num2str(div_small), '] submatrices: ', num2str(size(sub,1))])
end
end
end
end
fprintf('\ntime: %2.2fs\n\n',toc)
Here is a solution centered around 2D matrix convolution. The rough idea is to convolve P for each submatrix shape with a second matrix such that each element of the resulting matrix indicates how many ones are in the submatrix having its top left corner at said element. Like this you get all solutions for a single shape in one go, without having to loop over rows/columns, greatly speeding things up (it takes less than a second for a 200x200 matrix on my 8 years old laptop)
P= [0 1 1 1 0 0 0
1 1 1 1 0 1 1
0 0 1 0 0 1 0
0 1 1 0 1 1 1
0 0 0 0 0 0 1
0 0 0 0 0 0 1];
L=1; % a submatrix has to containg at least L ones and L zeros
H=5; % max area of a submatrix
submats = [];
for sH = H:-1:2*L
div_sH=divisors(sH); % find all divisors of sH
for i = 1:length(div_sH) % cycle over all couples of divisors
%number of rows of the current submatrix
nrows=div_sH(i);
% number of columns of the current submatrix
ncols=div_sH(end-i+1);
% perpare matrix to convolve P with
m = zeros(nrows*2-1,ncols*2-1);
m(1:nrows,1:ncols) = 1;
% get the number of ones in the top left corner each submatrix
submatsums = conv2(P,m,'same');
% set values where the submatrices go outside P invalid
validsums = zeros(size(P))-1;
validsums(1:(end-nrows+1),1:(end-ncols+1)) = submatsums(1:(end-nrows+1),1:(end-ncols+1));
% get the indexes where the number of ones and zeros is >= L
topLeftIdx = find(validsums >= L & validsums<=sH-L);
% save submatrixes in following format: [index, nrows, ncols]
% You can ofc use something different, but it seemed the simplest way to me
submats = [submats ; [topLeftIdx bsxfun(#times,[nrows ncols],ones(length(topLeftIdx),1))]];
end
end
First, I suggest that you combine finding the allowable sub-matrix sizes.
for smaller = 1:sqrt(H)
for larger = 2*L:H/smaller
# add smaller X larger and larger x smaller to your shapes list
Next, start with the smallest rectangles in the shapes. Note that any solution to a small rectangle can be extended in any direction, to the area limit of H, and the added elements will not invalidate the solution you found. This will identify many solutions without bothering to check the populations within.
Keep track of the solutions you've found. As you work your way toward larger rectangles, you can avoid checking anything already in your solutions set. If you keep that in a hash table, checking membership is O(1). All you'll need to check thereafter will be larger blocks of mostly-1 adjacent to mostly-0. This should speed up the processing somewhat.
Is that enough of a nudge to help?
I have a large sparse adjacency matrix with around 10M nodes, which I am processing with MATLAB. I want to convert the matrix into adjacency list as efficiently as possible. As an example adjacency matrix to illustrate this:
adj =
1 0 1
0 0 1
0 1 1
And the output is:
ans =
0 0 2
1 2
2 1 2
I want to do it as efficiently as possible, is there any efficient way to do it?
The result needs to be a cell array of vector, because the number of nodes connected to each node varies. Here's a way to do it:
[ii, jj] = find(adj); % row and col indices of connections
y = accumarray(ii, jj-1 , [], #(x){sort(x.')}); % get all nodes connected to each node,
% sorted. Subtract 1 for 0-based indexing
This gives
>> celldisp(y)
y{1} =
0 2
y{2} =
2
y{3} =
1 2
There is a m x n matrix which contains either 0 or 1. A square submatrix of 2x2 is defined which contains only 0. If such square submatrix is cut from the original matrix then we have to find out the maximum number of such square sub matrices which can be cut from the original matrix. Cutting strictly means no 2 square sub matrix can overlap.
For ex -
This is a 5x5 matrix
0 0 0 1 0
0 0 0 0 0
1 0 0 0 0
0 0 0 1 0
0 0 0 0 0
If we cut a square submatrix of 2x2 starting from (0,0) then the remaining matrix is
0 1 0
0 0 0
1 0 0 0 0
0 0 0 1 0
0 0 0 0 0
Further 2x2 square sub matrices can be cut
In this give input maximum 3 such matrices can be cut. If I mark them with 'a'
a a 0 1 0
a a a a 0
1 0 a a 0
a a 0 1 0
a a 0 0 0
I have tried the backtracking/recursive approach but it can work only for lower size input. Can anybody suggest a more efficeint approach?
Edit: I have mark matrix elements with "a" to show that this is one sub matrix which can be cut. We have to report only maximum number of 2x2 submatrix (containing all 0) which can be taen from this matrix
Just for the sake of completeness, I changed the script to do some crude recursion, you were right it's difficult to not resort to a recursive way of doing it...
The idea:
f(matrix,count)
IF count > length THEN
length = count
add all options to L
IF L is empty THEN
return
FOR each option in L
FOR each position in option
set position in matrix to 1
f(matrix,count+1)
FOR each position in option
set position in matrix to 0
where options are all 2x2 submatrices with only 0s that are currently in matrix
length = 0
set M to the matrix with 1s and 0s
f(M,0)
In python:
import copy
def possibilities(y):
l = len(y[0]) # horizontal length of matrix
h = len(y) # verticle length of matrix
sub = 2 # length of square submatrix you want to shift in this case 2x2
length = l-sub+1
hieght = h-sub+1
x = [[0,0],[0,1],
[1,0],[1,1]]
# add all 2x2 to list L
L=[]
for i in range(hieght):
for j in range(length):
if y[x[0][0]][x[0][1]]==0 and y[x[1][0]][x[1][1]]==0 and y[x[2][0]][x[2][1]]==0 and y[x[3][0]][x[3][1]]==0:
# create a copy of x
c = copy.deepcopy(x)
L.append(c)
for k in x: # shift submatrix to the right 1
k[1]+=1
(x[0][1],x[1][1],x[2][1],x[3][1]) = (0,1,0,1)
for k in x: # shift submatrix down 1
k[0]+=1
return L
def f(matrix,count):
global length
if count > length:
length = count
L = possibilities(matrix)
if not L:
return
for option in L:
for position in option:
matrix[position[0]][position[1]]=1
f(matrix,count+1)
# reset back to 0
for position in option:
matrix[position[0]][position[1]]=0
length = 0
# matrix
M = [[0,0,1,0,0,0],
[0,0,0,0,0,0],
[1,1,0,0,0,0],
[0,1,1,0,0,0]]
f(M,0)
print(length)
I am looking for most efficient IDL code to replace the IDL matrix multiply (#) operator for a specific, diagonally-oriented (not diagonal, or diagonally-symmetric) matrix with 3 distinct values: unity on the diagonal; unity plus a delta to the right of the diagonal; unity minus the same delta to the left.
Problem domain
IDL (fixed; non-negotiable; sorry); image smear on shutter-less CCD imaging system.
Basic problem statement
Given
a 1024x1024 matrix, "EMatrix," with unity on the diagonal; (1-delta)
to the left of the diagonal; (1+delta) to the right; delta = 0.044.
another 1024x1024 matrix, Image
Query
what is the fastest IDL code to calculate (Image # EMatrix)?
2014-09-16: See update below
Background
Larger problem statement (of which the matrix multiply seems to be only the slowest part, and optimizing the whole routine would not hurt):
http://pdssbn.astro.umd.edu/holdings/nh-j-lorri-3-jupiter-v1.1/document/soc_inst_icd.pdf
Section 9.3.1.2 (PDF page 47; internal page 34), and other documents in that same directory (sorry as a newbie I can only post two links)
My work so far
https://github.com/drbitboy/OptimizeDiagishIDLMatrixMultiply
That is now (2014-09-26) about an order of magnitude faster than the IDL # operator for a 1024x1024 matrix.
Details
The naive operation is O(n^3) and performs about a billion (2^30) double-precision multiplies and about the same number of additions; Wikipedia further tells me Strassen's algorithm cuts that down to O(n^2.807), or ~282M+ multiplies for n=1024.
Breaking it down for a simple 3x3 case, say image and EMatrix are
image EMatrix
[ 0 1 2 ] [ 1 p p ]
[ 3 4 5 ] # [ m 1 p ]
[ 6 7 8 ] [ m m 1 ]
where p represents 1+delta (1.044) and m represents 1-delta (0.956).
Because of the repetition of m's and p's, there should be a simplification available: looking at the middle column for image, the result for the three rows should be
[1,4,7] . [1,p,p] = m*(0) + 1*1 + p*(4+7)
[1,4,7] . [m,1,p] = m*(1) + 1*4 + p*(7)
[1,4,7] . [m,m,1] = m*(1+4) + 1*7 + p*(0)
where . represents the dot (inner?) product i.e. [1,4,7].[a,b,c] = (1a + 4b + 7c)
Based on that, here's what I did so far:
The middle term is just the column itself, and the sums multiplied by m and p look a lot like cumulative sums of contiguous sections of the column, possibly reversed (for m), shifted one and with the first element set to zero.
i.e. for m:
; shift image down one:
imgminusShift01 = shift(image,0,1)
; zero the top row:
imgminusShift01[*,0] = 0
; Make a cumulative sum:
imgminusCum = total( imageShift01, 2, /cumulative)
For p, imgplusShift01Cum follows essentially the same path but with a rotate(...,7) before and after to flip things up and down.
Once we have those three matrices (the original image, with its offspring imgminusShift01Cum and imgplusShift01Cum), the desired result is
m * imgminusShift01Cum + 1 * image + p * imgplusShift01Cum
To do the shifts and rotates, I use indices, shifted and rotated themselves and stored in a common block for subsequent calls which saves another 10-20%.
Summary, so far
2014-09-16: See update below
And that gives a speedup of 5+.
I was expecting a bit more, because I think I am down to 3M multiplies and 2M additions, so maybe the memory allocation is the expensive part and I should be doing something like REPLICATE_INPLACE (my IDL is rusty - I have not done much since 6.4 and early 7.0).
Or maybe IDL's matrix multiplication is better than theory?
Alternate approaches and other thoughts:
Can we do something with the fact that the EMatrix is equal to unity
plus a matrix with zeros on the diagonal and +/- delta in theupper
and lower triangles?
By summing over columns I am accessing the data sequentially the "wrong" way, but
would it actually save time to transpose Image first (it's only 8MB)?
Obviously choosing another language, getting a GPU array to help, or
writing a DLM, would be other options, but let's keep this to IDL for
now.
advTHANKSance (yes I am that old;-),
Brian Carcich 2014-07-20
Update 2014-09-16
Diego's approach brought me to a much simpler solution:
I think we got it; my first pass was too complicated, with all the rotations.
To use Diego's notation, but transposed, I am looking for
[K] = [IMG] # [E]
Since that multiplies columns of [IMG] by rows of [E], there is no interaction between columns of [IMG], so for analysis we only need to look at one column of [IMG], the dot (inner) products of which, with the rows of [E], become one column of the result [K]. Expanding that idea to one column of a 3x3 matrix with elements x, y and z:
[Kx] [x] [1 1+d 1+d]
[Ky] = [y] # [1-d 1 1+d]
[Kz] = [z] [1-d 1-d 1 ]
Looking specifically at element Ky above (one element of [K], corresponding to y in [IMG], broken down to a formula using only scalars):
Ky = x * (1-d) + y * 1 + z * (1+d)
Generalizing for any y in a column of any length:
Ky = sumx * (1-d) + y * 1 + sumz * (1+d)
Where scalars sumx and sumz are sums of all values above and below, respectively, y in that column of [IMG]. N.B. sumx and sumz are values specific to element y.
Rearranging:
Ky = (sumx + y + sumz) - d * (sumx - sumz)
Ky = tot - d * (sumx - sumz)
where
tot = (sumx + y + sumz)
i.e. tot is the sum of all values in the column (e.g. in IDL: tot = total(IMG,2)).
So to this point I have basically duplicated Diego's work; the rest of this analysis converts that last equation for Ky into a form suitable for speedy evaluation in IDL.
Solving the tot equation for sumz:
sumz = tot - (y + sumx)
Substituting back into Ky:
Ky = tot - (sumx - (tot - (y + sumx)))
Ky = tot - ((2 * sumx) + y - tot)
Ky = tot + (tot - ((2 * sumx) + y)
Using sumxy to represent the sum of all values in the column from the top down to, and including, y (IDL: [SUMXY] = total([IMG],2,/CUMULATIVE))
sumxy = sumx + y
and
sumx = sumxy - y
Substituting back into Ky:
Ky = tot + (tot - ((2 * (sumxy - y)) + y)
Ky = tot + (tot + y - (2 * sumxy))
So if we can evaluate tot and sumxy for every element of [IMG], i.e. if we can evaluate the matrices [TOT] and [SUMXY], and we already have [IMG] as the matrix version of y, then it is a simple linear combination of those matrices.
In IDL, these are simply:
[SUMXY] = TOTAL([IMG],2,/CUMULATIVE)
[TOT] = [SUMXY][*,N-1] # REPLICATE(1D0,1,N)
I.e. [TOT] is the last row of [SUMXY], duplicated to form a matrix of N rows.
And the final code looks like this:
function lorri_ematrix_multiply,NxN,EMatrix
NROWS = (size(NxN,/DIM))[1]
SUMXY = TOTAL(NxN,2,/CUMULATIVE)
TOT = SUMXY[*,NROWS-1] # REPLICATE(1,NROWS,1d0)
RETURN, TOT + ((EMatrix[1,0] - 1d0) * (TOT + NxN - (2d0 * SUMXY)))
end
which on our system is just shy of an order of magnitude faster than [IMG] # [E].
N.B. delta = (EMatrix[1,0] - 1d0)
Woo hoo!
Step 1:
I go straight on with math notation since I think that could be more clear than explaining by words:
[ +1 +d +d +d +d ] [ 1 0 0 0 0 ] [ 0 1 1 1 1 ] [ 0 0 0 0 0 ]
[ -d +1 +d +d +d ] [ 0 1 0 0 0 ] [ 0 0 1 1 1 ] [ 1 0 0 0 0 ]
[E] = [ -d -d +1 +d +d ] = [ 0 0 1 0 0 ] + d * [ 0 0 0 1 1 ] - d * [ 1 1 0 0 0 ]
[ -d -d -d +1 +d ] | [ 0 0 0 1 0 ] [ 0 0 0 0 1 ] [ 1 1 1 0 0 ]
[ -d -d -d -d +1 ] | [ 0 0 0 0 1 ] [ 0 0 0 0 0 ] [ 1 1 1 1 0 ]
|
| [ 1 0 0 0 0 ] [ 1 1 1 1 1 ] [ 1 0 0 0 0 ]
| [ 0 1 0 0 0 ] [ 0 1 1 1 1 ] [ 1 1 0 0 0 ]
= [ 0 0 1 0 0 ] + d * [ 0 0 1 1 1 ] - d * [ 1 1 1 0 0 ]
| [ 0 0 0 1 0 ] [ 0 0 0 1 1 ] [ 1 1 1 1 0 ]
| [ 0 0 0 0 1 ] [ 0 0 0 0 1 ] [ 1 1 1 1 1 ]
| [ID] [UT] [LT]
|
= [ID] + d * [UT] - d * [LT]
==>
[Img] # [E] = [E]##[Img] = [Img] + d * [UT] ## [Img] - d * [LT] ## [Img]
Now let's observe what is [LT] ## [Img]:
row 1 is the same as first line of [Img]
row 2 is the (columnwise) sum of row 1 and 2 of [Img]
row i is the (columnwise) sum of all the first i rows of [Img]
row n is the (columnwise) sum of all the rows of [Img]
so an efficient way to compute it is:
TOTAL(Image, 2, /CUMULATIVE)
Analogous but a bit different is the result of [UT] ## [Img]:
row 1 is the (columnwise) sum of all the rows of [Img]
row i is the (columnwise) sum of all the last i rows of [Img]
row n-1 is the (columnwise) sum of row 1 and 2 of [Img]
row n is the same as first line of [Img]
so [UT] ## [Img] = REVERSE(TOTAL(REVERSE(Image,2), 2, /CUMULATIVE),2)
Then:
[Img] # [E] = [E]##[Img] = Image + d * (REVERSE(TOTAL(REVERSE(Image,2), 2, /CUMULATIVE),2) - TOTAL(Image, 2, /CUMULATIVE))
Note that we see that in the end the results on each column are coming only from data of that same column.
Step 2:
Now let's look to [K] = [UT] ## [Img] - [LT] ## [Img] and see how it looks like. If for each generic column we name the column elements r(1), r(2), r(3), .... ,r(i), ..., r(n) we can see that the corresponding [K] column elements R(i)
looks like this:
when n is even (that is the case of 1024)
row 1 => R(1) = +r(1) -r(1) -r(2) .... -r(n-1) -r(n) = -SUM(j=2, n, r(n))
row 2 => R(1) = +r(1) +r(2) -r(1) -r(2) .... -r(n-1) = -SUM(j=3, n-1, r(n))
row 3 => R(1) = +r(1) +r(2) +r(3) -r(1) -r(2) .... -r(n-2) = -SUM(j=4, n-2, r(n))
: : : : : : : : : : :
row i (i < n/2)
=> R(1) = +r(1) ... +r(i) -r(1) -r(2) .... -r(n-i+1) = -SUM(j=i+1, n-i+1, r(n))
: : : : : : : : : : :
row n/2 => R(1) = +r(1) ... +r(n/2) -r(1) -r(2) .... -r(n/2+1) = -r(n/2+1)
row n/2+1 => R(1) = +r(1) ... +r(n/2+1) -r(1) -r(2) .... -r(n/2) = +r(n/2+1)
: : : : : : : : : : :
row i (i > n/2)
=> R(1) = +r(1) ... + r(i) -r(1) -r(2) .... -r(n-i+1) = +SUM(j=n-i+2, i, r(n))
= -R(n-i+1)
: : : : : : : : : : :
row n => R(1) = +r(1) ... + r(n) -r(1) = +SUM(j=2, n, r(n))
= -R(1)
when n is odd
It is similar but R((n+1)/2) will be all 0. I will not go in detail with this.
What is important is that the matrix [K] = [UT] ## [Img] - [LT] ## [Img] is anti-symmetrical with respect to its horizontal halving line.
This means that we could calculate values only in half matrix (let's say the top part) and then populate the lower part by mirroring and changing the sign.
Note that efficient calculation of the top part can be done starting from R(n/2) = r(n/2+1) and going up reducing the index (R(n/2 -1), R(n/2 -2), R(n/2 -3)...) each time using R(i) = R(i+1) - r(i+1) - r(n-i+1) that can be well rewritten as R(i-1) = R(i) - r(i) - r(n-i+2).
As a matter of calculation involved this about halves the calculation done but as a matter of actual speed it needs to be tested in order to see if implementation with explicit operations are as quick as the internal implementations of built-in functions as TOTAL(/CUMULATIVE) and similar. There are good probabilities that it is quicker since we can here avoid also TRANSPOSE and/or REVERSE.
Let us know how it goes with a bit of profiling!
I want to know in how many ways can we represent a number x as a sum of numbers from a given set of numbers {a1.a2,a3,...}. Each number can be taken more than once.
For example, if x=4 and a1=1,a2=2, then the ways of representing x=4 are:
1+1+1+1
1+1+2
1+2+1
2+1+1
2+2
Thus the number of ways =5.
I want to know if there exists a formula or some other fast method to do so. I can't brute force through it. I want to write code for it.
Note: x can be as large as 10^18. The number of terms a1,a2,a3,… can be up to 15, and each of a1,a2,a3,… can also be only up to 15.
Calculating the number of combinations can be done in O(log x), disregarding the time it takes to perform matrix multiplication on arbitrarily sized integers.
The number of combinations can be formulated as a recurrence. Let S(n) be the number of ways to make the number n by adding numbers from a set. The recurrence is
S(n) = a_1*S(n-1) + a_2*S(n-2) + ... + a_15*S(n-15),
where a_i is the number of times i occurs in the set. Also, S(n)=0 for n<0. This kind of recurrence can be formulated in terms of a matrix A of size 15*15 (or less is the largest number in the set is smaller). Then, if you have a column vector V containing
S(n-14) S(n-13) ... S(n-1) S(n),
then the result of the matrix multiplication A*V will be
S(n-13) S(n-12) ... S(n) S(n+1).
The A matrix is defined as follows:
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
a_15 a_14 a_13 a_12 a_11 a_10 a_9 a_8 a_7 a_6 a_5 a_4 a_3 a_2 a_1
where a_i is as defined above. The proof that the multiplication of this matrix with a vector of S(n_14) ... S(n) works can be immediately seen by performing the multiplication manually; the last element in the vector will be equal to the right hand side of the recurrence with n+1. Informally, the ones in the matrix shifts the elements in the column vector one row up, and the last row of the matrix calculates the newest term.
In order to calculate an arbitrary term S(n) of the recurrence is to calculate A^n * V, where V is equal to
S(-14) S(-13) ... S(-1) S(0) = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.
In order to get the runtime down to O(log x), one can use exponentiation by squaring to calculate A^n.
In fact, it is sufficient to ignore the column vector altogether, the lower right element of A^n contains the desired value S(n).
In case the above explanation was hard to follow, I have provided a C program that calculates the number of combinations in the way I described above. Beware that it will overflow a 64-bits integer very quickly. You'll be able to get much further with a high-precision floating point type using GMP, though you won't get an exact answer.
Unfortunately, I can't see a fast way to get an exact answer for numbers such at x=10^18, since the answer can be much larger than 10^x.
#include <stdio.h>
typedef unsigned long long ull;
/* highest number in set */
#define N 15
/* perform the matrix multiplication out=a*b */
void matrixmul(ull out[N][N],ull a[N][N],ull b[N][N]) {
ull temp[N][N];
int i,j,k;
for(i=0;i<N;i++) for(j=0;j<N;j++) temp[i][j]=0;
for(k=0;k<N;k++) for(i=0;i<N;i++) for(j=0;j<N;j++)
temp[i][j]+=a[i][k]*b[k][j];
for(i=0;i<N;i++) for(j=0;j<N;j++) out[i][j]=temp[i][j];
}
/* take the in matrix to the pow-th power, return to out */
void matrixpow(ull out[N][N],ull in[N][N],ull pow) {
ull sq[N][N],temp[N][N];
int i,j;
for(i=0;i<N;i++) for(j=0;j<N;j++) temp[i][j]=i==j;
for(i=0;i<N;i++) for(j=0;j<N;j++) sq[i][j]=in[i][j];
while(pow>0) {
if(pow&1) matrixmul(temp,temp,sq);
matrixmul(sq,sq,sq);
pow>>=1;
}
for(i=0;i<N;i++) for(j=0;j<N;j++) out[i][j]=temp[i][j];
}
void solve(ull n,int *a) {
ull m[N][N];
int i,j;
for(i=0;i<N;i++) for(j=0;j<N;j++) m[i][j]=0;
/* create matrix from a[] array above */
for(i=2;i<=N;i++) m[i-2][i-1]=1;
for(i=1;i<=N;i++) m[N-1][N-i]=a[i-1];
matrixpow(m,m,n);
printf("S(%llu): %llu\n",n,m[N-1][N-1]);
}
int main() {
int a[]={1,1,0,0,0,0,0,1,0,0,0,0,0,0,0};
int b[]={1,1,1,1,1,0,0,0,0,0,0,0,0,0,0};
solve(13,a);
solve(80,a);
solve(15,b);
solve(66,b);
return 0;
}
If you want to find all possible ways of representing a number N from a given set of numbers then you should follow a dynamic programming solution as already proposed.
But if you just want to know the number of ways, then you are dealing with the restricted partition function problem.
The restricted partition function p(n, dm) ≡ p(n, {d1, d2, . . . ,
dm}) is a number of partitions of n into positive integers {d1, d2, .
. . , dm}, each not greater than n.
You should also check the wikipedia article on partition function without restrictions where no restrictions apply.
PS. If negative numbers are also allowed then there probably are (countably )infinite ways to represent your sum.
1+1+1+1-1+1
1+1+1+1-1+1-1+1
etc...
PS2. This is more a math question than a programming one
Since order in sum is important it holds:
S( n, {a_1, ..., a_k} ) = sum[ S( n - a_i, {a_1, ..., a_k} ) for i in 1, ..., k ].
That is enough for dynamic programming solution. If values S(i, set) are created from 0 to n, than complexity is O( n*k ).
Edit: Just an idea. Look at one summation as a sequence (s_1, s_2, ..., s_m). Sum of first part of sequence will be larger than n/2 at one point, let it be for index j:
s_1 + s_2 + ... + s_{j-1} < n / 2,
s_1 + s_2 + ... + s_j = S >= n / 2.
There are at most k different sums S, and for each S there are at most k possible last elements s_j. All of possibilities (S,s_j) split sequence sum in 3 parts.
s_1 + s_2 + ... + s_{j-1} = L,
s_j,
s_{j+1} + ... + s_m = R.
It hold n/2 >= L, R > n/2 - max{a_i}. With that, upper formula have more complicated form:
S( n, set ) = sum[ S( n-L-s_j, set )*S( R, set ) for all combinations of (S,s_j) ].
I'm not sure, but I think that with each step it will be needed to 'create' range of
S(x,set) values where range will grow linearly by factor max{a_i}.
Edit 2: #Andrew samples. It is easy to implement first method and it works for 'small' x. Here is python code:
def S( x, ai_s ):
s = [0] * (x+1)
s[0] = 1
for i in xrange(1,x+1):
s[i] = sum( s[i-ai] if i-ai >= 0 else 0 for ai in ai_s )
return s[x]
S( 13, [1,2,8] )
S( 15, [1,2,3,4,5] )
This implementation has problem with memory for large x (>10^5 in python). Since only last max(a_i) values are needed it is possible to implement it with circular buffer.
These values grow very fast, e.g. S(100000, [1,2,8] ) is ~ 10^21503.