MATLAB: Fast calculation of Adamic-Adar Score - algorithm

I have an adjacency matrix of a network, and want to calculate the Adamic-Adar score. It is defined in the following way: For each pair of edges x and y, let z one of their common neighbors, and |z| is the degree of the neighbor.
Now the score is defined as a sum over all common neighbors z:
See for instance this paper, page 3.
I have written a small algorithm for MATLAB, but it uses two for-loops. I am convinced that it can be made much faster, but I dont know how. Could you please indicate ways how to speed this up?
% the entries of nn will always be 0 or 1, and the diagonal will always be 0
nn=[0 0 0 0 1 0; ...
0 0 0 1 1 0; ...
0 0 0 0 1 0; ...
0 1 0 0 0 1; ...
1 1 1 0 0 0; ...
0 0 0 1 0 0];
deg=sum(nn>0);
AAScore=zeros(size(nn));
for ii=1:length(nn)-1
for jj=ii+1:length(nn)
NBs=nn(ii,:).*nn(jj,:);
B=NBs.*deg;
C=B(B>1);
AAScore(ii,jj)=sum(1./log(C));
end
end
AAScore
I would appreciate any suggestion, thank you!
Comparing runtimes
My nn has ~2% entries, so it can be approximated by:
kk=1500;
nn=(rand(kk)>0.98).*(1-eye(kk));
My double-for: 37.404445 seconds.
Divakar's first solution: 58.455826 seconds.
Divakar's updated solution: 22.333510 seconds.

First off, get the indices in the output array that would be set, i.e. non-zeros. Looking at the code, we could notice that we are basically performing AND-ing of each row from input matrix nn against every other row. Given the fact that we are dealing with 1s and 0s, this basically translates to performing matrix-multiplication. So, the non-zeros in the matrix-multiplication result would indicate the places in the sqaured matrix output array where the computation is needed. This should be efficient as we would be iterating over lesser elements. On top of it, since we are getting a upper triangular matrix output, that should further reduce the computations by using a mask with triu(...,1).
Following those ideas, here's an implementation -
[R,C] = find(triu(nn*nn.'>0,1));
vals = sum(1./log(bsxfun(#times,nn(R,:).*nn(C,:),deg)),2);
out=zeros(size(nn));
out(sub2ind(size(out),R,C)) = vals;
For a case with input matrix nn being less-sparsey and really huge, you would feel the bottleneck at computing bsxfun(#times,nn(R,:).*nn(C,:),deg). So, for such a case, you can directly use those R,C indices to perform computation for updating respective selective places in the output array.
Thus, an alternative implementation would be -
[R,C] = find(triu(nn*nn.',1));
out=zeros(size(nn));
for ii =1:numel(R)
out(R(ii),C(ii)) = sum(1./log(nn(R(ii),:).*nn(C(ii),:).*deg));
end
A middle-ground could probably be estabilshed between the two above mentioned approaches by starting off with the R,C indices, then selecting chunks of rows off nn(R,:) and respective ones from nn(C,:) too and using the vectorized implementation across those chunks iteratively with lesser complexity. Setting the chunk size could be tricky, as it would largely depend on the system resources, input array size involved and the sparse-ness of it.

Related

Matlab: Speed up loop applied to each of 820,000 elements

I have a set of rainfall data, with a value every 15 minutes over many years, giving 820,000 rows.
The aim (eventually) of my code is to create columns which categorise the data which can then be used to extract relevant chunks of data for further analysis.
I am a Matlab novice and would appreciate some help!
The first steps I have got working sufficiently fast. However, some steps are very slow.
I have tried pre-allocating arrays, and using the lowest intX (8 or 16 depending on situation) possible, but other steps are so slow they don't complete.
The slow ones are for loops, but I don't know if they can be vectorised/split into chunks/anything else to speed them up.
I have a variable "rain" which contains a value for every time step/row.
I have created a variable called "state" of 0 if no rain, and 1 if there is rain.
Also a variable called "begin" which has 1 if it is the first row of a storm, and 0 if not.
The first slow loop is to create a "spell" variable - to give each rain storm a number.
% Generate blank column for spell of size (rain) - preallocate
spell = zeros(size(st),1,'int16');
% Start row for analysis
x=1;
% Populate "spell" variable with a storm number in each row of rain, for the storm number it belongs to (storm number calculated by adding up the number of "begin" values up to that point
for i=1:size(state)
if(state(x)==1)
spell(x) = sum(begin(1:x));
end
x=x+1;
end
The next stage is about length of each storm. The first steps are fast enough.
% List of storm numbers
spellnum = unique(spell);
% Length of each spell
spelllength = histc(spell,spellnum);
The last step below (the for loop) is too slow and just crashes.
% Generate blank column for length
length = zeros(size(state),1,'int16');
% Starting row
x = 1;
% For loop to output the total length of the storm for each row of rain within that storm
for i=1:size(state)
for j=1:size(state)
position = find(spell==x);
for k=1:size(state)
length(position) = spelllength(x+1);
end
end
x=x+1;
end
Is it possible to make this more efficient?
Apologies if examples already exist - I'm not sure what the process would be called!
Many thanks in advance.
Mem. allocation/reallocation tips:
try to create the results directly from expression (eventually trimming another, more general result);
if 1. is not possible, try to pre-allocate whenever possible (when you have an upper limit for the result);
if 2. is not possible try to grow cell-arrays rather than massive matrices (because a matrix requires a contiguous memory area)
Type-choice tips:
try to use always double in intermediate results, because is the basic numeric data type in MATLAB; avoiding conversions back and forth;
use other types for intermediate results only if there's a memory constraint that can be alleviated by using a smaller-size type.
Linearisation tips:
fastest linearisation uses matrix-wise or element-wise basic algebraic operations combined with logical indexing.
loops are not that bad starting with MATLAB R2008;
the worst-performing element-wise processing functions are arrayfun, cellfun and structfun with anonymous functions, because anon functions evaluate the slowest;
try not to calculate the same things twice, even if this gives you better linearisation.
First block:
% Just calculate the entire cumulative sum over begin, then
% trim the result. Check if the cumsum doesn't overflow.
spell = cumsum(begin);
spell(state==0) = 0;
Second block:
% The same, not sure how could you speed this up; changed
% the name of variables to my taste, though.
spell_num = unique(spell);
spell_length = histc(spell,spell_num);
Third block:
% Fix the following issues:
% - the most-inner "for" does not make sense because it rewrites
% several times the same thing;
% - the same looping variable "i" is re-used in three nested loops,
% - thename of the standard function "length" is obscured by declaring
% a variable named "length".
for x = 1:numel(spell_num)
storm_selector = (spell==spell_num(x));
storm_length(storm_selector) = spell_length(x+1);
end;
The combination of code I have ended up using is a mixture from #CST_Link and #Sifu. Thank you very much for your help! I don't think Stackoverflow lets me accept two answers, so for clarity by putting it all together, here is the code which everyone's helped me create!
The only slow part is the for loop in block three, but this still runs in a few minutes, which is good enough for me, and infinitely better than my attempt.
First block:
%% Spell
%spell is cumulative sum of begin
spell = cumsum(begin);
%% start row
x=1;
%% Replace all rows of spell with no rain with 0
spell(state==0)=0
Second block (unchanged except better variable names):
%% Spell number = all values of spell
spell_num = unique(spell);
%% Spell length = how many of each value of spell
spell_length = histc(spell,spell_num);
Third block:
%% Generate blank column for spell of size (state)
spell_length2 = zeros(length(state),1);
%%
for x=1:length(state)
position = find(spell==x);
spell_length2(position) = spell_length(x+1);
end
for the first part if i am following what you are doing
i created some data matching your description for testing.
please tell me if i missed something
state=[ 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 0];
begin=[ 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0];
spell = zeros(length(state),1,'int16');
%Start row for analysis
x=1;
% Populate "spell" variable with a storm number in each row of rain, for the storm number it belongs to (storm number calculated by adding up the number of "begin" values up to that point
for i=1:length(state)
if(state(x)==1)
spell(x) = sum(begin(1:x));
end
x=x+1;
end
% can be accomplished by simply using cumsum ( no need for extra variables if you are short in memory)
spell2=cumsum(begin);
spell3=spell2.*(state==1);
and the output for both spell and spell3 as shown
[spell.'; spell3]
0 0 0 0 0 1 1 1 1 1 0 2 0 0 2 0 3 3 3 3 0
0 0 0 0 0 1 1 1 1 1 0 2 0 0 2 0 3 3 3 3 0
Why don't you do that instead?
% For loop to output the total length of the storm for each row of rain within that storm
for x=1:size(state)
position = find(spell==x);
length(position) = spelllength(x+1);
end
I replaced the i iterator for x, that removes 2 lines and some computation.
I then proceeded to removed the two nested loops as they were litteraly useless (each loop would output the same thing)
That's already a good start..

reconstruct a matrix from its row and column sums

Take a n*m matrix filled with floating point values between 0 and 1.
Example:
0 0.5 0 0
0 0.5 1 0.4
0.2 1 0.3 0
0 1 0 0
The goal is to reconstruct the values in this matrix.
I do not have access to this matrix, so I do not know any of its values at the beginning.
There is a function to calculate each value, calc_value(m,n). So a simple way to reconstruct this matrix is to call calc_value(m,n) for each value.
But calling this function is a very expensive operation, so I would like to call this function as few times as possible.
I know the total sum of all values in the matrix, and the sum of values in each individual row and column. (calculating each of these sums is no more expensive than a call to calc_value(m,n))
Using the row and column sums as additional information, how can I fill all values in the matrix with the least amount of calls to calc_value()?
Is it possible with fewer than O(n*m) calls?
There is one additional constraint for the matrix that may help: the values in each row and column will be be monotonous increasing up to a maximum, then monotonous decreasing after that maximum. So a single row could look like this:
0 0.5 0.5 1 1 0.5 0
but not like this:
0 1 0 1 0 1
e.g. more than one distinct local maximum is not allowed
This is the status of my own attempts:
So far I discovered the following inequalities. For a given value of the matrix M(n,m):
M(n,m) <= Min ( sum_of_row_n, sum_of_column_m)
M(n,m) >= sum_of_row_n - sum_of_all_columns_except_m
M(n,m) >= sum_of_column_m - sum_of_all_rows_except_n
But these inequalities do not provide enough information to deduce the value M(n,m), except for some trivial cases.
From what you describe it seems that your matrix has m*n degrees of freedom. The range and monotonicity constraints do not reduce the degrees of freedom. Each sum (row, column, total) removes one degree of freedom - until (m-1)*(n-1) degrees have been reached. (Since the sum of all row sums and the sum of all column sums equals the total sum you can only exploit m+n-1 of these constraints).
So with the given information all you can do is:
calculate the matrix elements a_ij, 1 <= i< m, 1 <=j < n with calc_value(i,j)
compute the missing element of each row/column and a_mn via the row/col sum properties

Compare two arrays of points [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I'm trying to find a way to find similarities in two arrays of different points. I drew circles around points that have similar patterns and I would like to do some kind of auto comparison in intervals of let's say 100 points and tell what coefficient of similarity is for that interval. As you can see it might not be perfectly aligned also so point-to-point comparison would not be a good solution also (I suppose). Patterns that are slightly misaligned could also mean that they are matching the pattern (but obviously with a smaller coefficient)
What similarity could mean (1 coefficient is a perfect match, 0 or less - is not a match at all):
Points 640 to 660 - Very similar (coefficient is ~0.8)
Points 670 to 690 - Quite similar (coefficient is ~0.5-~0.6)
Points 720 to 780 - Let's say quite similar (coefficient is ~0.5-~0.6)
Points 790 to 810 - Perfectly similar (coefficient is 1)
Coefficient is just my thoughts of how a final calculated result of comparing function could look like with given data.
I read many posts on SO but it didn't seem to solve my problem. I would appreciate your help a lot. Thank you
P.S. Perfect answer would be the one that provides pseudo code for function which could accept two data arrays as arguments (intervals of data) and return coefficient of similarity.
Click here to see original size of image
I also think High Performance Mark has basically given you the answer (cross-correlation). In my opinion, most of the other answers are only giving you half of what you need (i.e., dot product plus compare against some threshold). However, this won't consider a signal to be similar to a shifted version of itself. You'll want to compute this dot product N + M - 1 times, where N, M are the sizes of the arrays. For each iteration, compute the dot product between array 1 and a shifted version of array 2. The amount you shift array 2 increases by one each iteration. You can think of array 2 as a window you are passing over array 1. You'll want to start the loop with the last element of array 2 only overlapping the first element in array 1.
This loop will generate numbers for different amounts of shift, and what you do with that number is up to you. Maybe you compare it (or the absolute value of it) against a threshold that you define to consider two signals "similar".
Lastly, in many contexts, a signal is considered similar to a scaled (in the amplitude sense, not time-scaling) version of itself, so there must be a normalization step prior to computing the cross-correlation. This is usually done by scaling the elements of the array so that the dot product with itself equals 1. Just be careful to ensure this makes sense for your application numerically, i.e., integers don't scale very well to values between 0 and 1 :-)
i think HighPerformanceMarks's suggestion is the standard way of doing the job.
a computationally lightweight alternative measure might be a dot product.
split both arrays into the same predefined index intervals.
consider the array elements in each intervals as vector coordinates in high-dimensional space.
compute the dot product of both vectors.
the dot product will not be negative. if the two vectors are perpendicular in their vector space, the dot product will be 0 (in fact that's how 'perpendicular' is usually defined in higher dimensions), and it will attain its maximum for identical vectors.
if you accept the geometric notion of perpendicularity as a (dis)similarity measure, here you go.
caveat:
this is an ad hoc heuristic chosen for computational efficiency. i cannot tell you about mathematical/statistical properties of the process and separation properties - if you need rigorous analysis, however, you'll probably fare better with correlation theory anyway and should perhaps forward your question to math.stackexchange.com.
My Attempt:
Total_sum=0
1. For each index i in the range (m,n)
2. sum=0
3. k=Array1[i]*Array2[i]; t1=magnitude(Array1[i]); t2=magnitude(Array2[i]);
4. k=k/(t1*t2)
5. sum=sum+k
6. Total_sum=Total_sum+sum
Coefficient=Total_sum/(m-n)
If all values are equal, then sum would return 1 in each case and total_sum would return (m-n)*(1). Hence, when the same is divided by (m-n) we get the value as 1. If the graphs are exact opposites, we get -1 and for other variations a value between -1 and 1 is returned.
This is not so efficient when the y range or the x range is huge. But, I just wanted to give you an idea.
Another option would be to perform an extensive xnor.
1. For each index i in the range (m,n)
2. sum=1
3. k=Array1[i] xnor Array2[i];
4. k=k/((pow(2,number_of_bits))-1) //This will scale k down to a value between 0 and 1
5. sum=(sum+k)/2
Coefficient=sum
Is this helpful ?
You can define a distance metric for two vectors A and B of length N containing numbers in the interval [-1, 1] e.g. as
sum = 0
for i in 0 to 99:
d = (A[i] - B[i])^2 // this is in range 0 .. 4
sum = (sum / 4) / N // now in range 0 .. 1
This now returns distance 1 for vectors that are completely opposite (one is all 1, another all -1), and 0 for identical vectors.
You can translate this into your coefficient by
coeff = 1 - sum
However, this is a crude approach because it does not take into account the fact that there could be horizontal distortion or shift between the signals you want to compare, so let's look at some approaches for coping with that.
You can sort both your arrays (e.g. in ascending order) and then calculate the distance / coefficient. This returns more similarity than the original metric, and is agnostic towards permutations / shifts of the signal.
You can also calculate the differentials and calculate distance / coefficient for those, and then you can do that sorted also. Using differentials has the benefit that it eliminates vertical shifts. Sorted differentials eliminate horizontal shift but still recognize different shapes better than sorted original data points.
You can then e.g. average the different coefficients. Here more complete code. The routine below calculates coefficient for arrays A and B of given size, and takes d many differentials (recursively) first. If sorted is true, the final (differentiated) array is sorted.
procedure calc(A, B, size, d, sorted):
if (d > 0):
A' = new array[size - 1]
B' = new array[size - 1]
for i in 0 to size - 2:
A'[i] = (A[i + 1] - A[i]) / 2 // keep in range -1..1 by dividing by 2
B'[i] = (B[i + 1] - B[i]) / 2
return calc(A', B', size - 1, d - 1, sorted)
else:
if (sorted):
A = sort(A)
B = sort(B)
sum = 0
for i in 0 to size - 1:
sum = sum + (A[i] - B[i]) * (A[i] - B[i])
sum = (sum / 4) / size
return 1 - sum // return the coefficient
procedure similarity(A, B, size):
sum a = 0
a = a + calc(A, B, size, 0, false)
a = a + calc(A, B, size, 0, true)
a = a + calc(A, B, size, 1, false)
a = a + calc(A, B, size, 1, true)
return a / 4 // take average
For something completely different, you could also run Fourier transform using FFT and then take a distance metric on the returning spectra.

how to read all 1's in an Array of 1's and 0's spread-ed all over the array randomly

I have an Array with 1 and 0 spread over the array randomly.
int arr[N] = {1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1,1,0,0,0,1....................N}
Now I want to retrive all the 1's in the array as fast as possible, but the condition is I should not loose the exact position(based on index) of the array , so sorting option not valid.
So the only option left is linear searching ie O(n) , is there anything better than this.
The main problem behind linear scan is , I need to run the scan even
for X times. So I feel I need to have some kind of other datastructure
which maintains this list once the first linear scan happens, so that
I need not to run the linear scan again and again.
Let me be clear about final expectations-
I just need to find the number of 1's in a certain range of array , precisely I need to find numbers of 1's in the array within range of 40-100. So this can be random range and I need to find the counts of 1 within that range. I can't do sum and all as I need to iterate over the array over and over again because of different range requirements
I'm surprised you considered sorting as a faster alternative to linear search.
If you don't know where the ones occur, then there is no better way than linear searching. Perhaps if you used bits or char datatypes you could do some optimizations, but it depends on how you want to use this.
The best optimization that you could do on this is to overcome branch prediction. Because each value is zero or one, you can use it to advance the index of the array that is used to store the one-indices.
Simple approach:
int end = 0;
int indices[N];
for( int i = 0; i < N; i++ )
{
if( arr[i] ) indices[end++] = i; // Slow due to branch prediction
}
Without branching:
int end = 0;
int indices[N];
for( int i = 0; i < N; i++ )
{
indices[end] = i;
end += arr[i];
}
[edit] I tested the above, and found the version without branching was almost 3 times faster (4.36s versus 11.88s for 20 repeats on a randomly populated 100-million element array).
Coming back here to post results, I see you have updated your requirements. What you want is really easy with a dynamic programming approach...
All you do is create a new array that is one element larger, which stores the number of ones from the beginning of the array up to (but not including) the current index.
arr : 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 1
count : 0 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 5 6 6 6 6 7
(I've offset arr above so it lines up better)
Now you can compute the number of 1s in any range in O(1) time. To compute the number of 1s between index A and B, you just do:
int num = count[B+1] - count[A];
Obviously you can still use the non-branch-prediction version to generate the counts initially. All this should give you a pretty good speedup over the naive approach of summing for every query:
int *count = new int[N+1];
int total = 0;
count[0] = 0;
for( int i = 0; i < N; i++ )
{
total += arr[i];
count[i+1] = total;
}
// to compute the ranged sum:
int range_sum( int *count, int a, int b )
{
if( b < a ) return range_sum(b,a);
return count[b+1] - count[a];
}
Well one time linear scanning is fine. Since you are looking for multiple scans across ranges of array I think that can be done in constant time. Here you go:
Scan the array and create a bitmap where key = key of array = sequence (1,2,3,4,5,6....).The value storedin bitmap would be a tuple<IsOne,cumulativeSum> where isOne is whether you have a one in there and cumulative Sum is addition of 1's as and wen you encounter them
Array = 1 1 0 0 1 0 1 1 1 0 1 0
Tuple: (1,1) (1,2) (0,2) (0,2) (1,3) (0,3) (1,4) (1,5) (1,6) (0,6) (1,7) (0,7)
CASE 1: When lower bound of cumulativeSum has a 0. Number of 1's [6,11] =
cumulativeSum at 11th position - cumulativeSum at 6th position = 7 - 3 = 4
CASE 2: When lower bound of cumulativeSum has a 1. Number of 1's [2,11] =
cumulativeSum at 11th position - cumulativeSum at 2nd position + 1 = 7-2+1 = 6
Step 1 is O(n)
Step 2 is 0(1)
Total complexity is linear no doubt but for your task where you have to work with the ranges several times the above Algorithm seems to be better if you have ample memory :)
Does it have to be a simple linear array data structure? Or can you create your own data structure which happens to have the desired properties, for which you're able to provide the required API, but whose implementation details can be hidden (encapsulated)?
If you can implement your own and if there is some guaranteed sparsity (to either 1s or 0s) then you might be able to offer better than linear performance. I see that you want to preserve (or be able to regenerate) the exact stream, so you'll have to store an array or bitmap or run-length encoding for that. (RLE will be useless if the stream is actually random rather than arbitrary but could be quite useful if there are significant sparsity or patterns with long strings of one or the other. For example a black&white raster of a bitmapped image is often a good candidate for RLE).
Let's say that your guaranteed that the stream will be sparse --- that no more than 10%, for example, of the bits will be 1s (or, conversely that more than 90% will be). If that's the case then you might model your solution on an RLE and maintain a count of all 1s (simply incremented as you set bits and decremented as you clear them). If there might be a need to quickly get the number of set bits for arbitrary ranges of these elements then instead of a single counter you can have a conveniently sized array of counters for partitions of the stream. (Conveniently-sized, in this case, means something which fits easily within memory, within your caches, or register sets, but which offers a reasonable trade off between computing a sum (all the partitions fully within the range) and the linear scan. The results for any arbitrary range is the sum of all the partitions fully enclosed by the range plus the results of linear scans for any fragments that are not aligned on your partition boundaries.
For a very, very, large stream you could even have a multi-tier "index" of partition sums --- traversing from the largest (most coarse) granularity down toward the "fragments" to either end (using the next layer of partition sums) and finishing with the linear search of only the small fragments.
Obviously such a structure represents trade offs between the complexity of building and maintaining the structure (inserting requires additional operations and, for an RLE, might be very expensive for anything other than appending/prepending) vs the expense of performing arbitrarily long linear search/increment scans.
If:
the purpose is to be able to find the number of 1s in the array at any time,
given that relatively few of the values in the array might change between one moment when you want to know the number and another moment, and
if you have to find the number of 1s in a changing array of n values m times,
... you can certainly do better than examining every cell in the array m times by using a caching strategy.
The first time you need the number of 1s, you certainly have to examine every cell, as others have pointed out. However, if you then store the number of 1s in a variable (say sum) and track changes to the array (by, for instance, requiring that all array updates occur through a specific update() function), every time a 0 is replaced in the array with a 1, the update() function can add 1 to sum and every time a 1 is replaced in the array with a 0, the update() function can subtract 1 from sum.
Thus, sum is always up-to-date after the first time that the number of 1s in the array is counted and there is no need for further counting.
(EDIT to take the updated question into account)
If the need is to return the number of 1s in a given range of the array, that can be done with a slightly more sophisticated caching strategy than the one I've just described.
You can keep a count of the 1s in each subset of the array and update the relevant subset count whenever a 0 is changed to a 1 or vice versa within that subset. Finding the total number of 1s in a given range within the array would then be a matter of adding the number of 1s in each subset that is fully contained within the range and then counting the number of 1s that are in the range but not in the subsets that have already been counted.
Depending on circumstances, it might be worthwhile to have a hierarchical arrangement in which (say) the number of 1s in the whole array is at the top of the hierarchy, the number of 1s in each 1/q th of the array is in the second level of the hierarchy, the number of 1s in each 1/(q^2) th of the array is in the third level of the hierarchy, etc. e.g. for q = 4, you would have the total number of 1s at the top, the number of 1s in each quarter of the array at the second level, the number of 1s in each sixteenth of the array at the third level, etc.
Are you using C (or derived language)? If so, can you control the encoding of your array? If, for example, you could use a bitmap to count. The nice thing about a bitmap, is that you can use a lookup table to sum the counts, though if your subrange ends aren't divisible by 8, you'll have to deal with end partial bytes specially, but the speedup will be significant.
If that's not the case, can you at least encode them as single bytes? In that case, you may be able to exploit sparseness if it exists (more specifically, the hope that there are often multi index swaths of zeros).
So for:
u8 input = {1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1,1,0,0,0,1....................N};
You can write something like (untested):
uint countBytesBy1FromTo(u8 *input, uint start, uint stop)
{ // function for counting one byte at a time, use with range of less than 4,
// use functions below for longer ranges
// assume it's just one's and zeros, otherwise we have to test/branch
uint sum;
u8 *end = input + stop;
for (u8 *each = input + start; each < end; each++)
sum += *each;
return sum;
}
countBytesBy8FromTo(u8 *input, uint start, uint stop)
{
u64 *chunks = (u64*)(input+start);
u64 *end = chunks + ((start - stop) >> 3);
uint sum = countBytesBy1FromTo((u8*)end, 0, stop - (u8*)end);
for (; chunks < end; chunks++)
{
if (*chunks)
{
sum += countBytesBy1FromTo((u8*)chunks, 0, 8);
}
}
}
The basic trick, is exploiting the ability to cast slices of your target array to single entities your language can look at in one swoop, and test by inference if ANY of the values of it are zeros, and then skip the whole block. The more zeros, the better it will work. In the case where your large cast integer always has at least one, this approach just adds overhead. You might find that using a u32 is better for your data. Or that adding a u32 test between the 1 and 8 helps. For datasets where zeros are much more common than ones, I've used this technique to great advantage.
Why is sorting invalid? You can clone the original array, sort the clone, and count and/or mark the locations of the 1s as needed.

Can we compute this in less than O(n*n) ...( nlogn or n)

This is a question asked to me by a very very famous MNC. The question is as follows ...
Input an 2D N*N array of 0's and 1's. If A(i,j) = 1, then all the values corresponding to the ith row and the jth column are going to be 1. If there is a 1 already, it remains as a 1.
As an example , if we have the array
1 0 0 0 0
0 1 1 0 0
0 0 0 0 0
1 0 0 1 0
0 0 0 0 0
we should get the output as
1 1 1 1 1
1 1 1 1 1
1 1 1 1 0
1 1 1 1 1
1 1 1 1 0
The input matrix is sparsely populated.
Is this possible in less than O(N^2)?
No additional space is provided was another condition. I would like to know if there's a way to achieve the complexity using a space <= O(N).
P.S : I don't need answers that give me a complexity of O(N*N). This is not a homework problem. I have tried much and couldn't get a proper solution and thought I could get some ideas here.Leave the printing aside for the complexity
My rough idea was to may be dynamically eliminate the number of elements traversed restricting them to around 2N or so. But I couldn't get a proper idea.
In the worst case, you may need to toggle N * N - N bits from 0 to 1 to generate the output. It would seem you're pretty well stuck with O(N*N).
I would imagine that you can optimize it for the best case, but I'm tempted to say that your worst case is still O(N*N): Your worst case will be an array of all 0s, and you will have to examine every single element.
The optimization would involve skipping a row or column as soon as you found a "1" (I can provide details, but you said you don't care about O(N*N)", but unless you have metadata to indicate that an entire row/column is empty, or unless you have a SIMD-style way to check multiple fields at once (say, if every row is aligned by 4, and you can read 32 bits worth data, or if your data is in form of a bitmask), you will always have to deal with the problem of an all-zero array.
Clearly, nor the output matrix nor its negated version has to be sparse (take a matrix with half of the first row set to 1 and anything else to 0 to see), so time depends on what format you are allowed to use for the output. (I'm assuming the input is a list of elements or something equivalent, since otherwise you couldn't take advantage of the matrix being sparse.)
A simple solution for O(M+N) space and time (M is the number of ones in the input matrix): take two arrays of length N filled with ones, iterate through all ones in the input, and for each drop the X coordinate from the first array and the Y from the second one. The output is the two arrays, which clearly define the result matrix: its (X,Y) coordinate is 0 iff the X coordinate of the first array and the Y coordinate of the second are 0.
Update: depending on the language, you could use some trickery to return a normal 2D array by referencing the same row multiple times. For example in PHP:
// compute N-length arrays $X and $Y which have 1 at the column
// and row positions which had no 1's in the input matrix
// this is O(M+N)
$result = array();
$row_one = array_fill(0,N,1);
for ($i=0; $i<N; $i++) {
if ($Y[$i]) {
$result[$i] = &$row_one;
} else {
$result[$i] = &$X;
}
}
return $result;
Of course this is a normal array only as long as you don't try to write it.
Since every entry of the matrix has to be checked, your worst case is always going to be N*N.
With a small 2*N extra storage, you can perform the operation in O(N*N). Just create a mask for each row and another for each column - scan the array and update the masks as you go. Then scan again to populate the result matrix based on the masks.
If you're doing something where the input matrix is changing, you could store a count of non-zero entries for each row and column of the input (rather than a simple mask). Then when an entry in the input changes, you update the counts accordingly. At that point, I would drop the output matrix entirely and query the masks/counts directly rather than even maintaining the output matrix (which could also be updated as thing change in less than NN time if you really wanted to keep it around). So loading the initial matrix would still be O(NN) but updates could be much less.
The input matrix may be sparse, but unless you can get it in a sparse format (i.e. a list of (i,j) pairs that are initially set), just reading your input will consume Ω(n^2) time. Even with sparse input, it's easy to end up with O(n^2) output to write. As a cheat, if you were allowed to output a list of set rows and set columns, then you could get down to linear time. There's no magic to be had when your algorithm actually has to produce a result more substantial than 'yes' or 'no'.
Mcdowella's comment on another answer suggests another alternative input format: run-length encoding. For a sparse input, that clearly requires no more than O(n) time to read it (consider how many transitions there are between 0 and 1). However, from there it breaks down. Consider an input matrix structured as follows:
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 . . .
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . .
. .
. .
. .
That is, alternating 0 and 1 on the first row, 0 everywhere else. Clearly sparse, since there are n/2 ones in total. However, the RLE output has to repeat this pattern in every row, leading to O(n^2) output.
You say:
we should get the output as...
So you need to output the entire matrix, which has N^2 elements. This is O(N*N).
The problem itself is not O(N*N): you dont have to compute and store the entire matrix: you only need two vectors, L and C, each of size N:
L[x] is 1 if line x is a line of ones, 0 otherwise;
C[x] is 1 if line x is a line of ones, 0 otherwise;
You can construct these vectors in O(N), because the initial matrix is sparse; your input data will not be a matrix, but a list containing the coordinates(line,column) of each non-zero element. While reading this list, you set L[line]=1 and C[column]=1, and the problem is solved: M[l,c] == 1 if L[l]==1 OR C[c]==1
Hii guys ,
thanks to the comment from mb14 i think i could get it solved in less than O(NN) time...
The worst would take O(NN)...
Actually , we have the given array suppose
1 0 0 0 1
0 1 0 0 0
0 1 1 0 0
1 1 1 0 1
0 0 0 0 0
Lets have 2 arrays of size N (this would be the worst case) ... One is dedicated for indexing rows and other columns...
Put those with a[i][1] = 0 in one array and then a[1][j] =0 in another..
Then take those values only and check for the second row and colums...In this manner , we get the values of rows and colums where there are only 0;'s entirely...
The number of values in the row array gives number of 0's in the result array and the points a[row-array values][column array value] gives you those points ....
We could solve it in below O(NN) and worst is O(NN) ... As we can seee , the arrays ( of size N) diminishes ....
I did this for a few arrays and got the result for all of them ... :)
Please correct me if i am wrong anywhere...
Thanx for all your comments guys...You are all very helpful and i did learn quite a few things along the way ... :)
There is clearly up to O(N^2) work to do. In the matrix
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
all bits have to be set to 1, and N*(N-1) are not set to one (20, in this 5x5 case).
Conversely, you can come up with an algorithm that always does it in O(N^2) time: sum along the top row and let column, and if the row or column gets a nonzero answer, fill in the entire row or column; then solve the smaller (N-1)x(N-1) problem.
So there exist cases that must take at least N^2 and any case can be solved in N^2 without extra space.
If your matrix is sparse, the complexity depends much on the input encoding and its in particular not well measured in N N2 or something like that but in terms of N your input complexity Min and your output complexity Mout. I'd expect something like O(N + Min + Mout) but much depending on the encoding and the tricks that you can play with it.
That depends entirely of your input data structure. If you pass your matrix (1s and 0s) as a 2D array you need to traverse it and that is O(N^2). But as your data is sparse, if you only pass the 1's as input, you can do it so the ouptut is O(M), where M is not the number of cells but the number of 1 cells. It would be something similar to this (pseudocode below):
list f(list l) {
list rows_1;
list cols_1;
for each elem in l {
rows_1[elem.row] = 1;
cols_1[elem.col] = 1;
}
list result;
for each row in rows_1 {
for each col in cols_1 {
if (row == 1 || col == 1) {
add(result, new_elem(row, col));
}
}
}
return result;
}
Don't fill the center of the matrix when you're checking values. As you go through the elements, when you have 1 set the corresponding element in the first row and the first column. Then go back and fill down and across.
edit: Actually, this is the same as Andy's.
It depends on your data structure.
There are only two possible cases for rows:
A row i is filled with 1's if there is an element (i,_) in the input
All other rows are the same: i.e. the j-th element is 1 iff there is an element (_,j) in the input.
Hence the result could be represented compactly as an array of references to rows. Since we only need two rows the result would also only consume O(N) memory. As an example this could be implemented in python as follows:
def f(element_list, N):
A = [1]*N
B = [0]*N
M = [B]*N
for row, col in element_list:
M[row] = A
B[col] = 1
return M
A sample call would be
f([(1,1),(2,2),(4,3)],5)
with the result
[[0, 1, 1, 1, 0], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [0, 1, 1, 1, 0], [1, 1, 1, 1, 1]]
The important point is that the arrays are not copied here, i.e. M[row]=A is just an assignment of a reference. Hence the complexity is O(N+M), where M is the length of the input.
#include<stdio.h>
include
int main()
{
int arr[5][5] = { {1,0,0,0,0},
{0,1,1,0,0},
{0,0,0,0,0},
{1,0,0,1,0},
{0,0,0,0,0} };
int var1=0,var2=0,i,j;
for(i=0;i<5;i++)
var1 = var1 | arr[0][i];
for(i=0;i<5;i++)
var2 = var2 | arr[i][0];
for(i=1;i<5;i++)
for(j=1;j<5;j++)
if(arr[i][j])
arr[i][0] = arr[0][j] = 1;
for(i=1;i<5;i++)
for(j=1;j<5;j++)
arr[i][j] = arr[i][0] | arr[0][j];
for(i=0;i<5;i++)
arr[0][i] = var1;
for(i=0;i<5;i++)
arr[i][0] = var2;
for(i=0;i<5;i++)
{
printf("\n");
for(j=0;j<5;j++)
printf("%d ",arr[i][j]);
}
getch();
}
This program makes use of only 2 4 temporary variables (var1,var2,i and j) and hence runs in constant space with time complexity O(n^2).. I Think it is not possible at all to solve this problem in < O(n^2).

Resources