Weird netlogo behavoir with big agentset and matrix extension - matrix

I’m running a netlogo model with around 120000 turtles. At some point while the program is running, netlogo is changing one entry of a matrix to negative value. It is always happening at the same entry but the time and value differ. Normally nothing should be changed in this matrix and when I am running the program with a reduced Agentset, for example with 100000 turtles, everything works fine and the matrix is not changed.
Does anyone know why this is happening and probably has an answer to this issue?
Hi everyone,
This is the code snippet where the failure occurs:
set mxH2_seg1_tpa matrix:times-element-wise mx_seg1_tpa mxH1_av
set mxH2_seg1_tpb matrix:times-element-wise mx_seg1_tpb mxH1_av
set mxH2_r_seg1_tp matrix:times-element-wise mx_r_seg1_tp mxH1_av
set row 0
set column 0 ;At this point everything is fine
while [row <= 9][
while [column <= 24][
if matrix:get mxH2_seg1_tpa column row != 0 [matrix:set mxH_seg1_tpa column row (ln matrix:get mxH2_seg1_tpa column row)]
if matrix:get mxH2_seg1_tpb column row != 0 [matrix:set mxH_seg1_tpb column row (ln matrix:get mxH2_seg1_tpb column row)]
;After here matrix mx_r_seg1_tp is changed and partly filled with strange values
if matrix:get mxH2_r_seg1_tp column row != 0 [matrix:set mxH_r_seg1_tp column row (ln matrix:get mxH2_r_seg1_tp column row)]
set column column + 1]
set column 0
set row row + 1]
The complete code is already very long so if the mistake is somewhere there I need some advice what to look for.

Try setting the random seed (to any number). Then run your simulation and see when it fails and which turtle etc. Then run it again and stop it at the tick before failure and print out the variables and inspect the turtle that is about to cause the failure.

This line:
if matrix:get mxH2_r_seg1_tp column row != 0 [matrix:set mxH_r_seg1_tp column row (ln matrix:get mxH2_r_seg1_tp column row)]
sets an element of mxH_r_seg1_tp to the result of ln applied to an element of mxH2_r_seg1_tp. If you take the logarithm of a number less than 1 and greater than 0, the result will be a negative number. Is that what's happening? You can use #JenB's advice to check whether mxH2_r_seg1_tp has elements < 1 just before the failure.
Note that repeatedly taking the logarithm of the result of taking a logarithm will eventually produce a number less than 1. Perhaps it's also relevant that the first three lines of the posted code multiply matrix elements repeatedly, since repeated multiplication can generate numbers less than 1 as well. Or perhaps the multiplications cause the matrix elements to increase in magnitude, while taking the log decreases their magnitude, and it takes many ticks before any element is < 1? Since there are several matrices involved in repeated calculations, it's hard to know whether one of these factors might be relevant, but you will be able to check.

After reading through the complete code I found my mistake. At one specific point I forgot to use matrix:copy. I am still not sure how the size of the agentset is influencing this problem but I do not think it is some kind of implementation bug. Thank you very much for your support. Your tips helped me to find my mistake. So if anyone has a similar problem, check if matrix:copy was used appropriately. Cheers Jan

Just an update on Jan final answer: I encountered a similar issue with NetLogo 5.1 and the new updated Matrix module. It seems that matrix:times-scalar will not only report the resulting matrix, but will also tamper the original one. This can be demonstrated using this code snippet (enter in Command Center):
(let m matrix:from-row-list [[1 2 3] [4 5 6]]) (let m2 (matrix:times-scalar m -1)) (print m) (print m2)
Correct result (given by NetLogo 5.0.5) should be:
{{matrix: [ [ 1 2 3 ][ 4 5 6 ] ]}}
{{matrix: [ [ -1 -2 -3 ][ -4 -5 -6 ] ]}}
Wrong result (given by NetLogo 5.1 and 5.2RC3):
{{matrix: [ [ -1 -2 -3 ][ -4 -5 -6 ] ]}}
{{matrix: [ [ -1 -2 -3 ][ -4 -5 -6 ] ]}}
This is most probably a bug because matrix:times-scalar is defined as a reporter and not a procedure (ie: you cannot just do matrix:times-scalar m -1), and from the NetLogo documentation, reporter functions should report (return) their results without tampering the inputs.
Only matrix:times-scalar is affected. To workaround this issue, you can either use matrix:times (which supports scalars as input since NetLogo 5.1), or you can matrix:copy your matrix before using matrix:times-scalar, or you can use NetLogo 5.0.5 which is not affected by the issue (but new functions such as matrix:map won't be available).
For more info, see: https://github.com/NetLogo/Matrix-Extension/issues/12

Related

Split array into four boxes such that sum of XOR's of the boxes is maximum

Given an array of integers which are needed to be split into four
boxes such that sum of XOR's of the boxes is maximum.
I/P -- [1,2,1,2,1,2]
O/P -- 9
Explanation: Box1--[1,2]
Box2--[1,2]
Box3--[1,2]
Box4--[]
I've tried using recursion but failed for larger test cases as the
Time Complexity is exponential. I'm expecting a solution using dynamic
programming.
def max_Xor(b1,b2,b3,b4,A,index,size):
if index == size:
return b1+b2+b3+b4
m=max(max_Xor(b1^A[index],b2,b3,b4,A,index+1,size),
max_Xor(b1,b2^A[index],b3,b4,A,index+1,size),
max_Xor(b1,b2,b3^A[index],b4,A,index+1,size),
max_Xor(b1,b2,b3,b4^A[index],A,index+1,size))
return m
def main():
print(max_Xor(0,0,0,0,A,0,len(A)))
Thanks in Advance!!
There are several things to speed up your algorithm:
Build in some start-up logic: it doesn't make sense to put anything into box 3 until boxes 1 & 2 are differentiated. In fact, you should generally have an order of precedence to keep you from repeating configurations in a different order.
Memoize your logic; this avoids repeating computations.
For large cases, take advantage of what value algebra exists.
This last item may turn out to be the biggest saving. For instance, if your longest numbers include several 5-bit and 4-bit numbers, it makes no sense to consider shorter numbers until you've placed those decently in the boxes, gaining maximum advantage for the leading bits. With only four boxes, you cannot have a num from 3-bit numbers that dominates a single misplaced 5-bit number.
Your goal is to place an odd number of 5-bit numbers into 3 or all 4 boxes; against this, check only whether this "pessimizes" bit 4 of the remaining numbers. For instance, given six 5-digit numbers (range 16-31) and a handful of small ones (0-7), your first consideration is to handle only combinations that partition the 5-digit numbers by (3, 1, 1, 1), as this leaves that valuable 5-bit turned on in each set.
With a more even mixture of values in your input, you'll also need to consider how to distribute the 4-bits for a similar "keep it odd" heuristic. Note that, as you work from largest to smallest, you need worry only about keeping it odd, and watching the following bit.
These techniques should let you prune your recursion enough to finish in time.
We can use Dynamic programming here to break the problem into smaller sets then store their result in a table. Then use already stored result to calculate answer for bigger set.
For example:
Input -- [1,2,1,2,1,2]
We need to divide the array consecutively into 4 boxed such that sum of XOR of all boxes is maximised.
Lets take your test case, break the problem into smaller sets and start solving for smaller set.
box = 1, num = [1,2,1,2,1,2]
ans = 1 3 2 0 1 3
Since we only have one box so all numbers will go into this box. We will store this answer into a table. Lets call the matrix as DP.
DP[1] = [1 3 2 0 1 3]
DP[i][j] stores answer for distributing 0-j numbers to i boxes.
now lets take the case where we have two boxes and we will take numbers one by one.
num = [1] since we only have one number it will go into the first box.
DP[1][0] = 1
Lets add another number.
num = [1 2]
now there can be two ways to put this new number into the box.
case 1: 2 will go to the First box. Since we already have answer
for both numbers in one box. we will just use that.
answer = DP[0][1] + 0 (Second box is empty)
case 2: 2 will go to second box.
answer = DP[0][0] + 2 (only 2 is present in the second box)
Maximum of the two cases will be stored in DP[1][1].
DP[1][1] = max(3+0, 1+2) = 3.
Now for num = [1 2 1].
Again for new number we have three cases.
box1 = [1 2 1], box2 = [], DP[0][2] + 0
box1 = [1 2], box2 = [1], DP[0][1] + 1
box1 = [1 ], box2 = [2 1], DP[0][0] + 2^1
Maximum of these three will be answer for DP[1][2].
Similarly we can find answer of num = [1 2 1 2 1 2] box = 4
1 3 2 0 1 3
1 3 4 6 5 3
1 3 4 6 7 9
1 3 4 6 7 9
Also note that a xor b xor a = b. you can use this property to get xor of a segment of an array in constant time as suggested in comments.
This way you can break the problem in smaller subset and use smaller set answer to compute for the bigger ones. Hope this helps. After understanding the concept you can go ahead and implement it with better time than exponential.
I would go bit by bit from the highest bit to the lowest bit. For every bit, try all combinations that distribute the still unused numbers that have that bit set so that an odd number of them is in each box, nothing else matters. Pick the best path overall. One issue that complicates this greedy method is that two boxes with a lower bit set can equal one box with the next higher bit set.
Alternatively, memoize the boxes state in your recursion as an ordered tuple.

How many times does a zero occur on an odometer

I am solving how many times a zero occus on an odometer. I count +1 everytime I see a zero.
10 -> +1
100-> +2 because in 100 I see 2 zero's
10004 -> +3 because I see 3 zero's
So I get,
1 - 100 -> +11
1 - 500 -> +91
1 - 501 -> +92
0 - 4294967295-> +3825876150
I used rubydoctest for it. I am not doing anything with begin_number yet. Can anyone explain how to calculate it without a brute force method?
I did many attempts. They go well for numbers like 10, 1000, 10.000, 100.000.000, but not for numbers like 522, 2280. If I run the rubydoctest, it will fail on # >> algorithm_count_zero(1, 500)
# doctest: algorithm_count_zero(begin_number, end_number)
# >> algorithm_count_zero(1, 10)
# => 1
# >> algorithm_count_zero(1, 1000)
# => 192
# >> algorithm_count_zero(1, 10000000)
# => 5888896
# >> algorithm_count_zero(1, 500)
# => 91
# >> algorithm_count_zero(0, 4294967295)
# => 3825876150
def algorithm_count_zero(begin_number, end_number)
power = Math::log10(end_number) - 1
if end_number < 100
return end_number/10
else
end_number > 100
count = (9*(power)-1)*10**power+1
end
answer = ((((count / 9)+power)).floor) + 1
end
end_number = 20000
begin_number = 10000
puts "Algorithm #{algorithm_count_zero(begin_number, end_number)}"
As noticed in a comment, this is a duplicate to another question, where the solution gives you correct guidelines.
However, if you want to test your own solution for correctness, i'll put in here a one-liner in the parallel array processing language Dyalog APL (which i btw think everyone modelling mathemathics and numbers should use).
Using tryapl.org you'll be able to get a correct answer for any integer value as argument. Tryapl is a web page with a backend that executes simple APL code statements ("one-liners", which are very typical to the APL language and it's extremely compact code).
The APL one-liner is here:
{+/(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 142857
Copy that and paste it into the edit row at tryapl.org, and press enter - you will quickly see an integer, which is the answer to your problem. In the code row above, you can see the argument rightmost; it is 142857 this time but you can change it to any integer.
As you have pasted the one-liner once, and executed it with Enter once, the easiest way to get it back for editing is to press [Up arrow]. This returns the most recently entered statement; then you can edit the number sitting rightmost (after the curly brace) and press Enter again to get the answer for a different argument.
Pasting teh code row above will return 66765 - that many zeroes exist for 142857.
If you paste this 2 characters shorter row below, you will see the individual components of the result - the sum of these components make up the final result. You will be able to see a pattern, which possibly makes it easier to understand what happens.
Try for example
{(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 1428579376
0 100000000 140000000 142000000 142800000 142850000 142857000 142857900 142857930 142857937
... and see how the intermediate results contain segments of the argument 1428579376, starting from left! There are as many intermediate results as there are numbers in the argument (10 this time).
The result for 1428579376 will be 1239080767, ie. the sum of the 10 numbers above. This many zeroes appear in all numbers between 1 and 1428579376 :-).
Consider each odometer position separately. The position x places from the far right changes once every 10^x times. By looking at the numbers to its right, you know how long it will be until it next changes. It will then hold each value for 10^x times before changing, until it reaches the end of the range you are considering, when it will hold its value at that time for some number of times that you can work out given the value at the very end of the range.
Now you have a sequence of the form x...0123456789012...y where you know the length and you know the values of x and y. One way to count the number of 0s (or any other digit) within this sequence is to clip off the prefix from x.. to just before the first 0, and clip off the suffix from just after the last 9 to y. Look for 0s n in this suffix, and measure the length of the long sequence from prefix to suffix. This will be of a length divisible by 10, and will contain each digit the same number of times.
Based on this you should be able to work out, for each position, how often within the range it will assume each of its 10 possible values. By summing up the values for 0 from each of the odometer positions you get the answer you want.

Matlab: Speed up loop applied to each of 820,000 elements

I have a set of rainfall data, with a value every 15 minutes over many years, giving 820,000 rows.
The aim (eventually) of my code is to create columns which categorise the data which can then be used to extract relevant chunks of data for further analysis.
I am a Matlab novice and would appreciate some help!
The first steps I have got working sufficiently fast. However, some steps are very slow.
I have tried pre-allocating arrays, and using the lowest intX (8 or 16 depending on situation) possible, but other steps are so slow they don't complete.
The slow ones are for loops, but I don't know if they can be vectorised/split into chunks/anything else to speed them up.
I have a variable "rain" which contains a value for every time step/row.
I have created a variable called "state" of 0 if no rain, and 1 if there is rain.
Also a variable called "begin" which has 1 if it is the first row of a storm, and 0 if not.
The first slow loop is to create a "spell" variable - to give each rain storm a number.
% Generate blank column for spell of size (rain) - preallocate
spell = zeros(size(st),1,'int16');
% Start row for analysis
x=1;
% Populate "spell" variable with a storm number in each row of rain, for the storm number it belongs to (storm number calculated by adding up the number of "begin" values up to that point
for i=1:size(state)
if(state(x)==1)
spell(x) = sum(begin(1:x));
end
x=x+1;
end
The next stage is about length of each storm. The first steps are fast enough.
% List of storm numbers
spellnum = unique(spell);
% Length of each spell
spelllength = histc(spell,spellnum);
The last step below (the for loop) is too slow and just crashes.
% Generate blank column for length
length = zeros(size(state),1,'int16');
% Starting row
x = 1;
% For loop to output the total length of the storm for each row of rain within that storm
for i=1:size(state)
for j=1:size(state)
position = find(spell==x);
for k=1:size(state)
length(position) = spelllength(x+1);
end
end
x=x+1;
end
Is it possible to make this more efficient?
Apologies if examples already exist - I'm not sure what the process would be called!
Many thanks in advance.
Mem. allocation/reallocation tips:
try to create the results directly from expression (eventually trimming another, more general result);
if 1. is not possible, try to pre-allocate whenever possible (when you have an upper limit for the result);
if 2. is not possible try to grow cell-arrays rather than massive matrices (because a matrix requires a contiguous memory area)
Type-choice tips:
try to use always double in intermediate results, because is the basic numeric data type in MATLAB; avoiding conversions back and forth;
use other types for intermediate results only if there's a memory constraint that can be alleviated by using a smaller-size type.
Linearisation tips:
fastest linearisation uses matrix-wise or element-wise basic algebraic operations combined with logical indexing.
loops are not that bad starting with MATLAB R2008;
the worst-performing element-wise processing functions are arrayfun, cellfun and structfun with anonymous functions, because anon functions evaluate the slowest;
try not to calculate the same things twice, even if this gives you better linearisation.
First block:
% Just calculate the entire cumulative sum over begin, then
% trim the result. Check if the cumsum doesn't overflow.
spell = cumsum(begin);
spell(state==0) = 0;
Second block:
% The same, not sure how could you speed this up; changed
% the name of variables to my taste, though.
spell_num = unique(spell);
spell_length = histc(spell,spell_num);
Third block:
% Fix the following issues:
% - the most-inner "for" does not make sense because it rewrites
% several times the same thing;
% - the same looping variable "i" is re-used in three nested loops,
% - thename of the standard function "length" is obscured by declaring
% a variable named "length".
for x = 1:numel(spell_num)
storm_selector = (spell==spell_num(x));
storm_length(storm_selector) = spell_length(x+1);
end;
The combination of code I have ended up using is a mixture from #CST_Link and #Sifu. Thank you very much for your help! I don't think Stackoverflow lets me accept two answers, so for clarity by putting it all together, here is the code which everyone's helped me create!
The only slow part is the for loop in block three, but this still runs in a few minutes, which is good enough for me, and infinitely better than my attempt.
First block:
%% Spell
%spell is cumulative sum of begin
spell = cumsum(begin);
%% start row
x=1;
%% Replace all rows of spell with no rain with 0
spell(state==0)=0
Second block (unchanged except better variable names):
%% Spell number = all values of spell
spell_num = unique(spell);
%% Spell length = how many of each value of spell
spell_length = histc(spell,spell_num);
Third block:
%% Generate blank column for spell of size (state)
spell_length2 = zeros(length(state),1);
%%
for x=1:length(state)
position = find(spell==x);
spell_length2(position) = spell_length(x+1);
end
for the first part if i am following what you are doing
i created some data matching your description for testing.
please tell me if i missed something
state=[ 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 0];
begin=[ 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0];
spell = zeros(length(state),1,'int16');
%Start row for analysis
x=1;
% Populate "spell" variable with a storm number in each row of rain, for the storm number it belongs to (storm number calculated by adding up the number of "begin" values up to that point
for i=1:length(state)
if(state(x)==1)
spell(x) = sum(begin(1:x));
end
x=x+1;
end
% can be accomplished by simply using cumsum ( no need for extra variables if you are short in memory)
spell2=cumsum(begin);
spell3=spell2.*(state==1);
and the output for both spell and spell3 as shown
[spell.'; spell3]
0 0 0 0 0 1 1 1 1 1 0 2 0 0 2 0 3 3 3 3 0
0 0 0 0 0 1 1 1 1 1 0 2 0 0 2 0 3 3 3 3 0
Why don't you do that instead?
% For loop to output the total length of the storm for each row of rain within that storm
for x=1:size(state)
position = find(spell==x);
length(position) = spelllength(x+1);
end
I replaced the i iterator for x, that removes 2 lines and some computation.
I then proceeded to removed the two nested loops as they were litteraly useless (each loop would output the same thing)
That's already a good start..

octave matrix for loop performance

I am new to Octave. I have two matrices. I have to compare a particular column of a one matrix with the other(my matrix A is containing more than 5 variables, similarly matrix B is containing the same.) and if elements in column one of matrix A is equal to elements in the second matrix B then I have to use the third column of second matrix B to compute certain values.I am doing this with octave by using for loop , but it consumes a lot of time to do the computation for single day , i have to do this for a year . Because size of matrices is very large.Please suggest some alternative way so that I can reduce my time and computation.
Thank you in advance.
Thanks for your quick response -hfs
continuation of the same problem,
Thank u, but this will work only if both elements in both the rows are equal.For example my matrices are like this,
A=[1 2 3;4 5 6;7 8 9;6 9 1]
B=[1 2 4; 4 2 6; 7 5 8;3 8 4]
here column 1 of first element of A is equal to column 1 of first element of B,even the second column hence I can take the third element of B, but for the second element of column 1 is equal in A and B ,but second element of column 2 is different ,here it should search for that element and print the element in the third column,and am doing this with for loop which is very slow because of larger dimension.In mine actual problem I have given for loop as written below:
for k=1:37651
for j=1:26018
if (s(k,1:2)==l(j,1:2))
z=sin((90-s(k,3))*pi/180) , break ,end
end
end
I want an alternative way to do this which should be faster than this.
You should work with complete matrices or vectors whenever possible. You should try commands and inspect intermediate results in the interactive shell to see how they fit together.
A(:,1)
selects the first column of a matrix. You can compare matrices/vectors and the result is a matrix/vector of 0/1 again:
> A(:,1) == B(:,1)
ans =
1
1
0
If you assign the result you can use it again to index into matrices:
I = A(:,1) == B(:,1)
B(I, 3)
This selects the third column of B of those rows where the first column of A and B is equal.
I hope this gets you started.

Can we compute this in less than O(n*n) ...( nlogn or n)

This is a question asked to me by a very very famous MNC. The question is as follows ...
Input an 2D N*N array of 0's and 1's. If A(i,j) = 1, then all the values corresponding to the ith row and the jth column are going to be 1. If there is a 1 already, it remains as a 1.
As an example , if we have the array
1 0 0 0 0
0 1 1 0 0
0 0 0 0 0
1 0 0 1 0
0 0 0 0 0
we should get the output as
1 1 1 1 1
1 1 1 1 1
1 1 1 1 0
1 1 1 1 1
1 1 1 1 0
The input matrix is sparsely populated.
Is this possible in less than O(N^2)?
No additional space is provided was another condition. I would like to know if there's a way to achieve the complexity using a space <= O(N).
P.S : I don't need answers that give me a complexity of O(N*N). This is not a homework problem. I have tried much and couldn't get a proper solution and thought I could get some ideas here.Leave the printing aside for the complexity
My rough idea was to may be dynamically eliminate the number of elements traversed restricting them to around 2N or so. But I couldn't get a proper idea.
In the worst case, you may need to toggle N * N - N bits from 0 to 1 to generate the output. It would seem you're pretty well stuck with O(N*N).
I would imagine that you can optimize it for the best case, but I'm tempted to say that your worst case is still O(N*N): Your worst case will be an array of all 0s, and you will have to examine every single element.
The optimization would involve skipping a row or column as soon as you found a "1" (I can provide details, but you said you don't care about O(N*N)", but unless you have metadata to indicate that an entire row/column is empty, or unless you have a SIMD-style way to check multiple fields at once (say, if every row is aligned by 4, and you can read 32 bits worth data, or if your data is in form of a bitmask), you will always have to deal with the problem of an all-zero array.
Clearly, nor the output matrix nor its negated version has to be sparse (take a matrix with half of the first row set to 1 and anything else to 0 to see), so time depends on what format you are allowed to use for the output. (I'm assuming the input is a list of elements or something equivalent, since otherwise you couldn't take advantage of the matrix being sparse.)
A simple solution for O(M+N) space and time (M is the number of ones in the input matrix): take two arrays of length N filled with ones, iterate through all ones in the input, and for each drop the X coordinate from the first array and the Y from the second one. The output is the two arrays, which clearly define the result matrix: its (X,Y) coordinate is 0 iff the X coordinate of the first array and the Y coordinate of the second are 0.
Update: depending on the language, you could use some trickery to return a normal 2D array by referencing the same row multiple times. For example in PHP:
// compute N-length arrays $X and $Y which have 1 at the column
// and row positions which had no 1's in the input matrix
// this is O(M+N)
$result = array();
$row_one = array_fill(0,N,1);
for ($i=0; $i<N; $i++) {
if ($Y[$i]) {
$result[$i] = &$row_one;
} else {
$result[$i] = &$X;
}
}
return $result;
Of course this is a normal array only as long as you don't try to write it.
Since every entry of the matrix has to be checked, your worst case is always going to be N*N.
With a small 2*N extra storage, you can perform the operation in O(N*N). Just create a mask for each row and another for each column - scan the array and update the masks as you go. Then scan again to populate the result matrix based on the masks.
If you're doing something where the input matrix is changing, you could store a count of non-zero entries for each row and column of the input (rather than a simple mask). Then when an entry in the input changes, you update the counts accordingly. At that point, I would drop the output matrix entirely and query the masks/counts directly rather than even maintaining the output matrix (which could also be updated as thing change in less than NN time if you really wanted to keep it around). So loading the initial matrix would still be O(NN) but updates could be much less.
The input matrix may be sparse, but unless you can get it in a sparse format (i.e. a list of (i,j) pairs that are initially set), just reading your input will consume Ω(n^2) time. Even with sparse input, it's easy to end up with O(n^2) output to write. As a cheat, if you were allowed to output a list of set rows and set columns, then you could get down to linear time. There's no magic to be had when your algorithm actually has to produce a result more substantial than 'yes' or 'no'.
Mcdowella's comment on another answer suggests another alternative input format: run-length encoding. For a sparse input, that clearly requires no more than O(n) time to read it (consider how many transitions there are between 0 and 1). However, from there it breaks down. Consider an input matrix structured as follows:
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 . . .
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . .
. .
. .
. .
That is, alternating 0 and 1 on the first row, 0 everywhere else. Clearly sparse, since there are n/2 ones in total. However, the RLE output has to repeat this pattern in every row, leading to O(n^2) output.
You say:
we should get the output as...
So you need to output the entire matrix, which has N^2 elements. This is O(N*N).
The problem itself is not O(N*N): you dont have to compute and store the entire matrix: you only need two vectors, L and C, each of size N:
L[x] is 1 if line x is a line of ones, 0 otherwise;
C[x] is 1 if line x is a line of ones, 0 otherwise;
You can construct these vectors in O(N), because the initial matrix is sparse; your input data will not be a matrix, but a list containing the coordinates(line,column) of each non-zero element. While reading this list, you set L[line]=1 and C[column]=1, and the problem is solved: M[l,c] == 1 if L[l]==1 OR C[c]==1
Hii guys ,
thanks to the comment from mb14 i think i could get it solved in less than O(NN) time...
The worst would take O(NN)...
Actually , we have the given array suppose
1 0 0 0 1
0 1 0 0 0
0 1 1 0 0
1 1 1 0 1
0 0 0 0 0
Lets have 2 arrays of size N (this would be the worst case) ... One is dedicated for indexing rows and other columns...
Put those with a[i][1] = 0 in one array and then a[1][j] =0 in another..
Then take those values only and check for the second row and colums...In this manner , we get the values of rows and colums where there are only 0;'s entirely...
The number of values in the row array gives number of 0's in the result array and the points a[row-array values][column array value] gives you those points ....
We could solve it in below O(NN) and worst is O(NN) ... As we can seee , the arrays ( of size N) diminishes ....
I did this for a few arrays and got the result for all of them ... :)
Please correct me if i am wrong anywhere...
Thanx for all your comments guys...You are all very helpful and i did learn quite a few things along the way ... :)
There is clearly up to O(N^2) work to do. In the matrix
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
all bits have to be set to 1, and N*(N-1) are not set to one (20, in this 5x5 case).
Conversely, you can come up with an algorithm that always does it in O(N^2) time: sum along the top row and let column, and if the row or column gets a nonzero answer, fill in the entire row or column; then solve the smaller (N-1)x(N-1) problem.
So there exist cases that must take at least N^2 and any case can be solved in N^2 without extra space.
If your matrix is sparse, the complexity depends much on the input encoding and its in particular not well measured in N N2 or something like that but in terms of N your input complexity Min and your output complexity Mout. I'd expect something like O(N + Min + Mout) but much depending on the encoding and the tricks that you can play with it.
That depends entirely of your input data structure. If you pass your matrix (1s and 0s) as a 2D array you need to traverse it and that is O(N^2). But as your data is sparse, if you only pass the 1's as input, you can do it so the ouptut is O(M), where M is not the number of cells but the number of 1 cells. It would be something similar to this (pseudocode below):
list f(list l) {
list rows_1;
list cols_1;
for each elem in l {
rows_1[elem.row] = 1;
cols_1[elem.col] = 1;
}
list result;
for each row in rows_1 {
for each col in cols_1 {
if (row == 1 || col == 1) {
add(result, new_elem(row, col));
}
}
}
return result;
}
Don't fill the center of the matrix when you're checking values. As you go through the elements, when you have 1 set the corresponding element in the first row and the first column. Then go back and fill down and across.
edit: Actually, this is the same as Andy's.
It depends on your data structure.
There are only two possible cases for rows:
A row i is filled with 1's if there is an element (i,_) in the input
All other rows are the same: i.e. the j-th element is 1 iff there is an element (_,j) in the input.
Hence the result could be represented compactly as an array of references to rows. Since we only need two rows the result would also only consume O(N) memory. As an example this could be implemented in python as follows:
def f(element_list, N):
A = [1]*N
B = [0]*N
M = [B]*N
for row, col in element_list:
M[row] = A
B[col] = 1
return M
A sample call would be
f([(1,1),(2,2),(4,3)],5)
with the result
[[0, 1, 1, 1, 0], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [0, 1, 1, 1, 0], [1, 1, 1, 1, 1]]
The important point is that the arrays are not copied here, i.e. M[row]=A is just an assignment of a reference. Hence the complexity is O(N+M), where M is the length of the input.
#include<stdio.h>
include
int main()
{
int arr[5][5] = { {1,0,0,0,0},
{0,1,1,0,0},
{0,0,0,0,0},
{1,0,0,1,0},
{0,0,0,0,0} };
int var1=0,var2=0,i,j;
for(i=0;i<5;i++)
var1 = var1 | arr[0][i];
for(i=0;i<5;i++)
var2 = var2 | arr[i][0];
for(i=1;i<5;i++)
for(j=1;j<5;j++)
if(arr[i][j])
arr[i][0] = arr[0][j] = 1;
for(i=1;i<5;i++)
for(j=1;j<5;j++)
arr[i][j] = arr[i][0] | arr[0][j];
for(i=0;i<5;i++)
arr[0][i] = var1;
for(i=0;i<5;i++)
arr[i][0] = var2;
for(i=0;i<5;i++)
{
printf("\n");
for(j=0;j<5;j++)
printf("%d ",arr[i][j]);
}
getch();
}
This program makes use of only 2 4 temporary variables (var1,var2,i and j) and hence runs in constant space with time complexity O(n^2).. I Think it is not possible at all to solve this problem in < O(n^2).

Resources