Time Complexity of a nested for loop that parses a matrix - time

Let's say I have a matrix that has X rows and Y columns. The total number of elements is X*Y, correct? So does that make n=X*Y?
for (i=0; i<X; i++)
{
for (j=0; j<Y; j++)
{
print(matrix[i][j]);
}
}
Then wouldn't that mean that this nested for loop is O(n)? Or am I misunderstanding how time complexities work?
Generally, I thought all nested for loops were O(n^2), but if it goes through X*Y calls to print(), doesn't that mean that the time complexity is O(X*Y) and X*Y is equal to n?

If you have a matrix of size rows*columns, then the inner loop (let's say) is O(columns), and the nested loops together are O(rows*columns).
You are confusing a problem size of N for a problem size of N^2. You can either say your matrix is size N or your matrix is size N^2, though unless your matrix is square you should say that you have a matrix of size Rows*Columns.

You are right when you say n = X x Y but wrong when you say the nested loops should be O(n). The meaning of nested loop can be understood if you dry run your code. You will notice that for each iteration of the outer loop the inner loop runs n (or what ever is the size condition) times. Hence, by simple math, you can deduce that its O(n^2). But, if you had just one loop when you will be iterating over (X x Y) (Eg: for(i = 0; i<(X*Y); i++) elements, then it will be O(n) cause you are not restarting your iteration at any point of time.
Hope this makes sense.

This answer was written hastily and received a few downvotes, so I decided to clarify and rewrite it
Time complexity of an algorithm is an expression of the number of operations of the algorithm in terms of the size of the problem the algorithm is intended to solve.
There are two sizes involved here.
The first size is the number of elements of the matrix X × Y This corresponds to what is known in complexity theory as the size of input. Let k = X × Y denote the number of elements in the matrix. Since the number of operations in the twin loop is X × Y, it is in O(k).
The second size is the number of columns and rows of the matrix. Let m = max(X,Y). The number of operations in the twin loop is in O(m^2). Usually in Linear Algebra this kind of size is used to characterize the complexity of matrix operations on m × m matrices.
When you talk about complexity you have to specify precisely how you encode an instance problem and what parameter you use to specify its size. In Complexity Theory we usually assume that the input to an algorithm is given as a string of characters coming from some finite alphabet and measure the complexity of an algorithm in terms of an upper bound on the number of operations on an instance of a problem given by a string of length n. That is in Complexity Theory n is usually the size of input.
In practical Complexity Analysis of algorithms we often use other measures of the size of an instance that are more meaningful in specific context. For instance if A is a connectivity matrix of a graph, we may use the number of vertices V as a measure of complexity of an instance of a problem, or if A is a matrix of a linear operator acting on a vector space, we may use the dimension of a vector space as such a measure. For square matrices the convention is to specify the complexity in terms of the dimension of the matrix, that is to measure the complexity of algorithms acting upon n × n matrices in terms of n. It often makes practical sense and also agrees with the conventions of a specific application field even if it may contradict the conventions of Complexity Theory.
Let us give the name Matrix Scan to our twin loop. You may legitimately say that if the size of an instance of Matrix Scan is the length of a string encoding of a matrix. Assuming bounded size of the entries it is the number of elements in the matrix, k. Then we can say the complexity of Matrix Scan is in O(k). On the other hand if we take m = max(X,Y) as a parameter that characterizes the complexity of an instance, as is customary in many applications, then the complexity Matrix Scan for an X×Y matrix will is also in O(m^2). For a square matrix X = Y = m and O(k) = O(m^2).
Notice: Some people in the comments asked whether we can always find an encoding of the problem to reduce any polynomial problem to a linear problem. This is not true. For some algorithms the number of operations grows faster than the length of the string encoding of their input. For instance, there is no algorithm to multiply two m×m matrices with θ(m^2) number of operations. Here the size of input grows as m^2, however Ran Raz proved that the number of operations grows at least as fast as m^2 log m. If n is in O(m^2) then m^2 log m is in O(n log n) and the best known algorithms complexity grows as O(m^(2+c)) = O(n^(1+c/2)), where c is at least 0.372 for versions of Coppersmith-Winograd algorithm and c = 1 for the common iterative algorithm.

Generally, I thought all nested for loops were O(n^2),
You are wrong about that. What confuses you I guess is that often people use as an example square(X==Y) matrix so complexity is n*n(X==n,Y==n).
If you want to practise your O(*) skills try to figure out why matrix multiplication is O(n^3). IF you dont know the algorithm for matrix multiplication it is easy to find it online.

Related

Would this n x n transpose algorithm be considered an in place algorithm?

Based on my research, I am gaining conflicting information about this simple algorithm. This algorithm is a basic matrix transposition, that transposes an n x n matrix A.
My current understanding is this algorithm would run at O(n^2) time and have a space complexity of O(1) as the matrix we manipulate would be the same one we deal with.
But- I have also been told it would actually run O(n) time and have space complexity of O(n) as well. Which means it wouldn't be in-place, as it requires extra space for manipulation.
Which thought process is correct here for the transpose algo below?
Transpose(A)
1. for i = 1 to n -1
2. for j = i + 1 to n
3. exchange A[i, j] with A[j,i]
Some confusion might arise from the facts that the workload is proportional to the number of elements in the array, and these elements occupy their own space. So by some notation abuse or inattention, both would be said "O(n)".
But this is wrong because
n is clearly not the number of elements but the number of rows/columns of the array;
by definition the space complexity does not include the input and output data, but any auxiliary space that would be required.
Hence we can confirm the time complexity O(n²) - in fact Θ(n²) - and space complexity O(1). The algorithm is in-place.
Final remark:
If we denote the number of elements as m, the time complexity is O(m), and there is no contradiction.

Computational complexity (Big-O notation) of a geometrical weighted centroid among access points

I need to compute the computational complexity of the following equations using Big-O notations:
Here, m is the total number of access points (perhaps the number of iterations in terms of complexity, i is individual access point). I learned about Big-O notation form this blog. Moreover, I found a similar question at this link. In the above equation, d is a distance computed with 4 operations (multiply, subtraction, division, and power). As seen in the above equation, w is computed with two operations (power and division). xw and yware computed with two operations each (multiplication and division).
Hence, I've figured out the Big-O notation of above algorithm as:
4*[m]+2*[m]+2*[m]+2*[m]
Is it correct? Can it be approximated as O(m) ?
Moreover, the above algorithm (equations) is combined with next algorithm whose computational complexity is O(N), N being the number of iterations. Here, N>>m. What will be the final computational complexity in terms of Big-O notation?
Thank you.
UPDATE:
The subscript w with x and y is just a notation. It is not the iteration. Iteration is only m. Eg. i = 1,2,3,4,5,......,m.The two algorithms operate in a pipeline fashion. For eg., at first the algorithm with m iterations is operated, and the output of this algorithm is fed (as input) to next algorithm with N iterations. So, when m iterations (algorithm 1) are completed, it is followed by N iterations (algorithm 2). My problem is similar to two loops that are not nested and have different iterations where N>>m.
for(int i=0; i<m; i++){
System.out.println(i);
}
for(int j=0; j<N; j++){
System.out.println(j);
}
What will be the final computational complexity?
Yes, your sum from i=1 to i=m takes O(m) time. All other operations are constant, you dont have any sub-sum in sum or something like this.
About your N value, you did not provide enough information. We have to know how the N is computed or how it is related to m.
Also you should consider following constraint - can you provide some maximum value (even incredibly) big one that cannot never be reached by the numbers or equations? Usually the operation with numbers are considered constants, because they are made on 32 or 64bit numbers which always take constant time.
However if you have some equations with incredible long numbers (like hundreds of characters long or more), the size of the numbers have to be considered in complexity. (You can probably imagine that multiplying two numbers that are milion characters long takes more than doing the same with 2x2)

Closest pair of points brute-force; Why O(n^2)?

I feel stupid for asking this question, but...
For the "closest pair of points" problem (see this if unfamiliar with it), why is the worst-case running time of the brute-force algorithm O(n^2)?
If say n = 4, then there would only be 12 possible pair of points to compare in the search space, if we also consider comparing two points from either direction. If we don't compare two points twice, then it's going to be 6.
O(n^2) doesn't add up to me.
The actual number of comparisons is:
, or .
But, in big-O notation, you are only concerned about the dominant term. At very large values of , the term becomes less important, as does the coefficient on the term. So, we just say it's .
Big-O notation isn't meant to give you the exact formula for the time taken or number of steps. It only gives you the order of the complexity/time so you can get a sense of how it scales for large inputs.
Applying brute force, we are forced to check all the possible pairs.Assuming N points,for each point there are N-1 other points for which we need to calculate the distance. So total possible distances calculated = N points * N-1 other points. But in process we double counted distances. Distance between A to B remains whether A to B or B to A is calculated. Hence N*(N-1)/2. Hence O(N^2).
In big-O notation, you can factor out multiplied constants, so:
O(k*(n^2)) = O(n^2)
The idea is that the constant (1/2 in the OP example, since distance comparison is reflective) doesn't really tell us anything new about the complexity. It still gets bigger with the square of the input.
In the brute-force version of the algorithm you compare all possible pairs of points. For each of n points you have (n - 1) other points to compare and if we take every pair once we end up with (n * (n - 1)) / 2 comparisons. The pessimistic complexity of O(n^2) means that the number of operations is bound by k * n^2 for some constant k. Big O notation can't tell you the exact number of operations but a function to which it is proportional when the size of data (n) increases.

Ambiguity about the Big-oh notation

I am currently trying to learn time complexity of algorithms, big-o notation and so on. However, some point confuses me a lot. I know that most of the time, the input size of an array or whatever we are dealing with determines the running time of the algorithm. Let's say I have an unsorted array with size N and I want to find the maximum element of this array without using any special algorithm. I just want to iterate over the array and find the maximum element. Since the size of my array is N, this process runs at O(N) or linear time. Let M is an integer that is the square root of N. So N can be written as the square of M that is M*M or M^2. So, I think there is nothing wrong if I want to replace N with M^2. I know that M^2 is also the size of my array so my big-o notation could be written as O(M^2). So, my new running time looks like running in quadratic time. Why does this happen?
You are correct, if it happens to be that you have some variable M such that M^2 ~= N is always true, you could say the algorithm runs in O(M^2).
But, note that now - the algorithm runs in quadratic related to M, and not quadratic time related to the input, it is still linear related to the size of the input.
The important thing here is defining linear/quadratic, etc.
More precisely, you have to detail linear/quadratic, etc. with respect to something (N or M for your example). The most natural choice is to study the complexity wrt. the size of the input (N for your example).
Another trap for big integers is that the size of n is log(n). So for instance if you loop over all smaller integers, your algorithm is not polynomial.

Precise Input Size and Time Complexity

When talking about time complexity we usually use n as input, which is not a precise measure of the actual input size. I am having trouble showing that, when using specific size for input (s) an algorithm remains in the same complexity class.
For instance, take a simple Sequential Search algorithm. In its worst case it takes W(n) time. If we apply specific input size (in base 2), the order should be W(lg L), where L is the largest integer.
How do I show that Sequential Search, or any algorithm, remains the same complexity class, in this case linear time? I understand that there is some sort of substitution that needs to take place, but I am shaky on how to come to the conclusion.
EDIT
I think I may have found what I was looking for, but I'm not entirely sure.
If you define worst case time complexity as W(s), the maximum number of steps done by an algorithm for an input size of s, then by definition of input size, s = lg n, where n is the input. Then, n = 2^s, leading to the conclusion that the time complexity is W(2^s), an exponential complexity. Therefore, the algorithm's performance with binary encoding is exponential, not linear as it is in terms of magnitude.
When talking about time complexity we usually use n as input, which is not a precise measure of the actual input size. I am having trouble showing that, when using specific size for input (s) an algorithm remains in the same complexity class.
For instance, take a simple Sequential Search algorithm. In its worst case it takes W(n) time. If we apply specific input size (in base 2), the order should be W(lg L), where L is the largest integer.
L is a variable that represents the largest integer.
n is a variable that represents the size of the input.
L is not a specific value anymore than n is.
When you apply a specific value, you aren't talking about a complexity class anymore, you are talking about an instance of that class.
Let's say you are searching a list of 500 integers. In other words, n = 500
The worst-case complexity class of Sequential Search is O(n)
The complexity is n
The specific instance of worst-case complexity is 500
Edit:
Your values will be uniform in the number of bits required to encode each value. If the input is a list of 32bit integers, then c = 32, the number of bits per integer. Complexity would be 32*n => O(n).
In terms of L, if L is the largest value, and lg L is the number of bits required to encode L, then lg L is the constant c. Your complexity in terms of bits is O(n) = c*n, where c = lg L is the constant specific input size.
What I know is that the maximum number
of steps done by Sequential Search is,
obviously, cn^2 + nlg L. cn^2 being
the number of steps to increment loops
and do branching.
That's not true at all. The maximum number of steps done by a sequential search is going to be c*n, where n is the number of items in the list and c is some constant. That's the worst case. There is no n^2 component or logarithmic component.
For example, a simple sequential search would be:
for (int i = 0; i < NumItems; ++i)
{
if (Items[i] == query)
return i;
}
return -1;
With that algorithm, if you search for each item, then half of the searches will require fewer than NumItems/2 iterations and half of the searches will require NumItems/2 or more iterations. If an item you search for isn't in the list, it will require NumItems iterations to determine that. The worst case running time is NumItems iterations. The average case is NumItems/2 iterations.
The actual number of operations performed is some constant, C, multiplied by the number of iterations. On average it's C*NumItems/2.
As Lucia Moura states: "Except for the unary encoding, all the other encodings for natural
numbers have lengths that are polynomially related"
Here is the source. Take a look at page 19.

Resources