Is there a way to get both rows and columns
from a matrix (2D) array in O(1) time?
Obviously, to get one or the other, its trivial to just return the array.
Is there an easy way to get both in O(1) time?
I'm thinking I could have a second matrix, but that would double
my space requirements.
edit:
I could build "columns" but would take O(n) time assuming a n by n matrix. Because, by default, the matrix is an array of arrays. My question is, more concretely, is there a way to change my matrix data structure so that I can do both operations in O(1) time?
Related
I read this in Introduction to algorithms book, they didn't want to copy matrix entries because it would take Theta n^2 time and the trick they used was the index calculation and they said,
In fact, we can partition the matrices without copying entries. The trick is to use index calculations. We identify a submatrix by a range of row indices and a range of column indices of the original matrix. We end up representing a submatrix a little differently from how we represent the original matrix.
I want to know what index calculation is because it's the first step for Strassen's method
I'm trying to balance a set of (Million +) 3D points using a KD-tree and I have two ways of doing it.
Way 1:
Use an O(n) algorithm to find the arraysize/2-th largest element along a given axis and store it at the current node
Iterate over all the elements in the vector and for each, compare them to the element I just found and put those smaller in newArray1, and those larger in newArray2
Recurse
Way 2:
Use quicksort O(nlogn) to sort all the elements in the array along a given axis, take the element at position arraysize/2 and store it in the current node.
Then put all the elements from index 0 to arraysize/2-1 in newArray1, and those from arraysize/2 to arraysize-1 in newArray2
Recurse
Way 2 seems more "elegant" but way 1 seems faster since the median search and the iterating are both O(n) so I get O(2n) which just reduces to O(n). But then at the same time, even though way 2 is O(nlogn) time to sort, splitting up the array into 2 can be done in constant time, but does it make up for the O(nlogn) time for sorting?
What should I do? Or is there an even better way to do this that I'm not even seeing?
How about Way 3:
Use an O(n) algorithm such as QuickSelect to ensure that the element at position length/2 is the correct element, all elements before are less, and all afterwards are larger than it (without sorting them completely!) - this is probably the algorithm you used in your Way 1 step 1 anyway...
Recurse into each half (except middle element) and repeat with next axis.
Note that you actually do not need to make "node" objects. You can actually keep the tree in a large array. When searching, start at length/2 with the first axis.
I've seen this trick being used by ELKI. It uses very little memory and code, which makes the tree quite fast.
Another way:
Sort for each of the dimensions: O(K N log N). This will be performed only once, we will utilize the sorted list on the dimensions.
For the current dimension, find the median in O(1) time, split for the median in O(N) time, split also the sorted arrays for each of the dimensions in O(KN) time, and recurse for the next dimension.
In that way, you will perform sorts at the beginning. And perform (K+1) splits/filterings for each subtree, for a known value. For small K, this approach should be faster than the other approaches.
Note: The additional space needed for the algorithm can be decreased by the tricks pointed out by Anony-Mousse.
Notice that if the query hyper-rectangle contains many points (all of them for example) it does not matter if the tree is balanced or not. A balanced tree is useful if the query hyper-rects are small.
Say we have two square matrices of the same size n, named A and B.
A and B share the property that each entry in their main diagonal diagonals is the same value (i.e., A[0,0] = A[1,1] = A[2,2] ... = A[n,n] and B[0,0] = B[1,1] = B[2,2] ... = B[n,n]).
Is there a way to represent A and B so that they can be added to each other in O(n) time, rather than O(n^2)?
In general: No.
For an nxn matrix, there are n^2 output values to populate; that takes O(n^2) time.
In your case: No.
Even if O(n) of the input/output values are dependent, that leaves O(n^2) that are independent. So there is no representation that can reduce the overall runtime below O(n^2).
But...
In order to reduce the runtime, it is necessary (but not necessarily sufficient) to increase the number of dependent values to O(n^2). Obviously, whether or not this is possible is dictated by the particular scenario...
To complement Oli Cherlesworth answer, I'd like to point out that in the specific case of sparse matrices, you can often obtain a runtime of O(n).
For instance, if you happen to know that your matrices are diagonal, you also know that the resulting matrix will be diagonal, and hence you only need to compute n values.
Similarly, there are band matrices that can be added in O(n), as well as more "random" sparse matrices. In general, in a sparse matrix, the number of non-zero elements per row is more or less constant (you obtain these elements from a finite element computation for example, or from graph adjacency matrices etc.), and as such, using an appropriate representation such as "Compressed row storage" or "Compressed column storage", you will end up using O(n) operations to add your two matrices.
Also a special mention for sublinear randomized algorithms, that only propose you to know the final value that is "not-too-far" from the real solution, up to random errors.
If an array contains duplicated elements, what data structure is better for sorting?
Could B tree work?
For a fixed and small range of element values you can use counting sort algorithm, as described here. Its complexity is O(n + k), where n is the size of your array, and k is, basically, the amount of different possible elements.
The point is to calculate the number of same elements, and then insert them in the right order.
Is it possible to compute the number of different elements in an array in linear time and constant space? Let us say it's an array of long integers, and you can not allocate an array of length sizeof(long).
P.S. Not homework, just curious. I've got a book that sort of implies that it is possible.
This is the Element uniqueness problem, for which the lower bound is Ω( n log n ), for comparison-based models. The obvious hashing or bucket sorting solution all requires linear space too, so I'm not sure this is possible.
You can't use constant space. You can use O(number of different elements) space; that's what a HashSet does.
You can use any sorting algorithm and count the number of different adjacent elements in the array.
I do not think this can be done in linear time. One algorithm to solve in O(n log n) requires first sorting the array (then the comparisons become trivial).
If you are guaranteed that the numbers in the array are bounded above and below, by say a and b, then you could allocate an array of size b - a, and use it to keep track of which numbers have been seen.
i.e., you would move through your input array take each number, and mark a true in your target array at that spot. You would increment a counter of distinct numbers only when you encounter a number whose position in your storage array is false.
Assuming we can partially destroy the input, here's an algorithm for n words of O(log n) bits.
Find the element of order sqrt(n) via linear-time selection. Partition the array using this element as a pivot (O(n)). Using brute force, count the number of different elements in the partition of length sqrt(n). (This is O(sqrt(n)^2) = O(n).) Now use an in-place radix sort on the rest, where each "digit" is log(sqrt(n)) = log(n)/2 bits and we use the first partition to store the digit counts.
If you consider streaming algorithms only ( http://en.wikipedia.org/wiki/Streaming_algorithm ), then it's impossible to get an exact answer with o(n) bits of storage via a communication complexity lower bound ( http://en.wikipedia.org/wiki/Communication_complexity ), but possible to approximate the answer using randomness and little space (Alon, Matias, and Szegedy).
This can be done with a bucket approach when assuming that there are only a constant number of different values. Make a flag for each value (still constant space). Traverse the list and flag the occured values. If you happen to flag an already flagged value, you've found a duplicate. You have to traverse the buckets for each element in the list. But that's still linear time.