I guess it's rather simple but it seems I'm troubling myself..
what's the complexity of the following?
// let's say that Q has M initial items
while Q not empty
v <- Q.getFirst
for each z in v // here, every v cannot have more than 3 z's
...
O(1) operations here
...
Q.insert(z)
end
end
The number of the times this will happen, depends on if at some point v's do not have more z's (let's call this number N)
Is the complexity O(MxN^2) or I'm wrong? It's like having a tree with M parent nodes and each node, at most, can have three children. N is the total number of nodes.
Your algorithmic complexity should have an upper bound of O( (M * v) - parent nodes that are children nodes ) which is much better stated as O(n) where n is the number of nodes in your tree, since you only iterate the tree once.
Depending on your operation, you would want to consider the runtime of your Q.insert(z) and Q.getFirst() operation, because depending on your data structure that may be worth considering.
Assuming Q.insert() and Q.getFirst() runtimes are O(1), you can say O(M * v) is an approximate bounding, but since v elements can be repeated, you are better off stating that the runtime is just O(n) because O(m*v) actually overestimates the upper bound in all cases. O(n) is exact for every instance of the tree (n being the number of nodes).
I would say that it's much more safe to call it O(n) since I don't know the exact implementation of your insert - although with a linked list both insert and the get first can be O(1) operations. (Most binary tree inserts will be O(log n) if properly implemented - sufficient information was not provided)
It should not harm you to play it safe and consider your runtime analysis O(n), but depending on who you're pitching it to, that extra variable may seem unnecessary.
HTH
edited: clarity of problem in comments helped me understand the question better, fixed nonsense
Related
Is there a data structure with elements that can be indexed whose insertion runtime is O(1)? So for example, I could index the data structure like so: a[4], and yet when inserting an element at an arbitrary place in the data structure that the runtime is O(1)? Note that the data structure does not maintain sorted order, just the ability for each sequential element to have an index.
I don't think its possible, since inserting somewhere that is not at the end or beginning of the ordered data structure would mean that all the indicies after insertion must be updated to know that their index has increased by 1, which would take worst case O(n) time. If the answer is no, could someone prove it mathematically?
EDIT:
To clarify, I want to maintain the order of insertion of elements, so upon inserting, the item inserted remains sequentially between the two elements it was placed between.
The problem that you are looking to solve is called the list labeling problem.
There are lower bounds on the cost that depend on the relationship between the the maximum number of labels you need (n), and the number of possible labels (m).
If n is in O(log m), i.e., if the number of possible labels is exponential in the number of labels you need at any one time, then O(1) cost per operation is achievable... but this is not the usual case.
If n is in O(m), i.e., if they are proportional, then O(log2 n) per operation is the best you can do, and the algorithm is complicated.
If n <= m2, then you can do O(log N). Amortized O(log N) is simple, and O(log N) worst case is hard. Both algorithms are described in this paper by Dietz and Sleator. The hard way makes use of the O(log2 n) algorithm mentioned above.
HOWEVER, maybe you don't really need labels. If you just need to be able to compare the order of two items in the collection, then you are solving a slightly different problem called "list order maintenance". This problem can actually be solved in constant time -- O(1) cost per operation and O(1) cost to compare the order of two items -- although again O(1) amortized cost is a lot easier to achieve.
When inserting into slot i, append the element which was first at slot i to the end of the sequence.
If the sequence capacity must be grown, then this growing may not necessarily be O(1).
I've an assignment to explain the time complexity of a ternary tree, and I find that info on the subject on the internet is a bit contradictory, so I was hoping I could ask here to get a better understanding.
So, with each search in the tree, we move to the left or right child a logarithmic amount of times, log3(n), with n being the amount of String in the tree, correct? And no matter what, we would also have to traverse down the middle child L number of times, where L is the length of the prefix we are searching.
Does the running time then come out to O(log3(n)+L)? I see many people simply saying that it runs in logarithmic time, but does Linear time not grow faster, and hence dominate?
Hope I'm making sense, thanks for any answers on the subject!
If the tree is balanced, then yes, any search that needs to visit only one child per iteration will run in logarithmic time.
Notice that O(log_3(n) = O(ln(n) / ln(3)) = O(ln(n) * c) = O(ln(n))
so the base of the logarithm does not matter. We say logarithmic time, O(log n).
Notice also that a balanced tree has a height of O(log(n)), where n is the number of nodes. So it looks like your L describes the height of the tree and is therefore also O(log n), so not linear w.r.t. n.
Does this answer your questions?
So far I think I understand the basic of finding algorithm complexity:
Basics operations like read,write,assignments and allocations have constant complexity O(k) that can be simplified as O(1).
For loops you have to think of the worst case, so for what value n the loop will take the longest time:
The complexity is O(n) if there are constant increments, for example if you have a variable i that starts from 0 and you increase it or decrease it by one at each loop iteration until you reach n.
The complexity is O(logn) if you have a variable and you increase it or decrease it by multiples.
The complexity is O(n^2) if there are nested loops.
If in a function there are multiple loops, the complexity of the function will be the loop with the worst complexity.
In case the value n doesn't change, and you always have to iterate n times, you use the Θ notation because there isn't a worst case or best case scenario.
Please correct me if anything I said so far is wrong.
For recursive functions the complexity depends on how many recursive calls there will be in the worst case scenario, you have to find a recurrence relation and solve it with one of the 3 methods:
This is where the problems begin for me:
Example
Let's say I have a binary tree with this structure: pointer to left and right child and value of depth of the node.
There is a function that initially takes the root and wants to perform an operation on each left child of nodes that have odd depths. To solve this with recursion, I'll have to check if node has odd depth, if it has a left child,if yes perform the operation on left child and then make the recursive call to the next node. In this case I think the complexity should be O(n), where n is the number of odd nodes and the worst case is that all odd nodes have a left child.
But what's the recurrence relation in a function like this?
I have been presented with a challenge to make the most effective algorithm that I can for a task. Right now I came to the complexity of n * logn. And I was wondering if it is even possible to do it better. So basically the task is there are kids having a counting out game. You are given the number n which is the number of kids and m which how many times you skip someone before you execute. You need to return a list which gives the execution order. I tried to do it like this you use skip list.
Current = m
while table.size>0:
executed.add(table[current%table.size])
table.remove(current%table.size)
Current += m
My questions are is this correct? Is it n*logn and can you do it better?
Is this correct?
No.
When you remove an element from the table, the table.size decreases, and current % table.size expression generally ends up pointing at another irrelevant element.
For example, 44 % 11 is 0 but 44 % 10 is 4, an element in a totally different place.
Is it n*logn?
No.
If table is just a random-access array, it can take n operations to remove an element.
For example, if m = 1, the program, after fixing the point above, would always remove the first element of the array.
When an array implementation is naive enough, it takes table.size operations to relocate the array each time, leading to a total to about n^2 / 2 operations in total.
Now, it would be n log n if table was backed up, for example, by a balanced binary search tree with implicit indexes instead of keys, as well as split and merge primitives. That's a treap for example, here is what results from a quick search for an English source.
Such a data structure could be used as an array with O(log n) costs for access, merge and split.
But nothing so far suggests this is the case, and there is no such data structure in most languages' standard libraries.
Can you do it better?
Correction: partially, yes; fully, maybe.
If we solve the problem backwards, we have the following sub-problem.
Let there be a circle of k kids, and the pointer is currently at kid t.
We know that, just a moment ago, there was a circle of k + 1 kids, but we don't know where, at which kid x, the pointer was.
Then we counted to m, removed the kid, and the pointer ended up at t.
Whom did we just remove, and what is x?
Turns out the "what is x" part can be solved in O(1) (drawing can be helpful here), so the finding the last kid standing is doable in O(n).
As pointed out in the comments, the whole thing is called Josephus Problem, and its variants are studied extensively, e.g., in Concrete Mathematics by Knuth et al.
However, in O(1) per step, this only finds the number of the last standing kid.
It does not automatically give the whole order of counting the kids out.
There certainly are ways to make it O(log(n)) per step, O(n log(n)) in total.
But as for O(1), I don't know at the moment.
Complexity of your algorithm depends on the complexity of the operations
executed.add(..) and table.remove(..).
If both of them have complexity of O(1), your algorithm has complexity of O(n) because the loop terminates after n steps.
While executed.add(..) can easily be implemented in O(1), table.remove(..) needs a bit more thinking.
You can make it in O(n):
Store your persons in a LinkedList and connect the last element with the first. Removing an element costs O(1).
Goging to the next person to choose would cost O(m) but that is a constant = O(1).
This way the algorithm has the complexity of O(n*m) = O(n) (for constant m).
I was wondering if there was a simple data structure that supports amortized log(n) lookup and insertion like a self balancing binary search tree but with constant memory overhead. (I don't really care about deleting elements).
One idea I had was to store everything in one contiguous block of memory divided into two contiguous blocks: an S part where all elements are sorted, and a U that isn't sorted.
To perform an insertion, we could add an element to U, and if the size of U exceeds log(size of S), then you sort the entire contiguous array (treat both S and U as one contiguous array), so that after the sort everything is in S and U is empty.
To perform lookup run binary search on S and just look through all of U.
However, I am having trouble calculating the amortized insertion time of my algorithm.
Ultimately I would just appreciate some reasonably simple algorithm/datastructure with desired properties, and some guarantee that it runs reasonably fast in amortized time.
Thank you!
If by constant amount of memory overhead you mean that for N elements stored in the data-structure the space consumption should be O(N), then any balanced tree will do -- in fact, any n-ary tree storing the elements in external leaves, where n > 1 and every external tree contains an element, has this property.
This follows from the fact that any tree graph with N nodes has N - 1 edges.
If by constant amount of memory overhead you mean that for N elements the space consumption should be N + O(1), then neither the balanced trees nor the hash tables have this property -- both will use k * N memory, where k > 1 due to extra node pointers in the case of trees and the load factor in the case of hash tables.
I find your approach interesting, but I do not think it will work even if you only sort U, and then merge the two sets in linear time. You would need to do a sort (O(logN * log(logN)) operations) after every logN updates, followed by an O(n) merging of S and U (note that so far nobody actually knows how to do this in linear time in place, that is, without an extra array).
The amortized insertion time would be O(n / logN). But you could maybe use your approach to achieve something close to O(√n) if you allow the size of U to grow to the square root of S.
Any hashtable will do that. The only tricky part about it is how you resolve conflicts - there are few ways of doing it, the other tricky part is correct hash computing.
See:
http://en.wikipedia.org/wiki/Hash_table