Is there any data structure can do random insert, random delete and interval summation with O(log n) time complexity? - data-structures

Insert(k, x), insert element x in the k-th position
Delete(k), delete element in the k-th position
Summation(l, r), calculate the sum of the element from l-th position to r-th position
I used Segment Tree to resolve interval summation in O(log n), but it seems that Segment Tree can not do random insert and random delete elegantly.

Yes there is. You can use something called Implicit Treaps.
https://cp-algorithms.com/data_structures/treap.html

Related

Given a n x n matrix where each of the rows and columns are sorted in ascending order, find the kth smallest element in the matrix

There a O(kLogN) solution of this using heap.In worst case k=N^2.
so time complexity becomes O(N^2LogN). Is there is any better algo for this problem.

Data structure representing a two-value array with 3 operations

Let's have n values. All values are false and can be changed to true.
I want 3 operations on these values with certain time complexities:
val(i) -> returns value at index i. Time complexity - O(1).
change(i) -> change value at index i to true. Time complexity -
amortized O(1).
find(i) -> return closest index to index i, that contains value
false (if there is false on i then return i). Time complexity - O(log n).
I don't really care about space complexity.
The structure is initialized at the beginning with fixed length. It doesn't really matter how much time the initizalization takes.
How should a data structure used for these operations look like?
Set up a segment tree on [0, n) and, for each elementary interval [i 2^k, (i+1) 2^k), store the AND of the boolean values in that interval.
val(i) is constant-time because it's just an array lookup.
change(i) is amortized constant-time if we alter the usual rootward propagation algorithm to exit early if there is no change at a particular level. This is because at any given time, the number of writes to intervals k levels from the root is at most half of the number of writes to intervals k+1 levels from the root (prove this by induction on k).
find(i) can be implemented in logarithmic time as follows. First observe that it suffices to find the nearest false left neighbor and the nearest false right neighbor and take the nearer of the two. To find the nearest false right neighbor for position i, decompose [i, n) into elementary intervals. Find the leftmost of these intervals that contains a false (i.e., its AND is false). Proceed leafward from this interval, checking to see at each level if the left half has a false. If it does, descend to it; otherwise, use the right half.
On a unit-cost RAM (i.e., the theoretical version of real hardware), we can get find(i) down to time O(log n / log log n) time with O(n / log n) words of storage by changing the tree fanout from 2 to the word size (Theta(log n)) and using bitwise operations.
Use combination of hashmap and binary search tree.
Example-
Suppose boolean[] A = { false, false, false, false}
Create hashmap (map with key as integer and value as Object)
Iterate over each item:
1. Create object with index and value as attribute.
2. Add object to the map.
3. Add the same object to BST using index of object for position in BST.
Now do following operation:
Val(i) : directly get object and return value of object. Complexity (1)
Change(i, true/false) : again get object from map and update its value. complexity O(1)
Find(i) : Check if value of object in that map is false or not. If it is false return otherwise do traversal in BST to check nearest index having value as false. Note traversal of BST based on object's key can be done in O(logn). Hence Complexity O(logn)

What is the complexity of this approach to finding K largest of N numbers

In this post on how to find the K largest of N elements the 2nd method proposed is:
Store the first k elements in a temporary array temp[0..k-1].
Find the smallest element in temp[], let the smallest element be min.
For each element x in arr[k] to arr[n-1]
If x is greater than the min then remove min from temp[] and insert x.
Print final k elements of temp[]
While I understand the approach, I do not understand their computed
Time Complexity of O((n-k)*k).
From my perspective, you are making a linear traversal of n-k elements and doing a single comparison on each element. And then perhaps replacing one elements of the temporary array of K elements.
More specifically, where does the *k aspect of their computed
Time Complexity of O((n-k)*k) come from? Why do they multipy n-k by that?
Lets consider that at kth iteration :
arr[k] > min(temp[0..k-1]
Now you will replace min(temp[0..k-1]) with arr[k].
And now you again need to compute the updated min of temp[0..k-1], because that would have changed. It can be any number in your updated temp[0..k-1]
So in worst case, u update the min everytime and hence the O(k).
Thus, time complexity = O((n-k)*k)

Data structure to delete elements less than k in O(log N) where N is number of elements

How should a data structure with the below functions all in O(log N) be implemented?
insert(x) - add integer to set
member(x) - check if set contains integer x
delete(x) - remove the integer x from set
deleteLessThan(x)
delete all numbers equal to or less than k
The only thing I can think of is using some kind of balanced BST to obtain the O(log N) for insert, member and delete.
The deleteLessThan() function would then look something like: find the smallest element larger than k, delete its left subtree and then rebalance. However, is it possible to rebalance a BST in O(log N) if you delete one of its subtrees?
Is amortised log N good enough? In that case, you can use a splay tree. All operations other than deleting the elements <= k are as explained on Wikipedia. For the remaining operation, splay the smallest element greater than k to the top, and delete its left subtree.
In case you allow amortisation, you can easily account for the deletion of M out of N nodes in O(M) time.

Length of union of ranges

I need to find length of union of ranges in one dimension coordinate system. I have many ranges of form [a_i,b_i], and I need to find the length of union of these ranges. The ranges can be dynamically added or removed and can be queried at any state for length of union of ranges.
for example: is ranges are:
[0-4]
[3-6]
[8-10]
The output should be 8.
Is there any suitable data structure for the purpose with following upper bounds on complexity:
Insertion - O(log N)
Deletion - O(log N)
Query - O(log N)
For a moment, assume you have a sorted array, containing both start and end points, with the convention that a start point precedes an end point with the same coordinate. With your example, the array will contain
0:start, 3:start, 4:end, 6:end, 8:start, 10:end
(if there was an interval ending at 3, then 3:start will precede 3:end)
To do a query, perform a sweep from left to right, incrementing a counter on "start" and decrementing a counter on "end". You record as S the place where the counter increments from 0 and record as
E the place where the counter becomes zero. At this point you add to the total count the number of elements between S and E. This is also a point, where you can just replace the preceding intervals with the interval [S, E].
Now, if you need O(log n) complexity for insertion/deletion, instead of in an array, you store the same elements (pairs of coordinate and start or end flag) in a balanced binary tree.
The sweep is then performed according to the inorder traversal.
The query itself stays O(n) complexity.
It's not quite O(lg n), but would an interval tree or segment tree suit your needs? You can keep the length of union in a variable, and when inserting or removing an interval, you can find in O(lg n + m) time what other m intervals intersect it, and then use that information to update the length variable in O(m) time.
Maintain a frequency array. Ex: If your range is (0,2) and (1,3), your frequency array should be [1, 2, 2, 1]. Also maintain a count of non-zero elements in the frequency array.
For insertion, increment the frequencies corresponding to that range. Update count when you increase from 0 to 1 (but not from 1 to 2 etc).
For deletion, decrement the frequencies. Similarly update count.
For query, output count.
Complexity is length of the range.

Resources