How does finding a Longest Increasing Subsequence that ends with a particular element leads to the solution of finding LIS - algorithm

I have understood that to find the solution of LIS problem, we need to find a LIS for every subsequence starting from initial element of the array to the each element that ends with a particular element(the last element), but I am not able to understand how would that help in finally finding a LIS of a given unsorted array, I also understand that this leads to an optimal substructure property and then can be solved, but as mentioned, I dont see how finding LIS(j) that ends with arr[j] will help us.
thanks.

Consider this sequence as an example:
a[] : 10 20 1 2 5 30 6 8 50 5 7
It produces the following sequence of LIS[i]:
a[] : 10 20 1 2 5 30 6 8 50 5 7
LIS[] : 1 2 1 2 3 4 4 5 6 3 4
Given this sequence, you can immediately find the length of the result, and its last element: the length is 6, and the last element is 50.
Now you can unfold the rest of the sequence, starting from the back: looking for LIS of 5 (one less than that of element 50) such that the number is less than 50 yields 8. Looking back further for 4 gives you 6 (there is no tie, because 30 is above 8). Next comes 5 with LIS of 3, and then a 2 with LIS of 2. Note that there is no tie again, even though 20 has the same LIS. This is because 20 is above 5. Finally, we find 1 with LIS of 1, completing the sequence:
50 8 6 5 2 1
Reversing this produces the longest increasing subsequence:
1 2 5 6 8 50
This is a common trick: given a table with the value of the function that you are maximizing (i.e. the length) you can produce the answer that yields this function (i.e. the sequence itself) by back-tracking the steps of the algorithm to the initial element.

Related

Can you check for duplicates by taking the sum of the array and then the product of the array?

Let's say we have an array of size N with values from 1 to N inside it. We want to check if this array has any duplicates. My friend suggested two ways that I showed him were wrong:
Take the sum of the array and check it against the sum 1+2+3+...+N. I gave the example 1,1,4,4 which proves that this way is wrong since 1+1+4+4 = 1+2+3+4 despite there being duplicates in the array.
Next he suggested the same thing but with multiplication. i.e. check if the product of the elements in the array is equal to N!, but again this fails with an array like 2,2,3,2, where 2x2x3x2 = 1x2x3x4.
Finally, he suggested doing both checks, and if one of them fails, then there is a duplicate in the array. I can't help but feel that this is still incorrect, but I can't prove it to him by giving him an example of an array with duplicates that passes both checks. I understand that the burden of proof lies with him, not me, but I can't help but want to find an example where this doesn't work.
P.S. I understand there are many more efficient ways to solve such a problem, but we are trying to discuss this particular approach.
Is there a way to prove that doing both checks doesn't necessarily mean there are no duplicates?
Here's a counterexample: 1,3,3,3,4,6,7,8,10,10
Found by looking for a pair of composite numbers with factorizations that change the sum & count by the same amount.
I.e., 9 -> 3, 3 reduces the sum by 3 and increases the count by 1, and 10 -> 2, 5 does the same. So by converting 2,5 to 10 and 9 to 3,3, I leave both the sum and count unchanged. Also of course the product, since I'm replacing numbers with their factors & vice versa.
Here's a much longer one.
24 -> 2*3*4 increases the count by 2 and decreases the sum by 15
2*11 -> 22 decreases the count by 1 and increases the sum by 9
2*8 -> 16 decreases the count by 1 and increases the sum by 6.
We have a second 2 available because of the factorization of 24.
This gives us:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
Has the same sum, product, and count of elements as
1,3,3,4,4,5,6,7,9,10,12,13,14,15,16,16,17,18,19,20,21,22,22,23
In general you can find these by finding all factorizations of composite numbers, seeing how they change the sum & count (as above), and choosing changes in both directions (composite <-> factors) that cancel out.
I've just wrote a simple not very effective brute-force function. And it shows that there is for example
1 2 4 4 4 5 7 9 9
sequence that has the same sum and product as
1 2 3 4 5 6 7 8 9
For n = 10 there are more such sequences:
1 2 3 4 6 6 6 7 10 10
1 2 4 4 4 5 7 9 9 10
1 3 3 3 4 6 7 8 10 10
1 3 3 4 4 4 7 9 10 10
2 2 2 3 4 6 7 9 10 10
My write-only c++ code is here: https://ideone.com/2oRCbh

Which sorting algorithm produces these steps?

This was a multiple-choice question in an exam today, and (at least) one of the answers should be true, but to me they all look wrong.
The sorting steps are:
5 2 6 1 3 4
4 2 6 1 3 5
4 2 5 1 3 6
4 2 3 1 5 6
1 2 3 4 5 6
The available answers were: Bubble Sort, Insertion Sort, Selection Sort, Merge Sort and Quick Sort.
I think that is a Quick sort. Here we can see the following steps:
A random selection of the reference element in the array (pivotValue), with respect to which reorders the elements of the array.
Move all of the values that are larger than the reference to the right, and all the values that the lower support left
Repeat algorithm for unsorted the left and right side of the array, while each element will not appear on its position
Why I think so:
It definitely isn't a Bubble Sort because it compares the first two elements of the array beginning so, the first step should be 2 5 6 1 3 4
It isn't a Insertion Sort because it's a sequential algorithm. In the first step we see that compared the first and the last element
It isn't a Selection Sort because it find the lowest value and move it to the top so, the first step should be 1 5 2 6 3 4
It isn't a Merge Sort because the array is divided into two subarrays. In this case we see interaction "first" and "second" parts
None of them.
bubble sort: no. After k steps, the last k elements should be the k largest, sorted.
insertion sort: no. After k steps, the k first elements should be sorted.
selection sort: no. After k steps, the k first elements should be the s smallest, sorted.
merge sort: no. After k steps, a value can only have moved 2^k - 1 places. (5 moves 5 places at k=1)
quick sort: no. Whatever the pivot is, 1 and 6 being the extreme values, they can stay in this initial position.
On the quick sort: To make it clear that it is not possible, lets enumerate the results of each pivot for the first step:
5 : [2134] - 5 - [6]. (2134 may be in any order)
2 : [1] - 2 - [5634]
6 : [52134] - 6
1 : 1 - [52634]
3 : [21] - 3 - [564]
4 : [213] - 4 - [56]
One obvious way of seeing that all those are incompatible with the OP's output is that in each case, the 1 is before the 6, no matter how you implement the pivot or the partition.
To solve this all you have to do is make a function for each sort algorithm but include a statement to print the array out after each swap. Then apply your print friendly sort algorithms to the initial array [5 2 6 1 3 4] and see which sort method produces the same output. Additionally, this will help you compare all the different methods.

Find the number of non-decreasing and non-increasing subsequences in an array

I am attempting to complete a programming challenge from Quora on HackerRank: https://www.hackerrank.com/contests/quora-haqathon/challenges/upvotes
I have designed a solution that works with some test cases, however, for many the algorithm that I am using is incorrect.
Rather than seeking a solution, I am simply asking for an explanation to how the subsequence is created and then I will implement a solution myself.
For example, with the input:
6 6
5 5 4 1 8 7
the correct output is -5, but I fail to see how -5 is the answer. The subsequence would be [5 5 4 1 8 7] and I cannot for the life of me find a means to get -5 as the output.
Problem Statement
At Quora, we have aggregate graphs that track the number of upvotes we get each day.
As we looked at patterns across windows of certain sizes, we thought about ways to track trends such as non-decreasing and non-increasing subranges as efficiently as possible.
For this problem, you are given N days of upvote count data, and a fixed window size K. For each window of K days, from left to right, find the number of non-decreasing subranges within the window minus the number of non-increasing subranges within the window.
A window of days is defined as contiguous range of days. Thus, there are exactly N−K+1 windows where this metric needs to be computed. A non-decreasing subrange is defined as a contiguous range of indices [a,b], a<b, where each element is at least as large as the previous element. A non-increasing subrange is similarly defined, except each element is at least as large as the next. There are up to K(K−1)/2 of these respective subranges within a window, so the metric is bounded by [−K(K−1)/2,K(K−1)/2].
Constraints
1≤N≤100,000 days
1≤K≤N days
Input Format
Line 1: Two integers, N and K
Line 2: N positive integers of upvote counts, each integer less than or equal to 10^9
Output Format
Line 1..: N−K+1 integers, one integer for each window's result on each line
Sample Input
5 3
1 2 3 1 1
Sample Output
3
0
-2
Explanation
For the first window of [1, 2, 3], there are 3 non-decreasing subranges and 0 non-increasing, so the answer is 3. For the second window of [2, 3, 1], there is 1 non-decreasing subrange and 1 non-increasing, so the answer is 0. For the third window of [3, 1, 1], there is 1 non-decreasing subrange and 3 non-increasing, so the answer is -2.
Given a window size of 6, and the sequence
5 5 4 1 8 7
the non-decreasing subsequences are
5 5
1 8
and the non-increasing subsequences are
5 5
5 4
4 1
8 7
5 5 4
5 4 1
5 5 4 1
So that's +2 for the non-decreasing subsequences and -7 for the non-increasing subsequences, giving -5 as the final answer.

Array size in Cycle leader iteration Algorithm [duplicate]

The cycle leader iteration algorithm is an algorithm for shuffling an array by moving all even-numbered entries to the front and all odd-numbered entries to the back while preserving their relative order. For example, given this input:
a 1 b 2 c 3 d 4 e 5
the output would be
a b c d e 1 2 3 4 5
This algorithm runs in O(n) time and uses only O(1) space.
One unusual detail of the algorithm is that it works by splitting the array up into blocks of size 3k+1. Apparently this is critical for the algorithm to work correctly, but I have no idea why this is.
Why is the choice of 3k + 1 necessary in the algorithm?
Thanks!
This is going to be a long answer. The answer to your question isn't simple and requires some number theory to fully answer. I've spent about half a day working through the algorithm and I now have a good answer, but I'm not sure I can describe it succinctly.
The short version:
Breaking the input into blocks of size 3k + 1 essentially breaks the input apart into blocks of size 3k - 1 surrounded by two elements that do not end up moving.
The remaining 3k - 1 elements in the block move according to an interesting pattern: each element moves to the position given by dividing the index by two modulo 3k.
This particular motion pattern is connected to a concept from number theory and group theory called primitive roots.
Because the number two is a primitive root modulo 3k, beginning with the numbers 1, 3, 9, 27, etc. and running the pattern is guaranteed to cycle through all the elements of the array exactly once and put them into the proper place.
This pattern is highly dependent on the fact that 2 is a primitive root of 3k for any k ≥ 1. Changing the size of the array to another value will almost certainly break this because the wrong property is preserved.
The Long Version
To present this answer, I'm going to proceed in steps. First, I'm going to introduce cycle decompositions as a motivation for an algorithm that will efficiently shuffle the elements around in the right order, subject to an important caveat. Next, I'm going to point out an interesting property of how the elements happen to move around in the array when you apply this permutation. Then, I'll connect this to a number-theoretic concept called primitive roots to explain the challenges involved in implementing this algorithm correctly. Finally, I'll explain why this leads to the choice of 3k + 1 as the block size.
Cycle Decompositions
Let's suppose that you have an array A and a permutation of the elements of that array. Following the standard mathematical notation, we'll denote the permutation of that array as σ(A). We can line the initial array A up on top of the permuted array σ(A) to get a sense for where every element ended up. For example, here's an array and one of its permutations:
A 0 1 2 3 4
σ(A) 2 3 0 4 1
One way that we can describe a permutation is just to list off the new elements inside that permutation. However, from an algorithmic perspective, it's often more helpful to represent the permutation as a cycle decomposition, a way of writing out a permutation by showing how to form that permutation by beginning with the initial array and then cyclically permuting some of its elements.
Take a look at the above permutation. First, look at where the 0 ended up. In σ(A), the element 0 ended up taking the place of where the element 2 used to be. In turn, the element 2 ended up taking the place of where the element 0 used to be. We denote this by writing (0 2), indicating that 0 should go where 2 used to be, and 2 should go were 0 used to be.
Now, look at the element 1. The element 1 ended up where 4 used to be. The number 4 then ended up where 3 used to be, and the element 3 ended up where 1 used to be. We denote this by writing (1 4 3), that 1 should go where 4 used to be, that 4 should go where 3 used to be, and that 3 should go where 1 used to be.
Combining these together, we can represent the overall permutation of the above elements as (0 2)(1 4 3) - we should swap 0 and 2, then cyclically permute 1, 4, and 3. If we do that starting with the initial array, we'll end up at the permuted array that we want.
Cycle decompositions are extremely useful for permuting arrays in place because it's possible to permute any individual cycle in O(C) time and O(1) auxiliary space, where C is the number of elements in the cycle. For example, suppose that you have a cycle (1 6 8 4 2). You can permute the elements in the cycle with code like this:
int[] cycle = {1, 6, 8, 4, 2};
int temp = array[cycle[0]];
for (int i = 1; i < cycle.length; i++) {
swap(temp, array[cycle[i]]);
}
array[cycle[0]] = temp;
This works by just swapping everything around until everything comes to rest. Aside from the space usage required to store the cycle itself, it only needs O(1) auxiliary storage space.
In general, if you want to design an algorithm that applies a particular permutation to an array of elements, you can usually do so by using cycle decompositions. The general algorithm is the following:
for (each cycle in the cycle decomposition algorithm) {
apply the above algorithm to cycle those elements;
}
The overall time and space complexity for this algorithm depends on the following:
How quickly can we determine the cycle decomposition we want?
How efficiently can we store that cycle decomposition in memory?
To get an O(n)-time, O(1)-space algorithm for the problem at hand, we're going to show that there's a way to determine the cycle decomposition in O(1) time and space. Since everything will get moved exactly once, the overall runtime will be O(n) and the overall space complexity will be O(1). It's not easy to get there, as you'll see, but then again, it's not awful either.
The Permutation Structure
The overarching goal of this problem is to take an array of 2n elements and shuffle it so that even-positioned elements end up at the front of the array and odd-positioned elements end up at the end of the array. Let's suppose for now that we have 14 elements, like this:
0 1 2 3 4 5 6 7 8 9 10 11 12 13
We want to shuffle the elements so that they come out like this:
0 2 4 6 8 10 12 1 3 5 7 9 11 13
There are a couple of useful observations we can have about the way that this permutation arises. First, notice that the first element does not move in this permutation, because even-indexed elements are supposed to show up in the front of the array and it's the first even-indexed element. Next, notice that the last element does not move in this permutation, because odd-indexed elements are supposed to end up at the back of the array and it's the last odd-indexed element.
These two observations, put together, means that if we want to permute the elements of the array in the desired fashion, we actually only need to permute the subarray consisting of the overall array with the first and last elements dropped off. Therefore, going forward, we are purely going to focus on the problem of permuting the middle elements. If we can solve that problem, then we've solved the overall problem.
Now, let's look at just the middle elements of the array. From our above example, that means that we're going to start with an array like this one:
Element 1 2 3 4 5 6 7 8 9 10 11 12
Index 1 2 3 4 5 6 7 8 9 10 11 12
We want to get the array to look like this:
Element 2 4 6 8 10 12 1 3 5 7 9 11
Index 1 2 3 4 5 6 7 8 9 10 11 12
Because this array was formed by taking a 0-indexed array and chopping off the very first and very last element, we can treat this as a one-indexed array. That's going to be critically important going forward, so be sure to keep that in mind.
So how exactly can we go about generating this permutation? Well, for starters, it doesn't hurt to take a look at each element and to try to figure out where it began and where it ended up. If we do so, we can write things out like this:
The element at position 1 ended up at position 7.
The element at position 2 ended up at position 1.
The element at position 3 ended up at position 8.
The element at position 4 ended up at position 2.
The element at position 5 ended up at position 9.
The element at position 6 ended up at position 3.
The element at position 7 ended up at position 10.
The element at position 8 ended up at position 4.
The element at position 9 ended up at position 11.
The element at position 10 ended up at position 5.
The element at position 11 ended up at position 12.
The element at position 12 ended up at position 6.
If you look at this list, you can spot a few patterns. First, notice that the final index of all the even-numbered elements is always half the position of that element. For example, the element at position 4 ended up at position 2, the element at position 12 ended up at position 6, etc. This makes sense - we pushed all the even elements to the front of the array, so half of the elements that came before them will have been displaced and moved out of the way.
Now, what about the odd-numbered elements? Well, there are 12 total elements. Each odd-numbered element gets pushed to the second half, so an odd-numbered element at position 2k+1 will get pushed to at least position 7. Its position within the second half is given by the value of k. Therefore, the elements at an odd position 2k+1 gets mapped to position 7 + k.
We can take a minute to generalize this idea. Suppose that the array we're permuting has length 2n. An element at position 2x will be mapped to position x (again, even numbers get halfed), and an element at position 2x+1 will be mapped to position n + 1 + x. Restating this:
The final position of an element at position p is determined as follows:
If p = 2x for some integer x, then 2x ↦ x
If p = 2x+1 for some integer x, then 2x+1 ↦ n + 1 + x
And now we're going to do something that's entirely crazy and unexpected. Right now, we have a piecewise rule for determining where each element ends up: we either divide by two, or we do something weird involving n + 1. However, from a number-theoretic perspective, there is a single, unified rule explaining where all elements are supposed to end up.
The insight we need is that in both cases, it seems like, in some way, we're dividing the index by two. For the even case, the new index really is formed by just dividing by two. For the odd case, the new index kinda looks like it's formed by dividing by two (notice that 2x+1 went to x + (n + 1)), but there's an extra term in there. In a number-theoretic sense, though, both of these really correspond to division by two. Here's why.
Rather than taking the source index and dividing by two to get the destination index, what if we take the destination index and multiply by two? If we do that, an interesting pattern emerges.
Suppose our original number was 2x. The destination is then x, and if we double the destination index to get back 2x, we end up with the source index.
Now suppose that our original number was 2x+1. The destination is then n + 1 + x. Now, what happens if we double the destination index? If we do that, we get back 2n + 2 + 2x. If we rearrange this, we can alternatively rewrite this as (2x+1) + (2n+1). In other words, we've gotten back the original index, plus an extra (2n+1) term.
Now for the kicker: what if all of our arithmetic is done modulo 2n + 1? In that case, if our original number was 2x + 1, then twice the destination index is (2x+1) + (2n+1) = 2x + 1 (modulo 2n+1). In other words, the destination index really is half of the source index, just done modulo 2n+1!
This leads us to a very, very interesting insight: the ultimate destination of each of the elements in a 2n-element array is given by dividing that number by two, modulo 2n+1. This means that there really is a nice, unified rule for determining where everything goes. We just need to be able to divide by two modulo 2n+1. It just happens to work out that in the even case, this is normal integer division, and in the odd case, it works out to taking the form n + 1 + x.
Consequently, we can reframe our problem in the following way: given a 1-indexed array of 2n elements, how do we permute the elements so that each element that was originally at index x ends up at position x/2 mod (2n+1)?
Cycle Decompositions Revisited
At this point, we've made quite a lot of progress. Given any element, we know where that element should end up. If we can figure out a nice way to get a cycle decomposition of the overall permutation, we're done.
This is, unfortunately, where things get complicated. Suppose, for example, that our array has 10 elements. In that case, we want to transform the array like this:
Initial: 1 2 3 4 5 6 7 8 9 10
Final: 2 4 6 8 10 1 3 5 7 9
The cycle decomposition of this permutation is (1 6 3 7 9 10 5 8 4 2). If our array has 12 elements, we want to transform it like this:
Initial: 1 2 3 4 5 6 7 8 9 10 11 12
Final: 2 4 6 8 10 12 1 3 5 7 9 11
This has cycle decomposition (1 7 10 5 9 11 12 6 3 8 4 2 1). If our array has 14 elements, we want to transform it like this:
Initial: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Final: 2 4 6 8 10 12 14 1 3 5 7 9 11 13
This has cycle decomposition (1 8 4 2)(3 9 12 6)(5 10)(7 11 13 14). If our array has 16 elements, we want to transform it like this:
Initial: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Final: 2 4 6 8 10 12 14 16 1 3 5 7 9 11 13 15
This has cycle decomposition (1 9 13 15 16 8 4 2)(3 10 5 11 14 7 12 6).
The problem here is that these cycles don't seem to follow any predictable patterns. This is a real problem if we're going to try to solve this problem in O(1) space and O(n) time. Even though given any individual element we can figure out what cycle contains it and we can efficiently shuffle that cycle, it's not clear how we figure out what elements belong to what cycles, how many different cycles there are, etc.
Primitive Roots
This is where number theory comes in. Remember that each element's new position is formed by dividing that number by two, modulo 2n+1. Thinking about this backwards, we can figure out which number will take the place of each number by multiplying by two modulo 2n+1. Therefore, we can think of this problem by finding the cycle decomposition in reverse: we pick a number, keep multiplying it by two and modding by 2n+1, and repeat until we're done with the cycle.
This gives rise to a well-studied problem. Suppose that we start with the number k and think about the sequence k, 2k, 22k, 23k, 24k, etc., all done modulo 2n+1. Doing this gives different patterns depending on what odd number 2n+1 you're modding by. This explains why the above cycle patterns seem somewhat arbitrary.
I have no idea how anyone figured this out, but it turns out that there's a beautiful result from number theory that talks about what happens if you take this pattern mod 3k for some number k:
Theorem: Consider the sequence 3s, 3s·2, 3s·22, 3s·23, 3s·24, etc. all modulo 3k for some k ≥ s. This sequence cycles through through every number between 1 and 3k, inclusive, that is divisible by 3s but not divisible by 3s+1.
We can try this out on a few examples. Let's work modulo 27 = 32. The theorem says that if we look at 3, 3 · 2, 3 · 4, etc. all modulo 27, then we should see all the numbers less than 27 that are divisible by 3 and not divisible by 9. Well, let'see what we get:
3 · 20 = 3 · 1 = 3 = 3 mod 27
3 · 21 = 3 · 2 = 6 = 6 mod 27
3 · 22 = 3 · 4 = 12 = 12 mod 27
3 · 23 = 3 · 8 = 24 = 24 mod 27
3 · 24 = 3 · 16 = 48 = 21 mod 27
3 · 25 = 3 · 32 = 96 = 15 mod 27
3 · 26 = 3 · 64 = 192 = 3 mod 27
We ended up seeing 3, 6, 12, 15, 21, and 24 (though not in that order), which are indeed all the numbers less than 27 that are divisible by 3 but not divisible by 9.
We can also try this working mod 27 and considering 1, 2, 22, 23, 24 mod 27, and we should see all the numbers less than 27 that are divisible by 1 and not divisible by 3. In other words, this should give back all the numbers less than 27 that aren't divisible by 3. Let's see if that's true:
20 = 1 = 1 mod 27
21 = 2 = 2 mod 27
22 = 4 = 4 mod 27
23 = 8 = 8 mod 27
24 = 16 = 16 mod 27
25 = 32 = 5 mod 27
26 = 64 = 10 mod 27
27 = 128 = 20 mod 27
28 = 256 = 13 mod 27
29 = 512 = 26 mod 27
210 = 1024 = 25 mod 27
211 = 2048 = 23 mod 27
212 = 4096 = 19 mod 27
213 = 8192 = 11 mod 27
214 = 16384 = 22 mod 27
215 = 32768 = 17 mod 27
216 = 65536 = 7 mod 27
217 = 131072 = 14 mod 27
218 = 262144 = 1 mod 27
Sorting these, we got back the numbers 1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26 (though not in that order). These are exactly the numbers between 1 and 26 that aren't multiples of three!
This theorem is crucial to the algorithm for the following reason: if 2n+1 = 3k for some number k, then if we process the cycle containing 1, it will properly shuffle all numbers that aren't multiples of three. If we then start the cycle at 3, it will properly shuffle all numbers that are divisible by 3 but not by 9. If we then start the cycle at 9, it will properly shuffle all numbers that are divisible by 9 but not by 27. More generally, if we use the cycle shuffle algorithm on the numbers 1, 3, 9, 27, 81, etc., then we will properly reposition all the elements in the array exactly once and will not have to worry that we missed anything.
So how does this connect to 3k + 1? Well, we need to have that 2n + 1 = 3k, so we need to have that 2n = 3k - 1. But remember - we dropped the very first and very last element of the array when we did this! Adding those back in tells us that we need blocks of size 3k + 1 for this procedure to work correctly. If the blocks are this size, then we know for certain that the cycle decomposition will consist of a cycle containing 1, a nonoverlapping cycle containing 3, a nonoverlapping cycle containing 9, etc. and that these cycles will contain all the elements of the array. Consequently, we can just start cycling 1, 3, 9, 27, etc. and be absolutely guaranteed that everything gets shuffled around correctly. That's amazing!
And why is this theorem true? It turns out that a number k for which 1, k, k2, k3, etc. mod pn that cycles through all the numbers that aren't multiples of p (assuming p is prime) is called a primitive root of the number pn. There's a theorem that says that 2 is a primitive root of 3k for all numbers k, which is why this trick works. If I have time, I'd like to come back and edit this answer to include a proof of this result, though unfortunately my number theory isn't at a level where I know how to do this.
Summary
This problem was tons of fun to work on. It involves cute tricks with dividing by two modulo an odd numbers, cycle decompositions, primitive roots, and powers of three. I'm indebted to this arXiv paper which described a similar (though quite different) algorithm and gave me a sense for the key trick behind the technique, which then let me work out the details for the algorithm you described.
Hope this helps!
Here is most of the mathematical argument missing from templatetypedef’s
answer. (The rest is comparatively boring.)
Lemma: for all integers k >= 1, we have
2^(2*3^(k-1)) = 1 + 3^k mod 3^(k+1).
Proof: by induction on k.
Base case (k = 1): we have 2^(2*3^(1-1)) = 4 = 1 + 3^1 mod 3^(1+1).
Inductive case (k >= 2): if 2^(2*3^(k-2)) = 1 + 3^(k-1) mod 3^k,
then q = (2^(2*3^(k-2)) - (1 + 3^(k-1)))/3^k.
2^(2*3^(k-1)) = (2^(2*3^(k-2)))^3
= (1 + 3^(k-1) + 3^k*q)^3
= 1 + 3*(3^(k-1)) + 3*(3^(k-1))^2 + (3^(k-1))^3
+ 3*(1+3^(k-1))^2*(3^k*q) + 3*(1+3^(k-1))*(3^k*q)^2 + (3^k*q)^3
= 1 + 3^k mod 3^(k+1).
Theorem: for all integers i >= 0 and k >= 1, we have
2^i = 1 mod 3^k if and only if i = 0 mod 2*3^(k-1).
Proof: the “if” direction follows from the Lemma. If
i = 0 mod 2*3^(k-1), then
2^i = (2^(2*3^(k-1)))^(i/(2*3^(k-1)))
= (1+3^k)^(i/(2*3^(k-1))) mod 3^(k+1)
= 1 mod 3^k.
The “only if” direction is by induction on k.
Base case (k = 1): if i != 0 mod 2, then i = 1 mod 2, and
2^i = (2^2)^((i-1)/2)*2
= 4^((i-1)/2)*2
= 2 mod 3
!= 1 mod 3.
Inductive case (k >= 2): if 2^i = 1 mod 3^k, then
2^i = 1 mod 3^(k-1), and the inductive hypothesis implies that
i = 0 mod 2*3^(k-2). Let j = i/(2*3^(k-2)). By the Lemma,
1 = 2^i mod 3^k
= (1+3^(k-1))^j mod 3^k
= 1 + j*3^(k-1) mod 3^k,
where the dropped terms are divisible by (3^(k-1))^2, so
j = 0 mod 3, and i = 0 mod 2*3^(k-1).

quick method count number of overlap intervals in an array of interval?

OK, this is a question I got for my advance algorithm class. I already turned in my solution once but got rejected by my instructor due to efficiency issue, in other words, I already made the efforts on my part but could not get it even after his hint, so please be gentle. I will give his hint below
Given an array of intervals with both start point and end point, find the number of other intervals fall within it for each interval. number of intervals is less than 10^9 and their ids are distinct. start and end are less than 10^18, the input files don't contain duplicate number for start and end. All the numbers above are integers
the hint is: considering a data structure with buckets. The algorithm should be faster than O(n^2)
sample input and output
input:
5 %% number of intervals
2 100 200 %% id, start,end. all lines below follows this
3 110 190
4 105 145
1 90 150
5 102 198
output:
3 0
4 0
1 1
5 2
2 3
The numbers are pretty big so O(N log N) might be a little to much but here's an idea.
First things first normalize the values, that means turning them smaller while keepinging the same ordering. In your example the normalize would be
90 100 102 105 110 145 150 190 198 200
1 2 3 4 5 6 7 8 9 10
So you're new intervals are:
5
2 2 10
3 5 8
4 4 6
1 1 7
5 3 9
Now the edges of the intervals are in the range of [1, 2N].
Now sort the intervals by their end:
5
4 4 6
1 1 7
3 5 8
5 3 9
2 2 10
When you reach an interval you can say that all the intervals that start before it and have not been encountered yet should have their answer increased by one. This can be done with a SegmentTree.
What you do when you get an interval [x, y] you increase all values in the range [1, x - 1] by 1 and then compute its answer as the value at x in the segment tree. That's just addition on an interval and query on a point, a common segment tree problem.
I don't really think you can solve this problem with less than O(N log N) time and O(N) memory, so this solution should be the asymptotically best solution in both time and space.

Resources