Hacker Earth's Monk and Rotation Time limit exceeded

Hacker Earth's Monk and Rotation Time limit exceeded - algorithm

I am trying to solve the following question from Hacker Earth which is Monk and Rotation
Please refer the link: https://www.hackerearth.com/practice/codemonk/ .
It's a beginner level problem which I managed to solve but unfortunately I am facing Time limit exceeded issue for Input #5
I have tried to optimize the solution. I think I managed to do so up to some extent wrt my crude solution in the beginning but sadly that's not enough.
I suspect the flaw lies with how I handle the input and it could be the bottleneck.
It may also have to do with how I am handling slices but I don't see a better way than that. Before optimization I was certain that slices were the culprit.
Edit:
I was not aware the question is not accessible without login, here is some brief explaination.
Input:
The first line will consists of one integer T denoting the number of
test cases. For each test case:
The first line consists of two
integers N and K, N being the number of elements in the array and K
denotes the number of steps of rotation.
The next line consists of
N space separated integers , denoting the elements of the array A.
Output:
Print the required array.
Constraints:
1 ≤ T ≤ 20
1 ≤ N ≤ 10^5
0 ≤ K ≤ 10^6
0 ≤ A[i] ≤ 10^6
Sample Input:
1
5 2
1 2 3 4 5
Sample Output:
4 5 1 2 3
Explanation
Here T is 1, which means one test case.
N = 5 denoting the number of elements in the array and K = 2, denoting the number of steps of rotations.
The initial array is 1,2,3,4,5:
In first rotation, 5 will come in the first position and all other elements will move to one position ahead from their current position. Now, the resultant array will be 5,1,2,3,4.
In second rotation, 4 will come in the first position and all other elements will move to one position ahead from their current position. Now, the resultant array will be 4,5,1,2,3
Time Limit: 1.0 sec(s) for each input file (2x for Golang)
Memory Limit: 256 MB
Source Limit: 1024 KB.
Input #5
package main
import (
"fmt"
)
func main() {
var testCases int
var arrSize, rotation int
fmt.Scanln(&testCases)
for i := 0; i < testCases; i++ {
fmt.Scanln(&arrSize, &rotation)
arr := make([]int, arrSize)
for i := range arr {
fmt.Scanf("%d", &arr[i])
}
rotation = rotation % arrSize
for i := arrSize - rotation; i < arrSize; i++ {
fmt.Printf("%v ", arr[i])
}
for i := 0; i < arrSize-rotation; i++ {
fmt.Printf("%v ", arr[i])
}
fmt.Print("\n")
}
}

The time complexity of your solution seems to be O(n), so I would say that your solution is "correct". You could probably optimize by not constructing an array and reading number after number. Just read the whole line. Find the index where to split the string and print the two substrings in reverse order.

Related

Inner workings of `rand.Intn` function - GoLang

Somehow, I happened to look at source code for Go on how it implements Random function when passed a length of array.
Here's the calling code
func randomFormat() string {
formats := []string{
"Hi, %v. Welcome!",
"Great to see you, %v!",
"Hail, %v! Well met!",
}
return formats[rand.Intn(len(formats))]
}
Go Source code: main part
func (r *Rand) Intn(n int) int {
if n <= 0 {
panic("invalid argument to Intn")
}
if n <= 1<<31-1 {
return int(r.Int31n(int32(n)))
}
return int(r.Int63n(int64(n)))
}
Go Source code: reference part - Most of devs have this already on their machines or go repo.
// Int31n returns, as an int32, a non-negative pseudo-random number in [0,n).
// It panics if n <= 0.
func (r *Rand) Int31n(n int32) int32 {
if n <= 0 {
panic("invalid argument to Int31n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int31() & (n - 1)
}
max := int32((1 << 31) - 1 - (1<<31)%uint32(n))
v := r.Int31()
for v > max {
v = r.Int31()
}
return v % n
}
// It panics if n <= 0.
func (r *Rand) Int63n(n int64) int64 {
if n <= 0 {
panic("invalid argument to Int63n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int63() & (n - 1)
}
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
}
func (r *Rand) Int31() int32 { return int32(r.Int63() >> 32) }
func (r *Rand) Int63() int64 { return r.src.Int63() }
type Source interface {
Int63() int64
Seed(seed int64)
}
I want to understand how the random function works encapsulating all inner functions. I am overwhelmed by the code and if someone has to plan the steps out in plain English what would those be?
For example, I don't get the logic for doing minus 1 in
if n <= 1<<31-1
Then, I don't get any of the head or toe of Int31n function
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int31() & (n - 1)
}
max := int32((1 << 31) - 1 - (1<<31)%uint32(n))
v := r.Int31()
for v > max {
v = r.Int31()
}
return v % n

This is more of a question about algorithms than it is about Go, but there are some Go parts. In any case I'll start with the algorithm issues.
Shrinking the range of a uniform random number generator
Suppose that we have a uniform-distribution random number generator that returns a number between, say, 0 and 7 inclusive. That is, it will, over time, return about the same number of 0s, 1s, 2s, ..., 7s, but with no apparent pattern between them.
Now, if we want a uniformly distributed random number between 0 and 7, this thing is perfect. That's what it returns. We just use it. But what if we want a uniformly distributed random number between 0 and 6 instead?
We could write:
func randMod7() int {
return generate() % 7
}
so that if generate() returns 7 (which it has a 1 out of 8 chance of doing), we convert that value to zero. But then we'll get zero back 2 out of 8 times, instead of 1 out of 8 times. We'll get 1, 2, 3, 4, 5, and 6 back 1 out of 8 times, and zero 2 out of 8 times, on average: once for each actual zero, and once for each 7.
What we need to do, then, is throw away any occurrences of 7:
func randMod7() int {
for {
if i := generate() < 7 {
return i
}
// oops, got 7, try again
}
}
Now, if we had a uniform-random-number-generator named generate() that returned a value between 0 and (say) 11 (12 possible values) and we wanted a value between 0 and 3 (four possible values), we could just use generate() % 4, because the 12 possible results would fall into 3 groups of four with equal probability. If we wanted a value between 0 and 5 inclusive, we could use generate() % 6, because the 12 possible results would fall into two groups of 6 with equal probability. In fact, all we need to do is examine the prime factorization of the range of our uniform number generator to see what moduli work. The factors of 12 are 2, 2, 3; so 2, 3, 4, and 6 all work here. Any other modulus, such as generate() % 10, produce a biased result: 0 and 1 occur 2 out of 12 times, but 2 through 9 occur 1 out of 12 times. (Note: generate() % 12 also works, but is kind of pointless.)
In our particular case, we have two different uniform random number generators available. One, Int31(), produces values between 0 and 0x7fffffff (2147483647 decimal, or 231 - 1, or 1<<31 - 1) inclusive. The other, Int63(), produces values between 0 and 0x7fffffffffffffff (9223372036854775807, or 263 - 1, or 1<<63 - 1). These are ranges that hold 231 and 263 values respectively, and hence their prime factorization is 31 2s, or 63 2s.
What this means is that we can compute Int31() mod 2k, for any integer k in zero to 31 inclusive, without messing up our uniformity. With Int63(), we can do the same with k ranging all the way up to 63.
Introducing the computer
Now, mathematically-and-computer-ly speaking, given any nonnegative integer n in [0..0x7ffffff] or [0..0x7fffffffffffffff], and a non-negative integer k in the right range (no more than 31 or 63 respectively), computing that integer n mod 2k produces the same result as computing that integer and doing a bit-mask operation with k bits set. To get that number of set bits, we want to take 1<<k and subtract 1. If k is, say, 4, we get 1<<4 or 16. Subtracting 1, we get 15, or 0xf, which has four 1 bits in it.
So:
n % (1 << k)
and:
n & (1<<k - 1)
produce the same result. Concretely, when k==4, this is n%16 or n&0xf. When k==5 this is n%32 or n&0x1f. Try it for k==0 and k==63.
Introducing Go-the-language
We're now ready to consider doing all of this in Go. We note that int (plain, unadorned int) is guaranteed to be able to hold values between -2147483648 and +2147483647 (-0x80000000 through +0x7fffffff) respectively. It may extend all the way to -0x8000000000000000 through +0x7ffffffffffffff.
Meanwhile, int32 always handles the smaller range and int64 always handles the larger range. The plain int is a different type from these other two, but implements the same range as one of the two. We just don't know which one.
Our Int31 implementation returns a uniformly distributed random number in the 0..0x7ffffff range. (It does this by returning the upper 32 bits of r.Int63(), though this is an implementation detail.) Our Int63 implementation returns a uniformly distributed random number in the 0..0x7ffffffffffffff range.
The Intn function you show here:
func (r *Rand) Intn(n int) int {
if n <= 0 {
panic("invalid argument to Intn")
}
if n <= 1<<31-1 {
return int(r.Int31n(int32(n)))
}
return int(r.Int63n(int64(n)))
}
just picks one of the two functions, based on the value of n: if it's less than or equal to 0x7fffffff (1<<31 - 1), the result fits in int32, so it uses int32(n) to convert n to int32, calls r.Int31n, and converts the result back to int. Otherwise, the value of n exceeds 0x7fffffff, implying that int has the larger range and we must use the larger-range generator, r.Int63n. The rest is the same except for types.
The code could just do:
return int(r.Int63n(int64(n)))
every time, but on 32-bit machines, where 64-bit arithmetic may be slow, this might be slow. (There's a lot of may and might here and if you were writing this yourself today, you should start by profiling / benchmarking the code. The Go authors did do this, though this was many years ago; at that time it was worth doing this fancy stuff.)
More bit-manipulation
The insides of both functions Int31n and Int63n are quite similar; the main difference is the types involved, and then in a few places, the maximum values. Again, the reason for this is at least partly historical: on some (mostly old now) computers, the Int63n variant is significantly slower than the Int32n variant. (In some non-Go language, we might write these as generics and then have the compiler generate a type-specific version automatically.) So let's just look at the Int63 variant:
func (r *Rand) Int63n(n int64) int64 {
if n <= 0 {
panic("invalid argument to Int63n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int63() & (n - 1)
}
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
}
The argument n has type int64, so that its value will not exceed 263-1 or 0x7fffffffffffffff or 9223372036854775807. But it could be negative, and negative values won't work right, so the first thing we do is test for that and panic if so. We also panic if the input is zero (this is something of a choice, but it's useful to note it now).
Next we have the n&(n-1) == 0 test. This is a test for powers of two, with one slight flaw, and it works in many languages (those that have bit-masking):
A power of two is always represented as a single set bit, in the binary representation of a number. For instance, 2 itself is 000000012, 4 is 000000102, 8 is 000001002, and so on, through 128 being 100000002. (Since I only "drew" eight bits this series maxes out at 128.)
Subtracting 1 from that number causes a borrow: that bit goes to zero, and all the lesser bits become 1. For instance, 100000002 - 1 is 011111112.
AND-ing these two together produces zero if there was just the single bit set initially. If not—for instance, if we have the value 130 or 100000102 initially, subtracting 1 produces 100000012—there's no borrow out of the top bit, so the top bit is set in both inputs and therefore is set in the AND-ed result.
The slight flaw is that if the initial value is zero, then we have 0-1, which produces all-1s; 0&0xffffffffffffffff is zero too, but zero is not an integer power of two. (20 is 1, not 0.) This minor flaw is not important for our purpose here, because we already made sure to panic for this case: it just doesn't happen.
Now we have the most complicated line of all:
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
The recurring 63s here are because we have a value range going from zero to 263-1. 1<<63 - 1 is (still, again, always) 9223372036854775807 or 0x7fffffffffffffff. Meanwhile, 1<<63, without 1 subtracted from it, is 9223372036854775808 or 0x8000000000000000. This value does not fit into int64 but it does fit into uint64. So if we turn n into a uint64, we can compute uint64(9223372036854775808) % uint64(n), which is what the % expression does. By using uint64 for this calculation, we ensure that it doesn't overflow.
But: what is this calculation all about? Well, go back to our example with a generate() that produces values in [0..7]. When we wanted a number in [0..5], we had to discard both 6 and 7. That's what we're going for here: we want to find the value above which we should discard values.
If we were to take 8%6, we'd get 2. 8 is one bigger than the maximum that our 3-bit generate() would generate. 8%6 == 2 is the number of "high values" that we have to discard: 8-2 = 6 and we want to discard values that are 6 or more. Subtract 1 from this, and we get 7-2 = 5; we can accept numbers in this input range, from 0 to 5 inclusive.
So, this somewhat fancy calculation for setting max is just a way to find out what the maximum value we like is. Values that are greater than max need to be tossed out.
This particular calculation works nicely even if n is much less than our generator returns. For instance, suppose we had a four-bit generator, returning values in the [0..15] range, and we wanted a number in [0..2]. Our n is therefore 3 (to indicate that we want a number in [0..2]). We compute 16%3 to get 1. We then take 15 (one less than our maximum output value) - 1 to get 14 as our maximum acceptable value. That is, we would allow numbers in [0..14], but exclude 15.
With a 63-bit generator returning values in [0..9223372036854775807], and n==3, we would set max to 9223372036854775805. That's what we want: it throws out the two biasing values, 9223372036854775806 and 9223372036854775807.
The remainder of the code simply does that:
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
We pick one Int63-range number. If it exceeds max, we pick another one and check again, until we pick one that is in the [0..max] range, inclusive of max.
Once we get a number that is in range, we use % n to shrink the range if needed. For instance, if the range is [0..2], we use v % 3. If v is (say) 14, 14%3 is 2. Our actual max is, again, 9223372036854775805, and whatever v is, between 0 and that, v%3 is between 0 and 2 and remains uniformly distributed, with no slight bias to 0 and 1 (9223372036854775806 would give us that one extra 0, and 9223372036854775807 would give us that one extra 1).
(Now repeat the above for int32 and 32 and 1<<32, for the Int31 function.)

Why is the complexity of the binary search Log in base 2?

Considering the implementation of the iterative binary search code:
// Java implementation of iterative Binary Search
class BinarySearch {
// Returns index of x if it is present in arr[],
// else return -1
int binarySearch(int arr[], int x)
{
int l = 0, r = arr.length - 1;
while (l <= r) {
int m = l + (r - l) / 2;
// Check if x is present at mid
if (arr[m] == x)
return m;
// If x greater, ignore left half
if (arr[m] < x)
l = m + 1;
// If x is smaller, ignore right half
else
r = m - 1;
}
// if we reach here, then element was
// not present
return -1;
}
// Driver method to test above
public static void main(String args[])
{
BinarySearch ob = new BinarySearch();
int arr[] = { 2, 3, 4, 10, 40 };
int n = arr.length;
int x = 10;
int result = ob.binarySearch(arr, x);
if (result == -1)
System.out.println("Element not present");
else
System.out.println("Element found at "
+ "index " + result);
}
}
The GeeksforGeeks website says the following:
"For example Binary Search (iterative implementation) has O(Logn) time complexity."
My question is what does the division by 2 have to do with logarithm in base 2? What is the relationship between each other? I will use the analogy of 1 pizza (array) to facilitate the understanding of my question:
1 pizza - divided into 2 parts = 2 pieces of pizza
2 pieces of pizza - divide each piece in half = 4 pieces of pizza
4 pieces of pizza - divide each piece in half = 8 pieces of pizza
8 pieces of pizza - divide each piece in half = 16 pieces of pizza
Logₐb = x
b = logarithming
a = base
x = logarithm result
aˣ = b
The values of pieces of pizza are 1, 2, 4, 8 and 16 are similar to logarithms, but I still can't understand what the relationship is. What would be the relationship among logarithming (b), base (a) and the result of logarithm (x) with the division by 2 of a array (pizza)? Would x be the final amount of pieces that I can divide my array(pizza)? Or is the x the number of divisions of my array (pizza)?

Contrary to your belief, O(log(n)) is independent of any base.

If you have a pizza consisting of 16 slices of unit size, how often can you halve it (and throw away one of the halves) until you get a single slice of unit size?
Answer: log2(16) = 4 times
If you have an array of length n, how often can you halve it (and throw away one of the halves) until you get an array slice of length 1?
Answer: log2(n)
More generally, how does an n-ary search algorithm relate to logarithms?
Logₐb = x
b = the size of the array to search
a = the number of slices you get after one cut (all but one are thrown away)
x = the number of cuts you need to make until you get a slice of size 1

Let's use the same pizza analogy you have, and assume we have 1 whole pizza and we want 8 slices. Every time we cut, we divide by 2 as well.
The first cut means we will have 2 slices. The second cut gives us 4 slices. The third cut results in 8 slices. We made 3 cuts to get to 8 slices. Mathematically, it turns out that there is a relationship with the numbers 2, 3, and 8. The log function connects those numbers accordingly. When we are limited to how much we can divide, that is our base (base = 2). We have a quantity which is 8. The number of operations was lg(8) = 3 (using lg as log of base 2).
The same idea applies to binary search. We divide each section of the array we search by 2, the quantity is whatever our size of the array is, and the number of operations we perform is asymptotically lg(n).

Considering the answers, comments and the following video:
StackOverflow response 1
StackOverflow response 2
Binary Search Video
#Mo B. comment:
The question is not: how many cuts are necessary to get 16 slices. But rather: how many cuts are necessary to get a slice of size 1? And that's 4. In other words, like in the algorithm, you cut in half and throw away one of the halves at each step. How often can you do that with an array of size 16?
#Yves Daoust comment:
The logarithm of a number is roughly the number of times you can halve it until you reach 1.
My conclusions are:
The logarithm of a array of size n is approximately the number of times we can divide it in half (considering the base = 2) until it reaches the smallest unit of size 1.
If (x = Logₐb) then 1*2ˣ = n
So x = # times you can multiply 1 by 2 until you get to n
Reversing Logic: x = # of times you can divide n by 2 until you get to 1
The example in the figure would be Log₂10 = x, where x is a result that is not exact. However, if I had drawn the array with 16 positions, this would imply Log₂16 = 4, the result 4 is the number of levels or divisions.

How to turn integers into Fibonacci coding efficiently?

Fibonacci sequence is obtained by starting with 0 and 1 and then adding the two last numbers to get the next one.
All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition. For example: 13 can be the sum of the sets {13}, {5,8} or {2,3,8}. But, as we have seen, some numbers have more than one set whose sum is the number. If we add the constraint that the sets cannot have two consecutive Fibonacci numbers, than we have a unique representation for each number.
We will use a binary sequence (just zeros and ones) to do that. For example, 17 = 1 + 3 + 13. Then, 17 = 100101. See figure 2 for a detailed explanation.
I want to turn some integers into this representation, but the integers may be very big. How to I do this efficiently.

The problem itself is simple. You always pick the largest fibonacci number less than the remainder. You can ignore the the constraint with the consecutive numbers (since if you need both, the next one is the sum of both so you should have picked that one instead of the initial two).
So the problem remains how to quickly find the largest fibonacci number less than some number X.
There's a known trick that starting with the matrix (call it M)
1 1
1 0
You can compute fibbonacci number by matrix multiplications(the xth number is M^x). More details here: https://www.nayuki.io/page/fast-fibonacci-algorithms . The end result is that you can compute the number you're look in O(logN) matrix multiplications.
You'll need large number computations (multiplications and additions) if they don't fit into existing types.
Also store the matrices corresponding to powers of two you compute the first time, since you'll need them again for the results.
Overall this should be O((logN)^2 * large_number_multiplications/additions)).

First I want to tell you that I really liked this question, I didn't know that All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition, I saw the prove by induction and it was awesome.
To respond to your question I think that we have to figure how the presentation is created. I think that the easy way to find this is that from the number we found the closest minor fibonacci item.
For example if we want to present 40:
We have Fib(9)=34 and Fib(10)=55 so the first element in the presentation is Fib(9)
since 40 - Fib(9) = 6 and (Fib(5) =5 and Fib(6) =8) the next element is Fib(5). So we have 40 = Fib(9) + Fib(5)+ Fib(2)
Allow me to write this in C#
class Program
{
static void Main(string[] args)
{
List<int> fibPresentation = new List<int>();
int numberToPresent = Convert.ToInt32(Console.ReadLine());
while (numberToPresent > 0)
{
int k =1;
while (CalculateFib(k) <= numberToPresent)
{
k++;
}
numberToPresent = numberToPresent - CalculateFib(k-1);
fibPresentation.Add(k-1);
}
}
static int CalculateFib(int n)
{
if (n == 1)
return 1;
int a = 0;
int b = 1;
// In N steps compute Fibonacci sequence iteratively.
for (int i = 0; i < n; i++)
{
int temp = a;
a = b;
b = temp + b;
}
return a;
}
}
Your result will be in fibPresentation

This encoding is more accurately called the "Zeckendorf representation": see https://en.wikipedia.org/wiki/Fibonacci_coding
A greedy approach works (see https://en.wikipedia.org/wiki/Zeckendorf%27s_theorem) and here's some Python code that converts a number to this representation. It uses the first 100 Fibonacci numbers and works correctly for all inputs up to 927372692193078999175 (and incorrectly for any larger inputs).
fibs = [0, 1]
for _ in xrange(100):
fibs.append(fibs[-2] + fibs[-1])
def zeck(n):
i = len(fibs) - 1
r = 0
while n:
if fibs[i] <= n:
r |= 1 << (i - 2)
n -= fibs[i]
i -= 1
return r
print bin(zeck(17))
The output is:
0b100101

As the greedy approach seems to work, it suffices to be able to invert the relation N=Fn.
By the Binet formula, Fn=[φ^n/√5], where the brackets denote the nearest integer. Then with n=floor(lnφ(√5N)) you are very close to the solution.
17 => n = floor(7.5599...) => F7 = 13
4 => n = floor(4.5531) => F4 = 3
1 => n = floor(1.6722) => F1 = 1
(I do not exclude that some n values can be off by one.)

I'm not sure if this is an efficient enough for you, but you could simply use Backtracking to find a(the) valid representation.
I would try to start the backtracking steps by taking the biggest possible fib number and only switch to smaller ones if the consecutive or the only once constraint is violated.

SPOJ "Card Trick": unable to understand how to apply binary index tree

Card Trick is a problem on Sphere online judge.
It states that
The magician shuffles a small pack of cards, holds it face down and performs the following procedure:
The top card is moved to the bottom of the pack. The new top card is dealt face up onto the table. It is the Ace of Spades.
Two cards are moved one at a time from the top to the bottom. The next card is dealt face up onto the table. It is the Two of Spades.
Three cards are moved one at a time…
This goes on until the nth and last card turns out to be the n of Spades.
This impressive trick works if the magician knows how to arrange the cards beforehand (and knows how to give a false shuffle). Your program has to determine the initial order of the cards for a given number of cards, 1 ≤ n ≤ 20000.
Input
On the first line of the input is a single positive integer, telling the number of test cases to follow. Each case consists of one line containing the integer n.
Output
For each test case, output a line with the correct permutation of the values 1 to n, space separated. The first number showing the top card of the pack, etc…
Example
Input:
2
4
5
Output:
2 1 4 3
3 1 4 5 2
Now the only solution I can think of is to use a queue and simulate the process.
But that would be O(n^2). I read the comments and they suggested using segment tree of BIT.
I know both segment tree and BIT but am unable to understand how to implement them in this question.
Please suggest some way to do this.

I have no idea why this problem should be linked with BIT or segment tree, but I solved the problem using simple "O(N^2)" simulation.
First the time limit for this problem is 11s, and N == 20000. This indicates that a O(kN) solution may pass the problem. I believe you think this k should be N, because simple simulation requires this, but somehow it can be optimized.
Let's see how we can construct the sequence when N == 5:
Round 1, count 1 space starting from first space after last position: _ 1 _ _ _
Round 2, count 2 spaces starting from first space after last position: _ 1 _ _ 2
Round 3, count 3 spaces starting from first space after last position: 3 1 _ _ 2
Round 4, count 4 spaces starting from first space after last position: 3 1 4 _ 2
Round 5, count 5 spaces starting from first space after last position: 3 1 4 5 2
We can see a nice pattern: for round i, we should count i space starting from the first space after last position, and warp back when necessary.
However, a crucial step is: after some rounds, the spaces left will be smaller than the space to count. In this case, we can take a mod to save time!
For example, in Round 4 of the previous example, we have only 2 spaces left but 4 spaces to count. If we count 4, it's a waste of time. Counting 4 steps is equivalent to count 4 % 2 == 0 space starting from the first space after last position. You can verify this point by yourself :)
Therefore, we can simulate this process using the code:
memset(ans, 255, sizeof(ans));
while (cur <= n)
{
int i, cnt;
int left = n - cur + 1; // count how many spaces left
left = cur % left + 1; // this line is critical, mod to save time!
for (i = pos, cnt = 0; ; ++i) // simulate the process
{
if (i > n) i = 1;
if (ans[i] == -1) ++cnt;
if (cnt == left) break;
}
ans[i] = cur;
pos = i;
++cur;
}

If you want to use a Fenwick tree (BIT) to solve this problem, take a closer look at the solution that nevets posted, particularly this part (thanks for the drawing nevets):
Round 1, count 1 space starting from first space after last position: _ 1 _ _ _
Round 2, count 2 spaces starting from first space after last position: _ 1 _ _ 2
Round 3, count 3 spaces starting from first space after last position: 3 1 _ _ 2
Round 4, count 4 spaces starting from first space after last position: 3 1 4 _ 2
Round 5, count 5 spaces starting from first space after last position: 3 1 4 5 2
Finding the correct free space using the above approach has a time complexity of O(N) because we have to go thru all the spaces (total complexity O(N^2)). Notice that we can calculate the next position using:
free(next_pos) = (free(current_pos) + next_number) mod free(total) + 1
where free(x) tells us how many free spaces are up to (including) a position. This is not a direct formula for next_pos, but it tells us what it needs to satisfy, so we can use this information to binary search it.
The only thing left to do is to do the free space calculations, and this is where BIT comes into play as it gives us a time complexity of O(log N) for both query and for update. The time complexity of finding a free space is now O(log^2 N) and the total time complexity is O(N log^2 N).
As for the running speed:
3.16s for the approach nevets suggested
1.18s using a queue to rotate the elements
0.60s using a linked list to rotate
0.02s using a BIT.
I must say I was quite surprised by the speed gain :-)
P.S. If you're not sure how to use the BIT, initialise by updating all values by +1. When marking a slot as taken, just update it by -1, that's it.

We can solve this also using an indexed set (normal set but with the ability to reach elements in it by indexes like the arrays and vectors)
We can consider this as a faster implementation for the approach of #nevets
and time complexity will be O(NlogN).
Instead of looping throw all elements to find the correct free space. We will store all the free spaces in the indexed set. and every time we take a free space we erase it from the set. and every time we want to find new correct free space we can find it in O(1)
typedef tree<int,null_type,less<int>,rb_tree_tag,tree_order_statistics_node_update> indexed_set_int;
int main() {
int n;
cin >> n;
int a[n]
indexed_set_int st;
for(int i = 0; i < n; i++) {
st.insert(i);
}
int ind = 0;
for(int i = 0; i < n; i++) {
ind += i+1;
ind %= st.size(); // We should mod it to the size of the set to avoid going outside the boundry of the set
auto it = st.find_by_order(ind); // This will get the index of the correct posision from the set of free poisions in O(N)
a[*it] = i+1;
st.erase(it); // remove the free space from the set because we already used it.
}
for(int i = 0; i < n; i++) {
cout << a[i] << " ";
}
}
This solution will be at least the same fast as the BIT. I didn't compare the actual speed of both approaches but from the Time complexity, they both are fast.

Removal of billboards from given ones

I came across this question
ADZEN is a very popular advertising firm in your city. In every road
you can see their advertising billboards. Recently they are facing a
serious challenge , MG Road the most used and beautiful road in your
city has been almost filled by the billboards and this is having a
negative effect on
the natural view.
On people's demand ADZEN has decided to remove some of the billboards
in such a way that there are no more than K billboards standing together
in any part of the road.
You may assume the MG Road to be a straight line with N billboards.Initially there is no gap between any two adjecent
billboards.
ADZEN's primary income comes from these billboards so the billboard removing process has to be done in such a way that the
billboards
remaining at end should give maximum possible profit among all possible final configurations.Total profit of a configuration is the
sum of the profit values of all billboards present in that
configuration.
Given N,K and the profit value of each of the N billboards, output the maximum profit that can be obtained from the remaining
billboards under the conditions given.
Input description
1st line contain two space seperated integers N and K. Then follow N lines describing the profit value of each billboard i.e ith
line contains the profit value of ith billboard.
Sample Input
6 2
1
2
3
1
6
10
Sample Output
21
Explanation
In given input there are 6 billboards and after the process no more than 2 should be together. So remove 1st and 4th
billboards giving a configuration _ 2 3 _ 6 10 having a profit of 21.
No other configuration has a profit more than 21.So the answer is 21.
Constraints
1 <= N <= 1,00,000(10^5)
1 <= K <= N
0 <= profit value of any billboard <= 2,000,000,000(2*10^9)
I think that we have to select minimum cost board in first k+1 boards and then repeat the same untill last,but this was not giving correct answer
for all cases.
i tried upto my knowledge,but unable to find solution.
if any one got idea please kindly share your thougths.

It's a typical DP problem. Lets say that P(n,k) is the maximum profit of having k billboards up to the position n on the road. Then you have following formula:
P(n,k) = max(P(n-1,k), P(n-1,k-1) + C(n))
P(i,0) = 0 for i = 0..n
Where c(n) is the profit from putting the nth billboard on the road. Using that formula to calculate P(n, k) bottom up you'll get the solution in O(nk) time.
I'll leave up to you to figure out why that formula holds.
edit
Dang, I misread the question.
It still is a DP problem, just the formula is different. Let's say that P(v,i) means the maximum profit at point v where last cluster of billboards has size i.
Then P(v,i) can be described using following formulas:
P(v,i) = P(v-1,i-1) + C(v) if i > 0
P(v,0) = max(P(v-1,i) for i = 0..min(k, v))
P(0,0) = 0
You need to find max(P(n,i) for i = 0..k)).

This problem is one of the challenges posted in www.interviewstreet.com ...
I'm happy to say I got this down recently, but not quite satisfied and wanted to see if there's a better method out there.
soulcheck's DP solution above is straightforward, but won't be able to solve this completely due to the fact that K can be as big as N, meaning the DP complexity will be O(NK) for both runtime and space.
Another solution is to do branch-and-bound, keeping track the best sum so far, and prune the recursion if at some level, that is, if currSumSoFar + SUM(a[currIndex..n)) <= bestSumSoFar ... then exit the function immediately, no point of processing further when the upper-bound won't beat best sum so far.
The branch-and-bound above got accepted by the tester for all but 2 test-cases.
Fortunately, I noticed that the 2 test-cases are using small K (in my case, K < 300), so the DP technique of O(NK) suffices.

soulcheck's (second) DP solution is correct in principle. There are two improvements you can make using these observations:
1) It is unnecessary to allocate the entire DP table. You only ever look at two rows at a time.
2) For each row (the v in P(v, i)), you are only interested in the i's which most increase the max value, which is one more than each i that held the max value in the previous row. Also, i = 1, otherwise you never consider blanks.

I coded it in c++ using DP in O(nlogk).
Idea is to maintain a multiset with next k values for a given position. This multiset will typically have k values in mid processing. Each time you move an element and push new one. Art is how to maintain this list to have the profit[i] + answer[i+2]. More details on set:
/*
* Observation 1: ith state depends on next k states i+2....i+2+k
* We maximize across this states added on them "accumulative" sum
*
* Let Say we have list of numbers of state i+1, that is list of {profit + state solution}, How to get states if ith solution
*
* Say we have following data k = 3
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? ? 5 3 1
*
* Answer for [1] = max(3+3, 5+1, 9+0) = 9
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? 9 5 3 1
*
* Let's find answer for [0], using set of [1].
*
* First, last entry should be removed. then we have (3+3, 5+1)
*
* Now we should add 1+5, but entries should be incremented with 1
* (1+5, 4+3, 6+1) -> then find max.
*
* Could we do it in other way but instead of processing list. Yes, we simply add 1 to all elements
*
* answer is same as: 1 + max(1-1+5, 3+3, 5+1)
*
*/
ll dp()
{
multiset<ll, greater<ll> > set;
mem[n-1] = profit[n-1];
ll sumSoFar = 0;
lpd(i, n-2, 0)
{
if(sz(set) == k)
set.erase(set.find(added[i+k]));
if(i+2 < n)
{
added[i] = mem[i+2] - sumSoFar;
set.insert(added[i]);
sumSoFar += profit[i];
}
if(n-i <= k)
mem[i] = profit[i] + mem[i+1];
else
mem[i] = max(mem[i+1], *set.begin()+sumSoFar);
}
return mem[0];
}

This looks like a linear programming problem. This problem would be linear, but for the requirement that no more than K adjacent billboards may remain.
See wikipedia for a general treatment: http://en.wikipedia.org/wiki/Linear_programming
Visit your university library to find a good textbook on the subject.
There are many, many libraries to assist with linear programming, so I suggest you do not attempt to code an algorithm from scratch. Here is a list relevant to Python: http://wiki.python.org/moin/NumericAndScientific/Libraries

Let P[i] (where i=1..n) be the maximum profit for billboards 1..i IF WE REMOVE billboard i. It is trivial to calculate the answer knowing all P[i]. The baseline algorithm for calculating P[i] is as follows:
for i=1,N
{
P[i]=-infinity;
for j = max(1,i-k-1)..i-1
{
P[i] = max( P[i], P[j] + C[j+1]+..+C[i-1] );
}
}
Now the idea that allows us to speed things up. Let's say we have two different valid configurations of billboards 1 through i only, let's call these configurations X1 and X2. If billboard i is removed in configuration X1 and profit(X1) >= profit(X2) then we should always prefer configuration X1 for billboards 1..i (by profit() I meant the profit from billboards 1..i only, regardless of configuration for i+1..n). This is as important as it is obvious.
We introduce a doubly-linked list of tuples {idx,d}: {{idx1,d1}, {idx2,d2}, ..., {idxN,dN}}.
p->idx is index of the last billboard removed. p->idx is increasing as we go through the list: p->idx < p->next->idx
p->d is the sum of elements (C[p->idx]+C[p->idx+1]+..+C[p->next->idx-1]) if p is not the last element in the list. Otherwise it is the sum of elements up to the current position minus one: (C[p->idx]+C[p->idx+1]+..+C[i-1]).
Here is the algorithm:
P[1] = 0;
list.AddToEnd( {idx=0, d=C[0]} );
// sum of elements starting from the index at top of the list
sum = C[0]; // C[list->begin()->idx]+C[list->begin()->idx+1]+...+C[i-1]
for i=2..N
{
if( i - list->begin()->idx > k + 1 ) // the head of the list is "too far"
{
sum = sum - list->begin()->d
list.RemoveNodeFromBeginning()
}
// At this point the list should containt at least the element
// added on the previous iteration. Calculating P[i].
P[i] = P[list.begin()->idx] + sum
// Updating list.end()->d and removing "unnecessary nodes"
// based on the criterion described above
list.end()->d = list.end()->d + C[i]
while(
(list is not empty) AND
(P[i] >= P[list.end()->idx] + list.end()->d - C[list.end()->idx]) )
{
if( list.size() > 1 )
{
list.end()->prev->d += list.end()->d
}
list.RemoveNodeFromEnd();
}
list.AddToEnd( {idx=i, d=C[i]} );
sum = sum + C[i]
}

//shivi..coding is adictive!!
#include<stdio.h>
long long int arr[100001];
long long int sum[100001];
long long int including[100001],excluding[100001];
long long int maxim(long long int a,long long int b)
{if(a>b) return a;return b;}
int main()
{
int N,K;
scanf("%d%d",&N,&K);
for(int i=0;i<N;++i)scanf("%lld",&arr[i]);
sum[0]=arr[0];
including[0]=sum[0];
excluding[0]=sum[0];
for(int i=1;i<K;++i)
{
sum[i]+=sum[i-1]+arr[i];
including[i]=sum[i];
excluding[i]=sum[i];
}
long long int maxi=0,temp=0;
for(int i=K;i<N;++i)
{
sum[i]+=sum[i-1]+arr[i];
for(int j=1;j<=K;++j)
{
temp=sum[i]-sum[i-j];
if(i-j-1>=0)
temp+=including[i-j-1];
if(temp>maxi)maxi=temp;
}
including[i]=maxi;
excluding[i]=including[i-1];
}
printf("%lld",maxim(including[N-1],excluding[N-1]));
}
//here is the code...passing all but 1 test case :) comment improvements...simple DP

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio