Debugging hackerrank week of code Lazy Sorting - algorithm

I am doing a question on hackerrank(https://www.hackerrank.com/contests/w21/challenges/lazy-sorting) right now, and I am confused as to why doesn't my code fulfill the requirements. The questions asks:
Logan is cleaning his apartment. In particular, he must sort his old favorite sequence, P, of N positive integers in nondecreasing order. He's tired from a long day, so he invented an easy way (in his opinion) to do this job. His algorithm can be described by the following pseudocode:
while isNotSorted(P) do {
WaitOneMinute();
RandomShuffle(P)
}
Can you determine the expected number of minutes that Logan will spend waiting for to be sorted?
Input format:
The first line contains a single integer, N, denoting the size of permutation .The second line contains N space-separated integers describing the respective elements in the sequence's current order, P_0, P_1 ... P_N-1.
Constraints:
2 <= N <= 18
1 <= P_i <= 100
Output format:
Print the expected number of minutes Logan must wait for P to be sorted, rounded to a scale of exactly 6 decimal places (i.e.,1.234567 format).
Sample input:
2
5 2
Sample output:
2.000000
Explanation
There are two permutations possible after a random shuffle, and each of them has probability 0.5. The probability to get the sequence sorted after the first minute is 0.5. The probability that will be sorted after the second minute is 0.25, the probability will be sorted after the third minute is 0.125, and so on. The expected number of minutes hence equals to:
summation of i*2^-i where i goes from 1 to infinity = 2
I wrote my code in c++ as follow:
#include <cmath>
#include <cstdio>
#include <vector>
#include <iostream>
#include <algorithm>
#include <map>
using namespace std;
int main() {
/* Enter your code here. Read input from STDIN. Print output to STDOUT */
map <int, int> m; //create a map to store the number of repetitions of each number
int N; //number of elements in list
//calculate the number of permutations
cin >> N;
int j;
int total_perm = 1;
int temp;
for (int i = 0; i < N; i++){
cin >> temp;
//if temp exists, add one to the value of m[temp], else initialize a new key value pair
if (m.find(temp) == m.end()){
m[temp] = 1;
}else{
m[temp] += 1;
}
total_perm *= i+1;
}
//calculate permutations taking into account of repetitions
for (map<int,int>::iterator iter = m.begin(); iter != m.end(); ++iter)
{
if (iter -> second > 1){
temp = iter -> second;
while (temp > 1){
total_perm = total_perm / temp;
temp -= 1;
}
}
}
float recur = 1 / float(total_perm);
float prev;
float current = recur;
float error = 1;
int count = 1;
//print expected number of minutes up to 6 sig fig
if (total_perm == 1){
printf("%6f", recur);
}else{
while (error > 0.0000001){
count += 1;
prev = current;
current = prev + float(count)*float(1-recur)*pow(recur,count-1);
error = abs(current - prev);
}
printf("%6f", prev);
}
return 0;
}
I don't really care about the competition, it's more about learning for me, so I would really appreciate it if someone can point out where I was wrong.

Unfortunately I am not familiar with C++ so I don't know exactly what your code is doing. I did, however, solve this problem. It's pretty cheeky and I think they posed the problem the way they did just to be confusing. So the important piece of knowledge here is that for an event with probability p, the expected number of trials until a success is 1/p. Since each trial here costs us a minute, that means we can find the expected number of trials and add ".000000" to the end.
So how do you do that? Well each permutation of the numbers is equally likely to occur, which means that if we can find how many permutations there are, we can find p. And then we take 1/p to get E[time]. But notice that each permutation has probability 1/p of occurring, where p is the total number of permutations. So really E[time] = number of permutations. I leave the rest to you.

This is just simple problem.
This problem looks like bogo sort.
How many unique permutations of the given array are possible? In the sample case, there are two permutations possible, so the expected time for any one permutation to occur is 2.000000. Extend this approach to the generic case, taking into account any repeated numbers.
However in the question, the numbers can be repeated. This reduces the number of unique permutations, and thus the answer.
Just find the number of unique permutations of the array, upto 6 decimal places. That is your answer.
Think about if array is sorted then what happen?
E.g
if test case is
5 5
5 4 3 2 1
then ans would be 120.000000 (5!/1!)
5 5
1 2 3 4 5
then ans would be 0.000000 in your question.
5 5
2 2 2 2 2
then also ans would be 0.000000
5 5
5 1 2 2 3
then ans is 60.000000
In general ans is if array is not sorted : N!/P!*Q!.. and so on..
Here is another useful link:
https://math.stackexchange.com/questions/1844133/expectation-over-sequencial-random-shuffles

Related

Strange Bank(Atcoder Beginner contest 099)

To make it difficult to withdraw money, a certain bank allows its customers to withdraw only one of the following amounts in one operation:
1 yen (the currency of Japan)
6 yen, 6^2(=36) yen, 6^3(=216) yen, ...
9 yen, 9^2(=81) yen, 9^3(=729) yen, ...
At least how many operations are required to withdraw exactly N yen in total?
It is not allowed to re-deposit the money you withdrew.
Constraints
1≤N≤100000
N is an integer.
Input is given from Standard Input in the following format:
N
Output
If at least x operations are required to withdraw exactly N yen in total, print x.
Sample Input 1
127
Sample Output 1
4
By withdrawing 1 yen, 9 yen, 36(=6^2) yen and 81(=9^2) yen, we can withdraw 127 yen in four operations.
It seemed as a simple greedy problem to me ,So that was the approach I used, but I saw I got a different result for one of the samples and figured out,
It will not always be greedy.
#include <iostream>
#include <queue>
#include <stack>
#include <algorithm>
#include <functional>
#include <cmath>
using namespace std;
int intlog(int base, long int x) {
return (int)(log(x) / log(base));
}
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
long int n;cin>>n;
int result=0;
while(n>0)
{
int base_9=intlog(9,n);int base_6=intlog(6,n);
int val;
val=max(pow(9,base_9),pow(6,base_6));
//cout<<pow(9,base_9)<<" "<<pow(6,base_6)<<"\n";
val=max(val,1);
if(n<=14 && n>=12)
val=6;
n-=val;
//cout<<n<<"\n";
result++;
}
cout<<result;
return 0;
}
At n 14 and above 12 , we have to pick 6 rather than 9, because To reach zero it will take less steps.
It got AC only for 18/22 TCs Please help me understand my mistake.
Greedy will not work here as the choosing the answer greedily i.e. the optimal result at every step will not guarantee the best end result (you can see that in your example). So instead you should traverse through every possible scenarios at each step to figure out the overall optimal result.
Now lets see how can we do that. As you can see that here the maximum input could be 10^5. And we can withdraw any one of the only following 12 values in one operation -
[1, 6, 9, 36(=6^2), 81(=9^2), 216(=6^3), 729(=9^3), 1296(=6^4), 6561(=9^4), 7776(=6^5), 46656(=6^6), 59049(=9^5)]
Because 6^7 and 9^6 will be more than 100000.
So at each step with value n we will try to take each possible (i.e less than or equals to n) element arr[i] from the above array and then recursively solve the subproblem for n-arr[i] until we reach at zero.
solve(n)
if n==0
return 1;
ans = n;
for(int i=0;i<arr.length;i++)
if (n>=arr[i])
ans = min(ans, 1+solve(n-arr[i]);
return ans;
Now this is very time extensive recursive solution(O(n*2^12)). We will try to optimize it. As you will try with some sample cases you will come to know that the subproblems are overlapping that means there could be duplicate subproblems. Here comes Dynamic Programming into the picture. You can store every subproblem's solution so that we can re-use them in future. So we can modify our solution as following
solve(n)
if n==0
return 1;
ans = n;
if(dp[n] is seen)
return dp[n];
for(int i=0;i<arr.length;i++)
if (n>=arr[i])
ans = min(ans, 1+solve(n-arr[i]);
return dp[n] = ans;
The time complexity for DP solution is O(n*12);

How to turn integers into Fibonacci coding efficiently?

Fibonacci sequence is obtained by starting with 0 and 1 and then adding the two last numbers to get the next one.
All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition. For example: 13 can be the sum of the sets {13}, {5,8} or {2,3,8}. But, as we have seen, some numbers have more than one set whose sum is the number. If we add the constraint that the sets cannot have two consecutive Fibonacci numbers, than we have a unique representation for each number.
We will use a binary sequence (just zeros and ones) to do that. For example, 17 = 1 + 3 + 13. Then, 17 = 100101. See figure 2 for a detailed explanation.
I want to turn some integers into this representation, but the integers may be very big. How to I do this efficiently.
The problem itself is simple. You always pick the largest fibonacci number less than the remainder. You can ignore the the constraint with the consecutive numbers (since if you need both, the next one is the sum of both so you should have picked that one instead of the initial two).
So the problem remains how to quickly find the largest fibonacci number less than some number X.
There's a known trick that starting with the matrix (call it M)
1 1
1 0
You can compute fibbonacci number by matrix multiplications(the xth number is M^x). More details here: https://www.nayuki.io/page/fast-fibonacci-algorithms . The end result is that you can compute the number you're look in O(logN) matrix multiplications.
You'll need large number computations (multiplications and additions) if they don't fit into existing types.
Also store the matrices corresponding to powers of two you compute the first time, since you'll need them again for the results.
Overall this should be O((logN)^2 * large_number_multiplications/additions)).
First I want to tell you that I really liked this question, I didn't know that All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition, I saw the prove by induction and it was awesome.
To respond to your question I think that we have to figure how the presentation is created. I think that the easy way to find this is that from the number we found the closest minor fibonacci item.
For example if we want to present 40:
We have Fib(9)=34 and Fib(10)=55 so the first element in the presentation is Fib(9)
since 40 - Fib(9) = 6 and (Fib(5) =5 and Fib(6) =8) the next element is Fib(5). So we have 40 = Fib(9) + Fib(5)+ Fib(2)
Allow me to write this in C#
class Program
{
static void Main(string[] args)
{
List<int> fibPresentation = new List<int>();
int numberToPresent = Convert.ToInt32(Console.ReadLine());
while (numberToPresent > 0)
{
int k =1;
while (CalculateFib(k) <= numberToPresent)
{
k++;
}
numberToPresent = numberToPresent - CalculateFib(k-1);
fibPresentation.Add(k-1);
}
}
static int CalculateFib(int n)
{
if (n == 1)
return 1;
int a = 0;
int b = 1;
// In N steps compute Fibonacci sequence iteratively.
for (int i = 0; i < n; i++)
{
int temp = a;
a = b;
b = temp + b;
}
return a;
}
}
Your result will be in fibPresentation
This encoding is more accurately called the "Zeckendorf representation": see https://en.wikipedia.org/wiki/Fibonacci_coding
A greedy approach works (see https://en.wikipedia.org/wiki/Zeckendorf%27s_theorem) and here's some Python code that converts a number to this representation. It uses the first 100 Fibonacci numbers and works correctly for all inputs up to 927372692193078999175 (and incorrectly for any larger inputs).
fibs = [0, 1]
for _ in xrange(100):
fibs.append(fibs[-2] + fibs[-1])
def zeck(n):
i = len(fibs) - 1
r = 0
while n:
if fibs[i] <= n:
r |= 1 << (i - 2)
n -= fibs[i]
i -= 1
return r
print bin(zeck(17))
The output is:
0b100101
As the greedy approach seems to work, it suffices to be able to invert the relation N=Fn.
By the Binet formula, Fn=[φ^n/√5], where the brackets denote the nearest integer. Then with n=floor(lnφ(√5N)) you are very close to the solution.
17 => n = floor(7.5599...) => F7 = 13
4 => n = floor(4.5531) => F4 = 3
1 => n = floor(1.6722) => F1 = 1
(I do not exclude that some n values can be off by one.)
I'm not sure if this is an efficient enough for you, but you could simply use Backtracking to find a(the) valid representation.
I would try to start the backtracking steps by taking the biggest possible fib number and only switch to smaller ones if the consecutive or the only once constraint is violated.

2048 game: how many moves did I do?

2048 used to be quite popular just a little while ago. Everybody played it and a lot of people posted nice screenshots with their accomplishments(myself among them). Then at some point I began to wonder if it possible to tell how long did someone play to get to that score. I benchmarked and it turns out that(at least on the android application I have) no more than one move can be made in one second. Thus if you play long enough(and fast enough) the number of moves you've made is quite good approximation to the number of seconds you've played. Now the question is: is it possible having a screenshot of 2048 game to compute how many moves were made.
Here is an example screenshot(actually my best effort on the game so far):
From the screenshot you can see the field layout at the current moment and the number of points that the player has earned. So: is this information enough to compute how many moves I've made and if so, what is the algorithm to do that?
NOTE: I would like to remind you that points are only scored when two tiles "combine" and the number of points scored is the value of the new tile(i.e. the sum of the values of the tiles being combined).
The short answer is it is possible to compute the number of moves using only this information. I will explain the algorithm to do that and I will try to post my answer in steps. Each step will be an observation targeted at helping you solve the problem. I encourage the reader to try and solve the problem alone after each tip.
Observation number one: after each move exactly one tile appears. This tile is either 4 or 2. Thus what we need to do is to count the number of tiles that appeared. At least on the version of the game I played the game always started with 2 tiles with 2 on them placed at random.
We don't care about the actual layout of the field. We only care about the numbers that are present on it. This will become more obvious when I explain my algorithm.
Seeing the values in the cells on the field we can compute what the score would be if 2 had appeared after each move. Call that value twos_score.
The number of fours that have appeared is equal to the difference of twos_score and actual_score divided by 4. This is true because for forming a 4 from two 2-s we would have scored 4 points, while if the 4 appears straight away we score 0. Call the number of fours fours.
We can compute the number of twos we needed to form all the numbers on the field. After that we need to subtract 2 * fours from this value as a single 4 replaces the need of two 2s. Call this twos.
Using this observations we are able to solve the problem. Now I will explain in more details how to perform the separate steps.
How to compute the score if only two appeared?
I will prove that to form the number 2n, the player would score 2n*(n - 1) points(using induction).
The statements is obvious for 2 as it directly appears and therefor no points are scored for it.
Let's assume that for a fixed k for the number 2k the user will score 2k*(k - 1)
For k + 1: 2k + 1 can only be formed by combining two numbers of value 2k. Thus the user will score 2k*(k - 1) + 2k*(k - 1) + 2k+1(the score for the two numbers being combined plus the score for the new number).
This equals: 2k + 1*(k - 1) + 2k+1= 2k+1 * (k - 1 + 1) = 2k+1 * k. This completes the induction.
Therefor to compute the score if only twos appeared we need to iterate over all numbers on the board and accumulate the score we get for them using the formula above.
How to compute the number of twos needed to form the numbers on the field?
It is much easier to notice that the number of twos needed to form 2n is 2n - 1. A strict proof can again be done using induction, but I will leave this to the reader.
The code
I will provide code for solving the problem in c++. However I do not use anything too language specific(appart from vector which is simply a dynamically expanding array) so it should be very easy to port to many other languages.
/**
* #param a - a vector containing the values currently in the field.
* A value of zero means "empty cell".
* #param score - the score the player currently has.
* #return a pair where the first number is the number of twos that appeared
* and the second number is the number of fours that appeared.
*/
pair<int,int> solve(const vector<vector<int> >& a, int score) {
vector<int> counts(20, 0);
for (int i = 0; i < (int)a.size(); ++i) {
for (int j = 0; j < (int)a[0].size(); ++j) {
if (a[i][j] == 0) {
continue;
}
int num;
for (int l = 1; l < 20; ++l) {
if (a[i][j] == 1 << l) {
num = l;
break;
}
}
counts[num]++;
}
}
// What the score would be if only twos appeared every time
int twos_score = 0;
for (int i = 1; i < 20; ++i) {
twos_score += counts[i] * (1 << i) * (i - 1);
}
// For each 4 that appears instead of a two the overall score decreases by 4
int fours = (twos_score - score) / 4;
// How many twos are needed for all the numbers on the field(ignoring score)
int twos = 0;
for (int i = 1; i < 20; ++i) {
twos += counts[i] * (1 << (i - 1));
}
// Each four replaces two 2-s
twos -= fours * 2;
return make_pair(twos, fours);
}
Now to answer how many moves we've made we should add the two values of the pair returned by this function and subtract two because two tiles with 2 appear straight away.

Sieve optimization

A sequence is created from sequence of natural numbers:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
removing every 2nd number in the 2nd step:
1 3 5 7 9 11 13 15 17 19 21 23
removing every 3rd number in the 3rd step (from previous sequence):
1 3 7 9 13 15 19 21
removing every 4th number in the 4th step (from previous sequence):
1 3 7 13 19
and so forth...
Now, we're able to say, that the 4th number of the sequence will be 13.
Definition and the right solution for this is here: http://oeis.org/A000960
My task is to find a 1000th member of the sequence.
I have written an algorithm for this, but I think it's quite slow (when I try it with 10.000th member it takes about 13 seconds). What it does is:
I have number which increases by 2 in every step, since we know
that there ain't no even numbers.
In counters array I store indexes for each step. If the number is
xth in xth step, i have to remove it, e.g. number 5 in 3rd step. And
I initiate a counter for the next step.
ArrayList<Long> list = new ArrayList<Long>(10000);
long[] counters = new long[1002];
long number = -1;
int active_counter = 3;
boolean removed;
counters[active_counter] = 1;
int total_numbers = 1;
while (total_numbers <= 1000) {
number += 2;
removed = false;
for (int i = 3; i <= active_counter; i++) {
if ((counters[i] % i) == 0) {
removed = true;
if (i == active_counter) {
active_counter++;
counters[active_counter] = i;
}
counters[i]++;
break;
}
counters[i]++;
}
if (!removed) {
list.add(number);
total_numbers++;
}
}
Your link to OEIS gives us some methods for fast calculation (FORMULA etc)
Implementation of the second one:
function Flavius(n: Integer): Integer;
var
m, i: Integer;
begin
m := n * n;
for i := n - 1 downto 1 do
m := (m - 1) - (m - 1) mod i;
Result := m;
end;
P.S. Algorithm is linear (O(n)), and result for n=10000 is 78537769
No this problem is not NP hard...
I have the intuition it is O(n^2), and the link proove it:
Let F(n) = number of terms <= n. Andersson, improving results of Brun,
shows that F(n) = 2 sqrt(n/Pi) + O(n^(1/6)). Hence a(n) grows like Pi n^2 / 4.
It think O(n^2) should not be give 15s for n = 10000. Yes there is something not correct :(
Edit :
I measured the number of access to counters (for n = 10000)to get a rough idea of the complexity and I have
F = 1305646150
F/n^2 = 13.05...
Your algorithm is between O(n^2) and O(n^2*(logn)) so you are doing things right.... :)
Wow, that is a really interesting problem.
Thank you so much for that.
I just lost an hour of my life to this. I think the problem will turn out to be NP-hard. And I am at a loss to generate an equation to calculate the ith term in the jth step.
Your "brute force" solution seems fine unless there is some clever math trick to generate the final solution in one step. But I do not think there is.
From a programming standpoint, you could try making your initial array a linked list and just un-linking the terms you want to drop. That would save you some time, since you wouldn't be rebuilding your list every step.
One approach could be to keep an array of the numbers you are using to sieve, rather than the numbers being sieved. Basically, if you are looking for the Nth value in the sequence, you create an array of N counters and then iterate through the natural numbers. For each number, you loop through your counters, incrementing them until one gets to its "maximum" value, at which point you set that counter to zero and stop incrementing the remaining counters. (This represents removing the current number at that counter's step.) If you get through all of the counters without removing the current number, then this is one of the numbers that is left over.
Some sample (Java) code that seems to match the sequence given by OEIS:
public class Test {
public static void main(String[] args) {
int N=10000;
int n=0;
long c=0;
int[] counters = new int[N];
outer: while(n<N) {
c++;
for(int i=0;i<N;i++){
counters[i]++;
if(counters[i]==i+2){
counters[i]=0;
continue outer;
}
}
// c is the n'th leftover
System.out.println(n + " " + c);
n++;
}
}
}
I believe this runs in O(N^3).

Population segmentation algorithm

I have a population of 50 ordered integers (1,2,3,..,50) and I look for a generic way to slice it "n" ways ("n" is the number of cutoff points ranging from 1 to 25) that maintains the order of the elements.
For example, for n=1 (one cutoff point) there are 49 possible grouping alternatives ([1,2-49], [1-2,3-50], [1-3,4-50],...). For n=2 (two cutoff points), the grouping alternatives are like: [1,2,3-50], [1,2-3,4-50],...
Could you recommend any general-purpose algorithm to complete this task in an efficient way?
Thanks,
Chris
Thanks everyone for your feedback. I reviewed all your comments and I am working on a generic solution that will return all combinations (e.g., [1,2,3-50], [1,2-3,4-50],...) for all numbers of cutoff points.
Thanks again,
Chris
Let sequence length be N, and number of slices n.
That problem becomes easier when you notice that, choosing a slicing to n slices is equivalent to choosing n - 1 from N - 1 possible split points (a split point is between every two numbers in the sequence). Hence there is (N - 1 choose n - 1) such slicings.
To generate all slicings (to n slices), you have to generate all n - 1 element subsets of numbers from 1 to N - 1.
The exact algorithm for this problem is placed here: How to iteratively generate k elements subsets from a set of size n in java?
Do you need the cutoffs, or are you just counting them. If you're just going to count them, then it's simple:
1 cutoff = (n-1) options
2 cutoffs = (n-1)*(n-2)/2 options
3 cutoffs = (n-1)(n-2)(n-3)/4 options
you can see the patterns here
If you actually need the cutoffs, then you have to actually do the loops, but since n is so small, Emilio is right, just brute force it.
1 cutoff
for(i=1,i<n;++i)
cout << i;
2 cutoffs
for(i=1;<i<n;++i)
for(j=i+1,j<n;++j)
cout << i << " " << j;
3 cutoffs
for(i=1;<i<n;++i)
for(j=i+1,j<n;++j)
for(k=j+1,k<n;++k)
cout << i << " " << j << " " << k;
again, you can see the pattern
So you want to select 25 split point from 49 choices in all possible ways. There are a lot of well known algorithms to do that.
I want to draw your attention to another side of this problem. There are 49!/(25!*(49-25)!) = 63 205 303 218 876 >= 2^45 ~= 10^13 different combinations. So if you want to store it, the required amount of memory is 32TB * sizeof(Combination). I guess that it will pass 1 PB mark.
Now lets assume that you want to process generated data on the fly. Lets make rather optimistic assumption that you can process 1 million combinations per second (here i assume that there is no parallelization). So this task will take 10^7 seconds = 2777 hours = 115 days.
This problem is more complicated than it seems at first glance. If you want to solve if at home in reasonable time, my suggestion is to change the strategy or wait for the advance of quantum computers.
This will generate an array of all the ranges, but I warn you, it'll take tons of memory, due to the large numbers of results (50 elements with 3 splits is 49*48*47=110544) I haven't even tried to compile it, so there's probably errors, but this is the general algorithm I'd use.
typedef std::vector<int>::iterator iterator_t;
typedef std::pair<iterator_t, iterator_t> range_t;
typedef std::vector<range_t> answer_t;
answer_t F(std::vector<int> integers, int slices) {
answer_t prev; //things to slice more
answer_t results; //thin
//initialize results for 0 slices
results.push_back(answer(range(integers.begin(), integers.end()), 1));
//while there's still more slicing to do
while(slices--) {
//move "results" to the "things to slice" pile
prev.clear();
prev.swap(results);
//for each thing to slice
for(int group=0; group<prev.size(); ++group) {
//for each range
for(int crange=0; crange<prev[group].size(); ++crange) {
//for each place in that range
for(int newsplit=0; newsplit<prev[group][crange].size(); ++newsplit) {
//copy the "result"
answer_t cur = prev[group];
//slice it
range_t L = range(cur[crange].first, cur[crange].first+newsplit);
range_t R = range(cur[crange].first+newsplit), cur[crange].second);
answer_t::iterator loc = cur.erase(cur.begin()+crange);
cur.insert(loc, R);
cur.insert(loc, L);
//add it to the results
results.push_back(cur);
}
}
}
}
return results;
}

Resources