How to analyze of the complexity of this code? - algorithm

I am solving a problem from codeforces. According to the editorial, the complexity of the following code should be O(n).
for(int i = n - 1; i >= 0; --i) {
r[i] = i + 1;
while (r[i] < n && height[i] > height[r[i]])
r[i] = r[r[i]];
if (r[i] < n && height[i] == height[r[i]])
r[i] = r[r[i]];
}
Here, height[i] is the height of i-th hill and r[i] is the position of the first right hill which is higher than height[i], and height[0] is the always greatest among other values of height array.
My question is, how we can guarantee the complexity of the code to be O(n) although the inner while loop's being?
In the inner while loop, the code updates r[i] values until height[i] > height[r[i]]. and the number of the updates depends on height array. For example, the number of updates of the height array sorted by non-decreasing order will be different from that of the height array sorted by non-increasing order. (in both cases, the we will sort the array except height[0], because height[0] should be always maximum in this problem).
And Is there any method to analyze an algorithm which varies on input data like this? amortized analysis will be one of answers?
PS. I would like to clarify my question more, We are to set the array r[] in the loop. And what about this? if the array height = {5,4,1,2,3} and i=1, (r[2]=3, r[3]=4 because 2 is the first value which is greater than 1, and 3 is the first value which is greater than 2) we are to compare 4 with 1, and because 4>1, we keep trying to compare 4 and 2(=height[r[2]]), 4 with 3(=height[r[3]]). In this case we have to compare 4 times to set r[1]. The number of comparison is differ from when height = {5,1,2,3,4}. Can we still guarantee the complexity of the code to be O(n)? If I miss something, Please let me know. Thank you.

I tried the mentioned algorithm with simple example, but it seems nothing changed, did I miss something ?
Example:
n = 5
height = { 2, 4, 6, 8, 10 }
r = { 1, 2, 3, 4, 5 }
---- i:4 ----
r[4] < 5 ?
---- i:3 ----
8 > 10 ?
8 = 10 ?
---- i:2 ----
6 > 8 ?
6 = 8 ?
---- i:1 ----
4 > 6 ?
4 = 6 ?
---- i:0 ----
2 > 4 ?
2 = 4 ?
-------------
height = { 2, 4, 6, 8, 10 }
r = { 1, 2, 3, 4, 5 }

Your algo (I do not know whether it will solve your problem) is actually O(n), even though there is an inner loop, but in most of the cases inner loop will not execute because of given the conditions. So in worst case it will run like 2n time, which is O(n).
You can test this assumption with a method like this where yourMethod will return the number of time it's inner loop were executed:
int arr[] = {1, 2, 3, 4, 5};
do {
int count = yourMethod(arr, 5);
}while(next_permutation(arr, arr+5));
With this you will be able check the worst case, average case, etc.

Related

Minimum common remainder of division

I have n pairs of numbers: ( p[1], s[1] ), ( p[2], s[2] ), ... , ( p[n], s[n] )
Where p[i] is integer greater than 1; s[i] is integer : 0 <= s[i] < p[i]
Is there any way to determine minimum positive integer a , such that for each pair :
( s[i] + a ) mod p[i] != 0
Anything better than brute force ?
It is possible to do better than brute force. Brute force would be O(A·n), where A is the minimum valid value for a that we are looking for.
The approach described below uses a min-heap and achieves O(n·log(n) + A·log(n)) time complexity.
First, notice that replacing a with a value of the form (p[i] - s[i]) + k * p[i] leads to a reminder equal to zero in the ith pair, for any positive integer k. Thus, the numbers of that form are invalid a values (the solution that we are looking for is different from all of them).
The proposed algorithm is an efficient way to generate the numbers of that form (for all i and k), i.e. the invalid values for a, in increasing order. As soon as the current value differs from the previous one by more than 1, it means that there was a valid a in-between.
The pseudocode below details this approach.
1. construct a min-heap from all the following pairs (p[i] - s[i], p[i]),
where the heap comparator is based on the first element of the pairs.
2. a0 = -1; maxA = lcm(p[i])
3. Repeat
3a. Retrieve and remove the root of the heap, (a, p[i]).
3b. If a - a0 > 1 then the result is a0 + 1. Exit.
3c. if a is at least maxA, then no solution exists. Exit.
3d. Insert into the heap the value (a + p[i], p[i]).
3e. a0 = a
Remark: it is possible for such an a to not exist. If a valid a is not found below LCM(p[1], p[2], ... p[n]), then it is guaranteed that no valid a exists.
I'll show below an example of how this algorithm works.
Consider the following (p, s) pairs: { (2, 1), (5, 3) }.
The first pair indicates that a should avoid values like 1, 3, 5, 7, ..., whereas the second pair indicates that we should avoid values like 2, 7, 12, 17, ... .
The min-heap initially contains the first element of each sequence (step 1 of the pseudocode) -- shown in bold below:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
We retrieve and remove the head of the heap, i.e., the minimum value among the two bold ones, and this is 1. We add into the heap the next element from that sequence, thus the heap now contains the elements 2 and 3:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
We again retrieve the head of the heap, this time it contains the value 2, and add the next element of that sequence into the heap:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
The algorithm continues, we will next retrieve value 3, and add 5 into the heap:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
Finally, now we retrieve value 5. At this point we realize that the value 4 is not among the invalid values for a, thus that is the solution that we are looking for.
I can think of two different solutions. First:
p_max = lcm (p[0],p[1],...,p[n]) - 1;
for a = 0 to p_max:
zero_found = false;
for i = 0 to n:
if ( s[i] + a ) mod p[i] == 0:
zero_found = true;
break;
if !zero_found:
return a;
return -1;
I suppose this is the one you call "brute force". Notice that p_max represents Least Common Multiple of p[i]s - 1 (solution is either in the closed interval [0, p_max], or it does not exist). Complexity of this solution is O(n * p_max) in the worst case (plus the running time for calculating lcm!). There is a better solution regarding the time complexity, but it uses an additional binary array - classical time-space tradeoff. Its idea is similar to the Sieve of Eratosthenes, but for remainders instead of primes :)
p_max = lcm (p[0],p[1],...,p[n]) - 1;
int remainders[p_max + 1] = {0};
for i = 0 to n:
int rem = s[i] - p[i];
while rem >= -p_max:
remainders[-rem] = 1;
rem -= p[i];
for i = 0 to n:
if !remainders[i]:
return i;
return -1;
Explanation of the algorithm: first, we create an array remainders that will indicate whether certain negative remainder exists in the whole set. What is a negative remainder? It's simple, notice that 6 = 2 mod 4 is equivalent to 6 = -2 mod 4. If remainders[i] == 1, it means that if we add i to one of the s[j], we will get p[j] (which is 0, and that is what we want to avoid). Array is populated with all possible negative remainders, up to -p_max. Now all we have to do is search for the first i, such that remainder[i] == 0 and return it, if it exists - notice that the solution does not have to exists. In the problem text, you have indicated that you are searching for the minimum positive integer, I don't see why zero would not fit (if all s[i] are positive). However, if that is a strong requirement, just change the for loop to start from 1 instead of 0, and increment p_max.
The complexity of this algorithm is n + sum (p_max / p[i]) = n + p_max * sum (1 / p[i]), where i goes from to 0 to n. Since all p[i]s are at least 2, that is asymptotically better than the brute force solution.
An example for better understanding: suppose that the input is (5,4), (5,1), (2,0). p_max is lcm(5,5,2) - 1 = 10 - 1 = 9, so we create array with 10 elements, initially filled with zeros. Now let's proceed pair by pair:
from the first pair, we have remainders[1] = 1 and remainders[6] = 1
second pair gives remainders[4] = 1 and remainders[9] = 1
last pair gives remainders[0] = 1, remainders[2] = 1, remainders[4] = 1, remainders[6] = 1 and remainders[8] = 1.
Therefore, first index with zero value in the array is 3, which is a desired solution.

How to increase speed of algorithm performance for calculation minimal number of moves?

I participate codefights and have the task find the minimal number of moves that are required to obtain a strictly increasing sequence from the input. As an input there are arrays of integers and accourding to the rule I can increase exactly one element an array by one per one move.
inputArray: [1, 1, 1]
Expected Output:3
inputArray: [-1000, 0, -2, 0]
Expected Output:5
inputArray: [2, 1, 10, 1]
Expected Output:12
inputArray: [2, 3, 3, 5, 5, 5, 4, 12, 12, 10, 15]
Expected Output:13
There are also conditions for input and output:
[time limit] 4000ms (py3)
[input] array.integer inputArray
3 ≤ inputArray.length ≤ 105,
-105 ≤ inputArray[i] ≤ 105
[output] integer
I came up with the followitn solution:
def arrayChange(inputArray):
k=0
for i in range(len(inputArray)-1):
if (inputArray[i]<inputArray[i+1]) == False:
while inputArray[i+1]<=inputArray[i]:
inputArray[i+1] = inputArray[i+1] + 1
k +=1
return k
However, apperantly for some tests that I cannot observe my algorithm performance is out of the time limits:
6/8
Execution time limit exceeded on test 7: Program exceeded the execution time limit. Make sure that it completes execution in a few seconds for any possible input.
Sample tests: 4/4
Hidden tests: 2/4
How to improve my algorithm for increasing performance speed?
Right now you increase by 1 at a time. Currently you have this code snippet:
inputArray[i+1] = inputArray[i+1] + 1
Instead of incrementing by 1 each time, why not add all of the numbers at once? For example, if you have the list [1, 3, 0] it makes sense to add 4 to the last element. Doing this would go much more quickly than adding 1 4 times.
#fileyfood500 gave me a very usefull hint and here is my solution that works:
deltX=0
for i in range(len(a)-1):
if (a[i]<a[i+1]) == False:
deltX1 = abs(a[i+1]-a[i])+1
a[i+1] = a[i+1] + deltX1
deltX += deltX1
print(deltX)
Now I do not need while loop at all because I increase the item, that should be increased by necessary number in one step.
def arrayChange(inputArray):
count = 0
for i in range(1,len(inputArray)):
while inputArray[i] <= inputArray[i-1]:
c = inputArray[i]
inputArray[i]= inputArray[i-1]+1
count += inputArray[i] - c
return count
This way you increment it directly and the time will be less.
Then just subtract the new value from the earlier one to get the number of times it increments.

Find the sum of least common multiples of all subsets of a given set

Given: set A = {a0, a1, ..., aN-1} (1 &leq; N &leq; 100), with 2 &leq; ai &leq; 500.
Asked: Find the sum of all least common multiples (LCM) of all subsets of A of size at least 2.
The LCM of a setB = {b0, b1, ..., bk-1} is defined as the minimum integer Bmin such that bi | Bmin, for all 0 &leq; i < k.
Example:
Let N = 3 and A = {2, 6, 7}, then:
LCM({2, 6}) = 6
LCM({2, 7}) = 14
LCM({6, 7}) = 42
LCM({2, 6, 7}) = 42
----------------------- +
answer 104
The naive approach would be to simply calculate the LCM for all O(2N) subsets, which is not feasible for reasonably large N.
Solution sketch:
The problem is obtained from a competition*, which also provided a solution sketch. This is where my problem comes in: I do not understand the hinted approach.
The solution reads (modulo some small fixed grammar issues):
The solution is a bit tricky. If we observe carefully we see that the integers are between 2 and 500. So, if we prime factorize the numbers, we get the following maximum powers:
2 8
3 5
5 3
7 3
11 2
13 2
17 2
19 2
Other than this, all primes have power 1. So, we can easily calculate all possible states, using these integers, leaving 9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 states, which is nearly 70000. For other integers we can make a dp like the following: dp[70000][i], where i can be 0 to 100. However, as dp[i] is dependent on dp[i-1], so dp[70000][2] is enough. This leaves the complexity to n * 70000 which is feasible.
I have the following concrete questions:
What is meant by these states?
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
How is dp[i] computed from dp[i-1]?
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
*The original problem description can be found from this source (problem F). This question is a simplified version of that description.
Discussion
After reading the actual contest description (page 10 or 11) and the solution sketch, I have to conclude the author of the solution sketch is quite imprecise in their writing.
The high level problem is to calculate an expected lifetime if components are chosen randomly by fair coin toss. This is what's leading to computing the LCM of all subsets -- all subsets effectively represent the sample space. You could end up with any possible set of components. The failure time for the device is based on the LCM of the set. The expected lifetime is therefore the average of the LCM of all sets.
Note that this ought to include the LCM of sets with only one item (in which case we'd assume the LCM to be the element itself). The solution sketch seems to sabotage, perhaps because they handled it in a less elegant manner.
What is meant by these states?
The sketch author only uses the word state twice, but apparently manages to switch meanings. In the first use of the word state it appears they're talking about a possible selection of components. In the second use they're likely talking about possible failure times. They could be muddling this terminology because their dynamic programming solution initializes values from one use of the word and the recurrence relation stems from the other.
Does dp stand for dynamic programming?
I would say either it does or it's a coincidence as the solution sketch seems to heavily imply dynamic programming.
If so, what recurrence relation is being solved? How is dp[i] computed from dp[i-1]?
All I can think is that in their solution, state i represents a time to failure , T(i), with the number of times this time to failure has been counted, dp[i]. The resulting sum would be to sum all dp[i] * T(i).
dp[i][0] would then be the failure times counted for only the first component. dp[i][1] would then be the failure times counted for the first and second component. dp[i][2] would be for the first, second, and third. Etc..
Initialize dp[i][0] with zeroes except for dp[T(c)][0] (where c is the first component considered) which should be 1 (since this component's failure time has been counted once so far).
To populate dp[i][n] from dp[i][n-1] for each component c:
For each i, copy dp[i][n-1] into dp[i][n].
Add 1 to dp[T(c)][n].
For each i, add dp[i][n-1] to dp[LCM(T(i), T(c))][n].
What is this doing? Suppose you knew that you had a time to failure of j, but you added a component with a time to failure of k. Regardless of what components you had before, your new time to fail is LCM(j, k). This follows from the fact that for two sets A and B, LCM(A union B} = LCM(LCM(A), LCM(B)).
Similarly, if we're considering a time to failure of T(i) and our new component's time to failure of T(c), the resultant time to failure is LCM(T(i), T(c)). Note that we recorded this time to failure for dp[i][n-1] configurations, so we should record that many new times to failure once the new component is introduced.
Why do the big primes not contribute to the number of states?
Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
You're right, of course. However, the solution sketch states that numbers with large primes are handled in another (unspecified) fashion.
What would happen if we did include them? The number of states we would need to represent would explode into an impractical number. Hence the author accounts for such numbers differently. Note that if a number less than or equal to 500 includes a prime larger than 19 the other factors multiply to 21 or less. This makes such numbers amenable for brute forcing, no tables necessary.
The first part of the editorial seems useful, but the second part is rather vague (and perhaps unhelpful; I'd rather finish this answer than figure it out).
Let's suppose for the moment that the input consists of pairwise distinct primes, e.g., 2, 3, 5, and 7. Then the answer (for summing all sets, where the LCM of 0 integers is 1) is
(1 + 2) (1 + 3) (1 + 5) (1 + 7),
because the LCM of a subset is exactly equal to the product here, so just multiply it out.
Let's relax the restriction that the primes be pairwise distinct. If we have an input like 2, 2, 3, 3, 3, and 5, then the multiplication looks like
(1 + (2^2 - 1) 2) (1 + (2^3 - 1) 3) (1 + (2^1 - 1) 5),
because 2 appears with multiplicity 2, and 3 appears with multiplicity 3, and 5 appears with multiplicity 1. With respect to, e.g., just the set of 3s, there are 2^3 - 1 ways to choose a subset that includes a 3, and 1 way to choose the empty set.
Call a prime small if it's 19 or less and large otherwise. Note that integers 500 or less are divisible by at most one large prime (with multiplicity). The small primes are more problematic. What we're going to do is to compute, for each possible small portion of the prime factorization of the LCM (i.e., one of the ~70,000 states), the sum of LCMs for the problem derived by discarding the integers that could not divide such an LCM and leaving only the large prime factor (or 1) for the other integers.
For example, if the input is 2, 30, 41, 46, and 51, and the state is 2, then we retain 2 as 1, discard 30 (= 2 * 3 * 5; 3 and 5 are small), retain 41 as 41 (41 is large), retain 46 as 23 (= 2 * 23; 23 is large), and discard 51 (= 3 * 17; 3 and 17 are small). Now, we compute the sum of LCMs using the previously described technique. Use inclusion-exclusion to get rid of the subsets whose LCM whose small portion properly divides the state instead of being exactly equal. Maybe I'll work a complete example later.
What is meant by these states?
I think here, states refer to if the number is in set B = {b0, b1, ..., bk-1} of LCMs of set A.
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
dp in the solution sketch stands for dynamic programming, I believe.
How is dp[i] computed from dp[i-1]?
It's feasible that we can figure out the state of next group of LCMs from previous states. So, we only need array of 2, and toggle back and forth.
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
We can use Prime Factorization and exponents only to present the number.
Here is one example.
6 = (2^1)(3^1)(5^0) -> state "1 1 0" to represent 6
18 = (2^1)(3^2)(5^0) -> state "1 2 0" to represent 18
Here is how we can get LMC of 6 and 18 using Prime Factorization
LCM (6,18) = (2^(max(1,1)) (3^ (max(1,2)) (5^max(0,0)) = (2^1)(3^2)(5^0) = 18
2^9 > 500, 3^6 > 500, 5^4 > 500, 7^4>500, 11^3 > 500, 13^3 > 500, 17^3 > 500, 19^3 > 500
we can use only count of exponents of prime number 2,3,5,7,11,13,17,19 to represent the LCMs in the set B = {b0, b1, ..., bk-1}
for the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500.
9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 <= 70000, so we only need two of dp[9][6][4][4][3][3][3][3] to keep tracks of all LCMs' states. So, dp[70000][2] is enough.
I put together a small C++ program to illustrate how we can get sum of LCMs of the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500. In the solution sketch, we need to loop through 70000 max possible of LCMs.
int gcd(int a, int b) {
int remainder = 0;
do {
remainder = a % b;
a = b;
b = remainder;
} while (b != 0);
return a;
}
int lcm(int a, int b) {
if (a == 0 || b == 0) {
return 0;
}
return (a * b) / gcd(a, b);
}
int sum_of_lcm(int A[], int N) {
// get the max LCM from the array
int max = A[0];
for (int i = 1; i < N; i++) {
max = lcm(max, A[i]);
}
max++;
//
int dp[max][2];
memset(dp, 0, sizeof(dp));
int pri = 0;
int cur = 1;
// loop through n x 70000
for (int i = 0; i < N; i++) {
for (int v = 1; v < max; v++) {
int x = A[i];
if (dp[v][pri] > 0) {
x = lcm(A[i], v);
dp[v][cur] = (dp[v][cur] == 0) ? dp[v][pri] : dp[v][cur];
if ( x % A[i] != 0 ) {
dp[x][cur] += dp[v][pri] + dp[A[i]][pri];
} else {
dp[x][cur] += ( x==v ) ? ( dp[v][pri] + dp[v][pri] ) : ( dp[v][pri] ) ;
}
}
}
dp[A[i]][cur]++;
pri = cur;
cur = (pri + 1) % 2;
}
for (int i = 0; i < N; i++) {
dp[A[i]][pri] -= 1;
}
long total = 0;
for (int j = 0; j < max; j++) {
if (dp[j][pri] > 0) {
total += dp[j][pri] * j;
}
}
cout << "total:" << total << endl;
return total;
}
int test() {
int a[] = {2, 6, 7 };
int n = sizeof(a)/sizeof(a[0]);
int total = sum_of_lcm(a, n);
return 0;
}
Output
total:104
The states are one more than the powers of primes. You have numbers up to 2^8, so the power of 2 is in [0..8], which is 9 states. Similarly for the other states.
"dp" could well stand for dynamic programming, I'm not sure.
The recurrence relation is the heart of the problem, so you will learn more by solving it yourself. Start with some small, simple examples.
For the large primes, try solving a reduced problem without using them (or their equivalents) and then add them back in to see their effect on the final result.

Maximum continuous achievable number

The problem
Definitions
Let's define a natural number N as a writable number (WN) for number set in M numeral system, if it can be written in this numeral system from members of U using each member no more than once. More strict definition of 'written': - here CONCAT means concatenation.
Let's define a natural number N as a continuous achievable number (CAN) for symbol set in M numeral system if it is a WN-number for U and M and also N-1 is a CAN-number for U and M (Another definition may be N is CAN for U and M if all 0 .. N numbers are WN for U and M). More strict:
Issue
Let we have a set of S natural numbers: (we are treating zero as a natural number) and natural number M, M>1. The problem is to find maximum CAN (MCAN) for given U and M. Given set U may contain duplicates - but each duplicate could not be used more than once, of cause (i.e. if U contains {x, y, y, z} - then each y could be used 0 or 1 time, so y could be used 0..2 times total). Also U expected to be valid in M-numeral system (i.e. can not contain symbols 8 or 9 in any member if M=8). And, of cause, members of U are numbers, not symbols for M (so 11 is valid for M=10) - otherwise the problem will be trivial.
My approach
I have in mind a simple algorithm now, which is simply checking if current number is CAN via:
Check if 0 is WN for given U and M? Go to 2: We're done, MCAN is null
Check if 1 is WN for given U and M? Go to 3: We're done, MCAN is 0
...
So, this algorithm is trying to build all this sequence. I doubt this part can be improved, but may be it can? Now, how to check if number is a WN. This is also some kind of 'substitution brute-force'. I have a realization of that for M=10 (in fact, since we're dealing with strings, any other M is not a problem) with PHP function:
//$mNumber is our N, $rgNumbers is our U
function isWriteable($mNumber, $rgNumbers)
{
if(in_array((string)$mNumber, $rgNumbers=array_map('strval', $rgNumbers), true))
{
return true;
}
for($i=1; $i<=strlen((string)$mNumber); $i++)
{
foreach($rgKeys = array_keys(array_filter($rgNumbers, function($sX) use ($mNumber, $i)
{
return $sX==substr((string)$mNumber, 0, $i);
})) as $iKey)
{
$rgTemp = $rgNumbers;
unset($rgTemp[$iKey]);
if(isWriteable(substr((string)$mNumber, $i), $rgTemp))
{
return true;
}
}
}
return false;
}
-so we're trying one piece and then check if the rest part could be written with recursion. If it can not be written, we're trying next member of U. I think this is a point which can be improved.
Specifics
As you see, an algorithm is trying to build all numbers before N and check if they are WN. But the only question is - to find MCAN, so, question is:
May be constructive algorithm is excessive here? And, if yes, what other options could be used?
Is there more quick way to determine if number is WN for given U and M? (this point may have no sense if previous point has positive answer and we'll not build and check all numbers before N).
Samples
U = {4, 1, 5, 2, 0}
M = 10
then MCAN = 2 (3 couldn't be reached)
U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11}
M = 10
then MCAN = 21 (all before could be reached, for 22 there are no two 2 symbols total).
Hash the digit count for digits from 0 to m-1. Hash the numbers greater than m that are composed of one repeated digit.
MCAN is bound by the smallest digit for which all combinations of that digit for a given digit count cannot be constructed (e.g., X000,X00X,X0XX,XX0X,XXX0,XXXX), or (digit count - 1) in the case of zero (for example, for all combinations of four digits, combinations are needed for only three zeros; for a zero count of zero, MCAN is null). Digit counts are evaluated in ascending order.
Examples:
1. MCAN (10, {4, 1, 5, 2, 0})
3 is the smallest digit for which a digit-count of one cannot be constructed.
MCAN = 2
2. MCAN (10, {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11})
2 is the smallest digit for which a digit-count of two cannot be constructed.
MCAN = 21
3. (from Alma Do Mundo's comment below) MCAN (2, {0,0,0,1,1,1})
1 is the smallest digit for which all combinations for a digit-count of four
cannot be constructed.
MCAN = 1110
4. (example from No One in Particular's answer)
MCAN (2, {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1111,11111111})
1 is the smallest digit for which all combinations for a digit-count of five
cannot be constructed.
MCAN = 10101
The recursion steps I've made are:
If the digit string is available in your alphabet, mark it used and return immediately
If the digit string is of length 1, return failure
Split the string in two and try each part
This is my code:
$u = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11];
echo ncan($u), "\n"; // 21
// the functions
function satisfy($n, array $u)
{
if (!empty($u[$n])) { // step 1
--$u[$n];
return $u;
} elseif (strlen($n) == 1) { // step 2
return false;
}
// step 3
for ($i = 1; $i < strlen($n); ++$i) {
$u2 = satisfy(substr($n, 0, $i), $u);
if ($u2 && satisfy(substr($n, $i), $u2)) {
return true;
}
}
return false;
}
function is_can($n, $u)
{
return satisfy($n, $u) !== false;
}
function ncan($u)
{
$umap = array_reduce($u, function(&$result, $item) {
#$result[$item]++;
return $result;
}, []);
$i = -1;
while (is_can($i + 1, $umap)) {
++$i;
}
return $i;
}
Here is another approach:
1) Order the set U with regards to the usual numerical ordering for base M.
2) If there is a symbol between 0 and (M-1) which is missing, then that is the first number which is NOT MCAN.
3) Find the fist symbol which has the least number of entries in the set U. From this we have an upper bound on the first number which is NOT MCAN. That number would be {xxxx} N times. For example, if M = 4 and U = { 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3}, then the number 333 is not MCAN. This gives us our upper bound.
4) So, if the first element of the set U which has the small number of occurences is x and it has C occurences, then we can clearly represent any number with C digits. (Since every element has at least C entries).
5) Now we ask if there is any number less than (C+1)x which can't be MCAN? Well, any (C+1) digit number can have either (C+1) of the same symbol or only at most (C) of the same symbol. Since x is minimal from step 3, (C+1)y for y < x can be done and (C)a + b can be done for any distinct a, b since they have (C) copies at least.
The above method works for set elements of only 1 symbol. However, we now see that it becomes more complex if multi-symbol elements are allowed. Consider the following case:
U = { 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1111,11111111}
Define c(A,B) = the number of 'A' symbols of 'B' length.
So for our example, c(0,1) = 15, c(0,2) = 0, c(0,3) = 0, c(0,4) = 0, ...
c(1,1) = 3, c(1,2) = 0, c(1,3) = 0, c(1,4) = 1, c(0,5) = 0, ..., c(1,8) = 1
The maximal 0 string we can't do is 16. The maximal 1 string we can't do is also 16.
1 = 1
11 = 1+1
111 = 1+1+1
1111 = 1111
11111 = 1+1111
111111 = 1+1+1111
1111111 = 1+1+1+1111
11111111 = 11111111
111111111 = 1+11111111
1111111111 = 1+1+11111111
11111111111 = 1+1+1+11111111
111111111111 = 1111+11111111
1111111111111 = 1+1111+11111111
11111111111111 = 1+1+1111+11111111
111111111111111 = 1+1+1+1111+11111111
But can we make the string 11111101111? We can't because the last 1 string (1111) needs the only set of 1's with the 4 in a row. Once we take that, we can't make the first 1 string (111111) because we only have an 8 (which is too big) or 3 1-lengths which are too small.
So for multi-symbols, we need another approach.
We know from sorting and ordering our strings what is the minimum length we can't do for a given symbol. (In the example above, it would be 16 zeros or 16 ones.) So this is our upper bound for an answer.
What we have to do now is start a 1 and count up in base M. For each number we write it in base M and then determine if we can make it from our set U. We do this by using the same approach used in the coin change problem: dynamic programming. (See for example http://www.geeksforgeeks.org/dynamic-programming-set-7-coin-change/ for the algorithm.) The only difference is that in our case we only have finite number of each elements, not an infinite supply.
Instead of subtracting the amount we are using like in the coin change problem, we strip the matching symbol off of the front of the string we are trying to match. (This is the opposite of our addition - concatenation.)

puzzle using arrays

My first array M + N size and second array of size N.
let us say m=4,n=5
a[ ]= 1,3,5,7,0,0,0,0,0
b[ ]= 2,4,6,8,10
Now , how can i merge these two arrays without using external sorting algorithms and any other temporary array(inplace merge) but complexity should be o(n).Resultant array must be in sorted order.
Provided a is exactly the right size and arrays are already sorted (as seems to be the case), the following pseudo-code should help:
# 0 1 2 3 4 5 6 7 8
a = [1,3,5,7,0,0,0,0,0]
b = [2,4,6,8,10]
afrom = 3
bfrom = 4
ato = 8
while bfrom >= 0:
if afrom == -1:
a[ato] = b[bfrom]
ato = ato - 1
bfrom = bfrom - 1
else:
if b[bfrom] > a[afrom]:
a[ato] = b[bfrom]
ato = ato - 1
bfrom = bfrom - 1
else:
a[ato] = a[afrom]
ato = ato - 1
afrom = afrom - 1
print a
It's basically a merge of the two lists into one, starting at the ends. Once bfrom hits -1, there are no more elements in b so the remainder in a were less than the lowest in b. Therefore the rest of a can remain unchanged.
If a runs out first, then it's a matter of transferring the rest of b since all the a elements have been transferred above ato already.
This is O(n) as requested and would result in something like:
[1, 2, 3, 4, 5, 6, 7, 8, 10]
Understanding that pseudo-code and translating it to your specific language is a job for you, now that you've declared it homework :-)
for (i = 0; i < N; i++) {
a[m+i] = b[i];
}
This will do an in-place merge (concatenation).
If you're asking for an ordered merge, that's not possible in O(N). If it were to be possible, you could use it to sort in O(N). And of course O(N log N) is the best known general-case sorting algorithm...
I've got to ask, though, looking at your last few questions: are you just asking us for homework help? You do know that it's OK to say "this is homework", and nobody will laugh at you, right? We'll even still do our best to help you learn.
Do you want a sorted array ? If not this should do
for(int i=a.length-1,j=0;i >=0; i--)
{
a[i] = b[j++];
}
You can take a look at in-place counting sort that works provided you know the input range. Effectively O(n).

Resources