Google Interview: Arrangement of Blocks - algorithm

You are given N blocks of height 1…N. In how many ways can you arrange these blocks in a row such that when viewed from left you see only L blocks (rest are hidden by taller blocks) and when seen from right you see only R blocks? Example given N=3, L=2, R=1 there is only one arrangement {2, 1, 3} while for N=3, L=2, R=2 there are two ways {1, 3, 2} and {2, 3, 1}.
How should we solve this problem by programming? Any efficient ways?

This is a counting problem, not a construction problem, so we can approach it using recursion. Since the problem has two natural parts, looking from the left and looking from the right, break it up and solve for just one part first.
Let b(N, L, R) be the number of solutions, and let f(N, L) be the number of arrangements of N blocks so that L are visible from the left. First think about f because it's easier.
APPROACH 1
Let's get the initial conditions and then go for recursion. If all are to be visible, then they must be ordered increasingly, so
f(N, N) = 1
If there are suppose to be more visible blocks than available blocks, then nothing we can do, so
f(N, M) = 0 if N < M
If only one block should be visible, then put the largest first and then the others can follow in any order, so
f(N,1) = (N-1)!
Finally, for the recursion, think about the position of the tallest block, say N is in the kth spot from the left. Then choose the blocks to come before it in (N-1 choose k-1) ways, arrange those blocks so that exactly L-1 are visible from the left, and order the N-k blocks behind N it in any you like, giving:
f(N, L) = sum_{1<=k<=N} (N-1 choose k-1) * f(k-1, L-1) * (N-k)!
In fact, since f(x-1,L-1) = 0 for x<L, we may as well start k at L instead of 1:
f(N, L) = sum_{L<=k<=N} (N-1 choose k-1) * f(k-1, L-1) * (N-k)!
Right, so now that the easier bit is understood, let's use f to solve for the harder bit b. Again, use recursion based on the position of the tallest block, again say N is in position k from the left. As before, choose the blocks before it in N-1 choose k-1 ways, but now think about each side of that block separately. For the k-1 blocks left of N, make sure that exactly L-1 of them are visible. For the N-k blocks right of N, make sure that R-1 are visible and then reverse the order you would get from f. Therefore the answer is:
b(N,L,R) = sum_{1<=k<=N} (N-1 choose k-1) * f(k-1, L-1) * f(N-k, R-1)
where f is completely worked out above. Again, many terms will be zero, so we only want to take k such that k-1 >= L-1 and N-k >= R-1 to get
b(N,L,R) = sum_{L <= k <= N-R+1} (N-1 choose k-1) * f(k-1, L-1) * f(N-k, R-1)
APPROACH 2
I thought about this problem again and found a somewhat nicer approach that avoids the summation.
If you work the problem the opposite way, that is think of adding the smallest block instead of the largest block, then the recurrence for f becomes much simpler. In this case, with the same initial conditions, the recurrence is
f(N,L) = f(N-1,L-1) + (N-1) * f(N-1,L)
where the first term, f(N-1,L-1), comes from placing the smallest block in the leftmost position, thereby adding one more visible block (hence L decreases to L-1), and the second term, (N-1) * f(N-1,L), accounts for putting the smallest block in any of the N-1 non-front positions, in which case it is not visible (hence L stays fixed).
This recursion has the advantage of always decreasing N, though it makes it more difficult to see some formulas, for example f(N,N-1) = (N choose 2). This formula is fairly easy to show from the previous formula, though I'm not certain how to derive it nicely from this simpler recurrence.
Now, to get back to the original problem and solve for b, we can also take a different approach. Instead of the summation before, think of the visible blocks as coming in packets, so that if a block is visible from the left, then its packet consists of all blocks right of it and in front of the next block visible from the left, and similarly if a block is visible from the right then its packet contains all blocks left of it until the next block visible from the right. Do this for all but the tallest block. This makes for L+R packets. Given the packets, you can move one from the left side to the right side simply by reversing the order of the blocks. Therefore the general case b(N,L,R) actually reduces to solving the case b(N,L,1) = f(N,L) and then choosing which of the packets to put on the left and which on the right. Therefore we have
b(N,L,R) = (L+R choose L) * f(N,L+R)
Again, this reformulation has some advantages over the previous version. Putting these latter two formulas together, it's much easier to see the complexity of the overall problem. However, I still prefer the first approach for constructing solutions, though perhaps others will disagree. All in all it just goes to show there's more than one good way to approach the problem.
What's with the Stirling numbers?
As Jason points out, the f(N,L) numbers are precisely the (unsigned) Stirling numbers of the first kind. One can see this immediately from the recursive formulas for each. However, it's always nice to be able to see it directly, so here goes.
The (unsigned) Stirling numbers of the First Kind, denoted S(N,L) count the number of permutations of N into L cycles. Given a permutation written in cycle notation, we write the permutation in canonical form by beginning the cycle with the largest number in that cycle and then ordering the cycles increasingly by the first number of the cycle. For example, the permutation
(2 6) (5 1 4) (3 7)
would be written in canonical form as
(5 1 4) (6 2) (7 3)
Now drop the parentheses and notice that if these are the heights of the blocks, then the number of visible blocks from the left is exactly the number of cycles! This is because the first number of each cycle blocks all other numbers in the cycle, and the first number of each successive cycle is visible behind the previous cycle. Hence this problem is really just a sneaky way to ask you to find a formula for Stirling numbers.

well, just as an empirical solution for small N:
blocks.py:
import itertools
from collections import defaultdict
def countPermutation(p):
n = 0
max = 0
for block in p:
if block > max:
n += 1
max = block
return n
def countBlocks(n):
count = defaultdict(int)
for p in itertools.permutations(range(1,n+1)):
fwd = countPermutation(p)
rev = countPermutation(reversed(p))
count[(fwd,rev)] += 1
return count
def printCount(count, n, places):
for i in range(1,n+1):
for j in range(1,n+1):
c = count[(i,j)]
if c > 0:
print "%*d" % (places, count[(i,j)]),
else:
print " " * places ,
print
def countAndPrint(nmax, places):
for n in range(1,nmax+1):
printCount(countBlocks(n), n, places)
print
and sample output:
blocks.countAndPrint(10)
1
1
1
1 1
1 2
1
2 3 1
2 6 3
3 3
1
6 11 6 1
6 22 18 4
11 18 6
6 4
1
24 50 35 10 1
24 100 105 40 5
50 105 60 10
35 40 10
10 5
1
120 274 225 85 15 1
120 548 675 340 75 6
274 675 510 150 15
225 340 150 20
85 75 15
15 6
1
720 1764 1624 735 175 21 1
720 3528 4872 2940 875 126 7
1764 4872 4410 1750 315 21
1624 2940 1750 420 35
735 875 315 35
175 126 21
21 7
1
5040 13068 13132 6769 1960 322 28 1
5040 26136 39396 27076 9800 1932 196 8
13068 39396 40614 19600 4830 588 28
13132 27076 19600 6440 980 56
6769 9800 4830 980 70
1960 1932 588 56
322 196 28
28 8
1
40320 109584 118124 67284 22449 4536 546 36 1
40320 219168 354372 269136 112245 27216 3822 288 9
109584 354372 403704 224490 68040 11466 1008 36
118124 269136 224490 90720 19110 2016 84
67284 112245 68040 19110 2520 126
22449 27216 11466 2016 126
4536 3822 1008 84
546 288 36
36 9
1
You'll note a few obvious (well, mostly obvious) things from the problem statement:
the total # of permutations is always N!
with the exception of N=1, there is no solution for L,R = (1,1): if a count in one direction is 1, then it implies the tallest block is on that end of the stack, so the count in the other direction has to be >= 2
the situation is symmetric (reverse each permutation and you reverse the L,R count)
if p is a permutation of N-1 blocks and has count (Lp,Rp), then the N permutations of block N inserted in each possible spot can have a count ranging from L = 1 to Lp+1, and R = 1 to Rp + 1.
From the empirical output:
the leftmost column or topmost row (where L = 1 or R = 1) with N blocks is the sum of the
rows/columns with N-1 blocks: i.e. in #PengOne's notation,
b(N,1,R) = sum(b(N-1,k,R-1) for k = 1 to N-R+1
Each diagonal is a row of Pascal's triangle, times a constant factor K for that diagonal -- I can't prove this, but I'm sure someone can -- i.e.:
b(N,L,R) = K * (L+R-2 choose L-1) where K = b(N,1,L+R-1)
So the computational complexity of computing b(N,L,R) is the same as the computational complexity of computing b(N,1,L+R-1) which is the first column (or row) in each triangle.
This observation is probably 95% of the way towards an explicit solution (the other 5% I'm sure involves standard combinatoric identities, I'm not too familiar with those).
A quick check with the Online Encyclopedia of Integer Sequences shows that b(N,1,R) appears to be OEIS sequence A094638:
A094638 Triangle read by rows: T(n,k) =|s(n,n+1-k)|, where s(n,k) are the signed Stirling numbers of the first kind (1<=k<=n; in other words, the unsigned Stirling numbers of the first kind in reverse order).
1, 1, 1, 1, 3, 2, 1, 6, 11, 6, 1, 10, 35, 50, 24, 1, 15, 85, 225, 274, 120, 1, 21, 175, 735, 1624, 1764, 720, 1, 28, 322, 1960, 6769, 13132, 13068, 5040, 1, 36, 546, 4536, 22449, 67284, 118124, 109584, 40320, 1, 45, 870, 9450, 63273, 269325, 723680, 1172700
As far as how to efficiently compute the Stirling numbers of the first kind, I'm not sure; Wikipedia gives an explicit formula but it looks like a nasty sum. This question (computing Stirling #s of the first kind) shows up on MathOverflow and it looks like O(n^2), as PengOne hypothesizes.

Based on #PengOne answer, here is my Javascript implementation:
function g(N, L, R) {
var acc = 0;
for (var k=1; k<=N; k++) {
acc += comb(N-1, k-1) * f(k-1, L-1) * f(N-k, R-1);
}
return acc;
}
function f(N, L) {
if (N==L) return 1;
else if (N<L) return 0;
else {
var acc = 0;
for (var k=1; k<=N; k++) {
acc += comb(N-1, k-1) * f(k-1, L-1) * fact(N-k);
}
return acc;
}
}
function comb(n, k) {
return fact(n) / (fact(k) * fact(n-k));
}
function fact(n) {
var acc = 1;
for (var i=2; i<=n; i++) {
acc *= i;
}
return acc;
}
$("#go").click(function () {
alert(g($("#N").val(), $("#L").val(), $("#R").val()));
});

Here is my construction solution inspired by #PengOne's ideas.
import itertools
def f(blocks, m):
n = len(blocks)
if m > n:
return []
if m < 0:
return []
if n == m:
return [sorted(blocks)]
maximum = max(blocks)
blocks = list(set(blocks) - set([maximum]))
results = []
for k in range(0, n):
for left_set in itertools.combinations(blocks, k):
for left in f(left_set, m - 1):
rights = itertools.permutations(list(set(blocks) - set(left)))
for right in rights:
results.append(list(left) + [maximum] + list(right))
return results
def b(n, l, r):
blocks = range(1, n + 1)
results = []
maximum = max(blocks)
blocks = list(set(blocks) - set([maximum]))
for k in range(0, n):
for left_set in itertools.combinations(blocks, k):
for left in f(left_set, l - 1):
other = list(set(blocks) - set(left))
rights = f(other, r - 1)
for right in rights:
results.append(list(left) + [maximum] + list(right))
return results
# Sample
print b(4, 3, 2) # -> [[1, 2, 4, 3], [1, 3, 4, 2], [2, 3, 4, 1]]

We derive a general solution F(N, L, R) by examining a specific testcase: F(10, 4, 3).
We first consider 10 in the leftmost possible position, the 4th ( _ _ _ 10 _ _ _ _ _ _ ).
Then we find the product of the number of valid sequences in the left and in the right of 10.
Next, we'll consider 10 in the 5th slot, calculate another product and add it to the previous one.
This process will go on until 10 is in the last possible slot, the 8th.
We'll use the variable named pos to keep track of N's position.
Now suppose pos = 6 ( _ _ _ _ _ 10 _ _ _ _ ). In the left of 10, there are 9C5 = (N-1)C(pos-1) sets of numbers to be arranged.
Since only the order of these numbers matters, we could look at 1, 2, 3, 4, 5.
To construct a sequence with these numbers so that 3 = L-1 of them are visible from the left, we can begin by placing 5 in the leftmost possible slot ( _ _ 5 _ _ ) and follow similar steps to what we did before.
So if F were defined recursively, it could be used here.
The only difference now is that the order of numbers in the right of 5 is immaterial.
To resolve this issue, we'll use a signal, INF (infinity), for R to indicate its unimportance.
Turning to the right of 10, there will be 4 = N-pos numbers left.
We first consider 4 in the last possible slot, position 2 = R-1 from the right ( _ _ 4 _ ).
Here what appears in the left of 4 is immaterial.
But counting arrangements of 4 blocks with the mere condition that 2 of them should be visible from the right is no different than counting arrangements of the same blocks with the mere condition that 2 of them should be visible from the left.
ie. instead of counting sequences like 3 1 4 2, one can count sequences like 2 4 1 3
So the number of valid arrangements in the right of 10 is F(4, 2, INF).
Thus the number of arrangements when pos == 6 is 9C5 * F(5, 3, INF) * F(4, 2, INF) = (N-1)C(pos-1) * F(pos-1, L-1, INF)* F(N-pos, R-1, INF).
Similarly, in F(5, 3, INF), 5 will be considered in a succession of slots with L = 2 and so on.
Since the function calls itself with L or R reduced, it must return a value when L = 1, that is F(N, 1, INF) must be a base case.
Now consider the arrangement _ _ _ _ _ 6 7 10 _ _.
The only slot 5 can take is the first, and the following 4 slots may be filled in any manner; thus F(5, 1, INF) = 4!.
Then clearly F(N, 1, INF) = (N-1)!.
Other (trivial) base cases and details could be seen in the C implementation below.
Here is a link for testing the code
#define INF UINT_MAX
long long unsigned fact(unsigned n) { return n ? n * fact(n-1) : 1; }
unsigned C(unsigned n, unsigned k) { return fact(n) / (fact(k) * fact(n-k)); }
unsigned F(unsigned N, unsigned L, unsigned R)
{
unsigned pos, sum = 0;
if(R != INF)
{
if(L == 0 || R == 0 || N < L || N < R) return 0;
if(L == 1) return F(N-1, R-1, INF);
if(R == 1) return F(N-1, L-1, INF);
for(pos = L; pos <= N-R+1; ++pos)
sum += C(N-1, pos-1) * F(pos-1, L-1, INF) * F(N-pos, R-1, INF);
}
else
{
if(L == 1) return fact(N-1);
for(pos = L; pos <= N; ++pos)
sum += C(N-1, pos-1) * F(pos-1, L-1, INF) * fact(N-pos);
}
return sum;
}

Related

How to approach and understand a math related DSA question

I found this question online and I really have no idea what the question is even asking. I would really appreciate some help in first understanding the question, and a solution if possible. Thanks!
To see if a number is divisible by 3, you need to add up the digits of its decimal notation, and check if the sum is divisible by 3.
To see if a number is divisible by 11, you need to split its decimal notation into pairs of digits (starting from the right end), add up corresponding numbers and check if the sum is divisible by 11.
For any prime p (except for 2 and 5) there exists an integer r such that a similar divisibility test exists: to check if a number is divisible by p, you need to split its decimal notation into r-tuples of digits (starting from the right end), add up these r-tuples and check whether their sum is divisible by p.
Given a prime int p, find the minimal r for which such divisibility test is valid and output it.
The input consists of a single integer p - a prime between 3 and 999983, inclusive, not equal to 5.
Example
input
3
output
1
input
11
output
2
This is a very cool problem! It uses modular arithmetic and some basic number theory to devise the solution.
Let's say we have p = 11. What divisibility rule applies here? How many digits at once do we need to take, to have a divisibility rule?
Well, let's try a single digit at a time. That would mean, that if we have 121 and we sum its digits 1 + 2 + 1, then we get 4. However we see, that although 121 is divisible by 11, 4 isn't and so the rule doesn't work.
What if we take two digits at a time? With 121 we get 1 + 21 = 22. We see that 22 IS divisible by 11, so the rule might work here. And in fact, it does. For p = 11, we have r = 2.
This requires a bit of intuition which I am unable to convey in text (I really have tried) but it can be proven that for a given prime p other than 2 and 5, the divisibility rule works for tuples of digits of length r if and only if the number 99...9 (with r nines) is divisible by p. And indeed, for p = 3 we have 9 % 3 = 0, while for p = 11 we have 9 % 11 = 9 (this is bad) and 99 % 11 = 0 (this is what we want).
If we want to find such an r, we start with r = 1. We check if 9 is divisible by p. If it is, then we found the r. Otherwise, we go further and we check if 99 is divisible by p. If it is, then we return r = 2. Then, we check if 999 is divisible by p and if so, return r = 3 and so on. However, the 99...9 numbers can get very large. Thankfully, to check divisibility by p we only need to store the remainder modulo p, which we know is small (at least smaller than 999983). So the code in C++ would look something like this:
int r(int p) {
int result = 1;
int remainder = 9 % p;
while (remainder != 0) {
remainder = (remainder * 10 + 9) % p;
result++;
}
return result;
}
I have no idea how they expect a random programmer with no background to figure out the answer from this.
But here is the brief introduction to modulo arithmetic that should make this doable.
In programming, n % k is the modulo operator. It refers to taking the remainder of n / k. It satisfies the following two important properties:
(n + m) % k = ((n % k) + (m % k)) % k
(n * m) % k = ((n % k) * (m % k)) % k
Because of this, for any k we can think of all numbers with the same remainder as somehow being the same. The result is something called "the integers modulo k". And it satisfies most of the rules of algebra that you're used to. You have the associative property, the commutative property, distributive law, addition by 0, and multiplication by 1.
However if k is a composite number like 10, you have the unfortunate fact that 2 * 5 = 10 which means that modulo 10, 2 * 5 = 0. That's kind of a problem for division.
BUT if k = p, a prime, then things become massively easier. If (a*m) % p = (b*m) % p then ((a-b) * m) % p = 0 so (a-b) * m is divisible by p. And therefore either (a-b) or m is divisible by p.
For any non-zero remainder m, let's look at the sequence m % p, m^2 % p, m^3 % p, .... This sequence is infinitely long and can only take on p values. So we must have a repeat where, a < b and m^a % p = m^b %p. So (1 * m^a) % p = (m^(b-a) * m^a) % p. Since m doesn't divide p, m^a doesn't either, and therefore m^(b-a) % p = 1. Furthermore m^(b-a-1) % p acts just like m^(-1) = 1/m. (If you take enough math, you'll find that the non-zero remainders under multiplication is a finite group, and all the remainders forms a field. But let's ignore that.)
(I'm going to drop the % p everywhere. Just assume it is there in any calculation.)
Now let's let a be the smallest positive number such that m^a = 1. Then 1, m, m^2, ..., m^(a-1) forms a cycle of length a. For any n in 1, ..., p-1 we can form a cycle (possibly the same, possibly different) n, n*m, n*m^2, ..., n*m^(a-1). It can be shown that these cycles partition 1, 2, ..., p-1 where every number is in a cycle, and each cycle has length a. THEREFORE, a divides p-1. As a side note, since a divides p-1, we easily get Fermat's little theorem that m^(p-1) has remainder 1 and therefore m^p = m.
OK, enough theory. Now to your problem. Suppose we have a base b = 10^i. The primality test that they are discussing is that a_0 + a_1 * b + a_2 * b^2 + a_k * b^k is divisible by a prime p if and only if a_0 + a_1 + ... + a_k is divisible by p. Looking at (p-1) + b, this can only happen if b % p is 1. And if b % p is 1, then in modulo arithmetic b to any power is 1, and the test works.
So we're looking for the smallest i such that 10^i % p is 1. From what I showed above, i always exists, and divides p-1. So you just need to factor p-1, and try 10 to each power until you find the smallest i that works.
Note that you should % p at every step you can to keep those powers from getting too big. And with repeated squaring you can speed up the calculation. So, for example, calculating 10^20 % p could be done by calculating each of the following in turn.
10 % p
10^2 % p
10^4 % p
10^5 % p
10^10 % p
10^20 % p
This is an almost direct application of Fermat's little theorem.
First, you have to reformulate the "split decimal notation into tuples [...]"-condition into something you can work with:
to check if a number is divisible by p, you need to split its decimal notation into r-tuples of digits (starting from the right end), add up these r-tuples and check whether their sum is divisible by p
When you translate it from prose into a formula, what it essentially says is that you want
for any choice of "r-tuples of digits" b_i from { 0, ..., 10^r - 1 } (with only finitely many b_i being non-zero).
Taking b_1 = 1 and all other b_i = 0, it's easy to see that it is necessary that
It's even easier to see that this is also sufficient (all 10^ri on the left hand side simply transform into factor 1 that does nothing).
Now, if p is neither 2 nor 5, then 10 will not be divisible by p, so that Fermat's little theorem guarantees us that
, that is, at least the solution r = p - 1 exists. This might not be the smallest such r though, and computing the smallest one is hard if you don't have a quantum computer handy.
Despite it being hard in general, for very small p, you can simply use an algorithm that is linear in p (you simply look at the sequence
10 mod p
100 mod p
1000 mod p
10000 mod p
...
and stop as soon as you find something that equals 1 mod p).
Written out as code, for example, in Scala:
def blockSize(p: Int, n: Int = 10, r: Int = 1): Int =
if n % p == 1 then r else blockSize(p, n * 10 % p, r + 1)
println(blockSize(3)) // 1
println(blockSize(11)) // 2
println(blockSize(19)) // 18
or in Python:
def blockSize(p: int, n: int = 10, r: int = 1) -> int:
return r if n % p == 1 else blockSize(p, n * 10 % p, r + 1)
print(blockSize(3)) # 1
print(blockSize(11)) # 2
print(blockSize(19)) # 18
A wall of numbers, just in case someone else wants to sanity-check alternative approaches:
11 -> 2
13 -> 6
17 -> 16
19 -> 18
23 -> 22
29 -> 28
31 -> 15
37 -> 3
41 -> 5
43 -> 21
47 -> 46
53 -> 13
59 -> 58
61 -> 60
67 -> 33
71 -> 35
73 -> 8
79 -> 13
83 -> 41
89 -> 44
97 -> 96
101 -> 4
103 -> 34
107 -> 53
109 -> 108
113 -> 112
127 -> 42
131 -> 130
137 -> 8
139 -> 46
149 -> 148
151 -> 75
157 -> 78
163 -> 81
167 -> 166
173 -> 43
179 -> 178
181 -> 180
191 -> 95
193 -> 192
197 -> 98
199 -> 99
Thank you andrey tyukin.
Simple terms to remember:
When x%y =z then (x%y)%y again =z
(X+y)%z == (x%z + y%z)%z
keep this in mind.
So you break any number into some r digits at a time together. I.e. break 3456733 when r=6 into 3 * 10 power(6 * 1) + 446733 * 10 power(6 * 0).
And you can break 12536382626373 into 12 * 10 power (6 * 2). + 536382 * 10 power (6 * 1) + 626373 * 10 power (6 * 0)
Observe that here r is 6.
So when we say we combine the r digits and sum them together and apply modulo. We are saying we apply modulo to coefficients of above breakdown.
So how come coefficients sum represents whole number’s sum?
When the “10 power (6* anything)” modulo in the above break down becomes 1 then that particular term’s modulo will be equal to the coefficient’s modulo. That means the 10 power (r* anything) is of no effect. You can check why it will have no effect by using the formulas 1&2.
And the other similar terms 10 power (r * anything) also will have modulo as 1. I.e. if you can prove that (10 power r)modulo is 1. Then (10 power r * anything) is also 1.
But the important thing is we should have 10 power (r) equal to 1. Then every 10 power (r * anything) is 1 that leads to modulo of number equal to sum of r digits divided modulo.
Conclusion: find r in (10 power r) such that the given prime number will leave 1 as reminder.
That also mean the smallest 9…..9 which is divisible by given prime number decides r.

Maxsubsequence - What are the key insights for this problem?

Below is the problem assignment using tree recursion approach:
Maximum Subsequence
A subsequence of a number is a series of (not necessarily contiguous) digits of the number. For example, 12345 has subsequences that include 123, 234, 124, 245, etc. Your task is to get the maximum subsequence below a certain length.
def max_subseq(n, l):
"""
Return the maximum subsequence of length at most l that can be found in the given number n.
For example, for n = 20125 and l = 3, we have that the subsequences are
2
0
1
2
5
20
21
22
25
01
02
05
12
15
25
201
202
205
212
215
225
012
015
025
125
and of these, the maxumum number is 225, so our answer is 225.
>>> max_subseq(20125, 3)
225
>>> max_subseq(20125, 5)
20125
>>> max_subseq(20125, 6) # note that 20125 == 020125
20125
>>> max_subseq(12345, 3)
345
>>> max_subseq(12345, 0) # 0 is of length 0
0
>>> max_subseq(12345, 1)
5
"""
"*** YOUR CODE HERE ***"
There are two key insights for this problem
You need to split into the cases where the ones digit is used and the one where it is not. In the case where it is, we want to reduce l since we used one of the digits, and in the case where it isn't we do not.
In the case where we are using the ones digit, you need to put the digit back onto the end, and the way to attach a digit d to the end of a number n is 10 * n + d.
I could not understand the insights of this problem, mentioned below 2 points:
split into the cases where the ones digit is used and the one where it is not
In the case where we are using the ones digit, you need to put the digit back onto the end
My understanding of this problem:
Solution to this problem looks to generate all subsequences upto l, pseudo code looks like:
digitSequence := strconv.Itoa(n) // "20125"
printSubSequence = func(digitSequence string, currenSubSequenceSize int) { // digitSequence is "20125" and currenSubSequenceSize is say 3
printNthSubSequence(digitSequence, currenSubSequenceSize) + printSubSequence(digitSequence, currenSubSequenceSize-1)
}
where printNthSubSequence prints subsequences for (20125, 3) or (20125, 2) etc...
Finding max_subseq among all these sequences then becomes easy
Can you help me understand the insights given in this problem, with an example(say 20125, 1)? here is the complete question
Something like this? As the instructions suggest, try it with and without the current digit:
function f(s, i, l){
if (i + 1 <= l)
return Number(s.substr(0, l));
if (!l)
return 0;
return Math.max(
// With
Number(s[i]) + 10 * f(s, i - 1, l - 1),
// Without
f(s, i - 1, l)
);
}
var input = [
['20125', 3],
['20125', 5],
['20125', 6],
['12345', 3],
['12345', 0],
['12345', 1]
];
for (let [s, l] of input){
console.log(s + ', l: ' + l);
console.log(f(s, s.length-1, l));
console.log('');
}

algorithmic puzzle for calculating the number of combinations of numbers sum to a fixed result

This is a puzzle i think of since last night. I have come up with a solution but it's not efficient so I want to see if there is better idea.
The puzzle is this:
given positive integers N and T, you will need to have:
for i in [1, T], A[i] from { -1, 0, 1 }, such that SUM(A) == N
additionally, the prefix sum of A shall be [0, N], while when the prefix sum PSUM[A, t] == N, it's necessary to have for i in [t + 1, T], A[i] == 0
here prefix sum PSUM is defined to be: PSUM[A, t] = SUM(A[i] for i in [1, t])
the puzzle asks how many such A's exist given fixed N and T
for example, when N = 2, T = 4, following As work:
1 1 0 0
1 -1 1 1
0 1 1 0
but following don't:
-1 1 1 1 # prefix sum -1
1 1 -1 1 # non-0 following a prefix sum == N
1 1 1 -1 # prefix sum > N
following python code can verify such rule, when given N as expect and an instance of A as seq(some people may feel easier reading code than reading literal description):
def verify(expect, seq):
s = 0
for j, i in enumerate(seq):
s += i
if s < 0:
return False
if s == expect:
break
else:
return s == expect
for k in range(j + 1, len(seq)):
if seq[k] != 0:
return False
return True
I have coded up my solution, but it's too slow. Following is mine:
I decompose the problem into two parts, a part without -1 in it(only {0, 1} and a part with -1.
so if SOLVE(N, T) is the correct answer, I define a function SOLVE'(N, T, B), where a positive B allows me to extend prefix sum to be in the interval of [-B, N] instead of [0, N]
so in fact SOLVE(N, T) == SOLVE'(N, T, 0).
so I soon realized the solution is actually:
have the prefix of A to be some valid {0, 1} combination with positive length l, and with o 1s in it
at position l + 1, I start to add 1 or more -1s and use B to track the number. the maximum will be B + o or depend on the number of slots remaining in A, whichever is less.
recursively call SOLVE'(N, T, B)
in the previous N = 2, T = 4 example, in one of the search case, I will do:
let the prefix of A be [1], then we have A = [1, -, -, -].
start add -1. here i will add only one: A = [1, -1, -, -].
recursive call SOLVE', here i will call SOLVE'(2, 2, 0) to solve the last two spots. here it will return [1, 1] only. then one of the combinations yields [1, -1, 1, 1].
but this algorithm is too slow.
I am wondering how can I optimize it or any different way to look at this problem that can boost the performance up?(I will just need the idea, not impl)
EDIT:
some sample will be:
T N RESOLVE(N, T)
3 2 3
4 2 7
5 2 15
6 2 31
7 2 63
8 2 127
9 2 255
10 2 511
11 2 1023
12 2 2047
13 2 4095
3 3 1
4 3 4
5 3 12
6 3 32
7 3 81
8 3 200
9 3 488
10 3 1184
11 3 2865
12 3 6924
13 3 16724
4 4 1
5 4 5
6 4 18
an exponential time solution will be following in general(in python):
import itertools
choices = [-1, 0, 1]
print len([l for l in itertools.product(*([choices] * t)) if verify(n, l)])
An observation: assuming that n is at least 1, every solution to your stated problem ends in something of the form [1, 0, ..., 0]: i.e., a single 1 followed by zero or more 0s. The portion of the solution prior to that point is a walk that lies entirely in [0, n-1], starts at 0, ends at n-1, and takes fewer than t steps.
Therefore you can reduce your original problem to a slightly simpler one, namely that of determining how many t-step walks there are in [0, n] that start at 0 and end at n (where each step can be 0, +1 or -1, as before).
The following code solves the simpler problem. It uses the lru_cache decorator to cache intermediate results; this is in the standard library in Python 3, or there's a recipe you can download for Python 2.
from functools import lru_cache
#lru_cache()
def walks(k, n, t):
"""
Return the number of length-t walks in [0, n]
that start at 0 and end at k. Each step
in the walk adds -1, 0 or 1 to the current total.
Inputs should satisfy 0 <= k <= n and 0 <= t.
"""
if t == 0:
# If no steps allowed, we can only get to 0,
# and then only in one way.
return k == 0
else:
# Count the walks ending in 0.
total = walks(k, n, t-1)
if 0 < k:
# ... plus the walks ending in 1.
total += walks(k-1, n, t-1)
if k < n:
# ... plus the walks ending in -1.
total += walks(k+1, n, t-1)
return total
Now we can use this function to solve your problem.
def solve(n, t):
"""
Find number of solutions to the original problem.
"""
# All solutions stick at n once they get there.
# Therefore it's enough to find all walks
# that lie in [0, n-1] and take us to n-1 in
# fewer than t steps.
return sum(walks(n-1, n-1, i) for i in range(t))
Result and timings on my machine for solve(10, 100):
In [1]: solve(10, 100)
Out[1]: 250639233987229485923025924628548154758061157
In [2]: %timeit solve(10, 100)
1000 loops, best of 3: 964 µs per loop

Obtaining the nth-smallest number according to a custom ordering?

I'm trying to solve this problem:
http://uva.onlinejudge.org/external/102/10232.html
Basically, a number x is greater than y, if the number of 1's in binary representation of x is greater than y;
If they do have same number of 1's, then we compare in the natural way;
So, now we have a Bitwise sequence composed by integers 1 .. 2147483647. Given the index of some integer, how can we get that integer EFFICIENTLY ?
NOTE: the first several integers in the sequence should be:
0, 1, 2, 4, 8, 16, 32, .. 1073741824, 3, 5, 6, 9, ..
- --------------------------------- --------------
0 one 1's two 1's
NOTES:
Creating a lookup table would work, but it's just too slow, and too much memory!
Distributing all integers into bags with different numbers of 1's is also very slow: how to count in the same bag, do I have to count one by one?
I am NOT a student. I'm a working professional. Solving ACM problems is just my hobby. Using brute-force is usually NOT my taste, if I believe there is a better efficient algorithm to do it.
I think you can break this down into two separate problems:
Finding how many numbers there are with exactly k bits set, and
Finding the nth smallest number with exactly k bits set.
Let's suppose you can solve problems (1) and (2). Then here's a solution to the overall problem, written in Awful Pseudocode:
function nthNumber(n):
let numBitsNeeded = 0;
while true:
let x = number of numbers with exactly numBitsNeeded bits.
if x >= n, break
n -= x
return the nth-smallest value with exactly numBitsNeeded bits
The idea is to figure out how many bits will be in the number n, and from there to determine which number with that many bits you'll need.
Let's attack each problem separately.
Part 1: Counting the number of values with exactly k bits set
Fortunately, this part has a nice closed-form solution. If you have a 32-bit number and want to know how many numbers have exactly k bits set, you can compute the value 32 choose k, since you're selecting which positions in the number will have the bit set. This can be computed as
32! / (k! (32 - k)!)
You can precompute this and put it in a table if you'd like, meaning that you can compute this value in O(1) time.
Part 2: Determining the nth smallest number with exactly k bits set.
Since all these numbers have the same number of bits and they're compared as usual, you can think of this part of the problem as finding the nth lexicographically ordered combination of k bits. One way that you could do this is the following: suppose that you knew how many numbers there were with the highest bit at position k, k + 1, k + 2, k + 3, etc. You could then binary search over those numbers to determine where the highest best of the number would go. Once you've done that, you can then recursively apply the same procedure but with k - 1 bits to recover the remaining bits of the number.
So now we need to figure out how to count the number of ways to choose k bits with the highest 1 bit at some position p. Fortunately, that's not that hard either. If you have k bits and the highest is at position p, then you need to distribute the remaining k - 1 bits in positions less than p. The number of ways to do this is given by (p - 1) choose (k - 1), which is
(p - 1)! / (k - 1)!((p - 1) - (k - 1))!
Combining this with the above logic gives you a way to determine where all the bits of the numbers should go without having to count all the way up.
Hope this helps!
According to #templatetypedef 's core algorithm, I finally got it Accepted:
# Problem Verdict Language Run Time Submission Date
12534054 10232 Bit-wise Sequence Accepted C++ 0.018 2013-10-21 00:39:12
The credit should still go to #templatetypedef, but I'm also posting the main code for others' reference.
The code is actually short, because most is my comments :-)
The main code (I spent a lot of time solving the "Offset 1" issue):
#include <cstdio>
#include <bitset>
#include <algorithm>
#define N 32
using namespace std;
unsigned C[N][N]; // C[k][n] means choose k objects from n objects
unsigned S[N];
void CreateLookupTable()
{
// Create the C(k, n)
for (int n = 0; n < N; ++n)
C[0][n] = 1;
for (int n = 1; n < N; ++n)
{
for (int k = 1; k < n; ++k)
C[k][n] = C[k][n-1] + C[k-1][n-1];
C[n][n] = 1;
}
// Construct an accumulated sequence from C[i, 31], where i is [0..31]
int n = N-1;
S[0] = C[0][n];
for (int i = 1; i < N; ++i)
S[i] = S[i-1]+C[i][n];
}
// n: the position in the current bucket
// b: how many 1's in the target number
// h: the index of the possible highest 1
inline void FillBitset(int n, int b, int h, bitset<N>& bs)
{
if (b == 0)
return;
// Search for which small bucket the number is. Similar as before,
// If b = 4, h_ = 3, there will be C[3][3] numbers
// If b = 4, h_ = 4, there will be C[3][4] numbers
// If b = 4, h_ = 5, there will be C[3][5] numbers
// ...
// If b = 4, h_ = h, there will be C[3][h] numbers
//
// Also, it's very easy to prove that:
// C[3][3] = C[4][4]
// C[3][3] + C[3][4] = C[4][5]
// C[3][3] + C[3][4] + C[3][5] = C[4][6]
// ...
// C[3][3] + C[3][4] + C[3][5] + .. + C[3][h] = C[4][h+1]
//
// Now let's determine which C[b][i], where i is [b..(h+1)]
unsigned* lb = lower_bound(&C[b][b], &C[b][h+1]+1, n);
int c = lb-(&C[b][b])+b;
// Don't forget to decrease c to get the index of the highest bit 1
--c;
// Fill the actual highest bit
bs.set(c);
// When c < b, lb must point to C[b][b] == 1, which means all lower bits
// are also 1's, and we should stop here
if (c < b)
{
for (int i = 0; i < c; ++i)
bs.set(i);
return;
}
// Deduct the number of numbers in the lower buckets, and search for
// (b-1) 1's in a smaller bucket. Apparently, the possible highest 1
// should be at index c-1, since the current bucket's highet 1 is at
// index c
FillBitset(n-C[b][c], b-1, c-1, bs);
}
inline unsigned GetNumber(int n)
{
if (n == 0)
return 0;
// From index to position
++n;
// Get which bucket Number[n] is
unsigned* lb = lower_bound(S, S+N, n);
int b = lb-S; // There are b 1's in the number, and b is always > 0
// because n == 0 is excluded above
// So let's go to the core algorithm: get the position in the bucket
// For each bucet, we need to divide it into small bags/buckets.
//
// If b = 4, there will be [3..30] = 28 bags to represent how many numbers
// whose highest one is at index 3, 4, 5, .. 30
//
// If b = 4, and the highest 1 is at index 3, there will be C[3][3] numbers
// If b = 4, and the highest 1 is at index 4, there will be C[3][4] numbers
// If b = 4, and the highest 1 is at index 5, there will be C[3][5] numbers
// ...
// If b = 4, and the highest 1 is at index 30, there will be C[3][30] numbers
//
// Now our task is to find which small bucket the number is
bitset<N> bs;
FillBitset(n-S[b-1], b, N-2, bs);
return (unsigned)bs.to_ulong();
}
Test Input:
0
1
2
31
32
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
496
497
498
2147483647
1234567890
987654321
6123512
852412
123125
67658153
214155
5623674
Test Output:
0
1
2
1073741824
3
0
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
3
5
1610612736
7
11
2147483647
1195924317
1467508257
147227152
1099186184
135856144
1247429124
57624
100730919

Maximum number of characters using keystrokes A, Ctrl+A, Ctrl+C and Ctrl+V

This is an interview question from google. I am not able to solve it by myself. Can somebody shed some light?
Write a program to print the sequence of keystrokes such that it generates the maximum number of character 'A's. You are allowed to use only 4 keys: A, Ctrl+A, Ctrl+C and Ctrl+V. Only N keystrokes are allowed. All Ctrl+ characters are considered as one keystroke, so Ctrl+A is one keystroke.
For example, the sequence A, Ctrl+A, Ctrl+C, Ctrl+V generates two A's in 4 keystrokes.
Ctrl+A is Select All
Ctrl+C is Copy
Ctrl+V is Paste
I did some mathematics. For any N, using x numbers of A's , one Ctrl+A, one Ctrl+C and y Ctrl+V, we can generate max ((N-1)/2)2 number of A's. For some N > M, it is better to use as many Ctrl+A's, Ctrl+C and Ctrl+V sequences as it doubles the number of A's.
The sequence Ctrl+A, Ctrl+V, Ctrl+C will not overwrite the existing selection. It will append the copied selection to selected one.
There's a dynamic programming solution. We start off knowing 0 keys can make us 0 A's. Then we iterate through for i up to n, doing two things: pressing A once and pressing select all + copy followed by paste j times (actually j-i-1 below; note the trick here: the contents are still in the clipboard, so we can paste it multiple times without copying each time). We only have to consider up to 4 consecutive pastes, since select, copy, paste x 5 is equivalent to select, copy, paste, select, copy, paste and the latter is better since it leaves us with more in the clipboard. Once we've reached n, we have the desired result.
The complexity might appear to be O(N), but since the numbers grow at an exponential rate it is actually O(N2) due to the complexity of multiplying the large numbers. Below is a Python implementation. It takes about 0.5 seconds to calculate for N=50,000.
def max_chars(n):
dp = [0] * (n+1)
for i in xrange(n):
dp[i+1] = max(dp[i+1], dp[i]+1) # press a
for j in xrange(i+3, min(i+7, n+1)):
dp[j] = max(dp[j], dp[i]*(j-i-1)) # press select all, copy, paste x (j-i-1)
return dp[n]
In the code, j represents the total number of keys pressed after our new sequence of keypresses. We already have i keypresses at this stage, and 2 new keypresses go to select-all and copy. Therefore we're hitting paste j-i-2 times. Since pasting adds to the existing sequence of dp[i] A's, we need to add 1 making it j-i-1. This explains the j-i-1 in the 2nd-last line.
Here are some results (n => number of A's):
7 => 9
8 => 12
9 => 16
10 => 20
100 => 1391569403904
1,000 => 3268160001953743683783272702066311903448533894049486008426303248121757146615064636953144900245
174442911064952028008546304
50,000 => a very large number!
I agree with #SB that you should always state your assumptions: Mine is that you don't need to paste twice to double the number of characters. This gets the answer for 7, so unless my solution is wrong the assumption must be right.
In case someone wonders why I'm not checking sequences of the form Ctrl+A, Ctrl+C, A, Ctrl+V: The end result will always be the same as A, Ctrl+A, Ctrl+C, Ctrl+V which I do consider.
By using marcog's solution I found a pattern that starts at n=16. To illustrate this here are the keystrokes for n=24 up to n=29, I replaced ^A with S (select), ^C with C (copy), and ^V with P (paste) for readability:
24: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P
4 * 4 * 4 * 4 * 4 = 1024
25: A,A,A,A,S,C,P,P,P,S,C,P,P,S,C,P,P,S,C,P,P,S,C,P,P
4 * 4 * 3 * 3 * 3 * 3 = 1296
26: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,S,C,P,P,S,C,P,P
4 * 4 * 4 * 3 * 3 * 3 = 1728
27: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,S,C,P,P
4 * 4 * 4 * 4 * 3 * 3 = 2304
28: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P
4 * 4 * 4 * 4 * 4 * 3 = 3072
29: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P
4 * 4 * 4 * 4 * 4 * 4 = 4096
After an initial 4 As, the ideal pattern is to select, copy, paste, paste, paste and repeat. This will multiply the number of As by 4 every 5 keystrokes. If this 5 keystroke pattern cannot consume the remaining keystrokes on its own some number of 4 keystroke patterns (SCPP) consume the final keystrokes, replacing SCPPP (or removing one of the pastes) as necessary. The 4 keystroke patterns multiply the total by 3 every 4 keystrokes.
Using this pattern here is some Python code that gets the same results as marcog's solution, but is O(1) edit: This is actually O(log n) due to exponentiation, thanks to IVlad for pointing that out.
def max_chars(n):
if n <= 15:
return (0, 1, 2, 3, 4, 5, 6, 9, 12, 16, 20, 27, 36, 48, 64, 81)[n]
e3 = (4 - n) % 5
e4 = n // 5 - e3
return 4 * (4 ** e4) * (3 ** e3)
Calculating e3:
There are always between 0 and 4 SCPP patterns at the end of the keystroke list, for n % 5 == 4 there are 4, n % 5 == 1 there are 3, n % 5 == 2 there are 2, n % 5 == 3 there are 1, and n % 5 == 4 there are 0. This can be simplified to (4 - n) % 5.
Calculating e4:
The total number of patterns increases by 1 whenever n % 5 == 0, as it turns out this number increases to exactly n / 5. Using floor division we can get the total number of patterns, the total number for e4 is the total number of patterns minus e3. For those unfamiliar with Python, // is the future-proof notation for floor division.
Here's how I would approach it:
assume CtrlA = select all
assume CtrlC = copy selection
assume CtrlV = paste copied selection
given some text, it takes 4 keystrokes to duplicate it:
CtrlA to select it all
CtrlC to copy it
CtrlV to paste (this will paste over the selection - STATE YOUR ASSUMPTIONS)
CtrlV to paste again which doubles it.
From there, you can consider doing 4 or 5 A's, then looping through the above. Note that doing ctrl + a, c, v, v will grow your text exponentially as you loop through. If remaining strokes < 4, just keep doing a CtrlV
The key to interviews # places like Google is to state your assumptions, and communicate your thinking. they want to know how you solve problems.
It's solveable in O(1): Like with the Fibonacci numbers, there is a formula to calculate the number of printed As (and the sequence of keystrokes):
1) We can simplify the problem description:
Having only [A],[C-a]+[C-c],[C-v] and an empty copy-paste-buffer
equals
having only [C-a]+[C-c],[C-v] and "A" in the copy-paste-buffer.
2) We can describe the sequence of keystrokes as a string of N chars out of {'*','V','v'}, where 'v' means [C-v] and '*' means [C-a] and 'V' means [C-c]. Example: "vvvv*Vvvvv*Vvvv"
The length of that string still equals N.
The product of the lengths of the Vv-words in that string equals the number of produced As.
3) Given a fixed length N for that string and a fixed number K of words, the outcome will be maximal iff all words have nearly equal lengths. Their pair-wise difference is not more than ±1.
Now, what is the optimal number K, if N is given?
4) Suppose, we want to increase the number of words by appending one single word of length L, then we have to reduce L+1 times any previous word by one 'v'. Example: "…*Vvvv*Vvvv*Vvvv*Vvvv" -> "…*Vvv*Vvv*Vvv*Vvv*Vvv"
Now, what is the optimal word length L?
(5*5*5*5*5) < (4*4*4*4*4)*4 , (4*4*4*4) > (3*3*3*3)*3
=> Optimal is L=4.
5) Suppose, we have a sufficient large N to generate a string with many words of length 4, but a few keystrokes are left; how should we use them?
If there are 5 or more left: Append another word with length 4.
If there are 0 left: Done.
If there are 4 left: We could either
a) append one word with length 3: 4*4*4*4*3=768.
b) or increase 4 words to lenght 5: 5*5*5*5=625. => Appending one word is better.
If there are 3 left: We could either
a) or append one word with length 3 by adjusting the previus word from length 4 to 3: 4*4*4*2=128 < 4*4*3*3=144.
b) increase 3 words to lenght 5: 5*5*5=125. => Appending one word is better.
If there are 2 left: We could either
a) or append one word with length 3 by adjusting the previus two words from length 4 to 3: 4*4*1=16 < 3*3*3=27.
b) increase 2 words to lenght 5: 5*5=25. => Appending one word is better.
If there is 1 left: We could either
a) or append one word with length 3 by adjusting the previus three words from length 4 to 3: 4*4*4*0=0 < 3*3*3*3=81.
b) increase one word to lenght 5: 4*4*5=80. => Appending one word is better.
6) Now, what if we don't have a "sufficient large N" to use the rules in 5)? We have to stick with plan b), if possible!
The strings for small N are:
1:"v", 2:"vv", 3:"vvv", 4:"vvvv"
5:"vvvvv" → 5 (plan b)
6:"vvvvvv" → 6 (plan b)
7:"vvv*Vvv" → 9 (plan a)
8:"vvvv*Vvv" → 12 (plan a)
9:"vvvv*Vvvv" → 16
10:"vvvv*Vvvvv" → 20 (plan b)
11:"vvv*Vvv*Vvv" → 29 (plan a)
12:"vvvv*Vvv*Vvv" → 36 (plan a)
13:"vvvv*Vvvv*Vvv" → 48 (plan a)
14:"vvvv*Vvvv*Vvvv" → 64
15:"vvv*Vvv*Vvv*Vvv" → 81 (plan a)
…
7) Now, what is the optimal number K of words in a string of length N?
If N < 7 then K=1 else if 6 < N < 11 then K=2 ; otherwise: K=ceil((N+1)/5)
Written in C/C++/Java: int K = (N<7)?(1) : (N<11)?(2) : ((N+5)/5);
And if N > 10, then the number of words with length 3 will be: K*5-1-N. With this, we can calculate the number of printed As:
If N > 10, the number of As will be: 4^{N+1-4K}·3^{5K-N-1}
Using CtrlA + CtrlC + CtrlV is an advantage only after 4 'A's.
So I would do something like this (in pseudo-BASIC-code, since you haven't specified any proper language):
// We should not use the clipboard for the first four A's:
FOR I IN 1 TO MIN(N, 4)
PRINT 'CLICK A'
NEXT
LET N1 = N - 4
// Generates the maximum number of pastes allowed:
FOR I IN 1 TO (N1 DIV 3) DO
PRINT 'CTRL-A'
PRINT 'CTRL-C'
PRINT 'CTRL-V'
LET N1 = N1 - 3
NEXT
// If we still have same keystrokes left, let's use them with simple CTRL-Vs
FOR I IN N1 TO N
PRINT 'CTRL-V'
NEXT
Edit
Back to using a single CtrlV in the main loop.
Added some comments to explain what I'm trying to do here.
Fixed an issue with the "first four A's" block.
It takes 3 keystrokes to double your number of As. It only makes sense to start doubling when you have 3 or more As already printed. You want your last allowed keystroke to be a CtrlV to make sure you are doubling the biggest number you can, so in order to align it we will fill in any extra keystrokes after the first three As at the beginning with more As.
for (i = 3 + n%3; i>0 && n>0; n--, i--) {
print("a");
}
for (; n>0; n = n-3) {
print("ctrl-a");
print("ctrl-c");
print("ctrl-v");
}
Edit:
This is terrible, I completely got ahead of myself and didn't consider multiple pastes for each copy.
Edit 2:
I believe pasting 3 times is optimal, when you have enough keystrokes to do it. In 5 keystrokes you multiply your number of As by 4. This is better than multiplying by 3 using 4 keystrokes and better than multiplying by 5 using 6 keystrokes. I compared this by giving each method the same number of keystrokes, enough so they each would finish a cycle at the same time (60), letting the 3-multiplier do 15 cycles, the 4-multiplier do 12 cycles, and the 5-multiplier do 10 cycles. 3^15 = 14,348,907, 4^12=16,777,216, and 5^10=9,765,625. If there are only 4 keystrokes left, doing a 3-multiplier is better than pasting 4 more times, essentially making the previous 4 multiplier become an 8-multiplier. If there are only 3 keystrokes left, a 2-multiplier is best.
Assume you have x characters in the clipboard and x characters in the text area; let's call it "state x".
Let's press "Paste" a few times (i denote it by m-1 for convenience), then "Select-all" and "Copy"; after this sequence, we get to "state m*x".
Here, we wasted a total of m+1 keystrokes.
So the asymptotic growth is (at least) something like f^n, where f = m^(1/(m+1)).
I believe it's the maximum possible asymptotic growth, though i cannot prove it (yet).
Trying various values of m shows that the maximum for f is obtained for m=4.
Let's use the following algorithm:
Press A a few times
Press Select-all
Press Copy
Repeat a few times:
Press Paste
Press Paste
Press Paste
Press Select-all
Press Copy
While any keystrokes left:
Press Paste
(not sure it's the optimal one).
The number of times to press A at the beginning is 3: if you press it 4 times, you miss the opportunity to double the number of A's in 3 more keystrokes.
The number of times to press Paste at the end is no more than 5: if you have 6 or more keystrokes left, you can use Paste, Paste, Paste, Select-all, Copy, Paste instead.
So, we get the following algorithm:
If (less than 6 keystrokes - special case)
While (any keystrokes left)
A
Else
First 5 keystrokes: A, A, A, Select-all, Copy
While (more than 5 keystrokes left)
Paste, Paste, Paste, Select-all, Copy
While (any keystrokes left)
Paste
(not sure it's the optimal one). The number of characters after executing this is something like
3 * pow(4, floor((n - 6) / 5)) * (2 + (n - 1) % 5).
Sample values: 1,2,3,4,5,6,9,12,15,18,24,36,48,60,72,96,144,192,240,288,...
What follows uses the OP's second edit that pasting does not replace existing text.
Notice a few things:
^A and ^C can be considered a single action that takes two keystrokes, since it never makes sense to do them individually. In fact, we can replace all instances of ^A^C with ^K^V, where ^K is a one-key "cut" operation (let's abbreviate it X). We shall see that dealing with ^K is much nicer than the two-cost ^A^C.
Let's assume that an 'A' starts in the clipboard. Then ^V (let's abbreviate it Y) is strictly superior to A and we can drop the latter from all consideration. (In the actual problem, if the clipboard starts empty, in what follows we'll just replace Y with A instead of ^V up until the first X.)
Every reasonable keystroke sequence can thus be interpreted as a group of Ys separated by Xs, for example YYYXYXYYXY. Denote by V(s) the number of 'A's produced by the sequence s. Then V(nXm) = V(n)*V(m), because X essentially replaces every Y in m with V(n) 'A's.
The copy-paste problem is thus isomorphic to the following problem: "using m+1 numbers which sum to N-m, maximimze their product." For example, when N=6, the answer is m=1 and the numbers (2,3). 6 = 2*3 = V(YYXYYY) = V(AA^A^C^V^V) (or V(YYYXYY) = V(AAA^A^C^V). )
We can make a few observations:
For a fixed value of m, the numbers to choose are ceil( (N-m)/(m+1) ) and floor( (N-m)/(m+1) ) (in whatever combination makes the sum work out; more specifically you will need (N-m) % (m+1) ceils and the rest floors). This is because, for a < b, (a+1)*(b-1) >= a*b.
Unfortunately I don't see an easy way to find the value of m. If this were my interview I would propose two solutions at this point:
Option 1. Loop over all possible m. An O(n log n) solution.
C++ code:
long long ipow(int a, int b)
{
long long val=1;
long long mul=a;
while(b>0)
{
if(b%2)
val *= mul;
mul *= mul;
b/=2;
}
return val;
}
long long trym(int N, int m)
{
int floor = (N-m)/(m+1);
int ceil = 1+floor;
int numceils = (N-m)%(m+1);
return ipow(floor, m+1-numceils) * ipow(ceil, numceils);
}
long long maxAs(int N)
{
long long maxval=0;
for(int m=0; m<N; m++)
{
maxval = std::max(maxval, trym(N,m));
}
return maxval;
}
Option 2. Allow m to attain non-integer values and find its optimal value by taking the derivative of [(N-m)/(m+1)]^m with respect to m and solving for its root. There is no analytic solution, but the root can be found using e.g. Newton's method. Then use the floor and ceiling of that root for the value of m, and choose whichever is best.
public int dp(int n)
{
int arr[] = new int[n];
for (int i = 0; i < n; i++)
arr[i] = i + 1;
for (int i = 2; i < n - 3; i++)
{
int numchars = arr[i] * 2;
int j = i + 3;
arr[j] = Math.max(arr[j], numchars);
while (j < n - 1)
{
numchars = numchars + arr[i];
arr[++j] = Math.max(arr[j], numchars);
}
}
return arr[n - 1];
}
Here is my approach and solution with code below.
Approach:
There are three distinct operations that can be performed.
Keystroke A - Outputs one character 'A'
Keystroke (Ctrl-A) + (Ctrl-C) - Outputs nothing essentially. These two keystrokes can be combined into one operation because each of these keystrokes individually make no sense. Also, this keystroke sets up the output for the next paste operation.
Keystroke (Ctrl-V) - Output for this keystroke really depends on the previous (second) operation and hence we would need to account for that in our code.
Now given the three distinct operations and their respective outputs, we have to run through all the permutations of these operations.
Assumption:
Now, some version of this problem states that the sequence of keystrokes, Ctrl+A -> Ctrl+C -> Ctrl+V, overwrite the highlighted selection. To factor in this assumption, only one line of code needs to be added to the solution below where the printed variable in case 2 is set to 0
case 2:
//Ctrl-A and then Ctrl-C
if((count+2) < maxKeys)
{
pOutput = printed;
//comment the below statement to NOT factor
//in the assumption described above
printed = 0;
}
For this solution
The code below will print a couple of sequences and the last sequence is the correct answer for any given N. e.g. for N=11 this will be the correct sequence
With the assumption
A, A, A, A, A, C, S, V, V, V, V, :20:
Without the assumption
A, A, A, C, S, V, V, C, S, V, V, :27:
I have decided to retain the assumption for this solution.
Keystroke Legend:
'A' - A
'C' - Ctrl+A
'S' - Ctrl+C
'V' - Ctrl+V
Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void maxAprinted(int count, int maxKeys, int op, int printed, int pOutput, int *maxPrinted, char *seqArray)
{
if(count > maxKeys)
return;
if(count == maxKeys)
{
if((*maxPrinted) < printed)
{
//new sequence found which is an improvement over last sequence
(*maxPrinted) = printed;
printf("\n");
int i;
for(i=0; i<maxKeys; i++)
printf(" %c,",seqArray[i]);
}
return;
}
switch(op)
{
case 1:
//A keystroke
printed++;
seqArray[count] = 'A';
count++;
break;
case 2:
//Ctrl-A and then Ctrl-C
if((count+2) < maxKeys)
{
pOutput = printed;
//comment the below statement to NOT factor
//in the assumption described above
printed = 0;
}
seqArray[count] = 'C';
count++;
seqArray[count] = 'S';
count++;
break;
case 3:
//Ctrl-V
printed = printed + pOutput;
seqArray[count] = 'V';
count++;
break;
}
maxAprinted(count, maxKeys, 1, printed, pOutput, maxPrinted, seqArray);
maxAprinted(count, maxKeys, 2, printed, pOutput, maxPrinted, seqArray);
maxAprinted(count, maxKeys, 3, printed, pOutput, maxPrinted, seqArray);
}
int main()
{
const int keyStrokes = 11;
//this array stores the sequence of keystrokes
char *sequence;
sequence = (char*)malloc(sizeof(char)*(keyStrokes + 1));
//stores the max count for As printed for a sqeuence
//updated in the recursive call.
int printedAs = 0;
maxAprinted(0, keyStrokes, 1, 0, 0, &printedAs, sequence);
printf(" :%d:", printedAs);
return 0;
}
Using the tricks mentioned in answers above, Mathematically, Solution can be explained in one equation as,
4 + 4^[(N-4)/5] + ((N-4)%5)*4^[(N-4)/5].
where [] is greatest integer factor
There is a trade-off between printing m-A's manually, then using Ctrl+A, Ctrl+C, and N-m-2 Ctrl+V. The best solution is in the middle. If max key strokes = 10, the best solution is typing 5 A's or 4 A's.
try using this Look at this http://www.geeksforgeeks.org/how-to-print-maximum-number-of-a-using-given-four-keys/ and maybe optimize a bit looking for the results around the mid point.
Here is my solution with dynamic programming, without a nested loop, and which also prints the actual characters that you'd need to type:
N = 52
count = [0] * N
res = [[]] * N
clipboard = [0] * N
def maybe_update(i, new_count, new_res, new_clipboard):
if new_count > count[i] or (
new_count == count[i] and new_clipboard > clipboard[i]):
count[i] = new_count
res[i] = new_res
clipboard[i] = new_clipboard
for i in range(1, N):
# First option: type 'A'.
# Using list concatenation for 'res' to avoid O(n^2) string concatenation.
maybe_update(i, count[i - 1] + 1, res[i - 1] + ['A'], clipboard[i - 1])
# Second option: type 'CTRL+V'.
maybe_update(i, count[i - 1] + clipboard[i - 1], res[i - 1] + ['v'],
clipboard[i - 1])
# Third option: type 'CTRL+A, CTRL+C, CTRL+V'.
# Assumption: CTRL+V always appends.
if i >= 3:
maybe_update(i, 2 * count[i - 3], res[i - 3] + ['acv'], count[i - 3])
for i in range(N):
print '%2d %7d %6d %-52s' % (i, count[i], clipboard[i], ''.join(res[i]))
This is the output ('a' means 'CTRL+A', etc.)
0 0 0
1 1 0 A
2 2 0 AA
3 3 0 AAA
4 4 0 AAAA
5 5 0 AAAAA
6 6 3 AAAacv
7 9 3 AAAacvv
8 12 3 AAAacvvv
9 15 3 AAAacvvvv
10 18 9 AAAacvvacv
11 27 9 AAAacvvacvv
12 36 9 AAAacvvacvvv
13 45 9 AAAacvvacvvvv
14 54 27 AAAacvvacvvacv
15 81 27 AAAacvvacvvacvv
16 108 27 AAAacvvacvvacvvv
17 135 27 AAAacvvacvvacvvvv
18 162 81 AAAacvvacvvacvvacv
19 243 81 AAAacvvacvvacvvacvv
20 324 81 AAAacvvacvvacvvacvvv
21 405 81 AAAacvvacvvacvvacvvvv
22 486 243 AAAacvvacvvacvvacvvacv
23 729 243 AAAacvvacvvacvvacvvacvv
24 972 243 AAAacvvacvvacvvacvvacvvv
25 1215 243 AAAacvvacvvacvvacvvacvvvv
26 1458 729 AAAacvvacvvacvvacvvacvvacv
27 2187 729 AAAacvvacvvacvvacvvacvvacvv
28 2916 729 AAAacvvacvvacvvacvvacvvacvvv
29 3645 729 AAAacvvacvvacvvacvvacvvacvvvv
30 4374 2187 AAAacvvacvvacvvacvvacvvacvvacv
31 6561 2187 AAAacvvacvvacvvacvvacvvacvvacvv
32 8748 2187 AAAacvvacvvacvvacvvacvvacvvacvvv
33 10935 2187 AAAacvvacvvacvvacvvacvvacvvacvvvv
34 13122 6561 AAAacvvacvvacvvacvvacvvacvvacvvacv
35 19683 6561 AAAacvvacvvacvvacvvacvvacvvacvvacvv
36 26244 6561 AAAacvvacvvacvvacvvacvvacvvacvvacvvv
37 32805 6561 AAAacvvacvvacvvacvvacvvacvvacvvacvvvv
38 39366 19683 AAAacvvacvvacvvacvvacvvacvvacvvacvvacv
39 59049 19683 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvv
40 78732 19683 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvv
41 98415 19683 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvvv
42 118098 59049 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacv
43 177147 59049 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvv
44 236196 59049 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvv
45 295245 59049 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvvv
46 354294 177147 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacv
47 531441 177147 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacvv
48 708588 177147 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvv
49 885735 177147 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvvv
50 1062882 531441 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacv
51 1594323 531441 AAAacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacvvacvv
If N key Strokes are allowed, then the result is N-3.
A's -> N-3
CTRL+A -> Selecting those N Characters :+1
CTRL+C -> Copying those N Characters :+1
Ctrl+V -> Pasting the N Characters. :+1
i.e., (Since we have selected the whole characters using CTRL+A) Replacing these existing N-3 characters with the copied N-3 Characters(which is overriding the same characters) and the result is N-3.

Resources