Know repetitions in multiset by its sum - algorithm

I'm given the size N of the multiset and its sum S. The elements of the set are supposed to be continuous, for example a multiset K having 6 (N=6) elements {1,1,2,2,2,3}, so S=11 (the multiset always contains first N repeating natural numbers).
How can I know the total changes to make so that there can be no repetitions and the set becomes continuous?
For the above example the multiset K needs 3 changes. Hence, finally the set K will become {1,2,3,4,5,6}.
What I did is, I found out the actual sum (i.e. n*(n+1)/2) and subtracted the given sum. Let it be T.
Then, T=ceil(T/n), then the answer becomes 2*T, it is working for most cases.
But, I guess I'm missing some cases. Does there exists some algorithm to know how many elements to change?
I'm given only the size and sum of the multiset.

As you already noticed, for a given N, the sum should be S' = N * (N-1) / 2. You are given some value S.
Clearly, if S' = S the answer is 0.
If S'- S <= N - 1, then the multiset that requires least changes is
{1, 2, ..., N-1, X}
where X = N - (S' - S), which is in the range [1, N-1]. In other words, X makes up for the difference in sum between the required and the actual multiset. Your answer would be 1.
If the difference is larger than N-1, then also N-1 cannot be in the multiset. If S'- S <= (N - 1) + (N - 2), a multiset that requires least changes is
{1, 2, ..., N-2, 1, X}
where X = N + (N - 1) - (S'- S), which is in the range [1, N-2]. Your answer would be 2.
Generalizing, you would get a table like:
S' - S | answer
-----------------------
[ 0, 0] | 0
[ 1, N-1] | 1
[ N, 2N-3] | 2
[2N-2, 3N-6] | 3
and so on. You could find a formula to get the answer in terms of N and S, but it seems much easier to use a simple loop. I'll leave the implementation to you.

Related

Permutations: Possibility of a permutation to be obtained from a given sequence [duplicate]

I studied math, and I come up with this problem.
There are two permutations A and B and a integer M.
We say A almost equal to B if we can make from A to B doing following operations.
-1 Choose a M-length segment of the permutation A.
-2 Perform a cyclic shift on it towards the right.(So,if sub segment is "1 2 3 4 5"(m=5), then after this operation , it will be "5 1 2 3 4".)
Question : Does A almost equal to B?
I think it is typical , but I couldn't find the answer.
How to solve it?(not brute force!)
number of elements in the permutation<=10^5
example
A "1 2 3"
B "2 3 1"
m=2
answer YES
explain "1 2 3"->"2 1 3"->"2 3 1"
Here's a proof of my conjecture. Let n be the length of the permutations and m be the length of the windows that we are allowed to rotate, where 1 ≤ m ≤ n. Permutations P and Q are almost equal if there exists a sequence of window rotations that transforms P into Q. Almost equality is an equivalence relation. Here's the claimed characterization of the equivalence classes.
(1) m = 1: P almost equals Q if and only if P = Q
(2) m = n: P almost equals Q if and only if they're rotations of each other
(3) 1 < m < n, m odd: P almost equals Q if and only if they have the same parity
(4) 1 < m < n, n even: P almost equals Q
The first two claims are obvious. As for (3), the necessity of the parity condition follows from the fact that rotating a window of odd length is an even permutation.
The meat of the argument here is to find an algorithm for when n = m + 1 ≥ 4, since in general, we can use an algorithm similar to insertion sort to transform P so that all but the last m + 1 elements match Q, and specifically, the case (n, m) = (3, 2) can be solved by inspection. In case m is even, we ensure further that the transformation matches the parity of Q, by rotating the last m elements once if necessary. (For m odd, we just assume equal parities.)
We need a technique for moving fewer than m elements at once. Suppose that the state is as follows.
1, 2, 3, 4, ..., m, m + 1
Rotate the second window m - 1 times (i.e., once in reverse).
1, 3, 4, ..., m, m + 1, 2
Rotate the first window m - 1 times.
3, 4, ..., m, m + 1, 1, 2
Second, twice.
3, 2, 4, ..., m, m + 1, 1
3, 1, 2, 4, ..., m, m + 1
We've succeeded in rotating the first three elements. This suffices in combination with conjugation by rotations to "insertion sort" the first m - 1 elements of Q into place. The other two are in the right order by the parity match.

Dividing N items in p groups

You are given N total number of item, P group in which you have to divide the N items.
Condition is the product of number of item held by each group should be max.
example N=10 and P=3 you can divide the 10 item in {3,4,3} since 3x3x4=36 max possible product.
You will want to form P groups of roughly N / P elements. However, this will not always be possible, as N might not be divisible by P, as is the case for your example.
So form groups of floor(N / P) elements initially. For your example, you'd form:
floor(10 / 3) = 3
=> groups = {3, 3, 3}
Now, take the remainder of the division of N by P:
10 mod 3 = 1
This means you have to distribute 1 more item to your groups (you can have up to P - 1 items left to distribute in general):
for i = 0 up to (N mod P) - 1:
groups[i]++
=> groups = {4, 3, 3} for your example
Which is also a valid solution.
For fun I worked out a proof of the fact that it in an optimal solution either all numbers = N/P or the numbers are some combination of floor(N/P) and ceiling(N/P). The proof is somewhat long, but proving optimality in a discrete context is seldom trivial. I would be interested if anybody can shorten the proof.
Lemma: For P = 2 the optimal way to divide N is into {N/2, N/2} if N is even and {floor(N/2), ceiling(N/2)} if N is odd.
This follows since the constraint that the two numbers sum to N means that the two numbers are of the form x, N-x.
The resulting product is (N-x)x = Nx - x^2. This is a parabola that opens down. Its max is at its vertex at x = N/2. If N is even this max is an integer. If N is odd, then x = N/2 is a fraction, but such parabolas are strictly unimodal, so the closer x gets to N/2 the larger the product. x = floor(N/2) (or ceiling, it doesn't matter by symmetry) is the closest an integer can get to N/2, hence {floor(N/2),ceiling(N/2)} is optimal for integers.
General case: First of all, a global max exists since there are only finitely many integer partitions and a finite list of numbers always has a max. Suppose that {x_1, x_2, ..., x_P} is globally optimal. Claim: given and i,j we have
|x_i - x_ j| <= 1
In other words: any two numbers in an optimal solution differ by at most 1. This follows immediately from the P = 2 lemma (applied to N = x_i + x_ j).
From this claim it follows that there are at most two distinct numbers among the x_i. If there is only 1 number, that number is clearly N/P. If there are two numbers, they are of the form a and a+1. Let k = the number of x_i which equal a+1, hence P-k of the x_i = a. Hence
(P-k)a + k(a+1) = N, where k is an integer with 1 <= k < P
But simple algebra yields that a = (N-k)/P = N/P - k/P.
Hence -- a is an integer < N/P which differs from N/P by less than 1 (k/P < 1)
Thus a = floor(N/P) and a+1 = ceiling(N/P).
QED

Maximum of sums of unsorted array and each of a number of sorted arrays

Given an unsorted array
A = a_1 ... a_n
And a set of sorted Arrays
B_i = b_i_1 ... b_i_n # for i from 1 to $large_number
I would like to find the maximums from the (not yet calculated) sum arrays
C_i = (a_1 + b_i_1) ... (a_n + b_i_n)
for each i.
Is there a trick to do better than just calculating all the C_i and finding their maximums in O($large_number * n)?
Can we do better when we know that the B arrays are just shifts from an endless sequence,
e.g.
S = 0 1 4 9 16 ...
B_i = S[i:i+n]
(The above sequence has the maybe advantageous property that (S_i - S_i-1 > S_i-1 - S_i-2))
There are $large_number * n data in your first problem, so there can't be any such trick.
You can prove this with an adversary argument. Suppose you have an algorithm that solves your problem without looking at all n * $large_number entries of b. I'm going to pick a fixed a, namely (-10, -20, -30, ..., -10n). The first $large_number * n - 1 the algorithm looks at an entry b_(i,j), I'll answer that it's 10j, for a sum of zero. The last time it looks at an entry, I'll answer that it's 10j+1, for a sum of 1.
If $large_number is Omega(n), your second problem requires you to look at n * $large_number entries of S, so it also can't have any such trick.
However, if you specify S, there may be something. And if $large_number <= n/2 (or whatever it is), then, all of the entries of S must be sorted, so you only have to look at the last B.
If we don't know anything I don't it's possible to do better than O($large_number * n)
However - If it's just shifts of an endless sequence we can do it in O($large_number + n):
We calculate B_0 ןמ O($large_number).
Than B_1 = (B_0 - S[0]) + S[n+1]
And in general: B_i = (B_i-1 - S[i-1]) + S[i-1+n].
So we can calculate all the other entries and the max in O(n).
This is for a general sequence - if we have some info about it, it might be possible to do better.
we know that the B arrays are just shifts from an endless sequence,
e.g.
S = 0 1 4 9 16 ...
B_i = S[i:i+n]
You can easily calculate S[i:i+n] as (sum of squares from 1 to i+n) - (sum of squares from 1 to i-1)
See https://math.stackexchange.com/questions/183316/how-to-get-to-the-formula-for-the-sum-of-squares-of-first-n-numbers
With the provided example, S1 = 0, S2 = 1, S3 = 4...
Let f(n) = SUM of Si for i=1 to n = (n-1)(n)(2n-1)/6
B_i = f(i+n) - f(i-1)
You then add SUM(A) to each sum.
Another approach is to calculate the difference between B_i and B_(i-1):
That would be: S[i:i+n] - S[i-1:i+n-1] = S(i+n) - S(i-1)
That way, you can just calculate the difference of the sums of each array with the previous one. In my understanding, since Ci = SUM(Bi)+SUM(A), SUM(A) becomes a constant that is irrelevant in finding the maximum.

Represent a number as sum of primes

I am given a large number n and I need to find whether it can be represented as sum of K prime numbers.
Ex 9 can be represented as sum of 3 prime number as 2+2+5.
I am trying to use variation of subset sum but number is too large to generate all primes number till then.
The problem is from the current HackerRank contest. The restrictions are 1 <= n, K <= 10^12
For K = 1, the answer is obviously "Yes" iif N is prime
For K = 2, according to the Goldbach conjecture, which is verified for N up to around 10^18, the answer is "Yes" iif N is even and N >= 4 or if N - 2 is prime.
The interesting case is K = 3. Obviously if N < 6, the answer is "No" because the smallest number expressible as the sum of three primes is 2 + 2 + 2 = 6.
If N >= 6, then either N - 2 or N - 3 is even and >= 4, so we can apply Goldbach's conjecture again.
So for K = 3, the answer is "Yes" simply iif N >= 6.
Via induction (hint: just use K - 3 times the prime 2), we can show that for K >= 3, the answer is "Yes" iif N >= 2*K, so only the cases K = 1 and K = 2 are non-trivial and require just a simple primality check, e.g. via Miller–Rabin in O(log^4 N).
EDIT: As a bonus, this proof also gives a constructive algorithm to output the partition. We use a number of 2's and maybe one 3 to get to K = 2. The tricky K = 2, N even case is not as hard as it looks: We know from computational verification of the Goldbach conjecture that for N >= 12, there is a Goldbach partition with a prime < 5200 or so. There are less than 700 such primes, so we can check them all in a reasonable amount of time.
The concept you are looking for is called the prime partitions of a number. The formula to compute the number of prime partitions of a number is \kappa(n) = \frac{1}{n}\left(\mathrm{sopf}(n) + \sum_{j=1}^{n-1} \mathrm{sopf}(j) \cdot \kappa(n-j)\right); I gave that in LaTeX notation because I don't know how to do it in html. The sopf(n) function is the sum of the distinct prime factors of n, so sopf(42) = 12, since 42 = 2 * 3 * 7, but sopf(12) = 5, since 12 = 2 * 2 * 3 but each prime factor is counted once.
I discuss this formula at my blog.
Your input are n and K. There are many cases :
K > n : impossible
K = n : the K prime numbers are all 1
K < n : 4 subcases :
a. n and K are odd
b. n is even, K is odd
c. n is odd, K is even
d. n and K are even
Case a: select any prime p < n and p > 2. The problem reduces to the same problem with input n-p and K-1 instead of n and K respectively, and we fall in case b
Case b: The problem reduces to the same problem with input n-2 and K-1 instead of n and K respectively, and we fall in case d
Case c: idem than b, but we fall in case a instead of d
Case d: if n = 2K, then 2, 2, ..., 2 taken K times is your solution (ie your primes are 2, 2, ..., 2). Otherwise n can be written
n = (\sum_{i=1}^{i=K-2} 2 ) + p + q
where we add the prime 2 (K-2) times in the sum. Then the problem reduces to the same problem with input n-2(K-2) instead of n and 2 instead of K. But this is Goldbach. You can solve it in O(n sqrt(n)) like this : take p and q both equal to n/2. Increment p and decrease q by 1 at each step until they are both prime.

Generating a stateless, pseudo-random permutation of integers from 0 to n?

Question spawned from this one. The problem can be formulated as follows:
Given two positive integers n and m, with m <= n, is there a way to find a suite of numbers, which cycles and covers all possible values from 0 to n?
As a basic example, if we take 3 as a number, for whatever number current between 0 and 3, we can compute the next value as:
next = (current+3) % 4
This will cycle. For instance: 1 -> 0 -> 3 -> 2 -> 1 etc. I found this solution by "chance" and it is even general ((i + n) % (n + 1) for any n), I cannot prove it mathematically. And it is a little too obvious.
Are there better ways to generate such a permutation?
I'm not sure what you intend m in the question to refer to, or how you're defining "a suite of numbers"). However, one way of getting a cycle of number is to use a recursion (or iteration) of the form:
next = f(current)
for some function f. For example, linear congruential RNGs use the iteration:
x = ( a · x + c ) mod m where 0 < a, c < m
They don't always produce all values from 0 to m-1, but under certain circumstances they do:
c and m are relatively prime
a - 1 is divisible by every prime factor of m (not including m)
if m is divisible by 4, a - 1 is divisible by 4.
(This is the Hull-Dobell theorem.)
Note that a, c == 1 satisfies the above criteria for any m. Futhermore, if m is prime, any values of a and c satisify the criteria, and if m is a power of 2, then the criteria are satisfied by any a, c such that a == 1 mod 4 and c == 1 mod 2. However, for certain values of m (eg. 6), the only value of a which will work is 1.
This might not qualify as "stateless", but I don't think that there is any strictly stateless solution; for example, you might look for some function f such that:
f(0), f(1),... f(m-1)
is a permutation of
0, 1, ..., m-1
so that you could generate the cycle by calling f(i) for successive values of i. But that's still a state, since you have to remember the last value of i you used,
Incrementing each subsequent number by any number that does not share a common prime divisor with (n-m+1) would cover the sequence (e.g. for the sequence [2-11] (10 numbers) incrementing by 3, 7, or 9 would work but 2, 4, 5, 6, and 8 would not because they share a common divisor (2 and/or 5)
EDIT
I took out the shuffling idea since it seems that you want to increment by the same number each time. If you want a truly "random" sequence that has m at the first element just take m out and place it at the beginning. I'm not sure how that helps you, though.

Resources