Existence of a permutation under constraints (Interview Street - Manipulative Numbers)

Existence of a permutation under constraints (Interview Street - Manipulative Numbers) - algorithm

I am trying to solve this problem: https://www.interviewstreet.com/challenges/dashboard/#problem/4f9a33ec1b8ea
Suppose that A is a list of n numbers ( A1, A2, A3, ... , An) and B ( B1, B2, B3, .. ,Bn ) is a permutation of these numbers. We say B is K-Manipulative if and only if its following value:
M(B) = min( B1 Xor B2, B2 Xor B3, B3 Xor B4, ... , Bn-1 Xor Bn, Bn Xor B1 ) is not less than 2^K.
You are given n number A1 to An, You have to find the biggest K such that there exists a permutation B of these numbers which is K-Manipulative.
Input:
In the first line of the input there is an integer N.
In the second line of input there are N integers A1 to An
N is not more than 100.
Ai is non-negative and will fit in 32-bit integer.
Output:
Print an integer to the output being the answer to the test. If there is no such K print -1 to the output.
Sample Input
3
13 3 10
Sample Output
2
Sample Input
4
1 2 3 4
Sample Output
1
Explanation
First Sample test
Here the list A is {13, 3, 10}. One possible permutation of A is, B = (10, 3, 13).
For B, min( B1 xor B2, B2 xor B3, B3 xor B1 ) = min( 10 xor 3, 3 xor 13, 13 xor 10 ) = min( 9, 14, 7 ) = 7.
So there exist a permutation B of A such that M(B) is not less than 4 i.e 2^2. However there does not exist any permutation B of A such that M(B) is not less than 8 ie 2^3. So the maximum possible value of K is 2.
==================================================================================
Here are the attempts I have made so far.
Attempt 1: Greedy Algorithm
Place the input in an array A[1..n]
Compute the value M(A). This gives the location of the min XOR value (i, (i + 1) % n)
Check whether swapping A[i] or A[(i + 1) % n] with any other element of the array increases the value of M(A). If such an element exists, make the swap.
Repeat steps 2 & 3 until the value M(A) cannot be improved.
This gives a local maxima for sure, but I am not sure whether this gives the global maxima.
Attempt 2: Checking for the existence of a permutation given neighbor constraints
Given input A[1..n], For i = 1..n and j = (i+1)..n compute x_ij = A[i] XOR A[j]
Compute the max(x_ij). Note that 2^p <= max(x_ij) < 2^(p+1) for some p.
Collect all x_ij such that x_ij >= 2^p. Note that this collection can be treated as a graph G with nodes {1, 2, .. n} and nodes i and j have an undirected edge between them if x_ij >= 2^p.
Check whether the graph G has a cycle which visits each node exactly once. If such a cycle exists, k = p. Otherwise, let p = p - 1, goto step 3.
This gives the correct answer, but note that in step 4 we are essentially checking whether a graph has a hamiltonian cycle which is a very hard problem.
Any hints or suggestions?

It is possible to solve this problem without going deep into graph theory.
Key inference
The property suggested by rich, is the key to solving this problem:
Following up on my comment: B1 Xor B2 < 2^K if and only if B1 and B2
agree on all but the K low order bits
Based on the above property, we only need to find the highest k for which there are at most n/2 occurrences of each unique higher order bits for every A[i].
In other words, amongst the group of values A[i] >> k, iff each of those values is repeated at most n/2 times, there exists a k-manipulative permutation with all the XOR values >= (2^k).
Why n/2
Suppose if you do have more than n/2 occurrences of any of the unique higher order bits, it is NOT possible to obtain a permutation B, with all the cyclic XORs being non-zero, i.e. there will be at least one B[i] XOR B[(i+1) % N] with all the higher order bits becoming zero, hence making M(B) < 2^k
Pseudocode
k = -1
for (bit = (MAX_BITS - 1) to 0) {
HashMap count
for (i = 0 to N - 1) {
higherOrderVal = A[i] >> bit
if (higherOrderVal exists in count) {
count[higherOrderVal] += 1
}
else {
count[higherOrderVal] = 1
}
}
isValid = true
for (element in count) {
if (element > N/2) {
isValid = false
break
}
}
if (isValid) {
k = bit
break
}
}
The time complexity for this solution is O(M * N) where M is the constant factor representing the maximum number of bits used to represent the numbers (32-bits, 64-bits, etc.), and N is the size of the input array A.

Following up on my comment: B1 Xor B2 < 2^K if and only if B1 and B2 agree on all but the K low order bits, so G has the very special structure of being complete multipartite, with partition labels consisting of all but the K low order bits. A complete multipartite graph is Hamiltonian if and only if there is no majority partition. Plug this fact into Attempt 2.

Greedy Approach #priority_queue
firstly insert all pairs xor value and consecutive index i and j ->pq.push(tuple(v[i]^v[j],i,j))
now pop the first maxm xor value and set the both index i and j
now again pop that maxm xor value which is comes from the i or j
this operation perform up to 1 to n
then return nth poped xor value

Related

Find a subset of length K from a set of N numbers (1...N) where k<=N whose XOR value is X

Write an algorithm to find a subset of length K {A1, A2 ... , AK} from a set of N numbers (1...N) where K<=N and A1^A2..AK-1^Ak is X (where a^b represents bitwise XOR of a and b). Print the subset in any order. Print -1 if no subset is possible.
Constraints: 1 <= K <= N <= 10^6
e.g
if N=5, K=4, X=5
Output:
1 2 3 5
if N=5, K=5, X=5
Output: -1
I have tried the following solution:
Find all possible subset of length K
Check if xor sum of any subset = X
The complexity of the above solution O(C(n, k)) which is not optimal for large N value.
Is there any linear time solution to this problem. Please help.

Find matrix with minimum number of nonzero elements to satisfy row and column sums

The problem is to find a matrix that has given row and column sums and has a minimum number of nonzero elements. Given two arrays of positive integers A[1...N] and B[1...M], sum(A)=sum(B). The arrays A and B are row and column sums respectively of an unknown NxM matrix. The elements of the matrix are non-negative integers.
Is this possible in polynomial time?
Equivalent formulation - create a minimum size multi-set C that can be created from A and from B by "breaking up numbers in smaller pieces". The multi-set C is the same as nonzero elements from the matrix. The obvious lower and upper bounds on the size of C are:
max(|A|, |B|) <= |C| <= N+M-1

As you mentioned earlier |C| <= N + M - 1
But, say you can split A and B into A1, A2 and B1, B2 such that sum(A1) = sum(B1) and sum(A2) = sum(B2), the constraint becomes smaller
|C| <= (|A1| + |B1| -1) + (|A2| + |B2| -1)
<= N + M - 2
So the goal of the problem is to split A and B into the maximum number of components A1, A2, ... Ak and B1, B2, ... Bk such that sum(Ai) = sum(Bi). In which case:
|C| <= N + M -1 -k
I don't think there is a polynomial solution to this. But the following heuristics might work.
Step 1: Sort A and B
Step 2: Find common elements between A and B and move them out into their own components
Step 3: Find sets of two elements in A that sum to an element in B and move them out. Do the same thing for two elements in B that sum to an element in A
Step 4: ....
As you can see, it becomes harder and harder as the size of the component keeps going up. But I am not sure if there is a better solution.

Counting subarray have sum in range [L, R]

I am solving a competitive programming problem, it was described like this:
Given n < 10^5 integer a1, a2, a3, ..., an and L, R. How many
subarrays are there such that sum of its element in range [L, R].
Example:
Input:
n = 4, L = 2, R = 4
1 2 3 4
Output: 4
(4 = 4, 3 = 1 + 2 = 3, 2 = 2)
One solution I have is bruteforce, but O(n^2) is too slow. What data structures / algorithms should I use to solve this problem efficiently ?

Compute prefix sums(p[0] = 0, p[1] = a1, p[2] = a1 + a2, ..., p[n] = sum of all numbers).
For a fixed prefix sum p[i], you need to find the number of such prefix sums p[j] that j is less than i and p[i] - R <= p[j] <= p[i] - L. One can do it in O(log n) with treap or another balanced binary search tree.
Pseudo code:
treap.add(0)
sum = 0
ans = 0
for i from 1 to n:
sum += a[i]
left, right = treap.split(sum - R)
middle, right = right.split(sum - L)
ans += middle.size()
merge left, middle and right together
treap.add(sum)

We can do it in linear time if the array contains positive numbers only.
First build an array with prefix sum from left to right.
1. Fix three pointers, X, Y and Z and initialize them with 0
2. At every step increase X by 1
3. While sum of numbers between X and Y are greater than R keep increasing Y
4. While sum of numbers between X and Z are greater than or equal to L, keep increasing Z
5. If valid Y and Z are found, add Z - Y + 1 to result.
6. If X is less than length of the array, Go to step 2.

Find the k non-repeating elements in a list with "little" additional space

The original problem statement is this one:
Given an array of 32bit unsigned integers in which every number appears exactly twice except three of them (which appear exactly once), find those three numbers in O(n) time using O(1) extra space. The input array is read-only. What if there are k exceptions instead of 3?
It's easy to solve this in Ο(1) time and Ο(1) space if you accept a very high constant factor because of the input restriction (the array can have at most 233 entries):
for i in lst:
if sum(1 for j in lst if i == j) == 1:
print i
So, for the sake of this question, let's drop the restriction in bit length and concentrate on the more general problem where the numbers can have up to m bits.
Generalizing an algorithm for k = 2, what I had in mind is the following:
XOR those numbers with a least significant bit of 1 and those with a 0 separately. If for both of the partitions, the resulting value is not zero, we know that we have partitioned the non-repeating numbers into two groups, each of which has at least one member
For each of those groups, try to partition it further by examining the second-least significant bit and so on
There is a special case to be considered, though. If after partitioning a group, the XOR values of one of the groups are both zero, we don't know whether one of the resulting sub-groups is empty or not. In this case my algorithm just leaves this bit out and continues with the next one, which is incorrect, for example it fails for the input [0,1,2,3,4,5,6].
Now the idea I had was to compute not only the XOR of the element, but also the XOR of the values after applying a certain function (I had chosen f(x) = 3x + 1 here). See Evgeny's answer below for a counter-example for this additional check.
Now although the below algorithm is not correct for k >= 7, I still include the implementation here to give you an idea:
def xor(seq):
return reduce(lambda x, y: x ^ y, seq, 0)
def compute_xors(ary, mask, bits):
a = xor(i for i in ary if i & mask == bits)
b = xor(i * 3 + 1 for i in ary if i & mask == bits)
return a if max(a, b) > 0 else None
def solve(ary, high = 0, mask = 0, bits = 0, old_xor = 0):
for h in xrange(high, 32):
hibit = 1 << h
m = mask | hibit
# partition the array into two groups
x = compute_xors(ary, m, bits | hibit)
y = compute_xors(ary, m, bits)
if x is None or y is None:
# at this point, we can't be sure if both groups are non-empty,
# so we check the next bit
continue
mask |= hibit
# we recurse if we are absolutely sure that we can find at least one
# new value in both branches. This means that the number of recursions
# is linear in k, rather then exponential.
solve(ary, h + 1, mask, bits | hibit, x)
solve(ary, h + 1, mask, bits, y)
break
else:
# we couldn't find a partitioning bit, so we output (but
# this might be incorrect, see above!)
print old_xor
# expects input of the form "10 1 1 2 3 4 2 5 6 7 10"
ary = map(int, raw_input().split())
solve(ary, old_xor=xor(ary))
From my analysis, this code has a worst-case time complexity of O(k * m² * n) where n is the number of input elements (XORing is O(m) and at most k partitioning operations can be successful) and space complexity O(m²) (because m is the maximum recursion depth and the temporary numbers can be of length m).
The question is of course if there is a correct, efficient approach with good asymptotic runtime (let's assume that k << n and m << n here for the sake of completeness), which also needs little additional space (for example, approaches that sort the input will not be accepted, because we'd need at least O(n) additional space for that, as we can't modify the input!).
EDIT: Now that the algorithm above is proven to be incorrect, it would of course be nice to see how it could be made correct, possibly by making it a bit less effient. Space complexity should be in o(n*m) (that is, sublinear in the total number of input bits). It would be okay to take k as an additional input if that makes the task easier.

I went offline and proved the original algorithm subject to the conjecture that the XOR tricks worked. As it happens, the XOR tricks don't work, but the following argument may still interest some people. (I re-did it in Haskell because I find proofs much easier when I have recursive functions instead of loops and I can use data structures. But for the Pythonistas in the audience I tried to use list comprehensions wherever possible.)
Compilable code at http://pastebin.com/BHCKGVaV.
Beautiful theory slain by an ugly fact
Problem: we're given a sequence of n nonzero 32-bit words in
which every element is either singleton or doubleton:
If a word appears exactly once, it is singleton.
If a word appears exactly twice, it is doubleton.
No word appears three or more times.
The problem is to find the singletons. If there are three
singletons, we should use linear time and constant space. More
generally, if there are k singletons, we should use O(k*n) time
and O(k) space. The algorithm rests on an unproven conjecture
about exclusive or.
We begin with these basics:
module Singleton where
import Data.Bits
import Data.List
import Data.Word
import Test.QuickCheck hiding ((.&.))
Key abstraction: Partial specification of a word
To tackle the problem I'm going to introduce an abstraction: to
describe the least significant $w$ bits of a 32-bit word, I
introduce a Spec:
data Spec = Spec { w :: Int, bits :: Word32 }
deriving Show
width = w -- width of a Spec
A Spec matches a word if the least significant w bits are equal
to bits. If w is zero, by definition all words match:
matches :: Spec -> Word32 -> Bool
matches spec word = width spec == 0 ||
((word `shiftL` n) `shiftR` n) == bits spec
where n = 32 - width spec
universalSpec = Spec { w = 0, bits = 0 }
Here are some claims about Specs:
All words match the universalSpec, which has width 0
If matches spec word and width spec == 32, then
word == bits spec
Key idea: "extend" a partial specification
Here is the key idea of the algorithm: we can extend a Spec by
adding another bit to the specification. Extending a Spec
produces a list of two Specs
extend :: Spec -> [Spec]
extend spec = [ Spec { w = w', bits = bits spec .|. (bit `shiftL` width spec) }
| bit <- [0, 1] ]
where w' = width spec + 1
And here's the crucial claim: if spec matches word and if
width spec is less than 32, then exactly one of the two specs
from extend spec match word. The proof is by case analysis on
the relevant bit of word. This claim is so important that I'm
going to call it Lemma One Here's a test:
lemmaOne :: Spec -> Word32 -> Property
lemmaOne spec word =
width spec < 32 && (spec `matches` word) ==>
isSingletonList [s | s <- extend spec, s `matches` word]
isSingletonList :: [a] -> Bool
isSingletonList [a] = True
isSingletonList _ = False
We're going to define a function which given a Spec and a
sequence of 32-bit words, returns a list of the singleton words
that match the spec. The function will take time proportional to
the length of the input times the size of the answer times 32, and
extra space proportional to the size of the answer times 32. Before
we tackle the main functio, we define some constant-space XOR
functions.
XOR ideas that are broken
Function xorWith f ws applies function f to every word in ws
and returns the exclusive or of the result.
xorWith :: (Word32 -> Word32) -> [Word32] -> Word32
xorWith f ws = reduce xor 0 [f w | w <- ws]
where reduce = foldl'
Thanks to stream fusion (see ICFP 2007), function xorWith takes
constant space.
A list of nonzero words has a singleton if and only if either the
exclusive or is nonzero, or if the exclusive or of 3 * w + 1 is
nonzero. (The "if" direction is trivial. The "only if" direction is
a conjecture that Evgeny Kluev has disproven; for a counterexample,
see array testb below. I can make Evgeny's example work by adding
a third function g, but obviously this situation calls for a
proof, and I don't have one.)
hasSingleton :: [Word32] -> Bool
hasSingleton ws = xorWith id ws /= 0 || xorWith f ws /= 0 || xorWith g ws /= 0
where f w = 3 * w + 1
g w = 31 * w + 17
Efficient search for singletons
Our main function returns a list of all the singletons matching a
spec.
singletonsMatching :: Spec -> [Word32] -> [Word32]
singletonsMatching spec words =
if hasSingleton [w | w <- words, spec `matches` w] then
if width spec == 32 then
[bits spec]
else
concat [singletonsMatching spec' words | spec' <- extend spec]
else
[]
We'll prove its correctness by induction on the width of the
spec.
The base case is that spec has width 32. In this case, the
list comprehension will give the list of words that are exactly
equal to bits spec. Function hasSingleton will return True if
and only if this list has exactly one element, which will be true
exactly when bits spec is singleton in words.
Now let's prove that if singletonsMatching is correct for
with m+1, it is also correct for width m, where *m < 32$.
(This is the opposite direction as usual for induction, but it
doesn't matter.)
Here is the part that is broken: for narrower widths, hasSingleton may return False even when given an array of singletons. This is tragic.
Calling extend spec on a spec of width m returns two specs
that have width $m+1$. By hypothesis, singletonsMatching is
correct on these specs. To prove: that the result contains exactly
those singletons that match spec. By Lemma One, any word that
matches spec matches exactly one of the extended specs. By
hypothesis, the recursive calls return exactly the singletons
matching the extend specs. When we combine the results of these
calls with concat, we get exactly the matching singletons, with
no duplicates and no omissions.
Actually solving the problem is anticlimactic: the singletons are
all the singletons that match the empty spec:
singletons :: [Word32] -> [Word32]
singletons words = singletonsMatching universalSpec words
Testing code
testa, testb :: [Word32]
testa = [10, 1, 1, 2, 3, 4, 2, 5, 6, 7, 10]
testb = [ 0x0000
, 0x0010
, 0x0100
, 0x0110
, 0x1000
, 0x1010
, 0x1100
, 0x1110
]
Beyond this point, if you want to follow what's going on, you need
to know QuickCheck.
Here's a random generator for specs:
instance Arbitrary Spec where
arbitrary = do width <- choose (0, 32)
b <- arbitrary
return (randomSpec width b)
shrink spec = [randomSpec w' (bits spec) | w' <- shrink (width spec)] ++
[randomSpec (width spec) b | b <- shrink (bits spec)]
randomSpec width bits = Spec { w = width, bits = mask bits }
where mask b = if width == 32 then b
else (b `shiftL` n) `shiftR` n
n = 32 - width
Using this generator, we can test Lemma One using
quickCheck lemmaOne.
We can test to see that any word claimed to be a singleton is in
fact singleton:
singletonsAreSingleton nzwords =
not (hasTriple words) ==> all (`isSingleton` words) (singletons words)
where isSingleton w words = isSingletonList [w' | w' <- words, w' == w]
words = [w | NonZero w <- nzwords]
hasTriple :: [Word32] -> Bool
hasTriple words = hasTrip (sort words)
hasTrip (w1:w2:w3:ws) = (w1 == w2 && w2 == w3) || hasTrip (w2:w3:ws)
hasTrip _ = False
Here's another property that tests the fast singletons against a
slower algorithm that uses sorting.
singletonsOK :: [NonZero Word32] -> Property
singletonsOK nzwords = not (hasTriple words) ==>
sort (singletons words) == sort (slowSingletons words)
where words = [w | NonZero w <- nzwords ]
slowSingletons words = stripDoubletons (sort words)
stripDoubletons (w1:w2:ws) | w1 == w2 = stripDoubletons ws
| otherwise = w1 : stripDoubletons (w2:ws)
stripDoubletons as = as

Disproof of algorithm in OP for k >= 7
This algorithm uses the possibility to recursively split a set of k unique values into two groups using value of a single bit when at least one of these groups is XORed to a nonzero value. For example, the following numbers
01000
00001
10001
may be split into
01000
and
00001
10001
using the value of the least significant bit.
If properly implemented, this works for k <= 6. But this approach fails for k = 8 and k = 7. Let's assume m = 4 and use 8 even numbers from 0 to 14:
0000
0010
0100
0110
1000
1010
1100
1110
Each bit, except the least significant one, has exactly 4 nonzero values. If we try to partition this set, because of this symmetry, we'll always get a subset with 2 or 4 or 0 nonzero values. XOR of these subsets is always 0. Which does not allow algorithm to make any split, so else part just prints the XOR of all these unique values (a single zero).
3x + 1 trick does not help: it only shuffles these 8 values and toggles the least significant bit.
Exactly the same arguments are applicable for k = 7 if we remove the first (all-zero) value from the above subset.
Since any group of unique values may be split into a group of 7 or 8 values and some other group, this algorithm also fails for k > 8.
Probabilistic algorithm
It is possible not to invent a completely new algorithm, but instead modify the algorithm in OP, making it work for any input values.
Each time the algorithm accesses an element of the input array, it should apply some transformation function to this element: y=transform(x). This transformed value y may be used exactly as x was used in the original algorithm - for partitioning the sets and XORing the values.
Initially transform(x)=x (unmodified original algorithm). If after this step we have less than k results (some of the results are several unique values XORed), we change transform to some hash function and repeat computations. This should be repeated (each time with different hash function) until we get exactly k values.
If these k values are obtained on the first step of the algorithm (without hashing), these values are our result. Otherwise we should scan the array once more, computing hash of each value and reporting those values, that match one of k hashes.
Each subsequent step of computations with different hash function may be performed either on the original set of k values or (better) separately on each of the subsets, found on previous step.
To obtain different hash function for each step of the algorithm, you can use Universal hashing. One necessary property for the hash function is reversibility - original value should be (in theory) reconstructible from the hash value. This is needed to avoid hashing of several "unique" values to the same hash value. Since using any reversible m-bit hash function has not much chances to solve the problem of "counterexample", hash values should be longer than m bits. One simple example of such hash function is concatenation of the original value and some one-way hash function of this value.
If k is not very large, it is not likely that we get a set of data similar to that counter-example. (I have no proof that there are no other "bad" data patterns, with different structure, but let's hope they are also not very probable). In this case average time complexity is not much larger than O(k * m2 * n).
Other improvements for the original algorithm
While computing the XOR of all the (yet unpartitioned) values it is reasonable to check for a unique zero value in the array. If there is one, just decrement k.
On each recursion step we cannot always know the exact size of each partition. But we know if it is odd or even: each split on a non-zero bit gives odd-sized subset, the other subset's parity is "toggled" parity of the original subset.
On the latest recursion steps, when the only non-split subset is of size 1, we may skip the search for splitting bit and report the result immediately (this is an optimization for very small k).
If we get an odd-sized subset after some split (and if we don't know for sure its size is 1), scan the array and try to find a unique value, equal to XOR of this subset.
There is no need to iterate through every bit to split even-sized set. Just use any non-zero bit of its XORed values. XORing one of the resulting subsets may produce zero, but this split is still valid because we have odd number of "ones" for this splitting bit but even set size. This also means that any split, that produces even-sized subset which is non-zero when XORed, is a valid split, even if the remaining subset XORs to zero.
You shouldn't continue splitting bit search on each recursion (like solve(ary, h + 1...). Instead you should restart search from the beginning. It is possible to split the set on bit 31, and have the only splitting possibility for one of the resulting subsets on bit 0.
You shouldn't scan the whole array twice (so second y = compute_xors(ary, m, bits) is not needed). You already have XOR of the whole set and XOR of a subset where the splitting bit is non-zero. Which means you can compute y immediately: y = x ^ old_xor.
Proof of algorithm in OP for k = 3
This is a proof not for the actual program in OP, but for its idea. The actual program currently rejects any split when one of the resulting subsets is zero. See suggested improvements for the cases when we may accept some of such splits. So the following proof may be applied to that program only after if x is None or y is None is changed to some condition that takes into account parity of the subset sizes or after a preprocessing step is added to exclude unique zero element from the array.
We have 3 different numbers. They should be different in at least 2 bit positions (if they are different in only one bit, the third number must be equal to one of the others). Loop in the solve function finds leftmost of these bit positions and partitions these 3 numbers into two subsets (of a single number and of 2 distinct numbers). The 2-number subset has equal bits in this bit position, but the numbers still should be different, so there should be one more splitting bit position (obviously, to the right of the first one). Second recursion step easily splits this 2-number subset into two single numbers. Trick with i * 3 + 1 is redundant here: it only doubles complexity of the algorithm.
Here is an illustration for the first split in a set of 3 numbers:
2 1
*b**yzvw
*b**xzvw
*a**xzvw
We have a loop that iterates through every bit position and computes XOR of the whole words, but separately, one XOR value (A) for true bits in given position, other XOR value (B) for false bit.
If number A has zero bit in this position, A contains XOR of some even-sized subset of values, if non-zero - odd-sized subset. The same is true for B. We are interested only in the even-sized subset.
It may contain either 0 or 2 values.
While there is no difference in the bit values (bits z, v, w), we have A=B=0, which means we cannot split our numbers on these bits.
But we have 3 non-equal numbers, which means at some position (1) we should have different bits (x and y). One of them (x) can be found in two of our numbers (even-sized subset!), other (y) - in one number.
Let's look at the XOR of values in this even-sized subset. From A and B select value (C), containing bit 0 at position 1. But C is just a XOR of two non-equal values.
They are equal at bit position 1, so they must differ in at least one more bit position (position 2, bits a and b). So C != 0 and it corresponds to the even-sized subset.
This split is valid because we can split this even-sized subset further either by very simple algorithm or by next recursion of this algorithm.
If there are no unique zero elements in the array, this proof may be simplified. We always split unique numbers into 2 subsets - one with 2 elements (and it cannot XOR to zero because elements are different), other with one element (non-zero by definition). So the original program with little pre-processing should work properly.
Complexity is O(m2 * n). If you apply the improvements I suggested earlier, the expected number of times this algorithm scans the array is m / 3 + 2. Because the first splitting bit position is expected to be m / 3, a single scan is needed to deal with 2-element subset, every 1-element subset does not need any array scans, and one more scan is needed initially (outside of solve method).
Proof of algorithm in OP for k = 4 .. 6
Here we assume that all the suggested improvements to the original algorithm are applied.
k=4 and k=5: Since there is at least one position with different bits, this set of numbers can be split in such a way that one of the subsets has size 1 or 2. If subset's size is 1, it is non-zero (we have no zero unique values). If subset's size is 2, we have XOR of two different numbers, which is non-zero. So in both cases the split is valid.
k=6: If XOR of the whole set is non-zero, we can split this set by any position where this XOR has non-zero bit. Otherwise we have even number of non-zero bit in each position. Since there is at least one position with different bits, this position splits the set into subsets of sizes 2 and 4. Subset of size 2 has always non-zero XOR because it contains 2 different numbers. Again, in both cases we have the valid split.
Deterministic algorithm
Disproof for k >= 7 shows the pattern where original algorithm does not work: we have a subset of size greater than 2 and at each bit position we have even number of non-zero bits. But we can always find a pair of positions where non-zero bits overlap in single number. In other words, it is always possible to find a pair of positions in the subset of size 3 or 4 with non-zero XOR of all bits in the subset in both positions. This suggest us to use an additional split-position: iterate through bit positions with two separate pointers, group all numbers in the array into two subsets where one subset has both non-zero bits in these positions, and other - all the remaining numbers. This increases the worst case complexity my m, but allows more values for k. Once there is no more possible to obtain a subset of size less than 5, add the third "splitting pointer", and so on. Each time k doubles, we may need an additional "splitting pointer", which increases the worst case complexity my m once more.
This might be considered as a sketch of a proof for the following algorithm:
Use original (improved) algorithm to find zero or more unique values and zero or more non-splittable subsets. Stop when there are no more non-splittable subsets.
For any of these non-splittable subsets, try to split it while increasing the number of "splitting pointers". When split is found, continue with step 1.
Worst case complexity is O(k * m2 * n * mmax(0, floor(log(floor(k/4))))), which may be approximated by O(k * n * mlog(k)) = O(k * n * klog(m)).
Expected run time of this algorithm for small k is a little bit worse than for probabilistic algorithm, but still not much larger than O(k * m2 * n).

One probabilistic approach to take would be to use a counting filter.
The algorithm is as follows:
Linearly scan the array and 'update' the counting filter.
Linearly scan the array and create a collection of all elements which aren't certainly of count 2 in the filter, this will be <= k of the real solutions. (The false positives in this case are unique elements which look like they aren't).
Chose a new basis of hash functions and repeat until we have all k solutions.
This uses 2m bits of space (independant of n). The time complexity is more involved, but knowing that the probability that any given unique element is not found in step 2 is approx (1 - e^(-kn/m))^k we will resolve to a solution very quickly, but unfortunatly we are not quite linear in n.
I appreciate that this doesn't satisfy your constraints as it is super-linear in time, and is probabilistic, but given the original conditions may not be satisfiable this
approach may be worth considering.

Here is a proper solution for the case k = 3 that takes only minimal amount of space, and the space requirement is O(1).
Let 'transform' be a function that takes an m-bit unsigned integer x and an index i as arguments. i is between 0 .. m - 1, and transform takes the integer x into
x itself, if the ith bit of x is not set
to x ^ (x <<< 1) where <<< denotes barrel shift (rotation)
Use in the following T(x, i) as shorthand for transform(x, i).
I now claim that if a, b, c are three distinct m-bit unsigned integers and a', b', c' and other three distinct m-bit unsigned integers such that a XOR b XOR c == a' XOR b' XOR c', but the sets {a, b, c} and {a', b', c'} are two different sets, then there is an index i such that T(a, i) XOR T(b, i) XOR T(c, i) differs from T(a', i) XOR T(b', i) XOR T(c', i).
To see this, let a' == a XOR a'', b' == b XOR b'' and c' == c XOR c'', i.e. let a'' denote the XOR of a and a' etc. Because a XOR b XOR c equals a' XOR b' XOR c' at every bit, it follows that a'' XOR b'' XOR c'' == 0. This means that at every bit position, either a', b', c' are identical to a, b, c, or exactly two of them have the bit at the chosen position flipped (0->1 or 1->0). Because a', b', c' differ from a, b, c, let P be any bit position where there have been two bit flips. We proceed to show that T(a', P) XOR T(b', P) XOR T(c', P) differs from T(a, P) XOR T(b, P) XOR T(c, P). Assume without loss of generality that a' has bit flip compared to a, b' has bit flip compared to b, and c' has the same bit value as c at this position P.
In addition to the bit position P, there must be another bit position Q where a' and b' differ (otherwise the sets do not consist of three distinct integers, or flipping the bit at position P does not create a new set of integers, a case that does not need to be considered). The XOR of the barrel rotated version of the bit position Q creates a parity error at bit position (Q + 1) mod m, which leads to to claim that that T(a', P) XOR T(b', P) XOR T(c', P) differs from T(a, P) XOR T(b, P) XOR T(c, P). The actual value of c' does not impact the parity error, obviously.
Hence, the algorithm is to
run through the input array, and calculate (1) the XOR of all elements, and (2) the XOR of T(x, i) for all elements x and i between 0 .. m - 1
search in constant space for three 32-bit integers a, b, c such that a XOR b XOR c and T(a, i) XOR b(a, i) XOR c(a, i) for all valid values of i match those calculated form the array
This works obviously because the duplicate elements get cancelled out of the XOR operations, and for the remaining three elements the reasoning above holds.
I IMPLEMENTED THIS and it works. Here is source code of my test program, which uses 16-bit integers for speed.
#include <iostream>
#include <stdlib.h>
using namespace std;
/* CONSTANTS */
#define BITS 16
#define MASK ((1L<<(BITS)) - 1)
#define N MASK
#define D 500
#define K 3
#define ARRAY_SIZE (D*2+K)
/* INPUT ARRAY */
unsigned int A[ARRAY_SIZE];
/* 'transform' function */
unsigned int bmap(unsigned int x, int idx) {
if (idx == 0) return x;
if ((x & ((1L << (idx - 1)))) != 0)
x ^= (x << (BITS - 1) | (x >> 1));
return (x & MASK);
}
/* Number of valid index values to 'transform'. Note that here
index 0 is used to get plain XOR. */
#define NOPS 17
/* Fill in the array --- for testing. */
void fill() {
int used[N], i, j;
unsigned int r;
for (i = 0; i < N; i++) used[i] = 0;
for (i = 0; i < D * 2; i += 2)
{
do { r = random() & MASK; } while (used[r]);
A[i] = A[i + 1] = r;
used[r] = 1;
}
for (j = 0; j < K; j++)
{
do { r = random() & MASK; } while (used[r]);
A[i++] = r;
used[r] = 1;
}
}
/* ACTUAL PROCEDURE */
void solve() {
int i, j;
unsigned int acc[NOPS];
for (j = 0; j < NOPS; j++) { acc[j] = 0; }
for (i = 0; i < ARRAY_SIZE; i++)
{
for (j = 0; j < NOPS; j++)
acc[j] ^= bmap(A[i], j);
}
/* Search for the three unique integers */
unsigned int e1, e2, e3;
for (e1 = 0; e1 < N; e1++)
{
for (e2 = e1 + 1; e2 < N; e2++)
{
e3 = acc[0] ^ e1 ^ e2; // acc[0] is the xor of the 3 elements
/* Enforce increasing order for speed */
if (e3 <= e2 || e3 <= e1) continue;
for (j = 0; j < NOPS; j++)
{
if (acc[j] != (bmap(e1, j) ^ bmap(e2, j) ^ bmap(e3, j)))
goto reject;
}
cout << "Solved elements: " << e1
<< ", " << e2 << ", " << e3 << endl;
exit(0);
reject:
continue;
}
}
}
int main()
{
srandom(time(NULL));
fill();
solve();
}

I presume you know k in advance
I choose Squeak Smalltalk as implementation language.
inject:into: is reduce and is O(1) in space, O(N) in time
select: is filter, (we don't use it because O(1) space requirement)
collect: is map, (we don't use it because O(1) space requirement)
do: is forall, and is O(1) in space, O(N) in time
a block in square brackets is a closure, or a pure lambda if it doesn't close over any variable and don't use return, the symbol prefixed with colons are the parameters.
^ means return
For k=1 the singleton is obtained by reducing the sequence with bit xor
So we define a method xorSum in class Collection (thus self is the sequence)
Collection>>xorSum
^self inject: 0 into: [:sum :element | sum bitXor:element]
and a second method
Collection>>find1Singleton
^{self xorSum}
We test it with
self assert: {0. 3. 5. 2. 5. 4. 3. 0. 2.} find1Singleton = {4}
The cost is O(N), space O(1)
For k=2, we search two singletons, (s1,s2)
Collection>>find2Singleton
| sum lowestBit s1 s2 |
sum := self xorSum.
sum is different from 0 and is equal to (s1 bitXOr: s2), the xor of two singletons
Split at lowest set bit of sum, and xor both sequences like you proposed, you get the 2 singletons
lowestBit := sum bitAnd: sum negated.
s1 := s2 := 0.
self do: [:element |
(element bitAnd: lowestBit) = 0
ifTrue: [s1 := s1 bitXor: element]
ifFalse: [s2 := s2 bitXor: element]].
^{s1. s2}
and
self assert: {0. 1. 1. 3. 5. 6. 2. 6. 4. 3. 0. 2.} find2Singleton sorted = {4. 5}
The cost is 2*O(N), space O(1)
For k=3,
We define a specific class implementing a slight variation of the xor split, in fact we use a ternary split, the mask can have value1 or value2, any other value is ignored.
Object
subclass: #BinarySplit
instanceVariableNames: 'sum1 sum2 size1 size2'
classVariableNames: '' poolDictionaries: '' category: 'SO'.
with these instance methods:
sum1
^sum1
sum2
^sum2
size1
^size1
size2
^size2
split: aSequence withMask: aMask value1: value1 value2: value2
sum1 := sum2 := size1 := size2 := 0.
aSequence do: [:element |
(element bitAnd: aMask) = value1
ifTrue:
[sum1 := sum1 bitXor: element.
size1 := size1 + 1].
(element bitAnd: aMask) = value2
ifTrue:
[sum2 := sum2 bitXor: element.
size2 := size2 + 1]].
doesSplitInto: s1 and: s2
^(sum1 = s1 and: [sum2 = s2])
or: [sum1 = s2 and: [sum2 = s1]]
And this class side method, a sort of constructor to create an instance
split: aSequence withMask: aMask value1: value1 value2: value2
^self new split: aSequence withMask: aMask value1: value1 value2: value2
Then we compute:
Collection>>find3SingletonUpToBit: m
| sum split split2 mask value1 value2 |
sum := self xorSum.
But this doesn't give any information on the bit to split... So we try each bit i=0..m-1.
0 to: m-1 do: [:i |
split := BinarySplit split: self withMask: 1 << i value1: 1<<i value2: 0.
If you obtain (sum1,sum2) == (0,sum), then you unlickily got the 3 singletons in the same bag...
So repeat until you get something different
Else, if different, you'll get a bag with s1 (the one with odd size) and another with s2,s3 (even size), so just apply algorithm for k=1 (s1=sum1) and k=2 with a modified bit pattern
(split doesSplitInto: 0 and: sum)
ifFalse:
[split size1 odd
ifTrue:
[mask := (split sum2 bitAnd: split sum2 negated) + (1 << i).
value1 := (split sum2 bitAnd: split sum2 negated).
value2 := 0.
split2 := BinarySplit split: self withMask: mask value1: value1 value2: value2.
^{ split sum1. split2 sum1. split2 sum2}]
ifFalse:
[mask := (split sum1 bitAnd: split sum1 negated) + (1 << i).
value1 := (split sum1 bitAnd: split sum1 negated) + (1 << i).
value2 := (1 << i).
split2 := BinarySplit split: self withMask: mask value1: value1 value2: value2.
^{ split sum2. split2 sum1. split2 sum2}]].
And we test it with
self assert: ({0. 1. 3. 5. 6. 2. 6. 4. 3. 0. 2.} find3SingletonUpToBit: 32) sorted = {1. 4. 5}
The worse cost is (M+1)*O(N)
For k=4,
When we split, we can have (0,4) or (1,3) or (2,2) singletons.
(2,2) is easy to recognize, both sizes are even, and both xor sum are different from 0, case solved.
(0,4) is easy to recognize, both sizes are even, and at least one sum is zero, so repeat search with incremented bit pattern on the bag with the sum != 0
(1,3) is harder, because both size are odd, and we fall back to case of unknown number of singletons... Though, we can easily recognize the single singleton, if an element of the bag is equal to the xor sum, which is impossible with 3 different numbers...
We can generalize for k=5... but above will be hard because we must find a trick for the case (4,2), and (1,5), remember our hypothesis, we must know k in advance... We'll have to do hypothesis and verify them afterward...
If you have a counter example, just submit it, I will check with above Smalltalk implementation
EDIT: I commited the code (license MIT) at http://ss3.gemstone.com/ss/SONiklasBContest.html

With space complexity requirements, loosen to O(m * n), this task can be easily solved in O(n) time. Just count the number of instances for each element using a hash table, then filter entries with counter equal to one. Or use any distributive sorting algorithm.
But here is a probabilistic algorithm, having lighter space requirements.
This algorithm uses additional bitset of size s. For each value in the input array, a hash function is computed. This hash function determines an index in the bitset. The idea is to scan input array, toggling corresponding bit in the bitset for each array entry. Duplicate entries toggle the same bit twice. Bits, toggled by the unique entries (almost all of them) remain in the bitset. This is practically the same as counting Bloom filter, where the only used bit in each counter is the least-significant bit.
Scanning the array once more, we may extract unique values (excluding some false negatives) as well as some duplicate values (false positives).
The bitset should be sparse enough to give as little false positives as possible to decrease number of unneeded duplicate values and therefore to decrease space complexity. Additional benefit of high sparseness of the bitset is decreasing the number of false negatives, which improves run time a little bit.
To determine optimal size for the bitset, distribute available space evenly between the bitset and temporary array containing both unique values and false positives (assuming k << n): s = n * m * k / s, which gives s = sqrt(n * m * k). And expected space requirement is O(sqrt(n * m * k)).
Scan input array and toggle bits in the bitset.
Scan input array and filter elements having corresponding nonzero bit in the bitset, write them to temporary array.
Use any simple approach (distribution sort or hash) to exclude duplicates from temporary array.
If size of temporary array plus the number of unique elements known so far is less than k, change hash function, clear the bitset and toggle bits, corresponding to known unique values, continue with step 1.
Expected time complexity is somewhere between O(n * m) and O(n * m * log(n * m * k) / log(n * m / k)).

Your algorithm is not O(n), because there is no guarantee to divide numbers into two same size groups in each step, also because there is no bound in your number sizes (they are not related to n), there is no limit for your possible steps, if you don't have any limit on your input number sizes (if they are independent from n), your algorithm run time could be ω(n), assume below numbers of size m bit and just their first n bits could be different:
(suppose m > 2n)
---- n bits --- ---- m-n bits --
111111....11111 00000....00000
111111....11111 00000....00000
111111....11110 00000....00000
111111....11110 00000....00000
....
100000....00000 00000....00000
Your algorithm will run for first m-n bits, and it will be O(n) in each step, till now you arrived O((m-n)*n) which is bigger than O(n^2).
PS: if you always have 32 bit numbers, your algorithm is O(n) and is not hard to prove this.

This is just an intuition, but I think the solution is to increase the number of partitions you evaluate until you find one where its xor sum is not zero.
For instance, for every two bits (x,y) in the range [0,m), consider the partitions defined by the value of a & ((1<<x) || (1 << y)). In the 32 bit case, that results in 32*32*4 = 4096 partitions and it allows to correctly solve the case where k = 4.
The interesting thing now would be to find a relation between k and the number of partitions required to solve the problem, that would also allow us to calculate the complexity of the algorithm. Another open question is if there are better partitioning schemas.
Some Perl code to illustrate the idea:
my $m = 10;
my #a = (0, 2, 4, 6, 8, 10, 12, 14, 15, 15, 7, 7, 5, 5);
my %xor;
my %part;
for my $a (#a) {
for my $i (0..$m-1) {
my $shift_i = 1 << $i;
my $bit_i = ($a & $shift_i ? 1 : 0);
for my $j (0..$m-1) {
my $shift_j = 1 << $j;
my $bit_j = ($a & $shift_j ? 1 : 0);
my $k = "$i:$bit_i,$j:$bit_j";
$xor{$k} ^= $a;
push #{$part{$k} //= []}, $a;
}
}
}
print "list: #a\n";
for my $k (sort keys %xor) {
if ($xor{$k}) {
print "partition with unique elements $k: #{$part{$k}}\n";
}
else {
# print "partition without unique elements detected $k: #{$part{$k}}\n";
}
}

The solution to the former problem (finding unique uint32 numbers in O(N) with O(1) memory usage) is quite simple, though not particularly fast:
void unique(int n, uint32 *a) {
uint32 i = 0;
do {
int j, count;
for (count = j = 0; j < n; j++) {
if (a[j] == i) count++;
}
if (count == 1) printf("%u appears only once\n", (unsigned int)i);
} while (++i);
}
For the case where the number of bits M is not limited, complexity becomes O(N*M*2M) and memory usage is still O(1).
update: the complementary solution using a bitmap results in complexity O(N*M) and memory usage O(2M):
void unique(int n, uint32 *a) {
unsigned char seen[1<<(32 - 8)];
unsigned char dup[1<<(32 - 8)];
int i;
memset(seen, sizeof(seen), 0);
memset(dup, sizeof(dup), 0);
for (i = 0; i < n; i++) {
if (bitmap_get(seen, a[i])) {
bitmap_set(dup, a[i], 1);
}
else {
bitmap_set(seen, a[i], 1);
}
}
for (i = 0; i < n; i++) {
if (bitmap_get(seen, a[i]) && !bitmap_get(dup, a[i])) {
printf("%u appears only once\n", (unsigned int)a[i]);
bitmap_set(seen, a[i], 0);
}
}
}
Interestingly, both approaches can be combined dividing the 2M space in bands. Then you will have to iterate over all the bands and inside every band find unique values using the bit vector technique.

Two approaches would work.
(1) Create a temporary hash table with where keys are the integers and values are the number
of repetitions. Of course, this would use more space than specified.
(2) sort the array (or a copy) and then count the number of cases where array[n+2]==array[n].
Of course, this would use more time than specified.
I'll be very surprised to see a solution that satisfies the original constraints.

Finding pairs with smallest XOR values from a list

I am working on a problem in which I am expected to take the xor of
all the pair of integers in an array and then find the K smallest
integers produced from xor'ing. The size of the array can be N=100000
and so K can be quite large but its limited to 250000.
For example,
if N=5 and K=4,
our array is {1 3 2 4 2}
The numbers resulting from xoring(1 and 3, 1-2, 1-4, 1-2, 3-2, 3-4, 3-2 etc)
3 3 2 5 0 1 6 1 6 7
Since K=4, we have to print 4 smallest integers.
so the answer would be 0 1 1 2.
Since the time limit is 2 sec and very tight, using the brute force approach
of xoring all the numbers would time out. My approach was wrong and so I need
help. May be we can exploit the limit on K=250000 and want to know if it is
possible to get the K smallest numbers without xoring all the integers.

(x ^ y) == (x | y) - (x & y) >= |y - x|
Sorting your numbers in order would be a start, because the difference between the pairs will give you a lower bound for the xor, and therefore a cutoff point for when to stop looking for numbers to xor x with.
There is also a shortcut to looking for pairs of numbers whose xor is less than (say) a power of 2, because you're only interested in x <= y <= x | (2 ^ N - 1). If this doesn't give you enough pairs, increase N and try again.
EDIT: You can of course exclude the pairs of numbers that you already found whose xor is less than the previous power of 2, by using x | (2 ^ (N - 1) - 1) < y <= x | (2 ^ N) - 1.
Example based on (sorted) [1, 2, 2, 3, 4]
Start by looking for pairs of numbers whose xor is less than 1: for each number x, search for subsequent numbers y = x. This gives {2, 2}.
If you need more than one pair, look for pairs of numbers whose xor is less than 2 but not less than 1: for each number x, search for numbers x < y <= x | 1. This gives {2, 3} (twice).
Note that the final xor values aren't quite sorted, but each batch is strictly less than the previous batch.
If you need more than that, look for pairs of numbers whose xor is less than 4 but not less than 2: for each number x, search for numbers x | 1 < y <= x | 3. This gives {1, 2} (twice); {1, 3}.
If you need more than that, look for pairs of numbers whose xor is less than 8 but not less than 4: for each number x, search for numbers x | 3 < y <= x | 7. This gives {1, 4}; {2, 4} (twice); {3, 4}.

Notice that if the all bits to the left of bit n (counting from the right) of numbers x and y are equal, x xor y ≤ 2n-1
x = 0000000000100110
y = 0000000000110010
^Everything to the left of bit 5 is equal
so x xor y ≤ 25-1 = 31
This can be exploited by storing every number in a bitwise-trie - that is, a trie where every edge is either a 0 or a 1. Then x xor y ≤ 2d(x,y)-1, where d(x,y) is the number of steps we need to move up to find the least-common ancestor of x and y.
root
(left-most bit)
0
/
0
/
...
1
/ \
0 1
/ /
0 0
... ...
/ /
0 0
x y
x and y share an ancestor-node that is 5 levels up, so d(x,y) is 5
Once you have the trie, it's easy to find all pairs such that d(x,y) = 1 - just navigate to all nodes 1 level above the leaves, and compare each of that node's children to each other. Those values will give you a max x xor y of 21-1 = 1.
If you still don't have k values, then move up to all nodes 2 levels above the leaves, and compare each of that node's grandchildren to each other†. Those values will give you a max x xor y of 22-1 = 3.
† (Actually, you only need to compare each of the leaves in the left-subtree with each of the leaves in the right-subtree, since each of the leaves in a given subtree have already been compared against each other)
Continue this until, after checking all nodes for a given level, you have at least k values of x xor y. Then sort that list of values, and take the k smallest.
When k is small (<< n2), this algorithm is O(n). For large k, it is O(2bn), where b is the number of bits per integer (assuming there are not many duplicates).

I would approach this by first sorting the input array of integers. Then, the pairs with the smallest xor values will be next to each other (but not all adjacent pairs will have the smallest xor values). You can start with adjacent pairs, then work outwards, checking pairs (N, N+2), (N, N+3), until you have reached your desired list of K smallest results.
For your sample array {1 3 2 4 2}, the sorted array is {1 2 2 3 4} and the pairwise xor values are:
1 xor 2 = 3
2 xor 2 = 0
2 xor 3 = 1
3 xor 4 = 7
For the next step,
1 xor 2 = 3
2 xor 3 = 1
2 xor 4 = 6
and again,
1 xor 3 = 2
2 xor 4 = 6
finally,
1 xor 4 = 5
This idea isn't complete, but you should be able to use it to help construct a full solution.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio