Find number occuring even number of times? [duplicate] - algorithm

Given is an array of integers. Each number in the array repeats an ODD number of times, but only 1 number is repeated for an EVEN number of times. Find that number.
I was thinking a hash map, with each element's count. It requires O(n) space. Is there a better way?

Hash-map is fine, but all you need to store is each element's count modulo 2.
All of those will end up being 1 (odd) except for the 0 (even) -count element.
(As Aleks G says you don't need to use arithmetic (count++ %2), only xor (count ^= 0x1); although any compiler will optimize that anyway.)

I don't know what the intended meaning of "repeat" is, but if there is an even number of occurrences of (all-1) numbers, and an odd number of occurances for only one number, then XOR should do the trick.

You don't need to keep the number of times each element is found - just whether it's even or odd number of time - so you should be ok with 1 bit for each element. Start with 0 for each element, then flip the corresponding bit when you encounter the element. Next time you encounter it, flip the bit again. At the end, just check which bit is 1.

If all numbers are repeated even times and one number repeats odd times, if you XOR all of the numbers, the odd count repeated number can be found.
By your current statement I think hashmap is good idea, but I'll think about it to find a better way. (I say this for positive integers.)

Apparently there is a solution in O(n) time and O(1) space, since it was asked at a software engineer company with this constraint explictly. See here : Bing interview question -- it seems to be doable using XOR over numbers in the array. Good luck ! :)

Related

Linear algorithm on binary strings

I'm going through some old midterms to study. (None of the solutions are given)
I've come across this problem which I'm stuck on
Let n = 2ℓ − 1 for some positive integer ℓ. Suppose someone claims to hold an array A[1.. n] of
distinct ℓ-bit strings; thus, exactly one ℓ-bit string does not appear in A. Suppose further that the
only way we can access A is by calling the function FetchBit(i, j), which returns the jth bit of the string A[i] in O(1) time.
Describe an algorithm to find the missing string in A using only O(n) calls to FetchBit.
The only thing I can think of is go through each string, convert it to base 10, sort them all and then see which value is missing. But that's certainly not O(n)
Proof it's not homework... http://web.engr.illinois.edu/~jeffe/teaching/algorithms/hwex/f12/midterm1.pdf
You can do it in 2n operations.
First, look at the first bit of every number. Obviously, you will get 2ℓ-1 zeros and 2ℓ-1-1 ones ore vice versa (because only one number is missing). If there is 2ℓ-1-1 ones then you know that the first bit of the missing number is one, otherwise it is zero.
Now you know the first bit of a missing number. Let's look at all numbers which have the same first bit (there are 2ℓ-1-1 of them) and repeat the same procedure with their second bit. This way you will determine the second bit of the missing number, and so on.
The total number of FetchBit calls will be 2ℓ-1 + 2ℓ-1-1 + ... + 21-1 <= 2ℓ+1 <= 2n+2 = O(n).

Given n-1*n array, find missing number

Here each row contains a bit representation of a number.These numbers come from 1..N Exactly one number is missing.Find the bit representation of the missing number.
The interviewer asked me this question.
I said: "We can find the sum of the given numbers and subtract it from the sum of first n numbers(which we know as (N*(N+1))/2)"
He said that involves changing from base 10 to base 2.
Can you give me a hint on how I can solve it without changing bases?
You can XOR together all numbers from 0..N range, then XOR the numbers from the array. The result will be the missing number.
Explanation: XORing a number with itself always results in zero. The algorithm above XORs each number exactly twice, except for the missing one. The missing number will be XOR-ed with zero exactly once, so the result is going to equal the missing number.
Note: the interviewer is wrong on needing to convert bases in order to do addition: adding binary numbers is easy and fun - in fact, computers do it all the time :-)
You can just XOR these numbers together, and XOR with 1..n. The fact that the numbers are stored in binary is a good hint, BTW.
In fact, any commutative operator with a inverse should work, since if the operator is commutative, the order does not matter, so it can be applied to the numbers you have and 1..n, with the difference being the first one is not operated on the number that is not in the array. Then you can use its inverse to find that number, with the two results you have. SO + and -, * and /, XOR and XOR and any other operators that meets the requirement all should work here.

Distinct digit count

Is is possible to count the distinct digits in a number in constant time O(1)?
Suppose n=1519 output should be 3 as there are 3 distinct digits(1,5,9).
I have done it in O(N) time but anyone knows how to find it in O(1) time?
I assume N is the number of digits of n. If the size of n is unlimited, it can't be done in general in O(1) time.
Consider the number n=11111...111, with 2 trillion digits. If I switch one of the digits from a 1 to a 2, there is no way to discover this without in some way looking at every single digit. Thus processing a number with 2 trillion digits must take (of the order of) 2 trillion operations at least, and in general, a number with N digits must take (of the order of) N operations at least.
However, for almost all numbers, the simple O(N) algorithm finishes very quickly because you can just stop as soon as you get to 10 distinct digits. Almost all numbers of sufficient length will have all 10 digits: e.g. the probability of not terminating with the answer '10' after looking at the first 100 digits is about 0.00027, and after the first 1000 digits it's about 1.7e-45. But unfortunately, there are some oddities which make the worst case O(N).
After seeing that someone really posted a serious answer to this question, I'd rather repeat my own cheat here, which is a special case of the answer described by #SimonNickerson:
O(1) is not possible, unless you are on radix 2, because that way, every number other than 0 has both 1 and 0, and thus my "solution" works not only for integers...
EDIT
How about 2^k - 1? Isn't that all 1s?
Drat! True... I should have known that when something seems so easy, it is flawed somehow... If I got the all 0 case covered, I should have covered the all 1 case too.
Luckily this case can be tested quite quickly (if addition and bitwise AND are considered an O(1) operation): if x is the number to be tested, compute y this way: y=(x+1) AND x. If y=0, then x=2^k - 1. because this is the only case when all the bits needed to be flipped by the addition. Of course, this is quite a bit flawed, as with bit lengths exceeding the bus width, the bitwise operators are not O(1) anymore, but rather O(N).
At the same time, I think it can be brought down to O(logN), by breaking the number into bus width size chunks, and AND-ing together the neighboring ones, repeating until only one is left: if there were no 0s in the number tested, the last one will be full 1s too...
EDIT2: I was wrong... This is still O(N).

Finding a number that repeats even no of times where all the other numbers repeat odd no of times

Given is an array of integers. Each number in the array repeats an ODD number of times, but only 1 number is repeated for an EVEN number of times. Find that number.
I was thinking a hash map, with each element's count. It requires O(n) space. Is there a better way?
Hash-map is fine, but all you need to store is each element's count modulo 2.
All of those will end up being 1 (odd) except for the 0 (even) -count element.
(As Aleks G says you don't need to use arithmetic (count++ %2), only xor (count ^= 0x1); although any compiler will optimize that anyway.)
I don't know what the intended meaning of "repeat" is, but if there is an even number of occurrences of (all-1) numbers, and an odd number of occurances for only one number, then XOR should do the trick.
You don't need to keep the number of times each element is found - just whether it's even or odd number of time - so you should be ok with 1 bit for each element. Start with 0 for each element, then flip the corresponding bit when you encounter the element. Next time you encounter it, flip the bit again. At the end, just check which bit is 1.
If all numbers are repeated even times and one number repeats odd times, if you XOR all of the numbers, the odd count repeated number can be found.
By your current statement I think hashmap is good idea, but I'll think about it to find a better way. (I say this for positive integers.)
Apparently there is a solution in O(n) time and O(1) space, since it was asked at a software engineer company with this constraint explictly. See here : Bing interview question -- it seems to be doable using XOR over numbers in the array. Good luck ! :)

Finding a single number in a list [duplicate]

This question already has answers here:
How to find the only number in an array that doesn't occur twice [duplicate]
(5 answers)
Closed 7 years ago.
What would be the best algorithm for finding a number that occurs only once in a list which has all other numbers occurring exactly twice.
So, in the list of integers (lets take it as an array) each integer repeats exactly twice, except one. To find that one, what is the best algorithm.
The fastest (O(n)) and most memory efficient (O(1)) way is with the XOR operation.
In C:
int arr[] = {3, 2, 5, 2, 1, 5, 3};
int num = 0, i;
for (i=0; i < 7; i++)
num ^= arr[i];
printf("%i\n", num);
This prints "1", which is the only one that occurs once.
This works because the first time you hit a number it marks the num variable with itself, and the second time it unmarks num with itself (more or less). The only one that remains unmarked is your non-duplicate.
By the way, you can expand on this idea to very quickly find two unique numbers among a list of duplicates.
Let's call the unique numbers a and b. First take the XOR of everything, as Kyle suggested. What we get is a^b. We know a^b != 0, since a != b. Choose any 1 bit of a^b, and use that as a mask -- in more detail: choose x as a power of 2 so that x & (a^b) is nonzero.
Now split the list into two sublists -- one sublist contains all numbers y with y&x == 0, and the rest go in the other sublist. By the way we chose x, we know that a and b are in different buckets. We also know that each pair of duplicates is still in the same bucket. So we can now apply ye olde "XOR-em-all" trick to each bucket independently, and discover what a and b are completely.
Bam.
O(N) time, O(N) memory
HT= Hash Table
HT.clear()
go over the list in order
for each item you see
if(HT.Contains(item)) -> HT.Remove(item)
else
ht.add(item)
at the end, the item in the HT is the item you are looking for.
Note (credit #Jared Updike): This system will find all Odd instances of items.
comment: I don't see how can people vote up solutions that give you NLogN performance. in which universe is that "better" ?
I am even more shocked you marked the accepted answer s NLogN solution...
I do agree however that if memory is required to be constant, then NLogN would be (so far) the best solution.
Kyle's solution would obviously not catch situations were the data set does not follow the rules. If all numbers were in pairs the algorithm would give a result of zero, the exact same value as if zero would be the only value with single occurance.
If there were multiple single occurance values or triples, the result would be errouness as well.
Testing the data set might well end up with a more costly algorithm, either in memory or time.
Csmba's solution does show some errouness data (no or more then one single occurence value), but not other (quadrouples). Regarding his solution, depending on the implementation of HT, either memory and/or time is more then O(n).
If we cannot be sure about the correctness of the input set, sorting and counting or using a hashtable counting occurances with the integer itself being the hash key would both be feasible.
I would say that using a sorting algorithm and then going through the sorted list to find the number is a good way to do it.
And now the problem is finding "the best" sorting algorithm. There are a lot of sorting algorithms, each of them with its strong and weak points, so this is quite a complicated question. The Wikipedia entry seems like a nice source of info on that.
Implementation in Ruby:
a = [1,2,3,4,123,1,2,.........]
t = a.length-1
for i in 0..t
s = a.index(a[i])+1
b = a[s..t]
w = b.include?a[i]
if w == false
puts a[i]
end
end
You need to specify what you mean by "best" - to some, speed is all that matters and would qualify an answer as "best" - for others, they might forgive a few hundred milliseconds if the solution was more readable.
"Best" is subjective unless you are more specific.
That said:
Iterate through the numbers, for each number search the list for that number and when you reach the number that returns only a 1 for the number of search results, you are done.
Seems like the best you could do is to iterate through the list, for every item add it to a list of "seen" items or else remove it from the "seen" if it's already there, and at the end your list of "seen" items will include the singular element. This is O(n) in regards to time and n in regards to space (in the worst case, it will be much better if the list is sorted).
The fact that they're integers doesn't really factor in, since there's nothing special you can do with adding them up... is there?
Question
I don't understand why the selected answer is "best" by any standard. O(N*lgN) > O(N), and it changes the list (or else creates a copy of it, which is still more expensive in space and time). Am I missing something?
Depends on how large/small/diverse the numbers are though. A radix sort might be applicable which would reduce the sorting time of the O(N log N) solution by a large degree.
The sorting method and the XOR method have the same time complexity. The XOR method is only O(n) if you assume that bitwise XOR of two strings is a constant time operation. This is equivalent to saying that the size of the integers in the array is bounded by a constant. In that case you can use Radix sort to sort the array in O(n).
If the numbers are not bounded, then bitwise XOR takes time O(k) where k is the length of the bit string, and the XOR method takes O(nk). Now again Radix sort will sort the array in time O(nk).
You could simply put the elements in the set into a hash until you find a collision. In ruby, this is a one-liner.
def find_dupe(array)
h={}
array.detect { |e| h[e]||(h[e]=true; false) }
end
So, find_dupe([1,2,3,4,5,1]) would return 1.
This is actually a common "trick" interview question though. It is normally about a list of consecutive integers with one duplicate. In this case the interviewer is often looking for you to use the Gaussian sum of n-integers trick e.g. n*(n+1)/2 subtracted from the actual sum. The textbook answer is something like this.
def find_dupe_for_consecutive_integers(array)
n=array.size-1 # subtract one from array.size because of the dupe
array.sum - n*(n+1)/2
end

Resources