Find unique number among 3n+1 numbers [duplicate] - algorithm

This question already has answers here:
Finding an element in an array that isn't repeated a multiple of three times?
(4 answers)
Closed 7 years ago.
I have been asked this question in an interview.
Given that, there are 3n+1 numbers. n of those numbers occur in triplets, only 1 occurs single time. How do we find the unique number in linear time i.e., O(n) ? The numbers are not sorted.
Note that, if there were 2n+1 numbers, n of which occur in pairs, we could just XOR all the numbers to find the unique one. The interviewer told me that it can be done by bit manipulation.

Count the number of times that each bit occurs in the set of 3n+1 numbers.
Reduce each bit count modulo 3.
What is left is the bit pattern of the single number.
Oh, dreamzor (above) has beaten me to it.

You can invent a 3nary XOR (call it XOR3) operation which operates in base 3 instead of base 2 and simply takes each 3nary digit modulo 3 (when usual XOR takes 2nary digit modulo 2).
Then, if you XOR3 all the numbers (converting them to 3nary first) this way, you will be left with the unique number (in base 3 so you will need to convert it back).
The complexity is not exactly linear, though, because the conversions from/to base 3 require additional logarithmic time. However, if the range of numbers is constant then the conversion time is also constant.
Code on C++ (intentionally verbose):
vector<int> to_base3(int num) {
vector<int> base3;
for (; num > 0; num /= 3) {
base3.push_back(num % 3);
}
return base3;
}
int from_base3(const vector<int> &base3) {
int num = 0;
for (int i = 0, three = 1; i < base3.size(); ++i, three *= 3) {
num += base3[i] * three;
}
return num;
}
int find_unique(const vector<int> &a) {
vector<int> unique_base3(20, 0); // up to 3^20
for (int num : a) {
vector<int> num_base3 = to_base3(num);
for (int i = 0; i < num_base3.size(); ++i) {
unique_base3[i] = (unique_base3[i] + num_base3[i]) % 3;
}
}
int unique_num = from_base3(unique_base3);
return unique_num;
}
int main() {
vector<int> rands { 1287318, 172381, 5144, 566546, 7123 };
vector<int> a;
for (int r : rands) {
for (int i = 0; i < 3; ++i) {
a.push_back(r);
}
}
a.push_back(13371337); // unique number
random_shuffle(a.begin(), a.end());
int unique_num = find_unique(a);
cout << unique_num << endl;
}

byte [] oneCount = new byte [32];
int [] test = {1,2,3,1,5,2,9,9,3,1,2,3,9};
for (int n: test) {
for (int bit = 0; bit < 32; bit++) {
if (((n >> bit) & 1) == 1) {
oneCount[bit]++;
oneCount[bit] = (byte)(oneCount[bit] % 3);
}
}
}
int result = 0;
int x = 1;
for (int bit = 0; bit < 32; bit++) {
result += oneCount[bit] * x;
x = x << 1;
}
System.out.print(result);
Looks like while I was coding, others gave the main idea

Related

Numbers of specified length that can be made using individual elements from an array

Let's say we have given an array of digits, A, and a positive number, B. The problem is to generate all the possible B-digit numbers combined of A's elements.
For example, if A = [0,1,2,3] and B = 2, then the output must be,
[10,11,12,13,20,21,22,23,30,31,32,33]
Generate all possible combinations​ of 2 digit numbers by multiplying and adding the elements of the array in nested for loops.
Check if the generated numbers are greater than 10 to be a valid two digit number.
`
#include<iostream>
#include<cmath>
int main () {
int A[4] = {0,1,2,3}; int B = 2; int k;
for(size_t i = 0; i < sizeof(A)/sizeof(A[0]); i++)
{
for(size_t j = 0; j < sizeof(A)/sizeof(A[0]); j++)
{
k = (A[i] * pow(10, B-1) + j);
if(k / 10 > 0)
std::cout << k << '\n';
}
}
}

How do I compute the sum of differences in C++ given an array of N integers?

Problem Statement: https://www.codechef.com/ZCOPRAC/problems/ZCO13001
My code falls flat with a 2.01 second runtime on test cases 4 and 5. I cannot figure out the problem with my code:-
#include<iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
int summation(long long int a[], int n, int count)
{
long long int sum=0;
int i;
if(count==n)
return 0;
else
{
for(i=count; i<n; i++)
sum+=a[i]-a[count];
}
return sum+summation(a, n, count+1);
}
int main()
{
int n, i;
long long int sum;
scanf("%d", &n);
long long int a[n];
for(i=0; i<n; i++)
scanf("%lld", &a[i]);
sort(a, a+n);
sum=summation(a, n, 0);
printf("%lld\n", sum);
return 0;
}
Thanks!
First of all you are on the correct track when you are sorting the numbers, but the complexity of your algorithm is O(n^2). What you want is an O(n) algorithm.
I'm only going to give you a hint, after that how you use it is up to you.
Let us take the example given on the site you specified itself i.e. 3,10,3,5. You sort these elements to get 3,3,5,10. Now what specifically are the elements of the sum of the differences in this? They are as follows -
3-3
5-3
10-3
5-3
10-3
10-5
Our result is supposed to be (3-3) + (5-3) + ... + (10-5). Let us approach this expression differently.
3 - 3
5 - 3
10 - 3
5 - 3
10 - 3
10 - 5
43 - 20
This we get by adding the elements on the left side and the right side of the - sign.
Now take a variable sum = 0.
You need to make the following observations -
As you can see in these individual differences how many times does the first 3 appear on the right side of the - sign ?
It appears 3 times so let us take sum = -3*3 = -9.
Now for the second 3 , it appears 2 times on the right side of the - sign and 1 time on the left side so we get (1-2)*3 = -3. Adding this to sum we get -12.
Similarly for 5 we have 2 times on the left side and 1 time on the right side. We get (2-1)*5 = 5. Adding this to sum we get -12+5 = -7.
Now for 10 we have it 3 times on the left side i.e. 3*10 so sum is = -7+30 = 23 which is your required answer. Now what you need to consider is how many times does a number appear on the left side and the right side of the - sign. This can be done in O(1) time and for n elements it takes O(n) time.
There you have your answer. Let me know if you don't understand any part of this answer.
Your code works, but there are two issues.
Using recursion will eventually run out of stack space. I ran your code for n=200000 (upper limit in Code Chef problem) and got stack overflow.
I converted the recursion to an equivalent loop. This hit the second issue - it takes a long time. It is doing 2*10^5 * (2*10^5 - 1) / 2 cycles which is 2*10^10 roughly. Assuming a processor can run 10^9 cycles a second, you're looking at 20 seconds.
To fix the time issue, look for duplicates in team strength value. Instead of adding the same strength value (val) each time it appears in the input, add it once and keep a count of how many times it was found (dup). Then, when calculating the contribution of the pair (i,j), multiply a[i].val - a[j].val by the number of times this combo appeared in raw input, which is the product of the two dup values a[i].dup * a[j].dup.
Here's the revised code, using Strength struct to hold the strength value & the number of times it occurred. I didn't have a handy input file, so used random number generator with range 1,100. By cycling through only the unique strength values, the total number of cycles is greatly reduced.
#include<iostream>
#include<cstdio>
#include<algorithm>
#include<random>
using namespace std;
int codechef_sum1(long long int a[], int n, int count)
{
long long int sum = 0;
int i;
if (count == n)
return 0;
else
{
for (i = count; i<n; i++)
sum += a[i] - a[count];
}
return sum + codechef_sum1(a, n, count + 1);
}
int codechef_sum2a(long long int a[], int n)
{
long long int sum = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < i; j++)
sum += (a[i] - a[j]);
return sum;
}
struct Strength
{
long long int val;
int dup;
//bool operator()(const Strength& lhs, const Strength & rhs) { return lhs.val < rhs.val; }
bool operator<(const Strength & rhs) { return this->val < rhs.val; }
};
int codechef_sum2b(Strength a[], int n)
{
long long int sum = 0;
for (int i = 0; i < n; i++)
{
for (int j = 0; j < i; j++)
sum += (a[i].val - a[j].val) * (a[i].dup * a[j].dup);
}
return sum;
}
int codechef_sum_test(int n)
{
std::default_random_engine generator;
std::uniform_int_distribution<int> distr(1, 100);
auto a1 = new long long int[n];
auto a2 = new Strength [n];
int dup = 0, num = 0;
for (int i = 0; i < n; i++)
{
int r = distr(generator);
a1[i] = r;
int dup_index = -1;
for (int ii = 0; ii < num; ii++)
{
if (a2[ii].val == r)
{
dup++;
dup_index = ii;
break;
}
}
if (dup_index == -1)
{
a2[num].val = r;
a2[num].dup = 1;
++num;
}
else
{
++a2[dup_index].dup;
}
}
sort(a1, a1 + n);
sort(a2, a2 + num);
auto sum11 = codechef_sum1(a1, n, 0);
auto sum12 = codechef_sum2a(a1, n);
auto sum2 = codechef_sum2b(a2, num);
printf("sum11=%lld, sum12=%lld\n", sum11, sum12);
printf("sum2=%lld, dup=%d, num=%d\n", sum2, dup, num);
delete[] a1;
delete[] a2;
return 0;
}
void main()
{
codechef_sum_test(50);
}
Here's my solution with a quicker algorithm.
Everything below is spoilers, in case you wanted to solve it yourself.
--
long long int summation_singlepass(long long int a[], int n)
{
long long int grand_total=0;
long long int iteration_sum, prev_iteration_sum=0;
int i;
for (i = 1; i < n; i++) {
iteration_sum = prev_iteration_sum + i * ( a[i] - a[i-1] );
grand_total += iteration_sum;
prev_iteration_sum = iteration_sum;
}
return grand_total;
}
--
To figure out an algorithm, take a few simple but meaningful cases. Then work through them step by step yourself. This usually gives good insights.
For example: 1,3,6,6,8 (after sorting)
Third element in series. Its sum of differences to previous elements is:
(6-1) + (6-3) = 8
Fourth element in series. No change! Sum of differences to previous elements is:
(6-1) + (6-3) + (6-6) = 8
Fifth element in series. Pattern emerges when compared to formula for third and fourth. Sum of differences to previous elements is:
(8-1) + (8-3) + (8-6) + (8-6) = 16
So it's an extra 2 for each prior element in the series. 2 is the difference between our current element (8) and the previous one (6).
To generalize this effect. Derive the current iteration sum as the previous iteration sum + (i - 1) * ( a[i] - a[i-1] ). Where i is our current (1-based) position in the series.
Note that the formula looks slightly different in code compared to how I wrote it above. This is because in C++ we're working with 0-based indices for arrays.
Also - if you wanted to continue tweaking the solution you posted in OP, change the function return of summation to long long int to handle larger sets without the running total getting chopped.

Counting tilings of a rectangle

I am trying to solve this problem but I can't find a solution:
A board consisting of squares arranged into N rows and M columns is given. A tiling of this board is a pattern of tiles that covers it. A tiling is interesting if:
only tiles of size 1x1 and/or 2x2 are used;
each tile of size 1x1 covers exactly one whole square;
each tile of size 2x2 covers exactly four whole squares;
each square of the board is covered by exactly one tile.
For example, the following images show a few interesting tilings of a board of size 4 rows and 3 columns:
http://dabi.altervista.org/images/task.img.4x3_tilings_example.gif
Two interesting tilings of a board are different if there exists at least one square on the board that is covered with a tile of size 1x1 in one tiling and with a tile of size 2x2 in the other. For example, all tilings shown in the images above are different.
Write a function
int count_tilings(int N, int M);
that, given two integers N and M, returns the remainder modulo 10,000,007 of the number of different interesting tilings of a board of size N rows and M columns.
Assume that:
N is an integer within the range [1..1,000,000];
M is an integer within the range [1..7].
For example, given N = 4 and M = 3, the function should return 11, because there are 11 different interesting tilings of a board of size 4 rows and 3 columns:
http://dabi.altervista.org/images/task.img.4x3_tilings_all.gif
for (4,3) the result is 11, for (6,5) the result is 1213.
I tried the following but it doesn't work:
static public int count_tilings ( int N,int M ) {
int result=1;
if ((N==1)||(M==1)) return 1;
result=result+(N-1)*(M-1);
int max_tiling= (int) ((int)(Math.ceil(N/2))*(Math.ceil(M/2)));
System.out.println(max_tiling);
for (int i=2; i<=(max_tiling);i++){
if (N>=2*i){
int n=i+(N-i);
int k=i;
//System.out.println("M-1->"+(M-1) +"i->"+i);
System.out.println("(M-1)^i)->"+(Math.pow((M-1),i)));
System.out.println( "n="+n+ " k="+k);
System.out.println(combinations(n, k));
if (N-i*2>0){
result+= Math.pow((M-1),i)*combinations(n, k);
}else{
result+= Math.pow((M-1),i);
}
}
if (M>=2*i){
int n=i+(M-i);
int k=i;
System.out.println("(N-1)^i)->"+(Math.pow((N-1),i)));
System.out.println( "n="+n+ " k="+k);
System.out.println(combinations(n, k));
if (M-i*2>0){
result+= Math.pow((N-1),i)*combinations(n, k);
}else{
result+= Math.pow((N-1),i);
}
}
}
return result;
}
static long combinations(int n, int k) {
/*binomial coefficient*/
long coeff = 1;
for (int i = n - k + 1; i <= n; i++) {
coeff *= i;
}
for (int i = 1; i <= k; i++) {
coeff /= i;
}
return coeff;
}
Since this is homework I won't give a full solution, but I'll give you some hints.
First here's a recursive solution:
class Program
{
// Important note:
// The value of masks given here is hard-coded for m == 5.
// In a complete solution, you need to calculate the masks for the
// actual value of m given. See explanation in answer for more details.
int[] masks = { 0, 3, 6, 12, 15, 24, 27, 30 };
int CountTilings(int n, int m, int s = 0)
{
if (n == 1) { return 1; }
int result = 0;
foreach (int mask in masks)
{
if ((mask & s) == 0)
{
result += CountTilings(n - 1, m, mask);
}
}
return result;
}
public static void Main()
{
Program p = new Program();
int result = p.CountTilings(6, 5);
Console.WriteLine(result);
}
}
See it working online: ideone
Note that I've added an extra parameter s. This stores the contents of the first column. If the first column is empty, s = 0. If the first column contains some filled squares the corresponding bits in s are set. Initially s = 0, but when a 2 x 2 tile is placed, this fills up some squares in the next column, and that will mean that s will be non-zero in the recursive call.
The masks variable is hard-coded but in a complete solution it needs to be calculated based on the actual value of m. The values stored in masks make more sense if you look at their binary representations:
00000
00011
00110
01100
01111
11000
11011
11110
In other words, it's all the ways of setting pairs of bits in a binary number with m bits. You can write some code to generate all these possiblities. Or since there are only 7 possible values of m, you could also just hard-code all seven possibilities for masks.
There are however two serious problems with the recursive solution.
It will overflow the stack for large values of N.
It requires exponential time to calculate. It is incredibly slow even for small values of N
Both these problems can be solved by rewriting the algorithm to be iterative. Keep m constant and initalize the result for n = 1 for all possible values of s to be 1. This is because if you only have one column you must use only 1x1 tiles, and there is only one way to do this.
Now you can calculate n = 2 for all possible values of s by using the results from n = 1. This can be repeated until you reach n = N. This algorithm completes in linear time with respect to N, and requires constant space.
Here is a recursive solution:
// time used : 27 min
#include <set>
#include <vector>
#include <iostream>
using namespace std;
void placement(int n, set< vector <int> > & p){
for (int i = 0; i < n -1 ; i ++){
for (set<vector<int> > :: iterator j = p.begin(); j != p.end(); j ++){
vector <int> temp = *j;
if (temp[i] == 1 || temp[i+1] == 1) continue;
temp[i] = 1; temp[i+1] = 1;
p.insert(temp);
}
}
}
vector<vector<int> > placement( int n){
if (n > 7) throw "error";
set <vector <int> > p;
vector <int> temp (n,0);
p.insert (temp);
for (int i = 0; i < 3; i ++) placement(n, p);
vector <vector <int> > s;
s.assign (p.begin(), p.end());
return s;
}
bool tryput(vector <vector <int> > &board, int current, vector<int> & comb){
for (int i = 0; i < comb.size(); i ++){
if ((board[current][i] == 1 || board[current+1][i]) && comb[i] == 1) return false;
}
return true;
}
void put(vector <vector <int> > &board, int current, vector<int> & comb){
for (int i = 0; i < comb.size(); i ++){
if (comb[i] == 1){
board[current][i] = 1;
board[current+1][i] = 1;
}
}
return;
}
void undo(vector <vector <int> > &board, int current, vector<int> & comb){
for (int i = 0; i < comb.size(); i ++){
if (comb[i] == 1){
board[current][i] = 0;
board[current+1][i] = 0;
}
}
return;
}
int place (vector <vector <int> > &board, int current, vector < vector <int> > & all_comb){
int m = board.size();
if (current >= m) throw "error";
if (current == m - 1) return 1;
int count = 0;
for (int i = 0; i < all_comb.size(); i ++){
if (tryput(board, current, all_comb[i])){
put(board, current, all_comb[i]);
count += place(board, current+1, all_comb) % 10000007;
undo(board, current, all_comb[i]);
}
}
return count;
}
int place (int m, int n){
if (m == 0) return 0;
if (m == 1) return 1;
vector < vector <int> > all_comb = placement(n);
vector <vector <int> > board(m, vector<int>(n, 0));
return place (board, 0, all_comb);
}
int main(){
cout << place(3, 4) << endl;
return 0;
}
time complexity O(n^3 * exp(m))
to reduce the space usage try bit vector.
to reduce the time complexity to O(m*(n^3)), try dynamic programming.
to reduce the time complexity to O(log(m) * n^3) try divide and conquer + dynamic programming.
good luck

find number that does not repeat in O(n) time O(1) space

for starters, I did have a look at these questions:
Given an array of integers where some numbers repeat 1 time, some numbers repeat 2 times and only one number repeats 3 times, how do you find the number that repeat 3 times
Algorithm to find two repeated numbers in an array, without sorting
this one different:
given an unsorted array of integers with one unique number and the rest numbers repeat 3 times,
i.e.:
{4,5,3, 5,3,4, 1, 4,3,5 }
we need to find this unique number in O(n) time and O(1) space
NOTE: this is not a homework, just I an nice question I came across
What about this one:
Idea: do bitwise addition mod 3
#include <stdio.h>
int main() {
int a[] = { 1, 9, 9, 556, 556, 9, 556, 87878, 87878, 87878 };
int n = sizeof(a) / sizeof(int);
int low = 0, up = 0;
for(int i = 0; i < n; i++) {
int x = ~(up & a[i]);
up &= x;
x &= a[i];
up |= (x & low);
low ^= x;
}
printf("single no: %d\n", low);
}
This solution works for all inputs.
The idea is to extract the bits of an integer from array and add to respective 32bit
bitmap 'b' (implemented as 32byte array to represent 32bit no.)
unsigned int a[7] = {5,5,4,10,4,9,9};
unsigned int b[32] = {0}; //Start with zeros for a 32bit no.
main1() {
int i, j;
unsigned int bit, sum =0 ;
for (i=0;i<7; i++) {
for (j=0; j<32; j++) { //This loop can be optimized!!!!
bit = ((a[i] & (0x01<<j))>>j); //extract the bit and move to right place
b[j] += bit; //add to the bitmap array
}
}
for (j=0; j<32; j++) {
b[j] %= 2; //No. repeating exactly 2 times.
if (b[j] == 1) {
sum += (unsigned int) pow(2, j); //sum all the digits left as 1 to get no
//printf("no. is %d", sum);
}
}
printf("no. is %d", sum);
}

Permutation with repetition without allocate memory

I'm looking for an algorithm to generate all permutations with repetition of 4 elements in list(length 2-1000).
Java implementation
The problem is that the algorithm from the link above alocates too much memory for calculation. It creates an array with length of all possible combination. E.g 4^1000 for my example. So i got heap space exception.
Thank you
Generalized algorithm for lazily-evaluated generation of all permutations (with repetition) of length X for a set of choices Y:
for I = 0 to (Y^X - 1):
list_of_digits = calculate the digits of I in base Y
a_set_of_choices = possible_choices[D] for each digit D in list_of_digits
yield a_set_of_choices
If there is not length limit for repetition of your 4 symbols there is a very simple algorithm that will give you what you want. Just encode your string as a binary number where all 2 bits pattern encode one of the four symbol. To get all possible permutations with repetitions you just have to enumerate "count" all possible numbers. That can be quite long (more than the age of the universe) as a 1000 symbols will be 2000 bits long. Is it really what you want to do ? The heap overflow may not be the only limit...
Below is a trivial C implementation that enumerates all repetitions of length exactly n (n limited to 16000 with 32 bits unsigned) without allocating memory. I leave to the reader the exercice of enumerating all repetitions of at most length n.
#include <stdio.h>
typedef unsigned char cell;
cell a[1000];
int npack = sizeof(cell)*4;
void decode(cell * a, int nbsym)
{
unsigned i;
for (i=0; i < nbsym; i++){
printf("%c", "GATC"[a[i/npack]>>((i%npack)*2)&3]);
}
printf("\n");
}
void enumerate(cell * a, int nbsym)
{
unsigned i, j;
for (i = 0; i < 1000; i++){
a[i] = 0;
}
while (j <= (nbsym / npack)){
j = 0;
decode(a, nbsym);
while (!++a[j]){
j++;
}
if ((j == (nbsym / npack))
&& ((a[j] >> ((nbsym-1)%npack)*2)&4)){
break;
}
}
}
int main(){
enumerate(a, 5);
}
You know how to count: add 1 to the ones spot, if you go over 9 jump back to 0 and add 1 to the tens, etc..
So, if you have a list of length N with K items in each spot:
int[] permutations = new int[N];
boolean addOne() { // Returns true when it advances, false _once_ when finished
int i = 0;
permutations[i]++;
while (permutations[i] >= K) {
permutations[i] = 0;
i += 1;
if (i>=N) return false;
permutations[i]++;
}
return true;
}

Resources