Maximum XOR value faster than just using XOR - algorithm

Given a number N and an array of integers (all nos less than 2^15). (A is size of array 100000)
Find Maximum XOR value of N and a integer from the array.
Q is no of queries (50000) and start, stop is the range in the array.
Input:
A Q
a1 a2 a3 ...
N start stop
Output:
Maximum XOR value of N and an integer in the array with the range specified.
Eg: Input
15 2 (2 is no of queries)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
10 6 10 (Query 1)
10 6 10 (Query 2)
Output:
13
13
Code:
for(int i=start-1;i<stop;i++){
int t =no[i]^a;
if(maxxor<t)
maxxor=t;
}
cout << maxxor <<endl;
I need a algorithm 10-100 times faster than this. Sorting is too expensive. I have also tried binary trees,bit manipulation.
How about a 2x - 3x improvement?. Is that possible by optimization.

It is possible to develop faster algorithm.
Let's call bits of N: a[0], a[1], ..., a[15], e.g if N = 13 = 0000000 00001101 (in binary), then a[0] = a[1] = ... a[11] = 0, a[12] = 1, a[13] = 1, a[14] = 0, a[15] = 1.
The main idea of algorithm is following: If a[0] == 1, then best possible answer has this bit zeroed. If a[0] == 0, then best possible answer has one at this position.
So at first you check if you have some number with the desired bit. If yes, you should take only number with this bit. If no, you take it's inverse.
Then you process other bits in same manner. E.g. if a[0] == 1, a[1] == 0, you first check whether there is number beginning with zero, if yes then you check whether there is a number beginning with 01. If nothing begins with zero, then you check whether there is a number beggining with 11. And so on...
So you need a fast algorithm to answer following query: Is there a number beginning with bits ... in range start, stop?
One possibility: Constuct trie from binary representation of numbers. In each node store all positions where this prefix is in array (and sort them). Then answering to this query can be a simple walk through this trie. To check whether there is suitable prefix in start, stop range you should do a binary search over stored array in a node.
This could lead to algorithm with complexity O(lg^2 N) which is faster.
Here is the code, it hasn't been tested much, may contain bugs:
#include <cstdio>
#include <vector>
#include <algorithm>
using namespace std;
class TrieNode {
public:
TrieNode* next[2];
vector<int> positions;
TrieNode() {
next[0] = next[1] = NULL;
}
bool HasNumberInRange(int start, int stop) {
vector<int>::iterator it = lower_bound(
positions.begin(), positions.end(), start);
if (it == positions.end()) return false;
return *it < stop;
}
};
void AddNumberToTrie(int number, int index, TrieNode* base) {
TrieNode* cur = base;
// Go through all binary digits from most significant
for (int i = 14; i >= 0; i--) {
int digit = 0;
if ((number & (1 << i)) != 0) digit = 1;
cur->positions.push_back(index);
if (cur->next[digit] == NULL) {
cur->next[digit] = new TrieNode;
}
cur = cur->next[digit];
}
cur->positions.push_back(index);
}
int FindBestNumber(int a, int start, int stop, TrieNode* base) {
int best_num = 0;
TrieNode* cur = base;
for (int i = 14; i >= 0; i--) {
int digit = 1;
if ((a & (1 << i)) != 0) digit = 0;
if (cur->next[digit] == NULL ||
!cur->next[digit]->HasNumberInRange(start, stop))
digit = 1 - digit;
best_num *= 2;
best_num += digit;
cur = cur->next[digit];
}
return best_num;
}
int main() {
int n; scanf("%d", &n);
int q; scanf("%d", &q);
TrieNode base;
for (int i = 0; i < n; i++) {
int x; scanf("%d", &x);
AddNumberToTrie(x, i, &base);
}
for (int i = 0; i < q; i++) {
int a, start, stop;
// Finds biggest i, such that start <= i < stop and XOR with a is as big as possible
// Base index is 0
scanf("%d %d %d", &a, &start, &stop);
printf("%d\n", FindBestNumber(a, start, stop, &base)^a);
}
}

Your algorithm runs in linear time (O(start-stop), or O(N) for the full range). If you can't assume that the input array already has a special ordering, you probably won't be able to get it any faster.
You only can try to optimize the overhead within the loop, but that surely won't give you a significant increase in speed.
edit:
As it seems you have to search the same list multiple time, but with different start- and end indexes.
That means that pre-sorting the array is also out of the question, because that would change the order of the elements. start and end would be meaningless.
What you could try to do is avoid processing the same range twice if one query fully contains an already scanned range.
Or maybe trying to consider all queries simultaneously while iterating throug the array.

If you have multiple queries with the same range, you can build a tree with the numbers in that range like this:
Use a binary tree of depth 15 where the numbers are at the leaves and a number corresponds to the path that leads to it (left is 0 and right is 1).
e.g. for 0 1 4 7:
/ \
/ /\
/ \ / \
0 1 4 7
Then is your query is N=n_1 n_2 n_3 … n_15 where n_1 is the first bit of N, n_2 the second …
Go from the root to a leaf and when you have to make a choice if n_i = 0 (where i is the depth of the current node) then go to the right, else go to the left. When you are on the leaf, it is the max leaf.
Original Answer for one query:
Your algorithm is optimal, you need to check all numbers in the array.
There may be a way to have a slightly faster program by using programming tricks, but it has no link with the algorithm.

I just come up with a solution that requires O(AlogM) time and space for preprocessing. And O(log2M) time for each query. M is the range of the integers, 2^15 in this problem.
For the
1st..Nth number, (Tree Group 1)
1st..(A/2)th number, (A/2)th..Ath number, (Tree Group 2)
1st..(A/4)th number, (A/4)th..(A/2)th number, (A/2)th..(3A/4)th, (3A/3)th..Ath, (Tree Group 3)
......., (Tree Group 4)
.......,
......., (Tree Group logA)
construct a binary trie of the binary representation of all number in the range. There would be 2M trees. But all trees aggregated will have no more than O(AlogM) elements. For a tree that include x numbers, there can be at most logM*x node in the tree. And each number is included in only one tree in each Tree Group.
For each query, you can split the range into several ranges (no more than 2logA) that we have processed into a tree. And for each tree, we can find the maximum XOR value in O(logM) time (will explain later). That is O(logA*logM) time.
How to find the maximum in a tree? Simply prefer the 1 child if the current digit is 0 in N, otherwise prefer the 0 child. If the preferred child exist, continue to that child, otherwise to the other.

yea or you could just calculate it and not waste time thinking about how to do it better.
int maxXor(int l, int r) {
int highest_xor = 0;
int base = l;
int tbase = l;
int val = 0;
int variance = 0;
do
{
while(tbase + variance <= r)
{
val = base ^ tbase + variance;
if(val > highest_xor)
{
highest_xor = val;
}
variance += 1;
}
base +=1;
variance = 0;
}while(base <= r);
return highest_xor;
}

Related

SUM exactly using K elements solution

Problem: On a given array with N numbers, find subset of size M (exactly M elements) that equal to SUM.
I am looking for a Dynamic Programming(DP) solution for this problem. Basically looking to understand the matrix filled approach. I wrote below program but didn't add memoization as i am still wondering how to do that.
#include <stdio.h>
#define SIZE(a) sizeof(a)/sizeof(a[0])
int binary[100];
int a[] = {1, 2, 5, 5, 100};
void show(int* p, int size) {
int j;
for (j = 0; j < size; j++)
if (p[j])
printf("%d\n", a[j]);
}
void subset_sum(int target, int i, int sum, int *a, int size, int K) {
if (sum == target && !K) {
show(binary, size);
} else if (sum < target && i < size) {
binary[i] = 1;
foo(target, i + 1, sum + a[i], a, size, K-1);
binary[i] = 0;
foo(target, i + 1, sum, a, size, K);
}
}
int main() {
int target = 10;
int K = 2;
subset_sum(target, 0, 0, a, SIZE(a), K);
}
Is the below recurrence solution makes sense?
Let DP[SUM][j][k] sum up to SUM with exactly K elements picked from 0 to j elements.
DP[i][j][k] = DP[i][j-1][k] || DP[i-a[j]][j-1][k-1] { input array a[0....j] }
Base cases are:
DP[0][0][0] = DP[0][j][0] = DP[0][0][k] = 1
DP[i][0][0] = DP[i][j][0] = 0
It means we can either consider this element ( DP[i-a[j]][j-1][k-1] ) or we don't consider the current element (DP[i][j-1][k]). If we consider current element, k is reduced by 1 which reduces the elements that needs to be considered and same goes when current element is not considered i.e. K is not reduced by 1.
Your solution looks right to me.
Right now, you're basically backtracking over all possibilities and printing each solution. If you only want one solution, you could add a flag that you set when one solution was found and check before continuing with recursive calls.
For memoization, you should first get rid of the binary array, after which you can do something like this:
int memo[NUM_ELEMENTS][MAX_SUM][MAX_K];
bool subset_sum(int target, int i, int sum, int *a, int size, int K) {
if (sum == target && !K) {
memo[i][sum][K] = true;
return memo[i][sum][K];
} else if (sum < target && i < size) {
if (memo[i][sum][K] != -1)
return memo[i][sum][K];
memo[i][sum][K] = foo(target, i + 1, sum + a[i], a, size, K-1) ||
foo(target, i + 1, sum, a, size, K);
return memo[i][sum][K]
}
return false;
}
Then, look at memo[_all indexes_][target][K]. If this is true, there exists at least one solution. You can store addition information to get you that next solution, or you can iterate with an i from found_index - 1 to 0 and check for which i you have memo[i][sum - a[i]][K - 1] == true. Then recurse on that, and so on. This will allow you to reconstruct the solution using just the memo array.
To my understanding, if only the feasibility of the input has to be checked, the problem can be solved with a two-dimensional state space
bool[][] IsFeasible = new bool[n][k]
where IsFeasible[i][j] is true if and only if there is a subset of the elements 1 to i which sum up to exactly j for every
1 <= i <= n
1 <= j <= k
and for this state space, the recurrence relation
IsFeasible[i][j] = IsFeasible[i-1][k-a[i]] || IsFeasible[i-1][k]
can be used, where the left-hand side of the or-operator || corresponds to selecting the i-th item and the right-hand side corresponds to to not selecting the i-th item. The actual choice of items could be obtained by backtracking or auxiliary information saved during evaluation.

Efficient way to count subsets with given sum

Given N numbers I need to count subsets whose sum is S.
Note : Numbers in array need not to be distinct.
My current code is :
int countSubsets(vector<int> numbers,int sum)
{
vector<int> DP(sum+1);
DP[0]=1;
int currentSum=0;
for(int i=0;i<numbers.size();i++)
{
currentSum+=numbers[i];
for (int j=min(sum,currentSum);j>=numbers[i];j--)
DP[j]+=DP[j - numbers[i]];
}
return DP[sum];
}
Can their be any efficient way than this ?
Constraints are :
1 ≤ N ≤ 14
1 ≤ S ≤ 100000
1 ≤ A[i] ≤ 10000
Also their are 100 test cases in a single file. So please help if their exist better solution than this one
N is small (2^20 - is about 1 milion - 2^14 is really small value) - just iterate over all subsets, below I wrote pretty fast way to do that (bithacking). Treat integers as sets (that's enumerating subsets in Lexicographical order)
int length = array.Length;
int subsetCount = 0;
for (int i=0; i<(1<<length); ++i)
{
int currentSet = i;
int tempIndex = length-1;
int currentSum = 0;
while (currentSet > 0) // iterate over bits "from the right side"
{
if (currentSet & 1 == 1) // if current bit is "1"
currentSum += array[tempIndex];
currentSet >>= 1;
tempIndex--;
}
subsetCount += (currentSum == targetSum) ? 1 : 0;
}
You can use the fact that N is small: it is possible to generate all possible subsets of the given array and check if its sum is S for each of them. The time complexity is O(N * 2 ** N) or O(2 ** N)(it depends on the way of the generation). This solution should be fast enough for the given constraints.
Here is a pseudo code of an O(2 ** N) solution:
result = 0
void generate(int curPos, int curSum):
if curPos == N:
if curSum == S:
result++
return
// Do not take the current element.
generate(curPos + 1, curSum)
// Take it.
generate(curPos + 1, curSum + numbers[curPos])
generate(0, 0)
A faster solution based on the meet in the middle technique:
Let's generate all subsets for the first half of the array using the algorithm described above and put their sums into a map(which maps a sum to the number of subsets that have it. It can be either a hash table or just an array because S is relatively small). This step takes O(2 ** (N / 2)) time.
Now let's generate all subsets for the second half and for each of them add the number of subset that sum up to S - currentSum e in the first half(using the map constructed in 1.), where the currentSum is the sum of all elements in the current subseta. Again, we have O(2 ** (N / 2)) subsets and each of them is processed in O(1).
The total time complexity is O(2 ** (N / 2)).
A pseudo code for this solution:
Map<int, int> count = new HashMap<int, int>() // or an array of size S + 1.
result = 0
void generate1(int[] numbers, int pos, int currentSum):
if pos == numbers.length:
count[currentSum]++
return
generate1(numbers, pos + 1, currentSum)
generate1(numbers, pos + 1, currentSum + numbers[pos])
void generate2(int[] numbers, int pos, int currentSum):
if pos == numbers.length:
result += count[S - currentSum]
return
generate2(numbers, pos + 1, currentSum)
generate2(numbers, pos + 1, currentSum + numbers[pos])
generate1(the first half of numbers, 0, 0)
generate2(the second half of numbers, 0, 0)
If N is odd, the middle element can go to either the first half or to the second one. It doesn't matter where it goes as long as it goes to exactly one of them.

Count the subsequences of length 4 divisible by 9

To count the subsequences of length 4 of a string of length n which are divisible by 9.
For example if the input string is 9999
then cnt=1
My approach is similar to Brute Force and takes O(n^3).Any better approach than this?
If you want to check if a number is divisible by 9, You better look here.
I will describe the method in short:
checkDividedByNine(String pNum) :
If pNum.length < 1
return false
If pNum.length == 1
return toInt(pNum) == 9;
Sum = 0
For c in pNum:
Sum += toInt(pNum)
return checkDividedByNine(toString(Sum))
So you can reduce the running time to less than O(n^3).
EDIT:
If you need very fast algorithm, you can use pre-processing in order to save for each possible 4-digit number, if it is divisible by 9. (It will cost you 10000 in memory)
EDIT 2:
Better approach: you can use dynamic programming:
For string S in length N:
D[i,j,k] = The number of subsequences of length j in the string S[i..N] that their value modulo 9 == k.
Where 0 <= k <= 8, 1 <= j <= 4, 1 <= i <= N.
D[i,1,k] = simply count the number of elements in S[i..N] that = k(mod 9).
D[N,j,k] = if j==1 and (S[N] modulo 9) == k, return 1. Otherwise, 0.
D[i,j,k] = max{ D[i+1,j,k], D[i+1,j-1, (k-S[i]+9) modulo 9]}.
And you return D[1,4,0].
You get a table in size - N x 9 x 4.
Thus, the overall running time, assuming calculating modulo takes O(1), is O(n).
Assuming that the subsequence has to consist of consecutive digits, you can scan from left to right, keeping track of what order the last 4 digits read are in. That way, you can do a linear scan and just apply divisibility rules.
If the digits are not necessarily consecutive, then you can do some finangling with lookup tables. The idea is that you can create a 3D array named table such that table[i][j][k] is the number of sums of i digits up to index j such that the sum leaves a remainder of k when divided by 9. The table itself has size 45n (i goes from 0 to 4, j goes from 0 to n-1, and k goes from 0 to 8).
For the recursion, each table[i][j][k] entry relies on table[i-1][j-1][x] and table[i][j-1][x] for all x from 0 to 8. Since each entry update takes constant time (at least relative to n), that should get you an O(n) runtime.
How about this one:
/*NOTE: The following holds true, if the subsequences consist of digits in contagious locations */
public int countOccurrences (String s) {
int count=0;
int len = s.length();
String subs = null;
int sum;
if (len < 4)
return 0;
else {
for (int i=0 ; i<len-3 ; i++) {
subs = s.substring(i, i+4);
sum = 0;
for (int j=0; j<=3; j++) {
sum += Integer.parseInt(String.valueOf(subs.charAt(j)));
}
if (sum%9 == 0)
count++;
}
return count;
}
}
Here is the complete working code for the above problem based on the above discussed ways using lookup tables
int fun(int h)
{
return (h/10 + h%10);
}
int main()
{
int t;
scanf("%d",&t);
int i,T;
for(T=0;T<t;T++)
{
char str[10001];
scanf("%s",str);
int len=strlen(str);
int arr[len][5][10];
memset(arr,0,sizeof(int)*(10*5*len));
int j,k,l;
for(j=0;j<len;j++)
{
int y;
y=(str[j]-48)%10;
arr[j][1][y]++;
}
//printarr(arr,len);
for(i=len-2;i>=0;i--) //represents the starting index of the string
{
int temp[5][10];
//COPYING ARRAY
int a,b,c,d;
for(a=0;a<=4;a++)
for(b=0;b<=9;b++)
temp[a][b]=arr[i][a][b]+arr[i+1][a][b];
for(j=1;j<=4;j++) //represents the length of the string
{
for(k=0;k<=9;k++) //represents the no. of ways to make it
{
if(arr[i+1][j][k]!=0)
{
for(c=1;c<=4;c++)
{
for(d=0;d<=9;d++)
{
if(arr[i][c][d]!=0)
{
int h,r;
r=j+c;
if(r>4)
continue;
h=k+d;
h=fun(h);
if(r<=4)
temp[r][h]=( temp[r][h]+(arr[i][c][d]*arr[i+1][j][k]))%1000000007;
}}}
}
//copy back from temp array
}
}
for(a=0;a<=4;a++)
for(b=0;b<=9;b++)
arr[i][a][b]=temp[a][b];
}
printf("%d\n",(arr[0][1][9])%1000000007);
}
return 0;
}

Permutation with repetition without allocate memory

I'm looking for an algorithm to generate all permutations with repetition of 4 elements in list(length 2-1000).
Java implementation
The problem is that the algorithm from the link above alocates too much memory for calculation. It creates an array with length of all possible combination. E.g 4^1000 for my example. So i got heap space exception.
Thank you
Generalized algorithm for lazily-evaluated generation of all permutations (with repetition) of length X for a set of choices Y:
for I = 0 to (Y^X - 1):
list_of_digits = calculate the digits of I in base Y
a_set_of_choices = possible_choices[D] for each digit D in list_of_digits
yield a_set_of_choices
If there is not length limit for repetition of your 4 symbols there is a very simple algorithm that will give you what you want. Just encode your string as a binary number where all 2 bits pattern encode one of the four symbol. To get all possible permutations with repetitions you just have to enumerate "count" all possible numbers. That can be quite long (more than the age of the universe) as a 1000 symbols will be 2000 bits long. Is it really what you want to do ? The heap overflow may not be the only limit...
Below is a trivial C implementation that enumerates all repetitions of length exactly n (n limited to 16000 with 32 bits unsigned) without allocating memory. I leave to the reader the exercice of enumerating all repetitions of at most length n.
#include <stdio.h>
typedef unsigned char cell;
cell a[1000];
int npack = sizeof(cell)*4;
void decode(cell * a, int nbsym)
{
unsigned i;
for (i=0; i < nbsym; i++){
printf("%c", "GATC"[a[i/npack]>>((i%npack)*2)&3]);
}
printf("\n");
}
void enumerate(cell * a, int nbsym)
{
unsigned i, j;
for (i = 0; i < 1000; i++){
a[i] = 0;
}
while (j <= (nbsym / npack)){
j = 0;
decode(a, nbsym);
while (!++a[j]){
j++;
}
if ((j == (nbsym / npack))
&& ((a[j] >> ((nbsym-1)%npack)*2)&4)){
break;
}
}
}
int main(){
enumerate(a, 5);
}
You know how to count: add 1 to the ones spot, if you go over 9 jump back to 0 and add 1 to the tens, etc..
So, if you have a list of length N with K items in each spot:
int[] permutations = new int[N];
boolean addOne() { // Returns true when it advances, false _once_ when finished
int i = 0;
permutations[i]++;
while (permutations[i] >= K) {
permutations[i] = 0;
i += 1;
if (i>=N) return false;
permutations[i]++;
}
return true;
}

How do I generate integer partitions?

I have a list of numbers like 1,2,3 and I want to find all the combination patterns that sum up to a particular number like 5. For example:
Sum=5
Numbers:1,2,3
Patterns:
1 1 1 1 1
1 1 1 2
1 1 3
1 2 2
2 3
You're allowed to repeat numbers as far as they don't go over your sum. Which way would be best to program this?
This is a slight modification of the change making problem. You should be able to find plenty of papers on this problem, and a dynamic programming solution would take no more than 20 lines of code.
http://en.wikipedia.org/wiki/Change-making_problem
This might also help: Dynamic Programming: Combination Sum Problem
These are called the partitions of a number , and your problem seems to impose the constraint of which numbers you're allowed to use in the partition.
This problem is known as a "doubly restricted integer partition." If the numbers "allowed" to sum to 5 were from a set V, then it is known as "multiply restricted integer partition." There is a paper by Riha and James: "Algorithm 29: Efficient algorithms for doubly and multiply restricted partitions" Computing Vol 16, No 1-2, pp 163-168 (1976). You should read that paper and implement their algorithm. Understanding how to do it will allow you to implement optimizations unique to your specific problem.
I would do it recursively starting with the highest numbers first. then, each time in start with the highest level and go in as many levels as numbers. As soon as the cumulative level exceeds your value, drop down to the next number. If still too large (or small), immediately return back one level and decrease THAT to the next number down, then to the next deeper level starting at the top again..
public static List<List<string>> Partition(int n, int max, string prefix)
{
if (n == 0)
{
_results.Add(prefix.Split(new char[] { ',' }).ToList());
}
for (int i = Math.Min(max, n); i >= 1; i--)
{
Partition(n - i, i, prefix + "," + i);
}
return _results;
}
You can use following code .. it wiil give you a exact answer as you want..
void print(int n, int * a)
{
int i ;
for (i = 0; i <= n; i++)
{
printf("%d", a[i]);
}
printf("\n");
}
void integerPartition(int n, int * a, int level)
{
int first;
int i;
if (n < 1)
return ;
a[level] = n;
print(level, a);
first = (level == 0) ? 1 : a[level-1];
for(i = first; i <= n / 2; i++)
{
a[level] = i;
integerPartition(n - i, a, level + 1);
}
}
int main()
{
int n = 10;
int * a = (int * ) malloc(sizeof(int) * n);
integerPartition (n, a, 0);
return(0);
}

Resources