Split a subset with a constraint - algorithm

Today, while practicing some Algorithm questions I found an interesting question.
The question is
You have to divide 1 to n (with one missing value x ) into two equal
halfs such that sum of the two halfs are equal.
Example:
If n = 7 and x = 4
The solution will be {7, 5} and {1, 2, 3, 6}
I can answer it with brute force method but i want an efficient solution
Can any one help me out?

If the sum of the elements 1→N without x is odd then there is no solution.
Otherwise you can find your solution in O(N) with balanced selection.
4 in a row
First let us consider that any sequence of four contiguous numbers can be split in two sets with equal sum given that:
[x, x+1, x+2, x+3] → [x+3, x];[x+2, x+1]
Thus selecting them and placing them in sets A B B A balances sets A and B.
4 across
Moreover, when we have two couples across an omitted value, it can hold a similar property:
[x-2, x-1, x+1, x+2] → [x+2, x-2]; [x+1, x-1]
so still A B B A
At this point we can fix the following cases:
we have a quadruplet: we split it as in case 1
we have 2 numbers, x and other 2 numbers: we split as in case 2
Alright, but it can happen we have 3 numbers, x and other 3 numbers, or other conditions. How can we select in balanced manner anyway?
+2 Gap
If we look again at the gap across x:
[x-1, x+1]
we can notice that somehow if we split the two neighbors in two separate sets we must balance a +2 on the set with bigger sum.
Balancing Tail
We can do this by using the last four numbers of the sequence:
[4 3 2 1] → [4, 2] ; [3, 1] → 6 ; 4
Finally we have to consider that we might not have one of them, so let's build the other case:
[3 2 1] → [2] ; [3, 1] → 2 ; 4
and let us also realize we can do the very same at the other end of the sequence with an A B A B (or B A B A) pattern - if our +2 stands on B (or A);
4 across +
It is amazing that 4 across still holds if we jump h (odd!) numbers:
[x+3, x+2, x-2, x-3] → [x+3, x-3]; [x+2, x-2]
So, exploring the array we can draw the solution step by step
An example:
11 10 9 8 7 6 5 4 3 2 1
the sum it's even, so x can be only an even number:
x = 10
11 - 9 | 8 7 6 5 | 4 3 2 1 → (+2 gap - on A) (4 in a row) (balancing tail)
A B A B B A B A B A
x = 8
11 10 | 9 - 7 | 6 5 | 4 3 2 1 → (4 across +) (+2 gap - on A) (balancing tail)
a b A B | b a | B A B A
x = 6
11 10 9 8 | 7 - 5 | 4 3 2 1 → (4 in a row) (+2 gap - on A) (balancing tail)
A B B A A B A B B B
x = 4 we have no balancing tail - we have to do that with head
11 10 9 8 | 7 6 | 5 - 3 | 2 1 → (balancing head) (4 across +) (+2 gap)
A B A B A B | b a | B A
x = 2
11 10 9 8 | 7 6 5 4 | 3 - 1 → (balancing head) (4 in a row) (+2 gap)
A B A B A B B A B A
It is interesting to notice the symmetry of the solutions. Another example.
10 9 8 7 6 5 4 3 2 1
the sum it's odd, so x can be only an odd number, and the number of elements now is odd.
x = 9
10 - 8 | 7 6 5 4 | 3 2 1 → (+2 gap - on A) (4 in a row) (balancing tail)
A B A B B A B A B
x = 7
10 9 | 8 - 6 | 5 4 | 3 2 1 → (4 across +) (+2 gap - on A) (balancing tail)
a b | A B | b a B A B
x = 5
10 9 8 7 | 6 - 4 | 3 2 1 → (4 in a row) (+2 gap - on A) (balancing tail)
A B B A A B B A B
x = 3
10 9 8 7 | 6 5 | 4 - 2 | 1 → (balancing head) (4 across + virtual 0) (+2 gap)
A B A B B A | a b | A
x = 1
10 9 8 7 | 6 5 4 3 | 2 → (balancing head) (4 in a row) (+2 gap virtual 0)
A B A B A B B A B
Finally it is worth to notice we can switch from A to B whenever we have a full balanced segment (i.e. 4 in a row or 4 across)
Funny said - but the property requesting the sum([1 ... N]-x) to be even makes the cases quite redundant if you try yourself.
I am pretty sure this algorithm can be generalized - I'll probably provide a revised version soon.

This problem can be solved by wrapping the standard subset sum problem of dynamic programming with preprocessing steps. These steps are of O(1) com
Algorithm (n, x):
sum = n * (n+1) / 2
neededSum = sum - x
If (neededSum % 2 != 0): return 0
create array [1..n] and remove x from it
call standard subsetsum(arr, 0, neededSum/2, [])
Working python implementation of subsetsum algorithm - printing all subsets is given below.
def subsetsum(arr, i, sum, ss):
if i >= len(arr):
if sum == 0:
print ss
return 1
else:
return 0
ss1 = ss[:]
count = subsetsum(arr, i + 1, sum, ss1)
ss1.append(arr[i])
count += subsetsum(arr, i + 1, sum - arr[i], ss1)
return count
arr = [1, 2, 3, 10, 5, 7]
sum = 14
a = []
print subsetsum(arr, 0, sum, a)
Hope it helps!

Related

What is N in this given scenario

I am trying to implement this code and this website has kindly provided their algorithm but I am trying to Find out what is "N" I understood what "I" and "M" is but not "N", is "N" the Total input(in the below example 5 because there are 5 letters)?
Algorithm:
Combinations are generated in lexicographical order. The algorithm uses indexes of the elements of the set. Here is how it works on example: Suppose we have a set of 5 elements with indexes 1 2 3 4 5 (starting from 1), and we need to generate all combinations of size m
= 3.
First, we initialize the first combination of size m - with indexes in ascending order
1 2 3
Then we check the last element (i = 3). If its value is less than n - m + i, it is incremented by 1.
1 2 4
Again we check the last element, and since it is still less than n - m
i, it is incremented by 1.
1 2 5
Now it has the maximum allowed value: n - m + i = 5 - 3 + 3 = 5, so we move on to the previous element (i = 2).
If its value less than n - m + i, it is incremented by 1, and all following elements are set to value of their previous neighbor plus 1
1 (2+1)3 (3+1)4 = 1 3 4
Then we again start from the last element i = 3
1 3 5
Back to i = 2
1 4 5
Now it finally equals n - m + i = 5 - 3 + 2 = 4, so we can move to first element (i = 1) (1+1)2 (2+1)3 (3+1)4 = 2 3 4
And then,
2 3 5
2 4 5
3 4 5
and it is the last combination since all values are set to the maximum possible value of n - m + i.
Input:
A
B
C
D
E
Output:
A B C
A B D
A B E
A C D
A C E
A D E
B C D
B C E
B D E
C D E
Take a look at the very first paragraf of the link you provided.
It states that
This combinations calculator generates all possible combinations of m elements from the set of n elements.
So yes, n is the number of elements or letters that the algorithm needs to use.
N here is the size of the set of set from which you generate the combinations. In the given example, "Suppose we have a set of 5 elements with indexes 1 2 3 4 5 (starting from 1)", N is 5.
Combinations are usually symbolized with nCm, or n choose m. So n is the total set size(in this example 5) and m is the number chosen(3).

Pandas pivot table Nested Sorting Part 3

Episode 3:
In part 2, we retained the hierarchical nature of the indices while sorting within right-most level. In part 1, we applied a custom sort to the left-most index level while sorting the values within the right-most index.
Now, I'd like to combine both methods.
Given the following data frame and resultant pivot table:
import pandas as pd
df=pd.DataFrame({'A':['a','a','a','a','a','b','b','b','b'],
'B':['x','y','z','x','y','z','x','y','z'],
'C':['a','b','a','b','a','b','a','b','a'],
'D':[7,5,3,4,1,6,5,3,1]})
df
A B C D
0 a x a 7
1 a y b 5
2 a z a 3
3 a x b 4
4 a y a 1
5 b z b 6
6 b x a 5
7 b y b 3
8 b z a 1
table = pd.pivot_table(df, index=['A', 'B','C'],aggfunc='sum')
table
D
A B C
a x a 7
b 4
y a 1
b 5
z a 3
b x a 5
y b 3
z a 1
b 6
I would like to specify a custom order of 'B'.
This seems to work:
df['B']=df['B'].astype('category')
df['B'].cat.set_categories(['z','x','y'],inplace=True)
Next, I'd like for the pivot table to keep the order for 'B' specified above while sorting the values 'D' descendingly within each category of 'B'.
Like this:
D
A B C
z a 3
x a 7
a b 4
y b 5
a 1
z b 6
b a 1
x a 5
y b 3
Thanks in advance!
UPDATE: using pivot_table()
In [79]: df.pivot_table(index=['A','B','C'], aggfunc='sum').reset_index().sort_values(['A','B','D'], ascending=[1,1,0]).set_index(['A','B','C'])
Out[79]:
D
A B C
a x a 7
b 4
y b 5
a 1
z a 3
b x a 5
y b 3
z b 6
a 1
is that what you want?
In [64]: df.sort_values(['A','B','D'], ascending=[1,1,0]).set_index(['A','B','C'])
Out[64]:
D
A B C
a z a 3
x a 7
b 4
y b 5
a 1
b z b 6
a 1
x a 5
y b 3

Efficient algorithm to find kth largest numbers from N lists by picking one number each time from N lists

There are given N lists of numbers. Every time one number will be picked from each list and all the picked numbers will be sorted. The k th largest of sorted numbers will be added to a set.
Finally the size of the set will be reported.
For Example
3 3
3 2 5 3
3 8 1 6
3 7 4 9
First integer is the no of lists N(From next line there are N lists. In this case it is 3, so next three lines have list values). Second integer is the k value.And first entry of the next N lines are the list size.
List values are list1 -> (2,5,3) , list2 ->(8,1,6), list3 ->(7,4,9)
Any number can be picked from the list. For example (2,8,7),(2,8,4),(2,8,9),(2,1,7),(2,1,4),(2,1,9)..etc are all valid combinations. From this combinations kth largest will be selected from each combination.
In this case the following numbers have the chance to be the 3 rd largest (since k=3)
(4,5,6,7,8,9)
The total count must be reported. So the output is 6
One way:
I am trying to find the permutation of all the list values, sort it and take the k th largest every time. In this way the complexity is high. For example 4 lists of sizes (10,12,15,20)= (10 *12 * 15 * 20) list values. So it will not fit in memory.
Is there any other efficient algorithm for this problem?
This is an interesting question , took a while to figure it out .
Make 2 max-heaps , h1 and h2 .
put 1st element of all lists at each time in h1 , and 1 element (maximum) from h1 to h2 and when size of h2 >=K ,
pop 1 element (maximum) from h2 and add it into your set .
Run on your case :
1) h1 = empty h2 = empty set=empty
2) h1 = 2 8 7 h2 = empty set=empty
3) h1 = 2 7 5 1 4 h2 = 8 set=empty
4) h1 = 2 5 1 4 3 6 9 h2 = 8 7 set=empty
5) h1 = 2 5 1 4 3 6 h2 = 8 7 9 set=empty
6) h1 = 2 5 1 4 3 h2 = 8 7 6 set=9
7) h1 = 2 1 4 3 h2 = 5 7 6 set=9 8
8) h1 = 2 1 3 h2 = 5 4 6 set=9 8 7
9) h1 = 2 1 h2 = 5 4 3 set=9 8 7 6
10) h1 = 1 h2 = 2 4 3 set=9 8 7 6 5
11) h1 = empty h2 = 2 1 3 set=9 8 7 6 5 4
h1 = empty , STOP.
Time complexity : O(N log N)

Arithmetic operation on sequence on integers

I have N integers numbers: 1,2,3...N
The task is to use +,-,*,/ to make expression 0.
For example -1*2+3+4-5=0
How can I do it?
May be some code on C/C++ ?
If N % 4 == 0, for every four consecutive integers a, b, c, d, take a - b - c + d
If N % 4 == 1, use 1 * 2 to start, then proceed as before. (i.e., 1*2 - 3 - 4 + 5 + 6 - 8 - 8 + 9 ...)
If N % 4 == 2, start with 1 - 2 + 3 * 4 - 5 - 6, then proceed as in the N % 4 == 0 example.
If N % 4 == 3, start with 1 + 2 - 3, then proceed as in the N%4 == 0 example.
All of these find a way to get zero out of the first few integers, leaving a multiple of four integers to work on, then take advantage of the fact that the pattern a - b - c + d = 0 for any four consecutive integers.
This is essentially SAT, or do you know that the numbers are a sequence (e.g. 2 1 8 is forbidden). What about negative numbers?
If the sequence is not too large, i would recommend to simply bootforce it. A greedy solution would be to reduce the problem by finding subsets which can be evaluated to zero.

Bitwise modulus computation

Given two numbers a and b where b is of form 2k where k is unknown.What would be the efficient way of computing a%b using bitwise operator.
a AND (b-1) == a%b (when b is 2^k)
ex. a = 11 (1011b), b = 4 (0100b)
11 / 4 = 2 R3
11 % 4 == 11 AND (4-1)
11 (1011b) AND 3 (0011b) == 3 (0011b)

Resources