Choosing a subset in uniformly random manner?

Choosing a subset in uniformly random manner? - random

Question is:
Write a method to randomly generate a set of m integers from an array of size n. Each
element must have equal probability of being chosen.
Is this answer correct?:
I pick a first integer uniformly randomly.
pick next. if it already exists. I don't take it else take it. and continue till I have m integers.

let m be the number of elements to select
for i = 1; i <= m; i++
pick a random number from 1 to n, call it j
swap array[j] and array [n] (assuming 1 indexed arrays)
n--
At the end of the loop, the last m elements of array is your random subset. There is a variation on fisher-yates shuffle.

There are 2^n subsets. Pick a number between 0 and 2^n-1 and turn that into binary. Those with bits set should be taken from the array and stored.
e.g. Consider the set 1,2,3,4.
int[] a = new int[]{ 1, 2, 3, 4 }
int n = (2*2*2*2) - 1; // 2^n -1
int items = new Random().nextInt(n);
// If items is 3 then this is 000011 so we would select 1 and 2
// If items is 5 then this is 000101 so we would select 1 and 3
// And so on
for (int i=0;i<a.length;++i) {
if ((items & (1 << i)) != 0) {
// The bit is set, grab this item
System.out.println("Selected " + a[i]);
}
}

Think of your original range to choose from as a list from 1-n, when you choose an element (number) remove that element from the list. Choose elements based on list index, rather than the actual number value.
int Choose1(List<int> elts)
{
var idx = rnd.Next(0,elts.Count);
var elt = elts[idx];
elts.RemoveAt(idx);
return elt;
}
public List<int> Choose(int fromN, int chooseM)
{
var range = new List<int>();
for (int i = 1; i <= fromN; i++)
{
range.Add(i);
}
var choices = new List<int>();
for (int i = 0; i < chooseM; i++)
{
choices.Add(Choose1(range));
}
return choices;
}
Using lists won't be efficient for large numbers, but you can use the same approach without actually constructing any lists, using a bit of arithmetic.

If your picks are random then the probability of picking m items in the manner you described would be 1/pow(n,m). I think what you need is 1/C(n,m).

Related

Problem from book Programming Pearls, Find duplicate number from 4,300,000,000 integers in Linear time? [duplicate]

Question: The input is on a sequential file. The file contains at most 4Billion integers. Find a missing integer.
Solution as per my understanding:
make two temporary files one with leading 0 and the other with leading 1
one of the two MUST( 4.3B pigeon-holes and 4B pigeons) have less than 2B.
Pick the file and repeat steps 1 & 2 on the 2nd bit and then on 3rd bit and so on..
what is the end condition of this iteration?
Also, the book mentions the efficiency of the algorithm being O(n)
but,
1st iteration => n probe operations
2nd iteration => n/2 probe operations
.
.
.
n + n/2 + n/4 +... 1 => nlogn??
Am I missing something?

You'll check both files and pick the one with the fewest elements.
You'll repeat the process until you've gone through all 32 bits and at the end you'll have a file 0 elements. This is where one of the missing numbers was supposed to be. So, if you've been keeping track of the bits you've filtered on thus far, you'll know what the number is supposed to be.
Note that this is to find a (i.e. 'any') missing number. If given an (unordered) sequential list of 4 billion (not 2^32 (4294967296)) integers with one missing, which you have to find, this won't work, as you can cut off the missing integer right in the beginning.
Also:
n + n/2 + n/4 + ... 1 <= 2n
Not n log n.
It's a geometric sequence with a = n, r = 1/2, which can be calculated with the formula:
n (1-(1/2)^m)
-------------
1 - (1/2)
Since 0 < (1/2)^m < 1 for any positive number m (since 0 < 1/2 < 1), we can say (1-r^m) < 1 and thus we can say the maximum is:
n.1
-------
1 - 1/2
n
= ---
1/2
= 2n

If there is only 1 missing value, meaning that you have the following criteria:
File contains all numbers ranging from a lowest value, N up to and including a highest value, M, except for 1 of those numbers, and each of the numbers present occurs only once, each (thanks #maraca)
File does not have to be sorted
There is only 1 of those values missing (just making sure)
Then the solution is quite simple:
ADD or XOR together all the numbers in the file.
ADD or XOR together all the numbers you're supposed to have.
The missing number is either one minus the other (in case of ADD) or one xor the other.
Here is a LINQPad program you can experiment with:
void Main()
{
var input = new[] { 1, 2, 3, 4, 5, 6, 8, 9, 10 };
var lowest = input[0];
var highest = input[0];
int xor = 0;
foreach (var value in input)
{
lowest = Math.Min(lowest, value);
highest = Math.Max(highest, value);
xor ^= value;
}
int requiredXor = 0;
for (int index = lowest; index <= highest; index++)
requiredXor ^= index;
var missing = xor ^ requiredXor;
missing.Dump();
}
Basically, it will:
XOR all values in the file together (value 1)
Find the lowest and highest numbers at the same time
XOR all values from lowest up to highest (value 2)
XOR the two values (value 1 and value 2) together to find the missing value
This method will not detect if the missing value is the lowest value - 1 or highest value + 1, for instance, if the file is supposed to hold 1..10, but is missing 10 or 1, then the above approach will not find it.
This solution is O(2n) (we loop the numbers twice), which translates to O(n).
Here is a more complete example showing both the ADD and the XOR solution (again in LINQPad):
void Main()
{
var input = new[] { 1, 2, 3, 4, 5, 6, 8, 9, 10 };
MissingXOR(input).Dump("xor");
MissingADD(input).Dump("add");
}
public static int MissingXOR(int[] input)
{
var lowest = input[0];
var highest = input[0];
int xor = 0;
foreach (var value in input)
{
lowest = Math.Min(lowest, value);
highest = Math.Max(highest, value);
xor ^= value;
}
int requiredXor = 0;
for (int index = lowest; index <= highest; index++)
requiredXor ^= index;
return xor ^ requiredXor;
}
public static int MissingADD(int[] input)
{
var lowest = input[0];
var highest = input[0];
int sum = 0;
foreach (var value in input)
{
lowest = Math.Min(lowest, value);
highest = Math.Max(highest, value);
sum += value;
}
var sumToHighest = (highest * (highest + 1)) / 2;
var sumToJustBelowLowest = (lowest * (lowest - 1)) / 2;
int requiredSum = sumToHighest - sumToJustBelowLowest;
return requiredSum - sum;
}

Algorithm to find two subsets of an array which are maximum & equal [duplicate]

Given an array, you have to find the max possible two equal sum, you can exclude elements.
i.e 1,2,3,4,6 is given array
we can have max two equal sum as 6+2 = 4+3+1
i.e 4,10,18, 22,
we can get two equal sum as 18+4 = 22
what would be your approach to solve this problem apart from brute force to find all computation and checking two possible equal sum?
edit 1: max no of array elements are N <= 50 and each element can be up to 1<= K <=1000
edit 2: Here is my solution https://ideone.com/cAbe4g, it takes too much time where given time limit is 5 seconds say for each case.
edit 3:- Total elements sum cannot be greater than 1000.

Recommended approach
I suggest solving this using DP where instead of tracking A,B (the size of the two sets), you instead track A+B,A-B (the sum and difference of the two sets).
Then for each element in the array, try adding it to A, or B, or neither.
The advantage of tracking the sum/difference is that you only need to keep track of a single value for each difference, namely the largest value of the sum you have seen for this difference.
For added efficiency, I recommend you iterate through the elements in order from smallest to largest and stop updating the DP once the largest difference seen so far is reached.
You can also only store the absolute value of the difference, and ignore any difference greater than 25000 (as it will be impossible for the difference to return to 0 from this point).
Python example code
from collections import defaultdict
def max_equal_sum(E):
D=defaultdict(int) # Map from abs difference to largest sum
D[0]=0 # Start with a sum and difference of 0
for a in E: # Iterate over each element in the array
D2=D.copy() # Can keep current sum and diff if element is skipped
for d,s in D.items(): # d is difference, s is sum
s2 = s+a # s2 is new sum
for d2 in [d-a,d+a]: # d2 is new difference
D2[abs(d2)] = max(D2[abs(d2)],s2) # Update with largest sum
D=D2
return D[0]/2 # Answer is half the sum of A+B for a difference of 0
print max_equal_sum([1,2,3,4,6]) # Prints 8
print max_equal_sum([4,10,18,22]) # Prints 22

The largest set with values from 0 to 1000 which has no two subsets with equal sum has 9 elements, e.g.:
{1, 2, 4, 8, 16, 32, 64, 128, 256, 512}
If you add a tenth element, then it will be equal to the sum of a subset of the nine previous values.
If you find two subsets with equal sum after you have excluded more than 9 elements, then two equal sums from the excluded elements can be added to form greater equal sums; this means that you should never exclude more than 9 elements.
The sum of the excluded elements in in the range 0 to 1000. Building a sieve to check which values in this range can be formed with the elements in the set will take at most 50 × 1000 steps. (We can store the minimum number of values that add up to each sum instead of a boolean, and use that to include only sums which can be made with 9 or fewer elements.)
If we then look at the sums of excluded numbers from small to large, that means looking at the sums of included numbers from large to small. For each sum of excluded values, we check which (up to 9) elements can form this sum (obviously not a trivial step, but we know the number of elements is between the minimum value as stored in the sieve, and 9), and this gives us the set of excluded and therefor also the set of included numbers. Then we use a similar sieve technique to check whether the included numbers can form the half sum; this will be in the range 1 to 500 and take at most 50 × 500 steps.
In all this, there is of course the odd/even-ness to take into account: if the total sum is odd, a subset with an odd sum has to be excluded; if the total sum is even, only subsets with an even sum have to be excluded.
I haven't really figured out how to generate worst-case input for this, so it's hard to judge the worst-case complexity; but I think the average case should be feasible.
Here are some of the parts in action. First, the sieve to find the sums of the sets of up to 9 values which can be excluded (and have the right odd/even-ness) tested with 20 values with a sum of 999:
function excludedSums(values) {
var sieve = [0];
for (var i in values) {
var temp = [];
for (var j in sieve) {
if (sieve[j] == 9) continue;
var val = values[i] + Number(j);
if (!sieve[val] || sieve[j] + 1 < sieve[val]) {
temp[val] = sieve[j] + 1;
}
}
for (var j in temp) {
sieve[j] = temp[j];
}
}
var odd = values.reduce(function(ac, el) {return ac + el}, 0) % 2;
for (var i in sieve) {
if (Number(i) % 2 != odd) delete sieve[i];
}
return sieve;
}
var set = [40,7,112,15,96,25,49,49,31,87,39,8,79,40,73,49,63,55,12,70];
var result = excludedSums(set);
for (var i in result) document.write(i + ", ");
Next, the sets of up to 9 values with a certain sum. In the example above, we see that e.g. one or more sets with the sum 99 can be excluded; let's find out what these sets are:
function excludedSets(values, target) {
var sieve = [[[]]];
for (var i in values) {
var temp = [];
for (var j in sieve) {
var val = values[i] + Number(j);
if (val > target) continue;
for (var k in sieve[j]) {
if (sieve[j][k].length < 9) {
if (!temp[val]) temp[val] = [];
temp[val].push(sieve[j][k].concat([values[i]]));
}
}
}
for (var j in temp) {
if (!sieve[j]) sieve[j] = [];
for (var k in temp[j]) {
sieve[j].push(temp[j][k].slice());
}
}
}
return sieve[target];
}
var set = [40,7,112,15,96,25,49,49,31,87,39,8,79,40,73,49,63,55,12,70];
var result = excludedSets(set, 99);
for (var i in result) document.write(result[i] + "<br>");
(You'll see a few duplicates in the output, because e.g. the value 49 appears three times in the set.)
Now let's test whether the set without excluded values can be split in two. We see that the sum 99 can be formed e.g. by values 87 and 12, so we exclude these from the set, and get a set of 18 values with the sum 900. Now we check whether the half sum 450 can be formed by adding values from the set:
function sumPossible(values, target) {
var sieve = [true];
for (var i in values) {
var temp = [];
for (var j in sieve) {
var val = values[i] + Number(j);
if (val < target) temp[val] = true;
else if (val == target) return true;
}
for (var j in temp) sieve[j] = temp[j];
}
return false;
}
var set = [40,7,112,15,96,25,49,49,31,39,8,79,40,73,49,63,55,70];
document.write(sumPossible(set, 450));
So 450 is one of the possible half-sums for this set. Obviously it's not the largest one, because we randomly picked the sum 99 to exclude as an example, instead of iterating over all sums from small to large; in fact the first option, excluding value 7, would have led to the maximum half-sum 496.
It should be noted that the larger the set, the more likely it is that the set can be split in half (if it has an even sum, or with the smallest odd value removed if it has an odd sum). A test with millions of sets of random values with an even sum up to 1000 revealed not a single set that couldn't be split in half, for any set size above 28. (It is of course possible to craft such a set, e.g. 49 ones and a single 51.)

For each element in the array,there are three possibilities.
(i) Include the element in 1st set
(ii) Include the element in 2nd set
(iii)Does not include the element in any set
whenever sum of first and second set becomes equals update the answer.
public class Main
{
static int ans = -1;
public static void find(int[] arr,int sum1,int sum2,int start)
{ if(sum1==sum2)
ans = Math.max(ans,sum1);
if(start==arr.length)
return;
find(arr,sum1+arr[start],sum2,start+1);
find(arr,sum1,sum2+arr[start],start+1);
find(arr,sum1,sum2,start+1);
}
public static void main(String[] args)
{
int[] arr = new int[]{1,2,100,101,6,100};
ans = -1;
find(arr,0,0,0);
System.out.println(ans);
}
}

#include <bits/stdc++.h>
using namespace std;
/*
Brute force recursive solve.
*/
void solve(vector&arr, int &ans, int p1, int p2, int idx, int mx_p){
// if p1 == p2, we have a potential answer
if(p1 == p2){
ans = max(ans, p1);
}
//base case 1:
if((p1>mx_p) || (p2>mx_p) || (idx >= arr.size())){
return;
}
// leave the current element
solve(arr, ans, p1, p2, idx+1, mx_p);
// add the current element to p1
solve(arr, ans, p1+arr[idx], p2, idx+1, mx_p);
// add the current element to p2
solve(arr, ans, p1, p2+arr[idx], idx+1, mx_p);
}
/*
Recursive solve with memoization.
*/
int solve(vector<vector<vector>>&memo, vector&arr,
int p1, int p2, int idx, int mx_p){
//base case 1:
if((p1>mx_p) || (p2>mx_p) || (idx>arr.size())){
return -1;
}
// memo'ed answer
if(memo[p1][p2][idx]>-1){
return memo[p1][p2][idx];
}
// if p1 == p2, we have a potential answer
if(p1 == p2){
memo[p1][p2][idx] = max(memo[p1][p2][idx], p1);
}
// leave the current element
memo[p1][p2][idx] = max(memo[p1][p2][idx], solve(memo, arr, p1, p2,
idx+1, mx_p));
// add the current element to p1
memo[p1][p2][idx] = max(memo[p1][p2][idx],
solve(memo, arr, p1+arr[idx], p2, idx+1, mx_p));
// add the current element to p2
memo[p1][p2][idx] = max(memo[p1][p2][idx],
solve(memo, arr, p1, p2+arr[idx], idx+1, mx_p));
return memo[p1][p2][idx];
}
int main(){
vector<int>arr = {1, 2, 3, 4, 7};
int ans = 0;
int mx_p = 0;
for(auto i:arr){
mx_p += i;
}
mx_p /= 2;
vector<vector<vector<int>>>memo(mx_p+1, vector<vector<int>>(mx_p+1,
vector<int>(arr.size()+1,-1)));
ans = solve(memo, arr, 0, 0, 0, mx_p);
ans = (ans>=0)?ans:0;
// solve(arr, ans, 0, 0, 0, mx_p);
cout << ans << endl;
return 0;
}

Given an array, you have to find the max possible two equal sum

Given an array, you have to find the max possible two equal sum, you can exclude elements.
i.e 1,2,3,4,6 is given array
we can have max two equal sum as 6+2 = 4+3+1
i.e 4,10,18, 22,
we can get two equal sum as 18+4 = 22
what would be your approach to solve this problem apart from brute force to find all computation and checking two possible equal sum?
edit 1: max no of array elements are N <= 50 and each element can be up to 1<= K <=1000
edit 2: Here is my solution https://ideone.com/cAbe4g, it takes too much time where given time limit is 5 seconds say for each case.
edit 3:- Total elements sum cannot be greater than 1000.

Recommended approach
I suggest solving this using DP where instead of tracking A,B (the size of the two sets), you instead track A+B,A-B (the sum and difference of the two sets).
Then for each element in the array, try adding it to A, or B, or neither.
The advantage of tracking the sum/difference is that you only need to keep track of a single value for each difference, namely the largest value of the sum you have seen for this difference.
For added efficiency, I recommend you iterate through the elements in order from smallest to largest and stop updating the DP once the largest difference seen so far is reached.
You can also only store the absolute value of the difference, and ignore any difference greater than 25000 (as it will be impossible for the difference to return to 0 from this point).
Python example code
from collections import defaultdict
def max_equal_sum(E):
D=defaultdict(int) # Map from abs difference to largest sum
D[0]=0 # Start with a sum and difference of 0
for a in E: # Iterate over each element in the array
D2=D.copy() # Can keep current sum and diff if element is skipped
for d,s in D.items(): # d is difference, s is sum
s2 = s+a # s2 is new sum
for d2 in [d-a,d+a]: # d2 is new difference
D2[abs(d2)] = max(D2[abs(d2)],s2) # Update with largest sum
D=D2
return D[0]/2 # Answer is half the sum of A+B for a difference of 0
print max_equal_sum([1,2,3,4,6]) # Prints 8
print max_equal_sum([4,10,18,22]) # Prints 22

The largest set with values from 0 to 1000 which has no two subsets with equal sum has 9 elements, e.g.:
{1, 2, 4, 8, 16, 32, 64, 128, 256, 512}
If you add a tenth element, then it will be equal to the sum of a subset of the nine previous values.
If you find two subsets with equal sum after you have excluded more than 9 elements, then two equal sums from the excluded elements can be added to form greater equal sums; this means that you should never exclude more than 9 elements.
The sum of the excluded elements in in the range 0 to 1000. Building a sieve to check which values in this range can be formed with the elements in the set will take at most 50 × 1000 steps. (We can store the minimum number of values that add up to each sum instead of a boolean, and use that to include only sums which can be made with 9 or fewer elements.)
If we then look at the sums of excluded numbers from small to large, that means looking at the sums of included numbers from large to small. For each sum of excluded values, we check which (up to 9) elements can form this sum (obviously not a trivial step, but we know the number of elements is between the minimum value as stored in the sieve, and 9), and this gives us the set of excluded and therefor also the set of included numbers. Then we use a similar sieve technique to check whether the included numbers can form the half sum; this will be in the range 1 to 500 and take at most 50 × 500 steps.
In all this, there is of course the odd/even-ness to take into account: if the total sum is odd, a subset with an odd sum has to be excluded; if the total sum is even, only subsets with an even sum have to be excluded.
I haven't really figured out how to generate worst-case input for this, so it's hard to judge the worst-case complexity; but I think the average case should be feasible.
Here are some of the parts in action. First, the sieve to find the sums of the sets of up to 9 values which can be excluded (and have the right odd/even-ness) tested with 20 values with a sum of 999:
function excludedSums(values) {
var sieve = [0];
for (var i in values) {
var temp = [];
for (var j in sieve) {
if (sieve[j] == 9) continue;
var val = values[i] + Number(j);
if (!sieve[val] || sieve[j] + 1 < sieve[val]) {
temp[val] = sieve[j] + 1;
}
}
for (var j in temp) {
sieve[j] = temp[j];
}
}
var odd = values.reduce(function(ac, el) {return ac + el}, 0) % 2;
for (var i in sieve) {
if (Number(i) % 2 != odd) delete sieve[i];
}
return sieve;
}
var set = [40,7,112,15,96,25,49,49,31,87,39,8,79,40,73,49,63,55,12,70];
var result = excludedSums(set);
for (var i in result) document.write(i + ", ");
Next, the sets of up to 9 values with a certain sum. In the example above, we see that e.g. one or more sets with the sum 99 can be excluded; let's find out what these sets are:
function excludedSets(values, target) {
var sieve = [[[]]];
for (var i in values) {
var temp = [];
for (var j in sieve) {
var val = values[i] + Number(j);
if (val > target) continue;
for (var k in sieve[j]) {
if (sieve[j][k].length < 9) {
if (!temp[val]) temp[val] = [];
temp[val].push(sieve[j][k].concat([values[i]]));
}
}
}
for (var j in temp) {
if (!sieve[j]) sieve[j] = [];
for (var k in temp[j]) {
sieve[j].push(temp[j][k].slice());
}
}
}
return sieve[target];
}
var set = [40,7,112,15,96,25,49,49,31,87,39,8,79,40,73,49,63,55,12,70];
var result = excludedSets(set, 99);
for (var i in result) document.write(result[i] + "<br>");
(You'll see a few duplicates in the output, because e.g. the value 49 appears three times in the set.)
Now let's test whether the set without excluded values can be split in two. We see that the sum 99 can be formed e.g. by values 87 and 12, so we exclude these from the set, and get a set of 18 values with the sum 900. Now we check whether the half sum 450 can be formed by adding values from the set:
function sumPossible(values, target) {
var sieve = [true];
for (var i in values) {
var temp = [];
for (var j in sieve) {
var val = values[i] + Number(j);
if (val < target) temp[val] = true;
else if (val == target) return true;
}
for (var j in temp) sieve[j] = temp[j];
}
return false;
}
var set = [40,7,112,15,96,25,49,49,31,39,8,79,40,73,49,63,55,70];
document.write(sumPossible(set, 450));
So 450 is one of the possible half-sums for this set. Obviously it's not the largest one, because we randomly picked the sum 99 to exclude as an example, instead of iterating over all sums from small to large; in fact the first option, excluding value 7, would have led to the maximum half-sum 496.
It should be noted that the larger the set, the more likely it is that the set can be split in half (if it has an even sum, or with the smallest odd value removed if it has an odd sum). A test with millions of sets of random values with an even sum up to 1000 revealed not a single set that couldn't be split in half, for any set size above 28. (It is of course possible to craft such a set, e.g. 49 ones and a single 51.)

For each element in the array,there are three possibilities.
(i) Include the element in 1st set
(ii) Include the element in 2nd set
(iii)Does not include the element in any set
whenever sum of first and second set becomes equals update the answer.
public class Main
{
static int ans = -1;
public static void find(int[] arr,int sum1,int sum2,int start)
{ if(sum1==sum2)
ans = Math.max(ans,sum1);
if(start==arr.length)
return;
find(arr,sum1+arr[start],sum2,start+1);
find(arr,sum1,sum2+arr[start],start+1);
find(arr,sum1,sum2,start+1);
}
public static void main(String[] args)
{
int[] arr = new int[]{1,2,100,101,6,100};
ans = -1;
find(arr,0,0,0);
System.out.println(ans);
}
}

#include <bits/stdc++.h>
using namespace std;
/*
Brute force recursive solve.
*/
void solve(vector&arr, int &ans, int p1, int p2, int idx, int mx_p){
// if p1 == p2, we have a potential answer
if(p1 == p2){
ans = max(ans, p1);
}
//base case 1:
if((p1>mx_p) || (p2>mx_p) || (idx >= arr.size())){
return;
}
// leave the current element
solve(arr, ans, p1, p2, idx+1, mx_p);
// add the current element to p1
solve(arr, ans, p1+arr[idx], p2, idx+1, mx_p);
// add the current element to p2
solve(arr, ans, p1, p2+arr[idx], idx+1, mx_p);
}
/*
Recursive solve with memoization.
*/
int solve(vector<vector<vector>>&memo, vector&arr,
int p1, int p2, int idx, int mx_p){
//base case 1:
if((p1>mx_p) || (p2>mx_p) || (idx>arr.size())){
return -1;
}
// memo'ed answer
if(memo[p1][p2][idx]>-1){
return memo[p1][p2][idx];
}
// if p1 == p2, we have a potential answer
if(p1 == p2){
memo[p1][p2][idx] = max(memo[p1][p2][idx], p1);
}
// leave the current element
memo[p1][p2][idx] = max(memo[p1][p2][idx], solve(memo, arr, p1, p2,
idx+1, mx_p));
// add the current element to p1
memo[p1][p2][idx] = max(memo[p1][p2][idx],
solve(memo, arr, p1+arr[idx], p2, idx+1, mx_p));
// add the current element to p2
memo[p1][p2][idx] = max(memo[p1][p2][idx],
solve(memo, arr, p1, p2+arr[idx], idx+1, mx_p));
return memo[p1][p2][idx];
}
int main(){
vector<int>arr = {1, 2, 3, 4, 7};
int ans = 0;
int mx_p = 0;
for(auto i:arr){
mx_p += i;
}
mx_p /= 2;
vector<vector<vector<int>>>memo(mx_p+1, vector<vector<int>>(mx_p+1,
vector<int>(arr.size()+1,-1)));
ans = solve(memo, arr, 0, 0, 0, mx_p);
ans = (ans>=0)?ans:0;
// solve(arr, ans, 0, 0, 0, mx_p);
cout << ans << endl;
return 0;
}

How to find minimum positive contiguous sub sequence in O(n) time?

We have this algorithm for finding maximum positive sub sequence in given sequence in O(n) time. Can anybody suggest similar algorithm for finding minimum positive contiguous sub sequence.
For example
If given sequence is 1,2,3,4,5 answer should be 1.
[5,-4,3,5,4] ->1 is the minimum positive sum of elements [5,-4].

There cannot be such algorithm. The lower bound for this problem is O(n log n). I'll prove it by reducing the element distinctness problem to it (actually to the non-negative variant of it).
Let's suppose we have an O(n) algorithm for this problem (the minimum non-negative subarray).
We want to find out if an array (e.g. A=[1, 2, -3, 4, 2]) has only distinct elements. To solve this problem, I could construct an array with the difference between consecutive elements (e.g. A'=[1, -5, 7, -2]) and run the O(n) algorithm we have. The original array only has distinct elements if and only if the minimum non-negative subarray is greater than 0.
If we had an O(n) algorithm to your problem, we would have an O(n) algorithm to element distinctness problem, which we know is not possible on a Turing machine.

We can have a O(n log n) algorithm as follow:
Assuming that we have an array prefix, which index i stores the sum of array A from 0 to i, so the sum of sub-array (i, j) is prefix[j] - prefix[i - 1].
Thus, in order to find the minimum positive sub-array ending at index j, so, we need to find the maximum element prefix[x], which less than prefix[j] and x < j. We can find that element in O(log n) time if we use a binary search tree.
Pseudo code:
int[]prefix = new int[A.length];
prefix[0] = A[0];
for(int i = 1; i < A.length; i++)
prefix[i] = A[i] + prefix[i - 1];
int result = MAX_VALUE;
BinarySearchTree tree;
for(int i = 0; i < A.length; i++){
if(A[i] > 0)
result = min(result, A[i];
int v = tree.getMaximumElementLessThan(prefix[i]);
result = min(result, prefix[i] - v);
tree.add(prefix[i]);
}

I believe there's a O(n) algorithm, see below.
Note: it has a scale factor that might make it less attractive in practical applications: it depends on the (input) values to be processed, see remarks in the code.
private int GetMinimumPositiveContiguousSubsequenc(List<Int32> values)
{
// Note: this method has no precautions against integer over/underflow, which may occur
// if large (abs) values are present in the input-list.
// There must be at least 1 item.
if (values == null || values.Count == 0)
throw new ArgumentException("There must be at least one item provided to this method.");
// 1. Scan once to:
// a) Get the mimumum positive element;
// b) Get the value of the MAX contiguous sequence
// c) Get the value of the MIN contiguous sequence - allowing negative values: the mirror of the MAX contiguous sequence.
// d) Pinpoint the (index of the) first negative value.
int minPositive = 0;
int maxSequence = 0;
int currentMaxSequence = 0;
int minSequence = 0;
int currentMinSequence = 0;
int indxFirstNegative = -1;
for (int k = 0; k < values.Count; k++)
{
int value = values[k];
if (value > 0)
if (minPositive == 0 || value < minPositive)
minPositive = value;
else if (indxFirstNegative == -1 && value < 0)
indxFirstNegative = k;
currentMaxSequence += value;
if (currentMaxSequence <= 0)
currentMaxSequence = 0;
else if (currentMaxSequence > maxSequence)
maxSequence = currentMaxSequence;
currentMinSequence += value;
if (currentMinSequence >= 0)
currentMinSequence = 0;
else if (currentMinSequence < minSequence)
minSequence = currentMinSequence;
}
// 2. We're done if (a) there are no negatives, or (b) the minPositive (single) value is 1 (or 0...).
if (minSequence == 0 || minPositive <= 1)
return minPositive;
// 3. Real work to do.
// The strategy is as follows, iterating over the input values:
// a) Keep track of the cumulative value of ALL items - the sequence that starts with the very first item.
// b) Register each such cumulative value as "existing" in a bool array 'initialSequence' as we go along.
// We know already the max/min contiguous sequence values, so we can properly size that array in advance.
// Since negative sequence values occur we'll have an offset to match the index in that bool array
// with the corresponding value of the initial sequence.
// c) For each next input value to process scan the "initialSequence" bool array to see whether relevant entries are TRUE.
// We don't need to go over the complete array, as we're only interested in entries that would produce a subsequence with
// a value that is positive and also smaller than best-so-far.
// (As we go along, the range to check will normally shrink as we get better and better results.
// Also: initially the range is already limited by the single-minimum-positive value that we have found.)
// Performance-wise this approach (which is O(n)) is suitable IFF the number of input values is large (or at least: not small) relative to
// the spread between maxSequence and minSeqence: the latter two define the size of the array in which we will do (partial) linear traversals.
// If this condition is not met it may be more efficient to replace the bool array by a (binary) search tree.
// (which will result in O(n logn) performance).
// Since we know the relevant parameters at this point, we may below have the two strategies both implemented and decide run-time
// which to choose.
// The current implementation has only the fixed bool array approach.
// Initialize a variable to keep track of the best result 'so far'; it will also be the return value.
int minPositiveSequence = minPositive;
// The bool array to keep track of which (total) cumulative values (always with the sequence starting at element #0) have occurred so far,
// and the 'offset' - see remark 3b above.
int offset = -minSequence;
bool[] initialSequence = new bool[maxSequence + offset + 1];
int valueCumulative = 0;
for (int k = 0; k < indxFirstNegative; k++)
{
int value = values[k];
valueCumulative += value;
initialSequence[offset + valueCumulative] = true;
}
for (int k = indxFirstNegative; k < values.Count; k++)
{
int value = values[k];
valueCumulative += value;
initialSequence[offset + valueCumulative] = true;
// Check whether the difference with any previous "cumulative" may improve the optimum-so-far.
// the index that, if the entry is TRUE, would yield the best possible result.
int indexHigh = valueCumulative + offset - 1;
// the last (lowest) index that, if the entry is TRUE, would still yield an improvement over what we have so far.
int indexLow = Math.Max(0, valueCumulative + offset - minPositiveSequence + 1);
for (int indx = indexHigh; indx >= indexLow; indx--)
{
if (initialSequence[indx])
{
minPositiveSequence = valueCumulative - indx + offset;
if (minPositiveSequence == 1)
return minPositiveSequence;
break;
}
}
}
return minPositiveSequence;
}
}

Finding sum of N largest elements of array of single-digit values [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Retrieving the top 100 numbers from one hundred million of numbers
I have a array which consists positive number between 0 to 9,(digit can repeat). I want to find sum of N largest elements
For example array = 5 1 2 4 and N=2
ans = 5+4 = 9
Simple approach: sort array and find sum of n largest elements. But i dont want to use it

The simplest O(n) solution is the following:
Run through array a and increasе b[a[i]] where b is a zero initialized array of 10 integers.
Run through b starting from the end (9th position) and if b[i] is lower than N add b[i] * i to your answer, then decrease N by b[i], otherwise if b[i] is greater or equal to N add N * i to the answer and over the loop.
Edit: code
vector<int> b(10, 0);
for(int i = 0; i < a.size(); ++i) {
b[a[i]]++;
}
int sum = 0;
for(int i = 9; i >= 0; --i) {
if(b[i] < n) {
sum += b[i] * i;
n -= b[i];
} else {
sum += n * i;
n = 0;
break;
}
}
if(n != 0) {
// no enough element in the array
}

insert all into a heap, and then delete (and sum) N elements.
complexity: O(n+Nlogn), because creating a heap is O(n), and each delete is O(logn), and you iterate over delete N times. total: O(n+Nlogn) [where n is the number of elements in your array].
EDIT: I missed it at first, but all your numbers are digits. so the simplest solution will be using radix sort or bucket sort and then sum the N biggest elements. solution is O(n).

I am a bit slow today, should code faster hehe ;-)
There are multiple answers already but I want to share my pseudo-code with you anyway, hope it helps!
public class LargestSumAlgorithm
{
private ArrayList arValues;
public void AddValueToArray(int p_iValue)
{
arValues.Add(p_iValue);
}
public int ComputeMaxSum(int p_iNumOfElementsToCompute)
{
// check if there are n elements in the array
int iNumOfItemsInArray = arValues.Size;
int iComputedValue = 0;
if(iNumOfItemsInArray >= p_iNumOfElementsToCompute)
{
// order the ArrayList ascending - largest values first
arValues.Sort(SortingEnum.Ascending);
// iterate over the p_iNumOfElementsToCompute in a zero index based ArrayList
for(int iPositionInValueArray = 0; iPositionInValueArray < p_iNumOfElementsToCompute); iPositionInValueArray++)
{
iComputedValue += arValues[i];
}
}
else
{
throw new ArgumentOutOfRangeException;
}
return iComputedValue;
}
public LargestSumAlgorithm()
{
arValues = new ArrayList();
}
}
public class Example
{
LargestNumAlgorithm theAlgorithm = new LargestSumAlgorithm();
theAlgorithm.AddValueToArray(1);
theAlgorithm.AddValueToArray(2);
theAlgorithm.AddValueToArray(3);
theAlgorithm.AddValueToArray(4);
theAlgorithm.AddValueToArray(5);
int iResult = theAlgorithm.ComputeMaxSum(3);
}

If you are using C++, use std::nth_element() to partition the array into two sets, one of them containing the N largest elements (unordered). Selection algo runs in O(n) time.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Choosing a subset in uniformly random manner? - random

let m be the number of elements to select for i = 1; i <= m; i++ pick a random number from 1 to n, call it j swap array[j] and array [n] (assuming 1 indexed arrays) n-- At the end of the loop, the last m elements of array is your random subset. There is a variation on fisher-yates shuffle.

If your picks are random then the probability of picking m items in the manner you described would be 1/pow(n,m). I think what you need is 1/C(n,m).

Related

Problem from book Programming Pearls, Find duplicate number from 4,300,000,000 integers in Linear time? [duplicate]

Algorithm to find two subsets of an array which are maximum & equal [duplicate]

Given an array, you have to find the max possible two equal sum

How to find minimum positive contiguous sub sequence in O(n) time?

Finding sum of N largest elements of array of single-digit values [duplicate]

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Choosing a subset in uniformly random manner? - random

let m be the number of elements to select for i = 1; i <= m; i++ pick a random number from 1 to n, call it j swap array[j] and array [n] (assuming 1 indexed arrays) n-- At the end of the loop, the last m elements of array is your random subset. There is a variation on fisher-yates shuffle.

If your picks are random then the probability of picking m items in the manner you described would be 1/pow(n,m). I think what you need is 1/C(n,m).

Related

Problem from book Programming Pearls, Find duplicate number from 4,300,000,000 integers in *Linear time*? [duplicate]

Algorithm to find two subsets of an array which are maximum & equal [duplicate]

Given an array, you have to find the max possible two equal sum

How to find minimum positive contiguous sub sequence in O(n) time?

Finding sum of N largest elements of array of single-digit values [duplicate]

Categories

Resources

Problem from book Programming Pearls, Find duplicate number from 4,300,000,000 integers in Linear time? [duplicate]