Given an array of numbers. At each step we can pick a number like N in this array and sum N with another number that exist in this array - algorithm

I'm stuck on this problem.
Given an array of numbers. At each step we can pick a number like N in this array and sum N with another number that exist in this array. We continue this process until all numbers in this array equals to zero. What is the minimum number of steps required? (We can guarantee initially the sum of numbers in this array is zero).
Example: -20,-15,1,3,7,9,15
Step 1: pick -15 and sum with 15 -> -20,0,1,3,7,9,0
Step 2: pick 9 and sum with -20 -> -11,0,1,3,7,0,0
Step 3: pick 7 and sum with -11 -> -4,0,1,3,0,0,0
Step 4: pick 3 and sum with -4 -> -1,0,1,0,0,0,0
Step 5: pick 1 and sum with -1 -> 0,0,0,0,0,0,0
So the answer of this example is 5.
I've tried using greedy algorithm. It works like this:
At each step we pick maximum and minimum number that already available in this array and sum these two numbers until all numbers in this array equals to zero.
but it doesn't work and get me wrong answer. Can anyone help me to solve this problem?
#include <bits/stdc++.h>
using namespace std;
int a[] = {-20,-15,1,3,7,9,15};
int bruteforce(){
bool isEqualToZero = 1;
for (int i=0;i<(sizeof(a)/sizeof(int));i++)
if (a[i] != 0){
isEqualToZero = 0;
break;
}
if (isEqualToZero)
return 0;
int tmp=0,m=1e9;
for (int i=0;i<(sizeof(a)/sizeof(int));i++){
for (int j=i+1;j<(sizeof(a)/sizeof(int));j++){
if (a[i]*a[j] >= 0) continue;
tmp = a[j];
a[i] += a[j];
a[j] = 0;
m = min(m,bruteforce());
a[j] = tmp;
a[i] -= tmp;
}
}
return m+1;
}
int main()
{
cout << bruteforce();
}
This is the brute force approach that I've written for this problem. Is there any algorithm to solve this problem faster?

This has an np-complete feel, but the following search does an A* search through all possible normalized partial sums on the way to a single non-zero term. Which solves your problem, and means that you don't get into an infinite loop if the sum is not zero.
If greedy works, this will explore the greedy path first, verify that you can't do better, and return fairly quickly. If greedy doesn't work, this may...take a lot longer.
Implementation in Python because that is easy for me. Translation into another language is an exercise for the reader.
import heapq
def find_minimal_steps (numbers):
normalized = tuple(sorted(numbers))
seen = set([normalized])
todo = [(min_steps_remaining(normalized), 0, normalized, None)]
while todo[0][0] < 7:
step_limit, steps_taken, prev, path = heapq.heappop(todo)
steps_taken = -1 * steps_taken # We store negative for sort order
if min_steps_remaining(prev) == 0:
decoded_path = []
while path is not None:
decoded_path.append((path[0], path[1]))
path = path[2]
return steps_taken, list(reversed(decoded_path))
prev_numbers = list(prev)
for i in range(len(prev_numbers)):
for j in range(len(prev_numbers)):
if i != j:
# Track what they were
num_i = prev_numbers[i]
num_j = prev_numbers[j]
# Sum them
prev_numbers[i] += num_j
prev_numbers[j] = 0
normalized = tuple(sorted(prev_numbers))
if (normalized not in seen):
seen.add(normalized)
heapq.heappush(todo, (
min_steps_remaining(normalized) + steps_taken + 1,
-steps_taken - 1, # More steps is smaller is looked at first
normalized,
(num_i, num_j, path)))
# set them back.
prev_numbers[i] = num_i
prev_numbers[j] = num_j
print(find_minimal_steps([-20,-15,1,3,7,9,15]))
For fun I also added a linked list implementation that doesn't just tell you how many minimal steps, but which ones it found. In this case its steps were (-15, 15), (7, 9), (3, 16), (1, 19), (-20, 20) meaning add 15 to -15, 9 to 7, 16 to 3, 19 to 1, and 20 to -20.

Related

How to get original array from random shuffle of an array

I was asked in an interview today below question. I gave O(nlgn) solution but I was asked to give O(n) solution. I could not come up with O(n) solution. Can you help?
An input array is given like [1,2,4] then every element of it is doubled and
appended into the array. So the array now looks like [1,2,4,2,4,8]. How
this array is randomly shuffled. One possible random arrangement is
[4,8,2,1,2,4]. Now we are given this random shuffled array and we want to
get original array [1,2,4] in O(n) time.
The original array can be returned in any order. How can I do it?
Here's an O(N) Java solution that could be improved by first making sure that the array is of the proper form. For example it shouldn't accept [0] as an input:
import java.util.*;
class Solution {
public static int[] findOriginalArray(int[] changed) {
if (changed.length % 2 != 0)
return new int[] {};
// set Map size to optimal value to avoid rehashes
Map<Integer,Integer> count = new HashMap<>(changed.length*100/75);
int[] original = new int[changed.length/2];
int pos = 0;
// count frequency for each number
for (int n : changed) {
count.put(n, count.getOrDefault(n,0)+1);
}
// now decide which go into the answer
for (int n : changed) {
int smallest = n;
for (int m=n; m > 0 && count.getOrDefault(m,0) > 0; m = m/2) {
//System.out.println(m);
smallest = m;
if (m % 2 != 0) break;
}
// trickle up from smallest to largest while count > 0
for (int m=smallest, mm = 2*m; count.getOrDefault(mm,0) > 0; m = mm, mm=2*mm){
int ct = count.getOrDefault(mm,0);
while (count.get(m) > 0 && ct > 0) {
//System.out.println("adding "+m);
original[pos++] = m;
count.put(mm, ct -1);
count.put(m, count.get(m) - 1);
ct = count.getOrDefault(mm,0);
}
}
}
// check for incorrect format
if (count.values().stream().anyMatch(x -> x > 0)) {
return new int[] {};
}
return original;
}
public static void main(String[] args) {
int[] changed = {1,2,4,2,4,8};
System.out.println(Arrays.toString(changed));
System.out.println(Arrays.toString(findOriginalArray(changed)));
}
}
But I've tried to keep it simple.
The output is NOT guaranteed to be sorted. If you want it sorted it's going to cost O(NlogN) inevitably unless you use a Radix sort or something similar (which would make it O(NlogE) where E is the max value of the numbers you're sorting and logE the number of bits needed).
Runtime
This may not look that it is O(N) but you can see that it is because for every loop it will only find the lowest number in the chain ONCE, then trickle up the chain ONCE. Or said another way, in every iteration it will do O(X) iterations to process X elements. What will remain is O(N-X) elements. Therefore, even though there are for's inside for's it is still O(N).
An example execution can be seen with [64,32,16,8,4,2].
If this where not O(N) if you print out each value that it traverses to find the smallest you'd expect to see the values appear over and over again (for example N*(N+1)/2 times).
But instead you see them only once:
finding smallest 64
finding smallest 32
finding smallest 16
finding smallest 8
finding smallest 4
finding smallest 2
adding 2
adding 8
adding 32
If you're familiar with the Heapify algorithm you'll recognize the approach here.
def findOriginalArray(self, changed: List[int]) -> List[int]:
size = len(changed)
ans = []
left_elements = size//2
#IF SIZE IS ODD THEN RETURN [] NO SOLN. IS POSSIBLE
if(size%2 !=0):
return ans
#FREQUENCY DICTIONARY given array [0,0,2,1] my map will be: {0:2,2:1,1:1}
d = {}
for i in changed:
if(i in d):
d[i]+=1
else:
d[i] = 1
# CHECK THE EDGE CASE OF 0
if(0 in d):
count = d[0]
half = count//2
if((count % 2 != 0) or (half > left_elements)):
return ans
left_elements -= half
ans = [0 for i in range(half)]
#CHECK REST OF THE CASES : considering the values will be 10^5
for i in range(1,50001):
if(i in d):
if(d[i] > 0):
count = d[i]
if(count > left_elements):
ans = []
break
left_elements -= d[i]
for j in range(count):
ans.append(i)
if(2*i in d):
if(d[2*i] < count):
ans = []
break
else:
d[2*i] -= count
else:
ans = []
break
return ans
I have a simple idea which might not be the best, but I could not think of a case where it would not work. Having the array A with the doubled elements and randomly shuffled, keep a helper map. Process each element of the array and, each time you find a new element, add it to the map with the value 0. When an element is processed, increment map[i] and decrement map[2*i]. Next you iterate over the map and print the elements that have a value greater than zero.
A simple example, say that the vector is:
[1, 2, 3]
And the doubled/shuffled version is:
A = [3, 2, 1, 4, 2, 6]
When processing 3, first add the keys 3 and 6 to the map with value zero. Increment map[3] and decrement map[6]. This way, map[3] = 1 and map[6] = -1. Then for the next element map[2] = 1 and map[4] = -1 and so forth. The final state of the map in this example would be map[1] = 1, map[2] = 1, map[3] = 1, map[4] = -1, map[6] = 0, map[8] = -1, map[12] = -1.
Then you just process the keys of the map and, for each key with a value greater than zero, add it to the output. There are certainly more efficient solutions, but this one is O(n).
In C++, you can try this.
With time is O(N + KlogK) where N is the length of input, and K is the number of unique elements in input.
class Solution {
public:
vector<int> findOriginalArray(vector<int>& input) {
if (input.size() % 2) return {};
unordered_map<int, int> m;
for (int n : input) m[n]++;
vector<int> nums;
for (auto [n, cnt] : m) nums.push_back(n);
sort(begin(nums), end(nums));
vector<int> out;
for (int n : nums) {
if (m[2 * n] < m[n]) return {};
for (int i = 0; i < m[n]; ++i, --m[2 * n]) out.push_back(n);
}
return out;
}
};
Not so clear about the space complexity required in the question, so this is my top-of-the-mind attempt to this question if this requires O(n) time complexity.
If the length of the input array is not even, then its wrong !!
Create a map, add the elements of the input array to it.
Divide each element in the input array by 2 and check if that value exists in the map. If it exists, add it to the array (slice) orig.
There is a chance we have added duplicate values to this original array, clean it!!
Here is a sample go code:
https://go.dev/play/p/w4mm-rloHyi
I am sure we can optimize this code in a lot of ways for space complexities. But its O(n) time complexity.

Find max sum of elements in an array ( with twist)

Given a array with +ve and -ve integer , find the maximum sum such that you are not allowed to skip 2 contiguous elements ( i.e you have to select at least one of them to move forward).
eg :-
10 , 20 , 30, -10 , -50 , 40 , -50, -1, -3
Output : 10+20+30-10+40-1 = 89
This problem can be solved using Dynamic Programming approach.
Let arr be the given array and opt be the array to store the optimal solutions.
opt[i] is the maximum sum that can be obtained starting from element i, inclusive.
opt[i] = arr[i] + (some other elements after i)
Now to solve the problem we iterate the array arr backwards, each time storing the answer opt[i].
Since we cannot skip 2 contiguous elements, either element i+1 or element i+2 has to be included
in opt[i].
So for each i, opt[i] = arr[i] + max(opt[i+1], opt[i+2])
See this code to understand:
int arr[n]; // array of given numbers. array size = n.
nput(arr, n); // input the array elements (given numbers)
int opt[n+2]; // optimal solutions.
memset(opt, 0, sizeof(opt)); // Initially set all optimal solutions to 0.
for(int i = n-1; i >= 0; i--) {
opt[i] = arr[i] + max(opt[i+1], opt[i+2]);
}
ans = max(opt[0], opt[1]) // final answer.
Observe that opt array has n+2 elements. This is to avoid getting illegal memory access exception (memory out of bounds) when we try to access opt[i+1] and opt[i+2] for the last element (n-1).
See the working implementation of the algorithm given above
Use a recurrence that accounts for that:
dp[i] = max(dp[i - 1] + a[i], <- take two consecutives
dp[i - 2] + a[i], <- skip a[i - 1])
Base cases left as an exercise.
If you see a +ve integer add it to the sum. If you see a negative integer, then inspect the next integer pick which ever is maximum and add it to the sum.
10 , 20 , 30, -10 , -50 , 40 , -50, -1, -3
For this add 10, 20, 30, max(-10, -50), 40 max(-50, -1) and since there is no element next to -3 discard it.
The last element will go to sum if it was +ve.
Answer:
I think this algorithm will help.
1. Create a method which gives output the maximum sum of particular user input array say T[n], where n denotes the total no. of elements.
2. Now this method will keep on adding array elements till they are positive. As we want to maximize the sum and there is no point in dropping positive elements behind.
3. As soon as our method encounters a negative element, it will transfer all consecutive negative elements to another method which create a new array say N[i] such that this array will contain all the consecutive negative elements that we encountered in T[n] and returns N[i]'s max output.
In this way our main method is not affected and its keep on adding positive elements and whenever it encounters negative element, it instead of adding their real values adds the net max output of that consecutive array of negative elements.
for example: T[n] = 29,34,55,-6,-5,-4,6,43,-8,-9,-4,-3,2,78 //here n=14
Main Method Working:
29+34+55+(sends data & gets value from Secondary method of array [-6,-5,-4])+6+43+(sends data & gets value from Secondary method of array [-8,-9,-4,-3])+2+78
Process Terminates with max output.
Secondary Method Working:
{
N[i] = gets array from Main method or itself as and when required.
This is basically a recursive method.
say N[i] has elements like N1, N2, N3, N4, etc.
for i>=3:
Now choice goes like this.
1. If we take N1 then we can recurse the left off array i.e. N[i-1] which has all elements except N1 in same order. Such that the net max output will be
N1+(sends data & gets value from Secondary method of array N[i-1] recursively)
2. If we doesn't take N1, then we cannot skip N2. So, Now algorithm is like 1st choice but starting with N2. So max output in this case will be
N2+(sends data & gets value from Secondary method of array N[i-2] recursively).
Here N[i-2] is an array containing all N[i] elements except N1 & N2 in same order.
Termination: When we are left with the array of size one ( for N[i-2] ) then we have to choose that particular value as no option.
The recursions will finally yield the max outputs and we have to finally choose the output of that choice which is more.
and redirect the max output to wherever required.
for i=2:
we have to choose the value which is bigger
for i=1:
We can surely skip that value.
So max output in this case will be 0.
}
I think this answer will help to you.
Given array:
Given:- 10 20 30 -10 -50 40 -50 -1 -3
Array1:-10 30 60 50 10 90 40 89 86
Array2:-10 20 50 40 0 80 30 79 76
Take the max value of array1[n-1],array1[n],array2[n-1],array2[n] i.e 89(array1[n-1])
Algorithm:-
For the array1 value assign array1[0]=a[0],array1=a[0]+a[1] and array2[0]=a[0],array2[1]=a[1].
calculate the array1 value from 2 to n is max of sum of array1[i-1]+a[i] or array1[i-2]+a[i].
for loop from 2 to n{
array1[i]=max(array1[i-1]+a[i],array1[i-2]+a[i]);
}
similarly for array2 value from 2 to n is max of sum of array2[i-1]+a[i] or array2[i-2]+a[i].
for loop from 2 to n{
array2[i]=max(array2[i-1]+a[i],array2[i-2]+a[i]);
}
Finally find the max value of array1[n-1],array[n],array2[n-1],array2[n];
int max(int a,int b){
return a>b?a:b;
}
int main(){
int a[]={10,20,30,-10,-50,40,-50,-1,-3};
int i,n,max_sum;
n=sizeof(a)/sizeof(a[0]);
int array1[n],array2[n];
array1[0]=a[0];
array1[1]=a[0]+a[1];
array2[0]=a[0];
array2[1]=a[1];
for loop from 2 to n{
array1[i]=max(array1[i-1]+a[i],array1[i-2]+a[i]);
array2[i]=max(array2[i-1]+a[i],array2[i-2]+a[i]);
}
--i;
max_sum=max(array1[i],array1[i-1]);
max_sum=max(max_sum,array2[i-1]);
max_sum=max(max_sum,array2[i]);
printf("The max_sum is %d",max_sum);
return 0;
}
Ans: The max_sum is 89
public static void countSum(int[] a) {
int count = 0;
int skip = 0;
int newCount = 0;
if(a.length==1)
{
count = a[0];
}
else
{
for(int i:a)
{
newCount = count + i;
if(newCount>=skip)
{
count = newCount;
skip = newCount;
}
else
{
count = skip;
skip = newCount;
}
}
}
System.out.println(count);
}
}
Let the array be of size N, indexed as 1...N
Let f(n) be the function, that provides the answer for max sum of sub array (1...n), such that no two left over elements are consecutive.
f(n) = max (a[n-1] + f(n-2), a(n) + f(n-1))
In first option, which is - {a[n-1] + f(n-2)}, we are leaving the last element, and due to condition given in question selecting the second last element.
In the second option, which is - {a(n) + f(n-1)} we are selecting the last element of the subarray, so we have an option to select/deselect the second last element.
Now starting from the base case :
f(0) = 0 [Subarray (1..0) doesn't exist]
f(1) = (a[1] > 0 ? a[1] : 0); [Subarray (1..1)]
f(2) = max( a(2) + 0, a[1] + f(1)) [Choosing atleast one of them]
Moving forward we can calculate any f(n), where n = 1...N, and store them to calculate next results. And yes, obviously, the case f(N) will give us the answer.
Time complexity o(n)
Space complexity o(n)
n = arr.length().
Append a 0 at the end of the array to handle boundary case.
ans: int array of size n+1.
ans[i] will store the answer for array a[0...i] which includes a[i] in the answer sum.
Now,
ans[0] = a[0]
ans[1] = max(a[1], a[1] + ans[0])
for i in [2,n-1]:
ans[i] = max(ans[i-1] , ans[i-2]) + a[i]
Final answer would be a[n]
If you want to avoid using Dynamic Programming
To find the maximum sum, first, you've to add all the positive
numbers.
We'll be skipping only negative elements. Since we're not
allowed to skip 2 contiguous elements, we will put all contiguous
negative elements in a temp array, and can figure out the maximum sum
of alternate elements using sum_odd_even function as defined below.
Then we can add the maximum of all such temp arrays to our sum of all
positive numbers. And the final sum will give us the desired output.
Code:
def sum_odd_even(arr):
sum1 = sum2 = 0
for i in range(len(arr)):
if i%2 == 0:
sum1 += arr[i]
else:
sum2 += arr[i]
return max(sum1,sum2)
input = [10, 20, 30, -10, -50, 40, -50, -1, -3]
result = 0
temp = []
for i in range(len(input)):
if input[i] > 0:
result += input[i]
if input[i] < 0 and i != len(input)-1:
temp.append(input[i])
elif input[i] < 0:
temp.append(input[i])
result += sum_odd_even(temp)
temp = []
else:
result += sum_odd_even(temp)
temp = []
print result
Simple Solution: Skip with twist :). Just skip the smallest number in i & i+1 if consecutive -ve. Have if conditions to check that till n-2 elements and check for the last element in the end.
int getMaxSum(int[] a) {
int sum = 0;
for (int i = 0; i <= a.length-2; i++) {
if (a[i]>0){
sum +=a[i];
continue;
} else if (a[i+1] > 0){
i++;
continue;
} else {
sum += Math.max(a[i],a[i+1]);
i++;
}
}
if (a[a.length-1] > 0){
sum+=a[a.length-1];
}
return sum;
}
The correct recurrence is as follow:
dp[i] = max(dp[i - 1] + a[i], dp[i - 2] + a[i - 1])
The first case is the one we pick the i-th element. The second case is the one we skip the i-th element. In the second case, we must pick the (i-1)th element.
The problem of IVlad's answer is that it always pick i-th element, which can lead to incorrect answer.
This question can be solved using include,exclude approach.
For first element, include = arr[0], exclude = 0.
For rest of the elements:
nextInclude = arr[i]+max(include, exclude)
nextExclude = include
include = nextInclude
exclude = nextExclude
Finally, ans = Math.max(include,exclude).
Similar questions can be referred at (Not the same)=> https://www.youtube.com/watch?v=VT4bZV24QNo&t=675s&ab_channel=Pepcoding.

How to get to array with the smallest sum

I was given this interview question, and I totally blanked out. How would you guys solve this:
Go from the start of an array to the end in a way that you minimize the sum of elements that you land on.
You can move to the next element, i.e go from index 1 to index 2.
Or you can hop one element over. i.e go from index 1 to index 3.
Assuming that you only move from left to right, and you want to find a way to get from index 0 to index n - 1 of an array of n elements, so that the sum of the path you take is minimum. From index i, you can only move ahead to index i + 1 or index i + 2.
Observe that the minimum path to get from index 0 to index k is the minimum between the minimum path to get from index 0 to index k - 1 and the mininum path from index 0 to index k- 2. There is simply no other path to take.
Therefore, we can have a dynamic programming solution:
DP[0] = A[0]
DP[1] = A[0] + A[1]
DP[k] = min(DP[0], DP[1]) + A[k]
A is the array of elements.
DP array will store the minimum sum to reach element at index i from index 0.
The result will be in DP[n - 1].
Java:
static int getMinSum(int elements[])
{
if (elements == null || elements.length == 0)
{
throw new IllegalArgumentException("No elements");
}
if (elements.length == 1)
{
return elements[0];
}
int minSum[] = new int[elements.length];
minSum[0] = elements[0];
minSum[1] = elements[0] + elements[1];
for (int i = 2; i < elements.length; i++)
{
minSum[i] = Math.min(minSum[i - 1] + elements[i], minSum[i - 2] + elements[i]);
}
return Math.min(minSum[elements.length - 2], minSum[elements.length - 1]);
}
Input:
int elements[] = { 1, -2, 3 };
System.out.println(getMinSum(elements));
Output:
-1
Case description:
We start from the index 0. We must take 1. Now we can go to index 1 or 2. Since -2 is attractive, we choose it. Now we can go to index 2 or hop it. Better hop and our sum is minimal 1 + (-2) = -1.
Another examples (pseudocode):
getMinSum({1, 1, 10, 1}) == 3
getMinSum({1, 1, 10, 100, 1000}) == 102
Algorithm:
O(n) complexity. Dynamic programming. We go from left to right filling up minSum array. Invariant: minSum[i] = min(minSum[i - 1] + elements[i] /* move by 1 */ , minSum[i - 2] + elements[i] /* hop */ ).
This seems like the perfect place for a dynamic programming solution.
Keeping track of two values, odd/even.
We will take Even to mean we used the previous value, and Odd to mean we haven't.
int Even = 0; int Odd = 0;
int length = arr.length;
Start at the back. We can either take the number or not. Therefore:
Even = arr[length];
Odd = 0;`
And now we move to the next element with two cases. Either we were even, in which case we have the choice to take the element or skip it. Or we were odd and had to take the element.
int current = arr[length - 1]
Even = Min(Even + current, Odd + current);
Odd = Even;
We can make a loop out of this and achieve a O(n) solution!

Largest sum of k elements not larger than m

This problem is from a programing competition, and I can't manage to solve it in acceptable time.
You are given an array a of n integers. Find the largest sum s of exactly k elements (not necessarily continuous) that does not exceed a given integer m (s < m).
Constraints:
0 < k <= n < 100
m < 3000
0 < a[i] < 100
Info: A solution is guaranteed to exist for the given input.
Now, I guess my best bet is a DP approach, but couldn't come up with the correct formula.
I would try two things. They are both based on the following idea:
If we can solve the problem of deciding if there are k elements that sum exactly to p, then we can binary search for the answer in [1, m].
1. Optimized bruteforce
Simply sort your array and cut your search short when the current sum exceeds p. The idea is that you will generally only have to backtrack very little, since the sorted array should help eliminate bad solutions early.
To be honest, I doubt this will be fast enough however.
2. A randomized algorithm
Keep a used array of size k. Randomly assign elements to it. While their sum is not p, randomly replace an element with another and make sure to update their sum in constant time.
Keep doing this a maximum of e times (experiment with its value for best results, the complexity will be O(e log m) in the end, so it can probably go quite high), if you couldn't get to sum p during this time, assume that it is not possible.
Alternatively, forget the binary search. Run the randomized algorithm directly and return the largest valid sum it finds in e runs or until your allocated running time ends.
I am not sure how DP would efficiently keep track of the number of elements used in the sum. I think the randomized algorithm is worth a shot since it is easy to implement.
Both of the accepted methods are inferior. Also, this is not a problem type that can be solved by DP.
The following is the correct method illustrated via example:
imagine a = { 2, 3, 5, 9, 11, 14, 17, 23 } (hence n = 8), k = 3, and s = 30
Sort the array a.
Define three pointers into the array, P1, P2, and P3 going from 1 to n. P1 < P2 < P3
Set P3 to a_max (here 23), P1 to 1, and P2 to 2. Calculate the sum s (here 23 + 2 + 3 = 28). If s > S, then decrease P3 by one and try again until you find a solution. If P3 < 3, then there is no solution. Store your first solution as best known solution so far (BKSSF).
Next, increase P2 until s > S. If you find a better solution update BKSSF. Decrease P2 by one.
Next increase P1 until s > S. Update if you find a better solution.
Now go back to P2 and decrease it by one.
Then increase P1 until s > S. etc.
You can see this is a recursive algorithm, in which for every increase or decrease, there is one or more corresponding decreases, increases.
This algorithm will be much, much faster than the attempts above.
for l <= k and r <= s:
V[l][r] = true iff it is possible to choose exactly l elements that sum up to r.
V[0][0] = true
for i in 1..n:
V'[][] - initialize with false
for l in 0..k-1:
for r in 0..s:
if V[l][r] and s + a[i] <= s:
V'[l + 1][r + a[i]] = true
V |= V'
That gives you all achievable sums in O(k * n * s).
I think Tyler Durden had the right idea. But you don't have to sum -all- the elements, and you can basically do it greedily, so you can cut down the loop a lot. In C++:
#include <iostream>
#include <algorithm>
using namespace std;
#define FI(n) for(int i=0;i<(n);i++)
int m, n, k;
int a[] = { 12, 43, 1, 4, 3, 5, 13, 34, 24, 22, 31 },
e[20];
inline int max(int i) { return n-k+i+1; }
void print(int e[], int ii, int sum)
{ cout << sum << '\t';
FI(ii+1) cout << e[i]<<','; cout<<'\n';
}
bool desc(int a, int b) { return a>b; }
int solve()
{ sort(a, a+n, desc);
cout <<"a="; FI(n) cout << a[i]<<','; cout<<"\nsum\tindexes\n";
int i,sum;
i = e[0] = sum = 0;
print (e,i,a[0]);
while(1)
{ while (e[i]<max(i) && sum+a[e[i]]>=m) e[i]++;
if (e[i]==max(i))
{ if (!i) return -1; // FAIL
cout<<"*"; print (e,i,sum);
sum -= a[e[--i]++];
} else // sum+a[e[i]]<m
{ sum += a[e[i]];
print (e,i,sum);
if (i+1==k) return sum;
e[i+1] = e[i]+1;
i++;
}
}
}
int main()
{ n = sizeof(a)/sizeof(int);
k = 3;
m = 39;
cout << "n,k,m="<<n<<' '<<k<<' '<<m<<'\n';
cout << solve();
}
For m=36 it gives the output
n,k,m=11 3 36
a=43,34,31,24,22,13,12,5,4,3,1,
sum indexes
43 0,
34 1,
*34 1,10,
31 2,
35 2,8,
*35 2,8,11,
34 2,9,
35 2,9,10,
35
For m=37 it gives
n,k,m=11 3 37
a=43,34,31,24,22,13,12,5,4,3,1,
sum indexes
43 0,
34 1,
*34 1,10,
31 2,
36 2,7,
*36 2,7,11,
35 2,8,
36 2,8,10,
36
(One last try: for m=39 it also gives the right answer, 38)
Output: the last number is the sum and the line above it has the indexes. Lines with an asterisk are before backtracking, so the last index of the line is one too high. Runtime should be O(k*n).
Sorry for the hard-to-understand code. I can clean it up and provide explanation upon request but I have another project due at the moment ;).

Algorithm to find two repeated numbers in an array, without sorting

There is an array of size n (numbers are between 0 and n - 3) and only 2 numbers are repeated. Elements are placed randomly in the array.
E.g. in {2, 3, 6, 1, 5, 4, 0, 3, 5} n=9, and repeated numbers are 3 and 5.
What is the best way to find the repeated numbers?
P.S. [You should not use sorting]
There is a O(n) solution if you know what the possible domain of input is. For example if your input array contains numbers between 0 to 100, consider the following code.
bool flags[100];
for(int i = 0; i < 100; i++)
flags[i] = false;
for(int i = 0; i < input_size; i++)
if(flags[input_array[i]])
return input_array[i];
else
flags[input_array[i]] = true;
Of course there is the additional memory but this is the fastest.
OK, seems I just can't give it a rest :)
Simplest solution
int A[N] = {...};
int signed_1(n) { return n%2<1 ? +n : -n; } // 0,-1,+2,-3,+4,-5,+6,-7,...
int signed_2(n) { return n%4<2 ? +n : -n; } // 0,+1,-2,-3,+4,+5,-6,-7,...
long S1 = 0; // or int64, or long long, or some user-defined class
long S2 = 0; // so that it has enough bits to contain sum without overflow
for (int i=0; i<N-2; ++i)
{
S1 += signed_1(A[i]) - signed_1(i);
S2 += signed_2(A[i]) - signed_2(i);
}
for (int i=N-2; i<N; ++i)
{
S1 += signed_1(A[i]);
S2 += signed_2(A[i]);
}
S1 = abs(S1);
S2 = abs(S2);
assert(S1 != S2); // this algorithm fails in this case
p = (S1+S2)/2;
q = abs(S1-S2)/2;
One sum (S1 or S2) contains p and q with the same sign, the other sum - with opposite signs, all other members are eliminated.
S1 and S2 must have enough bits to accommodate sums, the algorithm does not stand for overflow because of abs().
if abs(S1)==abs(S2) then the algorithm fails, though this value will still be the difference between p and q (i.e. abs(p - q) == abs(S1)).
Previous solution
I doubt somebody will ever encounter such a problem in the field ;)
and I guess, I know the teacher's expectation:
Lets take array {0,1,2,...,n-2,n-1},
The given one can be produced by replacing last two elements n-2 and n-1 with unknown p and q (less order)
so, the sum of elements will be (n-1)n/2 + p + q - (n-2) - (n-1)
the sum of squares (n-1)n(2n-1)/6 + p^2 + q^2 - (n-2)^2 - (n-1)^2
Simple math remains:
(1) p+q = S1
(2) p^2+q^2 = S2
Surely you won't solve it as math classes teach to solve square equations.
First, calculate everything modulo 2^32, that is, allow for overflow.
Then check pairs {p,q}: {0, S1}, {1, S1-1} ... against expression (2) to find candidates (there might be more than 2 due to modulo and squaring)
And finally check found candidates if they really are present in array twice.
You know that your Array contains every number from 0 to n-3 and the two repeating ones (p & q). For simplicity, lets ignore the 0-case for now.
You can calculate the sum and the product over the array, resulting in:
1 + 2 + ... + n-3 + p + q = p + q + (n-3)(n-2)/2
So if you substract (n-3)(n-2)/2 from the sum of the whole array, you get
sum(Array) - (n-3)(n-2)/2 = x = p + q
Now do the same for the product:
1 * 2 * ... * n - 3 * p * q = (n - 3)! * p * q
prod(Array) / (n - 3)! = y = p * q
Your now got these terms:
x = p + q
y = p * q
=> y(p + q) = x(p * q)
If you transform this term, you should be able to calculate p and q
Insert each element into a set/hashtable, first checking if its are already in it.
You might be able to take advantage of the fact that sum(array) = (n-2)*(n-3)/2 + two missing numbers.
Edit: As others have noted, combined with the sum-of-squares, you can use this, I was just a little slow in figuring it out.
Check this old but good paper on the topic:
Finding Repeated Elements (PDF)
Some answers to the question: Algorithm to determine if array contains n…n+m? contain as a subproblem solutions which you can adopt for your purpose.
For example, here's a relevant part from my answer:
bool has_duplicates(int* a, int m, int n)
{
/** O(m) in time, O(1) in space (for 'typeof(m) == typeof(*a) == int')
Whether a[] array has duplicates.
precondition: all values are in [n, n+m) range.
feature: It marks visited items using a sign bit.
*/
assert((INT_MIN - (INT_MIN - 1)) == 1); // check n == INT_MIN
for (int *p = a; p != &a[m]; ++p) {
*p -= (n - 1); // [n, n+m) -> [1, m+1)
assert(*p > 0);
}
// determine: are there duplicates
bool has_dups = false;
for (int i = 0; i < m; ++i) {
const int j = abs(a[i]) - 1;
assert(j >= 0);
assert(j < m);
if (a[j] > 0)
a[j] *= -1; // mark
else { // already seen
has_dups = true;
break;
}
}
// restore the array
for (int *p = a; p != &a[m]; ++p) {
if (*p < 0)
*p *= -1; // unmark
// [1, m+1) -> [n, n+m)
*p += (n - 1);
}
return has_dups;
}
The program leaves the array unchanged (the array should be writeable but its values are restored on exit).
It works for array sizes upto INT_MAX (on 64-bit systems it is 9223372036854775807).
suppose array is
a[0], a[1], a[2] ..... a[n-1]
sumA = a[0] + a[1] +....+a[n-1]
sumASquare = a[0]*a[0] + a[1]*a[1] + a[2]*a[2] + .... + a[n]*a[n]
sumFirstN = (N*(N+1))/2 where N=n-3 so
sumFirstN = (n-3)(n-2)/2
similarly
sumFirstNSquare = N*(N+1)*(2*N+1)/6 = (n-3)(n-2)(2n-5)/6
Suppose repeated elements are = X and Y
so X + Y = sumA - sumFirstN;
X*X + Y*Y = sumASquare - sumFirstNSquare;
So on solving this quadratic we can get value of X and Y.
Time Complexity = O(n)
space complexity = O(1)
I know the question is very old but I suddenly hit it and I think I have an interesting answer to it.
We know this is a brainteaser and a trivial solution (i.e. HashMap, Sort, etc) no matter how good they are would be boring.
As the numbers are integers, they have constant bit size (i.e. 32). Let us assume we are working with 4 bit integers right now. We look for A and B which are the duplicate numbers.
We need 4 buckets, each for one bit. Each bucket contains numbers which its specific bit is 1. For example bucket 1 gets 2, 3, 4, 7, ...:
Bucket 0 : Sum ( x where: x & 2 power 0 == 0 )
...
Bucket i : Sum ( x where: x & 2 power i == 0 )
We know what would be the sum of each bucket if there was no duplicate. I consider this as prior knowledge.
Once above buckets are generated, a bunch of them would have values more than expected. By constructing the number from buckets we will have (A OR B for your information).
We can calculate (A XOR B) as follows:
A XOR B = Array[i] XOR Array[i-1] XOR ... 0, XOR n-3 XOR n-2 ... XOR 0
Now going back to buckets, we know exactly which buckets have both our numbers and which ones have only one (from the XOR bit).
For the buckets that have only one number we can extract the number num = (sum - expected sum of bucket). However, we should be good only if we can find one of the duplicate numbers so if we have at least one bit in A XOR B, we've got the answer.
But what if A XOR B is zero?
Well this case is only possible if both duplicate numbers are the same number, which then our number is the answer of A OR B.
Sorting the array would seem to be the best solution. A simple sort would then make the search trivial and would take a whole lot less time/space.
Otherwise, if you know the domain of the numbers, create an array with that many buckets in it and increment each as you go through the array. something like this:
int count [10];
for (int i = 0; i < arraylen; i++) {
count[array[i]]++;
}
Then just search your array for any numbers greater than 1. Those are the items with duplicates. Only requires one pass across the original array and one pass across the count array.
Here's implementation in Python of #eugensk00's answer (one of its revisions) that doesn't use modular arithmetic. It is a single-pass algorithm, O(log(n)) in space. If fixed-width (e.g. 32-bit) integers are used then it is requires only two fixed-width numbers (e.g. for 32-bit: one 64-bit number and one 128-bit number). It can handle arbitrary large integer sequences (it reads one integer at a time therefore a whole sequence doesn't require to be in memory).
def two_repeated(iterable):
s1, s2 = 0, 0
for i, j in enumerate(iterable):
s1 += j - i # number_of_digits(s1) ~ 2 * number_of_digits(i)
s2 += j*j - i*i # number_of_digits(s2) ~ 4 * number_of_digits(i)
s1 += (i - 1) + i
s2 += (i - 1)**2 + i**2
p = (s1 - int((2*s2 - s1**2)**.5)) // 2
# `Decimal().sqrt()` could replace `int()**.5` for really large integers
# or any function to compute integer square root
return p, s1 - p
Example:
>>> two_repeated([2, 3, 6, 1, 5, 4, 0, 3, 5])
(3, 5)
A more verbose version of the above code follows with explanation:
def two_repeated_seq(arr):
"""Return the only two duplicates from `arr`.
>>> two_repeated_seq([2, 3, 6, 1, 5, 4, 0, 3, 5])
(3, 5)
"""
n = len(arr)
assert all(0 <= i < n - 2 for i in arr) # all in range [0, n-2)
assert len(set(arr)) == (n - 2) # number of unique items
s1 = (n-2) + (n-1) # s1 and s2 have ~ 2*(k+1) and 4*(k+1) digits
s2 = (n-2)**2 + (n-1)**2 # where k is a number of digits in `max(arr)`
for i, j in enumerate(arr):
s1 += j - i
s2 += j*j - i*i
"""
s1 = (n-2) + (n-1) + sum(arr) - sum(range(n))
= sum(arr) - sum(range(n-2))
= sum(range(n-2)) + p + q - sum(range(n-2))
= p + q
"""
assert s1 == (sum(arr) - sum(range(n-2)))
"""
s2 = (n-2)**2 + (n-1)**2 + sum(i*i for i in arr) - sum(i*i for i in range(n))
= sum(i*i for i in arr) - sum(i*i for i in range(n-2))
= p*p + q*q
"""
assert s2 == (sum(i*i for i in arr) - sum(i*i for i in range(n-2)))
"""
s1 = p+q
-> s1**2 = (p+q)**2
-> s1**2 = p*p + 2*p*q + q*q
-> s1**2 - (p*p + q*q) = 2*p*q
s2 = p*p + q*q
-> p*q = (s1**2 - s2)/2
Let C = p*q = (s1**2 - s2)/2 and B = p+q = s1 then from Viete theorem follows
that p and q are roots of x**2 - B*x + C = 0
-> p = (B + sqrtD) / 2
-> q = (B - sqrtD) / 2
where sqrtD = sqrt(B**2 - 4*C)
-> p = (s1 + sqrt(2*s2 - s1**2))/2
"""
sqrtD = (2*s2 - s1**2)**.5
assert int(sqrtD)**2 == (2*s2 - s1**2) # perfect square
sqrtD = int(sqrtD)
assert (s1 - sqrtD) % 2 == 0 # even
p = (s1 - sqrtD) // 2
q = s1 - p
assert q == ((s1 + sqrtD) // 2)
assert sqrtD == (q - p)
return p, q
NOTE: calculating integer square root of a number (~ N**4) makes the above algorithm non-linear.
Since a range is specified, you can perform radix sort. This would sort your array in O(n). Searching for duplicates in a sorted array is then O(n)
You can use simple nested for loop
int[] numArray = new int[] { 1, 2, 3, 4, 5, 7, 8, 3, 7 };
for (int i = 0; i < numArray.Length; i++)
{
for (int j = i + 1; j < numArray.Length; j++)
{
if (numArray[i] == numArray[j])
{
//DO SOMETHING
}
}
*OR you can filter the array and use recursive function if you want to get the count of occurrences*
int[] array = { 1, 2, 3, 4, 5, 4, 4, 1, 8, 9, 23, 4, 6, 8, 9, 1,4 };
int[] myNewArray = null;
int a = 1;
void GetDuplicates(int[] array)
for (int i = 0; i < array.Length; i++)
{
for (int j = i + 1; j < array.Length; j++)
{
if (array[i] == array[j])
{
a += 1;
}
}
Console.WriteLine(" {0} occurred {1} time/s", array[i], a);
IEnumerable<int> num = from n in array where n != array[i] select n;
myNewArray = null;
a = 1;
myNewArray = num.ToArray() ;
break;
}
GetDuplicates(myNewArray);
answer to 18..
you are taking an array of 9 and elements are starting from 0..so max ele will be 6 in your array. Take sum of elements from 0 to 6 and take sum of array elements. compute their difference (say d). This is p + q. Now take XOR of elements from 0 to 6 (say x1). Now take XOR of array elements (say x2). x2 is XOR of all elements from 0 to 6 except two repeated elements since they cancel out each other. now for i = 0 to 6, for each ele of array, say p is that ele a[i] so you can compute q by subtracting this ele from the d. do XOR of p and q and XOR them with x2 and check if x1==x2. likewise doing for all elements you will get the elements for which this condition will be true and you are done in O(n). Keep coding!
check this out ...
O(n) time and O(1) space complexity
for(i=0;i< n;i++)
xor=xor^arr[i]
for(i=1;i<=n-3;i++)
xor=xor^i;
So in the given example you will get the xor of 3 and 5
xor=xor & -xor //Isolate the last digit
for(i = 0; i < n; i++)
{
if(arr[i] & xor)
x = x ^ arr[i];
else
y = y ^ arr[i];
}
for(i = 1; i <= n-3; i++)
{
if(i & xor)
x = x ^ i;
else
y = y ^ i;
}
x and y are your answers
For each number: check if it exists in the rest of the array.
Without sorting you're going to have a keep track of numbers you've already visited.
in psuedocode this would basically be (done this way so I'm not just giving you the answer):
for each number in the list
if number not already in unique numbers list
add it to the unique numbers list
else
return that number as it is a duplicate
end if
end for each
How about this:
for (i=0; i<n-1; i++) {
for (j=i+1; j<n; j++) {
if (a[i] == a[j]) {
printf("%d appears more than once\n",a[i]);
break;
}
}
}
Sure it's not the fastest, but it's simple and easy to understand, and requires
no additional memory. If n is a small number like 9, or 100, then it may well be the "best". (i.e. "Best" could mean different things: fastest to execute, smallest memory footprint, most maintainable, least cost to develop etc..)
In c:
int arr[] = {2, 3, 6, 1, 5, 4, 0, 3, 5};
int num = 0, i;
for (i=0; i < 8; i++)
num = num ^ arr[i] ^i;
Since x^x=0, the numbers that are repeated odd number of times are neutralized. Let's call the unique numbers a and b.We are left with a^b. We know a^b != 0, since a != b. Choose any 1 bit of a^b, and use that as a mask ie.choose x as a power of 2 so that x & (a^b) is nonzero.
Now split the list into two sublists -- one sublist contains all numbers y with y&x == 0, and the rest go in the other sublist. By the way we chose x, we know that the pairs of a and b are in different buckets. So we can now apply the same method used above to each bucket independently, and discover what a and b are.
I have written a small programme which finds out the number of elements not repeated, just go through this let me know your opinion, at the moment I assume even number of elements are even but can easily extended for odd numbers also.
So my idea is to first sort the numbers and then apply my algorithm.quick sort can be use to sort this elements.
Lets take an input array as below
int arr[] = {1,1,2,10,3,3,4,5,5,6,6};
the number 2,10 and 4 are not repeated ,but they are in sorted order, if not sorted use quick sort to first sort it out.
Lets apply my programme on this
using namespace std;
main()
{
//int arr[] = {2, 9, 6, 1, 1, 4, 2, 3, 5};
int arr[] = {1,1,2,10,3,3,4,5,5,6,6};
int i = 0;
vector<int> vec;
int var = arr[0];
for(i = 1 ; i < sizeof(arr)/sizeof(arr[0]); i += 2)
{
var = var ^ arr[i];
if(var != 0 )
{
//put in vector
var = arr[i-1];
vec.push_back(var);
i = i-1;
}
var = arr[i+1];
}
for(int i = 0 ; i < vec.size() ; i++)
printf("value not repeated = %d\n",vec[i]);
}
This gives the output:
value not repeated= 2
value not repeated= 10
value not repeated= 4
Its simple and very straight forward, just use XOR man.
for(i=1;i<=n;i++) {
if(!(arr[i] ^ arr[i+1]))
printf("Found Repeated number %5d",arr[i]);
}
Here is an algorithm that uses order statistics and runs in O(n).
You can solve this by repeatedly calling SELECT with the median as parameter.
You also rely on the fact that After a call to SELECT,
the elements that are less than or equal to the median are moved to the left of the median.
Call SELECT on A with the median as the parameter.
If the median value is floor(n/2) then the repeated values are right to the median. So you continue with the right half of the array.
Else if it is not so then a repeated value is left to the median. So you continue with the left half of the array.
You continue this way recursively.
For example:
When A={2, 3, 6, 1, 5, 4, 0, 3, 5} n=9, then the median should be the value 4.
After the first call to SELECT
A={3, 2, 0, 1, <3>, 4, 5, 6, 5} The median value is smaller than 4 so we continue with the left half.
A={3, 2, 0, 1, 3}
After the second call to SELECT
A={1, 0, <2>, 3, 3} then the median should be 2 and it is so we continue with the right half.
A={3, 3}, found.
This algorithm runs in O(n+n/2+n/4+...)=O(n).
What about using the https://en.wikipedia.org/wiki/HyperLogLog?
Redis does http://redis.io/topics/data-types-intro#hyperloglogs
A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, in the case of the Redis implementation, which is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We'll just call them HLL from now) has seen very few elements.
Well using the nested for loop and assuming the question is to find the number occurred only twice in an array.
def repeated(ar,n):
count=0
for i in range(n):
for j in range(i+1,n):
if ar[i] == ar[j]:
count+=1
if count == 1:
count=0
print("repeated:",ar[i])
arr= [2, 3, 6, 1, 5, 4, 0, 3, 5]
n = len(arr)
repeated(arr,n)
Why should we try out doing maths ( specially solving quadratic equations ) these are costly op . Best way to solve this would be t construct a bitmap of size (n-3) bits , i.e, (n -3 ) +7 / 8 bytes . Better to do a calloc for this memory , so every single bit will be initialized to 0 . Then traverse the list & set the particular bit to 1 when encountered , if the bit is set to 1 already for that no then that is the repeated no .
This can be extended to find out if there is any missing no in the array or not.
This solution is O(n) in time complexity

Resources