Instance of subset sum problem - algorithm

I have a problem which is a pretty clear instance of the subset sum problem:
"given a list of Integers in the range [-65000,65000], the function returns true if any subset of the list summed is equal to zero. False otherwise."
What I wanted to ask is more of an explanation than a solution.
This was an instance-specific solution I came up before thinking about the complexity of the problem.
Sort the array A[] and, during sort, sum each element to a counter 'extSum' (O(NLogN))
Define to pointers low = A[0] and high = A[n-1]
Here is the deciding code:
while(A[low]<0){
sum = extSum;
if(extSum>0){
while(sum - A[high] < sum){
tmp = sum - A[high];
if(tmp==0) return true;
else if(tmp > 0){
sum = tmp;
high--;
}
else{
high--;
}
}
extSum -= A[low];
low++;
high = n - 1;
}
else{
/* Symmetric code: switch low, high and the operation > and < */
}
}
return false;
First of all, is this solution correct? I made some tests, but I am not sure...it looks too easy...
Isn't the time complexity of this code O(n^2)?
I already read the various DP solutions and the thing I would like to understand is, for the specific instance of the problem I am facing, how much better than this naive and intuitive solution they are. I know my approach can be improved a lot but nothing that would make a big difference when it comes to the time complexity....
Thank you for the clarifications
EDIT: One obvious optimization would be that, while sorting, if a 0 is found, the function returns true immediately....but it's only for the specific case in which there are 0s in the array.

Hmm, I think {0} will beat your answer.
Because it will simply ignore while and return false.

Related

convert to divide and conquer algorithm. Kotlin

convert method "FINAL" to divide and conquer algorithm
the task sounded like this: The buyer has n coins of
H1,...,Hn.
The seller has m
coins in denominations of
B1,...,Bm.
Can the buyer purchase the item
the cost S so that the seller has an exact change (if
necessary).
fun Final(H: ArrayList<Int>, B: ArrayList<Int>, S: Int): Boolean {
var Clon_Price = false;
var Temp: Int;
for (i in H) {
if (i == S)
return true;
}
for (i in H.withIndex()) {
Temp = i.value - S;
for (j in B) {
if (j == Temp)
Clon_Price = true;
}
}
return Clon_Price;
}
fun main(args: Array<String>) {
val H:ArrayList<Int> = ArrayList();
val B:ArrayList<Int> = ArrayList();
println("Enter the number of coins the buyer has:");
var n: Int = readln().toInt();
println("Enter their nominal value:")
while (n > 0){
H.add(readln().toInt());
n--
}
println("Enter the number of coins the seller has:");
var m: Int = readln().toInt();
println("Enter their nominal value:")
while (m > 0){
B.add(readln().toInt());
m--
}
println("Enter the product price:");
val S = readln().toInt();
if(Final(H,B,S)){
println("YES");
}
else{
println("No!");
}
Introduction
Since this is an assignment, I will only give you insights to solve this problem and you will need to do the coding yourself.
The algorithm
Receives two ArrayList<Int> and an Int parameter
if the searched (S) element can be found in H, then the result is true
Otherwise it loops H
Computes the difference between the current element and S
Searches for a match in B and if it's found, then true is being returned
If the method has not returned yet, then return false;
Divide et impera (Divide and conquer)
Divide and conquer is the process of breaking down a complicated task into similar, but simpler subtasks, repeating this breaking down until the subtasks become trivial (this was the divide part) and then, using the results of the trivial subtasks we can solve the slightly more complicated subtasks and go upwards in our layers of unsolved complexities until the problem is solved (this is the conquer part).
A very handy data-structure to use is the Stack. You can use the stack of your memory, which are fancy words for recursion, or, you can solve it iteratively, by managing such a stack yourself.
This specific problem
This algorithm does not seem to necessitate divide and conquer, given the fact that you only have two array lists that can be iterated, so, I guess, this is an early assignment.
To make sure this is divide and conquer, you can add two parameters to your method (which are 0 and length - 1 at the start) that reflect the current problem-space. And upon each call, check whether the starting and ending index (the two new parameters) are equal. If they are, you already have a trivial, simplified subtask and you just iterate the second ArrayList.
If they are not equal, then you still need to divide. You can simply
//... Some code here
return Final(H, B, S, start, end / 2) || Final(H, B, S, end / 2 + 1, end);
(there you go, I couldn't resist writing code, after all)
for your nontrivial cases. This automatically breaks down the problem into sub-problems.
Self-criticism
The idea above is a simplistic solution for you to get the gist. But, in reality, programmers dislike recursion, as it can lead to trouble. So, once you complete the implementation of the above, you are well-advised to convert your algorithm to make sure it's iterative, which should be fairly easy once you succeeded implementing the recursive version.

DTW algorithm: simple implementation - Verification

I have tried to make a simple implementation of the DTW algorithm in C,without using any substantial optimization techniques. I am trying to use this implementation for some simple sketch recognition, which is to say finding the k closest neighbors of a given sketch from within a set. I have gotten some results that seem weird to me and I would like to know of this is because of my dtw implementation. I need someone to verify my algorithm.
As I said, I am trying to find the k closest neighbors, so the only 'optimization' I have implemented to make calculations faster is that if the minimum cost of a given line calculated is at any point greater than the maximum distance between the k sketches currently considered as the closest neighbors, I stop calculating and return +inf.
Here is the corresponding algorithm:
(returnValue totalCost) dtw(sketch1, sketch2, curMaxDist){
distMatrix = 'empty matrix of size (sketch.size) x (sketch2.size)'
totalCostMatrix = 'empty matrix of size (sketch1.size) x (sketch2.size)'
for(i = 0 to sketch1.size - 1){
for(j = 0 to sketch2.size - 1){
distMatrix[i][j] = euclidianDistance(sketch1.point[i], sketch2.point[j])
totalCostMatrix[i][j] = +inf
}
}
//I am forcing the first points of each sketch to correspond to one
// and continue applying the algorithm from the next points.
for(i = 1 to sketch1.size - 1){
curMinDist = +inf
for(j = 1 to sketch2.size - 1){
totalCostMatrix[i][j] = min(totalCostMatrix[i-1][j-1],
totalCostMatrix[i-1][j],
totalCostMatrix[i][j-1]) + distMatrix[i][j]
if(totalCostMatrix[i][j] < curMinDist)
curMinDist = totalCostMatrix[i][j]
}
if(curMinDist > curMaxDist)
return +inf
}
return totalCostMatrix[sketch1.size - 1][sketch2.size - 1]
}
I am sure there is nothind wrong with the implementation as far as the syntax, C language etc is concerned since I have checked that and I always get the expectes result. I was just wandering if there is something wrong with the reasoning behind the algorithm. I am asking because it is a really well known algorithm and a really simple implementation so maybe it is easy for someone to spot an error there.

Algorithm to find duplicate in an array

I have an assignment to create an algorithm to find duplicates in an array which includes number values. but it has not said which kind of numbers, integers or floats. I have written the following pseudocode:
FindingDuplicateAlgorithm(A) // A is the array
mergeSort(A);
for int i <- 0 to i<A.length
if A[i] == A[i+1]
i++
return A[i]
else
i++
have I created an efficient algorithm?
I think there is a problem in my algorithm, it returns duplicate numbers several time. for example if array include 2 in two for two indexes i will have ...2, 2,... in the output. how can i change it to return each duplicat only one time?
I think it is a good algorithm for integers, but does it work good for float numbers too?
To handle duplicates, you can do the following:
if A[i] == A[i+1]:
result.append(A[i]) # collect found duplicates in a list
while A[i] == A[i+1]: # skip the entire range of duplicates
i++ # until a new value is found
Do you want to find Duplicates in Java?
You may use a HashSet.
HashSet h = new HashSet();
for(Object a:A){
boolean b = h.add(a);
boolean duplicate = !b;
if(duplicate)
// do something with a;
}
The return-Value of add() is defined as:
true if the set did not already
contain the specified element.
EDIT:
I know HashSet is optimized for inserts and contains operations. But I'm not sure if its fast enough for your concerns.
EDIT2:
I've seen you recently added the homework-tag. I would not prefer my answer if itf homework, because it may be to "high-level" for an allgorithm-lesson
http://download.oracle.com/javase/1.4.2/docs/api/java/util/HashSet.html#add%28java.lang.Object%29
Your answer seems pretty good. First sorting and them simply checking neighboring values gives you O(n log(n)) complexity which is quite efficient.
Merge sort is O(n log(n)) while checking neighboring values is simply O(n).
One thing though (as mentioned in one of the comments) you are going to get a stack overflow (lol) with your pseudocode. The inner loop should be (in Java):
for (int i = 0; i < array.length - 1; i++) {
...
}
Then also, if you actually want to display which numbers (and or indexes) are the duplicates, you will need to store them in a separate list.
I'm not sure what language you need to write the algorithm in, but there are some really good C++ solutions in response to my question here. Should be of use to you.
O(n) algorithm: traverse the array and try to input each element in a hashtable/set with number as the hash key. if you cannot enter, than that's a duplicate.
Your algorithm contains a buffer overrun. i starts with 0, so I assume the indexes into array A are zero-based, i.e. the first element is A[0], the last is A[A.length-1]. Now i counts up to A.length-1, and in the loop body accesses A[i+1], which is out of the array for the last iteration. Or, simply put: If you're comparing each element with the next element, you can only do length-1 comparisons.
If you only want to report duplicates once, I'd use a bool variable firstDuplicate, that's set to false when you find a duplicate and true when the number is different from the next. Then you'd only report the first duplicate by only reporting the duplicate numbers if firstDuplicate is true.
public void printDuplicates(int[] inputArray) {
if (inputArray == null) {
throw new IllegalArgumentException("Input array can not be null");
}
int length = inputArray.length;
if (length == 1) {
System.out.print(inputArray[0] + " ");
return;
}
for (int i = 0; i < length; i++) {
if (inputArray[Math.abs(inputArray[i])] >= 0) {
inputArray[Math.abs(inputArray[i])] = -inputArray[Math.abs(inputArray[i])];
} else {
System.out.print(Math.abs(inputArray[i]) + " ");
}
}
}

Dynamic programming - Coin change decision

I'm reviewing some old notes from my algorithms course and the dynamic programming problems are seeming a bit tricky to me. I have a problem where we have an unlimited supply of coins, with some denominations x1, x2, ... xn and we want to make change for some value X. We are trying to design a dynamic program to decide whether change for X can be made or not (not minimizing the number of coins, or returning which coins, just true or false).
I've done some thinking about this problem, and I can see a recursive method of doing this where it's something like...
MakeChange(X, x[1..n this is the coins])
for (int i = 1; i < n; i++)
{
if ( (X - x[i] ==0) || MakeChange(X - x[i]) )
return true;
}
return false;
Converting this a dynamic program is not coming so easily to me. How might I approach this?
Your code is a good start. The usual way to convert a recursive solution to a dynamic-programming one is to do it "bottom-up" instead of "top-down". That is, if your recursive solution calculates something for a particular X using values for smaller x, then instead calculate the same thing starting at smaller x, and put it in a table.
In your case, change your MakeChange recursive function into a canMakeChange table.
canMakeChange[0] = True
for X = 1 to (your max value):
canMakeChange[X] = False
for i=1 to n:
if X>=x[i] and canMakeChange[X-x[i]]==True:
canMakeChange[X]=True
My solution below is a greedy approach calculating all the solutions and cacheing the latest optimal one. If current executing solution is already larger than cached solution abort the path. Note, for best performance denomination should be in decreasing order.
import java.util.ArrayList;
import java.util.List;
public class CoinDenomination {
int denomination[] = new int[]{50,33,21,2,1};
int minCoins=Integer.MAX_VALUE;
String path;
class Node{
public int coinValue;
public int amtRemaining;
public int solutionLength;
public String path="";
public List<Node> next;
public String toString() { return "C: "+coinValue+" A: "+amtRemaining+" S:"+solutionLength;}
}
public List<Node> build(Node node)
{
if(node.amtRemaining==0)
{
if (minCoins>node.solutionLength) {
minCoins=node.solutionLength;
path=node.path;
}
return null;
}
if (node.solutionLength==minCoins) return null;
List<Node> nodes = new ArrayList<Node>();
for(int deno:denomination)
{
if(node.amtRemaining>=deno)
{
Node nextNode = new Node();
nextNode.amtRemaining=node.amtRemaining-deno;
nextNode.coinValue=deno;
nextNode.solutionLength=node.solutionLength+1;
nextNode.path=node.path+"->"+deno;
System.out.println(node);
nextNode.next = build(nextNode);
nodes.add(node);
}
}
return nodes;
}
public void start(int value)
{
Node root = new Node();
root.amtRemaining=value;
root.solutionLength=0;
root.path="start";
root.next=build(root);
System.out.println("Smallest solution of coins count: "+minCoins+" \nCoins: "+path);
}
public static void main(String args[])
{
CoinDenomination coin = new CoinDenomination();
coin.start(35);
}
}
Just add a memoization step to the recursive solution, and the dynamic algorithm falls right out of it. The following example is in Python:
cache = {}
def makeChange(amount, coins):
if (amount,coins) in cache:
return cache[amount, coins]
if amount == 0:
ret = True
elif not coins:
ret = False
elif amount < 0:
ret = False
else:
ret = makeChange(amount-coins[0], coins) or makeChange(amount, coins[1:])
cache[amount, coins] = ret
return ret
Of course, you could use a decorator to auto-memoize, leading to more natural code:
def memoize(f):
cache = {}
def ret(*args):
if args not in cache:
cache[args] = f(*args)
return cache[args]
return ret
#memoize
def makeChange(amount, coins):
if amount == 0:
return True
elif not coins:
return False
elif amount < 0:
return False
return makeChange(amount-coins[0], coins) or makeChange(amount, coins[1:])
Note: even the non-dynamic-programming version you posted had all kinds of edge cases bugs, which is why the makeChange above is slightly longer than yours.
This paper is very relevant: http://ecommons.library.cornell.edu/handle/1813/6219
Basically, as others have said, making optimal change totaling an arbitrary X with arbitrary denomination sets is NP-Hard, meaning dynamic programming won't yield a timely algorithm. This paper proposes a polynomial-time (that is, polynomial in the size of the input, which is an improvement upon previous algorithms) algorithm for determining if the greedy algorithm always produces optimal results for a given set of denominations.
Here is c# version just for reference to find the minimal number of coins required for given sum:
(one may refer to my blog # http://codingworkout.blogspot.com/2014/08/coin-change-subset-sum-problem-with.html for more details)
public int DP_CoinChange_GetMinimalDemoninations(int[] coins, int sum)
{
coins.ThrowIfNull("coins");
coins.Throw("coins", c => c.Length == 0 || c.Any(ci => ci <= 0));
sum.Throw("sum", s => s <= 0);
int[][] DP_Cache = new int[coins.Length + 1][];
for (int i = 0; i <= coins.Length; i++)
{
DP_Cache[i] = new int[sum + 1];
}
for(int i = 1;i<=coins.Length;i++)
{
for(int s=0;s<=sum;s++)
{
if (coins[i - 1] == s)
{
//k, we can get to sum using just the current coin
//so, assign to 1, no need to process further
DP_Cache[i][s] = 1;
}
else
{
//initialize the value withouth the current value
int minNoOfCounsWithoutUsingCurrentCoin_I = DP_Cache[i - 1][s];
DP_Cache[i][s] = minNoOfCounsWithoutUsingCurrentCoin_I;
if ((s > coins[i - 1]) //current coin can particiapte
&& (DP_Cache[i][s - coins[i - 1]] != 0))
{
int noOfCoinsUsedIncludingCurrentCoin_I =
DP_Cache[i][s - coins[i - 1]] + 1;
if (minNoOfCounsWithoutUsingCurrentCoin_I == 0)
{
//so far we couldnt identify coins that sums to 's'
DP_Cache[i][s] = noOfCoinsUsedIncludingCurrentCoin_I;
}
else
{
int min = this.Min(noOfCoinsUsedIncludingCurrentCoin_I,
minNoOfCounsWithoutUsingCurrentCoin_I);
DP_Cache[i][s] = min;
}
}
}
}
}
return DP_Cache[coins.Length][sum];
}
In the general case, where coin values can be arbitrary, the problem you are presenting is called the Knapsack Problem, and is known to belong to NP-complete (Pearson, D. 2004), so therefore is not solvable in polynomial time such as dynamic programming.
Take the pathological example of x[2] = 51, x[1] = 50, x[0] = 1, X = 100. Then it is required that the algorithm 'consider' the possibilities of making change with coin x[2], alternatively making change beginning with x[1]. The first-step used with national coinage, otherwise known as the Greedy Algorithm -- to wit, "use the largest coin less than the working total," will not work with pathological coinages. Instead, such algorithms experience a combinatoric explosion that qualifies them into NP-complete.
For certain special coin value arrangements, such as practically all those in actual use, and including the fictitious sytem X[i+1] == 2 * X[i], there are very fast algorithms, even O(1) in the fictitious case given, to determine the best output. These algorithms exploit properties of the coin values.
I am not aware of a dynamic programming solution: one which takes advantage of optimal sub-solutions as required by the programming motif. In general a problem can only be solved by dynamic programming if it can be decomposed into sub-problems which, when optimally solved, can be re-composed into a solution which is provably optimal. That is, if the programmer cannot mathematically demonstrate ("prove") that re-composing optimal sub-solutions of the problem results in an optimal solution, then dynamic programming cannot be applied.
An example commonly given of dynamic programming is an application to multiplying several matrices. Depending on the size of the matrices, the choice to evaluate A·B·C as either of the two equivalent forms: ((A·B)·C) or (A·(B·C)) leads to the computations of different quantities of multiplications and additions. That is, one method is more optimal (faster) than the other method. Dynamic programming is a motif which tabulates the computational costs of different methods, and performs the actual calculations according to a schedule (or program) computed dynamically at run-time.
A key feature is that computations are performed according to the computed schedule and not by an enumeration of all possible combinations -- whether the enumeration is performed recursively or iteratively. In the example of multiplying matrices, at each step, only the least-cost multiplication is chosen. As a result, the possible costs of intermediate-cost sub-optimal schedules are never calculated. In other words, the schedule is not calculated by searching all possible schedules for the optimal, but rather by incrementally building an optimal schedule from nothing.
The nomenclature 'dynamic programming' may be compared with 'linear programming' in which 'program' is also used in the sense meaning 'to schedule.'
To learn more about dynamic programming, consult the greatest book on algorithms yet known to me, "Introduction to Algorithms" by Cormen, Leiserson, Rivest, and Stein. "Rivest" is the 'R' of "RSA" and dynamic programming is but one chapter of scores.
iIf you write in a recursive way, it is fine, just use memory based search. you have to store what you have calculated, which will not be calculated again
int memory[#(coins)]; //initialize it to be -1, which means hasn't been calculated
MakeChange(X, x[1..n this is the coins], i){
if(memory[i]!=-1) return memory[i];
for (int i = 1; i < n; i++)
{
if ( (X - x[i] ==0) || MakeChange(X - x[i], i) ){
memory[i]=true;
return true;
}
}
return false;
}

Linear Time Voting Algorithm. I don't get it

As I was reading this (Find the most common entry in an array), the Boyer and Moore's Linear Time Voting Algorithm was suggested.
If you follow the link to the site, there is a step by step explanation of how the algorithm works. For the given sequence, AAACCBBCCCBCC it presents the right solution.
When we move the pointer forward over
an element e:
If the counter is 0, we set the current candidate to e and we set the
counter to 1.
If the counter is not 0, we increment or decrement the counter
according to whether e is the current
candidate.
When we are done, the current
candidate is the majority element, if
there is a majority.
If I use this algorithm on a piece of paper with AAACCBB as input, the suggested candidate would become B what is obviously wrong.
As I see it, there are two possibilities
The authors have never tried their algorithm on anything else than AAACCBBCCCBCC, are completely incompetent and should be fired on the spot (doubtfull).
I am clearly missing something, must get banned from Stackoverflow and never be allowed again to touch anything involving logic.
Note: Here is a a C++ implementation of the algorithm from Niek Sanders. I believe he correctly implemented the idea and as such it has the same problem (or doesn't it?).
The algorithm only works when the set has a majority -- more than half of the elements being the same. AAACCBB in your example has no such majority. The most frequent letter occurs 3 times, the string length is 7.
Small but an important addition to the other explanations. Moore's Voting algorithm has 2 parts -
first part of running Moore's Voting algorithm only gives you a candidate for the majority element. Notice the word "candidate" here.
In the second part, we need to iterate over the array once again to determine if this candidate occurs maximum number of times (i.e. greater than size/2 times).
First iteration is to find the candidate & second iteration is to check if this element occurs majority of times in the given array.
So time complexity is: O(n) + O(n) ≈ O(n)
From the first linked SO question:
with the property that more than half of the entries in the array are equal to N
From the Boyer and Moore page:
which element of a sequence is in the majority, provided there is such an element
Both of these algorithms explicitly assume that one element occurs at least N/2 times. (Note in particular that "majority" is not the same as "most common.")
I wrote a C++ code for this algorithm
char find_more_than_half_shown_number(char* arr, int len){
int i=0;
std::vector<int> vec;
while(i<len){
if(vec.empty()){
vec.push_back(arr[i]);
vec.push_back(1);
}else if(vec[0]==arr[i]){
vec[1]++;
}else if(vec[0]!=arr[i]&&vec[1]!=0){
vec[1]--;
}else{
vec[0]=arr[i];
}
i++;
}
int tmp_count=0;
for(int i=0;i<len;i++){
if(arr[i]==vec[0])
tmp_count++;
}
if(tmp_count>=(len+1)/2)
return vec[0];
else
return -1;
}
and the main function is as below:
int main(int argc, const char * argv[])
{
char arr[]={'A','A','A','C','C','B','B','C','C','C','B','C','C'};
int len=sizeof(arr)/sizeof(char);
char rest_num=find_more_than_half_shown_number(arr,len);
std::cout << "rest_num="<<rest_num<<std::endl;
return 0;
}
When the test case is "AAACCBB", the set has no majority. Because no element occurs more than 3 times since the length of "AAACCBB" is 7.
Here's the code for "the Boyer and Moore's Linear Time Voting Algorithm":
int Voting(vector<int> &num) {
int count = 0;
int candidate;
for(int i = 0; i < num.size(); ++i) {
if(count == 0) {
candidate = num[i];
count = 1;
}
else
count = (candidate == num[i]) ? ++count : --count;
}
return candidate;
}

Resources