Fewest number of classes for everyone to attend: polynomial-time solution? - algorithm
A teacher needs to give a mandatory class to every student in a class. The class must happen in a given month, say June, and everyone must attend this class exactly once.
Since students have various availability, not everybody is available everyday (the teacher is available every day). The teacher have everyone's availability for June, and wish to schedule as few classes as possible to cover everyone.
What is a good algorithm for this?
The one I can think of is to model this as a minimum set cover problem, where each set represents a particular day, and each node represents a student. A student is in a set if he is available on that day. The goal would be to select minimum number of sets so that every node is covered.
Since minimum set cover does not have a polynomial solution (other than an approximate one), is there a polynomial solution to this problem?
If this is a real problem, then because there are at most 31 days in a month, it is feasible to do a brute force enumeration of all possible days to have classes and check for each if all students are covered and which has the smallest number of classes.
This is technically a polynomial solution as it depends linearly on the number of students. (It depends exponentially on the number of days, but this is limited by 31 so can be treated as a large constant.)
If you also want it to be polynomial in the number of days, then this is equivalent to the (unresolved) P=NP question because your analogy with set cover works both ways. In other words, if you could solve this problem in polynomial time, then you could solve any set cover decision problem in polynomial time (with an appropriate choice of students and classes), and set cover is NP-complete, so you could solve any NP complete problem in polynomial time.
I have implemented the enumeration I suggested in my comment to the post by Peter de Rivaz. For speed, I use a bitset representation of the combinations. Each bit represents a day, and a 1-bit means the teacher lectures that day. I use an algorithm called Gosper's hack.
Again, the advantage of this enumeration strategy is that the combinations appear in increasing order of the number of days. It means the enumeration may stop at the first feasible combination because it represents the minimal number of days the teacher must lecture.
For a teaching period of 4 four days, my test program outputs:
All 1-subsets of a 4-set
0001
0010
0100
1000
All 2-subsets of a 4-set
0011
0101
0110
1001
1010
1100
All 3-subsets of a 4-set
0111
1011
1101
1110
All 4-subsets of a 4-set
1111
---
The test program is in the C++ language. The bitset representation is an unsigned integer. It has 32 bits, but the algorithm needs an extra bit. So the maximum number of days is 31, which fits the bill this time. On my computer, the 31-day case takes 16 seconds (with no bitset printing).
#include <iostream>
#include <string>
#include <algorithm>
#include <cmath>
template<typename T>
T gosper_start(int k) noexcept { // first k-subset
return (T{1} << k) - T{1};
}
template<typename T>
T gosper_next(T x) noexcept { // next k-subset
const T s = x & (T{0} - x); // avoids x & -x (because the unary minus may issue a compiler warning)
const T r = s + x;
return r | (((x^r) >> 2) / s);
}
template<typename T>
T gosper_stop(int n) noexcept { // the n-set limit
return T{1} << n;
}
template<typename T>
std::string bitset_to_string(T x, int n) { // string representation of bitset
std::string s{};
while (n-- > 0) {
s += (x & 1) ? "1" : "0";
x >>= 1;
}
std::reverse(s.begin(), s.end());
return s;
}
void test() {
using T = unsigned int; // the bit-set type
const int N = 4; // the n-set is a 4-set
const T L = gosper_stop<T>(N); // the n-set limit
for (int k=1; k<=N; ++k) { // all k from 1 to N
std::cout << "All " << k << "-subsets " << "of a " << N << "-set" << std::endl;
for (T s = gosper_start<T>(k); s < L; s = gosper_next<T>(s)) { // all k-subsets
std::cout << bitset_to_string(s,N) << std::endl;
}
}
std::cout << "---" << std::endl;
}
Related
How to convert currency amount to change? [duplicate]
This question already has an answer here: Python Coin change SO CLOSE (1 answer) Closed 3 years ago. Given a dollar amount convert it into euro coins and bills. You are given the dollar amount as the argument, and said that the dollar to euro rate is 1.30. You are given that euro denomations are 500 bill, 200 bill, 100 bill, 50 bill, 20 bill, 10 bill, 5 bill, 2 bill, 1 bill, 50 cents, 25 cents, 10 cents, 5 cents, 2 cents, 1 cent. Convert that dollar amount into the least amount of bills and coins. (Convert a numerical dollar amount (such as $10.00) to an equivalent amount in Euro Bills and Coins.) Disclaimer: This is a homework problem I've been given. I've thought about solving it using a while loop which iterates through each of the denominations and subtracts it from the value. something like: while(amount > 0){ if(amount - denomination[index] > 0) { amount -= denomination[index]; }else{ index++; } } But other sources are telling me that the coin change problems is solved with dynamic programming. I'm very confused.
For this specific denomations set change problem might be solved by greedy method, as you did. Same is true for sets where values differ twice like 1,2,4,8..., but rules are not simple, as #Patrick87 noticed in comment. Appropriate money systems are called "canonical", but it is not easy to find whether given system is canonical: example of discussion For arbitrary values greedy method can fail ([1,5,15,20]gives 20+5+5 for sum=30 while 15+15 is better) That is why in general coin change problem should be solved with dynamic programming
This answer is probably not "academic" enough, but using JavScript you can boil it down to a simple application of Array.reduce() (assuming that the "greedy" approach is applicable, which it will be for the Euro currency system): change=amnt=>(c,d,i)=>{var rest=amnt%d; if (rest!=amnt) {c[i]=(amnt-rest)/d; amnt=rest;} return c }; var rate=110.36; // Euro cents per USD var res=document.querySelector('#result'); document.querySelector('#USD').onkeyup=ev=>{ var cents=Math.round(ev.target.value*90.78); // amount in Euro cents var denom=[50000,20000,10000,5000,2000,1000, 5000,2000,1000,500,100,50,20,10,5,2,1]; var coins=denom.reduce(change(cents),[]); res.innerHTML=cents/100+' €<br>' +coins.map((n,i)=>n+'x'+(denom[i]>99?denom[i]/100+'€':denom[i]+'ct')) .filter(v=>v).join(', '); } USD <input type="text" value="13" id="USD"> <div id="result"></div>
Traditionally, currency coin change problems like the one presented to you are designed to be dynamic programming questions. Here's an example where your approach will yield the wrong answer for a similar problem with a simpler premise: Given an unlimited number of 7$ bills, 5$ bills, 4$ bills and 1$ bills, and a certain item with price of N$, find the optimal way of purchasing the item so that you use the least amount of bills possible. Now if I set N=12 in the previous problem, you'll see that your algorithm will indeed break down the 12$ into 1 bill of 7$ and another bill of 5$. If I set N=9 however, then you'll notice that your algorithm will break down the 9$ into a bill of 7$ and two bills of 1$, when the optimal solution is one bill of 5$ and one bill of 4$. So is your solution correct? Turns out, it is. That's simply because your bills are given in a way that your greedy solution will always work (I tested it up to 100000.00$ just to be 100% sure). I'm sure you can find resources online that can tell you the exact reason as to why your set of bill values works with a greedy algorithm, and unfortunately, I can't provide you with a satisfying explanation. Here's a discussion related to this issue Although you can solve your solution with a greedy algorithm, the dynamic programming (DP) approach will also yield the correct answers, and luckily for you, there are plenty of resources that can teach you about DP, if that's what's confusing you, such as GeeksForGeeks. If you're having issues implementing the DP solution, the code is posted here!
The problem of determining optimal representation in a coin system, in general, is weakly NP-hard. Probably, for all contemporary coin systems in the World greedy algorithm works fine. Most often contemporary coin systems use so-called binary-decimal pattern 1-2-5. But your example has 25 cents which require a closer look. But first, let's prove that 1-2-5 pattern amenable for the greedy algorithm. Observe that their LCM is 10, this means that we need to check only numbers in [1..9]. 1 = 1,0,0 4 = 0,2,0 7 = 0,1,1 2 = 0,1,0 5 = 0,0,1 8 = 1,1,1 3 = 1,1,0 6 = 1,0,1 9 = 0,2,1 So, this pattern is greedy. Let's now turn our attention to the first six denominations 50, 25, 10, 5, 2, 1. Here we have the same LCM - 50. I wrote a program to check this: #include <iostream> #include <array> #include <iomanip> #include <algorithm> #include <numeric> #include <iterator> bool IsOptimal(const int sum, const int numberOfCoins, std::array<int, 6>::const_iterator begin, std::array<int, 6>::const_iterator end) { if (sum < 0 || numberOfCoins == 0) return true; for (auto it = begin; it < end; ++it) { const int nextSum = sum - *it; if (nextSum == 0) return numberOfCoins == 1; if (!IsOptimal(nextSum, numberOfCoins - 1, it, end)) return false; } return true; } int main() { const std::array<int, 6> kDenoms = { 1,2,5,10,25,50 }; for (int i = 1; i < 50; ++i) { std::array<int, 6> change = { 0 }; int sum = 0; while (sum != i) { auto it = std::upper_bound(kDenoms.cbegin(), kDenoms.cend(), i - sum); ++change[--it - kDenoms.cbegin()]; sum += *it; } const bool isOptimal = IsOptimal(sum, std::accumulate(change.cbegin(), change.cend(), 0), kDenoms.cbegin(), kDenoms.cend()); std::cout << std::setw(2) << i << ": "; std::copy(change.cbegin(), change.cend() - 1, std::ostream_iterator<int>(std::cout, ",")); std::cout << change.back() << " " << std::boolalpha << isOptimal << std::endl; } return 0; } So, basically, what do we know? We know that all quantities less than 50 we can attack with the greedy algorithm to get its optimal solution. Observe, that all denominations above 50 are divisible by 50, so they will not interfere with 50, 25, 10, 5, 2, 1. We also proved that the greedy algorithm works for pattern 1-2-5, so the whole set of denominations amenable for the greedy algorithm.
Strange Bank(Atcoder Beginner contest 099)
To make it difficult to withdraw money, a certain bank allows its customers to withdraw only one of the following amounts in one operation: 1 yen (the currency of Japan) 6 yen, 6^2(=36) yen, 6^3(=216) yen, ... 9 yen, 9^2(=81) yen, 9^3(=729) yen, ... At least how many operations are required to withdraw exactly N yen in total? It is not allowed to re-deposit the money you withdrew. Constraints 1≤N≤100000 N is an integer. Input is given from Standard Input in the following format: N Output If at least x operations are required to withdraw exactly N yen in total, print x. Sample Input 1 127 Sample Output 1 4 By withdrawing 1 yen, 9 yen, 36(=6^2) yen and 81(=9^2) yen, we can withdraw 127 yen in four operations. It seemed as a simple greedy problem to me ,So that was the approach I used, but I saw I got a different result for one of the samples and figured out, It will not always be greedy. #include <iostream> #include <queue> #include <stack> #include <algorithm> #include <functional> #include <cmath> using namespace std; int intlog(int base, long int x) { return (int)(log(x) / log(base)); } int main() { ios_base::sync_with_stdio(false); cin.tie(NULL); long int n;cin>>n; int result=0; while(n>0) { int base_9=intlog(9,n);int base_6=intlog(6,n); int val; val=max(pow(9,base_9),pow(6,base_6)); //cout<<pow(9,base_9)<<" "<<pow(6,base_6)<<"\n"; val=max(val,1); if(n<=14 && n>=12) val=6; n-=val; //cout<<n<<"\n"; result++; } cout<<result; return 0; } At n 14 and above 12 , we have to pick 6 rather than 9, because To reach zero it will take less steps. It got AC only for 18/22 TCs Please help me understand my mistake.
Greedy will not work here as the choosing the answer greedily i.e. the optimal result at every step will not guarantee the best end result (you can see that in your example). So instead you should traverse through every possible scenarios at each step to figure out the overall optimal result. Now lets see how can we do that. As you can see that here the maximum input could be 10^5. And we can withdraw any one of the only following 12 values in one operation - [1, 6, 9, 36(=6^2), 81(=9^2), 216(=6^3), 729(=9^3), 1296(=6^4), 6561(=9^4), 7776(=6^5), 46656(=6^6), 59049(=9^5)] Because 6^7 and 9^6 will be more than 100000. So at each step with value n we will try to take each possible (i.e less than or equals to n) element arr[i] from the above array and then recursively solve the subproblem for n-arr[i] until we reach at zero. solve(n) if n==0 return 1; ans = n; for(int i=0;i<arr.length;i++) if (n>=arr[i]) ans = min(ans, 1+solve(n-arr[i]); return ans; Now this is very time extensive recursive solution(O(n*2^12)). We will try to optimize it. As you will try with some sample cases you will come to know that the subproblems are overlapping that means there could be duplicate subproblems. Here comes Dynamic Programming into the picture. You can store every subproblem's solution so that we can re-use them in future. So we can modify our solution as following solve(n) if n==0 return 1; ans = n; if(dp[n] is seen) return dp[n]; for(int i=0;i<arr.length;i++) if (n>=arr[i]) ans = min(ans, 1+solve(n-arr[i]); return dp[n] = ans; The time complexity for DP solution is O(n*12);
Is this a good Primality Checking Solution?
I have written this code to check if a number is prime (for numbers upto 10^9+7) Is this a good method ?? What will be the time complexity for this ?? What I have done is that I have made a unordered_set which stores the prime numbers upto sqrt(n). When checking if a number is prime or not if first check if its is less than the max number in the table. If it is less it is searched in the table so the complexity should be O(1) in this case. If it is more the number is put through a divisibility test with the numbers from the set of number containing the prime numbers. #include<iostream> #include<set> #include<math.h> #include<unordered_set> #define sqrt10e9 31623 using namespace std; unordered_set<long long> primeSet = { 2, 3 }; //used for fast lookups void genrate_prime_set(long range) //this generates prime number upto sqrt(10^9+7) { bool flag; set<long long> tempPrimeSet = { 2, 3 }; //a temporay set is used for genration set<long long>::iterator j; for (int i = 3; i <= range; i = i + 2) { //cout << i << " "; flag = true; for (j = tempPrimeSet.begin(); *j * *j <= i; ++j) { if (i % (*j) == 0) { flag = false; break; } } if (flag) { primeSet.insert(i); tempPrimeSet.insert(i); } } } bool is_prime(long long i,unordered_set<long long> primeSet) { bool flag = true; if(i <= sqrt10e9) //if number exist in the lookup table return primeSet.count(i); //if it doesn't iterate through the table for (unordered_set<long long>::iterator j = primeSet.begin(); j != primeSet.end(); ++j) { if (*j * *j <= i && i % (*j) == 0) { flag = false; break; } } return flag; } int main() { //long long testCases, a, b, kiwiCount; bool primeFlag = true; //unordered_set<int> primeNum; genrate_prime_set(sqrt10e9); cout << primeSet.size()<<"\n"; cout << is_prime(9999991,primeSet); return 0; }
This doesn't strike me as a particularly efficient way to do the job at hand. Although it probably won't make a big difference in the end, the efficient way to generate all the primes up to some specific limit is clearly to use a sieve--the sieve of Eratosthenes is simple and fast. There are a couple of modifications that can be faster, but for the small size you're dealing with, they're probably not worthwhile. These normally produce their output in a more effective format than you're currently using as well. In particular, you typically just dedicate one bit to each possible prime (i.e., each odd number) and end up with it zeroed if the number is composite, and one if it's prime (you can, of course, reverse the sense if you prefer). Since you only need one bit for each odd number from 3 to 31623, this requires only about 16 K bits, or about 2K bytes--a truly minuscule amount of memory by modern standards (especially: little enough to fit in L1 cache quite easily). Since the bits are stored in order, it's also trivial to compute and test by the factors up to the square root of the number you're testing instead of testing against all the numbers in the table (including those greater than the square root of the number you're testing, which is obviously a waste of time). This also optimizes access to the memory in case some of it's not in the cache (i.e., you can access all the data in order, making life as easy as possible for the hardware prefetcher). If you wanted to optimize further, I'd consider just using the sieve to find all primes up to 109+7, and look up inputs. Whether this is a win will depend (heavily) upon the number of queries you can expect to receive. A quick check shows that a simple implementation of the Sieve of Eratosthenes can find all primes up to 109 in about 17 seconds. After that, each query is (of course) essentially instantaneous (i.e., the cost of a single memory read). This does require around 120 megabytes of memory for the result of the sieve, which would once have been a major consideration, but (except on fairly limited systems) normally wouldn't be any more.
The very short answer: do research on the subject, starting with the term "Miller-Rabin" The short answer is no: Looking for factors of a number is a poor way to check for primality Exhaustively searching through primes is a poor way to look for factors Especially if you search through every prime, rather than just the ones less than or equal to the square root of the number Doing a primality test on each number of them is a poor way to generate a list of primes Also, you should take in primeSet by reference rather than copy, if it really needs to be a parameter. Note: testing small primes to see if they divide a number is a useful first step of a primality test, but should generally only be used for the smallest primes before switching to a better method
No, it's not a very good way to determine if a number is prime. Here is pseudocode for a simple primality test that is sufficient for numbers in your range; I'll leave it to you to translate to C++: function isPrime(n) d := 2 while d * d <= n if n % d == 0 return False d := d + 1 return True This works by trying every potential divisor up to the square root of the input number n; if no divisor has been found, then the input number could not be composite, meaning of the form n = p × q, because one of the two divisors p or q must be less than the square root of n while the other is greater than the square root of n. There are better ways to determine primality; for instance, after initially checking if the number is even (and hence prime only if n = 2), it is only necessary to test odd potential divisors, halving the amount of work necessary. If you have a list of primes up to the square root of n, you can use that list as trial divisors and make the process even faster. And there are other techniques for larger n. But that should be enough to get you started. When you are ready for more, come back here and ask more questions.
I can only suggest a way to use a library function in Java to check the primality of a number. As for the other questions, I do not have any answers. The java.math.BigInteger.isProbablePrime(int certainty) returns true if this BigInteger is probably prime, false if it's definitely composite. If certainty is ≤ 0, true is returned. You should try and use it in your code. So try rewriting it in Java Parameters certainty - a measure of the uncertainty that the caller is willing to tolerate: if the call returns true the probability that this BigInteger is prime exceeds (1 - 1/2^certainty). The execution time of this method is proportional to the value of this parameter. Return Value This method returns true if this BigInteger is probably prime, false if it's definitely composite. Example The following example shows the usage of math.BigInteger.isProbablePrime() method import java.math.*; public class BigIntegerDemo { public static void main(String[] args) { // create 3 BigInteger objects BigInteger bi1, bi2, bi3; // create 3 Boolean objects Boolean b1, b2, b3; // assign values to bi1, bi2 bi1 = new BigInteger("7"); bi2 = new BigInteger("9"); // perform isProbablePrime on bi1, bi2 b1 = bi1.isProbablePrime(1); b2 = bi2.isProbablePrime(1); b3 = bi2.isProbablePrime(-1); String str1 = bi1+ " is prime with certainity 1 is " +b1; String str2 = bi2+ " is prime with certainity 1 is " +b2; String str3 = bi2+ " is prime with certainity -1 is " +b3; // print b1, b2, b3 values System.out.println( str1 ); System.out.println( str2 ); System.out.println( str3 ); } } Output 7 is prime with certainity 1 is true 9 is prime with certainity 1 is false 9 is prime with certainity -1 is true
Population segmentation algorithm
I have a population of 50 ordered integers (1,2,3,..,50) and I look for a generic way to slice it "n" ways ("n" is the number of cutoff points ranging from 1 to 25) that maintains the order of the elements. For example, for n=1 (one cutoff point) there are 49 possible grouping alternatives ([1,2-49], [1-2,3-50], [1-3,4-50],...). For n=2 (two cutoff points), the grouping alternatives are like: [1,2,3-50], [1,2-3,4-50],... Could you recommend any general-purpose algorithm to complete this task in an efficient way? Thanks, Chris Thanks everyone for your feedback. I reviewed all your comments and I am working on a generic solution that will return all combinations (e.g., [1,2,3-50], [1,2-3,4-50],...) for all numbers of cutoff points. Thanks again, Chris
Let sequence length be N, and number of slices n. That problem becomes easier when you notice that, choosing a slicing to n slices is equivalent to choosing n - 1 from N - 1 possible split points (a split point is between every two numbers in the sequence). Hence there is (N - 1 choose n - 1) such slicings. To generate all slicings (to n slices), you have to generate all n - 1 element subsets of numbers from 1 to N - 1. The exact algorithm for this problem is placed here: How to iteratively generate k elements subsets from a set of size n in java?
Do you need the cutoffs, or are you just counting them. If you're just going to count them, then it's simple: 1 cutoff = (n-1) options 2 cutoffs = (n-1)*(n-2)/2 options 3 cutoffs = (n-1)(n-2)(n-3)/4 options you can see the patterns here If you actually need the cutoffs, then you have to actually do the loops, but since n is so small, Emilio is right, just brute force it. 1 cutoff for(i=1,i<n;++i) cout << i; 2 cutoffs for(i=1;<i<n;++i) for(j=i+1,j<n;++j) cout << i << " " << j; 3 cutoffs for(i=1;<i<n;++i) for(j=i+1,j<n;++j) for(k=j+1,k<n;++k) cout << i << " " << j << " " << k; again, you can see the pattern
So you want to select 25 split point from 49 choices in all possible ways. There are a lot of well known algorithms to do that. I want to draw your attention to another side of this problem. There are 49!/(25!*(49-25)!) = 63 205 303 218 876 >= 2^45 ~= 10^13 different combinations. So if you want to store it, the required amount of memory is 32TB * sizeof(Combination). I guess that it will pass 1 PB mark. Now lets assume that you want to process generated data on the fly. Lets make rather optimistic assumption that you can process 1 million combinations per second (here i assume that there is no parallelization). So this task will take 10^7 seconds = 2777 hours = 115 days. This problem is more complicated than it seems at first glance. If you want to solve if at home in reasonable time, my suggestion is to change the strategy or wait for the advance of quantum computers.
This will generate an array of all the ranges, but I warn you, it'll take tons of memory, due to the large numbers of results (50 elements with 3 splits is 49*48*47=110544) I haven't even tried to compile it, so there's probably errors, but this is the general algorithm I'd use. typedef std::vector<int>::iterator iterator_t; typedef std::pair<iterator_t, iterator_t> range_t; typedef std::vector<range_t> answer_t; answer_t F(std::vector<int> integers, int slices) { answer_t prev; //things to slice more answer_t results; //thin //initialize results for 0 slices results.push_back(answer(range(integers.begin(), integers.end()), 1)); //while there's still more slicing to do while(slices--) { //move "results" to the "things to slice" pile prev.clear(); prev.swap(results); //for each thing to slice for(int group=0; group<prev.size(); ++group) { //for each range for(int crange=0; crange<prev[group].size(); ++crange) { //for each place in that range for(int newsplit=0; newsplit<prev[group][crange].size(); ++newsplit) { //copy the "result" answer_t cur = prev[group]; //slice it range_t L = range(cur[crange].first, cur[crange].first+newsplit); range_t R = range(cur[crange].first+newsplit), cur[crange].second); answer_t::iterator loc = cur.erase(cur.begin()+crange); cur.insert(loc, R); cur.insert(loc, L); //add it to the results results.push_back(cur); } } } } return results; }
Minimizing time in transit
[Updates at bottom (including solution source code)] I have a challenging business problem that a computer can help solve. Along a mountainous region flows a long winding river with strong currents. Along certain parts of the river are plots of environmentally sensitive land suitable for growing a particular type of rare fruit that is in very high demand. Once field laborers harvest the fruit, the clock starts ticking to get the fruit to a processing plant. It's very costly to try and send the fruits upstream or over land or air. By far the most cost effective mechanism to ship them to the plant is downstream in containers powered solely by the river's constant current. We have the capacity to build 10 processing plants and need to locate these along the river to minimize the total time the fruits spend in transit. The fruits can take however long before reaching the nearest downstream plant but that time directly hurts the price at which they can be sold. Effectively, we want to minimize the sum of the distances to the nearest respective downstream plant. A plant can be located as little as 0 meters downstream from a fruit access point. The question is: In order to maximize profits, how far up the river should we build the 10 processing plants if we have found 32 fruit growing regions, where the regions' distances upstream from the base of the river are (in meters): 10, 40, 90, 160, 250, 360, 490, ... (n^2)*10 ... 9000, 9610, 10320? [It is hoped that all work going towards solving this problem and towards creating similar problems and usage scenarios can help raise awareness about and generate popular resistance towards the damaging and stifling nature of software/business method patents (to whatever degree those patents might be believed to be legal within a locality).] UPDATES Update1: Forgot to add: I believe this question is a special case of this one. Update2: One algorithm I wrote gives an answer in a fraction of a second, and I believe is rather good (but it's not yet stable across sample values). I'll give more details later, but the short is as follows. Place the plants at equal spacings. Cycle over all the inner plants where at each plant you recalculate its position by testing every location between its two neighbors until the problem is solved within that space (greedy algorithm). So you optimize plant 2 holding 1 and 3 fixed. Then plant 3 holding 2 and 4 fixed... When you reach the end, you cycle back and repeat until you go a full cycle where every processing plant's recalculated position stops varying.. also at the end of each cycle, you try to move processing plants that are crowded next to each other and are all near each others' fruit dumps into a region that has fruit dumps far away. There are many ways to vary the details and hence the exact answer produced. I have other candidate algorithms, but all have glitches. [I'll post code later.] Just as Mike Dunlavey mentioned below, we likely just want "good enough". To give an idea of what might be a "good enough" result: 10010 total length of travel from 32 locations to plants at {10,490,1210,1960,2890,4000,5290,6760,8410,9610} Update3: mhum gave the correct exact solution first but did not (yet) post a program or algorithm, so I wrote one up that yields the same values. /************************************************************ This program can be compiled and run (eg, on Linux): $ gcc -std=c99 processing-plants.c -o processing-plants $ ./processing-plants ************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> //a: Data set of values. Add extra large number at the end int a[]={ 10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240,99999 }; //numofa: size of data set int numofa=sizeof(a)/sizeof(int); //a2: will hold (pt to) unique data from a and in sorted order. int *a2; //max: size of a2 int max; //num_fixed_loc: at 10 gives the solution for 10 plants int num_fixed_loc; //xx: holds index values of a2 from the lowest error winner of each cycle memoized. accessed via memoized offset value. Winner is based off lowest error sum from left boundary upto right ending boundary. //FIX: to be dynamically sized. int xx[1000000]; //xx_last: how much of xx has been used up int xx_last=0; //SavedBundle: data type to "hold" memoized values needed (total traval distance and plant locations) typedef struct _SavedBundle { long e; int xx_offset; } SavedBundle; //sb: (pts to) lookup table of all calculated values memoized SavedBundle *sb; //holds winning values being memoized //Sort in increasing order. int sortfunc (const void *a, const void *b) { return (*(int *)a - *(int *)b); } /**************************** Most interesting code in here ****************************/ long full_memh(int l, int n) { long e; long e_min=-1; int ti; if (sb[l*max+n].e) { return sb[l*max+n].e; //convenience passing } for (int i=l+1; i<max-1; i++) { e=0; //sum first part for (int j=l+1; j<i; j++) { e+=a2[j]-a2[l]; } //sum second part if (n!=1) //general case, recursively e+=full_memh(i, n-1); else //base case, iteratively for (int j=i+1; j<max-1; j++) { e+=a2[j]-a2[i]; } if (e_min==-1) { e_min=e; ti=i; } if (e<e_min) { e_min=e; ti=i; } } sb[l*max+n].e=e_min; sb[l*max+n].xx_offset=xx_last; xx[xx_last]=ti; //later add a test or a realloc, etc, if approp for (int i=0; i<n-1; i++) { xx[xx_last+(i+1)]=xx[sb[ti*max+(n-1)].xx_offset+i]; } xx_last+=n; return e_min; } /************************************************************* Call to calculate and print results for given number of plants *************************************************************/ int full_memoization(int num_fixed_loc) { char *str; long errorsum; //for convenience //Call recursive workhorse errorsum=full_memh(0, num_fixed_loc-2); //Now print str=(char *) malloc(num_fixed_loc*20+100); sprintf (str,"\n%4d %6d {%d,",num_fixed_loc-1,errorsum,a2[0]); for (int i=0; i<num_fixed_loc-2; i++) sprintf (str+strlen(str),"%d%c",a2[ xx[ sb[0*max+(num_fixed_loc-2)].xx_offset+i ] ], (i<num_fixed_loc-3)?',':'}'); printf ("%s",str); return 0; } /************************************************** Initialize and call for plant numbers of many sizes **************************************************/ int main (int x, char **y) { int t; int i2; qsort(a,numofa,sizeof(int),sortfunc); t=1; for (int i=1; i<numofa; i++) if (a[i]!=a[i-1]) t++; max=t; i2=1; a2=(int *)malloc(sizeof(int)*t); a2[0]=a[0]; for (int i=1; i<numofa; i++) if (a[i]!=a[i-1]) { a2[i2++]=a[i]; } sb = (SavedBundle *)calloc(sizeof(SavedBundle),max*max); for (int i=3; i<=max; i++) { full_memoization(i); } free(sb); return 0; }
Let me give you a simple example of a Metropolis-Hastings algorithm. Suppose you have a state vector x, and a goodness-of-fit function P(x), which can be any function you care to write. Suppose you have a random distribution Q that you can use to modify the vector, such as x' = x + N(0, 1) * sigma, where N is a simple normal distribution about 0, and sigma is a standard deviation of your choosing. p = P(x); for (/* a lot of iterations */){ // add x to a sample array // get the next sample x' = x + N(0,1) * sigma; p' = P(x'); // if it is better, accept it if (p' > p){ x = x'; p = p'; } // if it is not better else { // maybe accept it anyway if (Uniform(0,1) < (p' / p)){ x = x'; p = p'; } } } Usually it is done with a burn-in time of maybe 1000 cycles, after which you start collecting samples. After another maybe 10,000 cycles, the average of the samples is what you take as an answer. It requires diagnostics and tuning. Typically the samples are plotted, and what you are looking for is a "fuzzy caterpilar" plot that is stable (doesn't move around much) and has a high acceptance rate (very fuzzy). The main parameter you can play with is sigma. If sigma is too small, the plot will be fuzzy but it will wander around. If it is too large, the plot will not be fuzzy - it will have horizontal segments. Often the starting vector x is chosen at random, and often multiple starting vectors are chosen, to see if they end up in the same place. It is not necessary to vary all components of the state vector x at the same time. You can cycle through them, varying one at a time, or some such method. Also, if you don't need the diagnostic plot, it may not be necessary to save the samples, but just calculate the average and variance on the fly. In the applications I'm familiar with, P(x) is a measure of probability, and it is typically in log-space, so it can vary from 0 to negative infinity. Then to do the "maybe accept" step it is (exp(logp' - logp))
Unless I've made an error, here are exact solutions (obtained through a dynamic programming approach): N Dist Sites 2 60950 {10,4840} 3 40910 {10,2890,6760} 4 30270 {10,2250,4840,7840} 5 23650 {10,1690,3610,5760,8410} 6 19170 {10,1210,2560,4410,6250,8410} 7 15840 {10,1000,2250,3610,5290,7290,9000} 8 13330 {10,810,1960,3240,4410,5760,7290,9000} 9 11460 {10,810,1690,2890,4000,5290,6760,8410,9610} 10 9850 {10,640,1440,2250,3240,4410,5760,7290,8410,9610} 11 8460 {10,640,1440,2250,3240,4410,5290,6250,7290,8410,9610} 12 7350 {10,490,1210,1960,2890,3610,4410,5290,6250,7290,8410,9610} 13 6470 {10,490,1000,1690,2250,2890,3610,4410,5290,6250,7290,8410,9610} 14 5800 {10,360,810,1440,1960,2560,3240,4000,4840,5760,6760,7840,9000,10240} 15 5190 {10,360,810,1440,1960,2560,3240,4000,4840,5760,6760,7840,9000,9610,10240} 16 4610 {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,7290,8410,9000,9610,10240} 17 4060 {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,7290,7840,8410,9000,9610,10240} 18 3550 {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,6760,7290,7840,8410,9000,9610,10240} 19 3080 {10,360,810,1210,1690,2250,2890,3610,4410,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 20 2640 {10,250,640,1000,1440,1960,2560,3240,4000,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 21 2230 {10,250,640,1000,1440,1960,2560,3240,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 22 1860 {10,250,640,1000,1440,1960,2560,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 23 1520 {10,250,490,810,1210,1690,2250,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 24 1210 {10,250,490,810,1210,1690,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 25 940 {10,250,490,810,1210,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 26 710 {10,160,360,640,1000,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 27 500 {10,160,360,640,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 28 330 {10,160,360,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 29 200 {10,160,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 30 100 {10,90,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 31 30 {10,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240} 32 0 {10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}