Strange Bank(Atcoder Beginner contest 099) - algorithm

To make it difficult to withdraw money, a certain bank allows its customers to withdraw only one of the following amounts in one operation:
1 yen (the currency of Japan)
6 yen, 6^2(=36) yen, 6^3(=216) yen, ...
9 yen, 9^2(=81) yen, 9^3(=729) yen, ...
At least how many operations are required to withdraw exactly N yen in total?
It is not allowed to re-deposit the money you withdrew.
Constraints
1≤N≤100000
N is an integer.
Input is given from Standard Input in the following format:
N
Output
If at least x operations are required to withdraw exactly N yen in total, print x.
Sample Input 1
127
Sample Output 1
4
By withdrawing 1 yen, 9 yen, 36(=6^2) yen and 81(=9^2) yen, we can withdraw 127 yen in four operations.
It seemed as a simple greedy problem to me ,So that was the approach I used, but I saw I got a different result for one of the samples and figured out,
It will not always be greedy.
#include <iostream>
#include <queue>
#include <stack>
#include <algorithm>
#include <functional>
#include <cmath>
using namespace std;
int intlog(int base, long int x) {
return (int)(log(x) / log(base));
}
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
long int n;cin>>n;
int result=0;
while(n>0)
{
int base_9=intlog(9,n);int base_6=intlog(6,n);
int val;
val=max(pow(9,base_9),pow(6,base_6));
//cout<<pow(9,base_9)<<" "<<pow(6,base_6)<<"\n";
val=max(val,1);
if(n<=14 && n>=12)
val=6;
n-=val;
//cout<<n<<"\n";
result++;
}
cout<<result;
return 0;
}
At n 14 and above 12 , we have to pick 6 rather than 9, because To reach zero it will take less steps.
It got AC only for 18/22 TCs Please help me understand my mistake.

Greedy will not work here as the choosing the answer greedily i.e. the optimal result at every step will not guarantee the best end result (you can see that in your example). So instead you should traverse through every possible scenarios at each step to figure out the overall optimal result.
Now lets see how can we do that. As you can see that here the maximum input could be 10^5. And we can withdraw any one of the only following 12 values in one operation -
[1, 6, 9, 36(=6^2), 81(=9^2), 216(=6^3), 729(=9^3), 1296(=6^4), 6561(=9^4), 7776(=6^5), 46656(=6^6), 59049(=9^5)]
Because 6^7 and 9^6 will be more than 100000.
So at each step with value n we will try to take each possible (i.e less than or equals to n) element arr[i] from the above array and then recursively solve the subproblem for n-arr[i] until we reach at zero.
solve(n)
if n==0
return 1;
ans = n;
for(int i=0;i<arr.length;i++)
if (n>=arr[i])
ans = min(ans, 1+solve(n-arr[i]);
return ans;
Now this is very time extensive recursive solution(O(n*2^12)). We will try to optimize it. As you will try with some sample cases you will come to know that the subproblems are overlapping that means there could be duplicate subproblems. Here comes Dynamic Programming into the picture. You can store every subproblem's solution so that we can re-use them in future. So we can modify our solution as following
solve(n)
if n==0
return 1;
ans = n;
if(dp[n] is seen)
return dp[n];
for(int i=0;i<arr.length;i++)
if (n>=arr[i])
ans = min(ans, 1+solve(n-arr[i]);
return dp[n] = ans;
The time complexity for DP solution is O(n*12);

Related

How to find the minimum number of processors to finish n jobs

I originally saw this problem on my university test (unfortunately, I don't have access to this question and the test cases anymore so this is described from memory). The problem is, you are given n jobs with start and finish times, and u have to find the minimum number of processors that can finish ALL jobs. A processor can only do disjoint jobs (jobs that don't overlap).
Sample test case:
There are 5 jobs (with start and end time given respectively):
1 11
2 3
3 6
4 6
10 13
The answer is you need a min of 3 processors to complete all the jobs.
processor 1: 1-11
processor 2: 2-3 4-6 10-13
processor 3: 3-6
My idea for this was to use a set of pairs with finish time as first and start time as second of the pair. This way, the jobs would be sorted by finish times. I would then keep iterating over the set and remove all processes that are disjoint for that iteration (using the greedy algorithm of finding the max. no. of jobs a single processor can schedule) and then my answer would be the number of iterations it took until the set was empty.
However, this didn't work on all the test cases. I believe it might be because the algorithm is too slow as inserting and deleting in the set take log n time (which I do for n elements). Then I am also iterating over the set again and again until it is empty.
My question is, is there a better way to do this?
Code:
#include <iostream>
#include <set>
#include <vector>
#include <utility>
#include <algorithm>
using namespace std;
int main()
{
int t;
cin >> t;
while(t--)
{
int n;
cin >> n;
set<pair<int, int>> a;
for(int i=0; i < n; ++i)
{
int s, f;
cin >> s >> f;
a.insert({f, s});
}
int cnt = 0;
while(!a.empty())
{
auto range = pair<int, int>(*a.begin());
a.erase(a.begin());
for(auto t = a.begin(); t!= a.end(); ++t)
{
if(t->second > range.first)
{
range.first = t->second;
a.erase(t);
}
}
cnt++;
}
cout << cnt << endl;
}
}
first line of input is number of test cases.
second line of input is number of jobs n.
the next n lines denotes each job with first number being start time and second number being end time
It can be done faster, in O(n) time, if the constraints on start and finish time is low. We can make a dp array, in which for each job, we will do dp[job[i].start]+=1 and dp[job[i].end+1]-=1. After calcualting prefix array from it, maximum number will be our answer. This question is very much similar to the minimum number of lecture halls. For more details, you can refer https://www.geeksforgeeks.org/minimum-halls-required-for-class-scheduling/.

Count sphenic odd number in range efficiently

I couldn't find a good solution for this problem;
I need to count the number of sphenic odd number in a range efficiency I have used sieve algorithm but normally there should be a better way to do this.
an odd sphenic number is a product of 3 distinct odd prime number.
I have tried with this but it takes so much time.
char t[200000001];
int dp[200000001];
int main() {
cin.tie(0);
ios_base::sync_with_stdio(false);
for(int i=3;i<200000001;i+=2)
{
if(t[i]==0)
{
for(int j=i;j<200000001;j+=i)
t[j]++;
}
if(t[i]==3 && i%2)
dp[i]=1;
}
for(int i=100;i<200000001;i++)
dp[i]+=dp[i-1];
int test=0;
cin>>test;
for(int tt=1;tt<=test;tt++)
{
int a,b;
cin>>a>>b;
cout<<dp[b]-dp[a-1]<<"\n";
}
}
thank you
Edit:
I am trying to solve this problem:
http://codeforces.com/group/HtnK54FR7R/contest/219854/problem/D
A prime sieve starting at 3 is a great idea. But the highest possible useful prime is ...? (hint: 200,000,000 / (? * ?))
One possible way might be to enumerate combinations of two primes backwards as the primes are generated, binary-searching the highest index for the second and the lowest possible index for the third, exiting the generation altogether if 15 * current_prime is greater than r.

Debugging hackerrank week of code Lazy Sorting

I am doing a question on hackerrank(https://www.hackerrank.com/contests/w21/challenges/lazy-sorting) right now, and I am confused as to why doesn't my code fulfill the requirements. The questions asks:
Logan is cleaning his apartment. In particular, he must sort his old favorite sequence, P, of N positive integers in nondecreasing order. He's tired from a long day, so he invented an easy way (in his opinion) to do this job. His algorithm can be described by the following pseudocode:
while isNotSorted(P) do {
WaitOneMinute();
RandomShuffle(P)
}
Can you determine the expected number of minutes that Logan will spend waiting for to be sorted?
Input format:
The first line contains a single integer, N, denoting the size of permutation .The second line contains N space-separated integers describing the respective elements in the sequence's current order, P_0, P_1 ... P_N-1.
Constraints:
2 <= N <= 18
1 <= P_i <= 100
Output format:
Print the expected number of minutes Logan must wait for P to be sorted, rounded to a scale of exactly 6 decimal places (i.e.,1.234567 format).
Sample input:
2
5 2
Sample output:
2.000000
Explanation
There are two permutations possible after a random shuffle, and each of them has probability 0.5. The probability to get the sequence sorted after the first minute is 0.5. The probability that will be sorted after the second minute is 0.25, the probability will be sorted after the third minute is 0.125, and so on. The expected number of minutes hence equals to:
summation of i*2^-i where i goes from 1 to infinity = 2
I wrote my code in c++ as follow:
#include <cmath>
#include <cstdio>
#include <vector>
#include <iostream>
#include <algorithm>
#include <map>
using namespace std;
int main() {
/* Enter your code here. Read input from STDIN. Print output to STDOUT */
map <int, int> m; //create a map to store the number of repetitions of each number
int N; //number of elements in list
//calculate the number of permutations
cin >> N;
int j;
int total_perm = 1;
int temp;
for (int i = 0; i < N; i++){
cin >> temp;
//if temp exists, add one to the value of m[temp], else initialize a new key value pair
if (m.find(temp) == m.end()){
m[temp] = 1;
}else{
m[temp] += 1;
}
total_perm *= i+1;
}
//calculate permutations taking into account of repetitions
for (map<int,int>::iterator iter = m.begin(); iter != m.end(); ++iter)
{
if (iter -> second > 1){
temp = iter -> second;
while (temp > 1){
total_perm = total_perm / temp;
temp -= 1;
}
}
}
float recur = 1 / float(total_perm);
float prev;
float current = recur;
float error = 1;
int count = 1;
//print expected number of minutes up to 6 sig fig
if (total_perm == 1){
printf("%6f", recur);
}else{
while (error > 0.0000001){
count += 1;
prev = current;
current = prev + float(count)*float(1-recur)*pow(recur,count-1);
error = abs(current - prev);
}
printf("%6f", prev);
}
return 0;
}
I don't really care about the competition, it's more about learning for me, so I would really appreciate it if someone can point out where I was wrong.
Unfortunately I am not familiar with C++ so I don't know exactly what your code is doing. I did, however, solve this problem. It's pretty cheeky and I think they posed the problem the way they did just to be confusing. So the important piece of knowledge here is that for an event with probability p, the expected number of trials until a success is 1/p. Since each trial here costs us a minute, that means we can find the expected number of trials and add ".000000" to the end.
So how do you do that? Well each permutation of the numbers is equally likely to occur, which means that if we can find how many permutations there are, we can find p. And then we take 1/p to get E[time]. But notice that each permutation has probability 1/p of occurring, where p is the total number of permutations. So really E[time] = number of permutations. I leave the rest to you.
This is just simple problem.
This problem looks like bogo sort.
How many unique permutations of the given array are possible? In the sample case, there are two permutations possible, so the expected time for any one permutation to occur is 2.000000. Extend this approach to the generic case, taking into account any repeated numbers.
However in the question, the numbers can be repeated. This reduces the number of unique permutations, and thus the answer.
Just find the number of unique permutations of the array, upto 6 decimal places. That is your answer.
Think about if array is sorted then what happen?
E.g
if test case is
5 5
5 4 3 2 1
then ans would be 120.000000 (5!/1!)
5 5
1 2 3 4 5
then ans would be 0.000000 in your question.
5 5
2 2 2 2 2
then also ans would be 0.000000
5 5
5 1 2 2 3
then ans is 60.000000
In general ans is if array is not sorted : N!/P!*Q!.. and so on..
Here is another useful link:
https://math.stackexchange.com/questions/1844133/expectation-over-sequencial-random-shuffles

How to solve weighted Activity selection with use of Segment Trees and Binary search?

Given N jobs where every job is represented by following three elements of it.
1) Start Time
2) Finish Time.
3) Profit or Value Associated.
Find the maximum profit subset of jobs such that no two jobs in the subset overlap.
I know a dynamic programming solution which has a complexity of O(N^2) (close to LIS where we have to just check the previous elements with which we can merge the current interval and take the interval whose merging gives maximum till the i th element ).This solution can be further improved to O(N*log N ) using Binary search and simple sorting!
But my friend was telling me that it can be even solved by using Segment Trees and binary search! I have no clue as to where I am going to use Segment Tree and how .??
Can you help?
On request,sorry not commented
What I am doing is sorting on the basis of the starting index, storing the maximum obtainable value till i at DP[i] by merging previous intervals and their maximum obtainable value !
void solve()
{
int n,i,j,k,high;
scanf("%d",&n);
pair < pair < int ,int>, int > arr[n+1];// first pair represents l,r and int alone shows cost
int dp[n+1];
memset(dp,0,sizeof(dp));
for(i=0;i<n;i++)
scanf("%d%d%d",&arr[i].first.first,&arr[i].first.second,&arr[i].second);
std::sort(arr,arr+n); // by default sorting on the basis of starting index
for(i=0;i<n;i++)
{
high=arr[i].second;
for(j=0;j<i;j++)//checking all previous mergable intervals //Note we will use DP[] of the mergable interval due to optimal substructure
{
if(arr[i].first.first>=arr[j].first.second)
high=std::max(high , dp[j]+arr[i].second);
}
dp[i]=high;
}
for(i=0;i<n;i++)
dp[n-1]=std::max(dp[n-1],dp[i]);
printf("%d\n",dp[n-1]);
}
int main()
{solve();return 0;}
EDIT:
My working code finally took me 3 hours to debug it though! Morover this code is slower than the binary search and sorting one due to a larger constant and bad implementation :P (just for reference)
#include<stdio.h>
#include<algorithm>
#include<vector>
#include<cstring>
#include<iostream>
#include<climits>
#define lc(idx) (2*idx+1)
#define rc(idx) (2*idx+2)
#define mid(l,r) ((l+r)/2)
using namespace std;
int Tree[4*2*10000-1];
void update(int L,int R,int qe,int idx,int value)
{
if(value>Tree[0])
Tree[0]=value;
while(L<R)
{
if(qe<= mid(L,R))
{
idx=lc(idx);
R=mid(L,R);
}
else
{
idx=rc(idx);
L=mid(L,R)+1;
}
if(value>Tree[idx])
Tree[idx]=value;
}
return ;
}
int Get(int L,int R,int idx,int q)
{
if(q<L )
return 0;
if(R<=q)
return Tree[idx];
return max(Get(L,mid(L,R),lc(idx),q),Get(mid(L,R)+1,R,rc(idx),q));
}
bool cmp(pair < pair < int , int > , int > A,pair < pair < int , int > , int > B)
{
return A.first.second< B.first.second;
}
int main()
{
int N,i;
scanf("%d",&N);
pair < pair < int , int > , int > P[N];
vector < int > V;
for(i=0;i<N;i++)
{
scanf("%d%d%d",&P[i].first.first,&P[i].first.second,&P[i].second);
V.push_back(P[i].first.first);
V.push_back(P[i].first.second);
}
sort(V.begin(),V.end());
for(i=0;i<N;i++)
{
int &l=P[i].first.first,&r=P[i].first.second;
l=lower_bound(V.begin(),V.end(),l)-V.begin();
r=lower_bound(V.begin(),V.end(),r)-V.begin();
}
sort(P,P+N,cmp);
int ans=0;
memset(Tree,0,sizeof(Tree));
for(i=0;i<N;i++)
{
int aux=Get(0,2*N-1,0,P[i].first.first)+P[i].second;
if(aux>ans)
ans=aux;
update(0,2*N-1,P[i].first.second,0,ans);
}
printf("%d\n",ans);
return 0;
}
high=arr[i].second;
for(j=0;j<i;j++)//checking all previous mergable intervals //Note we will use DP[] of the mergable interval due to optimal substructure
{
if(arr[i].first.first>=arr[j].first.second)
high=std::max(high, dp[j]+arr[i].second);
}
dp[i]=high;
This can be done in O(log n) with a segment tree.
First of all, let's rewrite it a bit. The max you are taking is a bit complicated, because it takes the maximum of a sum involving both i and j. But i is constant in this part, so let's take it out.
high=dp[0];
for(j=1;j<i;j++)//checking all previous mergable intervals //Note we will use DP[] of the mergable interval due to optimal substructure
{
if(arr[i].first.first>=arr[j].first.second)
high=std::max(high, dp[j]);
}
dp[i]=high + arr[i].second;
Great, now we have reduced the problem to determining the maximum in [0, i - 1] out of the values that satisfy your if condition.
If we didn't have the if, it would be a simple application of segment trees.
Now there are two choices.
1. Deal with O(log V) query time and O(V) memory for the segment tree
Where V is the maximum size of an interval's endpoint.
You can build a segment tree to which you insert interval start points as you move your i. Then you query over the range of values. Something like this, where the segment tree is initialized to -infinity and of size O(V).
Update(node, index, value):
if node.associated_interval == [index, index]:
node.max = value
return
if index in node.left.associated_interval:
Update(node.left, index, value)
else:
Update(node.right, index, value)
node.max = max(node.left.max, node.right.max)
Query(node, left, right):
if [left, right] does not intersect node.associated_interval:
return -infinity
if node.associated_interval included in [left, right]:
return node.max
return max(Query(node.left, left, right),
Query(node.right, left, right))
[...]
high=Query(tree, 0, arr[i].first.first)
dp[i]=high + arr[i].second;
Update(tree, arr[i].first.first, dp[i])
2. Reducing to O(log n) query time and O(n) memory for the segment tree
Since the number of intervals might be significantly less than their length, it's reasonable to think that we might be able to encode them better somehow, so that their length is also O(n). Indeed, we can.
This involves normalizing your intervals in the range [1, 2*n]. Consider the following intervals
8 100
3 50
90 92
Let's plot them on a line. They'd look like this:
3 8 50 90 92 100
Now replace each of them with their index:
1 2 3 4 5 6
3 8 50 90 92 100
And write your new intervals:
2 6
1 3
4 5
Note that they retain the properties of your initial intervals: the same ones overlap, the same ones are included in each other etc.
This can be done with a sort. You can now apply the same segment tree algorithm, except you declare the segment tree for the size 2*n.

Minimizing time in transit

[Updates at bottom (including solution source code)]
I have a challenging business problem that a computer can help solve.
Along a mountainous region flows a long winding river with strong currents. Along certain parts of the river are plots of environmentally sensitive land suitable for growing a particular type of rare fruit that is in very high demand. Once field laborers harvest the fruit, the clock starts ticking to get the fruit to a processing plant. It's very costly to try and send the fruits upstream or over land or air. By far the most cost effective mechanism to ship them to the plant is downstream in containers powered solely by the river's constant current. We have the capacity to build 10 processing plants and need to locate these along the river to minimize the total time the fruits spend in transit. The fruits can take however long before reaching the nearest downstream plant but that time directly hurts the price at which they can be sold. Effectively, we want to minimize the sum of the distances to the nearest respective downstream plant. A plant can be located as little as 0 meters downstream from a fruit access point.
The question is: In order to maximize profits, how far up the river should we build the 10 processing plants if we have found 32 fruit growing regions, where the regions' distances upstream from the base of the river are (in meters):
10, 40, 90, 160, 250, 360, 490, ... (n^2)*10 ... 9000, 9610, 10320?
[It is hoped that all work going towards solving this problem and towards creating similar problems and usage scenarios can help raise awareness about and generate popular resistance towards the damaging and stifling nature of software/business method patents (to whatever degree those patents might be believed to be legal within a locality).]
UPDATES
Update1: Forgot to add: I believe this question is a special case of this one.
Update2: One algorithm I wrote gives an answer in a fraction of a second, and I believe is rather good (but it's not yet stable across sample values). I'll give more details later, but the short is as follows. Place the plants at equal spacings. Cycle over all the inner plants where at each plant you recalculate its position by testing every location between its two neighbors until the problem is solved within that space (greedy algorithm). So you optimize plant 2 holding 1 and 3 fixed. Then plant 3 holding 2 and 4 fixed... When you reach the end, you cycle back and repeat until you go a full cycle where every processing plant's recalculated position stops varying.. also at the end of each cycle, you try to move processing plants that are crowded next to each other and are all near each others' fruit dumps into a region that has fruit dumps far away. There are many ways to vary the details and hence the exact answer produced. I have other candidate algorithms, but all have glitches. [I'll post code later.] Just as Mike Dunlavey mentioned below, we likely just want "good enough".
To give an idea of what might be a "good enough" result:
10010 total length of travel from 32 locations to plants at
{10,490,1210,1960,2890,4000,5290,6760,8410,9610}
Update3: mhum gave the correct exact solution first but did not (yet) post a program or algorithm, so I wrote one up that yields the same values.
/************************************************************
This program can be compiled and run (eg, on Linux):
$ gcc -std=c99 processing-plants.c -o processing-plants
$ ./processing-plants
************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//a: Data set of values. Add extra large number at the end
int a[]={
10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240,99999
};
//numofa: size of data set
int numofa=sizeof(a)/sizeof(int);
//a2: will hold (pt to) unique data from a and in sorted order.
int *a2;
//max: size of a2
int max;
//num_fixed_loc: at 10 gives the solution for 10 plants
int num_fixed_loc;
//xx: holds index values of a2 from the lowest error winner of each cycle memoized. accessed via memoized offset value. Winner is based off lowest error sum from left boundary upto right ending boundary.
//FIX: to be dynamically sized.
int xx[1000000];
//xx_last: how much of xx has been used up
int xx_last=0;
//SavedBundle: data type to "hold" memoized values needed (total traval distance and plant locations)
typedef struct _SavedBundle {
long e;
int xx_offset;
} SavedBundle;
//sb: (pts to) lookup table of all calculated values memoized
SavedBundle *sb; //holds winning values being memoized
//Sort in increasing order.
int sortfunc (const void *a, const void *b) {
return (*(int *)a - *(int *)b);
}
/****************************
Most interesting code in here
****************************/
long full_memh(int l, int n) {
long e;
long e_min=-1;
int ti;
if (sb[l*max+n].e) {
return sb[l*max+n].e; //convenience passing
}
for (int i=l+1; i<max-1; i++) {
e=0;
//sum first part
for (int j=l+1; j<i; j++) {
e+=a2[j]-a2[l];
}
//sum second part
if (n!=1) //general case, recursively
e+=full_memh(i, n-1);
else //base case, iteratively
for (int j=i+1; j<max-1; j++) {
e+=a2[j]-a2[i];
}
if (e_min==-1) {
e_min=e;
ti=i;
}
if (e<e_min) {
e_min=e;
ti=i;
}
}
sb[l*max+n].e=e_min;
sb[l*max+n].xx_offset=xx_last;
xx[xx_last]=ti; //later add a test or a realloc, etc, if approp
for (int i=0; i<n-1; i++) {
xx[xx_last+(i+1)]=xx[sb[ti*max+(n-1)].xx_offset+i];
}
xx_last+=n;
return e_min;
}
/*************************************************************
Call to calculate and print results for given number of plants
*************************************************************/
int full_memoization(int num_fixed_loc) {
char *str;
long errorsum; //for convenience
//Call recursive workhorse
errorsum=full_memh(0, num_fixed_loc-2);
//Now print
str=(char *) malloc(num_fixed_loc*20+100);
sprintf (str,"\n%4d %6d {%d,",num_fixed_loc-1,errorsum,a2[0]);
for (int i=0; i<num_fixed_loc-2; i++)
sprintf (str+strlen(str),"%d%c",a2[ xx[ sb[0*max+(num_fixed_loc-2)].xx_offset+i ] ], (i<num_fixed_loc-3)?',':'}');
printf ("%s",str);
return 0;
}
/**************************************************
Initialize and call for plant numbers of many sizes
**************************************************/
int main (int x, char **y) {
int t;
int i2;
qsort(a,numofa,sizeof(int),sortfunc);
t=1;
for (int i=1; i<numofa; i++)
if (a[i]!=a[i-1])
t++;
max=t;
i2=1;
a2=(int *)malloc(sizeof(int)*t);
a2[0]=a[0];
for (int i=1; i<numofa; i++)
if (a[i]!=a[i-1]) {
a2[i2++]=a[i];
}
sb = (SavedBundle *)calloc(sizeof(SavedBundle),max*max);
for (int i=3; i<=max; i++) {
full_memoization(i);
}
free(sb);
return 0;
}
Let me give you a simple example of a Metropolis-Hastings algorithm.
Suppose you have a state vector x, and a goodness-of-fit function P(x), which can be any function you care to write.
Suppose you have a random distribution Q that you can use to modify the vector, such as x' = x + N(0, 1) * sigma, where N is a simple normal distribution about 0, and sigma is a standard deviation of your choosing.
p = P(x);
for (/* a lot of iterations */){
// add x to a sample array
// get the next sample
x' = x + N(0,1) * sigma;
p' = P(x');
// if it is better, accept it
if (p' > p){
x = x';
p = p';
}
// if it is not better
else {
// maybe accept it anyway
if (Uniform(0,1) < (p' / p)){
x = x';
p = p';
}
}
}
Usually it is done with a burn-in time of maybe 1000 cycles, after which you start collecting samples. After another maybe 10,000 cycles, the average of the samples is what you take as an answer.
It requires diagnostics and tuning. Typically the samples are plotted, and what you are looking for is a "fuzzy caterpilar" plot that is stable (doesn't move around much) and has a high acceptance rate (very fuzzy). The main parameter you can play with is sigma.
If sigma is too small, the plot will be fuzzy but it will wander around.
If it is too large, the plot will not be fuzzy - it will have horizontal segments.
Often the starting vector x is chosen at random, and often multiple starting vectors are chosen, to see if they end up in the same place.
It is not necessary to vary all components of the state vector x at the same time. You can cycle through them, varying one at a time, or some such method.
Also, if you don't need the diagnostic plot, it may not be necessary to save the samples, but just calculate the average and variance on the fly.
In the applications I'm familiar with, P(x) is a measure of probability, and it is typically in log-space, so it can vary from 0 to negative infinity.
Then to do the "maybe accept" step it is (exp(logp' - logp))
Unless I've made an error, here are exact solutions (obtained through a dynamic programming approach):
N Dist Sites
2 60950 {10,4840}
3 40910 {10,2890,6760}
4 30270 {10,2250,4840,7840}
5 23650 {10,1690,3610,5760,8410}
6 19170 {10,1210,2560,4410,6250,8410}
7 15840 {10,1000,2250,3610,5290,7290,9000}
8 13330 {10,810,1960,3240,4410,5760,7290,9000}
9 11460 {10,810,1690,2890,4000,5290,6760,8410,9610}
10 9850 {10,640,1440,2250,3240,4410,5760,7290,8410,9610}
11 8460 {10,640,1440,2250,3240,4410,5290,6250,7290,8410,9610}
12 7350 {10,490,1210,1960,2890,3610,4410,5290,6250,7290,8410,9610}
13 6470 {10,490,1000,1690,2250,2890,3610,4410,5290,6250,7290,8410,9610}
14 5800 {10,360,810,1440,1960,2560,3240,4000,4840,5760,6760,7840,9000,10240}
15 5190 {10,360,810,1440,1960,2560,3240,4000,4840,5760,6760,7840,9000,9610,10240}
16 4610 {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,7290,8410,9000,9610,10240}
17 4060 {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,7290,7840,8410,9000,9610,10240}
18 3550 {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,6760,7290,7840,8410,9000,9610,10240}
19 3080 {10,360,810,1210,1690,2250,2890,3610,4410,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
20 2640 {10,250,640,1000,1440,1960,2560,3240,4000,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
21 2230 {10,250,640,1000,1440,1960,2560,3240,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
22 1860 {10,250,640,1000,1440,1960,2560,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
23 1520 {10,250,490,810,1210,1690,2250,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
24 1210 {10,250,490,810,1210,1690,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
25 940 {10,250,490,810,1210,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
26 710 {10,160,360,640,1000,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
27 500 {10,160,360,640,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
28 330 {10,160,360,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
29 200 {10,160,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
30 100 {10,90,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
31 30 {10,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
32 0 {10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}

Resources