Pixies in the custard swamp puzzle - algorithm

(With thanks to Rich Bradshaw)
I'm looking for optimal strategies for the following puzzle.
As the new fairy king, it is your duty to map the kingdom's custard swamp.
The swamp is covered in an ethereal mist, with islands of custard scattered throughout.
You can send your pixies across the swamp, with instructions to fly low or high at each point.
If a pixie swoops down over a custard, it will be distracted and won't complete its sequence.
Since the mist is so thick, all you know is whether a pixie got to the other side or not.
In coding terms..
bool flutter( bool[size] swoop_map );
This returns whether a pixie exited for a given sequence of swoops.
The simplest way is to pass in sequences with only one swoop. That reveals all custard islands in 'size' tries.
I'd rather something proportional to the number of custards - but have problems with sequences like:
C......C (that is, custards at beginning and end)
Links to other forms of this puzzle would be welcome as well.

This makes me think of divide and conquer. Maybe something like this (this is slightly broken pseudocode. It may have fence-post errors and the like):
retval[size] check()
{
bool[size] retval = ALLFALSE;
bool[size] flut1 = ALLFALSE;
bool[size] flut2 = ALLFALSE;
for (int i = 0; i < size/2; ++i) flut1[i] = TRUE;
for (int i = size/2; i < size; ++i) flut2[i] = TRUE;
if (flutter(flut1)) retval[0..size/2] = <recurse>check
if (flutter(flut2)) retval[size/2..size] = <recurse>check
}
In plain English, it calls flutter on each half of the custard map. If any half returns false, that whole half has no custard. Otherwise, half of the half has the algorithm applied recursively. I'm not sure if it is possible to do better. However, this algorithm is kind of lame if the swamp is mostly custard.
Idea Two:
int itsize = 1
bool[size] retval = ALLFALSE;
for (int pos = 0; pos < size;)
{
bool[size] nextval = ALLFALSE;
for (int pos2 = pos; pos2 < pos + size && pos2 < size; ++pos2) nextval[pos2] = true;
bool flut = flutter(nextval)
if (!flut || itsize == 1)
{
for (int pos2 = pos; pos2 < pos + size && pos2 < size; ++pos2) retval[pos2] = flut;
pos+=itsize;
}
if (flut) itsize = 1;
if (!flut) itsize*=2;
}
In plain English, it calls flutter on each element of the custard map, one at a time. If it does not find custard, the next call will be on twice as many elements as the previous call. This is kind of like binary search, except only in one direction since it does not know how many items it is searching for. I have no idea how efficient this is.

Brian's first divide and conquer algorithm is optimal in the following sense: there exists a constant C such that over all swamps with n squares and at most k custards, no algorithm has a worst case that is more than C times better than Brian's. Brian's algorithm uses O(k log(n/k)) flights, which is within a constant factor the information-theoretic lower bound of log2(n choose k) >= log2((n/k)^k) = k Omega(k log(n/k)). (You need an assumption like k <= n/2 to make the last step rigorous, but at this point, we've already reached the maximum of O(n) flights.)
Why does Brian's algorithm use only O(k log(n/k)) flights? At recursion depth i, it makes at most min(2^i, k) flights. The sum for 0 <= i <= log2(k) is O(k). The sum for log2(k) < i <= log2(n) is k (log2(n) - log2(k)) = k (log2(n/k)).

Related

Finding optimal subsequence in a path

A path for the context of this question is a collection of points with integer coordinates v1, v2, v3 ... vn such that v1 is connected to v2, v2 is connected to v3 and so on. The path is non-cyclic and does not have any branches.(By v and u are connected it means that the absolute difference between their either x or y coordinate is equal to 1)
We say there is a possible segment between vi and vj if they follow some criteria which is irrelevant to this question.
ci represents the farthest point on the path in the forward direction such that there is a possible segment between vi and ci. (ci lies ahead of vi)
di represents the farthest point on the path in the backward direction such that there is a possible segment between vi and di. (vi lies ahead of di)
Note: If there is a possible segment between u and v then there is a possible segment between any of its sub segments.
The values of ci and di are already calculated for each i.
For each pair vi and vj there is a penalty associated which also has been calculated for each i and j.
A sequence in a path is a collection of points of the path u1, u2, u3 ... um (not necessarily connected) such that u1 = v1, um = vn and there is a possible segment between each ui and ui+1.
Number of segments in such a cycle is (m-1).
The problem is to find the most optimal sequence which is a sequence having minimum number of segments possible and of all the such possible sequences have minimum sum of penalties of consecutive points in that sequence.
This problem is solved in a program called potrace which I am trying to implement but that implementation uses cyclic paths while my program uses non-cyclic.
I also cannot understand how the potrace implementation below works in the first place.
In the implementation below clip0[i] represents ci and clip1[i] represents di.
In potrace implementation cyclic means v1 and vn are also connected in the path.
Source Line 575
Documentation 2.2.4
/* calculate seg0[j] = longest path from 0 with j segments */
i = 0;
for (j=0; i<n; j++) {
seg0[j] = i;
i = clip0[i];
}
seg0[j] = n;
m = j;
/* calculate seg1[j] = longest path to n with m-j segments */
i = n;
for (j=m; j>0; j--) {
seg1[j] = i;
i = clip1[i];
}
seg1[0] = 0;
/* now find the shortest path with m segments, based on penalty3 */
/* note: the outer 2 loops jointly have at most n iterations, thus
the worst-case behavior here is quadratic. In practice, it is
close to linear since the inner loop tends to be short. */
pen[0]=0;
for (j=1; j<=m; j++) {
for (i=seg1[j]; i<=seg0[j]; i++) {
best = -1;
for (k=seg0[j-1]; k>=clip1[i]; k--) {
thispen = penalty3(pp, k, i) + pen[k];
if (best < 0 || thispen < best) {
prev[i] = k;
best = thispen;
}
}
pen[i] = best;
}
}
pp->m = m;
SAFE_CALLOC(pp->po, m, int); // output
/* read off shortest path */
for (i=n, j=m-1; i>0; j--) {
i = prev[i];
pp->po[j] = i;
}
A sample input can be this.
EDIT 1:
So when I implemented the same code for my case the last loop was breaking the code, the index value j was either becoming negative (without self looping) or i = prev[i] was self looping.
The penalty values are positive.
EDIT 2:
I coded vaguely the Dijkstra's algorithm and it seems to be working.
I am providing my relevant bit of code below.
using Weight = std::pair<int, float>;
std::vector<std::vector<std::pair<int, Weight>>> graph;
graph.resize(n);
/*This takes O(n^2).*/
for (int i = 0; i < n; ++i) {
for (int j = clip1[i]; j <= clip0[i]; ++j) {
float pen = calculatePenalty(index, i, j);
graph[i].emplace_back(j, Weight(1, pen));
graph[j].emplace_back(i, Weight(1, pen));
}
}
std::vector<bool> vis(n, false);
std::vector<Weight> dist(n, {10e5 + 1, 0.0f});
std::vector<int> prev(n, 0);
dist[0] = {0, 0.0f};
std::multiset<std::pair<Weight, int>> set;
set.insert({{0, 0.0f}, 0});
while (!set.empty()) {
auto p = *set.begin();
set.erase(set.begin());
int x = p.second;
Weight w0 = p.first;
if (vis[x]) continue;
vis[x] = true;
for (auto v : graph[x]) {
int e = v.first;
Weight w = v.second;
Weight w_ = {dist[x].first + w.first, dist[x].second + w.second};
if (w_ < dist[e]) {
prev[e] = x;
dist[e] = w_;
set.insert({dist[e], e});
}
}
}
for (int i = n - 1; i > 0;) {
seq.push_back(i);
i = prev[i];
}
seq.push_back(0);
If there are any errors in the above code then please correct it.
I think a number of improvements can be made in the above code.
The initialization of the graph itself has O(n^2) complexity. There should be an alternative way to do this part or the whole part.
Its also not so compact as the potrace counter part. A more compact implementation with better time complexity seems possible. If someone could provide some pseudocode in that direction than that would be appreciated.
Also in the potrace implementation it seems that the number of segments is precisely m. But when I compute m in my case and compare it with seg.size() - 1, it is not equal. (It is both greater and less in different cases but not by a large margin.)
The problem you're describing is the (single-source single-destination) shortest-path problem, where an edge's weight is (1, penalty) (and weights are summed elementwise and ordered lexically, so minimizing number of edges is first priority and minimizing total penalty is second priority). You can solve this problem in near-linear time with Dijkstra's algorithm if all your penalties are positive (or zero). In this case, you can prove that the shortest path will never repeat any vertices.
potrace's implementation looks roughly like Bellman-Ford's algorithm (in dynamic programming interpretation), which is a good approach if you have a mixture of positive and negative penalties (but unnecessarily slow if you have only positive penalties). In this case, the shortest path might repeat vertices, but when that happens, the path will actually repeat some vertices (a negative-weight cycle) infinitely many times, which is probably not what you want.

Divide N cake to M people with minimum wastes

So here is the question:
In a party there are n different-flavored cakes of volume V1, V2, V3 ... Vn each. Need to divide them into K people present in the party such that
All members of party get equal volume of cake (say V, which is the solution we are looking for)
Each member should get a cake of single flavour only (you cannot distribute parts of different flavored cakes to a member).
Some volume of cake will be wasted after distribution, we want to minimize the waste; or, equivalently, we are after a maximum distribution policy
Given known condition that: if V is an optimal solution, then at least one cake, X, can be divided by V without any volume left, i.e., Vx mod V == 0
I am trying to look for a solution with best time complexity (brute force will do it, but I need a quicker way).
Any suggestion would be appreciated.
Thanks
PS: It is not an assignment, it is an Interview question. Here is the pseducode for brute force:
int return_Max_volumn(List VolumnList)
{
maxVolumn = 0;
minimaxLeft = Integer.Max_value;
for (Volumn v: VolumnList)
for i = 1 to K people
targeVolumn = v/i;
NumberofpeoplecanGetcake = v1/targetVolumn +
v2/targetVolumn + ... + vn/targetVolumn
if (numberofPeopleCanGetcake >= k)
remainVolumn = (v1 mod targetVolumn) + (v2 mod targetVolumn)
+ (v3 mod targetVolumn + ... + (vn mod targetVolumn)
if (remainVolumn < minimaxLeft)
update maxVolumn to be targetVolumn;
update minimaxLeft to be remainVolumn
return maxVolumn
}
This is a somewhat classic programming-contest problem.
The answer is simple: do a basic binary search on volume V (the final answer).
(Note the title says M people, yet the problem description says K. I'll be using M)
Given a volume V during the search, you iterate through all of the cakes, calculating how many people each cake can "feed" with single-flavor slices (fed += floor(Vi/V)). If you reach M (or 'K') people "fed" before you're out of cakes, this means you can obviously also feed M people with any volume < V with whole single-flavor slices, by simply consuming the same amount of (smaller) slices from each cake. If you run out of cakes before reaching M slices, it means you cannot feed the people with any volume > V either, as that would consume even more cake than what you've already failed with. This satisfies the condition for a binary search, which will lead you to the highest volume V of single-flavor slices that can be given to M people.
The complexity is O(n * log((sum(Vi)/m)/eps) ). Breakdown: the binary search takes log((sum(Vi)/m)/eps) iterations, considering the upper bound of sum(Vi)/m cake for each person (when all the cakes get consumed perfectly). At each iteration, you have to pass through at most all N cakes. eps is the precision of your search and should be set low enough, no higher than the minimum non-zero difference between the volume of two cakes, divided by M*2, so as to guarantee a correct answer. Usually you can just set it to an absolute precision such as 1e-6 or 1e-9.
To speed things up for the average case, you should sort the cakes in decreasing order, such that when you are trying a large volume, you instantly discard all the trailing cakes with total volume < V (e.g. you have one cake of volume 10^6 followed by a bunch of cakes of volume 1.0. If you're testing a slice volume of 2.0, as soon as you reach the first cake of volume 1.0 you can already return that this run failed to provide M slices)
Edit:
The search is actually done with floating point numbers, e.g.:
double mid, lo = 0, hi = sum(Vi)/people;
while(hi - lo > eps){
mid = (lo+hi)/2;
if(works(mid)) lo = mid;
else hi = mid;
}
final_V = lo;
By the end, if you really need more precision than your chosen eps, you can simply take an extra O(n) step:
// (this step is exclusively to retrieve an exact answer from the final
// answer above, if a precision of 'eps' is not acceptable)
foreach (cake_volume vi){
int slices = round(vi/final_V);
double difference = abs(vi-(final_V*slices));
if(difference < best){
best = difference;
volume = vi;
denominator = slices;
}
}
// exact answer is volume/denominator
Here's the approach I would consider:
Let's assume that all of our cakes are sorted in the order of non-decreasing size, meaning that Vn is the largest cake and V1 is the smallest cake.
Generate the first intermediate solution by dividing only the largest cake between all k people. I.e. V = Vn / k.
Immediately discard all cakes that are smaller than V - any intermediate solution that involves these cakes is guaranteed to be worse than our intermediate solution from step 1. Now we are left with cakes Vb, ..., Vn, where b is greater or equal to 1.
If all cakes got discarded except the biggest one, then we are done. V is the solution. END.
Since we have more than one cake left, let's improve our intermediate solution by redistributing some of the slices to the second biggest cake Vn-1, i.e. find the biggest value of V so that floor(Vn / V) + floor(Vn-1 / V) = k. This can be done by performing a binary search between the current value of V and the upper limit (Vn + Vn-1) / k, or by something more clever.
Again, just like we did on step 2, immediately discard all cakes that are smaller than V - any intermediate solution that involves these cakes is guaranteed to be worse than our intermediate solution from step 4.
If all cakes got discarded except the two biggest ones, then we are done. V is the solution. END.
Continue to involve the new "big" cakes in right-to-left direction, improve the intermediate solution, and continue to discard "small" cakes in left-to-right direction until all remaining cakes get used up.
P.S. The complexity of step 4 seems to be equivalent to the complexity of the entire problem, meaning that the above can be seen as an optimization approach, but not a real solution. Oh well, for what it is worth... :)
Here's one approach to a more efficient solution. Your brute force solution in essence generates an implicit of possible volumes, filters them by feasibility, and returns the largest. We can modify it slightly to materialize the list and sort it so that the first feasible solution found is the largest.
First task for you: find a way to produce the sorted list on demand. In other words, we should do O(n + m log n) work to generate the first m items.
Now, let's assume that the volumes appearing in the list are pairwise distinct. (We can remove this assumption later.) There's an interesting fact about how many people are served by the volume at position k. For example, with volumes 11, 13, 17 and 7 people, the list is 17, 13, 11, 17/2, 13/2, 17/3, 11/2, 13/3, 17/4, 11/3, 17/5, 13/4, 17/6, 11/4, 13/5, 17/7, 11/5, 13/6, 13/7, 11/6, 11/7.
Second task for you: simulate the brute force algorithm on this list. Exploit what you notice.
So here is the algorithm I thought it would work:
Sort the volumes from largest to smallest.
Divide the largest cake to 1...k people, i.e., target = volume[0]/i, where i = 1,2,3,4,...,k
If target would lead to total number of pieces greater than k, decrease the number i and try again.
Find the first number i that will result in total number of pieces greater than or equal to K but (i-1) will lead to a total number of cakes less than k. Record this volume as baseVolume.
For each remaining cake, find the smallest fraction of remaining volume divide by number of people, i.e., division = (V_cake - (baseVolume*(Math.floor(V_cake/baseVolume)) ) / Math.floor(V_cake/baseVolume)
Add this amount to the baseVolume(baseVolume += division) and recalculate the total pieces all volumes could divide. If the new volume result in less pieces, return previous value, otherwise, repeat step 6.
Here are the java codes:
public static int getKonLagestCake(Integer[] sortedVolumesList, int k) {
int result = 0;
for (int i = k; i >= 1; i--) {
double volumeDividedByLargestCake = (double) sortedVolumesList[0]
/ i;
int totalNumber = totalNumberofCakeWithGivenVolumn(
sortedVolumesList, volumeDividedByLargestCake);
if (totalNumber < k) {
result = i + 1;
break;
}
}
return result;
}
public static int totalNumberofCakeWithGivenVolumn(
Integer[] sortedVolumnsList, double givenVolumn) {
int totalNumber = 0;
for (int volume : sortedVolumnsList) {
totalNumber += (int) (volume / givenVolumn);
}
return totalNumber;
}
public static double getMaxVolume(int[] volumesList, int k) {
List<Integer> list = new ArrayList<Integer>();
for (int i : volumesList) {
list.add(i);
}
Collections.sort(list, Collections.reverseOrder());
Integer[] sortedVolumesList = new Integer[list.size()];
list.toArray(sortedVolumesList);
int previousValidK = getKonLagestCake(sortedVolumesList, k);
double baseVolume = (double) sortedVolumesList[0] / (double) previousValidK;
int totalNumberofCakeAvailable = totalNumberofCakeWithGivenVolumn(sortedVolumesList, baseVolume);
if (totalNumberofCakeAvailable == k) {
return baseVolume;
} else {
do
{
double minimumAmountAdded = minimumAmountAdded(sortedVolumesList, baseVolume);
if(minimumAmountAdded == 0)
{
return baseVolume;
}else
{
baseVolume += minimumAmountAdded;
int newTotalNumber = totalNumberofCakeWithGivenVolumn(sortedVolumesList, baseVolume);
if(newTotalNumber == k)
{
return baseVolume;
}else if (newTotalNumber < k)
{
return (baseVolume - minimumAmountAdded);
}else
{
continue;
}
}
}while(true);
}
}
public static double minimumAmountAdded(Integer[] sortedVolumesList, double volume)
{
double mimumAdded = Double.MAX_VALUE;
for(Integer i:sortedVolumesList)
{
int assignedPeople = (int)(i/volume);
if (assignedPeople == 0)
{
continue;
}
double leftPiece = (double)i - assignedPeople*volume;
if(leftPiece == 0)
{
continue;
}
double division = leftPiece / (double)assignedPeople;
if (division < mimumAdded)
{
mimumAdded = division;
}
}
if (mimumAdded == Double.MAX_VALUE)
{
return 0;
}else
{
return mimumAdded;
}
}
Any Comments would be appreciated.
Thanks

Divvying people into rooms by last name?

I often teach large introductory programming classes (400 - 600 students) and when exam time comes around, we often have to split the class up into different rooms in order to make sure everyone has a seat for the exam.
To keep things logistically simple, I usually break the class apart by last name. For example, I might send students with last names A - H to one room, last name I - L to a second room, M - S to a third room, and T - Z to a fourth room.
The challenge in doing this is that the rooms often have wildly different capacities and it can be hard to find a way to segment the class in a way that causes everyone to fit. For example, suppose that the distribution of last names is (for simplicity) the following:
Last name starts with A: 25
Last name starts with B: 150
Last name starts with C: 200
Last name starts with D: 50
Suppose that I have rooms with capacities 350, 50, and 50. A greedy algorithm for finding a room assignment might be to sort the rooms into descending order of capacity, then try to fill in the rooms in that order. This, unfortunately, doesn't always work. For example, in this case, the right option is to put last name A in one room of size 50, last names B - C into the room of size 350, and last name D into another room of size 50. The greedy algorithm would put last names A and B into the 350-person room, then fail to find seats for everyone else.
It's easy to solve this problem by just trying all possible permutations of the room orderings and then running the greedy algorithm on each ordering. This will either find an assignment that works or report that none exists. However, I'm wondering if there is a more efficient way to do this, given that the number of rooms might be between 10 and 20 and checking all permutations might not be feasible.
To summarize, the formal problem statement is the following:
You are given a frequency histogram of the last names of the students in a class, along with a list of rooms and their capacities. Your goal is to divvy up the students by the first letter of their last name so that each room is assigned a contiguous block of letters and does not exceed its capacity.
Is there an efficient algorithm for this, or at least one that is efficient for reasonable room sizes?
EDIT: Many people have asked about the contiguous condition. The rules are
Each room should be assigned at most a block of contiguous letters, and
No letter should be assigned to two or more rooms.
For example, you could not put A - E, H - N, and P - Z into the same room. You could also not put A - C in one room and B - D in another.
Thanks!
It can be solved using some sort of DP solution on [m, 2^n] space, where m is number of letters (26 for english) and n is number of rooms. With m == 26 and n == 20 it will take about 100 MB of space and ~1 sec of time.
Below is solution I have just implemented in C# (it will successfully compile on C++ and Java too, just several minor changes will be needed):
int[] GetAssignments(int[] studentsPerLetter, int[] rooms)
{
int numberOfRooms = rooms.Length;
int numberOfLetters = studentsPerLetter.Length;
int roomSets = 1 << numberOfRooms; // 2 ^ (number of rooms)
int[,] map = new int[numberOfLetters + 1, roomSets];
for (int i = 0; i <= numberOfLetters; i++)
for (int j = 0; j < roomSets; j++)
map[i, j] = -2;
map[0, 0] = -1; // starting condition
for (int i = 0; i < numberOfLetters; i++)
for (int j = 0; j < roomSets; j++)
if (map[i, j] > -2)
{
for (int k = 0; k < numberOfRooms; k++)
if ((j & (1 << k)) == 0)
{
// this room is empty yet.
int roomCapacity = rooms[k];
int t = i;
for (; t < numberOfLetters && roomCapacity >= studentsPerLetter[t]; t++)
roomCapacity -= studentsPerLetter[t];
// marking next state as good, also specifying index of just occupied room
// - it will help to construct solution backwards.
map[t, j | (1 << k)] = k;
}
}
// Constructing solution.
int[] res = new int[numberOfLetters];
int lastIndex = numberOfLetters - 1;
for (int j = 0; j < roomSets; j++)
{
int roomMask = j;
while (map[lastIndex + 1, roomMask] > -1)
{
int lastRoom = map[lastIndex + 1, roomMask];
int roomCapacity = rooms[lastRoom];
for (; lastIndex >= 0 && roomCapacity >= studentsPerLetter[lastIndex]; lastIndex--)
{
res[lastIndex] = lastRoom;
roomCapacity -= studentsPerLetter[lastIndex];
}
roomMask ^= 1 << lastRoom; // Remove last room from set.
j = roomSets; // Over outer loop.
}
}
return lastIndex > -1 ? null : res;
}
Example from OP question:
int[] studentsPerLetter = { 25, 150, 200, 50 };
int[] rooms = { 350, 50, 50 };
int[] ans = GetAssignments(studentsPerLetter, rooms);
Answer will be:
2
0
0
1
Which indicates index of room for each of the student's last name letter. If assignment is not possible my solution will return null.
[Edit]
After thousands of auto generated tests my friend has found a bug in code which constructs solution backwards. It does not influence main algo, so fixing this bug will be an exercise to the reader.
The test case that reveals the bug is students = [13,75,21,49,3,12,27,7] and rooms = [6,82,89,6,56]. My solution return no answers, but actually there is an answer. Please note that first part of solution works properly, but answer construction part fails.
This problem is NP-Complete and thus there is no known polynomial time (aka efficient) solution for this (as long as people cannot prove P = NP). You can reduce an instance of knapsack or bin-packing problem to your problem to prove it is NP-complete.
To solve this you can use 0-1 knapsack problem. Here is how:
First pick the biggest classroom size and try to allocate as many group of students you can (using 0-1 knapsack), i.e equal to the size of the room. You are guaranteed not to split a group of student, as this is 0-1 knapsack. Once done, take the next biggest classroom and continue.
(You use any known heuristic to solve 0-1 knapsack problem.)
Here is the reduction --
You need to reduce a general instance of 0-1 knapsack to a specific instance of your problem.
So lets take a general instance of 0-1 knapsack. Lets take a sack whose weight is W and you have x_1, x_2, ... x_n groups and their corresponding weights are w_1, w_2, ... w_n.
Now the reduction --- this general instance is reduced to your problem as follows:
you have one classroom with seating capacity W. Each x_i (i \in (1,n)) is a group of students whose last alphabet begins with i and their number (aka size of group) is w_i.
Now you can prove if there is a solution of 0-1 knapsack problem, your problem has a solution...and the converse....also if there is no solution for 0-1 knapsack, then your problem have no solution, and vice versa.
Please remember the important thing of reduction -- general instance of a known NP-C problem to a specific instance of your problem.
Hope this helps :)
Here is an approach that should work reasonably well, given common assumptions about the distribution of last names by initial. Fill the rooms from smallest capacity to largest as compactly as possible within the constraints, with no backtracking.
It seems reasonable (to me at least) for the largest room to be listed last, as being for "everyone else" not already listed.
Is there any reason to make life so complicated? Why cann't you assign registration numbers to each student and then use the number to allocate them whatever the way you want :) You do not need to write a code, students are happy, everyone is happy.

Binary search is not efficient with traversal costs. What is?

Binary search let me down when I tried to apply it to the real world. The scenario is as follows.
I need to test the range of a device that communicates over radio.
Communication needs to occur quickly, but slow transmission is
tolerable, up to a point (say, about 3 minutes). I need to test
whether transmissions will be successful every 200 feet until failure, up to 1600
feet. Every 200 feet a test will be run which requires 3 minutes to
execute.
I naively assumed that a binary search would be the most efficient method of finding the failure point, but consider a travel speed of 200 ft/min and test time of 3 minutes. If failure to transmit occurs at 500 feet, binary search is not the most efficient means of finding the failure point, as shown below.
Simply walking along and testing every single point would have found the solution sooner, taking only 12 minutes, whereas binary search & testing would take 16 minutes.
My question: How do you calculate the most efficient path to the solution when traveling time matters? What is this called (e.g., binary-travel search, etc.)?
Binary search is indeed predicated on O(1) access times; there's little point binary searching a linked list, for example [but see Note 1], and that's essentially what you're doing, since you seem to be assuming that only discrete intervals are worth testing. If you were seeking a more accurate answer, you would find that the binary search allows an arbitrary precision, at the cost of one additional test per bit of precision.
Let's suppose you don't know even what the maximum value might be. Then you couldn't first test in the middle, since you wouldn't know where the middle was. Instead, you might do an exponential search for a limit (which is kind of a binary search inside out); you start by testing at x, then 2x, then 4x until you reach a point which is greater than the maximum (the signal doesn't reach that far). (x is the smallest answer you find interesting; in other words, if the first test at x shows the signal doesn't reach, you will then stop.) At the end of this phase, you'll be at 2ix, for some integer i, and you will know the answer is between 2i-1x and 2ix.
Now you can actually do the binary search, starting by going backwards by 2i-2x. From there, you might go either forwards or backwards, but you will definitely travel 2i-3x, and the next iteration you'll travel 2i-4x, and so on.
So in all, in the first phase (search for a maximum), you walked to 2ix, and did i tests. In the second phase, binary refinement, you walk a total of (2i-1-1)x and do i-1 tests. You'll end up at some point d which is between 2i-1 and 2i, so at worst you'll have walked 3d of the final point (and at best, you'll have walked 3d/2). The number of tests you will have done will be 2*ceil(log2(d/x)) - 1, which is within one test of 2*log2(d/x).
Under what circumstances should you do the binary search algorithm, then? Basically, it depends on the ratio of the travel time and the test time, and the desired precision of the answer. The simple sequential algorithm finds position d after d/x moves of size x and d/x tests; the binary search algorithm above finds position d after travelling at most 3d but doing only around 2 log(d/x) tests. Roughly speaking, if a test costs you more than twice the cost of travelling d/x, and the expected distance is sufficiently larger than the precision, you should prefer the binary search.
In your example, you appear to want the result with a precision of 200 feet; the travel time is 1 minute and the test time is 3 minutes, which is more than twice the travel time. So you should prefer the binary search, unless you expect that the answer will be found in a small number of multiples of the precision (as is the case). Note that although the binary algorithm uses four tests and 1000 feet of travel (compared with three tests and 600 feet for the sequential algorithm), improving the precision to 50 feet will only add four more tests and 150 feet of travel to the binary algorithm, while the sequential algorithm will require 20 tests.
Note 1: Actually, it might make sense to binary search a linked list, using precisely the above algorithm, if the cost of the test is high. Assuming the cost of the test is not proportional to the index in the list, the complexity of the search will be O(N) for both a lineary search and the binary search, but the binary search will do O(log N) tests and O(N) steps, while the sequential search will do O(N) tests and O(N) steps. For large enough N, this doesn't matter, but for real-world sized N it might matter a lot.
In reality, binary search can be applied here, but with several changes. We must calc not center, but an optimalPosition to visit.
int length = maxUnchecked - minChecked;
whereToGo = minChecked + (int)(length * factorIncrease) + stepIncrease;
Because we need find first position where communication failing, sometimes we must go back, after that can be optimal to use other strategy
int length = maxUnchecked - minChecked;
int whereToGo = 0;
if ( increase )
whereToGo = minChecked + (int)(length * factorIncrease) + stepIncrease;
else
whereToGo = minChecked + (int)(length * factorDecrease) + stepDecrease;
So, our task - to figure out such optimal factorIncrease, factorDecrease, stepIncrease, stepDecrease, that value of sum of f(failPos) will be minimal. How? Full bruteforce will help you if n (total length / 200.0f) is small. Else you can try use genetic algorithms or smth simple.
Step precision = 1, step limit = [0, n).
Factor eps - 1/(4*n), factor limit - [0,1).
Now, simple code (c#) to demonstate this:
class Program
{
static double factorIncrease;
static int stepIncrease;
static double factorDecrease;
static int stepDecrease;
static bool debug = false;
static int f(int lastPosition, int minChecked, int maxUnchecked, int last, int failPos, bool increase = true, int depth = 0)
{
if ( depth == 100 )
throw new Exception();
if ( maxUnchecked - minChecked <= 0 ) {
if ( debug )
Console.WriteLine("left: {0} right: {1}", minChecked, maxUnchecked);
return 0;
}
int length = maxUnchecked - minChecked;
int whereToGo = 0;
if ( increase )
whereToGo = minChecked + (int)(length * factorIncrease) + stepIncrease;
else
whereToGo = minChecked + (int)(length * factorDecrease) + stepDecrease;
if ( whereToGo <= minChecked )
whereToGo = minChecked + 1;
if ( whereToGo >= maxUnchecked )
whereToGo = maxUnchecked;
int cur = Math.Abs(whereToGo - lastPosition) + 3;
if ( debug ) {
Console.WriteLine("left: {2} right: {3} whereToGo:{0} cur: {1}", whereToGo, cur, minChecked, maxUnchecked);
}
if ( failPos == whereToGo || whereToGo == maxUnchecked )
return cur + f(whereToGo, minChecked, whereToGo - 1, last, failPos, true & increase, depth + 1);
else if ( failPos < whereToGo )
return cur + f(whereToGo, minChecked, whereToGo, last, failPos, true & increase, depth + 1);
else
return cur + f(whereToGo, whereToGo, maxUnchecked, last, failPos, false, depth + 1);
}
static void Main(string[] args)
{
int n = 20;
int minSum = int.MaxValue;
var minFactorIncrease = 0.0;
var minStepIncrease = 0;
var minFactorDecrease = 0.0;
var minStepDecrease = 0;
var eps = 1 / (4.00 * (double)n);
for ( factorDecrease = 0.0; factorDecrease < 1; factorDecrease += eps )
for ( stepDecrease = 0; stepDecrease < n; stepDecrease++ )
for ( factorIncrease = 0.0; factorIncrease < 1; factorIncrease += eps )
for ( stepIncrease = 0; stepIncrease < n; stepIncrease++ ) {
int cur = 0;
for ( int i = 0; i < n; i++ ) {
try {
cur += f(0, -1, n - 1, n - 1, i);
}
catch {
Console.WriteLine("fail {0} {1} {2} {3} {4}", factorIncrease, stepIncrease, factorDecrease, stepDecrease, i);
return;
}
}
if ( cur < minSum ) {
minSum = cur;
minFactorIncrease = factorIncrease;
minStepIncrease = stepIncrease;
minFactorDecrease = factorDecrease;
minStepDecrease = stepDecrease;
}
}
Console.WriteLine("best - mathmin={4}, f++:{0} s++:{1} f--:{2} s--:{3}", minFactorIncrease, minStepIncrease, minFactorDecrease, minStepDecrease, minSum);
factorIncrease = minFactorIncrease;
factorDecrease = minFactorDecrease;
stepIncrease = minStepIncrease;
stepDecrease = minStepDecrease;
//debug =true;
for ( int i = 0; i < n; i++ )
Console.WriteLine("{0} {1}", 3 + i * 4, f(0, -1, n - 1, n - 1, i));
debug = true;
Console.WriteLine(f(0, -1, n - 1, n - 1, n - 1));
}
}
So, some values (f++ - factorIncrease, s++ - stepIncrease, f-- - factorDecrease):
n = 9 mathmin = 144, f++: 0,1(1) s++: 1 f--: 0,2(2) s--: 1
n = 20 mathmin = 562, f++: 0,1125 s++: 2 f--: 0,25 s--: 1
Depending on what you actually want to optimise, there may be a way to work out an optimum search pattern. I presume you don't want to optimise the worst case time, because the slowest case for many search strategies will be when the break is at the very end, and binary search is actually pretty good here - you walk to the end without changing direction, and you don't make very many stops.
You might consider different binary trees, and perhaps work out the average time taken to work your way down to a leaf. Binary search is one sort of tree, and so is walking along and testing as you go - a very unbalanced tree in which each node has at least one leaf attached to it.
When following along such a tree you always start at one end or another of the line you are walking along, walk some distance before making a measurement, and then, depending on the result and the tree, either stop or repeat the process with a shorter line, where you are at one end or another of it.
This gives you something you can attack using dynamic programming. Suppose you have solved the problem for lengths of up to N segments, so that you know the cost for the optimum solutions of these lengths. Now you can work out the optimum solution for N+1 segments. Consider breaking the N+1 segments into two pieces in the N+1 possible ways. For each such way, work out the cost of moving to its decision point and taking a measurement and then add on the cost of the best possible solutions for the two sections of segments on either side of the decision point, possibly weighted to account for the probability of ending up in those sections. By considering those N+1 possible ways, you can work out the best way of splitting up N+1 segments, and its cost, and continue until you work out a best solution for the number of sections you actually have.

finding the position of a fraction in farey sequence

For finding the position of a fraction in farey sequence, i tried to implement the algorithm given here http://www.math.harvard.edu/~corina/publications/farey.pdf under "initial algorithm" but i can't understand where i'm going wrong, i am not getting the correct answers . Could someone please point out my mistake.
eg. for order n = 7 and fractions 1/7 ,1/6 i get same answers.
Here's what i've tried for given degree(n), and a fraction a/b:
sum=0;
int A[100000];
A[1]=a;
for(i=2;i<=n;i++)
A[i]=i*a-a;
for(i=2;i<=n;i++)
{
for(j=i+i;j<=n;j+=i)
A[j]-=A[i];
}
for(i=1;i<=n;i++)
sum+=A[i];
ans = sum/b;
Thanks.
Your algorithm doesn't use any particular properties of a and b. In the first part, every relevant entry of the array A is a multiple of a, but the factor is independent of a, b and n. Setting up the array ignoring the factor a, i.e. starting with A[1] = 1, A[i] = i-1 for 2 <= i <= n, after the nested loops, the array contains the totients, i.e. A[i] = phi(i), no matter what a, b, n are. The sum of the totients from 1 to n is the number of elements of the Farey sequence of order n (plus or minus 1, depending on which of 0/1 and 1/1 are included in the definition you use). So your answer is always the approximation (a*number of terms)/b, which is close but not exact.
I've not yet looked at how yours relates to the algorithm in the paper, check back for updates later.
Addendum: Finally had time to look at the paper. Your initialisation is not what they give. In their algorithm, A[q] is initialised to floor(x*q), for a rational x = a/b, the correct initialisation is
for(i = 1; i <= n; ++i){
A[i] = (a*i)/b;
}
in the remainder of your code, only ans = sum/b; has to be changed to ans = sum;.
A non-algorithmic way of finding the position t of a fraction in the Farey sequence of order n>1 is shown in Remark 7.10(ii)(a) of the paper, under m:=n-1, where mu-bar stands for the number-theoretic Mobius function on positive integers taking values from the set {-1,0,1}.
Here's my Java solution that works. Add head(0/1), tail(1/1) nodes to a SLL.
Then start by passing headNode,tailNode and setting required orderLevel.
public void generateSequence(Node leftNode, Node rightNode){
Fraction left = (Fraction) leftNode.getData();
Fraction right= (Fraction) rightNode.getData();
FractionNode midNode = null;
int midNum = left.getNum()+ right.getNum();
int midDenom = left.getDenom()+ right.getDenom();
if((midDenom <=getMaxLevel())){
Fraction middle = new Fraction(midNum,midDenom);
midNode = new FractionNode(middle);
}
if(midNode!= null){
leftNode.setNext(midNode);
midNode.setNext(rightNode);
generateSequence(leftNode, midNode);
count++;
}else if(rightNode.next()!=null){
generateSequence(rightNode, rightNode.next());
}
}

Resources