Algorithmic Strategy for selection of minimum number of baskets - algorithm

Example:
You have 4 baskets named P,Q,R,S.
You have 4 items in those baskets named A,B,C,D.
The composition of baskets are as follows
PIC
--A B C D
P 6 4 0 7
Q 6 4 1 1
R 4 6 3 6
S 4 6 2 3
Basket P has 6A, 4B, No C's and 7D.
Suppose you get following requests:
You have to give out 10A, 10B, 3C and 8D.
The minimum number of basket required to process the request is 2 (P,R).
How can I reach this algorithmically. What algo should I use, what should be the strategy?

Make directed graph (network) like this:
Source has edges with cost=1 and capacity=bigvalue to P,Q,R,S nodes
P has edges with cost=0 and capacity 6,4,7 to A,B,D, same for other baskets.
A,B,C,D have edges with cost=0 and capacity=10,10,3,8 to sink
Now solve Minimum-cost flow problem for 10+10+3+8 flow.

There was an algorithm about putting queens on rights places in chess board and the rule is that they must not threat each other. Your problem looks like that one for me. You can create a recursive structure like below:
Find first rows that meets the requirements: In your example P and Q (because 6+6 > 10)
So you handled first column, then go to second one and check if capacity of the basket P and Q can meet the requirement: They don't in your case (Because 4+4 < 10)
In here go back to first step (call the same recursive function for first column by increasing the pointer which was showing B before) and find the second rows that meet the requirements. P and R for your example. (6+4 = 10) Then do the second step for P and R.
So the idea is for every column find the baskets that meets the requirement and then go to the second column. If you can find the rows that meets the requirements then go for 3. If you can not find the ones in 3rd step then go back to 2nd step and again if no combinations of the rows that you choosed 2nd step meets the requirements than go to first and iterate it again.
I could not gave you a pseudocode properly but I think main idea is clear and not that hard to implement.

Related

What is the most efficient way to sort a stack using a limited set of instructions?

I know an almost identical question as already been asked
here,
but I do not find the
provided answers to be very helpful since the goal of the exercise was not clearly stated in the OP.
I have designed a simple algorithm to solve the exercise described below, but I would like help in order to improve it or
design a more efficient one.
Exercise
Given a stack A filled with n random integers (positive and/or negative) with no duplicates, an empty stack B and the eleven instructions listed below, print to the screen the shortest list made out of those instructions only such that when all the instructions are followed in order, A is sorted (the smallest number must be on top of the stack).
sa : swap a - swap the first 2 elements at the top of stack a
sb : swap b - swap the first 2 elements at the top of stack b.
ss : sa and sb at the same time.
pa : push a - take the first element at the top of b and put it at the top of
a.
pb : push b - take the first element at the top of a and put it at the top of
b.
ra : rotate a - shift up all elements of stack a by 1. The first element
becomes the last one.
rb : rotate b - shift up all elements of stack b by 1. The first element
becomes the last one.
rr : ra and rb at the same time.
rra : reverse rotate a - shift down all elements of stack a by 1. The
last element becomes the first one.
rrb : reverse rotate b - shift down all elements of stack b by 1. The
last element becomes the first one.
rrr : rra and rrb at the same time.
The goal of the exercise if to find the shortest list of stack instructions such that when followed A is sorted. What matters most is the size of the list, not the complexity of the algorithm we use to find such a list.
Algorithm
For now I have implemented this very simple algorithm :
Gather all the numbers in the array and sort it such that the
smallest number is at index 0.
Take the first number in the sorted array, we'll call it x. We need to move x to the top of the stack then push it to B so :
If x is in second position, swap.
If x is closer to the top of the stack, rotate until x is on top.
If x is closer to the bottom of the stack, reverse until x is on top.
After each operation check if the stack is sorted.
If it is not, push the first element of the stack onto B, take the next element in the array and repeat.
When only two elements are left in A, check if they are ordered, if not swap them.
Push all the elements from B back onto A.
This algorithm works pretty well when n is small but takes way too long when n gets large. On average I get :
30 instructions for n = 10.
2500 instructions for n = 100.
60000 instructions for n = 500.
250000 instructions for n = 10000.
I would like to go below 5000 steps for n = 500 and below 500 steps for n = 100.
This is a variation on https://stackoverflow.com/a/38165541/585411 which you already rejected. But hopefully you'll understand my explanation of how to do a bottom up mergesort better.
A run is a group of numbers in sorted order. At first you have many runs, presumably most are of small length. You're done when you have one run, in stack A.
To start, keep rotating A backwards while the bottom element is <= the top. This will position the the start of a run at the top of A.
Next, we need to split the runs evenly between A and B. The way we do it is go through A once, looking for runs. The first run goes at the bottom of A, the second run goes at the bottom of B, and so on. (Placing at the bottom of A just needs ra until the run is done. Placing at the bottom of B means pb then rb.)
Once we've split the runs, we either just placed a run at the bottom of A and A has one more run than B, or we just placed a run at the bottom of B and they have the same number of runs.
Now start merging runs, while you continue switching between A and B. Every time you merge, if you merged to A then A wound up with one more run. If you merged to B you have the same number of runs.
Merging a run to B looks like:
if top of A < top of B:
pb
rb
while bottom of B <= top of B:
if top of A < top of B:
pb
rb
while bottom of B <= top of A:
pb
rb
Merging a run to A is similar, just reversing the roles of the stacks.
Continue until B is empty. At that point B has 0 runs, while A has one. Which means that A is sorted.
This algorithm will take O(n log(n)) comparisons.
The problem has changed a lot since I first answered, so here are ideas for optimizations.
First, when splitting, we can do better than just dealing runs to A and B. Specifically we can put rising runs at the bottom of A, and push falling runs onto B (which leaves them rising). With an occasional sa to make the runs longer. These operations can be interleaved, so, for instance, we can deal out 5 2 3 1 4 with pb ra ra pb ra and then merge them with pa ra ra ra pa ra thereby sorting it with 11 operations. (This is probably not optimal, but it gives you the idea.) If you're clever about this you can probably start with an average run length in both piles of around 4 (and maybe much better). And during the splitting process you can do a lookahead of several instructions to figure out how to efficiently wind up with longer runs. (If you have 500 elements in runs of 4 that's 125 runs. The merge sort pass now should be able to finish in 7 passes.)
Are we done finding potential optimizations? Of course not.
When we start the merge passes, we now have uneven numbers of runs, and uneven numbers of elements. We are going to merge pairs of runs, place them somewhere, merge pairs again, place them somewhere, etc. After the pass is done, we'd like two things to be true:
The average length of run in both stacks should be about the same (merging runs of similar lengths is more efficient).
We want to have used as few operations as possible. Since merging n into m takes 2n+m operations, it matters where we put the merge.
We can solve for both constraints by using dynamic programming. We do that by constructing a data structure with the following information:
by the number of merged runs created:
by the number of runs put in `A`
by the number of elements put in `A`
minimal number of required steps
last stack merged into
We can then look through the part with the largest number of runs created, and figure out what makes the average run size as close as possible. And then walk back to figure out which sequence of merges got there in the minimum number of steps. And then we can work out what sequence of steps we took, and where we wound up.
When you put all of this together, I'm dubious that you'll be able to sort 500 elements in only 5000 steps. But I'd be surprised if you can't get it below 6000 on average.
And once you have all that, you can start to look for better optimizations still. ("We don't care how much analysis is required to produce the moves" is an invitation to spend unlimited energy optimizing.)
The question needs to be edited. The exercise is called "push swap", a project for students at school 42 (non-accredited school). A second part of the project is called "checker" which verifies the results of "push swap".
Here is a link that describes the push swap challenge | project. Spoiler alert: it also includes the authors approach for 100 and 500 numbers, so you may want to stop reading after the 3 and 5 number examples.
https://medium.com/#jamierobertdawson/push-swap-the-least-amount-of-moves-with-two-stacks-d1e76a71789a
The term stack is incorrectly used to describe the containers a and b (possibly a French to English translation issue), as swap and rotate are not native stack operations. A common implementation uses a circular doubly-linked list.
push swap: input a set of integers to be placed in a, and generate a list of the 11 operations that results in a being sorted. Other variables and arrays can be used to generate the list of operations. Some sites mention no duplicates in the set of integers. If there are duplicates, I would assume a stable sort, where duplicates are to be kept in their original order. Otherwise, if going for an optimal solution, all permutations of duplicates would need to be tested.
checker: verify the results of push swap.
Both programs need to validate input as well as produce results.
One web site lists how push swap is scored.
required: sort 3 numbers with <= 3 operations
required: sort 5 numbers with <= 12 operations
scored: sort 100 numbers with <= 700 operations max score
900 operations
1100 operations
1300 operations
1500 operations min score
scored: sort 500 numbers with <= 5500 operations max score
7000 operations
8500 operations
10000 operations
11500 operations min score
The following gives an idea of what is allowed in an algorithm to generate a list of operations. The first step is to convert the values in a into ranks (the values are never used again). In the case of duplicates, use the order of the duplicates when converting to ranks (stable sort), so there are no duplicate ranks. The values of the ranks are where the ranks belong in a sorted array:
for(i = 0; i < n; i++)
sorted[rank[i]] = rank[i].
For example, the values {-2 3 11 9 -5} would be converted to {1 2 4 3 0}: -2 belongs at sorted[1], 3 at sorted[2], ..., -5 at sorted[0]. For a stable sort where duplicates are allowed, the values {7 5 5 1 5} would be converted to {4 1 2 0 3}.
If a has 3 ranks, then there are 6 permutations of the ranks, and a maximum of 2 operations are needed to sort a:
{0 1 2} : already sorted
{0 2 1} : sa ra
{1 0 2} : sa
{1 2 0} : rra
{2 0 1} : ra
{2 1 0} : sa rra
For 5 ranks, 2 can be moved to b using 2 operations, the 3 left in a sorted with a max of 2 operations, leaving at least 8 operations to insert the 2 ranks from b to into a, to end up with a sorted a. There only 20 possible ways to move 2 ranks from b into a, small enough to create a table of 20 optimized sets of operations.
For 100 and 500 numbers, there are various strategies.
Spoiler:
Youtube video that shows 510 operations for n=100 and 3750 operations for n=500.
https://www.youtube.com/watch?v=2aMrmWOgLvU
Description converted to English:
Initial stage :
- parse parameters
- Creation of a stack A which is a circular doubly linked list (last.next = first; first.prec = last
- Addition in the struct of a rank component, integer from 1 to n.
This will be much more practical later.
Phase 1 :
- Split the list into 3 (modifiable parameter in the .h).
- Push the 2 smallest thirds into stack B and do a pre-sort. do ra with others
- Repeat the operation until there are only 3 numbers left in stack A.
- Sort these 3 numbers with a specific algo (2 operations maximum)
Phase2:
(Only the ra/rra/rb/rrb commands are used. sa and sb are not used in this phase)
- Swipe B and look for the number that will take the fewest moves to be pushed into A.
There are each time 4 ways to bring a number from B to A: ra+rb, ra+rrb, rra+rb, rra+rrb. We are looking for the mini between these 4 ways.
- Then perform the operation.
- Repeat the operation until empty B.
Phase 3: If necessary rot stack A to finalize the correct order. The shorter between ra or rra.
The optimization comes from the fact of the maximum use of the double rotations rr and rrr
Explanation:
Replace all values in a by rank.
For n = 100, a 3 way split is done:
ranks 0 to 32 are moved to the bottom of b,
ranks 33 to 65 are moved to the top of b,
leaving ranks 66 to 99 in a.
I'm not sure what is meant by "pre-sort" (top | bottom split in b?).
Ranks 66 to 99 in a are sorted, using b as needed.
Ranks from b are then inserted into a using fewest rotates.
For n = 500, a 7 way split is done:
Ranks 0 to 71 moved to bottom of b, 72 to 142 to top of b, which
will end up in the middle of b after other ranks moved to b.
Ranks 143 to 214 to bottom of b, 215 to 285 to top of b.
Ranks 286 to 357 to bottom of b, 358 to 428 to top of b.
Leaving ranks 429 to 499 in a.
The largest ranks in b are at the outer edges, smallest in the middle,
since the larger ranks are moved into sorted a before smaller ranks.
Ranks in a are sorted, then ranks from b moved into a using fewest rotates.

Algorithm to find largest identical-row square in matrix

I have a matrix of 100x100 size and need to find the largest set of rows and columns that create a square having equal rows. Example:
A B C D E F C D E
a 0 1 2 3 4 5 a 2 3 4
b 2 9 7 9 8 2
c 9 0 6 8 9 7 ==>
d 8 9 2 3 4 8 d 2 3 4
e 7 2 2 3 4 5 e 2 3 4
f 0 3 6 8 7 2
Currently I am using this algorithm:
candidates = [] // element type is {rows, cols}
foreach row
foreach col
candidates[] = {[row], [col]}
do
retval = candidates.first
foreach candidates as candidate
foreach newRow > candidates.rows.max
foreach newCol > candidates.cols.max
// compare matrix cells in candidate to newRow and newCol
if (newCandidateHasEqualRows)
newCandidates[] = {candidate.rows+newRow, candidate.cols+newCol}
candidates = newCandidates
while candidates.count
return retval
Has anyone else come across a problem similar to this? And is there a better algorithm to solve it?
Here's the NP-hardness reduction I mentioned, from biclique. Given a bipartite graph, make a matrix with a row for each vertex in part A and a column for each vertex in part B. For every edge that is present, put a 0 in the corresponding matrix entry. Put a unique positive integer for each other matrix entry. For all s > 1, there is a Ks,s subgraph if and only if there is a square of size s (which necessarily is all zero).
Given a fixed set of rows, the optimal set of columns is easily determined. You could try the a priori algorithm on sets of rows, where a set of rows is considered frequent iff there exist as many columns that, together with the rows, form a valid square.
I've implemented a branch and bound solver for this problem in C++ at http://pastebin.com/J1ipWs5b. To my surprise, it actually solves randomly-generated puzzles of size up to 100x100 quite quickly: on one problem with each matrix cell chosen randomly from 0-9, an optimal 4x4 solution is found in about 750ms on my old laptop. As the range of cell entries is reduced down to just 0-1, the solution times get drastically longer -- but still, at 157s (for the one problem I tried, which had an 8x8 optimal solution), this isn't terrible. It seems to be very sensitive to the size of the optimal solution.
At any point in time, we have a partial solution consisting of a set of rows that are definitely included, and a set of rows that are definitely excluded. (The inclusion status of the remaining rows is yet to be determined.) First, we pick a remaining row to "try". We try including the row; then (if necessary; see below) we try excluding it. "Trying" here means recursively solving the corresponding subproblem. We record the set of columns that are identical across all rows that are definitely included in the solution. As rows are added to the partial solution, this set of columns can only shrink.
There are a couple of improvements beyond the standard B&B idea of pruning the search when we determine that we can't develop the current partial solution into a better (i.e. larger) complete solution than some complete solution we have already found:
A dominance rule. If there are any rows that can be added to the current partial solution without shrinking the set of identical columns at all, then we can safely add them immediately, and we never have to consider not adding them. This potentially saves a lot of branching, especially if there are many similar rows in the input.
We can reorder the remaining (not definitely included or definitely excluded) rows arbitrarily. So in particular, we can always pick as the next row to consider the row that would most shrink the set of identical columns: this (perhaps counterintuitive) strategy has the effect of eliminating bad combinations of rows near the top of the search tree, which speeds up the search a lot. It also happens to complement the dominance rule above, because it means that if there are ever two rows X and Y such that X preserves a strict subset of the identical columns that Y preserves, then X will be added to the solution first, which in turn means that whenever X is included, Y will be forced in by the dominance rule and we don't need to consider the possibility of including X but excluding Y.

fully connection algorithm

I have encoutered an algorithm question:
Fully Connection
Given n cities which spreads along a line, let Xi be the position of city i and Pi be its population.
Now we begin to lay cables between every two of the cities based on their distance and population. Given two cities i and j, the cost to lay cable between them is |Xi-Xj|*max(Pi,Pj). How much does it cost to lay all the cables?
For example, given:
i Xi Pi
- -- --
1 1 4
2 2 5
3 3 6
Then the total cost can be calculated as:
i j |Xi-Xj| max(Pi, Pj) Segment Cost
- - ------ ----------- ------------
1 2 1 5 5
2 3 1 6 6
1 3 2 6 12
So that the total cost is 5+6+12 = 23.
While this can clearly be done in O(n2) time, can it be done in asymptotically less time?
I can think of faster solution. If I am not wrong it goes to O(n*logn). Now let's first sort all the cities according to Pi. This is O(n* log n). Then we start processing the cities in increasing order of Pi. the reason being - you always know you have max (Pi, Pj) = Pi in this case. We only want to add all the segments that come from relations with Pi. Those that will connect with larger indexes will be counted when they will be processed.
Now the thing I was able to think of was to use several index trees in order to reduce the complexity of the algorithm. First index tree is counting the number of nodes and can process queries of the kind: how many nodes are to the right of xi in logarithmic time. Lets call this number NR. The second index tree can process queries of the kind: what is the sum of distances from all the points to the right of a given x. The distances are counted towards a fixed point that is guaranteed to be to the right of the rightmost point, lets call its x XR.Lets call this number SUMD. Then the sum of the distances to all points to the right of our point can be found that way: NR * dist(Xi, XR) - SUMD. Then all these contribute (NR * dist(Xi, XR) - SUMD) *Pi to the result. The same for the left points and you get the answer. After you process the ith point you add it to the index trees and can go on.
Edit: Here is one article about Biary index trees: http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binaryIndexedTrees
This is the direct connections problem from codesprint 2.
They will be posting worked solutions to all problems within a week on their website.
(They have said "Now that the contest is over, we're totally cool with everyone discussing their solutions to the problems.")

Eliminating symmetry from graphs

I have an algorithmic problem in which I have derived a transfer matrix between a lot of states. The next step is to exponentiate it, but it is very large, so I need to do some reductions on it. Specifically it contains a lot of symmetry. Below are some examples on how many nodes can be eliminated by simple observations.
My question is whether there is an algorithm to efficiently eliminate symmetry in digraphs, similarly to the way I've done it manually below.
In all cases the initial vector has the same value for all nodes.
In the first example we see that b, c, d and e all receive values from a and one of each other. Hence they will always contain an identical value, and we can merge them.
In this example we quickly spot, that the graph is identical from the point of view of a, b, c and d. Also for their respective sidenodes, it doesn't matter to which inner node it is attached. Hence we can reduce the graph down to only two states.
Update: Some people were reasonable enough not quite sure what was meant by "State transfer matrix". The idea here is, that you can split a combinatorial problem up into a number of state types for each n in your recurrence. The matrix then tell you how to get from n-1 to n.
Usually you are only interested about the value of one of your states, but you need to calculate the others as well, so you can always get to the next level. In some cases however, multiple states are symmetrical, meaning they will always have the same value. Obviously it's quite a waste to calculate all of these, so we want to reduce the graph until all nodes are "unique".
Below is an example of the transfer matrix for the reduced graph in example 1.
[S_a(n)] [1 1 1] [S_a(n-1)]
[S_f(n)] = [1 0 0]*[S_f(n-1)]
[S_B(n)] [4 0 1] [S_B(n-1)]
Any suggestions or references to papers are appreciated.
Brendan McKay's nauty ( http://cs.anu.edu.au/~bdm/nauty/) is the best tool I know of for computing automorphisms of graphs. It may be too expensive to compute the whole automorphism group of your graph, but you might be able to reuse some of the algorithms described in McKay's paper "Practical Graph Isomorphism" (linked from the nauty page).
I'll just add an extra answer building on what userOVER9000 suggested, if anybody else are interested.
The below is an example of using nauty on Example 2, through the dreadnaut tool.
$ ./dreadnaut
Dreadnaut version 2.4 (64 bits).
> n=8 d g -- Starting a new 8-node digraph
0 : 1 3 4; -- Entering edge data
1 : 0 2 5;
2 : 3 1 6;
3 : 0 2 7;
4 : 0;
5 : 1;
6 : 2;
7 : 3;
> cx -- Calling nauty
(1 3)(5 7)
level 2: 6 orbits; 5 fixed; index 2
(0 1)(2 3)(4 5)(6 7)
level 1: 2 orbits; 4 fixed; index 4
2 orbits; grpsize=8; 2 gens; 6 nodes; maxlev=3
tctotal=8; canupdates=1; cpu time = 0.00 seconds
> o -- Output "orbits"
0:3; 4:7;
Notice it suggests joining nodes 0:3 which are a:d in Example 2 and 4:7 which are e:h.
The nauty algorithm is not well documented, but the authors describe it as exponential worst case, n^2 average.
Computing symmetries seems to be a bit of a second order problem. Taking just a,b,c and d in your second graph, the symmetry would have to be expressed
a(b,c,d) = b(a,d,c)
and all its permutations, or some such. Consider a second subgraph a', b', c', d' added to it. Again, we have the symmetries, but parameterised differently.
For computing people (rather than math people), could we express the problem like so?
Each graph node contains a set of letters. At each iteration, all of the letters in each node are copied to its neighbours by the arrows (some arrows take more than one iteration and can be treated as a pipe of anonymous nodes).
We are trying to find efficient ways of determining things such as
* what letters each set/node contains after N iterations.
* for each node the N after which its set no longer changes.
* what sets of nodes wind up containing the same sets of letters (equivalence class)
?

calendar scheduler algorithm

I'm looking for an algorithm that, given a set of items containing a start time, end time, type, and id, it will return a set of all sets of items that fit together (no overlapping times and all types are represented in the set).
S = [("8:00AM", "9:00AM", "Breakfast With Mindy", 234),
("11:40AM", "12:40PM", "Go to Gym", 219),
("12:00PM", "1:00PM", "Lunch With Steve", 079),
("12:40PM", "1:20PM", "Lunch With Steve", 189)]
Algorithm(S) => [[("8:00AM", "9:00AM", "Breakfast With Mindy", 234),
("11:40AM", "12:40PM", "Go to Gym", 219),
("12:40PM", "1:20PM", "Lunch With Steve", 189)]]
Thanks!
This can be solved using graph theory. I would create an array, which contains the items sorted by start time and end time for equal start times: (added some more items to the example):
no.: id: [ start - end ] type
---------------------------------------------------------
0: 234: [08:00AM - 09:00AM] Breakfast With Mindy
1: 400: [09:00AM - 07:00PM] Check out stackoverflow.com
2: 219: [11:40AM - 12:40PM] Go to Gym
3: 79: [12:00PM - 01:00PM] Lunch With Steve
4: 189: [12:40PM - 01:20PM] Lunch With Steve
5: 270: [01:00PM - 05:00PM] Go to Tennis
6: 300: [06:40PM - 07:20PM] Dinner With Family
7: 250: [07:20PM - 08:00PM] Check out stackoverflow.com
After that i would create a list with the array no. of the least item that could be the possible next item. If there isn't a next item, -1 is added:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
1 | 7 | 4 | 5 | 6 | 6 | 7 | -1
With that list it is possible to generate a directed acyclic graph. Every vertice has a connection to the vertices starting from the next item. But for vertices where already is a vertices bewteen them no edge is made. I'll try to explain with the example. For the vertice 0 the next item is 1. So a edge is made 0 -> 1. The next item from 1 is 7, that means the range for the vertices which are connected from vertice 0 is now from 1 to (7-1). Because vertice 2 is in the range of 1 to 6, another edge 0 -> 2 is made and the range updates to 1 to (4-1) (because 4 is the next item of 2). Because vertice 3 is in the range of 1 to 3 one more edge 0 -> 3 is made. That was the last edge for vertice 0. That has to be continued with all vertices leading to such a graph:
Until now we are in O(n2). After that all paths can be found using a depth first search-like algorithm and then eliminating the duplicated types from each path.
For that example there are 4 solutions, but none of them has all types because it is not possible for the example to do Go to Gym, Lunch With Steve and Go to Tennis.
Also this search for all paths has a worst case complexity of O(2n). For example the following graph has 2n/2 possible paths from a start vertice to an end vertice.
(source: archive.org)
There could be made some more optimisation, like merging some vertices before searching for all paths. But that is not ever possible. In the first example vertice 3 and 4 can't be merged even though they are of the same type. But in the last example vertice 4 and 5 can be merged if they are of the same type. Which means it doesn't matter which activity you choose, both are valid. This can speed up calculation of all paths dramatically.
Maybe there is also a clever way to consider duplicate types earlier to eliminate them, but worst case is still O(2n) if you want all possible paths.
EDIT1:
It is possible to determine if there are sets that contain all types and get a t least one such solution in polynomial time. I found a algorithm with a worst case time of O(n4) and O(n2) space. I'll take an new example which has a solution with all types, but is more complex.
no.: id: [ start - end ] type
---------------------------------------------------------
0: 234: [08:00AM - 09:00AM] A
1: 400: [10:00AM - 11:00AM] B
2: 219: [10:20AM - 11:20AM] C
3: 79: [10:40AM - 11:40AM] D
4: 189: [11:30AM - 12:30PM] D
5: 270: [12:00PM - 06:00PM] B
6: 300: [02:00PM - 03:00PM] E
7: 250: [02:20PM - 03:20PM] B
8: 325: [02:40PM - 03:40PM] F
9: 150: [03:30PM - 04:30PM] F
10: 175: [05:40PM - 06:40PM] E
11: 275: [07:00PM - 08:00PM] G
1.) Count the different types in the item set. This is possible in O(nlogn). It is 7 for that example.
2.) Create a n*n-matrix, that represents which nodes can reach the actual node and which can be reached from the actual node. For example if position (2,4) is set to 1, means that there is a path from node 2 to node 4 in the graph and (4,2) is set to 1 too, because node 4 can be reached from node 2. This is possible in O(n2). For the example the matrix would look like that:
111111111111
110011111111
101011111111
100101111111
111010111111
111101000001
111110100111
111110010111
111110001011
111110110111
111110111111
111111111111
3.) Now we have in every row, which nodes can be reached. We can also mark each node in a row which is not yet marked, if it is of the same type as a node that can be reached. We set that matrix positions from 0 to 2. This is possible in O(n3). In the example there is no way from node 1 to node 3, but node 4 has the same type D as node 3 and there is a path from node 1 to node 4. So we get this matrix:
111111111111
110211111111
121211111111
120121111111
111212111111
111121020001
111112122111
111112212111
111112221211
111112112111
111112111111
111111111111
4.) The nodes that still contains 0's (in the corresponding rows) can't be part of the solution and we can remove them from the graph. If there were at least one node to remove we start again in step 2.) with the smaller graph. Because we removed at least one node, we have to go back to step 2.) at most n times, but most often this will only happend few times. If there are no 0's left in the matrix we can continue with step 5.). This is possible in O(n2). For the example it is not possible to build a path with node 1 that also contains a node with type C. Therefore it contains a 0 and is removed like node 3 and node 5. In the next loop with the smaller graph node 6 and node 8 will be removed.
5.) Count the different types in the remainig set of items/nodes. If it is smaller than the first count there is no solution that can represent all types. So we have to find another way to get a good solution. If it is the same as the first count we now have a smaller graph which still holds all the possible solutions. O(nlogn)
6.) To get one solution we pick a start node (it doesn't matter which, because all nodes that are left in the graph are part of a solution). O(1)
7.) We remove every node that can't be reached from the choosen node. O(n)
8.) We create a matrix like in step 2.) and 3.) for that graph and remove the nodes that can not reach nodes of any type like in step 4.). O(n3)
9.) We choose one of the next nodes from the node we choosen before and continue with 7.) until there we are at a end node and the graph only has one path left.
That way it is also possible to get all paths, but that can still be exponential many. After all it should be faster than finding solutions in the original graph.
Hmmm, this reminds me of a task in the university, I'll describe what i can remember
The run-time is O(n*logn) which is pretty good.
This is a greedy approuch..
i will refine your request abit, tell me if i'm wrong..
Algorithem should return the MAX subset of non colliding tasks(in terms of total length? or amount of activities? i guess total length)
I would first order the list by the finishing times(first-minimum finishing time,last-maximum) = O(nlogn)
Find_set(A):
G<-Empty set;
S<-A
f<-0
while S!='Empty set' do
i<-index of activity with earliest finish time(**O(1)**)
if S(i).finish_time>=f
G.insert(S(i)) \\add this to result set
f=S(i).finish_time
S.removeAt(i) \\remove the activity from the original set
od
return G
Run time analysis:
initial ordering :nlogn
each iteration O(1)*n = O(n)
Total O(nlogn)+O(n) ~ O(nlogn) (well, given the O notation weakness to represent real complexety on small numbers.. but as the scale grow, this is a good algo)
Enjoy.
Update:
Ok, it seems like i've misread the post, you can alternatively use dynamic programming to reduce running time, there is a solution in link text page 7-19.
you need to tweak the algorithm a bit, first you should build the table, then you can get all variations on it fairly easy.
I would use an Interval Tree for this.
After you build the data structure, you can iterate each event and perform an intersection query. If no intersections are found, it is added to your schedule.
Yes exhaustive search might be an option:
initialise partial schedules with earliest tasks that overlap (eg 9-9.30
and 9.15-9.45)
foreach partial schedule generated so far generate a list of new partial schedules appending to each partial schedule the earliest task that don't overlap (generate more than one in case of ties)
recur with new partial schedules
In your case initlialisation would produce only (8-9 breakfast)
After the first iteration: (8-9 brekkie, 11.40-12.40 gym) (no ties)
After the second iteration: (8-9 brekkie, 11.40-12.40 gym, 12.40-1.20 lunch) (no ties again)
This is a tree search, but it's greedy. It leaves out possibilities like skipping the gym and going to an early lunch.
Since you're looking for every possible schedule, I think the best solution you will find will be a simple exhaustive search.
The only thing I can say algorithmically is that your data structure of lists of strings is pretty terrible.
The implementation is hugely language dependent so I don't even think pseudo-code would make sense, but I'll try to give the steps for the basic algorithm.
Pop off the first n items of the same type and put them in list.
For each item in list, add that item to schedule set.
Pop off next n items of same type off list.
For each item that starts after the first item ends, put on list. (If none, fail)
Continue until done.
Hardest part is deciding exactly how to construct the lists/recursion so it's most elegant.

Resources