Problems that involve time intervals and their overlapping - algorithm

I have recently came across a lot of questions that involve time intervals as an input. Some of the time intervals are overlapping. And depending upon that you have to perform an optimization, maximization or minimization operation on the input. I am not able to solve such problems. In fact, I am not able to even start thinking on these problems.
Here is an example:
Let us say, you are a resource holder. There can be an infinite supply of such a resource.
There are people who want that resource for a particular time interval. For ex: 4 pm to 8 pm
There can be an overlapping interval. ex: 5 pm to 7 pm, 3 pm to 6 pm
etc.
Depending upon these intervals, and their overlapping nature, you have to figure out how many distinct instances of these resources are required.
Ex. Input:
8 am - 9 am
8:30 am to 9:15 am
9.30 am to 1040 am
In this case, the first two intervals overlap. So two instances of resources will be required. The third interval is not overlapping, so the person with that interval can reuse the resource returned by any of the earlier ones.
Hence, in this case, minimum resources required are 2.
I don't need a solution. I need some pointers on how to solve. Are there any algorithms that address such questions? What should I read/ study. Are there any data structures that might help.

The number of intervals overlapping any time instant T is the number of interval start times less than T, minus the number of interval end times less than or equal to T.
Many of these problems, like the specific one above, can be solved by putting the start and end times separately into a sorted list or tree so you can figure out stuff about how these counts change over time.
To solve this problem, for example, sort the start and end times in a single list:
800S, 900E, 830S, 915E, 930S, 1040E
then sort them:
800S, 830S, 900E, 915E, 930S, 1040E
The run through the list and count, adding 1 for each start time and subtracting one for each end time:
1 2 1 0 1 0
The highest number of overlapping intervals is 2.

The data structure you need to use in order to solve this type of problems is The Interval Graph. The Interval Graph has a vertex for every interval and an edge between every pair of vertices corresponding to intervals that intersect.
The following interval graph corresponds to the set of three intervals in your example:
A: 8:00-9:00
B: 8:30-9:15
C: 9:30-10:40
This data structure captures the relevant aspects of most problems involving intervals and thus helps to solve them efficiently. Also, given the set of intervals (represented by a list of 2-tuples), you can construct the interval graph in Polynomial time.
Many problems that are NP-hard in general graphs, such as finding the Maximum Weight Independent Set or finding the Optimal Coloring, can be efficiently solved for interval graphs.
To solve the particular problem you've specified, first construct the interval graph G, while storing for each vertex the finish time of its corresponding interval. Also initialize a set of resources R={1} that at first contains only a single resource: resource number 1. Consider each vertex v of G in sorted order according to their finish time. Assign to v resource number i where i is the smallest resource in R not used by the neighbors of v. If no such a resource exists (because the neighbors of v use all the resources in R), insert a new resource i=max{R}+1 to R and assign it to v. The optimal number of resources (aka, the solution to your problem) is the size of the set R.

Related

Activity selection with two resources

Given n activities with start time (Si) and end time (Fi) and 2 resources.
Pick the activities such that maximum number of activities are finished.
My ideas
I tried to solve it with DP but couldn't figure out anything with DP.So trying with greedy
Approach: Fill resource-1 first greedily and then resource-2 next greedily(Least end time first). But this will not work for this case T1(1,4) T2(5,10) T3(6,12) T4(11,15)
Approach 2:Select tasks greedily and assign it in round robin fashion.
This will also not work.
Can anyone please help me in figuring out this?
No need to use DP at all, a Greedy solution suffices, though it is slightly more complicated than the 1-resource problem.
Here, we first sort the intervals by the ending time, earlier first. Then, put two "sentinel" intervals in the resources, both with ending time -∞. Then, keeping grabbing the interval x with lowest x.end, and follow these rules:
if x.start is before both of the two ending times in our two resources, skip x and don't assign it, since x cannot fit
otherwise, have x overwrite the resource whose endpoint is latest and still before x.start
The greedy strategy in rule 2 is the key point here: we want to replace the latest ending used resource, since that maximizes the "space" that we have in the other resource to accommodate some future interval with an early start time, making it strictly more likely that future interval will be able to fit.
Let's look the example in the question, with intervals (1,4), (5,10), (6,12), and (11,18) already in sorted order. We begin with both resources having (-∞,-∞) as "sentinel" intervals. Now take the first interval (1,4), and see that it fits, so now we have resource 1 having (1,4) and resource 2 having (-∞,-∞). Next, take (5,10), which can fit in both resources, so we choose resource 1, because it ends the latest, and now resource 1 has (5,10). Next, we take (6,12), which only fits in resource 2, so resource 2 has (6,12). Finally, take (11,18), which fits in resource 1.
Hence, we have been able to fit all four intervals using our Greedy strategy.
Activity selection problem can be solved by Greedy-Iterative-Activity-Selector Algorithm.
The basic idea is to always pick the next activity whose finish time is least among the remaining activities and the start time is more than or equal to the finish time of previously selected activity. We can sort the activities according to their finishing time so that we always consider the next activity as minimum finishing time activity.
See more on Wikipedia.

Variation to the Set-Covering Prob (Maybe an Activity Selection Prob)

Everyday from 9am to 5pm, I am supposed to have at least one person at the factory supervising the workers and make sure that nothing goes wrong.
There are currently n applicants to the job, and each of them can work from time si to time ci, i = 1, 2, ..., n.
My goal is to minimize the time that more than two people are keeping watch of the workers at the same time.
(The applicants' available working hours are able to cover the time period from 9am to 5pm.)
I have proved that at most two people are needed for any instant of time to fulfill my needs, but how should I get from here to the final solution?
Finding the time periods where only one person is available for the job and keeping them is my first step, but finding the next step is what troubles me... .
The algorithm must run in polynomial-time.
Any hints(a certain type of data structure maybe?) or references are welcome. Many thanks.
I think you can do this with dynamic programming by solving the sub-problem:
What is the minimum overlap time given that applicant i is the last worker and we have covered all times from start of day up to ci?
Call this value of the minimum overlap time cost(i).
You can compute the value of cost(i) by considering cases:
If si is equal to the start of day, then cost(i) = 0 (no overlap is required)
Otherwise, consider all previous applicants j. Set cost(i) to the minimum of cost(j)+overlap between i and j. Also set prev(i) to the value of j that attains the minimum.
Then the answer to your problem is given by the minimum of cost(k) for all values of k where ck is equal to the end of the day. You can work out the correct choice of people by backtracking using the values of prev.
This gives an O(n^2) algorithm.

Does greedily removing intervals with most conflicts solve interval scheduling?

We can solve the scheduling problem, in which we must select the largest set of continuous intervals that do no overlap, with a greedy algorithm: we just keep picking the intervals that end the earliest: http://en.wikipedia.org/wiki/Interval_scheduling
Apparently, greedily picking the intervals with fewest conflicts does not work.
I was wondering if putting all the intervals in one big set and then greedily removing the interval with the most number of conflicts left (until the intervals have no conflicts) works. I can envision implementing this greedy algorithm with a priority queue: every time we remove the interval X with greatest conflicts from the priority queue, we update the other intervals that used to conflict with interval X so that the other intervals now are marked as having 1 less conflict.
Does this work? I'm trying to come up with a counterexample to disprove it and can't.
Here is a counterexample.
The idea is to drop a required interval on the very first pick.
The number of conflicts is on the right.
==== 2
---- 3
---- 3
==== 4
---- 3
---- 3
==== 2
Obviously, we will want to pick the three bold (====) intervals and drop the four thin (----) intervals.
There is no other way to obtain three non-intersecting intervals.
By the way, you may find the TopCoder tutorial on greedy problems interesting since it starts with a discussion of several approaches on the same problem.

Algorithm design to assign nodes to graphs

I have a graph-theoretic (which is also related to combinatorics) problem that is illustrated below, and wonder what is the best approach to design an algorithm to solve it.
Given 4 different graphs of 6 nodes (by different, I mean different structures, e.g. STAR, LINE, COMPLETE, etc), and 24 unique objects, design an algorithm to assign these objects to these 4 graphs 4 times, so that the number of repeating neighbors on the graphs over the 4 assignments is minimized. For example, if object A and B are neighbors on 1 of the 4 graphs in one assignment, then in the best case, A and B will not be neighbors again in the other 3 assignments.
Obviously, the degree to which such minimization can go is dependent on the specific graph structures given. But I am more interested in a general solution here so that given any 4 graph structures, such minimization is guaranteed as the result of the algorithm.
Any suggestion/idea of solving this problem is welcome, and some pseudo-code may well be sufficient to illustrate the design. Thank you.
Representation:
You have 24 elements, I will name this elements from A to X (24 first letters).
Each of these elements will have a place in one of the 4 graphs. I will assign a number to the 24 nodes of the 4 graphs from 1 to 24.
I will identify the position of A by a 24-uple =(xA1,xA2...,xA24), and if I want to assign A to the node number 8 for exemple, I will write (xa1,Xa2..xa24) = (0,0,0,0,0,0,0,1,0,0...0), where 1 is on position 8.
We can say that A =(xa1,...xa24)
e1...e24 are the unit vectors (1,0...0) to (0,0...1)
note about the operator '.':
A.e1=xa1
...
X.e24=Xx24
There are some constraints on A,...X with these notations :
Xii is in {0,1}
and
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ... Sum(Xa24,Xb24,... Xx24)=1
Since one element can be assign to only one node.
I will define a graph by defining the neighbors relation of each node, lets say node 8 has neighbors node 7 and node 10
to check that A and B are neighbors on node 8 for exemple I nedd:
A.e8=1 and B.e7 or B.e10 =1 then I just need A.e8*(B.e7+B.e10)==1
in the function isNeighborInGraphs(A,B) I test that for every nodes and I get one or zero depending on the neighborhood.
Notations:
4 graphs of 6 nodes, the position of each element is defined by an integer from 1 to 24.
(1 to 6 for first graph, etc...)
e1... e24 are the unit vectors (1,0,0...0) to (0,0...1)
Let A, B ...X be the N elements.
A=(0,0...,1,...,0)=(xa1,xa2...xa24)
B=...
...
X=(0,0...,1,...,0)
Graph descriptions:
IsNeigborInGraphs(A,B)=A.e1*B.e2+...
//if 1 and 2 are neigbors in one graph
for exemple
State of the system:
L(A)=[B,B,C,E,G...] // list of
neigbors of A (can repeat)
actualise(L(A)):
for element in [B,X]
if IsNeigbotInGraphs(A,Element)
L(A).append(Element)
endIf
endfor
Objective functions
N(A)=len(L(A))+Sum(IsneigborInGraph(A,i),i in L(A))
...
N(X)= ...
Description of the algorithm
start with an initial position
A=e1... X=e24
Actualize L(A),L(B)... L(X)
Solve this (with a solveur, ampl for
exemple will work I guess since it's
a nonlinear optimization
problem):
Objective function
min(Sum(N(Z),Z=A to X)
Constraints:
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ...
Sum(Xa24,Xb24,... Xx24)=1
You get the best solution
4.Repeat step 2 and 3, 3 more times.
If all four graphs are K_6, then the best you can do is choose 4 set partitions of your 24 objects into 4 sets each of cardinality 6 so that the pairwise intersection of any two sets has cardinality at most 2. You can do this by choosing set partitions that are maximally far apart in the Hasse diagram of set partitions with partial order given by refinement. The general case is much harder, but perhaps you can still begin with this crude approximation of a solution and then be clever with which vertex is assigned which object in the four assignments.
Assuming you don't want to cycle all combinations and calculate the sum every time and choose the lowest, you can implement a minimum problem (solved depending on your constraints using either a linear programming solver i.e. symplex algorithm engines or a non-linear solver, much harder talking in terms of time) with constraints on your variables (24) depending on the shape of your path. You can also use free software like LINGO/LINDO to create rapidly a decision theory model and test its correctness (you need decision theory notions though)
If this has anything to do with the real world, then it's unlikely that you absolutely must have a solution that is the true minimum. Close to the minimum should be good enough, right? If so, you could repeatedly randomly make the 4 assignments and check the results until you either run out of time or have a good-enough solution or appear to have stopped improving your best solution.

Finding the minimal coverage of an interval with subintervals

Suppose I have an interval (a,b), and a number of subintervals {(ai,bi)}i whose union is all of (a,b). Is there an efficient way to choose a minimal-cardinality subset of these subintervals which still covers (a,b)?
A greedy algorithm starting at a or b always gives the optimal solution.
Proof: consider the set Sa of all the subintervals covering a. Clearly, one of them has to belong to the optimal solution. If we replace it with a subinterval (amax,bmax) from Sa whose right endpoint bmax is maximal in Sa (reaches furthest to the right), the remaining uncovered interval (bmax,b) will be a subset of the remaining interval from the optimal solution, so it can be covered with no more subintervals than the analogous uncovered interval from the optimal solution. Therefore, a solution constructed from (amax,bmax) and the optimal solution for the remaining interval (bmax,b) will also be optimal.
So, just start at a and iteratively pick the interval reaching furthest right (and covering the end of previous interval), repeat until you hit b. I believe that picking the next interval can be done in log(n) if you store the intervals in an augmented interval tree.
Sounds like dynamic programming.
Here's an illustration of the algorithm (assume intervals are in a list sorted by ending time):
//works backwards from the end
int minCard(int current, int must_end_after)
{
if (current < 0)
if (must_end_after == 0)
return 0; //no more intervals needed
else
return infinity; //doesn't cover (a,b)
if (intervals[current].end < must_end_after)
return infinity; //doesn't cover (a,b)
return min( 1 + minCard(current - 1, intervals[current].start),
minCard(current - 1, must_end_after) );
//include current interval or not?
}
But it should also involve caching (memoisation).
There are two cases to consider:
Case 1: There are no over-lapping intervals after the finish time of an interval. In this case, pick the next interval with the smallest starting time and the longest finishing time. (amin, bmax).
Case 2: There are 1 or more intervals overlapping with the last interval you're looking at. In this case, the start time doesn't matter because you've already covered that. So optimize for the finishing time. (a, bmax).
Case 1 always picks the first interval as the first interval in the optimal set as well (the proof is the same as what #RafalDowgrid provided).
You mean so that the subintervals still overlap in such a way that (a,b) remains completely covered at all points?
Maybe splitting up the subintervals themselves into basic blocks associated with where they came from, so you can list options for each basic block interval accounting for other regions covered by the subinterval also. Then you can use a search based on each sub-subinterval and at least be sure no gaps are left.
Then would need to search.. efficiently.. that would be harder.
Could eliminate any collection of intervals that are entirely covered by another set of smaller number and work the problem after the preprocessing.
Wouldn't the minimal for the whole be minimal for at least one half? I'm not sure.
Found a link to a journal but couldn't read it. :(
This would be a hitting set problem and be NP_hard in general.
Couldn't read this either but looks like opposite kind of problem.
Couldn't read it but another link that mentions splitting intervals up.
Here is an available reference on Randomized Algorithms for GeometricOptimization Problems.
Page 35 of this pdf has a greedy algorithm.
Page 11 of Karp (1972) mentions hitting-set and is cited alot.
Google result. Researching was fun but I have to go now.

Resources