Algorithmic/programming approach to Gantt chart problem - algorithm

I have 6 processes P1, P2, P3, P4, P5, and P6. I also have their start times and duration given in the problem.
process# start duration
1 1 1
2 3 1
3 0 6
4 5 2
5 5 4
6 8 1
Now I have to find out the maximum of number of completely non-overlapping processes. Two processes are completely non-overlapping if one does not overlap the other at any point in time.
So I made a Gantt chart and it is easy to see that the answer is 4.
P1, P2, P4 and P6 are completely non-overlapping.
Now I have to write a program to compute the same. On a Gantt chart I can easily 'see' the solution.
In the algorithm for my program, I don't know how to minimise the time complexity: currently I'm thinking about taking each process and comparing its start and end times with other processes, but that roughly makes it O(n^2).
If I scale up the processes from 6 to say 1000, O(n^2) will take a huge time.
Is there any standard way of doing such problems - I mean such problems that are easy to visualise - like Gantt charts? Otherwise how do I make this algorithm better, any suggestions?

There are different paths that you could take to find a solution, here are some in no particular order.
Is that already a solution on the net?
Most likely, important point, Gantt chars are essentially intervals.
Could it be a graph problem?
Consider that each interval is a node, imaging a start node at 0 (zero) and connect all nodes to all nodes starting later than its end. Use Dijkstra or A* like to find a solution.
Could it be a dynamic programming problem?
Are there subproblems, yes, add or don't add an interval, repeat.
Do I know a data structure that is used in this kind of problems?
Yes, Augmented Interval Tree can it be used for this problem, maybe.

Related

How to find the highest number of changes/permutations inside a group (maybe a graph)

Lets say in my company there are a number N of workers and M sectors. Each worker is currently assigned to a sector, also each worker is all willing to change to another sector.
For example:
Worker A is in sector 1 but want to go to sector 2
B is in 2 but want 3
C is in 3 but want 2
D is in 1 but want 3
and so on...
But they all must change with eachother.
A go to B position and B go to A position
or
A go to B position / B go to C position / C go to A position
I know that not everyone will change sectors, but I'm wondering if there is any specific algorithm that could find what movements will yield the maximum amount of changes.
I tought about naively swap two workers but some of them may be missing, they could all form a "loop" and no one would be left out (if possible)
I could use Monte Carlo to chain the workers and find the longest chain/loop but that would be too expensive as N and M grows
Also tought about finding the longest path in a graph using djikstra but as it looks like a NP-hard problem
Does anyone know an algorithm or how could I solve this efficiently? Or I'm trying to fly too close to the sun here?
This can be solved as a min-cost circulation problem. Construct a flow network where each sector corresponds to a node, and each worker corresponds to an arc. The capacity of each arc is 1, and the cost is −1 (i.e., we should move workers if we can). The conservation of flow constraint ensures that we can decompose the worker movements into a sum of simple cycles.
Klein's cycle canceling algorithm is not the most efficient, but it's very simple. Use (e.g.) Bellman−Ford to find a negative-cost cycle in the network, if one exists. If so, reverse the direction of each arc in the cycle, multiply the cost of each arc in the cycle by −1, and loop back to the beginning.
You could use the following observations to generate the most attractive sector changes (measured as how many workers get the change they want). In order of falling attractiveness,
Identify all circular chains of sector changes. Everybody gets the change they want.
Identify all non-circular chains of sector changes. They can be made circular at the expense of one worker not getting what s/he wants.
Revisit 1. Combine any two circular chains at the expense of two workers not getting what they want.
Instead of one optimal solution, you get a list of many more or less attractive options. You will have to put some bounds on steps 1 - 3 to keep options down to a tractable number.

Problems that involve time intervals and their overlapping

I have recently came across a lot of questions that involve time intervals as an input. Some of the time intervals are overlapping. And depending upon that you have to perform an optimization, maximization or minimization operation on the input. I am not able to solve such problems. In fact, I am not able to even start thinking on these problems.
Here is an example:
Let us say, you are a resource holder. There can be an infinite supply of such a resource.
There are people who want that resource for a particular time interval. For ex: 4 pm to 8 pm
There can be an overlapping interval. ex: 5 pm to 7 pm, 3 pm to 6 pm
etc.
Depending upon these intervals, and their overlapping nature, you have to figure out how many distinct instances of these resources are required.
Ex. Input:
8 am - 9 am
8:30 am to 9:15 am
9.30 am to 1040 am
In this case, the first two intervals overlap. So two instances of resources will be required. The third interval is not overlapping, so the person with that interval can reuse the resource returned by any of the earlier ones.
Hence, in this case, minimum resources required are 2.
I don't need a solution. I need some pointers on how to solve. Are there any algorithms that address such questions? What should I read/ study. Are there any data structures that might help.
The number of intervals overlapping any time instant T is the number of interval start times less than T, minus the number of interval end times less than or equal to T.
Many of these problems, like the specific one above, can be solved by putting the start and end times separately into a sorted list or tree so you can figure out stuff about how these counts change over time.
To solve this problem, for example, sort the start and end times in a single list:
800S, 900E, 830S, 915E, 930S, 1040E
then sort them:
800S, 830S, 900E, 915E, 930S, 1040E
The run through the list and count, adding 1 for each start time and subtracting one for each end time:
1 2 1 0 1 0
The highest number of overlapping intervals is 2.
The data structure you need to use in order to solve this type of problems is The Interval Graph. The Interval Graph has a vertex for every interval and an edge between every pair of vertices corresponding to intervals that intersect.
The following interval graph corresponds to the set of three intervals in your example:
A: 8:00-9:00
B: 8:30-9:15
C: 9:30-10:40
This data structure captures the relevant aspects of most problems involving intervals and thus helps to solve them efficiently. Also, given the set of intervals (represented by a list of 2-tuples), you can construct the interval graph in Polynomial time.
Many problems that are NP-hard in general graphs, such as finding the Maximum Weight Independent Set or finding the Optimal Coloring, can be efficiently solved for interval graphs.
To solve the particular problem you've specified, first construct the interval graph G, while storing for each vertex the finish time of its corresponding interval. Also initialize a set of resources R={1} that at first contains only a single resource: resource number 1. Consider each vertex v of G in sorted order according to their finish time. Assign to v resource number i where i is the smallest resource in R not used by the neighbors of v. If no such a resource exists (because the neighbors of v use all the resources in R), insert a new resource i=max{R}+1 to R and assign it to v. The optimal number of resources (aka, the solution to your problem) is the size of the set R.

Trying to gain intuition for work scheduling greedy algorithm

I have the following scenario: (since I don't know of a way to show LaTeX, here's a screenshot)
I'm having some trouble conceptualizing what's going on here. If I were to program this, I would probably attempt to structure this as some kind of heap where each node represents a worker, from earliest-to-latest, then run Prim's/Kruskal's algorithm on it. I don't know if I'm on the right track with that idea, but I need to flesh out my understanding of this problem so I can do the following:
Describe in detail the greedy choice
Show that if there's an optimal solution for which the greedy choice was not made, then an exchange can be made to conform with the greedy choice
Know how to implement a greedy algorithm solution, and its running time
So where should I be going with this idea?
This problem is very similar in nature to "Roster Scheduling problems." Think of the committee as say a set of 'supervisors' and you want to have a supervisor present, whenever a worker is present. In this case, the supervisor comes from the same set as the workers.
Here are some modeling ideas, and an Integer Programming formulation.
Time Slicing Idea
This sounds like a bad idea initially, but works really well in practice. We are going to create a lot of "time instants" T i from the start time of the first shift, to the end time of the very last shift. It sometimes helps to think of
T1, T2, T3....TN as being time instants (say) five minutes apart. For every Ti at least one worker is working on a shift. Therefore, that time instant has be be covered (Coverage means there has to be at least one member of the committee also working at time Ti.)
We really need to only worry about 2n Time instants: The start and finish times of each of the n workers.
Coverage Property Requirement
For every time instant Ti, we want a worker from the Committee present.
Let w1, w2...wn be the workers, sorted by their start times s_i. (Worker w1 starts the earliest shift, and worker wn starts the very last shift.)
Introduce a new Indicator variable (boolean):
Y_i = 1 if worker i is part of the committeee
Y_i = 0 otherwise.
Visualization
Now think of a 0-1 matrix, where the rows are the SORTED workers, and the columns are the time instants...
Construct a Time-Worker Matrix (0/1)
t1 t2 t3 t4 t5 t6 ... tN
-------------------------------------------
w1 1 1
w2 1 1
w3 1 1 1
w4 1 1 1
...
...
wn 1 1 1 1
-------------------------------------------
Total 2 4 3 ... ... 1 2 4 5
So the problem is to make sure that for each column, at least 1 worker is Selected to be part of the committee. The Total shows the number of candidates for the committee at each Time instant.
An Integer Programming based formulation
Objective: Minimize Sum(Y_i)
Subject to:
Y1 + Y2 >= 1 # coverage for time t1
Y1 + Y2 + Y3 >= 1 # coverage for time t2
...
More generally, the constraints are:
# Set Covering constraint for time T_i
Sum over all worker i's that are working at time t_i (Y_i) >= 1
Y_i Binary for all i's
Preprocessing
This Integer program, if attempted without preprocessing can be very difficult, and end up choking the solvers. But in practice there are quite a number of preprocessing ideas that can help immensely.
Make any forced assignments. (If ever there is a time instant with only one
worker working, that worker has to be in the committee ∈ C)
Separate into nice subproblems. Look at the time-worker Matrix. If there are nice 'rectangles' in it that can be cut out without
impacting any other time instant, then that is a wholly separate
sub-problem to solve. Makes the solver go much, much faster.
Identical shifts - If lots of workers have the exact same start and end times, then you can simply choose ANY one of them (say, the
lexicographically first worker, WLOG) and remove all the other workers from
consideration. (Makes a ton of difference in real life situations.)
Dominating shifts: If one worker starts before and stays later than any other worker, the 'dominating' worker can stay, all the
'dominated' workers can be removed from consideration for C.
All the identical rows (and columns) in the time-worker Matrix can be fused. You need to only keep one of them. (De-duping)
You could throw this into an IP solver (CPLEX, Excel, lp_solve etc.) and you will get a solution, if the problem size is not an issue.
Hope some of these ideas help.

Algorithm design to assign nodes to graphs

I have a graph-theoretic (which is also related to combinatorics) problem that is illustrated below, and wonder what is the best approach to design an algorithm to solve it.
Given 4 different graphs of 6 nodes (by different, I mean different structures, e.g. STAR, LINE, COMPLETE, etc), and 24 unique objects, design an algorithm to assign these objects to these 4 graphs 4 times, so that the number of repeating neighbors on the graphs over the 4 assignments is minimized. For example, if object A and B are neighbors on 1 of the 4 graphs in one assignment, then in the best case, A and B will not be neighbors again in the other 3 assignments.
Obviously, the degree to which such minimization can go is dependent on the specific graph structures given. But I am more interested in a general solution here so that given any 4 graph structures, such minimization is guaranteed as the result of the algorithm.
Any suggestion/idea of solving this problem is welcome, and some pseudo-code may well be sufficient to illustrate the design. Thank you.
Representation:
You have 24 elements, I will name this elements from A to X (24 first letters).
Each of these elements will have a place in one of the 4 graphs. I will assign a number to the 24 nodes of the 4 graphs from 1 to 24.
I will identify the position of A by a 24-uple =(xA1,xA2...,xA24), and if I want to assign A to the node number 8 for exemple, I will write (xa1,Xa2..xa24) = (0,0,0,0,0,0,0,1,0,0...0), where 1 is on position 8.
We can say that A =(xa1,...xa24)
e1...e24 are the unit vectors (1,0...0) to (0,0...1)
note about the operator '.':
A.e1=xa1
...
X.e24=Xx24
There are some constraints on A,...X with these notations :
Xii is in {0,1}
and
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ... Sum(Xa24,Xb24,... Xx24)=1
Since one element can be assign to only one node.
I will define a graph by defining the neighbors relation of each node, lets say node 8 has neighbors node 7 and node 10
to check that A and B are neighbors on node 8 for exemple I nedd:
A.e8=1 and B.e7 or B.e10 =1 then I just need A.e8*(B.e7+B.e10)==1
in the function isNeighborInGraphs(A,B) I test that for every nodes and I get one or zero depending on the neighborhood.
Notations:
4 graphs of 6 nodes, the position of each element is defined by an integer from 1 to 24.
(1 to 6 for first graph, etc...)
e1... e24 are the unit vectors (1,0,0...0) to (0,0...1)
Let A, B ...X be the N elements.
A=(0,0...,1,...,0)=(xa1,xa2...xa24)
B=...
...
X=(0,0...,1,...,0)
Graph descriptions:
IsNeigborInGraphs(A,B)=A.e1*B.e2+...
//if 1 and 2 are neigbors in one graph
for exemple
State of the system:
L(A)=[B,B,C,E,G...] // list of
neigbors of A (can repeat)
actualise(L(A)):
for element in [B,X]
if IsNeigbotInGraphs(A,Element)
L(A).append(Element)
endIf
endfor
Objective functions
N(A)=len(L(A))+Sum(IsneigborInGraph(A,i),i in L(A))
...
N(X)= ...
Description of the algorithm
start with an initial position
A=e1... X=e24
Actualize L(A),L(B)... L(X)
Solve this (with a solveur, ampl for
exemple will work I guess since it's
a nonlinear optimization
problem):
Objective function
min(Sum(N(Z),Z=A to X)
Constraints:
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ...
Sum(Xa24,Xb24,... Xx24)=1
You get the best solution
4.Repeat step 2 and 3, 3 more times.
If all four graphs are K_6, then the best you can do is choose 4 set partitions of your 24 objects into 4 sets each of cardinality 6 so that the pairwise intersection of any two sets has cardinality at most 2. You can do this by choosing set partitions that are maximally far apart in the Hasse diagram of set partitions with partial order given by refinement. The general case is much harder, but perhaps you can still begin with this crude approximation of a solution and then be clever with which vertex is assigned which object in the four assignments.
Assuming you don't want to cycle all combinations and calculate the sum every time and choose the lowest, you can implement a minimum problem (solved depending on your constraints using either a linear programming solver i.e. symplex algorithm engines or a non-linear solver, much harder talking in terms of time) with constraints on your variables (24) depending on the shape of your path. You can also use free software like LINGO/LINDO to create rapidly a decision theory model and test its correctness (you need decision theory notions though)
If this has anything to do with the real world, then it's unlikely that you absolutely must have a solution that is the true minimum. Close to the minimum should be good enough, right? If so, you could repeatedly randomly make the 4 assignments and check the results until you either run out of time or have a good-enough solution or appear to have stopped improving your best solution.

plane bombing problems- help

I'm training code problems, and on this one I am having problems to solve it, can you give me some tips how to solve it please.
The problem is taken from here:
https://www.ieee.org/documents/IEEEXtreme2008_Competitition_book_2.pdf
Problem 12: Cynical Times.
The problem is something like this (but do refer to above link of the source problem, it has a diagram!):
Your task is to find the sequence of points on the map that the bomber is expected to travel such that it hits all vital links. A link from A to B is vital when its absence isolates completely A from B. In other words, the only way to go from A to B (or vice versa) is via that link.
Due to enemy counter-attack, the plane may have to retreat at any moment, so the plane should follow, at each moment, to the closest vital link possible, even if in the end the total distance grows larger.
Given all coordinates (the initial position of the plane and the nodes in the map) and the range R, you have to determine the sequence of positions in which the plane has to drop bombs.
This sequence should start (takeoff) and finish (landing) at the initial position. Except for the start and finish, all the other positions have to fall exactly in a segment of the map (i.e. it should correspond to a point in a non-hit vital link segment).
The coordinate system used will be UTM (Universal Transverse Mercator) northing and easting, which basically corresponds to a Euclidian perspective of the world (X=Easting; Y=Northing).
Input
Each input file will start with three floating point numbers indicating the X0 and Y0 coordinates of the airport and the range R. The second line contains an integer, N, indicating the number of nodes in the road network graph. Then, the next N (<10000) lines will each contain a pair of floating point numbers indicating the Xi and Yi coordinates (1 < i<=N). Notice that the index i becomes the identifier of each node. Finally, the last block starts with an integer M, indicating the number of links. Then the next M (<10000) lines will each have two integers, Ak and Bk (1 < Ak,Bk <=N; 0 < k < M) that correspond to the identifiers of the points that are linked together.
No two links will ever cross with each other.
Output
The program will print the sequence of coordinates (pairs of floating point numbers with exactly one decimal place), each one at a line, in the order that the plane should visit (starting and ending in the airport).
Sample input 1
102.3 553.9 0.2
14
342.2 832.5
596.2 638.5
479.7 991.3
720.4 874.8
744.3 1284.1
1294.6 924.2
1467.5 659.6
1802.6 659.6
1686.2 860.7
1548.6 1111.2
1834.4 1054.8
564.4 1442.8
850.1 1460.5
1294.6 1485.1
17
1 2
1 3
2 4
3 4
4 5
4 6
6 7
7 8
8 9
8 10
9 10
10 11
6 11
5 12
5 13
12 13
13 14
Sample output 1
102.3 553.9
720.4 874.8
850.1 1460.5
102.3 553.9
Pre-process the input first, so you identify the choke points. Algorithms like Floyd-Warshall would help you.
Model the problem as a Heuristic Search problem, you can compute a MST which covers all choke-points and take the sum of the costs of the edges as a heuristic.
As the commenters said, try to make concrete questions, either here or to the TA supervising your class.
Don't forget to mention where you got these hints.
The problem can be broken down into two parts.
1) Find the vital links.
These are nothing but the Bridges in the graph described. See the wiki page (linked to in the previous sentence), it mentions an algorithm by Tarjan to find the bridges.
2) Once you have the vital links, you need to find the smallest number of points which given the radius of the bomb, will cover the links. For this, for each link, you create a region around it, where dropping the bomb will destroy it. Now you form a graph of these regions (two regions are adjacent if they intersect). You probably need to find a minimum clique partition in this graph.
Haven't thought it through (especially part 2), but hope it helps.
And good luck in the contest!
I think Moron' is right about the first part, but on the second part...
The problem description does not tell anything about "smallest number of points". It tells that the plane flies to the closest vital link.
So, I think the part 2 will be much simpler:
Find the closest non-hit segment to the current location.
Travel to the closest point on the closest segment.
Bomb the current location (remove all segments intersecting a circle)
Repeat until there are no non-hit vital links left.
This straight-forward algorithm has a complexity of O(N*N), but this should be sufficient considering input constraints.

Resources