How do I prove a class-room scheduling issue to be NP complete correctly? - complexity-theory

I am given a problem, that is about scheduling n classes in k rooms at a school, and it is a decision problem, because we want to ask if we can arrange these n classes in those k rooms so that a given timelimit t is not exceeded (the total time of classes in a certain scheduled way should not exceed t).
I am aware about to firstly show that every solution to the problem can be verified in polynomial time, but when it comes to reducing some known NP complete problem to the class-room scheduling problem then I do not know which NP-complete problem I should take.
I was thinking about using Traveling Salesman Problem to reduce, but I am not sure about how to interpret my class-room scheduling problem into a graph considering the symbolics. My first attempt to interpret my problem as a graph is to consider the classes as vertices, rooms as colours and the time for a class denoted by a weighted edge between two classes (the latter two interpretations completely unsure). But I don't know if this follows a standard pattern for some scheduling problem or I don't even know if Traveling Salesman Problem is a good NP-complete problem to reduce to the class-room scheduling problem. In the latter case, then I would like to know examples of more suitable NP-complete problems to reduce in my case.
Thanks in advance!

You can use map-coloring (graph-coloring) for it. You just need to define rules for edges and nodes. Nodes will be classes, rooms will be colors and you connect classes that can't be in same time. This is actually k-coloring problem, where you need to color specific graph with k colors in order to minimize number of classes per color. However in this special case, you just need to have less or equal to t per color. You can achieve this by going by standard rule of coloring, and switch to new color as soon as it has t number of classes.
This is still a NP-complete problem. Only exception is when you have 1 or 2 classes, then its in polynomial time. When you have 1 room, you just need to check if n<=t. When you have 2 rooms, you just need to check if it can be colored by 2 colors. You can achieve this by DFS (but first check if n <= 2t) and color odd steps with first color and even steps with second color. If it is possible to color all nodes with this tactic, you have a positive solution. When k>=3, its NP-complete.

Related

Weighted Interval Scheduling with dependent jobs/job with multiple required running time

Interval scheduling algorithms are pretty much based around sorting jobs by end time, but what if scheduling job A means you must schedule job C.
For instance, say you are trying to schedule radio programs and program A runs Monday 10am-11am and 2pm-3pm, but program B runs Monday 1:30-2:30? You can't run only the 10-11 portion of program A. It's all or nothing. Alternatively, say the program runs Mon, Wed, Fri but at different times each day.
Ideas I've played around with:
Shortest path algorithm where you simultaneously traverse 7 graphs for each day of the week, each graph sorted to connect only programs coming after. If you choose program A on monday, you choose it on all days, as so on. This solution doesn't solve the issue if the program needs to run twice in one day.
Generating an n by n matrix for the n programs and checking each's compatibility with the others. Traverse a graph where each program only connects with non-conflicting programs. A bit stuck on this idea and looking for next steps or new ideas entirely.
I just re-asked this question on CS Stack Exchange since this one is old and the equivalencies in the accepted answer are a bit tenuous. Re-posting my answer here for higher visibility:
This is equivalent to the maximum weighted independent set problem - given a graph with weighted vertices, find a subset of the vertices such that no two vertices in the subset are adjacent in the graph and the sum of their weights is maximized.
The graph the problem is being solved on is a type of interval graph where each vertex represents a group of dependent intervals and its weight is the sum of the intervals' weights. An edge is drawn between two vertices if one or more of their intervals overlap.
I haven't determined if this formulation is tractable or not. While the problem is NP-hard, we know it can be solved efficiently for certain types of graphs, including interval graphs where each vertex represents a single interval. See the Independent Set Wikipedia page for known algorithms.
My rule of thumb for scheduling is that almost everything is NP-complete except a few special cases. Suppose you could find a schedule that filled up every hour in the day, given possible programs that required an arbitrary number of disconnected time-slots. Then you could solve https://en.wikipedia.org/wiki/Exact_cover - the elements of X are time-slots, and the subsets S are programs. An exact cover corresponds to scheduling programs that fill every time-slot without overlapping each other.
I think this means you are looking for heuristics, such as Late Acceptance Hill-Climbing (http://www.yuribykov.com/LAHC/), limited discrepancy search (http://wiki.cs.pdx.edu/cs543-spring2010/important_algorithms.html), and ordinary hill-climbing from multiple random starts. I suggest that, whatever else you do, you conclude with a hill-climb designed to spot small improvements that people can spot, to make sure your computer doesn't produce a schedule that people can make obvious improvements to.

Maximum product perfect matching in complete bipartite graphs

I am trying to solve this problem : Jobs.
So far i have thought that the problem is same as the Assignment Problem with the distributors and districts represented as a bipartite graph and the edges representing the probability. But here we would need to maximize the product rather than the sum of weights of matched edges.
One idea that came to my mind was to change each edge weight to log ( weight ). Then the problem essentially changes to finding the maximum sum, which is can then be solved using the algorithms for Assignment Problem. But this poses a problem, since applying log will make the edge weights non-integer, something which i think the Hungarian Algorithm does not work for.
Please suggest some other alternative approach.
In theory, the Hungarian algorithm works fine with real weights.
In practice, it's possible that, since most integer logarithms cannot be represented exactly as floating-point numbers, it could come to pass that rounding would change the optimal solution. There are ways to deal with that even so, but it's unlikely that you'll need them for this programming contest problem.

Is this MSP an instance of the TSP?

In his book, The Algorithm Design Manual, Steven S. Skiena poses the following problem:
            
Now consider the following scheduling problem. Imagine you are a highly-indemand actor, who has been presented with offers to star in n different movie projects under development. Each offer comes specified with the first and last day of filming. To take the job, you must commit to being available throughout this entire period. Thus you cannot simultaneously accept two jobs whose intervals overlap.
For an artist such as yourself, the criteria for job acceptance is clear: you want to make as much money as possible. Because each of these films pays the same fee per film, this implies you seek the largest possible set of jobs (intervals) such that no two of them conflict with each other.
For example, consider the available projects in Figure 1.5 [above]. We can star in at most four films, namely “Discrete” Mathematics, Programming Challenges, Calculated Bets, and one of either Halting State or Steiner’s Tree.
You (or your agent) must solve the following algorithmic scheduling problem:
Problem: Movie Scheduling Problem
Input: A set I of n intervals on the line.
Output: What is the largest subset of mutually non-overlapping intervals which can be selected from I?
I wonder, is this an instance of the TSP (perhaps a simplified one)?
This problem can be solve by simply choosing the film with the earliest finish date, and proceeding from there, an O(n^2) process (there may be even faster solutions). Since we've found a polynomial time solution, it's not an instance of TSP, unless: (1) P=NP, and (2) there's an embarrassingly easy proof of (1).
Here's how to approach this problem:
Create a tree with the vertices being the films and the edges being overlaps. That is, two vertices are connected by an edge iff their schedule overlap. Thus the problem can be restated like this: "Given a graph G find the maximal subset of unconnected vertices."
Now one can relate the above problem to the known NP-hard problem. Specifically, create a graph G' with the same vertices and complimentary edges. That is, G' has an edge between vertices iff the original graph G didn't have it. Now the problem can be restated like this: "Given a graph G' find the maximal clique, where a clique is a subset of vertices all of which are connected to each other."
Clique is a well-know NP-hard problem. Because your scheduling problem is equivalent to Clique - voila! It's NP-hard too.

Best subsample in the Maxmin distance sense

I have a set of N points in a D-dimensional metric space. I want to select K of them in such a way that the smallest distance between any two points in the subset is the largest.
For instance, with N=4 and K=3 in 3D Euclidean space, the solution is the face of the tetrahedron having the longest short side.
Is there a classical way to achieve that ? Can it be solved exactly in polynomial time ?
I have googled as much as I could, but I have not figured out yet how to call this problem.
In my case N=50, K=10 and D=300 typically.
Clarification:
A brute force approach would be to try every combination of K points among the N and determine the closest pair in every subset. The solution is given by the subset that yields the longest pair.
Done the trivial way, an O(K^2) process, to be repeated N! / K!(N-K)! times.
Hum, 10^2 50! / 10! 40! = 1027227817000
I think you might find papers on unit disk graphs informative but discouraging. For instance, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.3113&rep=rep1&type=pdf states that the maximum independent set problem on unit disk graphs in NP-complete, even if the disk representation is known. A unit disk graph is the graph you get by placing points in the plane and forming links between every pair of points at most a unit distance apart.
So I think that if you could solve your problem in polynomial time you could run it on a unit disk graph for different values of K until you find a value at which the smallest distance between two chosen points was just over one, and I think this would be a maximum independent set on the unit disk graph, which would be solving an NP-complete problem in polynomial time.
(Just about to jump on a bicycle so this is a bit rushed, but searching for papers on unit disk graphs might at least turn up some useful search terms)
Here's an attempt to explain it piece by piece:
Here is another attempt to relate the two problems.
For maximum independent set see http://en.wikipedia.org/wiki/Maximum_independent_set#Finding_maximum_independent_sets. A decision problem version of this is "Are there K vertices in this graph such that no two are joined by an edge?" If you can solve this you can certainly find a maximum independent set by finding the largest K by asking this question for different K and then finding the K nodes by asking the question on versions of the graph with one or more nodes deleted.
I state without proof that finding the maximum independent set in a unit disk graph is NP-complete. Another reference for this is http://web.sau.edu/lilliskevinm/wirelessbib/ClarkColbournJohnson.pdf.
A decision version of your problem is "Do there exist K points with distance at least D between any two points?" Again, you can solve this in polynomial time iff you can solve your original problem in polynomial time - play around until you find the largest D that gives answer yes, and then delete points and see what happens.
A unit disk graph has an edge exactly when the distance between two points is 1 or less. So if you could solve the decision version of your original problem you could solve the decision version of the unit disk graph problem just by setting D = 1 and solving your problem.
So I think I have constructed a series of links showing that if you could solve your problem you could solve an NP-complete problem by turning it into your problem, which causes me to think that your problem is hard.

Longest circle in graphs

I want to solve the following problem:
I have a DAG which contains cities and jobs between them that needs to be done. The jobs are for trucks which can load a definied limit. The more the truck is loaded the better is the tour. Some jobs are for loading something in and some are for loading defined things out. You can always drive from city a to b even if there is no job to be done between them.
The last restriction is that I always need to start in city a and return to a because there is the home of the trucks :)
I first thought of Dijkstra's shortest path algorithm. I could easly turn that into longest path calculation. My problem in mind is now that all these algorithms are for calculating a shortest or longest path from vertex a to b, but I need it from a returning to a - in a circle.
Has some one some kicks for my mind?
Thanks for your feedback!
Marco
This crazy combination of knapsack and travelling salesman is surely NP-hard.
Virtually everywhere, when you want to load your agent with optimal job schedule, or when you want to build a route through all vertexes in the graph, or when you feel that you need to look for a "longest path*", you most likely run into an NP-complete or an NP-hard problem.
That means, that there is no known fast and exact solution to the problem, i.e. the one that runs in a polynomial time.
So you have to create approximations and implement non-optimal algorithms based on your particular conditions. What time loss is acceptable? Are there additional patterns the trucks can drive? Do you know more about the graph (e.g. is the area divided into distant dense regions)? Answer these questions and you'll find a non-strict heuristics that satisfies your customers.
*yes, searching for longest paths is not as easy as you think. Longest path problem is NP-complete, given that your graph is not acyclic.
You're trying to find the smallest possible way to get everything done? This reminds me of a max-flow/min-cut problem. You might be able to approximate the best answer by:
Connect all terminal nodes to a final end node.
Run one of the various maximum flow algorithms to find the max flow between a and end.
Return to city a. Update the graph to reflect what you just did. Repeat until all jobs are done.
The idea is that you get the most bang for every trip. Each trip after the 1st will be less efficient and less efficient, but that's to be expected.
Note: This only works because you have a DAG. Travelling salesman wouldn't be NP-Complete on a DAG, either, and it will likely be impossible to even hit all nodes on the graph. Re-reading your problem, it seems like you don't have a DAG, since you can return to city a - is that true?
You can adjust the traveling sales man problem dynamic programming algorithm to do what you want.
The classic approach says that you want to maximize the optimum function from all cities but you can take in consideration, at each step also the possibility of returning home.
And like Pavel mentioned, this problem is definitively NP-hard. Do you have some upper bounds for the number of cities or maximum number of objects that can be loaded in a truck?
PS: Do you want the BEST solution (maximum profit - might not be realistic in terms of processing time) or you accept some approximation?
Isn't this a Transportation problem?
Depending on the trucks number and starting points, you could add a fake transporations or add costs in order to satisfy your restrictions.
I'd also ask about truck restrictions:
are they all based in the same city?
do you have a fixed number of them?
and what you win if you use less then
you have?
is there a cycle time restriction?

Resources