Problem
We have 'n' number of interviewers and their free-busy slots. We want to schedule interviews of a candidate with these 'n' interviewers one by one. Interviews can be in any order.
Approach 1: First come first serve
Start with the first interviewer
Take the interviewer and create an interview schedule based on its free-busy slot.
Then take the next interviewer and repeat step 1.
But with this approach, we can miss out on some cases.
Approach 2: Greedy algorithm
Sort the interviewers on the basis of the number of free slots available in their free-busy schedules.
Create an interview schedule for the interviewer with the least number of free-busy slots.
Repeat step 2 for the next interviewer.
Is there any more optimized/better approach for this problem?
We can re-cast this problem in terms of graph theory:
For each interviewer, generate a node. For each interview slot, generate a node too. Now create edges between interviewer nodes and slot nodes for those combinations where the interviewer is available for that slot. This will give you a bipartite graph.
The goal now is to find a maximum cardinality matching, i.e. a matching of interviewers to slots for as many interviewers as possible. Hence, if a full solution is not possible, it will even give you a partial schedule. Common algorithms for this are the Ford-Fulkerson Algorithm (or Edmonds-Karp Algorithm) and the Hopcroft-Karp Algorithm.
Related
Unable to come up with a formal proof of optimality for algorithm A for the given problem. Have convinced myself that it is possible to execute some optimal schedule O in increasing order of the events' deadline. But don't know how to formally prove that extract_max operation converges to an optimal solution.
Problem
: Given a list of events with deadline date 'd' and duration 'l' days, provide algorithm to select events such that maximum number of them can be chosen. Of course, each event must be scheduled such that it finishes by deadline date 'd', it must run continuously for its duration 'l' days, and only one event can run at any given time.
**Greedy Algorithm A:**
Create max_heap S //schedule
Sort events by their deadline (increasing).
for(j=0;j<events.size();j++)
{
If you can incorporate event j into schedule S, do so.
Else, if(longest event in S > length of j) swap it with j.
}
Return S;
END
We can prove this by contradiction. Assume that the greedy choice were not part of the optimal solution; that is, if we consider the tasks sorted in ascending order of deadline, the optimal solution wouldn't include the one with the earliest deadline. Now, consider the task with earliest deadline in any hypothetical optimal solution. Either it overlaps with the greedy choice (in which case we might as well have chosen the greedy choice, since it finishes no later than the earliest task in the optimal solution, and cannot overlap with any earlier tasks in the optimal solution, since it was the earliest task); or else it doesn't overlap, in which case the optimal solution wasn't optimal (since we could also freely have included the greedy choice in it). In both cases, we have a contradiction (in the first, that the greedy solution couldn't have been picked; in the second, that the solution without the greedy choice was optimal) and so the assumption that the optimal choice doesn't contain the greedy choice was wrong; the optimal solution does include the greedy choice.
This question already has answers here:
Cases where the greedy algorithm fails the 0-1 knapsack problem
(2 answers)
Closed 2 years ago.
I don't get it. I really don't. Greedy Algorithm for me, only cares about :
Dividing a problem into stages[sub problems]
Maximizing/Minimizing or Optimizing output in each stage irrespective of later stages or anything else.
Even the 0/1 Knapsack Problem is solved using the same theory.
Stages become various items to fill
Optimizing output in each stage becomes picking the item providing most profit first and then picking the next item providing most profit and so on.
It's the same approach that we are following on both Knapsack problems. The only difference is :
In Fractional Knapsack : we maximize profit by picking the item providing most PROFIT/WEIGHT. Why? Because items can be divided
In 0/1 Knapsack : we maximize profit by simply picking the item providing most profit. Since items cannot be divided, we don't think about calculating profit/weight as it makes no difference.
They both should fall under Greedy Algorithm.
I'm just not able to understand where does concept of Dynamic Programming arrive.
Greedy algorithms are not about dividing the problem and solving it in parts.
It is in fact DP or backtracking. In Greedy we choose the best possible option that we have at the moment and pick that and compute for the rest. Best example for this is Dijkstra.
In 0/1 we don't know which item will give us the maximum weight, so we have to try all the items. In order to avoid computing the same input space again and again, we store the intermediate result and that's why it falls under DP.
In partial knapsack, we basically take the item which will give us the best weight at the moment, pick that and compute for the remaining of the weight.
I'm trying to use A* to find the optimal path in a Graph.
The context is that a tourist starts at his hotel, visits landmarks and returns to his hotel at the end of the day. The nodes (landmarks) have two values: importance and time spent. Edges have two values: time spent and cost(currency).
I want to minimize cost, maximize importance and make sure total time is under a certain value. I could find a balance between cost, importance and time for the past path-cost. But what about the future cost? I know how to do it with simpler pathfinding, but is there a method I could follow to find the heuristic I need?
You have a multi-dimensional objective (cost and importance) and so your problem is ill-posed because you haven't defined how to trade off cost and importance while you are "minimizing" cost and "maximizing" importance. You'll only get a partial ordering on sites to vist, because some pairs of sites may be incomparable because one site may have a higher cost and higher importance, while the other may have lower cost and lower importance. Look up multi-objective knapsack problem if you want concrete ways to formalize and solve your problem. And beware -- it can get a bit hairy the deeper you go into the literature.
Let's say we have a set of intervals
[s1,e1],[s2,e2]...[sn,en]
I would like to find the subset of non-overlapping intervals and has the maximum aggregate time.
Actually I'm looking for a greedy solution. Does it exist or not?
"Greedy" is not a formal term, but for the purpose of this question, let's define the class of greedy algorithms to be those that impose an a priori total order on intervals (i.e., independent of the input) and repeatedly extend the partial solution by the maximum available interval. Consider the inputs
[0,2],[1,4],[3,5]
[0,2],[1,4]
[1,4],[3,5].
There are three possibilities for the maximum interval among [0,2],[1,4],[3,5]. If [0,2] or [3,5] is maximum, then the greedy algorithm answers incorrectly for the second or third input respectively. If [1,4] is maximum, then the greedy algorithm answers incorrectly for the first input.
I'm developing a trip planer program. Each city has a property called rateOfInterest. Each road between two cities has a time cost. The problem is, given the start city, and the specific amount of time we want to spend, how to output a path which is most interesting (i.e. the sum of the cities' rateOfInterest). I'm thinking using some greedy algorithm, but is there any algorithm that can guarantee an optimal path?
EDIT Just as #robotking said, we allow visit places multiple times and it's only interesting the first visit. We have 50 cities, and each city approximately has 5 adjacent cities. The cost function on each edge is either time or distance. We don't have to visit all cities, just with the given cost function, we need to return an optimal partial trip with highest ROI. I hope this makes the problem clearer!
This sounds very much like an instance of a TSP in a weighted manner meaning there is some vertices that are more desirable than other...
Now you could find an optimal path trying every possible permutation (using backtracking with some pruning to make it faster) depending on the number of cities we are talking about. See the TSP problem is a n! problem so after n > 10 you can forget it...
If your number of cities is not that small then finding an optimal path won't be doable so drop the idea... however there is most likely a good enough heuristic algorithm to approximate a good enough solution.
Steven Skiena recommends "Simulated Annealing" as the heuristics of choice to approximate such hard problem. It is very much like a "Hill Climbing" method but in a more flexible or forgiving way. What I mean is that while in "Hill Climbing" you always only accept changes that improve your solution, in "Simulated Annealing" there is some cases where you actually accept a change even if it makes your solution worse locally hoping that down the road you get your money back...
Either way, whatever is used to approximate a TSP-like problem is applicable here.
From http://en.wikipedia.org/wiki/Travelling_salesman_problem, note that the decision problem version is "(where, given a length L, the task is to decide whether any tour is shorter than L)". If somebody gives me a travelling salesman problem to solve I can set all the cities to have the same rate of interest and then the decision problem is whether a most interesting path for time L actually visits all the cities and returns.
So if there was an efficient solution for your problem there would be an efficient solution for the travelling salesman problem, which is unlikely.
If you want to go further than a greedy search, some of the approaches of the travelling salesman problem may be applicable - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.5150 describes "Iterated Local Search" which looks interesting, with reference to the TSP.
If you want optimality, use a brute force exhaustive search where the leaves are the one where the time run out. As long as the expected depth of the search tree is less than 10 and worst case less than 15 you can produce a practical algorithm.
Now if you think about the future and expect your city network to grow, then you cannot ensure optimality. In this case you are dealing with a local search problem.