Vehicle usage optimization using GTFS - algorithm

I have a GTFS feed defined for my fleet. This tells the routes, trips and timings. Now using this GTFS feed, is it possible to optimize the utilization of my fleet's vehicles? Can I schedule the vehicles such that once it completes a trip, it can be assigned to serve a trip of another route?
I have constriants such as no vehicle should be running more than 12 hours, every vehicle will undergo a health check for 2 hrs, etc.
To me this sounds like a case of the Knapsack problem.
If such a project exists, kindly let me know. Is there an algorithm that can solve this problem?
Thanks,
Yash

You're asking a question that is typically assigned to a scheduling system, one which would produce GTFS files from the get-go. In smaller systems, this actually is not difficult to do, but as the number of routes (or "trip patterns") increases, the process gets more complex.
Before you undertake any project like this, I suggest reading over the TCRP manual on scheduling, paying close attention to the terms "cycle time," "headway," and "interlining."
While I'd love to help more, I don't have time right now to get into the specifics. I performed a similar analysis with automatically collected cycle times on a limited set of routes in my masters thesis, starting on page 118.
I hope this helps. If you have any follow-up questions, post a comment and I'll respond when I have time.

Related

Need an advice about algorithm to solve quite specific Job Shop Scheduling Problem

Job Shop Scheduling Problem (JSSP): I have jobs that consist of tasks and I have machines that can perform these tasks.
I should be able to add new jobs dynamically. E.g. I have a schedule for the first 5 jobs, and when the 6th arrive - I need to be able to fit it into the schedule in the best way. It is possible to adjust existing schedule within the given flexibility constrains.
Look at the picture below.
Jobs have tasks, each task is the same type of action. Think about painting of some objects with paint spray. All the machines are the same (paint sprays), and all of the tasks are the same.
Constraint 1. Jobs have a preferred deadline for completion, but the deadline is flexible to some extent.
Edit after #tucuxi answer: Flexible deadline mean that the time of completion can be extended by some delta if necessary.
Constraint 2. Between the jobs there is resting phase. Think about drying the paint. Resting phase has minimal required duration. Resting phase can be longer or shorter if necessary.
Edit after #tucuxi answer: So there is planned time of rest Tp which is desired, but flexible value that can be increased or decreased if this allows for better scheduling. And there is minimal time of rest Tm. So Tp-Tadjustmenet>=Tm.
The machine is occupied by the job from the start to the completion.
Here goes parts that make this problem very distinct from what I have read about.
Jobs arrive in batches of several jobs. For example a batch can contain 10 jobs of the type Job_1 and 5 of Job_2. Different batches can contain different types of jobs. All the jobs from the batches should be finished as close to each other as possible. Not necessary at the same time, but we need to minimize the delay between the completion of first and last jobs from the batch.
Constraint 3. Machines are grouped. In each group only M machines can work simultaneously. Think about paint sprays that are connected to the common pressurizer that has limited performance.
The goal.
Having given description of the problem, it should be possible to solve JSSP. It should be also possible to add new jobs to the existing schedule.
Edit after #tucuxi answer: This is not a task that should be solved immediately: it is not a time-critical system. But it shouldn't be too long to irritate a human who put new tasks into the algorithm.
Question
What kind of many JSSP algorithms can help me solve this? I can implement an algorithm by myself, if there is one. The closest I found is This - Resource Constrained Project Scheduling Problem. But I was not able to comprehend how can I glue it to the JSSP solving algorithm.
Edit after #tucuxianswer: No, I haven't tried it yet.
Is there any libraries that can be used to solve this problem? Python or C# are the preferred languages, but in the end it doesn't really matter.
I appreciate any help: keyword to search for, link, reference to a book, reference to a library.
Thank you.
I doubt that there is a pre-made algorithm that solves your exact problem.
If I had to solve it, I would first:
compile datasets of inputs that I can feed into candidate solvers.
think of a metric to rank outputs, so that I can compare the candidates to see which is better.
A baseline solver could be a brute-force search: test and rate all possible job schedulings for small sample problems. This is of course infeasible for large inputs, but for small inputs it allows you to compare the outputs of more efficient solvers to a known-best answer.
Your link is to localsolver.com, which appears to provide a library for specifying problem constraints to then solve them. It is not freely available, requiring a license to use; but it would seem that your problem can be readily modeled in it. Have you tried to do so? They appear to support both C++ and Python. Other free options exist, including optaplanner (2.8k stars in github) or python-constraint (I have not looked into other languages).
Note that a good metric is crucial to choosing a good algorithm: unless you have a clear cost function to minimize, choosing "a good algorithm" is impossible. In your description of the problem, I see several places where cost is unclear (marked in italics):
job deadlines are flexible
minimal required rest times... which may be shortened
jobs from a batch should be finished as close together as possible
(not from specification): how long can you wait for an optimal vs a less-optimal-but-faster solution?

Is it a good idea to use a global table instead of directed links in Netlogo to improve performance? (Answer is NO)

I have a model which works good for less than 100 agents (20 * 20 world size), but one of my model requirements now is to test my model for different groups of agents and I need to have more than 100 agents (and 40 * 40 world size). I have tried to optimize each function individually but I am afraid there is nothing left that I can change without destroying model requirements.
In the current version I am using links to keep track of agent relationship and there is no limit on how many links each agent can have, therefor number of links grows really fast (more than 2000 links), There is a need for updating each link's relationship value after each interaction.
A little more detail on the use of links in my model:
Agents create / update link's value and frequency with each other if they have social
Interaction
Many agents create or update their link value and frequency with one agent if they
observe an unusual social activity of that agent (different groups
Of these activities has been defined and different actions will be called based on the type of activity)
Agents will observe co-located agents link's value when they are in
Same patch and according to that they might have different kinds of social interaction
Agent in a certain age range will find a mate based on their link
value and other criteria.
And maybe a few more that I can't recall right now, but the link's value is being called many times in each tick, the agent lifetime is 4000 and simulation length is 40000 ticks, for 100 agents it takes 10-15 min to run the simulation but for 200 agents it takes 10 min to finish only 2000 ticks!
Since I have a big problem testing my model for more agents I was thinking of eliminating all links and use a global table with each pair of agents and their relationship value and frequency of their relationship but since using links is really easy I think I will have difficulties setting and getting values.
Does anyone have any idea of a better way to do this? Or ways to make netlogo models scalable ?
UPDATE:
Hi, I checked again and again And I am sure my programming style makes my program slow, I have found 2 cases of accidental asks one of them is really stupid because I set links invisible which is not even necessary!!! and I just could set hide-link whenever a link is created! without asking it again :D
Additionally I have eliminated the cases which I used ask or with for out-link neighbors , links and out-links . For example I have replaced a code which was checking to find any other agent who has a common relationship with caller agent, My initial code was really slow and I have replaced it with following code which works much faster:
Let Agents_I_Met out-link-neighbors
if any? other agents-here with [any? out-link-neighbors with [member? self Agents_I_Met ]]
[
Let Other_Agent one-of other agents-here with [any? out-link-neighbors with [member? self Agents_I_Met ]]
Let CommonAgent one-of Agents_I_Met with [member? Other_Agent in-link-neighbors ]
...
but still there are many cases that I need to call Other agents so I think it is ok to ask agents ask other agents in-radius X!
Finally now my system works much better in more reasonable time for 400 agents and 15000-20000 LINKS :)
But I am sure there is still place for improvement. Thanks Seth for your helpful answer :)
It's not a good idea. Links in NetLogo are implemented efficiently; whatever you substitute for them is very likely to be slower rather than faster.
You seem eager for this to be NetLogo's fault, but it probably isn't; the problem is almost certainly a problem in your own code, a problem that would have arisen in whatever programming language you were working in.
In most NetLogo simulations, the runtime increases linearly with the number of agents, i.e., doubling the number of agents. From the numbers you give, it sounds like you have written code that takes time proportional to the square of the number of agents. (It might be exponential in the number of agents, but it's much more common for people to accidentally write code that takes polynomial time.)
There are two ways this might have happened:
The algorithm you are implementing is inherently one that takes polynomial time to compute.
The algorithm you are implementing can be coded such that it executes in linear time, but you accidentally coded it in a way that takes polynomial time.
You need to ask yourself, what is it about my code that makes it not run in linear time?
You need to look at every loop in your code that loops over all of the agents in the model and ask yourself, is there anything in the body of this loop which can't be executed in constant time?
Primitives that loop over all the agents include ask and with. So the most common mistake is to write something like:
ask turtles [
... turtles with [ ... ] ...
]
without knowing anything else about the code, I can look at this and know that the model it occurs in will be a slow model, because the above code requires polynomial time to execute. Every possible pair of turtles executes the both of the with, so e.g. if there are 100 turtles, the code will take 10,000 steps to execute. If there are twice as many turtles, the code will take 40,000 steps to execute — doubling the number of turtles causes the runtime to quadruple, not just double.
We already saw once that at Use undirected links instead of Directed links that code of this form was the reason your model was slow.
So, you need to find where it is in your model that you have nested loops like this, where the size of both loops is proportional to the total number of agents. (The two loops might not necessarily be ask and with, and they might not even be in the same procedure.)
And when you find a place where you're doing this, you need to figure out whether you're doing a computation that inherently takes this long, or whether you've accidentally coded it in an unnecessarily slow form. If the latter, fix it. If the former, you'll need to rethink your requirements.
And it sounds like it's your requirements that are fault. You write "In the current version I am using links to keep track of agent relationship and there is no limit on how many links each agent can have". So the number of links is proportional to the number of turtles, right? That means that you don't even need nested loops in order to have written a program with polynomial runtime; if you ever do ask links [ ... ], even once, you're dead, since the number of links is already proportional to the square of the number of turtles.
As an example of how to fix it, you might consider putting some fixed size on the number of relationships each turtle can have. That would make it possible again for your model to run in linear time.
That doesn't mean you shouldn't also scrutinize your code for accidental problems though; it's possible that you have both problems, essential and accidental slowness. You may have accidentally done ask links [ ... ask turtles ... or ask turtles [ ... ask links ... in which case your runtime will increase as the cube of the number of turtles, or if you've done ask links [ ... ask links ..., as the fourth power.
None of this advice is especially specific to NetLogo. If it's easy to write slow programs in NetLogo, it's only because it's easy to write all kinds of programs in NetLogo — including slow ones. In particular, NetLogo makes it very easy to write loops. A short snippet of code like turtles with [color = red] is a loop, but it's so short and easy to write that it doesn't necessarily look or feel like a loop, so it's easy to miss the performance implications.

Optimal shift scheduling algorithm

I have been trying for some time solve a scheduling problem for a pool that I used to work at. This problem is as follows...
There are X many lifeguards that work at the pool, and each has a specific number of hours they would like to work. We hope to keep the average number of hours away from each lifeguards desired number of hours as low as possible, and as fair as possible for all. Each lifeguard is also a college student, and thus will have a different schedule of availability.
Each week the pool's schedule of events is different than the last, thus a new schedule must be created each week.
Within each day there will be so many lifeguards required for certain time intervals (ex: 3 guards from 8am-10am, 4 guards from 10am-3pm, and 2 guards from 3pm-10pm). This is where the hard part comes in. There are no clearly defined shifts (slots) to place each of the lifeguards into (because of the fact that creating a schedule may not be possible provided the availability of the lifeguards plus the changing weekly pool schedule of events).
Therefore a schedule must be created from a blank slate provided only with...
The Lifeguards and their information (# of desired hours, availability)
The pool's schedule of events, plus number of guards required to be on duty at any moment
The problem can now be clearly defined as "Create a possible schedule that covers the required number of guards at all times each day of the week AND be as fair as possible to all lifeguards in scheduling."
Creating a possible schedule that covers the required number of guards at all times each day of the week is the part of the problem that is a necessity and must be completely solved. The second half about being as fair as possible to all lifeguards significantly complicates the problem leading me to believe I will need an approximation approach, since the possible number of way to divide up a work day could be ridiculous, but sometimes may be necessary as the only possible schedule may be ridiculous for fairness.
Edit: One of the most commonly suggested algorithms I find is the "Hospitals/Residents problem", however I don't believe this will be applicable since there are no clearly defined slots to place workers.
One way to solve this is with constraint programming - the Wikipedia article provides links for several constraint programming languages and libraries. Here is a paper describing how to use constraint programming to solve scheduling problems.
Another option is to use a greedy algorithm to find a (possibly invalid) solution, and to then use local search to make the invalid solution valid, or else to improve the sub-optimal greedy solution. For example, start by assigning each lifeguard their preferred hours, which will result in too many guards being scheduled for some slots and will also result in some guards being assigned a ridiculous number of hours; then use local search to un-assign the guards with the most hours from the slots that have too many guards assigned.
You need to turn your fairness criterion into an objective function. Then you can pick from any number of workplace scheduling tools.For instance, you describe wanting to minimize the average difference between desired and assigned hours. However, I'd suggest that you consider minimizing the maximum difference. This seems fairer (to me) and it will generally result in a different schedule.
The problem, however, is a bit more complex. For instance, if one guard is always getting shorted while the others all get their desired hours, that's unfair as well. So you might want to introduce variables into your fairness model that represent the cumulative discrepancy for each guard from previous weeks. Also, a one-hour discrepancy for a guard who wants to work four hours a week may be more unfair than for a guard who wants to work twenty. To handle things like that, you might want to weight the discrepancies.
You might have to introduce constraints, such as that no guard is assigned more than a certain number of hours, or that every guard has a certain amount of time between shifts, or that the number of slots assigned to any one guard in a week should not exceed some threshold. Many scheduling tools have capabilities to handle these kinds of constraints, but you have to add them to the model.

Estimating a project with many unknowns

I'm working on a project with many unknowns like moving the app from one platform to another.
My original estimations are way off and there is no way I can really know for sure when this will end.
How can i deal with the inability to estimate such a project. It's not that I'm adding a button to a screen or designing a web site, or creating and app or even fixing bugs. These are not methods with bugs, these are assumptions made in the overall code, which are not correct anymore and are found step by step and each analyzed and mitigated with many more unknowns.
I happened to write a master thesis about software-estimation and there are lessons I've learned:
-1st Count, 2nd compute, 3rd judge - this means: first try to identify items in your work which are countable e.g files, classes, LOCs, UIs, etc. Then calculate using this data the effort (in person/days). Use judgement as the last ressort.
-Document your estimation! Show numbers. This minimizes your risk, thus you will present results not as your opinion, but as more or less objective figures. (In general, the more paper the cleaner the backside)
-Estimation is not a commitment. Commitment is one number, estimation is always a range - so give your estimation as a range ( use cone of uncertainty to select the range properly http://www.construx.com/Page.aspx?hid=1648 )
-Devide: Use WBS, devide your work in small pieces and estimate them separately. The granulity depents on the entire length, but at most a working-package soultn't be bigger than 10% of entire effort.
-Estimate effort first, then schedule, then costs.
-Consider estimation as support for planing, reestimate on each project phase (s. cone of uncertainty).
I would suggest the book http://www.stevemcconnell.com/est.htm which deals all these points, in particular how to deal with bosses, who try to pull a commitment from you.
Regards,
Valentin Heinitz
There's no really right answer for coming up with an accurate estimation, because there's no way to know it.
as for estimating the work itself, think about how each step can be divided into separate sub-steps, and break those down even smaller, until you can get a fair picture of as much of the work as you can, with chunks small and discreet enough to give sound estimates for. If you can, come up with both an expected time and a worst-case time, to get a range of where you could land.
Another way to approach this is to ignore the old system. It sounds like a headache. Make an estimate of scraping the old system and implementing a new one from scratch, or integrating a 3rd party, off the shelf solution. If there's a case to be made for this, it is worth at least investigating it.
Sounds like a post for postsecret not SO. :)
I would tell him that it will be done when its done, and if thats not good enough, he can learn to program and help you. Then again, I think that you might get fired, but hey that sounds like it might be better.
Tell him more or less what you told us. The project is too volatile too give an accurate estimate and the best you can do is give an estimate for a given task. As long as the number of tasks is unknown so will be the estimate. If he is at all worth his salary he would rather hear this than some made up number. This is not uncommon when dealing with a large legacy code base.
It's not that I'm adding a button to a screen or designing a web site,
or creating and app or even fixing bugs.
That is a real problem. You can not estimate what you don't have experience in. The only thing you can do is pad your estimate until you think it is a reasonable amount of time. The more unknowns you think there are the more you pad. The less you know about it the more you pad.
I read the below book and it spoke at length about accuracy vs precision. Basically you can be accurate but have a very large range. For instance you can be certain the task will be between 1 day and 1 year to complete. That is not very precise but it is really accurate.
Software Estimation Demystifying...
Some tips for estimating

What is better: set up underestimated or overestimated deadlines? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Suppose you are a project manager. You can estimate an effort in days for specific task for specific developer. After performing estimation you obtain some min and max values.
After this you delegate a task to developer. Actually you also set up deadline.
Which estimation is better to use when set up deadline: min or max?
As I see min estimation can result in stress for developer, max estimation can result in using all the time which is allocated to developer even if task can be complete faster (so called Student syndrome).
Which other pros and cons of two approaches?
EDIT:
Small clarification: I speak about setting up deadlines for subordinates when delegating the task, NOT for reporting to my boss.
EDIT:
To add one more clarification: I can keep in mind my real estimation, provide to boss slightly larger estimation, to subordinates - slightly smaller.
And this questions touches the following thing: is it good idea to provide to developer underestimation to make him working harder?
You should use the best guess which is a function of the min and max estimates* - not just the simple average -
best_guess = (min * min_weighting + max * max_weighting) / divisor*
* Tom Neyland suggests it should be (min_weighting + max_weighting). Actually I'm not sure whether that is correct, but it's probably more correct than my original divisor of 2.0.
The weighting you give to the min and max values will depend on the complexity of the task, the risks associated with the task, the likelihood of the risks occuring, the skill of the developer, etc. and will vary from organisation to organisation and from project to project. If you keep a record of your previous estimates and the actual time each took you'll be able to refine these estimates over time.
You should also use these values, plus a confidence value, when talking to senior management and customers. While giving the max and delivering early is not the same as giving the min and delivering late, it still shows that you don't have control over your development.
Giving the confidence value and an idea of the risks will also help manage expectations so if there are problems they're not unexpected.
* These min and max estimates will be got by various means - asking the developers, past experience etc. If polling developers then the actual min and max values should be treated as outliers and either discarded or modified in some way. What I mean here are the values you get from phrases like "it'll take 2 weeks if all goes well or a month if we hit some snags". So the values you plug into the formula are not the raw numbers.
Use neither min nor max but something in between.
Erring on the side of overestimation is better. It has much nicer cost behavior in the long term.
To overcome the stress due to underestimation, people may take shortcuts that are not beneficial in the long term. For example, taking extra technical debt thast has to be paid back eventually, and it comes back with an interest. The costs grow exponentially.
The extra cost from inefficiency due to student's syndrome behaves linearly.
Estimates and targets are different. You (or your managers and customers) set the targets you need to achieve. Estimates tell you how likely you are to meet those targets. Deadline is one sort of target. The deadline you choose depends on what kind of confidence level (risk of not meeting the deadline) you are willing to accept. P50 (0.5 probability of meeting the deadline) is commonplace. Sometimes you may want to schedule with P80 or some other confidence level. Note that the probability curve is a long-tailed one and the more confidence you want, the longer you will need to allocate time for the project.
Overall, I wouldn't spend too much time tracking individual tasks. With P50 targets half of them will be late in any case. What matters most is how the aggregate behaves. When composing individual tasks estimates into an aggregate, neither min or max is sensible. It's extremely unlikely that either all tasks complete with minimum time (most likely something like P10 time) or maximum time (e.g. P90 time): for n P10/P90 tasks the probability is 0.1^n.
PERT has some techniques for coming up with reasonable task duration probability distributions and aggregating them to larger wholes. I won't go into the math here. Here's some pointer for further reading:
Steve McConnell: Software Estimation - Demystifying the Black Art. It's quite readable and pragmatic but at least the 1st edition I have has some quirks in its math and otherwise.
Richard D. Stutzke: Estimating Software-Intensive Systems - Projects, Products and Processes. It's a little more academic, harder read but for example explains the math better.
Ask for best, likely and worst case scenario estimates instead. Then use Program Evaluation and Review Technique. However you may want to take a look at some PERT critique first.
For individual tasks or tasks making up the critical path it’s simply not prudent to go for the best case estimates. It’s like saying that the project is absolutely free of any risk and uncertainty. If the actual job turns out to be anything but the best case scenario you’ll end up blowing the schedule. It’s better to end up with some extra time on your hands and fill the time by implementing some nice-to-haves as opposed to having to work nights and weekends.
On the other hand if managers mostly went for the worst case estimates and in software world they can easily be an order of magnitude greater than the best case figures most projects would never make it past the feasibility and planning stage. Not all of the risks going to materialise.
Going for the best case estimate won't help fighting student syndrome. Include interim milestones and deliverables instead, beside being helpful at combating the student syndrome they're pre-requisite for having a trustworthy data on the project progress and uncovering early any potential issues.
If the difference between min and max are big rather than using some black magic formula I think it the best thing to do would be to go back to the developers and ask them to do a finer breakdown and prototyping, which will lead to better estimates where the gap between min and max is not that big.
Note to the question: In my opinion, the estimates should be done by the developers/architects since they have the best technical knowledge to be able to break down into tasks and estimate those tasks.
If you are estimating for a specific developer, and you know your estimates are generally accurate for that developer, then the min value is the logical deadline (initially). In the course of the project you will adjust deadlines according to circumstance.
If you have little experience with a specific developer, one of my fondly regarded previous managers would ask the developer himself to do the estimate and set the initial deadline a third of the distance between that developer's min and max, challenging the developer to beat it.
Something which has been missing in many of these answers (perhaps because it's slightly off-topic) is frequent updates. With younger/newer developers this is even more important - read the code they commit, and/or check in daily to ask them for specific, detailed reports.
This also allows you to set tight deadlines for developers without giving them too much stress, because they will know you're around to help adjust deadlines when needed.
Frequent updates give you the most important tool in setting customer/management expectations - early warning of issues which might delay things, and I prefer having that over any formula.
Is the developer going back into a cave to develop this or is there a good chance of changing requirements over the course of the project? I would think most projects will have a good chance that something won't go smoothly and thus it may be better to try to get the prototype up sooner rather than later.
As for the initial question, I think I'd break this out into a few different outcomes and consider each:
Gross underestimation -> This leads to the problem that there is still a lot of work to do and the manager appears unable to make reasonable estimations.
Minor underestimation -> In this case, either there is an extension, scope gets cut or some bugs are in the release, but this is better than the previous case.
Made the deadline, on time and on budget with quality -> While this may seem optimal as everything worked out, I don't think this is the best result possible.
Minor overestimation -> In this case, there is some breathing room that means either things finish early or some extra work is added. A point here is that this may seem to deliver a slightly better result than the previous case like how some companies will try to beat the earnings estimate by a small amount to do better than expected.
Gross overestimation -> I think this would be the worst case outcome though it is similar to the first in terms of someone being way out of their league in being able to provide a reasonable estimation.
That's just my opinion on each and others may have a different take on it than me.
If you're trying to hold developers to their minimum estimate, that's foolish. No one, in any industry, consistently hits their minimum time estimate for getting something done. Eventually, they'll just learn to pad their minimum estimates significantly, and then they'll never hit the old minimums, because all estimates will be above that.
In Agile/Scrum, you don't set firm deadlines, but set "how many hours left on this task". Every day, you update the amount of time left. You do not track hours spent, but do track estimated hours remaining, and you try and stay honest about it.
If you have lazy developers, this is bad, because they can easily game that system. If you have developers that are worth their salt, this is great. They get better at estimation pretty quickly, and you - as a project manager - learn how reliable their estimates are, and you'll have a much better feel for what estimates to pass up the chain based on the individual developer estimates.
Go slightly towards Agile, fire the bad developers as you discover which are which, reward the good developers for actually giving a damn, and have a more productive, happier team while being able to report more accurate expectations to your superiors.
If in doubt under promise and over deliver: you want to be the person who is delivering more than they were expecting, not less. Based on this always go with the higher of any estimate.
Slightly more complex:
For a given potential delivery, if you plot the delivery times against the chances of them being happening, you're going to get a curve which is a variation of a normal distribution, and you can assume that a developers minimum estimates are going to be somewhere towards the left of the curve and their maximum towards the right.
The area under the curve to the left of the single number you select as your estimate represents the probability of you successfully delivering on or before that estimate. So if you give a number at the very left hand side your chance of hitting is effectively zero, if you give a number at the very right hand side your chance is effectively 100%.
What is less commonly realised is if you give the mean value (assuming your min and max averaged out give something approximating the actual mean) you'll only hit that deadline 50% of the time. Effectively if you use the mean you're going to miss the deadline half the time. I don't know about you but I don't like being seen as the guy whose misses half his deadlines.
So you want a number which is going to give you something you hit, say, 90% of the time. Conveniently 95% represents the mean + two standard deviations but if you can't be arsed to calculate that (and most of us probably don't have the data) my experience says that:
(3 x max + 1 x min) / 4
gives a reasonable result.
Incidentally, what you tell the developer is the deadline is another question entirely. Personally I'd give him somewhere around ((2 x max + 1 x min) / 3) and have the rest as contingency.
What are you using the estimates for? Specifically, why will the developer feel stressed if you normally underestimate?
If you're trying to schedule how long something is likely to take, you go for an intermediate value. Probably on the long side, since people normally underestimate. In any case, you shouldn't be using these estimates as firm objectives for developers, and so they shouldn't be overly stressful.
If you're using these estimates to set up commitments, you need to err on the side of overestimating. Giving developers insufficient time leads to burnout, unmaintainable buggy code that doesn't do quite what the user wants, and low morale and high turnover. Set the commitments to be reachable, and encourage the developers to finish early.
This depends on project.
Some projects may require fast development and there's no alternatives if deadline is already set and there's no good chance to prolong development. Typical issue: marketing campaign resulting in new service. Such deadline can be enough for normal development, but in some organizations it is so close, that developers work in stress and make many errors that are fixed during production stage. That's a kind of project when developers have to work with topmost effectiveness and they'd better get good reward on success.
Some projects are accurately planned and here you can use all analytics you have: history data, some developer's time metrics on subtasks, calculating risks, etc.
But anyway MAX time shouldn't be used: its the most inaccurate measure that usually leads to even more time taken. And here's a simple reason: when developer just gives away this MAX, he almost doesn't measure. He just gives away his intuition that has very little info at the time. But if he'll spend at least half an hour he'll understand specifics of his tasks, he even may split it into subtask and increase his accuracy. So you can give developer some bias like "hey, guys, just think in what time you would provide stable code here" but send him measure himself. It is good for a job, it is good for a programmer himself.
The first mistake most estimators use when setting the deadline is assuming that the dev will be full-time every day on that task which is a disastrous mistake. This can result in not meeting the deadline even when you use the over estimate to figure out the deadline. Being under the hours but past the deadline you told the client is a big problem. People take leave, get sick, have jury duty, have to go to required meetings on some new HR policy, get called over to help on another project when someone is stuck, have to load software on a new computer when their old one breaks, have to research a production problem on code they recently deployed, etc. If you are estimating more than 6 hours a day on the project per person, you are already in trouble on the deadline before the project starts. When I did manpower studies, we used a figure that equated to just slightly more than 6 hours a day of direct work when calculating out how many people were needed for any job. And we did a lot of statistical analysis as the basis for the figure we used.
I think you have to decide which of these to use on a case by case basis. We have some projects that we know the max estimate is still probably a little low (usually when someone in management couldn't face the client with the real estimate), we have others where we are doing something new where we know the estimates are more likely to be off, in these kinds of cases go with the max. But for work you've done before that is well-defined and you know the dev assigned won't be learning new skills, then go closer to the min (but never actually use the min, there are alawys unexpected bumps in the road). ALso the shorter the project, the more likely you will be able to meet the min, it is far easier to get a good estimate for a week-long project than a year-long one.
More importantly is changing the estimate and deadline every time the circumstances change. If the client adds work, the extend the deadline and estimate, don't just do it. If your best dev quits and you have to put someone new on the project, extend the deadline becasue that person has to have time to get up to speed (you may have to eat the hours though, the client may not agree to pay for that time. Critical to this is telling the client right away. They tend to be better about moving a deadline (although not happy) than they are about missing one or making the dealine but the product doesn't work as they expect it to. Too many project managers just like to wish a problem is going away and the won't have to face that conversation with the client. But usually when they do finally have to tell him it is a much worse conversation than the difficult one they tried to avoid.

Resources