Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
What I'm looking for is very simple: I want a tool that computes the calculated, as opposed to estimated based on confidence intervals, shipping date given a list of tasks with total estimates and current progress each without introducing further uncertainty as I want to handle that externally.
I want it to take workdays duration and user input holidays into account, etc.
I know Fogbugz's Evidence Base Scheduling does something very close to that but I would like it without the statistical aspect and associated confidence intervals. I'm aware it's a drastic simplification and that statistical estimation is the essence of EBS but I'm not looking for a subjective discussion here, I just want to be able to access this simple information (the supposedly exact shipping date) at any given time during the project without having to calculate it myself.
So I'm looking for one of three things : 1) a way to customize Fogbugz (6.0) to show me the information I want besides confidence intervals 2) a way to customize Fogbugz to set estimates uncertainty to 0 3) another tool (free) that does what I want exactly.
EDIT: By "supposedly exact" or "calculated", I don't mean with respect to what is actually going to happen, that would indeed be trying to predict the future. I mean with respect to the information that was input, together with its obvious uncertainty. In that case, I guess estimates for individual tasks should be more seen as spending limits or upper bounds. The information I would like to be able to compute is really very simple : if everything goes exactly as specified, where does it take us ? Then, with information about how the estimates were made, such as the ability of each individual developper to make good estimates, I can derive the confidence interval. EBS does this automatically and, undoubtebly, very well which is why I use it. What I would like is to obtain is one more little piece of information, ie the same starting point EBS uses and try to play with my own asumptions as to how the statistical estimation should be made.
FogBugz will show you the sum of estimates at the bottom of the LIST page, labelled "Total estimated time remaining". This is the raw sum of estimates, without any EBS calculations.
You can't predict the future. So any calculated shipping date can only be a guess. That guess depends on the confidence intervals around each individual number that went into it. This is a matter of definition -- even though you may not like it.
You may want to have a "100% confident" date, but such a thing (by definition) cannot exist. You cannot have an uncertainty of zero unless you want a date infinitely far in the future. It's the nature of statistics: the distribution is actually infinite, but data is considerably more likely to cluster around the mean.
The only thing you can do is pick a really big confidence interval (99.7%). You are free to ignore the supporting statistical facts about the confidence interval and pretend it has zero uncertainty. For all practical purposes 0.3% uncertainty is small enough that you're not going to be unhappy with that date.
However, all statistically-based predictions of the future must have uncertainty. It's a law.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I'm working in a team that's been consistently and fairly successfully working in an agile approach, and this has been working great for the current project until now, for our initial work, as we incrementally build the product.
We're now moving into the next phase of this though, and the management are keen for us to set some specific deadlines ourselves, for when we'll be in a position to demo and sell this to real customers, on the order of months.
We have a fairly well organised large backlog for each of the elements of functionality we'd like to include, and a good sense of the prioritisation of these individual bits of functionality.
The naive solution is to get the minimum list of stories that would provide a demo-able product, estimate all of those individually, and add them up and combine with our velocity to get a date, and announce we'll be demoing from then. That leaves no leeway though, and seems likely to result in a mad crunch as we get up to deadline time, which I desperately want to avoid.
As an improvement, I'd like to add in some ratio of more optional stories to act as either contingency or bonus improvements, depending on how we progress, but we don't have any idea what ratio would be sensible, or whether this is the standard approach.
I'm also concerned by having to estimate the whole of our backlog all in one go up-front, as that seems very time consuming, and it seems likely that we'll discover more information in the months before we get to that story, which will affect our estimates.
Are there recommended approaches to dealing with setting deadlines to allow for an agile development process? Most of the information I've seen seems to be around handling the situation once you've got a fixed deadline to hit instead. I'd also be interested in any relevant literature or interesting blog posts that cover this issue.
Regarding literature: the best book I know regarding the estimation in software is "Software Estimation: Demystifying the Black Art" by Steve McConnel. It covers your case. Plus, it describes the difference between estimation and commitment (set-deadline, in other words) and explains how to derive the second from the first reliably.
The naive solution is to get the minimum list of stories that would
provide a demo-able product, estimate all of those individually, and
add them up and combine with our velocity to get a date, and announce
we'll be demoing from then. That leaves no leeway though, and seems
likely to result in a mad crunch as we get up to deadline time, which
I desperately want to avoid.
This is the solution I have used in the past. Your initial estimate is going to be off a bit so add some slack via a couple of additional sprints before setting your release date. If you get behind you can make it up in the slack. If not, your product backlog gives you additional features that you can include in the release if you so choose. This will be dependent on your velocity metric for the team though. Adjust your slack based on how accurate you feel this metric is for the current team. Once you have a target release you can circle back to see if you have any known resource constraints that might affect that release.
The approach you describe is likely to be correct. You may want to estimate for all desirable features, and prioritise UI elements (because investors and customers basically just see the shiny UI), and then your deadline will be that estimated date for completion; then add on some slack in the form of scaling your estimates. Use the ratio between current productivity and your worst period to create a pessimistic estimate. You can use that same ratio to scale shorter estimates (e.g. for your estimate to the minimum feature set).
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
How can I distinguish between two different users, like two different neighbours who lives in a same address and goes to the same office, but they have different patterns of driving and have different office schedules. I wanted to find out the probability of two persons who behaves more or less exactly. Depending on the resolution of the map, I wants to figure them, where they are, how often they are. Can I create a pattern ´for each drivers into some signatures, where their identity can be traced upon.
I assume, by the way that you asked your question, that you haven't had any plausible ideas yet. So I'll make an answer which is purely based on an idea that you might like to try out.
I initially thought of suggesting something along the line of word-similarity metrics, but because order is not necessarily important here, maybe it's worth trying something simpler to start. In fact, if I ever find myself considering something complex when developing a model, I take a step back and try to simplify. It's quicker to code, and you don't get so attached to something that's a dead end.
So, how about histograms? If you divide up time and space into larger blocks, you can increment a value in the relevant location for each time interval. You get a 2D histogram of a person's location. You can use basic anti-aliasing to make the histograms more representative.
From there, it's down to histogram comparison. You could implement something real basic using only 1D strips. You know, like sum the similarity measure for each of the vertical and horizontal strips. Linear histogram comparison is super-easy, and just a few lines of code in a language like C. Good enough for proof of concept. If it feels you're on the right track, then start looking for more tricky ideas...
The next thing I'd do is further stratify my data, using days of the week and statutory holidays... Maybe even stratify further using seasonal variables. I've found it pretty effective for forecasting electricity load, which is as much about social patterns as it is about weather. The trends become much more distinct when you separate an influencing variable.
So, after stratification you get a stack of 2D 'slices', and your signature becomes a kind of 3D volume. I see nothing wrong with representing the entire planet as a grid. Whether your squares represent 100m or 1km. It's easy to store this sparsely and prune out anything that's outside some number of standard deviations. You might choose only the most major events for the day and end up with a handful of locations.
You can then focus on the comparison metric. Maybe some kind of image-based gradient- or cluster-analysis. I'm sure there's loads of really great stuff out there. This is just the kinds of starting-points I make, having done no research.
If you need to add some temporal information to introduce separation between people with very similar lives, you can maybe build some lags into the system... Such as "where they were an hour ago". At that point (or possibly before), you probably want to switch from my over-simplified approach of averaging out a person's daily activities, and instead use something like classification trees. This kind of thing is very easy and rapid to develop with a tool like MATLAB or R.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Background
My team current is currently in the "bug fixing and polishing" phase of shipping a major rewrite. We still have a large stack of bugs to fix, scheduled against a couple of milestones. We've been asked to come up with estimates, of how much engineering effort is required to fix the bugs for each milestone.
For previous milestones, we've followed the following process:
Assign the bugs to the people that know the most about that area of the code, and will likely be the one to fix the bug.
Have each person go through the bugs that are assigned to them, and estimate how long they think it will take to fix the bugs, at an hour-level granularity. If a bug looks like it will potentially take more than a day or two to fix, they break the bug into likely subtasks, and estimate those.
Total the amount of work assigned to each person for each milestone, and try and balancing things out if people have drastically different amounts of work.
Multiply each person's total for each milestone by a "padding factor", to account for overly optimistic estimates (we've been using 1.5).
Take the largest total across the team members for a given release, and make that the time it will take for the team to close the existing bugs.
Estimate the number of bugs we expect to be created during the time it takes us to reach a particular milestone, and estimate how long on average, we think it will take to close each of these bugs. Add this on to the time to close the existing bugs for each release. This is our final number of the amount of work needed, delivered as a date by which we'll definitely ship that milestone.
This has been fairly accurate (we've come in pretty much spot on on our previous three milestones), but it's rather time consuming.
Current Problem
We've been asked to give estimates of the engineering time for upcoming milestones, but asked not to use the above process because it's too time consuming. Instead, as the tech lead of the team, I've been asked to provide estimates that are less certain, along with a certainty interval (ie, 1 month, plus or minus a week).
My primary estimation experience is with some variation of the method I described above (from a background of freelancing for a number of years). I've found that when I "shoot from the hip" on large tasks, I tend to be way off. I suspect it will be even worse when estimating how long it takes to fix bugs in areas of the code I don't know very well.
What tips, tricks or techniques have you found successful for estimating quickly, without breaking things down into fine grained tasks and estimating them?
Things that are not an option:
Not giving an estimate - I've tried this, it didn't fly:)
Picking a number and confidence interval that is ridiculously wide - I've considered this, but I don't think it'll fly either.
Evidence-base scheduling - We're using JIRA, which doesn't have any evidence-base scheduling tools written for it, and we can't migrate to FogBugz currently (BTW, if someone goes and writes an evidence-based scheduling plugin for JIRA, we would gladly pay for it).
The best tip for estimating: round up a heck of a lot.
It sounds like you're already an expert on the topic of estimation, and that you know the limitations of what's possible. It just isn't possible to estimate a task without assessing what needs doing to complete it!
Amount of time assessing is directly proportional to accuracy of estimate. And these things converge at the point when time assessing is so accurate you've solved the task, at that moment, you know exactly how long it takes.
Hmm, sorry, this may not be the answer you wanted to hear... it's just my thoughts on it though.
Be prepared to create a release at any time
Have the stake-holders prioritise the work done
Work on the highest priority items
Step 1. means you never miss a deadline.
Step 2. is a way of turning the question back on those who are asking you to estimate without spending time estimating.
Edit...
The above doesn't really answer your question, sorry.
The stake holders will want to prioritize work based on how long and expensive each task will be, and you are likely to be asked which of the highest prioritized changes you expect to be able to complete by the next deadline.
My technique that takes the least time is to use three times my impression of how long I think it would take me to do it.
You're looking for something taking longer than that, but not as long as your previous excellent estimates.
You'll still need to look at each bug, even if only to take a guess at whether it is easy, average, or tricky, or 1,2,4,8,16 or 32 hours work.
If you produce some code complexity metrics over your code base (eg cyclomatic complexity), and for each task, take a stab at which two or three portions of that code base will need to be changed the most, then estimate based on the assumption that the less complex portions of code will be quicker to change than the more complex portions. You could come up with some heuristics based on a few of your previous estimates, to use for each bug fix, giving an estimate of the time and variability required.
How about:
estimate=(bugs/devs)xdays (xK)
As simple as this is it's actually quite accurate. And can be estimated in 1minute.
It's confidence level is less than your detailed method, but I'd recommend you check your data on the last three milestones and check the difference between this quick estimate and your detailed estimate that will give you a "K" value representing your team's constant.
Be surprised.
Use Planning Poker, see the answers to How to estimate the length of a programming task
In simplest terms:
Your Most Absolutely Liberal Estimation * 3 = Your Estimate
The above may look like a joke, but it's not. I've used it many times. Time estimation on software projects of any kind is a game, just like making a deal with a car dealer. That formula will get you something to provide your management in a pinch and give you some room to play with as well.
However, if you're somehow able to get down to the more granular details (which is really the only way you'll be able to be more accurate), Google on Function Point Analysis, sometimes called "Fast Function Point Analysis" or "Fast Function Point Estimation".
Many folks out there have a myriad of spreadsheets, PDF's and the like that can help you estimate as quickly as possible. Check out the spreadsheets first as they'll have formulas built in for you.
Hope this helps!
You've been asking how to produce an estimate and an uncertainty interval. A better way to think of this is to do a worst-case estimate and a best-case estimate. Combine the two to have an estimate range. Well understood issues will naturally be more specific then the estimates for less-understood issues. For example, an estimate that looks like 1.5-2 days is probably for a well understood issue, an estimate that looks like 2-14 days would be typical for an issue not at all understood.
Limit the amount of investigation and time spent producing an estimate by allowing for a wider gap between the estimates. This works because its relatively easy to imagine realistic best case and worst case scenarios. When the uncertainty range is more than you're comfortable dealing with in the schedule, then take some time to understood the less understood issues. It may help to break them up.
I usually go for half-day granularity rather than hour granularity in my estimates if the work is expected to take more than a week overall.
public static class Time
{
/// <summary>
/// Estimates the hours.
/// </summary>
/// <param name="NumberPulledFromAss">The number pulled from ass.</param>
/// <param name="Priority">The priority.</param>
/// <param name="Complexity">The complexity.</param>
/// <returns>
/// a number in hours to estimate the time to complete a task.
/// Hey, you will be wrong anyway why waste more time than you need?
/// </returns>
public static int EstimateHours(int NumberPulledFromAss, int Priority, int Complexity)
{
var rand = new Random(NumberPulledFromAss);
var baseGuess = rand.Next(1, NumberPulledFromAss);
return (baseGuess + (Priority * Complexity)) * 2;
}
}
Your estimates are as accurate as the time you put into them. This time can be physical time breaking down the problem or drawing upon past experiences in areas you're familiar. If this isn't an option the try breaking the bugs/polish down into groups.
Trivial fix of a few hours.
Up to one day effort.
Very complex - one week effort.
Once you have these categorised then you can work out a rough guestimate.
Many hints may be useful in this article on an agile blog: Agile Estimating.
Calculating the variability in your estimate will take longer than calculating your estimate.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Is it better to describe improvements using percentages or just the differences in the numbers? For example if you improved the performance of a critical ETL SQL Query from 4000 msecs to 312 msecs how would you present it as an 'Accomplishment' on a performance review?
In currency. Money is the most effective medium for communicating value, which is what you're trying to use the performance review to demonstrate.
Person hours saved, (very roughly) estimated value of $NEW_THING_THE_COMPANY_CAN_DO_AS_RESULT, future hardware upgrades averted, etc.
You get the nice bonus that you show that you're sensitive to the company's financial position; a geek who can align himself with what the company is really about.
Take potato
Drench Potato in Lighter Fluid
Light potato on fire
Hand potato to boss
Make boss hold it for 4 seconds.
Ask boss how long those 4 seconds felt
Ask boss how much better half a second would have been
Bask in glory
It is always better to measure relative improvement.
So, if you brought it down to 312ms from 4000ms then it is an improvement of 3688ms, which is 92.2% of the original speed. So, you reduced the runtime by 92.2%. In other words, you brought the runtime down to only 7.8% of what it was originally.
Absolute numbers, on the other hand, usually are not that good since they are not comparable. (If your original runtime was 4,000,000ms then an improvement of 3688ms isn't that great.)
See this link for some nice chart suggestions.
Comparison to Requirements
If I have requirements (response time, throughput), I like to color code the absolute numbers like so:
Green: <= 80% of the requirement (response time); >= 120% of > the requirement (throughput)
No formatting: Meets the requirement.
Red: Does not meet the requirement.
Comparisons are interesting, but only if we have enough to see trends over time; Is our performance steadily improving or degrading? Ultimately, the business only cares if we're meeting the requirement. It's only when we don't that they ask for comparisons to previous releases.
Comparison of Benchmarks
If I'm comparing benchmarks to some baseline, then I like to use percentages, but only if the benchmark is a statistically significant change from the baseline.
Hardware Sizing
If I'm doing hardware sizing or capacity planning, then I like to express the performance as the absolute number plus the cost per transaction. For example:
System A: 1,000 transactions/second, $0.02/transaction
System B: 1,500 transactions/second, $0.04/transaction
Use whichever appears most impressive given the change. According to one method of calculation, that change sped up the query by 1,300%, which looks more impressive than 13x improvement, or
============= <-- old query
= <-- new query
Although the graph isn't a bad method.
If you can calculate the improvement in money, then go for that. One piece of software I wrote many years ago saved a few engineers a little bit of time each day. Figuring out the cost of salary, benefits, overhead and it turned into a savings of more than $12k per year for a small company.
-Adam
Rule of the thumb: Whichever sounds more impressive.
If you went from 10 tasks done in a period to 12, you could say you improved the performance by 20%
Saying you did two tasks more doesnt seem that impressive.
In your case, both numbers sound good, but try different representations and see what you get!
Sometimes graphics help a lot of the improvement is there on a number of factors, but the combined somehow does not look that cool
Example: You have 5 params A, B, C, D, E. You could make a bar chart with those 5 params and "before and after" values side by side for each param. That sure will look impressive.
God im starting to sound like my friend from marketing!
runs away screaming
you can make numbers and graphs say anything you want - the important thing is to make them say something meaningful and relevant to the audience you're presenting them to. if it's end users you can show them differences in the screen refreshes (something they understand), to managers perhaps the reduced number of servers they'll need in order to support the application ($ savings), financial...it's all about the $ how much did it save them. a general rule is the less technical the group the more graphical and dramatic you need to be.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Some people have suggested that when doing an estimate one should make a lower and upper range on the expected time to delivery. The few project tools I have seen, seem to demand one fixed date. Are there any tools that support this concept of a estimation range?
Joel touts Evidence-Based Scheduling in their FogBugz 6.0 software.
There's also the classic method of providing a best, worst and expected case estimate for each item and then computing a result
computed_result = (b + 4e + w)/6
You can use that to demonstrate how you derived your estimates.
HOWEVER, if you provide a range of time; all the client/sponsor/stakeholder is going to see is the lowest value. No mater what you say. So keep the range secret, and advertise the computed result.
I've used Merlin2 which is a project management product for the Mac. When you are starting a new project it asks you the start date and end date - which look fixed, but when you look at the project plan inspector you see that there is actually an "Earliest Date" and "Latest Date" for both the Start and End dates which can be edited. By default it adds the start date into "Earliest Start Date" and the end date to "Latest End Date" - and you can tweak as necessary.
"Some people have suggested that when doing an estimate one should make a lower and upper range on the expected time to delivery."
But what do your project stakeholders want? Will a range help them decide to fund your project?
Ranges don't really mean very much. Further, most people ignore the range and either see the low or the high number. Optimists have "happy-eyes", see the low number, and complain when you don't hit it, even if you're under the high number. Pessimists see the high number, say it's too big, and demand you replan the project to make the number smaller.
How -- precisely -- will a range help you? Who needs the range? What information will the range help them with? What decision do they have to make that requires a range?
I suggest that you plan each piece realistically.
Further, prioritize your project. After prioritizing, you'll see that there's some essential stuff, some important-but-not-essential stuff, and some optional stuff. This is your range. Cost to do the essential stuff is low. Cost to do important-but-not-essential is in the middle. Cost to do the optional stuff is high.
When someone asks you to "replan", you trim optional stuff.
It isn't a simplistic range. It's a realistic view of what you'll get done and what value it has.