precision and recall in recommendation systems - precision

I have designed a recommendation system and encountered a question in the evaluation process. In top 1 recommendation, both of precision and recall increase and in top 3 recommendation inversly increase. What is the reason for this? thank you.

Related

Integer arithmetic errors in modern CPUs [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Do I need to plan for possible miscalculations in modern CPUs, where for example an addition of two integers 1 and 1 results in 3 once?
(How often) Do such errors in the ALU occur?
Is there any built-in protection against this nowadays?
Is there a realistic chance that arithmetic errors like mentioned in the example above are the reason behind most "heisenbugs" out there?
CPU feature sizes have gotten small enough that errors like this in data can happen, but they're (much) more likely to happen on data being stored in memory than for an actual miscalculation to happen.
In some radiation-rich environments (e.g., on satellites) it's fairly common to have (for example) multiple CPUs that "vote" on an outcome, or repeat calculations when/if there's a disagreement. Other than that, about the only time it might be reasonable would be in something that was likely to affect human lives.
While it's possible that there's a Heisenbug that's really a result of something like a single-bit upset, it's extremely unlikely, at least IMO. I've seen quite a few bugs, some of which were hard to track down -- but when they were, there were really mistakes in the code.
You should never see errors with integer math. Even with floating point arithmetic it's exceedingly rare, unless someone is using a much older processor or your trying doing something with irrational numbers, incredible precision, and you aren't using a specialized math library.
Are you doing something where you seem integer errors? I'd be interested if you were.
Do I need to plan for possible miscalculations in modern CPUs
Yes. You also need to plan for spontaneous formation of black holes which could suddenly absorb all nearby matter, including you.
Do such errors in the ALU occur?
Well. If only engineers would use error-correcting codes, the odds are very, very small. What would have to happen is that a combination of error bits that happened to look valid would have to spontaneously arise in the circuitry. The odds aren't zero, but they're small.
Is there any built-in protection against this nowadays?
If only Error-correcting codes were not totally forgotten. Remember, "Parity is for farmers".
http://en.wikipedia.org/wiki/Error_detection_and_correction
http://en.wikipedia.org/wiki/Dynamic_random_access_memory#Errors_and_error_correction
http://en.wikipedia.org/wiki/SECDED#Hamming_codes_with_additional_parity_.28SECDED.29
Is there a realistic chance that arithmetic errors like mentioned
Yes. If you define "realistic" as non-zero, but really, really small.
Recent tests give widely varying error
rates with over 7 orders of magnitude
difference, ranging from 10^−10 to 10^−17
error/bit·h,
roughly one bit error,
per hour, per gigabyte of memory to
one bit error, per century, per
gigabyte of memory.

How can avoid people using my code for evil? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm not sure if this is quite the right place, but it seems like a decent place to ask.
My current job involves manual analysis of large data sets (at several levels, each more refined and done by increasingly experienced analysts). About a year ago, I started developing some utilities to track analyst performance by comparing results at earlier levels to final levels. At first, this worked quite well - we used it in-shop as a simple indicator to help focus training efforts and do a better job overall.
Recently though, the results have been taken out of context and used in a way I never intended. It seems management (one person in particular) has started using the results of these tools to directly affect EPR's (enlisted performance reports - \ it's an air force thing, but I assume something similar exists in other areas) and similar paperwork. The problem isn't who is using these results, but how. I've made it clear to everyone that the results are, quite simply, error-prone.
There are numerous unavoidable obstacles to generating this data, which I have worked to minimize with some nifty heuristics and such. Taken in the proper context, they're a useful tool. Out of context however, as they are now being used, they do more harm than good.
The manager(s) in question are taking the results as literal indicators of whether an analyst is performing well or poorly. The results are being averaged and individual scores are being ranked as above (good) or below (bad) average. This is being done with no regard for inherent margins of error and sample bias, with no regard for any sort of proper interpretation. I know of at least one person whose performance rating was marked down for an 'accuracy percentage' less than one percentage point below average (when the typical margin of error from the calculation method alone is around two to three percent).
I'm in the process of writing a formal report on the errors present in the system ("Beginner's Guide to Meaningful Statistical Analysis" included), but all signs point to this having no effect.
Short of deliberately breaking the tools (a route I'd prefer avoiding but am strongly considering under the circumstances), I'm wondering if anyone here has effectively dealt with similar situations before? Any insight into how to approach this would be greatly appreciated.
Update:
Thanks for the responses - plenty of good ideas all around.
If anyone is curious, I'm moving in the direction of 'refine, educate, and take control of interpretation'. I've started rebuilding my tools to try and negate or track error better and automatically generate any numbers and graphs they could want, with included documentation throughout (while hiding away as obscure references the raw data they currently seem so eager to import to the 'magical' excel sheets).
In particular, I'm hopeful that visual representations of error and properly created ranking systems (taking into account error, standard deviations, etc.) will help the situation.
Either modify the output to include error information (so if the error is +/- 5 %, don't output 22%, output 17% - 27%), or educate those whom this is being used against to the error so that they can defend themselves when it is used against them.
Well, you seem to have run afoul of the Law of Unintended Consequences in the context of human behavior.
Unfortunately, once the cat is out of the bag, it's pretty hard to put back in. You have a few options (which are not mutually exclusive, by the way) to consider, including:
Alter the reports so that their data can no longer be abused in the way you describe.
Work with management to help them understand why their use of your data is improper or misleading.
Work with those whose performance is being measured to pressure management to rethink their policy on the matter.
Work with management/analysts to come up with a viable means to measure performance in a way that is fair to everyone.
Break the report in a manner that makes them unusable for any purposes.
Clearly there is a desire on the part of management to get analytics on performance of analysts. Likely there is a real need for this ... and your reports happened to fill a void in the available information. The best option for everyone would be to find a way to effectively and fairly fill this need. There are many possible ways to achieve this - from dropping dense rankings in favor of performance tiers to using time-over-time variance to refine performance measurements.
Now, it's entirely possible that the existing reports you've provided simply cannot be applied in a fair and accurate manner to address this problem. In which case, you should work with your management team to make sure they understand why this is the case - and either redefine the way performance is measured or take the time to develop an appropriate and fair methodology.
One of the strongest means to convince management that their (ab)use of the data in your report is unwise is to remind them of the concept of perverse incentives. It's entirely possible that over time, analysts will modify their behavior in a way that results in higher rankings in performance reports at the cost of real performance or quality of results that are not otherwise captured or expressed. You seem to have a good understanding of your domain - so I would hope that you could provide specific and dramatic examples of such consequences to help make your case.
All you can do is to try and educate the managers as to why what they're doing is incorrect.
Beyond that, you can't stop idiots from being idiotic, and you'll just go mad trying.
I definitely wouldn't "break" code that people are relying on, even if it's not a specific deliverable. That will only cause them to complain about you, a move which may affect your own EPR :-)
I really think the key here is good communication with your managers.
Besides, I like PatrickV's idea. You could also try some other ways to engineer your tool around the problem so that it'll seem silly/be hard to use it as performance measurement - change the name of the statistics to mean something other than "how good programmer X is", make it hard to get data per-person, show error statistics.
You can also try to display the data in another way (this may actually make your managers think you are trying to help them). Show a graph - a several pixels difference in position may be harder to identify than a numeric results (my guess - your managers are using excel and coloring red everything below average). Draw the error margin so it doesn't make sense to obsess over fractions of percentages.
Give the result as a scale - low and high margin that take into account your error information, it is harder to compare.
Edit: Oh yeah, and read about "social interfaces". You can start with's Spolsky's Not Just Usability and Building Communities with Software.
I would echo #paxdiablo's advice, as a first step:
Work on the report on the inherent errors. In fact, make it the introduction to every copy generated.
When you refer to the measurement errors, indicate they are the lower limit of the errors (unless there actually aren't any).
Try to educate the manager(s) in the error of his/her ways.
If possible, discuss the issue with your manager. And perhaps with the offending managers' management, depending on how familiar you are with them you probably limit it to just "express some concerns" and giving a heads-up.
Consult your HR department, or whomever is in charge of fairness in the performance reviews.
Good luck.
The problem is that the code is not yours, it belongs to your company. They really can do whatever they want with it.
I hate to say this, but if you have an issue with the ethics of your company you will have to leave that company.
One thing you could do is implement the comparison yourself. If he really wants to check if somebody is performing significantly less than the rest, it should be tested formally as well.
Now to choose the right test is a bit tricky without knowing the data and the structure, so I can't really advise you on that one. Just take into account that if you do pairwise comparisons, or compare multiple scores against an average, that you run into the multitesting problem. A classic way of correcting is using Bonferroni. If you implement that one, you can be sure that at a certain point, noone will jump out any more. The Bonferroni correction is very conservative. Another option is using Dunn-Sidak, which is supposed to be less conservative.
The correct implementation would be an ANOVA -if the assumptions are met and the data suitable off course- with a post-hoc comparison like a Tukey Honest Significant Difference test. That way at least the uncertainty on the results is taken into account.
If you don't have a clue on which test to use, describe your data in detail on stats.stackexchange.com and ask for help on which test to use.
Cheers
I just wanted to elaborate on the Perverse Incentives answer of LBushkin. I can easily see your problem extending to where analysts will avoid difficult topics for fear of reducing their score. Or maybe they will provide the same answer as earlier stages to avoid hurting a friends score, even if that is not correct. An interesting question is what happens if the later answer is incorrect - you have no truth, just successive analytic opinions - in this case I assume the first answer is marked as "incorrect", right?
Maybe presenting some of these extensions to the manager will help.

What is Best for Defect Rate Tracking? Defects per KLOC?

I'm trying to create some internal metrics to demonstrate (determine?) how well TDD improves defect rates in code.
Is there a better way than defects/KLOC? What about a language's 'functional density'?
Any comments or suggestions would be helpful.
Thanks - Jonathan
You may also consider mapping defect discovery rate and defect resolution rates... how long does it take to find bugs, and once they're found, how long do they take to fix? To my knowledge, TDD is supposed to improve on fix times because it makes defects known earlier... right?
Any measure is an arbitrary comparison of defects to code size; so long as the comparison is similar, it should work. E.g., defects/kloc in C to defects/kloc in C. If you changed languages, it would affect the metric in any case, since the same program in another language might be less defect-prone.
Measuring defects isn't an easy thing. One would like to account for the complexity of the code, but that is incredibly messy and unpleasant. When measuring code quality I recommend:
Measure the current state (what is your defect rate now)
Make a change (peer reviews, training, code guidelines, etc)
Measure the new defect rate (Have things improved?)
Goto 2
If you are going to compare coders make sure you compare coders doing similar work in the same language. Don't compare the coder who works in the deep internals of your most complex calculation engine to the coder who writes the code that stores stuff in the database.
I try to make sure that coders know that the process is being measured not the coders. This helps to improve the quality of the metrics.
I suggest to use the ratio between the times :
the time spend fixing bugs
the time spend writing other codes
This seem valid across languages...
It also works if you only have a rough estimation of some big code base. You can still compare it to the new code you are writing, to impress you management ;-)
I'm skeptical of all LOC-related measurements, not just because of different relative expressiveness of languages, but because individual programmers will vary enough in the expressiveness of their code as to make this metric "fuzzy" at best.
The things I would measure in the interests of project management are:
Number of open defects on the project. There's no single scalar that can tell you where the project is and how close it is to a releasable state, but this is still a handy number to have on hand and watch over time.
Defect detection rate. This is not the rate of introduction of new defects into the system, but it's probably the closest proxy you'll find.
Defect resolution rate. If this is less than the detection rate, you're falling behind - if it's greater, you're getting ahead.
All of these numbers are more useful if you combine them with severity information. A product with 20 minor bugs may well closer to release than one with 2 crashing bugs. If you're clearing the minor bugs but not the severe ones, you have to get the developers to refocus their attention.
I would track these numbers per project and per developer. The reason for doing them per project should be clear. The per-developer numbers are certainly not the whole picture of an individual contributor's skill or productivity, but can point you to people who might need training or remediation.
You may also wish to tag all the tickets in your defect tracking system by project module as well (especially for larger projects), so that you can tell when critical modules are in a fragile state.
Why dont you consider defects per use case ? or defects per requirement. We have faced practical issues in arriving at the KLOC.

Should a developer aim for readability or performance first? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Oftentimes a developer will be faced with a choice between two possible ways to solve a problem -- one that is idiomatic and readable, and another that is less intuitive, but may perform better. For example, in C-based languages, there are two ways to multiply a number by 2:
int SimpleMultiplyBy2(int x)
{
return x * 2;
}
and
int FastMultiplyBy2(int x)
{
return x << 1;
}
The first version is simpler to pick up for both technical and non-technical readers, but the second one may perform better, since bit shifting is a simpler operation than multiplication. (For now, let's assume that the compiler's optimizer would not detect this and optimize it, though that is also a consideration).
As a developer, which would be better as an initial attempt?
You missed one.
First code for correctness, then for clarity (the two are often connected, of course!). Finally, and only if you have real empirical evidence that you actually need to, you can look at optimizing. Premature optimization really is evil. Optimization almost always costs you time, clarity, maintainability. You'd better be sure you're buying something worthwhile with that.
Note that good algorithms almost always beat localized tuning. There is no reason you can't have code that is correct, clear, and fast. You'll be unreasonably lucky to get there starting off focusing on `fast' though.
IMO the obvious readable version first, until performance is measured and a faster version is required.
Take it from Don Knuth
Premature optimization is the root of all evil (or at least most of it) in programming.
Readability 100%
If your compiler can't do the "x*2" => "x <<1" optimization for you -- get a new compiler!
Also remember that 99.9% of your program's time is spent waiting for user input, waiting for database queries and waiting for network responses. Unless you are doing the multiple 20 bajillion times, it's not going to be noticeable.
Readability for sure. Don't worry about the speed unless someone complains
In your given example, 99.9999% of the compilers out there will generate the same code for both cases. Which illustrates my general rule - write for readability and maintainability first, and optimize only when you need to.
Readability.
Coding for performance has it's own set of challenges. Joseph M. Newcomer said it well
Optimization matters only when it
matters. When it matters, it matters a
lot, but until you know that it
matters, don't waste a lot of time
doing it. Even if you know it matters,
you need to know where it matters.
Without performance data, you won't
know what to optimize, and you'll
probably optimize the wrong thing.
The result will be obscure, hard to
write, hard to debug, and hard to
maintain code that doesn't solve your
problem. Thus it has the dual
disadvantage of (a) increasing
software development and software
maintenance costs, and (b) having no
performance effect at all.
I would go for readability first. Considering the fact that with the kind of optimized languages and hugely loaded machines we have in these days, most of the code we write in readable way will perform decently.
In some very rare scenarios, where you are pretty sure you are going to have some performance bottle neck (may be from some past bad experiences), and you managed to find some weird trick which can give you huge performance advantage, you can go for that. But you should comment that code snippet very well, which will help to make it more readable.
Readability. The time to optimize is when you get to beta testing. Otherwise you never really know what you need to spend the time on.
A often overlooked factor in this debate is the extra time it takes for a programmer to navigate, understand and modify less readible code. Considering a programmer's time goes for a hundred dollars an hour or more, this is a very real cost.
Any performance gain is countered by this direct extra cost in development.
Putting a comment there with an explanation would make it readable and fast.
It really depends on the type of project, and how important performance is. If you're building a 3D game, then there are usually a lot of common optimizations that you'll want to throw in there along the way, and there's no reason not to (just don't get too carried away early). But if you're doing something tricky, comment it so anybody looking at it will know how and why you're being tricky.
The answer depends on the context. In device driver programming or game development for example, the second form is an acceptable idiom. In business applications, not so much.
Your best bet is to look around the code (or in similar successful applications) to check how other developers do it.
If you're worried about readability of your code, don't hesitate to add a comment to remind yourself what and why you're doing this.
using << would by a micro optimization.
So Hoare's (not Knuts) rule:
Premature optimization is the root of all evil.
applies and you should just use the more readable version in the first place.
This is rule is IMHO often misused as an excuse to design software that can never scale, or perform well.
Both. Your code should balance both; readability and performance. Because ignoring either one will screw the ROI of the project, which in the end of the day is all that matters to your boss.
Bad readability results in decreased maintainability, which results in more resources spent on maintenance, which results in a lower ROI.
Bad performance results in decreased investment and client base, which results in a lower ROI.
Readability is the FIRST target.
In the 1970's the army tested some of the then "new" techniques of software development (top down design, structured programming, chief programmer teams, to name a few) to determine which of these made a statistically significant difference.
THe ONLY technique that made a statistically significant difference in development was...
ADDING BLANK LINES to program code.
The improvement in readability in those pre-structured, pre-object oriented code was the only technique in these studies that improved productivity.
==============
Optimization should only be addressed when the entire project is unit tested and ready for instrumentation. You never know WHERE you need to optimize the code.
In their landmark books Kernigan and Plauger in the late 1970's SOFTWARE TOOLS (1976) and SOFTWARE TOOLS IN PASCAL (1981) showed ways to create structured programs using top down design. They created text processing programs: editors, search tools, code pre-processors.
When the completed text formating function was INSTRUMENTED they discovered that most of the processing time was spent in three routines that performed text input and output ( In the original book, the i-o functions took 89% of the time. In the pascal book, these functions consumed 55%!)
They were able to optimize these THREE routines and produced the results of increased performance with reasonable, manageable development time and cost.
The larger the codebase, the more readability is crucial. Trying to understand some tiny function isn't so bad. (Especially since the Method Name in the example gives you a clue.) Not so great for some epic piece of uber code written by the loner genius who just quit coding because he has finally seen the top of his ability's complexity and it's what he just wrote for you and you'll never ever understand it.
As almost everyone said in their answers, I favor readability. 99 out of 100 projects I run have no hard response time requirements, so it's an easy choice.
Before you even start coding you should already know the answer. Some projects have certain performance requirements, like 'need to be able to run task X in Y (milli)seconds'. If that's the case, you have a goal to work towards and you know when you have to optimize or not. (hopefully) this is determined at the requirements stage of your project, not when writing the code.
Good readability and the ability to optimize later on are a result of proper software design. If your software is of sound design, you should be able to isolate parts of your software and rewrite them if needed, without breaking other parts of the system. Besides, most true optimization cases I've encountered (ignoring some real low level tricks, those are incidental) have been in changing from one algorithm to another, or caching data to memory instead of disk/network.
If there is no readability , it will be very hard to get performance improvement when you really need it.
Performance should be only improved when it is a problem in your program, there are many places would be a bottle neck rather than this syntax. Say you are squishing 1ns improvement on a << but ignored that 10 mins IO time.
Also, regarding readability, a professional programmer should be able to read/understand computer science terms. For example we can name a method enqueue rather than we have to say putThisJobInWorkQueue.
The bitshift versus the multiplication is a trivial optimization that gains next to nothing. And, as has been pointed out, your compiler should do that for you. Other than that, the gain is neglectable anyhow as is the CPU this instruction runs on.
On the other hand, if you need to perform serious computation, you will require the right data structures. But if your problem is complex, finding out about that is part of the solution. As an illustration, consider searching for an ID number in an array of 1000000 unsorted objects. Then reconsider using a binary tree or a hash map.
But optimizations like n << C are usually neglectible and trivial to change to at any point. Making code readable is not.
It depends on the task needed to be solved. Usually readability is more importrant, but there are still some tasks when you shoul think of performance in the first place. And you can't just spend a day or to for profiling and optimization after everything works perfectly, because optimization itself may require rewriting sufficiant part of a code from scratch. But it is not common nowadays.
I'd say go for readability.
But in the given example, I think that the second version is already readable enough, since the name of the function exactly states, what is going on in the function.
If we just always had functions that told us, what they do ...
You should always maximally optimize, performance always counts. The reason we have bloatware today, is that most programmers don't want to do the work of optimization.
Having said that, you can always put comments in where slick coding needs clarification.
There is no point in optimizing if you don't know your bottlenecks. You may have made a function incredible efficient (usually at the expense of readability to some degree) only to find that portion of code hardly ever runs, or it's spending more time hitting the disk or database than you'll ever save twiddling bits.
So you can't micro-optimize until you have something to measure, and then you might as well start off for readability.
However, you should be mindful of both speed and understandability when designing the overall architecture, as both can have a massive impact and be difficult to change (depending on coding style and methedologies).
It is estimated that about 70% of the cost of software is in maintenance. Readability makes a system easier to maintain and therefore brings down cost of the software over its life.
There are cases where performance is more important the readability, that said they are few and far between.
Before sacrifing readability, think "Am I (or your company) prepared to deal with the extra cost I am adding to the system by doing this?"
I don't work at google so I'd go for the evil option. (optimization)
In Chapter 6 of Jon Bentley's "Programming Pearls", he describes how one system had a 400 times speed up by optimizing at 6 different design levels. I believe, that by not caring about performance at these 6 design levels, modern implementors can easily achieve 2-3 orders of magnitude of slow down in their programs.
Readability first. But even more than readability is simplicity, especially in terms of data structure.
I'm reminded of a student doing a vision analysis program, who couldn't understand why it was so slow. He merely followed good programming practice - each pixel was an object, and it worked by sending messages to its neighbors...
check this out
Write for readability first, but expect the readers to be programmers. Any programmer worth his or her salt should know the difference between a multiply and a bitshift, or be able to read the ternary operator where it is used appropriately, be able to look up and understand a complex algorithm (you are commenting your code right?), etc.
Early over-optimization is, of course, quite bad at getting you into trouble later on when you need to refactor, but that doesn't really apply to the optimization of individual methods, code blocks, or statements.
How much does an hour of processor time cost?
How much does an hour of programmer time cost?
IMHO both things have nothing to do. You should first go for code that works, as this is more important than performance or how well it reads. Regarding readability: your code should always be readable in any case.
However I fail to see why code can't be readable and offer good performance at the same time. In your example, the second version is as readable as the first one to me. What is less readable about it? If a programmer doesn't know that shifting left is the same as multiplying by a power of two and shifting right is the same as dividing by a power of two... well, then you have much more basic problems than general readability.

Does anyone work with Function Points? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
Some questions about Function Points:
1) Is it a reasonably precise way to do estimates? (I'm not unreasonable here, but just want to know compared to other estimation methods)
2) And is the effort required worth the benefit you get out of it?
3) Which type of Function Points do you use?
4) Do you use any tools for doing this?
Edit: I am interested in hearing from people who use them or have used them. I've read up on estimation practices, including pros/cons of various techniques, but I'm interested in the value in practice.
I was an IFPUG Certified Function Point Specialist from 2002-2005, and I still use them to estimate business applications (web-based and thick-client). My experience is mostly with smaller projects (1000 FP or less).
I settled on Function Points after using Use Case Points and Lines of Code. (I've been actively working with estimation techniques for 10+ years now).
Some questions about Function Points:
1) Is it a reasonably precise way to
do estimates? (I'm not unreasonable
here, but just want to know compared
to other estimation methods)
Hard to answer quickly, as it depends on where you are in the lifecycle (from gleam-in-the-eye to done). You also have to realize that there's more to estimation than precision.
Their greatest strength is that, when coupled with historical data, they hold up well under pressure from decision-makers. By separating the scope of the project from productivity (h/FP), they result in far more constructive conversations. (I first got involved in metrics-based estimation when I, a web programmer, had to convince a personal friend of my company's founder and CEO to go back to his investors and tell them that the date he had been promising was unattainable. We all knew it was, but it was the project history and functional sizing (home-grown use case points at the time) that actually convinced him.
Their advantage is greatest early in the lifecycle, when you have to assess the feasibility of a project before a team has even been assembled.
Contrary to common belief, it doesn't take that long to come up with a useful count, if you know what you're doing. Just off of the basic information types (logical files) inferred in an initial client meeting, and average productivity of our team, I could come up with a rough count (but no rougher than all the other unknowns at that stage) and a useful estimate in an afternoon.
Combine Function Point Analysis with a Facilitated Requirements Workshop and you have a great project set-up approach.
Once things were getting serious and we had nominated a team, we would then use Planning Poker and some other estimation techniques to come up with an independent number, and compare the two.
2) And is the effort required worth
the benefit you get out of it?
Absolutely. I've found preparing a count to be an excellent way to review user-goal-level requirements for consistency and completeness, in addition to all the other benefits. This was even in setting up Agile projects. I often found implied stories the customer had missed.
3) Which type of Function Points do
you use?
IFPUG CPM (Counting Practices Manual) 4.2
4) Do you use any tools for doing
this?
An Excel spreadsheet template I was given by the person who trained me. You put in the file or transaction attributes, and it does all of the table lookups for you.
As a concluding note, NO estimate is as precise (or more precisely, accurate) as the bean-counters would like, for reasons that have been well documented in many other places. So you have to run your projects in ways that can accommodate that (three cheers for Agile).
But estimates are still a vital part of decision support in a business environment, and I would never want to be without my function points. I suspect the people who characterize them as "fantasy" have never seen them properly used (and I have seen them overhyped and misused grotesquely, believe me).
Don't get me wrong, FP have an arbitrary feel to them at times. But, to paraphrase Churchill, Function Points are the worst possible early-lifecycle estimation technique known, except for all the others.
Mike Cohn in his Agile Estimating and Planning consider FPs to be great but difficult to get right. He (obviously) recommends to use story points-based estimation instead. I tend to agree with this as with each new project I see the benefits of Agile approach more and more.
1) Is it a reasonably precise way to do estimates? (I'm not unreasonable here, but just want to know compared to other estimation methods)
As far as estimation precision goes the functional points are very good. In my experience they are great but expensive in terms of effort involved if you want do it properly. Not that many projects could afford an elaboration phase to get the FP-based estimates right.
2) And is the effort required worth the benefit you get out of it?
FPs are great because they are officially recognised by ISO which gives your estimations a great deal of credibility. If you work on a big project for a big client it might be useful to invest in official-looking detailed estimations. But if the level of uncertainty is big to start with (like other vendors integration, legacy system, loose requirements etc.) you will not get anywhere near precision anyway so usually you have to just accept this and re-iterate the estimations later. If it is the case a cheaper way of doing the estimates (user stories and story points) are better.
3) Which type of Function Points do you use?
If I understand this part of your question correctly we used to do estimations based on the Feature Points but gradually moved away from these an almost all projects expect for the ones with heavy emphasis on the internal functionality.
4) Do you use any tools for doing this?
Excel is great with all the formulas you could use. Using Google Spreadsheets instead of Excel helps if you want to do that collaboratively.
There is also a great tool built-in to the Sparx Enterprise Architect which allows you to do the estimates based on the Use Cases which could be used for FP estimations as well.
The great hacknot is offline now, but it is in book form. He has an essay on function points: http://www.scribd.com/doc/459372/hacknot-book-a4, concluding they are a fantasy (which I agree with).
Joel on Software has a reasonable sound alternative called Evidence based scheduling that at least sounds like it might work....
From what I have study about Function Point (one of my teacher was highly involved in the process of the theory of function point) and he wasn't able to answer all our answers.Function point fail in many way because it's not because you have something read or write that you can evaluate correctly. You might have a result of 450 functions points and some of these function point will take 1 hour ans some will take 1 weeks. It's a metric that I will never use again.
No because any particular requirement can have an arbitrary amount of effort based on how precise (or imprecise) the author of the requirement is, and the level of experience of the function point assessor.
No because administration of imprecise derivations of abstract functionality yield no reliable estimate.
None if I can help it.
Tools? For function points? How about Excel? Or Word? Or Notepad? Or Edlin?
To answer your questions:
Yes they are more precise than anything else I have encountered (in 20+ years).
Yes they are well worth the effort. You can estimate size, resources, quality and schedule from just the FP count - extremely useful. It takes an average of 1 minute to count an FP manually and an average of 8 hours to fully code an FP (approximately $800 worth). Consider the carpenter's saying of "measure twice cut once". And now a shameless plug: with https://www.ScopeMaster.com you can measure 1 FP per second, and you don't need to learn how!
I like Cosmic Function Points (because they are versatile) and IFPUG because there is a lot of published data (mostly from Capers Jones).
Having invested considerable time, effort and money in developing a tool that counts FPs automatically from requirements, I shall never have to do it manually again!

Resources