How to estimate the contribution of an individual to a software project? [closed] - software-estimation

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I work on a software project and would like to estimate the percentage out of the total contribution that I have put in the development of the software. Is there some tool doing this? Such a tool can be useful for appraisals or negotiations, for example. After all, we work for money (yes, not only money, put the point remains). I think there is enough hand-waving for the most important things.
The estimation is very subjective (at least to me now) but I do not know of any tool that provides even a subjective estimate. I know of Sloccount that spells out the total effort using the lines of code but not on per-developer basis.
My idea of an ideal tool for this purpose would:
measure the complexity of the code (more complex is more effort, but more effort is not necessarily more contribution)
measure the decomposibility/flexibility of the software (more decomposable is better)
how much library code is used -- using library code speeds up the development process, increases the associated risk and requires the developer to know from before or learn about the library.
be intelligent enough to differentiate between "who wrote the code", "who copied the code" and "who indented the code".
It is difficult to differentiate between the complexity in the implementation and the intrinsic complexity of the problem. Perhaps a comparison can be made with an equivalent open source counterpart if there is, or for each submodule separately.
If there is no such tool, is there no merit in having such a tool? Or do you believe in "I do work, I do not measure"? It takes time after all. Perhaps the project manager should do this estimation continuously, say, weekly. Are there any standards? Yes, standardization is difficult because every project has different goals, but perhaps that should mean there should be multiple standards, not no standards at all. This looks similar to the how a company is valued in the market.
Update: after seeing a few initial answers: It does not make sense to imagine a tool that just outputs the percentages. Are there tools that can help humans (particularly managers) in making better decisions? Or what is the sufficient statistic for making better decisions? Are these statistics available?

I really doubt there is any reliable trustworthy way of measuring individual's contribution to the solution. Sometimes rewriting some complicated legacy code that results in less lines of code, less complicated solution (smaller cyclomatic complexity etc.) can be seen as a quite significant contribution, while in other cases deleting valuable code covering edge cases that results in the same statistics (less lines of code, smaller CC etc.) is definitely something bad. It all comes down to people, trust and cooperation, individualism in the team is almost always wrong and I would rather avoid it and especially not use it as a motivation factor.

This is a research topic on its own. There are several tools that have tried to define metrics like code ownership. There are other approaches which tackle other aspect of collaborative development, for instance the trustability we can have in the code.
There has been also several studies that tried to use the information from bug trackers. For instance, to identify the developer that is the more likely to introduce bugs. But it's hard to be objective (A brilliant developer that is assigned the most critical part of the system, will still be more likely to introduce critical bugs).
It's actually hard to monetize the development tasks. What is the cost of a bug? What is the gain of refactoring? That would be however one way to estimate the contribution of a developer.
The last cool tool I saw of this kind was the Game Plugin for Hudson continuous integration system. A score is assigned to each developer according their actions
-10 if they break the build
-1 for breaking a test
+1 for fixing a test
etc.
That's again a way to somehow assess the contribution of the developer.
All in all, I do feel like what you are asking for exist, but is still very immature.

I don't think you can get a tool to evaluate your share of the project. Measuring lines of source is all very well, but what of the quality of that source? You wouldn't want someone taking the credit for 200 lines of source if the job could have been easiy done in 20...
Also, thinking of my employer for a moment, a lot of people contribute to the project in ways other than code. Immediate examples I can think of would be Project Managers and Testers - both of whom are essential, both of whom rightly deserve some credit.
Martin

The only thing that I could imagine would be a voting system. I have absolutely no idea, if that would work in your team or anywhere - but I'm sure, that you will need humans for any realistic estimation of code quality.

In Stroustrup's Book on C++ I've read once "Don't try to solve social problems with technical means".
Thinking progmatically, the attitude and the ability of a programmer could be very quickly estimated by making a code-review together and having a talk on relevant topics.
Thinking as an IT-enthusiast and as a control-freak, this shouldn't be very hard, to implement a teachable machine-learning software, which uses version-cotrol, bug-database, etc and greates real-time performanced data for each contributor. E.g. R, KNIME or WEKA could be used for this.

Related

Why did not CASE tools succeed? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
...or why did they fail?
I am going to build a proof of a concept of something which could be classified as CASE, but I want to avoid some of the mistakes done before.
Thanks!
First, I think diagrams provide real value when they're small and simple. Large, highly detailed diagrams mostly waste a lot of paper, time, hard drive space, etc. A pencil and paper work quite nicely for diagrams that are small enough (and simple enough) to be useful. A software tool only helps when you're producing a diagram that's so large and complex that it's practically guaranteed to be useless.
Second, with most CASE tools, the fastest way to draw a diagram is to start by writing some (possibly simplified, mockup) code, and then "reverse engineer" the diagram from the code. Drawing the diagram directly is often slower than writing the code. To provide much real value, producing the high level diagram has to be quite a bit simpler than writing equivalent code.
When you get down to it, I've rarely seen CASE tools used as an actual "aid" to "software engineering" anyway. In most cases I've seen, the software engineering is done entirely separately, and the CASE tools were used to reverse engineer diagrams from code that was already written. The people producing the diagrams generally found them useless, and included them in reports to higher-level managers for "wow factor". The only "aid" they hoped for from the diagrams was impressing management with the complexity of what they were doing in the hope of increasing funding (some included diagrams of things like portions of the standard library, purely to add to apparent complexity).
As to how the tools failed at the software engineering part, I don't know of a single simple answer -- from what I've seen, I'd say it's more of a "death of a thousand nicks", than any single, glaring problem. If I did have to point to a single large problem, it would be that the ones I've looked at don't really take Patterns into account. Just for example, what I'd like is to work at an even higher level of abstraction, so I can point to some functionality, and play with things like "how would things look if I were to implement the following parts of that functionality as decorator classes?" Yes, I can draw one diagram with them as decorator classes, and one without, but I don't have a really quick, easy way to say "transform this entire hierarchy to move X, Y, and Z into decorator classes."
Contrast a typical CASE tool with a spreadsheet. In a spreadsheet, I can change one cell, and it will automatically recalculate how that affects anything else in the spreadsheet that depends on it. By contrast, CASE tools seem (at least to me) stuck at roughly the level of a grid control, where I can make changes in a cell, but I still have to manually track what other cells depend on that one, and what formulas to use, and calculate and modify all the affected cells by hand. Yes, if I want to print out a sheet of the right values, being able to edit them on the computer so I don't have eraser marks in the cells and such would be an improvement -- but only a small improvement, not the kind that turned personal computers from toys for a few hobbyists into a staple of essentially every business on earth.
If you look at the Wikipedia entry: http://en.wikipedia.org/wiki/Computer-aided_software_engineering then you'll see the "classic" tools from the 1990s. Having worked with many of those tools, I would suggest that the focus on commercialisation fragmented the market. Typically, not only did you pay huge sums for the tools, but then for consulting, training and run-time environments. With so many tools on offer, it was hard to build a competent team specialising on given tool.
Furthermore, it didn't help that the tools were over-sold. Promising managements unrealistic increases in productivity. There isn't any other area of IT where I've seen so much shelf-ware - products used for one project and then abandoned, often along with the project too.
The concepts of CASE live on in Eclipse and many other MDE tools. The problems of steep learning curves and fragmentation have still not been solved. Whilst the cost of the tools has reduced (to free in many cases) the training, consulting and ramp up costs are still there.
Before you expend a lot of effort on your CASE tool, have a look at the fields of MDA, MDE, DSL, even UML. Its worth browsing the OMG web site as well.
At the end of the day you should focus on the what you produce and not the tool. If you are able to automate some tasks then that's good. Building yet another CASE-like tool is a great intellectual exercise, but with minimal chances of commercial success. After all IBM, Oracle and Computer Associates have only had sporadic successes with their tools and they are still vigorously marketing them to enterprise customers.
I worked with Knowldegeware back in the early 90's. My simple answer to the demise of CASE is that as soon as you printed the model it was old. Keeping the model and the code in synch became impossible. The first target platform was MicroFocus COBOL, but then came Client-Server 94-95, followed by the internet 97-98 and nobody really wanted to use CASE with those new platforms.

Becoming the most efficient one-man team [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
Like many here, I am a one-man development team. I'm responsible for everything from gathering project requirements, designing concept-screens, planning and developing databases, and writing all code.
Being a one-man team is nice, but has its negatives. I don't have the ability to quickly consult with other developers, I rarely get a second set of eyes for my code, and I'm sure you guys can come up with many other negatives too.
To make the most of my time, and commit myself most efficiently to my work, what tips or practices could I implement into my day-to-day routine to be the best one-man team possible?
Daily list of what I am going to do.
Remove as many distractions as possible to focus on tasks. Turn off
email, turn off IM, etc... even if
for a set period of time and then
during a break check them.
Take time to learn about other coding techniques, tools and programming wisdom. This I have found to be crucial to my development. It's to easy to just code away and feel productive. What about what could be if you just had some more knowledge / weaponry under your belt to bang out that next widget. I know this one really sounds counter productive but it really isn't. Knowledge/know how is our real currency. The more we know the more we can make a better decision about how something should be done and do it faster.
Take breaks and be aware of your
body. When we are tired we don't
think as well and will make more
mistakes, become frustrated more
easily, etc...
Learn to use the 80 / 20 rule to your
advantage. I don't mean skimp or be
lazy. Often though we will work our
tail off for that 20% when it wasn't
necessary.
Set goals for yourself (daily,
weekly, bi-weekly). Make sure the
goals are also in line with those you
are coding for or you may find you
have wasted some time.
From a technical aspect consider:
Consider Unit testing / TDD. I have found in
my own work that this actually saves
time. It takes a while to get the
hang of but with anything you will
get better.
Care for your code. Refactor it
(especially if you start unit
testing). The better your code is
the easier it is to maintain which
takes less time. The easier it is to
understand the faster you can change
/ implement features.
I'm learning to spend a lot more time planning out my day than I used to. This includes planning out projects, down to writing psuedo-code for the programming I need to do. I find that with all the interruptions in my schedule, it's difficult for me to get started at something. Having everything broken down into small tasks makes it much easier to start after an interruption.
According to operational research, shortest job first is the best scheduler to get most amount of things done.
I write and run integration and system tests, but no unit tests, because I've no need for early (pre-integration) testing: Should one test internal implementation, or only test public behaviour?
A corrolary of Conway's Law is that you need to test the internal software interfaces which separate/integrate developers, whereas a "one man army" don't need to explicitly test his internal interfaces in this way.
A lot of the other tips are good but they equally apply to developers working in a team as well as a lone developer.
I think the hardest thing as a one man team is effective communication with the rest of your company. You will always be a lone programmers voice in any meeting or discussion around how best to build software.
As a result I'd advise trying to improve negotiation skills and focus on improving the way you describe technical concepts in terms a non-programmer can understand. Reading books such as Getting to Yes and How to win friends and influence people are a good way to start.
When there is more than one person agreeing on a viewpoint, the viewpoint automatically gains credibility with those you are trying to convince. In the absence of this possiblity you need to work extra hard at preparing your arguments with well-researched evidence and a balanced view.
I'm in the same situation. There's already a lot of good advice above but one thing I'd add is find when your best coding times are and make sure you're coding during that time. I have a few hours in the morning that I seem to be at my best for coding. I try to keep that time free of all distractions. Plan things like meetings, writing documentation, testing (at least the tedious, repetitive stuff), and all that other stuff for your less productive time. Keep those coding hours when you're 2 to 5 times more productive for coding.
Make sure you refactor early and often. That serves almost like a second set of eyes (for me, at least).
Don't work insane hours (especially tricky if you're working from home). Actually, working less hours often proves more productive as the impending break/end of day pressure increases your efficiency.
You may want to look up Parkinson's Law for work/time management.
I use a text file to collect all the things I do every day. Every time I run into a problem or have a question or find a solution, I add it to my file. It's very low-tech but it provides a wealth of information, like "where am I spending most of my time?" or "how did I fix that problem before?". Also makes it super-quick to give your client a list of hours at the end of your billing cycle.
I also use another text file (per client) that contains all the work items on my plate, arranged in order of priority, and updated frequently. It helps both me and my clients focus on what I should be working on next, so the pump is always primed.
Eventually I'll move away from flat text files to using something like FogBugz, but for now I can't beat the price, or how easy it is to search, or how easy it is to e-mail.

IT evaluating quality of coding - how do we know what's good? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Coming from an IT background, I've been involved with software projects but I'm not a programmer. One of my biggest challenges is that having a lot of experience in IT, people often turn to me to manage projects that include software development. The projects are usually outsourced and there isnt a budget for a full time architect or PM, which leaves me in a position to evaluate the work being performed.
Where I've managed to get through this in the past, I'm (with good reason) uneasy about accepting these responsibilities.
My question is, from a perspective of being technically experienced but not in programming, how can I evaluate whether coding is written well besides just determining if it works or not? Are there methodologies, tips, tricks of the trade, flags, signs, anything that would say - hey this is junk or hey this is pretty damn good?
Great question. Should get some good responses.
Code cleanliness (indented well, file organization, folder structure)
Well commented (not just inline comments, but variables that say what they are, functions that say what they do, etc.)
Small understandable functions/methods (no crazy 300 line methods that do all sorts of things with nested if logic all over the place)
Follows SOLID principles
Is the amount of unit test code similar in size and quality as the code base of the project
Is the interface code separate from the business logic code which in turn should be separate from the infrastructure access code (email, database, web services, file system, etc.)
What does a performance analysis tool think of the code (NDepend, NDoc, NCover, etc.)
There is a lot more to this...but this gets your started.
Code has 2 primary audiences:
The people who use it
The people who develop it
So you neeed 2 simple tests:
Run the code. Can you get it to do the job it is supposed to do?
Read the code. Can you understand the general intentions of the developer?
If you can answer yes to both of these, it is great code.
When reading the code, don't worry that you are not a programmer. If code is well written / documented, even a non-programmer should be able to see guess much of what it is intended to achieve.
BTW: Great question! I wish more non-programmers cared about code quality.
First, set ground rules (that all programmers sign up to) that say what's 'good' and what isn't. Automate tests for those that you can measure (e.g. functions less than a number of lines, McCabe complexity, idioms that your coders find confusing). Then accept that 'good coding' is something you know when you see rather than something you can actually pin down with a set of rules, and allow people to deviate from the standard provided they get agreement from someone with more experience. Similarly, such standards have to be living documents, adapted in the face of feedback.
Code reviews also work well, since not all such 'good style' rules can be automatically determined. Experienced programmers can say what they don't like about inexperienced programmers' code - and you have to get the original authors to change it so that they learn from their mistakes - and inexperienced programmers can say what they find hard to understand about other people's code - and, by being forced to read other people's code, they'll also learn new tricks. Again, this will give you feedback on your standard.
On some of your specific points, complexity and function size work well, as does code coverage during repeatable (unit) testing, but that last point comes with a caveat: unless you're working on something where high quality standards are a necessity (embedded code, as an example, or safety-critical code) 100% code coverage means you're testing the 10% of code paths that are worthwhile to test and the 90% that almost never get coded wrong in the first place. Worthwhile tests are the ones that find bugs and improve maintainability.
I think it's great you're trying to evaluate something that typically isn't evaluated. There have been some good answers above already. You've already shown yourself to be more mature in dealing with software by accepting that since you don't practice development personally, you can't assume that writing software is easy.
Do you know a developer whose work you trust? Perhaps have that person be a part of the evaluation process.
how can I evaluate whether coding is written well
There are various ways/metrics to define 'well'or 'good', for example:
Delivered on time
Delivered quickly
No bugs after delivery
Easy to install
Well documented
Runs quickly
Uses cheap hardware
Uses cheap software
Didn't cost much to write
Easy to administer
Easy to use
Easy to alter (i.e. add new features)
Easy to port to new hardware
...etc...
Of these, programmers tend to value "easy to alter": because, their job is to alter existing software.
Its a difficult one and could be where your non-functional requirements will help you
specify your performance requirements: transactions per second, response time, expected DB records over time,
require the delivery to include outcome from a performance analysis tool
specify the machine the application will be running on, you should not have to upgrade your hardware to run the app
For eyeballing the code and working out whether or not its well written its tougher, the answers from #Andrew & #Chris cover it pretty much... you want code that looks good, is easy to maintain and is performant.
Summary
Use Joel Test.
Why?
Thanks for tough question. I was about to write a long answer on merits of direct and indirect code evaluation, understanding your organisational context, perspective, figuring out a process and setting a criteria for code to be good enough, and then the difference between the code being perfect and just good enough which still might mean “very impressive”. I was about to refer to Steve McConnell’s Code Complete and even suggest delegating code audit to someone impartial you can trust, who is savvy enough business and programming-wise to get a grasp of the context, perspective, apply the criteria sensibly and report results neatly back to you. I was going to recommend looking at parts of UI that are normally out of end-user reach in the same way as one would be judging quality of cleaning by checking for dirt in hard-to-reach places.
Well, and then it struck me: what is the end goal? In most, but very few edge cowboy-coding scenarios, as a result of the audit you’re likely to discover that the code is better than junk, but certainly not damn good, maybe just slightly below the good enough mark. And then what is next? There are probably going to be a few choices:
Changing the supplier.
Insisting on the code being re-factored.
Leaving things as they are and from that point on demanding better code.
Unfortunately, none of the options is ideal or very good either. Having made an investment changing supplier is costly and quite risky: part of the software conceptual integrity will be lost, your company will have to, albeit indirectly, swallow the inevitable cost of the new supplier taking over the development and going through the learning curve (exactly opposite to that most suppliers are going to tell you to try and get their foot in the door). And there is going to be a big risk of missing the original deadlines.
The option of insisting on code re-factoring isn’t perfect either. There is going to be a question of cost and it’s very likely that for various contractual and historical reasons you won’t find yourself in a good negotiation position. In any case re-writing software is likely to affect deadlines and the organisation what couldn’t do the job right the first time is very unlikely to produce much better code on the second attempt. The latter is pertinent to the third option I would be dubious of any company producing a better code without some, often significant, organisational change. Leaving things as they are not good either: a piece of rotten code unless totally isolated is going to eventually poison the rest of the source.
This brings me to the actual conclusion, or in fact two:
Concentrate on picking the right software company in a first place, since going forward options are going to be somewhat constrained.
Make use of IT and management knowledge to pick a company that is focused on attracting and retaining good developers, that creates a working environment and culture fit for production of good quality code instead of relying on the post factum analysis.
It’s needless to expand on the importance of choosing the right company in the first place as opposed to summative evaluation of delivered project; hopefully the point is already made.
Well, how do we know the software company is right? Here I fully subscribe to the philosophy evangelised by Joel Spolsky: quality of software directly depends on quality of people involved which as it has been indicated by several studies can vary by an order of magnitude. And through the workings of free markets developers end up clustered in companies based on how much a particular company cares about attracting and retaining them.
As a general rule of life, best programmers end up working with the best, good with good, average with average and cowboy coders with other cowboy coders. However, there is a caveat. Most companies would have at least one or two very good developers they care about and try their hardest to retain. These devs are always put on a frontline: to fire fight, to lure a customer, to prove the organisation potential and competence. Working amongst more than average colleagues, overstretched between multiple projects, and being treated as royalty, sadly, these star programmers very often loose touch with the reality and become prima donnas who won’t “dirty” their hands with any actual programming work.
Unfortunately, programming talent doesn’t scale and it’s unlikely that the prima donna is going to work on your project past the initial phase designed to lure and lock in you as a customer. At the end the code is going to be produced by a less talented colleague and as a result you’ll get what you’ll get.
The solution is to look for a company there developer talents are more consistent and everyone is at least good enough to produce the right quality of code. And when it comes to choosing such an organisation that’s where Joel Test comes mighty handy. I believe it’s especially suitable for application by someone who has no programming experience but good understanding of IT and management.
The more points company scores on the Joel Test the more it’s likely to attract and retain good developers and most importantly provide them with the conditions to produce quality code. And since most great devs are actually in love with programming all the need is to be teamed up, given good and supportive work environment, a credible goal (or even better incredible) and they’ll start chucking out high quality code. It’s that simple.
Well, the only thing is that company that scores full twelve points on the Joel’s Test is likely to charge more than a sweatshop that scores a mere 3 or 5 (a self-estimated industry average). However, the benefits of having the synergy of efficient operations and bespoke trouble-free software that leverage strategic organisational goals will undoubtedly produce exceptional return on investment and overcome any hurdle rates by far outweighing any project costs. I mean, at the end of the day the company's work will likely be worth the money, every penny of it.
Also hope that someone will find this longish answer worthwhile.

Software quality metrics [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I was wondering if anyone has experience in metrics used to measure software quality. I know there are code complexity metrics but I'm wondering if there is a specific way to measure how well it actually performs during it's lifetime. I don't mean runtime performance, but rather a measure of the quality. Any suggested tools that would help gather these are welcome too.
Is there measurements to answer these questions:
How easy is it to change/enhance the software, robustness
If it is a common/general enough piece of software, how reusable is it
How many defects were associated with the code
Has this needed to be redesigned/recoded
How long has this code been around
Do developers like how the code is designed and implemented
Seems like most of this would need to be closely tied with a CM and bug reporting tool.
If measuring code quality in the terms you put it would be such a straightforward job and the metrics accurate, there would probably be no need for Project Managers anymore. Even more, the distinction between good and poor managers would be very small. Because it isn't, that just shows that getting an accurate idea about the quality of your software, is no easy job.
Your questions span to multiple areas that are quantified differently or are very subjective to quantification, so you should group these into categories that correspond to common targets. Then you can assign an "importance" factor to each category and derive some metrics from that.
For instance you could use static code analysis tools for measuring the syntactic quality of your code and derive some metrics from that.
You could also derive metrics from bugs/lines of code using a bug tracking tool integrated with a version control system.
For measuring robustness, reuse and efficiency of the coding process you could evaluate the use of design patterns per feature developed (of course where it makes sense). There's no tool that will help you achieve this, but if you monitor your software growing bigger and put numbers on these it can give you a pretty good idea of how you project is evolving and if it's going in the right direction. Introducing code-review procedures could help you keep track of these easier and possibly address them early in the development process. A number to put on these could be the percentage of features implemented using the appropriate design patterns.
While metrics can be quite abstract and subjective, if you dedicate time to it and always try to improve them, it can give you useful information.
A few things to note about metrics in the software process though:
Unless you do them well, metrics could prove to be more harm than good.
Metrics are difficult to do well.
You should be cautious in using metrics to rate individual performance or offering bonus schemes. Once you do this everyone will try to cheat the system and your metrics will prove worthless.
If you are using Ruby, there are some tools to help you out with metrics ranging from LOCs/Method and Methods/Class Saikuros Cyclomatic complexity.
My boss actually held a presentation on software metric we use at a ruby conference last year, these are the slides.
A interesting tool that brings you a lot of metrics at once is metric_fu. It checks alot of interesting aspects of your code. Stuff that is highly similar, changes a lot, has a lot of branches. All signs your codes could be better :)
I imagine there are lot more tools like this for other languages too.
There is a good thread from the old Joel on Software Discussion groups about this.
I know that some SVN stat programs provide an overview over changed lines per submit. If you have a bugtracking system and persons fixing bugs adding features etc are stating their commit number when the bug is fixed you can then calculate how many line were affected by each bug/new feature request. This could give you a measurement of changeability.
The next thing is simply count the number of bugs found and set them in ratio to the number of code lines. There are some values how many bugs a high quality software should have per codeline.
You could do it in some economic way or in programmer's way.
In case of economic way you mesaure costs of improving code, fixing bugs, adding new features and so on. If you choose the second way, you may want to measure how much staff works with your program and how easy it is to, say, find and fix an average bug in human hours. Certainly they are not flawless, because costs depend on the market situation and human hours depend on the actual people and their skills, so it's better to combine both methods.
This way you get some instruments to mesaure quality of your code. Of course you should take into account the size of your project and other factors, but I hope main idea is clear.
A more customer focused metric would be the average time it takes for the software vendor to fix bugs and implement new features.
It is very easy to calculate, based on your bug tracking software's date created, and closed information.
If your average bug fixing/feature implementation time is extremely high, this could also be an indicator for bad software quality.
You may want to check the following page describing various different aspects of software quality including sample plots. Some of the quality characteristics you require to measure can be derived using tool such as Sonar. It is very important to figure out how would you want to model some of the following aspects:
Maintainability: You did mention about how easy is it to change/test the code or reuse the code. These are related with testability and re-usability aspect of maintainability which is considered to be key software quality characteristic. Thus, you could measure maintainability as a function of testability (unit test coverage) and re-usability (cohesiveness index of the code).
Defects: Defects alone may not be a good idea to measure. However, if you can model defect density, it could give you a good picture.

Software projects and development in a research environment [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
What are useful strategies to adopt when you or the project does not have a clear idea of what the final (if any) product is going to be?
Let us take "research" to mean an exploration into an area where many things are not known or implemented and where a formal set of deliverables cannot be specified at the start of the project. This is common in STEM (science (physics, chemistry, biology, materials, etc.), technology engineering, medicine) and many areas of informatics and computer science. Software is created either as an end in itself (e.g. a new algorithm), a means of managing data (often experimental) and simulation (e.g. materials, reactions, etc.). It is usually created by small groups or individuals (I omit large science such as telescopes and hadron colliders where much emphasis is put of software engineering.)
Research software is characterised by (at least):
unknown outcome
unknown timescale
little formal project management
limited budgets (in academia at least)
unpredictability of third-party tools and libraries
changes in the outside world during the project (e.g. new discoveries which can be positive - save effort - or negative - getting scooped
Projects can be anything from days ("see if this is a worthwhile direction to go") to years ("this is my PhD topic") or longer. Frequently the people are not hired as software people but find they need to write code to get the research done or get infected by writing software. There is generally little credit for good software engineering - the "product" is a conference or journal publication.
However some of these projects turn out to be highly valuable - the most obvious area is genomics where in the early days scientists showed that dynamic programming was a revolutionary tool to help thinking about protein and nucleic structure - now this is a multi-billion industry (or more). The same is true for quantum mechanics codes to predict properties of substances.
The downside is that much code gets thrown away and it is difficult to build on. To try to overcome this we have build up libraries which are shared in the group and through the world as Open Source (but here again there is very little credit given). Many researchers reinvent the wheel ("head-down" programming where colleagues are not consulted and "hero" programming where someone tries to do the whole lot themself).
Too much formality at the start of a project often puts people off and innovation is lost (no-one will spend 2 months writing formal specs and unit tests). Too little and bad habits are developed and promulgated. Programming courses help but again it's difficult to get people doing them especially when you rely on their goodwill. Mentoring is extremely valuable but not always successful.
Are there online resources which can help to persuade people into good software habits?
EDIT: I'm grateful for dmckee (below) for pointing out a similar discussion. It's all good stuff and I particularly agree with version control as being one of the most important things that we can offer scientists (we offered this to our colleagues and got very good takeup). I also like the approach of the Software Carpentry course mentioned there.
It's extremely difficult. The environment both you and Stefano Borini describe is very accurate. I think there are three key factors which propagate the situation.
Short-term thinking
Lack of formal training and experience
Continuous turnover of grad students/postdocs to shoulder the brunt of new development
Short-term thinking. There are a few reasons that short-term thinking is the norm, most of them already well explained by Stefano. As well as the awful pressure to publish and the lack of recognition for software creation, I would emphasise the number of short-term contracts. There is simply very little advantage for more junior academics (PhD students and postdocs) to spend any time planning long-term software strategies, since contracts are 2-3 years. In the case of longer-term projects e.g. those based around the simulation code of a permanent member of staff, I have seen some applications of basic software engineering, things like simple version control, standard test cases, etc. However even in these cases, project management is extremely primitive.
Lack of formal training and experience. This is a serious handicap. In astronomy and astrophysics, programming is an essential tool, but understanding of the costs of development, particularly maintenance overheads, is extremely poor. Because scientists are normally smart people, there is a feeling that software engineering practices don't really apply to them, and that they can 'just make it work'. With more experience, most programmers realise that writing code that mostly works isn't the hard part; maintaining and extending it efficiently and safely is. Some scientific code is throwaway, and in these cases the quick and dirty approach is adequate. But all too often, the code will be used and reused for years to come, bringing consequent grief to all involved with it.
Continuous turnover of grad students/postdocs for new development. I think this is the key feature that allows the academic approach to software to continue to survive. If the code is horrendous and takes days to understand and debug, who pays that price? In general, it's not the original author (who has probably moved on). Nor is it the permanent member of staff, who is often only peripherally involved with new development. It is normally the graduate student who is implementing new algorithms, producing novel approaches, trying to extend the code in some way. Sometimes it will be a postdoc, hired specifically to work on adding some feature to an existing code, and contractually obliged to work on this area for some fraction of their time.
This model is hugely inefficient. I know a PhD student in astrophysics who spent over a year trying to implement a relatively basic piece of mathematics, only a few hundred lines of code, in an existing n-body code. Why did it take so long? Because she literally spent weeks trying to understand the existing, horribly written code, and how to add her calculation to it, and months more ineffectively debugging her problems due to the monolithic code structure, coupled with her own lack of experience. Note that there was almost no science involved in this process; just wasting time grappling with code. Who ultimately paid that price? Only her. She was the one who had to put more hours in to try and get enough results to make a PhD. Her supervisor will get another grad student after she's gone - and so the cycle continues.
The point I'm trying to make is that the problem with the software creation process in academia is endemic within the system itself, a function of the resources available and the type of work that is rewarded. The culture is deeply embedded throughout academia. I don't see any easy way of changing that culture through external resources or training. It's the system itself that needs to change, to reward people for writing substantial code, to place increased scrutiny on the correctness of results produced using scientific code, to recognise the importance of training and process in code, and to hold supervisors jointly responsible for wasting the time of the members of their research group.
I'll tell you my experience.
It is undoubt that a lot of software gets created and wasted in the academia. Fact is that it's difficult to adapt research software, purposely created for a specific research objective, to a more general environment. Also, the product of academia are scientific papers, not software. The value of software in academia is zero. The data you produce with that software is evaluated, once you write a paper on it (which takes a lot of editorial time).
In most cases, however, research groups have recognized frequent patterns, which can be polished, tested and archived as internal knowledge. This is what I do with my personal toolkit. I grow it according to my research needs, only with those features that are "cross-project". Developing a personal toolkit is almost a requirement, as your scientific needs are most likely unique for some verse (otherwise you would not be doing research) and you want to have as low amount of external dependencies as possible (since if something evolves and breaks your stuff, you will not have the time to fix it).
Everything else, however, is too specific for a given project to be crystallized. I therefore tend not to encapsulate something that is clearly a one-time solver. I do, however, go back and improve it if, later on, other projects require the same piece of code.
Short project span, and the heat of research (e.g. the publish or perish vision so central today), requires agile, quick languages, and in general, languages that can be grasped quickly. Ph.Ds in genomics and quantum chemistry don't have formal programming background. In some cases, they don't even like it. So the language must be quick, easy, clean, flexible, and easy to understand later on. The latter point is capital, as there's no time to produce documentation, and it's guaranteed that in academia, everyone will leave sooner or later, you burn the group experience to zero every three years or so. Academia is a high risk industry that periodically fires all their hard-formed executors, keeping only some managers. Having a code that is maintainable and can be easily grasped by someone else is therefore capital. Also, never underestimate the power of a google search to solve your problems. With a well deployed language you are more likely to find answers to gotchas and issues you can stumble on.
Management is a problem as well. Waterfall is out of discussion. There is no time for paperwork programming (requirements, specs, design). Spiral is quite ok, but as low paperwork as possible is clearly recommended. Fact is that anything that does not give you an article in academia is wasted time. If you spend one month writing specs, it's a month wasted, and your contract expires in 11 months. Moreover, that fatty document counts zero or close to zero for your career (as many other things: administration and teaching are two examples). Of course, Agile methods are also out of discussion. Most development is made by groups that are far, and in general have a bunch of other things to do as well. Coding concentration comes in brief bursts during "spare time" between articles, and before or after meetings. The bazaar is the most likely, but the bazaar carries a lot of issues as well.
So, to answer your question, the best strategy is "slow accumulation" of known good software, development in small bursts with a quick and agile method and language. Good coding practices need to be taught during lectures, as good laboratory practices are taught during practical courses (eg. never put water in sulphuric acid, always the opposite)
The hardest part is the transition between "this is just for a paper" and "we're really going to use this."
If you know that the code will only be for a paper, fine, take short cuts. Hardcode everything you can. Don't waste time on extensive validation if the programmer is the only one who will ever run the code. Etc. The problem is when someone says "Great! Now let's use this for real" or "Now let's use it for this entirely different scenario than what it was developed and tested for."
A related challenge is having to explain why the software isn't ready for prime time even though it obviously works, i.e. it's prototype quality and not production quality. What do you mean you need to rewrite it?
I would recommend that you/they read "Clean Code"
http://www.amazon.co.uk/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882/ref=sr_1_1?ie=UTF8&s=books&qid=1251633753&sr=8-1
The basic idea of this book is that if you do not keep the code "clean", eventually the mess will stop you from making any progress.
The kind of big science I do (particle physics) has a small number of large, long-running projects (ROOT and Geant4, for instance). These are developed mostly by actual programming professionals. Using processes that would be recognized by anyone else in the industry.
Then, each collaboration has a number of project-wide programs which are developed collaboratively under the direction of a small number of senior programming scientists. These use at least the basic tools (always version control, often some kind of bug tracking or automated builds).
Finally almost every scientist works on their own programs. Use of process on these programs is very spotty, and they often suffer from all the ills that others have discussed (short lifetimes, poor coding skills, no review, lots of serial maintainers, Not Invented Here Syndrome, etc. etc.). The only advantage that is available here compared to small group science, is that they work with the tools I talked about above, so there is something that you can point to and say "That is what you want to achieve.".
Don't really have that much more to add to what has already been said. It's a difficult balance to strike because our priorities are different - academia is all about discovering new things, software engineering is more about getting things done according to specifications.
The most important thing I can think of is to try and extricate yourself from the culture of in-house development that goes on in academia and try to maintain a disciplined approach to development, difficult as that may be in many cases owing to time restraints, lack of experience etc. This control-freakery sucks away at individual responsibility and decision-making and leaves it in the hands of a few who do not necessarily know best
Get a good book on software development, Code Complete already mention is excellent, as well as any respected book on algorithms and data structures. Read up on how you will need to manage your data eg do you need fast lookup / hash-tables / binary trees. Don't reinvent the wheel - use the libraries and things like STL otherwise you are likely to be wasting time. There is a vast amount on the web including this very fine blog.
Many academics, besides sometimes being primadonna-ish and precious about any approach seen as businesslike, tend to be quite vague in their objectives. To put it mildly. For this reason alone it is vital to build up your own software arsenal of helper functions and recipes, eventually, hopefully ending up with a kind of flexible experimental framework that enables you to try out any combination of things without being to restricted to any particular problem area. Strongly resist the temptation to just dive into the problem at hand.

Resources