I'm confronted to a huge "spaghetti code" with known lack of documentation, lack of test covering, high complexity, lack of design rules to be follow, etc. I let the code be analysed by a default sonar-scan, and surprisingly for me, the maintability has a really great score with a technical debt of 1,1% ! Reality shows that almost each change introduce new bugs
I'm quite perplex, and wonder if some particularities in the implementation could explain this score... We have for example quite a lot of interfaces (feeling 4-5 Interfaces for 1 implementation), uses reflexion and service locator pattern.
Are there other indicator that I could use that would be eventually more relevant for improving the quality?
The maintainability rating is the product of the estimated time to fix all the issues of type Code Smell in your code base versus the estimated time to write the code in its current state.
You should also look at the Bugs and Vulnerabilities in the code base.
Regarding your specific points (and assuming we're talking about Java):
known lack of documentation - there is a rule in the default profile that looks for Javadocs. You might read its description and parameter values to see what it does an does not find.
lack of test coverage - there is currently a "hole" in this detection; if there is no coverage for a class, then the class is not taken into account when computing lines that could/should be covered, and therefore when calculating coverage percentages. It should be fixed "soon". The first steps will appear on the platform side in 6.2, but will need accompanying changes in the language plugins to take effect.
high complexity - there are rules for this. If they are not finding what you think they should, then take a look at their (adjustable) thresholds.
lack of design rules - the only rule that might address this (Architectural Constraint) is
deprecated
slated for removal
not on by default dropped from the latest versions of the plugin
use of reflection - there aren't currently rules available to detect this
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 months ago.
Improve this question
I always wondered how does a project/team/company selects or qualifies for choosing a specific guideline to be followed like MISRA 1998/2004/2012? How should one know and decide that which guideline will be sufficient (cost vs time vs quality) for qualifying a project?
(I know his question is a bit blunt, but any answers will be appreciated)
This is often a requirement from the area of application. You'll have safety standards dictating that a safe subset of the programming language is used. This requirement can come from the "SIL" standards such as industry IEC 61508, automotive IEC 26262, aerospace DO-178 and so on. In such mission critical systems, you may not have any other choice than to use MISRA-C.
But MISRA-C is also becoming industry standard for all embedded systems, no matter their nature. The reason why is obvious: nobody, no matter area of application, likes bad, low-quality software full of bugs.
Introducing MISRA-C, together with company coding guidelines, style rules, static analysis, version control... all of it will improve the quality of the end product significantly. And it will force the programmers to educate themselves about C, while becoming more professional overall.
That being said, MISRA-C is not necessarily the most suitable set of rules for every company. It is mostly applicable to embedded systems. For other kinds of applications, something like CERT-C might be more relevant. It is convenient to use one of these well-known standards, because then you can automate tests with static analysers.
They key here is that every semi-professional company that produces software needs some sort of coding rules that focus on banning bad practice. Some companies tend to focus way too much on mostly unimportant style details like where to place {, when they should focus on things that truly improves software quality, like "we should avoid writing function-like macros".
The coding rules should be integrated with development routines. It doesn't make much sense to implement MISRA-C for one single project. There is quite a lot of work involved in getting it up and running.
What is very important is that you have at least one, preferably several C veterans with many years of experience, that you put in charge of the coding rules. They need to read every dirty detail of the MISRA document and decide which rules that make sense for the company to implement. Some of the rules are far from trivial to understand. Not all of them make sense in every situation. If your dev team consists solely of beginners or intermediately experiences programmers and you decide to follow MISRA to the letter, it will end badly. You need at least one veteran programmer with enough knowledge to make the calls about which rules to follow and which to deviate from.
As for which version of MISRA-C to pick, always use the latest one: 2012. It has been cleaned up quite a bit and some weird rules have been removed. It also supports C99, unlike the older ones.
Code shall be written such that it has certain desired properties (quality criteria). Some quality criteria are practically always desired, like correctness, readability, maintainability (even here are exeptions, for example if you code for the obfuscated c contest or underhanded c contest). Other quality criteria are only relevant for certain projects or organizations, for example portability to 16 bit machines.
A well formulated coding rule typically supports one or several quality criteria. It can, however, be in conflict with others. There is a large number of established coding rules that support the typically desired quality criteria, but without significant negative impact on others. Many of these have been identified quite a long time ago (Kernighan and Plauger: Elements of Programming Style, 1974, and, yes, I have a copy of it :-). At times additional good rules are identified.
Unless in very rare circumstances (like, intentional obfuscation) code should follow these "established good rules", irrelevant of whether they are part of MISRA-XXXX or not. And, if a new good rule is found, ensure people start following it. You may even decide to apply this new good rule to already existing code, but that is more of a management decision because of the risk involved.
It simply does not make sense not to follow a good rule just because it is not part of MISRA-XXXX. Similarly, it does not make sense to follow a bad rule just because it is part of MISRA-XXXX. (In MISRA-C-2004 it is said they have removed 16 rules from MISRA-1998 because "they did not make sense" - hopefully some developers have noticed and did not apply them. And, as Lundin mentions, in MISRA-2012 again some "weird rules" have been removed).
However, be aware that, for almost every rule, its thoughtless application can in certain situations even degrade those quality criteria which it normally supports.
In addition to those generally applicable rules, there are rules that are only relevant if there are specific quality criteria for your code. What complicates things is that, in most projects, for different parts of the code the various quality criteria are of different relevance.
For an interrupt service routine performance is more critical than for most other parts of the software. Therefore, compromises wrt. other quality criteria will be made.
Some software parts are designed to be portable between environments, but often by introducing adapters/wrappers which are then specific for a certain environment. Thus, the adapter itself is not even intended to be portable.
Except for the abovementioned "established good rules", a fixed set of coding rules consequently can not work well for all parts of a typical software. So, for each part of the software first identify which quality criteria are relevant for that part. Then, apply those patterns, principles and rules which really support those specific quality criteria appropriately.
But, what about MISRA? The MISRA rules are intended to support quality criteria like correctness via analysability (e.g. ban of recursion and dynamic memory management), portability (e.g. isolation of assembly code). That is, for software parts where these specific quality criteria are not relevant, the respective MISRA rules do not bring any benefit. Unfortunately, MISRA has no notion of software architecture, where different parts have different quality criteria.
While many rules have improved over the MISRA releases, there are still many rules that are stricter than necessary (e.g. "no /* within // comments and vice versa" - why?) and rules that try to avoid (unlikely) mistakes in ways that are likely to introduce problems rather than solving them (e.g. "no re-use of any name, not even in different local scopes"). Here's my tip for you: If you a) really want your rule-checker to find all bugs and b) don't mind getting an absurd amount of warnings with a high false-positive ratio, this one rule/checker does it all for you: "warning: line of code detected: <line of code>".
Finally, MISRA is developed under the assumption that C (in particular its arithmetics) are too hard to understand. MISRA developes its own arithmetics on top of C in the belief that developers instead should learn the MISRA arithmetics and then can get away without understanding C arithmetics. Apparently, this has not been successfull, because the MISRA-2004 model ("underlying type") was replaced by another one ("essential type") in MISRA-2012. Thus, if your team understands C well and uses a good static analysis tool (or even several) which is capable of identifying the problematic scenarios with C's arithmetics, the MISRA approach is likely not for you.
TL;DR: Follow standard best practices, apply established good rules. Identify specific quality criteria for the different parts of your architecture. For each part, apply patterns, principles and rules appropriate for that part. Understand rules before applying them. Don't apply stupid rules. Use good static code analysis. Be sure your team understands C.
Just as with your compiler suite or development environment, and/or your static analysis tools, the appropriate version should be decided at the on-set of a project, as it may be difficult to change later!
If you are starting a project from scratch, then I would recommend that MISRA C:2012 is adopted - while also adopting the guidance of (published April 2016, revised February 2020) MISRA Compliance
The new Compliance guidance also offers advice on how to treat Adopted Code (ie pre-existing or imported code).
Existing projects that have been developed to earlier versions should continue to use that earlier version... unless there is a good business case for changing. Changing WILL result in some rework!
I was going through this wiki article on SQALE(Software Quality Assessment based on Lifecycle Expectations). The Software Quality assurance part of it is clear. But I am unable to understand the "based on LifeCycle expectations" part of the model. Can someone please explain in a clear to understand way the LifeCycle expectations part.
Ever since I first encountered SQALE, I've suspected that the creators were working backward from the acronym, and "lifecycle expectations" was the best that they could come up with that wasn't already taken. (There are other similarly acronymed guidelines or methodologies in the software quality space, such as Squale and SQuaRE).
All but the most rudimentary models of software quality incorporate the notion of quality over the full lifecycle of a software product (from "gleam in someone's eye" phase all the way through eventual decomissioning), not just its state at the time of initial shipping.
Such models also acknowledge that the desirable investment in any given aspect of software quality (e.g.: maintainability) is not equal for all software products. For example, one would tend to make very different decisions in maintainability investments for a boxed-style software product one hopes to sell (and upgrade) for years vs for a short-term marketing promo site that will be decomissioned one month after it goes live.
So... All that "lifecycle expectations" bit probably implies is something along the lines of covering software quality aspects that affect multiple part of the lifecycle of a product, as well as allowing adjustment/calibration to different expectations for the various quality characteristics.
If you're interested in how the SQALE authors might have meant this (assuming they weren't only trying to shoehorn the name into the acronym), there are a few hints in the SQALE manual that can be downloaded from http://www.sqale.org/download. They seem to feel that they've done something novel and useful in projecting the ISO/IEC 9126 quality characteristics over a product lifecycle in a chronological order, and perhaps this is what the name is meant to reflect.
BTW, if you're interested in software quality and want to understand this sort of quality modeling in more depth, I'd recommend taking a look at SQuaRE (incl. the parts of ISO/IEC 9126 for which there are no SQuaRE replacements yet). It's not necessarily the most exciting reading, but I find that it provides excellent background for evaluating the utility of various quality methodologies and tools.
I'm programming from home and I want to know whether I'm more or less productive programming at 10 AM than I am when I'm programming at 8PM.
What metrics should I use to determine an answer to the question?
Ignoring the debate in the question's comments, a bunch of arbitrary productivity-ish metrics you could measure...
lines of code written
user stories/tasks completed
bugs fixed
tests written
tests passing first time
bugs found
code churn vs new code (i.e. "right first time" vs "rewritten repeatedly")
%age of time in IDE vs debugging
%age of time in IDE vs non-work applications
code quality (using another similarly arbitrary measure like FxCop compliance or cyclic complexity)
code performance (against some arbitrary or customer-specified benchmark)
The best metrics tend to be combinations - say, "average of bugs found per line of code written" - rather than a single measure. Still, these are all subjective and innacurate.
I'd suggest the best thing to do is decide what your goal is when you're programming. Is it to produce high-quality code, or super-performant realtime code, or mission-critical-must-be-bug-free code, or do you just need to ship something that works in the shortest time? Until you've defined "productive", it's hard to suggest what would be a meaningful measurement.
I don't know if there is some established method for measuring productivity in programmers but assuming alertness and focus has a direct impact on productivity, I suppose you could set yourself some kind of mental arithmetic test with randomised questions and answers and take it at regular intervals.
It's a tricky one because you can't measure by lines, or problems solved (because they vary in scale and difficulty.) In fact, this article suggests that when attempting to measure programmer productivity, there is almost no correllation between the time it takes to complete a task and the quality of the finished product.
I'm trying to create some internal metrics to demonstrate (determine?) how well TDD improves defect rates in code.
Is there a better way than defects/KLOC? What about a language's 'functional density'?
Any comments or suggestions would be helpful.
Thanks - Jonathan
You may also consider mapping defect discovery rate and defect resolution rates... how long does it take to find bugs, and once they're found, how long do they take to fix? To my knowledge, TDD is supposed to improve on fix times because it makes defects known earlier... right?
Any measure is an arbitrary comparison of defects to code size; so long as the comparison is similar, it should work. E.g., defects/kloc in C to defects/kloc in C. If you changed languages, it would affect the metric in any case, since the same program in another language might be less defect-prone.
Measuring defects isn't an easy thing. One would like to account for the complexity of the code, but that is incredibly messy and unpleasant. When measuring code quality I recommend:
Measure the current state (what is your defect rate now)
Make a change (peer reviews, training, code guidelines, etc)
Measure the new defect rate (Have things improved?)
Goto 2
If you are going to compare coders make sure you compare coders doing similar work in the same language. Don't compare the coder who works in the deep internals of your most complex calculation engine to the coder who writes the code that stores stuff in the database.
I try to make sure that coders know that the process is being measured not the coders. This helps to improve the quality of the metrics.
I suggest to use the ratio between the times :
the time spend fixing bugs
the time spend writing other codes
This seem valid across languages...
It also works if you only have a rough estimation of some big code base. You can still compare it to the new code you are writing, to impress you management ;-)
I'm skeptical of all LOC-related measurements, not just because of different relative expressiveness of languages, but because individual programmers will vary enough in the expressiveness of their code as to make this metric "fuzzy" at best.
The things I would measure in the interests of project management are:
Number of open defects on the project. There's no single scalar that can tell you where the project is and how close it is to a releasable state, but this is still a handy number to have on hand and watch over time.
Defect detection rate. This is not the rate of introduction of new defects into the system, but it's probably the closest proxy you'll find.
Defect resolution rate. If this is less than the detection rate, you're falling behind - if it's greater, you're getting ahead.
All of these numbers are more useful if you combine them with severity information. A product with 20 minor bugs may well closer to release than one with 2 crashing bugs. If you're clearing the minor bugs but not the severe ones, you have to get the developers to refocus their attention.
I would track these numbers per project and per developer. The reason for doing them per project should be clear. The per-developer numbers are certainly not the whole picture of an individual contributor's skill or productivity, but can point you to people who might need training or remediation.
You may also wish to tag all the tickets in your defect tracking system by project module as well (especially for larger projects), so that you can tell when critical modules are in a fragile state.
Why dont you consider defects per use case ? or defects per requirement. We have faced practical issues in arriving at the KLOC.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I was wondering if anyone has experience in metrics used to measure software quality. I know there are code complexity metrics but I'm wondering if there is a specific way to measure how well it actually performs during it's lifetime. I don't mean runtime performance, but rather a measure of the quality. Any suggested tools that would help gather these are welcome too.
Is there measurements to answer these questions:
How easy is it to change/enhance the software, robustness
If it is a common/general enough piece of software, how reusable is it
How many defects were associated with the code
Has this needed to be redesigned/recoded
How long has this code been around
Do developers like how the code is designed and implemented
Seems like most of this would need to be closely tied with a CM and bug reporting tool.
If measuring code quality in the terms you put it would be such a straightforward job and the metrics accurate, there would probably be no need for Project Managers anymore. Even more, the distinction between good and poor managers would be very small. Because it isn't, that just shows that getting an accurate idea about the quality of your software, is no easy job.
Your questions span to multiple areas that are quantified differently or are very subjective to quantification, so you should group these into categories that correspond to common targets. Then you can assign an "importance" factor to each category and derive some metrics from that.
For instance you could use static code analysis tools for measuring the syntactic quality of your code and derive some metrics from that.
You could also derive metrics from bugs/lines of code using a bug tracking tool integrated with a version control system.
For measuring robustness, reuse and efficiency of the coding process you could evaluate the use of design patterns per feature developed (of course where it makes sense). There's no tool that will help you achieve this, but if you monitor your software growing bigger and put numbers on these it can give you a pretty good idea of how you project is evolving and if it's going in the right direction. Introducing code-review procedures could help you keep track of these easier and possibly address them early in the development process. A number to put on these could be the percentage of features implemented using the appropriate design patterns.
While metrics can be quite abstract and subjective, if you dedicate time to it and always try to improve them, it can give you useful information.
A few things to note about metrics in the software process though:
Unless you do them well, metrics could prove to be more harm than good.
Metrics are difficult to do well.
You should be cautious in using metrics to rate individual performance or offering bonus schemes. Once you do this everyone will try to cheat the system and your metrics will prove worthless.
If you are using Ruby, there are some tools to help you out with metrics ranging from LOCs/Method and Methods/Class Saikuros Cyclomatic complexity.
My boss actually held a presentation on software metric we use at a ruby conference last year, these are the slides.
A interesting tool that brings you a lot of metrics at once is metric_fu. It checks alot of interesting aspects of your code. Stuff that is highly similar, changes a lot, has a lot of branches. All signs your codes could be better :)
I imagine there are lot more tools like this for other languages too.
There is a good thread from the old Joel on Software Discussion groups about this.
I know that some SVN stat programs provide an overview over changed lines per submit. If you have a bugtracking system and persons fixing bugs adding features etc are stating their commit number when the bug is fixed you can then calculate how many line were affected by each bug/new feature request. This could give you a measurement of changeability.
The next thing is simply count the number of bugs found and set them in ratio to the number of code lines. There are some values how many bugs a high quality software should have per codeline.
You could do it in some economic way or in programmer's way.
In case of economic way you mesaure costs of improving code, fixing bugs, adding new features and so on. If you choose the second way, you may want to measure how much staff works with your program and how easy it is to, say, find and fix an average bug in human hours. Certainly they are not flawless, because costs depend on the market situation and human hours depend on the actual people and their skills, so it's better to combine both methods.
This way you get some instruments to mesaure quality of your code. Of course you should take into account the size of your project and other factors, but I hope main idea is clear.
A more customer focused metric would be the average time it takes for the software vendor to fix bugs and implement new features.
It is very easy to calculate, based on your bug tracking software's date created, and closed information.
If your average bug fixing/feature implementation time is extremely high, this could also be an indicator for bad software quality.
You may want to check the following page describing various different aspects of software quality including sample plots. Some of the quality characteristics you require to measure can be derived using tool such as Sonar. It is very important to figure out how would you want to model some of the following aspects:
Maintainability: You did mention about how easy is it to change/test the code or reuse the code. These are related with testability and re-usability aspect of maintainability which is considered to be key software quality characteristic. Thus, you could measure maintainability as a function of testability (unit test coverage) and re-usability (cohesiveness index of the code).
Defects: Defects alone may not be a good idea to measure. However, if you can model defect density, it could give you a good picture.