Coverity vs. IAR C-STAT - static-analysis

Does anyone know of a Coverity vs. IAR's C-STAT head-to-head comparison or review?
I'm looking into different tools. IAR has been used by my company in the past. Higher-ups have shown an interest in Coverity. I'm trying to do a comparative analysis between them.

IAR's C-STAT analysis primarily focuses on MISRA and other compliance checkers. It does have some basic quality checkers. Coverity has a large number of quality and security checkers - the focus for Coverity is finding real bugs as opposed to ensuring you adhere to a coding standard (such as MISRA). Coverity also has excellent inter-procedural analysis and extremely low False Positive rates.
While right now these two tools have very little overlap, Coverity has begun adding compliance checkers and will soon offer the same coding standard coverage while also finding real bugs.
I am admittedly biased - I work on the Coverity product. I do honestly believe, however, that Coverity is the best tool of its kind on the market.

Related

SonarQube Category Explanations

Can anybody suggest a one/two line explanation of the "five" SonarQube categories, in such a way that a non-developer can understand what the percentage figure means?
Efficiency
Maintainability
Portability
Reliability
Usability
One word "synonym" for non-developers (not exact synonym though, but enough to give a quick idea):
Efficiency : performance
Maintainability : evolution
Portability : reuse
Reliability : resilience
Usability : design
Most of those metrics are presented in this Wikipedia entry
Efficiency:
Efficiency IT metrics measure the performance of an IT system.
An effective IT metrics program should measure many aspects of performance including throughput, speed, and availability of the system.
Maintainability
.
is the ease with which a product can be maintained in order to:
correct defects
meet new requirements
make future maintenance easier, or
cope with a changed environment
.
Portability:
the software codebase feature to be able to reuse the existing code instead of creating new code when moving software from an environment to another.
Reliability:
The IEEE defines reliability as "The ability of a system or component to perform its required functions under stated conditions for a specified period of time."
Note from this paper:
To most project and software development managers, reliability is equated to correctness, that is, they look to testing and the number of "bugs" found and fixed.
While finding and fixing bugs discovered in testing is necessary to assure reliability, a better way is to develop a robust, high quality product through all of the stages of the software lifecycle.
That is, the reliability of the delivered code is related to the quality of all of the processes and products of software development; the requirements documentation, the code, test plans, and testing.
Usability
studies the elegance and clarity with which the interaction with a computer program or a web site (web usability) is designed.
Usability differs from user satisfaction insofar as the former also embraces usefulness (see Computer user satisfaction).
See for instance usabilitymetrics.com
This represents for each category the density of violations (non-respect) of rules in the source code.

Evaluation of Code Metrics [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
There has been a considerable amout of discussion about code metrics (e.g.: What is the fascination with code metrics?). I (as a software developer) am really interested in those metrics because I think that they can help one to write better code. At least they are helpful when it comes to finding areas of code that need some refactoring.
However, what I would like to know is the following. Are there some evaluations of those source code metrics that prove that they really do correlate with the bug-rate or the maintainability of a method. For example: Do methods with a very high cyclomatic-complexity really introduce more bugs than methods with a low complexity? Or do methods with a high difficulty level (Halstead) really need much more amount to maintain them than methods with a low one?
Maybe someone knows about some reliable research in this area.
Thanks a lot!
Good question, no straight answer.
There are research papers available that show relations between, for example, cyclomatic complexity and bugs. The problem is that most research papers are not freely available.
I have found the following: http://www.pitt.edu/~ckemerer/CK%20research%20papers/CyclomaticComplexityDensity_GillKemerer91.pdf. Though it shows a relation between cyclomatic complexity and productivity. It has a few references to other papers however, and it is worth trying to google them.
Here are some:
Object-oriented metrics that predict maintainability
A Quantitative Evaluation of Maintainability Enhancement by Refactoring
Predicting Maintainability with Object-Oriented Metrics - An Empirical Comparison
Investigating the Effect of Coupling Metrics on Fault Proneness in Object-Oriented Systems
The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics
Have a look at this article from Microsoft research. In general I'm dubious of development wisdom coming out of Microsoft, but they do have the resources to be able to do long-term studies of large products. The referenced article talks about the correlation they've found between various metrics and project defect rate.
Finally I did find some papers about the correlation between software metrics and the error-rate but none of them was really what I was looking for. Most of the papers are outdated (late 80s or early 90s).
I think that it would be quite a good idea to start an analysis of current software. In my opinion it should be possible to investigate some populare open source systems. The source code is available and (what I think is much more important) many projects use issue trackers and some kind of version control system. Probably it would be possible to find a strong link between the log of the versioning systems and the issue trackers. This would lead to a very interesting possibility of analyzing the relation between some software metrics and the bug rate.
Maybe there still is a project out there that does exactly what I've described above. Does anybody know about something like that?
We conducted an empirical study about the bug prediction capabilities of the well-known Chidamber and Kemerer object-oriented metrics. It turned out these metrics combined can predict bugs with an accuracy of above 80% when we applied proper machine learning models. If you are interested, you can ready the full study in the following paper:
"Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction. In IEEE Transactions on Software Engineering, Vol. 31, No. 10, October 2005, pages 897-910."
I too was once fascinated with the promises of code metrics for measuring likely quality, and discovering how long it would take to write a particular piece of code given its design complexity. Sadly, the vast majority of claims for metrics were hype and never bore any fruit.
The largest problem is that the outputs we want to know (quality, time, $, etc.) depend on too many factors that cannot all be controlled for. Here is just a partial list:
Tool(s) Operating system
Type of code (embedded, back-end, GUI, web)
Developer experience level
Developer skill level
Developer background
Management environment
Quality focus
Coding standards
Software processes
Testing environment/practices
Requirements stability
Problem domain (accounting/telecom/military/etc.)
Company size/age
System architecture
Language(s)
See here for a blog that discusses many of these issues, giving sound reasons for why the things we have tried so far have not worked in practice. (Blog is not mine.)
https://shape-of-code.com
This link is good, as it deconstructs one of the most visible metrics, the Maintainability Index, found in Visual Studio:
https://avandeursen.com/2014/08/29/think-twice-before-using-the-maintainability-index/
See this paper for a good overview of quite a large number of metrics, showing that they do not correlate well with program understandability (which itself should correlate with maintainability): "Automatically Assessing Code Understandability: How Far Are We?", by Scalabrino et al.

How to measure software development performance? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am looking after some ways to measure the performance of a software development team. Is it a good idea to use the build tool? We use Hudson as an automatic build tool. I wonder if I can take the information from Hudson reports and obtain from it the progress of each of the programmers.
The main problem with performance metrics like this, is that humans are VERY good at gaming any system that measures their own performance to maximize that exact performance metric - usually at the expense of something else that is valuable.
Lets say we do use the hudson build to gather stats on programmer output. What could you look for, and what would be the unintended side effects of measuring that once programmers are clued onto it?
Lines of code (developers just churn out mountains of boilerplate code, and other needless overengineering, or simply just inline every damn method)
Unit test failures (don't write any unit tests, then they won't fail)
Unit test coverage (write weak tests that exercise the code, but don't really test it properly)
Number of bugs found in their code (don't do any coding, then you won't get bugs)
Number of bugs fixed (choose the easy/trivial bugs to work on)
Actual time to finish a task based against their own estimate (estimate higher to give more room)
And it goes on.
The point is, no matter what you measure, humans (not just programmers) get very good at optimizing to meet exactly that thing.
So how should you look at the performance of your developers? Well, that's hard. And it involves human managers, who are good at understanding people (and the BS they pull), and can look at each person subjectively in the context of who/where/what they are to figure out if they are doing a good job or not.
What you do once you've figured out who is/isn't performing is a whole different question though.
(I can't take credit for this line of thinking. It's originally from Joel Spolsky. Here and here)
Do NOT measure the performance of each individual programmer simply using the build tool. You can measure the team as a whole, sure, or you can certainly measure the progress of each programmer, but you cannot measure their performance with such a tool. Some modules are more complicated than others, some programmers are tasked with other projects, etc. It's not a recommended way of doing this, and it will encourage programmers to write sloppy code so that it looks like they did the most work.
No.
Metrics like that are doomed to failure. Different people work on different parts of the code, on different classes of problem, and absolute measurements are misleading at best.
The way to measure developer performance is to have excellent managers that do their job well, have good specs that accurately reflect requirements, and track everyone's progress carefully against those specs.
It's hard to do right. A software solution won't work.
I think this needs a very careful approach when deciding the ways to measure developers performance as most the traditional methods such as line of codes, number of check ins, number of bugs fixed etc. are proven to be subjective with todays software engineering concepts. We need to value team performance approach rather measuring individual KPIs in a project. However working in commercial development environment its important to keep a track and a close look at following factors of individual developers;
Code review comments – Each project, we can decide the number of code reviews need to be conducted for a given period. Based on the code reviews individuals get remarks about their coding standard improvements. Recurring issues of code reviews of same individual’s code needs to be brought in to attention. You can use automated code reviews tools or manual code reviews.
Test coverage and completeness of tests. – The % covered needs to be decided upfront and if certain developer fails to attempt it often, it needs to be taken care of.
Willingness to sign in to complex tasks and deliver them without much struggle
Achieving what’s defined as “Done” in a user story
Mastery level of each technical area.
With agile approach in some of the projects, the measurements of the development team and the expected performance are decided based on the releases. At each release planning there are different ‘contracts’ negotiated with the team members for the expected performance. I find this approach is more successful as there is no reason of adhering to UI related measurements in a release where there is a complex algorithm to be released.
I would NOT recommend using build tool information as a way to measure the performance / progress of software developers. Some of the confounding problems: possibly one task is considerably harder than another; possibly one task is much more involved in "design space" than "implementation space"; possibly (probably) the more efficient solution is the better solution, but that better solution contributes less lines of code than a terribly inefficient one which provides many many more lines of code; etc.
Speaking of KPI in software developers. www.smartKPIs.com may be a good resource for you. It contains a user friendly library of well-documented performance measures. At the moment it lists over 3300 KPI examples, grouped in 73 functional areas, as well as 83 industries and sub-categories.
KPI examples for the software developers are available on this page www.smartKPIs.com - application development They include but not limited to:
Defects removal efficiency
Data redundancy
In addition to examples of performance measures, www.smartKPIs.com also contains a catalogue of performance reports that illustrate the use of KPIs in practice.
Examples of such reports for information technology are available on: www.smartKPIs.com - KPIs in practice - information technology
The website is updated daily with new content, so check it from time to time for additional content.
Please note that while examples of performance measures are useful to inform decisions, each performance measure needs to be selected and customized based on the objectives and priorities of each organisation.
You would probably do better measuring how well your team tracks to schedules. If a team member (or entire team) is consistantly late, you will need to work with them to improve performance.
Don't short-cut or look for quick and easy ways to measure performance/progress of developers. There are many many factors that affect the output of a developer. I've seen alot of people try various metrics ...
Lines of code produced - encourages developers to churn out inefficient garbage
Complexity measures - encourages over analysis and refactoring
Number of bugs produced - encourages people to seek out really simple tasks and to hate your testers
... the list goes on.
When reviewing a developer you really need to look at how good their work is and define "good" in the context of what the comany needs and what situations/positions the company has put that indivual in. Progress should be evaluated with equal consideration and thought.
There are many different ways of doing this. Entire books written on the subject. You could use reports from Hudson but I think that would lead to misinformation and provide crude results. Really you need to have task tracking methodology.
Check how many lines of the codes each has written.
Then fire the bottom 70%.. NO 90%!... EVERY DAY!
(for the folks that aren't sure, YES, I am joking. Serious answer here)
We get 360 feedback from everyone on the team. If all your team members think you are crap, then you probably are.
There is a common mistake that many businesses make when setting up their release management tool. The Salesforce release management toolkit is one of the best ones available in the market today, but if you do not follow the vital steps of setting it up, you will definitely have some very bad results. You will get to use it but not to its full capacity. Establishing release management processes in isolation from the business processes is one of the worst mistakes to make. Release management tools go hand in hand with the enterprise strategy, objectives, governance, change management plus some other aspects. The processes of release management need to be formed in such a way that everyone in the business is on the same page.
Goals of release management
The main goal of release management is to have a consistent set of reliable and repeatable processes that are resource independent. This enables the achievement of the most favorable business value while at the same time optimizing the utilization of resources available. Considering that most organizations focus on running short, high-yield business projects, it is essential for optimization of the delivery value chain of the application to make certain that there are no holdups in the delivery of the business value.
Take for instance the force.com migration toolkit, as this tool has proven to be great in governance. A release management tool should allow for optimal visibility and accountability in governance.
Processes and release cycles
The release management processes must be consistent for the whole business. It is necessary to have streamlined and standardized processes across the various tool users. This is because they will be using the same platform and resources that enable efficient completion of their tasks. Having different processes for different divisions of your business can lead to grievous failures in tool management. The different sets of users will need to have visibility into what the others are doing. As aforementioned, visibility is of great importance in any business process.
When it comes to the release cycles, it is also imperative to have one centralized system that will track all the requirements of the different sets of users. It is also necessary to have this system centralized so that software development teams get insight into the features and changes requested by the business. Requests have to become priorities to make sure that the business gets to enjoy maximum benefit. Having a steering team is important because it is involved in the reviewing of business requirements plus also prioritizing the most appropriate changes that the business needs to make.
The changes that should happen to the Salesforce system can be very tricky and therefore having a regular meet up between the business and IT is good. This will help to determine the best changes to make to the system that will benefit the business. By considering the cost and value of implementing a feature, the steering committee has the task of deciding on the most important feature changes to make.
Here also good research http://intersog.com/blog/tech-tips/how-to-manage-millennials-on-software-development-teams
This is an old question but still, something you can do is borrow Velocity from Agile Software Development, where you assign a weight to each task and then you calculate how much "weight" you solve in each sprint (or iteration or whatever DLC you use). Of course this comes in hand with the fact that, like a commenter mentioned before, you need to actively keep track yourself of whether your developers are working or chatting online.
If you know your developers are working responsively, then you can rely on that velocity to give you an estimate of how much work the team can do. If at any iteration this number drops (considerably), then either it was poorly estimated or the team worked less.
Ultimately, the use of KPIs together with velocity can give you per-developer (or per-team) insights on performance.
Typically, directly using metrics for performance measurement is considered a Bad Idea, and one of the easy ways to run a team into the ground.
Now, you can use metrics like % of projects completed on-time, % of churn as code goes toward completion, etc...it's a wide field.
Here's an example:
60% of mission-critical bugs were written by Joe. That's a simple, straightforward metric. Fire Joe, right?
But wait, there's more!
Joe is the Senior Developer. He's the only guy trusted to write ultra-reliable code, every time. He's written about 80% of the mission-critical software, because he's the best.
Metrics are a bad measurement of developers.
I would share my experience and how I learnt a very valuable process on measuring the team performance. I must admit, I have fallen on tracking KPI simply because most of the departments would do the same but not really for the insight until I had the responsibility to evaluate developers performance where after a number of reading I evolved with the following solution.
One every project, I would entertain the team in a discussion on the project requirements and involve them so everyone knows what is to be done. In the same discussion through collaboration we would break the projects in to tasks and weight those tasks. Now previously we would estimate the project completion as 100% where each task has a percentage contribution. Well this did work for a while but was not the best solution. Now we would based the task on weight or points to be exact and use relative measurements to compare the task and differentiate the weights for example. There is a requirement to develop a web form to gather user data.
Task would go about like
1. User Interface - 2 Points
2. Database CRUD - 5 Points
3. Validation - 4 Points
4. Design (css) - 3 Points
With this strategy We can pin point a weekly approximate on how much we have completed and what is pending on the task force. We can also be able to pin point who has performed best.
I must admit that I still face some challenges on this strategy such as not every developer is comfortable on every technology. Somehow some are excited to learn technologies simply because they find 2015 high % of points fall in that section some would do what they can.
Remember, do not track a KPI for their own sake, track it for it's insight.

Tracking and prediciting quality level

What techniques do people recommend to track the quality level of a new program? Are their ways to take a poorly defined term like "quality level", quanitify it and then make predictions? Currently I use bug rates and S curves but I am looking for other ways to evaluate, estimate and predict quality levels.
Are you looking for this measure?
More seriously, for me code quality is about maintainability:
how easy it is to fix a bug, to add/remove/modify a feature,
how easy it is to refactor: is a regression test suite available?
how much technical debt?
Remember that you write code once, but you read it several times.
I am not sure what counting your bugs is going to do with out something to compare it with. What if the software you made was very hard and had lots of edge cases? You need some kind some kind of comparison...
Even though one of the other answers was clearly a joke code reviews are also probably a good idea. If you have too many bugs hire better engineers or have them write less code.
Edit: Added after considering comments...
Every bug is a like a unique sucky snow flake (a suckflake?). They have different levels of impact on your customers and developers. I would at least take this into account. Maybe adding severity (a measure of customer push for fix) and engineering hours spent fixing it might help improve accuracy. My concern I guess is that this is still over simplification of "quality" when it comes to developing software.
Sadly software quality != product quality. A recently released game called Fallout 3 won tons of awards and made lots of money (at least I assume) but was also a buggy hunk of junk on PC at least.
Just make sure you are tracking and optimizing the correct thing. Tracking # bugs vs time is just tracking # of bugs vs time. Reading any more into it requires some level of assumption however correct or incorrect.
What are your goals? Bugs are but one part of software quality. If you want to continue you to support your software maintainability is key. A lot of times bug fixes can make things less maintainable if your coder did things in a rush. This in turn makes future fixes and features harder and fixes can add new bugs.
This depends a lot on what kind of comparisons you're trying to make. If you're looking at a single project over time and the team doesn't change, then bug rates might be meaningful. However, if you're comparing different projects, with different teams, there's really no way to compare things like bug rates, because you are really comparing the rate of known bugs. One team may be much better at identifying bugs than the other, making their bug rate look higher, but they're really the ones with better software quality.
Do you do unit testing?
Code coverage on unit tests can be a decent measure of quality.

What is statistical debugging?

What is statistical debugging? I haven't found a clear, concise explanation yet, but the term certainly sounds impressive.
Is it just a research topic, or is it being used somewhere, for actual development? In other words: Will it help me find bugs in my program?
I created statistical debugging, along with various wonderful collaborators across the years. I wish I'd noticed your question months ago! But if you are still curious, perhaps this late answer will be better than nothing.
At a very high level, statistical debugging is the idea of using statistical models of program success/failure to track down bugs. These statistical models expose relationships between specific program behaviors and eventual success or failure of a run. For example, suppose you notice that there's a particular branch in the program that sometimes goes left, sometimes right. And you also notice that runs where the branch goes left are fine, but runs where the branch goes right are 75% more likely to crash. So there's a statistical correlation here which may be worth investigating more closely. Statistical debugging formalizes and automates that process of finding program (mis)behaviors that correlate with failure, thereby guiding developers to the root causes of bugs.
Getting back to your original question:
Is it just a research topic, or is it being used somewhere, for actual development?
It is mostly a research topic, but it is out there in the "real" world in two ways:
The public deployment of the Cooperative Bug Isolation Project hunts for bugs in various Open Source programs running under Fedora Linux. You can download pre-instrumented packages and every time you use them you're feeding us data to help us find bugs.
Microsoft has released Holmes, an implementation of statistical debugging for .NET. It's nicely integrated into Visual Studio and should be a very easy way for you to use statistical debugging to help find your own bugs in your own code. I've worked closely with Microsoft Research on Holmes, and these are good smart people who know how to put out high-quality tools.
One warning to keep in mind: statistical debugging needs ample raw data to build good statistical models. In CBI's public deployment, that raw data comes from real end users. With Holmes, I think Microsoft assumes that the raw data will come from in-house automated unit tests and manual tests. What won't work is code with no runs at all, or with only failing runs but no successful counterexamples. Statistical debugging works off of the contrast between good and bad runs, so you need to feed it both. If you want bug-hunting tools without runs, then you'll need some sort of static analysis. I do research on that too, but that's not statistical debugging. :-)
I hope this helped and was not too long. I'm happy to answer any follow-up questions. Happy bug-hunting!
that's when you ship software saying "well, it probably works..."
;-)
EDIT: it's a research topic where machine learning and statistical clustering are used to try to find patterns in programs that are good predictors of bugs, to identify where more bugs are likely to hide.
It sounds like statistical sampling. When you buy a product, there's a good chance that not every single product coming off the "assembly line" has been checked for quality.
Statistical sampling calls for checking a certain percentage of products to almost ensure they're all problem-free. It minimizes the effort at the risk of some problems sneaking through and is absolutely necessary where the testing process is a destructive one - if you carry out destructive testing on 100% of your production line, that's not going to leave much for distribution :-)
To be honest, unless you're checking every single execution path and every single possible input value, you're already doing this in your testing. The amount of effort required to test everything for any but the most simplistic systems is not worth it. The extra cost would make your product a non-compete item.
Note that statistical sampling doesn't just involve testing every 100th unit. There are ways to target the sampling to improve the chance of catching problems. For example, if historical data suggests most errors are introduced at a specific phase, target that phase. If one of your developers is more problematic than others, check his stuff more closely.
From what I can see from a cursory glance at some research papers, statistical debugging is just that - targeting areas based on past history of problems.
I know we already do this for our software. Since any bugs that get fixed have to pass unit and system tests that replicate the problem (and our TDD says these tests should be written before trying to fix the bug), those tests are automatically added to the regression test suite so that those areas that cause more problems naturally are tested more often in the future.

Resources