Now that the Cutoff date plugin for Sonar is deprecated (I've tried it, and it doesn't seem to work at all), is there any way to exclude issues based on a set date?
For a large project, it is desirable to really start at fresh at given point.
Maintaining a low alert threshold ( < 100 alerts ) is much more manageable for the developers than cluttering the issues listings with old history/low priority issues. (1500++). Identifying the new and relevant ones are then much harder.
What we want to focus on are new issues.
Example:
when you have changed the quality rules for say 200.000+ lines of code, you really only are interested in what is produced from now on and changes to already existing code that breaks the new rules.
If my feeling is correct, you're not aware of the way to work with differential periods in SonarQube. Because out-of-the-box SonarQube is designed to cover such common use case. See :
http://www.sonarqube.org/differentials-four-ways-to-see-whats-changed/
http://www.sonarqube.org/using-differentials-to-move-the-team-in-the-right-direction/
http://www.sonarqube.org/differentials-but-wait-theres-more/
Related
I want to use SonarQube on my project. The project is quite a big and scanning whole files take much time. Is it possible to scan only changed files in the last commit, and provide report based only on changed lines of code?
I want to check if added or modified lines make the project quality worst and I don't care about old code.
For example, if person A created a file with 9 bugs and then commited changes - the report and quality gate should show 9 bugs. Then person B edited the same file adding few lines containing 2 additional bugs, then commited changes - the report should show the 2 last bugs and quality gate should be executed on the last changes (so should consider the last 2 bugs)
I was able to narrow scan to only changed files in the last commit- but report is generated based on whole files. I had an idea about cutting only changed lines of code, paste them to new file and run sonar scan on the file - but I'm almost sure the SonarQube needs the whole context of file.
Is it possible to somehow achieve my usecase ?
No, it is impossible. I saw a lot of similar questions. These are answers to two of them:
New Code analysis only:
G Ann Campbell:
Analysis will always include all code. Why? Why take the time to
analyze all of it when only a file or two has been changed? Because
any given change can have far-reaching effects. I’ll give you two
examples:
I check in a change that deprecates a much-used method. Suddenly,
issues about the use of deprecated code should be raised all over the
project, but because I only analyzed that one file, no new issues were
raised.
I modify a much-used method to return null in some cases. Suddenly all
the methods that dereference the returned value without first
null-checking it are at risk of NullPointerExceptions. But only the
one file that I changed was analyzed, so none of those “Possible NPE”
issues are raised. Worse, they won’t be raised until after each
individual file happens to be touched.
And that’s why all files are included in each analysis.
I want sonar analysis on newly checkin code:
G Ann Campbell:
First, the SonarQube interface and default Quality Gate are designed to help you focus
on the New Code Period. You can’t keep analysis from picking up those
old issues, but you can decide to only pay attention to issues raised
on newly-changed code. That means you would essentially ignore the
issues on the left side of the project homepage with a white
background and focus instead on the New Code values over the yellow
background on the right. We call this Fixing the Leak, or
alternately Clean as You Code.
Second, if you have a commercial edition, then branch and PR analysis
are available to you. With Short-Lived Branch (SLB) and PR analysis
still covers all files, but all that’s reported in the UI is what’s
changed in the PR / SLB.
Ideally, you’ll combine both of these things to make sure your new
code stays clean.
The position in this matter has not changed over the last years, so don't expect it will be changed.
My question is really simple , but i cant find anywhere this. I have the latest SonarQube version and for every language i got three different quality axis ( maybe based in the ISO 25010 standard), maintainability, security and reliability. But, in some tutorials i saw people with more categories as: performance, portability, usability... how can i get all this kind of analysis because i think that the rules are the same? is it a commercial set of rules? i dont know how to look , anyone have any idea?
What you're seeing in those tutorials is the SQALE model, which was basically dropped by SonarQube 5.6 in favor of the simpler, 3-axis model. In other words, those tutorials are pretty old, and if you really want what they're showing, you'll need to run a pretty old (4.x) version of SonarQube.
Your next question will likely be why the quality model changed in 5.6. The answer to that is that the SQALE model was really intricate and cool.... but on a day-to-day basis way too difficult to use. Which is why the current quality model breaks it down 3 ways:
Reliability / Bugs, Security / Vulnerabilities - things you should look at right away
Maintainability / Code Smells - everything else.
I was thinking that I’d rather only use the Task Work Item and ignore the Bug Work Item. This is my thinking as I set things up for my team. I’m on a quest to see why I shouldn’t do this. From my perspective a Task is either a new item or a bug item. There is no need to use two distinct Work Item Types. To make this happen in TFS I’ll start with the Bug Work Item and create a custom field (“Item Type”) to distinguish the two task types: new/bug. Both new tasks and bugs will share the same fields. Anyone see any major drawbacks to this approach?
The main reason Tasks/Issue/Bugs/etc are different work items are because the individual fields of each work type can be configured differently.
For example, by default, Bugs have a Triage property, Issues have Due date, Tasks have a Discipline. The States of a Bug (Active/Closed/Resolve) are different from an issue (Active/Closed).
By merging them into a single work item type you would loose the ability to configure each one uniquely.
Also, the rules followed when a Bug and Task are closed, for example, are generally different. Segregating them into work items allows a simpler rules set.
Work item type is also a standard column in all queries.
Overall, it depends on how extensively you are using Team Foundation. If your project is small, and the above don't matter, it's not going to hurt. Though I don't see much gain either.
I would suggest keeping Bug and dropping Task if you want to merge them. By default when you check in code and Resolve with a bug, it sets the status to Resolved and assigns it to whoever created it - usually a tester, but in your case possibly a PM. That person can then test to confirm the work is done and close it. You can set up alerts on their work items so they get an email and know that progress has happened. Alternatively if you use Task, when you Resolve at check in it is just closed. No alerts, no further testing. YMMV but on some of our projects we use Bug for things like "user would like to add a new report" and it fits our process well. (For others we keep the distinction for reporting purposes.)
It all boils down to 3 things:
Creation / prioritization
Reporting / Notifications
Completion workflow
Typically creation of a Task involves different fields than a Bug. For a bug you'll want to know things like environment found in, who notified you, severity, priority, etc.
For tasks you usually want to know the requestor, reason behind it, business unit impacted and iteration it is scheduled for. Tasks might be long term goals that result in new or enhanced functionality.
Reporting and Notifications of the two are generally different as well. PM's are going to track tasks to ensure deliverables are met, your tech support area is going to track bugs.
Next, bugs will generally result in hotfixes and service packs. Depending on severity this this might involve a high priority push through QA and release as quickly as possible. Tasks are more laid back and will go through all forms of regression and regular testing with a period of acceptance by the impacted business unit.
Finally, bugs may impact previous versions of your software. Tasks will almost always be for either the version currently under development or the one after that.
In short, they are fundamentally different things. They might share most fields in common, however by combining them you are restricting yourself in both reporting and workflows. Today this might be okay; however within the next month or next year this could seriously restrict you.
Considering that maintainence of work item types is an incredibly easy thing there is almost no benefit to merging them.
As you work in a legacy codebase what will have the greatest impact over time that will improve the quality of the codebase?
Remove unused code
Remove duplicated code
Add unit tests to improve test coverage where coverage is low
Create consistent formatting across files
Update 3rd party software
Reduce warnings generated by static analysis tools (i.e.Findbugs)
The codebase has been written by many developers with varying levels of expertise over many years, with a lot of areas untested and some untestable without spending a significant time on writing tests.
Read Michael Feather's book "Working effectively with Legacy Code"
This is a GREAT book.
If you don't like that answer, then the best advice I can give would be:
First, stop making new legacy code[1]
[1]: Legacy code = code without unit tests and therefore an unknown
Changing legacy code without an automated test suite in place is dangerous and irresponsible. Without good unit test coverage, you can't possibly know what affect those changes will have. Feathers recommends a "stranglehold" approach where you isolate areas of code you need to change, write some basic tests to verify basic assumptions, make small changes backed by unit tests, and work out from there.
NOTE: I'm not saying you need to stop everything and spend weeks writing tests for everything. Quite the contrary, just test around the areas you need to test and work out from there.
Jimmy Bogard and Ray Houston did an interesting screen cast on a subject very similar to this:
http://www.lostechies.com/blogs/jimmy_bogard/archive/2008/05/06/pablotv-eliminating-static-dependencies-screencast.aspx
I work with a legacy 1M LOC application written and modified by about 50 programmers.
* Remove unused code
Almost useless... just ignore it. You wont get a big Return On Investment (ROI) from that one.
* Remove duplicated code
Actually, when I fix something I always search for duplicate. If I found some I put a generic function or comment all code occurrence for duplication (sometime, the effort for putting a generic function doesn't worth it). The main idea, is that I hate doing the same action more than once. Another reason is because there's always someone (could be me) that forget to check for other occurrence...
* Add unit tests to improve test coverage where coverage is low
Automated unit tests is wonderful... but if you have a big backlog, the task itself is hard to promote unless you have stability issue. Go with the part you are working on and hope that in a few year you have decent coverage.
* Create consistent formatting across files
IMO the difference in formatting is part of the legacy. It give you an hint about who or when the code was written. This can gave you some clue about how to behave in that part of the code. Doing the job of reformatting, isn't fun and it doesn't give any value for your customer.
* Update 3rd party software
Do it only if there's new really nice feature's or the version you have is not supported by the new operating system.
* Reduce warnings generated by static analysis tools
It can worth it. Sometime warning can hide a potential bug.
I'd say 'remove duplicated code' pretty much means you have to pull code out and abstract it so it can be used in multiple places - this, in theory, makes bugs easier to fix because you only have to fix one piece of code, as opposed to many pieces of code, to fix a bug in it.
Add unit tests to improve test coverage. Having good test coverage will allow you to refactor and improve functionality without fear.
There is a good book on this written by the author of CPPUnit, Working Effectively with Legacy Code.
Adding tests to legacy code is certianly more challenging than creating them from scratch. The most useful concept I've taken away from the book is the notion of "seams", which Feathers defines as
"a place where you can alter behavior in your program without editing in that place."
Sometimes its worth refactoring to create seams that will make future testing easier (or possible in the first place.) The google testing blog has several interesting posts on the subject, mostly revolving around the process of Dependency Injection.
I can relate to this question as I currently have in my lap one of 'those' old school codebase. Its not really legacy but its certainly not followed the trend of the years.
I'll tell you the things I would love to fix in it as they bug me every day:
Document the input and output variables
Refactor the variable names so they actually mean something other and some hungarian notation prefix followed by an acronym of three letters with some obscure meaning. CammelCase is the way to go.
I'm scared to death of changing any code as it will affect hundreds of clients that use the software and someone WILL notice even the most obscure side effect. Any repeatable regression tests would be a blessing since there are zero now.
The rest is really peanuts. These are the main problems with a legacy codebase, they really eat up tons of time.
I'd say it largely depends on what you want to do with the legacy code...
If it will indefinitely remain in maintenance mode and it's working fine, doing nothing at all is your best bet. "If it ain't broke, don't fix it."
If it's not working fine, removing the unused code and refactoring the duplicate code will make debugging a lot easier. However, I would only make these changes on the erring code.
If you plan on version 2.0, add unit tests and clean up the code you will bring forward
Good documentation. As someone who has to maintain and extend legacy code, that is the number one problem. It's difficult, if not downright dangerous to change code you don't understand. Even if you're lucky enough to be handed documented code, how sure are you that the documentation is right? That it covers all of the implicit knowledge of the original author? That it speaks to all of the "tricks" and edge cases?
Good documentation is what allows those other than the original author to understand, fix, and extend even bad code. I'll take hacked yet well-documented code that I can understand over perfect yet inscrutable code any day of the week.
The single biggest thing that I've done to the legacy code that I have to work with is to build a real API around it. It's a 1970's style COBOL API that I've built a .NET object model around, so that all the unsafe code is in one place, all of the translation between the API's native data types and .NET data types is in one place, the primary methods return and accept DataSets, and so on.
This was immensely difficult to do right, and there are still some defects in it that I know about. It's not terrifically efficient either, with all the marshalling that goes on. But on the other hand, I can build a DataGridView that round-trips data to a 15-year-old application which persists its data in Btrieve (!) in about half an hour, and it works. When customers come to me with projects, my estimates are in days and weeks rather than months and years.
As a parallel to what Josh Segall said, I would say comment the hell out of it. I've worked on several very large legacy systems that got dumped in my lap, and I found the biggest problem was keeping track of what I already learned about a particular section of code. Once I started placing notes as I go, including "To Do" notes, I stopped re-figuring out what I already figured out. Then I could focus on how those code segments flow and interact.
I would say just leave it alone for the most part. If it's not broken then don't fix it. If it is broken then go ahead and fix and improve the portion of the code that is broken and its immediately surrounding code. You can use the pain of the bug or sorely missing feature to justify the effort and expense of improving that part.
I would not recommend any wholesale kind of rewrite, refactor, reformat, or putting in of unit tests that is not guided by actual business or end-user need.
If you do get the opportunity to fix something, then do it right (the chance of doing it right the first time might have already passed, but since you are touching that part again might as well do it right time around) and this includes all the items you mentioned.
So in summary, there's no single or just a few things that you should do. You should do it all but in small portions and in an opportunistic manner.
Late to the party, but the following may be worth doing where a function/method is used or referenced often:
Local variables often tend to be poorly named in legacy code (often owing to their scope expanding when a method is modified, and not being updated to reflect this). Renaming these in line with their actual purpose can help clarify legacy code.
Even just laying out the method slightly differently can work wonders - for instance, putting all the clauses of an if on one line.
There might be stale/confusing code comments there already. Remove them if they're not needed, or amend them if you absolutely have to. (Of course, I'm not advocating removal of useful comments, just those that are a hindrance.)
These might not have the massive headline impact you're looking for, but they are low risk, particularly if the code can't be unit tested.
What are some of the strategies that are used when implementing FxCop / static analysis on existing code bases with existing violations? How can one most effectively reduce the static analysis violations?
Make liberal use of [SuppressMessage] attribute to begin with. At least at the beginning. Once you get the count to 0 via the attribute, you then put in a rule that new checkins may not introduce FxCop violations.
Visual Studio 2008 has a nice code analysis feature that allows you to ensure that code analysis runs on every build and you can treat warnings as errors. That might slow things down a bit so I recommend setting up a continuous integration server (like CruiseControl.NET) and having it run code analysis on every checkin.
Once you get it under control and aren't introducing new violations with every checkin, start to tackle whole classes of FxCop violations at a time with the goal of removing the SuppressMessageAttributes that you used.
The way to keep track of which ones you really want to keep is to always add a Justification value to the ones you really want to suppress.
Rewrite your code in a passing style!
Seriously, an old code base will have hundreds of errors - but that's why we have novice/intern programmers. Correcting FxCop violations is a great way to get an overview of the code base and also learn how to write conforming .NET code.
So just bite the bullet, drink lots of caffeine, and just get through it in a couple days!
NDepend looks like it could do what you're after, but I'm not sure if it can be integrated into a CruiseControl.Net automated build, and fail the build if the code doesn't meet the requirements (which is what I'd like to happen).
Any other ideas?
An alternative to FxCop would be to use the tool NDepend. This tool lets write Code Rules over C# LINQ Queries (what we call CQLinq). Disclaimer: I am one of the developers of the tool
More than 200 code rules are proposed by default. Customizing existing rules or creating your own rules is straightforward thanks to the well-known C# LINQ syntax.
To keep the number of false-positives low, CQLinq offers the unique capabilities to define what is the set JustMyCode through special code queries prefixed with notmycode. More explanations about this feature can be found here. Here are for example two notmycode default queries:
Discard generated and designer Methods from JustMyCode
Discard generated Types from JustMyCode
To keep the number of false-positives low, with CQLinq you can also focus rules result only on code added or code refactored, since a defined baseline in the past. See the following rule, that detect methods too complex added or refactored since the baseline:
warnif count > 0
from m in Methods
where m.CyclomaticComplexity > 20 &&
m.WasAdded() || m.CodeWasChanged()
select new { m, m.CyclomaticComplexity }
Finally, notice that with NDepend code rules can be verified live in Visual Studio and at build process time, in a generated HTML+javascript report.