How does Visual Studios calculate the Maintainability Index? - visual-studio-2010

I have been browsing through the posts and the MSDN, but I am not seeing how the Maintainability Index is created. Only what the good and bad values are.
http://msdn.microsoft.com/en-us/library/bb385914.aspx
Does anyone know? Can we get a reference or explination as to what they are using?
I am interested in what is defined as easy to maintain code.
Also more detail on the other metrics would be welcome, but as extra credit. The explination for the other metrics is much more understandable.

There is an explanation on the blog of their Code Analysis Team.
The actual formula for it is:
Maintainability Index = MAX(0,(171 - 5.2 * ln(Halstead Volume)
- 0.23 * (Cyclomatic Complexity)
- 16.2 * ln(Lines of Code))*100 / 171)
But don't just listen to me, go get the full read there.

The history of the maintainability index goes back to 1992, and was included in Visual Studio in 2007.
Since then, several papers have appeared critically discussing the index, the coefficient, the metrics it is composed of, and the difficulty of using it in practice.
For a summary, you can checkout my blog post "Think Twice Before Using the Maintainability Index."

Related

What influence the maintainability result for Sonarqube?

I'm confronted to a huge "spaghetti code" with known lack of documentation, lack of test covering, high complexity, lack of design rules to be follow, etc. I let the code be analysed by a default sonar-scan, and surprisingly for me, the maintability has a really great score with a technical debt of 1,1% ! Reality shows that almost each change introduce new bugs
I'm quite perplex, and wonder if some particularities in the implementation could explain this score... We have for example quite a lot of interfaces (feeling 4-5 Interfaces for 1 implementation), uses reflexion and service locator pattern.
Are there other indicator that I could use that would be eventually more relevant for improving the quality?
The maintainability rating is the product of the estimated time to fix all the issues of type Code Smell in your code base versus the estimated time to write the code in its current state.
You should also look at the Bugs and Vulnerabilities in the code base.
Regarding your specific points (and assuming we're talking about Java):
known lack of documentation - there is a rule in the default profile that looks for Javadocs. You might read its description and parameter values to see what it does an does not find.
lack of test coverage - there is currently a "hole" in this detection; if there is no coverage for a class, then the class is not taken into account when computing lines that could/should be covered, and therefore when calculating coverage percentages. It should be fixed "soon". The first steps will appear on the platform side in 6.2, but will need accompanying changes in the language plugins to take effect.
high complexity - there are rules for this. If they are not finding what you think they should, then take a look at their (adjustable) thresholds.
lack of design rules - the only rule that might address this (Architectural Constraint) is
deprecated
slated for removal
not on by default dropped from the latest versions of the plugin
use of reflection - there aren't currently rules available to detect this

Choosing a strategy for BI module

The company I work for produces a content management system (CMS) with different various add-ons for publishing, e-commerce, online printing, etc. We are now in process of adding "reporting module" and I need to investigate which strategy should be followed. The "reporting module" is otherwise known as Business Intelligence, or BI.
The module is supposed to be able to track item downloads, executed searches and produce various reports out of it. Actually, it is not that important what kind of data is being churned as in the long term we might want to be able to push whatever we think is needed and get a report out of it.
Roughly speaking, we have two options.
Option 1 is to write a solution based on Apache Solr (specifically, using https://issues.apache.org/jira/browse/SOLR-236). Pros of this approach:
free / open source / good quality
we use Solr/Lucene elsewhere so we know the domain quite well
total flexibility over what is being indexed as we could take incoming data (in XML format), push it through XSLT and feed it to Solr
total flexibility of how to show search results. Similar to step above, we could have custom XSLT search template and show results back in any format we think is necessary
our frontend developers are proficient in XSLT so fitting this mechanism for a different customer should be relatively easy
Solr offers realtime / full text / faceted search which are absolutely necessary for us. A quick prototype (based on Solr, 1M records) was able to deliver search results in 55ms. Our estimated maximum of records is about 1bn of rows (this isn't a lot for typical BI app) and if worse comes to worse, we can always look at SolrCloud, etc.
there are companies doing very similar things using Solr (Honeycomb Lexicon, for example)
Cons of this approach:
SOLR-236 might or might not be stable, moreover, it's not yet clear when/if it will be released as a part of official release
there would possibly be some stuff we'd have to write to get some BI-specific features working. This sounds a bit like reinventing the wheel
the biggest problem is that we don't know what we might need in the future (such as integration with some piece of BI software, export to Excel, etc.)
Option 2 is to do an integration with some free or commercial piece of BI software. So far I have looked at Wabit and will have a look at QlikView, possibly others. Pros of this approach:
no need to reinvent the wheel, software is (hopefully) tried and tested
would save us time we could spend solving problems we specialize in
Cons:
as we are a Java shop and our solution is cross-platform, we'd have to eliminate a lot of options which are in the market
I am not sure how flexible BI software can be. It would take time to go through some BI offerings to see if they can do flexible indexing, real time / full text search, fully customizable results, etc.
I was told that open source BI offers are not mature enough whereas commercial BIs (SAP, others) cost fortunes, their licenses start from tens of thousands of pounds/dollars. While I am not against commercial choice per se, it will add up to the overall price which can easily become just too big
not sure how well BI is made to work with schema-less data
I am definitely not be the best candidate to find the most approprate integration option in the market (mainly because of absence of knowledge in BI area), however a decision needs to be done fast.
Has anybody been in a similar situation and could advise on which route to take, or even better - advise on possible pros/cons of the option #2? The biggest problem here is that I don't know what I don't know ;)
I have spent some time playing with both QlikView and Wabit, and, have to say, I am quite disappointed.
I had an expectation that the whole BI industry actually has some science under it but from what I found this is just a mere buzzword. This MSDN article was actually an eye opener. The whole business of BI consists of taking data from well-normalized schemas (they call it OLTP), putting it into less-normalized schemas (OLAP, snowflake- or star-type) and creating indices for every aspect you want (industry jargon for this is data cube). The rest is just some scripting to get the pretty graphs.
OK, I know I am oversimplifying things here. I know I might have missed many different aspects (nice reports? export to Excel? predictions?), but from a computer science point of view I simply cannot see anything beyond a database index here.
I was told that some BI tools support compression. Lucene supports that, too. I was told that some BI tools are capable of keeping all index in the memory. For that there is a Lucene cache.
Speaking of the two candidates (Wabit and QlikView) - the first is simply immature (I've got dozens of exceptions when trying to step outside of what was suggested in their demo) whereas the other only works under Windows (not very nice, but I could live with that) and the integration would likely to require me to write some VBScript (yuck!). I had to spend a couple of hours on QlikView forums just to get a simple date range control working and failed because the Personal Edition I had did not support downloadable demo projects available on their site. Don't get me wrong, they're both good tools for what they have been built for, but I simply don't see any point of doing integration with them as I wouldn't gain much.
To address (arguable) immatureness of Solr I will define an abstract API so I can move all the data to a database which supports full text queries if anything goes wrong. And if worse comes to worse, I can always write stuff on top of Solr/Lucene if I need to.
If you're truly in a scenario where you're not sure what you don't know i think it's best to explore an open-source tool and evaluate its usefulness before diving into your own implementation. It could very well be that using the open-source solution will help you further crystallise your own understanding and required features.
I had worked previously w/ an open-source solution called Pentaho. I seriously felt that I understood a whole lot more by learning to use Pentaho's features for my end. Of course, as is the case of working w/ most of the open-source solutions, Pentaho seemed to be a bit intimidating at first, but I managed to get a good grip of it in a month's time. We also worked with Kettle ETL tool and Mondrian cubes - which I think most of the serious BI tools these days build on top of.
Earlier, all these components were independent, but off-late i believe Pentaho took ownership of all these projects.
But once you're confident w/ what you need and what you don't, I'd suggest building some basic reporting tool of your own on top of a mondrian implementation. Customising a sophisticated open-source tool can indeed be a big issue. Besides, there are licenses to be wary of. I believe Pentaho is GPL, though you might want to check on that.
First you should make clear what your reports should show. Which reporting feature do you need? Which output formats do you want? Do you want show it in the browser (HTML) or as PDF or with an interactive viewer (Java/Flash). Where are the data (database, Java, etc.)? Do you need Ad-Hoc reporting or only some hard coded reports? This are only some questions.
Without answers to this question it is difficult to give a real recommendation, but my general recommendation would be i-net Clear Reports (used to be called i-net Crystal-Clear). It is a Java tool. It is a commercial tool but the cost are lower as SAP and co.

Where can i find sample alogrithms for analyzing historical stock prices?

Can anyone direct me in the right direction?
Basically, I'm trying to analyze stock prices and see if I can spot any patterns. I'm using PHP and MySQL to do this. Where can I find sample algorithms like the ones used in MetaStock or thinkorswim? I know they are closed source, but are there any tutorials available for beginners?
Thank you,
P.S. I don't even know what to search for in google :(
A basic, educational algorithm to start with is a dual-crossover moving average. Simply chart fast (say, 5-day) and slow (say, 10-day) moving averages of a stock's closing price, and you have a weak predictor of when to buy long (fast line goes above slow) and sell short (slow line goes above the fast). After getting this working, you could implement exponential smoothing (see previously linked wiki article).
That would be a decent start. Take a look at other technical analysis techniques, but do keep in mind that this is quite a perilous method of trading.
Update: As for actually implementing this? You're a PHP programmer, so here is a charting library for PHP. This is the one I used a few years ago for this very project, and it worked out swimmingly. Maybe someone else can recommend a better one. If you need a free source of data, take a look at Yahoo! Finance's historical data. They dispense CSV files containing daily opening prices, closing prices, trading volume, etc. of virtually every indexed corporation.
Check out algorithms at investopedia and FM Labs has formulas for a lot of technical analysis indicators.
First you will need a solid math background : statistics in general, correlation analysis, linear algebra... If you really want to push it check out dimensional transposition. Then you will need solid basis in Data Mining. Associations can be useful if yo want to link strict numerical data with news headlines and other events.
One thing for sure you will most likely not find pre-digested algorithms out there that will make you rich...
I know someone who is trying just that... He is somewhat successful (meaning is is not loosing money and is making a bit) and making his own algorithms... I should mention he has a doctorate in Actuarial science.
Here are a few more links... hope they help out a bit
http://mathworld.wolfram.com/ActuarialScience.html
http://www.actuary.com/actuarial-science/
http://www.actuary.ca/
Best of luck to you
Save yourself time and use programs like NinjaTrader and Wealth-Lab. Both of them are great technical analysis platforms and accept C# as a programming language for defining your trading rules. Every possible technical indicator you can imagine is already included and if you need something more advanced you can always write your own indicator. You would also need a lot of data in order for your analysis to be statistically significant. For US stocks and ETFs, visit www.Kibot.com. We have good experience using their data.
Here's a pattern for ya
http://ddshankar.files.wordpress.com/2008/02/image001.jpg
I'd start with a good introduction to time series analysis and go from there. If you're interested in finding patterns then the interesting term is "1D-Pattern Matching". But for that you need nice features, so google for "Feature extraction in time series". Remember GiGo. So make sure you have error-free stock price data for a sufficiently long timeperiod before you start.
May I suggest that you do a little reading with respect to the Kalman filter? Wikipedia is a pretty good place to start:
http://en.wikipedia.org/wiki/Kalman_filter/
This should give you a little background on the problem of estimating and predicting the variables of some system (the stock market in this case).
But the stock market is not very well behaved so you may want to familiarize yourself with non linear extensions to the KF. Yes, the wikipedia entry has sections on the extended KF and the unscented KF, but here is an introduction that is just a little more in-depth:
http://cslu.cse.ogi.edu/nsel/ukf/
I suppose if anyone had ever tried this before then it would have been all over the news and very well known. So you may very well be on to something.
Use TradeStation
It is a platform that lets you write software to analyze historical stock data. You can even write programs that would trade the stock, and you can back test your program on historical data or run it real time through out the day.

What is a good maintainability index using Visual Studio 2008 code analysis?

My company recently purchased TFS and I have started looking into the code analysis tools to help drive up code quality and noticed a good looking metric "maintainability index". Is anyone using this metric for code reviews/checkins/etc? If so, what is an acceptable index for developers to work toward?
The maintainability index is not as much a fixed value you look at, it's more of an indication that code is hard to understand, test and/or debug. I usually try to keep high-level code (basically anything except for the real plumbing code) above 80, where 90+ would be good. It adds a competitive element to programming as maintainable as possible to me.
The code analysis tool really shines in the area of dependencies and the number of branches within a method though. More branches mean harder testing, which makes it more error-prone. Dependencies, same thing.
In other people's code, I use the maintainability index to spot possible bad parts in the code, so I know where to review it. Also, methods/classes with a high number of lines are an indication of poor code to me (unless it can't be avoided, again, the plumbing works).
In the end, I think it mainly depends on how often your code will change. Code that's expected to change a lot has to score higher in maintainability than your typical 'write once' code.

Calculate code metrics [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Are there any tools available that will calculate code metrics (for example number of code lines, cyclomatic complexity, coupling, cohesion) for your project and over time produce a graph showing the trends?
On my latest project I used SourceMonitor. It's a nice free tool for code metrics analysis.
Here is an excerpt from SourceMonitor official site:
Collects metrics in a fast, single
pass through source files.
Measures metrics for source code
written in C++, C, C#, VB.NET, Java,
Delphi, Visual Basic (VB6) or HTML.
Includes method and function level
metrics for C++, C, C#, VB.NET,
Java, and Delphi.
Saves metrics in checkpoints for
comparison during software
development projects.
Displays and prints metrics in
tables and charts.
Operates within a standard Windows
GUI or inside your scripts using XML
command files.
Exports metrics to XML or CSV
(comma-separated-value) files for
further processing with other tools.
For .NET beside NDepend which is simply the best tool, I can recommend vil.
Following tools can perform trend analysis:
CAST
Klocwork Insight
Sonar is definitively a tool that you must consider, especially for Java projects. However it will also handle PHP or C/C++, Flex and Cobol code.
Here is a screenshot that show some metrics on a project:
alt text http://sonar.codehaus.org/wp-content/uploads/2009/05/squid-metrics.png
Note that you can try the tool by using their demo site at http://nemo.sonarsource.org
NDepend for .net
I was also looking for a code metrics tool/plugin for my IDE but as far as I know there are none (for eclipse that is) that also show a graph of the complexity over a specified time period.
However, I did find the eclipse metrics plugin, it can handle:
McCabe's Cyclomatic Complexity
Efferent Couplings
Lack of Cohesion in Methods
Lines Of Code in Method
Number Of Fields
Number Of Levels
Number Of Locals In Scope
Number Of Parameters
Number Of Statements
Weighted Methods Per Class
And while using it, I didn't miss the graphing option you are seeking as well.
I think that, if you don't find any plugins/tools that can handle the graphing over time, you should look at the tool that suits you most and offers you all the information you need; even if the given information is only for the current build of your project.
As a side note, the eclipse metrics plugin allows you to export the data to an external file (link goes to an example), so if you use a source control tool, and you should!, you can always export the data for the specific build and store the file along with the source code, that way you still have a (basic) way to go back in time and check the differences.
keep in mind, What you measure is what you get. loc says nothing about productivity or efficency.
rate a programmer by lines of code and you will get.. lines of code.
the same argument goes for other metrics.
otoh.. http://www.crap4j.org/ is a very conservative and useful metric. it sets complexity in relation with coverage.
NDepend, I am using it and its best for this purpose.
Check this :
http://www.codeproject.com/KB/dotnet/NDepend.aspx
Concerning the tool NDepend it comes with 82 different code metric, from Number of Lines of Code, to Method Rank (popularity), Cyclomatic Complexity, Lack of Cohesion of Methods, Percentage Coverage (extracted from NCover or VSTS), Depth of Inheritance...
With its rule system, NDepend can also find issues and estimates technical debt which is an interesting code metric (amount of dev-effort to fix problems vs. amount of dev-time spoiled per year to let problems unfixed).
All these metrics are detailled here.
If you're in the .NET space, Developer Express' CodeRush provides LOC, Cyclomatic Complexity and the (rather excellent, IMHO) Maintenance Complexity analysis of code in real-time.
(Sorry about the Maintenance Complexity link; it's going to Google's cache. The original seems to be offline ATM).
Atlassian FishEye is another excellent tool for the job. It integrates with your source control system (currently supports CVS, SVN and Perforce), and analyzes all your files that way. The analysis is rather basic though, and the product itself is commercial (but very reasonably priced, IMO).
You can also get an add-on for it called Crucible that facilitates peer code reviews.
For Visual Studio .NET (at least C# and VB.NET) I find the free StudioTools to be extremely useful for metrics. It also adds a number of features found in commercial tools such as ReSharper.
Code Analyzer is simple tool which generates this kind of metrics.
(source: teel.ws)
For Python, pylint can provide some code quality metrics.
There's also a code metrics plugin for reflector, in case you are using .NET.
I would recommend Code Metrics Viewer Exention for visual studio.
It is very easy to analyze solution at once, also do comparison if you made progress ;-)
Read more here about the features
On the PHP front, I believe for example phpUnderControl includes metrics through phpUnit (if I am not mistaken).
Keep in mind that metrics are often flawed. For example, a coder who's working on trivial problems will produce more code and there for look better on your graphs, than a coder who's cracking the complex issues.
If you're after some trend analysis, does it really mean anything to measure beyond SLOC?
Even if you just doing a grep for trailing semi-colons and counting the number of lines returned, what you are after is consistency in the SLOC measurement technique. In this way today's measurement can be compared with last month's measurement in a meaningful way.
I can't really see what would a trend of McCabe Cyclometric Complexity give? I think that CC should be used more for a snapshot of quality to provide feedback to the developers.
Edit: Ooh. Just thought of a couple of other measurements that might be useful. Comments as a percentage of SLOC and test coverage. Neither of which you want to let slip. Coming back to retrofit either of these is never as god as doing them "in the heat of the moment!"
HTH.
cheers,
Rob
Scitools' Understand does have the capability to generate a lot of code metrics for you. I don't have a lot of experience with the code metrics features, but the static analysis features in general were nice and the price was very reasonable. The support was excellent.
Project Code Meter gives a differential development history report (in Excel format) which shows your coding progress metrics in SLOC, time and productivity percentage (it's time estimation is based on cyclomatic complexity and other metrics). Then in Excel you can easily produce the graph you want.
see this article which describes it step by step:
http://www.projectcodemeter.com/cost_estimation/help/FN_monsizing.htm
For Java you can try our tool, QualityGate that computes more than 60 source code metrics, tracks all changes through time and also provides an overall rating for the maintainability of the source code.

Resources