how to time (profile) maven goals in a multi-module project - performance

We have a huge project with many submodules. A full build takes currently over 30mins.
I wonder how this time distributes over different plugins/goals, e.g. tests, static analysis (findbugs, pmd, checkstyle, etc ...)
Would it be possible to time the build to see where (in both dimensions: modules and goals) most time is spent?

The maven-buildtime-extension is a maven plugin that can be used to see the times of each goal:
https://github.com/timgifford/maven-buildtime-extension

If you run the build in a CI server like TeamCity or Jenkins (formerly Hudson), it will give you timestamps for every step in the build process and you should be able to use these values to determine which goals/projects are taking the most time.
I don't think there is any way built in to maven to do this. In fact, in the related question artbristol posted, there is a link to a Maven feature request for this functionality. Unfortunately, this issue is unresolved and I don't know if it will ever be added.
The other potential solution is to write your own plugin which would provide this build metadata for you.

I don't think there is a way to determine the timing of particular goals. What you can do is run the particular goals separately to see how long they take. So instead of doing a "mvn install" which runs all of your tests, checkstyle, etc.. just do "mvn checkstyle:checkstyle" to see how long that takes for a particular module.
Having everything done every time is nice when its done by an automated server (continuum/jenkins/hudson) but when you are building locally, sometimes its better to be able to just compile. Some of the things you can do are have the static analysis goals ONLY run when you pass in a certain parameter or profile. Another option is to only have them ran when maven.test.skip=false.
If you are using a continuous build, try having the static analysis only done every 4 hours, or daily.

Related

Maven and Jenkinsfile - skipping previous phases

I'm exploring Jenkins staging functionality and I want to devise a fast and lean setup.
Basically, Jenkins promotes the use of stages to partition in the build process and provide nice visual feedback about the progress of the build.
So the Jenkinsfile goes something like
stage("Build")
bat("mvn compile")
stage("Test")
bat("mvn test")
stage("Deploy")
bat("mvn deploy")
This works well, but feels wrong, because deploy and test will both do activities from previous phases again.
As a result, in this setup I am building three times (although skipping compilation due to no changes) and testing two times (in the test and the deploy runs).
When I google around I can find various switches and one of them works for skipping unit tests, but the compilation and dependency resolution steps happen regardless of what I do.
Do I need to choose between speed and stages in this case or can I have both?
I mean:
stage("Resolve dependencies, build, test and deploy")
bat("mvn deploy")
is by far the fastest approach, but it doesn't produce a nice progress table in Jenkins.
In order to bring incremental builds in Maven phases as Gradle does, you can use takari-lifecycle maven plugin.
So, once the plugin is apply you will get all the benefits. In your example, Test stage which will perform mvn test will avoid compilation because it was compiled in the previous stage and Deploy stage will avoid compilation from your main source code and test source code but tests will be executed again so I suggest to add -DskipTests.

Incremental run of testsuite

We have a large project that has several thousands of tests in the testsuite, and the full testsuite run takes very long time.
I am looking for a tool that I can integrate in the Maven build that will run only those tests that might be affected (knowing code coverage for each), because some covered code has changes.
I was googling that and found a few similar things but not a perfect fit:
Ekstazi http://www.ekstazi.org/ looks like exactly that, but it does not work out-of-the box with TestNG (used in the testsuite), and it is not open source
Infinitest https://infinitest.github.io/ seems to focus mainly on IDE integration - is it possible to run the tests only on demand (just like mvn infinitest)?
PIT http://pitest.org/ is not exactly what I am looking for but it also needs to analyze per-test coverage
It would be also very useful to remember test coverage with (last) git commit and run the tests against the last code changes.
Further suggestions and comments on those above are welcome.
As far as I can see, Infinitest doesn't provide corresponding Maven plugin, so it's impossible to do using it. You may consider creating it though, making an invaluable contribution to the world.
As far as I can see, it provides pretty solid API so writing a plugin shouldn't be a big problem. You may want to take a look at InfinitestCore interface first. If you're using a CI environment you may want to provide file list for the Infinitest directly from git diff --name-only HEAD~1 which will produce the list of files changed in latest commit (as an example, if you run your builds against each commit).
UPD. It seems like there's a workaround involving maven-exec-plugin to explicitly run Infinitest in the Maven build: you can run 'mvn exec:exec' from the command line or from m2eclipse's Maven
Build launcher to run Infinitest against your project. I'd advice specifying the explicit build phase on which it should be run using the executions element in POM:
executions: It is important to keep in mind that a plugin may have multiple goals. Each goal may have a separate configuration, possibly even binding a plugin's goal to a different phase altogether. executions configure the execution of a plugin's goals.

TeamCity - non-trivial build sequence, please advice

I am tasked to improve quality and implement TeamCity for continuous integration. My experience with TeamCity is very limited - I use mostly TFS myself and have some experience with CC.NET.
A lot should happen within a build process... actually the build is already pushed into three different configurations that will run one after the next.
My main problem is that in each of those I actually would need to start multiple runners. For example, the first build step shall consist of:
The generation of new AssemblyInfo.cs files for consistent in assembly numbering
The actual compilation
A partial unit test run (all tests that run fast and check core functionality)
An FxCop run
A StyleCop run
The current version of TeamCity only allows to configure one runner ... which leaves me stuck with a lot of things.
How you would approach this? My current idea is going towards using the MsBuild runner for everything and basically start my own MsBuild based script which then does all the things, pretty much the way that TFS handles it (and the same way i did things back in the cc.net way with my own Nant build script).
On a further problem the question is how to present statistical information, for example from unit tests running in different stages (build configurations). We have some further down that take some time to run and want that to run in a 2nd or 3rd step (the latest for example testing database generation code which, including loading base data, takes about 15+ minutes to run). OTOH we would really like test results to be somehow consolidated.
Anyone any ideas?
Thanks.
TeamCity 6.0 allows multiple build steps for a single build configuration. Isn't it what you're looking for?
You'll need to script this out, at least parts of it. TeamCity provides some nice UI based config for some of your needs, but not all. Here's my suggestion:
Create an msbuild script to handle your first two bullet points, AssemblyInfo generation and compilation. Configure the msbuild runner to run your script, and to run your tests. Collect your assemblies as artifacts.
Create a second build configuration for FxCop. Trigger it from the first build. Give it an 'artifact dependency' on the first build, which is how it gets a hold of your dlls.
For StyleCop, TC doesn't support it out of the box like it does FxCop. Add it to your msbuild script manually, and have it produce an html report (which TeamCity can then display).
You need to take a look at the Dependencies functionality in the TeamCity. This feature allows you to create a sequence of build configurations. In other words, you need to create a build configuration for each step and then link all them as dependencies.
For consolidating test results please take a loot at the Artifact Dependencies. It might help.

TeamCity: Running FxCop After Build

I think I'm missing a valuable piece of understanding with TeamCity 5.0. Why is there a separate build runner for FxCop? I prefer that my build server run everything, at once (compile, run unit tests, FxCop, etc). The problem is, I don't see how to add more than a single Build Runner for a specific project, so it seems I have to add a second project to TeamCity with a dependency on another project that uses the sln2008 build runner, or I could simply go the long route and build everything out in MSBuild. Am I missing something that should be obvious? Is it possible to configure the sln2008 Build Runner to include FxCop code analysis?
I think most of the users want their builds with tests to be as fast as possible. Other things like coverage, code analysis, metrics most likely should not be run often. It is enough to run them once per day, because their value is statistics gathered over time.
As for multiple build runners per build configuraution feature - it is one of the most voted in our tracker: http://youtrack.jetbrains.net/issue/TW-3660?query=multiple+build+runners, it has very good chances to be implemented in the next versions.

What is the best way to setup an integration testing server?

Setting up an integration server, I’m in doubt about the best approach regarding using multiple tasks to complete the build. Is the best way to set all in just one big-job or make small dependent ones?
You definitely want to break up the tasks. Here is a nice example of CruiseControl.NET configuration that has different targets (tasks) for each step. It also uses a common.build file which can be shared among projects with little customization.
http://code.google.com/p/dot-net-reference-app/source/browse/#svn/trunk
I use TeamCity with an nant build script. TeamCity makes it easy to setup the CI server part, and nant build script makes it easy to do a number of tasks as far as report generation is concerned.
Here is an article I wrote about using CI with CruiseControl.NET, it has a nant build script in the comments that can be re-used across projects:
Continuous Integration with CruiseControl
The approach I favour is the following setup (Actually assuming you are in a .NET project):
CruiseControl.NET.
NANT tasks for each individual step. Nant.Contrib for alternative CC templates.
NUnit to run unit tests.
NCover to perform code coverage.
FXCop for static analysis reports.
Subversion for source control.
CCTray or similar on all dev boxes to get notification of builds and failures etc.
On many projects you find that there are different levels of tests and activities which take place when someone does a checkin. Sometimes these can increase in time to the point where it can be a long time after a build before a dev can see if they have broken the build with a checkin.
What I do in these cases is create three builds (or maybe two):
A CI build is triggered by checkin and does a clean SVN Get, Build and runs lightweight tests. Ideally you can keep this down to minutes or less.
A more comprehensive build which could be hourly (if changes) which does the same as the CI but runs more comprehensive and time consuming tests.
An overnight build which does everything and also runs code coverage and static analysis of the assemblies and runs any deployment steps to build daily MSI packages etc.
The key thing about any CI system is that it needs to be organic and constantly being tweaked. There are some great extensions to CruiseControl.NET which log and chart build timings etc for the steps and let you do historical analysis and so allow you to continously tweak the builds to keep them snappy. It's something that managers find hard to accept that a build box will probably keep you busy for a fifth of your working time just to stop it grinding to a halt.
We use buildbot, with the build broken down into discrete steps. There is a balance to be found between having build steps be broken down with enough granularity and being a complete unit.
For example at my current position, we build the sub-pieces for each of our platforms (Mac, Linux, Windows) on their respective platforms. We then have a single step (with a few sub steps) that compiles them into the final version that will end up in the final distributions.
If something goes wrong in any of those steps it is pretty easy to diagnose.
My advice is to write the steps out on a whiteboard in as vague terms as you can and then base your steps on that. In my case that would be:
Build Plugin Pieces
Compile for Mac
Compile for PC
Compile for Linux
Make final Plugins
Run Plugin tests
Build intermediate IDE (We have to bootstrap building)
Build final IDE
Run IDE tests
I would definitely break down the jobs. Chances are you're likely to make changes in the builds, and it'll be easier to track down issues if you have smaller tasks instead of searching through one monolithic build.
You should be able to create one big job from the smaller pieces, anyways.
G'day,
As you're talking about integration testing my big (obvious) tip would be to make the test server built and configured as close as possible to the deployment environment as possible.
</thebloodyobvious> (-:
cheers,
Rob
Break your tasks up into discrete goal/operations, then use a higher-level script to tie them all together appropriately.
This makes your build process easier to understand for other people (you're documenting as you go so anyone on your team can pick it up, right?), as well as increasing the potential for re-use. It's likely you won't reuse the high-level scripts (although this could be possible if you have similar projects), but you can definitely reuse (even if it's copy/paste) the discrete operations rather easily.
Consider the example of getting the latest source from your repository. You'll want to group the tasks/operations for retrieving the code with some logging statements and reference the appropriate account information. This is the sort of thing that's very easy to reuse from one project to the next.
For my team's environment, we use NAnt since it provides a common scripting environment between dev machines (where we write/debug the scripts) and the CI server (since we just execute the same scripts in a clean environment). We use Jenkins to manage our builds, but at their core each project is just calling into the same NAnt scripts and then we manipulate the results (ie, archive the build output, flag failing tests etc).

Resources