CI and automation - maven

I'm about to start CI, and I have a fully automated verification system, but as I read, the automation run will start after the developer code is pushed to the cloud (and that happens many times a day). When I run the whole automation bundles it takes around 1 hour to finish the tests.
So I'm wondering if the time is acceptable, if not, what can I do to decrease the time, is there some kind of certain method that could help. Tools, please advise.
Thanks in advance.

Maybe to think about to run automated tests at the end of a day. I also had similar situation, and came to solution to use cron jobs, that are set at midnight.
Maybe to think about that, and to avoid testing for each build.
If automated testing is needed for each build, try to introduce nodes (for eg. for Jenkins) You could add additional nodes and run on several machines, I think I also did for BitRise similar thing.
Divide tests cases with some logic, eg. logins in one run, negative testcases in different and so on.
Chop down testcases to certain smaller sections, use only core tests for each build, and complete run at the end of day.
There is lot of measures how to ensure faster running and not of them is programatically handled.
But also programatically can be drastically increase speed of tests, parallelisation, concurrent runs, grid etc.
Hope this helps,

Related

What would be the best way to add Performance Testing in Continuous Integration/Delivery Environment [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm about to integrate performance testing on CI, but I'm having a hard time to decide which ones I should execute and how to run it:
Should I run load testing on individual APIs or should I run the whole workflow (e.g.: Login -> Home Page -> Search)
How long I should run it?
Should I also add stress, peak and soak testing?
I'm thinking, those should be run out of the CI.
Any comments would be really appreciated.
The main idea of including tests in continuous integration process is protecting yourself from regressions, i.e. to ensure that a new feature or a bug fix doesn't cause performance degradation.
So common practice is to have short-term Load Test covering essential features and conducting anticipated number of users to be run periodically (i.e. on each commit or pull request to master/integration branch).
If you have enough capacity and environment availability it would also make sense to have soak test in place, but it should occur less frequently (you want to keep build time short enough, don't you?), i.e. overnight or over the weekend. This way you will be able to identify possible memory leaks
Both approaches assume having some reference metrics collected by previous ad-hoc runs which can be considered as acceptable pass/fail criteria.
Stress test assumes identifying saturation/breaking points so normally it is not included into CI pipeline and being run manually before major releases.
Check out How to Include Load Testing in your Continuous Integration Environment article for more information.
It's difficult to say what you should test without knowing what's important to your application. In many applications, testing the performance of login would be less important than testing the performance of search and a sub-optimal use of time. However getting an accurate performance of search against test data can be challenging, but hardly impossible. If running on CI, one could develop scripts that test performance in a critical or often changing area. Your CI system could be setup to watch for changes in these areas and then kick off the performance tests if necessary. When done, CI could notify the developer if the performance in an area where he or she made changes doesn't meet a specified threshold. I'd worry about running a few mission-critical performance tests with a tight feedback loop rather than worrying about running many types of tests. Remember, someone has to maintain these tests and supporting infrastructure.
If your developers are committing code continuously and have concerns on large load tests running for every single commit, you can ask them to batch the commits and trigger the build for performance test phase at a later time. However, all the interim builds can be allowed to go through the full QA cycle if the QA phase is well optimized and can finish within minutes. A 10-15 minute period is a good interval for a build to complete full automated tests. Now, the pipeline should mark them as interim builds that are not production ready and should allow them only after the completion of all the performance tests. You can also extend it by short load tests that match with the CI QA tests but defer the larger load tests to end-of day. In summary, the production deployment for the build has to wait until the load and soak tests have been completed. It does limit your ability to deploy several times throughout the day without risk. IF you would rather prefer to accept a certain degree of risk, you can proceed to deploy for minor changes like property and configuration changes.

Should code coverage be executed EVERY build? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I'm a huge fan of Brownfield Application Development. A great book no doubt and I'd recommend it to all devs out there. I'm here because I got to the point in the book about code coverage. At my new shop, we're using Team City for automated builds/continuous integration and it takes about 40 minutes for the build to complete. The Brownfield book talks all about frictionless development and how we want to ease the common burdens that developers have to endure. Here's what I read on page 130..
"Code coverage: Two processes for the price of one?
As you can see from the sample target in listing 5.2, you end up with two output files:
one with the test results and one with the code coverage results. This is because you
actually are executing your tests during this task.
You don’t technically need to execute your tests in a separate task if you’re running
the code coverage task. For this reason, many teams will substitute an automated
code coverage task for their testing task, essentially performing both actions in the
CI process. The CI server will compile the code, test it, and generate code coverage
stats on every check-in.
Although there’s nothing conceptually wrong with this approach, be aware of some
downsides. First, there’s overhead to generating code coverage statistics. When
there are a lot of tests, this overhead could be significant enough to cause friction in
the form of a longer-running automated build script. Remember that the main build
script should run as fast as possible to encourage team members to run it often. If
it takes too long to run, you may find developers looking for workarounds.
For these reasons, we recommend executing the code coverage task separately from
the build script’s default task. It should be run at regular intervals, perhaps as a separate scheduled task in your build file that executes biweekly or even monthly, but we
don’t feel there’s enough benefit to the metric to warrant the extra overhead of having
it execute on every check-in."
This is contrary to the practice at my current shop were we execute NCover per build. I want to go to my lead and request we not do this, but the best I can do is tell him "this is what the Brownfield book says". I don't think that's good enough. So I'm relying on you guys to fill me in with your personal experiences and advice on this topic. Thanks.
There are always two competing interests in continuous integration / automated build systems:
You want the build to run as quickly as possible
You want the build to run with as much feedback as possible (e.g. the most number of tests run, the most amount of information available about the build's stability and coverage, etc)
You will always need to make tradeoffs and find a balance between these competing interests. I usually try to keep my build times under 10 minutes, and will consider build systems broken if it takes more than about 20 minutes to give any sort of meaningful feedback about the build's stability. But this doesn't need to be a complete build that tests every case; there may be additional tests that are run later or in parallel on other machines to further test the system.
If you are seeing build times of 40 minutes, I would recommend you do one of the following as soon as possible:
Distribute the build/testing onto multiple machines, so that tests can be run in parallel and you can get faster feedback
Find things that are taking a lot of time in your build but are not giving a great amount of benefit, and only do those tasks as part of a nightly build
I would 100% recommend the first solution if at all possible. However, sometimes the hardware isn't available right away and we have to make sacrifices.
Code coverage is a relatively stable metric, in that it is relatively rare that your code coverage numbers would get dramatically worse within a single day. So if the code coverage is taking a long time to perform, then it's not really critical that it occurs on every build. But you should still try to get code coverage numbers at least once a night. Nightly builds can be allowed to take a bit longer, since there (presumably) won't be anybody waiting on them, but they still provide regular feedback about your project's status and ensure there aren't lots of unforeseen problems being introduced.
That said, if you are able to get the hardware to do more distributed or parallel building/testing, you should definitely go that route - it will ensure that your developers know as soon as possible if they broke something or introduced a problem in the system. The cost of the hardware will quickly pay itself back in the increased productivity that occurs from the rapid feedback of the build system.
Also, if your build machine is not constantly working (i.e. there is a lot of time when it is idle), then I would recommend setting it up to do the following:
When there is a code change, do a build and test. Leave out some of the longer running tasks, including potentially code coverage.
Once this build/test cycle completes (or in parallel), kick off a longer build that tests things more thoroughly, does code coverage, etc
Both of these builds should give feedback about the health of the system
That way, you get the quick feedback, but also get the more extended tests for every build, so long as the build machine has the capacity for it.
I wouldn't make any presumptions about how to fix this - you're putting the cart before the horse a bit here. You have a complaint that the build takes too long, so that's the issue I would ask to resolve, without preconceived notions about how to do it. There are many other potential solutions to this problem (faster machines, different processes, etc.) and you would be wise not to exclude any of them.
Ultimately this is a question of whether your management values the output of the build system enough to justify the time it takes. (And whether any action you might take to remedy the time consumption has acceptable fidelity in output).
This is a per team and per environment decision. You should first determine your threshold for build duration, and then factor out longer running processes into less-frequent occurrences (ideally no fewer than 1 or 2 times a day in CI) once that has been determined.
The objection appears to be that executing all the tests, and collecting code coverage, is expensive, and you don't (well, someone doesn't) want to pay that price for each build.
I cannot imagine why on earth you (or that someone) would not want to always know what the coverage status was.
If the build machine has nothing else to do, then it doesn't matter if it does this too.
If your build machine is too busy doing builds, maybe you've overloaded it by asking it to serve too many masters, or you are doing too many builds (why so many changes? hmm, maybe the tests aren't very good !).
If the problem is that the tests themselves really do take a long time, you can perhaps find a way to optimize the tests. In particular, you shouldn't need to re-run tests for the part of the code that didn't change. Figuring out how to do this (and trusting it) might be a challenge.
Some test coverage tools (such as ours) enable you to track what tests cover which part of the code, and, given a code change, which tests need to be re-run. With some additional scripting, you can simply re-run the tests that are affected first; this enables you to get what amounts to full test results early/fast without running all the tests. Then if there are issues with the build you find out as soon as possible.
[If you are paranoid and don't really trust the incremental testing process, you can run them for the early feedback, and then go on to run all the tests again, giving you full results.]

What’s the ROI of Continuous Integration?

Currently, our organization does not practice Continuous Integration.
In order for us to get an CI server up and running, I will need to produce a document demonstrating the return on the investment.
Aside from cost savings by finding and fixing bugs early, I'm curious about other benefits/savings that I could stick into this document.
My #1 reason for liking CI is that it helps prevent developers from checking in broken code which can sometimes cripple an entire team. Imagine if I make a significant check-in involving some db schema changes right before I leave for vacation. Sure, everything works fine on my dev box, but I forget to check-in the db schema changescript which may or may not be trivial. Well, now there are complex changes referring to new/changed fields in the database but nobody who is in the office the next day actually has that new schema, so now the entire team is down while somebody looks into reproducing the work you already did and just forgot to check in.
And yes, I used a particularly nasty example with db changes but it could be anything, really. Perhaps a partial check-in with some emailing code that then causes all of your devs to spam your actual end-users? You name it...
So in my opinion, avoiding a single one of these situations will make the ROI of such an endeavor pay off VERY quickly.
If you're talking to a standard program manager, they may find continuous integration to be a little hard to understand in terms of simple ROI: it's not immediately obvious what physical product that they'll get in exchange for a given dollar investment.
Here's how I've learned to explain it: "Continuous Integration eliminates whole classes of risk from your project."
Risk management is a real problem for program managers that is outside the normal ken of software engineering types who spend more time writing code than worrying about how the dollars get spent. Part of working with these sorts of people effectively is learning to express what we know to be a good thing in terms that they can understand.
Here are some of the risks that I trot out in conversations like these. Note, with sensible program managers, I've already won the argument after the first point:
Integration risk: in a continuous integration-based build system, integration issues like "he forgot to check in a file before he went home for a long weekend" are much less likely to cause an entire development team to lose a whole Friday's worth of work. Savings to the project avoiding one such incident = number of people on the team (minus one due to the villain who forgot to check in) * 8 hours per work day * hourly rate per engineer. Around here, that amounts to thousands of dollars that won't be charged to the project. ROI Win!
Risk of regression: with a unit test / automatic test suite that runs after every build, you reduce the risk that a change to the code breaks something that use to work. This is much more vague and less assured. However, you are at least providing a framework wherein some of the most boring and time consuming (i.e., expensive) human testing is replaced with automation.
Technology risk: continuous integration also gives you an opportunity to try new technology components. For example, we recently found that Java 1.6 update 18 was crashing in the garbage collection algorithm during a deployment to a remote site. Due to continuous integration, we had high confidence that backing down to update 17 had a high likelihood of working where update 18 did not. This sort of thing is much harder to phrase in terms of a cash value but you can still use the risk argument: certain failure of the project = bad. Graceful downgrade = much better.
CI assists with issue discovery. Measure the amount of time currently that it takes to discover broken builds or major bugs in the code. Multiply that by the cost to the company for each developer using that code during that time frame. Multiply that by the number of times breakages occur during the year.
There's your number.
Every successful build is a release candidate - so you can deliver updates and bug fixes much faster.
If part of your build process generates an installer, this allows a fast deployment cycle as well.
From Wikipedia:
when unit tests fail or a bug emerges, developers might revert the codebase back to a bug-free state, without wasting time debugging
developers detect and fix integration problems continuously - avoiding last-minute chaos at release dates, (when everyone tries to check in their slightly incompatible
versions).
early warning of broken/incompatible code
early warning of conflicting changes
immediate unit testing of all changes
constant availability of a "current" build for testing, demo, or release purposes
immediate feedback to developers on the quality, functionality, or system-wide impact
of code they are writing
frequent code check-in pushes developers to create modular, less
complex code
metrics generated from automated testing and CI (such as metrics for code coverage, code
complexity, and features complete) focus developers on developing functional, quality code, and help develop momentum in a team
well-developed test-suite required for best utility
We use CI (Two builds a day) and it saves us a lot of time keeping working code available for test and release.
From a developer point of view CI can be intimidating when Automatic Build Result, sent by email to all the world (developers, project managers, etc. etc.), says:
"Error in loading DLL Build of 'XYZ.dll' failed." and you are Mr. XYZ and they know who you are :)!
Here's my example from my own experiences...
Our system has multiple platforms and configurations with over 70 engineers working on the same code base. We suffered from somewhere around 60% build success for the less commonly used configs and 85% for the most commonly used. There was a constant flood of e-mails on a daily basis about compile errors or other failures.
I did some rough calculations and estimated that we lost an average of an hour a day per programmer to bad builds, which totals nearly 10 man days of work every day. That doesn't factor in the costs that occur in iteration time when programmers refuse to sync to the latest code because they don't know if it's stable, that costs us even more.
After deploying a rack of build servers managed by Team City we now see an average success rate of 98% on all configs, the average compile error stays in the system for minutes not hours and most of our engineers are now comfortable staying at the latest revision of the code.
In general I would say that a conservative estimate on our overall savings was around 6 man months of time over the last three months of the project compared with the three months prior to deploying CI. This argument has helped us secure resources to expand our build servers and focus more engineer time on additional automated testing.
Our biggest gain, is from always having a nightly build for QA. Under our old system each product, at least once a week, would find out at 2AM that someone had checked in bad code. This caused no nightly build for QA to test with, the remedy was to send release engineering an email. They would diagnose the problem and contact a dev. Sometimes it took as long as noon before QA actually had something to work with. Now, in addition to having a good installer every single night, we actually install it on VM's of all the different supported configurations everynight. So now when QA comes in, they can start testing within a few minutes. Now when you think of the old way, QA came in grabbed the installer, fired up all the vms, installed it, then started testing. We save QA probably 15 minutes per configuration to test on, per QA person.
There are free CI servers available, and free build tools like NAnt. You can implement it on your dev box to discover the benefits.
If you're using source control, and a bug-tracking system, I imagine that consistently being the first to report bugs (within minutes after every check-in) will be pretty compelling. Add to that the decrease in your own bug-rate, and you'll probably have a sale.
The ROI is really an ability to provide what the customer wants. This is of course very subjective but when implemented with involvement of the end customer, you would see that customers starts appreciating what they are getting and hence you tend to see less issues during User Acceptance.
Would it save cost? may be not,
would the project fail during UAT? definitely NO,
would the project be closed in between? - high possibility when the customers find that this would not deliver the
expected result.
would you get real-time and real data about the project - YES
So it helps in failing faster, which helps mitigate risks earlier.

Build Quality

We have 3 branches {Dev,Test,Release} and will have continuous integration set up for each branch. We want to be able to assign build qualities to each branch i.e. Dev - Ready for test...
Has anyone any experience with this that can offer any advice/best practice approach?
We are using TFS 2008 and we are aware that it has Build Qualities built in. It is just when to apply a quality and what kind of qualities people use is what we are looking for.
Thanks
:)
Your goal here is to get the highest quality possible in each branch, balanced against the burden of verifying that level of quality.
Allowing quality to drop in any branch is always harmful. Don't think you can let the Dev branch go to hell and then fix it up before merging. It doesn't work well, for two reasons:
Recovering is harder than you expect. When a branch is badly broken, you don't know how broken it really is. That's because each issue hides others. It's also hard to make any progress on any issue because you'll run in to other problems along the way.
Letting quality drop saves you nothing. People sometimes say "quality, cost, schedule - pick any 2" or something like that. The false assumption here is that you "save" by allowing quality to slip. The problem is that as soon as quality drops, so does "velocity" - the speed at which you get work done. The good news is that keeping quality high doesn't really cost extra, and everyone enjoys working with a high-quality code base.
The compromise you have to make is on how much time you spend verifying quality.
If you do Test Driven Development well, you will end up with a comprehensive set of extremely fast, reliable unit tests. Because of those qualities, you can reasonably require developers to run them before checking in, and run them regularly in each branch, and require that they pass 100% at all times. You can also keep refactoring as you go, which lets you keep velocity high over the life of the project.
Similarly, if you write automated integration / customer tests well, so they run quickly and reliably, then you can require that they be run often, as well, and always pass.
On the other hand, if your automated tests are flaky, if they run slowly, or if you regularly operate with "known failures", then you will have to back off on how often people must run them, and you'll spend a lot of time working through these issues. It sucks. Don't go there.
Worst case, most of your tests are not automated. You can't run them often, because people are really slow at these things. Your non-release branch quality will suffer, as will the merging speed and development velocity.
Assessing the quality of a build in a deterministic and reproducible way is definitely challenging. I suggest the following:
If you are set up to do automated regression testing then all those tests should pass.
Developers should integration test each of their changes using an official Dev build newly installed on an official and clean test rig and give their personal stamp of approval.
When these two items are satisfied for a particular Dev build you can be reasonably certain that promoting this build to Test will not be wasting the time of your QA team.

How to deal with long running Unit Tests?

I've got about 100 unit tests and with a coverage of %20, which I'm trying to increase the coverage and also this is a project in development so keep adding new tests.
Currently running my tests after every build is not feasible they takes about 2 moments.
Test Includes:
File read from the test folders (data-driven style to simulate some HTTP stuff)
Doing actual HTTP requests to a local web-server (this is a huge pain to mock, so I won't)
Not all of them unit-tests but there are also quite complicated Multithreaded classes which need to be tested and I do test the overall behaviour of the test. Which can be considered as Functional Testing but need to be run every time as well.
Most of the functionality requires reading HTTP, doing TCP etc. I can't change them because that's the whole idea of project if I change these tests it will be pointless to test stuff.
Also I don't think I have the fastest tools to run unit tests. My current setup uses VS TS with Gallio and nUnit as framework. I think VS TS + Gallio is a bit slower than others as well.
What would you recommend me to fix this problem? I want to run unit-tests after every little bit changes btu currently this problem is interrupting my flow.
Further Clarification Edit:
Code is highly coupled! Unfortunately and changing is like a huge refatoring process. And there is a chicken egg syndrome in it where I need unit tests to refactor such a big code but I can't have more unit tests if I don't refactor it :)
Highly coupled code doesn't allow me to split tests into smaller chunks. Also I don't test private stuff, it's personal choice, which allow me to develop so much faster and still gain a large amount of benefit.
And I can confirm that all unit tests (with proper isolation) quite fast actually, and I don't have a performance problem with them.
Further Clarification:
Code is highly coupled! Unfortunately and changing is like a huge refatoring process. And there is a chicken egg syndrome in it where I need unit tests to refactor such a big code but I can't have more unit tests if I don't refactor it :)
Highly coupled code doesn't allow me to split tests into smaller chunks. Also I don't test private stuff, it's personal choice, which allow me to develop so much faster and still gain a large amount of benefit.
And I can confirm that all unit tests (with proper isolation) quite fast actually, and I don't have a performance problem with them.
These don't sound like unit tests to me, but more like functional tests. That's fine, automating functional testing is good, but it's pretty common for functional tests to be slow. They're testing the whole system (or large pieces of it).
Unit tests tend to be fast because they're testing one thing in isolation from everything else. If you can't test things in isolation from everything else, you should consider that a warning sign that you code is too tightly coupled.
Can you tell which tests you have which are unit tests (testing 1 thing only) vs. functional tests (testing 2 or more things at the same time)? Which ones are fast and which ones are slow?
You could split your tests into two groups, one for short tests and one for long-running tests, and run the long-running tests less frequently while running the short tests after every change. Other than that, mocking the responses from the webserver and other requests your application makes would lead to a shorter test-run.
I would recommend a combined approach to your problem:
Frequently run a subset of the tests that are close to the code you make changes to (for example tests from the same package, module or similar). Less frequently run tests that are farther removed from the code you are currently working on.
Split your suite in at least two: fast running and slow running tests. Run the fast running tests more often.
Consider having some of the less likely to fail tests only be executed by an automated continues integration server.
Learn techniques to improve the performance of your tests. Most importantly by replacing access to slow system resources by faster fakes. For example, use in memory streams instead of files. Stub/mock the http access. etc.
Learn how to use low risk dependency breaking techniques, like those listed in the (very highly recommended) book "Working Effectively With Legacy Code". These allow you to effectively make your code more testable without applying high risk refactorings (often by temporarily making the actual design worse, like breaking encapsulation, until you can refactor to a better design with the safety net of tests).
One of the most important things I learned from the book mentioned above: there is no magic, working with legacy code is pain, and always will be pain. All you can do is accept that fact, and do your best to slowly work your way out of the mess.
First, those are not unit tests.
There isn't much of a point running functional tests like that after every small change. After a sizable change you will want to run your functional tests.
Second,don't be afraid to mock the Http part of the application. If you really want to unit test the application its a MUST. If your not willing to do that, you are going to waste a lot more time trying to test your actual logic, waiting for HTTP requests to come back and trying to set up the data.
I would keep your integration level tests, but strive to create real unit tests. This will solve your speed problems. Real unit tests do not have DB interaction, or HTTP interaction.
I always use a category for "LongTest". Those test are executed every night and not during the day. This way you can cut your waiting time by a lot. Try it : category your unit testing.
It sounds like you may need to manage expectations amongst the development team as well.
I assume that people are doing several builds per day and are epxected to run tests after each build. You might we be well served to switch your testing schedule to run a build with tests during lunch and then another over night.
I agree with Brad that these sound like functional tests. If you can pull the code apart that would be great, but until then i'd switch to less frequent testing.

Resources