Single package in continuous delivery pipeline when building in parallel - continuous-delivery

My company is using Jenkins for continuous integration and I'm trying to move towards CD. I'm using git hub as a code repository. Right now we are merging feature branches into a uat environment and when a particular feature has been accepted the feature branch will be merged to our production branch.
This is obviously dangerous because two changes could be tested together and deployed separately.
Ideally we would have a package tested and deployed without rebuilding but I'm having trouble seeing how this is possible. If two people work on two different features, the first is finished, packaged and goes into testing, the second is then finished and packaged without the first? But then how can I deploy the package without invalidating the testing of the other feature?
I'm not sure on the correct way to integrate features with a single deployable package.
Any help would be greatly appreciated.
Further,
If you look at http://ptgmedia.pearsoncmg.com/images/chap5_9780321601919/elementLinks/fig5_6.jpg
my concern is that check-in 1 can be deployed when it passes acceptance testing and that package will be deployed, but what if acceptance testing failed? Check-in 5 contains the same problem as check-in 1 so no deployment to production can be done until check-in 1 is fixed or removed. Removing the change would be annoying as there could be multiple commits to be removed, and a fix + testing could take a long time.

Continuous Delivery is an extension of Continuous Integration. CI is all about evaluating your changes in the context of everyone else's on a frequent basis (if you commit less than once per day it can't count as CI)
Branching, of any kind, is all about isolating change and so is fundamentally at odds with CI. Feature branching and CI are opposed.
What most organisations do is merge branches before testing. This compromises the value of the feature branch, but retains the value of CI. If you don't do this then the CI has little real value for the reasons that you describe - you are not evaluating changes in a realistic context.
Sorry but you can't have both, they are opposites!

Regarding the difference in cycle time of hotfixes vs less critical things have you looked into feature toggles? http://martinfowler.com/bliki/FeatureToggle.html

If you want to do Continuous Delivery then branching is a no-no. Well, mostly. Releases should be tagged in SCM, the fix applied to release and merged back into HEAD.
You should also have automated tests to prove the fix actually fixes the problem. This might be hard in some circumstances. In that case the minimum you should do is verify the fix doesn't break existing behaviour (if that's the intention of the fix).
Feature toggles are good, so is branching by abstraction, however in practice this is adopted only by the most mature and experienced teams who have adopted CD. I suspect you're not at that point yet, so this will help you overcome your bump until you're more comfortable with CD.
If two features are supposed to be deployed at the same time, then I guess you should use the TDD principle of creating a FAILING test first, then implementing code to make it go green. Check that test in, so no build can move forward until you've got it implemented. This will make it absolutely clear this build isn't destined for production, as the feature isn't complete. Not a good idea for this test to be a CI, but at a latest phase of testing... providing you have multiple test phases that is!

Related

Does implementing CI/CD require prerequisite steps?

I'm trying to understand CI/CD strategy.
Many CI/CD articles mention that it's a automation services of build, test, deploy phase.
I would to know does CI/CD concept have any prerequisites step(s)?
For example, if I make a simple tool that automatically builds and deploys, but test step is manual - can this be considered CI/CD?
There's a minor point of minutia that should be mentioned first: the "D" in "CI/CD" can either mean "Delivery" or "Deployment". For the sake of this question, we'll accept the two terms as relatively interchangeable -- but be aware that others may apply a more narrow definition, which may be slightly different depending on which "D" you mean, specifically. For additional context, see: Continuous Integration vs. Continuous Delivery vs. Continuous Deployment
For example, if I make a simple tool that automatically builds and deploys, but test step is manual - can this be considered CI/CD?
Let's break this down. Beforehand, let's establish what can be considered "CI/CD". Easy enough: if your (automated) process is practicing both CI (continuous integration) and CD (continuous deployment), then we can consider the solution as being some form of "CI/CD".
We'll need some definitions for CI and CD (see above link), which may vary by opinion. But if the question is whether this can be considered CI/CD, we can proceed on the lowest common denominator / bare minimum of popular/accepted definitions and apply those definitions liberally as they relate to the principles of CI/CD.
With that context, let's proceed to determine whether the constituent components are present.
Is Continuous Integration being practiced?
Yes. Continuous Integration is being practiced in this scenario. Continuous integration, in its most basic sense, is making sure that your ongoing work is regularly (continually) integrated (tested).
The whole idea is to combat the consequences of integrating (testing) too infrequently. If you do many many changes and never try to build/test the software, any of those changes may have very well broken the build, but you won't know until the point in time where integration (testing) occurs.
You are regularly integrating your changes and making sure the software still builds. This is unequivocally CI in practice.
But there are no automated tests?!
One may make an objection to the effect of "if you're not running what is traditionally thought of as tests (unit|integration|smoke|etc) as part of your automated process, it's not CI" -- this is a demonstrably false statement.
Even though in this case you mention that your "test" steps would be manual, it's still fair to say that simply building your application would be sufficient to meet the basic definition of a "test" in the sense of continuous integration. Successfully building (e.g. compiling) your code is, in itself IS a test. You are effectively testing "can it build". If your code change breaks the compile/build process, your CI process will tell you so right after committing your code -- that's CI in action.
Just like code changes may break a unit test, they can also break the compilation process -- automating your build tests that your changes did not break the build and is, therefore, a kind of continuous integration, without question.
Sure, your product can be broken by your changes even if it compiles successfully. It may even be the case that those software defects would have been caught by sufficient unit testing. But the same could be said of projects with proper unit tests, even projects with "100% code coverage". We certainly don't consider projects with test gaps as not practicing CI. The size of the test gap doesn't make the distinction between CI and non-CI; it's irrelevant to the definition.
Bottom line: building your software exercises (integrates/tests) your code changes, if even only in a minimally significant degree. Doing this on a continuous basis is a form of continuous integration.
Is Continuous Deployment/Delivery being practiced
Yes. It is plain to see in this scenario that, if you are deploying/delivering your software to whatever its 'production environment' is in an automated fashion then you have the "CD" component to CI/CD, at least in some minimal degree. The fact that your tests may be manual is not consequential.
Similar to the above, reasonable people could disagree on the effectiveness of the implementation depending on the details, but one would not be able to make the case that this practice is non-CD, by definition.
Conclusion: can this practice be considered "CI/CD"?
Yes. Both elements of CI and CD are present in at least a minimum degree. The practices used probably can't reasonably be called non-CI or non-CD. Therefore, it should be concluded this described practice can be considered "CI/CD".
I think it goes without saying that the described CI/CD process has gaps and could benefit from improvement and, with the lack of automated tests and other features, doesn't reap all the possible benefits of a robust CI/CD process could offer. However, this doesn't render the process non-CICD by any means. It's certainly CI/CD in practice; whether it's a particularly good or robust CI/CD practice is a subject of opinion.
does CI/CD concept have any prerequisites step(s)?
No, there are no specific prerequisites (like writing automated software tests, for example) to applying CI/CD concepts. You can apply both CI and CD independently of one another without any prerequisites.
To further illustrate, let's think of an even more minimal project with "CI/CD"...
CD could be as simple as committing to the main branch repository of a GitHub Pages. If that same Pages repo, for example, uses Jekyll, then you have CI, too, as GitHub will build your project automatically in addition to deploying it and inform you of build errors when they occur.
In this basic example, the only thing that was needed to implement "CI/CD" was commit the Jekyll project code to a GitHub Pages repository. No prerequisites.
There's even cases where you can accurately consider a project as having a CI process and the CI process might not even build any software at all! CI could, for example, consist solely of code style checks or other trivial checks like checking for newlines at the end of files. When projects only include these kinds of checks alone, we would still call that check process "CI" and it wouldn't be an inaccurate description of the process.

Is there a way to avoid broken or untested code from going live using CI?

I recently got my team to switch to CI. We're still in the baby steps, but so far we've started using this branching strategy and it's been working great. We don't have any automated tests running yet and do all our testing manually. We are also currently doing weekly releases so we create a release branch from our Dev branch Monday afternoon, run further tests Tuesday and push it live to our Master branch on Wednesday morning and monitor through out the day. The problem is that sometimes our QA team doesn't get to test all our features in time or the feature fails QA, but is already in the Dev branch and could be going live prior to the issues being fixed. Any ways to mitigate this? This is all fairly new to us so I am open to changing everything if necessary.
Hopefully this diagram is a good explanation of how things work currently:
You're not doing Continuous Integration (CI).
In CI you only have one branch (master/main/etc) - on which the CI tool operates - and maybe very short lived child branches that get merged into that branch, ideally in < 1 day. Pretty much synonymous with trunk-based development.
In traditional CI breakages are normal, part of the process - bad changes are identified by the CI tool (one of the reasons you're using it to start with) after they are merged into the branch. Breakages should be fixed ASAP as they block the project's critical path. Unfortunately this requires human intervention, which can be a serious speed bump for larger projects.
There is also gating CI, which performs automated orchestrated verifications of the changes prior to merging them into the branch, rejecting those failing the verifications and automatically merging into the branch the successful ones, thus ensuring that the branch maintain a minimum quality level at all times. IMHO the only comfy and deterministic way to do integration, especially in larger scale projects. AFAIK only 2 such tools out there: Zuul and ApartCI (which is my baby).
The QA/release stages are just subsequent steps in your CI/CD pipeline which are automatically reached if the previous steps pass, see the picture in this answer. Optionally you can pull release branches from the respective refpoints, which would allow complex multi-releases stories, hot-fixes, etc. But these branches are never merged back into their parent, the hot-fixes are instead double-committed/cherry-picked into the master branch, potentially with additional changes as required by the different branch context.
GitFlow is not CI:
you have multiple branches (silos) which you only merge once in a while.
branch merges can rapidly become a nightmare, especially in larger projects where you can have more than 1 feature branch active simultaneously. Try to add 1 then 2 more parallel feature branches in your picture, as well as the additional upstream branch merges that need to be performed before you can do the merges you already show. And the additional QA executions you may need because of these extra merges. Just to get an idea about how badly such strategy scales.
tests done in these isolated branches are out of context - the branches do not reflect the overall project integration. Even if these tests pass the master branch may fail them or be plain broken after merge. See an example of how just 2 simple changes can cause such problems, let alone branches with multiple changes.
As you probably guessed by now, I'm not a big fan of GitFlow, it's awfully similar to the pre-CI integration practices I had to endure a long time ago ;)

Handling multiple branches in continuous integration

I've been dealing with the problem of scaling CI at my company and at the same time trying to figure out which approach to take when it comes to CI and multiple branches. There is a similar question at stackoverflow, Multiple feature branches and continuous integration. I've started a new one because I'd like to get more of discussion and provide some analysis in the question.
So far I've found that there are 2 main approaches that I can take (or maybe some others???).
Multiple set of jobs (talking about Jenkins/Hudson here) per branch
Write tooling to manage the extra jobs
Create/modify/delete Jobs in bulk
Custom settings for each job per branch (SCM url, dep management repos duplications)
Some examples of people tackling this problem with shell tools, ant scripts and Jenkins CLI. See:
http://jenkins.361315.n4.nabble.com/Multiple-branches-best-practice-td2306578.html
http://jenkins.361315.n4.nabble.com/Is-it-possible-to-handle-multiple-branches-where-some-jobs-should-run-on-each-one-without-duplicatin-td954729.html
http://jenkins.361315.n4.nabble.com/Parallel-development-with-branches-td1013013.html
Configure or Create hudson job automatically
Will cause more load on your CI cluster
Feedback cycle for devs slows down (if the infrastructure cannot handle the new load)
Multiple set of jobs per 2 branches (dev & stable)
Manage the two sets manually (if you change the conf of a job then be sure to change in the other branch)
PITA but at least so few to manage
Other extra branches won't get a full test suite before they get pushed to dev
Unsatisfied devs. Why should a dev care about CI scaling problems. He has a simple request, when I branch I would like to test my code. Simple.
So it seems if I want to provide devs with CI for their own custom branches I need special tooling for Jenkins (API or shellscripts or something?) and handle scaling. Or I can tell them to merge more often to DEV and live without CI on custom branches. Which one would you take or are there other options?
When you talk about scaling CI you're really talking about scaling the use of your CI server to handle all your feature branches along with your mainline. Initially this looks like a good approach as the developers in a branch get all the advantages of the automated testing that the CI jobs include. However, you run into problems managing the CI server jobs (like you have discovered) and more importantly, you aren't really doing CI. Yes, you are using a CI server, but you aren't continuously integrating the code from all of your developers.
Performing real CI means that all of your developers are committing regularly to the mainline. Easy to say, but the hard part is doing it without breaking your application. I highly recommend you look at Continuous Delivery, especially the Keeping Your Application Releasable section in Chapter 13: Managing Components and Dependencies. The main points are:
Hide new functionality until it's finished (A.K.A Feature Toggles).
Make all changes incrementally as a series of small changes, each of which is releasable.
Use branch by abstraction to make large-scale changes to the codebase.
Use components to decouple parts of your application that change at different rates.
They are pretty self explanatory except branch by abstraction. This is just a fancy term for:
Create an abstraction over the part of the system that you need to change.
Refactor the rest of the system to use the abstraction layer.
Create a new implementation, which is not part of the production code path until complete.
Update your abstraction layer to delegate to your new implementation.
Remove the old implementation.
Remove the abstraction layer if it is no longer appropriate.
The following paragraph from the Branches, Streams, and Continuous Integration section in Chapter 14: Advanced Version Control summarises the impacts.
The incremental approach certainly requires more discipline and care - and indeed more creativity - than creating a branch and diving gung-ho into re-architecting and developing new functionality. But it significantly reduces the risk of your changes breaking the application, and will save your and your team a great deal of time merging, fixing breakages, and getting your application into a deployable state.
It takes quite a mind shift to give up feature branches and you will always get resistance. In my experience this resistance is based on developers not feeling safe committing code the the mainline and this is a reasonable concern. This in turn usually stems from a lack of knowledge, confidence or experience with the techniques listed above and possibly with the lack of confidence with your automated tests. The former can be solved with training and developer support. The latter is a far more difficult problem to deal with, however branching doesn't provide any extra real safety, it just defers the problem until the developers feel confident enough with their code.
I would set up separate jobs for each branch. I've done this before and it isn't hard to manage and set up if you've set up Hudson/Jenkins correctly. A quick way to create multiple jobs is to copy from an existing job that has similar requirements and modify them as needed. I'm not sure if you want to allow each developer to setup their own jobs for their own branches, but it isn't much work for one person (i.e. a build manager) to manage. Once the custom branches have been merged into stable branches, corresponding jobs can be removed when they are no longer necessary.
If you're worried about the load on the CI server, you could set up separate instances of the CI or even separate slaves to help balance the load across multiple servers. Make sure that the server you are running Hudson/Jenkins on is adequate. I've used Apache Tomcat and just had to ensure that it had enough memory and processing power to process the build queue.
It's important to be clear on what you want to achieve using CI and then figure out a way to implement it without much manual effort or duplication. There's nothing wrong with using other external tools or scripts that are executed by your CI server that help simplify your overall build management process.
I would choose dev+stable branches. And if you still want custom branches and afraid of the load, then why not move these custom ones to the cloud and let developers manage it themselves, e.g. http://cloudbees.com/dev.cb
This is the company where Kohsuke is now.
There is an Eclipse Tooling also, so if you are on Eclipse, you will have it tightly integrated right into dev env.
Actually what is really problematic is build isolation with feature branches. In our company we have a set of separate maven projects all be part of a larger distribution. These projects are maintained by different teams but for each distribution all projects need to be released. A featurebranch may now overlap from one project to another and thats when build isolation gets painfully. There are several solutions we've tried:
create separate snapshot repositories in nexus for each feature branch
share local repositories on dedicated slaves
use the repository-server-plugin with upstream repositories
build all within one job with one private repository
As a matter of fact, the last solution is the most promising. All other solutions lack in one or another way. Together with the job-dsl plugin it is easy to setup a new feature branch. simply copy and paste the groovy script, adapt branches and let the seed job create the new jobs. Make sure that the seed job removes nonmanaged jobs. Then you can easily scale with feature branches over different maven projects.
But as tom said well above, it would be nicer to overcome the necessity of feature branches and teach devs to integrate cleanly, but that is a longer process and the outcome is not clear with many legacy system parts you won't touch any more.
my 2 cents

What's the most efficient way for a developer to switch between tasks?

I'm looking for a workflow-type description of the series of steps you perform to switch from one software development task to another. If a step involves a tool, please specify which tool and how it's used. The goal of the workflow is to have the smoothest possible transition from task #1 to task #2 and back to task #1.
Consider this scenario...
You're implementing a new user story and, while you've made progress so far today, it's not quite done and you haven't implemented your tests yet.
Your lead comes to you with a high priority bug that's blocking your test team. You need to stop what you're doing and get the bug fixed. The bug is in a build from three days ago, which is the most recent build the test team has picked up.
You can fix the bug in a new version of the sources, but it has to be a stable version and can't include the incomplete feature you're currently working on.
Alt + Tab is how we do it.
Task switching is a thing of the brain. I don't think there is a tool to do that for you. If there is, I am also interested.
Each person has its own way of preparing, some don't prepare at all and are in another thing like a snap, some take more time etc. It depends on the man/woman.
Sure, you can try to create some mental milestones (taking a note, place a reminder etc) to return to it when getting back to the task, but this again depends on other factors (how long was the task switch, how quiet the office, familiarity with the task, moon phases etc).
The most efficient way for a developer to switch between tasks I think is subjective.
Meanwhile, have you read the Human Task Switches Considered Harmful from Joel Spolsky?
I would say the steps you need to take in the scenario you describe are 100% dependent on the development environment and tools you have set up.
Using Perforce for source code version control, we have set up a branching system where the releases are separate from development work and all development branches stem from a single "acceptance" branch. Each branch is used for a single issue, or for a set of very closely related issues. No other issues can be worked on in a branch until the changes have been integrated up to the acceptance branch.
Yes, it does mean we have a lot of branches. Yes, we do a lot of syncing (acceptance down to a work branch) and integrating (work branch up to acceptance). But its worth is incalculable when it comes to easily switching from one task to another, going back to a test-built, spotting two issues biting each other, etc.
After development has done its thing (including their own tests), an issue is tested by the QA team. First in isolation in its own branch. After that is is integrated into the acceptance branch and a regression test is done to find any problems with independent issues biting each other. When all issues for a release have thus been integrated into acceptance, a full regression and new functionality test is executed by the QA team.
So, the acceptance branch is always the "latest" state of development for the app.
In this set up the scenario you describe would play out as:
Leave my current task as it is, possibly check in any outstanding changes so as not to lose them when my computer crashes. If that means breaking a daily build of that branch, I wouldn't check in, unless it is easy to fix the compile errors. (Please note that we have many apps in our application suite and while my changes may compile in the app I am working on, they may still break the compilation of other apps in our suite) Our rule is: each submit may break functionality, but must not break the build process.
Find an "empty" branch - a branch that is not currently being used for any development work, or, if all branches are taken, create a new one.
Force sync the acceptance branch and the selected work branch so my machine is guaranteed to have the latest state for both branches.
Synchronize (forced if necessary) the latest state of the acceptance branch to the work branch, so the selected work branch is the same as the acceptance branch.
Open up that branch's application suite in the IDE, debug and solve. Submit to the work branch.
Tell QA to have a look at it in the work branch. If they are satisfied with it, integrate the changes up to acceptance so they can continue their test.
Switch the IDE back to work on the application suite in the branch I was working on before.
Rinse and repeat.
Considering your scenario,
you could check out the stable version of sources in another working copy, correct the bug, commit.
When you come back to your incomplete work, do an update and continue to work.
When you're working on something you usually have a few ideas, few things you're planning to do, some stuff that is not clear and has to be solved later. It tends to get lost when you switch to other task.
I found it useful to write them down somewhere - take a brain snapshot. Later it's easier to restore it and get back faster to your original task.
I make a note of every file I'm working on inside of a Task/Todo item with a reminder in approx. the amount of time I will be away from it. Then I save and close each of those files to prevent them from distracting me/eliminate the clutter/create room for the new task on my desktop. I have the memory of a flea, so I need all the help I can get.

Are code freezes still relevant when using a continuous integration build setup?

I've used a Continuous Integration server in the past with great success, and hadn't had the need to ever perform a code freeze on the source control system.
However, lately it seems that everywhere I look, most shops are using the concept of code freezes when preparing for a release, or even a new test version of their product. This idea runs even in my current project.
When you check-in early and often, and use unit tests, integration tests, acceptance tests, etc., are code freezes still needed?
Continuous integration is a "build" but it's part of the programming part of the development cycle. Just like the "tests" in TDD are part of the programming part of the development cycle.
There will still be builds and testing as part of the overall development cycle.
The point of continuous integration and tests is to shorten the feedback loops and give programmers more visibility. Ultimately, this does mean less problems in testing and builds, but it doesn't mean you no longer do the original parts of your development cycle - they are just more effective and can be raised to a higher level, since more tivial problems are being caught earlier in the development cycle.
So you will still have to have a code freeze (or at least a branch) in order to ensure the baseline for what you are shipping is as expected. Just because someone can implement something with a high degree of confidence does not mean it goes into your release without passing through the same final cycles, and the code freeze is an important part of that.
With CI, your code freezes can be very short, since your final build, testing and release may be very reliable, and code freeze may not even exist on small projects, since there is no need for a branch - you release and go right back into development on the next set of features very quickly.
I'd also like to add that CI and TDD allow the final build and testing phase to revert back closer to the traditional waterfall (do all dev, do all testing, then release), as opposed to the more continual QA which has been done on projects with weekly or monthly builds. Your testers can use the CI builds to give early feedback, but it's effectively a different sort of feedback than in the final testing, where you are looking for stability and reliability as opposed to functionality (which obviously was missed in the unit "tests" which the developers had built).
Code freezes are important, because continues integration does not replace runtime regression testing.
Having an application build and pass unit testing is only a small part of the challenge, ideally, when you freeze code for a release, you are signing off on two things:
This code has been fully regressioned, and is defect free
This code is EXACTLY the code that should be in production (for SOX compliance).
If your using a modern SCM, just fork the code at that point and start work on the next release in a branch, and do a merge when the project is deployed. (Of course, place a label so you can rollback that point if you need to apply a breaking patch).
Once code is in "release mode", it should not be touched.
Our typical process:
Development
||
\/
QAT
||
\/
UAT => Freeze until deploy date => Deploy => Merge and repeat
\ /
\- New Branch for future dev -------/
Of course, we usually have many parallel branches during development, that merge back up into the release stream before UAT.
The code freeze has more to do with QA than it has to do with Dev. The code freeze is the point where QA has said: "Enough. We only have bandwidth to fully test the new features added in so far." That doesn't mean dev doesn't have the bandwidth to add more features, it's just that QA needs time with a quiescent code base to ensure that everything works together.
If you're all in continuous integration mode (QA included) this could be just a freeze of a very short time while QA puts the final seal of approval on the whole package just before it goes out the door.
It all depends on how tightly your QA and regression testing are integrated into the dev cycle.
I'd second the votes already mentioned about SCM branching and allowing dev to continue on a different code branch than what QA is testing. It all goes back to the same thing. QA and regression testing need a quiescent code base for a period of time prior to release.
I think that code freezes are important because each new feature is a potential new source of bugs. Sure regression tests are great and help address this issue. But code freezes allow the developers to focus on fixing currently outstanding bugs and get the current feature set into a release worthy state.
At best, if I really wanted to develop new code during a code freeze, I would fork the frozen tree, do my work there, then after the freeze, merge the forked tree back in.
I'm going to sound like one of the context-driven people but the answer is "it depends".
Code Freeze is a strategy to cope with a problem. If you don't have the problem it is good at addressing, then no, it isn't needed. If you have another technique for addressing the problem, then no, it isn't needed.
Code Freeze is one technique for reducing risk. The advantages if brings are stability and simplicity. The disadvantage it brings are
Another technique is to use Branching, such as with "Feature Branches". The disadvantage of Branching is cost of dealing with the branches, of merging changes.
The technique you're describing for reducing risk is Automated Testing to give fast feedback. The trade-off here is increased velocity for some increased risk (you will miss some bugs).
Of these approaches I'm with you, I prefer the Automated Testing. But there are some situations, such as very high cost of failure, where a Code Freeze does provide a lot of value.

Resources