Continuous integration and large architectural changes. How to handle them? - continuous-integration

I was reading this answer in trying to understand how to work with multiple developers working on multiple branches of a project. My first reflex was to want Jenkins to run a separate build for each branch, but as I understand it, this is a bad way to approach the problem.
Now, I see how having very small features or parts of features which get merged back into the main branch often is the preferred way to go, but I can't quite wrap my head around what happens when a project goes through a very large architectural change.
Say I have a web project written in AngularJS and the team decides that for the future of the project, it needs to be moved over to using ReactJS instead. Said current project would have a reasonable number of features already implemented and tested. At this point, I can't imagine any smaller increment for the new "feature" of using ReactJS other than having it be on part with the current state of the project, meaning every test currently passing should still pass once it's done. Anything else would mean a regression for the project and I know of very few clients who would be ok with this.However, that's hardly going to be the case until the switch is almost 100% done, which will not be a small amount of work.
I might not understand the concept perfectly, but I don't see feature toggles working here (especially if the move to ReactJS requires we modify, say, the Gruntfile since that will inevitably break things a lot). Does the team doing the migration simply needs to tell the rest of the team not to touch the project until they say it's ok? That seems like a weird solution to me.
So I'll admit, I'm at a loss as to what the proper workflow would be here. Any input would be appreciated as our development process is something I constantly try and make better, even with my limited experience in the field.

Related

Good processes for debugging production environment? Copying data to Dev?

I've been thinking about this a bit. The idea is that something goes wrong in PROD. The data that was captured causes the web app to behave differently than in other environments. So, also data in other environments gets out of sync with prod (as expected). However, a bug comes along and for some reason it only happens in PROD, probably because of the differences in data.
I'm wondering what is a good practice to remedy these kinds of problems? More tests, for sure. But beyond that? One could create new data in dev, but the whole point is that some data point, or some combination of actions causes a data point to be wrong. Perhaps when using some other data source to arrive at the "actual" data point, which is different then the "expected" data point. Apologizes that this isn't a great description, and tries to be both an example and a definition of a general production bug.
I know this isn't a very precise question. Hopefully, there are references that make good suggestions.
This is a very interesting question. One approach I've used before is to deliberately do my final testing in production (TIP).
Before you skewer my effigy with multiple pointy needles, hear me out for a minute while I talk about continuous deployment :-)
The idea is to deploy a new build into production and then use custom routing to direct traffic between the old and new production builds. In principle this is quite simple: you start by routing the old build to your current customers and the new build only to your engineering team. Your customers don't see any change. But your team can start testing your new build, including messy stuff like disaster recovery and stress-testing. You will hopefully discover the type of bugs that you talk about in your question.
If there's a problem, then you simply rollback the new build. If your tests don't find any problem, you roll-out to say 5% of your client base. Then 10% and 20% and so on.
Whilst simple in principle, there are some issues that you need to plan for from the very start. The first is data and data schemas, which need to function correctly across both old and new builds. As long as the services used by your web app are designed to handle at least one rollback after a new build is deployed, and your new build understands both the old and new data, then you should be okay.
The second issue is API/interface changes. Rather than editing or deleting methods or parameters, you need to create a new API/interface that mostly re-directs to the old API/interface, except for the new/changed code.
Other issues including incompatible changes to configuration and settings between builds. These issues aren't fatal, but you do need to do some planning and testing beforehand. And the big reward is that you can safely do final testing of your code in production without affecting your customers.
Some links on testing in production:
There's no place like production
The future of software testing
Production is a mixed blessing
TIP - malpractice?
TIP really happens
Why TIP isn't as stupid as it sounds

How do I use Greenhopper to manage developers across multiple projects?

We are currently using Jira 5.1.6 with GreenHopper 6.0.5. We have a lot of projects, probably about a dozen total but only a few that are actively worked on at a time, with the rest being there for occasional bugfixes or other tasks. The 4–5 developers in our company are likely to be working on a couple projects at once (some working on just one, some working on maintenance on several, and it somewhat varying who's working on what depending on the business priorities).
So, GreenHopper seems set up from a very project-centric view. I can set up a Rapid Scrum Board for a project, and make Sprints within it of work to do for that project. This can give the business a good view of work into that project. Potentially, one can also make a Board for all of the projects (since GreenHopper 6 added that), and make a kind of "global sprint" across everything. If we were to have this kind of global sprint, all of the project owners would need to work at once on figuring out what should get done over the next couple weeks, which might be workable, but seems a bit tricky and would require a lot of coordination.
What I think we want is some kind of "resource view" or something, so that project owners could set up their tasks in their sprints, but there's some sort of view for each developer to tell them what task they should be working on next no matter which project's sprint it's in, and some way for our manager to allocate our time across the projects. So, I might be scheduled to work, for example, 20 hours a week on project A, 10 on project B, and 10 on maintenance of other projects, and then project owners making sprints could see how much time they had allocated, and I as a developer would see some kind of unified view of my upcoming tasks, so that I would know what I should be working on next and what's coming soon. I don't know if that description is exactly what we want, but I think we want something along those lines, and it seems like we can't be the only place that wants some sort of project-based view as well as a resource-based view.
The thoughts I've had of how we might approach this from my exploration of GreenHopper so far are:
Create those "global" sprints I mentioned, and work as a department at the beginning of each sprint to try to schedule what we'll all be doing. Projects can get a look at their particular piece of the sprint using a Quick Filter or somesuch, and we just have to deal with coordinating those sprints.
Use the "Parallel Sprints" feature on an all-projects Board, and have each developer create their own sprints of the tasks they have coming up. This helps with getting a resource-based view, but is probably tough for projects to figure out status of things, and definitely feels like squeezing GreenHopper into a space that it really doesn't want to go.
Create a board for each project of the things to be coming up for each project, so each project gets its own Sprints and we get the project-based view of things, and just have each developer track themselves which projects' sprints they should be getting tasks from. Basically, just GreenHopper isn't the tool for a resource-based view, so don't even bother, and trust our developers and our manager to look across all these projects for what tasks to work from rather than trying to do it all in one place.
None of those seem great, though I'm sure we could make do with any of them. But I keep on coming back to that it doesn't feel like we're doing something bizarre or unique to us, and we would have thought that since Jira/GreenHopper was an industry-standard agile tool that it'd be easier to use it for what we're trying to do. Are we doing something crazy? I think we'd be fine with changing our process to use standard practices if there's a standard way of doing Agile across multiple projects out there. Is there some GreenHopper setting or report or something somewhere I've missed? Is there some other Jira plugin that we should be using instead of or in addition to GreenHopper? Do other teams out there use one of the above approaches and can give advice on whether or not it's a good idea?
Any help would be appreciated. Thank you.
"... seems a bit tricky and would require a lot of coordination." Yup, sounds like project management to me.
I'd create boards for each product that gets released on its own schedule. I'd also create a query to show each user the issues assigned to them sorted by Sprint so they can see what work is on their plate. The issues will be across multiple boards and sprints.
I do wish that GH helped with resource allocation more, including totaling up the time allocated in the filter in the previous paragraph. At the moment I end up exporting the results of the filter to Excel and using that to sum up totals by resource.
I asked this question in perhaps the more appropriate place, on the Atlassian forum:
https://answers.atlassian.com/questions/99020/how-do-i-use-greenhopper-to-manage-developers-across-multiple-projects
And I think the answer there from them was very good. I created a board for each project, limited to its project and used for creating that project's sprints, and the developers use an "All Projects" board to see all of the sprints that they're a part of.
Doesn't handle resource allocation wonderfully, as mdoar states, but it does seem to be using the tool the best way that it can be for this for now.

Tool for Multiple Code Deployments.

Sorry if a similar question has been posed before. There are a lot of deployment questions but none seemed to address my problem.
Anyway. I'm working with asp.net, C# and using Visual Studio.
The Organization I'm working in is changing rapidly. There are a lot of projects coming in the pipeline that will require multiple code changes and iterative deployments over the next few months. While working, these changes are always 'on the forefront', so sometimes I have to code certain parts of the same program multiple times.
Since these projects are all staggered, I can't just make one sweeping change all at once; I have to deploy and redeploy the same program multiple times, using only the changes that are required for that deployment.
If this is confusing, here's a simple example:
Application is being used on an Intranet. This application calls our Database, using Driver A.
There are two environments, test and production.
Certain Stored procedures have to be called with parameters that register 'Test' to allow certain other applications to run even with bad data (for testing purposes).
When deploying applications, these stored procedures have to be modified, removing Test parameters
We have an Operating System upgrade, allowing us to move to a much faster Driver B, but requires changes to be made to the code to use Driver B.
So that's two wholly different deployments where some code must be changed for Deployment 1 and other code must be changed for Deployment 2.
Currently I'm just using notepad for an overall change list, regular debugging break points and a multitude of in-code comments, and then I manually slog through the code to make sure that everything is changed. With hundreds of thousands of lines of code over multiple files, classes, objects, etc. this gets pretty tedious, as well as there being a good chance of missing something (causing it to break) or pushing wrong changes (causing it to either break or allow bad data).
Is there a tool that could be used to help in this situation? Preferably one that I can discern what needs to change for Deployment A and what needs to change for Deployment B? I'm also open to hearing other schools of thought as well (tips are definitely accepted!)
Sure, I understand your problem.
I would suggest a couple of things
Installers : Why don't you think of installers, there are loads of installers i.e Install shield, Wix, MSI installer.
These installers will give you flexibilty to update files which you need to update, i.e. Full Control.
But you need to choose the best of them, I have worked around MSI and Wix a lot, so I know this can sort your problem, however its your call.
Publish : I haven't played around much with this, I have just done website publish. However I know it does wonders, so try it also.

Please settle a check out and lock vs update and merge version control debate

I've used source controls for a few years (if you count the Source Safe years), but am by no means an expert. We currently are using an older version of Sourcegear Vault. Our team currently uses a check out and lock model. I would rather switch to a update and merge model, but need to convince the other developers.
The reason the developers (not me) set up to work as check out and lock was due to renegade files. Our company works with a consulting firm to do much of our development work. Some years ago, long before my time here, they had the source control set up for update and merge. The consultants went to check in, but encountered a merge error. They then chose to work in a disconnected mode for months. When it was finally time to test the project, bugs galore appeared and it was discovered that the code bases were dramatically different. Weeks of work ended up having to be redone. So they went to check out and lock as the solution.
I don't like check out and lock, because it makes it very difficult for 2 or more people to work in the same project at the same time. Whenever you add a new file of any type or change a file's name, source control checks out the .csproj file. That prevents any other developers from adding/renaming files.
I considered making just the .csproj file as mergable, but the Sourcegear site says that this is a bad idea, because csproj is IDE auto-generated and that you cannot guarantee that two different VS generated files will produce the same code.
My friend (the other developer) tells me that the solution is to immediately check in your project. To me, the problem with this is that I may have a local copy that won't build and it could take time to get a build. It could be hours before I get the build working, which means that during that time, no one else would be able to create and rename files.
I counter that the correct solution is to switch to a mergable model. My answer to the "renegade files" issue is that it was an issue of poor programmer discipline and that you shouldn't use a weaker programmer choice as a fix for poor discipline; instead you should take action to fix the lack of programmer discipline.
So who's right? Is check in - check out a legitimate answer to the renegade file issue? Or does the .csproj issue far too big of a hassle for multiple developers? Or is Sourcegear wrong and that it should be ok to set the csproj file to update and merge?
The problem with update and merge that you guys ran into was rooted in a lack of communication between your group and the consulting group, and a lack of communication from the consulting group to your group as to what the problem was, and not necessarily a problem with the version control method itself. Ideally, the communication problem would need to be resolved first.
I think your technical analysis of the differences between the two version control methodologies is sound, and I agree that update/merge is better. But I think the real problem is in the communication to the people in your group(s), and how that becomes apparent in the use of version control, and whether the people in the groups are onboard/comfortable with the version control process you've selected. Note that as I say this, my own group at work is struggling through the exact same thing, only with Agile/SCRUM instead of VC. It's painful, it's annoying, it's frustrating, but the trick (I think) is in identifying the root problem and fixing it.
I think the solution here is in making sure that (whatever VC method is chosen) is communicated well to everyone, and that's the complicated part - you have to get not just your team on board with a particular VC technique, but also the consulting team. If someone on the consulting team isn't sure of how to perform a merge operation, well, try to train them. The key is to keep the communication open and clear so that problems can be resolved when they appear.
Use a proper source control system (svn, mercurial, git, ...)
If you are going to do a lot of branching, don't use anything less recent than svn 1.6. I'm guessing mercurial/git would be an even better solution, but I don't have too much hands-on-experience using those yet.
If people constantly are working on the same parts of the system, consider the system design. It indicates that each unit has too much responsibility.
Never, ever accept people to offline for more than a day or so. Exceptions to this rule should be extremely rare.
Talk to each other. Let the other developers know what your are working on.
Personally I would avoid having project files in my repository. But then again, I would never ever lock developers to one tool. Instead I would use a build system that generated project files/makefiles/whatever (CMake is my flavor for doing this).
EDIT: I think locking files is fixing the symptoms, not the disease. You will end up having developers doing nothing if this becomes a habit.
I have worked on successful projects with teams of 40+ developers using the update-and-merge model. The thing that makes this method work is frequent merges: the independent workers are continuously updating (merging down) changes from the repository, and everyone is frequently merging up their changes (as soon as they pass basic tests).
Merging frequently tends to mean that each merge is small, which helps a lot. Testing frequently, both on individual codebases and nightly checkouts from the repository, helps hugely.
We are using subversion with no check-in/check-out restrictions on any files in a highly parallel environment. I agree that the renegade files issue is a matter of discipline. Not using merge doesn't solve the underlying problem, what's preventing the developer from copying their own "fixed" copy of code over other people's updates?
Merge is a pita, but that can be minimized by checking in and updating your local copy early and often. I agree with you regarding breaking checkins, they are to be avoided. Updating your local copy with checked in changes on the other hand will force you to merge your changes in properly so that when you finally check-in things go smoothly.
With regards to .csproj files. They are just text, they are indeed mergeable if you spend the time to figure out how the file is structured, there are internal references that need to be maintained.
I don't believe any files that are required to build a project should be excluded from version control. How can you reliably rebuild or trace changes if portions of the project aren't recorded?
I am the development manager of a small company, only 3 programmers.
The projects we work on sometimes take weeks and we employ the big bang, shock and awe implementation style. This means that we have lots of database changes and program changes that have to work perfectly on the night that we implement. We checkout a program, change it and set it aside because implementing it before everything else will make 20 other things blow up. I am for check out and lock. Otherwise, another person might change a few things not realizing that program has had massive changes already. And the merge only helps if you haven't made database changes or changes to other systems not under source control. (Microsoft CRM, basically any packaged software that is extensible through configuration)
IMO, project files such as .csproj should not be part of the versioning system, since they aren't source really.
They also almost certainly are not mergeable.

Using a Continuous Integration Server for Home Development

As a follow up to one of my previous posts 'Using Version Control for Home Development', I am now asking about opinions as regards using a Build Server for a pet project.
Lately I have been reading about this 'Build Servers' concept, and I have looked at applications such as Maven and CruiseControl.Net.
And thus I ask, how feasible is it to use something like CruiseControl.Net for my home pet projects?
Reason I ask is that I think that these Build Servers are mainly aimed for team projects...but then again, I'm still very new to this Automated Build process.
Keep in mind that most of the time, these pet projects are only handled by one man, not a team.
So should I look more into this concept for the sake of using at home, or should I just get some practice on it for experience's sake?
[EDIT]
Although I thank you all for your answers as regards alternatives to CC.Net and such, no one has yet really tackled the issue of whether it is feasible or not to implement a Build System for Home Development ?
It is completely feasible to implement a build server for your home projects. I've implemented CC.Net for my home projects myself and it is pretty easy to do so, even for the first time. I would say the learning curve (depending on your experience) is less than a day to get your first project up and building, though there is always the longer tail on that curve as you dig into some of the more interesting details.
The question to me is more one of the motivation for continuous integration on these projects. If you are using "Home Project" synonomous with "Throw-away Project", there probably isn't much point in going to the trouble of CI unless you are using it specifically as a CI learning excercise.
However, assuming these are not throw-away projects you are talking about, I've found (in addition to the more obvious benefits of automation) that implementing CI helps reduce the overhead involved in coming back to a project you've walked away from for some period of time. Of course, unit tests are the most valuable asset in this regard, but the combination of unit tests with an automated build/deployment process really allows you to focus on the new and changed requirements when you come back to a project after having set it down for a while.
Additionally, as mghie points out in the comments to this answer, "CI will give even greater benefit for home projects if they build upon each other, so changes in one project could cause the build to break in others."
My advice, just do it once so you have a clearer picture of what is involved and the benefits you might reap and drawbacks you might incur. Then make the decision for yourself as to whether or not it is worth continuing to do. Like I said, the learning curve is reasonably low so the investment you will have to make in just giving it a try shouldn't be the reason not to.
Nutshell: Feasible - Yes, Desirable for home projects - Quite Possibly, Worth further investigation - Definitely, Investment - Relatively low
As an alternative to CC.Net, I recommend you to give a look to TeamCity, is really easy to setup and get it up and running.
Related question:
Best Continuous Integration Setup for a solo developer (.NET)
i installed CC.net months ago it took me a whole night to configure it and create the xml configurations and i have no regrets about it, it smoothly integrates with SVN, Nunit, Nant or Msbuild. you should try it only if it is to gain experience
Take a look at Hudson its very easy. You need to just deploy in Tomcat or any other servlet you use container. Once up every configuration can be done using browser. Hudson supports maven, ant etc and supports all the major SCMs. I have been using hudson for the past one year and not faced any trouble.
CC.NET is very feasible, in fact with the free cost and wide ranging supported actions. Not to mention the fact that since you can get the source code you can modify it to you needs I could not imagine anything better. I read the other compliants about how difficult it is to set up, but to be honest I had my first simple TFS/VS2005 project up within an hour. Just remember if you run into any issues or snags CC.NET has a pretty active google groups for Users and Devs who would be willing to help you through any gotchas.
I love CC.NET and I'm a big fan of CI, but I have to ask: with only one developer on the project, what integration scenarios exist? Wouldn't you just build the entire project in Visual Studio, negating the need for CI?
I would agree that CC.NET is a great option for local/home development. I wanted to add that it does not require an SCM tool in order to make it work. There is a file system watcher plugin that will just monitor for a file change. That way you don't have to have a check-in in order for it to execute. Also you don't have to wait for the CI cycle to complete, it's much like having F6 run everytime except for the IDE doesn't use all it's resources, you can keep coding away. If it breaks, you can choose to investigate or just ignore. There is no one way to do CI.
If you do create unit tests, having that constantly execute against your code on every save certainly has some advantages in early problems. Using CCTray allows you to see it, but not be intruded upon. My 2 cents.
Finally, setting it up the first time can be a little tough, but you can tweak out a Visual Studio C# template or whatever you desire to automatically configure your CI setup with the least amount of information required by the user.

Resources