Auto VCS tagging on Teamcity build - Limitations? - continuous-integration

There have been concerns that automatically tagging builds with build systems (TeamCity/CruiseControl) will create so many tags that Perforce will be bogged down.
The only references I've managed to find have said "unless you have numbers, don't worry about it." I'd rather worry now before polluting a 100+G repository.
Does anyone have systems doing 1000+ builds a month that have seen anything like this?

You could consider using automatic labels, which simply contain a view (the parts of the depot you are identifying) and a revision number (usually a changelist). Automatic labels put very little metadata in the database.
If you use static labels, you should periodically archive old labels to keep the size of the database under control.
You can find more info on these subjects at the Perforce Knowledge Base.

Related

Fork to create similar but different product

We need to fork our 18 project, 1,000+ file VS2015 C# solution so we can build a similar but different product. Both products will continue to be actively developed.
Changes might be product-specific (of course, otherwise we'd only need one product) or product-agnostic (e.g. payment processing).
I'd like improvements made in either product to be available to - but not automatically part of the other.
I know this could be achieved by manually selecting and merging changesets into the other product, but is there a better way? For example, is there a way developers can mark their check-ins as applicable to both products? Are there any tools I can use to help with this?
Even though Forking and Branching isn't the same! Branching means making a new copy for a new feature and experiments which will be merged to the master later. Forking means making a copy for an independent project.
However, in TFS the branching (from VS or command) is really the only implementation of fork. Detail steps please refer the official MSDN link: Branch folders and files
Note: You can only branch the source Code, not a Project (Work Items, Queries, Reports) though. Those must be copied into the new one.

Reintegrate a branch back to the trunk when sweeping changes have been made to the tree structure

A brief note before I start: there is a lot of explanation required to "set the stage", and it may seem like this is more of a design question than a question about a programming problem. The question is actually about SVN branching and merging, so please read to the end.
Scenario:
I have a large Visual Studio solution with quite a few projects. I'm using SVN, so of course the trunk has my production line of development. This consists of a core DLL assembly, a "main" UI user client, and a handful of "plugin" assemblies that operate by implementing interfaces on the core assembly in order to provide functionality within the UI, and also by utilizing a set of service methods which provide common functionality to all of the plugins (such as persistence logic operations, storage operations for a centralized file store architecture, etc.)
There are also external utilities that I have built over time which must duplicate a lot of the business logic in the plugins. I won't go into much detail because it will ultimately distract from my main question, but just picture, for example, a scheduled service on a server that handles centralized maintenance operations related to a particular plugin's data.
When I initially built this application, I (stupidly) didn't anticipate the need for centralized service tiers, so I architected the core assembly (for better or worse), as shown above, to be tightly integrated with the presentation layer of the application. In other words, the UI presentation logic needed to integrate the plugins with the user interface and the business logic needed by the plugins to perform common plugin logic operations is all part of the one "core" assembly. Therefore, much of the "shared" logic that exists between the plugins and the centralized services has resulted in duplicated code.
I decided to undertake the major refactoring initiative to pull out the common logic -- that which is not related to the presentation -- into a "shared" assembly. For this, I created a branch off the trunk. I reorganized common code into a "shared" assembly, and I re-pointed everything in the client application (plugins, etc.) and the external service applications to utilize the shared assembly. In many cases, I also had to rename classes in order to fit their more-general purpose going forward. The core assembly remained in place only to broker presentation-layer responsibilities between the plugins and the UI.
Problem:
Now that I have successfully completed the refactoring, I want to reintegrate the branch back into the trunk. Merging is tricky business even in simple cases, but what I'm facing here is a lot of tree conflicts to put it mildly. Also, in addition to residing in an entirely new project, the folder structure in the "shared" project is quite a bit different from what it was in the "core" project. Classes are, in many cases, located in different places due to the new mechanisms for using the shared assembly.
I want to maintain the version history of every class from its old home in the core assembly to its new home in the shared assembly. Furthermore, I want to guarantee that the merge is successful. That seems obvious, but in testing a miniature version of this whole scenario, I was never able to get the conflicts to resolve in such a way where my branch features remained entirely intact. Furthermore, the fact that I have renamed some of the classes, as I stated earlier, to suit their more-general roles, makes it very tricky to maintain the version history.
I will note that I am using AnkhSVN which helps in "normal" cases when you rename files to repair the moves, but it doesn't seem to work in these major tree-conflict cases. Also, I know there is a difference in how merges work between different versions of SVN -- I believe it's pre-SVN 1.5 and post-SVN 1.5. I'm using SVN 1.9.3.
I have been trying to figure this out for a few weeks now. I've been pouring through the SVN book, TortoiseSVN resources like this, and anything I could find from google searches, like this, this, and this -- among many, many, many others. I feel like I'm going crazy and I think advanced SVN (and Tortoise) are impossible to learn with the traditional teach-yourself, learn-from-the-web-and-books approach. At any rate, I would greatly appreciate any insight that is out there.
What is the proper methodology when you create a feature branch using SVN and plan on making major tree changes and "moves" (i.e. renames) so that you can reintegrate those changes with the trunk without losing anything?
Congratulations to stepping on the most "popular" rake in SVN - "Merge Hell after refactoring"!
There are (at least) two simple rules for your case, produced by the bitter experience:
Never perform refactoring in SVN
If you'll ignore rule 1: in the name of all that is holy and good in the world don't touch ANYTHING in trunk during refactoring in branch
If you reject these the righteous covenants you still have a ways to salvation
Pure SVN-way, long and dirty
Merge all and every subtree, which is source of Tree Conflict, determining by hands every source and target like
svn merge NEW_PATH/NEW_NAME old_path/old_name
and finalize this the bloody work by full merge
Tricky Mercurial-way (or Git-way, but I just hate Git)
Preface: such merges aren't problem at all for modern DVCSes, they have "bridges" to SVN-repos, thus - you can delegate this job of merging to external VCS of choice and return results back (with some limitations and warnings)
I'm too lazy to speak about all DVCSes and will explain only about Mercurial (considering that with SVN-background it will be the least painful migration).
With HGSubversion Mercurial can read (pull) and write (push) to Subversion repositories, but - it can't push to Subversion results of it's own merges, thus: it will be multi-stage operation with the substitution of WC of Subversion in the process
A brief synopsis
Install Mercurial (TortoiseHG) and HGSubversion extension
Clone the whole SVN-repository to Mercurial into some temporary location (not current Subversion WC)
Merge branch to mainline (SVN's trunk become default branch), resolve (possible) context-conflicts (not tree)
Test (?) results
Perform the full replacement of Subversion Working Copy (WC of trunk, obviously) by the content of Mercurial Working Directory (beware of .svn and .hg folders respectively)
Commit WC to trunk
For the beauty and compliance with all rules "cheat" mergeinfo data of trunk (committed in step 6 must me known later as mergeset, although it is not true formally)
HTH
PS - migration to Mercurial with HGVS doesn't seems as totally crazy idea for now

What is the "git stash" equivalent for Serena Dimensions?

I have made some changes. I cannot use those changes now. I need to discard them for now and go back to them later when the star alignment is more favorable (e.g. when our Cobol guy has enough time to get to his half of the work).
Short of using Eclipse → Synchronize with team and manually copy pasting the contents to a scratch directory so I can do the merging later, is there any way to "stash" changes for later?
There is no git stash equivalent on Serena Dimensions. The poor man's way will be to store your changes temporally on a different folder or a file with different name without including it to the source controlled solution and switching back and forth as needed.
Another alternative is to use streams in order to have your changes source controlled without affecting production code; a typical scenario is to have an Integration and Main streams. But it depends on your access level to the dimension database you are using and your project needs.
A git repo can be maintained locally to have this and other git functionality on your local computer (or even small team with shared folders or a git server) since it does not interfere with Dimensions, as long as you don't store the git metadata in the dimensions managed code and vice versa. This is not a straight forward solution and will require that you know how to set a git repo and precaution on you side when delivering to the Dimension server, but it works and is really helpful if you are familiar with git workflow.
Dimensions is not so friendly as git on this kind of usages, but way more robust for larger and more controlled projects.
Git and Dimensions work on different methodologies. Dimensions allows only to either commit a new version or discard the version, after checking out the file. As indicated above, one can still use streams or individual branches for their development work and can merge/deliver the changes later point in time, without affecting others work.

What is a good solution structure to allow easy customisation of a product on a per client basis?

I am looking for some advice on how to allow easy customisation and extension of a core product on a per client basis. I know it is probably too big a question. However we really need to get some ideas as if we get the setup of this wrong it could cause us problems for years. I don't have a lot of experience in customising and extending existing products.
We have a core product that we usually bespoke on a per client basis. We have recently rewritten the the product in C# 4 with an MVC3 frontend. We have refactored and now have 3 projects that compose the solution:
Core domain project (namespace - projectname.domain.*) - consisting of domain models (for use by EF), domain service interfaces etc (repository interfaces)
Domain infrastructure project (namespace -projectname.infrastructure.*) - that implements the domain service-EF Context, Repository implementation, File upload/download interface implementations etc.
MVC3 (namespace - projectname.web.*)-project that consists of controllers, viewmodels, CSS, content,scripts etc. It also has IOC (Ninject) handling DI for the project.
This solution works fine as a standalone product. Our problem is extending and customising the product on a per client basis. Our clients usually want the core product version given to them very quickly (usually within a couple of days of signing a contract) with branded CSS and styling. However 70% of the clients then want customisations to change the way it functions. Some customisations are small such as additional properties on domain model, viewmodel and view etc. Others are more significant and require entirely new domain models and controllers etc.
Some customisations appear to be useful to all clients, so periodically we would like to change them from being customisations and add them to the core.
We are presently storing the source code in TFS. To start a project we usually manually copy the source into a new Team Project. Change the namespace to reflect the clients name and start customising the basic parts and then deploy to Azure. This obviously results in an entirely duplicated code base and I’m sure isn’t the right way to go about it. I think we probably should be having something that provides the core features and extends/overrides where required. However I am really not sure how to go about this.
So I am looking for any advice on the best project configuration that would allow:
Rapid deployment of the code – so easy to start off a new client to
allow for branding/minor changes
Prevent the need for copying and pasting of code
Use of as much DI as possible to keep it loosely coupled
Allow for bespoking of the code on a
per client basis
The ability to extend the core product in a single
place and have all clients gain that functionality if we get the
latest version of the core and re-deploy
Any help/advice is greatly appreciated. Happy to add more information that anyone thinks will help.
I may not answer to this completly, but here some advices:
Don't copy your code, ever, whatever the reason is.
Don't rename the namespace to identify a given client version. Use the branches and continuous integration for that.
Choose a branching model like the following: a root branch called "Main", then create one branch from Main per major version of your product, then one branch per client. When you develop something, target from the start in which branch you'll develop depending on what you're doing (a client specific feature will go in the client branch, a global version in the version branch or client branch if you want to prototype it at first, etc.)
Try the best to rely on Work Item to track features you develop to know in which branch it's implemented to ease merge across branches.
Targeting the right branch for you dev is the most crucial thing, you don't have to necessary define some hard rules of "what to do in which occasion", but try to be consistant.
I've worked on a big 10 years project with more than 75 versions and what we usually did was:
Next major version: create a new branch from Main, dev Inside
Next minor version: dev in the current major branch, use Labels to mark each minor versions Inside your branch.
Some complex functionnal features was developped in the branch of the client that asked for it, then reversed integrated in the version branch when we succeeded in "unbranded" it.
Bug fixes in client branch, then reported in other branches when needed. (you have to use the Work Item for that or you'll get easily lost).
It's my take on that, other may have different point of view, I relied a lot on the Work Item for traceability of the code, which helped a lot for the delivery and reporting of code.
EDIT
Ok, I add some thought/feedback about branches:
In Software Configuration Management (SCM) you have two features to help you for versionning: branches and labels. Each one is not better nor worst than the other, it depends on what you need:
A Label is used to mark a point in time, using a label, for you to later be able to go back to that point if needed.
A Branch is used to "duplicate" your code to be able to work on two versions at the same time.
So using branches only depends on what you want to be able to do. If you have to work one many different versions (say one per client) at the same time: there's no other way to deal with it than using branches.
To limit the number of branches you have to decide what will be a new branch or what will be marked by a label for: Client Specific Versions, Major Version, Minor Version, Service Pack, etc.
Using branches for Client versions looks to be a no brainer.
Using one branch for each Major version may be the toughest choice for you to make. If you choose to use only one branch for all major versions, then you won't have the flexibility to work on different major versions at the same time, but your number of branches will be the lowest possible.
Finally, Jemery Thompson has a good point when he says that not all your code should be client dependent, there are some libraries (typically the lowest level ones) that shouldn't be customized per client. What we do usually is using a separated branch tree (which is not per client) for Framework, cross-cutting, low level services libraries. Then reference these projects in the per client version projects.
My advice for you is using Nuget for these libraries and create nuget package for them, as it's the best way to define versionned dependencies. Defining a Nuget package is really easy, as well as setting up a local Nuget server.
I just worried that with 30 or 40 versions (most of which aren't that different) branching was adding complexity.
+1 Great question, its more of a business decision you'll have to make:
Do I want a neat code-base where maintenance is easy and features and fixes get rolled out quickly to all our customers
or do I want a plethora of instances of one codebase split up, each with tiny tweaks that is hard (EDIT: unless your a ALM MVP who can "unbrand" things) to merged into a trunk.
I agree with almost everthing #Nockawa mentioned except IMHO dont substitute extending your code architecture with branches.
Definitely use a branch/trunk strategy but as you mentioned too many branches makes it harder to quickly roll-out site wide features and hinder project-wide continuous integration. If you wish to prevent copy/pasting limit the number of branches.
In terms of a coding solution here is what I believe you are looking for:
Modules/Plug-ins, Interfaces and DI is right on target!
Deriving custom classes off base ones (extending the DSL per customer, Assembly.Load())
Custom reporting solution (instead of new pages a lot of custom requests could be reports)
Pages with spreadsheets (hehe I know - but funnily enough it works!)
Great examples of the module/plugin point are CMS's such as DotNetNuke or Kentico. Other idea's could be gained by looking at Facebook's add-in architecture, plugin's for audio and video editing, 3D modeling apps (like 3DMax) and games that let you build your own levels.
The ideal solution would be a admin app that you can choose your
modules (DLL's), tailor the CSS (skin), script the dB, and auto-deploy
the solution upto Azure. To acheive this goal plugin's would make so
much more sense, the codebase wont be split up. Also when an
enhancement is done to a module - you can roll it out to all your
clients.
You could easily do small customisations such as additional properties on domain model, viewmodel and view etc with user controls, derived classes and function overrides.
Do it really generically, say a customer says I want to a label that tally's everyone's age in the system, make a function called int SumOfField(string dBFieldName, string whereClause) and then for that customers site have a label that binds to the function. Then say another customer wants a function to count the number of product purchases by customer, you can re-use it: SumOfField("product.itemCount","CustomerID=1").
More significant changes that require entirely new domain models and controllers etc would fit the plug-in architecture. An example might be a customer needs a second address field, you would tweak your current Address user-control to be a plug-in to any page, it would have settings to know which dB table and fields it can implement its interface to CRUD operations.
If the functionality is customised per client in 30-40 branches
maintainability will become so hard as I get the feeling you wont be
able to merge them together (easily). If there is a chance this will
get really big you dont want to manage 275 branches. However, if its
that specialised you have to go down to the User-Control level for
each client and "users cant design their own pages" then having
Nockawa 's branching strategy for the front-end is perfectly
reasonable.

Best Practice for Git Repositories with multiple projects in traditional n-tier design

I'm making the switch from a centralized SCM system to GIT. OK, I'll admit which one, it is Visual SourceSafe. So in addition to getting over the learning curve of Git commands and workflow, the biggest issue I'm currently facing is how to migrate our current repository over to Git in regards to single or some flavor of multiple repositories.
I've seen this question asked in a variety of ways, but normally just the basic..."I have applications that want to share some lower level libraries" and the canned response is always "use separate repositories" and/or "use Git submodules" without much explanation of when/why this pattern should be used (what does it overcome, what does it eliminate?) From my limited knowledge/reading on Git so far, it appears that submodules may have their own demons to battle, especially for someone new to Git.
However, what I've yet to see someone blatantly ask is, "When you have the traditional n-tier development (UI, Business, Data, and then Shared Tools) where each layer is its own project, should you use one or multiple repositories?" It is not clear to me because almost always, when a new 'feature' is added, code changes ripple through each layer.
To complicate matters with respect to Git, we've duplicated these layers across 'frameworks' to make more manageable projects/components from a developer's perspective. For the purpose of this discussion, lets call these collection of projects/layers 'Tahiti', which represents an entire 'product'.
The final 'layer' in our set up is the addition of client websites/projects which customize/expand upon Tahiti. Representing this in a folder structure might best look like:
/Clients
/Client1
/Client2
/UI Layer
/CoreWebsite (views/models/etc)
/WebsiteHelper (contains 'web' helpers appropriate for any website)
/Tahiti.WebsiteHelpers (contains 'web' helpers only appropriate for Tahiti sites)
/BusinessLayer (logic projects for different 'frameworks')
/Framework1.Business
/Framework2.Business
/DataLayer
/Framework1.Data
/Framework2.Data
/Core (projects that are shared tools useable by any project/layer)
/SharedLib1
/SharedLib2
After explaining how we've expanded on the traditional n-tier design with multiple projects, I'm looking for any experience on what decision you've made with a similar situation (even the simple UI, Business, Data separation was all that you used) and what was easier/harder because of your decision. Am I right in my preliminary reading on how submodules can be a bit of pain? More pain than is worth the benefit?
My gut reaction is to one repository for Tahiti (all projects excluding the 'client projects'), then one repository for each client. The entire Tahiti source I'm guessing has to be <10k files. Here is my reasoning (and I welcome criticism)
It seems to me, that in Git you want to track history of 'features' vs individual 'projects/files', and even with our project separation, a 'feature' will always span multiple projects.
A 'feature' coded in the core site will almost always minimally effect the core website and all projects for a 'framework' (i.e. CoreWebsite, Framework1.Business, Framework1.Data)
A feature can easily span multiple frameworks (I'd say 10% of the features we implement would span frameworks - CoreWebsite, Framework1.Business, Framework1.Data, Framework2.Business, Framework2.Data)
In a similar fashion, a feature could require changes to 1 or more SharedLib projects and/or the 'UI website helper' projects.
Changes to client's custom code will almost always only be local to their repository and not require tracking changes to other components to see what the 'entire feature change set' was.
Given that a feature spans projects to see the entire scope, if each project was its own repository, it seems it would be a pain to try to analyze *all* code changes across repositories?
Thanks in advance.
The reason most people advise to do separate repositories is because it separates out changes and change sets. If someone makes a change to the client projects (which you say doesn't really effect others), there is no reason for someone to update the entire code base. They can simply just get the changes from the project(s) they care about.
Git Submodules are like Externals in Subversion. You can set up your git repositories so that each one is a separate layer, and then use submodules to include the projects that are needed in the various hierarchies you have.
So if for example:
/Core -- It's own git repository that contains it's base files (as you had outlined)
/SharedLib1
/SharedLib2
/UI Layer -- Own git repository
/CoreWebsite
/WebsiteHelper
/Tahiti.WebsiteHelpers
/Core -- Git Submodule to the /Core repository
/SharedLib1
/SharedLib2
This ensures that any updates to the /Core repository are brought into UI Layer repository. It also means that if you have to update your shared libraries you don't have to do it across 5-6 projects.
VS 2022 support multi-repo.
The easiest way to enable multi-repo support is to use CTRL+Q, type
“preview” and open the preview features pane. Scroll to “Enable
multi-repo support” and toggle the checkbox. This functionality is
still a preview feature, which means we are working hard to add more
support in the coming releases. In the meantime, we’re depending on
your feedback, the community, to build what you need.
See Screenshot:
https://devblogs.microsoft.com/visualstudio/multi-repo-support-in-visual-studio/

Resources