Reintegrate a branch back to the trunk when sweeping changes have been made to the tree structure

Reintegrate a branch back to the trunk when sweeping changes have been made to the tree structure - visual-studio

A brief note before I start: there is a lot of explanation required to "set the stage", and it may seem like this is more of a design question than a question about a programming problem. The question is actually about SVN branching and merging, so please read to the end.
Scenario:
I have a large Visual Studio solution with quite a few projects. I'm using SVN, so of course the trunk has my production line of development. This consists of a core DLL assembly, a "main" UI user client, and a handful of "plugin" assemblies that operate by implementing interfaces on the core assembly in order to provide functionality within the UI, and also by utilizing a set of service methods which provide common functionality to all of the plugins (such as persistence logic operations, storage operations for a centralized file store architecture, etc.)
There are also external utilities that I have built over time which must duplicate a lot of the business logic in the plugins. I won't go into much detail because it will ultimately distract from my main question, but just picture, for example, a scheduled service on a server that handles centralized maintenance operations related to a particular plugin's data.
When I initially built this application, I (stupidly) didn't anticipate the need for centralized service tiers, so I architected the core assembly (for better or worse), as shown above, to be tightly integrated with the presentation layer of the application. In other words, the UI presentation logic needed to integrate the plugins with the user interface and the business logic needed by the plugins to perform common plugin logic operations is all part of the one "core" assembly. Therefore, much of the "shared" logic that exists between the plugins and the centralized services has resulted in duplicated code.
I decided to undertake the major refactoring initiative to pull out the common logic -- that which is not related to the presentation -- into a "shared" assembly. For this, I created a branch off the trunk. I reorganized common code into a "shared" assembly, and I re-pointed everything in the client application (plugins, etc.) and the external service applications to utilize the shared assembly. In many cases, I also had to rename classes in order to fit their more-general purpose going forward. The core assembly remained in place only to broker presentation-layer responsibilities between the plugins and the UI.
Problem:
Now that I have successfully completed the refactoring, I want to reintegrate the branch back into the trunk. Merging is tricky business even in simple cases, but what I'm facing here is a lot of tree conflicts to put it mildly. Also, in addition to residing in an entirely new project, the folder structure in the "shared" project is quite a bit different from what it was in the "core" project. Classes are, in many cases, located in different places due to the new mechanisms for using the shared assembly.
I want to maintain the version history of every class from its old home in the core assembly to its new home in the shared assembly. Furthermore, I want to guarantee that the merge is successful. That seems obvious, but in testing a miniature version of this whole scenario, I was never able to get the conflicts to resolve in such a way where my branch features remained entirely intact. Furthermore, the fact that I have renamed some of the classes, as I stated earlier, to suit their more-general roles, makes it very tricky to maintain the version history.
I will note that I am using AnkhSVN which helps in "normal" cases when you rename files to repair the moves, but it doesn't seem to work in these major tree-conflict cases. Also, I know there is a difference in how merges work between different versions of SVN -- I believe it's pre-SVN 1.5 and post-SVN 1.5. I'm using SVN 1.9.3.
I have been trying to figure this out for a few weeks now. I've been pouring through the SVN book, TortoiseSVN resources like this, and anything I could find from google searches, like this, this, and this -- among many, many, many others. I feel like I'm going crazy and I think advanced SVN (and Tortoise) are impossible to learn with the traditional teach-yourself, learn-from-the-web-and-books approach. At any rate, I would greatly appreciate any insight that is out there.
What is the proper methodology when you create a feature branch using SVN and plan on making major tree changes and "moves" (i.e. renames) so that you can reintegrate those changes with the trunk without losing anything?

Congratulations to stepping on the most "popular" rake in SVN - "Merge Hell after refactoring"!
There are (at least) two simple rules for your case, produced by the bitter experience:
Never perform refactoring in SVN
If you'll ignore rule 1: in the name of all that is holy and good in the world don't touch ANYTHING in trunk during refactoring in branch
If you reject these the righteous covenants you still have a ways to salvation
Pure SVN-way, long and dirty
Merge all and every subtree, which is source of Tree Conflict, determining by hands every source and target like
svn merge NEW_PATH/NEW_NAME old_path/old_name
and finalize this the bloody work by full merge
Tricky Mercurial-way (or Git-way, but I just hate Git)
Preface: such merges aren't problem at all for modern DVCSes, they have "bridges" to SVN-repos, thus - you can delegate this job of merging to external VCS of choice and return results back (with some limitations and warnings)
I'm too lazy to speak about all DVCSes and will explain only about Mercurial (considering that with SVN-background it will be the least painful migration).
With HGSubversion Mercurial can read (pull) and write (push) to Subversion repositories, but - it can't push to Subversion results of it's own merges, thus: it will be multi-stage operation with the substitution of WC of Subversion in the process
A brief synopsis
Install Mercurial (TortoiseHG) and HGSubversion extension
Clone the whole SVN-repository to Mercurial into some temporary location (not current Subversion WC)
Merge branch to mainline (SVN's trunk become default branch), resolve (possible) context-conflicts (not tree)
Test (?) results
Perform the full replacement of Subversion Working Copy (WC of trunk, obviously) by the content of Mercurial Working Directory (beware of .svn and .hg folders respectively)
Commit WC to trunk
For the beauty and compliance with all rules "cheat" mergeinfo data of trunk (committed in step 6 must me known later as mergeset, although it is not true formally)
HTH
PS - migration to Mercurial with HGVS doesn't seems as totally crazy idea for now

Related

How to implement "Lock & Edit" mechanism for Visual Studio? GitHub, SVN, VSS, TFS?

Here's the requirement:
C# classes need to be shared among a group of 5 developers.
If one developer starts editing a class, it should be automatically locked for others
Others can edit that class, only when the current developer releases the class
I understand that Git is a distributed version control system, whereby complete local repositories are created. Merge functionality has to be used for creating a consolidated file.
I have also tried Svn, but even that uses a Merge tool.
I have a small team, and I don't want to use Merge Tools. Which is the best way to accomplish this?

SVN does support this kind of workflow with its locking feature.
Read the section on locking in the SVN Book v 1.7 - it goes into plenty of detail.
As far as Im aware git does not support a locking workflow.
Apparently Team Foundation Server also supports a locking workflow, but I'm not familiar with it.
I will add that i do not think this a good way to work unless you absolutely have to (eg binaries or hard to merge files like model xml). Regular team communication and defensive programming should mean that the vast majority of code merges will be handled automatically by your version control system.
Merging is just a part of collaborative development. Nobody really wants to use merge tools, but IMO having to do an occasional (sometimes messy) merge is a far better prospect that having to wait until someone else is finished with a file before I can make my change - changes which are very likely NOT to conflict with others changes anyway. Especially in a small team.
You should also not (as mentioned in comments above) need a resource dedicated to Merging. A merge is best done by two people.
The developer with the conflict, and
The developer who committed the last change (that has caused that conflict.)
If these two can't work it out pretty quickly, or you really do need a resource just for merging (which I have seen occur even in smallish teams of around 10 developers) you have problems.. such as;
The code is monolithic/highly coupled and needs refactoring
The developers are not committing atomic changes.
Using svn and a complex branching strategy (scary)
Developers are not talking to each other (Just a 10 min standup/day would help)
Good luck!

Apache Subversion 1.8 features major improvements that make merging and solving conflicts easier. New automatic merges are definitely worth testing!
As #mounds already mentioned, you can use pessimistic locking kind of workflow with Apache Subversion. See the SVNBook | Lock communication section. In such case Visual Studio with VisualSVN will prompt you to lock a file before you start modifying it.
Note that such approach should be used with those files that can't be merged. So~, Embrace Merge!
Users and administrators alike are encouraged to attach the
svn:needs-lock property to any file that cannot be contextually
merged. This is the primary technique for encouraging good locking
habits and preventing wasted effort.

Best Practice: Removing obsolete artifacts from UCM ClearCase

We have a stream in ClearCase UCM. We create Views on this stream and fetch code for Build purpose. The total data copied is 10 GB. This is a huge codebase. I decided to investigate what makes it huge.
I found:
1) Multiple versions of Third Party applications are stored in
ClearCase
2) But only the latest Third Party applications are used by our
application
3) Lots of obsolete and redundant code is available
I proposed:
1) Removal of old versions of Third Party applications using rmname
(NOT rmelement) which will ensure the availability of element history
2) Removal of all redundant code
A total of 5 GB of obsolete data has been detected.
My Logic:
I think this is the best way to keep a stream of development clean. That is, the best way to organize a stream of development is to have the best, the cleanest and the leanest source code available.
Also, since all HISTORY will be available always in ClearCase, there is no need to panic about the deletion of elements.
I feel old, redundant and obsolete code and artifacts belong in HISTORY and not in the current stream of development.
Lastly, I feel ClearCase operations like making a baseline etc will take more time if we have bloat in the VOB. Since we do an incremental baseline for nightly builds, I do not think these obsolete items are baselined. But I feel all ClearCase operations are affected by bloat.
Is my LOGIC proper? Is my understanding of UCM ClearCase proper?
*Please let me know the best practice in such cases.*
People at my work place do not want to delete the obsolete files although 5 GB data is obsolete in the current stream.
Any help would be appreciated.

The best practice is actually separate from UCM in this case.
I too started by storing third-party binaries in ClearCase. It didn't scale well and the Vob started to get bloated, and simply too large to be managed (ie backed up) easily.
I now prefer storing third-parties in an artifact repository like Nexus, and add a little maven script to my build process in order to download the right binaries at the right versions, as declared in a pom.xml file.
Note that to remove old versions of a binaries from a vob, rmelem or rmver are really not advisable (risk of hyperlink corruption), but I used to do:
cleartool rmver -data aLargeBinary#/main/.../branch/OldVersion
That would keep the version in ClearCase, but would remove the version content (ie the large binary itself): that allowed for the Vob to get much smaller.
That being said, I agree with your general policies (especially regarding redundant code)

What is a good solution structure to allow easy customisation of a product on a per client basis?

I am looking for some advice on how to allow easy customisation and extension of a core product on a per client basis. I know it is probably too big a question. However we really need to get some ideas as if we get the setup of this wrong it could cause us problems for years. I don't have a lot of experience in customising and extending existing products.
We have a core product that we usually bespoke on a per client basis. We have recently rewritten the the product in C# 4 with an MVC3 frontend. We have refactored and now have 3 projects that compose the solution:
Core domain project (namespace - projectname.domain.*) - consisting of domain models (for use by EF), domain service interfaces etc (repository interfaces)
Domain infrastructure project (namespace -projectname.infrastructure.*) - that implements the domain service-EF Context, Repository implementation, File upload/download interface implementations etc.
MVC3 (namespace - projectname.web.*)-project that consists of controllers, viewmodels, CSS, content,scripts etc. It also has IOC (Ninject) handling DI for the project.
This solution works fine as a standalone product. Our problem is extending and customising the product on a per client basis. Our clients usually want the core product version given to them very quickly (usually within a couple of days of signing a contract) with branded CSS and styling. However 70% of the clients then want customisations to change the way it functions. Some customisations are small such as additional properties on domain model, viewmodel and view etc. Others are more significant and require entirely new domain models and controllers etc.
Some customisations appear to be useful to all clients, so periodically we would like to change them from being customisations and add them to the core.
We are presently storing the source code in TFS. To start a project we usually manually copy the source into a new Team Project. Change the namespace to reflect the clients name and start customising the basic parts and then deploy to Azure. This obviously results in an entirely duplicated code base and I’m sure isn’t the right way to go about it. I think we probably should be having something that provides the core features and extends/overrides where required. However I am really not sure how to go about this.
So I am looking for any advice on the best project configuration that would allow:
Rapid deployment of the code – so easy to start off a new client to
allow for branding/minor changes
Prevent the need for copying and pasting of code
Use of as much DI as possible to keep it loosely coupled
Allow for bespoking of the code on a
per client basis
The ability to extend the core product in a single
place and have all clients gain that functionality if we get the
latest version of the core and re-deploy
Any help/advice is greatly appreciated. Happy to add more information that anyone thinks will help.

I may not answer to this completly, but here some advices:
Don't copy your code, ever, whatever the reason is.
Don't rename the namespace to identify a given client version. Use the branches and continuous integration for that.
Choose a branching model like the following: a root branch called "Main", then create one branch from Main per major version of your product, then one branch per client. When you develop something, target from the start in which branch you'll develop depending on what you're doing (a client specific feature will go in the client branch, a global version in the version branch or client branch if you want to prototype it at first, etc.)
Try the best to rely on Work Item to track features you develop to know in which branch it's implemented to ease merge across branches.
Targeting the right branch for you dev is the most crucial thing, you don't have to necessary define some hard rules of "what to do in which occasion", but try to be consistant.
I've worked on a big 10 years project with more than 75 versions and what we usually did was:
Next major version: create a new branch from Main, dev Inside
Next minor version: dev in the current major branch, use Labels to mark each minor versions Inside your branch.
Some complex functionnal features was developped in the branch of the client that asked for it, then reversed integrated in the version branch when we succeeded in "unbranded" it.
Bug fixes in client branch, then reported in other branches when needed. (you have to use the Work Item for that or you'll get easily lost).
It's my take on that, other may have different point of view, I relied a lot on the Work Item for traceability of the code, which helped a lot for the delivery and reporting of code.
EDIT
Ok, I add some thought/feedback about branches:
In Software Configuration Management (SCM) you have two features to help you for versionning: branches and labels. Each one is not better nor worst than the other, it depends on what you need:
A Label is used to mark a point in time, using a label, for you to later be able to go back to that point if needed.
A Branch is used to "duplicate" your code to be able to work on two versions at the same time.
So using branches only depends on what you want to be able to do. If you have to work one many different versions (say one per client) at the same time: there's no other way to deal with it than using branches.
To limit the number of branches you have to decide what will be a new branch or what will be marked by a label for: Client Specific Versions, Major Version, Minor Version, Service Pack, etc.
Using branches for Client versions looks to be a no brainer.
Using one branch for each Major version may be the toughest choice for you to make. If you choose to use only one branch for all major versions, then you won't have the flexibility to work on different major versions at the same time, but your number of branches will be the lowest possible.
Finally, Jemery Thompson has a good point when he says that not all your code should be client dependent, there are some libraries (typically the lowest level ones) that shouldn't be customized per client. What we do usually is using a separated branch tree (which is not per client) for Framework, cross-cutting, low level services libraries. Then reference these projects in the per client version projects.
My advice for you is using Nuget for these libraries and create nuget package for them, as it's the best way to define versionned dependencies. Defining a Nuget package is really easy, as well as setting up a local Nuget server.

I just worried that with 30 or 40 versions (most of which aren't that different) branching was adding complexity.
+1 Great question, its more of a business decision you'll have to make:
Do I want a neat code-base where maintenance is easy and features and fixes get rolled out quickly to all our customers
or do I want a plethora of instances of one codebase split up, each with tiny tweaks that is hard (EDIT: unless your a ALM MVP who can "unbrand" things) to merged into a trunk.
I agree with almost everthing #Nockawa mentioned except IMHO dont substitute extending your code architecture with branches.
Definitely use a branch/trunk strategy but as you mentioned too many branches makes it harder to quickly roll-out site wide features and hinder project-wide continuous integration. If you wish to prevent copy/pasting limit the number of branches.
In terms of a coding solution here is what I believe you are looking for:
Modules/Plug-ins, Interfaces and DI is right on target!
Deriving custom classes off base ones (extending the DSL per customer, Assembly.Load())
Custom reporting solution (instead of new pages a lot of custom requests could be reports)
Pages with spreadsheets (hehe I know - but funnily enough it works!)
Great examples of the module/plugin point are CMS's such as DotNetNuke or Kentico. Other idea's could be gained by looking at Facebook's add-in architecture, plugin's for audio and video editing, 3D modeling apps (like 3DMax) and games that let you build your own levels.
The ideal solution would be a admin app that you can choose your
modules (DLL's), tailor the CSS (skin), script the dB, and auto-deploy
the solution upto Azure. To acheive this goal plugin's would make so
much more sense, the codebase wont be split up. Also when an
enhancement is done to a module - you can roll it out to all your
clients.
You could easily do small customisations such as additional properties on domain model, viewmodel and view etc with user controls, derived classes and function overrides.
Do it really generically, say a customer says I want to a label that tally's everyone's age in the system, make a function called int SumOfField(string dBFieldName, string whereClause) and then for that customers site have a label that binds to the function. Then say another customer wants a function to count the number of product purchases by customer, you can re-use it: SumOfField("product.itemCount","CustomerID=1").
More significant changes that require entirely new domain models and controllers etc would fit the plug-in architecture. An example might be a customer needs a second address field, you would tweak your current Address user-control to be a plug-in to any page, it would have settings to know which dB table and fields it can implement its interface to CRUD operations.
If the functionality is customised per client in 30-40 branches
maintainability will become so hard as I get the feeling you wont be
able to merge them together (easily). If there is a chance this will
get really big you dont want to manage 275 branches. However, if its
that specialised you have to go down to the User-Control level for
each client and "users cant design their own pages" then having
Nockawa 's branching strategy for the front-end is perfectly
reasonable.

Best Practice for Git Repositories with multiple projects in traditional n-tier design

I'm making the switch from a centralized SCM system to GIT. OK, I'll admit which one, it is Visual SourceSafe. So in addition to getting over the learning curve of Git commands and workflow, the biggest issue I'm currently facing is how to migrate our current repository over to Git in regards to single or some flavor of multiple repositories.
I've seen this question asked in a variety of ways, but normally just the basic..."I have applications that want to share some lower level libraries" and the canned response is always "use separate repositories" and/or "use Git submodules" without much explanation of when/why this pattern should be used (what does it overcome, what does it eliminate?) From my limited knowledge/reading on Git so far, it appears that submodules may have their own demons to battle, especially for someone new to Git.
However, what I've yet to see someone blatantly ask is, "When you have the traditional n-tier development (UI, Business, Data, and then Shared Tools) where each layer is its own project, should you use one or multiple repositories?" It is not clear to me because almost always, when a new 'feature' is added, code changes ripple through each layer.
To complicate matters with respect to Git, we've duplicated these layers across 'frameworks' to make more manageable projects/components from a developer's perspective. For the purpose of this discussion, lets call these collection of projects/layers 'Tahiti', which represents an entire 'product'.
The final 'layer' in our set up is the addition of client websites/projects which customize/expand upon Tahiti. Representing this in a folder structure might best look like:
/Clients
/Client1
/Client2
/UI Layer
/CoreWebsite (views/models/etc)
/WebsiteHelper (contains 'web' helpers appropriate for any website)
/Tahiti.WebsiteHelpers (contains 'web' helpers only appropriate for Tahiti sites)
/BusinessLayer (logic projects for different 'frameworks')
/Framework1.Business
/Framework2.Business
/DataLayer
/Framework1.Data
/Framework2.Data
/Core (projects that are shared tools useable by any project/layer)
/SharedLib1
/SharedLib2
After explaining how we've expanded on the traditional n-tier design with multiple projects, I'm looking for any experience on what decision you've made with a similar situation (even the simple UI, Business, Data separation was all that you used) and what was easier/harder because of your decision. Am I right in my preliminary reading on how submodules can be a bit of pain? More pain than is worth the benefit?
My gut reaction is to one repository for Tahiti (all projects excluding the 'client projects'), then one repository for each client. The entire Tahiti source I'm guessing has to be <10k files. Here is my reasoning (and I welcome criticism)
It seems to me, that in Git you want to track history of 'features' vs individual 'projects/files', and even with our project separation, a 'feature' will always span multiple projects.
A 'feature' coded in the core site will almost always minimally effect the core website and all projects for a 'framework' (i.e. CoreWebsite, Framework1.Business, Framework1.Data)
A feature can easily span multiple frameworks (I'd say 10% of the features we implement would span frameworks - CoreWebsite, Framework1.Business, Framework1.Data, Framework2.Business, Framework2.Data)
In a similar fashion, a feature could require changes to 1 or more SharedLib projects and/or the 'UI website helper' projects.
Changes to client's custom code will almost always only be local to their repository and not require tracking changes to other components to see what the 'entire feature change set' was.
Given that a feature spans projects to see the entire scope, if each project was its own repository, it seems it would be a pain to try to analyze *all* code changes across repositories?
Thanks in advance.

The reason most people advise to do separate repositories is because it separates out changes and change sets. If someone makes a change to the client projects (which you say doesn't really effect others), there is no reason for someone to update the entire code base. They can simply just get the changes from the project(s) they care about.
Git Submodules are like Externals in Subversion. You can set up your git repositories so that each one is a separate layer, and then use submodules to include the projects that are needed in the various hierarchies you have.
So if for example:
/Core -- It's own git repository that contains it's base files (as you had outlined)
/SharedLib1
/SharedLib2
/UI Layer -- Own git repository
/CoreWebsite
/WebsiteHelper
/Tahiti.WebsiteHelpers
/Core -- Git Submodule to the /Core repository
/SharedLib1
/SharedLib2
This ensures that any updates to the /Core repository are brought into UI Layer repository. It also means that if you have to update your shared libraries you don't have to do it across 5-6 projects.

VS 2022 support multi-repo.
The easiest way to enable multi-repo support is to use CTRL+Q, type
“preview” and open the preview features pane. Scroll to “Enable
multi-repo support” and toggle the checkbox. This functionality is
still a preview feature, which means we are working hard to add more
support in the coming releases. In the meantime, we’re depending on
your feedback, the community, to build what you need.
See Screenshot:
https://devblogs.microsoft.com/visualstudio/multi-repo-support-in-visual-studio/

best git and Xcode structure for evolving variants of the same product

There is a similar question at Best practice for managing project variants in Git? but the context is different and I suspect the answer might be too.
I have a Cocoa product "First" managed with Xcode and versioned using git. "First" is still evolving, and is currently at its third version.
Then a customer comes and ask for a variant of First, called Second. The changes from First to Second affect many, but not all, files. The changes affect source code but also resources (graphics elements, nib files, property lists...).
Now the two products are alive and share a number of common files. However, some changes such as bug fixes might apply to both products. Possibly, a new feature might be added to both products.
What would be the best way to manage such a scenario:
with Xcode
with git
I have two ideas, which are mutually exclusive:
Idea 1: git branch "First" into "Second", and apply any applicable change from one project back to the other one. This leads to two totally separate Xcode directories and projects.
Idea 2: Add a target named "Second" to the Xcode project. Now the same Xcode project has two targets and is used to develop and build both products. But this makes it difficult to manage releases for First and Second in git (releases have no reason to be synchronized).
Idea 2 makes the parallel development process very easy. Code is always in sync. Divergences can be handled through compile-time variables and a single source file OR through different source files. It makes version management more obscure though.
Idea 1 is perhaps cleaner, but then, what's the best practice to manage whatever stays common between the two projects? Can you do a "partial merge" between two git branches? On what basis? Or must that be handled manually?
It might be possible to encapsulate and extract some common part into a module or library, but not always. For example I don't think that's possible for the common document icons. Also refactoring "First" so that all common items are extracted away in a build-able manner is a major undertaking that I'd rather do a bit at a time.
I realize there may not be a perfect solution. I am looking for ideas and suggestion.
As a relatively recent git adopter I also realize that this may be an RTFM question. Then simply point me to the FM to R.

My preference is idea2. I currently do this as a way of writing a plug-in for the Main App, and then client apps that go on all the nodes of a cluster. The Plug-in and clients share 90% of the same code, so this makes it super-easy to maintain and debug the where/how of what's going on.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio