We are starting to develop new application that will include something like 30-50 projects developed by about dozen of developers with C# in MS Visual Studio.
I am working on componentize the application modules in order to support the architecture and enable parallel work.
We have argue: how many solutions should we have?
Some claim that we should have 1-2 solutions with 15-30 projects each. Some claim that we need a solution per component that means about 10-12 solutions with about 3-6 projects each.
I would be happy to hear pros/cons and experience with each direction (or other direction thereof)
I've worked on products on both extremes: one with ~100 projects in a single solution, and one with >20 solutions, of 4-5 projects each (Test, Business Layer, API, etc).
Each approach has its advantages and disadvantages.
A single solution is very useful when making changes - its easier to work with dependencies, and allows refactoring tools to work well. It does however, result in longer load times and longer build times.
Multiple solutions can help enforce separation of concerns, and keep build/load times low, and may be well suited to having multiple teams with narrower focus, and well defined service boundaries. They do however, have a large drawback when it comes to refactoring, since many references are file, not project references.
Maybe there's room for a hybrid approach use smaller solutions for the most part, but create a single including all projects for times when larger scale changes are required. Of course, you then have to maintain two separate solutions...
Finally, the structure of your projects and dependencies will have some influence on how you organize your solutions.
And keep in mind, in the long run RAM is cheaper than programmer time...
Solutions are really there for dependency management, so you can have project in more that one solution, if more than one thing depends on it. The number of solutions should really depend on your dependency graph.
Edit: This means you shouldn't be sticking projects that are not dependent on each other into the same solution, as it creates the illusion of dependency which means someone could create a real dependency when two projects should really be independent.
I've worked on a solution with close to 200 projects. It's not a big deal if you have enough RAM :).
One important thing to remember is that is projects depend on each other (be it with Dependencies or References), they should probably be in the same solution. Otherwise you get strange behavior when different projects have different dependencies in different solutions.
You want to maintain project references. If you can safely break up your solution with two or more discrete sets of projects that depend on each other, then do it. If you can't, then they all belong together.
We have a solution that has approximately 130 projects. About 3 years ago when we are using vs.net 2003 it was a terrible problem. Sometimes solution and VSS were crashing.
But now with VS.NET 2005 it's ok. Only loading is taking much time. Some of my coworkers unloading projects that they don't use. It's another option to speed up.
Changing build type to release is an another problem. But we have MSBuild scripts now. We do not use relese build of VS.NET no more.
I think you should not exaggerate your number of projects/solutions. Componentize what can
and will be reused, otherwise don't componentize!
It will only make things less transparent and increase build times. Partitioning can also be done within a project using folder or using a logical class structure.
When deciding what number of projects vs solutions do you need, you need to concider some questions:
logical layers of your application;
dependency between projects;
how projects are built;
who works with what projects;
Currently we have 1 solution with 70 projects.
For our continous integration we created 5 msbuild projects, so CI does not build our development solution.
Previously, we had separate solution for presentation (web and related projects) layer in separate git repository. This solution was used by outsource and freelance web developers.
I am working with a solution that has 405 projects currently. On a really fast machine this is doable, but only with current Visual Studio 2017 or 2012. Other versions crash frequently.
I don't think the actual number of solutions matters. Much more important is that you break the thing up along functional lines. As a silly, contrived example if you have a clutch of libraries that handles interop with foreign web services, that would be a solution; an EXE with the DLLs it needs to work would be another.
Only thing about so many projects in one solution is that the references and build order start to get confusing.
As a general rule I'd gravitate toward decreasing the number of projects (make the project a little more generic) and have devs share source control on those projects, but separate the functionality within those projects by sub-namespaces.
You should have as many as you need. There is no hard limit or best practice. Practice and read about what projects and solutions are and then make the proper engineering decisions about what you need from there.
It has been said in other answers however it is probably worth saying again.
Project dependencies are good in that they can rebuild dependent projects if the dependency has a change.
If you do a assembly file dependency there is no way that VS or MSBuild is going to know that a string of projects need to be built. What will happen is that on the first build the project that has changed will be rebuilt. If you have put the dependency on the build output then at least on the second build the project dependent on that will build. But then you have an unknown number of builds needed to get to the end of the chain.
With project dependencies it will sort it all out for you.
So the answer that says have as many (or few) as needed to ensure project dependencies are used.
If your team is broken down into functional areas that have a more formal release mechanism rather than checkin of source code then splitting it down those lines would be the way to go, otherwise the dependency map is your friend.
Related
So, I have an interesting situation. I have a group that is interested in using CI to help catch developer errors. Great - we are all for that. The problem I am having wrapping my head around things is that they want to build interdependent projects isolated from one another.
For example, we have a solution for our common/shared libraries that contains multiple projects, some of which depend on others in the solution. I would expect that if someone submits a change to one of the projects in the solution, the CI server would try to build the solution. After all, the solution is aware of dependencies and will build things, optimized, in the correct order.
Instead, they have broken out each project and are attempting to build them independently, without regard for dependencies. This causes other errors because dependent DLLs might not yet exist(!) and a possible chicken-and-egg problem if related changes were made in two or more projects.
This results in a lot of emails about broken builds (one for each project), even if the last set of changes did not really break anything! When I raised this issue, I was told that this was a known issue and the CI server would "rebuild things a few times to work out the dependencies!"
This sounds like a bass-ackwards way to do CI builds. But maybe that is because I am older and ignorant of newer trends - so, as anyone known of a CI setup that builds interdependent projects independently? Any for what good reason?
Oh, and they are expecting us to use the build outputs from the CI process, and each built DLL gets a potentially different version number. So we no longer have a single version number for the output of all related DLLs. The best we can ascertain is that something was built on a specific calendar day.
I seem to be unable to convince them that this is A Bad Thing.
So, what am I missing here?
Thanks!
Peace!
I second your opinion.
You risk spending more time dealing with the unnecessary noise than with the actual issues. And repeating portions of the CI verification pipeline only extends the overall CI execution time, which goes against the CI goal of reducing the feedback loop.
If anything you should try to bring as many dependent projects as possible (ideally all of them) under the same CI umbrella, to maximize the benefit of fast feedback of breakages/regressions on any part of the system as a whole.
Personally I also advocate using a gating CI system (based on pre-commit verifications) to prevent regressions rather than just detecting them and relying on human intervention for repairs.
We are thinking of making a distinct collection for code that is shared between our software products. We would then have a collection for the software projects. This would be done for organizational purposes. Would it be possible for a solution to reference c# projects that are in two different collections?
A solution can have files from anywhere, since they need to be mapped to a loal disk for Visual Studio to open them, but when spreading the code over multiple collections Source Control integration will break. Team Explorer requires all code to come from the same collection, and it favors solutions where all code is from a single team project, in those cases the other features will work best.
Other features in VSTS and TFS will be harder to use when your work is spread across multiple projects and/or collections. Test Plans, Agile tools, Team capacity and a few other items are purely scoped to the Project level. working in multiple projects at the same time will make it much harder for a team to use the tools properly. Many TFS consultants implement a "One project to rule them all" strategy because of these reasons. Putting all code and all work in one teamproject.
The XAML and 2015 build engine can only access sources from a single Project Collection at a time. So while it's possible to setup a number of local workspaces that each map to different team projects and/or project collections, this is not available when using the standard build servers that ship with TFS. Other build tools, such as Team City can be linked to multiple Source Control providers at the same time.
Now that it's possible to have multiple Git Repositories under the same Team Project, that approach makes even more sense.
Other things that will be harder may include branching structures, it will be very hard to manage the different versions of your code when they're spread across collections, since Labels and Branches can't cross collections and are hard to maintain when working in multiple projects.
A better solution for you would probably be to have your shared code in a separate team project or project collection, but not include it directly in your other projects in the form of Visual Studio Projects. Instead, have your shared projects stored as NuGet packages, either on a private NuGet feed or on a VSTS account with Package Management enabled.
That way your other projects will build faster (which can be significant when setup correctly), will use the binary versions of your shared libraries and have less unwieldy solutions.
One disadvantage of putting all code in a single team project or project collection is that it's harder to maintain the security. Project Collections for a natural boundary inside TFS, one that's harder to accidentally misconfigure or to break out from.
As #jessehouwing said solutions can have files from multiple collections. Each collection has a local workspace from where you attach project/files to your solution.
Where I disagree is that this is actually a good idea. There is no problem having 10+ collections, each for specific purpose.
For example 1 for generic projects that nobody will update unless necessary. One to store scripts. Basically think of it as library bookshelf. All similar stuff to some degree should be kept close.
Another plus is a security since one team can have access to some collections but not other. It is an obvious plus.
As to time difference, does really 1-2 sec make difference to anybody. Add one more agent and you will save time because locally building makes no difference.
As to Labels specific purpose strategy helps to overcome those problems.
I presently work with a large solution, containing about 100 projects. At least 10 of the projects are executable applications. Some of the library projects are imported as plugins via MEF and reflection rather than with direct references. If a needed plugin's own dependencies are not copied to the output or plugin directory of the executable project using it, we'll get reflection errors at runtime.
We've already tried or discussed the following solutions, but none of them seem like a good fit:
"Hard" References: Originally, we had the executable projects reference other projects they needed, even if they were going to ultimately be imported as optional plugins. This quickly fell out of favor with team members who needed to make builds that excluded certain plugins and liked to unload those projects to begin with. This also made it difficult to use Resharper or other tools to clean unused references and remove obsolete third party libraries without accidentally blowing away the "unused" references to the needed plugins own dependencies.
Post-build copying (with pre-build "pull"): For a brief period of time, a senior team member set all the plugin projects to xcopy their outputs output themselves to a known "DependencyInjection" folder as post-build events. Projects that needed those plugins would have pre-build events, xcopying each desired plugin to their own output directories. While this meant that the plugin projects "rightly" had no knowledge of where they might be used, this caused two major headaches. First, any time one made a change in a plugin project, they had to separately build (in sequence) the plugin project and then the executable project they would test it in (to get the files to copy over). Rebuild all would be more convenient but far too slow. Second, the continuous integration build would have to have been reconfigured since it compiled everything all in one directory and only cared if everything built successfully.
Post-build copying (push): The present solution started with xcopy and now mostly uses robocopy in post-build events of the plugin projects to copy needed files directly to the plugin folders of the executable projects that use them. This works fairly well in that if one makes a change in a plugin, one can go straight to running with the debugger. Also, the CI build doesn't break, and users disabling certain "optional" plugin projects for various builds don't get build errors from missing references. This still seems hackish, and is cumbersome to maintain in all the separate post-build windows, which are rather small and can't be expanded. When executable projects get moved from a project restructure or renamed, we don't find out about broken references until the next day after hearing results from the overnight automated testing.
"Dummy" projects with references: One idea that was briefly tossed about involved making empty projects for each of the different executable build configurations and going back to the hard references method on those. Each would use its own references to gather up the plugins and their dependencies. They would also have a reference to the actual executable and copy it over. Then, if one wanted to run a particular executable in a particular configuration, you'd run its dummy project. This one seemed particularly bloated and was never attempted.
NuGet: In my limited familiarity with NuGet, this seems like a good fit for using packages, except I wouldn't know how to implement that internal to one solution. We've talked about breaking up the solution, but many members of the team are strongly opposed to that. Is using NuGet with packages coming from within the same solution possible?
What are best practices for a situation like this? Is there a better solution to managing dependencies of reflected dependencies like this than any of the above, or is a refinement of one of the above the best choice?
Ok, so I assume in this answer that each developer needs to constantly have all 100 assemblies (Debug mode) locally to do its job (develop, compile, smoke test, run automatic tests).
You are mentioning that RebuildAll takes long time. Generally this symptom is caused by too many assemblies + build process not rationalized. So the first thing to do is to try to merge the 100 assemblies into as few assemblies as possible and avoid using things like Copy Local = true. The effect will be a much faster (like 10x) RebuildAll process. Keep in mind that assemblies are physical artefacts and that they are useful only for physical things (like plug-in, loading on-demand, test/app separation...). I wrote a white-book that details my thoughts on the topic: http://www.ndepend.com/WhiteBooks.aspx
Partitioning code base through .NET assemblies and Visual Studio projects (8 pages)
Common valid and invalid reasons to create an assembly
Increase Visual Studio solution compilation performance (up to x10 faster)
Organize the development environment
In the white-book advice's, one of idea is to avoid referencing project but to reference assemblies instead. This way it becomes your responsibility to fill Project > right click > Project Dependencies that will define the Project > right click > Project Build Order. If you decide to keep dealing with 100 assemblies, defining this setting represents an effort, but as a bonus a high-level (executable) project can depend on a library only used by reflection and this will solve your problem.
Did you measure the Lines of Code in terms of # of PDB sequences points? I estimate that until the limit 200K to 300K doing a RebuildAll (with optimization described in the white-book) should take 5 to 10 seconds (on a decent laptop) and it remains acceptable. If your code base is very large and goes beyond this limit, you'll need to break my first assumption and find a way that a developer doesn't need all assemblies to do its job (in which case we can talk about this further).
Disclaimer: This answer references resources from the site of the tool NDepend that I created and now manage its development.
I have been in a situation like yours. We had almost 100 projects. We too were using MEF and System.AddIn. In the beginning we had a few solutions. I was working on the core solution that included the core assemblies and their tests. Each plug-in category in a separate solution, that included contracts, implementation (some plug-ins had more than one implementation) and tests, plus some test host as well as the core assemblies. At some later point we added a solution that included all projects and after trying a few of the approaches you mention we decided to do the following:
Keep the references that are mandatory,
All executable projects were set to output to common locations (one for debug and one for release configurations),
All projects that should not be referenced were set to output to these common locations,
All projects that were referenced by others, were left unchanged and each reference was set with Copy Local = true.
Tests were left unchanged.
Although building all was slow, we didn't have any other problems. Of course having almost 100 projects is a sign that the design is probably too modular and as Patrick advises, we should have tried to compact it.
Anyway, you could try this approach in a couple of hours and perhaps instead of setting Copy Local = true, try to set the output folder of all projects mentioned in 4 to have their output set to the common locations. We didn't know that this setting will slow down the build process as Patrick mentions.
PS. We never tried using NuGet because we didn't have enough resources and time to experiment with it. It looked promising though.
We are starting up a new project and I am looking for the "best practices" solution of this similar problem. For us, we can divide the projects into two categories 1) The Platform assemblies, which provide common set of services across the board and 2) Verticals which would be perform business specific functions.
In the past we have used a Visual Studio plug-in with a simple UI that allow developers to specify a common assemblies path to copy the output assemblies and then reference all assemblies (whereever they reside in a different solution) from the common assemblies folder.
I am looking at NUGET but the sheer work you have to do to created and maintain NUGET packages is punitive.
It's a very common scenario and would be really interested to see how others have addressed it.
There has been some discussion in abandoning our CI system (Hudson FWIW) due to the fact that our projects are somewhat segmented. Without revealing too much, you can think of each project as similar to a web site project: it has dependencies, its own unit tests, etc.
It seems like one of the major benefits of CI is to make sure that each component of a project works together, but aside from project inheritance most of our projects are standalone and unit tested fairly well.
Given what I have explained here (the oddity in our project organization); can anyone explain any benefits of CI for segmented\modular\many projects?
So far as I can tell, this is the only good reason I've found:
“Bugs are also cumulative. The more bugs you have, the harder it is to remove each one. This is partly because you get bug interactions, where failures show as the result of multiple faults - making each fault harder to find. It's also psychological - people have less energy to find and get rid of bugs when there are many of them - a phenomenon that the Pragmatic Programmers call the Broken Windows syndrome.”
From here: http://martinfowler.com/articles/continuousIntegration.html#BenefitsOfContinuousIntegration
I would use Hudson for the following reasons:
Ensuring that your projects build/compile properly.
Building jobs dependent on the build success of other jobs.
Ensuring that your code adheres to agreed-upon coding standards.
Running unit tests.
Notifying development team of any issues found.
If the number of projects steadily increases, you will find the need to be able to manage each one effectively, especially considering the above reasons for doing so.
In your situation, you can benefit from CI in (at least) these two ways:
You can let the CI server run certain larger test suites automatically after each subversion/... check-in. Especially those which test the interaction of different modules, hence the name continuous integration. This takes away the maintenance work and waiting time from the developers when they consider a check-in. Some CI (e.g. Hudson) also can be configured to automatically build modules when a depending module is build. This way you can let it automatically test if depending modules are compatible with the new version of the changed one.
You can let the CI server publish the new artifacts to the repository of a dependency resolver (e.g., Ivy, Maven). This way, the various modules can automatically download the latest (stable) revisions of the modules they depend on. Combine this point with the previous one and imagine the possibilities (!!!).
Scenario:
Currently we have a single solution supporting a single deployable entity, a WinForm/WPF client, this solution has a layer architecture with projects representing various layers (Data, Business, UI). We will be introducing other deployable entities such as a LiteClient, a Server and an RIA.
We are considering a restructure where we will have multiple solutions, one per deployable entity (Client Solution, Server Solution, RIA Solution etc), these solutions will share various projects, e.g the Data layer project. This is essentially the Partitioned Single Solution recommended by Microsoft's P&P group (http://msdn.microsoft.com/en-us/library/Ee817674(pandp.10).aspx)
Question:
Apart from the admin overhead of this approach, is there any serious real world gothas waiting for us, if we adopt it.
This is a reasonable and normal approach. At the end of the day, the Visual Studio solution is simply a collection of projects with deployment information.
In order to reduce possible dependency issues, make sure you use a build server (TFS or otherwise) that takes care of compiling and deployments.
If you change something within the shared projects, you have to check if that didn't break the dependent projects. If you keep those projects in seperate solutions you have to remember to recompile them each time you modify the shared projects.
That's a drawback that I can see.