How To Handle Shared Library Version Bump In a Multi Repo? - microservices

Little context - today I have a monolith application that I am planning to split into micro-services due to it's growth and the need to partial re-deployments.
I'm designing a development process where I have a number micro-services in a multi repo environment (All written in python).
There is a one "Foundation" repository that stores ~30 different packages of
shared code.
Each micro-service is stored in a different repo.
The plan to share the code is using a package manager ("private" PyPi) where each package manages it's semantic version and on change it's packed and published.
An example of the packages and dependencies:
"Foundation":
Package 1
Package 2 --> 1
Package 3 --> 1,2
Package 4
"Micro-service 1"
Package 5
Package 6 --> 5,1,3
"Micro-service 2"
Package 7 --> 4
I'm trying to understand how to handle the flow where Package 1 introduces
new change (let's say - feature change that causes a minor version bump)
How I should keep up with changing all the depended libraries ?
considering:
Package 3 depends on 1 directly and indirectly - think i want to avoid case where they have different versions - it will cause a problem in the deployment.
Updating all the relevant micro-services, for example Package 6 depends on 1.
Every dependent package may introduce different version bump, some of them may have only patch level change due to Package 1 minor change.
It seems that in a fresh development process where there are a lot of changes it will be a nightmare to keep up updating everything
I've seen some different scripts or tools like Lerna for npm that should help with that - but still don't understand what's the best strategy for fresh development projects

Related

How to manage stable binaries and avoid risk of CI rebuilds when install packaging?

I am looking for a tool to manage the collection of binary files (input components) that make up a software release. This is a software product and we have released multiple versions each year for the last 20 years. The details and types of files may vary, but this is something many software teams need to manage.
What's a Software Release made of?
A mixture of files go into our software releases, including:
Windows executables/binaries (40 DLLs and 30+ EXE files).
Scripts used by the installer to create a database
API assemblies for various platforms (.NET, ActiveX, and Java)
Documentation files (HTML, PDF, CHM)
Source code for example applications
The full collected files for a single version of the release are about 90MB. Most are built from source code, but some are 3rd party.
Manual Process
Long ago we managed this manually.
When starting each new release the files used to build the last release would be copied to a new folder on a shared drive.
The developers would manually add or update files in this folder (hoping nothing was lost or deleted accidentally).
The software installer script would be compiled using the files in this folder to produce a SETUP.EXE (output).
Iterate steps 2 and 3 during validation & testing until release.
Automatic Process
Some years ago we adopted CI (building our binaries nightly or on-demand).
We resorted to putting 3rd party binaries under version control since they usually don't change as often.
Then we automated the process of collecting & updating files for a release based on the CI build outputs. Finally we were able to automate the construction of our SETUP.EXE.
Remaining Gaps
Great so far, but this leaves us with two problems:
Rebuilding Assemblies The CI mostly builds projects when something has changed, but when forced it will re-compile a binary that doesn't have any code change. The output is a fresh build of a binary we've previously tested (hint: should we always trust these are equivalent?).
Latest vs Stable Mostly our CI machine builds the latest versions of each project. In some cases this is ok, but often we want to release an older tested or stable version. To do this we have separate CI projects for the latest and stable builds - this works but is clumsy.
Thanks for your patience if you've got this far :-)
I Still Haven't Found What I'm Looking For
After some time searching for solutions it seems it might be easier to build our own solution, but surely someone else has solved these problems before!?
What we want is a way to store and manage binary files (either outputs from CI, or 3rd party files) such that each is tagged with a version (v1.2.3.4) that allows:
The CI to publish new versions of each binary (but reject rebuilt versions that already exist).
The development team to make a recipe for a software release (kinda like NuGet packages.config) that specifies components to include:
package name
version
path/destination in the release folder
The Automatic package script to use the recipe collect the required files, and compile the install package (e.g. SETUP.EXE).
I am aware of past debates about storing binaries in a VCS. For now I am looking for a better solution. That approach does not appear ideal for long-term ongoing use (e.g. how to prune old binaries)... amongst other issues.
I have tried some artifact repositories currently available. From my investigation these provide a solution for component/artifact storage and version control. However they do not provide tools for managing a list of components/artifacts to include in a software release.
Does anybody out there know of tools for this?
Have you found a way to get your CI infrastructure to address these remaining issues?
If you're using an artifact repository to solve this problem, how do you manage and automate the process?
This is a very broad topic, but it sounds like you want a release management tool (e.g. BuildMaster, developed by my company Inedo), possibly in conjunction with a package management server like ProGet (which you tagged, and is how I discovered this question).
To address some specific questions you have, I'll associate it with a feature that would solve the problem:
A mixture of files go into our software releases, including...
This is handled in BuildMaster with artifacts. This video gives a basic overview of how they are manually added to releases and deployed to a file system: https://inedo.com/support/tutorials/buildmaster/deployments/deploying-a-simple-web-app-to-iis
Of course, once that works to satisfaction, you can automate the import of artifacts from your existing CI tool, create them from a BuildMaster deployment plan itself, pull them from your package server, whatever. Down the line you can also have your CI tool call the BuildMaster release management API to create a release and automatically have it include all the artifacts and components you want (this is what most of our customers do now, i.e. have a build step in TeamCity create a release from a template).
Rebuilding Assemblies ... The output is a fresh build of a binary we've previously tested (hint: should we always trust these are equivalent?)
You can mostly assume they are equivalent functionally, but it's only the times that they are not that problems arise. This is especially true with package managers that do not lock dependencies to specific version numbers (i.e. NuGet, npm). You should be releasing exactly the same binary that was tested in previous environments.
[we want] the development team to make a recipe for a software release (kinda like NuGet packages.config) that specifies components to include:
This is handled with releases. A developer can choose its name, dates, etc., and associate it with a pipeline (i.e. a set of testing stages that the artifacts are deployed to), then can "click the deploy button" and have the automation do all the work.
Releases are grouped by "application", similar to a project in TeamCity. As a more advanced use case, you can use deployables. Deployables are essentially individual components of an application you include in a release; in your case the "Documentation" could be a deployable, and maybe contain an artifact of the .pdf and .docx files. Deployables from other applications (maybe a different team is responsible for them, or whatever) can then be referenced and "included" in a release, or you can reference ones from a past release.
Hopefully that provides some overview and fits your needs. Getting into this space is a bit overwhelming because there are so many terms, technologies, and methodologies, but my advice is to start simple and then slowly build upon it, e.g.:
deploy a single, manually uploaded component through BuildMaster to a share drive, then manually deploy it from there
add a deployment plan that imports the component
add a second plan and associate it with the 2nd stage that takes the uploaded artifact and deploys it to the target, bypassing the need for the share drive
add more deployment plans and associate them with pipeline stages and promote through them all to "close out" a release
add an agent and deploy to that instead of the default localhost server
add more components and segregate their deployment with deployables
add event listeners to email team members at points in the process
start adding approvals if you require gated "sign-offs"
and so on.

Package version management in Go 1.5

I'm getting my hands dirty with Go, and while I understand and appreciate the principle of simplicity that Go was built upon, I'd like to grasp the rationale behind forgoing a built-in package versioning method in their dependency-fetching tool go get and the import statement.
If I understand correctly, go get and import fetch the package from HEAD and they are unable to refer to a branch or a tag. While there are tools like gopkg.in that circumvent this limitation, the official toolchain:
Forces developers to create separate repos for major (breaking) versions of their products.
It doesn't allow consumers to downgrade between minor or micro versions in case bugs are found in newer ones.
Truth be told, things are not so easy because package versioning would require a strategy to deal with conflicting transitive dependencies, e.g. X depends on A and B, each of which depend on different versions of C.
Coming from a Java background, it does appear that this limitation poses some risks and problems, amongst others:
Product/package evolution and breakage of public APIs of 3rd party deps is unavoidable, therefore versioning must be a first-class citizen in the toolchain IMHO.
The Git-repo-per-version policy is highly inefficient:
The overall Git history of the package is lost or scattered across repos (merges between versions, backports, etc.)
Conflicts with transitive dependencies may still occur, and will go about undetected because the language nor the toolchain impose any semantics to allow detection in the first place.
Enterprise adoption may be hindered and development teams may shy away from the language, given that:
Always dragging in HEAD means that they can't control or freeze their 3rd party deps, leading to a potentially unpredictable end product.
May lack the manpower to keep their product constantly updated and tested with upstream's HEAD (not every company in the world is Google :)).
While I do understand that the latter risk can be – and must be – mitigated with Continuous Integration, it does not solve the underlying root of the problem.
What information am I missing? How do you deal with package upstream changes when deploying Go in an enterprise with limited manpower?
It is being addressed by vendoring which is part of Go 1.5 as an experimental feature, it can be enabled if the go command is run with GO15VENDOREXPERIMENT=1 in its environment, and will be a "full" feature in Go 1.6. Also see Vendor Directories.
The original discussion that led to the Go 1.5 Vedor Experiment can be found here.
The essence of vendoring is that you create a folder named vendor, and you put the exact version of the packages that your code relies on. Code inside the vendor folder is only importable by the code in the directory tree rooted at the parent of vendor, and you can import packages from vendor with an import path as if vendor would be the workspace/src folder (that is, with an import path that omits the prefix up to and including the vendor element).
Example:
/home/user/goworkspace/
src/
mymath/
mymath.go
vendor/
github.com/somebob/math
math.go
In this example github.com/somebob/math is an external package used by mymath package (from mymath.go). It can be used from mymath.go if it is imported like:
import "github.com/somebob/math"
(And not as import mymath/vendor/github.com/somebob/math which would be bad.)
While Go doesn't come with the standard package manager there are enough options to make builds reproducible (even in an enterprise with limited manpower).
Vendoring, which is described in another answer by #icza. This is almost complete equivalent of checking in versioned jar files in Java. This was very common approach with ant build tool before maven became popular. Actually vendoring is much better because you cannot lose source code.
This is slight variation of the first option. Instead of checking in vendored source code you can populate vendor folder during build by checking out predefined versions of the dependencies. There tools (e.g. glide) that automate this process.
Finally you can maintain predefined versions of all 3-rd party libraries in the internal repository and add it to GOPATH. This approach is described in detail in https://blog.gopheracademy.com/advent-2015/go-in-a-monorepo/
Note that incompatible transitive dependencies are not specific to Go. They exist in Java (and most other languages) too, though Java has a mechanism to partially solve this problem by making programs more complex - class loaders. Note that Go will report all incompatibilities at compile time while in Java some of the incompatibilities are triggered only at run time (because of late linking).
Java toolchain doesn't have concept of versions. It is provided by the external tool - maven. I believe by the time Go becomes more mature and popular a similar standard dependency management tool will emerge.

How to manage multiple versions of binary dependencies in TFS 2012?

I'm managing release process for couple of projects that target external API. Typical scenario is that a single solution targets a particular version, say v1, of 3rd party runtime in a production and newer version (v2) in development phase. I have to maintain dependencies for v1 for production support but also v2 for a DEV branch. Those scenarios may even go more complex depends on the rollout plan.
I tried branching + nuget but the problem is API I use is huge and it is hard to build a scope of a nuget package. Putting everything into one package makes no sense for smaller projects and on the other hand depending on what features we integrate, combination of DLLs may vary a lot and they are not nicely separated into closed concerns.
On top of it, usually we have multiple solutions that use those APIs.
I was thinking about building API version repository in TFS in some form
- myAPI
|- v1
|- v2
|- v3
Is there a way to configure a build process to look inside a server for referenced DLL files depending on a build setup? I can maintain multiple builds in the system obviously but I don't know how to provide referenced files location for each individual build.

Managing multiple versions of internal (private) NuGet packages

Our development team has been fairly small and, until now, all working on a single Visual Studio 2012 solution. We are growing and wanting to create better separation with multiple solutions for different project teams.
However, there are occasions where the code in one solution will want to utilize code from another. We have decided using internal (i.e. private) NuGet packages will be a good way to manage these dependencies.
However, the question has come up on how to deal with multiple versions of the same package that are in different SDLC stages (e.g. Development, QA, Staging, Production, etc.)
Example: If we have these three solutions...
CoreStuff
CoolProject1
CoolProject2
If working in CoolProject1, and we need to utilize code from CoreStuff, we can add the NuGet package. Presumably this package will be the latest Production (stable) version of CoreStuff.
However, what if a developer working on CoolProject2 is aware of some changes in CoreStuff that are currently in Development and wants to utilize that version?
Not sure if the best approach is to create separate packages for each (seems to require changing your package references back and forth depending on what stage the solution is in) or somehow utilize multiple versions of the same package (not sure if that's easy to manage with NuGet).
Anyone tackle something like this?
The first thing to remember is that NuGet will not automatically update your package references, so if you have already 'linked' your solution to the latest stable package of CoreStuff (say 1.2.2) then there won't be any problems if a newer (unstable) version is provided (assuming that the package you're using doesn't disappear from the package repository). Obviously if you upgrade your package reference then you will get the unstable package.
So the simplest solution is to make sure that you 'link' your project to the stable package by getting it via the NuGet package manager before the other package is released. While the UI only allows you to get the latest version, the Package Manager Console can get any version of a package so you could use that to explicitly provide the version number, e.g.:
Install-Package CoreStuff -Version 1.2.2 -Project CoolProject1
If that is not a solution then there are several other options to tackle this problem:
Give the development version a different semantic version that indicates it is a unstable version, e.g. 1.2.3-alpha. In this case CoolProject1 could pull in package CoreStuff.1.2.2 (which should be latest stable version in your repository) and CoolProject2 could pull in CoreStuff.1.2.3-alpha (which would be the latest unstable version).
Have multiple repositories, e.g. one for stable (released) packages and one for unstable (development) versions. Then you can select your packages from the desired repositories. If you wanted to you could make it so that only your release process can push packages up to the stable repository and your CI build pushes up to the unstable one (so that you always have the latest packages available)
If the developer of CoolProject2 just wants to develop against the latest version (but will wait to release CoolProject2 until after CoreStuff v.next has been released) then he could potentially create a local package repository (i.e. a directory on his drive) and put the new package of core stuff there. That way other developers won't even see the package.
The most important thing will be to make sure that you don't get CoreStuff.1.2.2 and CoreStuff.v-next in the same repository if CoreStuff.v-next simply has a higher version number, because in that case the NuGet UI won't let you pick v1.2.2 (but the Package Manager Console does!).
If you would want to switch from one package type to another you'd have to do a manual update (which you always have to do when changing to the next package version anyway), but that's not a bad thing given that this forces a developer to at least check that the update of the package doesn't break anything.

Why we need a package manager like Nuget?

I know Package Manager like NuGet help us when we want to use third party components.
From Nuget Codeplex Page:
NuGet is a free, open source developer focused package management
system for the .NET platform intent on simplifying the process of
incorporating third party libraries into a .NET application during
development.
There are a large number of useful 3rd party open source libraries out
there for the .NET platform, but for those not familiar with the OSS
ecosystem, it can be a pain to pull these libraries into a project.
Let’s take ELMAH as an example. It’s a fine error logging utility
which has no dependencies on other libraries, but is still a challenge
to integrate into a project. These are the steps it takes:
Find ELMAH
Download the correct zip package.
“Unblock” the package.
Verify its hash against the one provided by the hosting environment.
Unzip the package contents into a specific location in the solution.
Add an assembly reference to the assembly.
Update web.config with the correct settings which a developer needs to search for.
And this is for a library that has no dependencies. Imagine doing this
for NHibernate.Linq which has multiple dependencies each needing
similar steps. We can do much better!
NuGet automates all these common and tedious tasks for a package as
well as its dependencies. It removes nearly all of the challenges of
incorporating a third party open source library into a project’s
source tree
these steps are simple tasks that we do when we want to setup a project. its only for automation of adding 3rd party components and decrees chance of Error in configuration files? or it has much more responsibilities !?
It's value is hidden in the open: a package manager such as NuGet helps you dealing with software dependencies using automation. Many make the assumption that it's only meant for open source or third party components, but you could equally as well use it for your own internal packages.
The great thing about NuGet is (to name a few benefits):
NuGet encourages reuse of components because you implicitly rely on actual "releases" (even if pre-release), instead of branching sources
you can get rid of binaries bloating your VCS repositories (package restore feature)
it forces package creators to think about the way the package will be consumed and leaves them dealing with configuration of the component during package installation (who knows best how to configure the package than the package creators?). Think about ELMAH as an example.
automating package creation and publication on a package repository effectively is a form of continuous delivery (for software components). OctopusDeploy even takes it a step further and enables packaging entire Web sites ready for deployment.
NuGet encourages and sometimes enforces you to follow some ALM best practices. E.g. a package has a version, so you have to think about your versioning strategy (e.g. SemVer.org)
NuGet integrates with SymbolSource.org (which also has a Community edition to set up your own): this allows one to easily debug released packages without having to ship this info all the time
having one or more package repositories makes it easy for the organization to maintain a dependency matrix, or even build an inventory of OSS licenses that are in use by several projects
NuGet notifies you about available package updates
Creating packages makes people think about component architecture (all dependencies should be packaged as well)
Dependencies of a package are automatically resolved (so you can't forget any)
NuGet is smart enough to add assembly binding redirects when required
The above list is non-exhaustive, but I hope I covered the key benefits in this answer. I'm sure there are more.
Cheers,
Xavier
Reason to use NuGet is you don't have to ship all the libraries in your project, reducing the project size. With NuGet Power Tools, by specifying the package versions in the Packages.config file, you will be able to download all the required libraries the first time you run the project.
Live Exapmle : Reduced project size matters while deployment of project.Like if solution
have 500Mb of code and 200Mb of packages size then extra 200mb really
cost to upload project each time.Instead of uploading concrete
dll files we need to just set their reference in packages.config file.

Resources