How to control pnpm workspace build order - pnpm

I'm working in a large pnpm monorepo (pnpm v7).
The repository contains :
multiple apps
multiple shared dependencies
tooling (shared compilation setup package)
Every package (app or shared library) may reference any shared library (no circular ref though)
The compilation setup package is gulp helper that setup the compilation tasks for each app and some shared components.
It means that the build order should be :
tooling/compiler
packages/shared1
packages/shared2
apps/app1
apps/app2
Although dependencies are declared in every package.json files properly (either as devDependency or dependency), runnin pnpm -r run build seems to build projects randomly. The result is that it doesn't compile (complaining it miss some dependent packages).
I thought pnpm is supposed to deal with build order. Is there anything I miss ?
Should I move to more complex tools like turborepo or rush?

I think, finally, that pnpm is not able, alone, to have such control.
I solved my issue by addin new tool on top of my repo.
I first gave a chance to turbo then nxdev. Both are able to detect the dependency graph and build in the correct order

Related

Go binaries for concourse ci tasks

What are some good patterns for using go in concourse-ci tasks. For example, should I build files locally with all dependencies and check-in cross-compiled binaries to the repo? Should I build on concourse prior to running the task?
Examples of what people do here would be great. Public repos of pipelines/tasks even better.
The way I see it there are currently 3 options for handling go builds:
Use vendoring
Explicitly declare the dependencies as concourse resources
Maintain a docker image with the required dependencies already included
All options have pros and cons. The first option is currently my favorite since the responsibility for handling dependencies is up to the project maintainers and there is a very clear way to see which versions/revisions of the dependencies are being used - i.e. just check the vendoring tool config - but it does force you to have all dependency code in the project's repo.
The second option follows the go "philosophy" of always tracking master, but it may lead to slower builds (concourse will need to check every single resource regularly) and may lead to sudden breakage because of changes in dependencies.
The third option allows you to implicitly fix the revision of the dependencies in the docker image, in that respect it's similar to the first, however it requires maintaining docker images (doesn't necessarily mean 1 per project, but it might mean more than one depending on the number of projects that use this option and if there are conflicting dependency versions between them)

Build dependencies and local builds with continuous integration

Our company currently uses TFS for source control and build server. Most of our projects are written in C/C++, but we also have some .NET projects and wouldn't want to be limited if we need to use other languages in the future.
We'd like to use Git for our source control and we're trying to understand what would be the best choice for a build server. We have started looking into TeamCity, but there are some issues we're having trouble with which will probably be relevant regardless of our choice of build server:
Build dependencies - We'd like to be able to control the build dependencies for each <project, branch>. For example, have <MyProj, feature_branch> depend on <InfraProj1, feature_branch> and <InfraProj2, master>.
From what we’ve seen, to do that we might need to use Gradle or something similar to build our projects instead of plain MSBuild. Is this correct? Are there simpler ways of achieving this?
Local builds - Obviously we'd like to be able to build projects locally as well. This becomes somewhat of a problem when project dependencies are introduced, as we need a way to reference these resources or copy them locally for the build to succeed. How is this usually solved?
I'd appreciate any input, but a sample setup which covers these issues will also be a great help.
IMHO both issues you mention fall really in the config management category, thus, as you say, unrelated to the build server choice.
A workspace for a project build (doesn't matter if centralized or local) should really contain all necessary resources for the build.
How can you achieve that? Have a project "metadata" git repo with a "content" file containing all your project components and their dependencies (each with its own git/other repo) and their exact versions - effectively tying them together coherently (you may find it useful to store other metadata in this component down the road as well, like component specific SCM info if using a mix of SCMs across the workspace).
A workspace pull wrapper script would first pull this metadata git repo, parse the content file and then pull all the other project components and their dependencies according with the content file info. Any build in such workspace would have all the parts it needs.
When time comes to modify either the code in a project component or the version of one of the dependencies you'll need to also update this content file in the metadata git repo to reflect the update and commit it - this is how your project makes progress coherently, as a whole.
Of course, actually managing dependencies is another matter. Tons of opinions out there, some even conflicting.

Project structure. Scientific Python projects

I am looking for a better way to structure my research projects. I have the following setup:
There are projects a,b,c and a library lib. Each project tackles a different research question and the library carries code that is used across projects. Thus all projects depend on lib. Things get more complicated as project c depends on projects a and b as well. When I work on project c, I will also update a,b or lib simultaneously. Each project is in a separate git repository.
So far I have dealt with this situation by including the dependencies above via git submodule and all the source files are located in the root dir of the project. The advantage is that I keep track of which version of lib my projects depend. Also one of my projects could depend on an outdated version of lib. I run everything from the root directory without "installing" any of the packages to site-packages or so. When a path is not set correctly, I override it via sys.path.insert.
However, the following points make me want to change layout:
I keep losing track of which version of lib I am editing.
I want to make use of automated testing tools (tox,jenkins etc.) which seem to be much easier to handle with a standard project setup.
sys.path.insert can lead to subtle problems which are hard to debug.
I usually want all my projects to work with the tip of lib anyway.
Therefore I am currently rearranging all projects (especiall lib) to be in line with the standard Python directory structure (source stored in a subdirectory, root contains a setup.py file) to be able to work in a virtualenv. Then I can list all my dependencies in requirements.txt. First I install lib as develop via pip install -e . Then I run pip freeze > requirements.txt which then includes a line similar to this.
-e git+<path_to_remote>#<sha>#egg=`lib`
So again I have generated a dependency to a specific commit (sha) as with git submodule, ensuring that I can checkout an old commit and the project should run. I can now install everything in a virtualenv and got rid of my path problems. Great.
I face some new trouble though. One problem is, how to update the sha in requirements.txt. The easiest (but probably not most elegant) solution I see is to write a pre-commit hook that updates the sha before commiting. Is there a better way?
And more generally - do you see a better solution given my setup?
As far as I see you have mostly solved your problem and there are only small bits left.
1) Don't use hashes to identify versions of your libraries. Even if you don't publish your libraries to the Cheese Shop, do a normal library versioning (semver) and tag you git repositories accordingly. Thing way you will have human-readable and manageable version in your git+https://github.com/... URLs of dependencies.
2) Make your tox setup in the way that will let you test stable version of dependencies (that you have tagged last time) and master version right from the latest repo revision.

Continuum finding dependencies and building on chain-dependent projects

I am the Configuration manager for an IT firm. Currently we are using anthill build management server for all our build related purposes. We are looking to implement Continuous Integration in our development life cycle.
Currently the building process is done manually. Suppose there are 5 projects A,B,C,D,E and E is the parent project and the dependency chain does like this:
A->B->C->D->E
What we do is we build A first update project.xml of B to the latest version of A, build B so on and so forth untill all dependent projects get built and finally parent project gets built.
What I am thinking is automating the entire process i.e. automatically finding out dependencies and building them first and then updating the version of parent projects and building them again to a newer version.
Would continuum do this for me? If not is here any other CI tool that does this?
Hudson does this really well, if you're using Maven, it'll even automatically figure out the build dependencies for you automatically after the first build, otherwise you can manually define the build dependencies. I.e., it lets you configure the system to build project B after a successful project A build.
I'm not sure if it matters to you, but Hudson is also open source.
If not is here any other CI tool that does this?
I like TeamCity, which does pretty much everything you'll need. With the latest version (and a plugin from JetBrains), there's even Git support.
On the other hand, any continuous integration system should handle dependencies easily.
We use Zed Builds and Bugs for a setup similar to this. We have a master project that has sub-project dependencies and the build system handles everything in the proper order.
We also have very small, tight builds for the sub-projects so that each of them can be built when the developers commit to source control. The Zed Server is capable of pulling the latest artifacts from these small builds and putting them together into larger builds, but we haven't yet used that feature.
Our check-ins trigger the small CI builds, and then twice per day the entire application is re-built from scratch, following the dependency chain.
I'd agree with OregonGhost, though, any CI system should be able to set up this type of chain.
I don't think you need a CI tool for this. Try to automate this using a buildscript and use Continuum (or any other CI tool) to trigger your preferred buildtool.

What is the best practice for sharing a Visual Studio Project (assembly) among solutions

Suppose I have a project "MyFramework" that has some code, which is used across quite a few solutions. Each solution has its own source control management (SVN).
MyFramework is an internal product and doesn't have a formal release schedule, and same goes for the solutions.
I'd prefer not having to build and copy the DLLs to all 12 projects, i.e. new developers should to be able to just do a svn-checkout, and get to work.
What is the best way to share MyFramework across all these solutions?
Since you mention SVN, you could use externals to "import" the framework project into the working copy of each solution that uses it. This would lead to a layout like this:
C:\Projects
MyFramework
MyFramework.csproj
<MyFramework files>
SolutionA
SolutionA.sln
ProjectA1
<ProjectA1 files>
MyFramework <-- this is a svn:externals definition to "import" MyFramework
MyFramework.csproj
<MyFramework files>
With this solution, you have the source code of MyFramework available in each solution that uses it. The advantage is, that you can change the source code of MyFramework from within each of these solutions (without having to switch to a different project).
BUT: at the same time this is also a huge disadvantage, since it makes it very easy to break MyFramwork for some solutions when modifiying it for another.
For this reason, I have recently dropped that approach and am now treating our framework projects as a completely separate solution/product (with their own release-schedule). All other solutions then include a specific version of the binaries of the framework projects.
This ensures that a change made to the framework libraries does not break any solution that is reusing a library. For each solution, I can now decide when I want to update to a newer version of the framework libraries.
That sounds like a disaster... how do you cope with developers undoing/breaking the work of others...
If I were you, I'd put MyFrameWork in a completely seperate solution. When a developer wants to develop one of the 12 projects, he opens that project solution in one IDE & opens MyFrameWork in a seperate IDE.
If you strong name your MyFramework Assemby & GAC it, and reference it in your other projects, then the "Copying DLLs" won't be an issue.
You just Build MyFrameWork (and a PostBuild event can run GacUtil to put it in the asssembly cache) and then Build your other Project.
The "best way" will depend on your environment. I worked in a TFS-based, continuous integration environment, where the nightly build deployed the binaries to a share. All the dependent projects referred to the share. When this got slow, I built some tools to permit developers to have a local copy of the shared binaries, without changing the project files.
Does work in any of the 12 solutions regularly require changes to the "framework" code?
If so your framework is probably new and just being created, so I'd just include the framework project in all of the solutions. After all, if work dictates that you have to change the framework code, it should be easy to do so.
Since changes in the framework made from one solution will affect all the other solutions, breaks will happen, and you will have to deal with them.
Once you rarely have to change the framework as you work in the solutions (this should be your goal) then I'd include a reference to a framework dll instead, and update the dll in each solution only as needed.
svn:externals will take care of this nicely if you follow a few rules.
First, it's safer if you use relative URIs (starting with a ^ character) for svn:externals definitions and put the projects in the same repository if possible. This way the definitions will remain valid even if the subversion server is moved to a new URL.
Second, make sure you follow the following hint from the SVN book. Use PEG-REVs in your svn:externals definitions to avoid random breakage and unstable tags:
You should seriously consider using
explicit revision numbers in all of
your externals definitions. Doing so
means that you get to decide when to
pull down a different snapshot of
external information, and exactly
which snapshot to pull. Besides
avoiding the surprise of getting
changes to third-party repositories
that you might not have any control
over, using explicit revision numbers
also means that as you backdate your
working copy to a previous revision,
your externals definitions will also
revert to the way they looked in that
previous revision ...
I agree with another poster - that sounds like trouble. But if you can't want to do it the "right way" I can think of two other ways to do it. We used something similar to number 1 below. (for native C++ app)
a script or batch file or other process that is run that does a get and a build of the dependency. (just once) This is built/executed only if there are no changes in the repo. You will need to know what tag/branch/version to get. You can use a bat file as a prebuild step in your project files.
Keep the binaries in the repo (not a good idea). Even in this case the dependent projects have to do a get and have to know about what version to get.
Eventually what we tried to do for our project(s) was mimic how we use and refer to 3rd party libraries.
What you can do is create a release package for the dependency that sets up a path env variable to itself. I would allow multiple versions of it to exist on the machine and then the dependent projects link/reference specific versions.
Something like
$(PROJ_A_ROOT) = c:\mystuff\libraryA
$(PROJ_A_VER_X) = %PROJ_A_ROOT%\VER_X
and then reference the version you want in the dependent solutions either by specific name, or using the version env var.
Not pretty, but it works.
A scalable solution is to do svn-external on the solution directory so that your imported projects appear parallel to your other projects. Reasons for this are given below.
Using a separate sub-directory for "imported" projects, e.g. externals, via svn-external seems like a good idea until you have non-trivial dependencies between projects. For example, suppose project A depends on project on project B, and project B on project C. If you then have a solution S with project A, you'll end up with the following directory structure:
# BAD SOLUTION #
S
+---S.sln
+---A
| \---A.csproj
\---externals
+---B <--- A's dependency
| \---B.csproj
\---externals
\---C <--- B's dependency
\---C.csproj
Using this technique, you may even end up having multiple copies of a single project in your tree. This is clearly not what you want.
Furthermore, if your projects use NuGet dependencies, they normally get loaded within packages top-level directory. This means that NuGet references of projects within externals sub-directory will be broken.
Also, if you use Git in addition to SVN, a recommended way of tracking changes is to have a separate Git repository for each project, and then a separate Git repository for the solution that uses git submodule for the projects within. If a Git submodule is not an immediate sub-directory of the parent module, then Git submodule command will make a clone that is an immediate sub-directory.
Another benefit of having all projects on the same layer is that you can then create a "super-solution", which contains projects from all of your solutions (tracked via Git or svn-external), which in turn allows you to check with a single Solution-rebuild that any change you made to a single project is consistent with all other projects.
# GOOD SOLUTION #
S
+---S.sln
+---A
| \---A.csproj
+---B <--- A's dependency
| \---B.csproj
\---C <--- B's dependency
\---C.csproj

Resources