Upgrading old go project to work with go modules

Upgrading old go project to work with go modules - go

My $GOPATH contains 3 locations
/home//Documents/gotree
/home//Documents/perforce/modules/thirdparty/golibs
/home//Documents/perforce/modules/sggolibs/
Here location 1 is for general purposes, 2 and 3 for work-related libraries, which are maintained on one perforce server. These last two libraries are keeping in perforce so that anyone in the company should use these exact versions, not the library's latest version from internet.
In other location a couple of go servers are there, and all of them are using atleast a single library from $GOPATH location 2 and 3.
All those server are written 2,3 years ago, and does not contain any go.mod or any package management items.
My question is how do I upgrade all these servers to latest version go so that it will work with go modules, and probably a vendor directory to the thirdparty libraries?
Apologies if my question is too generic.

Unfortunately, Perforce is not one of the version control systems supported natively in the go command, so you may need to apply a bit of scripting or tooling in order to slot in the libraries from your Perforce repositories.
One option is to set up a module proxy to serve the dependencies from Perforce, and have your developers set the GOPROXY and GONOSUMDB environment variables so that they use that proxy instead of (or in addition to) the defaults (proxy.golang.org,direct).
Note that Go modules compute and store checksums for dependencies, so if you have modified any third-party dependencies it is important that any modifications be served with unique version strings (or different module paths!) so that they don't collide with upstream versions with different contents. (I seem to recall that the Athens proxy has support for filtering and/or injecting modules, although I'm not very familiar with its capabilities or configuration.)
I'm not aware of any Go module proxy implementations that support Perforce today, but you might double-check https://pkg.go.dev/search?q=%22module+proxy%22 to be sure; at the very least, there are a number of implementations listed there that you could use as a reference. The protocol is intentionally very simple, so implementing it hopefully wouldn't be a huge amount of work.
Another option — probably less work in the short term but more in the long term — is to use replace directives in each module to replace the source code for each Perforce-hosted dependency with the corresponding filesystem path. You could probably write a small script to automate that process; the go mod edit command is intended to support that kind of scripting.
Replacement modules are required to have go.mod files (to reduce confusion due to typos), so if you opt for this approach you may need to run go mod init in one or more of your Perforce directories to create them.
With either of the above approaches, it is probably simplest to start with as few modules as possible in your first-party repository: ideally just one at the root of your package tree. You can run go mod init there, then set up your replace directives and/or local proxy, then run go mod tidy to fill in the dependency graph.

Related

What does yarn --pnp?

There is this new shining Yarn feature called Plug'n'Play.
I would like to know what it does exactly?
I know it's creating a .pnp folder and a .pnp.js file, but does it change anything else on the machine, like a config file somewhere?
Thank you.

I designed and implemented PnP, so I can talk hours about it 🙂
tl;dr: We only write the .pnp.js and .pnp folders (on top of the regular Yarn cache). We don't store configuration anywhere else.
Without Plug'n'Play
When you run yarn install (even without PnP), a few things happen:
If you use the offline mirror feature, we download the tarballs from the registry and store them within the offline mirror folder
Regardless of whether or not you use the offline mirror, we unpack all the tarballs downloaded and store their files in the Yarn cache
We then figure out which files from the cache should be copied into which location in the node_modules
We apply the computed changes (a bunch of rsync operations, basically)
With Plug'n'Play
With PnP, the workflow becomes like this:
No changes, we download the tarballs from the registry in the offline mirror (if enabled)
No changes, we still unpack them into the Yarn cache
We generate a .pnp.js file¹
And that's it. There is no other generated file than the .pnp.js file (and the cache, but it already was there before).
¹ As you mentioned, we also generate a .pnp folder (.yarn as of Yarn 2) in the project. This folder is meant to contain two types of data:
Unplugged packages are packages that must be local to the project. Typically, those are the packages with postinstall scripts (we cannot store them into the cache, as the generated artifacts might be different from a project to another).
Virtual packages, which are symlinks created for each package in your dependency tree that lists peer dependencies. Without going into the details, they are a necessary part of the design, and are required to make require.resolve work as before. Those files don't exist anymore as of Yarn 2 🎉
How does it work?
The .pnp.js file contains information similar to the following:
webpack#1.0.0 -> /cache/webpack-1.0.0/
-> it depends on lodash#1.0.0
lodash#1.0.0 -> /cache/lodash-1.0.0/
-> no dependencies
By having those information, the resolution can correctly infer that when a file within /cache/webpack-1.0.0 makes a require call to lodash, then the required files must be loaded from /cache/lodash-1.0.0. It's a bit more complex in practice (we keep an inverse map for improved perfs, we use relative paths to ensure portability, etc), but the general concept is there.
Bonus round: With Plug'n'Play+Zip loading (Yarn 2)
Bonus: With Yarn 2, we're about to improve this workflow even more. This is what it will look like:
We download the tarballs from the registry and store then into the cache (no more distinction between offline mirror and cache - they are the same)
We generate the same .pnp.js file as before
And that's it! As you can see we don't unpack the packages anymore (instead, we use a Node loader to read them from the package archives at runtime).
Doing this has a very interesting property: if both your cache and .pnp.js files are there, you don't need to run yarn install for your application to work! And to ensure you have those files, you just have to add them to your repository and version them as you would with everything else.²
It's very useful, as you don't need to remember to run yarn install after git rebase, git pull, or git checkout, and your CI systems become faster and stabler as they don't need special setup - just clone your application and it'll just work.
² Before someone mentions it - checking-in binary files within a repository is perfectly fine. The reason why node_modules were a very bad thing to check-in within your repository was because of the exponential number of text files, which was putting a huge strain on Git - technically, but also philosophically as code reviews were made impossible.
In the case I described we don't suffer from the same problem, because the number of files is constrained (exactly one file for each package), and reviewing them is very easy - in fact, it's better in that you can clearly see how many new packages are added to your project by a PR!

It imports only the parts of a package you are going to use, making the bloated node_modules folder much, much leaner.
Think about for example having relative big libraries like lodash or ramda when you use only 4-5 functions from them - how much you could save getting only the actually used minimum.
I believe it is not yet 100% fully stable, but still a nice option to keep on your radar :)

how to manage GOPATH for multiple project directories

coming from a java and grails background, (and having written millions of lines of C 30 years ago), I cant see how go can be usable with a fixed gopath on windows.
installing go creates this structure
c:\users\me\go\scr
\pkg
\bin
As you will want to have many go projects it seems they have to be mixed together in the same src/kpg/bin dirs, polluting each other. e.g.
/src/project1/hello.go
/project2/hello.go
/pkg/xx
/bin/hello.exe
Which hello.go will hello.exe run?
Unless I am missing something fundamental, this seems crazy - all completely separate projects are expected to share the same package and bin dirs. It means you dont know which versions of which packages, and which exe files belong to which project, and there is presumably plenty of scope for conflicts. I would have expected a /src, /pkg and /bin for each separate go app (like java or grails or even C), and go path is completely redundant, it can be relative to the current project root (like in grails).
To make matters works, for work, we have to use a different directry, e.g.
d:\work\project3
\project4
\package5
\go_utility6
\go_utility7
So now we have a total of 6 separate directories where go progams live. It is not feasible to change the path every time you switch to working on a different project. I guess the other option is to add the 6 paths to the GOPATH. Presumably, all 7 go projects write to the same pkg and bin dir, which is going to be a disaster.
Is there a tenable solution to this situation (on windows at least)?
If we need to add a PATH to GOPATH for every project, what should the file structure be under each project?
E.g. uner xxx\go_utility6, which is a stand alone command line go app, what should the structure be? does there need to be a src dir in there somewhere? does this dir need gopath to point to it? does it need its own pkg, or should it use the c:\users\me\pkg dir?

UPDATE: When I posted this Go did not have modules support and we built and used a tool called vg. These days the recommended way to go is to use go modules.
I use vg for that, it takes care of keeping separate GOPATH paths per project and it switches automatically when you cd a project.

Your example "which hello.exe" should be used honestly makes not much sense. Two tools with the same name?
Even if both are, let's say, an api, your devops will be happier with more meaningful names.
The bin folder is used for 3rd party tools you install, you so not have to install your project binaries. Except they are tools, but then the name should be meaningful again.
You can get more information about the project structure here: https://golang.org/doc/code.html
Since go 1.8 supports a vendor folder below project folders, it is possible to break the original structure. (imho vendors were not maintainable before 1.8, yes that was crazy)
You might want to use a tool like direnv, which would support your desire to change GOPATH per project.
https://github.com/direnv/direnv
It also has some built in function for adding the current path to the GOPATH.
https://github.com/direnv/direnv/blob/master/stdlib.sh#L355:1
For example GoLang also supports handling multiple GOPATHs and per project GOPATHs. So direnv should also work properly.
In my company we have one go folder right next to our other projects.
Under go/src are our projects. No problem so far, since vendors are in the projects' vendor folders and committed.
The so far best dependency manager I would recommend for go is:
https://github.com/golang/dep
I hope that input helps.

With Go 1.11 Go Modules were introduced. You can use Go Modules to have Go projects outside the GOPATH directory.
Here is an example of how to configure a project using GoModules.

gentoo: how delete all config files on unmerging package (from its ebuild)

I am making my own personal package to have collection of usefull programs and configs. Main idea is to emerge this package and have system prepared for my prefferencies. Mainly it works (it simply depends on all my favourite programs), but I have two problems here:
how to install USE flags, UNMASK and such before affected programs are installed?
how to uninstall it (emerge --unmerge does NOT delete files in /etc, so even after uninstalling the package the USE flags (and others) are still kept - my intent is to REMOVE them, so next rebuild of world would NOT use them anymore - yes it means a lot of programs would lose some functionalities like support for some languages, support for some other programs and so on, it is desired result)
My solutions so far are:
The package have some files in /etc/portage/package.*
1.1. I emerge that package with --nodeps (so the config files are installed)
1.2. I emerge it again without that flag (so dependencies are installed
with right configuration))
I create (and install) script to parse /var/db/packages for my package CONTENTS and delete all /etc/portage/something files "manually" and I have to rum this script before unmerging the package
Is there better way to do it ?

You just doing/understanding it wrong! (sorry :)
First of all, instead of a metapackage (an empty ebuild that have only runtime dependencies) there is other ways:
use sets to describe your preferred packages. Manage your USE flags in a usual way (including per package USE if needed).
medium complexity solution is to write a metapackage ebuild (your current case) -- but, you can't mask/unmask USE flags anyway…
if you already have your overlay (obviously) -- defining your own profile would solve everything! Here you can manage everything just like you want: mask/unmask any USE flags, define what is system predefined package means for you, & etc…
Unfortunately, I don't use Gentoo portage (and emerge) and have no idea if it's possible to have multiple additive profiles. I have my own profiles here and it works perfectly with Paludis.
Second, never remove any configuration files (config-protected) after uninstall! There is no packages that do that, and there is a bunch of reasons for that… The main one is that user may have them modified and don't want to loose his changes. Moreover, personally I prefer to have all configs that I've ever touched to be in a dedicated VCS repo -- it wouldn't be nice, if someone, except me, would remove smth…
Imagine a real life example: user wants to reinstall some package and he has a bunch of configuration files, he spent some time to carefully edit them. Trivial way is to uninstall and then install again -- Oops! He lost his configs!
Moreover, from ebuild's POV, you have pkg_prerm and pkg_postrm functions, but both of them are called even at upgrade time (i.e. when unmerge followed by immediate merge phase). You have to be really careful to distinct that use cases… And what is more scare, having any "hardcoded" (and unique) rules in any package, you don't have any influence on them…
So, please, never remove any config protected files, let the user to take care of them (he is the boss, not a package manager)…
Update: If you really want to be able to remove some config-protected files, setting up your own profile looks even more better idea. You can set CONFIG_PROTECT_MASK to enforce unprotect files and/or directories. In that way you don't need to modify any ebuilds and/or write an ugly cleanup code.

Whats a good best practice with Go workspaces?

I'm just getting into learning Go, and reading through existing code to learn "how others are doing it". In doing so, the use of a go "workspace", especially as it relates to a project's dependencies, seems to be all over the place.
What (or is there) a common best practice around using a single or multiple Go workspaces (i.e. definitions of $GOPATH) while working on various Go projects? Should I be expecting to have a single Go workspace that's sort of like a central repository of code for all my projects, or explicitly break it up and set up $GOPATH as I go to work on each of these projects (kind of like a python virtualenv)?

I think it's easier to have one $GOPATH per project, that way you can have different versions of the same package for different projects, and update the packages as needed.
With a central repository, it's difficult to update a package as you might break an unrelated project when doing so (if the package update has breaking changes or new bugs).

I used to use multiple GOPATHs -- dozens, in fact. Switching between projects and maintaining the dependencies was a lot harder, because pulling in a useful update in one workspace required that I do it in the others, and sometimes I'd forget, and scratch my head, wondering why that dependency works in one project but not another. Fiasco.
I now have just one GOPATH and I actually put all my dev projects - Go or not - within it. With one central workspace, I can still keep each project in its own git repository (src/<whatever>) and use git branching to manage dependencies when necessary (in practice, very seldom).
My recommendation: use just one workspace, or maybe two (like if you need to keep, for example, work and personal code more separate, though the recommended package path naming convention should do that for you).

If you just set GOPATH to $HOME/go or similar and start working, everything works out of the box and is really easy.
If you make lots of GOPATHs with lots of bin dirs for lots of projects with lots of common dependencies in various states of freshness you are, as should be quite obvious, making things harder on yourself. That's just more work.
If you find that, on occasion, you need to isolate some things, then you can make a separate GOPATH to handle that situation.
But in general, if you find yourself doing more work, it's often because you're choosing to make things harder.
I've got what must be approaching 100 projects I've accumulated in the last four years of go. I almost always work in GOPATH, which is $HOME/go on my computers.

Using one GOPATH across all of your projects is very handy, but I find this to only be the case for my own personal projects.
I use a separate GOPATH for each production system I maintain because I use git submodules in each GOPATH's directory tree in order to freeze dependencies.
So, something like:
~/code/my-project
- src
- github.com
+ dependency-one
+ dependency-two
- my-org
- my-project
* main.go
+ package-one
+ package-two
- pkg
- bin
By setting GOPATH to ~/code/my-project, then it uses the dependency-one and dependency-two git submodules within that project instead of using global dependencies.

Try envirius (universal virtual environments manager). It allows to compile any version of go and create any number of environments based on it. $GOPATH/$GOROOT are depend on each particular environment.
Moreover, it allows to create environments with mixed languages (for example, python & go in one environment).

At my company I created Virtualgo to make managing multiple GOPATHs super easy. A couple of advantages over handling it manually are:
Automatic switching to the correct GOPATH when you cd to a project.
It integrates well with vendoring tools
It also sets the new GOBIN in your path, so you can use the executables installed there.
It still has your original GOPATH as a backup. If a package is not found in the project specific workspace it will search the main GOPATH.

One workspace + godep is best as for me.

I follow KISS - one GOPATH, two go paths:
export GOPATH=$HOME/go:$HOME/development/go
That way third party stuff goes in a central place (package install uses the first path entry by default), and I can flexibly move my projects elsewhere, at the second path entry.

You might want to try the direnv package.
https://direnv.net/

Just use GoSwitch. Saves a heck of a lot of time and sanity.
Add the script to the root of each of your projects and source it.
It will make that project dir your gopath and also add/removes the exact bin folder of that project to path.
https://github.com/buffonomics/goswitch

How to reference projects not in same root

Like most people we use third party libraries. Many have source which we keep in our VCS.
Currently if these libraries are updated, we need to pull the source manually and rebuild the binaries.
I am trying to find a way to instead reference them from the various solutions that use them, so that they will be automatically pulled from source control when you pull the dependant project, and automatically built if they are out of date. It would also be nice to be able to debug into them with the provided source.
The first problem I am having is that the libraries are not in the same solution root as the dependant projects. eg.
\Libraries
\External
\Lib1
Lib1.sln
\Products
\Product1
Product1.sln
Attempting to add Lib1.csproj to my Product1 solution gives me the warning:
The project that you are attempting to add to source control may cause
other source control users to have difficulty opening this solution or
getting newer versions of it. To avoid this problem, add the project
from a location below the binding root (C:\depot\Products\Products1)
of the other source controlled projects in the solution.
If I ignore this then I can set up build dependencies properly, but it still doesn't allow pulling the entire source tree in one go.
I was wondering how other people have third party libraries set up, particularly when there is source code. (We are using Perforce but I guess the question is relevant for any VCS)

One way to solve this in perforce is to put all modules / 3rdparty-software that are about to be reused to a separate location (depot), for examples "//shared" or similar.
Products (trees in your SCMS / perforce) can "link" the required modules by mapping them into the workspace. In perforce you can do that via clientviews.
If you have many people working on many products you'll need a easy mechanism to set up a personal workspace for a product properly (without requiring the developers to setup their clientview manually).
One possibility to achieve that is a small self-written tool/script that sets up a workspace and prepares the personal clientview based on a template that is located in the product-root and that defines what modules from the "//shared" depot need to be mapped to which location in the client workspace.
We are using this practice since years and it works fine. The danger is that the clientviews can get very complex.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio