Relationship between a package statement and the directory of a .go file - go

See this experiment.
~/go/src$ tree -F
.
├── 1-foodir/
│   └── 2-foofile.go
└── demo.go
1 directory, 2 files
~/go/src$ cat demo.go
package main
import (
"fmt"
"1-foodir"
)
func main() {
fmt.Println(foopkg.FooFunc())
}
~/go/src$ cat 1-foodir/2-foofile.go
package foopkg
func FooFunc() string {
return "FooFunc"
}
~/go/src$ GOPATH=~/go go run demo.go
FooFunc
I thought that we always import a package name. But the above example
shows that we actually import a package directory name ("1-foodir")
but while invoking exported names within that package, we use the
package name declared in the Go files (foopkg.FooFunc).
This is confusing for a beginner like me who comes from Java and Python world,
where the directory name itself is the package name used to qualify
modules/classes defined in the package.
Why is there a difference in the way we use import statement and
refer to names defined in the package in Go? Can you explain the rules
behind these things about Go?

If what you said was true, then your function call would actually be 1-foodir.FooFunc() instead of foopkg.FooFunc(). Instead, go sees the package name in 2-foofile.go and imports it as foopkg because in go the name of the package is exactly what comes after the words package at the top of .go files, provided it is a valid identifier.
The only use of the directory is for collecting a set of files that share the same package name. This is reiterated in the spec
A set of files sharing the same PackageName form the implementation of a package. An implementation may require that all source files for a package inhabit the same directory.
In go, it is convention that the directory match the package name but this doesn't have to be the case, and often it is not with 3rd party packages. The stdlib does do a good job of sticking to this convention.
Now where directories do come into play is the import path. You could have 2 packages named 'foo' in your single binary as long as they had different import paths, i.e.
/some/path/1/foo and /some/path/2/foo
And we can get really swanky and alias the imports to whatever we wanted, for example I could do
import (
bar "/some/path/1/foo"
baz "/some/path/2/foo"
)
Again the reason this works is not because the package name has to be unique, but the package import path must be unique.
Another bit of insight to glean from this statement is -- within a directory, you cannot have two package names. The go compiler will throw an error stating it cannot load package and that it found packages foo (foo.go) and bar (bar.go).
See https://golang.org/doc/code.html#PackageNames for more information.

Roughly for the why:
Packages have a "name" which is set by the package clause, the package thepackagename at the start of your source code.
Importing packages happens by pretty much opaque strings: the import path in the import declarations.
The first is the name and the second how to find that name. The first is for the programmer, the second for the compiler / the toolchain.
It is very convenient (for compilers and programmers) to state
Please import the package found in "/some/hierarchical/location"
and then refer to that package by it's simple name like robot in statements like
robot.MoveTo(3,7)
Note that using this package like
/some/hierarchical/location.MoveTo(3.7)
would not be legal code and neither readable nor clear nor convenient.
But to for the compiler / the toolchain it is nice if the import path has structure and allows to express arbitrary package locations, i.e. not only locations in a filesystem, but e.g. inside an archive or on remote machines, or or or.
Also important to note in all this: There is the Go compiler and the go tool. The Go compiler and the go tool are different things and the go tool imposes more restrictions on how you lay out your code, your workspace and your packages than what the Go compiler and the language spec would require. (E.g. the Go compiler allows to compile files from different directories into one package without any problems.)
The go tool mandates that all (there are special cases, I know) your source files of a package reside in one file system directory and common sense mandates that this directory should be "named like the package".

First thing first, the package clause and import path are different things.
package clause declares PackageName:
PackageClause = "package" PackageName .
PackageName = identifier .
The purpose of a package clause is to group files:
A set of files sharing the same PackageName form the implementation of a package.
By convention, the path basename (directory name) of ImportPath (see below) is the same as PackageName. It's recommended for convenient purposes that you don't need to ponder what's the PackageName to be used.
However, they can be different.
The basename only affects the ImportPath, check spec for import declartions:
ImportDecl = "import" ( ImportSpec | "(" { ImportSpec ";" } ")" ) .
ImportSpec = [ "." | PackageName ] ImportPath .
ImportPath = string_lit .
If the PackageName is omitted, it defaults to the identifier specified in the package clause of the imported package.
For example, if you have a dir foo but you declare package bar in a source file resides in it, when you import <prefix>/foo, you will use bar as a prefix to reference any exported symbols from that package.
darehas' answer raises a good point: you can't declare multiple packages under the same basename. However, according to package clause, you can spread the same package over different basenames:
An implementation may require that all source files for a package inhabit the same directory.

Related

Module XXX found, but does not contain package XXX

Not so familiar with Golang, it's probably a stupid mistake I made... But still, I can't for the life of me figure it out.
So, I got a proto3 file (let's call it file.proto), whose header is as follows:
syntax = "proto3";
package [package_name];
option go_package = "github.com/[user]/[repository]";
And I use protoc:
protoc --go_out=$GOPATH/src --go-grpc_out=$GOPATH/src file.proto
So far so good, I end up with two generated files (file.pb.go and file_grpc.pb.go) inside /go/src/github.com/[user]/[repository]/, and they are defined inside the package [package_name].
Then, the code I'm trying to build has the following import:
import (
"github.com/[user]/[repository]/[package_name]"
)
And I naively thought it would work. However, it produces the following error when running go mod tidy:
go: downloading github.com/[user]/[repository] v0.0.0-20211105185458-d7aab96b7629
go: finding module for package github.com/[user]/[repository]/[package_name]
example/xxx imports
github.com/[user]/[repository]/[package_name]: module github.com/[user]/[repository]#latest found (v0.0.0-20211105185458-d7aab96b7629), but does not contain package github.com/[user]/[repository]/[package_name]
Any idea what I'm doing wrong here? Go version is go1.19 linux/amd64 within Docker (golang:1.19-alpine).
Note: I also tried to only import github.com/[user]/[repository], same issue obviously.
UPDATE:
OK so what I do is that I get the proto file from the git repository that only contains the proto file:
wget https://raw.githubusercontent.com/[user]/[repository]/file.proto
Then I generate go files from that file with protoc:
protoc --go_out=. --go-grpc_out=. file.proto
Right now, in current directory, it looks like:
- directory
| - process.go
| - file.proto
| - github.com
| - [user]
| - [repository]
| - file.pb.go
| - file_grpc.pb.go
In that same directory, I run:
go mod init xxx
go mod tidy
CGO_ENABLED=0 go build process.go
The import directive in process.go is as follows:
import (
"xxx/github.com/[user]/[repository]"
)
Now it looks like it finds it, but still getting a gRPC error, which is weird because nothing changed. I still have to figure out if it comes from the issue above or not. Thanks!
Your question is really a number of questions in one; I'll try to provide some info that will help. The initial issue you had was because
At least one file with the .go extension must be present in a directory for it to be considered a package.
This makes sense because importing github.com/[user]/[repository] would be fairly pointless if that repository does not contain any .go files (i.e. the go compiler could not really do anything with the files).
Your options are:
Copy the output from protoc directly into your project folder and change the package declarations to match your package. If you do this there is no need for any imports.
Copy (or set go_out argument to protoc) the output from protoc into a subfolder of your project. The import path will then be the value of the module declaration in your go.mod plus the path from the folder that the go.mod is in (this is what you have done).
Store the files in a repo (on github or somewhere else). This does not need to be the same repo as your .proto files if you "want it to be agnostic" (note that 2 & 3 can be combined if the generated files will only be used within one code base or the repo is accessible to all users).
Option 1 is simple but its often beneficial to keep the generated code separate (makes it clear what you should not edit and improves editor autocomplete etc).
Option 2 is OK (especially if protoc writes the files directly and you set go_package appropriately). However issues may arise when the generated files will be used in multiple modules (e.g. as part of your customers code) and your repo is private. They will need to change go_package before running protoc (or search/replace the package declarations) and importing other .proto files may not work well.
Option 3 is probably the best approach in most situations because this works with the go tooling. You can create github.com/[user]/goproto (or similar) and put all of your generated code in there. To use this your customers just need to import github.com/[user]/goproto (no need to run protoc etc).
Go Modules/package intro
The go spec does not detail the format of import paths, leaving it up to the implementation:
The interpretation of the ImportPath is implementation-dependent but it is typically a substring of the full file name of the compiled package and may be relative to a repository of installed packages.
As you are using go modules (pretty much the default now) the implementations rules for resolving package paths (synonym of import path) can be summarised as:
Each package within a module is a collection of source files in the same directory that are compiled together. A package path is the module path joined with the subdirectory containing the package (relative to the module root). For example, the module "golang.org/x/net" contains a package in the directory "html". That package’s path is "golang.org/x/net/html".
So if your "module path" (generally the top line in a go.mod) is set to xxx (go mod init xxx) then you would import the package in subfolder github.com/[user]/[repository] with import xxx/github.com/[user]/[repository] (as you have found). If you got rid of the intervening folders and put the files into the [repository] subfolder (directly off your main folder) then it would be import xxx/[repository]
You will note in the examples above that the module names I used are paths to repo (as opposed to the xxx you used in go mod init xxx). This is intentional because it allows the go tooling to find the package when you import it from a different module. For example if you had used go mod init github.com/[user]/[repository] and option go_package = "github.com/[user]/[repository]/myproto";" then the generated files should go into the myproto folder in your project and you import them with import github.com/[user]/[repository]/myproto.
While you do not have to follow this approach I'd highly recommend it (it will save you from a lot of pain!). It can take a while to understand the go way of doing this, but once you do, it works well and makes it very clear where a package is hosted.

What is the purpose of the package declaration?

Every Go file starts with package <something>.
As far as I understand - and this is probably where I am missing some information - there are only two possible values for <something>: The name of the directory it is in*, or main. If it is main, all other files in that directory can only have main, too. If it is something else, the project is inconsistent/violating convention.
Now if it is the name of the directory, it's redundant, because the same information is, well, in the name of the directory.
If it is main, it's kind of useless, because as far as I can see there is no way to tell go build to "please build all main packages".
* Because, in other words, one directory is one package.
The name of the package does not have to coincide with the directory name. It is possible to have package foobar in the directory xyz/go-foobar. In this case, xyz/go-foobar becomes an import path, but the package name that you use to quality the identifiers (functions, types etc.) would be foobar.
Here's an example to make it more concrete: I created a test package http://godoc.org/github.com/dmitris/go-foobar (source in https://github.com/dmitris/go-foobar) - you can see from the documentation page, that the import path is "github.com/dmitris/go-foobar" but the package name is foobar, so you would call the function it provides as foobar.Demo() (not go-foobar.Demo()).
A similar real-life example - the import path for the NSQ Messaging platform is "github.com/nsqio/go-nsq" while the package name is "nsq": http://godoc.org/github.com/nsqio/go-nsq. However, for the sake of user-friendliness and simplicity, the standard and recommended practice is to keep the last portions of the import path and the package name being the same whenever possible.
package main is not useless - it tells the Go compiler to create an executable as opposed to a .a library file (with go install or go get; go build discards the compilation result). The executable is named after the directory name in which the package main file or files are placed. Again a concrete example - I made a test program https://github.com/dmitris/go-foobar-client, you install it with go get github.com/dmitris/go-foobar-client and you should get a go-foobar-client executable placed in your $GOPATH/bin directory. It is from the the directory name where the package main file is placed that the Go compiler takes the name of the executable from. The filename of the .go file that contains the main() function is not important - in the example above, we can rename main.go to client.go or something else, but as long as the enclosing directory is called go-foobar-client, that's how the resulting executable will be named.
For an additional accessible and practically oriented reading about Go packages, I recommend Dave Cheney's article "Five suggestions for setting up a Go project" http://dave.cheney.net/2014/12/01/five-suggestions-for-setting-up-a-go-project.
The missing information you "have" is that the package name does not need to be the same as the directory name.
It is perfectly fine to use a package name other than the folder name. If you do so, you still have to import the package based on the directory structure, but after the import you have to refer to it by the name you used in the package clause.
For example, if you have a folder $GOPATH/src/mypck, and in it you have a file a.go:
package apple
const Pi = 3.14
Using this package:
package main
import (
"mypck"
"fmt"
)
func main() {
fmt.Println(apple.Pi)
}
Just like you are allowed to use relative imports but is not advisable, you may use package names other that their containing folder, but this is not advisable also to avoid further misunderstanding.
Note that the specification doesn't even require all files belonging to the same package to be in the same folder (but it may be an implementation requirement). Spec: Package clause:
A set of files sharing the same PackageName form the implementation of a package. An implementation may require that all source files for a package inhabit the same directory.
What's the use of this?
Simple. A package name is a Go identifier:
identifier = letter { letter | unicode_digit } .
Which allows unicode letters to be used in identifiers, e.g. αβ is a valid identifier in Go. Folder and file names are not handled by Go but by the Operating System, and different file systems have different restrictions. There are actually many file systems which would not allow all valid Go identifiers as folder names, so you would not be able to name your packages what otherwise the language spec would allow.
So in one hand not all valid Go identifiers may be valid folder names. And on the other hand, not all valid folder names are valid Go identifiers, for example go-math is a valid folder name in most (all?) file systems, but it's not a valid Go identifier (as identifiers cannot contain the dash - character).
Having the option to use package names different than their containing folders, you have the option to really name your packages what the language spec allows, regardless of the underlying operating and file system, and put it in a folder named anything that the underlying OS and file system allows - regardless of the package name.

How to make go build work with nested directories

In the process of learning go I was playing around with making my own libraries. Here is what I did: in my $GOPATH/src I have two folders: mylibs and test. The test folder has a file called test.go which contains
package test
import "mylibs/hi/saysHi"
func main() {
saysHi.SayHi()
}
The mylibs folder contains another folder called hi, which has a file called saysHi.go containing:
package saysHi
import "fmt"
func SayHi() {
fmt.Printf("Hi\n")
}
So the directory structure looks like this:
GOPATH/src
test
test.go
mylibs
hi
saysHi.go
The problem is that when I try to compile test it complains saying
cannot find package "mylibs/hi/saysHi" in any of:
[...]
$GOPATH/src/mylibs/hi/saysHi (from $GOPATH)
I have deliberately made the directory structure deeper than necessary. If I make a simpler directory structure where I place saysHi.go in $GOPATH/saysHi/saysHi.go then it works.
But I don't see a reason for why this wouldn't work. Any ideas?
Generally speaking, your directory name should match the package name. So if you define
package saysHi
and want to import it with
import "mylibs/hi/saysHi"
you should place it in a structure like this:
mylibs
hi
saysHi
saysHi.go
The name of the .go file(s) inside the package makes no difference to the import path, so you could call the .go file anything you like.
To explain it a bit further, the import path you use should be the name of the directory containing the package. But, if you define a different package name inside that directory, you should use that name to access the package inside the code. This can be confusing, so it's best to avoid it until you understand where it's best used (hint: package versioning).
It gets confusing, so for example, if you had your package in the path
mylibs
hi
saysHi.go
And inside saysHi.go defined,
package saysHi
Then in test.go you will import it with
import "mylibs/hi"
And use it with
saysHi.SayHi()
Notice how you import it with the final directory being hi, but use it with the name saysHi.
Final note
Just in case you didn't know the following: your test file is called test.go, and that's fine, if it's just as an example, and not an actual test file for saysHi.go. But if it is/were a file containing tests for saysHi.go, then the accepted Go standard is to name the file saysHi_test.go and place it inside the same package alongside saysHi.go.
One more final note
I mentioned how you are allowed to choose a different package name from the directory name. But there is actually a way to write the code so that it's less confusing:
import (
saysHi "mylibs/hi"
)
Would import it from the mylibs/hi directory, and make a note of the fact that it should be used with saysHi, so readers of your code understand that without having to go look at the mylibs/hi code.

Getting access to functions from subdirectories

I'm writing small app following http://golang.org/doc/code.html
My directory tree looks like
-blog
-bin
-pkg
-src
-github.com
-packages_that_i_imported
-myblog
-config
routes.go
server.go
my server.go file contains following code
package main
import "..." //ommited imports
func main(){
r:= mux.InitRoutes() //function from imported package
Register_routes(r) //function from routes.go
}
And my routes.go
package main
func Register_routes(r *Router){
r.addRoute("test", "test", "test)
}
But after I do go run server.go
I'm getting following error
$ go run server.go
# command-line-arguments
./server.go:10: undefined: Register_routes
GOPATH variable points to my /blog folder
What am I missing? Why go doesn't it see files in subdirectories?
P.S. config/routes.go is part of server.go package
P.P.S I have moved routes.go to the same folder as server.go, but the error is still present
In order to use a function defined in another package, first you have to import it:
import "myblog/config"
And after that you have to refer to it by the package name:
config.Register_routes(r)
Also the package name should reflect the folder name in which it is defined. In your routes.go the package should be config. Package main is special, the main package will be compiled into an executable binary (it is the entry point of the program). See Program Execution in the language specification.
From the page you linked: Package names:
Go's convention is that the package name is the last element of the import path: the package imported as "crypto/rot13" should be named rot13.
Executable commands must always use package main.
There is no requirement that package names be unique across all packages linked into a single binary, only that the import paths (their full file names) be unique.
Check out the blog post Package names for a detailed guideline.
Note that different files of the same package have to be put into the same folder. And different files of the same package can use everything from the package without importing it and without using the package name (doesn't matter in which file it is defined). This is also true for unexported identifiers. From another package you can only access exported identifiers (their name must start with a capital letter).
Also the go naming convention is to used mixed caps rather than underscores to write multiword names, see Effective Go / MixedCaps. So the function should be named RegisterRoutes but this is not a requirement.

Golang: "package ast_test" underscore test

Source file from Golang's stdlib
File's base directory: ast
Package specified in the file: ast_test ???
Package specified in all other files inside the same directory: ast
From golang.org:
src contains Go source files organized into packages (one package per directory) ...
By convention, packages are given lower case, single-word names; there should be no need for underscores or mixedCaps
... Another convention is that the package name is the base name of its source directory
How is it possible to have multiple packages (here 2) in one folder?
You find another example in src/pkg/go/ast/commentmap_test.go, with the comment:
// To avoid a cyclic dependency with go/parser, this file is in a separate package.
I suppose it allows for an othogonal command like:
go test
That will test parser features while avoiding for that test to be part of the same parser features (since it has been put in a separate package)
From go command man page:
Test files that declare a package with the suffix "_test" will be compiled as a separate package, and then linked and run with the main test binary.
This thread asked the question:
Now that the go tool requires each directory to be one package and doesn't allow to have files with different package names inside the same folder, how is the package keyword useful? It seems like a unnecessary repetition.
Is it required by the compiler or is there any plan to remove it?
The answers hinted at the fact that you can have more than one package in a folder:
The package declaration declares the name of the package.
The language Go doesn't know what a file or a directory is and the import path itself doesn't effect the actual name of the package that is being imported. So the only way the compiler knows what to call the package is the package declaration.
The language doesn't require separate packages to be in separate directories; it is a requirement of the go tool.
Another hypothetical implementation may not have this requirement.
Even this go tool requirement can be bypassed thanks to the "// +build" build tags.
For example, read misc/cgo/gmp or misc/cgo/stdio (some files include // +build ignore)

Resources