In Go, when will a variable become unreachable? - go

Go 1.7 beta 1 was released this morning, here is the release notes draft of Go 1.7. A new function KeepAlive was added to the package runtime. The doc of runtime.KeepAlive has given an example:
type File struct { d int }
d, err := syscall.Open("/file/path", syscall.O_RDONLY, 0)
// ... do something if err != nil ...
p := &FILE{d}
runtime.SetFinalizer(p, func(p *File) { syscall.Close(p.d) })
var buf [10]byte
n, err := syscall.Read(p.d, buf[:])
// Ensure p is not finalized until Read returns.
runtime.KeepAlive(p)
// No more uses of p after this point.
The doc of runtime.SetFinalizer has also given an explanation about runtime.KeepAlive:
For example, if p points to a struct that contains a file descriptor
d, and p has a finalizer that closes that file descriptor, and if the
last use of p in a function is a call to syscall.Write(p.d, buf,
size), then p may be unreachable as soon as the program enters
syscall.Write. The finalizer may run at that moment, closing p.d,
causing syscall.Write to fail because it is writing to a closed file
descriptor (or, worse, to an entirely different file descriptor opened
by a different goroutine). To avoid this problem, call
runtime.KeepAlive(p) after the call to syscall.Write.
What confused me is that the variable p has not left its life scope yet, why will it be unreachable? Does that mean that a variable will be unreachable if only there is no use of it in the following code, no matter whether it is in its life scope?

A variable becomes unreachable when the runtime detects that the Go code cannot reach a point where that variable is referenced again.
In the example you posted, a syscall.Open() is used to open a file. The returned file descriptor (which is just an int value) is "wrapped" in a struct. Then a finalizer is attached to this struct value that closes the file descriptor. Now when this struct value becomes unreachable, its finalizer may be run at any moment, and the closing / invalidation / re-using of the file descriptor could cause unexpected behavior or errors in the execution of the Read() syscall.
The last use of this struct value p in Go code is when syscall.Read() is invoked (and the file descriptor p.d is passed to it). The implementation of the syscall will use that file descriptor after the initiation of syscall.Read(), it may do so up until syscall.Read() returns. But this use of the file descriptor is "independent" of the Go code.
So the struct value p is not used during the execution of the syscall, and the syscall blocks the Go code until it returns. Which means the Go runtime is allowed to mark p as unreachable during the execution of Read() (before Read() returns), or even before its actual execution begins (because p is only used to provide the arguments to call Read().
Hence the call to runtime.KeepAlive(): since this call is after the syscall.Read() and it references the variable p, the Go runtime is not allowed to mark p unreachable before Read() returns, because this is after the Read() call.
Note that you could use other constructs to "keep p alive", e.g. _ = p or returning it. runtime.KeepAlive() does nothing magical in the background, its implementation is:
func KeepAlive(interface{}) {}
runtime.KeepAlive() does provide a much better alternative because:
It clearly documents we want to keep p alive (to prevent runs of Finalizers).
Using other constructs such as _ = p might get "optimized" out by future compilers, but not runtime.KeepAlive() calls.

Related

Question on the go memory model,the last example

i have a question on the go memory model.
in the last example:
type T struct {
msg string
}
var g *T
func setup() {
t := new(T)
t.msg = "hello, world"
g = t
}
func main() {
go setup()
for g == nil {
}
print(g.msg)
}
In my opnion,reads and writes of values with a single machine word is a atomic behavior.I try many times to run the test but it is always can be observed.
So please tell me why g.msg is not guarntee to observed? I want to know the reason in detail,please.
Because there are 2 write operations in the launched goroutine:
t := new(T) // One
t.msg = "hello, world" // Two
g = t
It may be that the main goroutine will observe the non-nil pointer assignment to g in the last line, but since there is no explicit synchronization between the 2 goroutines, the compiler is allowed to reorder the operations (that doesn't change the behavior in the launched goroutine), e.g. to the following:
t := new(T) // One
g = t
t.msg = "hello, world" // Two
If operations would be rearranged like this, the behavior of the launched goroutine (setup()) would not change, so a compiler is allowed to to this. And in this case the main goroutine could observe the effect of g = t, but not t.msg = "hello, world".
Why would a compiler reorder the operations? E.g. because a different order may result in a more efficient code. E.g. if the pointer assigned to t is already in a register, it can also be assigned to g right away, without having to reload it again if the assignment to g would not be executed right away.
This is mentioned in the Happens Before section:
Within a single goroutine, reads and writes must behave as if they executed in the order specified by the program. That is, compilers and processors may reorder the reads and writes executed within a single goroutine only when the reordering does not change the behavior within that goroutine as defined by the language specification. Because of this reordering, the execution order observed by one goroutine may differ from the order perceived by another. For example, if one goroutine executes a = 1; b = 2;, another might observe the updated value of b before the updated value of a.
If you use proper synchronization, that will forbid the compiler to perform such rearranging that would change the observed behavior from other goroutines.
Running your example any number of times and not observing this does not mean anything. It may be the problem will never arise, it may be it will arise on a different architecture, or on a different machine, or when compiled with a different (future) version of Go. Simply do not rely on such behavior that is not guaranteed. Always use proper synchronization, never leave any data races in your app.

Why finalizer is never called?

var p = &sync.Pool{
New: func() interface{} {
return &serveconn{}
},
}
func newServeConn() *serveconn {
sc := p.Get().(*serveconn)
runtime.SetFinalizer(sc, (*serveconn).finalize)
fmt.Println(sc, "SetFinalizer")
return sc
}
func (sc *serveconn) finalize() {
fmt.Println(sc, "finalize")
*sc = serveconn{}
runtime.SetFinalizer(sc, nil)
p.Put(sc)
}
The above code tries to reuse object by SetFinalizer, but after debug I found finalizer is never called, why?
UPDATE
This may be related:https://github.com/golang/go/issues/2368
The above code tries to reuse object by SetFinalizer, but after debug I found finalizer is never called, why?
The finalizer is only called on an object when the GC
marks it as unused and then tries to sweep (free) at the end
of the GC cycle.
As a corollary, if a GC cycle is never performed during the runtime of your program, the finalizers you set may never be called.
Just in case you might hold a wrong assumption about the Go's GC, it may worth noting that Go does not employ reference counting on values; instead, it uses GC which works in parallel with the program, and the sessions during which it works happen periodically and are triggered by certain parameters like pressure on the heap produced by allocations.
A couple assorted notes regarding finalizers:
When the program terminates, no GC is forcibly run.
A corollary of this is that a finalizer is not guaranteed
to run at all.
If the GC finds a finalizer on an object about to be freed,
it calls the finalizer but does not free the object.
The object itself will be freed only at the next GC cycle —
wasting the memory.
All in all, you appear as trying to implement destructors.
Please don't: make your objects implement the sort-of standard method called Close and state in the contract of your type that the programmer is required to call it when they're done with the object.
When a programmer wants to call such a method no matter what, they use defer.
Note that this approach works perfectly for all types in the Go
stdlib which wrap resources provided by the OS—file and socket descriptors. So there is no need to pretend your types are somehow different.
Another useful thing to keep in mind is that Go was explicitly engineered to be no-nonsense, no-frills, no-magic, in-your-face language, and you're just trying to add magic to it.
Please don't, those who like decyphering layers of magic do program in Scala different languages.

When reading rand.Reader may result in error?

Do I understand correctly that crypto/rand.Reader can return Read error only on platforms not listed below, i.e. when it is not actually implemented?
// Reader is a global, shared instance of a cryptographically
// strong pseudo-random generator.
//
// On Linux, Reader uses getrandom(2) if available, /dev/urandom otherwise.
// On OpenBSD, Reader uses getentropy(2).
// On other Unix-like systems, Reader reads from /dev/urandom.
// On Windows systems, Reader uses the CryptGenRandom API.
var Reader io.Reader
TL;DR; crypto/rand's Read() (and Reader.Read()) methods may fail due to a variety of reasons, even on the platforms listed as supported. Do not assume that calls to this functions will always succeed. Always check the error return value.
Do I understand correctly that crypto/rand.Reader can return Read error only on platforms not listed below, i.e. when it is not actually implemented?
No. For example, have a look at the Linux implementation of rand.Reader. If available, this implementation will use the getrandom Linux system call, which may fail with a number of errors (most importantly, EAGAIN):
EAGAIN - The requested entropy was not available, and getrandom() would
have blocked if the GRND_NONBLOCK flag was not set.
The EAGAIN error quite literally tells you to "try again later"; the official meaning according to man 3 errno is "Resource temporarily unavailable". So when receiving an EAGAIN error you could simply keep trying for a certain time.
If getrandom is not available, the crypto/rand module will try to open and read from /dev/urandom (see source code), which might also fail for any number of reasons. These errors might not necessarily be of temporary nature (for example, issues with file system permissions); if your application depends on the availability of random data, you should treat an error like any other kind of non-recoverable error in your application.
For these reasons, you should not assume that rand.Read() will always succeed on Linux/UNIX and always check rand.Read()'s error return value.
type io.Reader
Reader is the interface that wraps the basic Read method.
Read reads up to len(p) bytes into p. It returns the number of bytes
read (0 <= n <= len(p)) and any error encountered. Even if Read
returns n < len(p), it may use all of p as scratch space during the
call. If some data is available but not len(p) bytes, Read
conventionally returns what is available instead of waiting for more.
When Read encounters an error or end-of-file condition after
successfully reading n > 0 bytes, it returns the number of bytes read.
It may return the (non-nil) error from the same call or return the
error (and n == 0) from a subsequent call. An instance of this general
case is that a Reader returning a non-zero number of bytes at the end
of the input stream may return either err == EOF or err == nil. The
next Read should return 0, EOF.
Callers should always process the n > 0 bytes returned before
considering the error err. Doing so correctly handles I/O errors that
happen after reading some bytes and also both of the allowed EOF
behaviors.
Implementations of Read are discouraged from returning a zero byte
count with a nil error, except when len(p) == 0. Callers should treat
a return of 0 and nil as indicating that nothing happened; in
particular it does not indicate EOF.
Implementations must not retain p.
type Reader interface {
Read(p []byte) (n int, err error)
}
No. io.Readers return errors.

GoLang CGO file handles

I’m working with a native linux C binary which has a fairly expensive initialization call which I would like to perform once at application startup. This call should open a bunch of file handles internally for later use. When I call this expensive initialization C function from Go, it completes successfully and correctly opens the files but those handles are open only for the duration of the call to the C function! This means that when I call successive C functions against the same library from Go, the file handles are no longer open and the calls fail. I have verified this using the lsof command. Interestingly, when the initialization call as well as calls to subsequent behavior are composed into a single C function which is then called from Go, the files are opened and remain open, allowing successful completion of all desired functionality.
Is there some kind of undocumented cgo behavior which is “cleaning up”, shutting down, or even leaking file handles or other stateful resources between multiple invocations of C functions from Go? If so, is this behavior configurable? We don’t have access to the source code for this library.
Also, I've verified that this is not related to thread-local storage. Calling runtime.LockOSThread() has no effect and we've verified that the files are closed after control returns from C back to the calling Go code.
Here’s an example of the kind of Go code I’d like to write:
// Go code:
func main() {
C.Initialize()
C.do_stuff() // internal state is already cleaned up! This call fails as a result. :(
}
Here’s an example of a C function that invokes the initialization and behavior all at once. This “wrapping” function is invoked from Go:
// C code:
void DoEverything(void)
{
Initialize();
do_stuff(); // succeeds because all internal state is intact (not cleaned up).
}
Ok, this is a bit embarrassing, but I figured it out. Right after calling initialize(), I was calling defer close(), but it was actually defer fmt.Println(close()). Because arguments to deferred functions are resolved immediately (not deferred), the close function was being invoked before we could invoke any other behavior. The golang blog clearly explains argument resolution to deferred function calls.

How to interpret negative line number in stack trace

I made some changes to a fairly large project of mine today, and now I'm getting some odd behavior. Because I'm a knucklehead, I can't go back and figure out what I did.
But the main thrust of my question is how I should understand the negative line number in the stack trace that is printed. The -1218 below is the one that I mean.
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x1 pc=0x80501f2]
goroutine 1 [running]:
server.init() // vv-------------RIGHT HERE
/home/.../debugComponent.go:-1218 +0x282
_/home/.../possessions.init()
/home/.../possessions.go:29 +0x42
_/home/.../pageWrap.init()
/home/.../pageWrap.go:112 +0x47
main.init()
/home/.../main.go:0 +0x3c
goroutine 2 [syscall]:
goroutine 3 [runnable]:
The associated debugComponent.go file is pretty non-essential right now, so I removed it to see what would happen, and the file name just gets replaced with a different one, and a different negative number.
I've had to find plenty of bugs while developing this app, but this one has got me stumped.
If it helps, there's the main.go and then several packages in play. The three files listed above are all different packages, and this seems to be happening during the imports.
I hope you've read this far, because here's the strangest part. If I add this declaration to main.go, the error goes away!
var test = func() int { return 1 }() // Everything is fine now!
Very confusing! It doesn't fix it if I do var test = "foobar". It has to be the invoked func.
Any insight is appreciated, but mostly I'm curious about the -1218 in the trace.
Update
I'm trying to get this down to a small example that reproduces the issue. After working on it I reverted back to my original code, and restarted the machine.
The first time I tried to build and run, two new entries were added to the top of the stack trace. But only the first time.
goroutine 1 [syscall]:
syscall.Syscall()
/usr/local/go/src/pkg/syscall/asm_linux_386.s:14 +0x5
syscall.Mkdir(0x83a2f18, 0x2, 0x2, 0x806255e, 0x83a2f1c, ...)
/usr/local/go/src/pkg/syscall/zerrors_linux_386.go:2225 +0x80
server.init()
So this would be in line with my main question about interpreting stack trace. The -1218 is still there, but now there are these.
The asm_linux_386.s has this at line 14:
MOVL 4(SP), AX // syscall entry
I found the zerrors_linux_386.go too, but there's no line 2225. The file stops long before that line.
It's already reported and accepted as Issue 5243.
Program execution
A package with no imports is initialized by assigning initial values
to all its package-level variables and then calling any package-level
function with the name and signature of
func init()
defined in its source. A package-scope or file-scope identifier with
name init may only be declared to be a function with this signature.
Multiple such functions may be defined, even within a single source
file; they execute in unspecified order.
Within a package, package-level variables are initialized, and
constant values are determined, according to order of reference: if
the initializer of A depends on B, A will be set after B. Dependency
analysis does not depend on the actual values of the items being
initialized, only on their appearance in the source. A depends on B if
the value of A contains a mention of B, contains a value whose
initializer mentions B, or mentions a function that mentions B,
recursively. It is an error if such dependencies form a cycle. If two
items are not interdependent, they will be initialized in the order
they appear in the source, possibly in multiple files, as presented to
the compiler. Since the dependency analysis is done per package, it
can produce unspecified results if A's initializer calls a function
defined in another package that refers to B.
An init function cannot be referred to from anywhere in a program. In
particular, init cannot be called explicitly, nor can a pointer to
init be assigned to a function variable.
If a package has imports, the imported packages are initialized before
initializing the package itself. If multiple packages import a package
P, P will be initialized only once.
The importing of packages, by construction, guarantees that there can
be no cyclic dependencies in initialization.
A complete program is created by linking a single, unimported package
called the main package with all the packages it imports,
transitively. The main package must have package name main and declare
a function main that takes no arguments and returns no value.
func main() { … }
Program execution begins by initializing the main package and then
invoking the function main. When the function main returns, the
program exits. It does not wait for other (non-main) goroutines to
complete.
Package initialization—variable initialization and the invocation of
init functions—happens in a single goroutine, sequentially, one
package at a time. An init function may launch other goroutines, which
can run concurrently with the initialization code. However,
initialization always sequences the init functions: it will not start
the next init until the previous one has returned.
As your program begins execution, it initializes package variables and executes init functions. Adding package variables is going to change the initialization. It looks like the initialization failed in debugComponent.go on something related to server.init(). The negative line number is probably a bug.
Without the source code, it's hard to say more.

Resources