The golang docs says that
Seed, unlike the Rand.Seed method, is safe for concurrent use.
The rand.Seed is actually from math/rand package, but what is Seed? If Seed is another function then it's not present in math/rand so it's unclear from where that function comes from?
update:
I'm exploring the demo program where in main we execute
rand.Seed(time.Now().UnixNano())
go process(...)
go process(...)
where the process is determined like
func process(...) {
time.Sleep(time.Duration(rand.Intn(30)) * time.Second)
...
}
We are using the same seed in two different threads, so is such using of rand.Seed considered to be a thread non-safe?
There is a rand.Seed() function, and there is a Rand.Seed() method. Your quote originates from the documentation of the Rand.Seed() method.
Global functions of the math/rand package operate on a global rand.Rand instance. If you check the source code of rand.Seed():
func Seed(seed int64) { globalRand.Seed(seed) }
The global functions are safe for concurrent use, so all other packages can use it (in a shared manner). The global rand.Rand instance is provided for convenience, you can use it "out-of-the box" without any preparation (except the need to properly seed it) and without any synchronization.
Instances of rand.Rand are not safe for concurrent use, each goroutine that needs a rand.Rand for deterministic random sequences should create one, and seed it appropriately. Or if a rand.Rand is to be shared between multiple goroutines, explicit synchronization is required.
Pros of using the global rand.Rand (via the global functions) are: (1) ease of use (it's implicitly shared with everyone) and (2) no synchronization is needed.
Pros of creating and using a custom rand.Rand instance: (1) it's faster (it's not synchronized implicitly) and (2) you are in control of who has access to it, so you can use it to repeat pseudo-random sequences (you can't do that with the global instance as "anyone" may use it concurrently with you).
Edit:
We are using the same seed in two different threads, so is such using of rand.Seed considered to be a thread non-safe?
You only call rand.Seed once, so actually it doesn't even matter if it's thread safe or not, it is not called concurrently. If rand.Seed() would be called from multiple goroutines concurrently, only then would it matter whether it's safe for concurrent use. And as stated earlier in my answer: "The global functions are safe for concurrent use..."
What you do call from multiple goroutines concurrently is rand.Intn(), but again, it's safe to do that.
Related
below code does not throw a data race
package main
import (
"fmt"
"os"
"strings"
)
func main() {
x := strings.Repeat(" ", 1024)
go func() {
for {
fmt.Fprintf(os.Stdout, x+"aa\n")
}
}()
go func() {
for {
fmt.Fprintf(os.Stdout, x+"bb\n")
}
}()
go func() {
for {
fmt.Fprintf(os.Stdout, x+"cc\n")
}
}()
go func() {
for {
fmt.Fprintf(os.Stdout, x+"dd\n")
}
}()
<-make(chan bool)
}
I tried multiple length of data, with variant https://play.golang.org/p/29Cnwqj5K30
This post says it is not TS.
This mail does not really answer the question, or I did not understand.
Package documentation of os and fmt dont mention much about this. I admit i did not dig the source code of those two packages to find further explanations, they appear too complex to me.
What are the recommendations and their references ?
I'm not sure it would qualify as a definitive answer but I'll try to provide some insight.
The F*-functions of the fmt package merely state they take a value of a type implementing io.Writer interface and call Write on it.
The functions themselves are safe for concurrent use — in the sense it's OK to call any number of fmt.Fwhaveter concurrently: the package itself is prepared for that,
but when it comes to concurrently writing to the same value of a type implementing io.Writer, the question becomes more complex because supporting of an interface in Go does not state anything about the real type concurrency-wise.
In other words, the real point of where the concurrency may or may not be allowed is deferred to the "writer" which the functions of fmt write to.
(One should also keep in mind that the fmt.*Print* functions are allowed to call Write on its destination any number of times during a single invocation, in a row, — as opposed to those provided by the stock package log.)
So, we basically have two cases:
Custom implementations of io.Writer.
Stock implementations of it, such as *os.File or wrappers around sockets produced by the functions of net package.
The first case is the simple one: whatever the implementor did.
The second case is harder: as I understand, the Go standard library's stance on this (albeit not clearly stated in the docs) in that the wrappers it provides around "things" provided by the OS—such as file descriptors and sockets—are reasonably "thin", and hence whatever semantics they implement, is transitively implemented by the stdlib code running on a particular system.
For instance, POSIX requires that write(2) calls are atomic with regard to one another when they are operating on regular files or symbolic links. This means, since any call to Write on things wrapping file descriptors or sockets actually results in a single "write" syscall of the target system, you might consult the docs of the target OS and get the idea of what will happen.
Note that POSIX only tells about filesystem objects, and if os.Stdout is opened to a terminal (or a pseudo-terminal) or to a pipe or to anything else which supports the write(2) syscall, the results will depend on what the relevant subsystem and/or the driver implement—for instance, data from multiple concurrent calls may be interspersed, or one of the calls, or both, may just be failed by the OS—unlikely, but still.
Going back to Go, from what I gather, the following facts hold true about the Go stdlib types which wrap file descriptors and sockets:
They are safe for concurrent use by themselves (I mean, on the Go level).
They "map" Write and Read calls 1-to-1 to the underlying object—that is, a Write call is never split into two or more underlying syscalls, and a Read call never returns data "glued" from the results of multiple underlying syscalls.
(By the way, people occasionally get tripped by this no-frills behaviour — for example, see this or this as examples.)
So basically when we consider this with the fact fmt.*Print* are free to call Write any number of times per a single call, your examples which use os.Stdout, will:
Never result in a data race — unless you've assigned the variable os.Stdout some custom implementation, — but
The data actually written to the underlying FD will be intermixed in an unpredictable order which may depend on many factors including the OS kernel version and settings, the version of Go used to build the program, the hardware and the load on the system.
TL;DR
Multiple concurrent calls to fmt.Fprint* writing to the same "writer" value defer their concurrency to the implementation (type) of the "writer".
It's impossible to have a data race with "file-like" objects provided by the Go stdlib in the setup you have presented in your question.
The real problem will be not with data races on the Go program level but with the concurrent access to a single resource happening on level of the OS. And there, we do not (usually) speak about data races because the commodity OSes Go supports expose things one may "write to" as abstractions, where a real data race would possibly indicate a bug in the kernel or in the driver (and the Go's race detector won't be able to detect it anyway as that memory would not be owned by the Go runtime powering the process).
Basically, in your case, if you need to be sure the data produced by any particular call to fmt.Fprint* comes out as a single contiguous piece to the actual data receiver provided by the OS, you need to serialize these calls as the fmt package provides no guarantees regarding the number of calls to Write on the supplied "writer" for the functions it exports.
The serialization may either be external (explicit, that is "take a lock, call fmt.Fprint*, release the lock") or internal — by wrapping the os.Stdout in a custom type which would manage a lock, and using it).
And while we're at it, the log package does just that, and can be used straight away as the "loggers" it provides, including the default one, allow to inhibit outputting of "log headers" (such as the timestamp and the name of the file).
I new to golang and I am reading the example from the book gopl.
Section 9.8.4 of The Go Programming Language book explains why Goroutines have no notion of identity that is accessible to the programmer
Goroutines have no notion of identity that is accessible to the programmer. This is by design, since thread-local storage tends to be abused. For example, in a web server implemented in a language with thread-local storage, it’s common for many functions to find information about the HTTP request on whose behalf they are currently working by looking in that storage. However, just as with programs that rely excessively on global variables, this can lead to an unhealthy ‘‘action at a distance’’ in which the behavior of a function is not determined by its arguments alone, but by the identity of the thread in which it runs. Consequently, if the identity of the thread should change—some worker threads are enlisted to help, say—the function misbehaves mysteriously.
and use the example of web server to illustrate this point. However, I have difficulty in understanding why the so called "action at a distance" is a bad practice and how this leads to
a function is not determined by its arguments alone, but by the identity of the thread in which it runs.
could anyone give an explanation for this(preferably in short code snippets)
Any help is appreciated!
Let's say we have the following code:
func doubler(num int) {
return num + num
}
doubler(5) will return 10. go doubler(5) will also return 10.
You can do the same with some sort of thread-local storage, if you want:
func doubler() {
return getThreadLocal("num") + getThreadLocal("num")
}
And we could run this with something like:
go func() {
setThreadLocal("num", 10)
doubler()
}()
But which is clearer? The variant which explicitly passes the num argument, or the variant which "magically" gets this from sort of thread-local storage?
This is what is meant with "action at a distance". The line setThreadLocal("num", 10) (which is distant) affects how doubler() behaves.
This example is clearly artificial, but the same principle applies with more real examples. For example, in some environments it's not uncommon to use thread-local store things such as user information, or other "global" variables.
This is why the paragraph you quoted compared it to global variables: thread-local storage are global variables, applicable only to the current thread.
When you passing parameters as arguments things are a lot clearer defined. There is no magic (often undocumented) global state that you need to think of when debugging things or writing tests.
See my repo
package main
import (
"fmt"
"time"
"github.com/timandy/routine"
)
func main() {
goid := routine.Goid()
fmt.Printf("cur goid: %v\n", goid)
go func() {
goid := routine.Goid()
fmt.Printf("sub goid: %v\n", goid)
}()
// Wait for the sub-coroutine to finish executing.
time.Sleep(time.Second)
}
I recommend looking at this post for an example of why someone might want to get information about the current thread/thread a function is running in:
stackoverflow - main threads in C#
As pointed out in the question, conditioning the behavior of a function on certain thread requirements (most likely) produces fragile/error-prone code that is difficult to debug.
I guess what your text book is trying to say is that a function should never rely on running in a specific thread, look up threads, etc. because this can cause unexpected behavior (especially in an API, if it's not obvious to the end-user that a function has to run in a specific thread). In Go, anything like that is impossible purely by language design. The behavior of a goroutine never depends on threads or something similar just because goroutines don't have an identity as you say correctly.
As tile, I am referring to Go package sync.Map, can its functions be considered as atomic? Mainly the Load, Store, LoadOrStore, and Delete function.
I also build a simple example go playground, is it guaranteed that only one goroutine can enter the code range line 15 - 17? As my test seems it can be guaranteed.
Please help to explain.
The godoc page for the sync package says: "Map is like a Go map[interface{}]interface{} but is safe for concurrent use by multiple goroutines without additional locking or coordination."
This statement guarantees that there's no need for additional mutexes or synchronization across goroutines. I wouldn't call that claim "atomic" (which has a very precise meaning), but it does mean that you don't have to worry about multiple goroutines being able to enter a LoadOrStore block (with the same key) like in your example.
I am programming in go and using mutex lock to lock certain variables so they cannot be overwritten while being read.
This got me thinking. Since you can read a variable multiple times. Is there a scenario where you ever have to lock a constant variable?
The rule is simple: if multiple goroutines access a variable concurrently, and at least one of the accesses is a write, then synchronization is required.
If we talk about constants, then there is no variable, and you cannot take the address of a constant (for details, see Find address of constant in go), so it is not possible to modify constant values.
You do not need any synchronization to access constants from multiple goroutines.
If you talk about constants there is no need to use sync routines to access them (as #icza suggests).
But if you consider const as variable whose value cannot be changed once it has been assigned a value, then you should be careful because of golang memory model and happens before relationship.
I'm writing a CMS in Go and have a session type (user id, page contents to render, etc). Ideally I'd like that type to be a global variable so I'm not having to propagate it through all the nested functions, however having a global variable like that would obviously mean that each new session would overwrite it's predecessor, which, needlessly to say, would be an epic fail.
Some languages to offer a way of having globals within threads that are preserved within that thread (ie the value of that global is sandboxed within that thread). While I'm aware that Goroutines are not threading, I just wondered if there was a similar method at my disposal or if I'd have to pass a local pointer of my session type down through the varies nested routines.
I'm guessing channels wouldn't do this? From what I can gather (and please correct me if I'm wrong here), but they're basically just a safe way of sharing global variables?
edit: I'd forgotten about this question! Anyhow, an update for anyone who is curious. This question was written back when I was new to Go and the CMS was basically my first project. I was coming from a C background with familiarity with POSIX thread but I quickly realised a better approach was to write the code in a mode functional design with session objects passed down as pointers in function parameters. This gave me both the context-sensitive local scope I was after while also minimizing the amount to data I was copying about. However being a 7 year old project and one that was at the start of my transition to Go, it's fair to say the project could do with a major rewrite anyway as there are a lot of mistakes made. That's a concern for another day though - currently it works and I have enough other projects on the go at.
You'll want to use something like a Context:
http://blog.golang.org/context
Basically, the pattern is to create a Context for each unique thing you want to do. (A web request in your case.) Use context.WithValue to embed multiple variables in the context. Then always pass it as the first parameter to other methods that are doing further work in other goroutines.
Getting the variable you need out of the context is a matter of calling context.Value from within any goroutine. From the above link:
A Context is safe for simultaneous use by multiple goroutines. Code can pass a single Context to any number of goroutines and cancel that Context to signal all of them.
I had an implementation where I was explicitly sending variables as method parameters, and I discovered that embedding these variables using contexts significantly cleaned up my code.
Using a Context also helps because it provides ways to end long-running tasks by using channels, select, and a concept called a "done channel." See this article for a great basic review and implementation:
http://blog.golang.org/pipelines
I'd recommend reading the pipelines article first for a good flavor of how to manage communication among goroutines, then the context article for a better idea of how to level-up and start embedding variables to pass around.
Good luck!
Don't use global variables. Use Go goroutine-local variables.
go-routine Id..
There are already goroutine-local variables: they are called function
arguments, function return values, and local variables.
Russ
If you have more than one user, then wouldn't you need that info for each connection? So I would think that you'd have a struct per connected user. It would be idiomatic Go to pass a pointer to that struct when setting up the worker goroutine, or passing the pointer over a channel.