Atomic swap map pointer cause program stuck - go

package main
import (
"fmt"
"sync/atomic"
"unsafe"
)
func main(){
old := make(map[string]string)
new := make(map[string]string)
new["hello"] = "apple"
fmt.Println("start swap")
atomic.SwapPointer((*unsafe.Pointer)(unsafe.Pointer(&old)), unsafe.Pointer(&new))
fmt.Println("end swap")
// pending here, don't stop
fmt.Println(old)
fmt.Println("end print old")
}
I want a lock-free way to update old map with new map because of the old map will be concurrent read at the most of the time.
if I use a rwlock, there will have serious performance penalty.
so I choose Golang atomic package to implement this, but the line
fmt.Println(old), the program is stuck here, could somebody give some advice.

The function atomic.SwapPointer does not do what you want / need. As the documentation says, this is roughly equivalent to:
old = *addr
*addr = new
return old
(except that the write to *addr and read-back from *addr are done atomically). What you wanted was, I think, the atomic equivalent of:
*old, *new = *new, *old
(and no useful return value). This operation simply does not exist in the sync package. If it did, you could swap the two internal map pointers, but you'd still be treading in dangerous (might-break-in-future-compilers) waters, as multiple commenters noted.
Consider using sync.Map instead. It provides a somewhat-concurrency-safe map with internal (per entry) locking that is optimized for two use cases, as described in the linked package documentation. If your use case is one of these two, it may provide what you need.
Just for illustration (don't do this! it's silly, you could just write *old, *new = *new, *old in swap)... A non-atomic swap of old and new can be achieved using atomic.SwapPointer:
package main
import (
"fmt"
"sync/atomic"
"unsafe"
)
func read(p unsafe.Pointer) unsafe.Pointer {
return *(*unsafe.Pointer)(p)
}
func swap(old *map[string]string, new *map[string]string) {
p := atomic.SwapPointer((*unsafe.Pointer)(unsafe.Pointer(old)), read(unsafe.Pointer(new)))
_ = atomic.SwapPointer((*unsafe.Pointer)(unsafe.Pointer(new)), p)
}
func main() {
old := map[string]string{"old": "old"}
new := map[string]string{"hello": "apple", "new": "new"}
fmt.Println("before: old =", old, "new =", new)
// fmt.Println("before: old:", read(unsafe.Pointer(&old)), "new:", read(unsafe.Pointer(&new)))
swap(&old, &new)
// fmt.Println("after: old:", read(unsafe.Pointer(&old)), "new:", read(unsafe.Pointer(&new)))
fmt.Println("after: old =", old, "new =", new)
}
Uncomment the commented-out lines to see more details. Of course, calling two separate atomic.SwapPointer operations is not atomic: there is a moment when both map variables view the new map, until the second swap makes the old variable view the old map. I think the unsafe.Pointer variable p preserves the old map against GC until we get it stored back into old, but I am not at all sure about this (that's one of the darker corners of Go).
Again: don't do this. If measurement suggests it helps, try sync.Map instead.

Since go 1.17, there is a atomic.Value.Swap method that I believe does what you want.
package main
import (
"fmt"
"sync/atomic"
)
func main() {
var old atomic.Value
old.Store(make(map[string]string))
new := make(map[string]string)
new["hello"] = "apple"
fmt.Println("start swap")
new = old.Swap(new).(map[string]string)
fmt.Println("end swap")
fmt.Println(old)
fmt.Println("end print old")
fmt.Println(new)
}
That being said, it's not clear from your use case that you actually need access to the old values, and if not, you can just use atomic.Value.Store to update the pointer instead. In fact, there is a very similar example in the documentation.
That being said, I agree with the sentiment that you should measure the impact of using RWLock and sync.Map before reaching for atomic.Value.

Related

Is this a safe way to use ulid concurrently with other libraries too?

I'm trying to use for the first time the ulid package.
In their README they say:
Please note that rand.Rand from the math package is not safe for concurrent use.
Instantiate one per long living go-routine or use a sync.Pool if you want to avoid the potential contention of a locked rand.Source as its been frequently observed in the package level functions.
Can you help me understand what does this mean and how to write SAFE code for concurrent use with libraries such ent or gqlgen?
Example: I'm using the below code in my app to generate new IDs (sometimes even many of them in the same millisecond which is fine for ulid).
import (
"math/rand"
"time"
"github.com/oklog/ulid/v2"
)
var defaultEntropySource *ulid.MonotonicEntropy
func init() {
defaultEntropySource = ulid.Monotonic(rand.New(rand.NewSource(time.Now().UnixNano())), 0)
}
func NewID() string {
return ulid.MustNew(ulid.Timestamp(time.Now()), defaultEntropySource).String()
}
Is this a safe way to use the package?
Is this a safe way to use the package?
No, that sentence suggests that each rand.Source should be local to the goroutine, your defaultEntropySource rand.Source piece is potentially shared between multiple goroutines.
As documentated New function, you only need to make sure the entropy reader is safe for concurrent use, but Monotonic is not. Here is a two ways of implementing the documentation suggestion:
Create a single rand.Source per call o NewID(), allocates a new entropy for each call to NewID
func NewID() string {
defaultEntropySource := ulid.Monotonic(rand.New(rand.NewSource(time.Now().UnixNano())), 0)
return ulid.MustNew(ulid.Timestamp(time.Now()), defaultEntropySource).String()
}
Playground
Like above but using sync.Pool to possible reuse previously allocated rand.Sources
var entropyPool = sync.Pool{
New: func() any {
entropy := ulid.Monotonic(rand.New(rand.NewSource(time.Now().UnixNano())), 0)
return entropy
},
}
func NewID() string {
e := entropyPool.Get().(*ulid.MonotonicEntropy)
s := ulid.MustNew(ulid.Timestamp(time.Now()), e).String()
entropyPool.Put(e)
return s
}
Playground

Println changes capacity of a slice

Consider the following code
package main
import (
"fmt"
)
func main() {
x := []byte("a")
fmt.Println(x)
fmt.Println(cap(x) == cap([]byte("a"))) // prints false
y := []byte("a")
fmt.Println(cap(y) == cap([]byte("a"))) // prints true
}
https://play.golang.org/p/zv8KQekaxH8
Calling simple Println with a slice variable, changes its capacity. I suspect calling any function with variadic parameters of ...interface{} produces the same effect. Is there any sane explanation for such behavior?
The explanation is, like bradfitz point in github, if you don't use make to create a slice, the compiler will use the cap it believes convenient. Creating multiple slices in different versions, or even the same, can result on slices of different capacities.
In short, if you need a concrete capacity, use make([]byte, len, cap). Otherwise you can't trust on a fixed capacity.

Using crypto/rand for generating permutations with rand.Perm

Go has two packages for random numbers:
crypto/rand, which provides a way to get random bytes
math/rand, which has a nice algorithm for shuffling ints
I want to use the Perm algorithm from math/rand, but provide it with high-quality random numbers.
Since the two rand packages are part of the same standard library there should be a way to combine them in a way so that crypto/rand provides a good source of random numbers that is used by math/rand.Perm to generate a permutation.
Here (and on the Playground) is the code I wrote to connect these two packages:
package main
import (
cryptoRand "crypto/rand"
"encoding/binary"
"fmt"
mathRand "math/rand"
)
type cryptoSource struct{}
func (s cryptoSource) Int63() int64 {
bytes := make([]byte, 8, 8)
cryptoRand.Read(bytes)
return int64(binary.BigEndian.Uint64(bytes) >> 1)
}
func (s cryptoSource) Seed(seed int64) {
panic("seed")
}
func main() {
rnd := mathRand.New(&cryptoSource{})
perm := rnd.Perm(52)
fmt.Println(perm)
}
This code works. Ideally I don't want to define the cryptoSource type myself but just stick together the two rand packages so that they work together. So is there a predefined version of this cryptoSource type somewhere?
That's basically what you need to do. It's not often that you need a cryptographically secure source of randomness for the common usage of math/rand, so there's no adaptor provided. You can make the implementation slightly more efficient by allocating the buffer space directly in the value, rather than allocating a new slice on every call. However in the unlikely event that reading the OS random source fails, this will need to panic to prevent returning invalid results.
type cryptoSource [8]byte
func (s *cryptoSource) Int63() int64 {
_, err := cryptoRand.Read(s[:])
if err != nil {
panic(err)
}
return int64(binary.BigEndian.Uint64(s[:]) & (1<<63 - 1))
}

io.MultiWriter vs. golang's pass-by-value

I'd like to create a situation where everything set to a particular log.Logger is also appended to a particular variable's array of strings.
The variable's type implements the io.Writer interface so it should be easy to add that via io.MultiWriter to log.New(), but I seem to have run into an intractable problem: the io.Writer interface is fixed and it's impossible for the variable to reference itself given golang's pass-by-value.
Maybe it will make more sense with an example:
package main
import "fmt"
import "io"
import "log"
import "os"
import "strings"
var Log *log.Logger
type Job_Result struct {
Job_ID int64
// other stuff
Log_Lines []string
}
// satisfies io.Writer interface
func (jr Job_Result) Write (p []byte) (n int, err error) {
s := strings.TrimRight(string(p),"\n ")
jr.Log_Lines= append(jr.Log_Lines,s)
return len(s), nil
}
func (jr Job_Result) Dump() {
fmt.Println("\nHere is a dump of the job result log lines:")
for n, s := range jr.Log_Lines{
fmt.Printf("\tline %d: %s\n",n,s)
}
}
func main() {
// make a Job_Result
var jr Job_Result
jr.Job_ID = 123
jr.Log_Lines = make([]string,0)
// create an io.MultiWriter that points to both stdout
// and that Job_Result var
var writers io.Writer
writers = io.MultiWriter(os.Stdout,jr)
Log = log.New(writers,
"",
log.Ldate|log.Ltime|log.Lshortfile)
// send some stuff to the log
Log.Println("program starting")
Log.Println("something happened")
Log.Printf("last thing that happened, should be %drd line\n",3)
jr.Dump()
}
This is the output, which is not surprising:
2016/07/28 07:20:07 testjob.go:43: program starting
2016/07/28 07:20:07 testjob.go:44: something happened
2016/07/28 07:20:07 testjob.go:45: last thing that happened, should be 3rd line
Here is a dump of the job result log lines:
I understand the problem - Write() is getting a copy of the Job_Result variable, so it's dutifully appending and then the copy vanishes as it's local. I should pass it a pointer to the Job_Result...but I'm not the one calling Write(), it's done by the Logger, and I can't change that.
I thought this was a simple solution to capturing log output into an array (and there is other subscribe/unsubscribe stuff I didn't show), but it all comes down to this problematic io.Write() interface.
Pilot error? Bad design? Something I'm not grokking? Thanks for any advice.
redefine the write function (is now pointer receiver)
// satisfies io.Writer interface
func (jr *Job_Result) Write (p []byte) (n int, err error) {
s := strings.TrimRight(string(p),"\n ")
jr.Log_Lines= append(jr.Log_Lines,s)
return len(s), nil
}
initialize
jr := new(Job_Result) // makes a pointer.
rest stays as is. This way, *Job_Result still implements io.Writer, but doesn't lose state.
The go tutorial already said, when a method modifies the receiver, you should probably use a pointer receiver, or the changes may be lost. Working with a pointer instead of the actual object has little downside, when you want to make sure, there is exactly one object. (And yes, it technically isn't an object).

In golang, how do I re-assign an external reference from within a function?

I'm probably not expressing this correctly in the question, but perhaps this code will make it clearer:
package main
import "fmt"
type Blob struct {
Message string
}
func assign1(bb **Blob) {
*bb = &Blob{"Internally created by assign1"}
}
func (b *Blob) assign2() {
*b = Blob{"Internally created by assign2"}
}
func main() {
x1 := &Blob{"Hello World"}
assign1(&x1)
fmt.Printf("x1 = %+v\n", *x1)
x2 := Blob{"Hello World"}
x2.assign2()
fmt.Printf("x2 = %+v\n", x2)
}
Produces, as desired:
x1 = {Message:Internally created by assign1}
x2 = {Message:Internally created by assign2}
I want to pass a reference (pointer to a pointer) into a function and have the function assign a new value to the pointer such that the calling scope will see that new value.
I've figured out the above two ways of doing this, but I'd like to know if they are actually correct or if there is some hidden flaw. Also, are either of them more idiomatic than the other?
Coming from Java, assign2 just seems wrong but I'm sure I've seen something similar in the encoding/json package. What is that actually doing?
Thanks!
James answers the mechanics of assign2. I'll touch a bit on when to use it.
Let's take a simpler example, first.
type Counter uint
func (c *Counter) Increment() {
*c += 1
}
In the counter example the entire state of the receiver is changing. Similarly for the encoding/json package the entire state of the receiver is changing. That's really the only time I would use that style.
One major advantage of the style: you can define an interface for the change, just like the GobDecoder does.
When I first saw the assign2 style it was a little grating. But then I remembered that (c *Counter) Increment gets translated to Increment(c *Counter) in the machine code and it didn't bother me anymore. I personally prefer assign1-style. (Though, there is no need for the double pointers as orignally posted.)
package main
import "fmt"
type Blob struct {
Message string
}
func assign1(b *Blob) {
*b = Blob{"Internally created by assign1"}
}
func main() {
x1 := Blob{"Hello World"}
assign1(&x1)
fmt.Printf("x1 = %+v\n", *x1)
}
Both forms are valid Go, as you've discovered.
For the assign2 case, the compiler finds that assign2 does not appear in Blob's method set, but it is part of *Blob's method set. It then converts the method call to:
(&x2).assign2()
While it can be confusing if a method then goes and changes x2 like in your example, there are a number of places in the standard library where this pattern is used. Probably the most common one is implementing custom decoding for a type with the encoding/json module: you implement the DecodeJSON method with a pointer receiver, and update the value being pointed to.

Resources