Time nonce generation in go routines - go

I am calling rest api which expects nonce header. The nonce must be unique timestamp and every consecutive call should have timestamp > previous one. My goal is to launch 10 go routines and from each one do a call to the web api. Since we do not have control over the routine execution order we might end up doing a webapi call with a nonce < previous one. I do not have control over the api implementation.
I have stripped down my code to something very simple which illustrate the problem:
package main
import (
"fmt"
"time"
)
func main() {
count := 10
results := make(chan string, count)
for i := 0; i < 10; i++ {
go someWork(results)
// Enabling the following line would give the
// expected outcome but does look like a hack to me.
// time.Sleep(time.Millisecond)
}
for i := 0; i < count; i++ {
fmt.Println(<-results)
}
}
func someWork(done chan string) {
// prepare http request, do http request, send to done chan the result
done <- time.Now().Format("15:04:05.00000")
}
From the output you can see how we have timestamps which are not chronologically ordered:
13:18:26.98549
13:18:26.98560
13:18:26.98561
13:18:26.98553
13:18:26.98556
13:18:26.98556
13:18:26.98557
13:18:26.98558
13:18:26.98559
13:18:26.98555
What would be the idiomatic way to achieve the expected outcome without adding the sleep line?
Thanks!

As I understand you only need to synchronize (serialize) the goroutines till request send part, that is where the timestamp and nonce need to be sequential. Response processing can be in parallel.
You can use a mutex for this case like in below code
package main
import (
"fmt"
"sync"
"time"
)
func main() {
count := 10
results := make(chan string, count)
var mutex sync.Mutex
for i := 0; i < count; i++ {
go someWork(&mutex, results)
}
for i := 0; i < count; i++ {
fmt.Println(<-results)
}
}
func someWork(mut *sync.Mutex, done chan string) {
// Lock the mutex, go routine getting lock here,
// is guaranteed to create the timestamp and
// perform the request before any other
mut.Lock()
// Get the timestamp
myTimeStamp := time.Now().Format("15:04:05.00000")
// prepare http request, do http request
// Unlock the mutex
mut.Unlock()
// Process response
// send to done chan the result
done <- myTimeStamp
}
But still some duplicate timestamps, may be need more fine-grained timestamp, but that is up to the use case.

I think: you can use a WaitGroup, for example:
package main
import (
"fmt"
"sync"
"time"
)
var wg sync.WaitGroup = sync.WaitGroup{}
var ct int = 0
func hello() {
fmt.Printf("Hello Go %v\n", time.Now().Format("15:04:05.00000"))
// when you are done, call done:
time.Sleep(time.Duration(10 * int(time.Second)))
wg.Done()
}
func main() {
for i := 0; i < 10; i++ {
wg.Add(1)
go hello()
wg.Wait()
}
}

Related

Is it thread safe to concurrently read/access an array in go?

Like if I have a struct with an array and I want to do something like this
type Paxos struct {
peers []string
}
for _, peer := range px.peers {
\\do stuff
}
My routines/threads will never modify the peers array, just read from it. Peers is an array of server addresses, and servers may fail but that wouldn't affect the peers array (later rpc calls would just fail)
If no writes are involved, concurrent reads are always safe, regardless of the data structure. However, as soon as even a single concurrency-unsafe write to a variable is involved, you need to serialise concurrent access (both writes and reads) to the variable.
Moreover, you can safely write to elements of a slice or an array under the condition that no more than one goroutine write to a given element.
For instance, if you run the following programme with the race detector on, it's likely to report a race condition, because multiple goroutines concurrently modify variable results without precautions:
package main
import (
"fmt"
"sync"
)
func main() {
const n = 8
var results []int
var wg sync.WaitGroup
wg.Add(n)
for i := 0; i < n; i++ {
i := i
go func() {
defer wg.Done()
results = append(results, square(i))
}()
}
wg.Wait()
fmt.Println(results)
}
func square(i int) int {
return i * i
}
However, the following programme contains no such no synchronization bug, because each element of the slice is modified by a single goroutine:
package main
import (
"fmt"
"sync"
)
func main() {
const n = 8
results := make([]int, n)
var wg sync.WaitGroup
wg.Add(n)
for i := 0; i < n; i++ {
i := i
go func() {
defer wg.Done()
results[i] = square(i)
}()
}
wg.Wait()
fmt.Println(results)
}
func square(i int) int {
return i * i
}
Yes, reads are thread-safe in Go and virtually all other languages. You're just looking up an address in memory and seeing what is there. If nothing is attempting to modify that memory, then you can have as many concurrent reads as you'd like.

waitgroup with concurrent limit but test fail

I use sync.WaitGroup with goroutine before, but I want to control the goroutine concurrency,
so I write my waitgroup with concurrency limit like:
package wglimit
import (
"sync"
)
// WaitGroupLimit ...
type WaitGroupLimit struct {
ch chan int
wg *sync.WaitGroup
}
// New ...
func New(size int) *WaitGroupLimit {
if size <= 0 {
size = 1
}
return &WaitGroupLimit{
ch: make(chan int, size), // buffer chan to limit concurrency
wg: &sync.WaitGroup{},
}
}
// Add ...
func (wgl *WaitGroupLimit) Add(delta int) {
for i := 0; i < delta; i++ {
wgl.ch <- 1
wgl.wg.Add(1)
}
}
// Done ...
func (wgl *WaitGroupLimit) Done() {
wgl.wg.Done()
<-wgl.ch
}
// Wait ...
func (wgl *WaitGroupLimit) Wait() {
close(wgl.ch)
wgl.wg.Wait()
}
And then I use it to control the goroutine concurrency, for example:
jobs := ["1", "2", "3", "4"] // some jobs
// wg := sync.WaitGroup{} // have no concurrency limit
wg := wglimit.New(2) // limit 2 goroutine
for _, job := range jobs {
wg.Add(1)
go func(job string) {
// job worker
defer wg.Done()
}(job)
}
wg.Wait()
And it looks like worked when running.
But Test Failed:
package wglimit
import (
"runtime"
"testing"
"time"
)
func TestGoLimit(t *testing.T) {
var limit int = 5
wglimit := New(limit)
for i := 0; i < 10000; i++ {
wglimit.Add(1)
go func() {
defer wglimit.Done()
time.Sleep(time.Millisecond)
if runtime.NumGoroutine() > limit+2 {
println(runtime.NumGoroutine()) // will print 9 , cocurrent limit fail ?
t.Errorf("FAIL")
}
}()
}
wglimit.Wait()
}
When testing, the goroutine numbers is bigger than my limit, it seems like the cocurrent limit fail.
Anything wrong with my WaitGroupLimit code and why?
Anything wrong with my WaitGroupLimit code [...]?
No.
The problem is runtime.NumGoroutine() doesn't do what you seem to think it does. It counts all goroutines, i.e. not only the ones you start but also the goroutines the runtime uses itself, e.g. for concurrent garbage collection. NumGoroutine is thus higher than your limit.
Your code is fine, your test isn't. Do not try to get clever in testing and test what you code really does: It blocks on Add until the limited resource is available. Test that and not a goroutine count which is just a (bad) proxy for the desired behaviour in your test.

Unexpected behavior in code using sync/atomic package for synchronization

Below is an example i was working on when learning about goroutines in Golang. In the code below we spawn 30 goroutines each of which accesses a shared variable called ordersProcessed. The example represents a cashier processing orders. Once ordersProcessed is more than 10 we print that the cashier is not able to take any more orders.
package main
import (
"fmt"
"sync"
"sync/atomic"
)
func main() {
var (
wg sync.WaitGroup
ordersProcessed int64
)
// This does not work as expected
cashier := func(orderNum int) {
value := atomic.LoadInt64(&ordersProcessed)
fmt.Println("Value is ", value)
if value < 10 {
// Cashier is ready to serve!
fmt.Println("Proessing order", orderNum)
atomic.AddInt64(&ordersProcessed, 1)
} else {
// Cashier has reached the max capacity of processing orders.
fmt.Println("I am tired! I want to take rest!", orderNum)
}
wg.Done()
}
for i := 0; i < 30; i++ {
wg.Add(1)
go func(orderNum int) {
// Making an order
cashier(orderNum)
}(i)
}
wg.Wait()
}
Im expecting to see processed messages for 10 orders and unable to process henceforth. However, all the 30 orders get processed. I have used the sync/atomic package to synchronize the access to the ordersProcessed variable, however its value is always read as 0 by every goroutine. If however i change the code above to use a mutex as below, it works as expected:
package main
import (
"fmt"
"sync"
)
func main() {
var (
wg sync.WaitGroup
ordersProcessed int64
mutex sync.Mutex
)
// This works as expected
cashier := func(orderNum int) {
mutex.Lock()
if ordersProcessed < 10 {
// Cashier is ready to serve!
fmt.Println("Processing order", orderNum)
ordersProcessed++
} else {
// Cashier has reached the max capacity of processing orders.
fmt.Println("I am tired! I want to take rest!", orderNum)
}
mutex.Unlock()
wg.Done()
}
for i := 0; i < 30; i++ {
wg.Add(1)
go func(orderNum int) {
// Making an order
cashier(orderNum)
}(i)
}
wg.Wait()
}
Can someone please tell me whats wrong with the way i used the sync/atomic package to synchronize access to the ordersProcessed variable ?
You used sync/atomic package, but you did not synchronize the goroutines.
When you start 30 goroutines, each goroutine starts by reading the shared variable, and incrementing it. If all goroutines read the variable, they will all read 0. The problem here is that you did not prevent other goroutines to modify the variable while one goroutine is working on it. After your program runs, the shared variable can be any value between 10 and 30, depending on how goroutines interleave.
Your second implementation is correct, that it prevents other goroutines from reading and modifying the shared variable while one of them is working on it.

Idiomatic way to make a request-response communication using channels

Maybe I'm just not reading the spec right or my mindset is still stuck with older synchronization methods, but what is the right way in Go to send one type as receive something else as a response?
One way I had come up with was
package main
import "fmt"
type request struct {
out chan string
argument int
}
var input = make(chan *request)
var cache = map[int]string{}
func processor() {
for {
select {
case in := <- input:
if result, exists := cache[in.argument]; exists {
in.out <- result
}
result := fmt.Sprintf("%d", in.argument)
cache[in.argument] = result
in.out <- result
}
}
}
func main() {
go processor()
responseCh := make(chan string)
input <- &request{
responseCh,
1,
}
result := <- responseCh
fmt.Println(result)
}
That cache is not really necessary for this example but otherwise it would cause a datarace.
Is this what I'm supposed to do?
There're plenty of possibilities, depends what is best approach for your problem. When you receive something from a channel, there is nothing like a default way for responding – you need to build the flow by yourself (and you definitely did in the example in your question). Sending a response channel with every request gives you a great flexibility as with every request you can choose where to route the response, but quite often is not necessary.
Here are some other examples:
1. Sending and receiving from the same channel
You can use unbuffered channel for both sending and receiving the responses. This nicely illustrates that unbuffered channels are in fact a synchronisation points in your program. The limitation is of course that we need to send exactly the same type as request and response:
package main
import (
"fmt"
)
func pow2() (c chan int) {
c = make(chan int)
go func() {
for x := range c {
c <- x*x
}
}()
return c
}
func main() {
c := pow2()
c <- 2
fmt.Println(<-c) // = 4
c <- 4
fmt.Println(<-c) // = 8
}
2. Sending to one channel, receiving from another
You can separate input and output channels. You would be able to use buffered version if you wish. This can be used as request/response scenario and would allow you to have a route responsible for sending the requests, another one for processing them and yet another for receiving responses. Example:
package main
import (
"fmt"
)
func pow2() (in chan int, out chan int) {
in = make(chan int)
out = make(chan int)
go func() {
for x := range in {
out <- x*x
}
}()
return
}
func main() {
in, out := pow2()
go func() {
in <- 2
in <- 4
}()
fmt.Println(<-out) // = 4
fmt.Println(<-out) // = 8
}
3. Sending response channel with every request
This is what you've presented in the question. Gives you a flexibility of specifying the response route. This is useful if you want the response to hit the specific processing routine, for example you have many clients with some tasks to do and you want the response to be received by the same client.
package main
import (
"fmt"
"sync"
)
type Task struct {
x int
c chan int
}
func pow2(in chan Task) {
for t := range in {
t.c <- t.x*t.x
}
}
func main() {
var wg sync.WaitGroup
in := make(chan Task)
// Two processors
go pow2(in)
go pow2(in)
// Five clients with some tasks
for n := 1; n < 5; n++ {
wg.Add(1)
go func(x int) {
defer wg.Done()
c := make(chan int)
in <- Task{x, c}
fmt.Printf("%d**2 = %d\n", x, <-c)
}(n)
}
wg.Wait()
}
Worth saying this scenario doesn't necessary need to be implemented with per-task return channel. If the result has some sort of the client context (for example client id), a single multiplexer could be receiving all the responses and then processing them according to the context.
Sometimes it doesn't make sense to involve channels to achieve simple request-response pattern. When designing go programs, I caught myself trying to inject too many channels into the system (just because I think they're really great). Old good function calls is sometimes all we need:
package main
import (
"fmt"
)
func pow2(x int) int {
return x*x
}
func main() {
fmt.Println(pow2(2))
fmt.Println(pow2(4))
}
(And this might be a good solution if anyone encounters similar problem as in your example. Echoing the comments you've received under your question, having to protect a single structure, like cache, it might be better to create a structure and expose some methods, which would protect concurrent use with mutex.)

How to wait for all goroutines to finish without using time.Sleep?

This code selects all xml files in the same folder, as the invoked executable and asynchronously applies processing to each result in the callback method (in the example below, just the name of the file is printed out).
How do I avoid using the sleep method to keep the main method from exiting? I have problems wrapping my head around channels (I assume that's what it takes, to synchronize the results) so any help is appreciated!
package main
import (
"fmt"
"io/ioutil"
"path"
"path/filepath"
"os"
"runtime"
"time"
)
func eachFile(extension string, callback func(file string)) {
exeDir := filepath.Dir(os.Args[0])
files, _ := ioutil.ReadDir(exeDir)
for _, f := range files {
fileName := f.Name()
if extension == path.Ext(fileName) {
go callback(fileName)
}
}
}
func main() {
maxProcs := runtime.NumCPU()
runtime.GOMAXPROCS(maxProcs)
eachFile(".xml", func(fileName string) {
// Custom logic goes in here
fmt.Println(fileName)
})
// This is what i want to get rid of
time.Sleep(100 * time.Millisecond)
}
You can use sync.WaitGroup. Quoting the linked example:
package main
import (
"net/http"
"sync"
)
func main() {
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// Increment the WaitGroup counter.
wg.Add(1)
// Launch a goroutine to fetch the URL.
go func(url string) {
// Decrement the counter when the goroutine completes.
defer wg.Done()
// Fetch the URL.
http.Get(url)
}(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()
}
WaitGroups are definitely the canonical way to do this. Just for the sake of completeness, though, here's the solution that was commonly used before WaitGroups were introduced. The basic idea is to use a channel to say "I'm done," and have the main goroutine wait until each spawned routine has reported its completion.
func main() {
c := make(chan struct{}) // We don't need any data to be passed, so use an empty struct
for i := 0; i < 100; i++ {
go func() {
doSomething()
c <- struct{}{} // signal that the routine has completed
}()
}
// Since we spawned 100 routines, receive 100 messages.
for i := 0; i < 100; i++ {
<- c
}
}
sync.WaitGroup can help you here.
package main
import (
"fmt"
"sync"
"time"
)
func wait(seconds int, wg * sync.WaitGroup) {
defer wg.Done()
time.Sleep(time.Duration(seconds) * time.Second)
fmt.Println("Slept ", seconds, " seconds ..")
}
func main() {
var wg sync.WaitGroup
for i := 0; i <= 5; i++ {
wg.Add(1)
go wait(i, &wg)
}
wg.Wait()
}
Although sync.waitGroup (wg) is the canonical way forward, it does require you do at least some of your wg.Add calls before you wg.Wait for all to complete. This may not be feasible for simple things like a web crawler, where you don't know the number of recursive calls beforehand and it takes a while to retrieve the data that drives the wg.Add calls. After all, you need to load and parse the first page before you know the size of the first batch of child pages.
I wrote a solution using channels, avoiding waitGroup in my solution the the Tour of Go - web crawler exercise. Each time one or more go-routines are started, you send the number to the children channel. Each time a go routine is about to complete, you send a 1 to the done channel. When the sum of children equals the sum of done, we are done.
My only remaining concern is the hard-coded size of the the results channel, but that is a (current) Go limitation.
// recursionController is a data structure with three channels to control our Crawl recursion.
// Tried to use sync.waitGroup in a previous version, but I was unhappy with the mandatory sleep.
// The idea is to have three channels, counting the outstanding calls (children), completed calls
// (done) and results (results). Once outstanding calls == completed calls we are done (if you are
// sufficiently careful to signal any new children before closing your current one, as you may be the last one).
//
type recursionController struct {
results chan string
children chan int
done chan int
}
// instead of instantiating one instance, as we did above, use a more idiomatic Go solution
func NewRecursionController() recursionController {
// we buffer results to 1000, so we cannot crawl more pages than that.
return recursionController{make(chan string, 1000), make(chan int), make(chan int)}
}
// recursionController.Add: convenience function to add children to controller (similar to waitGroup)
func (rc recursionController) Add(children int) {
rc.children <- children
}
// recursionController.Done: convenience function to remove a child from controller (similar to waitGroup)
func (rc recursionController) Done() {
rc.done <- 1
}
// recursionController.Wait will wait until all children are done
func (rc recursionController) Wait() {
fmt.Println("Controller waiting...")
var children, done int
for {
select {
case childrenDelta := <-rc.children:
children += childrenDelta
// fmt.Printf("children found %v total %v\n", childrenDelta, children)
case <-rc.done:
done += 1
// fmt.Println("done found", done)
default:
if done > 0 && children == done {
fmt.Printf("Controller exiting, done = %v, children = %v\n", done, children)
close(rc.results)
return
}
}
}
}
Full source code for the solution
Here is a solution that employs WaitGroup.
First, define 2 utility methods:
package util
import (
"sync"
)
var allNodesWaitGroup sync.WaitGroup
func GoNode(f func()) {
allNodesWaitGroup.Add(1)
go func() {
defer allNodesWaitGroup.Done()
f()
}()
}
func WaitForAllNodes() {
allNodesWaitGroup.Wait()
}
Then, replace the invocation of callback:
go callback(fileName)
With a call to your utility function:
util.GoNode(func() { callback(fileName) })
Last step, add this line at the end of your main, instead of your sleep. This will make sure the main thread is waiting for all routines to finish before the program can stop.
func main() {
// ...
util.WaitForAllNodes()
}

Resources