As far as I know, the net/http package uses goroutines for the handlers. Is it necessary that I lock even a map with sync.Mutex in order to prevent possible bugs in the nextId function cause the function could count an old state of the map?
Here is my example code:
package main
import (
"net/http"
"github.com/gorilla/mux"
"io/ioutil"
"fmt"
)
var testData = map[int]string {
1: "foo",
2: "bar",
}
func main() {
r := mux.NewRouter()
r.HandleFunc("/data", getData).Methods("GET")
r.HandleFunc("/data", addData).Methods("POST")
http.ListenAndServe(":3000", r)
}
func getData(writer http.ResponseWriter, request *http.Request) {
for k, v := range testData {
fmt.Fprintf(writer, "Key: %d\tValue: %v\n", k, v)
}
}
func addData(writer http.ResponseWriter, request *http.Request) {
if data, err := ioutil.ReadAll(request.Body); err == nil {
if len(data) == 0 {
writer.WriteHeader(http.StatusBadRequest)
return
}
id := nextId()
testData[id] = string(data)
url := request.URL.String()
writer.Header().Set("Location", fmt.Sprintf("%s", url))
writer.WriteHeader(http.StatusCreated)
} else {
writer.WriteHeader(http.StatusBadRequest)
}
}
func nextId() int {
id := 1
for k, _ := range testData {
if k >= id {
id = k + 1;
}
}
return id
}
Since the HTTP server of the standard lib calls handlers on their own goroutine, you must synchronzie access to all variables that are defined outside of the handlers (and where one of the access is a write). You have to do this whenever you use the stdlib's HTTP server. It doesn't matter if you use the standard lib's multiplexer or Gorilla's. The goroutine launch happens outside of the multiplexer (before the multiplexer is called).
Failing to do so (like in your example), data race occurs which you can verify by running it with the -race option:
WARNING: DATA RACE
Write at 0x00c000090c30 by goroutine 21:
runtime.mapassign_fast64()
/usr/local/go/src/runtime/map_fast64.go:92 +0x0
main.addData()
/home/icza/gows/src/play/play.go:47 +0x191
net/http.HandlerFunc.ServeHTTP()
/usr/local/go/src/net/http/server.go:2007 +0x51
github.com/gorilla/mux.(*Router).ServeHTTP()
/home/icza/gows/pkg/mod/github.com/gorilla/mux#v1.7.3/mux.go:212 +0x13e
net/http.serverHandler.ServeHTTP()
/usr/local/go/src/net/http/server.go:2802 +0xce
net/http.(*conn).serve()
/usr/local/go/src/net/http/server.go:1890 +0x837
Previous read at 0x00c000090c30 by goroutine 7:
runtime.mapiternext()
/usr/local/go/src/runtime/map.go:851 +0x0
main.getData()
/home/icza/gows/src/play/play.go:32 +0x194
net/http.HandlerFunc.ServeHTTP()
/usr/local/go/src/net/http/server.go:2007 +0x51
...
Related
I have the following code in Go using the semaphore library just as an example:
package main
import (
"fmt"
"context"
"time"
"golang.org/x/sync/semaphore"
)
// This protects the lockedVar variable
var lock *semaphore.Weighted
// Only one go routine should be able to access this at once
var lockedVar string
func acquireLock() {
err := lock.Acquire(context.TODO(), 1)
if err != nil {
panic(err)
}
}
func releaseLock() {
lock.Release(1)
}
func useLockedVar() {
acquireLock()
fmt.Printf("lockedVar used: %s\n", lockedVar)
releaseLock()
}
func causeDeadLock() {
acquireLock()
// calling this from a function that's already
// locked the lockedVar should cause a deadlock.
useLockedVar()
releaseLock()
}
func main() {
lock = semaphore.NewWeighted(1)
lockedVar = "this is the locked var"
// this is only on a separate goroutine so that the standard
// go "deadlock" message doesn't print out.
go causeDeadLock()
// Keep the primary goroutine active.
for true {
time.Sleep(time.Second)
}
}
Is there a way to get the acquireLock() function call to print a message after a timeout indicating that there is a potential deadlock but without unblocking the call? I would want the deadlock to persist, but a log message to be written in the event that a timeout is reached. So a TryAcquire isn't exactly what I want.
An example of what I want in psuedo code:
afterFiveSeconds := func() {
fmt.Printf("there is a potential deadlock\n")
}
lock.Acquire(context.TODO(), 1, afterFiveSeconds)
The lock.Acquire call in this example would call the afterFiveSeconds callback if the Acquire call blocked for more than 5 seconds, but it would not unblock the caller. It would continue to block.
I think I've found a solution to my problem.
func acquireLock() {
timeoutChan := make(chan bool)
go func() {
select {
case <-time.After(time.Second * time.Duration(5)):
fmt.Printf("potential deadlock while acquiring semaphore\n")
case <-timeoutChan:
break
}
}()
err := lock.Acquire(context.TODO(), 1)
close(timeoutChan)
if err != nil {
panic(err)
}
}
I'm trying to solve WARNING: DATA RACE
here is the code:
package models
import (
"sync"
"time"
)
type Stats struct {
sync.Mutex
request map[int64]int
}
func (s *Stats) PutRequest() {
s.Lock()
s.request[time.Now().Unix()]++
s.Unlock()
}
func (s *Stats) GetRequests() map[int64]int {
s.Lock()
m := s.request
s.Unlock()
return m
}
var Requests = Stats{
sync.Mutex{},
make(map[int64]int),
}
If i change Stats field request into integer then everithing works fine but not with map. How to correctly lock map in Go?
Use sync.RWMutex
func (s *Stats) PutRequest(ut int64) {
s.Lock()
defer s.Unlock()
s.request[ut]++
}
func (s *Stats) GetRequests() map[int64]int {
s.RLock()
defer s.RUnlock()
m := make(map[int64]int, len(s.request))
for k, v := range s.request {
m[k] = v
}
return m
}
The following channel example can be interesting in this case. go by example - stateful goroutines
Anyway you need to copy the map before returning it.
GetRequests returns the reference to the map, so if other code calls the function and do r/w on the returning map without acquiring the lock, then data race is introduced
I'm working on a golang web crawler that should parse the search results on some specific search engine. The main difficulty - parsing with concurrency, or rather, in processing pagination such as
← Previous 1 2 3 4 5 ... 34 Next →. All things work fine except recursive crawling of paginated results. Look at my code:
package main
import (
"bufio"
"errors"
"fmt"
"net"
"strings"
"github.com/antchfx/htmlquery"
"golang.org/x/net/html"
)
type Spider struct {
HandledUrls []string
}
func NewSpider(url string) *Spider {
// ...
}
func requestProvider(request string) string {
// Everything is good here
}
func connectProvider(url string) net.Conn {
// Also works
}
// getContents makes request to search engine and gets response body
func getContents(request string) *html.Node {
// ...
}
// CheckResult controls empty search results
func checkResult(node *html.Node) bool {
// ...
}
func (s *Spider) checkVisited(url string) bool {
// ...
}
// Here is the problems
func (s *Spider) Crawl(url string, channelDone chan bool, channelBody chan *html.Node) {
body := getContents(url)
defer func() {
channelDone <- true
}()
if checkResult(body) == false {
err := errors.New("Nothing found there")
ErrFatal(err)
}
channelBody <- body
s.HandledUrls = append(s.HandledUrls, url)
fmt.Println("Handled ", url)
newUrls := s.getPagination(body)
for _, u := range newUrls {
fmt.Println(u)
}
for i, newurl := range newUrls {
if s.checkVisited(newurl) == false {
fmt.Println(i)
go s.Crawl(newurl, channelDone, channelBody)
}
}
}
func (s *Spider) getPagination(node *html.Node) []string {
// ...
}
func main() {
request := requestProvider(*requestFlag)
channelBody := make(chan *html.Node, 120)
channelDone := make(chan bool)
var parsedHosts []*Host
s := NewSpider(request)
go s.Crawl(request, channelDone, channelBody)
for {
select {
case recievedNode := <-channelBody:
// ...
for _, h := range newHosts {
parsedHosts = append(parsedHosts, h)
fmt.Println("added", h.HostUrl)
}
case <-channelDone:
fmt.Println("Jobs finished")
}
break
}
}
It always returns the first page only, no pagination. Same GetPagination(...) works good. Please tell me, where is my error(s).
Hope Google Translate was correct.
The problem is probably that main exits before all goroutine finished.
First, there is a break after the select statement and it runs uncodintionally after first time a channel is read. That ensures the main func returns after the first time you send something over channelBody.
Secondly, using channelDone is not the right way here. The most idomatic approach would be using a sync.WaitGroup. Before starting each goroutine, use WG.Add(1) and replace the defer with defer WG.Done(); In main, use WG.Wait(). Please be aware that you should use a pointer to refer to the WaitGroup. You can read more here.
Following problem:
I have a function that only should allow one caller to execute.
If someone tries to call the function and it is already busy the second caller should immediatly return with an error.
I tried the following:
1. Use a mutex
Would be pretty easy. But the problem is, you cannot check if a mutex is locked. You can only block on it. Therefore it does not work
2. Wait on a channel
var canExec = make(chan bool, 1)
func init() {
canExec <- true
}
func onlyOne() error {
select {
case <-canExec:
default:
return errors.New("already busy")
}
defer func() {
fmt.Println("done")
canExec <- true
}()
// do stuff
}
What I don't like here:
looks really messi
if easy to mistakenly block on the channel / mistakenly write to the channel
3. Mixture of mutex and shared state
var open = true
var myMutex *sync.Mutex
func canExec() bool {
myMutex.Lock()
defer myMutex.Unlock()
if open {
open = false
return true
}
return false
}
func endExec() {
myMutex.Lock()
defer myMutex.Unlock()
open = true
}
func onlyOne() error {
if !canExec() {
return errors.New("busy")
}
defer endExec()
// do stuff
return nil
}
I don't like this either. Using a shard variable with mutex is not that nice.
Any other idea?
I'll throw my preference out there - use the atomic package.
var (
locker uint32
errLocked = errors.New("Locked out buddy")
)
func OneAtATime(d time.Duration) error {
if !atomic.CompareAndSwapUint32(&locker, 0, 1) { // <-----------------------------
return errLocked // All logic in these |
} // four lines |
defer atomic.StoreUint32(&locker, 0) // <-----------------------------
// logic here, but we will sleep
time.Sleep(d)
return nil
}
The idea is pretty simple. Set the initial value to 0 (0 value of uint32). The first thing you do in the function is check if the value of locker is currently 0 and if so it changes it to 1. It does all of this atomically. If it fails simply return an error (or however else you like to handle a locked state). If successful, you immediately defer replacing the value (now 1) with 0. You don't have to use defer obviously, but failing to set the value back to 0 before returning would leave you in a state where the function could no longer be run.
After you do those 4 lines of setup, you do whatever you would normally.
https://play.golang.org/p/riryVJM4Qf
You can make things a little nicer if desired by using named values for your states.
const (
stateUnlocked uint32 = iota
stateLocked
)
var (
locker = stateUnlocked
errLocked = errors.New("Locked out buddy")
)
func OneAtATime(d time.Duration) error {
if !atomic.CompareAndSwapUint32(&locker, stateUnlocked, stateLocked) {
return errLocked
}
defer atomic.StoreUint32(&locker, stateUnlocked)
// logic here, but we will sleep
time.Sleep(d)
return nil
}
You can use a semaphore for this (go get golang.org/x/sync/semaphore)
package main
import (
"errors"
"fmt"
"sync"
"time"
"golang.org/x/sync/semaphore"
)
var sem = semaphore.NewWeighted(1)
func main() {
var wg sync.WaitGroup
for i := 0; i < 10; i++ {
wg.Add(1)
go func() {
defer wg.Done()
if err := onlyOne(); err != nil {
fmt.Println(err)
}
}()
time.Sleep(time.Second)
}
wg.Wait()
}
func onlyOne() error {
if !sem.TryAcquire(1) {
return errors.New("busy")
}
defer sem.Release(1)
fmt.Println("working")
time.Sleep(5 * time.Second)
return nil
}
You could use standard channel approach with select statement.
var (
ch = make(chan bool)
)
func main() {
i := 0
wg := sync.WaitGroup{}
for i < 100 {
i++
wg.Add(1)
go func() {
defer wg.Done()
err := onlyOne()
if err != nil {
fmt.Println("Error: ", err)
} else {
fmt.Println("Ok")
}
}()
go func() {
ch <- true
}()
}
wg.Wait()
}
func onlyOne() error {
select {
case <-ch:
// do stuff
return nil
default:
return errors.New("Busy")
}
}
Do you want a function to be executed exactly once or once at given time? In former case take a look at https://golang.org/pkg/sync/#Once.
If you want once at a time solution:
package main
import (
"fmt"
"sync"
"time"
)
// OnceAtATime protects function from being executed simultaneously.
// Example:
// func myFunc() { time.Sleep(10*time.Second) }
// func main() {
// once := OnceAtATime{}
// once.Do(myFunc)
// once.Do(myFunc) // not executed
// }
type OnceAtATime struct {
m sync.Mutex
executed bool
}
func (o *OnceAtATime) Do(f func()) {
o.m.Lock()
if o.executed {
o.m.Unlock()
return
}
o.executed = true
o.m.Unlock()
f()
o.m.Lock()
o.executed = false
o.m.Unlock()
}
// Proof of concept
func f(m int, done chan<- struct{}) {
for i := 0; i < 10; i++ {
fmt.Printf("%d: %d\n", m, i)
time.Sleep(250 * time.Millisecond)
}
close(done)
}
func main() {
done := make(chan struct{})
once := OnceAtATime{}
go once.Do(func() { f(1, done) })
go once.Do(func() { f(2, done) })
<-done
done = make(chan struct{})
go once.Do(func() { f(3, done) })
<-done
}
https://play.golang.org/p/nZcEcWAgKp
But the problem is, you cannot check if a mutex is locked. You can only block on it. Therefore it does not work
With possible Go 1.18 (Q1 2022), you will be able to test if a mutex is locked... without blocking on it.
See (as mentioned by Go 101) the issue 45435 from Tye McQueen :
sync: add Mutex.TryLock
This is followed by CL 319769, with the caveat:
Use of these functions is almost (but not) always a bad idea.
Very rarely they are necessary, and third-party implementations (using a mutex and an atomic word, say) cannot integrate as well with the race detector as implementations in package sync itself.
The objections (since retracted) were:
Locks are for protecting invariants.
If the lock is held by someone else, there is nothing you can say about the invariant.
TryLock encourages imprecise thinking about locks; it encourages making assumptions about the invariants that may or may not be true.
That ends up being its own source of races.
Thinking more about this, there is one important benefit to building TryLock into Mutex, compared to a wrapper:
failed TryLock calls wouldn't create spurious happens-before edges to confuse the race detector.
And:
A channel-based implementation is possible, but performs poorly in comparison.
There's a reason we have sync.Mutex rather than just using channel for locking.
I came up with the following generic solution for that:
Works for me, or do you see any problem with that?
import (
"sync"
)
const (
ONLYONECALLER_LOCK = "onlyonecaller"
ANOTHER_LOCK = "onlyonecaller"
)
var locks = map[string]bool{}
var mutex = &sync.Mutex{}
func Lock(lock string) bool {
mutex.Lock()
defer mutex.Unlock()
locked, ok := locks[lock]
if !ok {
locks[lock] = true
return true
}
if locked {
return false
}
locks[lock] = true
return true
}
func IsLocked(lock string) bool {
mutex.Lock()
defer mutex.Unlock()
locked, ok := locks[lock]
if !ok {
return false
}
return locked
}
func Unlock(lock string) {
mutex.Lock()
defer mutex.Unlock()
locked, ok := locks[lock]
if !ok {
return
}
if !locked {
return
}
locks[lock] = false
}
see: https://play.golang.org/p/vUUsHcT3L-
How about this package: https://github.com/viney-shih/go-lock . It use channel and semaphore (golang.org/x/sync/semaphore) to solve your problem.
go-lock implements TryLock, TryLockWithTimeout and TryLockWithContext functions in addition to Lock and Unlock. It provides flexibility to control the resources.
Examples:
package main
import (
"fmt"
"time"
"context"
lock "github.com/viney-shih/go-lock"
)
func main() {
casMut := lock.NewCASMutex()
casMut.Lock()
defer casMut.Unlock()
// TryLock without blocking
fmt.Println("Return", casMut.TryLock()) // Return false
// TryLockWithTimeout without blocking
fmt.Println("Return", casMut.TryLockWithTimeout(50*time.Millisecond)) // Return false
// TryLockWithContext without blocking
ctx, cancel := context.WithTimeout(context.Background(), 50*time.Millisecond)
defer cancel()
fmt.Println("Return", casMut.TryLockWithContext(ctx)) // Return false
// Output:
// Return false
// Return false
// Return false
}
Lets keep it simple:
package main
import (
"fmt"
"time"
"golang.org/x/sync/semaphore"
)
var sem *semaphore.NewWeighted(1)
func init() {
sem = emaphore.NewWeighted(1)
}
func doSomething() {
if !sem.TryAcquire(1) {
return errors.New("I'm busy")
}
defer sem.Release(1)
fmt.Println("I'm doing my work right now, then I'll take a nap")
time.Sleep(10)
}
func main() {
go func() {
doSomething()
}()
}
I tried to implement a locking version of reading/writing from a map in golang, but it doesn't return the desired result.
package main
import (
"sync"
"fmt"
)
var m = map[int]string{}
var lock = sync.RWMutex{}
func StoreUrl(id int, url string) {
for {
lock.Lock()
defer lock.Unlock()
m[id] = url
}
}
func LoadUrl(id int, ch chan string) {
for {
lock.RLock()
defer lock.RUnlock()
r := m[id]
ch <- r
}
}
func main() {
go StoreUrl(125, "www.google.com")
chb := make(chan string)
go LoadUrl(125, chb);
C := <-chb
fmt.Println("Result:", C)
}
The output is:
Result:
Meaning the value is not returned via the channel, which I don't get. Without the locking/goroutines it seems to work fine. What did I do wrong?
The code can also be found here:
https://play.golang.org/p/-WmRcMty5B
Infinite loops without sleep or some kind of IO are always bad idea.
In your code if you put a print statement at the start of StoreUrl, you will find that it never gets printed i.e the go routine was never started, the go call is setting putting the info about this new go routine in some run queue of the go scheduler but the scheduler hasn't ran yet to schedule that task. How do you run the scheduler? Do sleep/IO/channel reading/writing.
Another problem is that your infinite loop is taking lock and trying to take the lock again, which will cause it to deadlock. Defer only run after function exit and that function will never exit because of infinite loop.
Below is modified code that uses sleep to make sure every execution thread gets time to do its job.
package main
import (
"sync"
"fmt"
"time"
)
var m = map[int]string{}
var lock = sync.RWMutex{}
func StoreUrl(id int, url string) {
for {
lock.Lock()
m[id] = url
lock.Unlock()
time.Sleep(1)
}
}
func LoadUrl(id int, ch chan string) {
for {
lock.RLock()
r := m[id]
lock.RUnlock()
ch <- r
}
}
func main() {
go StoreUrl(125, "www.google.com")
time.Sleep(1)
chb := make(chan string)
go LoadUrl(125, chb);
C := <-chb
fmt.Println("Result:", C)
}
Edit: As #Jaun mentioned in the comment, you can also use runtime.Gosched() instead of sleep.
Usage of defer incorrect, defer execute at end of function, not for statement.
func StoreUrl(id int, url string) {
for {
func() {
lock.Lock()
defer lock.Unlock()
m[id] = url
}()
}
}
or
func StoreUrl(id int, url string) {
for {
lock.Lock()
m[id] = url
lock.Unlock()
}
}
We can't control the order of go routine, so add time.Sleep() to control the order.
code here:
https://play.golang.org/p/Bu8Lo46SA2