I have a use case where I want to have a pool of N integers (0 - N-1) shared by N workers (N <= 100), each claiming an integer from the pool, "working" (for this example sleeping for a random duration), and returning them to the pool, and starting the process again. Each thread can take an arbitrary amount of time to return the key. I've quickly thrown together the following 2 solutions, and would like to know if there's a "best" or "safest" one, and if I'm missing a better approach. For the moment, these workers will never stop unless the application is killed, and we will have a fixed number of workers for the life of the application.
Single Buffered Channel
type Worker struct {
ID int
KeyIndex int
KeyChan chan int
}
func (w *Worker) GetKey() {
w.KeyIndex = <- w.KeyChan
}
func (w *Worker) ReturnKey() {
w.KeyChan <- w.KeyIndex
}
func (w *Worker) Work() {
for {
w.GetKey()
rand.Seed(time.Now().UnixNano())
n := rand.Intn(10)
time.Sleep(time.Duration(n) * time.Second)
w.ReturnKey()
}
}
func main() {
numWorkers := 5
c := make(chan int, numWorkers)
for i := 0; i < numWorkers; i++ {
c <- i
}
workers := make([]*Worker, numWorkers)
for i := range workers {
workers[i] = &Worker{
ID: i,
KeyChan: c,
}
}
for _, w := range workers {
go w.Work()
}
ch := make(chan byte, 1)
<-ch
}
Broker w/ Array + Mutex
type KeyBrokerMutex struct {
mu sync.Mutex
keys []bool
}
func (kb *KeyBrokerMutex) GetKey() int {
kb.mu.Lock()
defer kb.mu.Unlock()
for i, k := range kb.keys {
if k {
kb.keys[i] = false
return i
}
}
return -1
}
func (kb *KeyBrokerMutex) ReturnKey(index int) {
kb.mu.Lock()
defer kb.mu.Unlock()
kb.keys[index] = true
}
type Worker struct {
ID int
KeyIndex int
KeyBroker *KeyBrokerMutex
}
func (w *Worker) GetKeyBrokerMutex() {
w.KeyIndex = w.KeyPool.GetKey()
}
func (w *Worker) ReturnKeyBrokerMutex() {
w.KeyPool.ReturnKey(w.KeyIndex)
w.KeyIndex = -1
}
func (w *Worker) WorkMutex() {
for {
w.GetKeyBrokerMutex()
rand.Seed(time.Now().UnixNano())
n := rand.Intn(10)
time.Sleep(time.Duration(n) * time.Second)
w.ReturnKeyBrokerMutex()
}
}
func main() {
numWorkers := 5
keyBroker := KeyBrokerMutex{keys: make([]bool, numWorkers)}
for i := range keyBroker.keys {
keyBroker.keys[i] = true
}
workers := make([]*Worker, numWorkers)
for i := range workers {
workers[i] = &Worker{
ID: i,
KeyBroker: &keyBroker,
}
}
for _, w := range workers {
go w.WorkMutex()
}
ch := make(chan byte, 1)
<-ch
}
I also have a broker approach using 2 separate channels for getting and returning keys, however I don't think that offers any benefits over the above solutions.
I like the simplicity of the single channel approach, but is there any downside to having multiple consumers and producers to a single buffered channel?
Related
I'm making a goroutine worker pool that constantly increases and decreases according to the situation.
But never falldown under specific count.
To do this, I want to know the number of specific goroutines. Not use global variable.
package main
import (
"fmt"
"time"
)
func Adaptive_Worker_Pool(value_input chan int) {
kill_sig := make(chan bool)
make_sig := make(chan bool)
for i := 0; i < 5; i++ {
go Do(kill_sig, value_input)
}
go Make_Routine(make_sig, kill_sig, value_input)
go Judge(kill_sig, make_sig, value_input)
}
func Make_Routine(make_sig chan bool, kill_sig chan bool, value_input chan int) {
for {
<-make_sig
go Do(kill_sig, value_input)
}
}
func Do(kill_sig chan bool, value_input chan int) {
outer:
for {
select {
case value := <-value_input:
fmt.Println(value)
case <-kill_sig:
break outer
}
}
}
func Judge(make_sig chan bool, kill_sig chan bool, value_input chan int) {
for {
time.Sleep(time.Millisecond * 500)
count_value_in_channel := len(value_input)
if count_value_in_channel > 5 {
make_sig <- true
} else {
if { // if Count(Do( )) > 5 { continue }
continue // else {kill_sig <- true}
} else { // like this
kill_sig <- true
}
}
}
}
func main() {
value_input := make(chan int, 10)
Adaptive_Worker_Pool(value_input)
a := 0
for {
value_input <- a
a++
}
}
Some value input to a value_input channel and five goroutines that receive and output the value are created by default.
However, if the number of variables in the value_input channel is 5 or more, Do( ) goroutine will made.
I want to make the Judge( ) function decide whether to increment or decrement the Do( ) goroutine, If number of Do( ) goroutine.
How can I do it?
I think you should define a struct like:
type MyChannel struct {
value int
index int
}
and in for loop
value_input := make(chan MyChannel, 10)
a := 0
for {
value_input <- MyChannel{
index: a,
value: a,
}
a++
}
and in some check, you can check the index of MyChannel
I am implementing a set of codes that prints Lamport logical time upon the completion of sending messages to servers and broadcasting to nodes. My program runs fine before I implemented the codes Lamport logical time. Upon printing closing server..., the program breaks and shows deadlock. May I know if anyone can help me spot my mistake?
import (
"fmt"
"math/rand"
"time"
)
const num_nodes int = 3
const num_messages int = 2
// some arbitary large number
const buffer_channel = 10000
type Server struct {
serverChannel chan Message
nodeArray []Node
timestamp int
}
type Node struct {
nodeId int
nodeChannel chan Message
server Server
closeChannel chan int
readyChannel chan int
timestamp int
}
type Message struct {
senderId int
messageId int
timestamp int
}
func max(x int, y int) int {
if x > y {
return x
}
return y
}
func broadcast(m Message, s Server) {
for _, n := range s.nodeArray {
if n.nodeId != m.senderId {
broadcastMessage := Message{
m.senderId,
m.messageId,
s.timestamp,
}
go s.broadcastMessage(n.nodeChannel, broadcastMessage)
}
}
}
func (s Server) broadcastMessage(nodeChannel chan Message, broadcastMessage Message) {
fmt.Printf("[Server] is sending Message %d.%d to [Node %d]\n", broadcastMessage.senderId, broadcastMessage.messageId, broadcastMessage.senderId)
nodeChannel <- broadcastMessage
}
func (s Server) listen(messagesBufferChannel chan Message) {
numCompletedNodes := 0
for {
nodeMessage := <-s.serverChannel
s.timestamp = max(s.timestamp, nodeMessage.timestamp) + 1
nodeMessage.timestamp = s.timestamp
fmt.Printf("TS: %d -- [Server] has received Message %d.%d from [Node %d]\n", s.timestamp, nodeMessage.senderId, nodeMessage.messageId, nodeMessage.senderId)
messagesBufferChannel <- nodeMessage
s.timestamp += 1
broadcast(nodeMessage, s)
if nodeMessage.messageId == num_messages-1 {
numCompletedNodes += 1
if numCompletedNodes == num_nodes {
fmt.Println("Server finish broadcasting all messages. Stopping Server...")
return
}
}
numMilliSeconds := rand.Intn(1000) + 2000
time.Sleep(time.Duration(numMilliSeconds) * time.Millisecond)
}
}
func (n Node) preSendMessage() {
for i := 1; i <= num_messages; i++ {
numMilliSeconds := rand.Intn(1000) + 2000
time.Sleep(time.Duration(numMilliSeconds) * time.Millisecond)
n.readyChannel <- i
}
}
func (n Node) listenSendMessages(messagesBufferChannel chan Message) {
for {
select {
case receivedMessage := <-n.nodeChannel:
n.timestamp = max(n.timestamp, receivedMessage.timestamp) + 1
receivedMessage.timestamp = n.timestamp
fmt.Printf("TS: %d -- [Node %d] has received Message %d.%d from [Server]\n", n.timestamp, n.nodeId, receivedMessage.senderId, receivedMessage.messageId)
messagesBufferChannel <- receivedMessage
case nodeMessageId := <-n.readyChannel:
n.timestamp += 1
fmt.Printf("TS: %d -- [Node %d] is sending Message %d.%d to [Server]\n", n.timestamp, n.nodeId, n.nodeId, nodeMessageId)
nodeMessage := Message{
n.nodeId,
nodeMessageId,
n.timestamp,
}
n.server.serverChannel <- nodeMessage
case <-n.closeChannel:
fmt.Printf("Stopping [node %d]\n", n.nodeId)
return
default:
}
}
}
func main() {
fmt.Println("Start of Program...")
server := Server{
serverChannel: make(chan Message),
nodeArray: []Node{},
timestamp: 0,
}
for i := 1; i <= num_nodes; i++ {
newNode := Node{
nodeId: i,
nodeChannel: make(chan Message),
server: server,
readyChannel: make(chan int),
closeChannel: make(chan int),
timestamp: 0,
}
server.nodeArray = append(server.nodeArray, newNode)
}
var messagesBufferChannel chan Message = make(chan Message, buffer_channel)
for _, n := range server.nodeArray {
go n.preSendMessage()
go n.listenSendMessages(messagesBufferChannel)
}
server.listen(messagesBufferChannel)
time.Sleep(time.Second)
for _, n := range server.nodeArray {
n.closeChannel <- 1
}
time.Sleep(time.Second)
close(messagesBufferChannel)
}
I have a channel of thousands of IDs that need to be processed in parallel inside goroutines. How could I implement a lock so that goroutines cannot process the same id at the same time, should it be repeated in the channel?
package main
import (
"fmt"
"sync"
"strconv"
"time"
)
var wg sync.WaitGroup
func main() {
var data []string
for d := 0; d < 30; d++ {
data = append(data, "id1")
data = append(data, "id2")
data = append(data, "id3")
}
chanData := createChan(data)
for i := 0; i < 10; i++ {
wg.Add(1)
process(chanData, i)
}
wg.Wait()
}
func createChan(data []string) <-chan string {
var out = make(chan string)
go func() {
for _, val := range data {
out <- val
}
close(out)
}()
return out
}
func process(ids <-chan string, i int) {
go func() {
defer wg.Done()
for id := range ids {
fmt.Println(id + " (goroutine " + strconv.Itoa(i) + ")")
time.Sleep(1 * time.Second)
}
}()
}
--edit:
All values need to be processed in any order, but "id1, "id2" & "id3" need to block so they cannot be processed by more than one goroutine at the same time.
The simplest solution here is to not send the duplicate values at all, and then no synchronization is required.
func createChan(data []string) <-chan string {
seen := make(map[string]bool)
var out = make(chan string)
go func() {
for _, val := range data {
if seen[val] {
continue
}
seen[val] = true
out <- val
}
close(out)
}()
return out
}
I've found a solution. Someone has written a package (github.com/EagleChen/mapmutex) to do exactly what I needed:
package main
import (
"fmt"
"github.com/EagleChen/mapmutex"
"strconv"
"sync"
"time"
)
var wg sync.WaitGroup
var mutex *mapmutex.Mutex
func main() {
mutex = mapmutex.NewMapMutex()
var data []string
for d := 0; d < 30; d++ {
data = append(data, "id1")
data = append(data, "id2")
data = append(data, "id3")
}
chanData := createChan(data)
for i := 0; i < 10; i++ {
wg.Add(1)
process(chanData, i)
}
wg.Wait()
}
func createChan(data []string) <-chan string {
var out = make(chan string)
go func() {
for _, val := range data {
out <- val
}
close(out)
}()
return out
}
func process(ids <-chan string, i int) {
go func() {
defer wg.Done()
for id := range ids {
if mutex.TryLock(id) {
fmt.Println(id + " (goroutine " + strconv.Itoa(i) + ")")
time.Sleep(1 * time.Second)
mutex.Unlock(id)
}
}
}()
}
Your problem as stated is difficult by definition and my first choice would be to re-architect the application to avoid it, but if that's not an option:
First, I assume that if a given ID is repeated you still want it processed twice, but not in parallel (if that's not the case and the 2nd instance must be ignored, it becomes even more difficult, because you have to remember every ID you have processed forever, so you don't run the task over it twice).
To achieve your goal, you must keep track of every ID that is being acted upon in a goroutine - a go map is your best option here (note that its size will grow up to as many goroutines as you spin in parallel!). The map itself must be protected by a lock, as it is modified from multiple goroutines.
Another simplification that I'd take is that it is OK for an ID removed from the channel to be added back to it if found to be currently processed by another gorotuine. Then, we need map[string]bool as the tracking device, plus a sync.Mutex to guard it. For simplicity, I assume the map, mutex and the channel are global variables; but that may not be convenient for you - arrange access to those as you see fit (arguments to the goroutine, closure, etc.).
import "sync"
var idmap map[string]bool
var mtx sync.Mutex
var queue chan string
func process_one_id(id string) {
busy := false
mtx.Lock()
if idmap[id] {
busy = true
} else {
idmap[id] = true
}
mtx.Unlock()
if busy { // put the ID back on the queue and exit
queue <- id
return
}
// ensure the 'busy' mark is cleared at the end:
defer func() { mtx.Lock(); delete(idmap, id); mtx.Unlock() }()
// do your processing here
// ....
}
Functions WithMutex and WithoutMutex are giving different results.
WithoutMutex implementation is losing values even though I have Waitgroup set up.
What could be wrong?
Do not run on Playground
P.S. I am on Windows 10 and Go 1.8.1
package main
import (
"fmt"
"sync"
)
var p = fmt.Println
type MuType struct {
list []int
*sync.RWMutex
}
var muData *MuType
var data *NonMuType
type NonMuType struct {
list []int
}
func (data *MuType) add(i int, wg *sync.WaitGroup) {
data.Lock()
defer data.Unlock()
data.list = append(data.list, i)
wg.Done()
}
func (data *MuType) read() []int {
data.RLock()
defer data.RUnlock()
return data.list
}
func (nonmu *NonMuType) add(i int, wg *sync.WaitGroup) {
nonmu.list = append(nonmu.list, i)
wg.Done()
}
func (nonmu *NonMuType) read() []int {
return nonmu.list
}
func WithoutMutex() {
nonmu := &NonMuType{}
nonmu.list = make([]int, 0)
var wg = sync.WaitGroup{}
for i := 0; i < 10; i++ {
wg.Add(1)
go nonmu.add(i, &wg)
}
wg.Wait()
data = nonmu
p(data.read())
}
func WithMutex() {
mtx := &sync.RWMutex{}
withMu := &MuType{list: make([]int, 0)}
withMu.RWMutex = mtx
var wg = sync.WaitGroup{}
for i := 0; i < 10; i++ {
wg.Add(1)
go withMu.add(i, &wg)
}
wg.Wait()
muData = withMu
p(muData.read())
}
func stressTestWOMU(max int) {
p("Without Mutex")
for ii := 0; ii < max; ii++ {
WithoutMutex()
}
}
func stressTest(max int) {
p("With Mutex")
for ii := 0; ii < max; ii++ {
WithMutex()
}
}
func main() {
stressTestWOMU(20)
stressTest(20)
}
Slices are not safe for concurrent writes, so I am in no way surprised that WithoutMutex does not appear to be consistent at all, and has dropped items.
The WithMutex version consistently has 10 items, but in jumbled orders. This is also to be expected, since the mutex protects it so that only one can append at a time. There is no guarantee as to which goroutine will run in which order though, so it is a race to see which of the rapidly spawned goroutines will get to append first.
The waitgroup does not do anything to control access or enforce ordering. It merely provides a signal at the end that everything is done.
I have a channel which stores received data, I want to process it when one of following conditions is met:
1, the channel reaches its capacity.
2, the timer is fired since last process.
I saw the post
Golang - How to know a buffered channel is full
Update:
I inspired from that post and OneOfOne's advice, here is the play :
package main
import (
"fmt"
"math/rand"
"time"
)
var c chan int
var timer *time.Timer
const (
capacity = 5
timerDration = 3
)
func main() {
c = make(chan int, capacity)
timer = time.NewTimer(time.Second * timerDration)
go checkTimer()
go sendRecords("A")
go sendRecords("B")
go sendRecords("C")
time.Sleep(time.Second * 20)
}
func sendRecords(name string) {
for i := 0; i < 20; i++ {
fmt.Println(name+" sending record....", i)
sendOneRecord(i)
interval := time.Duration(rand.Intn(500))
time.Sleep(time.Millisecond * interval)
}
}
func sendOneRecord(record int) {
select {
case c <- record:
default:
fmt.Println("channel is full !!!")
process()
c <- record
timer.Reset(time.Second * timerDration)
}
}
func checkTimer() {
for {
select {
case <-timer.C:
fmt.Println("3s timer ----------")
process()
timer.Reset(time.Second * timerDration)
}
}
}
func process() {
for i := 0; i < capacity; i++ {
fmt.Println("process......", <-c)
}
}
This seems to work fine, but I have a concern, I want to block the channel writing from other goroutine when process() is called, is the code above capable to do so? Or should I add a mutex at the beginning of the process method?
Any elegant solution?
As was mentioned by #OneOfOne, select is really the only way to check if a channel is full.
If you are using the channel to effect batch processing, you could always create an unbuffered channel and have a goroutine pull items and append to a slice.
When the slice reaches a specific size, process the items.
Here's an example on play
package main
import (
"fmt"
"sync"
"time"
)
const BATCH_SIZE = 10
func batchProcessor(ch <-chan int) {
batch := make([]int, 0, BATCH_SIZE)
for i := range ch {
batch = append(batch, i)
if len(batch) == BATCH_SIZE {
fmt.Println("Process batch:", batch)
time.Sleep(time.Second)
batch = batch[:0] // trim back to zero size
}
}
fmt.Println("Process last batch:", batch)
}
func main() {
var wg sync.WaitGroup
ch := make(chan int)
wg.Add(1)
go func() {
batchProcessor(ch)
wg.Done()
}()
fmt.Println("Submitting tasks")
for i := 0; i < 55; i++ {
ch <- i
}
close(ch)
wg.Wait()
}
No, select is the only way to do it:
func (t *T) Send(v *Val) {
select {
case t.ch <- v:
default:
// handle v directly
}
}