I have two goroutines:
first one adds task to queue
second cleans up from the queue based on status
Add and cleanup might not be simultaneous.
If the status of task is success, I want to delete the task from the queue, if not, I will retry for status to be success (will have time limit). If that fails, I will log and delete from queue.
We can't communicate between add and delete because that is not how the real world scenario works.
I want something like a watcher which monitors addition in queue and does the following cleanup. To increase complexity, Add might be adding even during cleanup is happening (not shown here). I want to implement it without using external packages.
How can I achieve this?
type Task struct {
name string
status string //completed, failed
}
var list []*Task
func main() {
done := make(chan bool)
go Add()
time.Sleep(15)
go clean(done)
<-done
}
func Add() {
t1 := &Task{"test1", "completed"}
t2 := &Task{"test2", "failed"}
list = append(list, t1, t2)
}
func clean() {
for k, v := range list {
if v.status == "completed" {
RemoveIndex(list, k)
} else {
//for now consider this as retry
v.status == "completed"
}
if len(list) > 0 {
clean()
}
<-done
}
}
func RemoveIndex(s []int, index int) []int {
return append(s[:index], s[index+1:]...)
}
so i found a solution which works for me and posting it here for anyone it might be helpful for.
in my main i have added a ticker which runs every x seconds to watch if something is added in the queue.
type Task struct {
name string
status string //completed, failed
}
var list []*Task
func main() {
done := make(chan bool)
c := make(chan os.Signal, 2)
go Add()
go func() {
for {
select {
// case <-done:
// Cleaner(k)
case <-ticker.C:
Monitor(done)
}
}
}()
signal.Notify(c, os.Interrupt, syscall.SIGTERM)
<-c
//waiting for interrupt here
}
func Add() {
t1 := &Task{"test1", "completed"}
t2 := &Task{"test2", "failed"}
list = append(list, t1, t2)
}
func Monitor(done chan bool) {
if len(list) > 0 {
Cleaner()
}
}
func cleaner(){
//do cleaning here
// pop each element from queue and delete
}
func RemoveIndex(s []int, index int) []int {
return append(s[:index], s[index+1:]...)
}
so now this solution does not need to depend on communication between go routines,
in a real world scenario, the programme never dies and keeps adding and cleaning based on use case.you can optimize better by locking and unlocking before addition to queue and deletion from queue.
Related
On IoT devices go applications are running that can receive commands from the cloud. The commands are pushed on a queue
var queue chan time.Time
and workers on the IoT device process the queue.
The job of the worker is to send back data covering a period of time to the cloud, the time on the channel is the start time of such a period. The IoT devices are on mobile network connection so sometimes data gets lost and never arrives at the cloud. The cloud also is not sure if the command it sent arrived on the IoT device and could get impatient and resend the command.
I want to make sure that if the original command is still in the queue the same command can not be pushed on the queue. Is there a way to do that?
func addToQueue(periodStart time.Time) error {
if alreadyOnQueue(queue, periodStart) {
return errors.New("periodStart was already on the queue, not adding it again")
}
queue <- periodStart
return nil
}
func alreadyOnQueue(queue chan time.Time, t time.Time) bool {
return false // todo
}
I've created a solution that is available on https://github.com/munnik/uniqueue/
package uniqueue
import (
"errors"
"sync"
)
// UQ is a uniqueue queue. It guarantees that a value is only once in the queue. The queue is thread safe.
// The unique constraint can be temporarily disabled to add multiple instances of the same value to the queue.
type UQ[T comparable] struct {
back chan T
queue chan T
front chan T
constraints map[T]*constraint
mu sync.Mutex
AutoRemoveConstraint bool // if true, the constraint will be removed when the value is popped from the queue.
}
type constraint struct {
count uint // number of elements in the queue
disabled bool
}
func NewUQ[T comparable](size uint) *UQ[T] {
u := &UQ[T]{
back: make(chan T),
queue: make(chan T, size),
front: make(chan T),
constraints: map[T]*constraint{},
}
go u.linkChannels()
return u
}
// Get the back of the queue, this channel can be used to write values to.
func (u *UQ[T]) Back() chan<- T {
return u.back
}
// Get the front of the queue, this channel can be used to read values from.
func (u *UQ[T]) Front() <-chan T {
return u.front
}
// Ignores the constraint for a value v once, when the value is added to the queue again, the constraint is enabled again.
func (u *UQ[T]) IgnoreConstraintFor(v T) {
u.mu.Lock()
defer u.mu.Unlock()
if _, ok := u.constraints[v]; !ok {
u.constraints[v] = &constraint{}
}
u.constraints[v].disabled = true
}
// Manually add a constraint to the queue, only use in special cases when you want to prevent certain values to enter the queue.
func (u *UQ[T]) AddConstraint(v T) error {
u.mu.Lock()
defer u.mu.Unlock()
if _, ok := u.constraints[v]; !ok {
u.constraints[v] = &constraint{
count: 1,
disabled: false,
}
return nil
} else {
if u.constraints[v].disabled {
u.constraints[v].count += 1
u.constraints[v].disabled = false
return nil
}
}
return errors.New("Already existing constraint prevents adding new constraint")
}
// Manually remove a constraint from the queue, this needs to be called when AutoRemoveConstraint is set to false. Useful when you want to remove the constraint only when a worker using the queue is finished processing the value.
func (u *UQ[T]) RemoveConstraint(v T) {
u.mu.Lock()
defer u.mu.Unlock()
if _, ok := u.constraints[v]; ok {
u.constraints[v].count -= 1
if u.constraints[v].count == 0 {
delete(u.constraints, v)
}
}
}
func (u *UQ[T]) linkChannels() {
wg := &sync.WaitGroup{}
wg.Add(2)
go u.shiftToFront(wg)
go u.readFromBack(wg)
wg.Wait()
}
func (u *UQ[T]) shiftToFront(wg *sync.WaitGroup) {
for v := range u.queue {
u.front <- v
if u.AutoRemoveConstraint {
u.RemoveConstraint(v)
}
}
close(u.front)
wg.Done()
}
func (u *UQ[T]) readFromBack(wg *sync.WaitGroup) {
for v := range u.back {
if err := u.AddConstraint(v); err == nil {
u.queue <- v
}
}
close(u.queue)
wg.Done()
}
I've been searching a lot but could not find an answer for my problem yet.
I need to make multiple calls to an external API, but with different parameters concurrently.
And then for each call I need to init a struct for each dataset and process the data I receive from the API call. Bear in mind that I read each line of the incoming request and start immediately send it to the channel.
First problem I encounter was not obvious at the beginning due to the large quantity of data I'm receiving, is that each goroutine does not receive all the data that goes through the channel. (Which I learned by the research I've made). So what I need is a way of requeuing/redirect that data to the correct goroutine.
The function that sends the streamed response from a single dataset.
(I've cut useless parts of code that are out of context)
func (api *API) RequestData(ctx context.Context, c chan DWeatherResponse, dataset string, wg *sync.WaitGroup) error {
for {
line, err := reader.ReadBytes('\n')
s := string(line)
if err != nil {
log.Println("End of %s", dataset)
return err
}
data, err := extractDataFromStreamLine(s, dataset)
if err != nil {
continue
}
c <- *data
}
}
The function that will process the incoming data
func (s *StrikeStruct) Process(ch, requeue chan dweather.DWeatherResponse) {
for {
data, more := <-ch
if !more {
break
}
// data contains {dataset string, value float64, date time.Time}
// The s.Parameter needs to match the dataset
// IMPORTANT PART, checks if the received data is part of this struct dataset
// If not I want to send it to another go routine until it gets to the correct
one. There will be a max of 4 datasets but still this could not be the best approach to have
if !api.GetDataset(s.Parameter, data.Dataset) {
requeue <- data
continue
}
// Do stuff with the data from this point
}
}
Now on my own API endpoint I have the following:
ch := make(chan dweather.DWeatherResponse, 2)
requeue := make(chan dweather.DWeatherResponse)
final := make(chan strike.StrikePerYearResponse)
var wg sync.WaitGroup
for _, s := range args.Parameters.Strikes {
strike := strike.StrikePerYear{
Parameter: strike.Parameter(s.Dataset),
StrikeValue: s.Value,
}
// I receive and process the data in here
go strike.ProcessStrikePerYear(ch, requeue, final, string(s.Dataset))
}
go func() {
for {
data, _ := <-requeue
ch <- data
}
}()
// Creates a goroutine for each dataset
for _, dataset := range api.Params.Dataset {
wg.Add(1)
go api.RequestData(ctx, ch, dataset, &wg)
}
wg.Wait()
close(ch)
//Once the data is all processed it is all appended
var strikes []strike.StrikePerYearResponse
for range args.Fetch.Datasets {
strikes = append(strikes, <-final)
}
return strikes
The issue with this code is that as soon as I start receiving data from more than one endpoint the requeue will block and nothing more happens. If I remove that requeue logic data will be lost if it does not land on the correct goroutine.
My two questions are:
Why is the requeue blocking if it has a goroutine always ready to receive?
Should I take a different approach on how I'm processing the incoming data?
this is not a good way to solving your problem. you should change your solution. I suggest an implementation like the below:
import (
"fmt"
"sync"
)
// answer for https://stackoverflow.com/questions/68454226/how-to-handle-multiple-goroutines-that-share-the-same-channel
var (
finalResult = make(chan string)
)
// IData use for message dispatcher that all struct must implement its method
type IData interface {
IsThisForMe() bool
Process(*sync.WaitGroup)
}
//MainData can be your main struct like StrikePerYear
type MainData struct {
// add any props
Id int
Name string
}
type DataTyp1 struct {
MainData *MainData
}
func (d DataTyp1) IsThisForMe() bool {
// you can check your condition here to checking incoming data
if d.MainData.Id == 2 {
return true
}
return false
}
func (d DataTyp1) Process(wg *sync.WaitGroup) {
d.MainData.Name = "processed by DataTyp1"
// send result to final channel, you can change it as you want
finalResult <- d.MainData.Name
wg.Done()
}
type DataTyp2 struct {
MainData *MainData
}
func (d DataTyp2) IsThisForMe() bool {
// you can check your condition here to checking incoming data
if d.MainData.Id == 3 {
return true
}
return false
}
func (d DataTyp2) Process(wg *sync.WaitGroup) {
d.MainData.Name = "processed by DataTyp2"
// send result to final channel, you can change it as you want
finalResult <- d.MainData.Name
wg.Done()
}
//dispatcher will run new go routine for each request.
//you can implement a worker pool to preventing running too many go routines.
func dispatcher(incomingData *MainData, wg *sync.WaitGroup) {
// based on your requirements you can remove this go routing or not
go func() {
var p IData
p = DataTyp1{incomingData}
if p.IsThisForMe() {
go p.Process(wg)
return
}
p = DataTyp2{incomingData}
if p.IsThisForMe() {
go p.Process(wg)
return
}
}()
}
func main() {
dummyDataArray := []MainData{
MainData{Id: 2, Name: "this data #2"},
MainData{Id: 3, Name: "this data #3"},
}
wg := sync.WaitGroup{}
for i := range dummyDataArray {
wg.Add(1)
dispatcher(&dummyDataArray[i], &wg)
}
result := make([]string, 0)
done := make(chan struct{})
// data collector
go func() {
loop:for {
select {
case <-done:
break loop
case r := <-finalResult:
result = append(result, r)
}
}
}()
wg.Wait()
done<- struct{}{}
for _, s := range result {
fmt.Println(s)
}
}
Note: this is just for opening your mind for finding a better solution, and for sure this is not a production-ready code.
What I eventually want to accomplish is to dynamically scale my workers up OR down, depending on the workload.
The code below successfully parses data when a Task is coming through w.Channel
func (s *Storage) StartWorker(w *app.Worker) {
go func() {
for {
w.Pool <- w.Channel // register current worker to the worker pool
select {
case task := <-w.Channel: // received a work request, do some work
time.Sleep(task.Delay)
fmt.Println(w.WorkerID, "processing task:", task.TaskName)
w.Results <- s.ProcessTask(w, &task)
case <-w.Quit:
fmt.Println("Closing channel for", w.WorkerID)
return
}
}
}()
}
The blocking point here is the line below.
w.Pool <- w.Channel
In that sense, if I try to stop a worker(s) in any part of my program with:
w.Quit <- true
the case <-w.Quit: is blocked and never receives until there's another incoming Task on w.Channel (and I guess select statement here is random for each case selection).
So how can I stop a channel(worker) independently?
See below sample code, it declares a fanout function that is reponsible to size up/down the workers.
It works by using timeouts to detect that new workers has ended or are required to spawn.
there is an inner loop to ensure that each item is processed before moving on, blocking the source when it is needed.
package main
import (
"fmt"
"io"
"log"
"net"
"os"
)
func main() {
input := make(chan string)
fanout(input)
}
func fanout() {
workers := 0
distribute := make(chan string)
workerEnd := make(chan bool)
for i := range input {
done := false
for done {
select {
case distribute<-i:
done = true
case <-workerEnd:
workers--
default:
if workers <10 {
workers++
go func(){
work(distribute)
workerEnd<-true
}()
}
}
}
}
}
func work(input chan string) {
for {
select {
case i := <-input:
<-time.After(time.Millisecond)
case <-time.After(time.Second):
return
}
}
}
I am trying to implement a priority queue to send json objects through a network socket based on priority. I am using the container/heap package to implement the queue. I came up with something like this:
for {
if pq.Len() > 0 {
item := heap.Pop(&pq).(*Item)
jsonEncoder.Encode(&item)
} else {
time.Sleep(10 * time.Millisecond)
}
}
Are there better ways to wait for a new item than just polling the priority queue?
I'd probably use a couple a queuing goroutine. Starting with the data structures in the PriorityQueue example, I'd build a function like this:
http://play.golang.org/p/hcNFX8ehBW
func queue(in <-chan *Item, out chan<- *Item) {
// Make us a queue!
pq := make(PriorityQueue, 0)
heap.Init(&pq)
var currentItem *Item // Our item "in hand"
var currentIn = in // Current input channel (may be nil sometimes)
var currentOut chan<- *Item // Current output channel (starts nil until we have something)
defer close(out)
for {
select {
// Read from the input
case item, ok := <-currentIn:
if !ok {
// The input has been closed. Don't keep trying to read it
currentIn = nil
// If there's nothing pending to write, we're done
if currentItem == nil {
return
}
continue
}
// Were we holding something to write? Put it back.
if currentItem != nil {
heap.Push(&pq, currentItem)
}
// Put our new thing on the queue
heap.Push(&pq, item)
// Turn on the output queue if it's not turned on
currentOut = out
// Grab our best item. We know there's at least one. We just put it there.
currentItem = heap.Pop(&pq).(*Item)
// Write to the output
case currentOut <- currentItem:
// OK, we wrote. Is there anything else?
if len(pq) > 0 {
// Hold onto it for next time
currentItem = heap.Pop(&pq).(*Item)
} else {
// Oh well, nothing to write. Is the input stream done?
if currentIn == nil {
// Then we're done
return
}
// Otherwise, turn off the output stream for now.
currentItem = nil
currentOut = nil
}
}
}
}
Here's an example of using it:
func main() {
// Some items and their priorities.
items := map[string]int{
"banana": 3, "apple": 2, "pear": 4,
}
in := make(chan *Item, 10) // Big input buffer and unbuffered output should give best sort ordering.
out := make(chan *Item) // But the system will "work" for any particular values
// Start the queuing engine!
go queue(in, out)
// Stick some stuff on in another goroutine
go func() {
i := 0
for value, priority := range items {
in <- &Item{
value: value,
priority: priority,
index: i,
}
i++
}
close(in)
}()
// Read the results
for item := range out {
fmt.Printf("%.2d:%s ", item.priority, item.value)
}
fmt.Println()
}
Note that if you run this example, the order will be a little different every time. That's of course expected. It depends on exactly how fast the input and output channels run.
One way would be to use sync.Cond:
Cond implements a condition variable, a rendezvous point for goroutines waiting for or announcing the occurrence of an event.
An example from the package could be amended as follows (for the consumer):
c.L.Lock()
for heap.Len() == 0 {
c.Wait() // Will wait until signalled by pushing routine
}
item := heap.Pop(&pq).(*Item)
c.L.Unlock()
// Do stuff with the item
And producer could simply do:
c.L.Lock()
heap.Push(x)
c.L.Unlock()
c.Signal()
(Wrapping these in functions and using defers might be a good idea.)
Here is an example of thread-safe (naive) heap which pop method waits until item is available:
package main
import (
"fmt"
"sort"
"sync"
"time"
"math/rand"
)
type Heap struct {
b []int
c *sync.Cond
}
func NewHeap() *Heap {
return &Heap{c: sync.NewCond(new(sync.Mutex))}
}
// Pop (waits until anything available)
func (h *Heap) Pop() int {
h.c.L.Lock()
defer h.c.L.Unlock()
for len(h.b) == 0 {
h.c.Wait()
}
// There is definitely something in there
x := h.b[len(h.b)-1]
h.b = h.b[:len(h.b)-1]
return x
}
func (h *Heap) Push(x int) {
defer h.c.Signal() // will wake up a popper
h.c.L.Lock()
defer h.c.L.Unlock()
// Add and sort to maintain priority (not really how the heap works)
h.b = append(h.b, x)
sort.Ints(h.b)
}
func main() {
heap := NewHeap()
go func() {
for range time.Tick(time.Second) {
for n := 0; n < 3; n++ {
x := rand.Intn(100)
fmt.Println("push:", x)
heap.Push(x)
}
}
}()
for {
item := heap.Pop()
fmt.Println("pop: ", item)
}
}
(Note this is not working in playground because of the for range time.Tick loop. Run it locally.)
I'm learning go, and I would like to explore some patterns.
I would like to build a Registry component which maintains a map of some stuff, and I want to provide a serialized access to it:
Currently I ended up with something like this:
type JobRegistry struct {
submission chan JobRegistrySubmitRequest
listing chan JobRegistryListRequest
}
type JobRegistrySubmitRequest struct {
request JobSubmissionRequest
response chan Job
}
type JobRegistryListRequest struct {
response chan []Job
}
func NewJobRegistry() (this *JobRegistry) {
this = &JobRegistry{make(chan JobRegistrySubmitRequest, 10), make(chan JobRegistryListRequest, 10)}
go func() {
jobMap := make(map[string] Job)
for {
select {
case sub := <- this.submission:
job := MakeJob(sub.request) // ....
jobMap[job.Id] = job
sub.response <- job.Id
case list := <- this.listing:
res := make([]Job, 0, 100)
for _, v := range jobMap {
res = append(res, v)
}
list.response <- res
}
/// case somechannel....
}
}()
return
}
Basically, I encapsulate each operation inside a struct, which carries
the parameters and a response channel.
Then I created helper methods for end users:
func (this *JobRegistry) List() ([]Job, os.Error) {
res := make(chan []Job, 1)
req := JobRegistryListRequest{res}
this.listing <- req
return <-res, nil // todo: handle errors like timeouts
}
I decided to use a channel for each type of request in order to be type safe.
The problem I see with this approach are:
A lot of boilerplate code and a lot of places to modify when some param/return type changes
Have to do weird things like create yet another wrapper struct in order to return errors from within the handler goroutine. (If I understood correctly there are no tuples, and no way to send multiple values in a channel, like multi-valued returns)
So, I'm wondering whether all this makes sense, or rather just get back to good old locks.
I'm sure that somebody will find some clever way out using channels.
I'm not entirely sure I understand you, but I'll try answering never the less.
You want a generic service that executes jobs sent to it. You also might want the jobs to be serializable.
What we need is an interface that would define a generic job.
type Job interface {
Run()
Serialize(io.Writer)
}
func ReadJob(r io.Reader) {...}
type JobManager struct {
jobs map[int] Job
jobs_c chan Job
}
func NewJobManager (mgr *JobManager) {
mgr := &JobManager{make(map[int]Job),make(chan Job,JOB_QUEUE_SIZE)}
for {
j,ok := <- jobs_c
if !ok {break}
go j.Run()
}
}
type IntJob struct{...}
func (job *IntJob) GetOutChan() chan int {...}
func (job *IntJob) Run() {...}
func (job *IntJob) Serialize(o io.Writer) {...}
Much less code, and roughly as useful.
About signaling errors with an axillary stream, you can always use a helper function.
type IntChanWithErr struct {
c chan int
errc chan os.Error
}
func (ch *IntChanWithErr) Next() (v int,err os.Error) {
select {
case v := <- ch.c // not handling closed channel
case err := <- ch.errc
}
return
}