Parallel processing in golang - parallel-processing

Given the following code:
package main
import (
"fmt"
"math/rand"
"time"
)
func main() {
for i := 0; i < 3; i++ {
go f(i)
}
// prevent main from exiting immediately
var input string
fmt.Scanln(&input)
}
func f(n int) {
for i := 0; i < 10; i++ {
dowork(n, i)
amt := time.Duration(rand.Intn(250))
time.Sleep(time.Millisecond * amt)
}
}
func dowork(goroutine, loopindex int) {
// simulate work
time.Sleep(time.Second * time.Duration(5))
fmt.Printf("gr[%d]: i=%d\n", goroutine, loopindex)
}
Can i assume that the 'dowork' function will be executed in parallel?
Is this a correct way of achieving parallelism or is it better to use channels and separate 'dowork' workers for each goroutine?

Regarding GOMAXPROCS, you can find this in Go 1.5's release docs:
By default, Go programs run with GOMAXPROCS set to the number of cores available; in prior releases it defaulted to 1.
Regarding preventing the main function from exiting immediately, you could leverage WaitGroup's Wait function.
I wrote this utility function to help parallelize a group of functions:
import "sync"
// Parallelize parallelizes the function calls
func Parallelize(functions ...func()) {
var waitGroup sync.WaitGroup
waitGroup.Add(len(functions))
defer waitGroup.Wait()
for _, function := range functions {
go func(copy func()) {
defer waitGroup.Done()
copy()
}(function)
}
}
So in your case, we could do this
func1 := func() {
f(0)
}
func2 = func() {
f(1)
}
func3 = func() {
f(2)
}
Parallelize(func1, func2, func3)
If you wanted to use the Parallelize function, you can find it here https://github.com/shomali11/util

This answer is outdated. Please see this answer instead.
Your code will run concurrently, but not in parallel. You can make it run in parallel by setting GOMAXPROCS.
It's not clear exactly what you're trying to accomplish here, but it looks like a perfectly valid way of achieving concurrency to me.

f() will be executed concurrently but many dowork() will be executed sequentially within each f(). Waiting on stdin is also not the right way to ensure that your routines finished execution. You must spin up a channel that each f() pushes a true on when the f() finishes.
At the end of the main() you must wait for n number of true's on the channel. n being the number of f() that you have spun up.

This helped me when I was starting out.
package main
import "fmt"
func put(number chan<- int, count int) {
i := 0
for ; i <= (5 * count); i++ {
number <- i
}
number <- -1
}
func subs(number chan<- int) {
i := 10
for ; i <= 19; i++ {
number <- i
}
}
func main() {
channel1 := make(chan int)
channel2 := make(chan int)
done := 0
sum := 0
go subs(channel2)
go put(channel1, 1)
go put(channel1, 2)
go put(channel1, 3)
go put(channel1, 4)
go put(channel1, 5)
for done != 5 {
select {
case elem := <-channel1:
if elem < 0 {
done++
} else {
sum += elem
fmt.Println(sum)
}
case sub := <-channel2:
sum -= sub
fmt.Printf("atimta : %d\n", sub)
fmt.Println(sum)
}
}
close(channel1)
close(channel2)
}
"Conventional cluster-based systems (such as supercomputers) employ parallel execution between processors using MPI. MPI is a communication interface between processes that execute in operating system instances on different processors; it doesn't support other process operations such as scheduling. (At the risk of complicating things further, because MPI processes are executed by operating systems, a single processor can run multiple MPI processes and/or a single MPI process can also execute multiple threads!)"

You can add a loop at the end, to block until the jobs are done:
package main
import "time"
func f(n int, b chan bool) {
println(n)
time.Sleep(time.Second)
b <- true
}
func main() {
b := make(chan bool, 9)
for n := cap(b); n > 0; n-- {
go f(n, b)
}
for <-b {
if len(b) == 0 { break }
}
}

Related

Job queue where workers can add jobs, is there an elegant solution to stop the program when all workers are idle?

I find myself in a situation where I have a queue of jobs where workers can add new jobs when they are done processing one.
For illustration, in the code below, a job consists in counting up to JOB_COUNTING_TO and, randomly, 1/5 of the time a worker will add a new job to the queue.
Because my workers can add jobs to the queue, it is my understanding that I was not able to use a channel as my job queue. This is because sending to the channel is blocking and, even with a buffered channel, this code, due to its recursive nature (jobs adding jobs) could easily reach a situation where all the workers are sending to the channel and no worker is available to receive.
This is why I decided to use a shared queue protected by a mutex.
Now, I would like the program to halt when all the workers are idle. Unfortunately this cannot be spotted just by looking for when len(jobQueue) == 0 as the queue could be empty but some worker still doing their job and maybe adding a new job after that.
The solution I came up with is, I feel a bit clunky, it makes use of variables var idleWorkerCount int and var isIdle [NB_WORKERS]bool to keep track of idle workers and the code stops when idleWorkerCount == NB_WORKERS.
My question is if there is a concurrency pattern that I could use to make this logic more elegant?
Also, for some reason I don't understand the technique that I currently use (code below) becomes really inefficient when the number of Workers becomes quite big (such as 300000 workers): for the same number of jobs, the code will be > 10x slower for NB_WORKERS = 300000 vs NB_WORKERS = 3000.
Thank you very much in advance!
package main
import (
"math/rand"
"sync"
)
const NB_WORKERS = 3000
const NB_INITIAL_JOBS = 300
const JOB_COUNTING_TO = 10000000
var jobQueue []int
var mu sync.Mutex
var idleWorkerCount int
var isIdle [NB_WORKERS]bool
func doJob(workerId int) {
mu.Lock()
if len(jobQueue) == 0 {
if !isIdle[workerId] {
idleWorkerCount += 1
}
isIdle[workerId] = true
mu.Unlock()
return
}
if isIdle[workerId] {
idleWorkerCount -= 1
}
isIdle[workerId] = false
var job int
job, jobQueue = jobQueue[0], jobQueue[1:]
mu.Unlock()
for i := 0; i < job; i += 1 {
}
if rand.Intn(5) == 0 {
mu.Lock()
jobQueue = append(jobQueue, JOB_COUNTING_TO)
mu.Unlock()
}
}
func main() {
// Filling up the queue with initial jobs
for i := 0; i < NB_INITIAL_JOBS; i += 1 {
jobQueue = append(jobQueue, JOB_COUNTING_TO)
}
var wg sync.WaitGroup
for i := 0; i < NB_WORKERS; i += 1 {
wg.Add(1)
go func(workerId int) {
for idleWorkerCount != NB_WORKERS {
doJob(workerId)
}
wg.Done()
}(i)
}
wg.Wait()
}
Because my workers can add jobs to the queue
A re entrant channel always deadlock. This is easy to demonstrate using this code
package main
import (
"fmt"
)
func main() {
out := make(chan string)
c := make(chan string)
go func() {
for v := range c {
c <- v + " 2"
out <- v
}
}()
go func() {
c <- "hello world!" // pass OK
c <- "hello world!" // no pass, the routine is blocking at pushing to itself
}()
for v := range out {
fmt.Println(v)
}
}
While the program
tries to push at c <- v + " 2"
it can not
read at for v := range c {,
push at c <- "hello world!"
read at for v := range out {
thus, it deadlocks.
If you want to pass that situation you must overflow somewhere.
On the routines, or somewhere else.
package main
import (
"fmt"
"time"
)
func main() {
out := make(chan string)
c := make(chan string)
go func() {
for v := range c {
go func() { // use routines on the stack as a bank for the required overflow.
<-time.After(time.Second) // simulate slowliness.
c <- v + " 2"
}()
out <- v
}
}()
go func() {
for {
c <- "hello world!"
}
}()
exit := time.After(time.Second * 60)
for v := range out {
fmt.Println(v)
select {
case <-exit:
return
default:
}
}
}
But now you have a new problem.
You created a memory bomb by overflowing without limits on the stack. Technically, this is dependent on the time needed to finish a job, the memory available, the speed of your cpus and the shape of the data (they might or might not generate a new job). So there is a upper limit, but it is so hard to make sense of it, that in practice this ends up to be a bomb.
Consider not overflowing without limits on the stack.
If you dont have any arbitrary limit on hand, you can use a semaphore to cap the overflow.
https://play.golang.org/p/5JWPQiqOYKz
my bombs did not explode with a work timeout of 1s and 2s, but they took a large chunk of memory.
In another round with a modified code, it exploded
Of course, because you use if rand.Intn(5) == 0 { in your code, the problem is largely mitigated. Though, when you meet such pattern, think twice to the code.
Also, for some reason I don't understand the technique that I currently use (code below) becomes really inefficient when the number of Workers becomes quite big (such as 300000 workers): for the same number of jobs, the code will be > 10x slower for NB_WORKERS = 300000 vs NB_WORKERS = 3000.
In the big picture, you have a limited amount of cpu cycles. All those allocations and instructions, to spawn and synchronize, has to be executed too. Concurrency is not free.
Now, I would like the program to halt when all the workers are idle.
I came up with that but i find it very difficult to reason about and convince myself it wont end up in a write on closed channel panic.
The idea is to use a sync.WaitGroup to count in flight items and rely on it to properly close the input channel and finish the job.
package main
import (
"log"
"math/rand"
"sync"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
var wg sync.WaitGroup
var wgr sync.WaitGroup
out := make(chan string)
c := make(chan string)
go func() {
for v := range c {
if rand.Intn(5) == 0 {
wgr.Add(1)
go func(v string) {
<-time.After(time.Microsecond)
c <- v + " 2"
}(v)
}
wgr.Done()
out <- v
}
close(out)
}()
var sent int
wg.Add(1)
go func() {
for i := 0; i < 300; i++ {
wgr.Add(1)
c <- "hello world!"
sent++
}
wg.Done()
}()
go func() {
wg.Wait()
wgr.Wait()
close(c)
}()
var rcv int
for v := range out {
// fmt.Println(v)
_ = v
rcv++
}
log.Println("sent", sent)
log.Println("rcv", rcv)
}
I ran it with while go run -race .; do :; done it worked fine for a reasonable amount of iterations.

In Golang, how to handle many goroutines with channel

I'm thinking start 1000 goroutines at the same time using for loop in Golang.
The problem is: I have to make sure that every goroutine has been executed.
Is it possible using channels to help me make sure of that?
The structure is kinda like this:
func main {
for i ... {
go ...
ch?
ch?
}
As #Andy mentioned You can use sync.WaitGroup to achieve this. Below is an example. Hope the code is self-explanatory.
package main
import (
"fmt"
"sync"
"time"
)
func dosomething(millisecs int64, wg *sync.WaitGroup) {
defer wg.Done()
duration := time.Duration(millisecs) * time.Millisecond
time.Sleep(duration)
fmt.Println("Function in background, duration:", duration)
}
func main() {
arr := []int64{200, 400, 150, 600}
var wg sync.WaitGroup
for _, n := range arr {
wg.Add(1)
go dosomething(n, &wg)
}
wg.Wait()
fmt.Println("Done")
}
To make sure the goroutines are done and collect the results, try this example:
package main
import (
"fmt"
)
const max = 1000
func main() {
for i := 1; i <= max; i++ {
go f(i)
}
sum := 0
for i := 1; i <= max; i++ {
sum += <-ch
}
fmt.Println(sum) // 500500
}
func f(n int) {
// do some job here and return the result:
ch <- n
}
var ch = make(chan int, max)
In order to wait for 1000 goroutines to finish, try this example:
package main
import (
"fmt"
"sync"
)
func main() {
wg := &sync.WaitGroup{}
for i := 0; i < 1000; i++ {
wg.Add(1)
go f(wg, i)
}
wg.Wait()
fmt.Println("Done.")
}
func f(wg *sync.WaitGroup, n int) {
defer wg.Done()
fmt.Print(n, " ")
}
I would suggest that you follow a pattern. Concurrency and Channel is Good but if you use it in a bad way, your program might became even slower than expected. The simple way to handle multiple go-routine and channel is by a worker pool pattern.
Take a close look at the code below
// In this example we'll look at how to implement
// a _worker pool_ using goroutines and channels.
package main
import "fmt"
import "time"
// Here's the worker, of which we'll run several
// concurrent instances. These workers will receive
// work on the `jobs` channel and send the corresponding
// results on `results`. We'll sleep a second per job to
// simulate an expensive task.
func worker(id int, jobs <-chan int, results chan<- int) {
for j := range jobs {
fmt.Println("worker", id, "started job", j)
time.Sleep(time.Second)
fmt.Println("worker", id, "finished job", j)
results <- j * 2
}
}
func main() {
// In order to use our pool of workers we need to send
// them work and collect their results. We make 2
// channels for this.
jobs := make(chan int, 100)
results := make(chan int, 100)
// This starts up 3 workers, initially blocked
// because there are no jobs yet.
for w := 1; w <= 3; w++ {
go worker(w, jobs, results)
}
// Here we send 5 `jobs` and then `close` that
// channel to indicate that's all the work we have.
for j := 1; j <= 5; j++ {
jobs <- j
}
close(jobs)
// Finally we collect all the results of the work.
for a := 1; a <= 5; a++ {
<-results
}
}
This simple example is taken from here . Also the results channel can help you keep track of all the go routines executing the jobs including failure notice.

Why does this golang program create a memory leak?

I am trying to understand concurrency and goroutines, and had a couple questions about the following experimental code:
Why does it create a memory leak? I thought that a return at the end of the goroutine would allow memory associated with it to get cleaned up.
Why do my loops almost never reach 999? In fact, when I output to a file and study the output, I notice that it rarely prints integers in double digits; the first time it prints "99" is line 2461, and for "999" line 6120. This behavior is unexpected to me, which clearly means I don't really understand what is going on with goroutine scheduling.
Disclaimer:
Be careful running the code below, it can crash your system if you don't stop it after a few seconds!
CODE
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for {
// spawn four worker goroutines
spawnWorkers(4, wg)
// wait for the workers to finish
wg.Wait()
}
}
func spawnWorkers(max int, wg sync.WaitGroup) {
for n := 0; n < max; n++ {
wg.Add(1)
go func() {
defer wg.Done()
f(n)
return
}()
}
}
func f(n int) {
for i := 0; i < 1000; i++ {
fmt.Println(n, ":", i)
}
}
Thanks to Tim Cooper, JimB, and Greg for their helpful comments. The corrected version of the code is posted below for reference.
The two fixes were to pass in the WaitGroup by reference, which fixed the memory leak, and to pass n correctly into the anonymous goroutine, and
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for {
// spawn four worker goroutines
spawnWorkers(4,&wg)
// wait for the workers to finish
wg.Wait()
}
}
func spawnWorkers(max int, wg *sync.WaitGroup) {
for n := 0; n < max; n++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
f(n)
return
}(n)
}
}
func f(n int) {
for i := 0; i < 1000; i++ {
fmt.Println(n, ":", i)
}
}

GoLang - Sequential vs Concurrent

I have two versions of factorial. Concurrent vs Sequencial.
Both the program will calculate factorial of 10 "1000000" times.
Factorial Concurrent Processing
package main
import (
"fmt"
//"math/rand"
"sync"
"time"
//"runtime"
)
func main() {
start := time.Now()
printFact(fact(gen(1000000)))
fmt.Println("Current Time:", time.Now(), "Start Time:", start, "Elapsed Time:", time.Since(start))
panic("Error Stack!")
}
func gen(n int) <-chan int {
c := make(chan int)
go func() {
for i := 0; i < n; i++ {
//c <- rand.Intn(10) + 1
c <- 10
}
close(c)
}()
return c
}
func fact(in <-chan int) <-chan int {
out := make(chan int)
var wg sync.WaitGroup
for n := range in {
wg.Add(1)
go func(n int) {
//temp := 1
//for i := n; i > 0; i-- {
// temp *= i
//}
temp := calcFact(n)
out <- temp
wg.Done()
}(n)
}
go func() {
wg.Wait()
close(out)
}()
return out
}
func printFact(in <-chan int) {
//for n := range in {
// fmt.Println("The random Factorial is:", n)
//}
var i int
for range in {
i ++
}
fmt.Println("Count:" , i)
}
func calcFact(c int) int {
if c == 0 {
return 1
} else {
return calcFact(c-1) * c
}
}
//###End of Factorial Concurrent
Factorial Sequencial Processing
package main
import (
"fmt"
//"math/rand"
"time"
"runtime"
)
func main() {
start := time.Now()
//for _, n := range factorial(gen(10000)...) {
// fmt.Println("The random Factorial is:", n)
//}
var i int
for range factorial(gen(1000000)...) {
i++
}
fmt.Println("Count:" , i)
fmt.Println("Current Time:", time.Now(), "Start Time:", start, "Elapsed Time:", time.Since(start))
}
func gen(n int) []int {
var out []int
for i := 0; i < n; i++ {
//out = append(out, rand.Intn(10)+1)
out = append(out, 10)
}
println(len(out))
return out
}
func factorial(val ...int) []int {
var out []int
for _, n := range val {
fa := calcFact(n)
out = append(out, fa)
}
return out
}
func calcFact(c int) int {
if c == 0 {
return 1
} else {
return calcFact(c-1) * c
}
}
//###End of Factorial sequential processing
My assumption was concurrent processing will be faster than sequential but sequential is executing faster than concurrent in my windows machine.
I am using 8 core/ i7 / 32 GB RAM.
I am not sure if there is something wrong in the programs or my basic understanding is correct.
p.s. - I am new to GoLang.
Concurrent version of your program will always be slow compared to the sequential version. The reason however, is related to the nature and behavior of problem you are trying to solve.
Your program is concurrent but it is not parallel. Each callFact is running in it's own goroutine but there is no division of the amount of work required to be done. Each goroutine must perform the same computation and output the same value.
It is like having a task that requires some text to be copied a hundred times. You have just one CPU (ignore the cores for now).
When you start a sequential process, you point the CPU to the original text once, and ask it to write it down a 100 times. The CPU has to manage a single task.
With goroutines, the CPU is told that there are a hundred tasks that must be done concurrently. It just so happens that they are all the same tasks. But CPU is not smart enough to know that.
So it does the same thing as above. Even though each task now is a 100 times smaller, there is still just one CPU. So the amount of work CPU has to do is still the same, except with all the added overhead of managing 100 different things at once. Hence, it looses a part of its efficiency.
To see an improvement in performance you'll need proper parallelism. A simple example would be to split the factorial input number roughly in the middle and compute 2 smaller factorials. Then combine them together:
// not an ideal solution
func main() {
ch := make(chan int)
r := 10
result := 1
go fact(r, ch)
for i := range ch {
result *= i
}
fmt.Println(result)
}
func fact(n int, ch chan int) {
p := n/2
q := p + 1
var wg sync.WaitGroup
wg.Add(2)
go func() {
ch <- factPQ(1, p)
wg.Done()
}()
go func() {
ch <- factPQ(q, n)
wg.Done()
}()
go func() {
wg.Wait()
close(ch)
}()
}
func factPQ(p, q int) int {
r := 1
for i := p; i <= q; i++ {
r *= i
}
return r
}
Working code: https://play.golang.org/p/xLHAaoly8H
Now you have two goroutines working towards the same goal and not just repeating the same calculations.
Note about CPU cores:
In your original code, the sequential version's operations are most definitely being distributed amongst various CPU cores by the runtime environment and the OS. So it still has parallelism to a degree, you just don't controll it.
The same is happening in the concurrent version but again as mentioned above, the overhead of goroutine context switching makes the performance come down.
abhink has given a good answer. I would also like to draw attention to Amdahl's Law, which should always be borne in mind when trying to use parallel processing to increase the overall speed of computation. That's not to say "don't make things parallel", but rather: be realistic about expectations and understand the parallel architecture fully.
Go allows us to write concurrent programs. This is related to trying to write faster parallel programs, but the two issues are separate. See Rob Pike's Concurrency is not Parallelism for more info.

How can we determine when the "last" worker process/thread is finished in Go?

I'll use a hacky inefficient prime number finder to make this question a little more concrete.
Let's say our main function fires off a bunch of "worker" goroutines. They will report their results to a single channnel which prints them. But not every worker will report something so we can't use a counter to know when the last job is finished. Or is there a way?
For the concrete example, here, main fires off goroutines to check whether the values 2...1000 are prime (yeah I know it is inefficient).
package main
import (
"fmt"
"time"
)
func main() {
c := make(chan int)
go func () {
for {
fmt.Print(" ", <- c)
}
}()
for n := 2; n < 1000; n++ {
go printIfPrime(n, c)
}
time.Sleep(2 * time.Second) // <---- THIS FEELS WRONG
}
func printIfPrime(n int, channel chan int) {
for d := 2; d * d <= n; d++ {
if n % d == 0 {
return
}
}
channel <- n
}
My problem is that I don't know how to reliably stop it at the right time. I tried adding a sleep at the end of main and it works (but it might take too long, and this is no way to write concurrent code!). I would like to know if there was a way to send a stop signal through a channel or something so main can stop at the right time.
The trick here is that I don't know how many worker responses there will be.
Is this impossible or is there a cool trick?
(If there's an answer for this prime example, great. I can probably generalize. Or maybe not. Maybe this is app specific?)
Use a WaitGroup.
The following code uses two WaitGroups. The main function uses wgTest to wait for print_if_prime functions to complete. Once they are done, it closes the channel to break the for loop in the printing goroutine. The main function uses wgPrint to wait for printing to complete.
package main
import (
"fmt"
"sync"
)
func main() {
c := make(chan int)
var wgPrint, wgTest sync.WaitGroup
wgPrint.Add(1)
go func(wg *sync.WaitGroup) {
defer wg.Done()
for n := range c {
fmt.Print(" ", n)
}
}(&wgPrint)
for n := 2; n < 1000; n++ {
wgTest.Add(1)
go print_if_prime(&wgTest, n, c)
}
wgTest.Wait()
close(c)
wgPrint.Wait()
}
func print_if_prime(wg *sync.WaitGroup, n int, channel chan int) {
defer wg.Done()
for d := 2; d*d <= n; d++ {
if n%d == 0 {
return
}
}
channel <- n
}
playground example

Resources