Concurrent QuickSort only partially sorting - go

im trying to implement QuickSort concurrently. When I run it and look at the sorted array, there is a portion of elements near the start of the array that is unsorted but the majority of the array is.
Code Below
package main
import (
"fmt"
"math/rand"
//"runtime"
"sync"
"time"
)
func main() {
slice := generateSlice(1000000)
var wg sync.WaitGroup
start := time.Now()
go Quicksort(slice, 0, len(slice)-1, &wg)
wg.Wait()
end := time.Since(start)
fmt.Printf("Sort Time: %v, sorted: %v \n", end, slice)
}
func Quicksort(A []int, p int, r int, wg *sync.WaitGroup) {
if p < r {
q := Partition(A, p, r)
wg.Add(2)
go Quicksort(A, p, q-1, wg)
go Quicksort(A, q+1, r, wg)
}
}
func Partition(A []int, p int, r int) int {
index := rand.Intn(r-p) + p
pivot := A[index]
A[index] = A[r]
A[r] = pivot
x := A[r]
j := p - 1
i := p
for i < r {
if A[i] <= x {
j++
tmp := A[j]
A[j] = A[i]
A[i] = tmp
}
i++
}
temp := A[j+1]
A[j+1] = A[r]
A[r] = temp
return j + 1
}
func generateSlice(size int) []int {
slice := make([]int, size)
rand.Seed(time.Now().UnixNano())
for i := 0; i < size; i++ {
slice[i] = rand.Intn(999) - rand.Intn(999)
}
return slice
}
I can't seem to find the issue, any ideas?

Your implementation has multiple problems. Hymns For Disco has already mentioned a couple of them in the comments. Another change I would suggest is not to use the same waitGroup in all recursive function calls. It can be very difficult to keep track of counter increments and decrements and you might reach a deadlock.
I have made a few changes to your code. I think it's working fine. Note that 'Partition' and 'generateSlice' functions remain unchanged.
func main() {
slice := generateSlice(1000)
Quicksort(slice, 0, len(slice)-1)
fmt.Printf("%v\n", slice)
}
func Quicksort(A []int, p int, r int) {
if p < r {
var wg sync.WaitGroup
q := Partition(A, p, r)
wg.Add(2)
go func() {
defer wg.Done()
Quicksort(A, p, q-1)
}()
go func() {
defer wg.Done()
Quicksort(A, q+1, r)
}()
wg.Wait()
}
}

Related

Concurrently Add Nodes to Linked List golang

I'm trying to add nodes to a linked list concurrently using channels and goroutines. I seem to be doing be something wrong, however. Here's what I've written so far.
Currently, my print function is just repeating the 8th node. This seems to work on other linked lists, so I don't totally understand the issue. Any help would be great. Here is the code that I wrote
func makeNodes(ctx context.Context, wg *sync.WaitGroup, ch chan Node) {
defer wg.Done()
for i := 0; i < 9; i++ {
tmp := Node{Data: i, Next: nil}
ch <- tmp
}
<-ctx.Done()
return
}
type Node struct {
Data int
Next *Node
}
type List struct {
Head *Node
Length int
Last *Node
}
func (l *List) addToEnd(n *Node) {
if l.Head == nil {
l.Head = n
l.Last = n
l.Length++
return
}
tmp := l.Last
tmp.Next = n
l.Last = n
l.Length++
}
func (l List) print() {
tmp := l.Head
for tmp != nil {
fmt.Println(tmp)
tmp = tmp.Next
}
fmt.Println("\n")
}
func main() {
cha := make(chan Node)
defer close(cha)
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
var wg sync.WaitGroup
wg.Add(1)
list := List{nil, 0, nil}
go makeNodes(ctx, &wg, cha)
go func() {
for j := range cha {
list.addToEnd(&j)
}
}()
cancel()
wg.Wait()
list.print()
}
This program allocates a single structure (j in the for j:= range loop) and repeatedly overwrites it with the contents read from the channel.
This results in the same variable (j, at a fixed memory location) being added to the list multiple times.
Consider modifying the channel to be a channel of pointers.
In main()
cha := make(chan *Node)
Then for makeNodes()
Each time a new node is created (via Node{}), a new Node pointer is placed into the channel.
func makeNodes(ctx context.Context, wg *sync.WaitGroup, ch chan *Node) {
defer wg.Done()
for i := 0; i < 9; i++ {
tmp := Node{Data: i, Next: nil}
ch <- &tmp
}
<-ctx.Done()
return
}
The following will now correctly add each unique Node pointer to the list.
go func() {
for j := range cha {
list.addToEnd(j)
}
}()
Also, you may find not all entities make it to the list or are read from the channel. Your method for synchronizing the producer (makeNodes()) and consume (for j:= range) needs work.

Passing a WaitGroup to a function changes behavior, why?

I have 3 merge sort implementations:
MergeSort: simple one without concurrency;
MergeSortSmart: with concurrency limited by buffered channel size limit. If buffer is full, calls the simple implementation;
MergeSortSmartBug: same strategy as the previous one, but with a small "refactor", passing wg pointer to a function reducing code duplication.
The first two works as expected, but the third one returns an empty slice instead of the sorted input. I couldn't understand what happened and found no answers as well.
Here is the playground link for the code: https://play.golang.org/p/DU1ypbanpVi
package main
import (
"fmt"
"math/rand"
"runtime"
"sync"
)
type pass struct{}
var semaphore = make(chan pass, runtime.NumCPU())
func main() {
rand.Seed(10)
s := make([]int, 16)
for i := 0; i < 16; i++ {
s[i] = int(rand.Int31n(1000))
}
fmt.Println(s)
fmt.Println(MergeSort(s))
fmt.Println(MergeSortSmart(s))
fmt.Println(MergeSortSmartBug(s))
}
func merge(l, r []int) []int {
tmp := make([]int, 0, len(l)+len(r))
for len(l) > 0 || len(r) > 0 {
if len(l) == 0 {
return append(tmp, r...)
}
if len(r) == 0 {
return append(tmp, l...)
}
if l[0] <= r[0] {
tmp = append(tmp, l[0])
l = l[1:]
} else {
tmp = append(tmp, r[0])
r = r[1:]
}
}
return tmp
}
func MergeSort(s []int) []int {
if len(s) <= 1 {
return s
}
n := len(s) / 2
l := MergeSort(s[:n])
r := MergeSort(s[n:])
return merge(l, r)
}
func MergeSortSmart(s []int) []int {
if len(s) <= 1 {
return s
}
n := len(s) / 2
var wg sync.WaitGroup
wg.Add(2)
var l, r []int
select {
case semaphore <- pass{}:
go func() {
l = MergeSortSmart(s[:n])
<-semaphore
wg.Done()
}()
default:
l = MergeSort(s[:n])
wg.Done()
}
select {
case semaphore <- pass{}:
go func() {
r = MergeSortSmart(s[n:])
<-semaphore
wg.Done()
}()
default:
r = MergeSort(s[n:])
wg.Done()
}
wg.Wait()
return merge(l, r)
}
func MergeSortSmartBug(s []int) []int {
if len(s) <= 1 {
return s
}
n := len(s) / 2
var wg sync.WaitGroup
wg.Add(2)
l := mergeSmart(s[:n], &wg)
r := mergeSmart(s[n:], &wg)
wg.Wait()
return merge(l, r)
}
func mergeSmart(s []int, wg *sync.WaitGroup) []int {
var tmp []int
select {
case semaphore <- pass{}:
go func() {
tmp = MergeSortSmartBug(s)
<-semaphore
wg.Done()
}()
default:
tmp = MergeSort(s)
wg.Done()
}
return tmp
}
Why does the Bug version returns an empty slice? How can I refactor the Smart version without doing two selects one after the other?
Sorry for I couldn't reproduce this behavior in a smaller example.
The problem is not with the WaitGroup itself. It's with your concurrency handling. Your mergeSmart function lunches a go routine and returns the tmp variable without waiting for the go routine to finish.
You might want to try a pattern more like this:
leftchan := make(chan []int)
rightchan := make(chan []int)
go mergeSmart(s[:n], leftchan)
go mergeSmart(s[n:], rightchan)
l := <-leftchan
r := <-rightchan
Or you can use a single channel if order doesn't matter.
mergeSmart doesn't wait on the wg, so it returns a tmp that hasn't received a value yet. You could probably repair it by passing a reference to the destination slice in to the function, instead of returning a slice.
Look at the mergeSmart function. When the select enter into the first case, the goroutine is launched and imediatly returns tmp (which is an empty array). In that case there is no way to get the right value. (See advanced debugging prints here https://play.golang.org/p/IedaY3muso2)
Maybe passing arrays preallocated by reference?
I implemented both suggestions (passing slice by reference and using channels) and the (working!) result is here: https://play.golang.org/p/DcDC_-NjjAH
package main
import (
"fmt"
"math/rand"
"runtime"
"sync"
)
type pass struct{}
var semaphore = make(chan pass, runtime.NumCPU())
func main() {
rand.Seed(10)
s := make([]int, 16)
for i := 0; i < 16; i++ {
s[i] = int(rand.Int31n(1000))
}
fmt.Println(s)
fmt.Println(MergeSort(s))
fmt.Println(MergeSortSmart(s))
fmt.Println(MergeSortSmartPointer(s))
fmt.Println(MergeSortSmartChan(s))
}
func merge(l, r []int) []int {
tmp := make([]int, 0, len(l)+len(r))
for len(l) > 0 || len(r) > 0 {
if len(l) == 0 {
return append(tmp, r...)
}
if len(r) == 0 {
return append(tmp, l...)
}
if l[0] <= r[0] {
tmp = append(tmp, l[0])
l = l[1:]
} else {
tmp = append(tmp, r[0])
r = r[1:]
}
}
return tmp
}
func MergeSort(s []int) []int {
if len(s) <= 1 {
return s
}
n := len(s) / 2
l := MergeSort(s[:n])
r := MergeSort(s[n:])
return merge(l, r)
}
func MergeSortSmart(s []int) []int {
if len(s) <= 1 {
return s
}
n := len(s) / 2
var wg sync.WaitGroup
wg.Add(2)
var l, r []int
select {
case semaphore <- pass{}:
go func() {
l = MergeSortSmart(s[:n])
<-semaphore
wg.Done()
}()
default:
l = MergeSort(s[:n])
wg.Done()
}
select {
case semaphore <- pass{}:
go func() {
r = MergeSortSmart(s[n:])
<-semaphore
wg.Done()
}()
default:
r = MergeSort(s[n:])
wg.Done()
}
wg.Wait()
return merge(l, r)
}
func MergeSortSmartPointer(s []int) []int {
if len(s) <= 1 {
return s
}
n := len(s) / 2
var l, r []int
var wg sync.WaitGroup
wg.Add(2)
mergeSmartPointer(&l, s[:n], &wg)
mergeSmartPointer(&r, s[n:], &wg)
wg.Wait()
return merge(l, r)
}
func mergeSmartPointer(tmp *[]int, s []int, wg *sync.WaitGroup) {
select {
case semaphore <- pass{}:
go func() {
*tmp = MergeSortSmartPointer(s)
<-semaphore
wg.Done()
}()
default:
*tmp = MergeSort(s)
wg.Done()
}
}
func MergeSortSmartChan(s []int) []int {
if len(s) <= 1 {
return s
}
n := len(s) / 2
lchan := make(chan []int)
rchan := make(chan []int)
go mergeSmartChan(s[:n], lchan)
go mergeSmartChan(s[n:], rchan)
l := <-lchan
r := <-rchan
return merge(l, r)
}
func mergeSmartChan(s []int, c chan []int) {
select {
case semaphore <- pass{}:
go func() {
c <- MergeSortSmartChan(s)
<-semaphore
}()
default:
c <- MergeSort(s)
}
}
I understood 100% what I was doing wrong, thanks!
And for future references, here's the benchmark of sorting a slice of 100,000 elems:
$ go test -bench=.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i5-9300H CPU # 2.40GHz
BenchmarkMergeSort-8 97 12230309 ns/op
BenchmarkMergeSortSmart-8 181 7209844 ns/op
BenchmarkMergeSortSmartPointer-8 163 7483136 ns/op
BenchmarkMergeSortSmartChan-8 156 8149585 ns/op

How to collect values from a channel into a slice in Go?

Suppose I have a helper function helper(n int) which returns a slice of integers of variable length. I would like to run helper(n) in parallel for various values of n and collect the output in one big slice. My first attempt at this is the following:
package main
import (
"fmt"
"golang.org/x/sync/errgroup"
)
func main() {
out := make([]int, 0)
ch := make(chan int)
go func() {
for i := range ch {
out = append(out, i)
}
}()
g := new(errgroup.Group)
for n := 2; n <= 3; n++ {
n := n
g.Go(func() error {
for _, i := range helper(n) {
ch <- i
}
return nil
})
}
if err := g.Wait(); err != nil {
panic(err)
}
close(ch)
// time.Sleep(time.Second)
fmt.Println(out) // should have the same elements as [0 1 0 1 2]
}
func helper(n int) []int {
out := make([]int, 0)
for i := 0; i < n; i++ {
out = append(out, i)
}
return out
}
However, if I run this example I do not get all 5 expected values, instead I get
[0 1 0 1]
(If I uncomment the time.Sleep I do get all five values, [0 1 2 0 1], but this is not an acceptable solution).
It seems that the problem with this is that out is being updated in a goroutine, but the main function returns before it is done updating.
One thing that would work is using a buffered channel of size 5:
func main() {
ch := make(chan int, 5)
g := new(errgroup.Group)
for n := 2; n <= 3; n++ {
n := n
g.Go(func() error {
for _, i := range helper(n) {
ch <- i
}
return nil
})
}
if err := g.Wait(); err != nil {
panic(err)
}
close(ch)
out := make([]int, 0)
for i := range ch {
out = append(out, i)
}
fmt.Println(out) // should have the same elements as [0 1 0 1 2]
}
However, although in this simplified example I know what the size of the output should be, in my actual application this is not known a priori. Essentially what I would like is an 'infinite' buffer such that sending to the channel never blocks, or a more idiomatic way to achieve the same thing; I've read https://blog.golang.org/pipelines but wasn't able to find a close match to my use case. Any ideas?
In this version of the code, the execution is blocked until ch is closed.
ch is always closed at the end of a routine that is responsible to push into ch. Because the program pushes to ch in a routine, it is not needed to use a buffered channel.
package main
import (
"fmt"
"golang.org/x/sync/errgroup"
)
func main() {
ch := make(chan int)
go func() {
g := new(errgroup.Group)
for n := 2; n <= 3; n++ {
n := n
g.Go(func() error {
for _, i := range helper(n) {
ch <- i
}
return nil
})
}
if err := g.Wait(); err != nil {
panic(err)
}
close(ch)
}()
out := make([]int, 0)
for i := range ch {
out = append(out, i)
}
fmt.Println(out) // should have the same elements as [0 1 0 1 2]
}
func helper(n int) []int {
out := make([]int, 0)
for i := 0; i < n; i++ {
out = append(out, i)
}
return out
}
Here is the fixed version of the first code, it is convoluted but demonstrates the usage of sync.WaitGroup.
package main
import (
"fmt"
"sync"
"golang.org/x/sync/errgroup"
)
func main() {
out := make([]int, 0)
ch := make(chan int)
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
for i := range ch {
out = append(out, i)
}
}()
g := new(errgroup.Group)
for n := 2; n <= 3; n++ {
n := n
g.Go(func() error {
for _, i := range helper(n) {
ch <- i
}
return nil
})
}
if err := g.Wait(); err != nil {
panic(err)
}
close(ch)
wg.Wait()
// time.Sleep(time.Second)
fmt.Println(out) // should have the same elements as [0 1 0 1 2]
}
func helper(n int) []int {
out := make([]int, 0)
for i := 0; i < n; i++ {
out = append(out, i)
}
return out
}

how to use channel inside paralleled for-loop

func parallelSum (c chan int){
sum := 0
for i :=1 ; i< 100;i++{
go func(i int){
sum += i
}(i)
}
time.Sleep(1*time.Second)
c <- sum
}
I'm trying to learn the parallel ability to speed up things like OpenMP. And here is an example of the intended summing up parallel loop in Go, this function runs as a goroutine.
Note that the variable sum is not a channel here, so does this mean the variable sum access inside the for loop is a blocked operation? Is it now efficient enough? Is there a better solution?
I knew the channel feature was designed for this, my obviously wrong implement below can compile, but with 100 runtime errors like following.
goroutine 4 [chan receive]:
main.parallelSumError(0xc0000180c0)
/home/tom/src/goland_newproject/main.go:58 +0xb4 //line 58 : temp := <-sum
created by main.main
/home/tom/src/goland_newproject/main.go:128 +0x2ca //line 128: go parallelSumError(pcr), the calling function
So what's the problem here? it seems summing is not a good example for paralleled for-loop, but actually I wish to know how to use channel inside paralleled for-loop.
func parallelSum (c chan int){
sum := make(chan int)
for i :=1 ; i< 100;i++{
go func(i int){
temp := <- sum //error here why?
temp += i
sum <- temp
}(i)
}
time.Sleep(1*time.Second)
temp := <-sum
c <- temp
}
both with the same main function
func main(){
pc := make(chan int)
go parallelSum(pc)
result = <- pc
fmt.Println("parallel result:", result)
}
I don't like the idea of summing numbers through channels. I'd rather use something classical like sync.Mutex or atomic.AddUint64. But, at least, I made your code working.
We aren't able to pass a value from one channel to another (I added temp variable). Also, there is sync.WaitGroup and other stuff.
But I still don't like the idea of the code.
package main
import (
"fmt"
"sync"
)
func main() {
pc := make(chan int)
go parallelSum(pc)
result := <- pc
fmt.Println("parallel result:", result)
}
func parallelSum (c chan int){
sum := make(chan int)
wg := sync.WaitGroup{}
wg.Add(100)
for i :=1 ; i <= 100;i++{
go func(i int){
temp := <- sum
temp += i
wg.Done()
sum <- temp
}(i)
}
sum <- 0
wg.Wait()
temp := <- sum
c <- temp
}
When using go routines (i.e. go foo()), it is preferable to use communication over memory-sharing. In this matter, as you mention, channels are the golang way to handle communication.
For your specific application, the paralleled sum similar to OpenMP, it would be preferable to detect the number of CPUs and generate as many routines as wished:
package main
import (
"fmt"
"runtime"
)
func main() {
numCPU := runtime.NumCPU()
sumc := make(chan int, numCPU)
valuec := make(chan int)
endc := make(chan interface{}, numCPU)
// generate go routine per cpu
for i := 0; i < numCPU; i++ {
go sumf(sumc, valuec, endc)
}
// generate values and pass it through the channels
for i := 0; i < 100; i++ {
valuec <- i
}
// tell go routines to end up when they are done
for i := 0; i < numCPU; i++ {
endc <- nil
}
// sum results
sum := 0
for i := 0; i < numCPU; i++ {
procSum := <-sumc
sum += procSum
}
fmt.Println(sum)
}
func sumf(sumc, valuec chan int, endc chan interface{}) {
sum := 0
for {
select {
case i := <-valuec:
sum += i
case <-endc:
sumc <- sum
return
}
}
}
Hopefully, this helps.

Julia set image rendering ruined by concurrency

I have the following code that I am to change into a concurrent program.
// Stefan Nilsson 2013-02-27
// This program creates pictures of Julia sets (en.wikipedia.org/wiki/Julia_set).
package main
import (
"image"
"image/color"
"image/png"
"log"
"math/cmplx"
"os"
"strconv"
)
type ComplexFunc func(complex128) complex128
var Funcs []ComplexFunc = []ComplexFunc{
func(z complex128) complex128 { return z*z - 0.61803398875 },
func(z complex128) complex128 { return z*z + complex(0, 1) },
}
func main() {
for n, fn := range Funcs {
err := CreatePng("picture-"+strconv.Itoa(n)+".png", fn, 1024)
if err != nil {
log.Fatal(err)
}
}
}
// CreatePng creates a PNG picture file with a Julia image of size n x n.
func CreatePng(filename string, f ComplexFunc, n int) (err error) {
file, err := os.Create(filename)
if err != nil {
return
}
defer file.Close()
err = png.Encode(file, Julia(f, n))
return
}
// Julia returns an image of size n x n of the Julia set for f.
func Julia(f ComplexFunc, n int) image.Image {
bounds := image.Rect(-n/2, -n/2, n/2, n/2)
img := image.NewRGBA(bounds)
s := float64(n / 4)
for i := bounds.Min.X; i < bounds.Max.X; i++ {
for j := bounds.Min.Y; j < bounds.Max.Y; j++ {
n := Iterate(f, complex(float64(i)/s, float64(j)/s), 256)
r := uint8(0)
g := uint8(0)
b := uint8(n % 32 * 8)
img.Set(i, j, color.RGBA{r, g, b, 255})
}
}
return img
}
// Iterate sets z_0 = z, and repeatedly computes z_n = f(z_{n-1}), n ≥ 1,
// until |z_n| > 2 or n = max and returns this n.
func Iterate(f ComplexFunc, z complex128, max int) (n int) {
for ; n < max; n++ {
if real(z)*real(z)+imag(z)*imag(z) > 4 {
break
}
z = f(z)
}
return
}
I have decided to try and make the Julia() function concurrent. So I changed it to:
func Julia(f ComplexFunc, n int) image.Image {
bounds := image.Rect(-n/2, -n/2, n/2, n/2)
img := image.NewRGBA(bounds)
s := float64(n / 4)
for i := bounds.Min.X; i < bounds.Max.X; i++ {
for j := bounds.Min.Y; j < bounds.Max.Y; j++ {
go func(){
n := Iterate(f, complex(float64(i)/s, float64(j)/s), 256)
r := uint8(0)
g := uint8(0)
b := uint8(n % 32 * 8)
img.Set(i, j, color.RGBA{r, g, b, 255})
}()
}
}
return img
This change causes the images to look very different. The patterns are essentially the same, but there are a lot of white pixels that were not there before.
What is happening here?
There are 2 problems:
You don't actually wait for your goroutines to finish.
You don't pass i and j to the goroutine, so they will almost always be the last i and j.
Your function should look something like:
func Julia(f ComplexFunc, n int) image.Image {
var wg sync.WaitGroup
bounds := image.Rect(-n/2, -n/2, n/2, n/2)
img := image.NewRGBA(bounds)
s := float64(n / 4)
for i := bounds.Min.X; i < bounds.Max.X; i++ {
for j := bounds.Min.Y; j < bounds.Max.Y; j++ {
wg.Add(1)
go func(i, j int) {
n := Iterate(f, complex(float64(i)/s, float64(j)/s), 256)
r := uint8(0)
g := uint8(0)
b := uint8(n % 32 * 8)
img.Set(i, j, color.RGBA{r, g, b, 255})
wg.Done()
}(i, j)
}
}
wg.Wait()
return img
}
A bonus tip, when diving into concurrency, it's usually a good idea to try your code with the race detector.
You might have to use a mutex to call img.Set but I'm not very sure and I can't test atm.

Resources