How to iterate int range concurrently - go

For purely educational purposes I created a base58 package. It will encode/decode a uint64 using the bitcoin base58 symbol chart, for example:
b58 := Encode(100) // return 2j
num := Decode("2j") // return 100
While creating the first tests I came with this:
func TestEncode(t *testing.T) {
var i uint64
for i = 0; i <= (1<<64 - 1); i++ {
b58 := Encode(i)
num := Decode(b58)
if num != i {
t.Fatalf("Expecting %d for %s", i, b58)
This "naive" implementation, tries to convert all the range from uint64 (From 0 to 18,446,744,073,709,551,615) to base58 and later back to uint64 but takes too much time.
To better understand how go handles concurrency I would like to know how to use channels or goroutines and perform the iteration across the full uint64 range in the most efficient way?
Could data be processed by chunks and in parallel, if yes how to accomplish this?
Like mention in the answer by #Adrien, one-way is to use t.Parallel() but that applies only when testing the package, In any case, by implementing it I found that is noticeably slower, it runs in parallel but there is no speed gain.
I understand that doing the full uint64 may take years but what I want to find/now is how could a channel or goroutine, may help to speed up the process (testing with small range 1<<16) probably by using something like this just as an example.
The question is not about how to test the package is about what algorithm, technic could be used to iterate faster by using concurrency.

This functionality is built into the Go testing package, in the form of T.Parallel:
func TestEncode(t *testing.T) {
var i uint64
for i = 0; i <= (1<<64 - 1); i++ {
t.Run(fmt.Sprintf("%d",i), func(t *testing.T) {
j := i // Copy to local var - important
t.Parallel() // Mark test as parallelizable
b58 := Encode(j)
num := Decode(b58)
if num != j {
t.Fatalf("Expecting %d for %s", j, b58)

I came up with this solutions:
package main
import (
func encode(i uint64) {
x := base58.Encode(i)
fmt.Printf("%d = %s\n", i, x)
func main() {
concurrency := 4
sem := make(chan struct{}, concurrency)
for i, val := uint64(0), uint64(1<<16); i <= val; i++ {
sem <- struct{}{}
go func(i uint64) {
defer func() { <-sem }()
for i := 0; i < cap(sem); i++ {
sem <- struct{}{}
Basically, start 4 workers and calls the encode function, to notice/understand more this behavior a sleep is added so that the data can be printed in chunks of 4.
Also, these answers helped me to better understand concurrency understanding:
If there is a better way please let me know.


how to use channel inside paralleled for-loop

func parallelSum (c chan int){
sum := 0
for i :=1 ; i< 100;i++{
go func(i int){
sum += i
c <- sum
I'm trying to learn the parallel ability to speed up things like OpenMP. And here is an example of the intended summing up parallel loop in Go, this function runs as a goroutine.
Note that the variable sum is not a channel here, so does this mean the variable sum access inside the for loop is a blocked operation? Is it now efficient enough? Is there a better solution?
I knew the channel feature was designed for this, my obviously wrong implement below can compile, but with 100 runtime errors like following.
goroutine 4 [chan receive]:
/home/tom/src/goland_newproject/main.go:58 +0xb4 //line 58 : temp := <-sum
created by main.main
/home/tom/src/goland_newproject/main.go:128 +0x2ca //line 128: go parallelSumError(pcr), the calling function
So what's the problem here? it seems summing is not a good example for paralleled for-loop, but actually I wish to know how to use channel inside paralleled for-loop.
func parallelSum (c chan int){
sum := make(chan int)
for i :=1 ; i< 100;i++{
go func(i int){
temp := <- sum //error here why?
temp += i
sum <- temp
temp := <-sum
c <- temp
both with the same main function
func main(){
pc := make(chan int)
go parallelSum(pc)
result = <- pc
fmt.Println("parallel result:", result)
I don't like the idea of summing numbers through channels. I'd rather use something classical like sync.Mutex or atomic.AddUint64. But, at least, I made your code working.
We aren't able to pass a value from one channel to another (I added temp variable). Also, there is sync.WaitGroup and other stuff.
But I still don't like the idea of the code.
package main
import (
func main() {
pc := make(chan int)
go parallelSum(pc)
result := <- pc
fmt.Println("parallel result:", result)
func parallelSum (c chan int){
sum := make(chan int)
wg := sync.WaitGroup{}
for i :=1 ; i <= 100;i++{
go func(i int){
temp := <- sum
temp += i
sum <- temp
sum <- 0
temp := <- sum
c <- temp
When using go routines (i.e. go foo()), it is preferable to use communication over memory-sharing. In this matter, as you mention, channels are the golang way to handle communication.
For your specific application, the paralleled sum similar to OpenMP, it would be preferable to detect the number of CPUs and generate as many routines as wished:
package main
import (
func main() {
numCPU := runtime.NumCPU()
sumc := make(chan int, numCPU)
valuec := make(chan int)
endc := make(chan interface{}, numCPU)
// generate go routine per cpu
for i := 0; i < numCPU; i++ {
go sumf(sumc, valuec, endc)
// generate values and pass it through the channels
for i := 0; i < 100; i++ {
valuec <- i
// tell go routines to end up when they are done
for i := 0; i < numCPU; i++ {
endc <- nil
// sum results
sum := 0
for i := 0; i < numCPU; i++ {
procSum := <-sumc
sum += procSum
func sumf(sumc, valuec chan int, endc chan interface{}) {
sum := 0
for {
select {
case i := <-valuec:
sum += i
case <-endc:
sumc <- sum
Hopefully, this helps.

Saving results from a parallelized goroutine

I am trying to parallelize an operation in golang and save the results in a manner that I can iterate over to sum up afterwords.
I have managed to set up the parameters so that no deadlock occurs, and I have confirmed that the operations are working and being saved correctly within the function. When I iterate over the Slice of my struct and try and sum up the results of the operation, they all remain 0. I have tried passing by reference, with pointers, and with channels (causes deadlock).
I have only found this example for help: But this seems outdated now, as Vector as been deprecated? I also have not found any references to the way this function (in the example) was constructed (with the func (u Vector) before the name). I tried replacing this with a Slice but got compile time errors.
Any help would be very appreciated. Here is the key parts of my code:
type job struct {
a int
b int
result *big.Int
func choose(jobs []Job, c chan int) {
temp := new(big.Int)
for _,job := range jobs {
job.result = //perform operation on job.a and job.b
c <- 1
func main() {
num := 100 //can be very large (why we need big.Int)
n := num
k := 0
const numCPU = 6 //runtime.NumCPU
count := new(big.Int)
// create a 2d slice of jobs, one for each core
jobs := make([][]Job, numCPU)
for (float64(k) <= math.Ceil(float64(num / 2))) {
// add one job to each core, alternating so that
// job set is similar in difficulty
for i := 0; i < numCPU; i++ {
if !(float64(k) <= math.Ceil(float64(num / 2))) {
jobs[i] = append(jobs[i], Job{n, k, new(big.Int)})
n -= 1
k += 1
c := make(chan int, numCPU)
for i := 0; i < numCPU; i++ {
go choose(jobs[i], c)
// drain the channel
for i := 0; i < numCPU; i++ {
// computations are done
for i := range jobs {
for _,job := range jobs[i] {
count.Add(count, job.result)
Here is the code running on the go playground
As long as the []Job slice is only modified by one goroutine at a time, there's no reason you can't modify the job in place.
for i, job := range jobs {
jobs[i].result = temp.Binomial(int64(job.a), int64(job.b))
You should also use a WaitGroup, rather than rely on counting tokens in a channel yourself.

GoLang - Sequential vs Concurrent

I have two versions of factorial. Concurrent vs Sequencial.
Both the program will calculate factorial of 10 "1000000" times.
Factorial Concurrent Processing
package main
import (
func main() {
start := time.Now()
fmt.Println("Current Time:", time.Now(), "Start Time:", start, "Elapsed Time:", time.Since(start))
panic("Error Stack!")
func gen(n int) <-chan int {
c := make(chan int)
go func() {
for i := 0; i < n; i++ {
//c <- rand.Intn(10) + 1
c <- 10
return c
func fact(in <-chan int) <-chan int {
out := make(chan int)
var wg sync.WaitGroup
for n := range in {
go func(n int) {
//temp := 1
//for i := n; i > 0; i-- {
// temp *= i
temp := calcFact(n)
out <- temp
go func() {
return out
func printFact(in <-chan int) {
//for n := range in {
// fmt.Println("The random Factorial is:", n)
var i int
for range in {
i ++
fmt.Println("Count:" , i)
func calcFact(c int) int {
if c == 0 {
return 1
} else {
return calcFact(c-1) * c
//###End of Factorial Concurrent
Factorial Sequencial Processing
package main
import (
func main() {
start := time.Now()
//for _, n := range factorial(gen(10000)...) {
// fmt.Println("The random Factorial is:", n)
var i int
for range factorial(gen(1000000)...) {
fmt.Println("Count:" , i)
fmt.Println("Current Time:", time.Now(), "Start Time:", start, "Elapsed Time:", time.Since(start))
func gen(n int) []int {
var out []int
for i := 0; i < n; i++ {
//out = append(out, rand.Intn(10)+1)
out = append(out, 10)
return out
func factorial(val []int {
var out []int
for _, n := range val {
fa := calcFact(n)
out = append(out, fa)
return out
func calcFact(c int) int {
if c == 0 {
return 1
} else {
return calcFact(c-1) * c
//###End of Factorial sequential processing
My assumption was concurrent processing will be faster than sequential but sequential is executing faster than concurrent in my windows machine.
I am using 8 core/ i7 / 32 GB RAM.
I am not sure if there is something wrong in the programs or my basic understanding is correct.
p.s. - I am new to GoLang.
Concurrent version of your program will always be slow compared to the sequential version. The reason however, is related to the nature and behavior of problem you are trying to solve.
Your program is concurrent but it is not parallel. Each callFact is running in it's own goroutine but there is no division of the amount of work required to be done. Each goroutine must perform the same computation and output the same value.
It is like having a task that requires some text to be copied a hundred times. You have just one CPU (ignore the cores for now).
When you start a sequential process, you point the CPU to the original text once, and ask it to write it down a 100 times. The CPU has to manage a single task.
With goroutines, the CPU is told that there are a hundred tasks that must be done concurrently. It just so happens that they are all the same tasks. But CPU is not smart enough to know that.
So it does the same thing as above. Even though each task now is a 100 times smaller, there is still just one CPU. So the amount of work CPU has to do is still the same, except with all the added overhead of managing 100 different things at once. Hence, it looses a part of its efficiency.
To see an improvement in performance you'll need proper parallelism. A simple example would be to split the factorial input number roughly in the middle and compute 2 smaller factorials. Then combine them together:
// not an ideal solution
func main() {
ch := make(chan int)
r := 10
result := 1
go fact(r, ch)
for i := range ch {
result *= i
func fact(n int, ch chan int) {
p := n/2
q := p + 1
var wg sync.WaitGroup
go func() {
ch <- factPQ(1, p)
go func() {
ch <- factPQ(q, n)
go func() {
func factPQ(p, q int) int {
r := 1
for i := p; i <= q; i++ {
r *= i
return r
Working code:
Now you have two goroutines working towards the same goal and not just repeating the same calculations.
Note about CPU cores:
In your original code, the sequential version's operations are most definitely being distributed amongst various CPU cores by the runtime environment and the OS. So it still has parallelism to a degree, you just don't controll it.
The same is happening in the concurrent version but again as mentioned above, the overhead of goroutine context switching makes the performance come down.
abhink has given a good answer. I would also like to draw attention to Amdahl's Law, which should always be borne in mind when trying to use parallel processing to increase the overall speed of computation. That's not to say "don't make things parallel", but rather: be realistic about expectations and understand the parallel architecture fully.
Go allows us to write concurrent programs. This is related to trying to write faster parallel programs, but the two issues are separate. See Rob Pike's Concurrency is not Parallelism for more info.

Parallel processing in golang

Given the following code:
package main
import (
func main() {
for i := 0; i < 3; i++ {
go f(i)
// prevent main from exiting immediately
var input string
func f(n int) {
for i := 0; i < 10; i++ {
dowork(n, i)
amt := time.Duration(rand.Intn(250))
time.Sleep(time.Millisecond * amt)
func dowork(goroutine, loopindex int) {
// simulate work
time.Sleep(time.Second * time.Duration(5))
fmt.Printf("gr[%d]: i=%d\n", goroutine, loopindex)
Can i assume that the 'dowork' function will be executed in parallel?
Is this a correct way of achieving parallelism or is it better to use channels and separate 'dowork' workers for each goroutine?
Regarding GOMAXPROCS, you can find this in Go 1.5's release docs:
By default, Go programs run with GOMAXPROCS set to the number of cores available; in prior releases it defaulted to 1.
Regarding preventing the main function from exiting immediately, you could leverage WaitGroup's Wait function.
I wrote this utility function to help parallelize a group of functions:
import "sync"
// Parallelize parallelizes the function calls
func Parallelize(functions ...func()) {
var waitGroup sync.WaitGroup
defer waitGroup.Wait()
for _, function := range functions {
go func(copy func()) {
defer waitGroup.Done()
So in your case, we could do this
func1 := func() {
func2 = func() {
func3 = func() {
Parallelize(func1, func2, func3)
If you wanted to use the Parallelize function, you can find it here
This answer is outdated. Please see this answer instead.
Your code will run concurrently, but not in parallel. You can make it run in parallel by setting GOMAXPROCS.
It's not clear exactly what you're trying to accomplish here, but it looks like a perfectly valid way of achieving concurrency to me.
f() will be executed concurrently but many dowork() will be executed sequentially within each f(). Waiting on stdin is also not the right way to ensure that your routines finished execution. You must spin up a channel that each f() pushes a true on when the f() finishes.
At the end of the main() you must wait for n number of true's on the channel. n being the number of f() that you have spun up.
This helped me when I was starting out.
package main
import "fmt"
func put(number chan<- int, count int) {
i := 0
for ; i <= (5 * count); i++ {
number <- i
number <- -1
func subs(number chan<- int) {
i := 10
for ; i <= 19; i++ {
number <- i
func main() {
channel1 := make(chan int)
channel2 := make(chan int)
done := 0
sum := 0
go subs(channel2)
go put(channel1, 1)
go put(channel1, 2)
go put(channel1, 3)
go put(channel1, 4)
go put(channel1, 5)
for done != 5 {
select {
case elem := <-channel1:
if elem < 0 {
} else {
sum += elem
case sub := <-channel2:
sum -= sub
fmt.Printf("atimta : %d\n", sub)
"Conventional cluster-based systems (such as supercomputers) employ parallel execution between processors using MPI. MPI is a communication interface between processes that execute in operating system instances on different processors; it doesn't support other process operations such as scheduling. (At the risk of complicating things further, because MPI processes are executed by operating systems, a single processor can run multiple MPI processes and/or a single MPI process can also execute multiple threads!)"
You can add a loop at the end, to block until the jobs are done:
package main
import "time"
func f(n int, b chan bool) {
b <- true
func main() {
b := make(chan bool, 9)
for n := cap(b); n > 0; n-- {
go f(n, b)
for <-b {
if len(b) == 0 { break }

How to break out of select gracefuly in golang

I have a program in golang that counts SHA1s and prints ones that start with two zeros. I want to use goroutines and channels. My problem is that I don't know how to gracefully exit select clause if I don't know how many results it will produce.
Many tutorials know that in advance and exit when counter hits. Other suggest using WaitGroups, but I don't want to do that: I want to print results in main thread as soon it appears in channel. Some suggest to close a channel when goroutines are finished, but I want to close it after asynchronous for finishes, so I don't know how.
Please help me to achieve my requirements:
package main
import (
type Hash struct {
message string
hash [sha1.Size]byte
var counter int = 0
var max int = 100000
var channel = make(chan Hash)
var source = rand.NewSource(time.Now().UnixNano())
var generator = rand.New(source)
func main() {
nCPU := runtime.NumCPU()
fmt.Println("Number of CPUs: ", nCPU)
start := time.Now()
for i := 0 ; i < max ; i++ {
go func(j int) {
// close channel here? I can't because asynchronous producers work now
for {
select {
// how to stop receiving if there are no producers left?
case hash := <- channel:
fmt.Printf("Hash is %v\n ", hash)
fmt.Printf("Count of %v sha1 took %v\n", max, time.Since(start))
func count(i int) {
random := fmt.Sprintf("This is a test %v", generator.Int())
hash := sha1.Sum([]byte(random))
if (hash[0] == 0 && hash[1] == 0) {
channel <- Hash{random, hash}
Firstly: if you don't know when your computation ends, how could you even model it? Make sure you know exactly when and under what circumstances your program terminates. If you're done you know how to write it in code.
You're basically dealing with a producer-consumer problem. A standard case. I would model
that this way (on play):
func producer(max int, out chan<- Hash, wg *sync.WaitGroup) {
defer wg.Done()
for i := 0; i < max; i++ {
random := fmt.Sprintf("This is a test %v", rand.Int())
hash := sha1.Sum([]byte(random))
if hash[0] == 0 && hash[1] == 0 {
out <- Hash{random, hash}
Obviously you're brute-forcing hashes, so the end is reached when the loop is finished.
We can close the channel here and signal the other goroutines that there is nothing more to listen for.
func consumer(max int, in <-chan Hash, wg *sync.WaitGroup) {
defer wg.Done()
for {
hash, ok := <-in
if !ok {
fmt.Printf("Hash is %v\n ", hash)
The consumer takes all the incoming messages from the in channel and checks if it was closed (ok).
If it is closed, we're done. Otherwise print the received hashes.
To start this all up we can write:
wg := &sync.WaitGroup{}
c := make(chan Hash)
go producer(max, c, wg)
go consumer(max, c, wg)
The WaitGroup's purpose is to wait until the spawned goroutines finished, signalled by
the call of wg.Done in the goroutines.
Also note that the Rand you're using is not safe for concurrent access. Use the one initialized
globally in math/rand. Example:
The structure of your program should probably be re-examined.
Here is a working example of what I presume you are looking for.
It can be run on the Go playground
package main
import (
type Hash struct {
message string
hash [sha1.Size]byte
const Max int = 100000
func main() {
nCPU := runtime.NumCPU()
fmt.Println("Number of CPUs: ", nCPU)
hashes := Generate()
start := time.Now()
for hash := range hashes {
fmt.Printf("Hash is %v\n ", hash)
fmt.Printf("Count of %v sha1 took %v\n", Max, time.Since(start))
func Generate() <-chan Hash {
c := make(chan Hash, 1)
go func() {
defer close(c)
source := rand.NewSource(time.Now().UnixNano())
generator := rand.New(source)
for i := 0; i < Max; i++ {
random := fmt.Sprintf("This is a test %v", generator.Int())
hash := sha1.Sum([]byte(random))
if hash[0] == 0 && hash[1] == 0 {
c <- Hash{random, hash}
return c
Edit: This does not fire up a separate routine for each Hash computation,
but to be honest, I fail to see the value on doing so. The scheduling of all those routines will likely cost you far more than running the code in a single routine.
If need be, you can split it up into chunks of N routines, but a 1:1 mapping is not the way to go with this.
