What am I missing on concurrency? - go

I have a very simple script that makes a get request and then does some thing with the response. I have 2 version one using a go routine and one without I bencharmaked both and there was no difference in speed. Here is a dumb down version of what I'm doing:
Regular Version:
func main() {
url := "http://finance.yahoo.com/q?s=aapl"
for i := 0; i < 250; i++ {
resp, err := http.Get(url)
if err != nil {
fmt.Println(err)
}
fmt.Println(resp.Status)
}
}
Go Routine:
func main() {
url := "http://finance.yahoo.com/q?s=aapl"
for i := 0; i < 250; i++ {
wg.Add(1)
go run(url, &wg)
wg.Wait()
}
}
func run(url string, wg *sync.WaitGroup) {
defer wg.Done()
resp, err := http.Get(url)
if err != nil {
fmt.Println(err)
}
fmt.Println(resp.Status)
}
In most cases when I used a go routine the program took longer to execute. What concept am I missing to understand using concurrency efficiently?

The main problem with your example is that you're calling wg.Wait() within the for loop. This causes execution to block until you the deferred wg.Done() call inside of run. As a result, the execution isn't concurrent, it happens in a goroutine but you block after starting goroutine i and before starting i+1. If you place that statement after the loop instead like below then your code won't block until after the loop (all goroutines have been started, some may have already completed).
func main() {
url := "http://finance.yahoo.com/q?s=aapl"
for i := 0; i < 250; i++ {
wg.Add(1)
go run(url, &wg)
// wg.Wait() don't wait here cause it serializes execution
}
wg.Wait() // wait here, now that all goroutines have been started
}

Related

Deadlock on All GoRoutines When Using Channels and WaitGroup

I am new to Go and am currently attempting to run a function that creates a file and returns it's filename and have this run concurrently.
I've decided to try and accomplish this with goroutines and a WaitGroup. When I use this approach, I end up with a list size that is a couple hundred files less than the input size. E.g. for 5,000 files I get around 4,700~ files created.
I believe this is due to some race conditions:
wg := sync.WaitGroup{}
filenames := make([]string, 0)
for i := 0; i < totalFiles; i++ {
wg.Add(1)
go func() {
defer wg.Done()
filenames = append(filenames, createFile())
}()
}
wg.Wait()
return filenames, nil
Don't communicate by sharing memory; share memory by communicating.
I tried using channels to "share memory by communicating". Whenever I do this, there appears to be a deadlock that I can't seem to wrap my head around why. Would anyone be able to point me in the right direction for using channels and waitgroups together properly in order to save all of the created files to a shared data structure?
This is the code that produces the deadlock for me (fatal error: all goroutines are asleep - deadlock!):
wg := sync.WaitGroup{}
filenames := make([]string, 0)
ch := make(chan string)
for i := 0; i < totalFiles; i++ {
wg.Add(1)
go func() {
defer wg.Done()
ch <- createFile()
}()
}
wg.Wait()
for i := range ch {
filenames = append(filenames, i)
}
return filenames, nil
Thanks!
The first one has a race. You have to protect access to filenames:
mu:=sync.Mutex{}
for i := 0; i < totalFiles; i++ {
wg.Add(1)
go func() {
defer wg.Done()
mu.Lock()
defer mu.Unlock()
filenames = append(filenames, createFile())
}()
}
For the second case, you are waiting for the goroutines to finish, but goroutines can only finish once you read from the channel, so deadlock. You can fix it by reading from the channel in a separate goroutine.
go func() {
for i := range ch {
filenames = append(filenames, i)
}
}()
wg.Wait()
close(ch) // Required, so the goroutine can terminate
return filenames, nil
There is a lock-free version, if the number of files is fixed:
filenames := make([]string, totalFiles)
for i := 0; i < totalFiles; i++ {
wg.Add(1)
go func(index int) {
defer wg.Done()
filenames[index]=createFile()
}(i)
}
wg.Wait()

Why does the goroutine only run once as part of a waitgroup

func check(name string) string {
resp, err := http.Get(endpoint + name)
if err != nil {
panic(err)
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
panic(err)
}
return string(body)
}
func worker(name string, wg *sync.WaitGroup, names chan string) {
defer wg.Done()
var a = check(name)
names <- a
}
func main() {
names := make(chan string)
var wg sync.WaitGroup
for i := 1; i <= 5; i++ {
wg.Add(1)
go worker("www"+strconv.Itoa(i), &wg, names)
}
fmt.Println(<-names)
}
The expected result would be 5 results but, only one executes and the process ends.
Is there something I am missing? New to go.
The endpoint is a generic API that returns json
You started 5 goroutines, but read input for one only. Also, you are not waiting for your goroutines to end.
// If there are only 5 goroutines unconditionally, you don't need the wg
for i := 1; i <= 5; i++ {
go worker("www"+strconv.Itoa(i), names)
}
for i:=1;i<=5;i++ {
fmt.Println(<-names)
}
If, however, you don't know how many goroutines you're waiting, then the waitgroup is necessary.
for i := 1; i <= 5; i++ {
wg.Add(1)
go worker("www"+strconv.Itoa(i), &wg, names)
}
// Read from the channel until it is closed
done:=make(chan struct{})
go func() {
for x:=range names {
fmt.Println(x)
}
// Signal that println is completed
close(done)
}()
// Wait for goroutines to end
wg.Wait()
// Close the channel to terminate the reader goroutine
close(names)
// Wait until println completes
<-done
You are launching 5 goroutines, but are reading from the names channel only once.
fmt.Println(<-names)
As soon as that first channel read is done, main() exits.
That means everything stops before having the time to be executed.
To know more about channels, see "Concurrency made easy" from Dave Cheney:
If you have to wait for the result of an operation, it’s easier to do it yourself.
Release locks and semaphores in the reverse order you acquired them.
Channels aren’t resources like files or sockets, you don’t need to close them to free them.
Acquire semaphores when you’re ready to use them.
Avoid mixing anonymous functions and goroutines
Before you start a goroutine, always know when, and how, it will stop

How do I handle panic in goroutines?

I am quite new to golang.So, please spare me the sword ( if possible ).
I was trying to get data from the web by studying the tutorial here
Now, the tutorial goes all well, but I wanted to check for edge cases and error-handling ( just to be thorough with my new learning of the language, don't want to be the one with half-baked knowledge ).
Here's my go-playground code.
Before asking I looked at a lot of references like :
Go blog defer,panic and recover
handling panics in goroutines
how-should-i-write-goroutine
And a few more, however I couldn't figure it out much.
Here's the code in case you don't want to go to the playground ( for reasons yet unknown to man ) :
// MakeRequest : Makes requests concurrently
func MakeRequest(url string, ch chan<- string, wg *sync.WaitGroup) {
start := time.Now()
resp, err := http.Get(url)
defer func() {
resp.Body.Close()
wg.Done()
if r := recover(); r != nil {
fmt.Println("Recovered in f", r)
}
}()
if err != nil {
fmt.Println(err)
panic(err)
}
secs := time.Since(start).Seconds()
body, _ := ioutil.ReadAll(resp.Body)
ch <- fmt.Sprintf("%.2f elapsed with response length: %d %s", secs, len(body), url)
}
func main() {
var wg sync.WaitGroup
output := []string{
"https://www.facebook.com",
"",
}
start := time.Now()
ch := make(chan string)
for _, url := range output {
wg.Add(1)
go MakeRequest(url, ch, &wg)
}
for range output {
fmt.Println(<-ch)
}
fmt.Printf("%.2fs elapsed\n", time.Since(start).Seconds())
}
Update
I changed the code to ( let's say ) handle the error in goroutine like this ( go-playground here ):
func MakeRequest(url string, ch chan<- string, wg *sync.WaitGroup) {
start := time.Now()
resp, err := http.Get(url)
if err == nil {
secs := time.Since(start).Seconds()
body, _ := ioutil.ReadAll(resp.Body)
ch <- fmt.Sprintf("%.2f elapsed with response length: %d %s", secs, len(body), url)
// fmt.Println(err)
// panic(err)
}
defer wg.Done()
}
Update 2 :
After an answer I changed the code to this and it successfully removes the chan deadlock, however now I need to handle this in main :
func MakeRequest(url string, ch chan<- string, wg *sync.WaitGroup) {
defer wg.Done()
start := time.Now()
resp, err := http.Get(url)
if err == nil {
secs := time.Since(start).Seconds()
body, _ := ioutil.ReadAll(resp.Body)
ch <- fmt.Sprintf("%.2f elapsed with response length: %d %s", secs, len(body), url)
// fmt.Println(err)
// panic(err)
}
// defer resp.Body.Close()
ch <- fmt.Sprintf("")
}
Isn't there a more elegant way to handle this ?
But now I get locked up in deadlock.
Thanks and regards.
Temporarya
( a golang noobie )
You are using recover correctly. You have two problems:
You are using panic incorrectly. You should only panic when there was a programming error. Avoid using panics unless you believe taking down the program is a reasonable response to what happened. In this case, I would just return the error, not panic.
You are panicing during your panic. What is happening is that you are first panicing at panic(err). Then in your defer function, you are panicing at resp.Body.Close(). When http.Get returns an error, it returns a nil response. That means that resp.Body.Close() is acting on a nil value.
The idiomatic way to handle this would be something like the following:
func MakeRequest(url string, ch chan<- string, wg *sync.WaitGroup) {
defer wg.Done()
start := time.Now()
resp, err := http.Get(url)
if err != nil {
//handle error without panicing
}
// there was no error, so resp.Body is guaranteed to exist.
defer resp.Body.Close()
...
Response to update: Ifhttp.Get() returns an error, you never send on the channel. At some point all goroutines except the main goroutine stop running and the main goroutine is waiting on <-ch. Since that channel receive will never complete and there is nothing else for the Go runtime to schedule, it panics (unrecoverably).
Response to comment: To ensure the channel doesn't hang, you need some sort of coordination to know when messages will stop coming. How this is implemented depends on your real program, and an example cannot necessarily extrapolate to reality. For this example, I would simply close the channel when the WaitGroup is done.
Playground
func main() {
var wg sync.WaitGroup
output := []string{
"https://www.facebook.com",
"",
}
start := time.Now()
ch := make(chan string)
for _, url := range output {
wg.Add(1)
go MakeRequest(url, ch, &wg)
}
go func() {
wg.Wait()
close(ch)
}()
for val := range ch {
fmt.Println(val)
}
fmt.Printf("%.2fs elapsed\n", time.Since(start).Seconds())
}

Go routines not executing

The following is the code that is giving me problem. What i want to achieve is to create those many tables in parallel. After all the tables are created I want to exit the functions.
func someFunction(){
....
gos := 5
proc := make(chan bool, gos)
allDone := make(chan bool)
for i:=0; i<gos; i++ {
go func() {
for j:=i; j<len(tables); j+=gos {
r, err := db.Exec(tables[j])
fmt.Println(r)
if err != nil {
methods.CheckErr(err, err.Error())
}
}
proc <- true
}()
}
go func() {
for i:=0; i<gos; i++{
<-proc
}
allDone <- true
}()
for {
select {
case <-allDone:
return
}
}
}
I'm creating two channels 1 to keep track of number of tables created (proc) and other (allDone) to see if all are done.
When i run this code then the go routine to create table starts execution but before it completes someFunction gets terminated.
However there is no problem if run the code sequentially
What is the mistake in my design pattern and also how do i correct it.
The usual pattern for what you're trying to achieve uses WaitGroup.
I think the problem you're facing is that i is captured by each goroutine and it keeps getting incremented by the outer loop. Your inner loop starts at i and since the outer loop has continued, each goroutine starts at 5.
Try passing the iterator as parameter to the goroutine so that you get a new copy each time.
func someFunction(){
....
gos := 5
var wg sync.WaitGroup
wg.Add(gos)
for i:=0; i< gos; i++ {
go func(n int) {
defer wg.Done()
for j:=n; j<len(tables); j+=gos {
r, err := db.Exec(tables[j])
fmt.Println(r)
if err != nil {
methods.CheckErr(err, err.Error())
}
}
}(i)
}
wg.Wait();
}
I'm not sure what you're trying to achieve here, each goroutine does db.Exec on all the tables above the one it started with so the first one treats all the tables, the second one treats all but the first one and so on. Is this what you intended?

How does golang defer work in cycles? [duplicate]

This question already has answers here:
Proper way to release resources with defer in a loop?
(3 answers)
Closed 11 months ago.
I'm trying to handle reconnections to MongoDB. To do this I try to perform every operation three times (in case it fails with io.EOF)
type MongoDB struct {
session *mgo.Session
DB *mgo.Database
}
func (d MongoDB) performWithReconnect(collection string,
operation func(*mgo.Collection) error) error {
var err error
for i := 0; i < 3; i++ {
session := d.session.Copy()
defer session.Close()
err = operation(session.DB(Config.MongoDb).C(collection))
if err == io.EOF{
continue
}
if err == nil{
return err
}
}
return err
}
So the question is about defer. Will it close all sessions as I suppose or it is going to behave some other way?
If you know some good practices to handle this different way I will be happy to read them.
Consider the following program
package main
import (
"fmt"
)
func print(s string, i int) {
fmt.Println(s, i)
}
func main() {
for i := 0; i < 3; i++ {
defer print("loop", i)
}
fmt.Println("after loop 1")
for i := 0; i < 3; i++ {
func(i int) {
defer print("func", i)
}(i)
}
fmt.Println("after loop 2")
}
It will print
after loop 1
func 0
func 1
func 2
after loop 2
loop 2
loop 1
loop 0
The deferred function calls will be put on stack and then executed in a reverse order at the end of surrounding function. In your case it will be quite bad as you will have connections waiting to be closed.
I recommend wrapping the contents of loop into an inline function. It will call deferred function just as you want it.
From A Tour of Go:
A defer statement defers the execution of a function until the surrounding function returns.
So in your code, you're creating three (identical) defer functions, which will all run when the function exits.
If you need a defer to run inside of a loop, you have to put it inside of a function. This can be done in an anonymous function thusly:
for i := 0; i < 3; i++ {
err := func() error {
session := d.session.Copy()
defer session.Close()
return operation(session.DB(Config.MongoDb).C(collection))
}()
if err == io.EOF {
continue
}
if err != nil {
return err
}
}

Resources