Using Channels in Go to receive Responses and Write to SQL Concurrently - go

I am working with Go to implement a pipeline of JSON data from an external API, process the message and then send to a SQL database.
I am trying to concurrently run API requests, then after I return a response, I'd like to send it to be inserted into the DB via another goroutine via load().
In my below code, sometimes I'll receive my log.Printf() in the load() func, other times I won't. Which indicates that I'm likely closing a channel or not properly setting up the communication.
The pattern I am attempting is something like this:
package main
import (
"encoding/json"
"io/ioutil"
"log"
"net/http"
"time"
)
type Request struct {
url string
}
type Response struct {
status int
args Args `json:"args"`
headers Headers `json:"headers"`
origin string `json:"origin"`
url string `json:"url"`
}
type Args struct {
}
type Headers struct {
accept string `json:"Accept"`
}
func main() {
start := time.Now()
numRequests := 5
responses := make(chan Response, 5)
defer close(responses)
for i := 0; i < numRequests; i++ {
req := Request{url: "https://httpbin.org/get"}
go func(req *Request) {
resp, err := extract(req)
if err != nil {
log.Fatal("Error extracting data from API")
return
}
// Send response to channel
responses <- resp
}(&req)
// Perform go routine to load data
go load(responses)
}
log.Println("Execution time: ", time.Since(start))
}
func extract(req *Request) (r Response, err error) {
var resp Response
request, err := http.NewRequest("GET", req.url, nil)
if err != nil {
return resp, err
}
request.Header = http.Header{
"accept": {"application/json"},
}
response, err := http.DefaultClient.Do(request)
defer response.Body.Close()
if err != nil {
log.Fatal("Error")
return resp, err
}
// Read response data
body, err := ioutil.ReadAll(response.Body)
if err != nil {
log.Fatal("Error")
return resp, err
}
json.Unmarshal(body, &resp)
resp.status = response.StatusCode
return resp, nil
}
type Record struct {
origin string
url string
}
func load(ch chan Response) {
// Read response from channel
resp := <-ch
// Process the response data
records := process(resp)
log.Printf("%+v\n", records)
// Load data to db stuff here
}
func process(resp Response) (record Record) {
// Process the response struct as needed to get a record of data to insert to DB
return record
}

The program has no protection against completion before the work is done. So sometimes the program terminates before the goroutine can finish.
To prevent that, use a WaitGroup:
wg:=sync.WaitGroup{}
for i := 0; i < numRequests; i++ {
...
wg.Add(1)
go func() {
defer wg.Done()
load(responses)
}()
}
wg.Wait()

Related

How to prioritize goroutines

I want to call two endpoints at the same time (A and B). But if I got a response 200 from both I need to use the response from A otherwise use B response.
If B returns first I need to wait for A, in other words, I must use A whenever A returns 200.
Can you guys help me with the pattern?
Thank you
Wait for a result from A. If the result is not good, then wait from a result from B. Use a buffered channel for the B result so that the sender does not block when A is good.
In the following snippet, fnA() and fnB() functions that issue requests to the endpoints, consume the response and cleanup. I assume that the result is a []byte, but it could be the result of decoding JSON or something else. Here's an example for fnA:
func fnA() ([]byte, error) {
r, err := http.Get("http://example.com/a")
if err != nil {
return nil, err
}
defer r.Body.Close() // <-- Important: close the response body!
if r.StatusCode != 200 {
return nil, errors.New("bad response")
}
return ioutil.ReadAll(r.Body)
}
Define a type to hold the result and error.
type response struct {
result []byte
err error
}
With those preliminaries done, here's how to prioritize A over B.
a := make(chan response)
go func() {
result, err := fnA()
a <- response{result, err}
}()
b := make(chan response, 1) // Size > 0 is important!
go func() {
result, err := fnB()
b <- response{result, err}
}()
resp := <-a
if resp.err != nil {
resp = <-b
if resp.err != nil {
// handle error. A and B both failed.
}
}
result := resp.result
If the application does not execute code concurrently with A and B, then there's no need to use a goroutine for A:
b := make(chan response, 1) // Size > 0 is important!
go func() {
result, err := fnB()
b <- response{result, err}
}()
result, err := fnA()
if err != nil {
resp = <-b
if resp.err != nil {
// handle error. A and B both failed.
}
result = resp.result
}
I'm suggesting you to use something like this, this is a bulky solution, but there you can start more than two endpoints for you needs.
func endpointPriorityTest() {
const (
sourceA = "a"
sourceB = "b"
sourceC = "c"
)
type endpointResponse struct {
source string
response *http.Response
error
}
epResponseChan := make(chan *endpointResponse)
endpointsMap := map[string]string{
sourceA: "https://jsonplaceholder.typicode.com/posts/1",
sourceB: "https://jsonplaceholder.typicode.com/posts/10",
sourceC: "https://jsonplaceholder.typicode.com/posts/100",
}
for source, endpointURL := range endpointsMap {
source := source
endpointURL := endpointURL
go func(respChan chan<- *endpointResponse) {
// You can add a delay so that the response from A takes longer than from B
// and look to the result map
// if source == sourceA {
// time.Sleep(time.Second)
// }
resp, err := http.Get(endpointURL)
respChan <- &endpointResponse{
source: source,
response: resp,
error: err,
}
}(epResponseChan)
}
respCache := make(map[string]*http.Response)
// Reading endpointURL responses from chan
for epResp := range epResponseChan {
// Skips failed requests
if epResp.error != nil {
continue
}
// Save successful response to cache map
respCache[epResp.source] = epResp.response
// Interrupt reading channel if we've got an response from source A
if epResp.source == sourceA {
break
}
}
fmt.Println("result map: ", respCache)
// Now we can use data from cache map
// resp, ok :=respCache[sourceA]
// if ok{
// ...
// }
}
#Zombo 's answer has the correct logic flow. Piggybacking off this, I would suggest one addition: leveraging the context package.
Basically, any potentially blocking tasks should use context.Context to allow the call-chain to perform more efficient clean-up in the event of early cancelation.
context.Context also can be leveraged, in your case, to abort the B call early if the A call succeeds:
func failoverResult(ctx context.Context) *http.Response {
// wrap the (parent) context
ctx, cancel := context.WithCancel(ctx)
// if we return early i.e. if `fnA()` completes first
// this will "cancel" `fnB()`'s request.
defer cancel()
b := make(chan *http.Response, 1)
go func() {
b <- fnB(ctx)
}()
resp := fnA(ctx)
if resp.StatusCode != 200 {
resp = <-b
}
return resp
}
fnA (and fnB) would look something like this:
func fnA(ctx context.Context) (resp *http.Response) {
req, _ := http.NewRequestWithContext(ctx, "GET", aUrl)
resp, _ = http.DefaultClient.Do(req) // TODO: check errors
return
}
Normally in golang, channel are used for communicating between goroutines.
You can orchestrate your scenario with following sample code.
basically you pass channel into your callB which will hold response. You don't need to run callA in goroutine as you always need result from that endpoint/service
package main
import (
"fmt"
"time"
)
func main() {
resB := make(chan int)
go callB(resB)
res := callA()
if res == 200 {
fmt.Print("No Need for B")
} else {
res = <-resB
fmt.Printf("Response from B : %d", res)
}
}
func callA() int {
time.Sleep(1000)
return 200
}
func callB(res chan int) {
time.Sleep(500)
res <- 200
}
Update: As suggestion given in comment, above code leaks "callB"
package main
import (
"fmt"
"time"
)
func main() {
resB := make(chan int, 1)
go callB(resB)
res := callA()
if res == 200 {
fmt.Print("No Need for B")
} else {
res = <-resB
fmt.Printf("Response from B : %d", res)
}
}
func callA() int {
time.Sleep(1000 * time.Millisecond)
return 200
}
func callB(res chan int) {
time.Sleep(500 * time.Millisecond)
res <- 200
}

variable is empty but later has a value

I'm trying to develop a Terraform provider but I have a problem of the first request body. Here is the code:
type Body struct {
id string
}
func resourceServerCreate(d *schema.ResourceData, m interface{}) error {
key := d.Get("key").(string)
token := d.Get("token").(string)
workspace_name := d.Get("workspace_name").(string)
board_name := d.Get("board_name").(string)
resp, err := http.Post("https://api.trello.com/1/organizations?key="+key+"&token="+token+"&displayName="+workspace_name,"application/json",nil)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
//lettura body.
body := new(Body)
json.NewDecoder(resp.Body).Decode(body)
log.Println("[ORCA MADONNA] il log funzia "+body.id)
d.Set("board_id",body.id)
resp1, err1 := http.Post("https://api.trello.com/1/boards?key="+key+"&token="+token+"&idOrganization="+body.id+"&=&name="+board_name,"application/json",nil)
if err1 != nil {
log.Fatalln(resp1)
}
defer resp1.Body.Close()
d.SetId(board_name)
return resourceServerRead(d, m)
}
In the log is empty, but the second call have it and work fine. How is it possible?
Go doesn't force you to check error responses, therefore it's easy to make silly mistakes. Had you checked the return value from Decode(), you would have immediately discovered a problem.
err := json.NewDecoder(resp.Body).Decode(body)
if err != nil {
log.Fatal("Decode error: ", err)
}
Decode error: json: Unmarshal(non-pointer main.Body)
So your most immediate fix is to use & to pass a pointer to Decode():
json.NewDecoder(resp.Body).Decode(&body)
Also of note, some programming editors will highlight this mistake for you:
Here's a working demonstration, including a corrected Body structure as described at json.Marshal(struct) returns “{}”:
package main
import (
"bytes"
"encoding/json"
"fmt"
"log"
"net/http"
"time"
)
type JSON = map[string]interface{}
type JSONArray = []interface{}
func ErrFatal(err error, msg string) {
if err != nil {
log.Fatal(msg+": ", err)
}
}
func handleTestRequest(w http.ResponseWriter, req *http.Request) {
w.Write(([]byte)("{\"id\":\"yourid\"}"))
}
func launchTestServer() {
http.HandleFunc("/", handleTestRequest)
go http.ListenAndServe(":8080", nil)
time.Sleep(1 * time.Second) // allow server to get started
}
// Medium: "Don’t use Go’s default HTTP client (in production)"
var restClient = &http.Client{
Timeout: time.Second * 10,
}
func DoREST(method, url string, headers, payload JSON) *http.Response {
requestPayload, err := json.Marshal(payload)
ErrFatal(err, "json.Marshal(payload")
request, err := http.NewRequest(method, url, bytes.NewBuffer(requestPayload))
ErrFatal(err, "NewRequest "+method+" "+url)
for k, v := range headers {
request.Header.Add(k, v.(string))
}
response, err := restClient.Do(request)
ErrFatal(err, "DoRest client.Do")
return response
}
type Body struct {
Id string `json:"id"`
}
func clientDemo() {
response := DoREST("POST", "http://localhost:8080", JSON{}, JSON{})
defer response.Body.Close()
var body Body
err := json.NewDecoder(response.Body).Decode(&body)
ErrFatal(err, "Decode")
fmt.Printf("Body: %#v\n", body)
}
func main() {
launchTestServer()
for i := 0; i < 5; i++ {
clientDemo()
}
}

How can you longpoll multiple urls in Go?

Here's what I have thus far:
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"log"
"net/http"
)
func subscribe(urls Urls) []byte {
req, err := http.NewRequest("GET", urls.Url, nil)
if err != nil {
log.Fatal(err)
}
req.Header.Set("authentication", "Bearer " + urls.Token)
http_client := &http.Client{}
res, err := http_client.Do(req)
if err != nil {
log.Fatal(err)
}
defer res.Body.Close()
resourceResp, err := ioutil.ReadAll(res.Body)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(resourceResp))
var data map[string]interface{}
error := json.Unmarshal([]byte(resourceResp), &data)
if error != nil {
log.Fatal(error)
}
return subscribe(urls)
}
type Urls struct {
Url string
Token string
}
func main() {
var urls [2]Urls
urls[0] = Urls{
Url: "https://example.com/users/8",
Token: "abcdefg",
}
urls[1] = Urls{
Url: "https://example.com/users/9",
Token: "hijklmnop",
}
subscribe(urls[0])
subscribe(urls[1])
}
The end goal is to "subscribe" to the multiple urls and pull any updated data (eventually adding it to a queue, but one step at a time). After that, reestablish the connection. Right now, only the first subscribe gets run. Thanks!
I think what you're asking is for the subscribe functions to be run in parallel. One way is to wrap them in goroutines and wait for all the goroutines to finish:
func main() {
...
...
var wg sync.WaitGroup
wg.Add(len(urls))
for _, url := range(urls) {
go func() {
defer wg.Done()
subscribe(url)
}()
}
wg.Wait()
}

Go Routine: Shared Global variable in web server

I have go web server running on port and handling post request which internally calls different url to fetch response using goroutine and proceed.
I have divided the whole flow to different method. Draft of the code.
package main
import (
"bytes"
"fmt"
"github.com/gorilla/mux"
"log"
"net/http"
"time"
)
var status_codes string
func main() {
router := mux.NewRouter().StrictSlash(true)
/*router := NewRouter()*/
router.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
_, _ = fmt.Fprintf(w, "Hello!!!")
})
router.HandleFunc("/{name}", func(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
prepare(w, r, vars["name"])
}).Methods("POST")
log.Fatal(http.ListenAndServe(fmt.Sprintf(":%d", 8080), router))
}
func prepare(w http.ResponseWriter, r *http.Request, name string) {
//initializing for the current request, need to maintain this variable for each request coming
status_codes = ""
//other part of the code and call to goroutine
var urls []string
//lets say all the url loaded, call the go routine func and wait for channel to respond and then proceed with the response of all url
results := callUrls(urls)
process(w, results)
}
type Response struct {
status int
url string
body string
}
func callUrls(urls []string) []*Response {
ch := make(chan *Response, len(urls))
for _, url := range urls {
go func(url string) {
//http post on url,
//base on status code of url call, add to status code
//some thing like
req, err := http.NewRequest("POST", url, bytes.NewBuffer(somePostData))
req.Header.Set("Content-Type", "application/json")
req.Close = true
client := &http.Client{
Timeout: time.Duration(time.Duration(100) * time.Second),
}
response, err := client.Do(req)
if err != nil {
status_codes += "200,"
//do other thing with the response received
} else {
status_codes += "500,"
}
// return to channel accordingly
ch <- &Response{200, "url", "response body"}
}(url)
}
var results []*Response
for {
select {
case r := <-ch:
results = append(results, r)
if len(results) == len(urls) {
//Done
close(ch)
return results
}
}
}
}
func process(w http.ResponseWriter, results []*Response){
//read those status code received from all urls call for the given request
fmt.Println("status", status_codes)
//Now the above line keep getting status code from other request as well
//for eg. if I have called 5 urls then it should have
//200,500,204,404,200,
//but instead it is
//200,500,204,404,200,204,404,200,204,404,200, and some more keep growing with time
}
The above code does:
Variable declare globally, Initialized in prepare function.
append value in go routine callUrls function
read those variable in process function
Now should I pass those variable declared globally to each function call to make them local as it won't be shared then?(I would hate to do this.)
Or is there any other approach to achieve the same thing without adding more argument to function being called.
As I will have few other string and int value as well that will be used across the program and in go routine function as well.
What will be the correct way of making them thread safe and only 5 codes for each request coming on port simultaneously.
Don't use global variables, be explicit instead and use function arguments. Moreover, you have a race condition on status_codes because it is accessed by multiple goroutines without any mutex lock.
Take a look at my fix below.
func prepare(w http.ResponseWriter, r *http.Request, name string) {
var urls []string
//status_codes is populated by callUris(), so let it return the slice with values
results, status_codes := callUrls(urls)
//process() needs status_codes in order to work, so pass the variable explicitely
process(w, results, status_codes)
}
type Response struct {
status int
url string
body string
}
func callUrls(urls []string) []*Response {
ch := make(chan *Response, len(urls))
//In order to avoid race condition, let's use a channel
statusChan := make(chan string, len(urls))
for _, url := range urls {
go func(url string) {
//http post on url,
//base on status code of url call, add to status code
//some thing like
req, err := http.NewRequest("POST", url, bytes.NewBuffer(somePostData))
req.Header.Set("Content-Type", "application/json")
req.Close = true
client := &http.Client{
Timeout: time.Duration(time.Duration(100) * time.Second),
}
response, err := client.Do(req)
if err != nil {
statusChan <- "200"
//do other thing with the response received
} else {
statusChan <- "500"
}
// return to channel accordingly
ch <- &Response{200, "url", "response body"}
}(url)
}
var results []*Response
var status_codes []string
for !doneRes || !doneStatus { //continue until both slices are filled with values
select {
case r := <-ch:
results = append(results, r)
if len(results) == len(urls) {
//Done
close(ch) //Not really needed here
doneRes = true //we are done with results, set the corresponding flag
}
case status := <-statusChan:
status_codes = append(status_codes, status)
if len(status_codes) == len(urls) {
//Done
close(statusChan) //Not really needed here
doneStatus = true //we are done with statusChan, set the corresponding flag
}
}
}
return results, status_codes
}
func process(w http.ResponseWriter, results []*Response, status_codes []string) {
fmt.Println("status", status_codes)
}

Too Many open files/ No such host error while running a go program which makes concurrent requests

I have a golang program which is supposed to call an API with different payloads, the web application is a drop wizard application which is running on localhost, and the go program is below
package main
import (
"bufio"
"encoding/json"
"log"
"net"
"net/http"
"os"
"strings"
"time"
)
type Data struct {
PersonnelId string `json:"personnel_id"`
DepartmentId string `json:"department_id"`
}
type PersonnelEvent struct {
EventType string `json:"event_type"`
Data `json:"data"`
}
const (
maxIdleConnections = 20
maxIdleConnectionsPerHost = 20
timeout = time.Duration(5 * time.Second)
)
var transport = http.Transport{
Dial: dialTimeout,
MaxIdleConns: maxIdleConnections,
MaxIdleConnsPerHost: 20,
}
var client = &http.Client{
Transport: &transport,
}
func dialTimeout(network, addr string) (net.Conn, error) {
return net.DialTimeout(network, addr, timeout)
}
func makeRequest(payload string) {
req, _ := http.NewRequest("POST", "http://localhost:9350/v1/provider-
location-personnel/index", strings.NewReader(payload))
req.Header.Set("X-App-Token", "TESTTOKEN1")
req.Header.Set("Content-Type", "application/json")
resp, err := client.Do(req)
if err != nil {
log.Println("Api invocation returned an error ", err)
} else {
defer resp.Body.Close()
log.Println(resp.Body)
}
}
func indexPersonnels(personnelPayloads []PersonnelEvent) {
for _, personnelEvent := range personnelPayloads {
payload, err := json.Marshal(personnelEvent)
if err != nil {
log.Println("Error while marshalling payload ", err)
}
log.Println(string(payload))
// go makeRequest(string(payload))
}
}
func main() {
ch := make(chan PersonnelEvent)
for i := 0; i < 20; i++ {
go func() {
for personnelEvent := range ch {
payload, err := json.Marshal(personnelEvent)
if err != nil {
log.Println("Error while marshalling payload", err)
}
go makeRequest(string(payload))
//log.Println("Payload ", string(payload))
}
}()
}
file, err := os.Open("/Users/tmp/Desktop/personnels.txt")
defer file.Close()
if err != nil {
log.Fatalf("Error opening personnel id file %v", err)
}
scanner := bufio.NewScanner(file)
for scanner.Scan() {
go func() {
ch <- PersonnelEvent{EventType: "provider_location_department_personnel_linked", Data: Data{DepartmentId: "2a8d9687-aea8-4a2c-bc08-c64d7716d973", PersonnelId: scanner.Text()}}
}()
}
}
Its reading some ids from a file and then creating a payload out of it and invoking a post request on the web server, but when i run the program it gives too many open file errors/no such host errors, i feel that the program is too much concurrent how to make it run gracefully?
inside your 20 goroutines started in main(), "go makeRequest(...)" again created one goroutine for each event. you don't need start extra goroutine there.
Besides, I think you don't need start goroutine in your scan loop, either. buffered channel is enough,because bottleneck should be at doing http requests.
You can use a buffered channel, A.K.A. counting semaphore, to limit the parallelism.
// The capacity of the buffered channel is 10,
// which means you can have 10 goroutines to
// run the makeRequest function in parallel.
var tokens = make(chan struct{}, 10)
func makeRequest(payload string) {
tokens <- struct{}{} // acquire the token or block here
defer func() { <-tokens }() // release the token to awake another goroutine
// other code...
}

Resources