I have a service that use to upload file to AWS S3. I was trying to use with goroutines and without to upload the file. If I upload the file without goroutines, it should wait till finish then give the response, and if I use goroutines it will run in the background and faster to response to the client-side.
How about if that upload failed if I use goroutines? And then that file not uploaded to AWS S3? Can you tell me to handle this how?
here is my function to upload file
func uploadToS3(s *session.Session, size int64, name string , buffer []byte)( string , error) {
tempFileName := "pictures/" + bson.NewObjectId().Hex() + "-" + filepath.Base(name)
_, err := s3.New(s).PutObject(&s3.PutObjectInput{
Bucket: aws.String("myBucketNameHere"),
Key: aws.String(tempFileName),
ACL: aws.String("public-read"),
Body: bytes.NewReader(buffer),
ContentLength: aws.Int64(int64(size)),
ContentType: aws.String(http.DetectContentType(buffer)),
ContentDisposition: aws.String("attachment"),
ServerSideEncryption: aws.String("AES256"),
StorageClass: aws.String("INTELLIGENT_TIERING"),
})
if err != nil {
return "", err
}
return tempFileName, err
}
func UploadFile(db *gorm.DB) func(c *gin.Context) {
return func(c *gin.Context) {
file, err := c.FormFile("file")
f, err := file.Open()
if err != nil {
fmt.Println(err)
}
defer f.Close()
buffer := make([]byte, file.Size)
_, _ = f.Read(buffer)
s, err := session.NewSession(&aws.Config{
Region: aws.String("location here"),
Credentials: credentials.NewStaticCredentials(
"id",
"key",
"",
),
})
if err != nil {
fmt.Println(err)
}
go uploadToS3(s, file.Size, file.Filename, buffer)
c.JSON(200, fmt.Sprintf("Image uploaded successfully"))
}
}
I was thinking as well how about if there many request to upload a file over 10000+ per 5-10mins ? would some file can't be upload because too many request?
The problem is that when using a goroutine, you immediately return a success message to your client. If that's really the case, it means your goroutine needs to be able to recover in case of error when uploading to S3 (don't lose the image). So either you take care of that, or you inform asynchronously your client that the upload failed, so the client can re-try.
This question is too broad for a single answer. There are, broadly speaking, three possible approaches:
Wait for your goroutines to complete to handle any errors.
Ensure your goroutines can handle (or possibly ignore) any errors they encounter, such that returning an error never matters.
Have your goroutines log any errors, for handling later, possibly by a human, or possibly by some cleanup/retry function.
Which approach is best depends on the situation.
For any asynchronous task - such as uploading a file in a background go-routine - one can write the uploading function in such a way to return a chan error to the caller. The caller can then react to the file uploads eventual error (or nil for no error) at a later time by reading from the chan error.
However if you are accepting upload requests, I'd suggest instead to created a worker upload go-routine, that accepts file uploads via a channel. An output "error" channel can track success/failure. And if need be, the error uploaded could be written back to the original upload channel queue (including a retry tally & retry max - so a problematic payload does not loop forever) .
Related
I'm integrating Binance API into an existing system and while most parts a straight forward, the data streaming API hits my limited understanding of go-routines. I don't believe there is anything special in the golang SDK for Binance, but essentially I only need two functions, one that starts the data stream and processes events with the event handler given as a parameter and a second one that ends the data stream without actually shutting down the client as it would close all other connections. On a previous project, there were two message types for this, but the binance SDK uses an implementation that returns two go channels, one for errors and an another one, I guess from the name, for stopping the data stram.
The code I wrote for starting the data stream looks like this:
func startDataStream(symbol, interval string, wsKlineHandler futures.WsKlineHandler, errHandler futures.ErrHandler) (err error){
doneC, stopC, err := futures.WsKlineServe(symbol, interval, wsKlineHandler, errHandler)
if err != nil {
fmt.Println(err)
return err
}
return nil
}
This works as expected and streams data. A simple test verifies it:
func runWSDataTest() {
symbol := "BTCUSDT"
interval := "15m"
errHandler := func(err error) {fmt.Println(err)}
wsKlineHandler := func(event *futures.WsKlineEvent) {fmt.Println(event)}
_ = startDataStream(symbol, interval, wsKlineHandler, errHandler)
}
The thing that is not so clear to me, mainly due to incomplete understanding, really is how do I stop the stream. I think the returned stopC channel can be used to somehow issue a end singnal similar to, say, a sigterm on system level and then the stream should end.
Say, I have a stopDataStream function that takes a symbol as an argument
func stopDataStream(symbol){
}
Let's suppose I start 5 data streams for five symbols and now I want to stop just one of the streams. That begs the question of:
How do I track all those stopC channels?
Can I use a collection keyed with the symbol, pull the stopC channel, and then just issue a signal to end just that data stream?
How do I actually write into the stopC channel from the stop function?
Again, I don't think this is particularly hard, it's just I could not figure it out yet from the docs so any help would be appreciated.
Thank you
(Answer originally written by #Marvin.Hansen)
Turned out, just saving & closing the channel solved it all. I was really surprised how easy this is, but here is the code of the updated functions:
func startDataStream(symbol, interval string, wsKlineHandler futures.WsKlineHandler, errHandler futures.ErrHandler) (err error) {
_, stopC, err := futures.WsKlineServe(symbol, interval, wsKlineHandler, errHandler)
if err != nil {
fmt.Println(err)
return err
}
// just save the stop channel
chanMap[symbol] = stopC
return nil
}
And then, the stop function really becomes embarrassing trivial:
func stopDataStream(symbol string) {
stopC := chanMap[symbol] // load the stop channel for the symbol
close(stopC) // just close it.
}
Finally, testing it all out:
var (
chanMap map[string]chan struct{}
)
func runWSDataTest() {
chanMap = make(map[string]chan struct{})
symbol := "BTCUSDT"
interval := "15m"
errHandler := func(err error) { fmt.Println(err) }
wsKlineHandler := getKLineHandler()
println("Start stream")
_ = startDataStream(symbol, interval, wsKlineHandler, errHandler)
time.Sleep(3 * time.Second)
println("Stop stream")
stopDataStream(symbol)
time.Sleep(1 * time.Second)
}
This is it.
I'm creating a simple application where it allows users to upload big files using simple-uploader since this plugin sends the files in chunks instead of one big file. The problem is that when I save the file the first chunk is the only one that is being saved. Is there a way in Go where I'll wait for all the chunks to arrive in the server then save it afterward?
Here's a snippet of the code I'm doing:
dFile, err := c.FormFile("file")
if err != nil {
return SendError(c, err)
}
filename := dFile.Filename
f, err := dFile.Open()
if err != nil {
return SendError(c, err)
}
defer f.Close()
// save file in s3
duration := sss.UploadFile(f, "temp/"+filename")
... send response
By the way for this project, I'm using the fiber framework.
While working on this I encountered tus-js-client which is doing the same as the simple-uploader and implementation in go called tusd which will reassemble the chunks so you don't have to worry about it anymore.
Here's a discussion where I posted my solution: https://stackoverflow.com/a/65785097/549529.
I have a proof of concept http server using echo which takes a POST request with a JSON body. I am trying to stream the request body over to multiple POST requests using pipes and the multiwriter but it is not working correctly.
In the example below I can see the data is sent to the 2 POST endpoints and I can see a log from those requests but I never get a response back it seems the code hangs waiting for the http.Post(...) functions to complete.
If I call these 2 endpoints directly they work fine and give a valid json response, so i believe the problem is with this piece of code which is my handler for the route.
func ImportAggregate(c echo.Context) error {
oneR, oneW := io.Pipe()
twoR, twoW := io.Pipe()
done := make(chan bool, 2)
go func() {
fmt.Println("Product Starting")
response, err := http.Post("http://localhost:1323/products/import", "application/json", oneR)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(response.Body)
}
done <- true
}()
go func() {
fmt.Println("Import Starting")
response, err := http.Post("http://localhost:1323/discounts/import", "application/json", twoR)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(response.Body)
}
done <- true
}()
mw := io.MultiWriter(oneW, twoW)
io.Copy(mw, c.Request().Body)
<-done
<-done
return c.String(200, "Imported")
}
The output in console is:
Product Starting
Import Starting
The issue in OP code is that the http.Post calls never detects the EOF of the provided io.Reader.
That happens because the provided half write pipe is never closed, thus, the half read pipe never emits the regular EOF error.
As a note about OP comment that closing the half read pipe would generate irregular errors, one has to understand that reading from a closed pipe is not a correct behavior.
Thus in this situation, care should be taken to close the half write side right after the content has been copied.
The resulting source code should be changed to
func ImportAggregate(c echo.Context) error {
oneR, oneW := io.Pipe()
twoR, twoW := io.Pipe()
done := make(chan bool, 2)
go func() {
fmt.Println("Product Starting")
response, err := http.Post("http://localhost:1323/products/import", "application/json", oneR)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(response.Body)
}
done <- true
}()
go func() {
fmt.Println("Import Starting")
response, err := http.Post("http://localhost:1323/discounts/import", "application/json", twoR)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(response.Body)
}
done <- true
}()
mw := io.MultiWriter(oneW, twoW)
io.Copy(mw, c.Request().Body)
oneW.Close()
twoW.Close()
<-done
<-done
return c.String(200, "Imported")
}
Side notes beyond OP question:
an error check must implemented around the io.Copy in order to detect a transmission error.
it is not needed to close the half read side of the pipe, http.Post will do it after it received the EOF signal.
the goroutines responsible to consume the pipes must be declared and started before the input request is copied. The Pipes being synchronous, the code would block during the io.Copy waiting to be consumed on its other end.
the done chan does not require to be unbuffered (of length 2)
a way to forward error from outgoing requests to the outgoing response would be to use a channel of type (chan error), loop over it two times, and check for the first error encountered.
I want to write requests to one file from some ajax script. The problem arises when there will be many of those in a second and writing to file will take more time than the break between requests, and when there will be two requests at the same time.
How could I solve this?
I've came up using mutex, like:
var mu sync.Mutex
func writeToFile() {
mu.Lock()
defer mu.Unlock()
// write to file
}
But it makes the whole thing synchronous and I don't really know what happens when there are two requests at the same time. And it still does not lock the file itself.
Uh, what's the proper way to do this?
You only need to make writing to the file "sequential", meaning don't allow 2 concurrent goroutines to write to the file. Yes, if you use locking in the writeToFile() function, serving your ajax requests may become (partially) sequential too.
What I suggest is open the file once, when your application starts. And designate a single goroutine which will be responsible writing to the file, no other goroutines should do it.
And use a buffered channel to send data that should be written to the file. This will make serving ajax requests non-blocking, and still the file will not be written concurrently / parallel.
Note that this way ajax requests won't even have to wait while the data is actually written to the file (faster response time). This may or may not be a problem. For example if later writing fails, your ajax response might already be committed => no chance to signal failure to the client.
Example how to do it:
var (
f *os.File
datach = make(chan []byte, 100) // Buffered channel
)
func init() {
// Open file for appending (create if doesn't exist)
var err error
f, err = os.OpenFile("data.txt", os.O_CREATE|os.O_APPEND|os.O_WRONLY, 0666)
if err != nil {
panic(err)
}
// Start goroutine which writes to the file
go writeToFile()
}
func writeToFile() {
// Loop through any data that needs to be written:
for data := range datach {
if _, err := f.Write(data); err != nil {
// handle error!
}
}
// We get here if datach is closed: shutdown
f.Close()
}
func ajaxHandler(w http.ResponseWriter, r *http.Request) {
// Assmeble data that needs to be written (appended) to the file
data := []byte{1, 2, 3}
// And send it:
datach <- data
}
To gracefully exit from the app, you should close the datach channel: when it's closed, the loop in the writeToFile() will terminate, and the file will be closed (flushing any cached data, releasing OS resources).
If you want to write text to the file, you may declare the data channel like this:
var datach = make(chan string, 100) // Buffered channel
And you may use File.WriteString() to write it to the file:
if _, err := f.WriteString(data); err != nil {
// handle error!
}
The problem is this: There is a web server. I figured that it would be beneficial to use goroutines in page loading, so I went ahead and did: called loadPage function as a goroutine. However, when doing this, the server simply stops working without errors. It prints a blank, white page. The problem has to be in the function itself- something there is conflicting with the goroutine somehow.
These are the relevant functions:
func loadPage(w http.ResponseWriter, path string) {
s := GetFileContent(path)
w.Header().Add("Content-Type", getHeader(path))
w.Header().Add("Content-Length", GetContentLength(path))
fmt.Fprint(w, s)
}
func GetFileContent(path string) string {
cont, err := ioutil.ReadFile(path)
e(err)
aob := len(cont)
s := string(cont[:aob])
return s
}
func GetFileContent(path string) string {
cont, err := ioutil.ReadFile(path)
e(err)
aob := len(cont)
s := string(cont[:aob])
return s
}
func getHeader(path string) string {
images := []string{".jpg", ".jpeg", ".gif", ".png"}
readable := []string{".htm", ".html", ".php", ".asp", ".js", ".css"}
if ArrayContainsSuffix(images, path) {
return "image/jpeg"
}
if ArrayContainsSuffix(readable, path) {
return "text/html"
}
return "file/downloadable"
}
func ArrayContainsSuffix(arr []string, c string) bool {
length := len(arr)
for i := 0; i < length; i++ {
s := arr[i]
if strings.HasSuffix(c, s) {
return true
}
}
return false
}
The reason why this happens is because your HandlerFunc which calls "loadPage" is called synchronously with the request. When you call it in a go routine the Handler is actually returning immediately, causing the response to be sent immediately. That's why you get a blank page.
You can see this in server.go (line 1096):
serverHandler{c.server}.ServeHTTP(w, w.req)
if c.hijacked() {
return
}
w.finishRequest()
The ServeHTTP function calls your handler, and as soon as it returns it calls "finishRequest". So your Handler function must block as long as it wants to fulfill the request.
Using a go routine will actually not make your page any faster. Synchronizing a singe go routine with a channel, as Philip suggests, will also not help you in this case as that would be the same as not having the go routine at all.
The root of your problem is actually ioutil.ReadFile, which buffers the entire file into memory before sending it.
If you want to stream the file you need to use os.Open. You can use io.Copy to stream the contents of the file to the browser, which will used chunked encoding.
That would look something like this:
f, err := os.Open(path)
if err != nil {
http.Error(w, "Not Found", http.StatusNotFound)
return
}
n, err := io.Copy(w, f)
if n == 0 && err != nil {
http.Error(w, "Error", http.StatusInternalServerError)
return
}
If for some reason you need to do work in multiple go routines, take a look at sync.WaitGroup. Channels can also work.
If you are trying to just serve a file, there are other options that are optimized for this, such as FileServer or ServeFile.
In the typical web framework implementations in Go, the route handlers are invoked as Goroutines. I.e. at some point the web framework will say go loadPage(...).
So if you call a Go routine from inside loadPage, you have two levels of Goroutines.
The Go scheduler is really lazy and will not execute the second level if it's not forced to. So you need to enforce it through synchronization events. E.g. by using channels or the sync package. Example:
func loadPage(w http.ResponseWriter, path string) {
s := make(chan string)
go GetFileContent(path, s)
fmt.Fprint(w, <-s)
}
The Go documentation says this:
If the effects of a goroutine must be observed by another goroutine,
use a synchronization mechanism such as a lock or channel
communication to establish a relative ordering.
Why is this actually a smart thing to do? In larger projects you may deal with a large number of Goroutines that need to be coordinated somehow efficiently. So why call a Goroutine if it's output is used nowhere? A fun fact: I/O operations like fmt.Printf do trigger synchronization events too.