How to handle chunked file upload - go

I'm creating a simple application where it allows users to upload big files using simple-uploader since this plugin sends the files in chunks instead of one big file. The problem is that when I save the file the first chunk is the only one that is being saved. Is there a way in Go where I'll wait for all the chunks to arrive in the server then save it afterward?
Here's a snippet of the code I'm doing:
dFile, err := c.FormFile("file")
if err != nil {
return SendError(c, err)
}
filename := dFile.Filename
f, err := dFile.Open()
if err != nil {
return SendError(c, err)
}
defer f.Close()
// save file in s3
duration := sss.UploadFile(f, "temp/"+filename")
... send response
By the way for this project, I'm using the fiber framework.

While working on this I encountered tus-js-client which is doing the same as the simple-uploader and implementation in go called tusd which will reassemble the chunks so you don't have to worry about it anymore.
Here's a discussion where I posted my solution: https://stackoverflow.com/a/65785097/549529.

Related

Trying to detect if multiple files are present in multipart/form-data request, and rejecting multiple attachments

I am building an API using Go that needs to store a file that is sent in a multipart form request. It also needs to return an error if more than one file is attached, and the files do not have key values attached. I'm running into an issue where the Part for the Multipart Reader changes upon iteration. So I can either successfully upload the first file but not return the error, or it returns an error when needed, but when a valid request comes in - it iterates past it and uploads nothing.
I have written a couple for loops trying this, and some without.
i := 0
var data io.Reader
for part, err := reader.NextPart(); err != io.EOF; part, err := reader.NextPart() {
i++
data = part
}
if i > 1 {
return nil, errors.New("too many files")
}
req := storeRequest{
Data: data,
FileNAme: r.URL.Path,
}
return req, nil
Any suggestions on how I could handle this? Thanks in advance.

When I io.Copy a file, it doesn't block and trying to use it might fail until it finishes

I have this code (more or less):
resp, err := http.Get(url)
if err != nil {
// handle error
}
if resp.StatusCode != http.StatusOK {
// handle error
}
out, err := os.Create(filepath)
if err != nil {
return err
}
// Write the body to file
_, err = io.Copy(out, resp.Body)
resp.Body.Close()
out.Close()
My issue is that if I immediately try to do something (e.g. take the hash of this file), then I see that it is still copying for a while.
At first I was deferring the out.Close(), and I though that I need to out.Close after the io.Copy, which will block until its done with it. or so I thought.
This didn't work and I still have the same issue.
How do I block or wait for the io.Copy operation to finish?
Thanks!
Likely you are hitting some disk buffer/cache, where your OS or disk device keeps some data in memory before actually persisting the write to the disk.
Calling
out.Sync()
forces a fsync syscall, which will instruct the OS to force a flush of the buffer and write the data to disk. I suggest calling out.Flush() after the io.Copy call returns.
Related docs you may find interesting:
https://pkg.go.dev/os#File.Sync
https://man7.org/linux/man-pages/man2/fdatasync.2.html

Editing zip file in memory and returning it via http response results in a corrupt file

Hey guys am new to go exactly 23 hours and 10 minutes new so obviously am having issues with some stuff, I have a zip file that is in memory and I would like to take that file make a copy of it add some files to the copy and return the file via HTTP, it works but when I open the file it seems to be corrupted
outFile, err := os.OpenFile("./template.zip", os.O_RDWR, 0666)
if err != nil {
log.Fatalf("Failed to open zip for writing: %s", err)
}
defer outFile.Close()
zipw := zip.NewWriter(outFile)
fmt.Println(reflect.TypeOf(zipw))
for _, appCode := range appPageCodeText {
f, err := zipw.Create(appCode.Name + ".jsx")
if err != nil {
log.Fatal(err)
}
_, err = f.Write([]byte(appCode.Content)) //casting it to byte array and writing to file
}
// Clean up
err = zipw.Close()
if err != nil {
log.Fatal(err)
}
defer outFile.Close()
//Get the Content-Type of the file
//Create a buffer to store the header of the file in
FileHeader := make([]byte, 512)
//Copy the headers into the FileHeader buffer
outFile.Read(FileHeader)
//Get content type of file
fmt.Println(reflect.TypeOf(outFile))
//Get the file size
FileStat, _ := outFile.Stat() //Get info from file
FileSize := strconv.FormatInt(FileStat.Size(), 10) //Get file size as a string
buffer := make([]byte, FileStat.Size())
outFile.Read(buffer)
//Send the headers
w.Header().Set("Content-Disposition", "attachment; filename="+"template.zip")
w.Header().Set("Content-Type", "application/zip")
w.Header().Set("Content-Length", FileSize)
outFile.Seek(0, 0)
// io.Copy(w, buffer) //'Copy' the file to the client
w.Write(buffer)
(The primary problem): you Read the first 512 bytes of outFile into FileHeader, which means that they're not read into buffer, which means the first 512 bytes of the file aren't sent to the client. You do a Seek, but too late for it to be useful — the contents of buffer are already set at that point. You need to move the Seek earlier, or write both buffers, or just remove the unnecessary FileHeader read.
Your comment claims that you do so to get the content-type of the file, but FileHeader is actually never used. And why would it be? You know what the type of the file is, you just wrote it. So the separate read of the first 512 bytes is unneeded.
Actually, it's all unneeded — Instead of making a file on disk, using a zip.Writer to write to the file, re-opening the file from disk, reading it into a byte array, and then writing that byte array to the HTTP client, you could simply either have the zip.Writer write directly to the HTTP client (if you don't care about setting Content-Length), or have it write to a bytes.Buffer and then copy that buffer out to the HTTP client (if an accurate Content-Length is important to you).
The first version looks like:
w.Header().Set("Content-Disposition", "attachment; filename=template.zip")
w.Header().Set("Content-Type", "application/zip")
zipw := zip.NewWriter(w)
// Your for loop to add items to the zip goes here.
//
zipw.Close() // plus error handling
And the second version looks like:
buffer := &bytes.Buffer{}
zipw := zip.NewWriter(buffer)
// Your for loop to add items to the zip goes here.
//
zipw.Close() // plus error handling
w.Header().Set("Content-Disposition", "attachment; filename=template.zip")
w.Header().Set("Content-Type", "application/zip")
w.Header().Set("Content-Length", strconv.FormatInt(buffer.Length(), 10))
io.Copy(w, buffer) // plus error handling

How to handle errors in a goroutine

I have a service that use to upload file to AWS S3. I was trying to use with goroutines and without to upload the file. If I upload the file without goroutines, it should wait till finish then give the response, and if I use goroutines it will run in the background and faster to response to the client-side.
How about if that upload failed if I use goroutines? And then that file not uploaded to AWS S3? Can you tell me to handle this how?
here is my function to upload file
func uploadToS3(s *session.Session, size int64, name string , buffer []byte)( string , error) {
tempFileName := "pictures/" + bson.NewObjectId().Hex() + "-" + filepath.Base(name)
_, err := s3.New(s).PutObject(&s3.PutObjectInput{
Bucket: aws.String("myBucketNameHere"),
Key: aws.String(tempFileName),
ACL: aws.String("public-read"),
Body: bytes.NewReader(buffer),
ContentLength: aws.Int64(int64(size)),
ContentType: aws.String(http.DetectContentType(buffer)),
ContentDisposition: aws.String("attachment"),
ServerSideEncryption: aws.String("AES256"),
StorageClass: aws.String("INTELLIGENT_TIERING"),
})
if err != nil {
return "", err
}
return tempFileName, err
}
func UploadFile(db *gorm.DB) func(c *gin.Context) {
return func(c *gin.Context) {
file, err := c.FormFile("file")
f, err := file.Open()
if err != nil {
fmt.Println(err)
}
defer f.Close()
buffer := make([]byte, file.Size)
_, _ = f.Read(buffer)
s, err := session.NewSession(&aws.Config{
Region: aws.String("location here"),
Credentials: credentials.NewStaticCredentials(
"id",
"key",
"",
),
})
if err != nil {
fmt.Println(err)
}
go uploadToS3(s, file.Size, file.Filename, buffer)
c.JSON(200, fmt.Sprintf("Image uploaded successfully"))
}
}
I was thinking as well how about if there many request to upload a file over 10000+ per 5-10mins ? would some file can't be upload because too many request?
The problem is that when using a goroutine, you immediately return a success message to your client. If that's really the case, it means your goroutine needs to be able to recover in case of error when uploading to S3 (don't lose the image). So either you take care of that, or you inform asynchronously your client that the upload failed, so the client can re-try.
This question is too broad for a single answer. There are, broadly speaking, three possible approaches:
Wait for your goroutines to complete to handle any errors.
Ensure your goroutines can handle (or possibly ignore) any errors they encounter, such that returning an error never matters.
Have your goroutines log any errors, for handling later, possibly by a human, or possibly by some cleanup/retry function.
Which approach is best depends on the situation.
For any asynchronous task - such as uploading a file in a background go-routine - one can write the uploading function in such a way to return a chan error to the caller. The caller can then react to the file uploads eventual error (or nil for no error) at a later time by reading from the chan error.
However if you are accepting upload requests, I'd suggest instead to created a worker upload go-routine, that accepts file uploads via a channel. An output "error" channel can track success/failure. And if need be, the error uploaded could be written back to the original upload channel queue (including a retry tally & retry max - so a problematic payload does not loop forever) .

Encoding a file to send to Google AutoML

I am writing a golang script to send an image to the prediction engine of Google AutoML API.
It accepts most files using the code below, but certain .jpeg or .jpeg it returns error 500 saying invalid file. Mostly it works, but I can't figure out the exceptions. They are perfectly valid jpg's.
I am encoding the payload using EncodeToString.
Among other things, I have tried decoding it, saving it to a PNG, nothing seems to work. It doesn't like some images.
I wonder if I have an error in my method? Any help would be really appreciated. Thanks
PS the file saves to the filesystem and uploads to S3 just fine. It's just the encoding to a string when it goes to Google that it fails.
imgFile, err := os.Open(filename)
if err != nil {
fmt.Println(err)
}
img, fname, err := image.Decode(imgFile)
if err != nil {
fmt.Println(fname)
}
buf := new(bytes.Buffer)
err = jpeg.Encode(buf, img, nil)
// Encode as base64.
imgBase64Str := base64.StdEncoding.EncodeToString(buf.Bytes())
defer imgFile.Close()
payload := fmt.Sprintf(`{"payload": {"image": {"imageBytes": "%v"},}}`, imgBase64Str)
// send as a byte
pay := bytes.NewBuffer([]byte(payload))
req, err := http.NewRequest(http.MethodPost, URL.String(), pay)
I believe I fixed it.
I looked in the Google docs again and for the speech to text (which is a different API) it says to do encode64 -w 0
So, looking in Go docs, it seems RawStdEncoding is right to use to replicate this behaviour, not StdEncoding
No image failures yet. Hope this helps someone else one day.

Resources