How to handle large file uploads with low memory footprint? - go

I'm writing an application in Go where user can upload a file and it would be eventually uploaded to Amazon S3 bucket. I've written an endpoint using which user can upload a file as multipart form data. Once the file is uploaded, I'm uploading it to S3 bucket.
func UploadRoutes(route *gin.Engine) {
route.POST("/upload", uploadHandler)
}
func uploadHandler(context *gin.Context) {
fileHeader, err := context.FormFile("file")
// check err
file, err := fileHeader.Open()
// check err
// uploads to S3 bucket
err = utils.Upload(file, fileHeader.Filename)
}
But I am not sure where the uploaded file data is being store between two uploads. It seems like file would be in memory after upload from user is completed and before it is uploaded to S3 bucket (Reference: https://pkg.go.dev/mime/multipart#File)
If that's the case, large file uploads would consume too much server memory. As a workaround, I can think of writing the file on disk and then initiating multipart upload to S3 bucket. Are there better alternatives?
Here is a related question but the I think I'm already using multipart upload as suggested in the answer: AWS S3 uploading/downloading huge files with low memory footprint

you can use Upload, you can pass the multipart reader directly to s3 instead of storing files in server before uploading
func uploadHandler(context *gin.Context) {
fileHeader, err := context.FormFile("file")
// check err
config := &aws.Config{
Region: aws.String("us-west-1"), // your region here
}
AwsSession := session.Must(session.NewSession(config))
uploader := s3manager.NewUploader(AwsSession)
result, err := uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(bucketname),
Key: aws.String(filePath),
Body: fileHeader, // you can pass the reader here
})
}

Related

Google Translation API WithAPIKey option error

I am building a multilingual SAAS website builder in Golang which is run per client. Each client can have their own website and can translate their website in the desired language.
Since the feature is per client, so I collected an API key from client, which I used to translate their site content.
Here is the code,
V2
package main
import (
"context"
"fmt"
"cloud.google.com/go/translate"
"google.golang.org/api/option"
)
func main() {
translationStrings := []string{"hello"}
ctx := context.Background()
opts := option.WithAPIKey(APIKEY)
c, err := translate.NewClient(ctx, opts)
if err != nil {
fmt.Println(err)
}
defer c.Close()
resp, err := c.Translate(ctx, translationStrings, language.French,
&translate.Options{
Source: language.English,
Format: translate.Text,
})
if err != nil {
fmt.Println(err)
}
fmt.Println(resp)
}
V3
translate "cloud.google.com/go/translate/apiv3"
translatepb "google.golang.org/genproto/googleapis/cloud/translate/v3"
c, err := translate.NewTranslationClient(ctx, opts)
if err != nil {
fmt.Println(err)
}
defer c.Close()
req := &translatepb.TranslateTextRequest{
Contents: translationStrings,
TargetLanguageCode: "sr-Latn",
}
resp, err := c.TranslateText(ctx, req)
The code with V2 works well but the same code with V3 does not work. It gives error:
API keys are not supported for gRPC APIs. Remove the WithAPIKey option from your client-creating call.
As stated in the error, it is asking me to remove WithAPIKey options. But if I remove this then how will I use the api key for each client.
I have chosen to work with V3 apis because the will be translated as a whole so it will be a large request. I have read in the docs that V3 api can work in batches.
So my questions are:
how can I use per client api key structure with api V3?
Is it okay to go with the api V2 for the purpose as stated above?
Cloud Translation API v3 does not currently support API keys. It is recommended that you create a service account for Cloud Translation API v3 requests. For information on creating a service account, see Creating and managing service accounts. Your service account must be added to one of the IAM roles added for Cloud Translation API v3.

Creating V4 Signed URLs in CloudRun

I'd like to create Signed URLs to Google Cloud Storage resources from an app deployed using CloudRun.
I set up CloudRun with a custom Service Account with the GCS role following this guide.
My intent was to use V4 Signing to create Signed URLs from CloudRun. There is a guide for this use-case where a file service_account.json is used to generate JWT config. This works for me on localhost when I download the file from google's IAM. I'd like to avoid having this file committed in the repository use the one that I provided in CloudRun UI.
I was hoping that CloudRun injects this service account file to the app container and makes it accessible in GOOGLE_APPLICATION_CREDENTIALS variable but that's not the case.
Do you have a recommendation on how to do this? Thank you.
As you say, Golang Storage Client Libraries require a service account json file to sign urls.
There is currently a feature request open in GitHub for this but you should be able to work this around with this sample that I found here:
import (
"context"
"fmt"
"time"
"cloud.google.com/go/storage"
"cloud.google.com/go/iam/credentials/apiv1"
credentialspb "google.golang.org/genproto/googleapis/iam/credentials/v1"
)
const (
bucketName = "bucket-name"
objectName = "object"
serviceAccount = "[PROJECTNUMBER]-compute#developer.gserviceaccount.com"
)
func main() {
ctx := context.Background()
c, err := credentials.NewIamCredentialsClient(ctx)
if err != nil {
panic(err)
}
opts := &storage.SignedURLOptions{
Method: "GET",
GoogleAccessID: serviceAccount,
SignBytes: func(b []byte) ([]byte, error) {
req := &credentialspb.SignBlobRequest{
Payload: b,
Name: serviceAccount,
}
resp, err := c.SignBlob(ctx, req)
if err != nil {
panic(err)
}
return resp.SignedBlob, err
},
Expires: time.Now().Add(15*time.Minute),
}
u, err := storage.SignedURL(bucketName, objectName, opts)
if err != nil {
panic(err)
}
fmt.Printf("\"%v\"", u)
}
Cloud Run (and other compute platforms) does not inject a service account key file. Instead, they make access_tokens available on the instance metadata service. You can then exchange this access token with a JWT.
However, often times, Google’s client libraries and gcloud works out of the box on GCP’s compute platforms without explicitly needing to authenticate. So if you use the instructions on the page you linked (gcloud or code samples) it should be working out-of-the-box.

How do I make go fasthttp server to trigger the handler faster at a fileupload?

I use fasthttp for a file server project. The file server has an upload function. For uploading files I pass a key as a URL Query to validate the permission for upload.
The main():
// start http server
log.Printf("Starting HTTP server on %q", listento)
go func() {
h := &fasthttp.Server{
Handler: requestHandler,
MaxRequestBodySize: 2 * 1024 * 1024 * 1024,
}
if err := h.ListenAndServe(listento); err != nil {
log.Panicf("error in ListenAndServe: %s", err)
}
}()
The requestHandler function:
func requestHandler(ctx *fasthttp.RequestCtx) {
switch string(ctx.Path()) {
case "/uploadx":
log.Println("Upload: ["+ctx.RemoteIP().String()+"] ["+string(ctx.Path())+"]")
}
}
I upload a big file and unfortunately the requestHandler gets triggered when the file upload process is completed.
But it should be triggered on start of the upload process, because i want to avoid someone upload a 500MB file without check permission first.
Is there any way to make the requestHandler trigger faster? On the start of the Upload Process?
The Server itself received the first part of the HTTP Request, so the big question is, why does fasthttp trigger requestHandler so late?
I tried now net/http:
mux.HandleFunc("/upload", uploadFile)
func uploadFile(w http.ResponseWriter, r *http.Request) {
fmt.Println("File Upload Endpoint Hit")
fmt.Println(r)
}
With net/http i receive the File Upload Endpoint Hit already on start of the FileUpload - exactly like required but i really prefer to use fasthttp.
Am i doing something wrong?
Thanks
For big file uploads browsers send a Expect: 100-continue header to ask the server if it's ok for them to continue with the upload. You could use https://godoc.org/github.com/valyala/fasthttp#Server.ContinueHandler to check for permissions and either allow or reject the upload.
Fasthttp will always read the full response before calling a handler. This allows for a more performant API with less allocations.

Golang HTTP uploading file to S3 using tusd only uploading metadata

I am using the tusd library to upload a file directly to S3 in Go. It seems to be functioning however tusd uploads two files a .info metadata file and a .bin actual content file. For some reason my code is only uploading the info file.
The documentation is quite tricky to navigate so perhaps I have missed a setting somewhere
Code as gist to show both the server and the client code.
There are mutiple issues here.
Your tus libary import paths are wrong they should be:
"github.com/tus/tusd/pkg/handler"
"github.com/tus/tusd/pkg/s3store"
You dont use the S3 store propely, you setup a configuration to have storage directly on your server
fStore := filestore.FileStore{
Path: "./uploads",
}
Instead it should be something like this:
// S3 acces configuration
s3Config := &aws.Config{
Region: aws.String(os.Getenv("AWS_REGION")),
Credentials: credentials.NewStaticCredentials(os.Getenv("AWS_ACCESS_KEY_ID"), os.Getenv("AWS_SECRET_ACCESS_KEY"), ""),
DisableSSL: aws.Bool(true),
S3ForcePathStyle: aws.Bool(true),
}
// Setting up the s3 storage
s3Store := s3store.New(os.Getenv("AWS_BUCKET_NAME"), s3.New(session.Must(session.NewSession()), s3Config))
// Creates a new and empty store composer
composer := handler.NewStoreComposer()
// UseIn sets this store as the core data store in the passed composer and adds all possible extension to it.
s3Store.UseIn(composer)
// Setting up handler
handler, err := handler.NewHandler(handler.Config{
BasePath: "/files/",
StoreComposer: composer,
})
if err != nil {
panic(fmt.Errorf("Unable to create handler: %s", err))
}
// Listen and serve
http.Handle("/files/", http.StripPrefix("/files/", handler))
err = http.ListenAndServe(":8080", nil)
if err != nil {
panic(fmt.Errorf("Unable to listen: %s", err))
}
It is possible that your client isnt working proprely also (I didnt test it).
I would recommend you use https://github.com/eventials/go-tus instead of trying to implement the protocol by yourself.

How to get my image (base64) in Google-Cloud-Storage with go script

I have been looking for an example GAE script in go to get my image that I got from the resulted screenshot of PageSpeed Insights and saved it as json_decode object using Kohana/Cache to Google Cloud Storage (GCS).
The reason of this method is simply because I found this Kohana model is the most convenient way writing files to GCS, although I am seeking also other way like this to write files to GCS using Blobstore to serve them while the Go API Files has been deprecate as documented here.
Here is the form of stored object containing the screenshot image data (base64) which is saved as public in default application bucket with object name images/thumb/mythumb.jpg:
stdClass Object
(
[screenshot] => stdClass Object
(
[data] => _9j_4AAQSkZJRgABAQAAAQABAAD_...= // base64 data
[height] => 240
[mime_type] => image/jpeg
[width] => 320
)
[otherdata] => Array
(
[..] => ..
[..] => ..
)
)
I want to get this image that set as public using my customized url as below that to be proceed through go module and also I need it to be expired in a certain time because I have managed to update the image content itself regularly:
http://myappId.appspot.com/image/thumb/mythumb.jpg
I have set in disptach.yaml to send all image request to my go module as below:
- url: "*/images/*"
module: go
and set the handler in go.yaml to proceed the image request as below:
handlers:
- url: /images/thumb/.*
script: _go_app
- url: /images
static_dir: images
Using this directive I have got that all /images/ request (other than /images/thumb/ request) serve images from the static directory and that /images/thumb/mythumb.jpg goes to the module application.
So left what code I have to use (see ????) in my application file named thumb.go as below:
package thumb
import(
//what to import
????
????
)
const (
googleAccessID = "<serviceAccountEmail>#developer.gserviceaccount.com"
serviceAccountPEMFilename = "YOUR_SERVICE_ACCOUNT_KEY.pem"
bucket = "myappId.appspot.com"
)
var (
expiration = time.Now().Add(time.Second * 60) //expire in 60 seconds
)
func init() {
http.HandleFunc("/images/thumb/", handleThumb)
}
func handleThumb(w http.ResponseWriter, r *http.Request) {
ctx := cloud.NewContext(appengine.AppID(c), hc)
???? //what code to get the string of 'mythumb.jpg' from url
???? //what code to get the image stored data from GCS
???? //what code to encoce base64 data
w.Header().Set("Content-Type", "image/jpeg;")
fmt.Fprintf(w, "%v", mythumb.jpg)
}
I have taken many codes from some examples like this, this or this but could not get one works so far. I have also tried a sample from this which is almost close to my case but also found no luck.
So in generally t was mainly due to lack on what are the correct code to be put on the line that I marked by ???? as well the relevant library or path to be imported. I have also checked the GCS permission if something have been missing as described here and here.
I shall thank you much for your help and advise.
From what I've read in your description, it seems that the only relevant parts are the ???? lines in the actual Go code. Let me know if that's not the case.
First ????: "what code to get the string of 'mythumb.jpg' from url"?
From reading the code, you're looking to extract mythumb.jpg from a url like http://localhost/images/thumb/mythumb.jpg. A working example is available at the Writing Web Applications tutorial:
package main
import (
"fmt"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hi there, I love %s!", r.URL.Path[1:])
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
Such that
http://localhost:8080/monkeys
Prints
Hi there, I love monkeys!
Second ????: "what code to get the image stored data from GCS"?
The API method you're probably looking to use is storage.objects.get.
You did link to one of the JSON API Go Examples for Google Cloud Storage, which is a good general reference, but is not related to the problem you're trying to solve. That particular example is put together for Client-side applications (hence the redirectURL = "urn:ietf:wg:oauth:2.0:oob" line). Additionally, this sample uses deprecated/out-of-date oauth2 and storage packages.
One of the cleanest (and non-deprecated) ways to do this for an application which wants to access its own buckets on behalf of itself would be to use the golang/oauth2 and Google APIs Client Library for Go packages.
An example of how to authenticate with JSON Web Token auth with the golang/oauth2 package is available in the repo:
func ExampleJWTConfig() {
conf := &jwt.Config{
Email: "xxx#developer.com",
// The contents of your RSA private key or your PEM file
// that contains a private key.
// If you have a p12 file instead, you
// can use `openssl` to export the private key into a pem file.
//
// $ openssl pkcs12 -in key.p12 -out key.pem -nodes
//
// It only supports PEM containers with no passphrase.
PrivateKey: []byte("-----BEGIN RSA PRIVATE KEY-----..."),
Subject: "user#example.com",
TokenURL: "https://provider.com/o/oauth2/token",
}
// Initiate an http.Client, the following GET request will be
// authorized and authenticated on the behalf of user#example.com.
client := conf.Client(oauth2.NoContext)
client.Get("...")
}
Next, instead of using the oauth2 client directly, use that client with the Google APIs Client Library for Go mentioned earlier:
service, err := storage.New(client)
if err != nil {
fatalf(service, "Failed to create service %v", err)
}
Notice the similarity to the out-of-date JSON API Go Examples?
In your handler, you'll want to go out and get the related object using func ObjectsService.Get. Assuming that you know the name of the object and bucket, that is.
Straight from the previous example, you can use code similar to what's below to retrieve the download link:
if res, err := service.Objects.Get(bucketName, objectName).Do(); err == nil {
fmt.Printf("The media download link for %v/%v is %v.\n\n", bucketName, res.Name, res.MediaLink)
} else {
fatalf(service, "Failed to get %s/%s: %s.", bucketName, objectName, err)
}
Then, fetch the file, or do whatever you want with it. Full example:
import (
"golang.org/x/oauth2"
"golang.org/x/oauth2/jwt"
"google.golang.org/api/storage/v1"
"fmt"
)
...
const (
bucketName = "YOUR_BUCKET_NAME"
objectName = "mythumb.jpg"
)
func main() {
conf := &jwt.Config{
Email: "xxx#developer.com",
PrivateKey: []byte("-----BEGIN RSA PRIVATE KEY-----..."),
Subject: "user#example.com",
TokenURL: "https://provider.com/o/oauth2/token",
}
client := conf.Client(oauth2.NoContext)
service, err := storage.New(client)
if err != nil {
fatalf(service, "Failed to create service %v", err)
}
if res, err := service.Objects.Get(bucketName, objectName).Do(); err == nil {
fmt.Printf("The media download link for %v/%v is %v.\n\n", bucketName, res.Name, res.MediaLink)
} else {
fatalf(service, "Failed to get %s/%s: %s.", bucketName, objectName, err)
}
// Go fetch the file, etc.
}
Third ????: "what code to encoce base64 data"?
Pretty simple with the encoding/base64 package. SO simple, that they've included an example:
package main
import (
"encoding/base64"
"fmt"
)
func main() {
data := []byte("any + old & data")
str := base64.StdEncoding.EncodeToString(data)
fmt.Println(str)
}
Hope that helps.

Resources