I've been trying to use multipart.Part to help read very large file uploads (>20GB) from HTTP - so I've written the below code which seems to work nicely:
func ReceiveMultipartRoute(w http.ResponseWriter, r *http.Request) {
mediatype, p, err := mime.ParseMediaType(r.Header.Get("Content-Type"))
if err != nil {
//...
}
if mediatype != "multipart/form-data" {
//...
}
boundary := p["boundary"]
reader := multipart.NewReader(r.Body, boundary)
buffer := make([]byte, 8192)
for {
part, err := reader.NextPart()
if err != nil {
// ...
}
f, err := os.CreateTemp("", part.FileName())
if err != nil {
// ...
}
for {
numBytesRead, err := part.Read(buffer)
// People say not to read if there's an err, but then I miss the last chunk?
f.Write(buffer[:numBytesRead])
if err != nil {
if err == io.EOF {
break
} else {
// error, abort ...
return
}
}
}
}
}
However, in the innermost for loop, I found out that I have to read from part.Read before even checking for EOF, as I notice that I will miss the last chunk if I do so beforehand and break. However, I notice on many other articles/posts where people check for errors/EOF, and break-ing if there is without using the last read. Am I using multipart.Part.Read() wrongly/safely?
You use multipart.Part in a proper way.
multipart.Part is a particular implementation of io.Reader. Accordingly, you should be guided by the conventions and follow the recommendations for io.Reader. Quote from the documentation:
Callers should always process the n > 0 bytes returned before considering the error err. Doing so correctly handles I/O errors that happen after reading some bytes and also both of the allowed EOF behaviors.
Also note that in the example you are copying data from io.Reader to os.File. os.File implements io.ReaderFrom interface, so you can use File.ReadFrom() method to copy the data.
_, err := file.ReadFrom(part)
// non io.EOF
if err != nil {
return fmt.Errorf("copy data: %w", err)
}
If you need to use a buffer, you can use io.CopyBuffer() function. But note that you need to hide io.ReaderFrom implementation, otherwise the buffer will not be used to perform the copy. See examples: 1, 2, 3.
_, err := io.CopyBuffer(writeFunc(file.Write), part, buffer)
// non io.EOF
if err != nil {
return fmt.Errorf("copy data: %w", err)
}
type writeFunc func([]byte) (int, error)
func (write writeFunc) Write(data []byte) (int, error) {
return write(data)
}
Related
When I debug in the following code, sometimes it can read data from the body correctly but with EOF error.
func (r *trailerReader) Read(b []byte) (int, error) {
n, err := r.resp.Body.Read(b)
if err != nil {
if e := r.resp.Trailer.Get("X-Stream-Error"); e != "" {
err = errors.New(e)
}
}
return n, err
}
I called this method in my code:
// FilesRead read a file in a given MFS
func (s *Shell) FilesRead(ctx context.Context, path string, options ...FilesOpt) (io.ReadCloser, error) {
rb := s.Request("files/read", path)
for _, opt := range options {
if err := opt(rb); err != nil {
return nil, err
}
}
resp, err := rb.Send(ctx)
if err != nil {
return nil, err
}
if resp.Error != nil {
return nil, resp.Error
}
return resp.Output, nil
}
Any thoughts?
As Stebalien said in this Github issue, it's a go's expected behavior of Reader.
Refer to the third paragraph of this documentation
When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call. An instance of this general case is that a Reader returning a non-zero number of bytes at the end of the input stream may return either err == EOF or err == nil. The next Read should return 0, EOF.
The file that is written to disk is empty, but the reader is not.
I do not understand where the issue is.
I tried to play with a Buffer and then String() method and I can confirm that the content is fine, but using the Read() method of this library is not working.
The library I use is github.com/jlaffaye/ftp
// pullFileByFTP
func pullFileByFTP(fileID, server string, port int64, username, password, path, file string) error {
// Connect to the server
client, err := ftp.Dial(fmt.Sprintf("%s:%d", server, port))
if err != nil {
return err
}
// Log in the server
err = client.Login(username, password)
if err != nil {
return err
}
// Retrieve the file
reader, err := client.Retr(fmt.Sprintf("%s%s", path, file))
if err != nil {
return err
}
// Read the file
var srcFile []byte
_, err = reader.Read(srcFile)
if err != nil {
return err
}
// Create the destination file
dstFile, err := os.Create(fmt.Sprintf("%s/%s", shared.TmpDir, fileID))
if err != nil {
return fmt.Errorf("Error while creating the destination file : %s", err)
}
defer dstFile.Close()
// Copy the file
dstFile.Write(srcFile)
return nil
}
You are using Read and Write wrong:
var srcFile []byte
_, err = reader.Read(srcFile)
Read puts the read bytes into its argument. Since srcFile is a nil slice, this instructs the reader to read zero bytes. Use ioutil.ReadAll to read all bytes.
Next up is your use of Write. Write(b) writes up to len(b) bytes, but not necessarily all of it. You must check the return values and call Write repeatedly if necessary.
However, in your case you just want to connect an io.Reader (*Response implements io.Reader) and io.Writer (*os.File). That's what io.Copy is for:
reader, err := client.Retr(path + file)
dstFile, err := ioutil.TempFile("", fileID)
_, err := io.Copy(dstFile, reader)
err := dstFile.Close()
I'm trying to process a multipart file upload in small chunks to avoid storing the entire file in memory. The following function seems to solve this, however when passing a []byte as the destination for the part.Read() method, it reads the part in chunks of 4096 bytes instead of in chunks of the destination size (len([]byte)).
When opening a local file and Read()'ing it into a []byte of the same size, it uses the entire space available as expected. Thus I think it's something specific to the part.Reader(). However, I'm unable to find anything about a default or max size for that function.
For reference, the function is as follows:
func ReceiveFile(w http.ResponseWriter, r *http.Request) {
reader, err := r.MultipartReader()
if err != nil {
panic(err)
}
if reader == nil {
panic("Wrong media type")
}
buf := make([]byte, 16384)
fmt.Println(len(buf))
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
panic(err)
}
var n int
for {
n, err = part.Read(buf)
if err == io.EOF {
break
}
if err != nil {
panic(err)
}
fmt.Printf("Read %d bytes into buf\n", n)
fmt.Println(len(buf))
}
n, err = part.Read(buf)
fmt.Printf("Finally read %d bytes into buf\n", n)
fmt.Println(len(buf))
}
The part reader does not attempt to fill the caller's buffer as allowed by the io.Reader contract.
The best way to handle this depends on the requirements of the application.
If you want to slurp the part into memory, then use ioutil.ReadAll:
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
// handle error
}
p, err := ioutil.ReadAll(part)
if err != nil {
// handle error
}
// p is []byte with the contents of the part
}
If you want to copy the part to the io.Writer w, then use io.Copy:
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
// handle error
}
w := // open a writer
_, err := io.Copy(w, part)
if err != nil {
// handle error
}
}
If you want to process fixed size chunks, then use io.ReadFull:
buf := make([]byte, chunkSize)
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
// handle error
}
_, err := io.ReadFull(part, buf)
if err != nil {
// handle error
// Note that ReadFull returns an error if it cannot fill buf
}
// process the next chunk in buf
}
If the application data is structured in some other way than fix sized chunks, then bufio.Scanner might be of help.
Instead change the chunk size, why not use io.ReadFull ?
https://golang.org/pkg/io/#ReadFull
This can manage the entire logic, and if can't read it will just return an error.
Suppose we have a function that returns some value and an error. What's the preferred way of handling the error and value declarations?
func example_a(data interface{}) (interface{}, error) {
var err error
var bytes []byte
if bytes, err = json.Marshal(data); err != nil {
return nil, err
}
// ...
return use(bytes), nil
}
func example_b(data interface{}) (interface{}, error) {
if bytes, err := json.Marshal(data); err != nil {
return nil, err
} else {
// ...
return use(bytes), nil
}
}
func example_c(data interface{}) (result interface{}, err error) {
var bytes []byte
if bytes, err = json.Marshal(data); err != nil {
return
}
// ...
return use(bytes), nil
}
func example_d(data interface{}) (interface{}, error) {
bytes, err := json.Marshal(data)
if err != nil {
return nil, err
}
// ...
return use(bytes), nil
}
func example_dream(data interface{}) (interface{}, error) {
if bytes, err ≡ json.Marshal(data); err != nil {
return nil, err
}
// ...
return use(bytes), nil
}
Example A is clear, but it adds 2 extra lines. Moreover, I find that it's unclear why in this particular case we should use var, and at the same time := is not always appropriate. Then you want to reuse the err declaration somewhere down the line, and I'm not a big fan of splitting declaration and assignment.
Example B is using the if-declare-test language feature, which I surmise is encouraged, but at the same time you are forced to nest function continuation violating the happy-path principle, which too is encouraged.
Example C uses the named parameter return feature, which is something between A and B. Biggest problem here, is that if your code base is using styles B and C, then it's easy to mistake := and =, which can cause all kinds of issues.
Example D (added from suggestions) has for me the same kind of usage problem as C, because inevitably I run into the following:
func example_d(a, b interface{}) (interface{}, error) {
bytes, err := json.Marshal(a)
if err != nil {
return nil, err
}
bytes, err := json.Marshal(b) //Compilation ERROR
if err != nil {
return nil, err
}
// ...
return use(bytes), nil
}
So depending on previous declarations I have to modify my code to either use := or =, which makes it harder to see and refactor.
Example Dream is what I kind of intuitively would have expected from GO - no nesting, and quick exit without too much verbosity and variable reuse. Obviously it doesn't compile.
Usually use() is inlined and repeats the pattern several times, compounding the nesting or split declaration issue.
So what's the most idiomatic way of handling such multiple returns and declarations? Is there a pattern I'm missing?
If you look at lots of Go code you will find the following to be the usual case:
func example(data interface{}) (interface{}, error) {
bytes, err := json.Marshal(data)
if err != nil {
return nil, err
}
// ...
return use(bytes), nil
}
The declare and test if construct is nice in its place, but it is not generally apropriate here.
I'm currently developing a download server in Go. I need to limit the download speed of users to 100KB/s.
This was my code:
func serveFile(w http.ResponseWriter, r *http.Request) {
fileID := r.URL.Query().Get("fileID")
if len(fileID) != 0 {
w.Header().Set("Content-Disposition", "attachment; filename=filename.txt")
w.Header().Set("Content-Type", r.Header.Get("Content-Type"))
w.Header().Set("Content-Length", r.Header.Get("Content-Length"))
file, err := os.Open(fmt.Sprintf("../../bin/files/test.txt"))
defer file.Close()
if err != nil {
http.NotFound(w, r)
return
}
io.Copy(w, file)
} else {
io.WriteString(w, "Invalid request.")
}
}
Then I found a package on github and my code became the following:
func serveFile(w http.ResponseWriter, r *http.Request) {
fileID := r.URL.Query().Get("fileID")
if len(fileID) != 0 {
w.Header().Set("Content-Disposition", "attachment; filename=Wiki.png")
w.Header().Set("Content-Type", r.Header.Get("Content-Type"))
w.Header().Set("Content-Length", r.Header.Get("Content-Length"))
file, err := os.Open(fmt.Sprintf("../../bin/files/test.txt"))
defer file.Close()
if err != nil {
http.NotFound(w, r)
return
}
bucket := ratelimit.NewBucketWithRate(100*1024, 100*1024)
reader := bufio.NewReader(file)
io.Copy(w, ratelimit.Reader(reader, bucket))
} else {
io.WriteString(w, "Invalid request.")
}
}
But I'm getting this error:
Corrupted Content Error
The page you are trying to view cannot be shown because an error in
the data transmission was detected.
Here's my code on the Go playground: http://play.golang.org/p/ulgXQl4eQO
Rather than mucking around with getting the correct the content type and length headers yourself it'd probably be much better to use http.ServeContent which will do that for you (as well as support "If-Modified-Since", range requests, etc. If you can supply an "ETag" header it can also handle "If-Range" and "If-None-Match" requests as well).
As mentioned previously, it's often preferable to limit on the write side but it's awkward to wrap an http.ResponseWriter since various http functions also check for optional interfaces such as http.Flusher and http.Hijacker. It's much easier to wrap the io.ReadSeeker that ServeContent needs.
For example, something like this perhaps:
func pathFromID(fileID string) string {
// replace with whatever logic you need
return "../../bin/files/test.txt"
}
// or more verbosely you could call this a "limitedReadSeeker"
type lrs struct {
io.ReadSeeker
// This reader must not buffer but just do something simple
// while passing through Read calls to the ReadSeeker
r io.Reader
}
func (r lrs) Read(p []byte) (int, error) {
return r.r.Read(p)
}
func newLRS(r io.ReadSeeker, bucket *ratelimit.Bucket) io.ReadSeeker {
// Here we know/expect that a ratelimit.Reader does nothing
// to the Read calls other than add delays so it won't break
// any io.Seeker calls.
return lrs{r, ratelimit.Reader(r, bucket)}
}
func serveFile(w http.ResponseWriter, req *http.Request) {
fileID := req.URL.Query().Get("fileID")
if len(fileID) == 0 {
http.Error(w, "invalid request", http.StatusBadRequest)
return
}
path := pathFromID(fileID)
file, err := os.Open(path)
if err != nil {
http.NotFound(w, req)
return
}
defer file.Close()
fi, err := file.Stat()
if err != nil {
http.Error(w, "blah", 500) // XXX fixme
return
}
const (
rate = 100 << 10
capacity = 100 << 10
)
// Normally we'd prefer to limit the writer but it's awkward to wrap
// an http.ResponseWriter since it may optionally also implement
// http.Flusher, or http.Hijacker.
bucket := ratelimit.NewBucketWithRate(rate, capacity)
lr := newLRS(file, bucket)
http.ServeContent(w, req, path, fi.ModTime(), lr)
}
I'm not seeing the error, but I did notice some issues with the code. For this:
w.Header().Set("Content-Type", r.Header.Get("Content-Type"))
You should use the mime package's:
func TypeByExtension(ext string) string
To determine the content type. (if you end up with the empty string default to application/octet-stream)
For:
w.Header().Set("Content-Length", r.Header.Get("Content-Length"))
You need to get the content length from the file itself. By using the request content length, for a GET this basically ends up as a no-op, but for a POST you're sending back the wrong length, which might explain the error you're seeing. After you open the file, do this:
fi, err := file.Stat()
if err != nil {
http.Error(w, err.Error(), 500)
return
}
w.Header().Set("Content-Length", fmt.Sprint(fi.Size()))
One final thing, when you open the file, if there's an error, you don't need to close the file handle. Do it like this instead:
file, err := os.Open(...)
if err != nil {
http.NotFound(w, r)
return
}
defer file.Close()