`io.Copy` file size different from original - go

I am dealing with multipart/form-data file upload and my backend uses Go io.Copy to copy the form data to local file.
func SaveFileHandler() error {
...
file := form.File["file_uploaded"] // file uploaded in form
src, _ := file.Open()
// here the original file size is 35540353 in my case,
// which is a video/mp4 file
fmt.Println(file.Size)
// create a local file with same filename
dst, _ := os.Create(file.Filename)
// save it
_, err = io.Copy(dst, src)
// err is nil
fmt.Println(err)
stat, _ := dst.Stat()
// then the local file size differs from the original updated one. Why?
// local file size becomes 35537281 (original one is 35540353)
fmt.Println(stat.Size())
// meanwhile I can't open the local video/mp4 file,
// which seems to be broken due to losing data from `io.Copy`
...
How can it be? Is there any max buffer size for io.Copy? Or does file mime type matter in this case?
I tried with png and txt file and both worked as expected.
The Go version is go1.12.6 linux/amd64

There's not too much information in your question, but from what you've said, I bet the data isn't being flushed to the file completely before you call dst.Stat(). You could close the file first to ensure that the data is fully flushed:
func SaveFileHandler() error {
...
// create a local file with same filename
dst, _ := os.Create(file.Filename)
// save it
_, err = io.Copy(dst, src)
// Close the file
dst.Close()
// err is nil
fmt.Println(err)
stat, _ := dst.Stat()
...

Related

Getting `panic: os: invalid use of WriteAt on file opened with O_APPEND`

I am a newbie to Go. Was starting to write my first code in which I have to download a bunch of CSV's from AWS. I don't understand why it is giving me the below error with O_APPEND mode. If I remove os.O_APPEND, I only get the last file data which is not the objective.
The objective is to download all CSV files into one file locally. I'd like to understand what I'm doing incorrectly.
package main
import (
"fmt"
"os"
"path/filepath"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/credentials"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
)
const (
AccessKeyId = "xxxxxxxxx"
SecretAccessKey = "xxxxxxxxxxxxxxxxxxxx"
Region = "eu-central-1"
Bucket = "dexter-reports"
bucketKey = "Jenkins/pluginVersions/"
)
func main() {
// Load the Shared AWS Configuration
os.Setenv("AWS_ACCESS_KEY_ID", AccessKeyId)
os.Setenv("AWS_SECRET_ACCESS_KEY", SecretAccessKey)
filename := "JenkinsPluginDetais.txt"
cred := credentials.NewStaticCredentials(AccessKeyId, SecretAccessKey, "")
config := aws.Config{Credentials: cred, Region: aws.String(Region), Endpoint: aws.String("s3.amazonaws.com")}
file, err := os.OpenFile(filename, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0666)
if err != nil {
panic(err)
}
defer file.Close()
sess, err := session.NewSession(&config)
if err != nil {
fmt.Println(err)
}
//list Buckets
ObjectList := listBucketObjects(sess)
//loop over the obectlist. First initialize the s3 downloader via s3manager
downloader := s3manager.NewDownloader(sess)
for _, item := range ObjectList.Contents {
csvFile := filepath.Base(*item.Key)
if csvFile != "pluginVersions" {
downloadBucketObjects(downloader, file, csvFile)
}
}
}
func listBucketObjects(sess *session.Session) *s3.ListObjectsV2Output {
//create a new s3 client
svc := s3.New(sess)
resp, err := svc.ListObjectsV2(&s3.ListObjectsV2Input{
Bucket: aws.String(Bucket),
Prefix: aws.String(bucketKey),
})
if err != nil {
panic(err)
}
return resp
}
func downloadBucketObjects(downloader *s3manager.Downloader, file *os.File, keyobj string) {
fileToDownload := bucketKey + keyobj
numBytes, err := downloader.Download(file,
&s3.GetObjectInput{
Bucket: aws.String(Bucket),
Key: aws.String(fileToDownload),
})
if err != nil {
panic(err)
}
fmt.Println("Downloaded", file.Name(), numBytes, "bytes")
}
Firstly, I don't get it why do you even need os.O_APPEND flag in the first place. As per my understanding, you can omit os.O_APPEND.
Now, let's come to the actual problem of why it's happening:
Doc for O_APPEND (Ref: https://man7.org/linux/man-pages/man2/open.2.html):
O_APPEND
The file is opened in append mode. Before each write(2),
the file offset is positioned at the end of the file, as
if with lseek(2). The modification of the file offset and
the write operation are performed as a single atomic step.
So for every call to write the file offset is positioned at the end of the file.
But (*s3Manager.Download).Download supposedly be using WriteAt method, i.e.,
Doc for WriteAt:
$ go doc os WriteAt
package os // import "os"
func (f *File) WriteAt(b []byte, off int64) (n int, err error)
WriteAt writes len(b) bytes to the File starting at byte offset off. It
returns the number of bytes written and an error, if any. WriteAt returns a
non-nil error when n != len(b).
If file was opened with the O_APPEND flag, WriteAt returns an error.
Notice the last line, that if the file's opened with O_APPEND flag it will result in an error and it's even right because WriteAt's second argument is an offset but mixing O_APPEND's behaviour and WriteAt offset seeking might create problem resulting in unexpected results and it errors out.
Consider the definition of s3manager.Downloader:
func (d Downloader) Download(w io.WriterAt, input *s3.GetObjectInput, options ...func(*Downloader)) (n int64, err error)
The first argument is an io.WriterAt; this interface is:
type WriterAt interface {
WriteAt(p []byte, off int64) (n int, err error)
}
This means that the Download function is going to call the WriteAt method in the File you are passing it. As per the documentation for File.WriteAt
If file was opened with the O_APPEND flag, WriteAt returns an error.
So this explains why you are getting the error but raises the question "why is Download using WriteAt and not accepting an io.Writer (and calling Write)?"; the answer can be found in the documentation:
The w io.WriterAt can be satisfied by an os.File to do multipart concurrent downloads, or in memory []byte wrapper using aws.WriteAtBuffer
So, to increase performance, Downloader might make multiple simultaneous requests for parts of the file and then write these out as they are received (meaning it may not write the data in order). This also explains why calling the function multiple times with the same File results in overwritten data (when Downloader retrieves the each chunk of the file it writes it out at the appropriate position in the output file; this overwrites any data already there).
The above quote from the documentation also points to a possible solution; use an aws.WriteAtBuffer and, once the download is finished, write the data to your file (which could then be opened with O_APPEND) - something like this:
buf := aws.NewWriteAtBuffer([]byte{})
numBytes, err := downloader.Download(buf,
&s3.GetObjectInput{
Bucket: aws.String(Bucket),
Key: aws.String(fileToDownload),
})
if err != nil {
panic(err)
}
_, err = file.Write(buf.Bytes())
if err != nil {
panic(err)
}
An alternative would be to download into a temporary file and then append that to your output file (you may need to do this if the files are large).

Referencing a file several levels up and down in a hierarchical structure in go

I'm trying to use os.open(fileDir) to read a file, then upload that file to an s3 bucket. Here's what I have so far.
func addFileToS3(s *session.Session, fileDir string) error {
file, err := os.Open(fileDir)
if err != nil {
return err
}
defer file.Close()
// Get file size and read the file content into a buffer
fileInfo, _ := file.Stat()
var size int64 = fileInfo.Size()
buffer := make([]byte, size)
file.Read(buffer)
// code to upload to s3
return nil
My directory structure is like
|--project-root
|--test
|--functional
|--assets
|-- good
|--fileINeed
But my code is running inside
|--project-root
|--services
|--service
|--test
|--myGoCode
How do a I pass in the correct fileDir? I need a solution that works locally and when the code gets deployed. I looked at the package path/filepath but I wasn't sure whether to get the absolute path first, then go down the hierarchy or something else.
You can add the following small function to get the expected file path.
var (
_, file, _, _ = runtime.Caller(0)
baseDir = filepath.Dir(file)
projectDir = filepath.Join(baseDir, "../../../")
)
func getFileINeedDirectory() string {
fileINeedDir := path.Join(projectDir, "test/functional/assets/good/fileINeed")
return fileINeedDir // project-dir/test/functional/assets/good/fileINeed
}

Editing zip file in memory and returning it via http response results in a corrupt file

Hey guys am new to go exactly 23 hours and 10 minutes new so obviously am having issues with some stuff, I have a zip file that is in memory and I would like to take that file make a copy of it add some files to the copy and return the file via HTTP, it works but when I open the file it seems to be corrupted
outFile, err := os.OpenFile("./template.zip", os.O_RDWR, 0666)
if err != nil {
log.Fatalf("Failed to open zip for writing: %s", err)
}
defer outFile.Close()
zipw := zip.NewWriter(outFile)
fmt.Println(reflect.TypeOf(zipw))
for _, appCode := range appPageCodeText {
f, err := zipw.Create(appCode.Name + ".jsx")
if err != nil {
log.Fatal(err)
}
_, err = f.Write([]byte(appCode.Content)) //casting it to byte array and writing to file
}
// Clean up
err = zipw.Close()
if err != nil {
log.Fatal(err)
}
defer outFile.Close()
//Get the Content-Type of the file
//Create a buffer to store the header of the file in
FileHeader := make([]byte, 512)
//Copy the headers into the FileHeader buffer
outFile.Read(FileHeader)
//Get content type of file
fmt.Println(reflect.TypeOf(outFile))
//Get the file size
FileStat, _ := outFile.Stat() //Get info from file
FileSize := strconv.FormatInt(FileStat.Size(), 10) //Get file size as a string
buffer := make([]byte, FileStat.Size())
outFile.Read(buffer)
//Send the headers
w.Header().Set("Content-Disposition", "attachment; filename="+"template.zip")
w.Header().Set("Content-Type", "application/zip")
w.Header().Set("Content-Length", FileSize)
outFile.Seek(0, 0)
// io.Copy(w, buffer) //'Copy' the file to the client
w.Write(buffer)
(The primary problem): you Read the first 512 bytes of outFile into FileHeader, which means that they're not read into buffer, which means the first 512 bytes of the file aren't sent to the client. You do a Seek, but too late for it to be useful — the contents of buffer are already set at that point. You need to move the Seek earlier, or write both buffers, or just remove the unnecessary FileHeader read.
Your comment claims that you do so to get the content-type of the file, but FileHeader is actually never used. And why would it be? You know what the type of the file is, you just wrote it. So the separate read of the first 512 bytes is unneeded.
Actually, it's all unneeded — Instead of making a file on disk, using a zip.Writer to write to the file, re-opening the file from disk, reading it into a byte array, and then writing that byte array to the HTTP client, you could simply either have the zip.Writer write directly to the HTTP client (if you don't care about setting Content-Length), or have it write to a bytes.Buffer and then copy that buffer out to the HTTP client (if an accurate Content-Length is important to you).
The first version looks like:
w.Header().Set("Content-Disposition", "attachment; filename=template.zip")
w.Header().Set("Content-Type", "application/zip")
zipw := zip.NewWriter(w)
// Your for loop to add items to the zip goes here.
//
zipw.Close() // plus error handling
And the second version looks like:
buffer := &bytes.Buffer{}
zipw := zip.NewWriter(buffer)
// Your for loop to add items to the zip goes here.
//
zipw.Close() // plus error handling
w.Header().Set("Content-Disposition", "attachment; filename=template.zip")
w.Header().Set("Content-Type", "application/zip")
w.Header().Set("Content-Length", strconv.FormatInt(buffer.Length(), 10))
io.Copy(w, buffer) // plus error handling

How to append files (io.Reader)?

func SimpleUploader(r *http.Request, w http.ResponseWriter) {
// temp folder path
chunkDirPath := "./creatives/.uploads/" + userUUID
// create folder
err = os.MkdirAll(chunkDirPath, 02750)
// Get file handle from multipart request
var file io.Reader
mr, err := r.MultipartReader()
var fileName string
// Read multipart body until the "file" part
for {
part, err := mr.NextPart()
if err == io.EOF {
break
}
if part.FormName() == "file" {
file = part
fileName = part.FileName()
fmt.Println(fileName)
break
}
}
// Create files
tempFile := chunkDirPath + "/" + fileName
dst, err := os.Create(tempFile)
defer dst.Close()
buf := make([]byte, 1024*1024)
file.Read(buf)
// write/save buffer to disk
ioutil.WriteFile(tempFile, buf, os.ModeAppend)
if http.DetectContentType(buf) != "video/mp4" {
response, _ := json.Marshal(&Response{"File upload cancelled"})
settings.WriteResponse(w, http.StatusInternalServerError, response)
return
}
// joinedFile := io.MultiReader(bytes.NewReader(buf), file)
_, err = io.Copy(dst, file)
if err != nil {
settings.LogError(err, methodName, "Error copying file")
}
response, _ := json.Marshal(&Response{"File uploaded successfully"})
settings.WriteResponse(w, http.StatusInternalServerError, response)
}
I am uploading a Video file.
Before uploading the entire file I want to do some checks so I save the first 1mb to a file :
buf := make([]byte, 1024*1024)
file.Read(buf)
// write/save buffer to disk
ioutil.WriteFile(tempFile, buf, os.ModeAppend)
Then if the checks pass I want to upload the rest of the file dst is the same file used to save the 1st 1 mb so basically i am trying to append to the file :
_, err = io.Copy(dst, file)
The uploaded file size is correct but the file is corrupted(can't play the video).
What else have I tried? : Joining both the readers and saving to a new file. But with this approach the file size increases by 1 mb and is corrupted.
joinedFile := io.MultiReader(bytes.NewReader(buf), file)
_, err = io.Copy(newDst, joinedFile)
Kindly help.
You've basically opened the file twice by doing os.Create and ioutil.WriteFile
the issue being is that os.Create's return value (dst) is like a pointer to the beginning of that file. WriteFile doesn't move where dst points to.
You are basically doing WriteFile, then io.Copy on top of the first set of bytes WriteFile wrote.
Try doing WriteFile first (with Create flag), and then os.OpenFile (instead of os.Create) that same file with Append flag to append the remaining bytes to the end.
Also, it's extremely risky to allow a client to give you the filename as it could be ../../.bashrc (for example), to which you'd overwrite your shell init with whatever the user decided to upload.
It would be much safer if you computed a filename yourself, and if you need to remember the user's selected filename, store that in your database or even a metadata.json type file that you load later.

How add a file to an existing zip file using Golang

We can create a zip new file and add files using Go Language.
But, how to add a new file with existing zip file using GoLang?
If we can use Create function, how to get the zip.writer reference?
Bit confused.
After more analysis, i found that, it is not possible to add any files with the existing zip file.
But, I was able to add files with tar file by following the hack given in this URL.
you can:
copy old zip items into a new zip file;
add new files into the new zip file;
zipReader, err := zip.OpenReader(zipPath)
targetFile, err := os.Create(targetFilePath)
targetZipWriter := zip.NewWriter(targetFile)
for _, zipItem := range zipReader.File {
zipItemReader, err := zipItem.Open()
header, err := zip.FileInfoHeader(zipItem.FileInfo())
header.Name = zipItem.Name
targetItem, err := targetZipWriter.CreateHeader(header)
_, err = io.Copy(targetItem, zipItemReader)
}
addNewFiles(targetZipWriter) // IMPLEMENT YOUR LOGIC
Although I have not attempted this yet with a zip file that already exists and then writing to it, I believe you should be able to add files to it.
This is code I have written to create a conglomerate zip file containing multiple files in order to expedite uploading the data to another location. I hope it helps!
type fileData struct {
Filename string
Body []byte
}
func main() {
outputFilename := "path/to/file.zip"
// whatever you want as filenames and bodies
fileDatas := createFileDatas()
// create zip file
conglomerateZip, err := os.Create(outputFilename)
if err != nil {
return err
}
defer conglomerateZip.Close()
zipWriter := zip.NewWriter(conglomerateZip)
defer zipWriter.Close()
// populate zip file with multiple files
err = populateZipfile(zipWriter, fileDatas)
if err != nil {
return err
}
}
func populateZipfile(w *zip.Writer, fileDatas []*fileData) error {
for _, fd := range fileDatas {
f, err := w.Create(fd.Filename)
if err != nil {
return err
}
_, err = f.Write([]byte(fd.Body))
if err != nil {
return err
}
err = w.Flush()
if err != nil {
return err
}
}
return nil
}
This is a bit old and already has an answer, but if performance isn't a key concern for you (making the zip file isn't on a hot path for example) you can do this with the archive/zip library by creating a new writer and copying the existing files into it then adding your new content. Something like this:
zw := // new zip writer from buffer or temp file
newFileName := // file name to add
reader, _ := zip.NewReader(bytes.NewReader(existingFile), int64(len(existingFile)))
for _, file := range reader.File {
if file.Name == newFileName {
continue // don't copy the old file over to avoid duplicates
}
fw, _ := zw.Create(file.Name)
fr, _ := file.Open()
io.Copy(fw, fr)
fr.Close()
}
Then you would return the new writer and append files as needed. If you aren't sure which files might overlap you can turn that if check into a function with a list of file names you will eventually add. You can also use this logic to remove a file from an existing archive.
Now in 2021, there is still no support for appending files to an existing archive.
But at least it is now possible to add already-compressed files, i.e. we don't anymore have to decompress & re-compress files when duplicating them from old archive to new one.
(NOTE: this only applies to Go 1.17+)
So, based on examples by #wongoo and #Michael, here is how I would implement appending files now with the minimum performance overhead (you'll want to add error handling though):
zr, err := zip.OpenReader(zipPath)
defer zr.Close()
zwf, err := os.Create(targetFilePath)
defer zwf.Close()
zw := zip.NewWriter(zwf)
defer zwf.Close() // or not... since it will try to wrote central directory
for _, zipItem := range zrw.File {
if isOneOfNamesWeWillAdd(zipItem.Name) {
continue // avoid duplicate files!
}
zipItemReader, err := zipItem.OpenRaw()
header := zipItem.FileHeader // clone header data
targetItem, err := targetZipWriter.CreateRaw(&header) // use cloned data
_, err = io.Copy(targetItem, zipItemReader)
}
addNewFiles(zw) // IMPLEMENT YOUR LOGIC

Resources