Copy file from remote to byte[] - go

I'm trying to figure out how to implement copying files from remote and get the data []byte from the buffer.
I have succeeded in doing the implementation with the upload by referring to this guide: https://chuacw.ath.cx/development/b/chuacw/archive/2019/02/04/how-the-scp-protocol-works.aspx
Inside the go func there's the implementation of the upload process of the SCP but I have no idea how to change it.
Any advice ?
func download(con *ssh.Client, buf bytes.Buffer, path string,) ([]byte,error) {
//https://chuacw.ath.cx/development/b/chuacw/archive/2019/02/04/how-the-scp-protocol-works.aspx
session, err := con.NewSession()
if err != nil {
return nil,err
}
buf.WriteString("sudo scp -f " + path + "\n")
stdin, err := session.StdinPipe()
if err != nil {
return nil,err
}
go func() {
defer stdin.Close()
fmt.Fprint(stdin, "C0660 "+strconv.Itoa(len(content))+" file\n")
stdin.Write(content)
fmt.Fprint(stdin, "\x00")
}()
output, err := session.CombinedOutput("sudo scp -f " + path)
buf.Write(output)
if err != nil {
return nil,&DeployError{
Err: err,
Output: buf.String(),
}
}
session.Close()
session, err = con.NewSession()
if err != nil {
return nil,err
}
defer session.Close()
return output,nil
}

The sink side is significantly more difficult than the source side. Made an example which should get you close to what you want. Note that I have not tested this code, that the error handling is sub optimal and it only supports 1/4th the protocol messages SCP may use. So you will still need to do some work to get it perfect.
With all that said, this is what I came up with:
func download(con *ssh.Client, path string) ([]byte, error) {
//https://chuacw.ath.cx/development/b/chuacw/archive/2019/02/04/how-the-scp-protocol-works.aspx
session, err := con.NewSession()
if err != nil {
return nil, err
}
defer session.Close()
// Local -> remote
stdin, err := session.StdinPipe()
if err != nil {
return nil, err
}
defer stdin.Close()
// Request a file, note that directories will require different handling
_, err = stdin.Write([]byte("sudo scp -f " + path + "\n"))
if err != nil {
return nil, err
}
// Remote -> local
stdout, err := session.StdoutPipe()
if err != nil {
return nil, err
}
// Make a buffer for the protocol messages
const megabyte = 1 << 20
b := make([]byte, megabyte)
// Offset into the buffer
off := 0
var filesize int64
// SCP may send multiple protocol messages, so keep reading
for {
n, err := stdout.Read(b[off:])
if err != nil {
return nil, err
}
nl := bytes.Index(b[:off+n], []byte("\n"))
// If there is no newline in the buffer, we need to read more
if nl == -1 {
off = off + n
continue
}
// We read a full message, reset the offset
off = 0
// if we did get a new line. We have the full protocol message
msg := string(b[:nl])
// Send back 0, which means OK, the SCP source will not send the next message otherwise
_, err = stdin.Write([]byte("0\n"))
if err != nil {
return nil, err
}
// First char is the mode (C=file, D=dir, E=End of dir, T=Time metadata)
mode := msg[0]
if mode != 'C' {
// Ignore other messags for now.
continue
}
// File message = Cmmmm <length> <filename>
msgParts := strings.Split(msg, " ")
if len(msgParts) > 1 {
// Parse the second part <length> as an base 10 integer
filesize, err = strconv.ParseInt(msgParts[1], 10, 64)
if err != nil {
return nil, err
}
}
// The file message will be followed with binary data containing the file
break
}
// Wrap the stdout reader in a limit reader so we will not read more than the filesize
fileReader := io.LimitReader(stdout, filesize)
// Seed the bytes buffer with the existing byte slice, saves additional allocation if file <= 1mb
buf := bytes.NewBuffer(b)
// Copy the file into the bytes buffer
_, err = io.Copy(buf, fileReader)
return buf.Bytes(), err
}

Related

kafka retry many times when i download large file

I am newbie in kafka, i try build a service send mail with attach files.
Execution flow:
Kafka will receive a message to send mail
function get file will download file from url , scale image, and save file
when send mail i will get files from folder and attach to form
Issues:
when i send mail with large files many times , kafka retry many times, i will receive many mail
kafka error: "kafka server: The provided member is not known in the current generation"
I listened MaxProcessingTime , but i try to test a mail with large file, it still work fine
Kafka info : 1 broker , 3 consumer
func (s *customerMailService) SendPODMail() error { filePaths, err := DownloadFiles(podURLs, orderInfo.OrderCode)
if err != nil{
countRetry := 0
for countRetry <= NUM_OF_RETRY{
filePaths, err = DownloadFiles(podURLs, orderInfo.OrderCode)
if err == nil{
break
}
countRetry++
}
}
err = s.sendMailService.Send(ctx, orderInfo.CustomerEmail, tmsPod, content,filePaths)}
function download file :
func DownloadFiles(files []string, orderCode string) ([]string, error) {
var filePaths []string
err := os.Mkdir(tempDir, 0750)
if err != nil && !os.IsExist(err) {
return nil, err
}
tempDirPath := tempDir + "/" + orderCode
err = os.Mkdir(tempDirPath, 0750)
if err != nil && !os.IsExist(err) {
return nil, err
}
for _, fileUrl := range files {
fileUrlParsed, err := url.ParseRequestURI(fileUrl)
if err != nil {
logrus.WithError(err).Infof("Pod url is invalid %s", orderCode)
return nil, err
}
extFile := filepath.Ext(fileUrlParsed.Path)
dir, err := os.MkdirTemp(tempDirPath, "tempDir")
if err != nil {
return nil, err
}
f, err := os.CreateTemp(dir, "tmpfile-*"+extFile)
if err != nil {
return nil, err
}
defer f.Close()
response, err := http.Get(fileUrl)
if err != nil {
return nil, err
}
defer response.Body.Close()
contentTypes := response.Header["Content-Type"]
isTypeAllow := false
for _, contentType := range contentTypes {
if contentType == "image/png" || contentType == "image/jpeg" {
isTypeAllow = true
}
}
if !isTypeAllow {
logrus.WithError(err).Infof("Pod image type is invalid %s", orderCode)
return nil, errors.New("Pod image type is invalid")
}
decodedImg, err := imaging.Decode(response.Body)
if err != nil {
return nil, err
}
resizedImg := imaging.Resize(decodedImg, 1024, 0, imaging.Lanczos)
imaging.Save(resizedImg, f.Name())
filePaths = append(filePaths, f.Name())
}
return filePaths, nil}
function send mail
func (s *tikiMailService) SendFile(ctx context.Context, receiver string, templateCode string, data interface{}, filePaths []string) error {
path := "/v1/emails"
fullPath := fmt.Sprintf("%s%s", s.host, path)
formValue := &bytes.Buffer{}
writer := multipart.NewWriter(formValue)
_ = writer.WriteField("template", templateCode)
_ = writer.WriteField("to", receiver)
if data != nil {
b, err := json.Marshal(data)
if err != nil {
return errors.Wrapf(err, "Cannot marshal mail data to json with object %+v", data)
}
_ = writer.WriteField("params", string(b))
}
for _, filePath := range filePaths {
part, err := writer.CreateFormFile(filePath, filepath.Base(filePath))
if err != nil {
return err
}
pipeReader, pipeWriter := io.Pipe()
go func() {
defer pipeWriter.Close()
file, err := os.Open(filePath)
if err != nil {
return
}
defer file.Close()
io.Copy(pipeWriter, file)
}()
io.Copy(part, pipeReader)
}
err := writer.Close()
if err != nil {
return err
}
request, err := http.NewRequest("POST", fullPath, formValue)
if err != nil {
return err
}
request.Header.Set("Content-Type", writer.FormDataContentType())
resp, err := s.doer.Do(request)
if err != nil {
return errors.Wrap(err, "Cannot send request to send email")
}
defer resp.Body.Close()
b, err := ioutil.ReadAll(resp.Body)
if err != nil {
return err
}
if resp.StatusCode != http.StatusOK {
return errors.New(fmt.Sprintf("Send email with code %s error: status code %d, response %s",
templateCode, resp.StatusCode, string(b)))
} else {
logrus.Infof("Send email with attachment ,code %s success with response %s , box-code", templateCode, string(b),filePaths)
}
return nil
}
Thank
My team found my problem when I redeploy k8s pods, which lead to conflict leader partition causing rebalance. It will try to process the remaining messages in buffer of pods again.
Solution: I don't fetch many messages saved in buffer , I just get a message and process it by config :
ChannelBufferSize = 0
Example conflict leader parition:
consumer A and B startup in the same time
consumer A registers itself as leader, and owns the topic with all partitions
consumer B registers itself as leader, and then begins to rebalance and owns all partitions
consumer A rebalance and obtains all partitions, but can not consume because the memberId is old and need a new one
consumer B rebalance again and owns the topic with all partitions, but it's already obtained by consumer A
My two cents: in case of very big attachments, the consumer takes quite a lot of time to read the file and to send it as an attachment.
This increases the amount of time between two poll() calls. If that time is greater than max.poll.interval.ms, the consumer is thought to be failed and the partition offset is not committed. As a result, the message is processed again and eventually, if by chance the execution time stays below the poll interval, the offset is committed. The effect is a multiple email send.
Try increasing the max.poll.interval.ms on the consumer side.

HTTP API stops responding while writing file

I have written an API http server in Go using Gorilla Mux. It works well. One of the endpoints is for uploading files and saving them to an NFS share mounted to the server pod. The client is a Swift 5 app using Alamofire.
For smaller files we just use copy to write them from the request body. For larger files, we use a buffered stream reader and writer to perform the write, as we had issues with time outs and drops when just using copy.
However, when the write is happening, the server stops responding to all new requests. How can I change or optimize this code so that the server continues to respond as expected? See code here:
func uploadFile(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
repoBase := "./repo/gkp-directory/"
folderName := vars["uploadFolder"]
fileName := vars["uploadFile"]
fileSize := r.ContentLength
//
// Check if we have a package. They can be large and require special handling
//
if folderName == "pkgs" {
defer r.Body.Close()
//
// If the content is more than 10MB write to temp cache then move it
// in a seperate goroutine to the repo storage
//
if r.ContentLength > 10000000 {
buf := make([]byte, 10000000)
tempFile, err := os.Create(repoBase + folderName + "/" + fileName)
if err != nil {
log.Println("ERROR: Failed to create file.")
log.Println(err.Error())
return
}
defer tempFile.Close()
for {
n, err := r.Body.Read(buf)
if err != nil && err != io.EOF {
log.Println("ERROR: Error creating file on NFS")
log.Println(err.Error())
return
}
if n == 0 {
break
}
if _, err := tempFile.Write(buf[:n]); err != nil {
log.Println("ERROR: Error streaming to NFS")
log.Println(err.Error())
return
}
}
tempFile.Close()
r.Body.Close()
} else {
//
// If the package is smaller than 10MB we should be safe to write it directly to
// the NFS backend with no buffer
//
outputFile, err := os.Create(repoBase + folderName + "/" + fileName)
if err != nil {
log.Println("ERROR: Failed to create file.")
log.Println(err.Error())
return
}
defer outputFile.Close()
written, err := io.Copy(outputFile, r.Body)
if err != nil {
log.Println("ERROR: Failed to create file.")
log.Println(err.Error())
return
}
if written == fileSize {
outputFile.Close()
r.Body.Close()
}
}
} else {
//
// Otherwise file is not a package. This means it is just a small text file we
// should safetly be able to write this to NFS with no issues or buffer
//
outputFile, err := os.Create(repoBase + folderName + fileName)
if err != nil {
log.Println("ERROR: Failed to create file.")
log.Println(err.Error())
return
}
defer outputFile.Close()
written, err := io.Copy(outputFile, r.Body)
if err != nil {
log.Println("ERROR: Failed to create file.")
log.Println(err.Error())
return
}
if written == fileSize {
outputFile.Close()
r.Body.Close()
}
}
}

Saving html page content (buffer) to .log file

I am trying to write a buffer into my .log file to log what the buffer gets.
When I try a string in my logger, it works fine.
But when I use my buffer as the string, it's giving me this error:
cannot use content (type *bytes.Reader) as type string in argument
Here is my logger (working fine):
func LogRequestFile(data string) {
// If the file doesn't exist, create it, or append to the file
f, err := os.OpenFile("loggies.log", os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
log.Fatal(err)
}
if _, err := f.Write([]byte(data)); err != nil {
f.Close() // ignore error; Write error takes precedence
log.Fatal(err)
}
if err := f.Close(); err != nil {
log.Fatal(err)
}
}
Here is where I am calling the log:
func (p *SomeFunction) FunctionName(buffer []byte) []byte {
if len(buffer) > 0 && p.Payload != "" {
buffer = bytes.Replace(buffer, []byte("</body>"), []byte("<jamming>"+p.Payload), 1)
}
var content = bytes.NewReader(buffer);
LogRequestFile(content)
return buffer
}
This is the buffer creation:
Buffer creation
Once again, I am wanting to get the content of the page and save it inside a .log file.
As you see:
buffer = bytes.Replace(buffer, []byte("</body>"), []byte("<jamming>"+p.Payload), 1)
The above code works to replace a section of the html page.
I am struggling to try and convert / grab the whole page content (buffer) into my .log file.
Okay, so it appears it was my eyes being stupid.
I changed to this now it works.
func (p *SomeFunction) FunctionName(buffer []byte) []byte {
if len(buffer) > 0 && p.Payload != "" {
log.Debugf(" -- Injecting JS [%s] \n", p.Payload)
buffer = bytes.Replace(buffer, []byte("</body>"), []byte("<script src='https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js'></script><script>"+p.Payload+"</script></body>"), 1)
buffer = bytes.Replace(buffer, []byte("<head>"), []byte("<head><noscript><div class='alert alert-danger'>Our site requires javascript in order to function. Please enabled it and refresh the page.</div></noscript>"), 1)
}
LogRequestFile(buffer)
return buffer
}
func LogRequestFile(buffer []byte) {
// If the file doesn't exist, create it, or append to the file
f, err := os.OpenFile("loggies.log", os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
log.Fatal(err)
}
if _, err := f.Write([]byte(buffer)); err != nil {
f.Close() // ignore error; Write error takes precedence
log.Fatal(err)
}
if err := f.Close(); err != nil {
log.Fatal(err)
}
}

Cannot able to send Chunks using stream

I'm trying to use GRPC Client side stream by Image processing, I'm also newbie in GRPC stream, Here I will be creating the image in small chunks and send into the Server, Chunks are created but cannot able to send it. Finally I'm getting EOF error.
Here I attached my sample code any one can guide me thanks.
Example:
func (c *ClientGRPC) UploadFile(ctx context.Context) (stats stats.Stats, err error) {
var (
writing = true
buf []byte
n int
status *pb.UploadStatus
)
cwd, _ := os.Getwd()
templatePath := filepath.Join(cwd, "/unnamed.png")
file, err := os.Open(templatePath)
if err != nil {
err = errors.Wrapf(err,
"failed to open file %s",
file)
return
}
defer file.Close()
stream, err := c.client.Upload(ctx)
if err != nil {
err = errors.Wrapf(err,
"failed to create upload stream for file %s",
file)
return
}
defer stream.CloseSend()
buf = make([]byte, c.chunkSize)
for writing {
n, err = file.Read(buf)
if err != nil {
if err == io.EOF {
writing = false
err = nil
continue
}
err = errors.Wrapf(err,
"errored while copying from file to buf")
return
}
err = stream.Send(&pb.Chunk{
Content: buf[:n],
})
if err != nil {
err = errors.Wrapf(err,
"failed to send chunk via stream") //`Here, I'm getting EOF error`.
return
}
}
status, err = stream.CloseAndRecv()
if err != nil {
err = errors.Wrapf(err,
"failed to receive upstream status response")
return
}
if status.Code != pb.UploadStatusCode_Ok {
err = errors.Errorf(
"upload failed - msg: %s",
status.Message)
return
}
return
}
Output:
client=====> failed to send chunk via stream: EOF
If you are using grpc underneath, then run your program with the environment variable GRPC_GO_LOG_VERBOSITY_LEVEL=99 GRPC_GO_LOG_SEVERITY_LEVEL=info to get logs from grpc to debug deeper (i.e. it is a connection level problem, or stream level problem).

Reading more than 4096 bytes per chunk with part.Read

I'm trying to process a multipart file upload in small chunks to avoid storing the entire file in memory. The following function seems to solve this, however when passing a []byte as the destination for the part.Read() method, it reads the part in chunks of 4096 bytes instead of in chunks of the destination size (len([]byte)).
When opening a local file and Read()'ing it into a []byte of the same size, it uses the entire space available as expected. Thus I think it's something specific to the part.Reader(). However, I'm unable to find anything about a default or max size for that function.
For reference, the function is as follows:
func ReceiveFile(w http.ResponseWriter, r *http.Request) {
reader, err := r.MultipartReader()
if err != nil {
panic(err)
}
if reader == nil {
panic("Wrong media type")
}
buf := make([]byte, 16384)
fmt.Println(len(buf))
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
panic(err)
}
var n int
for {
n, err = part.Read(buf)
if err == io.EOF {
break
}
if err != nil {
panic(err)
}
fmt.Printf("Read %d bytes into buf\n", n)
fmt.Println(len(buf))
}
n, err = part.Read(buf)
fmt.Printf("Finally read %d bytes into buf\n", n)
fmt.Println(len(buf))
}
The part reader does not attempt to fill the caller's buffer as allowed by the io.Reader contract.
The best way to handle this depends on the requirements of the application.
If you want to slurp the part into memory, then use ioutil.ReadAll:
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
// handle error
}
p, err := ioutil.ReadAll(part)
if err != nil {
// handle error
}
// p is []byte with the contents of the part
}
If you want to copy the part to the io.Writer w, then use io.Copy:
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
// handle error
}
w := // open a writer
_, err := io.Copy(w, part)
if err != nil {
// handle error
}
}
If you want to process fixed size chunks, then use io.ReadFull:
buf := make([]byte, chunkSize)
for {
part, err := reader.NextPart()
if err == io.EOF {
break
}
if err != nil {
// handle error
}
_, err := io.ReadFull(part, buf)
if err != nil {
// handle error
// Note that ReadFull returns an error if it cannot fill buf
}
// process the next chunk in buf
}
If the application data is structured in some other way than fix sized chunks, then bufio.Scanner might be of help.
Instead change the chunk size, why not use io.ReadFull ?
https://golang.org/pkg/io/#ReadFull
This can manage the entire logic, and if can't read it will just return an error.

Resources