Extract from tar file in Go - go

This code tries to tar some texts into a tar file and untar it.
The code for tar works but seems like I am doing something wrong
because untar the same file does not work.
When I untar the file that I manually tar.gz with OS GUI, it works but
not in this code.
http://play.golang.org/p/diTOojUuBX
func main() {
mpath := "a.tar.gz"
// defer os.Remove(mpath)
f, err := overwrite(mpath)
defer f.Close()
if err != nil {
panic(err)
}
gw := gzip.NewWriter(f)
defer gw.Close()
if err != nil {
panic(err)
}
tw := tar.NewWriter(gw)
for _, file := range files {
hdr := &tar.Header{
Name: file.Name,
Mode: 0600,
Size: int64(len(file.Body)),
}
if err := tw.WriteHeader(hdr); err != nil {
panic(err)
}
if _, err := tw.Write([]byte(file.Body)); err != nil {
panic(err)
}
}
// Make sure to check the error on Close.
if err := tw.Close(); err != nil {
panic(err)
}
fr, err := read(mpath)
defer fr.Close()
if err != nil {
panic(err)
}
gr, err := gzip.NewReader(fr)
defer gr.Close()
if err != nil {
panic(err)
}
tr := tar.NewReader(gr)
for {
hdr, err := tr.Next()
if err == io.EOF {
// end of tar archive
break
}
if err != nil {
panic(err)
}
path := hdr.Name
switch hdr.Typeflag {
case tar.TypeDir:
if err := os.MkdirAll(path, os.FileMode(hdr.Mode)); err != nil {
panic(err)
}
case tar.TypeReg:
ow, err := overwrite(path)
defer ow.Close()
if err != nil {
panic(err)
}
if _, err := io.Copy(ow, tr); err != nil {
panic(err)
}
default:
fmt.Printf("Can't: %c, %s\n", hdr.Typeflag, path)
}
}
}

It looks to me like there are two problems.
You are using defer to close your tar writer and gzip writer, however defers only execute when the current scope ends. Since you are running this all in one function your file handles will still be open when you attempt to read to untar and that may cause issues (perhaps file has not flushed fully for example).
You are not setting the Typeflag in the header when you create the tarball. While GNU tar may deal with this, assuming Typeflag '0', Go may not. According to the docs http://www.gnu.org/software/tar/manual/html_node/Standard.html the Typeflag for a regular file is byte '0'. This may mean you need to set the Typeflag on a per resource in your code (directories, files, links, etc).
I rewrote you code like the following, and it is working for me now. (note: storing everything as REGTYPE)
http://play.golang.org/p/3B7F_axr-i
EDIT
Ah-ha, I know the problem with #2 now. The Go library for tar uses FileInfoHeader in order to determine parts of the header, like Typeflag. Since your files are not really files on the system it cannot fill out the appropriate Typeflag.
GNU tar obviously knows how to deal with this, or perhaps tries its best to figure it out and is successful in this case.

Related

How to zip symlinks in Golang using "archive/zip" package?

I need to zip/unzip folder that contains symlinks in a way that the structure will be saved and symlinks will be written as symlinks.
Is there a way doing this using Golang package "archive/zip"? or any other alternative way?
I tried to use this code, but 'io.Copy()' copies the target file content and we "lose" the symlink.
archive, err := os.Create("archive.zip")
if err != nil {
panic(err)
}
defer archive.Close()
zipWriter := zip.NewWriter(archive)
localPath := "../testdata/sym"
file, err := os.Open(localPath)
defer file.Close()
if err != nil {
panic(err)
}
w1 , err:= zipWriter.Create("w1")
if _, err = io.Copy(w1, file); err !=nil{
panic(err)
}
zipWriter.Close()
I used this PR : https://github.com/mholt/archiver/pull/92
and wrote symlink's target to writer.
I tested it on Linux and Windows.
archive, err := os.Create("archive.zip")
if err != nil {
panic(err)
}
defer archive.Close()
zipWriter := zip.NewWriter(archive)
defer zipWriter.Close()
localPath := "../testdata/sym"
symlinksTarget = "../a/b.in"
file, err := os.Open(localPath)
defer file.Close()
if err != nil {
panic(err)
}
info, err := os.Lstat(file.Name())
if err != nil {
panic(err)
}
header, err := zip.FileInfoHeader(info)
if err != nil {
panic(err)
}
header.Method = zip.Deflate
writer, err := zipWriter.CreateHeader(header)
if err != nil {
panic(err)
}
// Write symlink's target to writer - file's body for symlinks is the symlink target.
_, err = writer.Write([]byte(filepath.ToSlash(symlinksTarget)))
if err != nil {
panic(err)
}

Editing a zip file in memory

I am trying to edit a zip file in memory in Go and return the zipped file through a HTTP response
The goal is to add a few files to a path in the zip file example
I add a log.txt file in my path/to/file route in the zipped folder
All this should be done without saving the file or editing the original file.
I have implemented a simple version of real-time stream compression, which can correctly compress a single file. If you want it to run efficiently, you need a lot of optimization.
This is only for reference. If you need more information, you should set more useful HTTP header information before compression so that the client can correctly process the response data.
package main
import (
"archive/zip"
"io"
"net/http"
"os"
"github.com/gin-gonic/gin"
)
func main() {
engine := gin.Default()
engine.GET("/log.zip", func(c *gin.Context) {
f, err := os.Open("./log.txt")
if err != nil {
c.String(http.StatusInternalServerError, err.Error())
return
}
defer f.Close()
info, err := f.Stat()
if err != nil {
c.String(http.StatusInternalServerError, err.Error())
return
}
z := zip.NewWriter(c.Writer)
head, err := zip.FileInfoHeader(info)
if err != nil {
c.String(http.StatusInternalServerError, err.Error())
return
}
defer z.Close()
w, err := z.CreateHeader(head)
if err != nil {
c.String(http.StatusInternalServerError, err.Error())
return
}
_, err = io.Copy(w, f)
if err != nil {
c.String(http.StatusInternalServerError, err.Error())
return
}
})
engine.Run("127.0.0.1:8080")
}
So after hours of tireless work i figured out my approach was bad or maybe not possible with the level of my knowledge so here is a not so optimal solution but it works and fill ur file is not large it should be okay for you.
So you have a file template.zip and u want to add extra files, my initial approach was to copy the whole file into memory and edit it from their but i was having complications.
My next approach was to recreate the file in memory, file by file and to do that i need to know every file in the directory i used the code below to get all my files into a list
root := "template"
err = filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if info.IsDir() {
return nil
}append(files,path)}
now i have all my files and i can create a buffer to hold all this files
buf := new(bytes.Buffer)
// Create a new zip archive.
zipWriter := zip.NewWriter(buf)
now with the zip archive i can write all my old files to it while at the same time copying the contents
for _, file := range files {
zipFile, err := zipWriter.Create(file)
if err != nil {
fmt.Println(err)
}
content, err := ioutil.ReadFile(file)
if err != nil {
log.Fatal(err)
}
// Convert []byte to string and print to screen
// text := string(content)
_, err = zipFile.Write(content)
if err != nil {
fmt.Println(err)
}
}
At this point, we have our file in buf.bytes()
The remaining cold adds the new files and sends the response back to the client
for _, appCode := range appPageCodeText {
f, err := zipWriter.Create(filepath.fileextension)
if err != nil {
log.Fatal(err)
}
_, err = f.Write([]byte(appCode.Content))
}
err = zipWriter.Close()
if err != nil {
fmt.Println(err)
}
w.Header().Set("Content-Disposition", "attachment; filename="+"template.zip")
w.Header().Set("Content-Type", "application/zip")
w.Write(buf.Bytes()) //'Copy' the file to the client

Copy executable golang file to another folder

I wonder if possible to copy running .exe file to another folder. I am trying to do this using usual copy approach in Go like that.
func copy(src, dst string) error {
in, err := os.Open(src)
if err != nil {
return err
}
defer in.Close()
out, err := os.Create(dst)
if err != nil {
return err
}
defer out.Close()
_, err = io.Copy(out, in)
if err != nil {
return err
}
return out.Close()
}
...
copyErr := copy(os.Args[0], "D:"+"\\"+"whs.exe")
if copyErr != nil {
log.Panicf("copy -> %v", copyErr)
}
The file copied with the same size but I can't open it correctly. I have only a fast cmd flash. After several milliseconds, cmd is closing and I can't see even any errors.
I was trying to write errors to log file but it's empty.
f, err := os.OpenFile("debug.log", os.O_RDWR|os.O_CREATE|os.O_APPEND, 0777)
if err != nil {
log.Panicf("setLogOutput -> %v", err)
}
defer f.Close()
log.SetOutput(f)
If I open not copied .exe file everything works correctly.
I've reduced my program to only one main method. The result was the same.
func main() {
log.Println("Starting...")
copyErr := copy(os.Args[0], "F:"+"\\"+"whs.exe")
if copyErr != nil {
log.Panicf("copy -> %v", copyErr)
}
os.Stdin.Read([]byte{0})
}
I have found an error.
The process cannot access the file because it is being used by another process.
I was trying to copy the .exe file to its own path.
func copy(src, dst string) error {
if _, err := os.Stat(dst); os.IsNotExist(err) {
in, err := os.Open(src)
if err != nil {
return err
}
defer in.Close()
out, err := os.Create(dst)
if err != nil {
return err
}
defer out.Close()
_, err = io.Copy(out, in)
if err != nil {
return err
}
}
return nil
}

Getting `write too long` error when trying to create tar.gz file from file and directories

So I'm trying to crate a tar.gz file file from multiple directories and files. Something with the same usage as:
tar -cvzf sometarfile.tar.gz somedir/ someotherdir/ somefile.json somefile.xml
Assuming the directories have other directories inside of them.
I have this as an input:
paths := []string{
"somedir/",
"someotherdir/",
"somefile.json",
"somefile.xml",
}
and using these:
func TarFilesDirs(paths []string, tarFilePath string ) error {
// set up the output file
file, err := os.Create(tarFilePath)
if err != nil {
return err
}
defer file.Close()
// set up the gzip writer
gz := gzip.NewWriter(file)
defer gz.Close()
tw := tar.NewWriter(gz)
defer tw.Close()
// add each file/dir as needed into the current tar archive
for _,i := range paths {
if err := tarit(i, tw); err != nil {
return err
}
}
return nil
}
func tarit(source string, tw *tar.Writer) error {
info, err := os.Stat(source)
if err != nil {
return nil
}
var baseDir string
if info.IsDir() {
baseDir = filepath.Base(source)
}
return filepath.Walk(source,
func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
header, err := tar.FileInfoHeader(info, info.Name())
if err != nil {
return err
}
if baseDir != "" {
header.Name = filepath.Join(baseDir, strings.TrimPrefix(path, source))
}
if err := tw.WriteHeader(header); err != nil {
return err
}
if info.IsDir() {
return nil
}
file, err := os.Open(path)
if err != nil {
return err
}
defer file.Close()
_, err = io.Copy(tw, file)
if err != nil {
log.Println("failing here")
return err
}
return err
})
}
Problem: if a directory is large I'm getting:
archive/tar: write too long
error, when I remove it everything works.
Ran out of ideas and wasted many hours on this trying to find a solution...
Any ideas?
Thanks
I was having a similar issue until I looked more closely at the tar.FileInfoHeader doc:
FileInfoHeader creates a partially-populated Header from fi. If fi describes a symlink, FileInfoHeader records link as the link target. If fi describes a directory, a slash is appended to the name. Because os.FileInfo's Name method returns only the base name of the file it describes, it may be necessary to modify the Name field of the returned header to provide the full path name of the file.
Essentially, FileInfoHeader isn't guaranteed to fill out all the header fields before you write it with WriteHeader, and if you look at the implementation the Size field is only set on regular files. Your code snippet seems to only handle directories, this means if you come across any other non regular file, you write the header with a size of zero then attempt to copy a potentially non-zero sized special file on disk into the tar. Go returns ErrWriteTooLong to stop you from creating a broken tar.
I came up with this and haven't had the issue since.
if err := filepath.Walk(directory, func(path string, info os.FileInfo, err error) error {
if err != nil {
return check(err)
}
var link string
if info.Mode()&os.ModeSymlink == os.ModeSymlink {
if link, err = os.Readlink(path); err != nil {
return check(err)
}
}
header, err := tar.FileInfoHeader(info, link)
if err != nil {
return check(err)
}
header.Name = filepath.Join(baseDir, strings.TrimPrefix(path, directory))
if err = tw.WriteHeader(header); err != nil {
return check(err)
}
if !info.Mode().IsRegular() { //nothing more to do for non-regular
return nil
}
fh, err := os.Open(path)
if err != nil {
return check(err)
}
defer fh.Close()
if _, err = io.CopyBuffer(tw, fh, buf); err != nil {
return check(err)
}
return nil
})
Since you are seeing this issue with a large directory only, I think the following fix might not help, but this will address the issue of creating a tar from files that might be continuously growing.
In my case the issue was that when we were creating the tar header, the header.Size (inside tar.FileInfoHeader) was getting setting to the file size (info.Size()) at that instant of time.
When we later in the code, try to open the concerned file (os.Open) and copy its contents (io.Copy) we risk copying more data than what we earlier set the tar header size to because the file could have grown in the meantime.
This piece of code will ensure we ONLY copy that much data as we set the tar header size to:
_, err = io.**CopyN**(tw, file, info.Size())
if err != nil {
log.Println("failing here")
return err
}
Write writes to the current entry in the tar archive. Write returns the error ErrWriteTooLong if more than hdr.Size bytes are written after WriteHeader.
There is a Size option you can add to the Header. Haven't tried it but maybe that helps...
See also https://golang.org/pkg/archive/tar/

golang scp file using crypto/ssh

I'm trying to download a remote file over ssh
The following approach works fine on shell
ssh hostname "tar cz /opt/local/folder" > folder.tar.gz
However the same approach on golang giving some difference in output artifact size. For example the same folders with pure shell produce artifact gz file 179B and same with go script 178B.
I assume that something has been missed from io.Reader or session got closed earlier. Kindly ask you guys to help.
Here is the example of my script:
func executeCmd(cmd, hostname string, config *ssh.ClientConfig, path string) error {
conn, _ := ssh.Dial("tcp", hostname+":22", config)
session, err := conn.NewSession()
if err != nil {
panic("Failed to create session: " + err.Error())
}
r, _ := session.StdoutPipe()
scanner := bufio.NewScanner(r)
go func() {
defer session.Close()
name := fmt.Sprintf("%s/backup_folder_%v.tar.gz", path, time.Now().Unix())
file, err := os.OpenFile(name, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
defer file.Close()
for scanner.Scan() {
fmt.Println(scanner.Bytes())
if err := scanner.Err(); err != nil {
fmt.Println(err)
}
if _, err = file.Write(scanner.Bytes()); err != nil {
log.Fatal(err)
}
}
}()
if err := session.Run(cmd); err != nil {
fmt.Println(err.Error())
panic("Failed to run: " + err.Error())
}
return nil
}
Thanks!
bufio.Scanner is for newline delimited text. According to the documentation, the scanner will remove the newline characters, stripping any 10s out of your binary file.
You don't need a goroutine to do the copy, because you can use session.Start to start the process asynchronously.
You probably don't need to use bufio either. You should be using io.Copy to copy the file, which has an internal buffer already on top of any buffering already done in the ssh client itself. If an additional buffer is needed for performance, wrap the session output in a bufio.Reader
Finally, you return an error value, so use it rather than panic'ing on regular error conditions.
conn, err := ssh.Dial("tcp", hostname+":22", config)
if err != nil {
return err
}
session, err := conn.NewSession()
if err != nil {
return err
}
defer session.Close()
r, err := session.StdoutPipe()
if err != nil {
return err
}
name := fmt.Sprintf("%s/backup_folder_%v.tar.gz", path, time.Now().Unix())
file, err := os.OpenFile(name, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0644)
if err != nil {
return err
}
defer file.Close()
if err := session.Start(cmd); err != nil {
return err
}
n, err := io.Copy(file, r)
if err != nil {
return err
}
if err := session.Wait(); err != nil {
return err
}
return nil
You can try doing something like this:
r, _ := session.StdoutPipe()
reader := bufio.NewReader(r)
go func() {
defer session.Close()
// open file etc
// 10 is the number of bytes you'd like to copy in one write operation
p := make([]byte, 10)
for {
n, err := reader.Read(p)
if err == io.EOF {
break
}
if err != nil {
log.Fatal("err", err)
}
if _, err = file.Write(p[:n]); err != nil {
log.Fatal(err)
}
}
}()
Make sure your goroutines are synchronized properly so output is completeky written to the file.

Resources