I have a tar file that contains multiple tar files in it. I'm currently extracting these tars recursively using the tar Reader by moving manually over the files. This process is very heavy and slow, especially when dealing with large tar files that contain thousands of files and directories.
I didn't find any good package that is able to do this recursive extraction fast. plus I tried using the command tar -xf file.tar --same-owner" for the inner tars, but had a problem with permissions issue (which happens only on mac).
my question is:
Is there a way to parallelize the manual extraction process so that the inner tars will be extracted in parallel?
I have a method for the extraction task which I'm trying to make parallel:
var wg sync.WaitGroup
for {
header, err := tarBallReader.Next()
if err != nil {
go extractFileAsync(parentFolder, header, tarBallReader, depth, &wg)
after adding the go routines, the files are getting corrupted and the process is stuck on an endless loop.
example of the main tar content:
or simply you can run docker save <image>:<tag> -o image.tar and check the content of the tar.

Probably your code hangs on wg.Wait() due to the fact that the number of calls to wg.Done() during execution is not equal to len(tarFiles).
That should work:
var wg sync.WaitGroup
// wg.Add(len(tarFiles))
for {
header, err := tarBallReader.Next()
if err != nil {
go extractFileAsync(parentFolder, header, tarBallReader, depth, &wg)
func extractFileAsync(...) {
defer wg.Done()
// some code
UPD: correction of a possible race condition. Thanks #craigb
Here is my solution to a similar problem (simplified):
package main
import (
type Semaphore struct {
Wg sync.WaitGroup
Ch chan int
// Limit on the number of simultaneously running goroutines.
// Depends on the number of processor cores, storage performance, amount of RAM, etc.
const grMax = 10
const tarFileName = "docker_image.tar"
const dstDir = "output/docker"
func extractTar(tarFileName string, dstDir string) error {
f, err := os.Open(tarFileName)
if err != nil {
return err
sem := Semaphore{}
sem.Ch = make(chan int, grMax)
if err := Untar(dstDir, f, &sem, true); err != nil {
return err
fmt.Println("extractTar: wait for complete")
return nil
func Untar(dst string, r io.Reader, sem *Semaphore, godeep bool) error {
tr := tar.NewReader(r)
for {
header, err := tr.Next()
switch {
case err == io.EOF:
return nil
case err != nil:
return err
// the target location where the dir/file should be created
target := filepath.Join(dst, header.Name)
switch header.Typeflag {
// if its a dir and it doesn't exist create it
case tar.TypeDir:
if _, err := os.Stat(target); err != nil {
if err := os.MkdirAll(target, 0755); err != nil {
return err
// if it's a file create it
case tar.TypeReg:
if err := saveFile(tr, target, os.FileMode(header.Mode)); err != nil {
return err
ext := filepath.Ext(target)
// if it's tar file and we are on top level, extract it
if ext == ".tar" && godeep {
// A buffered channel is used to limit the number of simultaneously running goroutines
sem.Ch <- 1
// the file is unpacked to a directory with the file name (without extension)
newDir := filepath.Join(dst, strings.TrimSuffix(header.Name, ".tar"))
if err := os.Mkdir(newDir, 0755); err != nil {
return err
go func(target string, newDir string, sem *Semaphore) {
fmt.Println("start goroutine, chan length:", len(sem.Ch))
fmt.Println("START:", target)
defer sem.Wg.Done()
defer func() {<-sem.Ch}()
// the internal tar file opens
ft, err := os.Open(target)
if err != nil {
defer ft.Close()
// the godeep parameter is false here to avoid unpacking archives inside the current archive.
if err := Untar(newDir, ft, sem, false); err != nil {
fmt.Println("DONE:", target)
}(target, newDir, sem)
return nil
func saveFile(r io.Reader, target string, mode os.FileMode) error {
f, err := os.OpenFile(target, os.O_CREATE|os.O_RDWR, mode)
if err != nil {
return err
defer f.Close()
if _, err := io.Copy(f, r); err != nil {
return err
return nil
func main() {
err := extractTar(tarFileName, dstDir)
if err != nil {


Golang: Facing error while creating .tar.gz file having large name

I am trying to create a .tar.gz file from folder that contains multiple files / folders. Once the .tar.gz file gets created, while extracting, the files are not not properly extracted. Mostly I think its because of large names or path exceeding some n characters, because same thing works when the filename/path is small. I referred this and tried to add below code but it did not help.
header.Uid = 0
header.Gid = 0
I am using simple code seen below to create .tar.gz. The approach is, I create a temp folder, do some processing on the files and from that temp path, I create the .tar.gz file hence in the path below I am using pre-defined temp folder path.
package main
import (
fp "path/filepath"
func main() {
// Create output file
out, err := os.Create("output.tar.gz")
if err != nil {
log.Fatalln("Error writing archive:", err)
defer out.Close()
// Create the archive and write the output to the "out" Writer
tmpDir := "C:/Users/USERNAME~1/AppData/Local/Temp/temp-241232063"
err = createArchive1(tmpDir, out)
if err != nil {
log.Fatalln("Error creating archive:", err)
fmt.Println("Archive created successfully")
func createArchive1(path string, targetFile *os.File) error {
gw := gzip.NewWriter(targetFile)
defer gw.Close()
tw := tar.NewWriter(gw)
defer tw.Close()
// walk through every file in the folder
err := fp.Walk(path, func(filePath string, info os.FileInfo, err error) error {
// ensure the src actually exists before trying to tar it
if _, err := os.Stat(filePath); err != nil {
return err
if err != nil {
return err
if info.IsDir() {
return nil
file, err := os.Open(filePath)
if err != nil {
return err
defer file.Close()
// generate tar header
header, err := tar.FileInfoHeader(info, info.Name())
header.Uid = 0
header.Gid = 0
if err != nil {
return err
header.Name = filePath //strings.TrimPrefix(filePath, fmt.Sprintf("%s/", fp.Dir(path))) //info.Name()
// write header
if err := tw.WriteHeader(header); err != nil {
return err
if _, err := io.Copy(tw, file); err != nil {
return err
return nil
return err
Please let me know what wrong I am doing.

The process cannot access the file because it is being used by another process in Golang

The process cannot access the file ... because it is being used by another process
I can't Remover Zip file with this code ..
it's possible? extract and delete the file in one code.
package main
import (
func main() {
url := ""
out, _ := os.Create("E:\\experi\\")
defer out.Close()
resp, _ := http.Get(url)
defer resp.Body.Close()
_, _ = io.Copy(out, resp.Body)
files, err := Unzip("E:\\experi\\", "E:\\experi\\1234567890")
if err != nil {
fmt.Println("Unzipped the following files:\n" + strings.Join(files, "\n"))
func Unzip(src string, destination string) ([]string, error) {
var filenames []string
r, err := zip.OpenReader(src)
if err != nil {
return filenames, err
defer r.Close()
for _, f := range r.File {
fpath := filepath.Join(destination, f.Name)
if !strings.HasPrefix(fpath, filepath.Clean(destination)+string(os.PathSeparator)){
return filenames, fmt.Errorf("%s is an illegal filepath", fpath)
filenames = append(filenames, fpath)
if f.FileInfo().IsDir() {
os.MkdirAll(fpath, os.ModePerm)
if err = os.MkdirAll(filepath.Dir(fpath), os.ModePerm); err != nil {
return filenames, err
outFile, err := os.OpenFile(fpath,
if err != nil {
return filenames, err
rc, err := f.Open()
if err != nil {
return filenames, err
_, err = io.Copy(outFile, rc)
if err != nil {
return filenames, err
return filenames, nil
func removeFile() {
error := os.Remove("E:\\experi\\")
if error != nil {
output text
2020/10/28 13:09:04 remove E:\experi\ The process cannot access the file because it is being used by another process.
Process finished with exit code 1
Any other way to do this same thing ?
Did I go wrong anywhere?
Help Would be Much Appreciated. Thanks in Advance. :)
out, _ := os.Create("E:\\experi\\") creates or truncates the file and returns you a *File (so the file is open).
defer out.Close() closes the file "the moment the surrounding function returns" (spec).
So at the time you call Unzip you have the file open. To fix this call out.Close() before the call to Unzip (and please don't assume that calls complete without error).
If you close using the defer, it is closed after performing up to the last line of the function. You must explicitly close the file before remove it.

how to repeat shutting down and establish go routine?

every one,I am new to golang.I wanna get the data from log file generated by my application.cuz roll-back mechanism, I met some problem.For instance,my target log file is chats.log,it will be renamed to chats.log.2018xxx and a new chats.log will be my go routine that read log file will fail to work.
so I need detect the change and shutdown the previous go routine and then establish the new go routine.
I looked for modules that can help me,and I found
func ExampleNewWatcher(fn string, createnoti chan string, wg sync.WaitGroup) {
defer wg.Done()
watcher, err := fsnotify.NewWatcher()
if err != nil {
defer watcher.Close()
done := make(chan bool)
go func() {
for {
select {
case event := <-watcher.Events:
if event.Op == fsnotify.Create && event.Name==fn{
createnoti <- "has been created"
case err := <-watcher.Errors:
log.Println("error:", err)
err = watcher.Add("./")
if err != nil {
I use fsnotify to detech the change,and make sure the event of file is my log file,and then send some message to a channel.
this is my worker go routine:
func tailer(fn string,isfollow bool, outchan chan string, done <-chan interface{},wg sync.WaitGroup) error {
defer wg.Done()
_, err := os.Stat(fn)
if err != nil{
t, err := tail.TailFile(fn, tail.Config{Follow:isfollow})
if err != nil{
defer t.Stop()
for line := range t.Lines{
case outchan <- line.Text:
case <- done:
return nil
return nil
I using tail module to read the log file,and I add a done channel to it to shutdown the cycle(I don't know whether I put it in the right way)
And I will send every log content to a channel to consuming it.
So here is the question:how should I put it together?
ps: Actually,I can use some tool to do this apache-flume,but all of those tools need dependency.
Thank you a lot!
Here is a complete example that reloads and rereads the file as it changes or gets deleted and recreated:
package main
import (
const filename = "myfile.txt"
func ReadFile(filename string) string {
data, err := ioutil.ReadFile(filename)
if err != nil {
return string(data)
func main() {
watcher, err := fsnotify.NewWatcher()
if err != nil {
defer watcher.Close()
err = watcher.Add("./")
if err != nil {
for {
select {
case event := <-watcher.Events:
if event.Op == fsnotify.Create && event.Name == filename {
case err := <-watcher.Errors:
log.Println("error:", err)
Note this doesn't require goroutines, channels or a WaitGroup. Better to keep things simple and reserve those for when they're actually needed.

bug using golang io.pipe to tar files

I have been testing code using io.Pipe to tar and gunzip files into a tar ball and then unzipping using the tar utility. The follow code passes, however the untaring process keeps getting
tar: Truncated input file (needed 1050624 bytes, only 0 available)
tar: Error exit delayed from previous errors.
This issue is really driving me crazy. It has been two weeks. I really need help debugging.
Development enviroment: go version go1.9 darwin/amd64
package main
import (
func testTarGzipPipe2(t *testing.T) {
src := "/path/to/file/folder"
pr, pw := io.Pipe()
gzipWriter := gzip.NewWriter(pw)
defer gzipWriter.Close()
tarWriter := tar.NewWriter(gzipWriter)
defer tarWriter.Close()
status := make(chan bool)
go func() {
defer pr.Close()
// tar to local disk
tarFile, err := os.OpenFile("/path/to/tar/ball/test.tar.gz", os.O_RDWR|os.O_CREATE, 0755)
if err != nil {
defer tarFile.Close()
if _, err := io.Copy(tarFile, pr); err != nil {
status <- true
err := filepath.Walk(src, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
header, err := tar.FileInfoHeader(info, info.Name())
if err != nil {
return err
// header.Name = strings.TrimPrefix(strings.Replace(path, src, "", -1), string(filepath.Separator))
if err := tarWriter.WriteHeader(header); err != nil {
return err
if info.Mode().IsDir() {
return nil
f, err := os.Open(path)
if err != nil {
return err
defer f.Close()
if _, err := io.Copy(tarWriter, f); err != nil {
return err
return nil
if err != nil {
You are closing the pipe before the deferred Close calls on the gzipWriter and tarWriter. There's no error, because you're not checking the error on either of those close calls. You need to close the tarWriter, then the gzipWriter, then the PipeWriter, in that order.
However, there's no reason for the pipe at all in this code, and you can remove the goroutine and the associated coordination altogether if you write directly to the file.
tarFile, err := os.OpenFile("/tmp/test.tar.gz", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
defer tarFile.Close()
gzipWriter := gzip.NewWriter(tarFile)
defer gzipWriter.Close()
tarWriter := tar.NewWriter(gzipWriter)
defer tarWriter.Close()

Go file downloader

I have the following code which is suppose to download file by splitting it into multiple parts. But right now it only works on images, when I try downloading other files like tar files the output is an invalid file.
Used os.WriteAt instead of os.Write and removed os.O_APPEND file mode.
package main
import (
var file_url string
var workers int
var filename string
func init() {
flag.StringVar(&file_url, "url", "", "URL of the file to download")
flag.StringVar(&filename, "filename", "", "Name of downloaded file")
flag.IntVar(&workers, "workers", 2, "Number of download workers")
func get_headers(url string) (map[string]string, error) {
headers := make(map[string]string)
resp, err := http.Head(url)
if err != nil {
return headers, err
if resp.StatusCode != 200 {
return headers, errors.New(resp.Status)
for key, val := range resp.Header {
headers[key] = val[0]
return headers, err
func download_chunk(url string, out string, start int, stop int) {
client := new(http.Client)
req, _ := http.NewRequest("GET", url, nil)
req.Header.Add("Range", fmt.Sprintf("bytes=%d-%d", start, stop))
resp, _ := client.Do(req)
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
file, err := os.OpenFile(out, os.O_WRONLY, 0600)
if err != nil {
if file, err = os.Create(out); err != nil {
defer file.Close()
if _, err := file.WriteAt(body, int64(start)); err != nil {
fmt.Println(fmt.Sprintf("Range %d-%d: %d", start, stop, resp.ContentLength))
func main() {
headers, err := get_headers(file_url)
if err != nil {
} else {
length, _ := strconv.Atoi(headers["Content-Length"])
bytes_chunk := length / workers
fmt.Println("file length: ", length)
for i := 0; i < workers; i++ {
start := i * bytes_chunk
stop := start + (bytes_chunk - 1)
go download_chunk(file_url, filename, start, stop)
var input string
Basically, it just reads the length of the file, divides it with the number of workers then each file downloads using HTTP's Range header, after downloading it seeks to a position in the file where that chunk is written.
If you really ignore many errors like seen above then your code is not supposed to work reliably for any file type.
However, I guess I can see on problem in your code. I think that mixing O_APPEND and seek is probably a mistake (Seek should be ignored with this mode). I suggest to use (*os.File).WriteAt instead.
IIRC, O_APPEND forces any write to happen at the [current] end of file. However, your download_chunk function instances for file parts can be executing in unpredictable order, thus "reordering" the file parts. The result is then a corrupted file.
1.the sequence of the go routine is not sure。
eg. the execute result maybe as follows:
file length:20902
Range 10451-20901:10451
Range 0-10450:10451
so the chunks can't just append.
2.when write chunk datas must have a sys.Mutex
(my english is poor,please forget it)
