Related
I have a 7z archive of a number of .txt files. I am trying to list all the files in the archive and upload them to an s3 bucket. But I'm having trouble with extracting .7z archives on Go. To do this, I found a package github.com/gen2brain/go-unarr (imported as extractor) and this is what I have so far
content, err := ioutil.ReadFile("sample_archive.7z")
if err != nil {
fmt.Printf("err: %+v", err)
}
a, err := extractor.NewArchiveFromMemory(content)
if err != nil {
fmt.Printf("err: %+v", err)
}
lst, _ := a.List()
fmt.Printf("lst: %+v", last)
This prints a list of all the files in the archive. But this has two issues.
It reads files from local using ioutil and the input of NewArchiveFromMemory must be of type []byte. But I can't read from local and will have to use a file from memory of type os.file. So I will either have to find a different method or convert the os.file to []byte. There's another method NewArchiveFromReader(r io.Reader). But this is returning an error saying Bad File Descriptor.
file, err := os.OpenFile(
path,
os.O_WRONLY|os.O_TRUNC|os.O_CREATE,
0666,
)
a, err := extractor.NewArchiveFromReader(file)
if err != nil {
fmt.Printf("ERROR: %+v", err)
}
lst, _ := a.List()
fmt.Printf("files: %+v\n", lst)
I am able to get the list of the files in the archive. And using Extract(destinaltion_path string), I can also extract it to a local directory. But I want the extracted files also in os.file format ( ie. a list of os.file since there will be multiple files ).
How can I change my current code to achieve both the above targets? Is there any other library to do this?
os.File implements the io.Reader interface (because it has a Read([]byte) (int, error) method defined), so you can use NewArchiveFromReader(file) without any conversions needed. You can read up on Go interfaces for more background on why that works.
If you're okay with extracting to a local directory, you can do that and then read the files back in (warning, may contain typos):
func extractAndOpenAll(*extractor.Archive) ([]*os.File, error) {
err := a.Extract("/tmp/path") // consider using ioutil.TempDir()
if err != nil {
return nil, err
}
filestats, err := ioutil.ReadDir("/tmp/path")
if err != nil {
return nil, err
}
# warning: all these file handles must be closed by the caller,
# which is why even the error case here returns the list of files.
# if you forget, your process might leak file handles.
files := make([]*os.File, 0)
for _, fs := range(filestats) {
file, err := os.Open(fs.Name())
if err != nil {
return files, err
}
files = append(files, file)
}
return files, nil
}
It is possible to use the archived files without writing back to disk (https://github.com/gen2brain/go-unarr#read-all-entries-from-archive), but whether or not you should do that instead depends on what your next step is.
I have downloaded over 500 Gb of data to a single directory off of AWS.
Whenever I try to access that directory, the command line hangs and doesn't show me anything.
I'm trying to run some code that will interact with the files by printing out the path of each file but the command line hangs and then exits the program.
The program definitely starts execution because "Printing file path's" gets displayed to the console.
func main() {
fmt.Println("Printing file path's")
err := filepath.Walk(source,
func(fpath string, info os.FileInfo, err error) {
if !info.IsDir() && file path.Ext(fpath)==".txt" {
fmt.Println(fpath)
}
}
}
}
How should I handle the situation of being able to view all the files in the command line and why is this program not working?
UPDATE:
By using
files, err := dir.Readdir(10)
if err == io.EOF {
break
}
I was able to snap up the first 10 folders/files in the directory.
Using a loop I could keep doing this until I hit the end of the directory.
This doesn't rely on ordering the files/folders as does the walk function and so its more efficient.
The possible performance issue with filepath.Walk is clearly documented:
The files are walked in lexical order, which makes the output deterministic but means that for very large directories Walk can be inefficient.
Use os.File.Readdir to iterate files in filesystem order:
Readdir reads the contents of the directory associated with file and returns a slice of up to n FileInfo values, as would be returned by Lstat, in directory order. Subsequent calls on the same file will yield further FileInfos.
package main
import (
"fmt"
"io"
"log"
"os"
"time"
)
func main() {
dir, err := os.Open("/tmp")
if err != nil {
log.Fatal(err)
}
for {
files, err := dir.Readdir(10)
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
for _, fi := range files {
classifier := ""
if fi.IsDir() {
classifier = "/"
}
fmt.Printf("%v %12d %s%s\n",
fi.ModTime().UTC().Truncate(time.Second),
fi.Size(),
fi.Name(), classifier,
)
}
}
}
I've been trying to figure out how to simply list the files and folders in a single directory in Go.
I've found filepath.Walk, but it goes into sub-directories automatically, which I don't want. All of my other searches haven't turned anything better up.
I'm sure that this functionality exists, but it's been really hard to find. Let me know if anyone knows where I should look. Thanks.
You can try using the ReadDir function in the os package. Per the docs:
ReadDir reads the named directory, returning all its directory entries sorted by filename.
The resulting slice contains os.DirEntry types, which provide the methods listed here. Here is a basic example that lists the name of everything in the current directory (folders are included but not specially marked - you can check if an item is a folder by using the IsDir() method):
package main
import (
"fmt"
"os"
"log"
)
func main() {
entries, err := os.ReadDir("./")
if err != nil {
log.Fatal(err)
}
for _, e := range entries {
fmt.Println(e.Name())
}
}
We can get a list of files inside a folder on the file system using various golang standard library functions.
filepath.Walk
ioutil.ReadDir
os.File.Readdir
package main
import (
"fmt"
"io/ioutil"
"log"
"os"
"path/filepath"
)
func main() {
var (
root string
files []string
err error
)
root := "/home/manigandan/golang/samples"
// filepath.Walk
files, err = FilePathWalkDir(root)
if err != nil {
panic(err)
}
// ioutil.ReadDir
files, err = IOReadDir(root)
if err != nil {
panic(err)
}
//os.File.Readdir
files, err = OSReadDir(root)
if err != nil {
panic(err)
}
for _, file := range files {
fmt.Println(file)
}
}
Using filepath.Walk
The path/filepath package provides a handy way to scan all the files
in a directory, it will automatically scan each sub-directories in the
directory.
func FilePathWalkDir(root string) ([]string, error) {
var files []string
err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if !info.IsDir() {
files = append(files, path)
}
return nil
})
return files, err
}
Using ioutil.ReadDir
ioutil.ReadDir reads the directory named by dirname and returns a
list of directory entries sorted by filename.
func IOReadDir(root string) ([]string, error) {
var files []string
fileInfo, err := ioutil.ReadDir(root)
if err != nil {
return files, err
}
for _, file := range fileInfo {
files = append(files, file.Name())
}
return files, nil
}
Using os.File.Readdir
Readdir reads the contents of the directory associated with file and
returns a slice of up to n FileInfo values, as would be returned by
Lstat, in directory order. Subsequent calls on the same file will
yield further FileInfos.
func OSReadDir(root string) ([]string, error) {
var files []string
f, err := os.Open(root)
if err != nil {
return files, err
}
fileInfo, err := f.Readdir(-1)
f.Close()
if err != nil {
return files, err
}
for _, file := range fileInfo {
files = append(files, file.Name())
}
return files, nil
}
Benchmark results.
Get more details on this Blog Post
Even simpler, use path/filepath:
package main
import (
"fmt"
"log"
"path/filepath"
)
func main() {
files, err := filepath.Glob("*")
if err != nil {
log.Fatal(err)
}
fmt.Println(files) // contains a list of all files in the current directory
}
Starting with Go 1.16, you can use the os.ReadDir function.
func ReadDir(name string) ([]DirEntry, error)
It reads a given directory and returns a DirEntry slice that contains the directory entries sorted by filename.
It's an optimistic function, so that, when an error occurs while reading the directory entries, it tries to return you a slice with the filenames up to the point before the error.
package main
import (
"fmt"
"log"
"os"
)
func main() {
files, err := os.ReadDir(".")
if err != nil {
log.Fatal(err)
}
for _, file := range files {
fmt.Println(file.Name())
}
}
Of interest: Go 1.17 (Q3 2021) includes fs.FileInfoToDirEntry():
func FileInfoToDirEntry(info FileInfo) DirEntry
FileInfoToDirEntry returns a DirEntry that returns information from info.
If info is nil, FileInfoToDirEntry returns nil.
Background
Go 1.16 (Q1 2021) will propose, with CL 243908 and CL 243914 , the ReadDir function, based on the FS interface:
// An FS provides access to a hierarchical file system.
//
// The FS interface is the minimum implementation required of the file system.
// A file system may implement additional interfaces,
// such as fsutil.ReadFileFS, to provide additional or optimized functionality.
// See io/fsutil for details.
type FS interface {
// Open opens the named file.
//
// When Open returns an error, it should be of type *PathError
// with the Op field set to "open", the Path field set to name,
// and the Err field describing the problem.
//
// Open should reject attempts to open names that do not satisfy
// ValidPath(name), returning a *PathError with Err set to
// ErrInvalid or ErrNotExist.
Open(name string) (File, error)
}
That allows for "os: add ReadDir method for lightweight directory reading":
See commit a4ede9f:
// ReadDir reads the contents of the directory associated with the file f
// and returns a slice of DirEntry values in directory order.
// Subsequent calls on the same file will yield later DirEntry records in the directory.
//
// If n > 0, ReadDir returns at most n DirEntry records.
// In this case, if ReadDir returns an empty slice, it will return an error explaining why.
// At the end of a directory, the error is io.EOF.
//
// If n <= 0, ReadDir returns all the DirEntry records remaining in the directory.
// When it succeeds, it returns a nil error (not io.EOF).
func (f *File) ReadDir(n int) ([]DirEntry, error)
// A DirEntry is an entry read from a directory (using the ReadDir method).
type DirEntry interface {
// Name returns the name of the file (or subdirectory) described by the entry.
// This name is only the final element of the path, not the entire path.
// For example, Name would return "hello.go" not "/home/gopher/hello.go".
Name() string
// IsDir reports whether the entry describes a subdirectory.
IsDir() bool
// Type returns the type bits for the entry.
// The type bits are a subset of the usual FileMode bits, those returned by the FileMode.Type method.
Type() os.FileMode
// Info returns the FileInfo for the file or subdirectory described by the entry.
// The returned FileInfo may be from the time of the original directory read
// or from the time of the call to Info. If the file has been removed or renamed
// since the directory read, Info may return an error satisfying errors.Is(err, ErrNotExist).
// If the entry denotes a symbolic link, Info reports the information about the link itself,
// not the link's target.
Info() (FileInfo, error)
}
src/os/os_test.go#testReadDir() illustrates its usage:
file, err := Open(dir)
if err != nil {
t.Fatalf("open %q failed: %v", dir, err)
}
defer file.Close()
s, err2 := file.ReadDir(-1)
if err2 != nil {
t.Fatalf("ReadDir %q failed: %v", dir, err2)
}
Ben Hoyt points out in the comments to Go 1.16 os.ReadDir:
os.ReadDir(path string) ([]os.DirEntry, error), which you'll be able to call directly without the Open dance.
So you can probably shorten this to just os.ReadDir, as that's the concrete function most people will call.
See commit 3d913a9 (Dec. 2020):
os: add ReadFile, WriteFile, CreateTemp (was TempFile), MkdirTemp (was TempDir) from io/ioutil
io/ioutil was a poorly defined collection of helpers.
Proposal #40025 moved out the generic I/O helpers to io.
This CL for proposal #42026 moves the OS-specific helpers to os,
making the entire io/ioutil package deprecated.
os.ReadDir returns []DirEntry, in contrast to ioutil.ReadDir's []FileInfo.
(Providing a helper that returns []DirEntry is one of the primary motivations for this change.)
ioutil.ReadDir is a good find, but if you click and look at the source you see that it calls the method Readdir of os.File. If you are okay with the directory order and don't need the list sorted, then this Readdir method is all you need.
From your description, what you probably want is os.Readdirnames.
func (f *File) Readdirnames(n int) (names []string, err error)
Readdirnames reads the contents of the directory associated with file and returns a slice of up to n names of files in the directory, in directory order. Subsequent calls on the same file will yield further names.
...
If n <= 0, Readdirnames returns all the names from the directory in a single slice.
Snippet:
file, err := os.Open(path)
if err != nil {
return err
}
defer file.Close()
names, err := file.Readdirnames(0)
if err != nil {
return err
}
fmt.Println(names)
Credit to SquattingSlavInTracksuit's comment; I'd have suggested promoting their comment to an answer if I could.
A complete example of printing all the files in a directory recursively using Readdirnames
package main
import (
"fmt"
"os"
)
func main() {
path := "/path/to/your/directory"
err := readDir(path)
if err != nil {
panic(err)
}
}
func readDir(path string) error {
file, err := os.Open(path)
if err != nil {
return err
}
defer file.Close()
names, _ := file.Readdirnames(0)
for _, name := range names {
filePath := fmt.Sprintf("%v/%v", path, name)
file, err := os.Open(filePath)
if err != nil {
return err
}
defer file.Close()
fileInfo, err := file.Stat()
if err != nil {
return err
}
fmt.Println(filePath)
if fileInfo.IsDir() {
readDir(filePath)
}
}
return nil
}
How to check in Go if folder is empty? I can check like:
files, err := ioutil.ReadDir(*folderName)
if err != nil {
return nil, err
}
// here check len of files
But it kinda looks to me that there should be more a elegant solution.
Whether a directory is empty or not is not stored in the file-system level as properties like its name, creation time or its size (in case of files).
That being said you can't just obtain this information from an os.FileInfo. The easiest way is to query the children (content) of the directory.
ioutil.ReadDir() is quite a bad choice as that first reads all the contents of the specified directory and then sorts them by name, and then returns the slice. The fastest way is as Dave C mentioned: query the children of the directory using File.Readdir() or (preferably) File.Readdirnames() .
Both File.Readdir() and File.Readdirnames() take a parameter which is used to limit the number of returned values. It is enough to query only 1 child. As Readdirnames() returns only names, it is faster because no further calls are required to obtain (and construct) FileInfo structs.
Note that if the directory is empty, io.EOF is returned as an error (and not an empty or nil slice) so we don't even need the returned names slice.
The final code could look like this:
func IsEmpty(name string) (bool, error) {
f, err := os.Open(name)
if err != nil {
return false, err
}
defer f.Close()
_, err = f.Readdirnames(1) // Or f.Readdir(1)
if err == io.EOF {
return true, nil
}
return false, err // Either not empty or error, suits both cases
}
I've been trying to figure out how to simply list the files and folders in a single directory in Go.
I've found filepath.Walk, but it goes into sub-directories automatically, which I don't want. All of my other searches haven't turned anything better up.
I'm sure that this functionality exists, but it's been really hard to find. Let me know if anyone knows where I should look. Thanks.
You can try using the ReadDir function in the os package. Per the docs:
ReadDir reads the named directory, returning all its directory entries sorted by filename.
The resulting slice contains os.DirEntry types, which provide the methods listed here. Here is a basic example that lists the name of everything in the current directory (folders are included but not specially marked - you can check if an item is a folder by using the IsDir() method):
package main
import (
"fmt"
"os"
"log"
)
func main() {
entries, err := os.ReadDir("./")
if err != nil {
log.Fatal(err)
}
for _, e := range entries {
fmt.Println(e.Name())
}
}
We can get a list of files inside a folder on the file system using various golang standard library functions.
filepath.Walk
ioutil.ReadDir
os.File.Readdir
package main
import (
"fmt"
"io/ioutil"
"log"
"os"
"path/filepath"
)
func main() {
var (
root string
files []string
err error
)
root := "/home/manigandan/golang/samples"
// filepath.Walk
files, err = FilePathWalkDir(root)
if err != nil {
panic(err)
}
// ioutil.ReadDir
files, err = IOReadDir(root)
if err != nil {
panic(err)
}
//os.File.Readdir
files, err = OSReadDir(root)
if err != nil {
panic(err)
}
for _, file := range files {
fmt.Println(file)
}
}
Using filepath.Walk
The path/filepath package provides a handy way to scan all the files
in a directory, it will automatically scan each sub-directories in the
directory.
func FilePathWalkDir(root string) ([]string, error) {
var files []string
err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if !info.IsDir() {
files = append(files, path)
}
return nil
})
return files, err
}
Using ioutil.ReadDir
ioutil.ReadDir reads the directory named by dirname and returns a
list of directory entries sorted by filename.
func IOReadDir(root string) ([]string, error) {
var files []string
fileInfo, err := ioutil.ReadDir(root)
if err != nil {
return files, err
}
for _, file := range fileInfo {
files = append(files, file.Name())
}
return files, nil
}
Using os.File.Readdir
Readdir reads the contents of the directory associated with file and
returns a slice of up to n FileInfo values, as would be returned by
Lstat, in directory order. Subsequent calls on the same file will
yield further FileInfos.
func OSReadDir(root string) ([]string, error) {
var files []string
f, err := os.Open(root)
if err != nil {
return files, err
}
fileInfo, err := f.Readdir(-1)
f.Close()
if err != nil {
return files, err
}
for _, file := range fileInfo {
files = append(files, file.Name())
}
return files, nil
}
Benchmark results.
Get more details on this Blog Post
Even simpler, use path/filepath:
package main
import (
"fmt"
"log"
"path/filepath"
)
func main() {
files, err := filepath.Glob("*")
if err != nil {
log.Fatal(err)
}
fmt.Println(files) // contains a list of all files in the current directory
}
Starting with Go 1.16, you can use the os.ReadDir function.
func ReadDir(name string) ([]DirEntry, error)
It reads a given directory and returns a DirEntry slice that contains the directory entries sorted by filename.
It's an optimistic function, so that, when an error occurs while reading the directory entries, it tries to return you a slice with the filenames up to the point before the error.
package main
import (
"fmt"
"log"
"os"
)
func main() {
files, err := os.ReadDir(".")
if err != nil {
log.Fatal(err)
}
for _, file := range files {
fmt.Println(file.Name())
}
}
Of interest: Go 1.17 (Q3 2021) includes fs.FileInfoToDirEntry():
func FileInfoToDirEntry(info FileInfo) DirEntry
FileInfoToDirEntry returns a DirEntry that returns information from info.
If info is nil, FileInfoToDirEntry returns nil.
Background
Go 1.16 (Q1 2021) will propose, with CL 243908 and CL 243914 , the ReadDir function, based on the FS interface:
// An FS provides access to a hierarchical file system.
//
// The FS interface is the minimum implementation required of the file system.
// A file system may implement additional interfaces,
// such as fsutil.ReadFileFS, to provide additional or optimized functionality.
// See io/fsutil for details.
type FS interface {
// Open opens the named file.
//
// When Open returns an error, it should be of type *PathError
// with the Op field set to "open", the Path field set to name,
// and the Err field describing the problem.
//
// Open should reject attempts to open names that do not satisfy
// ValidPath(name), returning a *PathError with Err set to
// ErrInvalid or ErrNotExist.
Open(name string) (File, error)
}
That allows for "os: add ReadDir method for lightweight directory reading":
See commit a4ede9f:
// ReadDir reads the contents of the directory associated with the file f
// and returns a slice of DirEntry values in directory order.
// Subsequent calls on the same file will yield later DirEntry records in the directory.
//
// If n > 0, ReadDir returns at most n DirEntry records.
// In this case, if ReadDir returns an empty slice, it will return an error explaining why.
// At the end of a directory, the error is io.EOF.
//
// If n <= 0, ReadDir returns all the DirEntry records remaining in the directory.
// When it succeeds, it returns a nil error (not io.EOF).
func (f *File) ReadDir(n int) ([]DirEntry, error)
// A DirEntry is an entry read from a directory (using the ReadDir method).
type DirEntry interface {
// Name returns the name of the file (or subdirectory) described by the entry.
// This name is only the final element of the path, not the entire path.
// For example, Name would return "hello.go" not "/home/gopher/hello.go".
Name() string
// IsDir reports whether the entry describes a subdirectory.
IsDir() bool
// Type returns the type bits for the entry.
// The type bits are a subset of the usual FileMode bits, those returned by the FileMode.Type method.
Type() os.FileMode
// Info returns the FileInfo for the file or subdirectory described by the entry.
// The returned FileInfo may be from the time of the original directory read
// or from the time of the call to Info. If the file has been removed or renamed
// since the directory read, Info may return an error satisfying errors.Is(err, ErrNotExist).
// If the entry denotes a symbolic link, Info reports the information about the link itself,
// not the link's target.
Info() (FileInfo, error)
}
src/os/os_test.go#testReadDir() illustrates its usage:
file, err := Open(dir)
if err != nil {
t.Fatalf("open %q failed: %v", dir, err)
}
defer file.Close()
s, err2 := file.ReadDir(-1)
if err2 != nil {
t.Fatalf("ReadDir %q failed: %v", dir, err2)
}
Ben Hoyt points out in the comments to Go 1.16 os.ReadDir:
os.ReadDir(path string) ([]os.DirEntry, error), which you'll be able to call directly without the Open dance.
So you can probably shorten this to just os.ReadDir, as that's the concrete function most people will call.
See commit 3d913a9 (Dec. 2020):
os: add ReadFile, WriteFile, CreateTemp (was TempFile), MkdirTemp (was TempDir) from io/ioutil
io/ioutil was a poorly defined collection of helpers.
Proposal #40025 moved out the generic I/O helpers to io.
This CL for proposal #42026 moves the OS-specific helpers to os,
making the entire io/ioutil package deprecated.
os.ReadDir returns []DirEntry, in contrast to ioutil.ReadDir's []FileInfo.
(Providing a helper that returns []DirEntry is one of the primary motivations for this change.)
ioutil.ReadDir is a good find, but if you click and look at the source you see that it calls the method Readdir of os.File. If you are okay with the directory order and don't need the list sorted, then this Readdir method is all you need.
From your description, what you probably want is os.Readdirnames.
func (f *File) Readdirnames(n int) (names []string, err error)
Readdirnames reads the contents of the directory associated with file and returns a slice of up to n names of files in the directory, in directory order. Subsequent calls on the same file will yield further names.
...
If n <= 0, Readdirnames returns all the names from the directory in a single slice.
Snippet:
file, err := os.Open(path)
if err != nil {
return err
}
defer file.Close()
names, err := file.Readdirnames(0)
if err != nil {
return err
}
fmt.Println(names)
Credit to SquattingSlavInTracksuit's comment; I'd have suggested promoting their comment to an answer if I could.
A complete example of printing all the files in a directory recursively using Readdirnames
package main
import (
"fmt"
"os"
)
func main() {
path := "/path/to/your/directory"
err := readDir(path)
if err != nil {
panic(err)
}
}
func readDir(path string) error {
file, err := os.Open(path)
if err != nil {
return err
}
defer file.Close()
names, _ := file.Readdirnames(0)
for _, name := range names {
filePath := fmt.Sprintf("%v/%v", path, name)
file, err := os.Open(filePath)
if err != nil {
return err
}
defer file.Close()
fileInfo, err := file.Stat()
if err != nil {
return err
}
fmt.Println(filePath)
if fileInfo.IsDir() {
readDir(filePath)
}
}
return nil
}