I'm trying to loop through files in a directory and compare their ModTime against a certain date in order to delete older files.
I'm using ioutil.ReadDir() to get the files but I'm stuck with how to retrieve the ModTime of each file.
Thanks
The return from ioutil.ReadDir is ([]os.FileInfo, error). You would simply iterate the []os.FileInfo slice and inspect the ModTime() of each. ModTime() returns a time.Time so you can compare in any way you see fit.
package main
import (
"fmt"
"io/ioutil"
"log"
"time"
)
var cutoff = 1 * time.Hour
func main() {
fileInfo, err := ioutil.ReadDir("/tmp")
if err != nil {
log.Fatal(err.Error())
}
now := time.Now()
for _, info := range fileInfo {
if diff := now.Sub(info.ModTime()); diff > cutoff {
fmt.Printf("Deleting %s which is %s old\n", info.Name(), diff)
}
}
}
Related
I have a folder which contains multiple types of files (with no subfolders in this simple case). Let's assume it contains 20000 .raw files and 20000 .jpg files. I need to move .raw files into raw folder and .jpg files into jpg folder. So I tired to use golang to solve it:
package main
import (
"flag"
"fmt"
"io/fs"
"io/ioutil"
"os"
"runtime"
"strings"
"sync"
"time"
)
func CreateFolder(basePath string, folderName string) {
os.Mkdir(basePath+"/"+folderName, 0755)
}
func MoveFile(file string, path string, folder string) {
err := os.Rename(path+"/"+file, path+"/"+folder+"/"+file)
if err != nil {
panic(err)
}
}
func getInfo(a fs.FileInfo, c chan string) {
if a.IsDir() || strings.HasPrefix(a.Name(), ".") {
return
} else {
c <- a.Name()
}
}
func dealInfo(path string, typeDict *sync.Map, c chan string) {
for name := range c {
sp := strings.Split(name, ".")
suffix := sp[len(sp)-1]
if _, ok := typeDict.Load(suffix); ok {
MoveFile(name, path, suffix)
} else {
CreateFolder(path, suffix)
MoveFile(name, path, suffix)
typeDict.Store(suffix, 1)
}
}
}
func main() {
runtime.GOMAXPROCS(8)
var (
filepath = flag.String("p", "", "default self folder")
)
flag.Parse()
fmt.Println(*filepath)
fmt.Println("==========")
if *filepath == "" {
fmt.Println("No valid folder path")
return
} else {
fileinfos, err := ioutil.ReadDir(*filepath)
stime := time.Now()
if err != nil {
panic(err)
}
var typeDict sync.Map
ch := make(chan string, 20)
for _, fs := range fileinfos {
go getInfo(fs, ch)
go dealInfo(*filepath, &typeDict, ch)
}
fmt.Println(time.Since(stime))
}
}
But it returns an error: runtime: failed to create new OS thread. I guess this is due to too much goroutines the script created? But I've no idea why this could happen because I think ch := make(chan string, 20) would limit the number of goroutine.
I also tried to use wg *sync.WaitGroup, like:
getInfo(...) // use this func to put all files info into a channel
wg.Add(20)
for i:=0; i<20; i++ {
go dealInfo(..., &wg) // this new dealInfo contains wg.Done()
}
wg.Wait()
But this will cause a deadlock error.
May I know the best way to move files parallel please? Your help is really appreciated!
This may work.
However the move operation depends on the Operational System and the Filesystem.
Doing it on parallel may not be optimal via NFS for instance. You must check.
The strategy of list the files, send to channels to be executed (move/rename) by some goroutines is something that I will try in this situation.
The number of goroutines (workers) can be a command line parameter.
Is there an easy way to check the size of a Golang project? It's not an executable, it's a package that I'm importing in my own project.
You can see how big the library binaries are by looking in the $GOPATH/pkg directory (if $GOPATH is not exported go defaults to $HOME/go).
So to check the size of some of the gorilla http pkgs. Install them first:
$ go get -u github.com/gorilla/mux
$ go get -u github.com/gorilla/securecookie
$ go get -u github.com/gorilla/sessions
The KB binary sizes on my 64-bit MacOS (darwin_amd64):
$ cd $GOPATH/pkg/darwin_amd64/github.com/gorilla/
$ du -k *
284 mux.a
128 securecookie.a
128 sessions.a
EDIT:
Library (package) size is one thing, but how much space that takes up in your executable after the link stage can vary wildly. This is because packages have their own dependencies and with that comes extra baggage, but that baggage may be shared by other packages you import.
An example demonstrates this best:
empty.go:
package main
func main() {}
http.go:
package main
import "net/http"
var _ = http.Serve
func main() {}
mux.go:
package main
import "github.com/gorilla/mux"
var _ = mux.NewRouter
func main() {}
All 3 programs are functionally identical - executing zero user code - but their dependencies differ. The resulting binary sizes in KB:
$ du -k *
1028 empty
5812 http
5832 mux
What does this tell us? The core go pkg net/http adds significant size to our executable. The mux pkg is not large by itself, but it has an import dependency on net/http pkg - hence the significant file size for it too. Yet the delta between mux and http is only 20KB, whereas the listed file size of the mux.a library is 284KB. So we can't simply add the library pkg sizes to determine their true footprint.
Conclusion:
The go linker will strip out a lot of baggage from individual libraries during the build process, but in order to get a true sense of how much extra weight importing certain packages, one has to look at all of the pkg's sub-dependencies as well.
Here is another solution that makes use of https://pkg.go.dev/golang.org/x/tools/go/packages
I took the example provided by the author, and slightly updated it with the demonstration binary available here.
package main
import (
"flag"
"fmt"
"log"
"os"
"sort"
"golang.org/x/tools/go/packages"
)
func main() {
flag.Parse()
// Many tools pass their command-line arguments (after any flags)
// uninterpreted to packages.Load so that it can interpret them
// according to the conventions of the underlying build system.
cfg := &packages.Config{Mode: packages.NeedFiles |
packages.NeedSyntax |
packages.NeedImports,
}
pkgs, err := packages.Load(cfg, flag.Args()...)
if err != nil {
fmt.Fprintf(os.Stderr, "load: %v\n", err)
os.Exit(1)
}
if packages.PrintErrors(pkgs) > 0 {
os.Exit(1)
}
// Print the names of the source files
// for each package listed on the command line.
var size int64
for _, pkg := range pkgs {
for _, file := range pkg.GoFiles {
s, err := os.Stat(file)
if err != nil {
log.Println(err)
continue
}
size += s.Size()
}
}
fmt.Printf("size of %v is %v b\n", pkgs[0].ID, size)
size = 0
for _, pkg := range allPkgs(pkgs) {
for _, file := range pkg.GoFiles {
s, err := os.Stat(file)
if err != nil {
log.Println(err)
continue
}
size += s.Size()
}
}
fmt.Printf("size of %v and deps is %v b\n", pkgs[0].ID, size)
}
func allPkgs(lpkgs []*packages.Package) []*packages.Package {
var all []*packages.Package // postorder
seen := make(map[*packages.Package]bool)
var visit func(*packages.Package)
visit = func(lpkg *packages.Package) {
if !seen[lpkg] {
seen[lpkg] = true
// visit imports
var importPaths []string
for path := range lpkg.Imports {
importPaths = append(importPaths, path)
}
sort.Strings(importPaths) // for determinism
for _, path := range importPaths {
visit(lpkg.Imports[path])
}
all = append(all, lpkg)
}
}
for _, lpkg := range lpkgs {
visit(lpkg)
}
return all
}
You can download all the imported modules with go mod vendor, then count the lines of all the .go files that aren't test files:
package main
import (
"bytes"
"fmt"
"io/fs"
"os"
"os/exec"
"path/filepath"
"strings"
)
func count(mod string) int {
imp := fmt.Sprintf("package main\nimport _ %q", mod)
os.WriteFile("size.go", []byte(imp), os.ModePerm)
exec.Command("go", "mod", "init", "size").Run()
exec.Command("go", "mod", "vendor").Run()
var count int
filepath.WalkDir("vendor", func(s string, d fs.DirEntry, err error) error {
if strings.HasSuffix(s, ".go") && !strings.HasSuffix(s, "_test.go") {
data, err := os.ReadFile(s)
if err != nil {
return err
}
count += bytes.Count(data, []byte{'\n'})
}
return nil
})
return count
}
func main() {
println(count("github.com/klauspost/compress/zstd"))
}
I'm using C# my entire life and now trying out GO. How do I find the lower time value between two time structs?
import (
t "time"
"fmt"
)
func findTime() {
timeA, err := t.Parse("01022006", "08112016")
timeB, err := t.Parse("01022006", "08152016")
Math.Min(timeA.Ticks, timeB.Ticks) // This is C# code but I'm looking for something similar in GO
}
You can use the Time.Before method to test if a time is before another:
timeA, err := time.Parse("01022006", "08112016")
timeB, err := time.Parse("01022006", "08152016")
var min time.Time
if timeA.Before(timeB) {
min = timeA
} else {
min = timeB
}
How might I get the count of items returned by io/ioutil.ReadDir()?
I have this code, which works, but I have to think isn't the RightWay(tm) in Go.
package main
import "io/ioutil"
import "fmt"
func main() {
files,_ := ioutil.ReadDir("/Users/dgolliher/Dropbox/INBOX")
var count int
for _, f := range files {
fmt.Println(f.Name())
count++
}
fmt.Println(count)
}
Lines 8-12 seem like way too much to go through to just count the results of ReadDir, but I can't find the correct syntax to get the count without iterating over the range. Help?
Found the answer in http://blog.golang.org/go-slices-usage-and-internals
package main
import "io/ioutil"
import "fmt"
func main() {
files,_ := ioutil.ReadDir("/Users/dgolliher/Dropbox/INBOX")
fmt.Println(len(files))
}
ReadDir returns a list of directory entries sorted by filename, so it is not just files. Here is a little function for those wanting to get a count of files only (and not dirs):
func fileCount(path string) (int, error){
i := 0
files, err := ioutil.ReadDir(path)
if err != nil {
return 0, err
}
for _, file := range files {
if !file.IsDir() {
i++
}
}
return i, nil
}
Starting with Go 1.16 (Feb 2021), a better option is os.ReadDir:
package main
import "os"
func main() {
d, e := os.ReadDir(".")
if e != nil {
panic(e)
}
println(len(d))
}
os.ReadDir returns fs.DirEntry instead of fs.FileInfo, which means that
Size and ModTime methods are omitted, making the process more efficient if
you just need an entry count.
https://golang.org/pkg/os#ReadDir
If you wanna get all files (not recursive) you can use len(files). If you need to just get the files without folders and hidden files just loop over them and increase a counter. And please don’t ignore errors
By looking at the code of ioutil.ReadDir
func ReadDir(dirname string) ([]fs.FileInfo, error) {
f, err := os.Open(dirname)
if err != nil {
return nil, err
}
list, err := f.Readdir(-1)
f.Close()
if err != nil {
return nil, err
}
sort.Slice(list, func(i, j int) bool { return list[i].Name() < list[j].Name() })
return list, nil
}
you would realize that it calls os.File.Readdir() then sorts the files.
In case of counting it, you don't need to sort, so you are better off calling os.File.Readdir() directly.
You can simply copy and paste this function then remove the sort.
But I did find out that f.Readdirnames(-1) is much faster than f.Readdir(-1).
Running time is almost half for /usr/bin/ with 2808 items (16ms vs 35ms).
So to summerize it in an example:
package main
import (
"fmt"
"os"
)
func main() {
f, err := os.Open(os.Args[1])
if err != nil {
panic(err)
}
list, err := f.Readdirnames(-1)
f.Close()
if err != nil {
panic(err)
}
fmt.Println(len(list))
}
Does anyone know how to check for a file access date and time? The function returns the modified date and time and I need something that compares the accessed date time to the current date and time.
You can use os.Stat to get a FileInfo struct which also contains the last access time (as well as the last modified and the last status change time).
info, err := os.Stat("example.txt")
if err != nil {
// TODO: handle errors (e.g. file not found)
}
// info.Atime_ns now contains the last access time
// (in nanoseconds since the unix epoch)
After that, you can use time.Nanoseconds to get the current time (also in nanoseconds since the unix epoch, January 1, 1970 00:00:00 UTC). To get the duration in nanoseconds, just subtract those two values:
duration := time.Nanoseconds() - info.Atime_ns
By casting os.FileInfo to *syscall.Stat_t:
package main
import ( "fmt"; "log"; "os"; "syscall"; "time" )
func main() {
for _, arg := range os.Args[1:] {
fileinfo, err := os.Stat(arg)
if err != nil {
log.Fatal(err)
}
atime := fileinfo.Sys().(*syscall.Stat_t).Atim
fmt.Println(time.Unix(atime.Sec, atime.Nsec))
}
}
Alternatively, after the Stat you can also do
statinfo.ModTime()
Also you can use Format() on it, should you need it eg for a webserver
see https://gist.github.com/alexisrobert/982674
For windows
syscall.Win32FileAttributeData
info, _ := os.Stat("test.txt")
fileTime := info.Sys().(*syscall.Win32FileAttributeData).LastAccessTime
aTime := time.Unix(0, fileTime.Nanoseconds())
Example
package main
import (
"fmt"
"log"
"os"
"syscall"
"time"
)
func main() {
info, _ := os.Stat("./test.txt")
fileTime := info.Sys().(*syscall.Win32FileAttributeData).LastAccessTime
// _ = info.Sys().(*syscall.Win32FileAttributeData).CreationTime
// _ = info.Sys().(*syscall.Win32FileAttributeData).LastWriteTime
fileAccessTime := time.Unix(0, fileTime.Nanoseconds())
// Compare
// t2, _ := time.Parse("2006/01/02 15:04:05 -07:00:00", "2023/02/08 13:18:00 +08:00:00")
now := time.Now()
log.Println(fileAccessTime)
log.Println(now.Add(-20 * time.Minute))
if fileAccessTime.After(now.Add(-20 * time.Minute)) {
fmt.Println("You accessed this file 20 minutes ago.")
}
}
Linux
see this answer