Size of directory from os.Stat/Lstat - go

Let's say I do an os.Stat() on a directory:
func main() {
fi, _ := os.Stat("/tmp")
println(fi.Size())
}
// 548
// Program exited.
https://play.golang.org/p/NIzGMHRYfi
What exactly is the [FileInfo].Size() value meant to represent? It's not the file size, so I'm guessing something like number of files? Inodes? I couldn't find a clear answer anywhere, so maybe someone can enlighten me?

FileInfo mentions
// length in bytes for regular files; system-dependent for others
So it really depends on the execution environment.
See for instance "Where does ext4 store directory sizes?"
In that example, a directory stat size returns 4096 bytes.
That's the actual size of the directory itself, not what it contains.
The stat command provides no facility for querying anything other then the size of a filesystem object (directory or file).
stat simply doesn't have a way to return multiple sizes—so it can only return the size of the directory itself, not of its contents.
And also "directory size including contents" becomes less clear when you have hardlinked files.

Related

Go Lang ioutil.writeFile function making directories and files read only

I have created a go program to trace over all the files under a root path and replace a specific source string with a target string after that writhing the updated content into the existing file and for that, I have used filepath.Walk function to trace over the files and ioutil.ReadFile and ioutil.WriteFile to read from and write into the file.
So, the problem is ioutil.WriteFile function is marking the root directory, subdirectories, and files as read-only. I checked online for the solution of this but could not find anything appropriate.
Below is the code snippet of same.
if strings.Contains(data, sourceString) {
data = strings.ReplaceAll(data, sourceString, targetStringArray[index])
ioutil.WriteFile(path, []byte(data), os.FileMode(0655))
}
File mode 0655 is translated to permissions rw- r-x r-x. This combination of permissions is weird, because there is no "x" for the owner, but there are for the group and others. Maybe what you want is 0755 (rwx r-x r-x) for executable files or 0644 (rw- r-- r--) for non-executable files, which are the most used ones.

What's the difference between `os.O_APPEND` and `os.ModeAppend`?

We can specify both of flag and perm at os.OpenFile.
They have really similar options, O_APPEND and ModeAppend. What's the difference between them?
f, _ := os.OpenFile("access.log", os.O_APPEND|os.O_CREATE, os.ModeAppend|0644)
The flag specify the flags used on the system call to open the file while perm sets the File mode on the file. The file mode includes the permissions and type of file eg. symlink, directory, etc...
os.O_APPEND tells the underlying OS that all the write calls you do on that file handler should always append to the file so you don't need to set the offset to write on the correct part of the file.
ModeAppend sets the file mode to be append. This means that the this file can only be modified by appending to it, not by rewriting the file contents. The specifics of this depends on the OS and file system you are using. I believe Plan 9, implements it by ignoring the offset on any write call to the file and always appending to it, while in linux it means that the file can only be open for writing in append mode. I think that on most linux distros you need to be root to set the file mode to append.
In 99.99% of cases you just want to use perm to set the file permissions rwx. In your case if you want to open a file and append to it you should use:
// os.O_WRONLY tells the computer you are only going to writo to the file, not read
// os.O_CREATE tells the computer to create the file if it doesn't exist
// os.O_APPEND tells the computer to append to the end of the file instead of overwritting or truncating it
f, err := os.OpenFile("access.log", os.O_WRONLY|os.O_CREATE|os.O_APPEND, 0644)
You might have only ignore the return error on os.OpenFile to put the example online, but you should get used to always checking for errors. You have no idea how many users ran into trouble when starting with go because they ignore the errors. Sometimes is something stupid and easy to fix like a typo, but if you ignore the error you don't know what the issue is.
You can read more about the append file mode here.

Why is running opendir, readdir, stat so slow compared to the Windows dir command?

I have a Perl script that is using opendir to read the contents of a directory:
opendir ( DIR, $path ) or next;
while (my $file = readdir DIR) {
Then I'm doing:
-s $file to get the size of each file
(stat($file))[9] to get the modified time of each file
I'm running this from a Windows machine and accessing a Samba share on Ubuntu 14.04.
This is all working fine but the process seems to run very slow compared to when I run a dir listing on the same folder.
Does anyone know why using opendir takes so much longer than a dir listing and if there's any way I can change my script to speed it up?
According to perlport:
On Win32 stat() needs to open the file to determine the link count and update attributes that may have been changed through hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up stat() by not performing this operation.
Since the files you're accessing are on a Samba share, opening them is probably fairly time consuming. Also, -s makes a stat system call behind the scenes, so calling -s followed by stat is wasteful.
The following should be faster:
local ${^WIN32_SLOPPY_STAT} = 1;
opendir my $dh, $path or die "Failed to opendir '$path': $!";
while (my $file = readdir $dh) {
my ($size, $mtime) = (stat $file)[7, 9];
say join "\t", $file, $size, $mtime;
}
Dir will be much faster as it is binary code that I suspect is very optimized, so so it can retrieve and format the information quickly.
In your script it seems you are doing several calls which have to be interpreted, one for the time and another for the size. Even if the lower calls in Perl are binary code, to get the information it probably has to go through several layers. You could reduce the number of calls by #mob suggestion by saving the returned values of stat and accessing the parts you need. For example:
#items = stat($file);
$size = $items[7];
$modified = $items[9];
which would save one of the calls and possibly speed up the script.
If you want all of the files, you might consider doing a system call to do a directory command and redirect the output to a file, after which you can parse the file to get the information of times and size. This may be a bit faster depending on the amount of files. ( /4 will be a 4 digit year, /t:w will be when it was last written/modified and /c will get rid of the commas in the size)
system("dir /4 /t:w /-c $path > tempList.txt");
Then open and parse redirected file for the information you desire.
open my $in,"tempList.txt" die "Unable to open file tempList.txt";
my #lines = <$in>;
close($in);
chomp(#lines);
foreach ( #lines )
{
next if ( ! ( m/^\d{4}\/\d{2}\/\d{2}\s+ ); # Not a line with a file
#parts = split('\s+');
# Get the parts you need (time and size, where you may have to some other
# work to get it in the desired format
#.....
}
It may be possible to add regex to do matching and pull out items as you need them when testing if you want to process the line. That might save some time and effort as well.

Bash/shell/OS interpretation of . and .. — can I define ...?

How do . and .., as paths (vs. ranges, e.g., {1..10}, which I'm not concerned with), really work? I know what they do, and use them all the time, but don't fully grasp how/where they're interpreted. Does the shell handle them? The interpreting process? The OS?
The reason why I'm asking is that I'd like to be able to use ... to refer to ../.., .... to refer to ../../.., etc. (up to some small finite number; I don't need bash to process an arbitrarily large number of dots). I.e., if my current directory is /tmp/let/me/out, and I call cd ..., my resulting current directory should be /tmp/let. I don't particularly care if ... etc. show up in ls -a output like . and .. do, but I would like to be able to call cat /tmp/let/me/out/..../phew.txt to print the contents of /tmp/phew.txt.
Pointers to relevant documentation appreciated as well as direct answers. This kind of syntax question is very hard to Google.
I'm using bash 4.3.42, by the way, with the autocd and globstar shell options.
. and .. are genuine directory names. They are not "sort-cuts", aliases, or anything fake.
They happen to point to the same inode as the other name you use. A file or directory can have several names pointing to the same inode, these are usually known as hard links, to distinguish them from symbolic (or soft) links.
If you are on Linux or OS X you can use stat to look at most of the inode metadata - it is what ls looks at. You will see there is an inode number. If you stat . and stat current-directory-name you will see that number is the same.
The one thing that is not held in the inode is the filename - that is held in the directory.
So . and .. reside in the directory on the file system, they are not a figment of the shell's imagination. So, for example, I can use . and .. quite happily from C.
I doubt you can change them - personally I have never tried and I never will. You would have to change what these filenames linked to by editing the directory. If you managed it you would probably do irreparable damage to your file system.
I write this to clarify what has already been written before.
In many file systems a DIRECTORY is a file; a special type of file that the file system identifies as being distinctly a directly.
A directory file contains a list of names that map to files on the disk
A file, including a directly does not have an intrinsic name associated with it (not true in all file systems). The name of a file exists only in a directory.
The same file can have an entry in multiple directories (hard link). The same file can then have multiple names and multiple paths.
The file system maintains in every directory entries for "." and ".."
In such file systems there are always directory ENTRIES for the NAMES "." and "..". These entries are maintained by the file system.
The name "." links to its own directory.
The name ".." links to the parent directory EXCEPT for the top level directory where it links to itself (. and .. thus link to the same directory file).
So when you use "." and ".." as in /dir1/dir2/../dir3/./dir4/whatever,
"." and ".." are processed in the exact same way as "dir1" and "dir2".
This translation is done by the file system; not the shell.
cd ...
Does not work because there is no entry for "..." (at least not normally).
You can create a directory called "..." if you want.
You can actually achieve something like this, though this is an ugly hack:
You can run a command before every command entered to bash, and after every command. For that you trap the DEBUG pseudo signal and set a command to PROMPT_COMMAND, respectively.
trap 'ln -s ../.. ... &>/dev/null | true' DEBUG
PROMPT_COMMAND='rm ...'
With this, it seems like there's an additional entry in the current directory:
pwd
# /tmp/crazy-stuff
ls -a
# . .. ... foo
ls -a .../tmp/crazy-stuff
# . .. ... foo
Though this only works in the current directory, because the symbolic links is deleted after each command invokation. Thus ls foo/bar/... won't work this way.
Another ugly hack would be to "override" mkdir such that it populates every new directory with these symbolic links.
See also the comments on the second answer here, particularly Eliah's: https://askubuntu.com/questions/327126/what-is-a-dot-only-named-folder
Much in the same way that when you cd into some directory subdir, you're actually following a pointer that points to that directory, .. is a pointer added by the OS that points to the parent directory, and I'd imagine . works the same way.

Make binary believe it is in another directory

On a Linux system I have a binary (bin.exe) which needs to read an input file (input.cfg), where the names of other data files (data.txt) are specified. Usually both binary, input file and data files were in the same directory. Now and for organization reasons I need binary file to be in $SOMEPATH/bin and input and data files in $SOMEPATH/input.
I do not know how to do this. If I try
$SOMEPATH/bin/bin.exe $SOMEPATH/input/input.cfg
I get
error, "data.txt" not found
One solution would be to include absolute of relative path of "data.txt" in input.cfg, but the binary does not accept this.
I thought about somehow fooling the binary so that it thinks it is in $SOMEPATH/input, so that I just do
$SOMEPATH/bin/bin.exe input.cfg
and it works, but I do not know whether this is possible. any hints?
(cd $SOMEPATH/input && $SOMEPATH/bin/bin.exe input.cfg)
This is assuming that the program is relying on the current working directory to find the files. If the program is trying hard to find them in the same location as the executable, by consulting /proc/<pid>/exe for example, then you may be out of luck.

Resources