What is the mysterious relationship between `fchown()` and `flock()`?

What is the mysterious relationship between `fchown()` and `flock()`? - macos

Reading the man pages for fchown, I find this statement:
The fchown() system call is particularly useful when used in conjunction with the file locking primitives (see flock(2)).
Here's the mystery: the man page for flock(2) makes no mention of fchown or even how ownership affects it in general.
So can anyone explain what happens when fchown and flock are used together and why it's so "useful?"
I'm developing for macOS (Darwin), but I find the same statement (and lack of an explanation) in Linux, BSD, POSIX, and virtually every other *NIX man page I've searched.
Backstory ('cause every great villain has a backstory):
I have a set-UID helper process that gets executed as root, but spends much of its time running as user. While it's running as user, the files it creates belong to user. Good so far.
However, occasionally it needs to create files while running as root. When this happens, the files belong to root and I want them to belong to user. So my plan was to create+open the file, then call fchown() to change the ownership back to user.
But a few of these files are shared and I use flock() to block concurrent access to the file and now I'm wondering what will happen to my flocks.

Related

Circular Shebang #! (too many levels of symbolic links)

For context: I was reading this example of trickery one can do with a shebang. It uses
#!/bin/rm
which ends up deleting the file you executed (very funny; you can extend this to create self-deleting messages). This shows that the program (rm) is invoked with the filename as an argument and therefore has access to the whole file including the shebang that invoked it.
Another trickery I've come up with is to invoke yourself as an interpreter in shebang to create an infinite loop. For example if /usr/bin/loop starts with
#!/usr/bin/loop
t should invoke itself with itself forever. Obviously at some point an error will occur and in my particular case I get:
bash: /usr/bin/loop: /usr/bin/loop: bad interpreter: Too many levels of symbolic links
It looks like it got two levels deep. Can somebody explain to me why this particular error occurs? Or maybe share some other error messages for different shells.
In particular I would like to understand why there are symbolic links involved and whether this is an implementation detail of bash or not.

why this particular error occurs?
Because when kernel tries to run the executable, tries to run /usr/bin/loop, then tries to run /usr/bin/loop, then tries to run /usr/bin/loop, etc. and finally fails with ELOOP error. It checks that.
https://elixir.bootlin.com/linux/latest/source/fs/exec.c#L1767
hare some other error messages for different shells.
While it could be, this particular errno message comes from glibc strerror.
why there are symbolic links involved
Because ELOOP is typically returned when doing input/output operations, the message mentions them. Symbolic links are not involved.
this is an implementation detail of bash or not.
Not.

Ruby file handle management (too many open files)

I am performing very rapid file access in ruby (2.0.0 p39474), and keep getting the exception Too many open files
Having looked at this thread, here, and various other sources, I'm well aware of the OS limits (set to 1024 on my system).
The part of my code that performs this file access is mutexed, and takes the form:
File.open( filename, 'w'){|f| Marshal.dump(value, f) }
where filename is subject to rapid change, depending on the thread calling the section. It's my understanding that this form relinquishes its file handle after the block.
I can verify the number of File objects that are open using ObjectSpace.each_object(File). This reports that there are up to 100 resident in memory, but only one is ever open, as expected.
Further, the exception itself is thrown at a time when there are only 10-40 File objects reported by ObjectSpace. Further, manually garbage collecting fails to improve any of these counts, as does slowing down my script by inserting sleep calls.
My question is, therefore:
Am I fundamentally misunderstanding the nature of the OS limit---does it cover the whole lifetime of a process?
If so, how do web servers avoid crashing out after accessing over ulimit -n files?
Is ruby retaining its file handles outside of its object system, or is the kernel simply very slow at counting 'concurrent' access?
Edit 20130417:
strace indicates that ruby doesn't write all of its data to the file, returning and releasing the mutex before doing so. As such, the file handles stack up until the OS limit.
In an attempt to fix this, I have used syswrite/sysread, synchronous mode, and called flush before close. None of these methods worked.
My question is thus revised to:
Why is ruby failing to close its file handles, and how can I force it to do so?

Use dtrace or strace or whatever equivalent is on your system, and find out exactly what files are being opened.
Note that these could be sockets.
I agree that the code you have pasted does not seem to be capable of causing this problem, at least, not without a rather strange concurrency bug as well.

OSX: How to force multiple-file-copy operation to plough through errors

This is so wrong.
I want to perform a large copy operation; moving 250 GB from my laptop hard drive to an external drive.
OSX lion claims this will take about five hours.
After a couple of hours of chugging, it reports that one particular file could not be copied (for whatever reason; I cannot remember and I don't have the patience to repeat the experiment at the moment).
And on that note it bails.
I am frankly left aghast.
That this problem persists in this day and age is to me scarcely believable. I remember hitting up against the same scenario 20 years back with Windows 3.1.
How hard would it be for the folks at Apple (or Microsoft for that matter) to implement file copying in such a way that it skips over failures, writing a list of failed operations on-the-fly to stderr? And how much more useful would that implementation be? (both these questions are rhetorical by the way; simply an expression of my utter bewilderment; please don't answer them unless by means of comments or supplements to an answer to the actual question, which follows:).
More to the point (and this is my actual question), how can I implement this myself in OS X?
PS I'm open to all solutions here: programmatic / scripting / third-party software

I hear and understand your rant, but this is bordering on being a SuperUser-type question and not a programming question (saved only by the fact you said you would like to implement this yourself).
From the description, it sounds like the Finder bailed when it couldn't copy one particular file (my guess is that it was looking for admin and/or root permission for some priviledged folder).
For massive copies like this, you can use the Terminal command line:
e.g.
cp
or
sudo cp
with options like "-R" (which continues copying even if errors are detected -- unless you're using "legacy" mode) or "-n" (don't copy if the file already exists at the destination). You can see all the possible options by typing in "man cp" at the Terminal command line.
If you really wanted to do this programatically, there are options in NSWorkspace (the performFileoperation:source:destination:files:tag: method (documentation linked for you, look at the NSWorkspaceCopyOperation constant). You can also do more low level stuff via "NSFileManager" and it's copyItemAtPath:toPath:error: method, but that's really getting to brute-force approaches there.

Move or copy and truncate a file that is in use

I want to be able to (programmatically) move (or copy and truncate) a file that is constantly in use and being written to. This would cause the file being written to would never be too big.
Is this possible? Either Windows or Linux is fine.
To be specific what I'm trying to do is log video with FFMPEG and create hour long videos.

It is possible in both Windows and Linux, but it would take cooperation between the applications involved. If the application that is writing the new data to the file is not aware of what the other application is doing, it probably would not work (well ... there is some possibility ... back to that in a moment).
In general, to get this to work, you would have to open the file shared. For example, if using the Windows API CreateFile, both applications would likely need to specify FILE_SHARE_READ and FILE_SHARE_WRITE. This would allow both (multiple) applications to read and write the file "concurrently".
Beyond sharing the file, though, it would also be necessary to coordinate the operations between the applications. You would need to use some kind of locking mechanism (either by locking some part of the file or some shared mutex/semaphore). Note that if you use file locking, you could lock some known offset in the file to act as a "semaphore" (it can even be a byte value beyond the physical end of the file). If one application were appending to the file at the same exact time that the other application were truncating it, then it would lead to unpredictable results.
Back to the comment about both applications needing to be aware of each other ... It is possible that if both applications opened the file exclusively and kept retrying the operations until they succeeded, then perform the operation, then close the file, it would essentially allow them to work without "knowledge" of each other. However, that would probably not work very well and not be very efficient.
Having said all that, you might want to consider alternatives for efficiency reasons. For example, if it were possible to have the writing application write to new files periodically, it might be more efficient than having to "move" the data constantly out of one file to another. Also, if you needed to maintain some portion of the file (e.g., move out the first 100 MB to another file and then move the second 100 MB to the beginning) that could be a fairly expensive operation as well.

logrotate would be a good option is linux, comes stock on just about any distro. I'm sure there's a similar windows service out there somewhere

command line wisdom for 2 panel file manager user

Want to upgrade my file management productivity by replacing 2 panel file manager with command line (bash or cygwin). Can commandline give same speed? Please advise a guru way of how to do e.g. copy of some file in directory A to the directory B. Is it heavy use of pushd/popd? Or creation of links to most often used directories? What are the best practices and a day-to-day routine to manage files of a command line master?

Can commandline give same speed?
My experience is that commandline copying is significantly faster (especially in the Windows environment). Of course the basic laws of physics still apply, a file that is 1000 times bigger than a file that copies in 1 second will still take 1000 seconds to copy.
..(howto) copy of some file in directory A to the directory B.
Because I often have 5-10 projects that use similar directory structures, I set up variables for each subdir using a naming convention :
project=NewMatch
NM_scripts=${project}/scripts
NM_data=${project}/data
NM_logs=${project}/logs
NM_cfg=${project}/cfg
proj2=AlternateMatch
altM_scripts=${proj2}/scripts
altM_data=${proj2}/data
altM_logs=${proj2}/logs
altM_cfg=${proj2}/cfg
You can make this sort of thing as spartan or baroque as needed to match your theory of living/programming.
Then you can easily copy the cfg from 1 project to another
cp -p $NM_cfg/*.cfg ${altM_cfg}
Is it heavy use of pushd/popd?
Some people seem to really like that. You can try it and see what you thing.
Or creation of links to most often used directories?
Links to dirs are, in my experience used more for software development where a source code is expecting a certain set of dir names, and your installation has different names. Then making links to supply the dir paths expected is helpful. For production data, is just one more thing that can get messed up, or blow up. That's not always true, maybe you'll have a really good reason to have links, but I wouldn't start out that way, just because it is possible to do.
What are the best practices and a day-to-day routine to manage files of a command line master?
( Per above, use standardized directory structure for all projects.
Have scripts save any small files to a directory your dept keeps in the /tmp dir, .
i.e /tmp/MyDeptsTmpFile (named to fit your local conventions) )
It depends. If you're talking about data and logfiles, dated fileNames can save you a lot of time. I recommend dateFmts like YYYYMMDD(_HHMMSS) if you need the extra resolution.
Dated logfiles are very handy, when a current process seems like it is taking a long time, you can look at the log file from a week ago and quantify exactly how long this process took, a week, month, 6 months (up to how much space you can afford). LogFiles should also capture all STDERR messages, so you never have to re-run a bombed program just to see what the error message was.
This is Linux/Unix you're using, right? Read the man page for the cp cmd installed on your machine. I recommend using an alias like alias CP='/bin/cp -pi' so you always copy a file with the same permissions and with the original files' time stamp. Then it is easy to use /bin/ls -ltr to see a sorted list of files with the most recent files showing up at the bottom of the list. (No need to scroll back to the top, when you sort by time,reverse). Also the '-i' option will warn you that you are going to overwrite a file, and this has saved me more than a couple of times.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, and/or give it a + (or -) as a useful answer.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio