I was wondering if calling Write() on an os.File is thread safe. I'm having a hard time finding any mention of thread safety in the docs.
The convention (at least for the standard library) is the following: No function/method is safe for concurrent use unless explicitly stated (or obvious from the context).
It is not safe to write concurrently to an os.File via Write() without external synchronization.
After browsing the source code a little bit I found the following method which is eventually called by file.Write(). Since there are race condition checks in place, I'm assuming that the call is in fact not thread-safe within Go (Source).
However, it seemed unlikely that those system calls wouldn't be thread-safe on an OS level. After some browsing I came upon this interesting answer that fueled my suspicions even more. For windows the source indicates a call to WriteFile which also appears to be thread safe.
Related
I know you can define functions called init in any package, and these function will be executed before main. I use this to open my log file and my DB connection.
Is there a way to define code that will be executed when the program ends, either because it reaches the end of the main function or because it was interrupted ? The only way I can think of is by manually calling a deffered terminate function on each package used by main, but that's quite verbose and error prone.
The C atexit functionality was considered by the Go developers and the idea of adopting it was rejected.
From one of the related thread at golang-nuts:
Russ Cox:
Atexit may make sense in single-threaded, short-lived
programs, but I am skeptical that it has a place in a
long-running multi-threaded server.
I've seen many C++ programs that hang on exit because
they're running global destructors that don't really need to
run, and those destructors are cleaning up and freeing
memory that would be reclaimed by the operating system
anyway, if only the program could get to the exit system call.
Compared to all that pain, needing to call Flush when you're
one with a buffer seems entirely reasonable and is
necessary anyway for correct execution of long-running
programs.
Even ignoring that problem, atexit introduces even more
threads of control, and you have to answer questions like
do all the other goroutines stop before the atexit handlers
run? If not, how do they avoid interfering? If so, what if
one holds a lock that the handler needs? And on and on.
I'm not at all inclined to add Atexit.
Ian Lance Taylor:
The only fully reliable mechanism is a wrapper program that invokes the
real program and does the cleanup when the real program completes. That
is true in any language, not just Go.
In my somewhat unformed opinion, os.AtExit is not a great idea. It is
an unstructured facility that causes stuff to happen at program exit
time in an unpredictable order. It leads to weird scenarios like
programs that take a long time just to exit, an operation that should be
very fast. It also leads to weird functions like the C function _exit,
which more or less means exit-but-don't-run-atexit-functions.
That said, I think a special exit function corresponding to the init
function is an interesting idea. It would have the structure that
os.AtExit lacks (namely, exit functions are run in reverse order of when
init functions are run).
But exit functions won't help you if your program gets killed by the
kernel, or crashes because you call some C code that gets a segmentation
violation.
In general, I agree with jnml's answer. Should you however still want to do it, you could use defer in the main() function, like this: http://play.golang.org/p/aUdFXHtFOM.
I've been using Windows CRITICAL_SECTION since the 1990s and I've been aware of the TryEnterCriticalSection function since it first appeared. I understand that it's supposed to help me avoid a context switch and all that.
But it just occurred to me that I have never used it. Not once.
Nor have I ever felt I needed to use it. In fact, I can't think of a situation in which I would.
Generally when I need to get an exclusive lock on something, I need that lock and I need it now. I can't put it off until later. I certainly can't just say, "oh well, I won't update that data after all". So I need EnterCriticalSection, not TryEnterCriticalSection
So what exactly is the use case for TryEnterCriticalSection?
I've Googled this, of course. I've found plenty of quick descriptions on how to use it but almost no real-world examples of why. I did find this example from Intel that, frankly doesn't help much:
CRITICAL_SECTION cs;
void threadfoo()
{
while(TryEnterCriticalSection(&cs) == FALSE)
{
// some useful work
}
// Critical Section of Code
LeaveCriticalSection (&cs);
}
// other work
}
What exactly is a scenario in which I can do "some useful work" while I'm waiting for my lock? I'd love to avoid thread-contention but in my code, by the time I need the critical section, I've already been forced to do all that "useful work" in order to get the values that I'm updating in shared data (for which I need the critical section in the first place).
Does anyone have a real-world example?
As an example you might have multiple threads that each produce a high volume of messages (events of some sort) that all need to go on a shared queue.
Since there's going to be frequent contention on the lock on the shared queue, each thread can have a local queue and then, whenever the TryEnterCriticalSection call succeeds for the current thread, it copies everything it has in its local queue to the shared one and releases the CS again.
In C++11 therestd::lock which employs deadlock-avoidance algorithm.
In C++17 this has been elaborated to std::scoped_lock class.
This algorithm tries to lock on mutexes in one order, and then in another, until succeeds. It takes try_lock to implement this approach.
Having try_lock method in C++ is called Lockable named requirement, whereas mutexes with only lock and unlock are BasicLockable.
So if you build C++ mutex on top of CTRITICAL_SECTION, and you want to implement Lockable, or you'll want to implement lock avoidance directly on CRITICAL_SECTION, you'll need TryEnterCriticalSection
Additionally you can implement timed mutex on TryEnterCriticalSection. You can do few iterations of TryEnterCriticalSection, then call Sleep with increasing delay time, until TryEnterCriticalSection succeeds or deadline has expired. It is not a very good idea though. Really timed mutexes based on user-space WIndows synchronization objects are implemented on SleepConditionVariableSRW, SleepConditionVariableCS or WaitOnAddress.
Because windows CS are recursive TryEnterCriticalSection allows a thread to check whether it already owns a CS without risk of stalling.
Another case would be if you have a thread that occasionally needs to perform some locked work but usually does something else, you could use TryEnterCriticalSection and only perform the locked work if you actually got the lock.
There doesn't appear to be synchronization between establishing/removing callbacks (e.g. kauth_unlisten_scope) and the callbacks themselves (in the xnu codebase, yes, I know, it's dated). This puts the burden of tracking/draining callbacks and synchronizing with calls on the extension itself. But this is problematic as well in that there is a window in noting that a thread has exited the callback AND actually returning out of the extension code.
Is there any pattern that gives a correct avoidance of this race? Or, is there any documentation from Apple that indicates they've synchronized this correctly?
As far as I'm aware there's no 100% reliable way of preventing kauth callback deregistration race conditions; the API is just badly designed. Apple themselves implement/recommend a simple atomic counter based mechanism, which you can see in the Kauth-O-Rama example. Search for gActivationCount in the KauthORama.c source file. There's still a small chance that a thread is running code before the increment or after the decrement in the callback, but I have never seen a crash caused by this.
I know there are no destructors in Go since technically there are no classes. As such, I use initClass to perform the same functions as a constructor. However, is there any way to create something to mimic a destructor in the event of a termination, for the use of, say, closing files? Right now I just call defer deinitClass, but this is rather hackish and I think a poor design. What would be the proper way?
In the Go ecosystem, there exists a ubiquitous idiom for dealing with objects which wrap precious (and/or external) resources: a special method designated for freeing that resource, called explicitly — typically via the defer mechanism.
This special method is typically named Close(), and the user of the object has to call it explicitly when they're done with the resource the object represents. The io standard package does even have a special interface, io.Closer, declaring that single method. Objects implementing I/O on various resources such as TCP sockets, UDP endpoints and files all satisfy io.Closer, and are expected to be explicitly Closed after use.
Calling such a cleanup method is typically done via the defer mechanism which guarantees the method will run no matter if some code which executes after resource acquisition will panic() or not.
You might also notice that not having implicit "destructors" quite balances not having implicit "constructors" in Go. This actually has nothing to do with not having "classes" in Go: the language designers just avoid magic as much as practically possible.
Note that Go's approach to this problem might appear to be somewhat low-tech but in fact it's the only workable solution for the runtime featuring garbage-collection. In a language with objects but without GC, say C++, destructing an object is a well-defined operation because an object is destroyed either when it goes out of scope or when delete is called on its memory block. In a runtime with GC, the object will be destroyed at some mostly indeterminate point in the future by the GC scan, and may not be destroyed at all. So if the object wraps some precious resource, that resource might get reclaimed way past the moment in time the last live reference to the enclosing object was lost, and it might even not get reclaimed at all—as has been well explained by #twotwotwo in their respective answer.
Another interesting aspect to consider is that the Go's GC is fully concurrent (with the regular program execution). This means a GC thread which is about to collect a dead object might (and usually will) be not the thread(s) which executed that object's code when it was alive. In turn, this means that if the Go types could have destructors then the programmer would need to make sure whatever code the destructor executes is properly synchronized with the rest of the program—if the object's state affects some data structures external to it. This actually might force the programmer to add such synchronization even if the object does not need it for its normal operation (and most objects fall into such category). And think about what happens of those exernal data strucrures happened to be destroyed before the object's destructor was called (the GC collects dead objects in a non-deterministic way). In other words, it's much easier to control — and to reason about — object destruction when it is explicitly coded into the program's flow: both for specifying when the object has to be destroyed, and for guaranteeing proper ordering of its destruction with regard to destroying of the data structures external to it.
If you're familiar with .NET, it deals with resource cleanup in a way which resembles that of Go quite closely: your objects which wrap some precious resource have to implement the IDisposable interface, and a method, Dispose(), exported by that interface, must be called explicitly when you're done with such an object. C# provides some syntactic sugar for this use case via the using statement which makes the compiler arrange for calling Dispose() on the object when it goes out of the scope declared by the said statement. In Go, you'll typically defer calls to cleanup methods.
One more note of caution. Go wants you to treat errors very seriously (unlike most mainstream programming language with their "just throw an exception and don't give a fsck about what happens due to it elsewhere and what state the program will be in" attitude) and so you might consider checking error returns of at least some calls to cleanup methods.
A good example is instances of the os.File type representing files on a filesystem. The fun stuff is that calling Close() on an open file might fail due to legitimate reasons, and if you were writing to that file this might indicate that not all the data you wrote to that file had actually landed in it on the file system. For an explanation, please read the "Notes" section in the close(2) manual.
In other words, just doing something like
fd, err := os.Open("foo.txt")
defer fd.Close()
is okay for read-only files in the 99.9% of cases, but for files opening for writing, you might want to implement more involved error checking and some strategy for dealing with them (mere reporting, wait-then-retry, ask-then-maybe-retry or whatever).
runtime.SetFinalizer(ptr, finalizerFunc) sets a finalizer--not a destructor but another mechanism to maybe eventually free up resources. Read the documentation there for details, including downsides. They might not run until long after the object is actually unreachable, and they might not run at all if the program exits first. They also postpone freeing memory for another GC cycle.
If you're acquiring some limited resource that doesn't already have a finalizer, and the program would eventually be unable to continue if it kept leaking, you should consider setting a finalizer. It can mitigate leaks. Unreachable files and network connections are already cleaned up by finalizers in the stdlib, so it's only other sorts of resources where custom ones can be useful. The most obvious class is system resources you acquire through syscall or cgo, but I can imagine others.
Finalizers can help get a resource freed eventually even if the code using it omits a Close() or similar cleanup, but they're too unpredictable to be the main way to free resources. They don't run until GC does. Because the program could exit before next GC, you can't rely on them for things that must be done, like flushing buffered output to the filesystem. If GC does happen, it might not happen soon enough: if a finalizer is responsible for closing network connections, maybe a remote host hits its limit on open connections to you before GC, or your process hits its file-descriptor limit, or you run out of ephemeral ports, or something else. So it's much better to defer and do cleanup right when it's necessary than to use a finalizer and hope it's done soon enough.
You don't see many SetFinalizer calls in everyday Go programming, partly because the most important ones are in the standard library and mostly because of their limited range of applicability in general.
In short, finalizers can help by freeing forgotten resources in long-running programs, but because not much about their behavior is guaranteed, they aren't fit to be your main resource-management mechanism.
There are Finalizers in Go. I wrote a little blog post about it. They are even used for closing files in the standard library as you can see here.
However, I think using defer is more preferable because it's more readable and less magical.
I was pondering language features and I was wondering if the following feature had been implemented in any languages.
A way of declaring that an object may only be accessed within a Mutex. SO for example in java you would only be able to access an object if it was in a synchrnoised block and in C# a Lock.
A compiler error would ensue if the object was used outside of a Mutex block.
Any thoughts?
UPDATE
I think some people have misunderstood the question, I'm not asking if you can lock objects, I'm asking if there is a mechanism to state at declaration of an object that it may only be accessed from within a lock/synchronised statement.
There are two ways to do that.
Your program either refuses to run a method unless the protecting mutex is locked by the calling thread (that's a runtime check); or it refuses to compile (that's a compile time check).
First way is what C# lock does.
Second method requires a compiler able to evaluate every execution path possible. It's hardly feasible.
In Java you can add the synchronized keyword to a method, but that is only syntactic sugar to wrapping the entire method body in a synchronized(this)-block (for non-static methods).
So for Java there is no language construct that enforces that behavior. You can try to .wait() on this with a zero timeout to ensure that the calling code has acquired the monitor, but that's just checking after-the-fact
In Objective-C, you can use the #property and #synthesize directives to let the compiler generate the code for accessors. By default they are protected by mutex.
Demanding locks on everything as you describe would create the potential for deadlocks, as one might be forced to take a lock sooner than one would otherwise.
That said, there are approaches similar to what you describe - Software Transactional Memory, in particular, avoids the deadlock issue by allowing rollbacks and retries.