OpenMP critical section vs locks - openmp

What is the difference between OpenMP locks and critical section? Are they just alternatives for each other?
For example if I am writing to a same file using multiple files, should I use the locking or just the critical section before writing into the file?

Critical sections will most commonly use a lock internally, e.g.:
libgomp: source
libiomp:
If the optional (name) is omitted, it locks an unnamed global mutex.
The OpenMP specification guarantees the following behaviour:
>
The critical construct restricts execution of the associated
structured block to a single thread at a time
A critical section therefore serves the same purpose as acquiring a lock. The difference is that the low-level details are handled for you.
I would advise you to use critical whenever possible due to the simplicity. If you have separate blocks that need to be critical but don't interfere with each other give them names, and only if you need some behaviour that cannot be accommodated by the annotations, use explicit locking.

Related

Can I use boost named_semaphore in place of ACE_SEMAPHORE as I am trying to move from ACE to boost libraries?

I am moving my code from ACE library support to boost library support. I need to replace ACE_Semaphore. It seems C++11 doesn't support semaphore methods. I have seen named_semaphore in boost. Another choice I saw was to go for POCO semaphore where in I have to include POCO libraries. Kindly give me suggestions as to which is the best way to move forward.
Edit: This is for intra process thread synchronization.
The short answer is: yes.
If for intra-process synchronization, you can simply emulate one with a mutex+condition variable:
C++0x has no semaphores? How to synchronize threads?
Note though, usually a mutex + condition variable will do, as the concrete condition doesn't usually take the form of a counter. (E.g. Synchronizing three threads with Condition Variable)
For interprocess synchronization you use the named semaphore. An example: How to limit the number of running instances in C++ Note that there are implementation differences¹.
¹ e.g. named_semaphore in boost allocates its own shared resource, while in ACE it's assumed the user allocates it from existing shared space. In boost, you obviously also can, as long as your platform supports native synchronization primitives in shared memory

Is there a robust implementation of condition_variable and mutex that can be stored in shared memory on Windows?

As described in this question, use of boost's interprocess_mutex and interproces condition_variable may result in a deadlock if the process holding the mutex crashes.
This is because boost's mutex is not a kernel object and therefore is not automatically released when the process holding it exits.
Is there a way in boost to use interprocess conditional variables with the mutex returned by a call to CreateMutex?
Just use CreateSemaphore() directly to implement condvars across multiple processes. You don't need to use Boost condvars. Windows provides a very rich set of well defined, fairly correct, named machine-wide synchronisation objects. Use those instead.

Does Go support volatile / non-volatile variables?

I'm new to the language so bear with me.
I am curious how GO handles data storage available to threads, in the sense that non-local variables can also be non-volatile, like in Java for instance.
GO has the concept of channel, which, by it's nature -- inter thread communication, means it bypasses processor cache, and reads/writes to heap directly.
Also, have not found any reference to volatile in the go lang documentation.
TL;DR: Go does not have a keyword to make a variable safe for multiple goroutines to write/read it. Use the sync/atomic package for that. Or better yet Do not communicate by sharing memory; instead, share memory by communicating.
Two answers for the two meanings of volatile
.NET/Java concurrency
Some excerpts from the Go Memory Model.
If the effects of a goroutine must be observed by another goroutine,
use a synchronization mechanism such as a lock or channel
communication to establish a relative ordering.
One of the examples from the Incorrect Synchronization section is an example of busy waiting on value.
Worse, there is no guarantee that the write to done will ever be
observed by main, since there are no synchronization events between
the two threads. The loop in main is not guaranteed to finish.
Indeed, this code(play.golang.org/p/K8ndH7DUzq) never exits.
C/C++ non-standard memory
Go's memory model does not provide a way to address non-standard memory. If you have raw access to a device's I/O bus you'll need to use assembly or C to safely write values to the memory locations. I have only ever needed to do this in a device driver which generally precludes use of Go.
The simple answer is that volatile is not supported by the current Go specification, period.
If you do have one of the use cases where volatile is necessary, such as low-level atomic memory access that is unsupported by existing packages in the standard library, or unbuffered access to hardware mapped memory, you'll need to link in a C or assembly file.
Note that if you do use C or assembly as understood by the GC compiler suite, you don't even need cgo for that, since the [568]c C/asm compilers are also able to handle it.
You can find examples of that in Go's source code. For example:
http://golang.org/src/pkg/runtime/sema.goc
http://golang.org/src/pkg/runtime/atomic_arm.c
Grep for many other instances.
For how memory access in Go does work, check out The Go Memory Model.
No, go does not support the volatile or register statement.
See this post for more information.
This is also noted in the Go for C++ Programmers guide.
The Go Memory Model documentation explains why the concept of 'volatile' has no application in Go.
Loosely: Among other things, goroutines are free to keep goroutine-local changes cached in registers so those changes are not observable by other goroutines. To "flush" those changes to memory, a synchronization must be performed. Either by using locks or by communicating (channel send or receive).

Correct lock to use in linux character driver

I am writing a simple character device driver. (kernel 2.6.26)
Multiple concurrent reader & writers are expected.
I am not sure what type of lock is best used to synchronize a short access to internal structures.
Any advice will be most appreciated
Compare with http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/c214.html . An old document from before when mutexes existed, but given mutexes are a sleeping lock, they count towards user context.
spinlock — spinlock_bh — mutex — semaphore
If your data structures are only ever accessed by functions whose execution is triggered by userspace, all lock primitives are available to you. It depends on gut feeling of how short a "short access" is.
And then there is RCU as a fifth way of doing things, though it is somewhat not a locking primitive in its own right. (It is used together with one of the lock primitives.)
Start with a mutex. Once you've got it working you can think about reworking the locking.

Executable sections marked as "execute" AND "read"?

I've noticed (on Win32 at least) that in executables, code sections (.text) have the "read" access bit set, as well as the "execute" access bit. Are there any bonafide legit reasons for code to be reading itself instead of executing itself? I thought this was what other sections were for (such as .rdata).
(Specifically, I'm talking about IMAGE_SCN_MEM_READ.)
IMAGE_SCN_MEM_EXECUTE |IMAGE_SCN_MEM_READ are mapped into memory as PAGE_EXECUTE_READ, which is equivalent to PAGE_EXECUTE_WRITECOPY. This is needed to enable copy-on-write access. Copy-on-write means that any attempts to modify the page results in a new, process-private copy of the page being created.
There are a few different reasons for needing write-copy:
Code that needs to be relocated by the loader must have this set so that the loader can do the fix-ups. This is very common.
Sections that have code and data in single section would need this as well, to enable modifying process globals. Code & data in a single section can save space, and possibly improve locality by having code and the globals the code uses being on the same page.
Code that attempts to modify itself. I believe this is fairly rare.
Compile-time constants, particularly for long long or double values, are often loaded with a mov register, address statement from the code segment.
The one example I can think of for a reason to read code is to allow for self modifying code. Code must necessarily be able to read itself in order to be self modifying.
Also consider the opposite side. What advantage is gained from disallowing code from reading itself? I struggled for a bit on this one but I can see no advantage gained from doing so.

Resources