C++ 11 equivalent of java.util.ConcurrentHashMap - c++11

I find myself constantly writing Mutex code in order to synchronize read/write access to a std::unordered_map and other containers so that I can use them as I do java.util.concurrent containers. I was about to start writing a wrapper to encapsulate the Mutex, but I would rather use a well tested library so I don't stuff up the threading.
Is there such a library?

Intel produced a library called Threading Building Blocks which has two such things: concurrent_hash_map and concurrent_unordered_map. They have slightly different characteristics, but one or the other will probably suit your needs.

Folly has an AtomicHashmap implementation. The major limitation is that you can only use int32 or int64 keys. Check the documentation here (specially the Limitation section)

Related

Can I use boost named_semaphore in place of ACE_SEMAPHORE as I am trying to move from ACE to boost libraries?

I am moving my code from ACE library support to boost library support. I need to replace ACE_Semaphore. It seems C++11 doesn't support semaphore methods. I have seen named_semaphore in boost. Another choice I saw was to go for POCO semaphore where in I have to include POCO libraries. Kindly give me suggestions as to which is the best way to move forward.
Edit: This is for intra process thread synchronization.
The short answer is: yes.
If for intra-process synchronization, you can simply emulate one with a mutex+condition variable:
C++0x has no semaphores? How to synchronize threads?
Note though, usually a mutex + condition variable will do, as the concrete condition doesn't usually take the form of a counter. (E.g. Synchronizing three threads with Condition Variable)
For interprocess synchronization you use the named semaphore. An example: How to limit the number of running instances in C++ Note that there are implementation differences¹.
¹ e.g. named_semaphore in boost allocates its own shared resource, while in ACE it's assumed the user allocates it from existing shared space. In boost, you obviously also can, as long as your platform supports native synchronization primitives in shared memory

Does Go support volatile / non-volatile variables?

I'm new to the language so bear with me.
I am curious how GO handles data storage available to threads, in the sense that non-local variables can also be non-volatile, like in Java for instance.
GO has the concept of channel, which, by it's nature -- inter thread communication, means it bypasses processor cache, and reads/writes to heap directly.
Also, have not found any reference to volatile in the go lang documentation.
TL;DR: Go does not have a keyword to make a variable safe for multiple goroutines to write/read it. Use the sync/atomic package for that. Or better yet Do not communicate by sharing memory; instead, share memory by communicating.
Two answers for the two meanings of volatile
.NET/Java concurrency
Some excerpts from the Go Memory Model.
If the effects of a goroutine must be observed by another goroutine,
use a synchronization mechanism such as a lock or channel
communication to establish a relative ordering.
One of the examples from the Incorrect Synchronization section is an example of busy waiting on value.
Worse, there is no guarantee that the write to done will ever be
observed by main, since there are no synchronization events between
the two threads. The loop in main is not guaranteed to finish.
Indeed, this code(play.golang.org/p/K8ndH7DUzq) never exits.
C/C++ non-standard memory
Go's memory model does not provide a way to address non-standard memory. If you have raw access to a device's I/O bus you'll need to use assembly or C to safely write values to the memory locations. I have only ever needed to do this in a device driver which generally precludes use of Go.
The simple answer is that volatile is not supported by the current Go specification, period.
If you do have one of the use cases where volatile is necessary, such as low-level atomic memory access that is unsupported by existing packages in the standard library, or unbuffered access to hardware mapped memory, you'll need to link in a C or assembly file.
Note that if you do use C or assembly as understood by the GC compiler suite, you don't even need cgo for that, since the [568]c C/asm compilers are also able to handle it.
You can find examples of that in Go's source code. For example:
http://golang.org/src/pkg/runtime/sema.goc
http://golang.org/src/pkg/runtime/atomic_arm.c
Grep for many other instances.
For how memory access in Go does work, check out The Go Memory Model.
No, go does not support the volatile or register statement.
See this post for more information.
This is also noted in the Go for C++ Programmers guide.
The Go Memory Model documentation explains why the concept of 'volatile' has no application in Go.
Loosely: Among other things, goroutines are free to keep goroutine-local changes cached in registers so those changes are not observable by other goroutines. To "flush" those changes to memory, a synchronization must be performed. Either by using locks or by communicating (channel send or receive).

Writing a Ruby extension in Go (golang)

Are there some tutorials or practical lessons on how to write an extension for Ruby in Go?
Go 1.5 added support for building shared libraries that are callable from C (and thus from Ruby via FFI). This makes the process easier than in pre-1.5 releases (when it was necessary to write the C glue layer), and the Go runtime is now usable, making this actually useful in real life (goroutines and memory allocations were not possible before, as they require the Go runtime, which was not useable if Go was not the main entry point).
goFuncs.go:
package main
import "C"
//export GoAdd
func GoAdd(a, b C.int) C.int {
return a + b
}
func main() {} // Required but ignored
Note that the //export GoAdd comment is required for each exported function; the symbol after export is how the function will be exported.
goFromRuby.rb:
require 'ffi'
module GoFuncs
extend FFI::Library
ffi_lib './goFuncs.so'
attach_function :GoAdd, [:int, :int], :int
end
puts GoFuncs.GoAdd(41, 1)
The library is built with:
go build -buildmode=c-shared -o goFuncs.so goFuncs.go
Running the Ruby script produces:
42
Normally I'd try to give you a straight answer but the comments so far show there might not be one. So, hopefully this answer with a generic solution and some other possibilities will be acceptable.
One generic solution: compile high level language program into library callable from C. Wrap that for Ruby. One has to be extremely careful about integration at this point. This trick was a nice kludge to integrate many languages in the past, usually for legacy reasons. Thing is, I'm not a Go developer and I don't know that you can compile Go into something callable from C. Moving on.
Create two standalone programs: Ruby and Go program. In the programs, use a very efficient way of passing data back and forth. The extension will simply establish a connection to the Go program, send the data, wait for the result, and pass the result back into Ruby. The communication channel might be OS IPC, sockets, etc. Whatever each supports. The data format can be extremely simple if there's no security issues and you're using predefined message formats. That further boosts speed. Some of my older programs used XDR for binary format. These days, people seem to use things like JSON, Protocol Buffers and ZeroMQ style wire protocols.
Variation of second suggestion: use ZeroMQ! Or something similar. ZeroMQ is fast, robust and has bindings for both languages. It manages the whole above paragraph for you. Drawbacks are that it's less flexible wrt performance tuning and has extra stuff you don't need.
The tricky part of using two processes and passing data between them is a speed penalty. The overhead might not justify leaving Ruby. However, Go has great native performance and concurrency features that might justify coding part of an application in it versus a scripting language like Ruby. (Probably one of your justifications for your question.) So, try each of these strategies. If you get a working program that's also faster, use it. Otherwise, stick with Ruby.
Maybe less appealing option: use something other than Go that has similar advantages, allows call from C, and can be integrated. Althought it's not very popular, Ada is a possibility. It's long been strong in native code, (restricted) concurrency, reliability, low-level support, cross-language development and IDE (GNAT). Also, Julia is a new language for high performance technical and parallel programming that can be compiled into a library callable from C. It has a JIT too. Maybe changing problem statement from Ruby+Go to Ruby+(more suitable language) will solve the problem?
As of Go 1.5, there's a new build mode that tells the Go compiler to output a shared library and a C header file:
-buildmode c-shared
(This is explained in more detail in this helpful tutorial: http://blog.ralch.com/tutorial/golang-sharing-libraries/)
With the new build mode, you no longer have to write a C glue layer yourself (as previously suggested in earlier responses). Once you have the shared-library and the header file, you can proceed to use FFI to call the Go-created shared library (example here: https://www.amberbit.com/blog/2014/6/12/calling-c-cpp-from-ruby/)

Reentrancy in Boost

When working with multithreading, I need to make sure that the boost classes I use are reentrant (even when each thread uses its own object instance).
I'm having hard time finding in the documentation of Boost's classes a statement about the reentrancy of the class. Am I missing something here? Are all the classes of Boost reentrant unless explicitly stated otherwise in the documentation? Or is Boost's documentation on reentrancy a gray area?
For example, I couldn't find anywhere in the documentation a statement on the reentrancy of the boost::numeric::ublas∷matrix class. So can I assume it's reentrant or not?
Thanks!
Ofer
Most of Boost is similar to most of the STL and the C++ standard library in that:
Creating two instances of a type in two threads at the same time is OK.
Using two instances of a type in two threads at the same time is OK.
Using a single object in two threads at the same time is often not OK.
But doing read-only operations on one object in two threads is often OK.
There is usually no particular effort taken to "enhance" thread safety, except where there is a particular need to do so, like shared_ptr, Asio, Signals2 (but not Signals), and so on. Parts of Boost that look like value types (such as your matrix example) probably do not have any special thread safety support at all, leaving it up to the application.

C Runtime objects, dll boundaries

What is the best way to design a C API for dlls which deals with the problem of passing "objects" which are C runtime dependent (FILE*, pointer returned by malloc, etc...). For example, if two dlls are linked with a different version of the runtime, my understanding is that you cannot pass a FILE* from one dll to the other safely.
Is the only solution to use windows-dependent API (which are guaranteed to work across dlls) ? The C API already exists and is mature, but was designed from a unix POV, mostly (and still has to work on unix, of course).
You asked for a C, not a C++ solution.
The usual method(s) for doing this kind of thing in C are:
Design the modules API to simply not require CRT objects. Get stuff passed accross in raw C types - i.e. get the consumer to load the file and simply pass you the pointer. Or, get the consumer to pass a fully qualified file name, that is opened , read, and closed, internally.
An approach used by other c modules, the MS cabinet SD and parts of the OpenSSL library iirc come to mind, get the consuming application to pass in pointers to functions to the initialization function. So, any API you pass a FILE* to would at some point during initialization have taken a pointer to a struct with function pointers matching the signatures of fread, fopen etc. When dealing with the external FILE*s the dll always uses the passed in functions rather than the CRT functions.
With some simple tricks like this you can make your C DLLs interface entirely independent of the hosts CRT - or in fact require the host to be written in C or C++ at all.
Neither existing answer is correct: Given the following on Windows: you have two DLLs, each is statically linked with two different versions of the C/C++ standard libraries.
In this case, you should not pass pointers to structures created by the C/C++ standard library in one DLL to the other. The reason is that these structures may be different between the two C/C++ standard library implementations.
The other thing you should not do is free a pointer allocated by new or malloc from one DLL that was allocated in the other. The heap manger may be differently implemented as well.
Note, you can use the pointers between the DLLs - they just point to memory. It is the free that is the issue.
Now, you may find that this works, but if it does, then you are just luck. This is likely to cause you problems in the future.
One potential solution to your problem is dynamically linking to the CRT. For example,you could dynamically link to MSVCRT.DLL. That way your DLL's will always use the same CRT.
Note, I suggest that it is not a best practice to pass CRT data structures between DLLs. You might want to see if you can factor things better.
Note, I am not a Linux/Unix expert - but you will have the same issues on those OSes as well.
The problem with the different runtimes isn't solvable because the FILE* struct belongs
to one runtime on a windows system.
But if you write a small wrapper Interface your done and it does not really hurt.
stdcall IFile* IFileFactory(const char* filename, const char* mode);
class IFile {
virtual fwrite(...) = 0;
virtual fread(...) = 0;
virtual delete() = 0;
}
This is save to be passed accross dll boundaries everywhere and does not really hurt.
P.S.: Be careful if you start throwing exceptions across dll boundaries. This will work quiet well if you fulfill some design creterions on windows OS but will fail on some others.
If the C API exists and is mature, bypassing the CRT internally by using pure Win32 API stuff gets you half the way. The other half is making sure the DLL's user uses the corresponding Win32 API functions. This will make your API less portable, in both use and documentation. Also, even if you go this way with memory allocation, where both the CRT functions and the Win32 ones deal with void*, you're still in trouble with the file stuff - Win32 API uses handles, and knows nothing about the FILE structure.
I'm not quite sure what are the limitations of the FILE*, but I assume the problem is the same as with CRT allocations across modules. MSVCRT uses Win32 internally to handle the file operations, and the underlying file handle can be used from every module within the same process. What might not work is closing a file that was opened by another module, which involves freeing the FILE structure on a possibly different CRT.
What I would do, if changing the API is still an option, is export cleanup functions for any possible "object" created within the DLL. These cleanup functions will handle the disposal of the given object in the way that corresponds to the way it was created within that DLL. This will also make the DLL absolutely portable in terms of usage. The only worry you'll have then is making sure the DLL's user does indeed use your cleanup functions rather than the regular CRT ones. This can be done using several tricks, which deserve another question...

Resources