I am working with Monte Carlo simulations to find the decimal places of PI. So far so good but OpenMP came in and I realize that ran2, arguably the best RGN, is not threadsafe! The implementation is here. Since I have not worked with OpenMP and neither a lot on multi-threading I am stuck at making this thread safe using OpenMP.
So far what I know is that a function is already thread-safe if it doesn't modify non-local memory and it doesn't call any function that does. In this case, there are 3 variables which are static and thus will be modified if gets used by different threads.
One possible solution is to call this in a thread safe way by enclosing the calling of ran2 in a critical section but that makes no sense as I get no speedup.
Can somebody give me pointers on how to proceed with this or if somebody has any reference that will be great too!
What is generally done to render a procedure thread safe is to associate previously static data to a thread local data. Look for instance at the man page of rand_r() that is a thread safe version of rand().
So in your version of L'Ecuyer:
define a struct (say state) that holds the static data
redefine procedure ran2() to have an additional parameter that is a pointer to this struct state and modify the code accordingly. Let ran2_r() be the new name.
define in every thread a local struct state to hold the state
probably state needs to be seeded. You can use get_thread_num() to provide a per thread seed to initialize properly the state when entering the thread.
Now you just need to call your new ran2_r() with a pointer to this state. It will be modified by the procedure, but modifications will be stored in the thread local state var.
I got a piece of code which uses a std::set to keep a bunch of pointer.
I used this to be sure of that each pointer will present only once in my container.
Then, I heard about std::unique_ptr that ensure the pointer will exists only once in my entire code, and that's exactly what I need.
So my question is quite simple, should I change my container type to std::vector ? Or It won't changes anything leaving a std::set ?
I think the job your set is doing is probably different to unique_ptr.
Your set is likely recording some events, and ensuring that only 1 event is recorded for each object that triggers, using these terms very loosely.
An example might be tracing through a mesh and recording all the nodes that are passed through.
The objects themselves already exist and are owned elsewhere.
The purpose of unique_ptr is to ensure that there is only one owner for a dynamically allocated object, and ensure automatic destruction of the object. Your objects already have owners, they don't need new ones!
I have a question about the marker trait Sync after reading Extensible Concurrency with the Sync and Send Traits.
Java's "synchronize" means blocking, so I was very confused about how a Rust struct with Sync implemented whose method is executed on multiple threads would be effective.
I searched but found no meaningful answer. I'm thinking about it this way: every thread will get the struct's reference synchronously (blocking), but call the method in parallel, is that true?
Java: Accesses to this object from multiple threads become a synchronized sequence of actions when going through this codepath.
Rust: It is safe to access this type synchronously through a reference from multiple threads.
(The two points above are not canonical definitions, they are just demonstrations how similar words can be used in sentences to obtain different meanings)
synchronized is implemented as a mutual exclusion lock at runtime. Sync is a compile time promise about runtime properties of a specific type that allows other types depend on those properties through trait bounds. A Mutex just happens to be one way one can provide Sync behavior. Immutable types usually provide this behavior too without any runtime cost.
Generally you shouldn't rely on words having exactly the same meaning in different contexts. Java IO stream != java collection stream != RxJava reactive stream ~= tokio Stream. C volatile != java volatile. etc. etc.
Ultimately the prose matters a lot more than the keyword which are just shorthands.
V8 requires a HandleScope to be declared in order to clean up any Local handles that were created within scope. I understand that HandleScope will dereference these handles for garbage collection, but I'm interested in why each Local class doesn't do the dereferencing themselves like most internal ref_ptr type helpers.
My thought is that HandleScope can do it more efficiently by dumping a large number of handles all at once rather than one by one as they would in a ref_ptr type scoped class.
Here is how I understand the documentation and the handles-inl.h source code. I, too, might be completely wrong since I'm not a V8 developer and documentation is scarce.
The garbage collector will, at times, move stuff from one memory location to another and, during one such sweep, also check which objects are still reachable and which are not. In contrast to reference-counting types like std::shared_ptr, this is able to detect and collect cyclic data structures. For all of this to work, V8 has to have a good idea about what objects are reachable.
On the other hand, objects are created and deleted quite a lot during the internals of some computation. You don't want too much overhead for each such operation. The way to achieve this is by creating a stack of handles. Each object listed in that stack is available from some handle in some C++ computation. In addition to this, there are persistent handles, which presumably take more work to set up and which can survive beyond C++ computations.
Having a stack of references requires that you use this in a stack-like way. There is no “invalid” mark in that stack. All the objects from bottom to top of the stack are valid object references. The way to ensure this is the LocalScope. It keeps things hierarchical. With reference counted pointers you can do something like this:
shared_ptr<Object>* f() {
shared_ptr<Object> a(new Object(1));
shared_ptr<Object>* b = new shared_ptr<Object>(new Object(2));
return b;
}
void g() {
shared_ptr<Object> c = *f();
}
Here the object 1 is created first, then the object 2 is created, then the function returns and object 1 is destroyed, then object 2 is destroyed. The key point here is that there is a point in time when object 1 is invalid but object 2 is still valid. That's what LocalScope aims to avoid.
Some other GC implementations examine the C stack and look for pointers they find there. This has a good chance of false positives, since stuff which is in fact data could be misinterpreted as a pointer. For reachability this might seem rather harmless, but when rewriting pointers since you're moving objects, this can be fatal. It has a number of other drawbacks, and relies a lot on how the low level implementation of the language actually works. V8 avoids that by keeping the handle stack separate from the function call stack, while at the same time ensuring that they are sufficiently aligned to guarantee the mentioned hierarchy requirements.
To offer yet another comparison: an object references by just one shared_ptr becomes collectible (and actually will be collected) once its C++ block scope ends. An object referenced by a v8::Handle will become collectible when leaving the nearest enclosing scope which did contain a HandleScope object. So programmers have more control over the granularity of stack operations. In a tight loop where performance is important, it might be useful to maintain just a single HandleScope for the whole computation, so that you won't have to access the handle stack data structure so often. On the other hand, doing so will keep all the objects around for the whole duration of the computation, which would be very bad indeed if this were a loop iterating over many values, since all of them would be kept around till the end. But the programmer has full control, and can arrange things in the most appropriate way.
Personally, I'd make sure to construct a HandleScope
At the beginning of every function which might be called from outside your code. This ensures that your code will clean up after itself.
In the body of every loop which might see more than three or so iterations, so that you only keep variables from the current iteration.
Around every block of code which is followed by some callback invocation, since this ensures that your stuff can get cleaned if the callback requires more memory.
Whenever I feel that something might produce considerable amounts of intermediate data which should get cleaned (or at least become collectible) as soon as possible.
In general I'd not create a HandleScope for every internal function if I can be sure that every other function calling this will already have set up a HandleScope. But that's probably a matter of taste.
Disclaimer: This may not be an official answer, more of a conjuncture on my part; but the v8 documentation is hardly
useful on this topic. So I may be proven wrong.
From my understanding, in developing various v8 based backed application. Its a means of handling the difference between the C++ and javaScript environment.
Imagine the following sequence, which a self dereferencing pointer can break the system.
JavaScript calls up a C++ wrapped v8 function : lets say helloWorld()
C++ function creates a v8::handle of value "hello world =x"
C++ returns the value to the v8 virtual machine
C++ function does its usual cleaning up of resources, including dereferencing of handles
Another C++ function / process, overwrites the freed memory space
V8 reads the handle : and the data is no longer the same "hell!#(#..."
And that's just the surface of the complicated inconsistency between the two; Hence to tackle the various issues of connecting the JavaScript VM (Virtual Machine) to the C++ interfacing code, i believe the development team, decided to simplify the issue via the following...
All variable handles, are to be stored in "buckets" aka HandleScopes, to be built / compiled / run / destroyed by their
respective C++ code, when needed.
Additionally all function handles, are to only refer to C++ static functions (i know this is irritating), which ensures the "existence"
of the function call regardless of constructors / destructor.
Think of it from a development point of view, in which it marks a very strong distinction between the JavaScript VM development team, and the C++ integration team (Chrome dev team?). Allowing both sides to work without interfering one another.
Lastly it could also be the sake of simplicity, to emulate multiple VM : as v8 was originally meant for google chrome. Hence a simple HandleScope creation and destruction whenever we open / close a tab, makes for much easier GC managment, especially in cases where you have many VM running (each tab in chrome).
Why do we need boost::thread_specific_ptr, or in other words what can we not easily do without it?
I can see why pthread provides pthread_getspecific() etc. These functions are useful for cleaning up after dead threads, and handy to call from C-style functions (the obvious alternative being to pass a pointer everywhere that points to some memory allocated before the thread was created).
In contrast, the constructor of boost:thread takes a callable class by value, and everything non-static in that class becomes thread local once it is copied. I cannot see why I would want to use boost::thread_specific_ptr in preference to a class member any more than I would want to use a global variable in OOP code.
Do I horribly misunderstand anything? A very brief example would help, please. Many thanks.
thread_specific_ptr simply provides portable thread local data access. You don't have to be managing your threads with Boost.Thread to get value from this. The canonical example is the one cited in the Boost docs for this class:
One example is the C errno variable,
used for storing the error code
related to functions from the Standard
C library. It is common practice (and
required by POSIX) for compilers that
support multi-threaded applications to
provide a separate instance of errno
for each thread, in order to avoid
different threads competing to read or
update the value.