Should my Akka actors' properties be marked #volatile? - actor

This question looks similar to Should my Scala actors' properties be marked #volatile? but not sure that answer will be the same.
As example, in case when the fork-join dispatcher was configured and actor's state wasn't marked by #volatile, is it guarantied that state of the actor will be propagated through the cache hierarchy from one core (or processor) to another if fork/join worker threads run on different cores (or processors)?
P.S. Is it right that after JSR133 only one write/read operation to/from any volatile variable required to flush cache to main memory and see all preceding non-volatile writings from this thread on other thread that running on other core (or processor)? If yes then it can be answer, because scanning of work queue do some readings and writing from/to volatile variables of FJ task.

No, you shouldn't put volatile on your actor fields. Why?
if an actor makes changes to its internal state while processing a
message, and accesses that state while processing another message
moments later. It is important to realize that with the actor model
you don’t get any guarantee that the same thread will be executing the
same actor for different messages.
It's all here: http://doc.akka.io/docs/akka/2.0/general/jmm.html
Regarding your PS, you need to read/write to the same volatile field to get the happens-before guarantee. Read up on "volatile piggybacking"

Related

Ehcache and CacheWriter (write-behind) relation

Suppose we have a Cache configured with a write-behind CacheWriter. Let's assume we put some object in the cache and later on the object is removed because of an eviction policy.
What's is guaranteed regarding writing? More precisely, is write() event guaranteed to happen for that object, even though it was removed before it "had a chance" to be written?
Thanks!
No, write() is not guaranteed to happen. In a write-behind case, all writes are stored in a queue while some background threads read from that queue to update the underlying SoR (System of Records, i.e.: your database). That queue can be read or modified by other threads concurrently reading or modifying the same cache.
For instance, if a put() happens on a certain key, write() enqueues the command. If before one of the background thread had the chance to consume the write command before remove() happens on that same key, the write command can be removed from the queue (note the 'can' here). There are other similar optimizations that can take place ('can' again), those can change and new ones can be added in any minor version as this is all considered an implementation detail, as long as the data served by Ehcache follows its general visibility guarantees.
This means Write-Behind, and more generally all CacheWriters must not be used for any form of accounting, if that's the use-case you had in mind.

Ehcache 3: Are keys that being written get locked?

Suppose we have a cache with a CacheLoaderWriter, so we are registered to the events: write and writeAll.
What is the status of these keys at that time?
i.e. If another thread tries to cache.get(keyThatBeingWritten), will it be blocked until the write()/writeAll() operations exit?
writeAll() logically functions like a succession of write(), it is entirely possible for one thread to observe some already written data while another thread is still busy executing writeAll().
Regarding write(), it will block concurrent reader and writer threads working on the same key if needed for as long as needed to fulfill the Ehcache visibility guarantees.

What does JMS Session single-threadedness mean?

What is the exact nature of the thread-unsafety of a JMS Session and its associated constructs (Message, Consumer, Producer, etc)? Is it just that access to them must be serialized, or is it that access is restricted to the creating thread only?
Or is it a hybrid case where creation can be distinguished from use, i.e. one thread can create them only and then another thread can be the only one to use them? This last possibility would seem to contradict the statement in this answer which says "In fact you must not use it from two different threads at different times either!"
But consider the "Server Side" example code from the ActiveMQ documentation.
The Server class has data members named session (of type Session) and replyProducer (of type MessageProducer) which are
created in one thread: whichever one invokes the Server() constructor and thereby invokes the setupMessageQueueConsumer() method with the actual creation calls; and
used in another thread: whichever one invokes the onMessage() asynchronous callback.
(In fact, the session member is used in both threads too: in one to create the replyProducer member, and in the other to create a message.)
Is this official example code working by accident or by design? Is it really possible to create such objects in one thread and then arrange for another thread to use them?
(Note: in other messaging infrastructures, such as Solace, it's possible to specify the thread on which callbacks occur, which could be exploited to get around this "thread affinity of objects" restriction, but no such API call is defined in JMS, as far as I know.)
JMS specification says a session object should not be used across threads except when calling Session.Close() method. Technically speaking if access to Session object or it's children (producer, consumer etc) is serialized then Session or it's child objects can be accessed across threads. Having said that, since JMS is an API specification, it's implementation differs from vendor to vendor. Some vendors might strictly enforce the thread affinity while some may not. So it's always better to stick to JMS specification and write code accordingly.
The official answer appears to be a footnote to section 4.4. "Session" on p.60 in the JMS 1.1 specification.
There are no restrictions on the number of threads that can use a Session object or those it creates. The restriction is that the resources of a Session should not be used concurrently by multiple threads. It is up to the user to insure that this concurrency restriction is met. The simplest way to do this is to use one thread. In the case of asynchronous delivery, use one thread for setup in stopped mode and then start asynchronous delivery. In more complex cases the user must provide explicit synchronization.
Whether a particular implementation abides by this is another matter, of course. In the case of the ActiveMQ example, the code is conforming because all inbound message handling is through a single asynchronous callback.

boost.asio - do i need to use locks if sharing database type object between different async handlers?

I'm making a little server for a project, I have a log handler class which contains a log implemented as a map and some methods to act on it (add entry, flush to disk, commit etc..)
This object is instantiated in the server Class, and I'm passing the address to the session so each session can add entries to it.
The sessions are async, the log writes will happen in the async_read callback. I'm wondering if this will be an issue and if i need to use locks?
The map format is map<transactionId map<sequenceNum, pair<head, body>>, each session will access a different transactionId, so there should be no clashes as far as i can figure. Also hypothetically, if they were all writing to the same place in memory -- something large enough that the operation would not be atomic; would i need locks? As far as I understand each async method dispatches a thread to handle the operation, which would make me assume yes. At the same time I read that one of the great uses of async functions is the fact that synchronization primitives are not needed. So I'm a bit confused.
First time using ASIO or any type of asynchronous functions altogether, and i'm not a very experienced coder. I hope the question makes sense! The code seems to run fine so far, but i'm curios if it's correct.
Thank you!
Asynchronous handlers will only be invoked in application threads processing the io_service event loop via run(), run_one(), poll(), or poll_one(). The documentation states:
Asynchronous completion handlers will only be called from threads that are currently calling io_service::run().
Hence, for a non-thread safe shared resource:
If the application code only has one thread, then there is neither concurrency nor race conditions. Thus, no additional form of synchronization is required. Boost.Asio refers to this as an implicit strand.
If the application code has multiple threads processing the event-loop and the shared resource is only accessed within handlers, then synchronization needs to occur, as multiple threads may attempt to concurrently access the shared resource. To resolve this, one can either:
Protect the calls to the shared resource via a synchronization primitive, such as a mutex. This question covers using mutexes within handlers.
Use the same strand to wrap() the ReadHandlers. A strand will prevent concurrent invocation of handlers dispatched through it. For more details on the usage of strands, particularly for composed operations, such as async_read(), consider reading this answer.
Rather than posting the entire ReadHandler into the strand, one could limit interacting with the shared resource to a specific set of functions, and these functions are posted as CompletionHandlers to the same strand. This subtle difference between this and the previous solution is the granularity of synchronization.
If the application code has multiple threads and the shared resource is accessed from threads processing the event loop and from threads not processing the event loop, then synchronization primitives, such as a mutex, needs to be used.
Also, even if a shared resource is small enough that writes and reads are always atomic, one should prefer using explicit and proper synchronization. For example, although the write and read may be atomic, without proper memory fencing to guarantee memory visibility, a thread may not observe a chance in memory even though the actual memory has chanced. Boost.Asio's will perform the proper memory barriers to guarantee visibility. For more details, on Boost.Asio and memory barriers, consider reading this answer.

Thread safety for DirectShow filters that deliver output samples from a worker thread

I'm working on a DirectShow filter which takes input samples and turns them into modified output samples but where there isn't a one-to-one correspondence between input and output samples so CTransformFilter doesn't seem to be the way to go.
The best way of writing this appears to be writing a filter using CBaseFilter, CBaseInputPin and CBaseOutputPin where samples are received on an input pin and processed by a worker thread which creates and delivers new samples from the output pin. The worker thread copies the input sample data before starting work so that my filter doesn't have to maintain a reference to the input samples outside the input CBaseInputPin::Receive call.
What's the best practice for maintaining thread safety and avoiding deadlocks in this case? Should the input and output pin share the same streaming lock or should they have a streaming lock each for their streaming operations? Do buffer allocation, sample delivery and other output pin operations need to hold the streaming lock(s) and/or the filter lock? Any sample code that does something similar? Any other gotchas to watch out for in this situation?
The DirectShow bases classes contain scary comments for CBaseOutputPin::Deliver and CBaseOutputPin::GetDeliveryBuffer which I don't fully understand (pasted below).
/* Deliver a filled-in sample to the connected input pin. NOTE the object must
have locked itself before calling us otherwise we may get halfway through
executing this method only to find the filter graph has got in and
disconnected us from the input pin. If the filter has no worker threads
then the lock is best applied on Receive(), otherwise it should be done
when the worker thread is ready to deliver. There is a wee snag to worker
threads that this shows up. The worker thread must lock the object when
it is ready to deliver a sample, but it may have to wait until a state
change has completed, but that may never complete because the state change
is waiting for the worker thread to complete. The way to handle this is for
the state change code to grab the critical section, then set an abort event
for the worker thread, then release the critical section and wait for the
worker thread to see the event we set and then signal that it has finished
(with another event). At which point the state change code can complete */
You have a few samples in Windows SDK in \Samples\multimedia\directshow\filters and earlier version of SDK had even more. This would be perhaps the best sample code you can check for locking practices.
The filter and pins normally use shared critical sections to ensure thread safety. For instance, CTransformFilter::m_csFilter protects the state data of the filter and not only section, but pins also use the section. The additional section is also used to serialize streaming requests (pushing through samples, sending EOS notifications).
Your filter might be using state critical section, or you can alternatively use additional synchronization object (section, reader-writer lock, or a mutex) in order to avoid deadlocks with critical section being possibly locked by base classes.
Regular suggestions apply: to avoid deadlocks you should make sure your locking order is designed in a way that if section A can be locked on a thread which already has section B locked, you should only lock B [on other threads] when without existing lock on A, so that no deadlock is possible.
So typically, you have two scenarios which most use cases fall into:
you are reusing state critical section of the filter
you are using a separate critical section which protects your private data, and you don't keep the section locked while doing calls on base class methods and methods of other object such as peer filters

Resources