Can someone explain to me what Threadsafe is? [duplicate] - thread-safety

Recently I tried to Access a textbox from a thread (other than the UI thread) and an exception was thrown. It said something about the "code not being thread safe" and so I ended up writing a delegate (sample from MSDN helped) and calling it instead.
But even so I didn't quite understand why all the extra code was necessary.
Update:
Will I run into any serious problems if I check
Controls.CheckForIllegalCrossThread..blah =true

Eric Lippert has a nice blog post entitled What is this thing you call "thread safe"? about the definition of thread safety as found of Wikipedia.
3 important things extracted from the links :
“A piece of code is thread-safe if it functions correctly during
simultaneous execution by multiple threads.”
“In particular, it must satisfy the need for multiple threads to
access the same shared data, …”
“…and the need for a shared piece of data to be accessed by only one
thread at any given time.”
Definitely worth a read!

In the simplest of terms threadsafe means that it is safe to be accessed from multiple threads. When you are using multiple threads in a program and they are each attempting to access a common data structure or location in memory several bad things can happen. So, you add some extra code to prevent those bad things. For example, if two people were writing the same document at the same time, the second person to save will overwrite the work of the first person. To make it thread safe then, you have to force person 2 to wait for person 1 to complete their task before allowing person 2 to edit the document.

Wikipedia has an article on Thread Safety.
This definitions page (you have to skip an ad - sorry) defines it thus:
In computer programming, thread-safe describes a program portion or routine that can be called from multiple programming threads without unwanted interaction between the threads.
A thread is an execution path of a program. A single threaded program will only have one thread and so this problem doesn't arise. Virtually all GUI programs have multiple execution paths and hence threads - there are at least two, one for processing the display of the GUI and handing user input, and at least one other for actually performing the operations of the program.
This is done so that the UI is still responsive while the program is working by offloading any long running process to any non-UI threads. These threads may be created once and exist for the lifetime of the program, or just get created when needed and destroyed when they've finished.
As these threads will often need to perform common actions - disk i/o, outputting results to the screen etc. - these parts of the code will need to be written in such a way that they can handle being called from multiple threads, often at the same time. This will involve things like:
Working on copies of data
Adding locks around the critical code
Opening files in the appropriate mode - so if reading, don't open the file for write as well.
Coping with not having access to resources because they're locked by other threads/processes.

Simply, thread-safe means that a method or class instance can be used by multiple threads at the same time without any problems occurring.
Consider the following method:
private int myInt = 0;
public int AddOne()
{
int tmp = myInt;
tmp = tmp + 1;
myInt = tmp;
return tmp;
}
Now thread A and thread B both would like to execute AddOne(). but A starts first and reads the value of myInt (0) into tmp. Now for some reason, the scheduler decides to halt thread A and defer execution to thread B. Thread B now also reads the value of myInt (still 0) into it's own variable tmp. Thread B finishes the entire method so in the end myInt = 1. And 1 is returned. Now it's Thread A's turn again. Thread A continues. And adds 1 to tmp (tmp was 0 for thread A). And then saves this value in myInt. myInt is again 1.
So in this case the method AddOne() was called two times, but because the method was not implemented in a thread-safe way the value of myInt is not 2, as expected, but 1 because the second thread read the variable myInt before the first thread finished updating it.
Creating thread-safe methods is very hard in non-trivial cases. And there are quite a few techniques. In Java you can mark a method as synchronized, this means that only one thread can execute that method at a given time. The other threads wait in line. This makes a method thread-safe, but if there is a lot of work to be done in a method, then this wastes a lot of space. Another technique is to 'mark only a small part of a method as synchronized' by creating a lock or semaphore, and locking this small part (usually called the critical section). There are even some methods that are implemented as lock-less thread-safe, which means that they are built in such a way that multiple threads can race through them at the same time without ever causing problems, this can be the case when a method only executes one atomic call. Atomic calls are calls that can't be interrupted and can only be done by one thread at a time.

In real world example for the layman is
Let's suppose you have a bank account with the internet and mobile banking and your account have only $10.
You performed transfer balance to another account using mobile banking, and the meantime, you did online shopping using the same bank account.
If this bank account is not threadsafe, then the bank allows you to perform two transactions at the same time and then the bank will become bankrupt.
Threadsafe means that an object's state doesn't change if simultaneously multiple threads try to access the object.

You can get more explanation from the book "Java Concurrency in Practice":
A class is thread‐safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, and with no additional synchronization or other coordination on the part of the calling code.

A module is thread-safe if it guarantees it can maintain its invariants in the face of multi-threaded and concurrence use.
Here, a module can be a data-structure, class, object, method/procedure or function. Basically scoped piece of code and related data.
The guarantee can potentially be limited to certain environments such as a specific CPU architecture, but must hold for those environments. If there is no explicit delimitation of environments, then it is usually taken to imply that it holds for all environments that the code can be compiled and executed.
Thread-unsafe modules may function correctly under mutli-threaded and concurrent use, but this is often more down to luck and coincidence, than careful design. Even if some module does not break for you under, it may break when moved to other environments.
Multi-threading bugs are often hard to debug. Some of them only happen occasionally, while others manifest aggressively - this too, can be environment specific. They can manifest as subtly wrong results, or deadlocks. They can mess up data-structures in unpredictable ways, and cause other seemingly impossible bugs to appear in other remote parts of the code. It can be very application specific, so it is hard to give a general description.

Thread safety: A thread safe program protects it's data from memory consistency errors. In a highly multi-threaded program, a thread safe program does not cause any side effects with multiple read/write operations from multiple threads on same objects. Different threads can share and modify object data without consistency errors.
You can achieve thread safety by using advanced concurrency API. This documentation page provides good programming constructs to achieve thread safety.
Lock Objects support locking idioms that simplify many concurrent applications.
Executors define a high-level API for launching and managing threads. Executor implementations provided by java.util.concurrent provide thread pool management suitable for large-scale applications.
Concurrent Collections make it easier to manage large collections of data, and can greatly reduce the need for synchronization.
Atomic Variables have features that minimize synchronization and help avoid memory consistency errors.
ThreadLocalRandom (in JDK 7) provides efficient generation of pseudorandom numbers from multiple threads.
Refer to java.util.concurrent and java.util.concurrent.atomic packages too for other programming constructs.

Producing Thread-safe code is all about managing access to shared mutable states. When mutable states are published or shared between threads, they need to be synchronized to avoid bugs like race conditions and memory consistency errors.
I recently wrote a blog about thread safety. You can read it for more information.

You are clearly working in a WinForms environment. WinForms controls exhibit thread affinity, which means that the thread in which they are created is the only thread that can be used to access and update them. That is why you will find examples on MSDN and elsewhere demonstrating how to marshall the call back onto the main thread.
Normal WinForms practice is to have a single thread that is dedicated to all your UI work.

I find the concept of http://en.wikipedia.org/wiki/Reentrancy_%28computing%29 to be what I usually think of as unsafe threading which is when a method has and relies on a side effect such as a global variable.
For example I have seen code that formatted floating point numbers to string, if two of these are run in different threads the global value of decimalSeparator can be permanently changed to '.'
//built in global set to locale specific value (here a comma)
decimalSeparator = ','
function FormatDot(value : real):
//save the current decimal character
temp = decimalSeparator
//set the global value to be
decimalSeparator = '.'
//format() uses decimalSeparator behind the scenes
result = format(value)
//Put the original value back
decimalSeparator = temp

To understand thread safety, read below sections:
4.3.1. Example: Vehicle Tracker Using Delegation
As a more substantial example of delegation, let's construct a version of the vehicle tracker that delegates to a thread-safe class. We store the locations in a Map, so we start with a thread-safe Map implementation, ConcurrentHashMap. We also store the location using an immutable Point class instead of MutablePoint, shown in Listing 4.6.
Listing 4.6. Immutable Point class used by DelegatingVehicleTracker.
class Point{
public final int x, y;
public Point() {
this.x=0; this.y=0;
}
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
Point is thread-safe because it is immutable. Immutable values can be freely shared and published, so we no longer need to copy the locations when returning them.
DelegatingVehicleTracker in Listing 4.7 does not use any explicit synchronization; all access to state is managed by ConcurrentHashMap, and all the keys and values of the Map are immutable.
Listing 4.7. Delegating Thread Safety to a ConcurrentHashMap.
public class DelegatingVehicleTracker {
private final ConcurrentMap<String, Point> locations;
private final Map<String, Point> unmodifiableMap;
public DelegatingVehicleTracker(Map<String, Point> points) {
this.locations = new ConcurrentHashMap<String, Point>(points);
this.unmodifiableMap = Collections.unmodifiableMap(locations);
}
public Map<String, Point> getLocations(){
return this.unmodifiableMap; // User cannot update point(x,y) as Point is immutable
}
public Point getLocation(String id) {
return locations.get(id);
}
public void setLocation(String id, int x, int y) {
if(locations.replace(id, new Point(x, y)) == null) {
throw new IllegalArgumentException("invalid vehicle name: " + id);
}
}
}
If we had used the original MutablePoint class instead of Point, we would be breaking encapsulation by letting getLocations publish a reference to mutable state that is not thread-safe. Notice that we've changed the behavior of the vehicle tracker class slightly; while the monitor version returned a snapshot of the locations, the delegating version returns an unmodifiable but “live” view of the vehicle locations. This means that if thread A calls getLocations and thread B later modifies the location of some of the points, those changes are reflected in the Map returned to thread A.
4.3.2. Independent State Variables
We can also delegate thread safety to more than one underlying state variable as long as those underlying state variables are independent, meaning that the composite class does not impose any invariants involving the multiple state variables.
VisualComponent in Listing 4.9 is a graphical component that allows clients to register listeners for mouse and keystroke events. It maintains a list of registered listeners of each type, so that when an event occurs the appropriate listeners can be invoked. But there is no relationship between the set of mouse listeners and key listeners; the two are independent, and therefore VisualComponent can delegate its thread safety obligations to two underlying thread-safe lists.
Listing 4.9. Delegating Thread Safety to Multiple Underlying State Variables.
public class VisualComponent {
private final List<KeyListener> keyListeners
= new CopyOnWriteArrayList<KeyListener>();
private final List<MouseListener> mouseListeners
= new CopyOnWriteArrayList<MouseListener>();
public void addKeyListener(KeyListener listener) {
keyListeners.add(listener);
}
public void addMouseListener(MouseListener listener) {
mouseListeners.add(listener);
}
public void removeKeyListener(KeyListener listener) {
keyListeners.remove(listener);
}
public void removeMouseListener(MouseListener listener) {
mouseListeners.remove(listener);
}
}
VisualComponent uses a CopyOnWriteArrayList to store each listener list; this is a thread-safe List implementation particularly suited for managing listener lists (see Section 5.2.3). Each List is thread-safe, and because there are no constraints coupling the state of one to the state of the other, VisualComponent can delegate its thread safety responsibilities to the underlying mouseListeners and keyListeners objects.
4.3.3. When Delegation Fails
Most composite classes are not as simple as VisualComponent: they have invariants that relate their component state variables. NumberRange in Listing 4.10 uses two AtomicIntegers to manage its state, but imposes an additional constraint—that the first number be less than or equal to the second.
Listing 4.10. Number Range Class that does Not Sufficiently Protect Its Invariants. Don't do this.
public class NumberRange {
// INVARIANT: lower <= upper
private final AtomicInteger lower = new AtomicInteger(0);
private final AtomicInteger upper = new AtomicInteger(0);
public void setLower(int i) {
//Warning - unsafe check-then-act
if(i > upper.get()) {
throw new IllegalArgumentException(
"Can't set lower to " + i + " > upper ");
}
lower.set(i);
}
public void setUpper(int i) {
//Warning - unsafe check-then-act
if(i < lower.get()) {
throw new IllegalArgumentException(
"Can't set upper to " + i + " < lower ");
}
upper.set(i);
}
public boolean isInRange(int i){
return (i >= lower.get() && i <= upper.get());
}
}
NumberRange is not thread-safe; it does not preserve the invariant that constrains lower and upper. The setLower and setUpper methods attempt to respect this invariant, but do so poorly. Both setLower and setUpper are check-then-act sequences, but they do not use sufficient locking to make them atomic. If the number range holds (0, 10), and one thread calls setLower(5) while another thread calls setUpper(4), with some unlucky timing both will pass the checks in the setters and both modifications will be applied. The result is that the range now holds (5, 4)—an invalid state. So while the underlying AtomicIntegers are thread-safe, the composite class is not. Because the underlying state variables lower and upper are not independent, NumberRange cannot simply delegate thread safety to its thread-safe state variables.
NumberRange could be made thread-safe by using locking to maintain its invariants, such as guarding lower and upper with a common lock. It must also avoid publishing lower and upper to prevent clients from subverting its invariants.
If a class has compound actions, as NumberRange does, delegation alone is again not a suitable approach for thread safety. In these cases, the class must provide its own locking to ensure that compound actions are atomic, unless the entire compound action can also be delegated to the underlying state variables.
If a class is composed of multiple independent thread-safe state variables and has no operations that have any invalid state transitions, then it can delegate thread safety to the underlying state variables.

Related

Understand the usage of strand without locking

Reference:
websocket_client_async_ssl.cpp
strands
Question 1> Here is my understanding:
Given a few async operations bound with the same strand, the strand
will guarantee that all associated async operations will be executed
as a strictly sequential invocation.
Does this mean that all above async operations will be executed by a same thread?
Or it just says that at any time, only one asyn operation will be executed by any available thread?
Question 2> The boost::asio::make_strand function creates a strand object for an executor or execution context.
session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(net::make_strand(ioc), ctx)
Here, resolver_ and ws_ have its own strand, but I have problems to understand how each strand applies to what asyn operations.
For example, in the following aysnc and handler, which functions(i.e aysnc or handler) are bound to the same strand and will not run simultaneously.
run
=>resolver_.async_resolve -->session::on_resolve
=>beast::get_lowest_layer(ws_).async_connect -->session::on_connect
=>ws_.next_layer().async_handshake --> session::on_ssl_handshake
=>ws_.async_handshake --> session::on_handshake
async ================================= handler
Question 3> How can we retrieve the strand from executor?
Is there any difference between these two?
get_associated_executor
get_executor
io_context::get_executor: Obtains the executor associated with the
io_context.
get_associated_executor: Helper function to obtain an object's
associated executor.
Question 4> Is it correct that I use the following method to bind deadline_timer to io_context to prevent race condition?
All other parts of the code is same as the example of websocket_client_async_ssl.cpp.
session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(net::make_strand(ioc), ctx),
d_timer_(ws_.get_executor())
{ }
void on_heartbeat_write( beast::error_code ec, std::size_t bytes_transferred)
{
d_timer_.expires_from_now(boost::posix_time::seconds(5));
d_timer_.async_wait(beast::bind_front_handler( &session::on_heartbeat, shared_from_this()));
}
void on_heartbeat(const boost::system::error_code& ec)
{
ws_.async_write( net::buffer(text_ping_), beast::bind_front_handler( &session::on_heartbeat_write, shared_from_this()));
}
void on_handshake(beast::error_code ec)
{
d_timer_.expires_from_now(boost::posix_time::seconds(5));
d_timer_.async_wait(beast::bind_front_handler( &session::on_heartbeat, shared_from_this()));
ws_.async_write(net::buffer(text_), beast::bind_front_handler(&session::on_write, shared_from_this()));
}
Note:
I used d_timer_(ws_.get_executor()) to init deadline_timer and hoped that it will make sure they don't write or read the websocket at the same time.
Is this the right way to do it?
Question 1
Does this mean that all above async operations will be executed by a same thread? Or it just says that at any time, only one async operation will be executed by any available thread?
The latter.
Question 2
Here, resolver_ and ws_ have its own strand,
Let me interject that I think that's unnecessarily confusing in the example. They could (should, conceptually) have used the same strand, but I guess they didn't want to go through the trouble of storing a strand. I'd probably have written:
explicit session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(resolver_.get_executor(), ctx) {}
The initiation functions are called where you decide. The completion handlers are dispatch-ed on the executor that belongs to the IO object that you call the operation on, unless the completion handler is bound to a different executor (e.g. using bind_executor, see get_associated_exectutor). In by far the most circumstances in modern Asio, you will not bind handlers, instead "binding IO objects" to the proper executors. This makes it less typing, and much harder to forget.
So in effect, all the async-initiations in the chain except for the one in run() are all on a strand, because the IO objects are tied to strand executors.
You have to keep in mind to dispatch on a strand when some outside user calls into your classes (e.g. often to stop). It is a good idea therefore to develop a convention. I'd personally make all the "unsafe" methods and members private:, so I will often have pairs like:
public:
void stop() {
dispatch(strand_, [self=shared_from_this()] { self->do_stop(); });
}
private:
void do_stop() {
beast::get_lowest_layer(ws_).cancel();
}
Side Note:
In this particular example, there is only one (main) thread running/polling the io service, so the whole point is moot. But as I explained recently (Does mulithreaded http processing with boost asio require strands?), the examples are here to show some common patterns that allow one to do "real life" work as well
Bonus: Handler Tracking
Let's use BOOST_ASIO_ENABLE_HANDLER_TRACKING to get some insight.¹ Running a sample session shows something like
If you squint a little, you can see that all the strand executors are the same:
0*1|resolver#0x559785a03b68.async_resolve
1*2|strand_executor#0x559785a02c50.execute
2*3|socket#0x559785a05770.async_connect
3*4|strand_executor#0x559785a02c50.execute
4*5|socket#0x559785a05770.async_send
5*6|strand_executor#0x559785a02c50.execute
6*7|socket#0x559785a05770.async_receive
7*8|strand_executor#0x559785a02c50.execute
8*9|socket#0x559785a05770.async_send
9*10|strand_executor#0x559785a02c50.execute
10*11|socket#0x559785a05770.async_receive
11*12|strand_executor#0x559785a02c50.execute
12*13|deadline_timer#0x559785a05958.async_wait
12*14|socket#0x559785a05770.async_send
14*15|strand_executor#0x559785a02c50.execute
15*16|socket#0x559785a05770.async_receive
16*17|strand_executor#0x559785a02c50.execute
17*18|socket#0x559785a05770.async_send
13*19|strand_executor#0x559785a02c50.execute
18*20|strand_executor#0x559785a02c50.execute
20*21|socket#0x559785a05770.async_receive
21*22|strand_executor#0x559785a02c50.execute
22*23|deadline_timer#0x559785a05958.async_wait
22*24|socket#0x559785a05770.async_send
24*25|strand_executor#0x559785a02c50.execute
25*26|socket#0x559785a05770.async_receive
26*27|strand_executor#0x559785a02c50.execute
23*28|strand_executor#0x559785a02c50.execute
Question 3
How can we retrieve the strand from executor?
You don't[*]. However make_strand(s) returns an equivalent strand if s is already a strand.
[*] By default, Asio's IO objects use the type-erased executor (asio::executor or asio::any_io_executor depending on version). So technically you could ask it about its target_type() and, after comparing the type id to some expected types use something like target<net::strand<net::io_context::executor_type>>() to access the original, but there's really no use. You don't want to be inspecting the implementation details. Just honour the handlers (by dispatching them on their associated executors like Asio does).
Is there any difference between these two? get_associated_executor get_executor
get_executor gets an owned executor from an IO object. It is a member function.
asio::get_associated_executor gets associated executors from handler objects. You will observe that get_associated_executor(ws_) doesn't compile (although some IO objects may satisfy the criteria to allow it to work).
Question 4
Is it correct that I use the following method to bind deadline_timer to io_context
You will notice that you did the same as I already mentioned above to tie the timer IO object to the same strand executor. So, kudos.
to prevent race condition?
You don't prevent race conditions here. You prevent data races. That is because in on_heartbeat you access the ws_ object which is an instance of a class that is NOT threadsafe. In effect, you're sharing access to non-threadsafe resources, and you need to serialize access, hence you want to be on the strand that all other accesses are also on.
Note: [...] and hoped that it will make sure they don't write or read the websocket at the same time. Is this the right way to do it?
Yes this is a good start, but it is not enough.
Firstly, you can write or read at the same time, as long as
write operations don't overlap
read operations don't overlap
accesses to the IO object are safely serialized.
In particular, your on_heartbeat might be safely serialized so you don't have a data race on calling the async_write initiation function. However, you need more checks to know whether a write operation is already (still) in progress. One way to achieve that is to have a queue with outgoing messages. If you have strict heartbeat requirements and high load, you might need a priority-queue here.
¹ I simplified the example by replacing the stream type with the Asio native ssl::stream<tcp::socket>. This means we don't get all the internal timers that deal with tcp_stream expirations. See https://pastebin.ubuntu.com/p/sPRYh6Xbwz/

How can I write a save GUI-Aktor for Scalafx?

Basically I want an Aktor to change a scalafx-GUI safely.
I've read many posts describing this, but there where sometimes contradictory and some years old, so some of them might be outdated.
I have a working example code and I basically want to know if this kind of programming is thread-save.
The other question is if I can configure sbt or the compiler or something in a way, that all threads (from the gui, the actors and the futures) are started by the same dispatcher.
I've found some example code "scalafx-akka-demo" on GitHub, which is 4 years old. What I did in the following example is basically the same, just a little simplified to keep things easy.
Then there is the scalatrix-example approximately with the same age. This example really worries me.
In there is a self-written dispatcher from Viktor Klang from 2012, and I have no idea how to make this work or if I really need it. The question is: Is this dispatcher only an optimisation or do I have to use something like it to be thread save?
But even if I don't absolutely need the dispatcher like in scalatrix, it is not optimal to have a dispatcher for the aktor-threads and one for the scalafx-event-threads. (And maybe one for the Futures-threads as well?)
In my actual project, I have some measurement values coming from a device over TCP-IP, going to an TCP-IP actor and are to be displayed in a scalafx-GUI. But this is much to long.
So here is my example code:
import akka.actor.{Actor, ActorRef, ActorSystem, Props}
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
import scalafx.Includes._
import scalafx.application.{JFXApp, Platform}
import scalafx.application.JFXApp.PrimaryStage
import scalafx.event.ActionEvent
import scalafx.scene.Scene
import scalafx.scene.control.Button
import scalafx.stage.WindowEvent
import scala.concurrent.ExecutionContext.Implicits.global
object Main extends JFXApp {
case object Count
case object StopCounter
case object CounterReset
val aktorSystem: ActorSystem = ActorSystem("My-Aktor-system") // Create actor context
val guiActor: ActorRef = aktorSystem.actorOf(Props(new GUIActor), "guiActor") // Create GUI actor
val button: Button = new Button(text = "0") {
onAction = (_: ActionEvent) => guiActor ! Count
}
val someComputation = Future {
Thread.sleep(10000)
println("Doing counter reset")
guiActor ! CounterReset
Platform.runLater(button.text = "0")
}
class GUIActor extends Actor {
def receive: Receive = counter(1)
def counter(n: Int): Receive = {
case Count =>
Platform.runLater(button.text = n.toString)
println("The count is: " + n)
context.become(counter(n + 1))
case CounterReset => context.become(counter(1))
case StopCounter => context.system.terminate()
}
}
stage = new PrimaryStage {
scene = new Scene {
root = button
}
onCloseRequest = (_: WindowEvent) => {
guiActor ! StopCounter
Await.ready(aktorSystem.whenTerminated, 5.seconds)
Platform.exit()
}
}
}
So this code brings up a button, and every time it is clicked the number of the button increases. After some time the number on the button is reset once.
In this example-code I tried to bring the scalafx-GUI, the actor and the Future to influence each other. So the button click sends a message to the actor, and then the actor changes the gui - which is what I am testing here.
The Future also sends to the actor and changes the gui.
So far, this example works and I haven't found everything wrong with it.
But unfortunately, when it comes to thread-safety this doesn't mean much
My concrete questions are:
Is the method to change the gui in the example code thread save?
Is there may be a better way to do it?
Can the different threads be started from the same dispatcher?
(if yes, then how?)
To address your questions:
1) Is the method to change the gui in the example code thread save?
Yes.
JavaFX, which ScalaFX sits upon, implements thread safety by insisting that all GUI interactions take place upon the JavaFX Application Thread (JAT), which is created during JavaFX initialization (ScalaFX takes care of this for you). Any code running on a different thread that interacts with JavaFX/ScalaFX will result in an error.
You are ensuring that your GUI code executes on the JAT by passing interacting code via the Platform.runLater method, which evaluates its arguments on the JAT. Because arguments are passed by name, they are not evaluated on the calling thread.
So, as far as JavaFX is concerned, your code is thread safe.
However, potential issues can still arise if the code you pass to Platform.runLater contains any references to mutable state maintained on other threads.
You have two calls to Platform.runLater. In the first of these (button.text = "0"), the only mutable state (button.text) belongs to JavaFX, which will be examined and modified on the JAT, so you're good.
In the second call (button.text = n.toString), you're passing the same JavaFX mutable state (button.text). But you're also passing a reference to n, which belongs to the GUIActor thread. However, this value is immutable, and so there are no threading issues from looking at its value. (The count is maintained by the Akka GUIActor class's context, and the only interactions that change the count come through Akka's message handling mechanism, which is guaranteed to be thread safe.)
That said, there is one potential issue here: the Future both resets the count (which will occur on the GUIActor thread) as well as setting the text to "0" (which will occur on the JAT). Consequently, it's possible that these two actions will occur in an unexpected order, such as button's text being changed to "0" before the count is actually reset. If this occurs simultaneously with the user clicking the button, you'll get a race condition and it's conceivable that the displayed value may end up not matching the current count.
2) Is there may be a better way to do it?
There's always a better way! ;-)
To be honest, given this small example, there's not a lot of further improvement to be made.
I would try to keep all of the interaction with the GUI inside either GUIActor, or the Main object to simplify the threading and synchronization issues.
For example, going back to the last point in the previous answer, rather than have the Future update button.text, it would be better if that was done as part of the CounterReset message handler in GUIActor, which then guarantees that the counter and button appearance are synchronized correctly (or, at least, that they're always updated in the same order), with the displayed value guaranteed to match the count.
If your GUIActor class is handling a lot of interaction with the GUI, then you could have it execute all of its code on the JAT (I think this was the purpose of Viktor Klang's example), which would simplify a lot of its code. For example, you would not have to call Platform.runLater to interact with the GUI. The downside is that you then cannot perform processing in parallel with the GUI, which might slow down its performance and responsiveness as a result.
3) Can the different threads be started from the same dispatcher? (if yes, then how?)
You can specify custom execution contexts for both futures and Akka actors to get better control of their threads and dispatching. However, given Donald Knuth's observation that "premature optimization is the root of all evil", there's no evidence that this would provide you with any benefits whatsoever, and your code would become significantly more complicated as a result.
As far as I'm aware, you can't change the execution context for JavaFX/ScalaFX, since JAT creation must be finely controlled in order to guarantee thread safety. But I could be wrong.
In any case, the overhead of having different dispatchers is not going to be high. One of the reasons for using futures and actors is that they will take care of these issues for you by default. Unless you have a good reason to do otherwise, I would use the defaults.

Closing over java.util.concurrent.ConcurrentHashMap inside a Future of Actor's receive method?

I've an actor where I want to store my mutable state inside a map.
Clients can send Get(key:String) and Put(key:String,value:String) messages to this actor.
I'm considering the following options.
Don't use futures inside the Actor's receive method. In this may have a negative impact on both latency as well as throughput in case I've a large number of gets/puts because all operations will be performed in order.
Use java.util.concurrent.ConcurrentHashMap and then invoke the gets and puts inside a Future.
Given that java.util.concurrent.ConcurrentHashMap is thread-safe and providers finer level of granularity, I was wondering if it is still a problem to close over the concurrentHashMap inside a Future created for each put and get.
I'm aware of the fact that it's a really bad idea to close over mutable state inside a Future inside an Actor but I'm still interested to know if in this particular case it is correct or not?
In general, java.util.concurrent.ConcurrentHashMap is made for concurrent use. As long as you don't try to transport the closure to another machine, and you think through the implications of it being used concurrently (e.g. if you read a value, use a function to modify it, and then put it back, do you want to use the replace(key, oldValue, newValue) method to make sure it hasn't changed while you were doing the processing?), it should be fine in Futures.
May be a little late, but still, in the book Reactive Web Applications, the author has indicated an indirection to this specific problem, using pipeTo as below.
def receive = {
case ComputeReach(tweetId) =>
fetchRetweets(tweetId, sender()) pipeTo self
case fetchedRetweets: FetchedRetweets =>
followerCountsByRetweet += fetchedRetweets -> List.empty
fetchedRetweets.retweets.foreach { rt =>
userFollowersCounter ! FetchFollowerCount(
fetchedRetweets.tweetId, rt.user
)
}
...
}
where followerCountsByRetweet is a mutable state of the actor. The result of fetchRetweets() which is a Future is piped to the same actor as a FetchedRetweets message, which then acts on the message on to modify the state of the acto., this will mitigate any concurrent operation on the state

Can multiple threads synchronize a load with a single store using memory_order_acquire using std::atomic

I was wondering if you have 2 threads that are doing a load using memory_order_acquire and one thread doing a store using memory_acquire_release will the load only be synched with one of the two threads doing the load? Meaning is it only able to synch up with one of the threads for the store / load or can you have multiple threads doing a synchronized load with a single thread doing a store. This seems to be the case after researching it is a one to one relationship but read-modify-write operations seems to chain, see below.
It seems to work fine if the threads are doing a read-modify-write operation like fetch_sub then it seems these will have a chained release even though there is no synchronize-with relationship between reader1_thread and reader2_thread
std::atomic<int> s;
void loader_thread()
{
s.store(1,std::memory_order_release);
}
// seems to chain properly if these were fetch_sub instead of load,
// wondering if just loads will be synchronized as well meaning both threads
// will be synched up with the store in the loader thread or just the first one
void reader1_thread()
{
while(!s.load(std::memory_order_acquire));
}
void reader2_thread()
{
while(!s.load(std::memory_order_acquire));
}
The standard says that "an atomic store-release synchronizes with a load-acquire that takes its value from the store" (C++11 §1.10 [intro.multithread] p8). It notably does not say that there can only be one such synchronizing load; so indeed any atomic load-acquire that takes its value from an atomic store-release synchronizes with that store.

What is the cost of creating actors in Akka?

Consider a scenario in which I am implementing a system that processes incoming tasks using Akka. I have a primary actor that receives tasks and dispatches them to some worker actors that process the tasks.
My first instinct is to implement this by having the dispatcher create an actor for each incoming task. After the worker actor processes the task it is stopped.
This seems to be the cleanest solution for me since it adheres to the principle of "one task, one actor". The other solution would be to reuse actors - but this involves the extra-complexity of cleanup and some pool management.
I know that actors in Akka are cheap. But I am wondering if there is an inherent cost associated with repeated creation and deletion of actors. Is there any hidden cost associated with the data structures Akka uses for the bookkeeping of actors ?
The load should be of the order of tens or hundreds of tasks per second - think of it as a production webserver that creates one actor per request.
Of course, the right answer lies in the profiling and fine tuning of the system based on the type of the incoming load.
But I wondered if anyone could tell me something from their own experience ?
LATER EDIT:
I should given more details about the task at hand:
Only N active tasks can run at some point. As #drexin pointed out - this would be easily solvable using routers. However, the execution of tasks isn't a simple run and be done type of thing.
Tasks may require information from other actors or services and thus may have to wait and become asleep. By doing so they release an execution slot. The slot can be taken by another waiting actor which now has the opportunity to run. You could make an analogy with the way processes are scheduled on one CPU.
Each worker actor needs to keep some state regarding the execution of the task.
Note: I appreciate alternative solutions to my problem, and I will certainly take them into consideration. However, I would also like an answer to the main question regarding the intensive creation and deletion of actors in Akka.
You should not create an actor for every request, you should rather use a router to dispatch the messages to a dynamic amount of actors. That's what routers are for. Read this part of the docs for more information: http://doc.akka.io/docs/akka/2.0.4/scala/routing.html
edit:
Creating top-level actors (system.actorOf) is expensive, because every top-level actor will initialize an error kernel as well and those are expensive. Creating child actors (inside an actor context.actorOf) is way cheaper.
But still I suggest you to rethink this, because depending on the frequency of the creation and deletion of actors you will also put afditional pressure on the GC.
edit2:
And most important, actors are not threads! So even if you create 1M actors, they will only run on as many threads as the pool has. So depending on the throughput setting in the config every actor will process n messages before the thread gets released to the pool again.
Note that blocking a thread (includes sleeping) will NOT return it to the pool!
An actor which will receive one message right after its creation and die right after sending the result can be replaced by a future. Futures are more lightweight than actors.
You can use pipeTo to receive the future result when its done. For instance in your actor launching the computations:
def receive = {
case t: Task => future { executeTask( t ) }.pipeTo(self)
case r: Result => processTheResult(r)
}
where executeTask is your function taking a Task to return a Result.
However, I would reuse actors from a pool through a router as explained in #drexin answer.
I've tested with 10000 remote actors created from some main context by a root actor, same scheme as in prod module a single actor was created. MBP 2.5GHz x2:
in main: main ? root // main asks root to create an actor
in main: actorOf(child) // create a child
in root: watch(child) // watch lifecycle messages
in root: root ? child // wait for response (connection check)
in child: child ! root // response (connection ok)
in root: root ! main // notify created
Code:
def start(userName: String) = {
logger.error("HELLOOOOOOOO ")
val n: Int = 10000
var t0, t1: Long = 0
t0 = System.nanoTime
for (i <- 0 to n) {
val msg = StartClient(userName + i)
Await.result(rootActor ? msg, timeout.duration).asInstanceOf[ClientStarted] match {
case succ # ClientStarted(userName) =>
// logger.info("[C][SUCC] Client started: " + succ)
case _ =>
logger.error("Terminated on waiting for response from " + i + "-th actor")
throw new RuntimeException("[C][FAIL] Could not start client: " + msg)
}
}
t1 = System.nanoTime
logger.error("Starting of a single actor of " + n + ": " + ((t1 - t0) / 1000000.0 / n.toDouble) + " ms")
}
The result:
Starting of a single actor of 10000: 0.3642917 ms
There was a message stating that "Slf4jEventHandler started" between "HELOOOOOOOO" and "Starting of a single", so the experiment seems even more realistic (?)
Dispatchers was a default (a PinnedDispatcher starting a new thread each and every time), and it seemed like all that stuff is the same as Thread.start() was, for a long long time since Java 1 - 500K-1M cycles or so ^)
That's why I've changed all code inside loop, to a new java.lang.Thread().start()
The result:
Starting of a single actor of 10000: 0.1355219 ms
Actors make great finite state machines so let that help drive your design here. If your request handling state is greatly simplified by having one actor per request then do that. I find that actors are particularly good at managing more than two states as a rule of thumb.
Commonly though, one request handling actor that references request state from within a collection that it maintains as part of its own state is a common approach. Note that this can also be achieved with an Akka reactive stream and the use of the scan stage.

Resources