How to put actor to sleep? - actor

I have one actor which is executing a forever loop that is waiting for the availability of data to operate on.
The doc says the Actor runs on a very lightweight thread, so I'm not sure whether i can use the thread.sleep() method on that actor. My objective is to not have that actor consume too much processing power.
So can I use the thread.sleep() method inside the actor ?

Don't sleep() inside Actors! That would cause the Thread to be blocked, causing exactly what you're trying to avoid - using up resources.
Instead if you just handle the message and "do nothing", the Actor will not use up any scheduling resources and will be just another plain object on the heap (occupying around a bit of memory but nothing else).

I just schedule to send a "WakeUp" message in a future time. Akka will send that message at predefined time, so the actor can handle and continue processing. This is to avoid using sleep.
// schedule to wake up
getContext().getSystem().scheduler().scheduleOnce(
FiniteDuration.create(sleepTime.toMillis(), TimeUnit.MILLISECONDS),
new Runnable() {
#Override
public void run() {
getContext().getSelf().tell(new WakeUpMessage());
}
},
getContext().getSystem().executionContext());

Related

Understand the usage of strand without locking

Reference:
websocket_client_async_ssl.cpp
strands
Question 1> Here is my understanding:
Given a few async operations bound with the same strand, the strand
will guarantee that all associated async operations will be executed
as a strictly sequential invocation.
Does this mean that all above async operations will be executed by a same thread?
Or it just says that at any time, only one asyn operation will be executed by any available thread?
Question 2> The boost::asio::make_strand function creates a strand object for an executor or execution context.
session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(net::make_strand(ioc), ctx)
Here, resolver_ and ws_ have its own strand, but I have problems to understand how each strand applies to what asyn operations.
For example, in the following aysnc and handler, which functions(i.e aysnc or handler) are bound to the same strand and will not run simultaneously.
run
=>resolver_.async_resolve -->session::on_resolve
=>beast::get_lowest_layer(ws_).async_connect -->session::on_connect
=>ws_.next_layer().async_handshake --> session::on_ssl_handshake
=>ws_.async_handshake --> session::on_handshake
async ================================= handler
Question 3> How can we retrieve the strand from executor?
Is there any difference between these two?
get_associated_executor
get_executor
io_context::get_executor: Obtains the executor associated with the
io_context.
get_associated_executor: Helper function to obtain an object's
associated executor.
Question 4> Is it correct that I use the following method to bind deadline_timer to io_context to prevent race condition?
All other parts of the code is same as the example of websocket_client_async_ssl.cpp.
session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(net::make_strand(ioc), ctx),
d_timer_(ws_.get_executor())
{ }
void on_heartbeat_write( beast::error_code ec, std::size_t bytes_transferred)
{
d_timer_.expires_from_now(boost::posix_time::seconds(5));
d_timer_.async_wait(beast::bind_front_handler( &session::on_heartbeat, shared_from_this()));
}
void on_heartbeat(const boost::system::error_code& ec)
{
ws_.async_write( net::buffer(text_ping_), beast::bind_front_handler( &session::on_heartbeat_write, shared_from_this()));
}
void on_handshake(beast::error_code ec)
{
d_timer_.expires_from_now(boost::posix_time::seconds(5));
d_timer_.async_wait(beast::bind_front_handler( &session::on_heartbeat, shared_from_this()));
ws_.async_write(net::buffer(text_), beast::bind_front_handler(&session::on_write, shared_from_this()));
}
Note:
I used d_timer_(ws_.get_executor()) to init deadline_timer and hoped that it will make sure they don't write or read the websocket at the same time.
Is this the right way to do it?
Question 1
Does this mean that all above async operations will be executed by a same thread? Or it just says that at any time, only one async operation will be executed by any available thread?
The latter.
Question 2
Here, resolver_ and ws_ have its own strand,
Let me interject that I think that's unnecessarily confusing in the example. They could (should, conceptually) have used the same strand, but I guess they didn't want to go through the trouble of storing a strand. I'd probably have written:
explicit session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(resolver_.get_executor(), ctx) {}
The initiation functions are called where you decide. The completion handlers are dispatch-ed on the executor that belongs to the IO object that you call the operation on, unless the completion handler is bound to a different executor (e.g. using bind_executor, see get_associated_exectutor). In by far the most circumstances in modern Asio, you will not bind handlers, instead "binding IO objects" to the proper executors. This makes it less typing, and much harder to forget.
So in effect, all the async-initiations in the chain except for the one in run() are all on a strand, because the IO objects are tied to strand executors.
You have to keep in mind to dispatch on a strand when some outside user calls into your classes (e.g. often to stop). It is a good idea therefore to develop a convention. I'd personally make all the "unsafe" methods and members private:, so I will often have pairs like:
public:
void stop() {
dispatch(strand_, [self=shared_from_this()] { self->do_stop(); });
}
private:
void do_stop() {
beast::get_lowest_layer(ws_).cancel();
}
Side Note:
In this particular example, there is only one (main) thread running/polling the io service, so the whole point is moot. But as I explained recently (Does mulithreaded http processing with boost asio require strands?), the examples are here to show some common patterns that allow one to do "real life" work as well
Bonus: Handler Tracking
Let's use BOOST_ASIO_ENABLE_HANDLER_TRACKING to get some insight.¹ Running a sample session shows something like
If you squint a little, you can see that all the strand executors are the same:
0*1|resolver#0x559785a03b68.async_resolve
1*2|strand_executor#0x559785a02c50.execute
2*3|socket#0x559785a05770.async_connect
3*4|strand_executor#0x559785a02c50.execute
4*5|socket#0x559785a05770.async_send
5*6|strand_executor#0x559785a02c50.execute
6*7|socket#0x559785a05770.async_receive
7*8|strand_executor#0x559785a02c50.execute
8*9|socket#0x559785a05770.async_send
9*10|strand_executor#0x559785a02c50.execute
10*11|socket#0x559785a05770.async_receive
11*12|strand_executor#0x559785a02c50.execute
12*13|deadline_timer#0x559785a05958.async_wait
12*14|socket#0x559785a05770.async_send
14*15|strand_executor#0x559785a02c50.execute
15*16|socket#0x559785a05770.async_receive
16*17|strand_executor#0x559785a02c50.execute
17*18|socket#0x559785a05770.async_send
13*19|strand_executor#0x559785a02c50.execute
18*20|strand_executor#0x559785a02c50.execute
20*21|socket#0x559785a05770.async_receive
21*22|strand_executor#0x559785a02c50.execute
22*23|deadline_timer#0x559785a05958.async_wait
22*24|socket#0x559785a05770.async_send
24*25|strand_executor#0x559785a02c50.execute
25*26|socket#0x559785a05770.async_receive
26*27|strand_executor#0x559785a02c50.execute
23*28|strand_executor#0x559785a02c50.execute
Question 3
How can we retrieve the strand from executor?
You don't[*]. However make_strand(s) returns an equivalent strand if s is already a strand.
[*] By default, Asio's IO objects use the type-erased executor (asio::executor or asio::any_io_executor depending on version). So technically you could ask it about its target_type() and, after comparing the type id to some expected types use something like target<net::strand<net::io_context::executor_type>>() to access the original, but there's really no use. You don't want to be inspecting the implementation details. Just honour the handlers (by dispatching them on their associated executors like Asio does).
Is there any difference between these two? get_associated_executor get_executor
get_executor gets an owned executor from an IO object. It is a member function.
asio::get_associated_executor gets associated executors from handler objects. You will observe that get_associated_executor(ws_) doesn't compile (although some IO objects may satisfy the criteria to allow it to work).
Question 4
Is it correct that I use the following method to bind deadline_timer to io_context
You will notice that you did the same as I already mentioned above to tie the timer IO object to the same strand executor. So, kudos.
to prevent race condition?
You don't prevent race conditions here. You prevent data races. That is because in on_heartbeat you access the ws_ object which is an instance of a class that is NOT threadsafe. In effect, you're sharing access to non-threadsafe resources, and you need to serialize access, hence you want to be on the strand that all other accesses are also on.
Note: [...] and hoped that it will make sure they don't write or read the websocket at the same time. Is this the right way to do it?
Yes this is a good start, but it is not enough.
Firstly, you can write or read at the same time, as long as
write operations don't overlap
read operations don't overlap
accesses to the IO object are safely serialized.
In particular, your on_heartbeat might be safely serialized so you don't have a data race on calling the async_write initiation function. However, you need more checks to know whether a write operation is already (still) in progress. One way to achieve that is to have a queue with outgoing messages. If you have strict heartbeat requirements and high load, you might need a priority-queue here.
¹ I simplified the example by replacing the stream type with the Asio native ssl::stream<tcp::socket>. This means we don't get all the internal timers that deal with tcp_stream expirations. See https://pastebin.ubuntu.com/p/sPRYh6Xbwz/

How can I have multiple contexts handle events in Apama

I am trying to define a monitor in which I receieve events and then handle them on multiple contexts (roughly equating to threads if I understand correctly) I know I can write
spawn myAction() to myNewContext;
and this will run that action in the new context.
However I want to have an action which will respond to an event when it comes into my monitor:
on all trigger() as t {
doMyThing()
}
on all otherTrigger() as ot {
doMyOtherThing()
}
Can I define my on all in a way that uses a specific context? Something like
on all trigger() as t in myContext {
doMyThing()
}
on all otherTrigger() as t in myOtherContext {
doMyOtherThing()
}
If not what is the best way to define this in Apama EPL? Also could I have multiple contexts handling the same events when they arrive, round robin style?
Apama events from external receivers (ie the outside world) are delivered only to public contexts, including the 'main' context. So depending on your architecture, you can either spawn your action to a public context
// set the receivesInput parameter to true to make this context public
spawn myAction() to context("myContext", true);
...
action myAction() {
on all trigger() as t {
doMyThing();
}
}
or, spawn your action to a private context and set up an event forwarder in a public context, usually the main context (which will always exist)
spawn myAction() to context("myNewContext");
on all trigger() as t {
send t to "myChannel"; // forward all trigger events to the "myChannel" channel
}
...
action myAction() {
monitor.subscribe("myChannel"); // receive all events delivered to the "myChannel" channel
on all trigger() as t {
doMyThing();
}
}
Spawning to a private context and leveraging the channels system is generally the better design as it only sends events to contexts that care about them
To extend a bit on Madden's answer (I don't have enough rep to comment yet), the private context and forwarders is also the only way to achieve true round-robin: otherwise all contexts will receive all events. The easiest approach is to use a partitioning strategy (e.g. IDs ending in 0 go to context-0, or you have one context per machine you're monitoring, etc.), because then each concern is tracked in the same context and you don't have to share state.
Also could I have multiple contexts handling the same events when they arrive, round robin style?
This isn't entirely clear to me. What benefit are you aiming for here? If you're looking to reduce latency by having the "next available" context pick up the event, this probably isn't the right way to achieve it - the deciding which context processes the event means you'd need inter-context communications and coordination, which will increase latency. If you want multiple contexts to process the same events (e.g. one context runs your temperature spike rule, and another runs your long-term temperature average rule, but both take temperature readings as inputs), then that's a good approach but it's not what I'd have called round-robin.

Guaranteed way to cancel a hanging Task?

I often have to execute code on a separate thread that is long running, blocking, instable and\or has a potential to hang forever. Since the existence of TPL the internet is full of examples that nicely cancel a task with the cancellation token but I never found an example that kills a task that hangs. Code that hangs forever is likely to be expected as soon as you communicate with hardware or call some third party code. A task that hangs cannot check the cancellation token and is doomed to stay alive forever. In critical applications I equip those tasks with alive signals that are sent on regular time intervals. As soon as a hanging task is detected, it is killed and a new instance is started.
The code below shows an example task that calls a long running placeholder method SomeThirdPartyLongOperation() which has the potential to hang forever. The StopTask() first checks if the task is still running an tries to cancel it with the cancellation token. If that doesn’t work, the task hangs and the underlying thread is interrupted\aborted old school style.
private Task _task;
private Thread _thread;
private CancellationTokenSource _cancellationTokenSource;
public void StartTask()
{
_cancellationTokenSource = new CancellationTokenSource();
_task = Task.Factory.StartNew(() => DoWork(_cancellationTokenSource.Token), _cancellationTokenSource.Token, TaskCreationOptions.LongRunning, TaskScheduler.Default);
}
public void StopTask()
{
if (_task.Status == TaskStatus.RanToCompletion)
return;
_cancellationTokenSource.Cancel();
try
{
_task.Wait(2000); // Wait for task to end and prevent hanging by timeout.
}
catch (AggregateException aggEx)
{
List<Exception> exceptions = aggEx.InnerExceptions.Where(e => !(e is TaskCanceledException)).ToList(); // Ignore TaskCanceledException
foreach (Exception ex in exceptions)
{
// Process exception thrown by task
}
}
if (!_task.IsCompleted) // Task hangs and didn't respond to cancellation token => old school thread abort
{
_thread.Interrupt();
if (!_thread.Join(2000))
{
_thread.Abort();
}
}
_cancellationTokenSource.Dispose();
if (_task.IsCompleted)
{
_task.Dispose();
}
}
private void DoWork(CancellationToken cancellationToken)
{
if (string.IsNullOrEmpty(Thread.CurrentThread.Name)) // Set thread name for debugging
Thread.CurrentThread.Name = "DemoThread";
_thread = Thread.CurrentThread; // Save for interrupting/aborting if thread hangs
for (int i = 0; i < 10; i++)
{
cancellationToken.ThrowIfCancellationRequested();
SomeThirdPartyLongOperation(i);
}
}
Although I’ve been using this construct for some years now, I want to know if there are some potential mistakes in it. I’ve never seen an example of a task that saves the underlying thread or gives it a name to simplify debugging, so I’m a bit unsure if this is the right way to go. Comment on any detail is welcome!
Code that hangs forever is likely to be expected as soon as you communicate with hardware or call some third party code.
Communication: absolutely not. There's always a way to timeout with communication APIs, so even with misbehaving hardware, there's no need to force-kill an I/O operation.
Third-party code: only if you're paranoid (or have high demands such as 24x7 automation).
Here's the bottom line:
There's no way to force-kill a task.
You can force-kill a thread, but this can easily cause serious problems with application state, possibility if introducing deadlocks in other parts of the code, and resource leaks.
You can force-kill an appdomain, which solves a large portion of app state / deadlock issues with killing threads. However, it doesn't solve them all, and there's still the problem of resource leaks.
You can force-kill a process. This is the only truly clean and reliable solution.
So, if you choose to trust the third-party code, I recommend that you just call it like any other API. If you require 100% reliability regardless of third-party libraries, you'll need to wrap the third-party dll into a separate process and use cross-process communication to call it.
Your current code force-kills a thread pool thread, which is certainly not recommended; those threads belong to the thread pool, not to you, and this is still true even if you specify LongRunning. If you go the kill-thread route (which is not fully reliable), then I recommend using an explicit thread.
The question is why is this task even hanging at all? I think there's no universal solution to this problem but you should focus on the task to be always responsible and not on forcing to interrupt it.
In this code, it looks like you're looking for a simple thread rather than a task - you shouldn't link tasks to threads - it's very likely that the task will switch to another thread after some async operations and you will end up on killing an innoccent thread that is not connected to your task anymore. If you really need to kill the whole thread then make a dedicated one just for this job.
You shouldn't also name or do anything with any thread that is used for tasks' default pool. Consider this code:
static void Main(string[] args)
{
Task.Run(sth);
Console.Read();
}
static async Task sth()
{
Thread.CurrentThread.Name = "My name";
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
Console.WriteLine(Thread.CurrentThread.Name ?? "No name");
}
the output is:
3
4
No name

Concurrent Collection, reporting custom progress data to UI when parallel tasking

I have a concurrent collection that contains 100K items. The processing of each item in the collection can take as little as 100ms or as long as 10 seconds. I want to speed things up by parallelizing the processing, and have a 100 minions doing the work simultaneously. I also have to report some specific data to the UI as this processing occurs, not simply a percentage complete.
I want the parallelized sub-tasks to nibble away at the concurrent collection like a school of minnows attacking a piece of bread tossed into a pond. How do I expose the concurrent collection to the parallelized tasks? Can I have a normal loop and simply launch an async task inside the loop and pass it an IProgress? Do I even need the concurrent collection for this?
It has been recommended to me that I use Parallel.ForEach but I don't see how each sub-process established by the degrees of parallelism could report a custom object back to the UI with each item it processes, not only after it has finished processing its share of the 100K items.
The framework already provides the IProgress inteface for this purpose, and an implementation in Progress. To report progress, call IProgress.Report with a progressvalue. The value T can be any type, not just a number.
Each IProgress implementation can work in its own way. Progress raises an event and calls a callback you pass to it when you create it.
Additionally, Progress.Report executes asynchronously. Under the covers, it uses SychronizationContext.Post to execute its callback and all event handlers on the thread that created the Progress instance.
Assuming you create a progress value class like this:
class ProgressValue
{
public long Step{get;set;}
public string Message {get;set;}
}
You could write something like this:
IProgress<ProgressValue> myProgress=new Progress<ProgressValue>(p=>
{
myProgressBar.Value=p.Step;
});
IList<int> myVeryLargeList=...;
Parallel.ForEach(myVeryLargeList,item,state,step=>
{
//Do some heavy work
myProgress.Report(new ProgressValue
{
Step=step,
Message=String.Format("Processed step {0}",step);
});
});
EDIT
Oops! Progress implements IProgress explicitly. You have to cast it to IProgress , as #Tim noticed.
Fixed the code to explicitly declare myProgress as an IProgress.

Difference of JMS/MQ and a synchronized method

I understood that JMS is used to process synch messages, so what is the difference to use JMS or just take something like that?
public void synchronized doSomething(Message message) {
//do something sync
}
Thank you.
I am actually not sure what you mean by "synch messages". The key concept behind JMS is asynchronous messaging. So a sender/publisher simply calls send(Message) which is a non-blocking call. It thus does not need to wait for the receiver/consumer to finish processing.

Resources