Coroutine: difference between join() and cancelAndJoin() - kotlin-coroutines

I follow the reference document on Coroutine (https://kotlinlang.org/docs/reference/coroutines/cancellation-and-timeouts.html) and I test in the same time the example on Android Studio (a good boy).
But I am very confuse with cancelAndJoin() method.
If I replace, in the example code, the "cancelAndJoin" with "join", there is no difference in logs.
Here is the code :
fun main() = runBlocking {
val startTime = System.currentTimeMillis()
val job = launch(Dispatchers.Default) {
var nextPrintTime = startTime
var i = 0
while (i < 5) { // computation loop, just wastes CPU
// print a message twice a second
if (System.currentTimeMillis() >= nextPrintTime) {
println("job: I'm sleeping ${i++} ...")
nextPrintTime += 500L
}
}
}
delay(1300L) // delay a bit
println("main: I'm tired of waiting!")
job.cancelAndJoin() // cancels the job and waits for its completion
println("main: Now I can quit.")
}
and in the 2 cases (with join or cancelAndJoin) the logs are :
job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
job: I'm sleeping 3 ...
job: I'm sleeping 4 ...
main: Now I can quit.
Does anybody can explain what is the difference with the 2 methods?
This is not clear because cancel is "stopping the job", join is "waiting for completion", but the two together??? We "stop" and "wait"???
Thanks by advance :)

This is not clear because cancel is "stopping the job", join is
"waiting for completion", but the two together??? We "stop" and
"wait"?
Just because we wrote job.cancel() doesn't mean that job will cancel immediately.
When we write job.cancel() the job enters a transient cancelling state. This job is yet to enter the final cancelled state and only after the job enters the cancelled state it can be considered completely cancelled.
Observe how the state transition occurs,
When we call job.cancel() the cancellation procedure(transition from completing to cancelling) starts for the job. But we still need to wait, as the job is truly cancelled only when it reaches the cancelled state. Hence we write job.cancelAndJoin(), where cancel() starts the cancellation procedure for job and join() makes sure to wait until the job() has reached the cancelled state.
Make sure to check out this excellent article
Note: The image is shamelessly copied from the linked article itself

Ok I've just realize that my question was stupid!
cancelAndJoin() is equal to do cancel() and then join().
The code demonstrate that the job can not be cancelled if we don't check if it is active or not. And to do so, we must use "isActive".
So like this:
val startTime = System.currentTimeMillis()
val job = launch(Dispatchers.Default) {
var nextPrintTime = startTime
var i = 0
while (isActive) { // cancellable computation loop
// print a message twice a second
if (System.currentTimeMillis() >= nextPrintTime) {
println("job: I'm sleeping ${i++} ...")
nextPrintTime += 500L
}
}
}
delay(1300L) // delay a bit
println("main: I'm tired of waiting!")
job.cancelAndJoin() // cancels the job and waits for its completion
println("main: Now I can quit.")

Related

Unwanted callback from `SFSpeechRecognizer` `recognitionTask` at ~30s mark

If I inspected the stamps for callbacks from the SFSpeechRecognizer recognitionTask callback (on macOS):
recognitionTask = speechRecognizer.recognitionTask( with: recognitionRequest )
{ result, error in
// if more than two seconds elapsed since the last update, we send a notification
NSLog( "speechRecognizer.recognitionTask callback" )
:
... I observe:
:
2019-11-08 14:51:00.35 ... speechRecognizer.recognitionTask callback
2019-11-08 14:51:00.45 ... speechRecognizer.recognitionTask callback
2019-11-08 14:51:32.31 ... speechRecognizer.recognitionTask callback
i.e. There is an additional unwanted callback approximately 30 seconds after my last utterance.
result is nil for this last callback.
The fact that it's close to 30 seconds suggests to me it is representing a max-timeout.
I'm not expecting a time out, because I have manually shutdown my session (at around the 5s mark, by clicking a button):
#objc
func stopRecording()
{
print( "stopRecording()" )
// Instructs the task to stop accepting new audio (e.g. stop recording) but complete processing on audio already buffered.
// This has no effect on URL-based recognition requests, which effectively buffer the entire file immediately.
recognitionTask?.finish()
// Indicate that the audio source is finished and no more audio will be appended
recognitionRequest?.endAudio()
//self.recognitionRequest = nil
audioEngine.stop()
audioEngine.inputNode.removeTap( onBus: 0 )
//recognitionTask?.cancel()
//self.recognitionTask = nil
self.timer?.invalidate()
print( "stopRecording() DONE" )
}
There's a lot of commented out code, as it seems to me there is some process I'm failing to shut down, but I can't figure it out.
Complete code is here.
Can anyone see what's going wrong?
NOTE: in this solution, there is still one unwanted callback immediately after the destruction sequence is initiated. Hence I'm not going to accept the answer yet. Maybe there exists a more elegant solution to this problem, or someone may shed light on observed phenomena.
The following code does the job:
is_listening = true
recognitionTask = speechRecognizer.recognitionTask( with: recognitionRequest )
{ result, error in
if !self.is_listening {
NSLog( "IGNORED: speechRecognizer.recognitionTask callback" )
return
}
// if more than two seconds elapsed since the last update, we send a notification
self.timer?.invalidate()
NSLog( "speechRecognizer.recognitionTask callback" )
self.timer = Timer.scheduledTimer(withTimeInterval: 2.0, repeats: false) { _ in
if !self.is_listening {
print( "IGNORED: Inactivity (timer) Callback" )
return
}
print( "Inactivity (timer) Callback" )
self.timer?.invalidate()
NotificationCenter.default.post( name: Dictation.notification_paused, object: nil )
//self.stopRecording()
}
:
}
#objc
func stopRecording()
{
print( "stopRecording()" )
is_listening = false
audioEngine.stop()
audioEngine.inputNode.removeTap(onBus: 0)
audioEngine.inputNode.reset()
recognitionRequest?.endAudio()
recognitionRequest = nil
timer?.invalidate()
timer = nil;
recognitionTask?.cancel()
recognitionTask = nil
}
The exact sequence is taken from https://github.com/2bbb/ofxSpeechRecognizer/blob/master/src/ofxSpeechRecognizer.mm -- I'm not sure to what extent it is order sensitive.
For starters, that destruction sequence seems to eliminate the 30s timeout? callback.
However the closures still get hit after stopListening.
To deal with this I've created an is_listening flag.
I have a hunch Apple should maybe have internalised this logic within the framework, but meh it works, I am a happy bunny.

dispatch_get_main_queue() doesn't run new async jobs smoothly

let dispatchGroup = dispatch_group_create()
let now = DISPATCH_TIME_NOW
for i in 0..<1000 {
dispatch_group_enter(dispatchGroup)
// Do some async tasks
let delay = dispatch_time(now, Int64(Double(i) * 0.1 * Double(NSEC_PER_SEC)))
dispatch_after(delay, dispatch_get_main_queue(), {
print(i)
dispatch_group_leave(dispatchGroup)
})
}
The print statement can print first 15-20 numbers smoothly, however, when i goes larger, the print statement prints stuff in a sluggish way. I had more complicated logic inside dispatch_after and I noticed the processing was very sluggish, that's why I wrote this test.
Is there a buffer size or other properties that I can configure? It seems dispatch_get_main_queue() doesn't work well with bigger number of async tasks.
Thanks in advance!
The problem isn't dispatch_get_main_queue(). (You'll notice the same behavior if you use a different queue.) The problem rests in dispatch_after().
When you use dispatch_after, it creates a dispatch timer with a leeway of 10% of the start/when. See the Apple github libdispatch source. The net effect is that when these timers (start ± 10% leeway) overlap, it may start coalescing them. When they're coalesced, they'll appear to fire in a "clumped" manner, a bunch of them firing immediately right after another and then a little delay before it gets to the next bunch.
There are a couple of solutions, all entailing the retirement of the series of dispatch_after calls:
You can build timers manually, forcing DispatchSource.TimerFlag.strict to disable coalescing:
let group = DispatchGroup()
let queue = DispatchQueue.main
let start = CACurrentMediaTime()
os_log("start")
for i in 0 ..< 1000 {
group.enter()
let timer = DispatchSource.makeTimerSource(flags: .strict, queue: queue) // use `.strict` to avoid coalescing
timer.setEventHandler {
timer.cancel() // reference timer so it has strong reference until the handler is called
os_log("%d", i)
group.leave()
}
timer.schedule(deadline: .now() + Double(i) * 0.1)
timer.resume()
}
group.notify(queue: .main) {
let elapsed = CACurrentMediaTime() - start
os_log("all done %.1f", elapsed)
}
Personally, I dislike that reference to timer inside the closure, but you need to keep some strong reference to it until the timer fires, and GCD timers release the block (avoiding strong reference cycle) when the timer is canceled/finishes. This is inelegant solution, IMHO.
It is more efficient to just schedule single repeating timer that fires every 0.1 seconds:
var timer: DispatchSourceTimer? // note this is property to make sure we keep strong reference
func startTimer() {
let queue = DispatchQueue.main
let start = CACurrentMediaTime()
var counter = 0
// Do some async tasks
timer = DispatchSource.makeTimerSource(flags: .strict, queue: queue)
timer!.setEventHandler { [weak self] in
guard counter < 1000 else {
self?.timer?.cancel()
self?.timer = nil
let elapsed = CACurrentMediaTime() - start
os_log("all done %.1f", elapsed)
return
}
os_log("%d", counter)
counter += 1
}
timer!.schedule(deadline: .now(), repeating: 0.05)
timer!.resume()
}
This not only solves the coalescing problem, but it also is more efficient.
For Swift 2.3 rendition, see previous version of this answer.

implementing a scheduler class in Windows

I want to implement a scheduler class, which any object can use to schedule timeouts and cancel then if necessary. When a timeout expires, this information will be sent to the timeout setter/owner at that time asynchronously.
So, for this purpose, I have 2 fundamental classes WindowsTimeout and WindowsScheduler.
class WindowsTimeout
{
bool mCancelled;
int mTimerID; // Windows handle to identify the actual timer set.
ITimeoutReceiver* mSetter;
int cancel()
{
mCancelled = true;
if ( timeKillEvent(mTimerID) == SUCCESS) // Line under question # 1
{
delete this; // Timeout instance is self-destroyed.
return 0; // ok. OS Timer resource given back.
}
return 1; // fail. OS Timer resource not given back.
}
WindowsTimeout(ITimeoutReceiver* setter, int timerID)
{
mSetter = setter;
mTimerID = timerID;
}
};
class WindowsScheduler
{
static void CALLBACK timerFunction(UINT uID,UINT uMsg,DWORD dwUser,DWORD dw1,DWORD dw2)
{
WindowsTimeout* timeout = (WindowsTimeout*) uMsg;
if (timeout->mCancelled)
delete timeout;
else
timeout->mDestination->GEN(evTimeout(timeout));
}
WindowsTimeout* schedule(ITimeoutReceiver* setter, TimeUnit t)
{
int timerID = timeSetEvent(...);
if (timerID == SUCCESS)
{
return WindowsTimeout(setter, timerID);
}
return 0;
}
};
My questions are:
Q.1. When a WindowsScheduler::timerFunction() call is made, this call is performed in which context ? It is simply a callback function and I think, it is performed by the OS context, right ? If it is so, does this calling pre-empt any other tasks already running ? I mean do callbacks have higher priority than any other user-task ?
Q.2. When a timeout setter wants to cancel its timeout, it calls WindowsTimeout::cancel().
However, there is always a possibility that timerFunction static call to be callbacked by OS, pre-empting the cancel operation, for example, just after mCancelled = true statement. In such a case, the timeout instance will be deleted by the callback function.
When the pre-empted cancel() function comes again, after the callback function completes execution, will try to access an attribute of the deleted instance (mTimerID), as you can see on the line : "Line under question # 1" in the code.
How can I avoid such a case ?
Please note that, this question is an improved version of the previos one of my own here:
Windows multimedia timer with callback argument
Q1 - I believe it gets called within a thread allocated by the timer API. I'm not sure, but I wouldn't be surprised if the thread ran at a very high priority. (In Windows, that doesn't necessarily mean it will completely preempt other threads, it just means it will get more cycles than other threads).
Q2 - I started to sketch out a solution for this, but then realized it was a bit harder than I thought. Personally, I would maintain a hash table that maps timerIDs to your WindowsTimeout object instances. The hash table could be a simple std::map instance that's guarded by a critical section. When the timer callback occurs, it enters the critical section and tries to obtain the WindowsTimer instance pointer, and then flags the WindowsTimer instance as having been executed, exits the critical section, and then actually executes the callback. In the event that the hash table doesn't contain the WindowsTimer instance, it means the caller has already removed it. Be very careful here.
One subtle bug in your own code above:
WindowsTimeout* schedule(ITimeoutReceiver* setter, TimeUnit t)
{
int timerID = timeSetEvent(...);
if (timerID == SUCCESS)
{
return WindowsTimeout(setter, timerID);
}
return 0;
}
};
In your schedule method, it's entirely possible that the callback scheduled by timeSetEvent will return BEFORE you can create an instance of WindowsTimeout.

Producer-consumer with sempahores

Producer-consumer problem taken from Wikipedia:
semaphore mutex = 1
semaphore fillCount = 0
semaphore emptyCount = BUFFER_SIZE
procedure producer() {
while (true) {
item = produceItem()
down(emptyCount)
down(mutex)
putItemIntoBuffer(item)
up(mutex)
up(fillCount)
}
up(fillCount) //the consumer may not finish before the producer.
}
procedure consumer() {
while (true) {
down(fillCount)
down(mutex)
item = removeItemFromBuffer()
up(mutex)
up(emptyCount)
consumeItem(item)
}
}
My question - why does the producer have up(fillCount) //the consumer may not finish before the producer after the while loop. When will the program get there and why is it needed?
I think the code doesn't make sense this way. The loop never ends, so the line in question can be never reached.
The code didn't originally contain that line, and it was added by an anonymous editor in March 2009. I removed that line now.
In general, code on Wikipedia is often edited by many people over a long period of time, so it's quite easy to introduce bugs into it.

Scala stateful actor, recursive calling faster than using vars?

Sample code below. I'm a little curious why MyActor is faster than MyActor2. MyActor recursively calls process/react and keeps state in the function parameters whereas MyActor2 keeps state in vars. MyActor even has the extra overhead of tupling the state but still runs faster. I'm wondering if there is a good explanation for this or if maybe I'm doing something "wrong".
I realize the performance difference is not significant but the fact that it is there and consistent makes me curious what's going on here.
Ignoring the first two runs as warmup, I get:
MyActor:
559
511
544
529
vs.
MyActor2:
647
613
654
610
import scala.actors._
object Const {
val NUM = 100000
val NM1 = NUM - 1
}
trait Send[MessageType] {
def send(msg: MessageType)
}
// Test 1 using recursive calls to maintain state
abstract class StatefulTypedActor[MessageType, StateType](val initialState: StateType) extends Actor with Send[MessageType] {
def process(state: StateType, message: MessageType): StateType
def act = proc(initialState)
def send(message: MessageType) = {
this ! message
}
private def proc(state: StateType) {
react {
case msg: MessageType => proc(process(state, msg))
}
}
}
object MyActor extends StatefulTypedActor[Int, (Int, Long)]((0, 0)) {
override def process(state: (Int, Long), input: Int) = input match {
case 0 =>
(1, System.currentTimeMillis())
case input: Int =>
state match {
case (Const.NM1, start) =>
println((System.currentTimeMillis() - start))
(Const.NUM, start)
case (s, start) =>
(s + 1, start)
}
}
}
// Test 2 using vars to maintain state
object MyActor2 extends Actor with Send[Int] {
private var state = 0
private var strt = 0: Long
def send(message: Int) = {
this ! message
}
def act =
loop {
react {
case 0 =>
state = 1
strt = System.currentTimeMillis()
case input: Int =>
state match {
case Const.NM1 =>
println((System.currentTimeMillis() - strt))
state += 1
case s =>
state += 1
}
}
}
}
// main: Run testing
object TestActors {
def main(args: Array[String]): Unit = {
val a = MyActor
// val a = MyActor2
a.start()
testIt(a)
}
def testIt(a: Send[Int]) {
for (_ <- 0 to 5) {
for (i <- 0 to Const.NUM) {
a send i
}
}
}
}
EDIT: Based on Vasil's response, I removed the loop and tried it again. And then MyActor2 based on vars leapfrogged and now might be around 10% or so faster. So... lesson is: if you are confident that you won't end up with a stack overflowing backlog of messages, and you care to squeeze every little performance out... don't use loop and just call the act() method recursively.
Change for MyActor2:
override def act() =
react {
case 0 =>
state = 1
strt = System.currentTimeMillis()
act()
case input: Int =>
state match {
case Const.NM1 =>
println((System.currentTimeMillis() - strt))
state += 1
case s =>
state += 1
}
act()
}
Such results are caused with the specifics of your benchmark (a lot of small messages that fill the actor's mailbox quicker than it can handle them).
Generally, the workflow of react is following:
Actor scans the mailbox;
If it finds a message, it schedules the execution;
When the scheduling completes, or, when there're no messages in the mailbox, actor suspends (Actor.suspendException is thrown);
In the first case, when the handler finishes to process the message, execution proceeds straight to react method, and, as long as there're lots of messages in the mailbox, actor immediately schedules the next message to execute, and only after that suspends.
In the second case, loop schedules the execution of react in order to prevent a stack overflow (which might be your case with Actor #1, because tail recursion in process is not optimized), and thus, execution doesn't proceed to react immediately, as in the first case. That's where the millis are lost.
UPDATE (taken from here):
Using loop instead of recursive react
effectively doubles the number of
tasks that the thread pool has to
execute in order to accomplish the
same amount of work, which in turn
makes it so any overhead in the
scheduler is far more pronounced when
using loop.
Just a wild stab in the dark. It might be due to the exception thrown by react in order to evacuate the loop. Exception creation is quite heavy. However I don't know how often it do that, but that should be possible to check with a catch and a counter.
The overhead on your test depends heavily on the number of threads that are present (try using only one thread with scala -Dactors.corePoolSize=1!). I'm finding it difficult to figure out exactly where the difference arises; the only real difference is that in one case you use loop and in the other you do not. Loop does do fair bit of work, since it repeatedly creates function objects using "andThen" rather than iterating. I'm not sure whether this is enough to explain the difference, especially in light of the heavy usage by scala.actors.Scheduler$.impl and ExceptionBlob.

Resources