dispatch_get_main_queue() doesn't run new async jobs smoothly - swift2

let dispatchGroup = dispatch_group_create()
let now = DISPATCH_TIME_NOW
for i in 0..<1000 {
dispatch_group_enter(dispatchGroup)
// Do some async tasks
let delay = dispatch_time(now, Int64(Double(i) * 0.1 * Double(NSEC_PER_SEC)))
dispatch_after(delay, dispatch_get_main_queue(), {
print(i)
dispatch_group_leave(dispatchGroup)
})
}
The print statement can print first 15-20 numbers smoothly, however, when i goes larger, the print statement prints stuff in a sluggish way. I had more complicated logic inside dispatch_after and I noticed the processing was very sluggish, that's why I wrote this test.
Is there a buffer size or other properties that I can configure? It seems dispatch_get_main_queue() doesn't work well with bigger number of async tasks.
Thanks in advance!

The problem isn't dispatch_get_main_queue(). (You'll notice the same behavior if you use a different queue.) The problem rests in dispatch_after().
When you use dispatch_after, it creates a dispatch timer with a leeway of 10% of the start/when. See the Apple github libdispatch source. The net effect is that when these timers (start ± 10% leeway) overlap, it may start coalescing them. When they're coalesced, they'll appear to fire in a "clumped" manner, a bunch of them firing immediately right after another and then a little delay before it gets to the next bunch.
There are a couple of solutions, all entailing the retirement of the series of dispatch_after calls:
You can build timers manually, forcing DispatchSource.TimerFlag.strict to disable coalescing:
let group = DispatchGroup()
let queue = DispatchQueue.main
let start = CACurrentMediaTime()
os_log("start")
for i in 0 ..< 1000 {
group.enter()
let timer = DispatchSource.makeTimerSource(flags: .strict, queue: queue) // use `.strict` to avoid coalescing
timer.setEventHandler {
timer.cancel() // reference timer so it has strong reference until the handler is called
os_log("%d", i)
group.leave()
}
timer.schedule(deadline: .now() + Double(i) * 0.1)
timer.resume()
}
group.notify(queue: .main) {
let elapsed = CACurrentMediaTime() - start
os_log("all done %.1f", elapsed)
}
Personally, I dislike that reference to timer inside the closure, but you need to keep some strong reference to it until the timer fires, and GCD timers release the block (avoiding strong reference cycle) when the timer is canceled/finishes. This is inelegant solution, IMHO.
It is more efficient to just schedule single repeating timer that fires every 0.1 seconds:
var timer: DispatchSourceTimer? // note this is property to make sure we keep strong reference
func startTimer() {
let queue = DispatchQueue.main
let start = CACurrentMediaTime()
var counter = 0
// Do some async tasks
timer = DispatchSource.makeTimerSource(flags: .strict, queue: queue)
timer!.setEventHandler { [weak self] in
guard counter < 1000 else {
self?.timer?.cancel()
self?.timer = nil
let elapsed = CACurrentMediaTime() - start
os_log("all done %.1f", elapsed)
return
}
os_log("%d", counter)
counter += 1
}
timer!.schedule(deadline: .now(), repeating: 0.05)
timer!.resume()
}
This not only solves the coalescing problem, but it also is more efficient.
For Swift 2.3 rendition, see previous version of this answer.

Related

Memory leak with Rayon and Indicatif in Rust

So, I'm trying to do a exhaustive search of a hash. The hash itself is not important here. As I want to use all processing power of my CPU, I'm using Rayon to get a thread pool and lots of tasks. The search algorithm is the following:
let (tx, rx) = mpsc::channel();
let original_hash = String::from(original_hash);
rayon::spawn(move || {
let mut i = 0;
let mut iter = SenhaIterator::from(initial_pwd);
while i < max_iteracoes {
let pwd = iter.next().unwrap();
let clone_tx = tx.clone();
rayon::spawn(move || {
let hash = calcula_hash(&pwd);
clone_tx.send((pwd, hash)).unwrap();
});
i += 1;
}
});
let mut last_pwd = None;
let bar = ProgressBar::new(max_iteracoes as u64);
while let Ok((pwd, hash)) = rx.recv() {
last_pwd = Some(pwd);
if hash == original_hash {
bar.finish();
return last_pwd.map_or(ResultadoSenha::SenhaNaoEncontrada(None), |s| {
ResultadoSenha::SenhaEncontrada(s)
});
}
bar.inc(1);
}
bar.finish();
ResultadoSenha::SenhaNaoEncontrada(last_pwd)
Just a high level explanation: as the tasks go completing their work, they send a pair of (password, hash) to the main thread, which will compare the hash with the original hash (the one I'm trying to find a password for). If they match, great, I return to main with an enum value that indicates success, and the password that produces the original hash. After all iterations end, I'll return to main with an enum value that indicates that no hash was found, but with the last password, so I can retry from this point in some future run.
I'm trying to use Indicatif to show a progress bar, so I can get a glimpse of the progress.
But my problem is that the program is growing it's memory usage without a clear reason why. If I make it run with, let's say, 1 billion iterations, it goes slowly adding memory until it fills all available system memory.
But when I comment the line bar.inc(1);, the program behaves as expected, with normal memory usage.
I've created a test program, with Rayon and Indicatif, but without the hash calculation and it works correctly, no memory misbehavior.
That makes me think that I'm doing something wrong with memory management in my code, but I can't see anything obvious.
I found a solution, but I'm still not sure why it solves the original problem.
What I did to solve it is to transfer the progress code to the first spawn closure. Look at lines 6 and 19 below:
let (tx, rx) = mpsc::channel();
let original_hash = String::from(original_hash);
rayon::spawn(move || {
let mut bar = ProgressBar::new(max_iteracoes as u64);
let mut i = 0;
let mut iter = SenhaIterator::from(initial_pwd);
while i < max_iteracoes {
let pwd = iter.next().unwrap();
let clone_tx = tx.clone();
rayon::spawn(move || {
let hash = calcula_hash(&pwd);
clone_tx.send((pwd, hash)).unwrap();
});
i += 1;
bar.inc();
}
bar.finish();
});
let mut latest_pwd = None;
while let Ok((pwd, hash)) = rx.recv() {
latest_pwd = Some(pwd);
if hash == original_hash {
return latest_pwd.map_or(PasswordOutcome::PasswordNotFound(None), |s| {
PasswordOutcome::PasswordFound(s)
})
}
}
PasswordOutcome::PasswordNotFound(latest_pwd)
That first spawn closure has the role of fetching the next password to try and pass it over to a worker task, which calculates the corresponding hash and sends the pair (password, hash) to the main thread. The main thread will wait for the pairs from the rx channel and compare with the expected hash.
What is still missing from me is why tracking the progress on the outer thread leaks memory. I couldn't identify what is really leaking. But it's working now and I'm happy with the result.

Coroutine: difference between join() and cancelAndJoin()

I follow the reference document on Coroutine (https://kotlinlang.org/docs/reference/coroutines/cancellation-and-timeouts.html) and I test in the same time the example on Android Studio (a good boy).
But I am very confuse with cancelAndJoin() method.
If I replace, in the example code, the "cancelAndJoin" with "join", there is no difference in logs.
Here is the code :
fun main() = runBlocking {
val startTime = System.currentTimeMillis()
val job = launch(Dispatchers.Default) {
var nextPrintTime = startTime
var i = 0
while (i < 5) { // computation loop, just wastes CPU
// print a message twice a second
if (System.currentTimeMillis() >= nextPrintTime) {
println("job: I'm sleeping ${i++} ...")
nextPrintTime += 500L
}
}
}
delay(1300L) // delay a bit
println("main: I'm tired of waiting!")
job.cancelAndJoin() // cancels the job and waits for its completion
println("main: Now I can quit.")
}
and in the 2 cases (with join or cancelAndJoin) the logs are :
job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
job: I'm sleeping 3 ...
job: I'm sleeping 4 ...
main: Now I can quit.
Does anybody can explain what is the difference with the 2 methods?
This is not clear because cancel is "stopping the job", join is "waiting for completion", but the two together??? We "stop" and "wait"???
Thanks by advance :)
This is not clear because cancel is "stopping the job", join is
"waiting for completion", but the two together??? We "stop" and
"wait"?
Just because we wrote job.cancel() doesn't mean that job will cancel immediately.
When we write job.cancel() the job enters a transient cancelling state. This job is yet to enter the final cancelled state and only after the job enters the cancelled state it can be considered completely cancelled.
Observe how the state transition occurs,
When we call job.cancel() the cancellation procedure(transition from completing to cancelling) starts for the job. But we still need to wait, as the job is truly cancelled only when it reaches the cancelled state. Hence we write job.cancelAndJoin(), where cancel() starts the cancellation procedure for job and join() makes sure to wait until the job() has reached the cancelled state.
Make sure to check out this excellent article
Note: The image is shamelessly copied from the linked article itself
Ok I've just realize that my question was stupid!
cancelAndJoin() is equal to do cancel() and then join().
The code demonstrate that the job can not be cancelled if we don't check if it is active or not. And to do so, we must use "isActive".
So like this:
val startTime = System.currentTimeMillis()
val job = launch(Dispatchers.Default) {
var nextPrintTime = startTime
var i = 0
while (isActive) { // cancellable computation loop
// print a message twice a second
if (System.currentTimeMillis() >= nextPrintTime) {
println("job: I'm sleeping ${i++} ...")
nextPrintTime += 500L
}
}
}
delay(1300L) // delay a bit
println("main: I'm tired of waiting!")
job.cancelAndJoin() // cancels the job and waits for its completion
println("main: Now I can quit.")

How to identify memory leak with slowly eating up memory

I wrote a small app which calls every few seconds for checking folders. I run this app for appr. one week. Now I saw that it had occupied 32 GB of my physical 8 GB RAM. So system forced me to stop it.
So it seems that very slowly the app eats up memory. I tried Instruments. With activity monitor the only slowly growing process is "DTServiceHub". Is this something I have to watch at?
For debugging I write some information with print to standard output. Is this information dropped, because app is Cocoa app or stored somewhere till termination of the app? In this case I have to remove all these print-Statements.
Some code for looping:
func startUpdating() {
running = true
timeInterval = 5
run.isEnabled = false
setTimer()
}
func setTimer() {
timer = Timer(timeInterval: timeInterval, target: self, selector: #selector(ViewController.update), userInfo: nil, repeats: false)
RunLoop.current.add(timer, forMode: RunLoopMode.commonModes)
}
func update() {
writeLOG("update -Start-", level: 0x2)
...
setTimer()
}
There are several problems with your code. You should not create a Timer and then add it to a run loop every time you set it, because the run loop will retain that Timer and will never release it. Even worse is when you try to do that every time you update. Just create the timer once. If you don't need it anymore you have to invalidate it to remove it from the run loop and allow its release. If you need to adjust it just set the next fire date.
Here is an example that was tested in a playground:
import Cocoa
class MyTimer
{
var fireTime = 10.0
var timer:Timer
init()
{
self.timer = Timer.scheduledTimer(timeInterval: fireTime, target: self, selector: #selector(update), userInfo: nil, repeats: true)
}
deinit
{
// it will remove the timer from the run loop and it will enable its release
timer.invalidate()
}
#objc func update()
{
print("update")
let calendar = Calendar.current
let date = Date()
if let nextFireDate = calendar.date(byAdding: .second, value: Int(fireTime), to: date)
{
print(date)
timer.fireDate = nextFireDate
}
}
}
let timer = MyTimer()
CFRunLoopRun()
The problem is that your app loops forever and thus never drains the auto release pool. You should never write an app like that. If you need to check something periodically, use an NSTimer. If you need to hear about changes in a folder, use NSWorkspace or kqueue or similar. In other words, use callbacks. Never just loop.

Android java development time of acceleration to

How can I calculate the time of acceleration to 100kmh?
Well, I registered a location listener when the !location.hasSpeed() is true store the time of location into a variable. When the speed is reach of the given speed in this case 100km/h (27.77 m/s) I substract from the spped of location and the result I divide by 1000.
Here is the "pseudo code"
#Override
public void onLocationChanged(Location currentLoc) {
// when stop reseted, when start reset again
if (isAccelerationLoggingStarted) {
if (currentLoc.hasSpeed() && currentLoc.getSpeed() > 0.0) {
// dismiss the time between reset to start to move
startAccelerationToTime = (double) currentLoc.getTime();
}
}
if (!currentLoc.hasSpeed()) {
isAccelerationLoggingStarted = true;
startAccelerationToTime = (double) currentLoc.getTime();
acceleration100 = 0.0;
}
if (isAccelerationLoggingStarted) {
if (currentLoc.getSpeed() >= 27.77) {
acceleration100 = (currentLoc.getTime() - startAccelerationToTime) / 1000;
isAccelerationLoggingStarted = false;
}
}
}
The main problem i see here, is that whenever the device is moving, startAccelerationToTime is reset. (The first if only checks whether there's movement; it doesn't check whether there's already a start time recorded.
I don't see where isAccelerationLoggingStarted is needed at all -- the speed, and the variables themselves, can be cleaned up a bit to make it clear what the next step should be.
Your pseudocode probably ought to look something like:
if speed is 0
clear start time
else if no start time yet
start time = current time
clear acceleration time
else if no acceleration time yet, and if speed >= 100 mph
acceleration time = current time - start time
In Java, that'd look like...
long startTime = 0;
double accelerationTime = 0.0;
#Override
public void onLocationChanged(Location currentLoc) {
// when stopped (or so slow we might as well be), reset start time
if (!currentLoc.hasSpeed() || currentLoc.getSpeed() < 0.005) {
startTime = 0;
}
// We're moving, but is there a start time yet?
// if not, set it and clear the acceleration time
else if (startTime == 0) {
startTime = currentLoc.getTime();
accelerationTime = 0.0;
}
// There's a start time, but are we going over 100 km/h?
// if so, and we don't have an acceleration time yet, set it
else if (accelerationTime == 0.0 && currentLoc.getSpeed() >= 27.77) {
accelerationTime = (double)(currentLoc.getTime() - startTime) / 1000.0;
}
}
Now, i'm not sure exactly how location listeners work, or how often they notify you when you're moving. So this may only semi work. In particular, onLocationChanged might not get called when you're not moving; you may need to request an update (perhaps via a "reset" button or something) or set certain params in order to trigger the stuff that happens when speed == 0.

Scala stateful actor, recursive calling faster than using vars?

Sample code below. I'm a little curious why MyActor is faster than MyActor2. MyActor recursively calls process/react and keeps state in the function parameters whereas MyActor2 keeps state in vars. MyActor even has the extra overhead of tupling the state but still runs faster. I'm wondering if there is a good explanation for this or if maybe I'm doing something "wrong".
I realize the performance difference is not significant but the fact that it is there and consistent makes me curious what's going on here.
Ignoring the first two runs as warmup, I get:
MyActor:
559
511
544
529
vs.
MyActor2:
647
613
654
610
import scala.actors._
object Const {
val NUM = 100000
val NM1 = NUM - 1
}
trait Send[MessageType] {
def send(msg: MessageType)
}
// Test 1 using recursive calls to maintain state
abstract class StatefulTypedActor[MessageType, StateType](val initialState: StateType) extends Actor with Send[MessageType] {
def process(state: StateType, message: MessageType): StateType
def act = proc(initialState)
def send(message: MessageType) = {
this ! message
}
private def proc(state: StateType) {
react {
case msg: MessageType => proc(process(state, msg))
}
}
}
object MyActor extends StatefulTypedActor[Int, (Int, Long)]((0, 0)) {
override def process(state: (Int, Long), input: Int) = input match {
case 0 =>
(1, System.currentTimeMillis())
case input: Int =>
state match {
case (Const.NM1, start) =>
println((System.currentTimeMillis() - start))
(Const.NUM, start)
case (s, start) =>
(s + 1, start)
}
}
}
// Test 2 using vars to maintain state
object MyActor2 extends Actor with Send[Int] {
private var state = 0
private var strt = 0: Long
def send(message: Int) = {
this ! message
}
def act =
loop {
react {
case 0 =>
state = 1
strt = System.currentTimeMillis()
case input: Int =>
state match {
case Const.NM1 =>
println((System.currentTimeMillis() - strt))
state += 1
case s =>
state += 1
}
}
}
}
// main: Run testing
object TestActors {
def main(args: Array[String]): Unit = {
val a = MyActor
// val a = MyActor2
a.start()
testIt(a)
}
def testIt(a: Send[Int]) {
for (_ <- 0 to 5) {
for (i <- 0 to Const.NUM) {
a send i
}
}
}
}
EDIT: Based on Vasil's response, I removed the loop and tried it again. And then MyActor2 based on vars leapfrogged and now might be around 10% or so faster. So... lesson is: if you are confident that you won't end up with a stack overflowing backlog of messages, and you care to squeeze every little performance out... don't use loop and just call the act() method recursively.
Change for MyActor2:
override def act() =
react {
case 0 =>
state = 1
strt = System.currentTimeMillis()
act()
case input: Int =>
state match {
case Const.NM1 =>
println((System.currentTimeMillis() - strt))
state += 1
case s =>
state += 1
}
act()
}
Such results are caused with the specifics of your benchmark (a lot of small messages that fill the actor's mailbox quicker than it can handle them).
Generally, the workflow of react is following:
Actor scans the mailbox;
If it finds a message, it schedules the execution;
When the scheduling completes, or, when there're no messages in the mailbox, actor suspends (Actor.suspendException is thrown);
In the first case, when the handler finishes to process the message, execution proceeds straight to react method, and, as long as there're lots of messages in the mailbox, actor immediately schedules the next message to execute, and only after that suspends.
In the second case, loop schedules the execution of react in order to prevent a stack overflow (which might be your case with Actor #1, because tail recursion in process is not optimized), and thus, execution doesn't proceed to react immediately, as in the first case. That's where the millis are lost.
UPDATE (taken from here):
Using loop instead of recursive react
effectively doubles the number of
tasks that the thread pool has to
execute in order to accomplish the
same amount of work, which in turn
makes it so any overhead in the
scheduler is far more pronounced when
using loop.
Just a wild stab in the dark. It might be due to the exception thrown by react in order to evacuate the loop. Exception creation is quite heavy. However I don't know how often it do that, but that should be possible to check with a catch and a counter.
The overhead on your test depends heavily on the number of threads that are present (try using only one thread with scala -Dactors.corePoolSize=1!). I'm finding it difficult to figure out exactly where the difference arises; the only real difference is that in one case you use loop and in the other you do not. Loop does do fair bit of work, since it repeatedly creates function objects using "andThen" rather than iterating. I'm not sure whether this is enough to explain the difference, especially in light of the heavy usage by scala.actors.Scheduler$.impl and ExceptionBlob.

Resources