Kotlin: coroutineScope is slower than GlobalScope - performance

I'm learning coroutines, and I encounter the following surprising (for me) behavior. I want to have a parallel map. I consider 4 solutions:
Just map, no parallelism
pmap from here.
Modification of item 2: I removed coroutineScope and use GlobalScope.
Java's parallelStream.
The code:
import kotlinx.coroutines.*
import kotlin.streams.toList
import kotlin.system.measureNanoTime
inline fun printTime(msg: String, f: () -> Unit) =
println("${msg.padEnd(15)} time: ${measureNanoTime(f) / 1e9}")
suspend fun <T, U> List<T>.pmap(f: (T) -> U) = coroutineScope {
map { async { f(it) } }.map { it.await() }
}
suspend fun <T, U> List<T>.pmapGlob(f: (T) -> U) =
map { GlobalScope.async { f(it) } }.map { it.await() }
fun eval(i: Int) = (0 .. i).sumBy { it * it }
fun main() = runBlocking {
val list = (0..200).map { it * it * it }
printTime("No parallelism") { println(list.map(::eval).sum()) }
printTime("CoroutineScope") { println(list.pmap(::eval).sum()) }
printTime("GlobalScope") { println(list.pmapGlob(::eval).sum()) }
printTime("ParallelStream") { println(list.parallelStream().map(::eval).toList().sum()) }
}
Output (without sums):
No parallelism time: 0.85726849
CoroutineScope time: 0.827426385
GlobalScope time: 0.145788785
ParallelStream time: 0.161423263
As you can see, with coroutineScope there is almost no gain, while with GlobalScope it works as fast as parallelStream. What is the reason? Can I have a solution which has all advantages of coroutineScope with the same speed gain?

Scopes are only indirectly involved in the differences you observed.
GlobalScope is a singleton that defines its own dispatcher, which is Dispatchers.Default. It is backed by a thread pool.
coroutineScope does not define its own dispatcher so you inherit it from the caller, in this case the one created by runBlocking. It uses the single thread it is called on.
If you replace coroutineScope with withContext(Dispatchers.Default), you'll get the same timings. This is in fact how you should write this (instead of GlobalScope) in order to get sane behavior in the face of possible failures of some of the concurrent tasks.

Related

Kotlin coroutines within a Spring REST endpoint

Given a REST endpoint and two asynchronous coroutines each returning an integer, I want this endpoint to return their sum. The two functions (funA and funB) should run in parallel, in such a way that the whole computation should take ~3secs.
I'm using SpringBoot 2.6.3 and Kotlin Coroutines 1.6.0. Here is my attempt:
import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*
import org.springframework.web.bind.annotation.GetMapping
import org.springframework.web.bind.annotation.RestController
#RestController
class Controllers {
#GetMapping(
value = ["/summing"]
)
fun summing(): String {
val finalResult = runBlocking {
val result = async { sum() }
println("Your result: ${result.await()}")
return#runBlocking result
}
println("Final Result: $finalResult")
return "$finalResult"
}
suspend fun funA(): Int {
delay(3000)
return 10
}
suspend fun funB(): Int {
delay(2000)
return 90
}
fun sum() = runBlocking {
val resultSum = async { funA().await() + funB().await() }
return#runBlocking resultSum
}
}
The problem is that this code does not compile as await() is not recognised as a valid method. If I remove await() the two functions are executed in series (total time ~5 secs) and instead of the expected result (100), I get:
Your result: DeferredCoroutine{Completed}#1c1fe804
Final Result: DeferredCoroutine{Completed}#48a622de
and so the endpoint returns "DeferredCoroutine{Completed}#48a622de".
I want the endpoint to return "100" instead and within ~3secs. How can I achieve this?
You really messed this up ;-) There are several problems with your code:
Use runBlocking() only to use another runBlocking() inside it.
Use async() and immediately call await() on it (in summing()) - it does nothing.
funA().await() and funB().await() don't make any sense really. These functions return integers, you can't await() on already acquired integers.
Generally, using much more code than needed.
The solution is pretty simple: use runBlocking() once to jump into coroutine world and then use async() to start both functions concurrently to each other:
runBlocking {
val a = async { funA() }
val b = async { funB() }
a.await() + b.await()
}
Or alternatively:
runBlocking {
listOf(
async { funA() },
async { funB() },
).awaitAll().sum()
}
Or (a little shorter, but I consider this less readable):
runBlocking {
val a = async { funA() }
funB() + a.await()
}
Also,runBlocking() is not ideal. I believe Spring has support for coroutines, so it would be better to make summing() function suspend and use coroutineScope() instead of runBlocking() - this way the code won't block any threads.

How to continue a suspend function in a dynamic proxy in the same coroutine?

I want to continue a suspend function in a dynamic proxy in the same coroutine.
Please have a look at the following code:
interface Adder {
suspend fun add(a: Int, b: Int): Int
}
val IH = InvocationHandler { _, method, args ->
val continuation = args.last() as Continuation<*>
val realArgs = args.take(args.size - 1)
println("${method.name}$realArgs")
GlobalScope.launch {
delay(5_000)
#Suppress("UNCHECKED_CAST") (continuation as Continuation<Int>).resume(3)
}
COROUTINE_SUSPENDED
}
fun main() {
val adder = Proxy.newProxyInstance(
Adder::class.java.classLoader, arrayOf(Adder::class.java), IH
) as Adder
runBlocking {
println(adder.add(1, 2))
}
}
It works fine. It runs the delay function in a new coroutine.
However, that's not what I want.
I want to run the InvocationHandler in the same coroutine as the one that was started with runBlocking.
Something like:
val IH = InvocationHandler { _, _, _ ->
delay(5_000)
3
}
This obviously won't compile because delay is a suspend function that must be run in a coroutine.
So the question is: How could I write the InvocationHandler for my intended behavior?
Any help would be very much appreciated.
I'd like to use this code in my RPC framework.
My real code would replace the delay call with non-blocking Ktor socket calls for serializing the data over the wire.
You can find the code example at: https://raw.githubusercontent.com/softappeal/yass/master/kotlin/yass/test/ch/softappeal/yass/remote/SuspendProxy.kt
I've found a solution for my problem:
package ch.softappeal.yass
import kotlinx.coroutines.*
import java.lang.reflect.*
import kotlin.coroutines.*
import kotlin.test.*
typealias SuspendInvoker = suspend (method: Method, arguments: List<Any?>) -> Any?
private interface SuspendFunction {
suspend fun invoke(): Any?
}
private val SuspendRemover = SuspendFunction::class.java.methods[0]
#Suppress("UNCHECKED_CAST")
fun <C : Any> proxy(contract: Class<C>, invoker: SuspendInvoker): C =
Proxy.newProxyInstance(contract.classLoader, arrayOf(contract)) { _, method, arguments ->
val continuation = arguments.last() as Continuation<*>
val argumentsWithoutContinuation = arguments.take(arguments.size - 1)
SuspendRemover.invoke(object : SuspendFunction {
override suspend fun invoke() = invoker(method, argumentsWithoutContinuation)
}, continuation)
} as C
interface Adder {
suspend fun add(a: Int, b: Int): Int
}
class SuspendProxyTest {
#Test
fun test() {
val adder = proxy(Adder::class.java) { method, arguments ->
println("${method.name}$arguments")
delay(100)
3
}
runBlocking { assertEquals(3, adder.add(1, 2)) }
}
}
Any comments?
Is this a good/problematic solution?
Could/should the "removing of suspend functionality" be added to the kotlin.coroutines library?
use runBlocking inside InvocationHandler:
val IH = InvocationHandler { _, _, _ ->
runBlocking{
delay(5_000)// now you can use suspend functions here
}
3
}

How to build a monitor with infinite loop?

I am building a monitor in Kotlin to schedule certain operations, what I want is a program that inserts or updates some database entries for a given time intervall. What I got so far is a program that runs for a given time span, but I have an infinite loop in my porgram that takes up to 30% of processor power when it is not time for an update. So my question is how to build a monitor without an infinite loop?
this my code so far:
while(!operations.done && appConfigurations.run_with_monitor) {
if (DataSourceMonitor.isReadyForUpdate(lastMonitorModel)) {
operations.update()
}
}
operations is an entire sequence of different actions to execute. Each operation implementing the IScheduler interface.
interface IScheduler {
var done: Boolean
fun update()
fun reset()
}
Example of implementation:
class Repeat(private val task: IScheduler) : IScheduler {
override var done = false
override fun update() {
if (this.task.done) {
this.reset()
}
this.task.update()
//logger.info { "Update repeat, done is always $done" }
}
override fun reset() {
this.task.reset()
this.done = false
}
}
class Sequence(private val task1: IScheduler, private val task2: IScheduler): IScheduler {
override var done = false
var current = task1
var next = task2
override fun update() {
if (!this.done) {
this.current.update()
if (this.current.done) {
this.current = this.next
}
if (this.next.done) {
this.done = true
}
}
}
class Print(private val msg: String): IScheduler {
override var done = false
override fun update() {
println(this.msg)
this.done = true
}
override fun reset() {
this.done = false
}
}
The value of operations can be as follows:
val operations = Repeat(Sequence(Print("First action"), Print("Another action")))
**So right now my monitor is working and completely functional, but how can I improve the performance of the infinite loop? **
Hope anyone has some ideas about this.
If your DataSourceMonitor has no way to block until isReadyForUpdate is going to return true, then the usual approach is to add a delay. eg:
while(!operations.done && appConfigurations.run_with_monitor) {
if (DataSourceMonitor.isReadyForUpdate(lastMonitorModel)) {
operations.update()
} else {
Thread.sleep(POLL_DELAY);
}
}
If it's always ready for update there won't be any delay, but if it ever isn't ready for update then it'll sleep. You'll need to tune the POLL_DELAY. Bigger values mean less CPU usage, but greater latency in detecting new events to process. Smaller values produce less latency, but use more CPU.
If you really want to get fancy you can have the poll delay start small and then increase up to some maximum, dropping back down once events are found. This is probably overkill, but look up "adaptive polling" if you're interested.
I have refactored my code and I can accomplish the same result with less code, by removing the IScheduler interface by the abstract class TimerTask. The job can be done with these lines of code:
val scheduler = Sequence(
Print("Executed task 1"),
Sequence(Print("Executed task 2"),
Sequence(Print("Executed task 3"), Print("Finished Scheduler")))
)
Timer().schedule(scheduler, DELAY, PERIOD)
All the interface implementations are changed to TimerTask implementations:
class Print(private val msg: String): TimerTask() {
override fun run() {
println(msg)
}
}
class Sequence(private val task1: Runnable, private val task2: Runnable): TimerTask() {
override fun run() {
task1.run()
task2.run()
}
}

Dont understand how to make flux subscription working in kotlin

I'm new to reactive programming. I expect to see
test provider started
Beat 1000
Beat 2000
in logs but there is only test provider started and no Beat or on complete messages. Looks like I miss something
#Service
class ProviderService {
#PostConstruct
fun start(){
val hb: Flux<HeartBeat> = Flux.interval(Duration.ofSeconds(1)).map { HeartBeat(it) }
val provider = Provider("test", hb)
}
}
////////////////////////
open class Provider(name: String, heartBests: Flux<HeartBeat>) {
companion object {
val log = LoggerFactory.getLogger(Provider::class.java)!!
}
init {
log.info("$name provider started")
heartBests.doOnComplete { log.info("on complete") }
heartBests.doOnEach { onBeat(it.get().number) }
}
fun onBeat(n: Number){
log.info("Beat $n")
}
}
/////
class HeartBeat(val number: Number)
three pretty common mistakes here:
operators like doOnEach return a new Flux instance with the added behavior, so you need to (re)assign to a variable or use a fluent style
nothing happens until you subscribe() (or a variant of it. blockXXX do also subscribe under the hood for instance...)
such a pipeline is fully asynchronous, and runs on a separate Thread due to the time dimension of the source, interval. As a result, control would immediately return in init even if you had subscribed, potentially causing the main thread and then the app to exit.
In your code lambda from 'doOnComplete' has been never called, because you created infinite stream. Method 'doOnEach' as 'map' is intermediate operations (like map in streams), its doesn't make a call.
And you have another mistake, reactive suggests "fluent pattern".
Try this simple example:
import reactor.core.publisher.Flux
import java.time.Duration
fun main(args: Array<String>) {
val flux = Flux.interval(Duration.ofSeconds(1)).map { HeartBeat(it) }
println("start")
flux.take(3)
.doOnEach { println("on each $it") }
.map { println("before map");HeartBeat(it.value * 2) }
.doOnNext { println("on next $it") }
.doOnComplete { println("on complete") }
.subscribe { println("subscribe $it") }
Thread.sleep(5000)
}
data class HeartBeat(val value: Long)

Parallel operations on Kotlin collections?

In Scala, one can easily do a parallel map, forEach, etc, with:
collection.par.map(..)
Is there an equivalent in Kotlin?
The Kotlin standard library has no support for parallel operations. However, since Kotlin uses the standard Java collection classes, you can use the Java 8 stream API to perform parallel operations on Kotlin collections as well.
e.g.
myCollection.parallelStream()
.map { ... }
.filter { ... }
As of Kotlin 1.1, parallel operations can also be expressed quite elegantly in terms of coroutines. Here is a custom pmap helper function for lists:
fun <A, B>List<A>.pmap(f: suspend (A) -> B): List<B> = runBlocking {
map { async(Dispatchers.Default) { f(it) } }.map { it.await() }
}
You can use this extension method:
suspend fun <A, B> Iterable<A>.pmap(f: suspend (A) -> B): List<B> = coroutineScope {
map { async { f(it) } }.awaitAll()
}
See Parallel Map in Kotlin for more info
There is no official support in Kotlin's stdlib yet, but you could define an extension function to mimic par.map:
fun <T, R> Iterable<T>.pmap(
numThreads: Int = Runtime.getRuntime().availableProcessors() - 2,
exec: ExecutorService = Executors.newFixedThreadPool(numThreads),
transform: (T) -> R): List<R> {
// default size is just an inlined version of kotlin.collections.collectionSizeOrDefault
val defaultSize = if (this is Collection<*>) this.size else 10
val destination = Collections.synchronizedList(ArrayList<R>(defaultSize))
for (item in this) {
exec.submit { destination.add(transform(item)) }
}
exec.shutdown()
exec.awaitTermination(1, TimeUnit.DAYS)
return ArrayList<R>(destination)
}
(github source)
Here's a simple usage example
val result = listOf("foo", "bar").pmap { it+"!" }.filter { it.contains("bar") }
If needed it allows to tweak threading by providing the number of threads or even a specific java.util.concurrent.Executor. E.g.
listOf("foo", "bar").pmap(4, transform = { it + "!" })
Please note, that this approach just allows to parallelize the map operation and does not affect any downstream bits. E.g. the filter in the first example would run single-threaded. However, in many cases just the data transformation (ie. map) requires parallelization. Furthermore, it would be straightforward to extend the approach from above to other elements of Kotlin collection API.
From 1.2 version, kotlin added a stream feature which is compliant with JRE8
So, iterating over a list asynchronously could be done like bellow:
fun main(args: Array<String>) {
val c = listOf("toto", "tata", "tutu")
c.parallelStream().forEach { println(it) }
}
Kotlin wants to be idiomatic but not too much synthetic to be hard to understand at a first glance.
Parallel computation trough Coroutines is no exception. They want it to be easy but not implicit with some pre-built method, allowing to branch the computation when needed.
In your case:
collection.map {
async{ produceWith(it) }
}
.forEach {
consume(it.await())
}
Notice that to call async and await you need to be inside a so called Context, you cannot make suspending calls or launching a coroutine from a non-coroutine context. To enter one you can either:
runBlocking { /* your code here */ }: it will suspend the current thread until the lambda returns.
GlobalScope.launch { }: it will execute the lambda in parallel; if your main finishes executing while your coroutines have not bad things will happen, in that case better use runBlocking.
Hope it may helps :)
At the present moment no. The official Kotlin comparison to Scala mentions:
Things that may be added to Kotlin later:
Parallel collections
This solution assumes that your project is using coroutines:
implementation( "org.jetbrains.kotlinx:kotlinx-coroutines-core:1.3.2")
The functions called parallelTransform don't retain the order of elements and return a Flow<R>, while the function parallelMap retains the order and returns a List<R>.
Create a threadpool for multiple invocations:
val numberOfCores = Runtime.getRuntime().availableProcessors()
val executorDispatcher: ExecutorCoroutineDispatcher =
Executors.newFixedThreadPool(numberOfCores ).asCoroutineDispatcher()
use that dispatcher (and call close() when it's no longer needed):
inline fun <T, R> Iterable<T>.parallelTransform(
dispatcher: ExecutorDispatcher,
crossinline transform: (T) -> R
): Flow<R> = channelFlow {
val items: Iterable<T> = this#parallelTransform
val channelFlowScope: ProducerScope<R> = this#channelFlow
launch(dispatcher) {
items.forEach {item ->
launch {
channelFlowScope.send(transform(item))
}
}
}
}
If threadpool reuse is of no concern (threadpools aren't cheap), you can use this version:
inline fun <T, R> Iterable<T>.parallelTransform(
numberOfThreads: Int,
crossinline transform: (T) -> R
): Flow<R> = channelFlow {
val items: Iterable<T> = this#parallelTransform
val channelFlowScope: ProducerScope<R> = this#channelFlow
Executors.newFixedThreadPool(numberOfThreads).asCoroutineDispatcher().use { dispatcher ->
launch( dispatcher ) {
items.forEach { item ->
launch {
channelFlowScope.send(transform(item))
}
}
}
}
}
if you need a version that retains the order of elements:
inline fun <T, R> Iterable<T>.parallelMap(
dispatcher: ExecutorDispatcher,
crossinline transform: (T) -> R
): List<R> = runBlocking {
val items: Iterable<T> = this#parallelMap
val result = ConcurrentSkipListMap<Int, R>()
launch(dispatcher) {
items.withIndex().forEach {(index, item) ->
launch {
result[index] = transform(item)
}
}
}
// ConcurrentSkipListMap is a SortedMap
// so the values will be in the right order
result.values.toList()
}
I found this:
implementation 'com.github.cvb941:kotlin-parallel-operations:1.3'
details:
https://github.com/cvb941/kotlin-parallel-operations
I've come up with a couple of extension functions:
The suspend extension function on Iterable<T> type, which does a parallel processing of items and returns some result of processing each item. By default it uses Dispatchers.IO dispatcher to offload blocking tasks to a shared pool of threads. Must be called from a coroutine (including a coroutine with Dispatchers.Main dispatcher) or another suspend function.
suspend fun <T, R> Iterable<T>.processInParallel(
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> R,
): List<R> = coroutineScope { // or supervisorScope
map {
async(dispatcher) { processBlock(it) }
}.awaitAll()
}
Example of calling from a coroutine:
val collection = listOf("A", "B", "C", "D", "E")
someCoroutineScope.launch {
val results = collection.processInParallel {
process(it)
}
// use processing results
}
where someCoroutineScope is an instance of CoroutineScope.
Launch and forget extension function on CoroutineScope, which doesn't return any result. It also uses Dispatchers.IO dispatcher by default. Can be called using CoroutineScope or from another coroutine.
fun <T> CoroutineScope.processInParallelAndForget(
iterable: Iterable<T>,
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> Unit
) = iterable.forEach {
launch(dispatcher) { processBlock(it) }
}
Example of calling:
someoroutineScope.processInParallelAndForget(collection) {
process(it)
}
// OR from another coroutine:
someCoroutineScope.launch {
processInParallelAndForget(collection) {
process(it)
}
}
2a. Launch and forget extension function on Iterable<T>. It's almost the same as previous, but the extension type is different. CoroutineScope must be passed as argument to the function.
fun <T> Iterable<T>.processInParallelAndForget(
scope: CoroutineScope,
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> Unit
) = forEach {
scope.launch(dispatcher) { processBlock(it) }
}
Calling:
collection.processInParallelAndForget(someCoroutineScope) {
process(it)
}
// OR from another coroutine:
someScope.launch {
collection.processInParallelAndForget(this) {
process(it)
}
}
You can mimic the Scala API by using extension properties and inline classes. Using the coroutine solution from #Sharon answer, you can write it like this
val <A> Iterable<A>.par get() = ParallelizedIterable(this)
#JvmInline
value class ParallelizedIterable<A>(val iter: Iterable<A>) {
suspend fun <B> map(f: suspend (A) -> B): List<B> = coroutineScope {
iter.map { async { f(it) } }.awaitAll()
}
}
with this, now your code can change from
anIterable.map { it.value }
to
anIterable.par.map { it.value }
also you can change the entry point as you like other than using extension properties, e.g.
fun <A> Iterable<A>.parallel() = ParallelizedIterable(this)
anIterable.parallel().map { it.value }
You can also use another parallel solution and implement the rest of iterable methods inside ParallelizedIterable while still having the same method names for the operations
The drawback is that this implementation can only parallelize one operation after it, to make it so that it parallelize every subsequent operation, you may need to modify ParallelizedIterable further so it return its own type instead of returning back to List<A>

Resources