i would like to start a process is active while my ktor server is up.
suspend fun doWork(){
while(true){
delay(2000L)
printit("I did work")
}
}
I have read that best practice is to avoid runBlocking and GlobalScope.
so i think i should create a custom scope for the coroutine to run in.
(the job is using some db calls so i did this)
val scope = CoroutineScope(Job() + Dispatchers.IO)
i then call it from my main function
fun main() {
val runningJob = scope.launch{
doWork()
}
embeddedServer(Netty, port = 8080, host = "127.0.0.1") {
routing {
get("/") {
call.respondHtml(HttpStatusCode.OK, HTML::index)
}
static("/static") {
resources()
}
}
}.start(wait = true)
}
Is this the proper way to do this? Most of the coroutine examples kind of leave out calling from non suspend functions.
Thank you in advance :)
I have a custom Scope that is using a single thread as it's Dispatcher.
private val jsDispatcher = Executors.newSingleThreadExecutor().asCoroutineDispatcher()
private val jsScope = CoroutineScope(jsDispatcher + SupervisorJob() + CoroutineName("JS-Thread"))
Let's assume I have a code block that uses the above scope to launch a new coroutine and call multiple suspend methods
jsScope.launch {
sampleMethod()
sampleMethod2()
sampleMethod3()
}
I need to validate and throw an exception if one of the above sample methods is not running on the above JS thread
private suspend fun sampleMethod() = coroutineScope {
//Implement me
validateThread()
}
How can this be enforced?
You can check the current thread name in your method:
private suspend fun sampleMethod() = coroutineScope {
assert(Thread.currentThread().name == "js-thread") // Doesn't work!
}
However, newSingleThreadExecutor uses DefaultThreadFactory which produces thread names like pool-N-thread-M which cannot really be validated because you don't know M or N. I see two solutions here:
Take advantage of the fact that you have a single thread and change its name as soon as you create the executor:
runBlocking {
jsScope.launch {
Thread.currentThread().name = "js-thread"
}
}
Pass a custom thread factory: Executors.newSingleThreadExecutor(MyThreadFactory("js-thread"))
private class MyThreadFactory(private val name: String) : ThreadFactory {
private val group: ThreadGroup
private val threadNumber = AtomicInteger(1)
init {
val s = System.getSecurityManager()
group = if (s != null) {
s.threadGroup
} else {
Thread.currentThread().threadGroup
}
}
override fun newThread(r: Runnable): Thread {
val t = Thread(group, r, "$name-${threadNumber.getAndIncrement()}", 0)
if (t.isDaemon) {
t.isDaemon = false
}
if (t.priority != Thread.NORM_PRIORITY) {
t.priority = Thread.NORM_PRIORITY
}
return t
}
}
Code was adapted from DefaultThreadFactory. Guava and apache-commons also provide utility methods to do the same. This has the advantage that it works for any thread pool, not just single-threaded.
After some research, I took a look at the withContext() implementation and the answer to my question was right there.
Taken from the withContext() implementation, this is how to check if current coroutine context is on same dispatcher as other context/scope
if (newContext[ContinuationInterceptor] === oldContext[ContinuationInterceptor]) {
// same dispatcher
}
I'm learning coroutines, and I encounter the following surprising (for me) behavior. I want to have a parallel map. I consider 4 solutions:
Just map, no parallelism
pmap from here.
Modification of item 2: I removed coroutineScope and use GlobalScope.
Java's parallelStream.
The code:
import kotlinx.coroutines.*
import kotlin.streams.toList
import kotlin.system.measureNanoTime
inline fun printTime(msg: String, f: () -> Unit) =
println("${msg.padEnd(15)} time: ${measureNanoTime(f) / 1e9}")
suspend fun <T, U> List<T>.pmap(f: (T) -> U) = coroutineScope {
map { async { f(it) } }.map { it.await() }
}
suspend fun <T, U> List<T>.pmapGlob(f: (T) -> U) =
map { GlobalScope.async { f(it) } }.map { it.await() }
fun eval(i: Int) = (0 .. i).sumBy { it * it }
fun main() = runBlocking {
val list = (0..200).map { it * it * it }
printTime("No parallelism") { println(list.map(::eval).sum()) }
printTime("CoroutineScope") { println(list.pmap(::eval).sum()) }
printTime("GlobalScope") { println(list.pmapGlob(::eval).sum()) }
printTime("ParallelStream") { println(list.parallelStream().map(::eval).toList().sum()) }
}
Output (without sums):
No parallelism time: 0.85726849
CoroutineScope time: 0.827426385
GlobalScope time: 0.145788785
ParallelStream time: 0.161423263
As you can see, with coroutineScope there is almost no gain, while with GlobalScope it works as fast as parallelStream. What is the reason? Can I have a solution which has all advantages of coroutineScope with the same speed gain?
Scopes are only indirectly involved in the differences you observed.
GlobalScope is a singleton that defines its own dispatcher, which is Dispatchers.Default. It is backed by a thread pool.
coroutineScope does not define its own dispatcher so you inherit it from the caller, in this case the one created by runBlocking. It uses the single thread it is called on.
If you replace coroutineScope with withContext(Dispatchers.Default), you'll get the same timings. This is in fact how you should write this (instead of GlobalScope) in order to get sane behavior in the face of possible failures of some of the concurrent tasks.
I'm new to reactive programming. I expect to see
test provider started
Beat 1000
Beat 2000
in logs but there is only test provider started and no Beat or on complete messages. Looks like I miss something
#Service
class ProviderService {
#PostConstruct
fun start(){
val hb: Flux<HeartBeat> = Flux.interval(Duration.ofSeconds(1)).map { HeartBeat(it) }
val provider = Provider("test", hb)
}
}
////////////////////////
open class Provider(name: String, heartBests: Flux<HeartBeat>) {
companion object {
val log = LoggerFactory.getLogger(Provider::class.java)!!
}
init {
log.info("$name provider started")
heartBests.doOnComplete { log.info("on complete") }
heartBests.doOnEach { onBeat(it.get().number) }
}
fun onBeat(n: Number){
log.info("Beat $n")
}
}
/////
class HeartBeat(val number: Number)
three pretty common mistakes here:
operators like doOnEach return a new Flux instance with the added behavior, so you need to (re)assign to a variable or use a fluent style
nothing happens until you subscribe() (or a variant of it. blockXXX do also subscribe under the hood for instance...)
such a pipeline is fully asynchronous, and runs on a separate Thread due to the time dimension of the source, interval. As a result, control would immediately return in init even if you had subscribed, potentially causing the main thread and then the app to exit.
In your code lambda from 'doOnComplete' has been never called, because you created infinite stream. Method 'doOnEach' as 'map' is intermediate operations (like map in streams), its doesn't make a call.
And you have another mistake, reactive suggests "fluent pattern".
Try this simple example:
import reactor.core.publisher.Flux
import java.time.Duration
fun main(args: Array<String>) {
val flux = Flux.interval(Duration.ofSeconds(1)).map { HeartBeat(it) }
println("start")
flux.take(3)
.doOnEach { println("on each $it") }
.map { println("before map");HeartBeat(it.value * 2) }
.doOnNext { println("on next $it") }
.doOnComplete { println("on complete") }
.subscribe { println("subscribe $it") }
Thread.sleep(5000)
}
data class HeartBeat(val value: Long)
In Scala, one can easily do a parallel map, forEach, etc, with:
collection.par.map(..)
Is there an equivalent in Kotlin?
The Kotlin standard library has no support for parallel operations. However, since Kotlin uses the standard Java collection classes, you can use the Java 8 stream API to perform parallel operations on Kotlin collections as well.
e.g.
myCollection.parallelStream()
.map { ... }
.filter { ... }
As of Kotlin 1.1, parallel operations can also be expressed quite elegantly in terms of coroutines. Here is a custom pmap helper function for lists:
fun <A, B>List<A>.pmap(f: suspend (A) -> B): List<B> = runBlocking {
map { async(Dispatchers.Default) { f(it) } }.map { it.await() }
}
You can use this extension method:
suspend fun <A, B> Iterable<A>.pmap(f: suspend (A) -> B): List<B> = coroutineScope {
map { async { f(it) } }.awaitAll()
}
See Parallel Map in Kotlin for more info
There is no official support in Kotlin's stdlib yet, but you could define an extension function to mimic par.map:
fun <T, R> Iterable<T>.pmap(
numThreads: Int = Runtime.getRuntime().availableProcessors() - 2,
exec: ExecutorService = Executors.newFixedThreadPool(numThreads),
transform: (T) -> R): List<R> {
// default size is just an inlined version of kotlin.collections.collectionSizeOrDefault
val defaultSize = if (this is Collection<*>) this.size else 10
val destination = Collections.synchronizedList(ArrayList<R>(defaultSize))
for (item in this) {
exec.submit { destination.add(transform(item)) }
}
exec.shutdown()
exec.awaitTermination(1, TimeUnit.DAYS)
return ArrayList<R>(destination)
}
(github source)
Here's a simple usage example
val result = listOf("foo", "bar").pmap { it+"!" }.filter { it.contains("bar") }
If needed it allows to tweak threading by providing the number of threads or even a specific java.util.concurrent.Executor. E.g.
listOf("foo", "bar").pmap(4, transform = { it + "!" })
Please note, that this approach just allows to parallelize the map operation and does not affect any downstream bits. E.g. the filter in the first example would run single-threaded. However, in many cases just the data transformation (ie. map) requires parallelization. Furthermore, it would be straightforward to extend the approach from above to other elements of Kotlin collection API.
From 1.2 version, kotlin added a stream feature which is compliant with JRE8
So, iterating over a list asynchronously could be done like bellow:
fun main(args: Array<String>) {
val c = listOf("toto", "tata", "tutu")
c.parallelStream().forEach { println(it) }
}
Kotlin wants to be idiomatic but not too much synthetic to be hard to understand at a first glance.
Parallel computation trough Coroutines is no exception. They want it to be easy but not implicit with some pre-built method, allowing to branch the computation when needed.
In your case:
collection.map {
async{ produceWith(it) }
}
.forEach {
consume(it.await())
}
Notice that to call async and await you need to be inside a so called Context, you cannot make suspending calls or launching a coroutine from a non-coroutine context. To enter one you can either:
runBlocking { /* your code here */ }: it will suspend the current thread until the lambda returns.
GlobalScope.launch { }: it will execute the lambda in parallel; if your main finishes executing while your coroutines have not bad things will happen, in that case better use runBlocking.
Hope it may helps :)
At the present moment no. The official Kotlin comparison to Scala mentions:
Things that may be added to Kotlin later:
Parallel collections
This solution assumes that your project is using coroutines:
implementation( "org.jetbrains.kotlinx:kotlinx-coroutines-core:1.3.2")
The functions called parallelTransform don't retain the order of elements and return a Flow<R>, while the function parallelMap retains the order and returns a List<R>.
Create a threadpool for multiple invocations:
val numberOfCores = Runtime.getRuntime().availableProcessors()
val executorDispatcher: ExecutorCoroutineDispatcher =
Executors.newFixedThreadPool(numberOfCores ).asCoroutineDispatcher()
use that dispatcher (and call close() when it's no longer needed):
inline fun <T, R> Iterable<T>.parallelTransform(
dispatcher: ExecutorDispatcher,
crossinline transform: (T) -> R
): Flow<R> = channelFlow {
val items: Iterable<T> = this#parallelTransform
val channelFlowScope: ProducerScope<R> = this#channelFlow
launch(dispatcher) {
items.forEach {item ->
launch {
channelFlowScope.send(transform(item))
}
}
}
}
If threadpool reuse is of no concern (threadpools aren't cheap), you can use this version:
inline fun <T, R> Iterable<T>.parallelTransform(
numberOfThreads: Int,
crossinline transform: (T) -> R
): Flow<R> = channelFlow {
val items: Iterable<T> = this#parallelTransform
val channelFlowScope: ProducerScope<R> = this#channelFlow
Executors.newFixedThreadPool(numberOfThreads).asCoroutineDispatcher().use { dispatcher ->
launch( dispatcher ) {
items.forEach { item ->
launch {
channelFlowScope.send(transform(item))
}
}
}
}
}
if you need a version that retains the order of elements:
inline fun <T, R> Iterable<T>.parallelMap(
dispatcher: ExecutorDispatcher,
crossinline transform: (T) -> R
): List<R> = runBlocking {
val items: Iterable<T> = this#parallelMap
val result = ConcurrentSkipListMap<Int, R>()
launch(dispatcher) {
items.withIndex().forEach {(index, item) ->
launch {
result[index] = transform(item)
}
}
}
// ConcurrentSkipListMap is a SortedMap
// so the values will be in the right order
result.values.toList()
}
I found this:
implementation 'com.github.cvb941:kotlin-parallel-operations:1.3'
details:
https://github.com/cvb941/kotlin-parallel-operations
I've come up with a couple of extension functions:
The suspend extension function on Iterable<T> type, which does a parallel processing of items and returns some result of processing each item. By default it uses Dispatchers.IO dispatcher to offload blocking tasks to a shared pool of threads. Must be called from a coroutine (including a coroutine with Dispatchers.Main dispatcher) or another suspend function.
suspend fun <T, R> Iterable<T>.processInParallel(
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> R,
): List<R> = coroutineScope { // or supervisorScope
map {
async(dispatcher) { processBlock(it) }
}.awaitAll()
}
Example of calling from a coroutine:
val collection = listOf("A", "B", "C", "D", "E")
someCoroutineScope.launch {
val results = collection.processInParallel {
process(it)
}
// use processing results
}
where someCoroutineScope is an instance of CoroutineScope.
Launch and forget extension function on CoroutineScope, which doesn't return any result. It also uses Dispatchers.IO dispatcher by default. Can be called using CoroutineScope or from another coroutine.
fun <T> CoroutineScope.processInParallelAndForget(
iterable: Iterable<T>,
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> Unit
) = iterable.forEach {
launch(dispatcher) { processBlock(it) }
}
Example of calling:
someoroutineScope.processInParallelAndForget(collection) {
process(it)
}
// OR from another coroutine:
someCoroutineScope.launch {
processInParallelAndForget(collection) {
process(it)
}
}
2a. Launch and forget extension function on Iterable<T>. It's almost the same as previous, but the extension type is different. CoroutineScope must be passed as argument to the function.
fun <T> Iterable<T>.processInParallelAndForget(
scope: CoroutineScope,
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> Unit
) = forEach {
scope.launch(dispatcher) { processBlock(it) }
}
Calling:
collection.processInParallelAndForget(someCoroutineScope) {
process(it)
}
// OR from another coroutine:
someScope.launch {
collection.processInParallelAndForget(this) {
process(it)
}
}
You can mimic the Scala API by using extension properties and inline classes. Using the coroutine solution from #Sharon answer, you can write it like this
val <A> Iterable<A>.par get() = ParallelizedIterable(this)
#JvmInline
value class ParallelizedIterable<A>(val iter: Iterable<A>) {
suspend fun <B> map(f: suspend (A) -> B): List<B> = coroutineScope {
iter.map { async { f(it) } }.awaitAll()
}
}
with this, now your code can change from
anIterable.map { it.value }
to
anIterable.par.map { it.value }
also you can change the entry point as you like other than using extension properties, e.g.
fun <A> Iterable<A>.parallel() = ParallelizedIterable(this)
anIterable.parallel().map { it.value }
You can also use another parallel solution and implement the rest of iterable methods inside ParallelizedIterable while still having the same method names for the operations
The drawback is that this implementation can only parallelize one operation after it, to make it so that it parallelize every subsequent operation, you may need to modify ParallelizedIterable further so it return its own type instead of returning back to List<A>