New to async and trying to understand when it makes sense to use it.
We are going to have lots methods in webapi2 calling legacy webservices.
We have lots of low level dlls (Company.Dal.dll,Company.Biz.dll) etc.. that have methods that are not async
Question
Does async has to be all the way really ?
Is there any benefit of having an high level dll (all method async) calling low level dlls (dal,biz etc legacy code) where none of the method are async?
Is it there any benefit in having just the high level component to be async and the rest syncronous?
Many thanks for clarification
Any good tutorials explaning this concept
Using async only makes sense if you actually await something. If you don't, the async method will actually be completely synchronous (and you get a warning from the compiler about it).
In this case, async doesn't have any advantages, only disadvantages: the code is more complex and less efficient.
A thread can only do one thing at a time. If procedures keep your thread busy, there is no sense in making them async.
However if there are periods where the thread in your procedure has to wait for something else to finish, your thread might do something useful instead. In those circumstances async-await becomes useful.
Eric lippert once explained async-await with a restaurant metaphor (search on the page for async-await). If you have a cook who has to wait until the bread is toasted, this cook could do something else, like cooking an egg, and get back to the toaster when the "something else" is finished, or when has to wait for something, like await for the egg to be cooked.
In software the things where your thread typically will do nothing except waiting for something to finish is when reading / writing to disk, sending or receiving data over the network etc. Those are typically actions where you can find async versions as well as non-async versions of the procedure. See for instance classes like Stream, TextReader, WebClient, etc.
If your thread has to do a lot of calculations, it is not useful to make the function async, because there is no moment your thread will not do anything but wait, so your thread won't have time to do other things.
However, if your thread could do something useful while the calculations are done, consider letting another thread do those calculations while your thread is doing the other useful stuff:
private async Task<int> MyLengthyProcedure(...)
{
Task<int> calculationTask = Task.Run( () => DoCalculations(...));
// while one of the threads is doing the calculations,
// your thread could do other useful things:
int i = DoOtherCalculations();
// or if there are calculations that could be performed
// by separate threads simultaneously, start a second task
Task<int> otherCalculationTask = Task.Run( () => DoEvenMoreCalculations(...));
// now two threads are doing calculations. This thread is still free to
// do other things
// after a while you need the result of both calculations:
await Task.WhenAll( new Task[] {calculationTask, otherCalculationTask});
// the int return value of DoCalculations and DoOtherCalculations are in
// the Result property of the task object:
int j = calculationTask.Result;
int k = otherCalculationTask.Result;
return i + j + k;
;
Related
I've been attempting to do a bit of performance review on an app I have, it's a back end Kotlin app that just pulls in some data, does a bit of data transformation and dumps it out, nothing too fancy. One thing that caught my eye was the final bit of execution where we dump our final data onto a queue, at first I noticed when we start up the app the final network call takes a very long time at first, sometimes over a second. Normally we run this network call in a coroutine to stop that last call blocking everything but I started trying to time the coroutine and the network call separately and got some odd results, from what I can see the coroutine takes can take forever to launch/complete compared to the network call. It's entirely possible I'm not recording things correctly but this is the general timing approach I have:
val coroutineTime - Instant.now().toEpochMillis()
GlobalScope.launch {
executionTime = measureTimeMillis { <--DO Message Sending -->}
totalTime = Instant.now().toEpochMillis() - coroutineTime
// Log out execution Time and total time
}
Now here what I'll see is something like
- totalTime = ~800ms
- executionTime = ~150ms
These aren't one-offs either, I have multiple of these processes going on at once ( up to 10 threads I think) and the first total times will always take significantly longer than the actual executionTime/network call. Eventually after a new dozen messages the overhead will calm down and these times will become equivalent at about 15ms, but having nearly 700ms overhead on coroutine start up seems insane to me.
Is this normal/expected behavior? I've tested this in a separate app and see similar but less extreme results where the first coroutine will take about 70ms to boot up, I'm struggling to find any other examples of this type of discussion outside of kotlin being used in android development.
As a first note, it's almost never a good idea to use the GlobalScope unless you really know what you're doing. This is why it was marked as delicate API. You should instead use a scope that is appropriately closed (following the lifecycle of whatever component launches this work).
Now, AFAIK, this GlobalScope runs on the default dispatcher, so maybe this is due to a cold start of that default thread pool. Later, it could also be a problem to use this dispatcher for network calls depending on the amount of concurrent coroutines you have. It would be more appropriate to use Disptachers.IO instead for IO bound work (or a custom thread pool).
It still doesn't explain the cold start, but I would first change that before investigating.
This is expected behavior if you use coroutines inappropriately ;-)
My guess is that your message sending is a blocking operation. By default GlobalScope.launch() dispatches coroutines with Dispatchers.Default which is designed to perform CPU-intensive operations, it has a limited number of threads and you should never block when using it. If you do you may run out of threads and coroutines will need to wait until some blocking operations will finish.
If you need to run blocking or IO code, you should use Dispatchers.IO instead:
GlobalScope.launch(Dispatchers.IO) {
I was facing similar issue, I have a function that loads some data from shared prefs, makes some calculations on the data (all this done in Dispatcher.Default), and return the result on Dispatcher.Main. I measured how long it took the Coroutine to actually start executing the block inside Dispatchers.Main.launch { } after calculations are done(time from tag2 to tag3 below), and got about 950ms (!!) Here is the function :
fun someName() {
CoroutineScope(Dispatchers.Default).launch {
val time = System.currentTimeMillis()
//load data and calculations
Log.d("tag2", "load and calculations took " + (System.currentTimeMillis() - time))
CoroutineScope(Dispatchers.Main.immediate).launch {
Log.d("tag3", "reached main thread code " + (System.currentTimeMillis() - time))
//do something
Log.d("tag4", "do something took " + (System.currentTimeMillis() - time))
}
}
}
But then I realized this happens while app launch, and main thread is busy creating all the UI, so even with .immediate it takes time until main thread will get to execute the dispatched code... then I tried to run this function after app already started and waiting, and found that from tag2 to tag 3 takes about 1ms (!!) (with .immediate). So looks like when dispatching something on Coroutine, when thread isn't busy it will start immediately
I got following functions for making server calls
suspend fun <T: BaseResponse> processPost(post:Post):T? {
val gson=Gson()
val data=gson.toJson(post.reqData)
val res= sendPost(data,post.script)
Log.d("server","res:"+res.first)
//process response here
return null
}
private fun sendPost(data:String,url:String):Pair<String,Int> {
//send data to server
}
In some cases processPost may enter into infinite loop(for instance to wait for access token refresh).Of course this code should never be run on the main thread.But when I mark this function as suspend IDE is highliting it as redundant.Its not big deal but I'm curious how then can I restrict function execution on the main thread?
It seems that you have quite some learning on coroutines to do. It’s impossible to cover all you need to know in one single answer. That’s what tutorials are for. Anyway I will try to answer just the points you asked. It may not make sense before you learn the concepts, I’m sorry if my answer does not help.
Just like many other things, coroutines are not magic. If you don’t understand what something does, you cannot hope it has the properties you want. It may sound harsh but I want to stress that such mentality is a major cause of bugs.
Making a function suspending allows you to call other suspending functions in the function body. It does not make blocking calls non-blocking, nor does it automatically jump threads for you.
You can use withContext to have the execution jump to another thread.
suspend fun xyz() = withContext(Dispatchers.IO) {
...
}
When you call xyz in the main thread, it’ll hand the task to the IO dispatcher. Without being blocked, it can then handle other stuff in the app.
EDIT regarding the comment.
Sorry for being so patronizing and making a wrong guess about your misconception.
If you just want the compiler/the IDE to shut up about the warning, you can simply add #Suppress("RedundantSuspendModifier") to the function. But you shouldn't, because the compiler knows better than you, at least for now.
The great thing about coroutines is that you can write in direct style without blocking the main thread.
launch(Dispatchers.Main) {
val result = makeAnHttpCall() // this can take a long time
messWithUi(result) // changes to the UI has to be in the main thread
}
I hope it is obvious by now that the suspend modifier is not going to stop the main thread from calling the function.
#Suppress("RedundantSuspendModifier")
suspend fun someHeavyComputation(): Result {
return ...
}
launch(Dispatchers.Main) {
val result = someHeavyComputation() // this will run in the main thread
messWithUi(result)
}
Now if you want the computation not to be done in the main thread:
suspend fun someHeavyComputation() = withContext(Dispatchers.Default) {
... // this will be in a thread pool
}
Further reading: Blocking threads, suspending coroutines.
Basically I want an Aktor to change a scalafx-GUI safely.
I've read many posts describing this, but there where sometimes contradictory and some years old, so some of them might be outdated.
I have a working example code and I basically want to know if this kind of programming is thread-save.
The other question is if I can configure sbt or the compiler or something in a way, that all threads (from the gui, the actors and the futures) are started by the same dispatcher.
I've found some example code "scalafx-akka-demo" on GitHub, which is 4 years old. What I did in the following example is basically the same, just a little simplified to keep things easy.
Then there is the scalatrix-example approximately with the same age. This example really worries me.
In there is a self-written dispatcher from Viktor Klang from 2012, and I have no idea how to make this work or if I really need it. The question is: Is this dispatcher only an optimisation or do I have to use something like it to be thread save?
But even if I don't absolutely need the dispatcher like in scalatrix, it is not optimal to have a dispatcher for the aktor-threads and one for the scalafx-event-threads. (And maybe one for the Futures-threads as well?)
In my actual project, I have some measurement values coming from a device over TCP-IP, going to an TCP-IP actor and are to be displayed in a scalafx-GUI. But this is much to long.
So here is my example code:
import akka.actor.{Actor, ActorRef, ActorSystem, Props}
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
import scalafx.Includes._
import scalafx.application.{JFXApp, Platform}
import scalafx.application.JFXApp.PrimaryStage
import scalafx.event.ActionEvent
import scalafx.scene.Scene
import scalafx.scene.control.Button
import scalafx.stage.WindowEvent
import scala.concurrent.ExecutionContext.Implicits.global
object Main extends JFXApp {
case object Count
case object StopCounter
case object CounterReset
val aktorSystem: ActorSystem = ActorSystem("My-Aktor-system") // Create actor context
val guiActor: ActorRef = aktorSystem.actorOf(Props(new GUIActor), "guiActor") // Create GUI actor
val button: Button = new Button(text = "0") {
onAction = (_: ActionEvent) => guiActor ! Count
}
val someComputation = Future {
Thread.sleep(10000)
println("Doing counter reset")
guiActor ! CounterReset
Platform.runLater(button.text = "0")
}
class GUIActor extends Actor {
def receive: Receive = counter(1)
def counter(n: Int): Receive = {
case Count =>
Platform.runLater(button.text = n.toString)
println("The count is: " + n)
context.become(counter(n + 1))
case CounterReset => context.become(counter(1))
case StopCounter => context.system.terminate()
}
}
stage = new PrimaryStage {
scene = new Scene {
root = button
}
onCloseRequest = (_: WindowEvent) => {
guiActor ! StopCounter
Await.ready(aktorSystem.whenTerminated, 5.seconds)
Platform.exit()
}
}
}
So this code brings up a button, and every time it is clicked the number of the button increases. After some time the number on the button is reset once.
In this example-code I tried to bring the scalafx-GUI, the actor and the Future to influence each other. So the button click sends a message to the actor, and then the actor changes the gui - which is what I am testing here.
The Future also sends to the actor and changes the gui.
So far, this example works and I haven't found everything wrong with it.
But unfortunately, when it comes to thread-safety this doesn't mean much
My concrete questions are:
Is the method to change the gui in the example code thread save?
Is there may be a better way to do it?
Can the different threads be started from the same dispatcher?
(if yes, then how?)
To address your questions:
1) Is the method to change the gui in the example code thread save?
Yes.
JavaFX, which ScalaFX sits upon, implements thread safety by insisting that all GUI interactions take place upon the JavaFX Application Thread (JAT), which is created during JavaFX initialization (ScalaFX takes care of this for you). Any code running on a different thread that interacts with JavaFX/ScalaFX will result in an error.
You are ensuring that your GUI code executes on the JAT by passing interacting code via the Platform.runLater method, which evaluates its arguments on the JAT. Because arguments are passed by name, they are not evaluated on the calling thread.
So, as far as JavaFX is concerned, your code is thread safe.
However, potential issues can still arise if the code you pass to Platform.runLater contains any references to mutable state maintained on other threads.
You have two calls to Platform.runLater. In the first of these (button.text = "0"), the only mutable state (button.text) belongs to JavaFX, which will be examined and modified on the JAT, so you're good.
In the second call (button.text = n.toString), you're passing the same JavaFX mutable state (button.text). But you're also passing a reference to n, which belongs to the GUIActor thread. However, this value is immutable, and so there are no threading issues from looking at its value. (The count is maintained by the Akka GUIActor class's context, and the only interactions that change the count come through Akka's message handling mechanism, which is guaranteed to be thread safe.)
That said, there is one potential issue here: the Future both resets the count (which will occur on the GUIActor thread) as well as setting the text to "0" (which will occur on the JAT). Consequently, it's possible that these two actions will occur in an unexpected order, such as button's text being changed to "0" before the count is actually reset. If this occurs simultaneously with the user clicking the button, you'll get a race condition and it's conceivable that the displayed value may end up not matching the current count.
2) Is there may be a better way to do it?
There's always a better way! ;-)
To be honest, given this small example, there's not a lot of further improvement to be made.
I would try to keep all of the interaction with the GUI inside either GUIActor, or the Main object to simplify the threading and synchronization issues.
For example, going back to the last point in the previous answer, rather than have the Future update button.text, it would be better if that was done as part of the CounterReset message handler in GUIActor, which then guarantees that the counter and button appearance are synchronized correctly (or, at least, that they're always updated in the same order), with the displayed value guaranteed to match the count.
If your GUIActor class is handling a lot of interaction with the GUI, then you could have it execute all of its code on the JAT (I think this was the purpose of Viktor Klang's example), which would simplify a lot of its code. For example, you would not have to call Platform.runLater to interact with the GUI. The downside is that you then cannot perform processing in parallel with the GUI, which might slow down its performance and responsiveness as a result.
3) Can the different threads be started from the same dispatcher? (if yes, then how?)
You can specify custom execution contexts for both futures and Akka actors to get better control of their threads and dispatching. However, given Donald Knuth's observation that "premature optimization is the root of all evil", there's no evidence that this would provide you with any benefits whatsoever, and your code would become significantly more complicated as a result.
As far as I'm aware, you can't change the execution context for JavaFX/ScalaFX, since JAT creation must be finely controlled in order to guarantee thread safety. But I could be wrong.
In any case, the overhead of having different dispatchers is not going to be high. One of the reasons for using futures and actors is that they will take care of these issues for you by default. Unless you have a good reason to do otherwise, I would use the defaults.
In my application I have the following:
db2.CreateTable<CategoryGroup>();
db2.CreateTable<Category>();
db2.CreateTable<CategoryGroupSource>();
db2.CreateTable<CategorySource>();
db2.CreateTable<Phrase>();
db2.CreateTable<PhraseSource>();
db2.CreateTable<Score>();
db2.CreateTable<Setting>();
From what I understand there is an Async way to do this also:
database.CreateTableAsync<TodoItem>().Wait();
Can someone explain if there is any advantage in me using the Async way and do people normally always use the Async?
Also are there likely to be benefits if I use this type of Async query:
public Task<TodoItem> GetItemAsync(int id)
{
return database.Table<TodoItem>().Where(i => i.ID == id).FirstOrDefaultAsync();
}
When calling the methods on the main (UI) thread everything on the UI stops for as long as it takes that method to execute. If db2.CreateTable<CategoryGroup>() doesn't take up much time when doing it's thing, it shouldn't be a problem.
Doing a lot of time consuming actions straight after each other might affect your UI and make it freeze.
Calling the *Async variant of the method moves the work to a background thread, via the task API. Calling Wait() on that task, though, makes the current thread (in this case the UI thread) wait for the task to finish, and you're stuck with the same problem.
You should always await tasks: await database.CreateTableAsync<TodoItem>(). This will let it execute on a background thread and not make the current thread wait for it to finish. The next line in your code won't be executed until the Task is finished though. When you write the code, it makes the `Async variant look like it's behaving like the regular version.
Personally, I'd probably move all the methods into a task and just await that. That way you're not returning to the UI thread between each task to execute the next one:
await Task.Run(() =>
{
db2.CreateTable<CategoryGroup>();
db2.CreateTable<Category>();
db2.CreateTable<CategoryGroupSource>();
db2.CreateTable<CategorySource>();
db2.CreateTable<Phrase>();
db2.CreateTable<PhraseSource>();
db2.CreateTable<Score>();
db2.CreateTable<Setting>();
}
In this case you're making the database do all it's work on a background thread (and not freezing the UI while it's doing it). It then returns the result to the UI thread to enable you to update UI.
public Task<TodoItem> GetItemAsync(int id)
{
return database.Table<TodoItem>().Where(i => i.ID == id).FirstOrDefaultAsync();
}
We recently developed a site based on SOA but this site ended up having terrible load and performance issues when it went under load. I posted a question related this issue here:
ASP.NET website becomes unresponsive under load
The site is made of an API (WEB API) site which is hosted on a 4-node cluster and a web site which is hosted on another 4-node cluster and makes calls to the API. Both are developed using ASP.NET MVC 5 and all actions/methods are based on async-await method.
After running the site under some monitoring tools such as NewRelic, investigating several dump files and profiling the worker process, it turned out that under a very light load (e.g. 16 concurrent users) we ended up having around 900 threads which utilized 100% of CPU and filled up the IIS thread queue!
Even though we managed to deploy the site to the production environment by introducing heaps of caching and performance amendments many developers in our team believe that we have to remove all async methods and covert both API and the web site to normal Web API and Action methods which simply return an Action result.
I personally am not happy with approach because my gut feeling is that we have not used the async methods properly otherwise it means that Microsoft has introduced a feature that basically is rather destructive and unusable!
Do you know any reference that clears it out that where and how async methods should/can be used? How we should use them to avoid such dramas? e.g. Based on what I read on MSDN I believe the API layer should be async but the web site could be a normal no-async ASP.NET MVC site.
Update:
Here is the async method that makes all the communications with the API.
public static async Task<T> GetApiResponse<T>(object parameters, string action, CancellationToken ctk)
{
using (var httpClient = new HttpClient())
{
httpClient.BaseAddress = new Uri(BaseApiAddress);
var formatter = new JsonMediaTypeFormatter();
return
await
httpClient.PostAsJsonAsync(action, parameters, ctk)
.ContinueWith(x => x.Result.Content.ReadAsAsync<T>(new[] { formatter }).Result, ctk);
}
}
Is there anything silly with this method? Note that when we converted all method to non-async methods we got a heaps better performance.
Here is a sample usage (I've cut the other bits of the code which was related to validation, logging etc. This code is the body of a MVC action method).
In our service wrapper:
public async static Task<IList<DownloadType>> GetSupportedContentTypes()
{
string userAgent = Request.UserAgent;
var parameters = new { Util.AppKey, Util.StoreId, QueryParameters = new { UserAgent = userAgent } };
var taskResponse = await Util.GetApiResponse<ApiResponse<SearchResponse<ProductItem>>>(
parameters,
"api/Content/ContentTypeSummary",
default(CancellationToken));
return task.Data.Groups.Select(x => x.DownloadType()).ToList();
}
And in the Action:
public async Task<ActionResult> DownloadTypes()
{
IList<DownloadType> supportedTypes = await ContentService.GetSupportedContentTypes();
Is there anything silly with this method? Note that when we converted
all method to non-async methods we got a heaps better performance.
I can see at least two things going wrong here:
public static async Task<T> GetApiResponse<T>(object parameters, string action, CancellationToken ctk)
{
using (var httpClient = new HttpClient())
{
httpClient.BaseAddress = new Uri(BaseApiAddress);
var formatter = new JsonMediaTypeFormatter();
return
await
httpClient.PostAsJsonAsync(action, parameters, ctk)
.ContinueWith(x => x.Result.Content
.ReadAsAsync<T>(new[] { formatter }).Result, ctk);
}
}
Firstly, the lambda you're passing to ContinueWith is blocking:
x => x.Result.Content.ReadAsAsync<T>(new[] { formatter }).Result
This is equivalent to:
x => {
var task = x.Result.Content.ReadAsAsync<T>(new[] { formatter });
task.Wait();
return task.Result;
};
Thus, you're blocking a pool thread on which the lambda is happened to be executed. This effectively kills the advantage of the naturally asynchronous ReadAsAsync API and reduces the scalability of your web app. Watch out for other places like this in your code.
Secondly, an ASP.NET request is handled by a server thread with a special synchronization context installed on it, AspNetSynchronizationContext. When you use await for continuation, the continuation callback will be posted to the same synchronization context, the compiler-generated code will take care of this. OTOH, when you use ContinueWith, this doesn't happen automatically.
Thus, you need to explicitly provide the correct task scheduler, remove the blocking .Result (this will return a task) and Unwrap the nested task:
return
await
httpClient.PostAsJsonAsync(action, parameters, ctk).ContinueWith(
x => x.Result.Content.ReadAsAsync<T>(new[] { formatter }),
ctk,
TaskContinuationOptions.None,
TaskScheduler.FromCurrentSynchronizationContext()).Unwrap();
That said, you really don't need such added complexity of ContinueWith here:
var x = await httpClient.PostAsJsonAsync(action, parameters, ctk);
return await x.Content.ReadAsAsync<T>(new[] { formatter });
The following article by Stephen Toub is highly relevant:
"Async Performance: Understanding the Costs of Async and Await".
If I have to call an async method in a sync context, where using await
is not possible, what is the best way of doing it?
You almost never should need to mix await and ContinueWith, you should stick with await. Basically, if you use async, it's got to be async "all the way".
For the server-side ASP.NET MVC / Web API execution environment, it simply means the controller method should be async and return a Task or Task<>, check this. ASP.NET keeps track of pending tasks for a given HTTP request. The request is not getting completed until all tasks have been completed.
If you really need to call an async method from a synchronous method in ASP.NET, you can use AsyncManager like this to register a pending task. For classic ASP.NET, you can use PageAsyncTask.
At worst case, you'd call task.Wait() and block, because otherwise your task might continue outside the boundaries of that particular HTTP request.
For client side UI apps, some different scenarios are possible for calling an async method from synchronous method. For example, you can use ContinueWith(action, TaskScheduler.FromCurrentSynchronizationContext()) and fire an completion event from action (like this).
async and await should not create a large number of threads, particularly not with just 16 users. In fact, it should help you make better use of threads. The purpose of async and await in MVC is to actually give up the thread pool thread when it's busy processing IO bound tasks. This suggests to me that you are doing something silly somewhere, such as spawning threads and then waiting indefinitely.
Still, 900 threads is not really a lot, and if they're using 100% cpu, then they're not waiting.. they're chewing on something. It's this something that you should be looking into. You said you have used tools like NewRelic, well what did they point to as the source of this CPU usage? What methods?
If I were you, I would first prove that merely using async and await are not the cause of your problems. Simply create a simple site that mimics the behavior and then run the same tests on it.
Second, take a copy of your app, and start stripping stuff out and then running tests against it. See if you can track down where the problem is exactly.
There is a lot of stuff to discuss.
First of all, async/await can help you naturally when your application has almost no business logic. I mean the point of async/await is to do not have many threads in sleep mode waiting for something, mostly some IO, e.g. database queries (and fetching). If your application does huge business logic using cpu for 100%, async/await does not help you.
The problem of 900 threads is that they are inefficient - if they run concurrently. The point is that it's better to have such number of "business" threads as you server has cores/processors. The reason is thread context switching, lock contention and so on. There is a lot of systems like LMAX distruptor pattern or Redis which process data in one thread (or one thread per core). It's just better as you do not have to handle locking.
How to reach described approach? Look at disruptor, queue incoming requests and processed them one by one instead of parallel.
Opposite approach, when there is almost no business logic, and many threads just waits for IO is good place where to put async/await into work.
How it mostly works: there is a thread which reads bytes from network - mostly only one. Once some some request arrive, this thread reads the data. There is also limited thread pool of workers which processes requests. The point of async is that once one processing thread is waiting for some thing, mostly io, db, the thread is returned in poll and can be used for another request. Once IO response is ready, some thread from pool is used to finish the processing. This is the way how you can use few threads to server thousand request in a second.
I would suggest that you should draw some picture how your site is working, what each thread does and how concurrently it works. Note that it's necessary to decide whether throughput or latency is important for you.