Tasks vs. TPL Dataflow vs. Async/Await, which to use when? - task-parallel-library

I have read through quite a number technical documents either by some of the Microsoft team, or other authors detailing functionality of the new TPL Dataflow library, async/await concurrency frameworks and TPL. However, I have not really come across anything that clearly delineates which to use when. I am aware that each has its own place and applicability but specifically I wonder in regards to the following situation:
I have a data flow model that runs completely in-process. At the top sits a data generation component (A) which generates data and passes it on either via data flow block linkages or through raising events to a processing component (B). Some parts within (B) have to run synchronously while (A) massively benefits from parallelism as most of the processes are I/O or CPU bound (reading binary data from disk, then deserializing and sorting them). In the end the processing component (B) passes on transformed results to (C) for further usage.
I wonder specifically when to use tasks, async/await, and TPL data flow blocks in regards to the following:
Kicking off the data generation component (A). I clearly do not want to lock the gui/dashboard thus this process would have to somewhat run on a different thread/task.
How to call methods within (A), (B), and (C) that are not directly involved in the data generation and processing process but perform configuration work that may possibly take several hundred milliseconds/seconds to return. My hunch is that this is where async/await shines?
The most I struggle with is how to best design the message passing from one component to the next. TPL Dataflow looks very interesting but it is sometimes too slow for my purpose. (Note at the end in regards to performance issues). If not using TPL Dataflow how do I achieve responsiveness and concurrency by in-process inter-task/concurrent data passing? Example, clearly if I raise an event within a task the subscribed event handler runs in the same task instead of being passed to another task, correct? In summary, how can component (A) go about its business after passing on data to component (B) while component (B) retrieves the data and focuses on processing it? Which concurrency model is best used here?
I implemented data flow blocks here, but is that truly the best approach?
I guess above points in summary point to my struggle with how to design and implement API type components using standard practice? Should methods be designed async, data inputs as data flow blocks, and data output as either data flow block or event? What is the best approach in general? I am asking because most of the components mentioned above are supposed to work independently, so they can essentially be swapped out or independently altered internally without having to re-write accessors and output.
Note on performance: I mentioned TPL Dataflow blocks are sometimes slow. I deal with a high throughput, low latency type of application and target disk I/O limits and thus tpl dataflow blocks often performed much slower than, for example, a synchronous processing unit. Issue is that I do not know how to embed the process in its own task or concurrent model to achieve something similar than what tpl dataflow blocks already take care of, but without the overhead that comes with tpl df.

It sounds like you have a "push" system. Plain async code only handles "pull" scenarios.
Your choice is between TPL Dataflow and Rx. I think TPL Dataflow is easier to learn, but since you've already tried it and it won't work for your situation, I would try Rx.
Rx comes at the problem from a very different perspective: it is centered around "streams of events" rather than TPL Dataflow's "mesh of actors". Recent versions of Rx are very async-friendly, so you can use async delegates at several points in your Rx pipeline.
Regarding your API design, both TPL Dataflow and Rx provide interfaces you should implement: IReceivableSourceBlock/ITargetBlock for TPL Dataflow, and IObservable/IObserver for Rx. You can just wire up the implementations to the endpoints of your internal mesh (TPL Dataflow) or query (Rx). That way, your components are just a "block" or "observable/observer/subject" that can be composed in other "meshes" or "queries".
Finally, for your async construction system, you just need to use the factory pattern. Your implementation can call Task.Run to do configuration on a thread pool thread.

Just wanted to leave this here, if it helps someone to get a feeling when to use dataflow, because I was surprised at the TPL Dataflow performance. I had a the next scenario:
Iterate through all the C# code files in project (around 3500 files)
Read all the files lines (IO operation)
Iterate through all the file lines and find some strings in them
Return the files and their lines which have a the searched string in
I thought that this was a really good example for the TPL Dataflow but when I have just generated a new Task for each file which I needed to open, and done all the logic in that specific task, this code was faster.
From this my conclusion was to go with Await/Async/Task implementation by default, at least for such simple tasks and that TPL Dataflow was made for more complex situations, especially when you would need Batching and other more "pushed" based scenarios and when the synchronization is more of an issue.
Edit: Then I have done some more reasearch on this and created a demo project and the results are quite interesting. Because as when we have more operations and as they become more complex, the TPL Dataflow becomes more efficient.
Here is the link to the repo.

Related

How to apply machine learning for streaming data in Apache NIFI

I have a processor that generates time series data in JSON format. Based on the received data I need to make a forecast using machine learning algorithms on python. Then write the new forecast values ​​to another flow file.
The problem is: when you run such a python script, it must perform many massive preprocessing operations: queries to a database, creating a complex data structure, initializing forecasting models, etc.
If you use ExecuteStreamCommand, then for each flow file the script will be run every time. Is this true?
Can I make in NIFI a python script that starts once and receives the flow files many times, storing the history of previously received data. Or do I need to make an HTTP service that will receive data from NIFI?
You have a few options:
Build a custom processor. This is my suggested approach. The code would need to be in Java (or Groovy, which provides a more Python-like experience) but would not have Python dependencies, etc. However, I have seen examples of this approach for ML model application (see Tim Spann's examples) and this is generally very effective. The initialization and individual flowfile trigger logic is cleanly separated, and performance is good.
Use InvokeScriptedProcessor. This will allow you to write the code in Python and separate the initialization (pre-processing, DB connections, etc., onScheduled in NiFi processor parlance) with the execution phase (onTrigger). Some examples exist but I have not personally pursued this with Python specifically. You can use Python dependencies but not "native modules" (i.e. compiled C code), as the execution engine is still Jython.
Use ExecuteStreamCommand. Not strongly recommended. As you mention, every invocation would require the preprocessing steps to occur, unless you designed your external application in such a way that it ran a long-lived "server" component and each ESC command sent data to it and returned an individual response. I don't know what your existing Python application looks like, but this would likely involve complicated changes. Tim has another example using CDSW to host and deploy the model and NiFi to send it data via HTTP to evaluate.
Make a Custom Processor that can do that. Java is more appropriate. I believe you can do pretty much every with Java you just need to find libraries. Yes, there might be some issues with some initialization and preprocessing that can be handled by all that in the init function of nifi that will allow you preserve the state of certain components.
Link in my use case I had to build a custom processor that could take in images and apply count the number of people in that image. For that, I had to load a deep learning model once in the init method and after through on trigger method, it could be taking the reference of that model every time it processes an image.

What's the best practice for NSPersistentContainer newBackgroundContext?

I'm familiarizing myself with NSPersistentContainer. I wonder if it's better to spawn an instance of the private context with newBackgroundContext every time I need to insert/fetch some entities in the background or create one private context, keep it and use for all background tasks through the lifetime of the app.
The documentation also offers convenience method performBackgroundTask. Just trying to figure out the best practice here.
I generally recommend one of two approaches. (There are other setups that work, but these are two that I have used, and tested and would recommend.)
The Simple Way
You read from the viewContext and you write to the viewContext and only use the main thread. This is the simplest approach and avoid a lot of the multithread issues that are common with core-data. The problem is that the disk access is happening on the main thread and if you are doing a lot of it it could slow down your app.
This approach is suitable for small lightweight application. Any app that has less than a few thousand total entities and no bulk changes at once would be a good candidate for this. A simple todo list, would be a good example.
The Complex Way
The complex way is to only read from the viewContext on the main thread and do all your writing using performBackgroundTask inside a serial queue. Every block inside the performBackgroundTask refetches any managedObjects that it needs (using objectIds) and all managedObjects that it creates are discarded at the end of the block. Each performBackgroundTask is transactional and saveContext is called at end of the block. A fuller description can be found here: NSPersistentContainer concurrency for saving to core data
This is a robust and functional core-data setup that can manage data at any reasonable scale.
The problem is that you much always make sure that the managedObjects are from the context you expect and are accessed on the correct thread. You also need a serial queue to make sure you don't get write conflicts. And you often need to use fetchedResultsController to make sure entities are not deleted while you are holding pointers to them.

parallel programming for robot control

I need to write a program which does 2 tasks at the same time for better efficiency & high response. First task is, for example, get vision data from a camera & process it.
Second task is, receive processed data from first task & do sth else with this data (robot control strategy). However, while robot control task is being performed, the camera data receiving should still be working.
Is there a solution for such type of programming in C++/C#?? I'm learning TBB, is it the right choice? However, I'm reading things like "loop parallelization", am I going in the right direction??
This links to a very common style in control programming where the computer is used as a central unit to connect to electronic devices (sensors) & actuators and all these devices are processed concurrently
No, your example of loop paralleling is using parallel programming to speed up the result of a calculation for one set of data.
What you need is multitasking. You didn't mention any target architecture. Assuming this will be an embedded system, like a microprocessor, you have several options. There are embedded micro-OSes like VXworks and uC-OS that allow you to do just what you are asking. These allow you to set up multiple "tasks" that run virtually concurrently. Of course true concurrency is impossible with one CPU, but the scheduler in these OSes is designed to be very deterministic, for quasi-real-time systems like you describe.
Sounds good to me! TBB OK, C# has useful threadpool etc. classes. Just one thing, if you haven't done anything like this before - it's all about the data, not the code. If you design the data flow correctly, the code will write itself, (well OK, not really:).

Will there be IQueryable-like additions to IObservable? (.NET Rx)

The new IObservable/IObserver frameworks in the System.Reactive library coming in .NET 4.0 are very exciting (see this and this link).
It may be too early to speculate, but will there also be a (for lack of a better term) IQueryable-like framework built for these new interfaces as well?
One particular use case would be to assist in pre-processing events at the source, rather than in the chain of the receiving calls. For example, if you have a very 'chatty' event interface, using the Subscribe().Where(...) will receive all events through the pipeline and the client does the filtering.
What I am wondering is if there will be something akin to IQueryableObservable, whereby these LINQ methods will be 'compiled' into some 'smart' Subscribe implementation in a source. I can imagine certain network server architectures that could use such a framework. Or how about an add-on to SQL Server (or any RDBMS for that matter) that would allow .NET code to receive new data notifications (triggers in code) and would need those notifications filtered server-side.
Well, you got it in the latest release of Rx, in the form of an interface called IQbservable (pronounced as IQueryableObservable). Stay tuned for a Channel 9 video on the subject, coming up early next week.
To situate this feature a bit, one should realize there are conceptually three orthogonal axes to the Rx/Ix puzzle:
What the data model is you're targeting. Here we find pull-based versus push-based models. Their relationship is based on duality. Transformations exist between those worlds (e.g. ToEnumerable).
Where you execute operations that drive your queries (sensu lato). Certain operators need concurrency. This is where scheduling and the IScheduler interface come in. Operators exist to hop between concurrency domains (e.g. ObserveOn).
How a query expression needs to execute. Either verbatim (IL) or translatable (expression trees). Their relationship is based on homoiconicity. Conversions exist between both representations (e.g. AsQueryable).
All the IQbservable interface (which is the dual to IQueryable and the expression tree representation of an IObservable query) enables is the last point. Sometimes people confuse the act of query translation (the "how" to run) with remoting aspects (the "where" to run). While typically you do translate queries into some target language (such as WQL, PowerShell, DSQLs for cloud notification services, etc.) and remote them into some target system, both concerns can be decoupled. For example, you could use the expression tree representation to do local query optimization.
With regards to possible security concerns, this is no different from the IQueryable capabilities. Typically one will only remote the expression language and not any "truly side-effecting" operators (whatever that means for languages other than fundamentalist functional ones). In particular, the Subscribe and Run operations stay local and take you out of the queryable monad (therefore triggering translation, just as GetEnumerator does in the world of IQueryable). How you'd remote the act of subscribing is something I'll leave to the imagination of the reader.
Start playing with the latest bits today and let us know what you think. Also stay tuned for the upcoming Channel 9 video on this new feature, including a discussion of some of its design philosophy.
While this sounds like an interesting possibility, I would have several reservations about implementing this.
1) Just as you can't serialize non-trivial lambda expressions used by IQueryable, serializing these for Rx would be similarly difficult. You would likely want to be able to serialize multi-line and statement lambdas as part of this framework. To do that, you would likely need to implement something like Erik Meijer's other pet projects - Dryad and Volta.
2) Even if you could serialize these lambda expressions, I would be concerned about the possibility of running arbitrary code on the server sent from the client. This could easily pose a security concern far greater than cross-site scripting. I doubt that the potential benefit of allowing the client to send expressions to the server to execute outweighs the security vulnerability implications.
8 (now 10) years into the future: I stumbled over Qactive (former Rxx), a Rx.Net based queryable reactive tcp server provider
It is the answer to the "question in question"
Server
Observable
.Interval(TimeSpan.FromSeconds(1))
.ServeQbservableTcp(new IPEndPoint(IPAddress.Loopback, 3205));
Client
var datasourceAddress = new IPEndPoint(IPAddress.Loopback, 3205);
var datasource = new TcpQbservableClient<long>(datasourceAddress);
(
from value in datasource.Query()
//The code below is actually executed on the server
where value <= 5 || value >= 8
select value
)
.Subscribe(Console.WriteLine);
What´s mind blowing about this is that clients can say what and how frequently they want the data they receive and the server can still limit and control when, how frequent and how much data it returns.
For more info on this https://github.com/RxDave/Qactive
Another blog.sample
https://sachabarbs.wordpress.com/2016/12/23/rx-over-the-wire/
One problem I would love to see solved with the Reactive Framework, if it's possible, is enabling emission and subcription to change notifications for cached data from Web services and other pull-only services.
It appears, based on a new channel9 interview, that there will be LINQ support for IObserver/IObservable in the BCL of .NET 4.
However it will essentially be LINQ-to-Objects style queries, so at this stage, it doesn't look like a 'smart subscribe' as you put it. That's as far as the basic implementations go in .NET 4. (From my understanding from the above interview)
Having said that, the Reactive framework (Rx) may have more detailed implementations of IObserver/IObservable, or you may be able to write your own passing in Expression<Func...> for the Subscribe paramaters and then using the Expression Tree of the Func to subscribe in a smarter way that suits the event channel you are subscribing to.

How can I implement a blocking process in a single slot without freezing the GUI?

Let's say I have an event and the corresponding function is called. This function interacts with the outside world and so can sometimes have long delays. If the function waits or hangs then my UI will freeze and this is not desirable. On the other hand, having to break up my function into many parts and re-emitting signals is long and can break up the code alot which would make hard to debug and less readable and slows down the development process. Is there a special feature in event driven programming which would enable me to just write the process in one function call and be able to let the mainThread do its job when its waiting? For example, the compiler could reckognize a keyword then implement a return then re-emit signals connected to new slots automatically? Why do I think this would be a great idea ;) Im working with Qt
Your two options are threading, or breaking your function up somehow.
With threading, it sounds like your ideal solution would be Qt::Concurrent. If all of your processing is already in one function, and the function is pretty self-contained (doesn't reference member variables of the class), this would be easy to do. If not, things might get a little more complicated.
For breaking your function up, you can either do it as you suggested and break it into different functions, with the different parts being called one after another, or you can do it in a more figurative way, but scattering calls to allow other processing inside your function. I believe calling processEvents() would do what you want, but I haven't come across its use in a long time. Of course, you can run into other problems with that unless you understand that it might cause other parts of your class to run once more (in response to other events), so you have to treat it almost as multi-threaded in protecting variables that have an indeterminate state while you are computing.
"Is there a special feature in event driven programming which would enable me to just write the process in one function call and be able to let the mainThread do its job when its waiting?"
That would be a non-blocking process.
But your original query was, "How can I implement a blocking process in a single slot without freezing the GUI?"
Perhaps what you're looking for a way to stop other processing when some - any - process decides it's time to block? There are typically ways to do this, yes, by calling a method on one of the parental objects, which, of course, will depend on the specific objects you are using (eg a frame).
Look to the parent objects and see what methods they have that you'd like to use. You may need to overlay one of them to get your exactly desired results.
If you want to handle a GUI event by beginning a long-running task, and don't want the GUI to wait for the task to finish, you need to do it concurrently, by creating either a thread or a new process to perform the task.
You may be able to avoid creating a thread or process if the task is I/O-bound and occasional callbacks to handle I/O would suffice. I'm not familiar with Qt's main loop, but I know that GTK's supports adding event sources that can integrate into a select() or poll()-style loop, running handlers after either a timeout or when a file descriptor becomes ready. If that's the sort of task you have, you could make your event handler add such an event source to the application's main loop.

Resources