Do you need to wait for completion on a TPL Dataflow DataflowBlock.NullTarget<T> - task-parallel-library

Questions like this one:
TPL Dataflow, how to forward items to only one specific target block among many linked target blocks?
propose using the DataflowBlock.NullTarget{T} to discard items from a pipeline, e.g.
forwarder.LinkTo(DataflowBlock.NullTarget<SomeType>());
However, if you use NullTarget like this, how do you wait for Completion? Would it not be better to create a discard block:
ITargetBlock<SomeType> discard = DataflowBlock.NullTarget<SomeType>();
forwarder.LinkTo(discard);
and wait for completion on this? i.e.
discard.Completion.Wait()
Or do you not need to wait for completion of a "NullTarget" block, i.e. is it simply throw away and forget?

This isn't documented, but based on my tests, the Completion of a NullTarget will never complete, even after you Complete() or Fault() it.
What this means is that you can't wait on the completion of NullTarget blocks, because the waiting would never end.

Related

Get status of asynchronous (InvocationType=Event) AWS lambda execution

I am creating an AWS step function where one of the step, let's call it step X, starts a variable number of lambdas. Since these lambda functions are long (they take between 1 and 10 minutes each to complete), I don't want to wait for them in step X. I would be spending money just for waiting. I therefore start them with InvocationType=Event so that they all run asynchronously and in parallel.
Once step X is done starting all these lambdas, I want my step function to wait for all these asynchronous functions to complete. So, a little like described here, I would create some kind of while loop in my step function. This loop would wait until all my asynchronous invocations have completed.
So the problem is: is it possible to query for the status of an AWS lambda that was started with InvocationType=Event?
If it is not possible, I would need my lambdas to persist their status somewhere so that I can poll this status. I would like to avoid this strategy since it does not cover problems that occur outside of my lambda (ex: out of memory, throttling exceptions, etc.)
An asynchronously invoked Lambda is a "fire and forget" use case. There's no straightforward way to get its result. I'm afraid you'll have to write your own job synchronization logic.
instead of polling,(which again is expensive), you can provide a callback, for the lambda to post back asynchronously. once you get all positives for all lambdas, then continue the process.
Since the question was initially posted, AWS added the support for dynamic parallelism in workflows. The need to manually start lambda functions and poll for their completion from within a step function is therefore now an anti-pattern.

Labview events - do a taks in parallel to a running loop

I'm trying to do something very simple:
OK button sums a+b and shows in c
Loop switch button control a infinity loop
Option 1 - Loop outside event
Option 2 - Loop inside event
I just want to be able to keep the loop running and the OK button working at the same time, how can achieve that simple task in Labview "way of life".
Results:
Op 1 - Outside event: One loop occurs after OK click, if loop is running, OK works only at first time
Op 2 - Inside event: Button OK does not work
You can't. You'll need two seperate while loops, one with the count functionality, but don't use the 'loop' variable as the stop condition, make the loop variable control a the count condition.
In the other while loop you'll have your event code.
The only thing you'll have to worry about is stopping the first while loop from the event code.
Here is how you can do it with a Master/Slave configuration. All the user events are handled in the master, the counting is handled in the slave. The loop can be restarted and the stop works for both loops.
To Stop the code you use a different event, in the case the loop conditional is false you don't do anything in the slave loop. Not shown here, but the loop conditional also has it's own event structure to reset the counter if needed.
This master/slave structure is extendable to as many loops as you want.
I see two options:
Similiar to option 2 but do the "loop math" not inside the "Loop Value Canged"-Case but inside the "Timeout"-Case. Then you don't need the while loop, use a if-case (loop = true) instead.
Use two while loops. Inside each of them put a event case. One to handle the "C=A+B"-Event and one for the "Loop Value Changed".
I think the design pattern you are looking for is the Producer/Consumer pattern. This allows you to run parallel loops and if need be share data between them.
A quick google on the term combined with labview will give you enough examples.

Correct use-case for Concurrency::task_group

The documentation says: "task_group class represents a collection of parallel work which can be waited on or canceled."
1). Do I take it to mean that tasks need to be logically related (but broken down) and that you will ideally need to wait on them elsewhere to collate the results?
IOW, is it possible to use task_group to just schedule asynchronous tasks that basically have no relation to each other (as an analogy: sort of like dumping each iteration of some processing activity in a queue and picking it up for execution by another thread)? Each of them just execute and die away and as a result I wouldn't even have to wait or cancel them.
(I do understand that the task_group dtor will throw an exception if I don't cancel or wait on incomplete tasks. Lets forget that for the moment and only focus on whether I am using it for the right purpose).
This page has an explanation on task groups - not bad.
In a nutshell,
use task groups (the concurrency::task_group class or the concurrency::parallel_invoke algorithm) when you want to decompose parallel work into smaller pieces and then wait for those smaller pieces to complete.

timing consecutive events in cuda

If you have multiple consecutive CUDA events (in a single stream) that you'd like to time (e.g. cudaMemcpy followed by a kernel launch followed by another cudaMemcpy), is it safe/proper/accurate to synchronize only on the last event? For example:
cudaEventRecord(event1_start);
// do something
cudaEventRecord(event1_stop);
cudaEventRecord(event2_start);
// do something else
cudaEventRecord(event2_stop);
cudaEventSynchronize(event2_stop);
cudaEventElapsedTime(&time1, event1_start, event1_stop);
cudaEventElapsedTime(&time2, event2_start, event2_stop);
My understanding is that these events and actual cuda calls get placed into a FIFO queue. The CPU then needs to only wait until the last event is recorded before it records timings for all. Is this correct?
Thanks!
If they are all executed in the same stream or the default stream they will be executed sequentially so I'd say yes, if you synchronize only the last one the others should be finished. I don't guarantee it because I never tested it. I suggest you test it with a simple case where you synchronize both events or only the last one and then compare the times.

Clarification on Threads and Run Loops In Cocoa

I'm trying to learn about threading and I'm thoroughly confused. I'm sure all the answers are there in the apple docs but I just found it really hard to breakdown and digest. Maybe somebody could clear a thing or 2 up for me.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop or is it somehow a new thread even though the method says "mainThread"? If the purpose of threads is to relieve processing on the main thread how does this help?
2) RunLoops
Is it true that if I want to create a completely seperate thread I use
"detachNewThreadSelector"? Does calling start on this initiate a default run loop for the thread that has been created? If so where do run loops come into it?
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
Run Loops
You can think of a Run Loop to be an event processing for-loop associated to a thread. This is provided by the system for every thread, but it's only run automatically for the main thread.
Note that running run loops and executing a thread are two distinct concepts. You can execute a thread without running a run loop, when you're just performing long calculations and you don't have to respond to various events.
If you want to respond to various events from a secondary thread, you retrieve the run loop associated to the thread by
[NSRunLoop currentRunLoop]
and run it. The events run loops can handle is called input sources. You can add input sources to a run-loop.
PerformSelector
performSelectorOnMainThread: adds the target and the selector to a special input source called performSelector input source. The run loop of the main thread dequeues that input source and handles the method call one by one, as part of its event processing loop.
NSOperation/NSOperationQueue
I think of NSOperation as a way to explicitly declare various tasks inside an app which takes some time but can be run mostly independently. It's easier to use than to detach the new thread yourself and maintain various things yourself, too. The main NSOperationQueue automatically maintains a set of background threads which it reuses, and run NSOperations in parallel.
So yes, if you just need to queue up operations in the main thread, you can do away with NSOperationQueue and just use performSelectorOnMainThread:, but that's not the main point of NSOperation.
GCD
GCD is a new infrastructure introduced in Snow Leopard. NSOperationQueue is now implemented on top of it.
It works at the level of functions / blocks. Feeding blocks to dispatch_async is extremely handy, but for a larger chunk of operations I prefer to use NSOperation, especially when that chunk is used from various places in an app.
Summary
You need to read Official Apple Doc! There are many informative blog posts on this point, too.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop …
You're asking about implementation details. Don't worry about how it works.
What it does is perform that selector on the main thread.
… or is it somehow a new thread even though the method says "mainThread"?
No.
If the purpose of threads is to relieve processing on the main thread how does this help?
It helps you when you need to do something on the main thread. A common example is updating your UI, which you should always do on the main thread.
There are other methods for doing things on new secondary threads, although NSOperationQueue and GCD are generally easier ways to do it.
2) RunLoops
Is it true that if I want to create a completely seperate thread I use "detachNewThreadSelector"?
That has nothing to do with run loops.
Yes, that is one way to start a new thread.
Does calling start on this initiate a default run loop for the thread that has been created?
No.
I don't know what you're “calling start on” here, anyway. detachNewThreadSelector: doesn't return anything, and it starts the thread immediately. I think you mixed this up with NSOperations (which you also don't start yourself—that's the queue's job).
If so where do run loops come into it?
Run loops just exist, one per thread. On the implementation side, they're probably lazily created upon demand.
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
These two things are unrelated.
performSelectorOnMainThread: does exactly that: Performs the selector on the main thread.
NSOperations run on secondary threads, one per operation.
An operation queue determines the order in which the operations (and their threads) are started.
Threads themselves are not queued (except maybe by the scheduler, but that's part of the kernel, not your application). The operations are queued, and they are started in that order. Once started, their threads run in parallel.
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
GCD is more or less the same set of concepts as operation queues. You won't understand one as long as you don't understand the other.
So what are all these things good for?
Run loops
Within a thread, a way to schedule things to happen. Some may be scheduled at a specific date (timers), others simply “whenever you get around to it” (sources). Most of these are zero-cost when idle, only consuming any CPU time when the thing happens (timer fires or source is signaled), which makes run loops a very efficient way to have several things going on at once without any threads.
You generally don't handle a run loop yourself when you create a scheduled timer; the timer adds itself to the run loop for you.
Threads
Threads enable multiple things to happen at the exact same time on different processors. Thing 1 can happen on thread A (on processor 1) while thing 2 happens on thread B (on processor 0).
This can be a problem. Multithreaded programming is a dance, and when two threads try to step in the same place, pain ensues. This is called contention, and most discussion of threaded programming is on the topic of how to avoid it.
NSOperationQueue and GCD
You have a thing you need done. That's an operation. You can't have it done on the main thread, or you'd simply send a message like normal; you need to run it in the background, on a secondary thread.
To achieve this, express it as either an NSOperation object (you create a subclass of NSOperation and instantiate it) or a block (or both), then add it to either an NSOperationQueue (NSOperations, including NSBlockOperation) or a dispatch queue (bare block).
GCD can be used to make things happen on the main thread, as well; you can create serial queues and add blocks to them. A serial queue, as its name suggests, will run exactly one block at a time, rather than running a bunch of them in parallel.
So what should I do?
I would not recommend creating threads directly. Use NSOperationQueue or GCD instead; they force you into better thinking habits that will reduce the risk of your threaded code inducing headaches.
For things that run periodically, not fitting into the “thing I need done” model of NSOperations and GCD blocks, consider just using the run loop on the main thread. Chances are, you don't need to put it on a thread after all. A rendering loop in a 3D game, for example, can be a simple timer.

Resources