While reading an interesting article on Exception in Coroutines. I came across section that includes the following image:
I am trying to understand how and where did the parent Job come from?
Although they explain it the article as:
a new coroutine always gets assigned a new Job()
But looking at the code, I see a top level scope with a Job, when calling the "parent" launch we override it with a SupervisorJob and then two child launch which i guess inherit the parent job or create their own.
So, my question is where did the parent job came from (the one wrapping the 2 child jobs)?
What you pass to launch becomes the parent of the job that launch internally creates. That job is always of the same type, which is StandaloneCoroutine. You cannot control the Job instance corresponding to the coroutine, you can only decide on its parent.
In your diagram, the Job() inside the scope is completely ignored. The top-level launch creates its own StandaloneCoroutine and assigns the SupervisorJob you passed in as its parent.
In the particular situation represented in your diagram, SupervisorJob has no effect as it doesn't prevent any coroutine failure from spreading. The top-level launch will get cancelled if any of the two inner launch coroutines fail. The only difference will be that the SupervisorJob will remain active, but it will have no children at that point.
Related
let me discribe shortly what I want and what I - maybe - know.
I want spring-batch to run a async job; in future more jobs.
The job gets two parameters: an external id and a year.
The job should be able to be restarted after completion because the user wants to run a job with the same parameters again and again.
Only one job should be executed with the same parameters at the same time.
From outside (web interface) it should be possible to query if a job is running by job name and parameters.
The querier could be different from the job starter so an instance or execution id is not present.
I know that a job instance is the representation of the job(name) and the parameters and - like you commented - I cannot rerun a job with the same parameters if the instance/execution is marked completed - except I use a incrementer.
But this changes the parameters by adding a run.id. Now a job is restartable but I and sping-batch itself are not able to identify a running job instance (by name and original parameters) anymore because every job run results in a new instance.
And the question "why would one would restart a successfully completed job instance?" is easy to answer: The user outside don't know about job/instance/execution. The user will start some data processing for a year again and again. And it's my task to make it possible :).
So it would be nice if spring-batch can let the user know "the job with your original parameters is still running".
Question:
What would be a good solution for my needs?
I didn't tried something but thought about it. Maybe I can write an own JobDao for my query? But this will not solve the run-instance-at-same-time problem. Or I can customize the JdbcJobInstanceDao or SimpleJobRepository? Maybe I must add a own job_key which contains only the original parameters?
To correctly understand the answer I am going to give to your question, it is important to know the difference and understand the relation between a job, a job instance and a job execution in Spring Batch. The The Domain Language of Batch section of the reference documentation explains that in details with examples.
The job should be able to be restarted after completion.
This is not possible by design, or more precisely, a job instance cannot be restarted after completion by design (Think of it like "why would one would restart a successfully completed job instance?").
From outside (web interface) it should be possible to query if an instance is running by job name and parameters. There querier could be different from the job starter so an instance or execution id is not present.
The JobExplorer is the API you are looking for. You can ask for job instances and job executions as needed.
Question: What would be a good solution for my needs?
In your case, you receive an external ID and a year as a job execution request. Those two parameters can be used as identifying parameters to define job instances. With this in place, if a job instance is failed, you can restart it by using the same parameters.
I see no need for an incrementer in your case. The incrementer is useful for jobs for which the instances can be defined as a "sequence" that can be "incremented". I see no need to create a custom DAO or JobRepository neither, you should be able to implement your requirement with the built-in components by correctly defining what a job instance is.
For my use-case I have to check if a execution for a job/parameters-combination is running. The parameters here are without run.id of an incrementor. This check must be done before a job run and by explicit rest call. Normally spring-batch checks for running executions but because of the used incrementor every job instance is unique and it will never find any.
So I created a bean with a check method and made use of jobExplorer.findRunningJobExecutions(jobName);. The result can then compared with the used paramters by iterating over JobExecution.getJobParameters().getParameters().
The bean can be used in the rest-method and in an own implemention of JobLauncher.run().
Another solution would be to store the increment separately for a job/parameters-combination. But I don't want to do this not least because I think a framework like spring-batch should do this for me or supports me by reusing/restarting a completed job instance.
I want to perform some load and save operations on another thread (in SDL). To be able to do this I thought of creating a thread and detaching it (letting it end on its own) everytime I call a function that needs to run separately.
But I don't think this is the correct behaviour (or is it?).
Is there any better solution, like creating and using only one thread? And if there is, how can I call my function(s) from it?
Use std::async. On most implementations it uses efficient solutions like reusing threads from threadpool.
The Life span of a thread is dependent on the main thread(or parent thread), without a join all children threads would be terminated when the main thread(or parent thread) exit.A thread is tied to the process. You might want to looking into forking a process instead, this would persist even if the parent process exit, but would be could be come a zombie process, with no way of terminating it within the program.
Is there a way or API available through which I can remove the already assigned process from a job object and re-assign it to another job object?
After a process is associated with a job, the association cannot be broken.
See here for more details: Job Objects Creation
When I took a look at the reference of 'Launching-Jobs' in gnu.org, I didn't get this part.
The shell should also call setpgid to put each of its child processes into the new process group. This is because there is a potential timing problem: each child process must be put in the process group before it begins executing a new program, and the shell depends on having all the child processes in the group before it continues executing. If both the child processes and the shell call setpgid, this ensures that the right things happen no matter which process gets to it first.
There is two method on the link page, launch_job () and launch_process ().
They both call the setpgid in order to prevent the timing problem.
But I didn't get why is there such a problem.
I guess new program means result of execvp (p->argv[0], p->argv); in launch_process(). And before run execvp, setpgid (pid, pgid); is always executed, without same function on launch_job ().
So again, why is there such a problem? (why we have to call setpgid (); on launch_job () either?)
The problem is that the shell wants the process to be in the right process group. If the shell doesn't call setpgid() on its child process, there is a window of time during which the child process is not part of the process group, while the shell execution continues. (By calling setpgid() the shell can guarantee that the child process is part of the process group after that call).
There is another problem, which is that the child process may execute the new program (via exec) before its process group id has been properly set (i.e. before the parent calls setpgid()). That is why the child process should also call setpgid() (before calling exec()).
The description is admittedly pretty bad. There isn't just one problem being solved here; it's really two separate problems. One - the parent (i.e. the shell) wants to have the child process in the right process group. Two - the new program should begin execution only once its process has already been put into the right process group.
I'm trying to learn about threading and I'm thoroughly confused. I'm sure all the answers are there in the apple docs but I just found it really hard to breakdown and digest. Maybe somebody could clear a thing or 2 up for me.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop or is it somehow a new thread even though the method says "mainThread"? If the purpose of threads is to relieve processing on the main thread how does this help?
2) RunLoops
Is it true that if I want to create a completely seperate thread I use
"detachNewThreadSelector"? Does calling start on this initiate a default run loop for the thread that has been created? If so where do run loops come into it?
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
Run Loops
You can think of a Run Loop to be an event processing for-loop associated to a thread. This is provided by the system for every thread, but it's only run automatically for the main thread.
Note that running run loops and executing a thread are two distinct concepts. You can execute a thread without running a run loop, when you're just performing long calculations and you don't have to respond to various events.
If you want to respond to various events from a secondary thread, you retrieve the run loop associated to the thread by
[NSRunLoop currentRunLoop]
and run it. The events run loops can handle is called input sources. You can add input sources to a run-loop.
PerformSelector
performSelectorOnMainThread: adds the target and the selector to a special input source called performSelector input source. The run loop of the main thread dequeues that input source and handles the method call one by one, as part of its event processing loop.
NSOperation/NSOperationQueue
I think of NSOperation as a way to explicitly declare various tasks inside an app which takes some time but can be run mostly independently. It's easier to use than to detach the new thread yourself and maintain various things yourself, too. The main NSOperationQueue automatically maintains a set of background threads which it reuses, and run NSOperations in parallel.
So yes, if you just need to queue up operations in the main thread, you can do away with NSOperationQueue and just use performSelectorOnMainThread:, but that's not the main point of NSOperation.
GCD
GCD is a new infrastructure introduced in Snow Leopard. NSOperationQueue is now implemented on top of it.
It works at the level of functions / blocks. Feeding blocks to dispatch_async is extremely handy, but for a larger chunk of operations I prefer to use NSOperation, especially when that chunk is used from various places in an app.
Summary
You need to read Official Apple Doc! There are many informative blog posts on this point, too.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop …
You're asking about implementation details. Don't worry about how it works.
What it does is perform that selector on the main thread.
… or is it somehow a new thread even though the method says "mainThread"?
No.
If the purpose of threads is to relieve processing on the main thread how does this help?
It helps you when you need to do something on the main thread. A common example is updating your UI, which you should always do on the main thread.
There are other methods for doing things on new secondary threads, although NSOperationQueue and GCD are generally easier ways to do it.
2) RunLoops
Is it true that if I want to create a completely seperate thread I use "detachNewThreadSelector"?
That has nothing to do with run loops.
Yes, that is one way to start a new thread.
Does calling start on this initiate a default run loop for the thread that has been created?
No.
I don't know what you're “calling start on” here, anyway. detachNewThreadSelector: doesn't return anything, and it starts the thread immediately. I think you mixed this up with NSOperations (which you also don't start yourself—that's the queue's job).
If so where do run loops come into it?
Run loops just exist, one per thread. On the implementation side, they're probably lazily created upon demand.
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
These two things are unrelated.
performSelectorOnMainThread: does exactly that: Performs the selector on the main thread.
NSOperations run on secondary threads, one per operation.
An operation queue determines the order in which the operations (and their threads) are started.
Threads themselves are not queued (except maybe by the scheduler, but that's part of the kernel, not your application). The operations are queued, and they are started in that order. Once started, their threads run in parallel.
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
GCD is more or less the same set of concepts as operation queues. You won't understand one as long as you don't understand the other.
So what are all these things good for?
Run loops
Within a thread, a way to schedule things to happen. Some may be scheduled at a specific date (timers), others simply “whenever you get around to it” (sources). Most of these are zero-cost when idle, only consuming any CPU time when the thing happens (timer fires or source is signaled), which makes run loops a very efficient way to have several things going on at once without any threads.
You generally don't handle a run loop yourself when you create a scheduled timer; the timer adds itself to the run loop for you.
Threads
Threads enable multiple things to happen at the exact same time on different processors. Thing 1 can happen on thread A (on processor 1) while thing 2 happens on thread B (on processor 0).
This can be a problem. Multithreaded programming is a dance, and when two threads try to step in the same place, pain ensues. This is called contention, and most discussion of threaded programming is on the topic of how to avoid it.
NSOperationQueue and GCD
You have a thing you need done. That's an operation. You can't have it done on the main thread, or you'd simply send a message like normal; you need to run it in the background, on a secondary thread.
To achieve this, express it as either an NSOperation object (you create a subclass of NSOperation and instantiate it) or a block (or both), then add it to either an NSOperationQueue (NSOperations, including NSBlockOperation) or a dispatch queue (bare block).
GCD can be used to make things happen on the main thread, as well; you can create serial queues and add blocks to them. A serial queue, as its name suggests, will run exactly one block at a time, rather than running a bunch of them in parallel.
So what should I do?
I would not recommend creating threads directly. Use NSOperationQueue or GCD instead; they force you into better thinking habits that will reduce the risk of your threaded code inducing headaches.
For things that run periodically, not fitting into the “thing I need done” model of NSOperations and GCD blocks, consider just using the run loop on the main thread. Chances are, you don't need to put it on a thread after all. A rendering loop in a 3D game, for example, can be a simple timer.