What exactly is happenning when Windows scheduling a thread. What computation resources are involved in process of scheduling itself.
More specific - how many CPU cycles can take rescheduling a runnable thread which just finished its timeslice/quantum for another timeslice/quantum (because there are no other threads for example).
Might have changed since Win2000 but otherwise there's a free sample chapter from Inside Windows 2000 available on the MS Press site which might be helpful. Chapter 6: Processes, Threads, and Jobs
Ok. here is quote from Russinovich latest book: "At each of these junctions [e.g. end of slice], Windows must determine which thread should run next. When
Windows selects a new thread to run, it performs a context switch to it. A context switch is the
procedure of saving the volatile machine state associated with a running thread, loading another
thread’s volatile state, and starting the new thread’s execution."
if anyone knows better it seems to me that there is context switch in the end of timeslice even if there are no other thread.... at least I couldnt find evidence on the contrary...
Related
I'm reading Windows Internals (7th Edition), and they write about processes in Chapter 1:
Processes
[...] a Windows process comprises the following:
[...]
At least one thread of execution Although an "empty" process is possible, it is (mostly) not useful.
What does "mostly" mean in this context? What could a process with no threads do, and how would that be useful?
EDIT: Also, in a 2015 talk, Mark Russinovich says that a process has "at least one thread" (19:12). Was that a generalization?
Disclaimer: I work for Microsoft.
I think the answer has come out in the comments. There seem to be at least two scenarios where a threadless process would be useful.
Scenario 1: capturing process snapshots
This is probably the most straightforward one. As RbMm commented, PssCaptureSnapshot can be called with the PSS_CAPTURE_VA_CLONE option to create a threadless (or "empty") process (using ZwCreateProcessEx, presumably to duplicate the target process's memory in kernel mode).
The primary use here would be for debugging, if a developer wanted to inspect a process's memory at a certain point in time.
Notably, Eryk Sun points out that an empty process is not necessary for inspecting handles (even though an empty process holds both its own memory space and handles), since there is already a way to inspect a process's handles without creating a new process or duplicating memory.
Scenario 2: forking processes with specific inherited handles---safely
Raymond Chen explains another use for a threadless process: creating new "real" processes with inherited handles safely.
When a thread wants to create a new process (CreateProcess), there are several ways for it to pass handles to the new process:
Make a handle inheritable and CreateProcess with bInheritHandles = true.
Make a handle inheritable, add it to a PROC_THREAD_ATTRIBUTE_LIST, and pass that list to the CreateProcess call.
However, they offer conflicting guarantees that can cause problems when callers want to create two threads with different handles concurrently. As Raymond puts it in Why do people take a lock around CreateProcess calls?:
In order for a handle to be inherited, you not only have to put it in the PROC_THREAD_ATTRIBUTE_LIST, but you also must make the handle inheritable. This means that if another thread is not on board with the PROC_THREAD_ATTRIBUTE_LIST trick and does a straight CreateProcess with bInheritHandles = true, it will inadvertently inherit your handles.
You can use a threadless process to mitigate this. In general:
Create a threadless process.
DuplicateHandle all of the handles you want to capture into this new threadless process.
CreateProcess your new, real forked process, using the PROC_THREAD_ATTRIBUTE_LIST, but set the nominal parent process of this process to be the threadless process (with PROC_THREAD_ATTRIBUTE_PARENT_PROCESS).
You can now CreateProcess concurrently without worrying about other callers, and you can now close the duplicate handles and the empty process.
I was reading a MSDN doc about driver synchronization and I come across a statement that goes like this
a driver can wait if
• The driver is executing in a nonarbitrary thread context. That is,
you can identify the thread that will enter a wait state. In practice,
the only driver routines that execute in a nonarbitrary thread context
are the DriverEntry, AddDevice, Reinitialize, and Unload routines of
any driver, plus the dispatch routines of highest-level drivers. All
these routines are called directly by the system
now my question is that why dispatch routines are considered in arbitrary thread context ?? Since read, write and other routines will be invoked when a request will be raised from user space, so we can know that which thread did that in system space ?? My be I am completely messed up or it could be a silly question but still help me coz i am a newbie in windwos.
ok i found the answer in a document :) and here is what it states ..
Although the highest-level drivers receive I/O requests in the context
of the requesting thread, they often forward those requests to their
lower level drivers on different threads. Consequently, you can make
no assumptions about the contents of the user-mode address space at
the time such routines are called
Could you point me in a direction for discovering how threads are being alternated in the Linux kernel?
Although I do not possess in depth knowledge about the kernel, but AFAIK to the kernel threads (& processes) appear as tasks. The switching between tasks is known as context switch. Context switch is triggered by scheduler through schedule call which is present in kernel/sched.c ( http://lxr.linux.no/linux+v3.0.4/kernel/sched.c#L4247 ). In schedule function context_switch is called which switches memory map & register values for the new thread. I would suggest looking at schedule function.
P.S.: You can use http://lxr.linux.no for browsing kernel code online.
Hope this helps!
I have written a compiler and interpreter for a scripting language. The interpreter is a DLL ('The Engine') which runs in a single thread and can load many 100s or 1000s of compiled byte-code applications and excecute them as a set of internal processes. There is a main loop that excecutes a few instructions from each of the loaded app processes before moving one to the next process.
The byte code instruction in the compiled apps can either be a low level instructions (pop, push, add, sub etc) or a call to an external function library (which is where most of the work is done). These external libararies can call back to the engine to put the internal processes into a sleep state waiting for a particular event upon which the external function (probably after receiving an event) will wake up the internal process again. If all internal processes are in a sleep state (which the are most of the time) then I can put the Engine to sleep as well thus handing off the CPU to other threads.
However there is nothing to prevent someone writing a script which just does a tight loop like this:
while(1)
x=1;
endwhile
Which means my main loop will never enter a sleep state and so the CPU goes up to 100% and locks up the system. I want my engine to run as fast as possibly, whilst still handling windows events so that other applications are still responsive when a tight loop similar to the above is encountered.
So my first question is how to add code to my main loop to ensure windows events are handled without slowing down the main engine which should run at the fastest speed possible..
Also it would be nice to be able to set the maximum CPU usage my engine can use and throttle down the CPU usage by calling the occasional Sleep(1)..
So my second question is how can I throttle down then CPU usage to the required level?
The engine is written in Borland C++ and makes calls to the win32 API.
Thanks in advance
1. Running a message loop at the same time as running your script
I want my engine to run as fast as
possibly, whilst still handling
windows events so that other
applications are still responsive when
a tight loop similar to the above is
encountered.
The best way to continue running a message loop while performing another operation is to move that other operation to another thread. In other words, move your script interpreter to a second thread and communicate with it from your main UI thread, which runs the message loop.
When you say Borland C++, I assume you're using C++ Builder? In this situation, the main thread is the only one that interacts with the UI, and its message loop is run via Application->Run. If you're periodically calling Application->ProcessMessages in your library callbacks, that's reentrant and can cause problems. Don't do it.
One comment to your question suggested moving each script instance to a separate thread. This would be ideal. However, beware of issues with the DLLs the scripts call if they keep state - DLLs are loaded per-process, not per-thread, so if they keep state you may encounter threading issues. For the moment purely to address your current question, I'd suggest moving all your script execution to a single other thread.
You can communicate between threads many ways, such as by posting messages between them using PostMessage or PostThreadMessage. Since you're using Borland C++, you should have access to the VCL. It has a good thread wrapper class called TThread. Derive from this and put your script loop in Execute. You can use Synchronize (blocks waiting) or Queue (doesn't block; method may be run at any time, when the target thread processes its message loop) to run methods in the context of another thread.
As a side note:
so that other
applications are still responsive when
a tight loop similar to the above is
encountered.
This is odd. In a modern, preemptively multitasked version of Windows other applications should still be responsive even when your program is very busy. Are you doing anything odd with your thread priorities, or are you using a lot of memory so that other applications are paged out?
2. Handling an infinite loop in a script
You write:
there is nothing to prevent someone
writing a script which just does a
tight loop like this:
while(1) x=1; endwhile
Which means my main loop will never
enter a sleep state and so the CPU
goes up to 100% and locks up the
system.
but phrase how to handle this as:
Also it would be nice to be able to
set the maximum CPU usage my engine
can use and throttle down the CPU
usage by calling the occasional
Sleep(1)..
So my second question is how can I
throttle down then CPU usage to the
required level?
I think you're taking the wrong approach. An infinite loop like while(1) x=1; endwhile is a bug in the script, but it should not take down your host application. Just throttling the CPU won't make your application able to handle the situation. (And using lots of CPU isn't necessarily a problem: if it the work is available for the CPU to run, do it! There's nothing holy about using only a bit of your computer's CPU. It's there to use after all.) What (I think) you really want is to be able to continue to have your application able to respond when running this script (solved by a second thread) and then:
Detect when a script is 'not responding', or not calling into your callbacks
Be able to take action, such as asking the user if they want to terminate the script
An example of another program that does this is Firefox. If you go to a page with a misbehaving script, eventually you'll get a dialog asking if you want to stop the script running.
Without knowing more about how your script is actually interpreted or run, I can't give a detailed answer to these two. But I can suggest an approach, which is:
Your interpreter probably runs a loop, getting the next instruction and executing it. Your interactivity is currently provided by a callback running from one of those instructions being executed. I'd suggest making use of that by having your callback simply log the time it was last called. Then in your processing thread, every instruction (or every ten or a hundred) check the current time against the last callback time. If a long time has passed, say fifteen or thirty seconds, it may be an indication that the script is stuck. Notify the main thread but keep processing.
For "time", something like GetTickCount is probably sufficient.
Next step: Your main UI thread can react to this by asking the user what to do. If they want to terminate the script, communicate with the script thread to set a flag. In your script processing loop, again every instruction (or hundred) check for this flag, and if it's set, stop.
When you move to having one thread per script interpreter, you TThread's Terminated flag for this. Idiomatically for something that runs infinitely in a thread, you run in a while (!Terminated && [any other conditions]) loop in your Execute function.
To actually answer your question about using less CPU, the best approach is probably to change your thread's priority using SetThreadPriority to a lower priority, such as THREAD_PRIORITY_BELOW_NORMAL. It will still run if nothing else needs to run. This will affect your script's performance. Another approach is to use Sleep as you suggest, but this really is artificial. Perhaps SwitchToThread is slightly better - it yields to another thread the OS chooses. Personally, I think the CPU is there to use, and if you solve the problem of an interactive UI and handling out-of-control scripts then there should be no problem with using all CPU if your script needs it. If you're using "too much" CPU, perhaps the interpreter itself could be optimised. You'll need to run a profiler and find out where the CPU time is being spent.
Although a badly designed script might put you in a do-nothing loop, don't worry about it. Windows is designed to handle this kind of thing, and won't let your program take more than its fair share of the CPU. If it does manage to get 100%, it's only because nothing else wants to run.
Application has an auxiliary thread. This thread is not meant to run all the time, but main process can call it very often.
So, my question is, what is more optimal in terms of CPU performance: suspend thread when it is not being used or keep it alive and use WaitForSingleObject function in order to wait for a signal from main process?
In terms of CPU resources used, both solutions are the same - the thread which is suspended and thread which is waiting in WaitForSingleObject for an object which is not signalled both get no CPU cycles at all.
That said, WaitForSingleObject is almost always a prefered solution because the code using it will be much more "natural" - easier to read, and easier to make right. Suspending/Resuming threads can be dangerous, because you need to take a lot of care to make sure you know you are suspending a thread in a state where suspending it will do no harm (imagine suspending a thread which is currently holding a mutex).
I would assume Andrei doesn't use Delphi to write .NET and therefore Suspend doesn't translate to System.Threading.Thread.Suspend but to SuspendThread Win32 API.
I would strongly suggest against it. If you suspend the thread in an arbitrary moment, then you don't know what's going to happen (for example, you may suspend the thread in such a state the some shared resource is blocked). If you however already know that the thread is in suspendable state, then simply use WaitForSingleObject (or any other WaitFor call) - it will be equally effective as suspending the thread, i.e. thread will use zero CPU time until it is awaken.
What do you mean by "suspend"? WaitForSingleObject will suspend the thread, i.e., it will not consume any CPU, until the signal arrives.
If it's a worker thread that has units of work given to it externally, it should definitely be using signalling objects as that will ensure it doesn't use CPU needlessly.
If it has any of its own work to do as well, that's another matter. I wouldn't suspend the thread from another thread (what happens if there's two threads delivering work to it?) - my basic rule is that threads should control their own lifetime with suggestions from other threads. This localizes all control in the thread itself.
See the excellent tutorial on multi-threading in Delphi :
Multi Threading Tutorial
Another option would be the TMonitor introduced in Delphi 2009, which has functions like Wait, Pulse and PulseAll to keep threads inactive when there is nothing to do for them, and notify them as soon as they should continue with their work. It is loosely modeled after the object locks in Java. Like there, Delphi object now have a "lock" field which can be used for thread synchrinozation.
A blog which gives an example for a threaded queue class can be found at http://delphihaven.wordpress.com/2011/05/04/using-tmonitor-1/
Unfortunately there was a bug in the TMonitor implementation, which seems to be fixed in XE2