Why to sleep 1? - windows

I understand that, in a endless loop or somewhere else, you could sleep(0) to leave the OS to perform a context-switching and execute another thread (if there is and it is ready to execute).
Now, I saw a bunch of code where people use sleep(1) instead of sleep(0).
Is this optimal?
Where may I found documentation about it?

If you're implementing something like 'check for the existence of a file, repeat until it exists, then continue', it's better to do a sleep(some_small_positive_number), so you don't use up 100% of CPU time.
Polling loops like this are almost always a sign of improper planning when used in a program, but are used often in command line scripts.

99.9% of the time, such short loops are a symptom of poor design, inadequate understanding of inter-thread comms or just laziness 'cos polling seems easier.
Most while(true) loops in multithreaded calls need no Sleep() calls at all because they block on some other call, I/O or inter-thread synchro objects.
In those cases where a loop does not block on anything, you still need no sleep() calls if the work being done is making real forward progress. Putting in a sleep() call just slows down real work. If the work has an undesirable impact on the system as a whole, lower the priority of the work threads instead of shoving in sleep() calls.
The evil is looping purely for the purpose of polling flags. This is done so often that sleep() itself is often regarded as intrinsically evil. It is not - it's the misuse of it that should stop.
There is not much, on modern OS, that requires polling. File systems, for example, give notifications upon file creation, eliminating the need to continually check and removing the latency and CPU-waste of sleep() loops.

Related

Is there ever a situation where an infinite loop may be desired?

Infinite loops are taught as evil. Is there ever a good use?
When coding them by accident, the CPU peaks and I imagine memory does too, especially if assigning variables inside the loop.
If there is a good use, how are those issues prevented?
Basicly every operating system or server spins in an infinte loop.
To avoid these memory issues normally you wouldn't allocate memory inside the loop unless it can be freed later inside the same loop. For example you would allocate memory for a request and delete it once it was served.
To avoid cpu peaks you would wait for interrupts in case of an os or call a blocking function like poll() which waits for a new event once per iteration.
First of all, the word "infinite" in this phrase should be taken a bit more loosely. I am presuming you are talking about a while (true) loop with a break instruction, which will eventually end, as opposed to a loop which will run until the end of time and all humanity.
In the former sense, yes, there are use cases where it's appropriate:
Games use infinite game loops.
Embedded programs use infinite main loops.
Windows applications use infinite message loops.
One example where they might be used inappropriately is when they are used to create time delays by spinning the CPU, which is what novice programmers tend to do to avoid dealing with timer interrupts (or timer events, or other non-procedural constructs). However, when spinning the CPU is done to acquire a shared resource, then the "infinite loop" is also a perfectly valid implementation choice. Even the .NET CLR Monitor, for example, tries spinning for several hundred cycles before issuing a true wait on a kernel event handle and creating a more expensive thread switch.
In addition to programs that run on event loops (like the the system processes that #Christoph mentions), some languages have a concept known as a generator, that allow and even encourage you to write an infinite loop. The trick is that the object only runs for a finite time when it "yields" (returns) some expression. After that its state is "frozen" until it is needed again. For example, in Python you can have an object that alternates between LEFT and RIGHT:
def side():
while True:
yield "LEFT"
yield "RIGHT"
a = side()
print a.next()
print a.next()
print a.next()
Which would give LEFT RIGHT LEFT. The side function looks like an infinite loop with the statement While True:, but it will only ever run for a finite amount of time per call.
All the applications on your handset run in infinite event loops.

How can I tell Windows XP/7 not to switch threads during a certain segment of my code?

I want to prevent a thread switch by Windows XP/7 in a time critical part of my code that runs in a background thread. I'm pretty sure I can't create a situation where I can guarantee that won't happen, because of higher priority interrupts from system drivers, etc. However, I'd like to decrease the probability of a thread switch during that part of my code to the minimum that I can. Are there any create-thread flags or Window API calls that can assist me? General technique tips are appreciated too. If there is a way to get this done without having to raise the threads priority to real-time-critical that would be great, since I worry about creating system performance issues for the user if I do that.
UPDATE: I am adding this update after seeing the first responses to my original post. The concrete application that motivated the question has to do with real-time audio streaming. I want to eliminate every bit of delay I can. I found after coding up my original design that a thread switch can cause a 70ms or more delay at times. Since my app is between two sockets acting as a middleman for delivering audio, the instant I receive an audio buffer I want to immediately turn around and push it out the the destination socket. My original design used two cooperating threads and a semaphore since the there was one thread managing the source socket, and another thread for the destination socket. This architecture evolved from the fact the two devices behind the sockets are disparate entities.
I realized that if I combined the two sockets onto the same thread I could write a code block that reacted immediately to the socket-data-received message and turned it around to the destination socket in one shot. Now if I can do my best to avoid an intervening thread switch, that would be the optimal coding architecture for minimizing delay. To repeat, I know I can't guarantee this situation, but I am looking for tips/suggestions on how to write a block of code that does this and minimizes as best as I can the chance of an intervening thread switch.
Note, I am aware that O/S code behind the sockets introduces (potential) delays of its own.
AFAIK there are no such flags in CreateThread or etc (This also doesn't make sense IMHO). You may snooze other threads in your process from execution during in critical situations (by enumerating them and using SuspendThread), as well as you theoretically may enumerate & suspend threads in other processes.
OTOH snoozing threads is generally not a good idea, eventually you may call some 3rd-party code that would implicitly wait for something that should be accomplished in another threads, which you suspended.
IMHO - you should use what's suggested for the case - playing with thread/process priorities (also you may consider SetThreadPriorityBoost). Also the OS tends to raise the priority to threads that usually don't use CPU aggressively. That is, threads that work often but for short durations (before calling one of the waiting functions that suspend them until some condition) are considered to behave "nicely", and they get prioritized.

How does cooperative multitasking work?

I read this Wikipedia text slice:
Because a cooperatively multitasked system relies on each process regularly giving up time to other processes on the system, one poorly designed program can consume all of the CPU time for itself or cause the whole system to hang.
Out of curiosity, how does one give up that time? Is this some sort of OS call? Let's think about non-preemptive cases like fibers or evented IO that do cooperative multitasking. How do they give up that time?
Take this NodeJS example:
var fs = require('fs');
fs.readFile('/path/to/file', function(err, data) {});
It is obvious to me that the process does nothing while it's waiting for the data, but how does V8 in this case give up time for other processes?
Let's assume Linux/Windows as our OS.
Edit: I found out how Google is doing this with their V8.
On Windows they basically sleep zero time:
void Thread::YieldCPU() {
Sleep(0);
}
And on Linux they make an OS call:
void Thread::YieldCPU() {
sched_yield();
}
of sched.h.
Yes, every program participates in the scheduling decisions of the OS, so you have to call a particular syscall that tells the kernel to take back over. Often this was called yield(). If you imagine how difficult it is to guarantee that a paticular line of code is called at regular, short intervals, or even at all, you get an idea of why cooperative multitasking is a suboptimal solution.
In your example, it is the javascript engine itself is interrupted by the OS scheduler, if it's a preemptive OS. If it's a cooperative one, then no, the engine gets no work done, and neither does any other process. As a result, such systems are usually not suitable for real-time (or even serious) workloads.
An example of such an OS is NetWare. In that system, it was necessary to call a specific function (I think it is called ThreadSwitch or maybe ThreadSwitchWithDelay). And it was always a guess as to how often it was needed. In every single CPU-intensive loop in the product it was necessary to call one of those functions periodically.
But in that system other calls would result in allowing other threads to run. In particular (and germane to the question) is that I/O calls resulted in giving the OS the opportunity to run other threads. Basically any system call that gave control to the OS was sufficient to allow other threads to run (mutex/semaphore calls being important ones).
As a general rule, co-operative multitasking involves the functions signalling that they are now waiting, rather than going into spin loops ( where they process while waiting ) they suspend themselves.
In this case, the processing behind the ReadFile will handle the waiting for data and the relevant signalling that it is suspendable. Within you own code, whatever it is written in, you should suspend processing if you are waiting for a long-running process, not spin. However, in many cases, the suspend processes are automatically handled, because suspension activities are built in. The danger in this is tht if you deliberately force long-term spins, then you will hang the system.
The alternative ( from that wiki ) is pre-emptive multitasking, where the process is forced out action after a certain time, irrespective of what it is doing. This means that whatever you do, it cannot run forever, because the system process will force it out. However, it can be less efficient as the break points are not defined.

Should I use multiple threads in this situation? [Ruby]

I'm opening multiple files and processing them, one line at a time. The files contain tokens separating the data, such that sometimes the processing of one file may have to wait for others to catch up to that same token.
I was doing this initially with only one thread and an array indicating with true/false if the file should be read in the current iteration or if it should wait for some of the others to catch up.
Would using threads make this simpler? More efficient? Does Ruby have a mechanism for this?
Firstly, Threads never make anything simpler. Threading is only applicable for helping to speed up applications. Threading introduces a host of new complications, it may seem handy to be able to describe multiple threads of execution but it always makes life harder.
Secondly, premature optimization is the root of all evil. Do not attempt to speed up the file processing unless you know that it is a bottleneck. Do the simplest thing that could possibly work (but no simpler).
Thirdly, threading might help if the process of reading the files was independent so that thread can process a file without worrying about what the other threads are doing. It sounds like this is not true in your case. Since the different threads would have to communicate with each other you are unlikely to see a speed benefit in applying threads.
Fourthly, I don't know Ruby and therefore can't comment on what mechanisms it has.
I'm not sure if using threads in ruby is beneficial. Recently I've written and tested an application which was supposed to do parallel computations, but I didn't get what I expected even on quad core processor, it performed computations sequentially, one thread after another. Read this article, it has discussion about threads scheduling, it may turn out that things haven't changed at least for original ruby.

Is it better to poll or wait?

I have seen a question on why "polling is bad". In terms of minimizing the amount of processor time used by one thread, would it be better to do a spin wait (i.e. poll for a required change in a while loop) or wait on a kernel object (e.g. a kernel event object in windows)?
For context, assume that the code would be required to run on any type of processor, single core, hyperthreaded, multicore, etc. Also assume that a thread that would poll or wait can't continue until the polling result is satisfactory if it polled instead of waiting. Finally, the time between when a thread starts waiting (or polling) and when the condition is satisfied can potentially vary from a very short time to a long time.
Since the OS is likely to more efficiently "poll" in the case of "waiting", I don't want to see the "waiting just means someone else does the polling" argument, that's old news, and is not necessarily 100% accurate.
Provided the OS has reasonable implementations of these type of concurrency primitives, it's definitely better to wait on a kernel object.
Among other reasons, this lets the OS know not to schedule the thread in question for additional timeslices until the object being waited-for is in the appropriate state. Otherwise, you have a thread which is constantly getting rescheduled, context-switched-to, and then running for a time.
You specifically asked about minimizing the processor time for a thread: in this example the thread blocking on a kernel object would use ZERO time; the polling thread would use all sorts of time.
Furthermore, the "someone else is polling" argument needn't be true. When a kernel object enters the appropriate state, the kernel can look to see at that instant which threads are waiting for that object...and then schedule one or more of them for execution. There's no need for the kernel (or anybody else) to poll anything in this case.
Waiting is the "nicer" way to behave. When you are waiting on a kernel object your thread won't be granted any CPU time as it is known by the scheduler that there is no work ready. Your thread is only going to be given CPU time when it's wait condition is satisfied. Which means you won't be hogging CPU resources needlessly.
I think a point that hasn't been raised yet is that if your OS has a lot of work to do, blocking yeilds your thread to another process. If all processes use the blocking primitives where they should (such as kernel waits, file/network IO etc.) you're giving the kernel more information to choose which threads should run. As such, it will do more work in the same amount of time. If your application could be doing something useful while waiting for that file to open or the packet to arrive then yeilding will even help you're own app.
Waiting does involve more resources and means an additional context switch. Indeed, some synchronization primitives like CLR Monitors and Win32 critical sections use a two-phase locking protocol - some spin waiting is done fore actually doing a true wait.
I imagine doing the two-phase thing would be very difficult, and would involve lots of testing and research. So, unless you have the time and resources, stick to the windows primitives...they already did the research for you.
There are only few places, usually within the OS low-level things (interrupt handlers/device drivers) where spin-waiting makes sense/is required. General purpose applications are always better off waiting on some synchronization primitives like mutexes/conditional variables/semaphores.
I agree with Darksquid, if your OS has decent concurrency primitives then you shouldn't need to poll. polling usually comes into it's own on realtime systems or restricted hardware that doesn't have an OS, then you need to poll, because you might not have the option to wait(), but also because it gives you finegrain control over exactly how long you want to wait in a particular state, as opposed to being at the mercy of the scheduler.
Waiting (blocking) is almost always the best choice ("best" in the sense of making efficient use of processing resources and minimizing the impact to other code running on the same system). The main exceptions are:
When the expected polling duration is small (similar in magnitude to the cost of the blocking syscall).
Mostly in embedded systems, when the CPU is dedicated to performing a specific task and there is no benefit to having the CPU idle (e.g. some software routers built in the late '90s used this approach.)
Polling is generally not used within OS kernels to implement blocking system calls - instead, events (interrupts, timers, actions on mutexes) result in a blocked process or thread being made runnable.
There are four basic approaches one might take:
Use some OS waiting primitive to wait until the event occurs
Use some OS timer primitive to check at some defined rate whether the event has occurred yet
Repeatedly check whether the event has occurred, but use an OS primitive to yield a time slice for an arbitrary and unknown duration any time it hasn't.
Repeatedly check whether the event has occurred, without yielding the CPU if it hasn't.
When #1 is practical, it is often the best approach unless delaying one's response to the event might be beneficial. For example, if one is expecting to receive a large amount of serial port data over the course of several seconds, and if processing data 100ms after it is sent will be just as good as processing it instantly, periodic polling using one of the latter two approaches might be better than setting up a "data received" event.
Approach #3 is rather crude, but may in many cases be a good one. It will often waste more CPU time and resources than would approach #1, but it will in many cases be simpler to implement and the resource waste will in many cases be small enough not to matter.
Approach #2 is often more complicated than #3, but has the advantage of being able to handle many resources with a single timer and no dedicated thread.
Approach #4 is sometimes necessary in embedded systems, but is generally very bad unless one is directly polling hardware and the won't have anything useful to do until the event in question occurs. In many circumstances, it won't be possible for the condition being waited upon to occur until the thread waiting for it yields the CPU. Yielding the CPU as in approach #3 will in fact allow the waiting thread to see the event sooner than would hogging it.

Resources