Handling user interface in a multi-threaded application (or being forced to have a UI-only main thread) - windows

In my application, I have a 'logging' window, which shows all the logging, warnings, errors of the application.
Last year my application was still single-threaded so this worked [quite] good.
Now I am introducing multithreading. I quickly noticed that it's not a good idea to update the logging window from different threads. Reading some articles on keeping the UI in the main thread, I made a communication buffer, in which the other threads are adding their logging messages, and from which the main thread takes the messages and shows them in the logging window (this is done in the message loop).
Now, in a part of my application, the memory usage increases dramatically, because the separate threads are generating lots of logging messages, and the main thread cannot empty the communication buffer quickly enough. After the while the memory decreases again (if the other threads have finished their work and the main thread gradually empties the communication buffer).
I solved this problem by having a maximum size on the communication buffer, but then I run into a problem in the following situation:
the main thread has to perform a complex action
the main thread takes some parts of the action and let's separate threads execute this
while the seperate threads are executing their logic, the main thread processes the results from the other threads and continues with its work if the other threads are finished
Problem is that in this situation, if the other threads perform logging, there is no UI-message loop, and so the communication buffer is filled, but not emptied.
I see two solutions in solving this problem:
require the main thread to do regular polling of the communication buffer
only performing user interface logic in the main thread (no other logic)
I think the second solution seems the best, but this may not that easy to introduce in a big application (in my case it performs mathematical simulations).
Are there any other solutions or tips?
Or is one of the two proposed the best, easiest, most-pragmatic solution?
Thanks,
Patrick

Let's make some order first.
you may not hold UI processing for any time U would have noticed, or he will be frustrated
you may still perform long operations in the UI thread. this is done by means of PeakMessage loop. If you design one or more proper peakmessage loops, you do not need multithreading, unless for performance optimization.
you may consider MsgWaitForSingleObject() loop instead of GetMessage if you want to communicate with threads efficiently (always better than polling)
Therefore, if you do not redesign your message loop
There's no way you can perform syncronous requests from other threads
You may design a separate thread for the logging
All non-UI logic will have to be elsewhere.
About the memory problem:
it is a bad design to have one thread able to allocate all memory if another thread is stuck. Such dependency is a clear recipe for a disaster.
If the buffer is limited, you need to decide what happens when it's overrun. You have two options - suspend the thread or discard the message.
UI code:
It is possible to design logger code that would display messages with incredible speed. Such designs are complicated, rely on sophisticated caching and arranging data for fast access, viewport management and rendering only the part that corresponds to actual pixels that the user is looking at.
For most applications it is just a gimmick, because users do not read very fast. Most of the time it is better to design a different approach to showing logs, perhaps a stateful UI to let user choose what is interesting to him at the moment. Spy++ for example, some sysinternals tools like regmon, filemon are incredibly fast in showing their own logs. You can have a look at their source code.

Related

How to set up a thread pool for an IOCP server with connections that may perform blocking operations

I'm currently working on a server application that's written in the proactor style, using select() + a dynamically sized thread pool (there's a simple mechanism based on keeping track of idle worker threads).
I need to modify it to use IOCP instead of select() on windows, and I'm wondering what the best way to utilize threads is.
For background information, the server has stateful, long-lived connections, and any request may require significant processing, and block. In fact, most requests call into customer-written code, which may block at will.
I've read that the OS can tell when an IOCP thread blocks, and unblock another one, but it doesn't look like there's any support for creating additional threads under heavy load, or if many of the threads are blocked.
I've read one site which suggested that you have a small, fixed-size thread pool which uses IOCP to deal with I/O only, which sends requests which can block to another, dynamically-sized thread pool. This seems non-optimal due to the additional thread synchronization required (although you can use IOCP as well for the tasks for the second thread pool), and the larger number of threads needed (extra context switching).
Is that the best way?
It sounds like what you've read is one of my articles on IOCP (most probably this one). That's likely a bit out of date now as the whole problem that it sort to avoid (that of I/O being cancelled if the thread that issued it exits before the I/O completes) is no longer a problem with any of Microsoft's currently supported OS's (it's only an issue on XP and before).
You're correct in noticing that my design from 2000/2002 was sub optimal from a context switching point of view; but it worked pretty well at the time, given the constraints of the underlying API.
On a modern OS there's no real advantage in having separate thread pools for I/O and blocking work. A more modern solution would probably involve dynamically expanding and reducing the number of I/O threads servicing the IOCP as required.
You'd need to track the number of IOCP threads that are active (i.e. not waiting on GetQueuedCompletionStatus() ) and spawn more when there are "too few". Likewise just as a thread is about to go back and wait on GQCS you could check to see if you have "too many" and if so, let it die instead.
I should probably update those articles.

How can I tell Windows XP/7 not to switch threads during a certain segment of my code?

I want to prevent a thread switch by Windows XP/7 in a time critical part of my code that runs in a background thread. I'm pretty sure I can't create a situation where I can guarantee that won't happen, because of higher priority interrupts from system drivers, etc. However, I'd like to decrease the probability of a thread switch during that part of my code to the minimum that I can. Are there any create-thread flags or Window API calls that can assist me? General technique tips are appreciated too. If there is a way to get this done without having to raise the threads priority to real-time-critical that would be great, since I worry about creating system performance issues for the user if I do that.
UPDATE: I am adding this update after seeing the first responses to my original post. The concrete application that motivated the question has to do with real-time audio streaming. I want to eliminate every bit of delay I can. I found after coding up my original design that a thread switch can cause a 70ms or more delay at times. Since my app is between two sockets acting as a middleman for delivering audio, the instant I receive an audio buffer I want to immediately turn around and push it out the the destination socket. My original design used two cooperating threads and a semaphore since the there was one thread managing the source socket, and another thread for the destination socket. This architecture evolved from the fact the two devices behind the sockets are disparate entities.
I realized that if I combined the two sockets onto the same thread I could write a code block that reacted immediately to the socket-data-received message and turned it around to the destination socket in one shot. Now if I can do my best to avoid an intervening thread switch, that would be the optimal coding architecture for minimizing delay. To repeat, I know I can't guarantee this situation, but I am looking for tips/suggestions on how to write a block of code that does this and minimizes as best as I can the chance of an intervening thread switch.
Note, I am aware that O/S code behind the sockets introduces (potential) delays of its own.
AFAIK there are no such flags in CreateThread or etc (This also doesn't make sense IMHO). You may snooze other threads in your process from execution during in critical situations (by enumerating them and using SuspendThread), as well as you theoretically may enumerate & suspend threads in other processes.
OTOH snoozing threads is generally not a good idea, eventually you may call some 3rd-party code that would implicitly wait for something that should be accomplished in another threads, which you suspended.
IMHO - you should use what's suggested for the case - playing with thread/process priorities (also you may consider SetThreadPriorityBoost). Also the OS tends to raise the priority to threads that usually don't use CPU aggressively. That is, threads that work often but for short durations (before calling one of the waiting functions that suspend them until some condition) are considered to behave "nicely", and they get prioritized.

How to force workflow runtime to use more CPU power?

Hello
I've quite unordinary problem because I think that in my case workflow runtime doesn't use enough CPU power. Scenario is as follow:
I send a lot of messages to queues. I use EnqueueItem method from WorkflowRuntime class.
I create new instance of workflow with CreateWorkflow method of WorkflowRuntime class.
I wait until new workflow will be moved to the first state. Under normal conditions it takes dozens of second (the workflow is complicated). When at the same time messages are being sent to queues (as described in the point 1) it takes 1 minute or more.
I observe low CPU (8 cores) utilization, no more than 15%. I can add that I have separate process that is responsible for workflow logic and I communicate with it with WCF.
You've got logging, which you think is not a problem, but you don't know. There are many database operations. Those need to block for I/O. Having more cores will only help if different threads can run unimpeded.
I hate to sound like a stuck record, always trotting out the same answer, but you are guessing at what the problem is, and you're asking other people to guess too. People are very willing to guess, but guesses don't work. You need to find out what's happening.
To find out what's happening, the method I use is, get it running under a debugger. (Simplify the problem by going down to one core.) Then pause the whole thing, look at each active thread, and find out what it's waiting for. If it's waiting for some CPU-bound function to complete for some reason, fine - make a note of it. If it's waiting for some logging to complete, make a note. If it's waiting for a DB query to complete, note it. If it's waiting at a mutex for some other thread, note it.
Do this for each thread, and do it several times. Then, you can really say you know what it's doing. When you know what it's waiting for and why, you'll have a pretty good idea how to improve it. That's a variation on this technique.
What are you doing in the work item?
If you have any sort of cross thread synchronisation (Critical sections etc) then this could cause you to spend time stalling the threads waiting for resources to become free.
For example, If you are doing any sort of file access then you are going to spend considerable time blocked waiting for the loads to complete and this will leave your threads idle a lot of the time. You could throw more threads at the problem but then you'd end up generating more disk requests and the resource contention would become even more of a problem.
Thats a couple of potential ideas but I'd really need to know what you are doing before I can be more useful ...
Edit: in answer to your comments...
1) OK
2) You'd perform terribly with 2000 threads working flat out due to switching overhead. In fact running 20-25 threads on an 8 core machine may be a bad plan too because if you get them running at high speed then they will spend time stealing each other's runtime and regular context switches (software thread switches) are very expensive. They may not be as expensive as the waits your code is suffering.
3) Logging? Do you just submit them to an asynchronous queue that spits them out to disk when it has the opportunity or are they sychronous file writes? If they are aysnchronous can you guarantee that there isn't a maximum number of request that can be queued before you DO have to wait? And if you have to wait how many threads end up iin contention for the space that just opened up? There are a lot of ifs there alone.
4) Database operation even on the best database are likely to block if 2 threads make similar calls into the database simultaneously. A good database is designed to limit this but its quite likely that, at least some, clashing will happen.
Suffice to say you will want to get a good thread profiler to see where time is REALLY being lost. Failing that you will just have to live with the performance or attack the problem in a different way ...
WF3 performance is a little on the slow side. If you are using .NET 4 you will get a better performance moving to WF4. Mind you is means a rewrite as WF4 is a completely different product.
As to WF3. There is white paper here that should give you plenty of information to improve things from the standard settings. Look for things like increasing the number of threads used by the DefaultWorkflowSchedulerService or switching to the ManualWorkflowSchedulerService and disabling performance counters which are enabled by default.

Is it better to poll or wait?

I have seen a question on why "polling is bad". In terms of minimizing the amount of processor time used by one thread, would it be better to do a spin wait (i.e. poll for a required change in a while loop) or wait on a kernel object (e.g. a kernel event object in windows)?
For context, assume that the code would be required to run on any type of processor, single core, hyperthreaded, multicore, etc. Also assume that a thread that would poll or wait can't continue until the polling result is satisfactory if it polled instead of waiting. Finally, the time between when a thread starts waiting (or polling) and when the condition is satisfied can potentially vary from a very short time to a long time.
Since the OS is likely to more efficiently "poll" in the case of "waiting", I don't want to see the "waiting just means someone else does the polling" argument, that's old news, and is not necessarily 100% accurate.
Provided the OS has reasonable implementations of these type of concurrency primitives, it's definitely better to wait on a kernel object.
Among other reasons, this lets the OS know not to schedule the thread in question for additional timeslices until the object being waited-for is in the appropriate state. Otherwise, you have a thread which is constantly getting rescheduled, context-switched-to, and then running for a time.
You specifically asked about minimizing the processor time for a thread: in this example the thread blocking on a kernel object would use ZERO time; the polling thread would use all sorts of time.
Furthermore, the "someone else is polling" argument needn't be true. When a kernel object enters the appropriate state, the kernel can look to see at that instant which threads are waiting for that object...and then schedule one or more of them for execution. There's no need for the kernel (or anybody else) to poll anything in this case.
Waiting is the "nicer" way to behave. When you are waiting on a kernel object your thread won't be granted any CPU time as it is known by the scheduler that there is no work ready. Your thread is only going to be given CPU time when it's wait condition is satisfied. Which means you won't be hogging CPU resources needlessly.
I think a point that hasn't been raised yet is that if your OS has a lot of work to do, blocking yeilds your thread to another process. If all processes use the blocking primitives where they should (such as kernel waits, file/network IO etc.) you're giving the kernel more information to choose which threads should run. As such, it will do more work in the same amount of time. If your application could be doing something useful while waiting for that file to open or the packet to arrive then yeilding will even help you're own app.
Waiting does involve more resources and means an additional context switch. Indeed, some synchronization primitives like CLR Monitors and Win32 critical sections use a two-phase locking protocol - some spin waiting is done fore actually doing a true wait.
I imagine doing the two-phase thing would be very difficult, and would involve lots of testing and research. So, unless you have the time and resources, stick to the windows primitives...they already did the research for you.
There are only few places, usually within the OS low-level things (interrupt handlers/device drivers) where spin-waiting makes sense/is required. General purpose applications are always better off waiting on some synchronization primitives like mutexes/conditional variables/semaphores.
I agree with Darksquid, if your OS has decent concurrency primitives then you shouldn't need to poll. polling usually comes into it's own on realtime systems or restricted hardware that doesn't have an OS, then you need to poll, because you might not have the option to wait(), but also because it gives you finegrain control over exactly how long you want to wait in a particular state, as opposed to being at the mercy of the scheduler.
Waiting (blocking) is almost always the best choice ("best" in the sense of making efficient use of processing resources and minimizing the impact to other code running on the same system). The main exceptions are:
When the expected polling duration is small (similar in magnitude to the cost of the blocking syscall).
Mostly in embedded systems, when the CPU is dedicated to performing a specific task and there is no benefit to having the CPU idle (e.g. some software routers built in the late '90s used this approach.)
Polling is generally not used within OS kernels to implement blocking system calls - instead, events (interrupts, timers, actions on mutexes) result in a blocked process or thread being made runnable.
There are four basic approaches one might take:
Use some OS waiting primitive to wait until the event occurs
Use some OS timer primitive to check at some defined rate whether the event has occurred yet
Repeatedly check whether the event has occurred, but use an OS primitive to yield a time slice for an arbitrary and unknown duration any time it hasn't.
Repeatedly check whether the event has occurred, without yielding the CPU if it hasn't.
When #1 is practical, it is often the best approach unless delaying one's response to the event might be beneficial. For example, if one is expecting to receive a large amount of serial port data over the course of several seconds, and if processing data 100ms after it is sent will be just as good as processing it instantly, periodic polling using one of the latter two approaches might be better than setting up a "data received" event.
Approach #3 is rather crude, but may in many cases be a good one. It will often waste more CPU time and resources than would approach #1, but it will in many cases be simpler to implement and the resource waste will in many cases be small enough not to matter.
Approach #2 is often more complicated than #3, but has the advantage of being able to handle many resources with a single timer and no dedicated thread.
Approach #4 is sometimes necessary in embedded systems, but is generally very bad unless one is directly polling hardware and the won't have anything useful to do until the event in question occurs. In many circumstances, it won't be possible for the condition being waited upon to occur until the thread waiting for it yields the CPU. Yielding the CPU as in approach #3 will in fact allow the waiting thread to see the event sooner than would hogging it.

Best practice regarding number of threads in GUI applications

In the past I've worked with a number of programmers who have worked exclusively writing GUI applications.
And I've been given the impression that they have almost universally minimised the use of multiple threads in their applications. In some cases they seem to have gone to extreme lengths to ensure that they use a single thread.
Is this common? Is this the generally accepted philosophy for gui application design?
And if so, why?
[edit]
There are a number of answers saying that thread usage should be minimised to reduce complexity. Reducing complexity in general is a good thing.
But if you look at any number of applications where response to external events is of paramount importance (eg. web servers, any number of embedded applications) there seems to be a world of difference in the attitude toward thread usage.
Generally speaking, GUI frameworks aren't thread safe. For things like Swing(Java's GUI API), only one thread can be updating the UI (or bad things can happen). Only one thread handles dispatching events. If you have multiple threads updating the screen, you can get some ugly flicker and incorrect drawing.
That doesn't mean the application needs to be single threaded, however. There are certainly circumstances when you don't want this to be the case. If you click on a button that calculates pi to 1000 digits, you don't want the UI to be locked up and the button to be depressed for the next couple of days. This is when things like SwingWorker come in handy. It has two parts a doInBackground() which runs in a seperate thread and a done() that gets called by the thread that handles updating the UI sometime after the doInBackground thread has finished. This allows events to be handled quickly, or events that would take a long time to process in the background, while still having the single thread updating the screen.
I think in terms of windows you are limited to all GUI operations happening on a single thread - because of the way the windows message pump works, to increase responsivness most apps add at least one additional worker thread for longer running tasks that would otherwise block and make the ui unresponsive.
Threading is fundamentally hard and so thinking in terms or more than a couple threads can often result in a lot of debugging effort - there is a quote that escapes me right now that goes something like - "if you think you understand threading then you really dont"
I've seen the same thing. Ideally you should perform any operation that is going to take longer then a few hundred ms in a background thread. Anything sorter than 100ms and a human probably wont notice the difference.
A lot of GUI programmers I've worked with in the past are scared of threads because they are "hard". In some GUI frameworks such as the Delphi VCL there are warnings about using the VCL from multiple threads, and this tends to scare some people (others take it as a challenge ;) )
One interesting example of multi-threaded GUI coding is the BeOS API. Every window in an application gets its own thread. From my experience this made BeOS apps feel more responsive, but it did make programming things a little more tricky. Fortunately since BeOS was designed to be multi-threaded by default there was a lot of stuff in the API to make things easier than on some other OSs I've used.
Most GUI frameworks are not thread safe, meaning that all controls have to me accessed from the same thread that created them. Still, it's a good practice to create worker threads to have responsive applications, but you need to be careful to delegate GUI updates to the GUI thread.
Yes.
GUI applications should minimize the the number of threads that they use for the following reasons:
Thread programming is very hard and complicated
In general, GUI applications do at most 2 things at once : a) Respond to User Input, and b) Perform a background task (such as load in data) in response to a user action or an anticipated user action
In general therefore, the added complexity of using multiple threads is not justified by the needs of the application.
There are of course exceptions to the rule.
GUIs generally don't use a whole lot of threads, but they often do throw off another thread for interacting with certain sub-systems especially if those systems take awhile or are very shared resources.
For example, if you're going to print, you'll often want to throw off another thread to interact with the printer pool as it may be very busy for awhile and there's no reason not to keep working.
Another example would be database loads where you're interacting with SQL server or something like that and because of the latency involved you may want to create another thread so your main UI processing thread can continue to respond to commands.
The more threads you have in an application, (generally) the more complex the solution is. By attempting to minimise the number of threads being utilised within a GUI, there are less potential areas for problems.
The other issue is the biggest problem in GUI design: the human. Humans are notorious in their inability to want to do multiple things at the same time. Users have a habit of clicking multiple butons/controls in quick sucession in order to attempt to get something done quicker. Computers cannot generally keep up with this (this is only componded by the GUIs apparent ability to keep up by using multiple threads), so to minimise this effect GUIs will respond to input on a first come first serve basis on a single thread. By doing this, the GUI is forced to wait until system resorces are free untill it can move on. Therefore elimating all the nasty deadlock situations that can arise. Obviously if the program logic and the GUI are on different threads, then this goes out the window.
From a personal preference, I prefer to keep things simple on one thread but not to the detriment of the responsivness of the GUI. If a task is taking too long, then Ill use a different thread, otherwise Ill stick to just one.
As the prior comments said, GUI Frameworks (at least on Windows) are single threaded, thus the single thread. Another recommendation (that is difficult to code in practice) is to limit the number of the threads to the number of available cores on the machine. Your CPU can only do one operation at a time with one core. If there are two threads, a context switch has to happen at some point. If you've got too many threads, the computer can sometimes spend more time swapping between threads than letting threads work.
As Moore's Law changes with more cores, this will change and hopefully programming frameworks will evolve to help us use threads more effectively, depending on the number of cores available to the program, such as the TPL.
Generally all the windowing messages from the window manager / OS will go to a single queue so its natural to have all UI elements in a single thread. Some frameworks, such as .Net, actually throw exceptions if you attempt to directly access UI elements from a thread other than the thread that created it.

Resources