I'm opening multiple files and processing them, one line at a time. The files contain tokens separating the data, such that sometimes the processing of one file may have to wait for others to catch up to that same token.
I was doing this initially with only one thread and an array indicating with true/false if the file should be read in the current iteration or if it should wait for some of the others to catch up.
Would using threads make this simpler? More efficient? Does Ruby have a mechanism for this?

Firstly, Threads never make anything simpler. Threading is only applicable for helping to speed up applications. Threading introduces a host of new complications, it may seem handy to be able to describe multiple threads of execution but it always makes life harder.
Secondly, premature optimization is the root of all evil. Do not attempt to speed up the file processing unless you know that it is a bottleneck. Do the simplest thing that could possibly work (but no simpler).
Thirdly, threading might help if the process of reading the files was independent so that thread can process a file without worrying about what the other threads are doing. It sounds like this is not true in your case. Since the different threads would have to communicate with each other you are unlikely to see a speed benefit in applying threads.
Fourthly, I don't know Ruby and therefore can't comment on what mechanisms it has.

I'm not sure if using threads in ruby is beneficial. Recently I've written and tested an application which was supposed to do parallel computations, but I didn't get what I expected even on quad core processor, it performed computations sequentially, one thread after another. Read this article, it has discussion about threads scheduling, it may turn out that things haven't changed at least for original ruby.


Would threading be beneficial for this situation?

I have a CSV file with over 1 million rows. I also have a database that contains such data in a formatted way.
I want to check and verify the data in the CSV file and the data in the database.
Is it beneficial/reduces time to thread reading from the CSV file and use a connection pool to the database?
How well does Ruby handle threading?
I am using MongoDB, also.
It's hard to say without knowing some more details about the specifics of what you want the app to feel like when someone initiates this comparison. So, to answer, some general advice that should apply fairly well regardless of the problem you might want to thread.
Threading does NOT make something computationally less costly
Threading doesn't make things less costly in terms of computation time. It just lets two things happen in parallel. So, beware that you're not falling into the common misconception that, "Threading makes my app faster because the user doesn't wait for things." - this isn't true, and threading actually adds quite a bit of complexity.
So, if you kick off this DB vs. CSV comparison task, threading isn't going to make that comparison take any less time. What it might do is allow you to tell the user, "Ok, I'm going to check that for you," right away, while doing the comparison in a separate thread of execution. You still have to figure out how to get back to the user when the comparison is done.
Think about WHY you want to thread, rather than simply approaching it as whether threading is a good solution for long tasks
Like I said above, threading doesn't make things faster. At best, it uses computing resources in a way that is either more efficient, or gives a better user experience, or both.
If the user of the app (maybe it's just you) doesn't mind waiting for the comparison to run, then don't add threading because you're just going to add complexity and it won't be any faster. If this comparison takes a long time and you'd rather "do it in the background" then threading might be an answer for you. Just be aware that if you do this you're then adding another concern, which is, how do you update the user when the background job is done?
Threading involves extra overhead and app complexity, which you will then have to manage within your app - tread lightly
There are other concerns as well, such as, how do I schedule that worker thread to make sure it doesn't hog the computing resources? Are the setting of thread priorities an option in my environment, and if so, how will adjusting them affect the use of computing resources?
Threading and the extra overhead involved will almost definitely make your comparison take LONGER (in terms of absolute time it takes to do the comparison). The real advantage is if you don't care about completion time (the time between when the comparison starts and when it is done) but instead the responsiveness of the app to the user, and/or the total throughput that can be achieved (e.g. the number of simultaneous comparisons you can be running, and as a result the total number of comparisons you can complete within a given time span).
Threading doesn't guarantee that your available CPU cores are used efficiently
See Green Threads vs. native threads - some languages (depending on their threading implementation) can schedule threads across CPUs.
Threading doesn't necessarily mean your threads wind up getting run in multiple physical CPU cores - in fact in many cases they definitely won't. If all your app's threads run on the same physical core, then they aren't truly running in parallel - they are just splitting CPU time in a way that may make them look like they are running in parallel.
For these reasons, depending on the structure of your app, it's often less complicated to send background tasks to a separate worker process (process, not thread), which can easily be scheduled onto available CPU cores at the OS level. Separate processes (as opposed to separate threads) also remove a lot of the scheduling concerns within your app, because you essentially offload the decision about how to schedule things onto the OS itself.
This last point is pretty important. OS schedulers are extremely likely to be smarter and more efficiently designed than whatever algorithm you might come up with in your app.

How can I tell Windows XP/7 not to switch threads during a certain segment of my code?

I want to prevent a thread switch by Windows XP/7 in a time critical part of my code that runs in a background thread. I'm pretty sure I can't create a situation where I can guarantee that won't happen, because of higher priority interrupts from system drivers, etc. However, I'd like to decrease the probability of a thread switch during that part of my code to the minimum that I can. Are there any create-thread flags or Window API calls that can assist me? General technique tips are appreciated too. If there is a way to get this done without having to raise the threads priority to real-time-critical that would be great, since I worry about creating system performance issues for the user if I do that.
UPDATE: I am adding this update after seeing the first responses to my original post. The concrete application that motivated the question has to do with real-time audio streaming. I want to eliminate every bit of delay I can. I found after coding up my original design that a thread switch can cause a 70ms or more delay at times. Since my app is between two sockets acting as a middleman for delivering audio, the instant I receive an audio buffer I want to immediately turn around and push it out the the destination socket. My original design used two cooperating threads and a semaphore since the there was one thread managing the source socket, and another thread for the destination socket. This architecture evolved from the fact the two devices behind the sockets are disparate entities.
I realized that if I combined the two sockets onto the same thread I could write a code block that reacted immediately to the socket-data-received message and turned it around to the destination socket in one shot. Now if I can do my best to avoid an intervening thread switch, that would be the optimal coding architecture for minimizing delay. To repeat, I know I can't guarantee this situation, but I am looking for tips/suggestions on how to write a block of code that does this and minimizes as best as I can the chance of an intervening thread switch.
Note, I am aware that O/S code behind the sockets introduces (potential) delays of its own.
AFAIK there are no such flags in CreateThread or etc (This also doesn't make sense IMHO). You may snooze other threads in your process from execution during in critical situations (by enumerating them and using SuspendThread), as well as you theoretically may enumerate & suspend threads in other processes.
OTOH snoozing threads is generally not a good idea, eventually you may call some 3rd-party code that would implicitly wait for something that should be accomplished in another threads, which you suspended.
IMHO - you should use what's suggested for the case - playing with thread/process priorities (also you may consider SetThreadPriorityBoost). Also the OS tends to raise the priority to threads that usually don't use CPU aggressively. That is, threads that work often but for short durations (before calling one of the waiting functions that suspend them until some condition) are considered to behave "nicely", and they get prioritized.

How to force workflow runtime to use more CPU power?

I've quite unordinary problem because I think that in my case workflow runtime doesn't use enough CPU power. Scenario is as follow:
I send a lot of messages to queues. I use EnqueueItem method from WorkflowRuntime class.
I create new instance of workflow with CreateWorkflow method of WorkflowRuntime class.
I wait until new workflow will be moved to the first state. Under normal conditions it takes dozens of second (the workflow is complicated). When at the same time messages are being sent to queues (as described in the point 1) it takes 1 minute or more.
I observe low CPU (8 cores) utilization, no more than 15%. I can add that I have separate process that is responsible for workflow logic and I communicate with it with WCF.
You've got logging, which you think is not a problem, but you don't know. There are many database operations. Those need to block for I/O. Having more cores will only help if different threads can run unimpeded.
I hate to sound like a stuck record, always trotting out the same answer, but you are guessing at what the problem is, and you're asking other people to guess too. People are very willing to guess, but guesses don't work. You need to find out what's happening.
To find out what's happening, the method I use is, get it running under a debugger. (Simplify the problem by going down to one core.) Then pause the whole thing, look at each active thread, and find out what it's waiting for. If it's waiting for some CPU-bound function to complete for some reason, fine - make a note of it. If it's waiting for some logging to complete, make a note. If it's waiting for a DB query to complete, note it. If it's waiting at a mutex for some other thread, note it.
Do this for each thread, and do it several times. Then, you can really say you know what it's doing. When you know what it's waiting for and why, you'll have a pretty good idea how to improve it. That's a variation on this technique.
What are you doing in the work item?
If you have any sort of cross thread synchronisation (Critical sections etc) then this could cause you to spend time stalling the threads waiting for resources to become free.
For example, If you are doing any sort of file access then you are going to spend considerable time blocked waiting for the loads to complete and this will leave your threads idle a lot of the time. You could throw more threads at the problem but then you'd end up generating more disk requests and the resource contention would become even more of a problem.
Thats a couple of potential ideas but I'd really need to know what you are doing before I can be more useful ...
Edit: in answer to your comments...
1) OK
2) You'd perform terribly with 2000 threads working flat out due to switching overhead. In fact running 20-25 threads on an 8 core machine may be a bad plan too because if you get them running at high speed then they will spend time stealing each other's runtime and regular context switches (software thread switches) are very expensive. They may not be as expensive as the waits your code is suffering.
3) Logging? Do you just submit them to an asynchronous queue that spits them out to disk when it has the opportunity or are they sychronous file writes? If they are aysnchronous can you guarantee that there isn't a maximum number of request that can be queued before you DO have to wait? And if you have to wait how many threads end up iin contention for the space that just opened up? There are a lot of ifs there alone.
4) Database operation even on the best database are likely to block if 2 threads make similar calls into the database simultaneously. A good database is designed to limit this but its quite likely that, at least some, clashing will happen.
Suffice to say you will want to get a good thread profiler to see where time is REALLY being lost. Failing that you will just have to live with the performance or attack the problem in a different way ...
WF3 performance is a little on the slow side. If you are using .NET 4 you will get a better performance moving to WF4. Mind you is means a rewrite as WF4 is a completely different product.
As to WF3. There is white paper here that should give you plenty of information to improve things from the standard settings. Look for things like increasing the number of threads used by the DefaultWorkflowSchedulerService or switching to the ManualWorkflowSchedulerService and disabling performance counters which are enabled by default.

What to avoid for performance reasons in multithreaded code?

I'm currently reviewing/refactoring a multithreaded application which is supposed to be multithreaded in order to be able to use all the available cores and theoretically deliver a better / superior performance (superior is the commercial term for better :P)
What are the things I should be aware when programming multithreaded applications?
I mean things that will greatly impact performance, maybe even to the point where you don't gain anything with multithreading at all but lose a lot by design complexity. What are the big red flags for multithreading applications?
Should I start questioning the locks and looking to a lock-free strategy or are there other points more important that should light a warning light?
Edit: The kind of answers I'd like are similar to the answer by Janusz, I want red warnings to look up in code, I know the application doesn't perform as well as it should, I need to know where to start looking, what should worry me and where should I put my efforts. I know it's kind of a general question but I can't post the entire program and if I could choose one section of code then I wouldn't be needing to ask in the first place.
I'm using Delphi 7, although the application will be ported / remake in .NET (c#) for the next year so I'd rather hear comments that are applicable as a general practice, and if they must be specific to either one of those languages
One thing to definitely avoid is lots of write access to the same cache lines from threads.
For example: If you use a counter variable to count the number of items processed by all threads, this will really hurt performance because the CPU cache lines have to synchronize whenever the other CPU writes to the variable.
One thing that decreases performance is having two threads with much hard drive access. The hard drive would jump from providing data for one thread to the other and both threads would wait for the disk all the time.
Something to keep in mind when locking: lock for as short a time as possible. For example, instead of this:
bool value = askSomeSharedResourceForSomeValue();
if (value)
Do this (if possible):
bool value = false;
value = askSomeSharedResourceForSomeValue();
if (value)
Of course, this example only works if DoSomethingIfTrue() and DoSomethingIfFalse() don't require synchronization, but it illustrates this point: locking for as short a time as possible, while maybe not always improving your performance, will improve the safety of your code in that it reduces surface area for synchronization problems.
And in certain cases, it will improve performance. Staying locked for long lengths of time means that other threads waiting for access to some resource are going to be waiting longer.
More threads then there are cores, typically means that the program is not performing optimally.
So a program which spawns loads of threads usually is not designed in the best fashion. A good example of this practice are the classic Socket examples where every incoming connection got it's own thread to handle of the connection. It is a very non scalable way to do things. The more threads there are, the more time the OS will have to use for context switching between threads.
You should first be familiar with Amdahl's law.
If you are using Java, I recommend the book Java Concurrency in Practice; however, most of its help is specific to the Java language (Java 5 or later).
In general, reducing the amount of shared memory increases the amount of parallelism possible, and for performance that should be a major consideration.
Threading with GUI's is another thing to be aware of, but it looks like it is not relevant for this particular problem.
What kills performance is when two or more threads share the same resources. This could be an object that both use, or a file that both use, a network both use or a processor that both use. You cannot avoid these dependencies on shared resources but if possible, try to avoid sharing resources.
Run-time profilers may not work well with a multi-threaded application. Still, anything that makes a single-threaded application slow will also make a multi-threaded application slow. It may be an idea to run your application as a single-threaded application, and use a profiler, to find out where its performance hotspots (bottlenecks) are.
When it's running as a multi-threaded aplication, you can use the system's performance-monitoring tool to see whether locks are a problem. Assuming that your threads would lock instead of busy-wait, then having 100% CPU for several threads is a sign that locking isn't a problem. Conversely, something that looks like 50% total CPU utilitization on a dual-processor machine is a sign that only one thread is running, and so maybe your locking is a problem that's preventing more than one concurrent thread (when counting the number of CPUs in your machine, beware multi-core and hyperthreading).
Locks aren't only in your code but also in the APIs you use: e.g. the heap manager (whenever you allocate and delete memory), maybe in your logger implementation, maybe in some of the O/S APIs, etc.
Should I start questioning the locks and looking to a lock-free strategy
I always question the locks, but have never used a lock-free strategy; instead my ambition is to use locks where necessary, so that it's always threadsafe but will never deadlock, and to ensure that locks are acquired for a tiny amount of time (e.g. for no more than the amount of time it takes to push or pop a pointer on a thread-safe queue), so that the maximum amount of time that a thread may be blocked is insignificant compared to the time it spends doing useful work.
You don't mention the language you're using, so I'll make a general statement on locking. Locking is fairly expensive, especially the naive locking that is native to many languages. In many cases you are reading a shared variable (as opposed to writing). Reading is threadsafe as long as it is not taking place simultaneously with a write. However, you still have to lock it down. The most naive form of this locking is to treat the read and the write as the same type of operation, restricting access to the shared variable from other reads as well as writes. A read/writer lock can dramatically improve performance. One writer, infinite readers. On an app I've worked on, I saw a 35% performance improvement when switching to this construct. If you are working in .NET, the correct lock is the ReaderWriterLockSlim.
I recommend looking into running multiple processes rather than multiple threads within the same process, if it is a server application.
The benefit of dividing the work between several processes on one machine is that it is easy to increase the number of servers when more performance is needed than a single server can deliver.
You also reduce the risks involved with complex multithreaded applications where deadlocks, bottlenecks etc reduce the total performance.
There are commercial frameworks that simplifies server software development when it comes to load balancing and distributed queue processing, but developing your own load sharing infrastructure is not that complicated compared with what you will encounter in general in a multi-threaded application.
I'm using Delphi 7
You might be using COM objects, then, explicitly or implicitly; if you are, COM objects have their own complications and restrictions on threading: Processes, Threads, and Apartments.
You should first get a tool to monitor threads specific to your language, framework and IDE. Your own logger might do fine too (Resume Time, Sleep Time + Duration). From there you can check for bad performing threads that don't execute much or are waiting too long for something to happen, you might want to make the event they are waiting for to occur as early as possible.
As you want to use both cores you should check the usage of the cores with a tool that can graph the processor usage on both cores for your application only, or just make sure your computer is as idle as possible.
Besides that you should profile your application just to make sure that the things performed within the threads are efficient, but watch out for premature optimization. No sense to optimize your multiprocessing if the threads themselves are performing bad.
Looking for a lock-free strategy can help a lot, but it is not always possible to get your application to perform in a lock-free way.
Threads don't equal performance, always.
Things are a lot better in certain operating systems as opposed to others, but if you can have something sleep or relinquish its time until it's signaled...or not start a new process for virtually everything, you're saving yourself from bogging the application down in context switching.

Best practice regarding number of threads in GUI applications

In the past I've worked with a number of programmers who have worked exclusively writing GUI applications.
And I've been given the impression that they have almost universally minimised the use of multiple threads in their applications. In some cases they seem to have gone to extreme lengths to ensure that they use a single thread.
Is this common? Is this the generally accepted philosophy for gui application design?
And if so, why?
There are a number of answers saying that thread usage should be minimised to reduce complexity. Reducing complexity in general is a good thing.
But if you look at any number of applications where response to external events is of paramount importance (eg. web servers, any number of embedded applications) there seems to be a world of difference in the attitude toward thread usage.
Generally speaking, GUI frameworks aren't thread safe. For things like Swing(Java's GUI API), only one thread can be updating the UI (or bad things can happen). Only one thread handles dispatching events. If you have multiple threads updating the screen, you can get some ugly flicker and incorrect drawing.
That doesn't mean the application needs to be single threaded, however. There are certainly circumstances when you don't want this to be the case. If you click on a button that calculates pi to 1000 digits, you don't want the UI to be locked up and the button to be depressed for the next couple of days. This is when things like SwingWorker come in handy. It has two parts a doInBackground() which runs in a seperate thread and a done() that gets called by the thread that handles updating the UI sometime after the doInBackground thread has finished. This allows events to be handled quickly, or events that would take a long time to process in the background, while still having the single thread updating the screen.
I think in terms of windows you are limited to all GUI operations happening on a single thread - because of the way the windows message pump works, to increase responsivness most apps add at least one additional worker thread for longer running tasks that would otherwise block and make the ui unresponsive.
Threading is fundamentally hard and so thinking in terms or more than a couple threads can often result in a lot of debugging effort - there is a quote that escapes me right now that goes something like - "if you think you understand threading then you really dont"
I've seen the same thing. Ideally you should perform any operation that is going to take longer then a few hundred ms in a background thread. Anything sorter than 100ms and a human probably wont notice the difference.
A lot of GUI programmers I've worked with in the past are scared of threads because they are "hard". In some GUI frameworks such as the Delphi VCL there are warnings about using the VCL from multiple threads, and this tends to scare some people (others take it as a challenge ;) )
One interesting example of multi-threaded GUI coding is the BeOS API. Every window in an application gets its own thread. From my experience this made BeOS apps feel more responsive, but it did make programming things a little more tricky. Fortunately since BeOS was designed to be multi-threaded by default there was a lot of stuff in the API to make things easier than on some other OSs I've used.
Most GUI frameworks are not thread safe, meaning that all controls have to me accessed from the same thread that created them. Still, it's a good practice to create worker threads to have responsive applications, but you need to be careful to delegate GUI updates to the GUI thread.
GUI applications should minimize the the number of threads that they use for the following reasons:
Thread programming is very hard and complicated
In general, GUI applications do at most 2 things at once : a) Respond to User Input, and b) Perform a background task (such as load in data) in response to a user action or an anticipated user action
In general therefore, the added complexity of using multiple threads is not justified by the needs of the application.
There are of course exceptions to the rule.
GUIs generally don't use a whole lot of threads, but they often do throw off another thread for interacting with certain sub-systems especially if those systems take awhile or are very shared resources.
For example, if you're going to print, you'll often want to throw off another thread to interact with the printer pool as it may be very busy for awhile and there's no reason not to keep working.
Another example would be database loads where you're interacting with SQL server or something like that and because of the latency involved you may want to create another thread so your main UI processing thread can continue to respond to commands.
The more threads you have in an application, (generally) the more complex the solution is. By attempting to minimise the number of threads being utilised within a GUI, there are less potential areas for problems.
The other issue is the biggest problem in GUI design: the human. Humans are notorious in their inability to want to do multiple things at the same time. Users have a habit of clicking multiple butons/controls in quick sucession in order to attempt to get something done quicker. Computers cannot generally keep up with this (this is only componded by the GUIs apparent ability to keep up by using multiple threads), so to minimise this effect GUIs will respond to input on a first come first serve basis on a single thread. By doing this, the GUI is forced to wait until system resorces are free untill it can move on. Therefore elimating all the nasty deadlock situations that can arise. Obviously if the program logic and the GUI are on different threads, then this goes out the window.
From a personal preference, I prefer to keep things simple on one thread but not to the detriment of the responsivness of the GUI. If a task is taking too long, then Ill use a different thread, otherwise Ill stick to just one.
As the prior comments said, GUI Frameworks (at least on Windows) are single threaded, thus the single thread. Another recommendation (that is difficult to code in practice) is to limit the number of the threads to the number of available cores on the machine. Your CPU can only do one operation at a time with one core. If there are two threads, a context switch has to happen at some point. If you've got too many threads, the computer can sometimes spend more time swapping between threads than letting threads work.
As Moore's Law changes with more cores, this will change and hopefully programming frameworks will evolve to help us use threads more effectively, depending on the number of cores available to the program, such as the TPL.
Generally all the windowing messages from the window manager / OS will go to a single queue so its natural to have all UI elements in a single thread. Some frameworks, such as .Net, actually throw exceptions if you attempt to directly access UI elements from a thread other than the thread that created it.
