Determine thread created time in WinDbg with user-mode dump - debugging

How to determine the time thread was created in WinDbg with user-mode dump.

The !runaway command allows you to display the elapsed time since a thread was created. From the documentation:
!runaway [Flags] Parameters
Flags Specifies the kind of information to be displayed. Flags can be
any combination of the following bits. The default value is 0x1.
Bit 0 (0x1) Causes the debugger to show the amount of user time
consumed by each thread.
Bit 1 (0x2) Causes the debugger to show the amount of kernel time
consumed by each thread.
Bit 2 (0x4)  Causes the debugger to show the amount of time that has
elapsed since each thread was created.

Related

Pausing a kthread in Linux

here https://blog.packagecloud.io/eng/2017/02/06/monitoring-tuning-linux-networking-stack-sending-data/#queuing-disciplines
it is written:
As you’ll see from the previous post, the NET_TX_SOFTIRQ softirq has
the function net_tx_action registered to it. This means that there is
a kernel thread executing net_tx_action. That thread is occasionally
paused and raise_softirq_irqoff resumes it. Let’s take a look at what
net_tx_action does so we can understand how the kernel processes
transmit requests.
It is written that kthread is occasionally paused. When a kthread is paused and why?
How kthread knows about work to execute? Does it poll a queue?
I think what's said about pausing a thread there is more like a figure of speech. In this case it's not kthread that is paused, the thread works just fine.
The body of work related to softirq is in __do_softirq() function.
There's a number of softirq types, and each softirq type is represented by a bit in a bitmask. Whenever there's work for a specific type of softirq, the corresponding bit is raised in the bitmask. __do_softirq() processes this bitmask bit by bit starting with the least significant bit, and does the work for each softirq type that has the bit set. Thus softirq types are processed in the order of priority, with the bit 0 representing the highest priority. In fact, if you look at the code you'll see that the bitmask is saved (copied) and then cleared before the processing starts, and it's the copy that is processed.
The bit for NET_TX_SOFTIRQ is raised each time a new skb is submitted to the kernel networking stack to send data out. That causes __do_softirq() to call net_tx_action() for outgoing data. If there's no data to send out, then the bit is not raised. Essentially, that's what causes the kernel softirq thread to "pause" which is just a layman way to say that there's no work for it, so net_tx_action() is not called. As soon as there's more data, the bit is raised again as data is submitted to the kernel networking stack. __do_softirq() sees that and calls net_tx_action() again.
There's a softirq thread on each CPU. A thread is run when there's at least one pending softirq type. The threads are defined in softirq_threads structure and started in spawn_softirqd() function.

When windows releases threads?

Our product consumes a lot of windows resources, such as socket handles, memory, threads and so on. Usually there are 700-900 active threads, but in some cases product can rapidly create new threads and do some work, and close it.
I came across with crash memory dump of our product. With ~* windbg command I can see 817 active threads, but when I run !handle command it prints me these summary:
Type Count
None 15
Event 2603
Section 13
File 705
Directory 4
Mutant 32
WindowStation 2
Semaphore 789
Key 208
Process 1
Thread 5766
Desktop 1
IoCompletion 308
Timer 276
KeyedEvent 1
TpWorkerFactory 48
So, actually process holds 5766 threads. So, my question, When Windows actually frees handles for process? Is it possible some kind of delay, or cashing? Can someone explain this behavior?
I don't think that we have handle leaks, but we have weird behavior in legacy part of system with rapidly creating and closing threads for small tasks. Also I would like to point, that we unlikely run more than 1000 threads simultaneously, I am pretty sure about this.
Thanks.
When you say So, actually process holds 5766 threads., what you really mean is that the process holds 5766 thread handles.
Even though a thread may no longer be running, whether that is the result of a call to ExitThread()/TerminateThread() or returning from the ThreadProc, any handles to that thread will remain valid. This makes it possible to do things like call GetExitCodeThread() on the handle of a thread that has finished its work.
Unfortunately, that means that you have to remember to call CloseHandle() instead of just letting it leak. The MSDN example on Creating Threads covers this to some extent.
Another thing that I will note is that somewhere not too far above 1000 running threads, you are likely to exhaust the amount of virtual address space available to a 32bit process since each thread by default reserves 1MB of address space for its stack.

who is running kernel if cpu is running processes?

Suppose in a two process environment, one process is scheduled for execution by the kernel, and it demanded for some data which is not available in the RAM. So the cpu will indicate the kernel that something is not available and the process will be suspended. Then after kernel loads the second process for execution through the CPU and start investigating about the data in secondary memory location (say virtual memory) and gets it, puts it back to main memory by a swap to the memory data which is currently inactive, and puts the process back in the ready queue for execution.
We know that everything in computer system is get manipulated by CPU only and if CPU is busy executing continuously the process code then who is executing the kernel code to perform the tasks done by kernel?
Please let me know if i am able to explain the scenario.
At any point in time, CPU (/s) will be
Running a process in User Mode.
Running on behalf of a process in Kernel Mode to execute previleged instruction or access hardware (for example when system call read / write is issued).
Running in repsonse to a hardware interrupt. i.e. running in interrupt context. (Not associated with any process in particular) and yes in kernel mode.
Running some kernel threads to serve deferred work like soft irq. (Tasklet / Softirq)
Running CPU idle thread if nothing is there to execute.
If you are in particular asking about scheduling, then
Suppose a process is running and now it has issued a read call to retrieve data from hard disk, say, then process is removed from cpu and kernel invokes schedule() functions. So here, first process issues read system call, which results in switching from user mode to kernel mode. The kernel which is running on behalf of the process prepares for the hard disk read operation and then calls schedule() function
Suppose a hardware interrupt has come, then currently running process is removed, and interrupt service handler for that interrupt begins to execute in kernel mode (obviously).
Basically, kernel runs in between user processes !!
Clear now ?
Shash
The kernel runs either as a result of a hardware interrupt, or as a result of being invoked by a process to do something. In both cases the code which was executing at that moment stops running until the kernel finishes its job.
It is similar to a function call: when function A calls function B, function A has to wait until function B is done doing what it does, and returns control to function A. You do not need multiple CPUs, or any kind of magic to accomplish this.
The CPU is not continuously executing process code. The CPU is interrupted to perform various operations. Interrupts can occur for various reasons: a resource becomes available, a previous action completes, or simply a timer goes off.
I recommend this series of videos for more in-depth information: http://academicearth.org/courses/operating-systems-and-system-programming

Why is a thread's status running but it doesn't use any CPU?

Today I found a very strange problem.
I ran Redhat Enterprise Linux 6, and the CPU was Intel E31275 (4 cores, 8 threads). I found one kernel thread (I called it as my_thread) didn't work correctly.
With "ps" command, I found the status of my_thread was always running:
ps ax
5545 ? R 3:14 [my_thread]
15774 ttyS0 Ss 0:00 -bash
...
But its running time was always 3:14. Since it ws running, why didn't the total time increase?
From the proc file /proc/5545/sched, I found the all statistics including wakeups count (se.nr_wakeups) for this thread was always the same, too.
From /proc/5545/stack, I found this thread called this function and never returned:
interruptible_sleep_on_timeout(&q, 3*HZ);
In theory this function would return every 3 seconds if no other threads woke up the thread. Each time after the function returned, se.nr_wakeups in /proc/5545/sched would be increased by 1. But this never happened after I found the thread had some problems.
Does any one have some ideas? Is it possible that interruptible_sleep_on_timeout() never returns?
Update:
I find the problem won't occur if I set CPU affinity for this thread. If I pin it to a dedicated core, then everything is OK. Are there any problems with SMP scheduling?
Update again:
After I disalbe hyperthread in BIOS, I have not seen such a problem until now.
First off, R indicates the thread is not in running state but runnable. That is, it does not mean it runs, it means it is in a state the scheduler is allowed to pick it for running. There is a big difference between the two.
In a similar sense interruptible_sleep_on_timeout(&q, 3*HZ); will not run the thread after 3 jiffies, but rather make it available for running after 3 jiffies - and indeed you see it in "ps" as available for running, so possibly the timeout has indeed occurred.
Since you did not say anything about the kernel thread in question I don't even know if it is in your own code or standard kernel code so I cannot really answer in detail.
One possible reason for the situation you described is that some other thread (user or kernel) has higher priority then your thread and so the scheduler never picks it for running. If so, it is not probably a thread running in real time priority (SCHED_FIFO or SCHED_RR).

Determine thread wait time in WinDbg with user-mode dump

is there any way in WinDbg to determine since what date/time a Windows thread is blocked by functions like WaitForSingleObjects or WaitForMultipleObjects?
I know how to do this in kernel debugging (using !thread), but I have no idea how to do this in user-mode debugging.
In WinDbg, you can use !runaway to get thread timings:
!runaway
!runaway 1
(user time)
!runaway 2
(kernel time)
!runaway 4
(elapsed time)
(You'll find these documented as 0, 1 and 2 some places, but in my experience those don't work. Perhaps it depends on the WinDbg version or something...)
You can compute the time spent suspended by subtracting a thread's user and kernel time from it's elapsed time, but unfortunately I don't know of any way (short of writing a WinDbg plugin) to get WinDbg to do that for you.
If you're not set on WinDbg, you can use Process Explorer to get the same information. When you right-click a process and select the threads tab in the properties dialog, you get a list of all the threads in the process. Selecting a particular thread will show the same timing information, among other things.

Resources