Windbg break-in takes very long time - windows

I want to capture a stacktrace of an application which sometimes stops responding for few minutes.
When the application stops responding, the windows desktop also stops responding to mouse clicks, although some other already running applications are working fine at that time (for example windbg works fine, ProcessExplorer refreshes its screen, but does not respond to mouse events).
While the application is non responsive, it is actually taking about 80% of one CPU core. That is why I would like to get a stacktrace.
The misbehaving application usually takes about 2-3 minutes to do its strange job or if Ctrl+Esc is pressed it becomes responsive immediately (and the start menu opens of course...)
I have WinDbg attached to the misbehaving application and when I issue the Break command, the break-in does not happen until the application starts to respond again.
From what I understand the break-in actually creates a remote thread which pretty soon calls DbgBreakPoint.
What could be preventing debugger's thread from executing?
EDIT:
First of all thanks for your help!
I was also thinking that this might be caused by a bad device driver or something that installs a system wide hook somewhere.
I was thinking to enable kernel debugging and get a stack trace from the kernel for the offending thread or enable manual bluescreen trigger to produce a dump and look at that afterwards.
Process Explorer and Process Monitor does not reveal anything interesting. They also become unusable when the bug is triggered (updating their windows but not responsive to mouse or keyboard).
EDIT2:
Background info:
App uses QT, OpenGL and also DirectSound and runs on Windows 7 SP1 x64
I am currently suspecting something with the graphics part.
The strange thing is that if a system-wide lock is taken (like GDI Lock), this would prevent drawing of other Windows, but that does not happen. WinDbg on same machine works fine. ProcessExplorer updates but does not receive mouse clicks, Desktop updates but no mouse clicks.
I currently have a kernel debugger attached...
EDIT3
ETW was most useful for debugging. It turns out that Qt's main event processing loop goes crazy. PeekMessage and MsgWaitForMultipleObjectsEx (with 0 timeout) gets called in a tight loop. That is where the high CPU usage comes from.
It looks like the App is generating/getting loads of messages at that time. But it is not easy to see what the messages are (or I don't know how to access function parameters in ETW). Using a debugger also does not help much but, with a breakpoint in the QT's event loop leads me to believe that WM_TIMER messages are the culprit.

Given that the desktop also misbehaves during this time, it sounds like your app isn't necessarily misbehaving but merely aggravating a bug elsewhere (e.g., in a device driver or some crummy anti-malware code that has injected itself into other processes). Stack traces from your app may or may not be very revealing.
If the problem is easily reproducible, I'd set a breakpoint somewhere in the "middle" of the app and see if the problem happens before or after that. Then move the breakpoint until you find the last instruction your app executes before things go bonkers. Figuring out what your app does that triggers this behavior may give a clue as what's going on.
Another option is to try using some system-wide debugging tools. First, I'd peak in the Event Viewer to see if there are suspicious error or warning events posting in proximity to the moment the machine goes haywire. Then I'd try a tool like Sysinternal's Process Monitor or Process Explorer to get a better view of what's happening. You might also try ETW to capture a system-wide trace of what's happening on the system that you can study after the fact. (ETW can be hard to use, so check out Bruce Dawson's UIforETW.)

Use ETW to find the cause. Install the Windows Performance Toolkit (part of the Win10 v1511 SDK: https://go.microsoft.com/fwlink/p/?LinkID=698771 which is the last version that works in Win7), run WPRUI.exe, select CPU Usage and click on Start.
After you captured the hang, click on Save. Wait until WPRUI is finished, open the ETL in WPA, setup and load debug symbols in WPA.
Drag & Drop the CPU Usage (Precise) graph to analyse pane and look for WAIT (µs) max for your process to see that long hang and expand the stack to see where it happens.

Related

directx 11 c++ application in vs 2012 -- full screen crashes after running for a while

Win7 Directx 11 with VS 2012 -- When I let my app run in full screen on my development computer, eventually, the program exits full screen back to windowed, all by itself, and pops up a dialog telling me windows resources are running low. The dialog tells me something about turning off interactive themes or something to that end. When I run the program in release mode on one of our client machines, the app runs fine, does everything it should, but after a while, instead of popping up the dialog about windows resources, I get an exception window with exception 0x40000015 as the error. This only happens if I am in full screen, windowed, it never crashes. Event Viewer shows nothing at all about the crash. Any thoughts?
Thanks in advance. Basically what my app is, is a wrapper graphics library. When it says I am using all the resources, my resource monitor shows that I never went above 20% of memory and the cpu never went above 14%. The 0x40000015 error number is rather general and doesn't exactly point me in any real direction.
This is not related to trying to exit an app in full screen, as that I have the code required to prevent the exception related to that problem.
R
It turned out to be a component issue on the target platform. The ReportLiveDeviceObjects info was helpful and will continue to help down the line.

explorer thread using unusual amount of processor power, detect root/source of "issue"?

I was profiling an application I am developing, and I saw unusual explorer.exe processor usage, almost 15%!
This is very high for explorer, I launched process explorer and then I saw this:
I click "kill" and explorer works fine..
this happens every time explorer restarts / startsup.. How can I detect which application or what is causing that thread to launch and how do I prevent it from launching? My explorer works fine without it.
I suspect this is a virus but none of my AV software detects it..
Can anyone help me out?
You have a 3rd party components which creates dynamically code (green) and this code creates snapshots of programs (red) with this function: kernel32.dll!CreateToolhelp32Snapshot
Use AutoRuns and ShellExView to disable all 3rd party tools until you find the code which causes your CPU usage.

Strange behavior in user idle detection in Adobe Air on Mac

I've got an Adobe AIR application written in pure AS3 that has some functionality that happens when the user is idle and then returns to the normal state when the user returns. I'm detecting this activity with the following code:
NativeApplication.nativeApplication.idleThreshold = 180;
NativeApplication.nativeApplication.addEventListener(Event.USER_IDLE, onUserIdle);
NativeApplication.nativeApplication.addEventListener(Event.USER_PRESENT, onUserPresent);
The onUserIdle method is called after 3 minutes as it should be, but then the onUserPresent event is fired almost immediately afterwards. I'm talking milliseconds later. This happens without any user input whatsoever. The bizarre thing is that this does not occur on Windows - only on OSX. And it happens on all flavors of OSX going back to 10.6.3.
Adobe's documentation is incredibly vague on how those events are determined, so I'm not sure if there is something I can do at the system level to fix the problem. Does anyone have any experience with this issue, and if not, any other suggestions on how I can detect user idleness even when the app does not have focus?
Just to preempt the suggestion, I cannot use mouse/keyboard listeners to simulate the same behavior because they do not work if the application loses focus, whereas the NativeApplication events still fire. I've also used NativeProcess to get the output of ioreg to get the hardware idle time as reported by the system, but it does not appear to be affected by the mouse.
I really appreciate any assistance.
Edit: I just discovered that this does not occur when the application is run in an Administrator account on OSX. It only happens in a User account, which only serves to confuse me more.
I figured out what the issue is. When the USER_IDLE event is fired in a user account, we did several things - one of which was forcibly kill the Dock in order to make sure it was hidden from the screen. For whatever reason, this resets the internally available idleThreshold count. This was not only happening in AIR - it was also happening when monitored through Terminal and it appears there is absolutely nothing that can be done to stop it. The solution was to stop killing the dock. Everything magically worked after that.

disable the options of a Windows 8 not-responding app

On Microsoft website I saw this:
What does it mean when a program is not responding?
If a program is not responding, it means the program is interacting more slowly than usual with Windows, typically because a problem has occurred in the program. If the problem is temporary, and if you choose to wait, some programs will start responding again. Depending on the options available, you can also choose to close or restart the program.
On Windows 8, how do you DISABLE "the options" so that users of my application do not see the options to restart my app?
I'd suggest changing your application's design so that this situation doesn't arise. "Not responding" afaik means that the thread with the main message pump is not fetching messages. This is not a good thing to allow to happen.
It's better to do things that will take a long time in a separate thread, and keep responding to the user in the meantime (even if this just means asking them to wait and keeping a dialog responding to mouse events etc).

How to find out more about Application Hang event?

If a VB6 app is causing an Application Hang event to appear in the Event Viewer, how can I find out more about why the application is hanging?
Does an Application Hang event mean that the app has frozen and crashed, or just that it temporarily hangs?
All I get in the event log for this event is:
Hanging application [MyAppName].exe, version [MyAppVersionNo], hang module hungapp, version 0.0.0.0, hang address 0x00000000.
That is not enough and I want to be able to find out more about why it is hanging. What code changes or other steps need to be taken to cause the app to provide more details in the event log?
I recommend using the Windows Performance Toolkit. The best version to use is in the Windows Assessment & Deployment Kit, http://www.microsoft.com/download/en/details.aspx?id=28997
Once it's installed, what you do is start up Windows Performance Recorder (WPR) and click the Start button to begin recording. Next, reproduce the problem with your app. Then go back to WPR and press the Save button. Next, load up Windows Performance Analyzer and open that *.ETL file that was generated. Then you want to go to System Activity section in the Graph Explorer, expand it, and find the UI Delays graph (or it might be the first graph parked on System Activity). Double click on it to get the detailed version in an Analysis tab.
Once you find the UI delay you're interested in, you can add another graph such as CPU Usage (Sampled) from the Processing node in Graph Explorer. When the two graphs are in the same Analysis tab, their scrolling and selection will be synchronized. So you can click on the UI delay event and it will also highlight the corresponding range in CPU Usage.
The Application Hang event means that Windows has decided that the application is unresponsive. Since the event is generated by the operating system and not the application, your options for getting additional information in the event are extremely limited.
This is what seems to be available on an Application Hang event:
Message: Hanging application %1, version %2, hang module %3, version %4, hang address 0x%5.
From:
http://www.microsoft.com/technet/support/ee/transform.aspx?ProdName=Windows+Operating+System&ProdVer=5.2&EvtID=1002&EvtSrc=Application+Hang&LCID=1033
If you believe the cause of the event is something that your application does (as opposed to something in the environment where the application is running), then instead of trying to pass info to the hang event, you should raise the level of log info to debug mode and look in your application's log file to see what it is doing just prior to becoming unresponsive.
If you lack logging information, or a logging framework in your application, then that is where you should be focusing your efforts. The upside is that you will benefit from better logging in the future as well. Use a logging framework however, so you can disable debug level logging in a production environment, when everything is running smoothly.
I would approach this by reviewing the code in the module that Windows has determined has hung, the name of which was written to the event log. Attempting to get more detail in the hang event will not be possible because when Windows has determined the app is unresponsive, it is too late.
Into the module that is hanging I would add multiple calls to DoEvents as well as logging status messages directly into the EventLog. Adding a logging framework at this point would introduce complexity and involve either a database or file access in which to store the logs.
Windows thinks the app has hung because it has stopped responding to messages. Unfortunately, implementing a second thread in your VB6 app is not trivial, unlike .NET. Never-the-less, adding another thread would keep the app responsive, but then you would likely still be left with answering the question, "why is the code taking so long to execute?"
Getting information from windows event perspective will not help. Try to have tracing in your application which helps you to get the exact cause.

Resources