I have a WinForms application that uses XNA to animate 3D models in a control. The app have been doing just fine for months but recently I've started to experience periodic pauses in the animation. Setting out to investigate what is going on I have established these facts:
It happens on my machine only, other machines works fine
Removing everything from my render loop does not improve the problem
In 2. I didn't actually remove everything, I limited my loop to set the viewport on my GraphicsDevice and then do a GraphicsDevice.Present.
Trying to dig further I fired up PIX to capture some statistics. Screenshots of two PIX runs can be viewed here (Run6) and here (Run14). Run6 is using my original render loop and Run14 is using the bare-bones Present loop.
PIX tells me that the GPU is periodically doing something, and I assume this is causing the pauses. What could be the cause of this? Or how do I go about finding out what the GPU is actually doing?
Update: since I usually trust my code to be perfect (who's laughing?) I started a new XNA project from scratch to see if it exhibit the same behavior. So starting a new XNA 3.1 Windows Game project and running PIX I get this timeline. The same periodic pauses. So the problem must be lower in the stack, in XNA or Direct3D.
So PIX shows that the GPU is working on something, I can see the list of DX calls made within each frame and the timing calculations shows that the pause occurs during (or after) the IDirect3DDevice9::Present call.
Update 2: I had previously installed and uninstalled XNA 4.0 CTP on the problematic machine. I cannot be certain that this is related but I thought that perhaps a reinstall of the XNA Game Studio 3.1 bits could make a difference. Turns out it did.
The underlying question remains the same (and the bounty is still up): what could affect XNA 3.1 (or DirectX) to make it behave like this and is there any logging/tracing power tool for the DirectX and/or GPU level out there that could shed some light on what is going on?
Note: I'm using XNA 3.1 on a Windows 7 x64 dual-core machine with 8GB RAM.
Note2: also posted this question on the XNA Creators forums here.
You could try to see if you can find something with Xperf that is close to your periodically problem, do not run your application but keep the programs open that would normally run besides your application. You could also try to do it again with the application running but it could give a cluttered view.
Start the tracing, do this in an elevated prompt.
xperf -on BASE+LATENCY -stackWalk Profile
Wait for a fair amount of time to be sure that the problem is traced.
Stop the tracing and open it like this.
xperf -d trace.etl
xperfview trace.etl
Analyze by looking at the graphs and consulting tables of specific intervals and see if you can find something that is related to the problem, the highest chance on finding it would be in the DPC and Interrupts section. But it might as well be something odd at the CPU or I/O section. Good luck!
Also more information on Xperf and how to obtain it, hopefully this delivers results.
If not, you can alternatively try GPUView which has been used for improvements in DWM,
this is also included next to Xperf with the Windows Performance Toolkit so you can easily try both!
log v
... wait for a fair amount of time to be sure that the problem is traced ...
log
gpuview merged.etl
In the case that gpuview gets out of memory you can try to add "/limit 3" or remove the v.
Read the documentation of the tools if you are stuck somewhere.
Hmm ... this seems to be occurring on the GPU, however it sounds like a CPU garbage collection issue. Can you run the CLR profiler and see if you can see any spikes in GC activity that you can correlate to the slowdowns?
I agree that it sounds unlikely since you can clearly see it in PIX, but it might offer a clue as to the cause.
If it's only happening on your own machine, then could it be drivers? Forgive me for being skeptical, but it's a 64 bit machine after all :D
This looks like either a vsync issue or GPU in its last throes. Since going back to a different version fixed it, and the "bottleneck" is in IDirect3DDevice9::Present lets go with the former option.
I'm not familiar with XNA so I don't know how much of the workings of D3D are exposed, but do you know what your PresentationParameters are set to?
Specifically try setting the swap effect set to Discard.
Related
Foreword
Since this appears to be a bug in the d3d9 Emulation on the Windows side, this would probably best addressed to Microsoft. If you know where I could get into contact with the DirectX Team, please tell me.
For the time being, I'm assuming that the only real chance we have is working around the bug.
What
We're investigating an inresponsiveness found in the Game Test Drive Unlimited 2.
Only when opening the Map and only when having an "RTX" Card (I think the most precise we got is GDDR6, because AMD also seems affected).
After long debugging, we found out, that it's not a simple fault of the game, but instead Present returning D3DERR_DEVICELOST even when having the game in Windowed Mode.
When the Game is in Fullscreen Mode, it properly does the required roundtrip over TestCooperativeLevel and Reset, but after the next frame is rendered, Present has lost the device again, causing the Hangup.
Now I'm looking for pointers on how to solve this issue. While it's probably some internal state corruption of some sorts, it's definitely triggered by an API Call only present when rendering the Map in that Game.
We will try to dig into d3d9.dll, but my suspicion is that the error code just comes from some Kernel/Driver Call, where our knowledge and tooling ends.
Ideally I'd like to fuzzy-find the drawcall by just hooking everything and omitting random apicalls, but I guess it's just not so easy and causes a lot more errors in most conditions.
Also note that an APITrace we did, showed D3D_OK for every single call including EndScene, up until Present, so it's not as simple as checking the return codes.
Trying to use Direct X 9 in Debug/Diagnostic Mode is also not really possible on Windows 10 anymore apparently, even when installing the SDK from June 2010
Thanks in Advance for any idea and maybe addresses to direct this problem to.
I am experiencing the exact same issue as this fellow, and cant seem to find any real solutions to the issue. The only solution I've been able to find so far that works it not maximizing VS. Keeping VS in windowed mode fixes the problem completely, but as soon as it returns to maximized state UI elements no longer redraw unless the stale area is put under new active focus. This makes it impossible to develop and I'm forced to return to windowed mode. I've tried disabling hardware acceleration with no luck. Any Ideas?
There are a handful of feedback reports at connect.microsoft.com that describe a similar problem. But none that ever evolved to a solution, most typically the programmer that filed the report stopped responding when asked to provide additional diagnostics.
Clearly this is an environmental problem, something is wrong with your machine. VS2012 uses WPF for rendering, it puts non-trivial demands on the machine. You do need a solid video driver and a functional DirectX stack to make it work without problems. This tends to be an issue in general, video drivers are by far the most problematic kind of driver. Problems are roughly correlated with the age of the machine and how old the operating system is.
So a good starting point is to look for a video driver update first, visit the web site of the company that made your adapter to look for the latest. Reinstall DirectX on the machine next. If you were considering a clean update of your machine before, like updating the Windows version, now would be a good time to start with a clean slate.
I'm developing an application in LabView on Windows. Starting a week ago, one test machine (a ToughBook, no less) was freezing up completely once every couple days: no mouse cursor, taskbar clock frozen. So yesterday it was retired. But just now, I've seen it on another machine, also a laptop.
This is a pretty uncommon failure mode for PC's. I don't know much about Windows, but I'd expect it to indicate that the software stopped running so completely and suddenly that the kernel was unable to panic.
Is this an accurate assessment? Where do I begin to debug this problem? What controls the cursor in the Windows architecture — is it all kernel mode or is there a window server that might be getting choked by something? Would an unstable third-party hardware driver cause this, rather than a blue screen?
EDIT: I should add that the freezes don't necessarily happen while the code is running.
I'd certainly consider hardware and/or drivers as a possibility - perhaps you could say what hardware is involved?
You could test this by adding a 'debug mode' for each piece of hardware your LabVIEW code talks to, where you would use e.g. a case structure to skip the actual I/O calls and return dummy data to the rest of the application. Make sure it's a similar amount of data to what the real device returns. You'll find this much easier if you've modularised your code into subVI's with clearly defined functions! If disabling I/O calls to a particular bit of hardware stops the freezes it would suggest the problem might be with that hardware or its driver.
Hard to say what the problem is. Base on the symptoms I would check for a possible memory leak (see if your LabVIEW app memory usage is growing overtime using "windows task manager").
We had an arcade/redemption game running on Win98, but hardware which can run it has finally gone obsolete. The game used a number of scaling effects, some through the 3D path, and played some tricks moving things in and out of video memory. If I was to undertake porting it to run on Windows 7, how much trouble would it likely be? Would it mostly be recompilation, or have the APIs undergone such transformation that I might as well re-write the device interfaces?
Don't think of it as porting to Win7. Just simply port to DX9 and let DX do the work with the Win7 parts. Infact, you could probably just leave it as is and it shuold run -- but you mention you do crazy things with video memory that I assume has nothing to do with DX. (ie either through GDI or some other hack?). Anyway, the DX7, 8 and 9 APIs all have quite drastic differences. But the nice thing is they're all backwards compatible. If the code you have is pure dx7, try compiling against the latest SDK and see if it works on win7 straight off.
It's been a while since I've written any DirectX7 code (or Direct X code at all) but if I recall there were some significant API changes event between 7 and 8 - let along 9 or 10 - that would make such a change a bit more difficult. Specifically I think the major change was that they refactored the system after 7 to merge DirectDraw into Direct3D so that the two systems were no longer completely separate between 7 and 8. I haven't look at it since then, but I suspect that given the number of new coding methods and like that the API has change quite a bit so it is probably going to be a bit of a project to make these changes rather than mostly recompilation like you might have hoped.
You EVER moved things in and out of video memory? shudder
Still ... its quicker to do that now than when DX7 was around. What exactly were you doing? From your description its impossible to say how easy it would be. A DX7 app should still run on Windows 7, I can't think of what odd features you may have used that would cause it to break.
Also converting an application to DX9 from 7 is not actually all that hard (Converting to DX10+ would be a nightmare). They are still relatively similar ... the main thing that has changed since those days is the shortening of things like D3DTRANSFORMSTAGESTATE_* or D3DRENDERSTATE_* to D3DTSS_* or D3DRS_*.
Edit: The biggest change I can think of that has happened since DX7 is that graphics card manufacturers have dropped support for palettised textures which "could" break some old apps on modern machines. That really is a very simple fix though ...
Edit2: Decompressing things from disk into a texture can be a bit of a pain. Your main issue is the fact that you end up suffering A performance hit when you create the texture. However if you have a load of textures already created and open then you can load to the relevant texture as and when you please. You only suffer a lock/unlock hit. That can be mitigated by loading a resource a few frames in advance. If you do this, though, it will no doubt require multi-threading and calling D3D from multiple threads. If you do this set the multi-thread flag on the device.
I have a crash dump of an application that is supposedly leaking GDI. The app is running on XP and I have no problems loading it into WinDbg to look at it. Previously we have use the Gdikdx.dll extension to look at Gdi information but this extension is not supported on XP or Vista.
Does anyone have any pointers for finding GDI object usage in WinDbg.
Alternatively, I do have access to the failing program (and its stress testing suite) so I can reproduce on a running system if you know of any 'live' debugging tools for XP and Vista (or Windows 2000 though this is not our target).
I've spent the last week working on a GDI leak finder tool. We also perform regular stress testing and it never lasted longer than a day's worth w/o stopping due to user/gdi object handle overconsumption.
My attempts have been pretty successful as far as I can tell. Of course, I spent some time beforehand looking for an alternative and quicker solution. It is worth mentioning, I had some previous semi-lucky experience with the GDILeaks tool from msdn article mentioned above. Not to mention that i had to solve a few problems prior to putting it to work and this time it just didn't give me what and how i wanted it. The downside of their approach is the heavyweight debugger interface (it slows down the researched target by orders of magnitude which I found unacceptable). Another downside is that it did not work all the time - on some runs I simply could not get it to report/compute anything! Its complexity (judging by the amount of code) was another scare-away factor. I'm not a big fan of GUIs, as it is my belief that I'm more productive with no windows at all ;o). I also found it hard to make it find and use my symbols.
One more tool I used before setting on to write my own, was the leakbrowser.
Anyways, I finally settled on an iterative approach to achieve following goals:
minor performance penalties
implementation simplicity
non-invasiveness (used for multiple products)
relying on as much available as possible
I used detours (non-commercial use) for core functionality (it is an injectible DLL). Put Javascript to use for automatic code generation (15K script to gen 100K source code - no way I code this manually and no C preprocessor involved!) plus a windbg extension for data analysis and snapshot/diff support.
To tell the long story short - after I was finished, it was a matter of a few hours to collect information during another stress test and another hour to analyze and fix the leaks.
I'll be more than happy to share my findings.
P.S. some time did I spend on trying to improve on the previous work. My intention was minimizing false positives (I've seen just about too many of those while developing), so it will also check for allocation/release consistency as well as avoid taking into account allocations that are never leaked.
Edit: Find the tool here
There was a MSDN Magazine article from several years ago that talked about GDI leaks. This points to several different places with good information.
In WinDbg, you may also try the !poolused command for some information.
Finding resource leaks in from a crash dump (post-mortem) can be difficult -- if it was always the same place, using the same variable that leaks the memory, and you're lucky, you could see the last place that it will be leaked, etc. It would probably be much easier with a live program running under the debugger.
You can also try using Microsoft Detours, but the license doesn't always work out. It's also a bit more invasive and advanced.
I have created a Windbg script for that. Look at the answer of
Command to get GDI handle count from a crash dump
To track the allocation stack you could set a ba (Break on Access) breakpoint past the last allocated GDICell object to break just at the point when another GDI allocation happens. That could be a bit complex because the address changes but it could be enough to find pretty much any leak.