Catch application error on restart or shutdown? - debugging

Occasionally when I restart Win10 x64 I get an application error from my system tray utility in the from of:
AppName in Title bar
The instruction at 0xXXXXXXXXXXXXXX referenced memory at 0xXXXXXXXXXX.
The memory could not be read.
over the top of the This app is preventing you from restarting.
But there are no .dmp files created. I have Windows set to create the dump files and in the normal course of things it does, just not this particular one on a restart/shutdown.
I also can't find anything in the Event Viewer to at least get the base address so I can do some basic math to find the offset to the instruction.
Any tips on how I can get a .dmp created for when this happens or some type of information to help diagnose the issue on reboots?
There are occasions other apps (not mine) do the same thing on a restart so not sure if this is normal? Doesn't see like it should be.

Related

Diagnosing Win32 RegisterClass leak

We are trying to troubleshoot a nasty problem on a production server where the server will start misbehaving after running for awhile.
Diagnostics have led us to believe there may be a bug in a DLL that is used by one of the processes running on this server that is resulting in a global atom leak. The assumed vector is a process that is calling RegisterClass without a corresponding UnregisterClass (and the class name is using a random number as part of the name, so it's a different class name each time the process starts).
This article provided some information: https://blogs.msdn.microsoft.com/ntdebugging/2012/01/31/identifying-global-atom-table-leaks/
But we are reluctant to attempt kernel mode debugging on a production server, so we have tried installing windbg and using the !gatom command to list atoms for a given session.
I use windbg to attach to a process in one of the sessions (these processes are running as Windows Services if that matters), then invoke the !gatom command. The returned atom list doesn't have any window classes in it.
Then I read this: https://blogs.msdn.microsoft.com/oldnewthing/20150429-00/?p=44984
and it sounds like there is a separate atom table for windows classes. And no way to query it. I was hoping that we'd be able to actually see how many windows class atoms have been registered, and see if that list gets bigger over time, indicating a leak.
The documentation on !gatom is sparse, and I'm hoping I can get some expert confirmation or recommendations on how to proceed.
Does anyone have any ideas on how we can get at the list of registered Windows classes on a production server?
More detail about what happens when the server starts to misbehave:
We run many instances (>50) of the same application as separately registered services running from isolated executables and DLLs - so each of those 50 instances has their own private executables and DLLs.
During their normal run, the processes unload and reload a DLL (about every hour). There is a windows class used that's part of a "session handle" used by the DLL (the session handle is part of the registered windows class name), and that session handle is unique each time the DLL is loaded. So every hour, there is an additional Window class registration, made by a DLL (our service stays running).
After some period of time, the system will get into a state where further attempts to load the DLL in question fail. This may happen for one of the services, then gradually over time, other services will start to have the same problem.
When this happens, restarting the service does not fix the problem. The only way that we've found to get things running properly again is to reboot the server.
We are monitoring memory commit load, and we are well within the virtual memory of the server. We are even within the physical memory size.
I just did a code review the vendor of the DLL, and it looks like they are not actually calling RegisterClass from the DLL itself (they only make one RegisterClass call from the DLL, and it's a static string - not a different class name for each session). The DLL launches an EXE, and that EXE is the one that registers the session specific class name. Their EXE does call UnregisterClass (and even if it didn't, the EXE is terminated when we unload their DLL, so it seems that this may not be what is going on).
I am now out of bullets on this one. The behavior seems like some sort of resource leak or pool exhaustion. The next time this happens, I will try connecting to the failing process with windbg and see what the application atom pool looks like - but I'm not hopeful that is going to shed any light.
Update: The excellent AtomTableMonitor tool has narrowed the problem to rogue RegisterWindowMessage. I'm going to ask a more specific question focused on this exact issue: Diagnosing RegisterWindowsMessage leak
You may try using this standalone global atom monitor
The application appears to have capabilities to monitor atoms in services
that run in a different session
btw if you have narrowed it to RegisterWindowMessage
then spy++ can log the Registered messages system wide along with thread and process
spy++ (i am using it from vs2015 community)
ctrl+m select all windows in system
in the messages tab clear all and select registered
and start logging
you can also save the log (it is plain text in-spite of strange extension )
powershell -c "gc spy++.sxl -Tail 3"
<000152> 001F01A4 P message:0xC1B2 [Registered:"nsAppShell:EventID"] wParam:00000000 lParam:06EDFCE0 time:4:2
7:49.584 point:(408, 221)
<000153> 001F01A4 P message:0xC1B2 [Registered:"nsAppShell:EventID"] wParam:00000000 lParam:06EDFCE0 time:4:2
7:49.600 point:(408, 221)
<000154> 001F01A4 P message:0xC1B2 [Registered:"nsAppShell:EventID"] wParam:00000000 lParam:06EDFCE0 time:4:2
7:49.600 point:(408, 221)

foxpro program freezes while it is running

I am using foxpro apps under Windows 7. During the compilation one of my program it suddenly became freezing until I move the mouse or press any key. And this happens all the time while I am working with prog.
This happens when I move only the data to a mapped directory on the host. If my application, foxpro and the data are in the same directory on the virtual machine there are no problem with it.
This happens when my data is not on the virtual machine.
Can it be a caching issue?
Change the registry:
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\WOW]
Save export as backup. Then change value "DefaultSeparateVDM" to "yes"
If you have 64bit then you need to create a file for 16bit applications that use the internal start command in separate memory space as follows:
Start /separate command
Also have a look at this article. http://www.reddit.com/r/Database/comments/2kz0x5/dbf_file_getting_corrupt/
There kind of similar problem, who knows, maybe it would be helpful for you
I have run into similar in the past, and it doesn't have to do with VFP application and data residing in the same folder. What has happened to me is the debugger. You mention "...while I am working with the prog." tells me you are in the VFP development mode, and not run-time from the app itself. If you have issues where your debugger has breakpoints, or other flags that have become corrupted somehow, I have done
CLEAR DEBUGGER
But that is something going back several years and MIGHT be what you are encountering.

Troubleshooting VB6 App Crash after XP to Win7 Upgrade

I have a VB6 application that I provide support for. This application works on both Windows XP and Windows 7. Some users were migrated from Windows XP to Windows 7 using the User State Migration tool. These users now receive a generic "Application has crashed" Windows error message when they open certain screens (forms) in the application. My assumption is that there is a missing dll/ocx reference, but I'm having trouble tracking it down.
I've tried many/varied troubleshooting techniques:
Full uninstall and reinstall of my application
Manually re-registering all dll's and ocx's that I know are used
Running Process Monitor on a broken computer and a working computer to compare what dll's and ocx's are accessed. The answer might be here but even after filtering out most of the background noise the amount of data is overwhelming. At a minimum I reviewed all of the calls right before it crashes and all of the calls that were not successful. All of the non-successful calls match between working and non-working.
Installed the Windows Debugger Tools and captured a crash dump. Analyzed the crash dump with DebugDiag. DebugDiag says the exception is in msvbvm60.dll. I tried building a PDB file for my exe and loading it in DebugDiag to get more detail about where the exception is occuring but DebugDiag doesn't want to accept the PDB (might be doing something wrong here, but it just seems to ignore it. This same PDB file works fine when I do remote debugging, however.)
I recompiled my VB6 program without any optimizations in PCode. I've read online that sometimes building in PCode, while bad for performance, will tell you the real exception.
Used the above created PDB file to remote debug the VB6 application. The debugger says that the application crashes after the new window has been created, on a line that sets MousePointer = vbHourGlass... To me it seems unlikely that this is the real cause of the error. There are at least 20 other locations in the program where this same line is called and all work fine.
(Forgot about this one)
Used Dependency Walker and profiled the application on both a working and non-working computer. All errors found by dependency walker were the same between the two computers. There were no additional dependencies found on a working computer, and all missing dependencies on the non-working computer were also missing on the working one.
None of these actions changed my error message or showed me what the error is (unless it really is the mouse cursor issue)... There are no entries in the Windows Event Log related to the app crash.
The non-working and working computers all have the same base Windows 7 image, the only difference is whatever is being changed by USMT, which further convinces me that this is some kind of quirky configuration change or a missing dll/ocx or perhaps an unregistered dll/ocx.
Any ideas or thoughts on how I can track down the root cause of the issue would be greatly appreciated.
Update 1 - Response to questions
#MarkHall I have tried running it as admin, though not with UAC off. The application runs fine on a Windows 7 box as a non-admin with full UAC. Windows XP was 32-bit, Windows 7 is 64-bit, but again it works just fine on a like for like box where the user was not migrated from Windows XP.
#Beaner It's possible that it stores settings somewhere that have been corrupted, but the remote debugging leads me to think that it's more likely something else since it seems to die on a step related to the UI, which then makes me think it's probably a missing dll/ocx reference.
#Bob77 The application is installed into Program Files (x86). While many of the libraries do reside in the same folder, they are all registered.
Peter, often I've noticed that the debugger will indicate a line of code that is actually incorrect, depending on WHERE in the actual assembly language the fault occurs. You should look REAL close around your statement that sets the cursor to vbHourGlass. Your exception is PROBABLY happening BEFORE that line of code, but that line is what the debugger thinks is the actual faulted line of code.
Since you said it happens when a window OPENS, I'd look real close at any ocx's you may have referenced on the form, but perhaps NOT actually being used, or called. You might have one there that you don't intend to be there, that could be causing security issues, or something on Win7? Edit the .frm file by hand if you have to, and look at all the GUIDs the form references.
It is possible that one machine is using PER-USER registration, and the other is using PER-SYSTEM registration?? I don't know...
I would take a much closer look at the form that you are trying to open, and be VERY cautious of everything you are doing in the form load events, and so on. This sounds like it could be something as stupid as Windows Aero being enabled on one system, and not another, or some other sort of "Theme" setting that is throwing the VB Form Rendering routine into a hissyfit... Perhaps even something as stupid as a transparent color index in the icon you selected for that from?
If you are still developing this app, (or at least maintaining it), create an entirely NEW form, and re-create all the controls, etc, on the form (resist the temptation to copy/paste them from the old one...), and then see if THAT does the trick. Then, copy all the event code to the new form one event at a time, with at LEAST enough event code to make the form function, even if it's just a "dead form", that loads no data, or whatever the form is supposed to do. Check and debug after each change, and you WILL find it eventually. Of course, make sure you isolate one of the defunct systems to have a platform that you can duplicate the issue on, or then it's just guessing. I find that using something like Acronis w/ Universal Restore is a great option to then take the image file into a good HV, like VirtualBox, and then restore that image as a VM, so you can debug without interfering with your actual users. This sounds like a lot of work, but then again, so is re-writing an application that already exists, right? :)
Failing THAT... /* and */ are your friends!! (Well, we're dealing with VB, so ' would be your best friend! heh... But I'd start commenting out all the code on the form until that sucker opens. Then once it opens, start putting one line back at a time, and re-running it... That's called "VooDoo Debugging", but sometimes, you gotta do what you gotta do...
THANKS A LOT PETER! :) Now you got ME so involved in this, I feel like I'M the one debugging this sucker! Like if it was MY code I was trying to fix! :)
Let me know if any of this helps... I am actually quite interested in what you discover.

how to get memory dump after blue screen

I'm getting a lovely BSOD on bootup (STOP: 0x0000007E) from a driver I'm writing, and would like to load up the memory dump for analysis. However, it's not getting dumped anywhere. Everything is setup correctly in the Startup and Recovery settings, but I get no dump file, and nothing in the event log stating a dump has taken place. It looks like a dump is not even occurring...
I know the exact line of code causing it (a call to IoAttachDevice()), but am not sure why, and would like to view the DbgPrint output to see where exactly it's failing. Could Windows possibly be crashing before the dumping functionality is set up? If so, how do I get access to the state of the machine when the failure occurs?
UPDATE: Other possibly useful information: I'm running Windows XP through VirtualBox on a Linux host.
I don't know why you're not getting a dump file, but if you have ready access to the machine, attach a kernel debugger to it an repro the error - you'll be left with the machine sitting in the debugger, ready to go (you can have the debugger produce the dumpfile for you if you want to debug offline as well).
Right-click on "my computer" select "Advanced", under "startup and recovery" click "settings". select "kernel memory dump" or "complete memory dump".
What's the start setting of your driver? If it starts too early in the boot order, the filesystem might not be remounted read-write yet, and therefore there's no place for a dump to go.
Drivers under development shouldn't generally be set to auto-start until you've gotten the driver stable when loaded later. Of course you eventually need to set it to auto-start so you can verify it works correctly, but that comes later.

What is the "Cannot set allocations" error, who emits it and what can I do about it?

We've been plagued for several years by occasional reports from customers about a non-descript error message "Cannot set allocations" that appears on startup of our app. We have never been able to reproduce the problem in our own test environments so far. I have now run out of ideas for attempting to track this down. Here's a collection of observations that have accumulated over time:
Error message text reads "Cannot set allocations" (note absence of punctuation).
The window title simply reads "Error" (or the localized equivalent).
The "Cannot set allocations" text is always in English regardless of OS locale.
I have so far not been able to locate the DLL or EXE containing the message text.
Google is chock full of reports of this error for a variety of different products - but no solutions.
The only unifying aspect between the affected products I could make out so far was that they all appear to come in the form of DLLs that load into third-party processes (such as addins for Visual Studio or Windows Explorer shell extensions).
Our app is actually a shareware COM-addin for MS Outlook, written in Delphi (i.e. native code - no .NET).
The prime suspect in our case is the third-party licensing wrapper that we're using which decrypts and uncompresses our DLL into memory on the fly. Obviously I couldn't simply give an unprotected version of our app to the affected customers to verify this suspicion. Maybe the other vendors that this has been reported against are using similar products.
Debug versions of the protection wrapper supplied to us by the licensing vendor yielded no results: The log files looked exactly the same as from sessions where the error did not occur. Apparently the "inner" DLL gets decrypted and uncompressed all right but for some reason still fails to get loaded by the host process.
By creating an unprotected "loader" DLL we have been able to pinpoint the occurrence of the error somewhere behind the LoadLibrary call that is supposed to load our DLL into memory.
Extensive logging and global exception hooks in our own code (both the unprotected loader and the protected "core"-DLL) yielded no results at all. The error is obviously raised somewhere else.
The problem described in this earlier question of mine was very probably prompted by the same issue. This was before we created the unprotected loader stub.
The error only occurs at about 1-2% of our customers - whereas typically all installations at any affected customer's site are affected the same.
Sometimes the error goes away after we release a new version but often it will come back again after a couple of weeks or months.
Once the error has started to occur on a machine it does so consistently.
The error never occurs while connected to the affected machine via remote access (e.g. VNC, RDP, TeamViewer, etc.) and none of the affected customers are within travel distance from us so all we have to go by is log files and "eye-witness reports".
One customer reported that the error message dialog apparently was non-modal, i.e. he was able to simply move the dialog box to the side and continue working with the application (minus the functionality that our DLL would have provided). Not sure whether this is universally true in all other occurrences, too.
In some cases customers have been able to permanently rid themselves of the error by disabling or uninstalling other addins from other vendors that were sharing the host application with our own product.
The error has so far been observed on Windows XP, Vista and 7.
During the last few weeks we had a surge of reports from Outlook 2003 / Windows 7 users. Could the situation have been made worse by a recent Windows/Office-update?
Does anyone have any experience with this error at all?
Or any more ideas for investigating this?
I have only recently had this happen, which prompted my search and I ended up here. I can tell you that with me for sure it is in windows 7 home premium BUT ONLY WITH IE9 (which I hate by the way) it reduces the user back to the dummy stage and assumes that we have to be repeated flagged about everything.It will keep asking you if you want to disable add ons to speed up load times but usually on things that aren't really the things slowing the browser down in the first place,it is there is too much garbage loading in the first place.But back to the "Cannot set allocations", I for one have never expirianced it in any other browser which is not to say it doesn't happen with them.
This is going to be pure guess-work, but it sounds like maybe your third-party licensing software is trying to load your DLL at a particular location in memory, which - on these failing systems - happens to already be occupied by something else, perhaps a global hook DLL.
If you have a customer who is willing to work with you a bit, it might shed some light on the situation to get a crash dump (e.g., with ADPlus or maybe simpler with Sysinternals' ProcDump) when the error message is showing. That would show what modules are loaded and possibly the callstack (if it is from a message box at the time of the error as opposed to one that is catching an exception after the problem).
I also have experienced the "Cannot set allocations" issue. Royal pain. I had disabled Java, since I did not seem to need it, I used add/remove programs to remove Java from my system. Then I stated to get those errors. I have reinstalled, but disabled Java in IE explorer. Now I do not get the error anymore. Not a programmer, don't know why this happened. Maybe a clue for someone.
Win 7 - 64bit OS IE Explorer 10. Hope this helps someone figure this out. John
I've seen this happen. In my case a global hook dll created a large memory file mapping, perhaps to the memory the licensing dll was counting on.
I see "Cannot set Allocations" when I open google chrome only. Also after that, chrome closes with a msg saying "Whoa chrome has crashed..."
Still no solution :(
Also not a programmer, but it always happens when I open Chrome. It opens second window with error message 'cannot set allocation'. I usually close it and go on my way. if i don't it usually causes a crash. doesnt happen on any other browsers.

Resources