SDL memory leaks and Visual Leak Detector

SDL memory leaks and Visual Leak Detector - visual-studio

Alright, so I think my program might have a memory leak. It's an SDL application, and it seems to have grown too large for me to manually pinpoint the leak. I searched around for a windows equivalent of Valgrind (I'm running Windows 7 x64 and using Visual Studio 2010), and eventually came across Visual Leak Detector. Unfortunately, it doesn't seem to want to generate ay output.
I set up another project, an empty console application, and set up VLD the same way as in my SDL app. Upon running the program, VLD worked perfectly and caught every memory leak that I threw at it. But in the SDL app, it just outputs "Visual Leak Detector Version 2.2 installed." at the beginning of the debug session and nothing else, even when I intentionally created a memory leak right in the main function.
The closest I can tell, it might have to do with SDL screwing with the program entry point. But that's just a guess. Is there any way to get VLD to work with SDL?

You could try to deleaker. It is a powerful tool for debugging memory leaks.

I had a similar problem using SDL library as well. In my case though, I was trying to use the default Memory Leak detection of Visual Studio 2010 because I didn't wanted to use a third party library/application.
Fixing the issue
If after all the required includes, define and function call you still don't see any memory leaks printed out, it might be that your Runtime Library is not set properly.
Double check if you have the debug version of the Runtime Library instead of the non-debug one (/MT and /MD).
Multi-threaded Debug (/MTd)
Multi-threaded Debug DLL (/MDd)
The compiler defines _DEBUG when you specify the /MTd or /Mdd option. These options specify debug versions of the C run-time library. See _DEBUG reference MSDN
Thus, the _DEBUG symbol must be defined in order to enable CRT code.
[...] When _DEBUG is not defined, calls to _CrtSetDbgFlag are removed during preprocessing [...]. See MSDN reference
So building a debug build is not enough to ensure _DEBUG will be defined.
This is something that you usually don't change in a normal project, but following a tutorial for SDL could lead you were I was.
Hopefully, it is going to help someone else, or even you.
More Details below
I was following the MSDN page to enable Memory leak detection out of the box with VS 2010.
After declaring those
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>
I enabled them into my code and I inserted a deliberate memory leak
int main( int argc, char* args[] )
{
_CrtSetDbgFlag( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF );
_CrtSetReportMode( _CRT_ERROR, _CRTDBG_MODE_DEBUG );
int *pArray = (int*)malloc(sizeof(int) * 24); // Memory not freed
return 0;
}
Nothing was printed out.
So, I looked at the assembly and it was definitively not generating the CRT code at all as you can see:
int main( int argc, char* args[] )
{
012932F0 push ebp
012932F1 mov ebp,esp
012932F3 sub esp,0CCh
012932F9 push ebx
012932FA push esi
012932FB push edi
012932FC lea edi,[ebp-0CCh]
01293302 mov ecx,33h
01293307 mov eax,0CCCCCCCCh
0129330C rep stos dword ptr es:[edi]
_CrtSetDbgFlag( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF );
_CrtSetReportMode( _CRT_ERROR, _CRTDBG_MODE_DEBUG ); // Nothing in both case!
int *pArray = (int*)malloc(sizeof(int) * 24);
0129330E mov esi,esp
01293310 push 60h
01293312 call dword ptr [__imp__malloc (129E4CCh)]
01293318 add esp,4
0129331B cmp esi,esp
0129331D call #ILT+580(__RTC_CheckEsp) (1291249h)
01293322 mov dword ptr [pArray],eax
Then, I realized that the _DEBUG symbol was probably not getting defined.

Related

How does Windows 10 task manager detect a virtual machine?

The Windows 10 task manager (taskmgr.exe) knows if it is running on a physical or virtual machine.
If you look in the Performance tab you'll notice that the number of processors label either reads Logical processors: or Virtual processors:.
In addition, if running inside a virtual machine, there is also the label Virtual machine: Yes.
See the following two screen shots:
My question is if there is a documented API call taskmgr is using to make this kind of detection?
I had a very short look at the disassembly and it seems that the detection code is somehow related to GetLogicalProcessorInformationEx and/or IsProcessorFeaturePresent and/or NtQuerySystemInformation.
However, I don't see how (at least not without spending some more hours of analyzing the assembly code).
And: This question is IMO not related to other existing questions like How can I detect if my program is running inside a virtual machine? since I did not see any code trying to compare smbios table strings or cpu vendor strings with existing known strings typical for hypervisors ("qemu", "virtualbox", "vmware"). I'm not ruling out that a lower level API implementation does that but I don't see this kind of code in taskmgr.exe.
Update: I can also rule out that taskmgr.exe is using the CPUID instruction (with EAX=1 and checking the hypervisor bit 31 in ECX) to detect a matrix.
Update: A closer look at the disassembly showed that there is indeed a check for bit 31, just not done that obviously.
I'll answer this question myself below.

I've analyzed the x64 taskmgr.exe from Windows 10 1803 (OS Build 17134.165) by tracing back the writes to the memory location that is consulted at the point where the Virtual machine: Yes label is set.
Responsible for that variable's value is the return code of the function WdcMemoryMonitor::CheckVirtualStatus
Here is the disassembly of the first use of the cpuid instruction in this function:
lea eax, [rdi+1] // results in eax set to 1
cpuid
mov dword ptr [rbp+var_2C], ebx // save CPUID feature bits for later use
test ecx, ecx
jns short loc_7FF61E3892DA // negative value check equals check for bit 31
...
return 1
loc_7FF61E3892DA:
// different feature detection code if hypervisor bit is not set
So taskmgr is not using any hardware strings, mac addresses or some other sophisticated technologies but simply checks if the hypervisor bit (CPUID leaf 0x01 ECX bit 31)) is set.
The result is bogus of course since e.g. adding -hypervisor to qemu's cpu parameter disables the hypervisor cpuid flag which results in task manager not showing Virtual machine: yes anymore.
And finally here is some example code (tested on Windows and Linux) that perfectly mimics Windows task manager's test:
#include <stdio.h>
#ifdef _WIN32
#include <intrin.h>
#else
#include <cpuid.h>
#endif
int isHypervisor(void)
{
#ifdef _WIN32
int cpuinfo[4];
__cpuid(cpuinfo, 1);
if (cpuinfo[2] >> 31 & 1)
return 1;
#else
unsigned int eax, ebx, ecx, edx;
__get_cpuid (1, &eax, &ebx, &ecx, &edx);
if (ecx >> 31 & 1)
return 1;
#endif
return 0;
}
int main(int argc, char **argv)
{
if (isHypervisor())
printf("Virtual machine: yes\n");
else
printf("Virtual machine: no\n"); /* actually "maybe */
return 0;
}

How to find code for crash

I have some 64-bit code that runs in release mode on a server. There's no Visual studio on the server, only on my dev-machine. The program has been written by many authors now (me latest), and some code in it I'm still not familiar with, and its quite big.
The program crashes now and then with a nullpointer. The instruction at 0xwhatever (latest 0x40066c19) referenced memory at 0x00000000 - click on OK to terminate the program. I have all the source and PDB files for the EXE, but when i run it and attach the process, the memory 0x40066c19 is completely out of range. There is only ?? in that area. How do you use the info about "the instruction at ..." ?
The disassembly window displays something like (example) - but as you see there are simply too far from 00000001403CB888 to 0x40066c19
if (LastKickIdle > GetTickCount())
00000001403CB882 call qword ptr [__imp_GetTickCount (0140688310h)]
00000001403CB888 cmp dword ptr [LastKickIdle (0140888DF8h)],eax
00000001403CB88E ja CMainDlg::OnKickIdle+281h (01403CBAB1h)
return 1;
LastKickIdle = GetTickCount() + 500;
00000001403CB894 mov qword ptr [__formal],rbx
00000001403CB89C call qword ptr [__imp_GetTickCount (0140688310h)]
00000001403CB8A2 add eax,1F4h
00000001403CB8A7 mov dword ptr [LastKickIdle (0140888DF8h)],eax

I run into similar situations at work. I created a log class that takes a string arg and writes it to a file with some other useful info. I make entries at the beginning of methods or at places that I think something may be or may later be problematic. With this, I have at least been able to narrow down my search for problems.
Hope this helps.

How to debug high half x86-64 kernel in QtCreator 4.3.0

I am having problems when trying to debug high half x86-64 kernel in Qt Creator 4.3 (based on Qt 5.8). The thing is that any pointer pointing to an address which has bit 63 set (like 0xFFFF800000000000) does not appear on the variable list and can't be used in Expression Evaluator. With previous versions of Qt Creator (based on Qt 5.7 and 5.4) debugging was possible (with some issues, but possible).
I am pretty sure that this is a bug/regression in Qt Creator debug scripts but I don't think I am going to register on their website just to report a bug. I am rather looking for some kind of workaround.
I am using it to debug custom kernel in QEMU under Ubuntu but the problem affects regular Qt based applications as well. Debugging with GDB (version 7.11.1) itself works fine, but it's a pain.
Example:
char *ptr = nullptr; // ptr is visible
ptr = (char *)0xFFFF800000000000; // ptr becomes invisible
ptr = (char *)0xABCD1234; // ptr becomes visible again
ptr = (char *)0x7FFFFFFFFFFFFFFF; // even here ptr is visible
ptr = (char *)0x8000000000000000; // but now it is invisible again

How does gcc know the register size to use in inline assembly?

I have the inline assembly code:
#define read_msr(index, buf) asm volatile ("rdmsr" : "=d"(buf[1]), "=a"(buf[0]) : "c"(index))
The code using this macro:
u32 buf[2];
read_msr(0x173, buf);
I found the disassembly is (using gnu toolchain):
mov eax,0x173
mov ecx,eax
rdmsr
mov DWORD PTR [rbp-0xc],edx
mov DWORD PTR [rbp-0x10],eax
The question is that 0x173 is less than 0xffff, why gcc does not use mov cx, 0x173? Will the gcc analysis the following instruction rdmsr? Will the gcc always know the correct register size?

It depends on the size of the value or variable passed.
If you pass a "short int" it will set "cx" and read the data from "ax" and "dx" (if buf is a short int, too).
For char it would access "cl" and so on.
So "c" refers to the "ecx" register, but this is accessed with "ecx", "cx", or "cl" depending on the size of the access, which I think makes sense.
To test you can try passing (unsigned short)0x173, it should change the code.
There is no analysis of the inline assembly (in fact it is after text substitution direclty copied to the output assembly, including syntax errors). Also there is no default register size, depending on whether you have a 32 or 64 bit target. This would be way to limiting.

I think the answer is because the current default data size is 32-bit. In 64-bit long mode, the default data size is also 32-bit, unless you use "rex.w" prefix.

Intel specifies the RDMSR instruction as using (all of) ECX to determine the model specific register. That being the case, and apparently as specified by your macro, GCC has every reason to load your constant into the full ECX.
So the question about why it doesn't load CX seems completely inappropriate. It looks like GCC is generating the right code.
(You didn't ask why it stages the load of ECX inefficiently by using EAX; I don't know the answer to that).

inline assembly error: can't find a register in class 'GENERAL_REGS' while reloading 'asm'

I have an inline AT&T style assembly block, which works with XMM registers and there are no problems in Release configuration of my XCode project, however I've stumbled upon this strange error (which is supposedly a GCC bug) in Debug configuration... Can I fix it somehow? There is nothing special in assembly code, but I am using a lot of memory constraints (12 constraints), can this cause this problem?

Not a complete answer, sorry, but the comments section is too short for this ...
Can you post a sample asm("..." :::) line that demonstrates the problem ?
The use of XMM registers is not the issue, the error message indicates that GCC wanted to create code like, say:
movdqa (%rax),%xmm0
i.e. memory loads/stores through pointers held in general registers, and you specified more memory locations than available general-purpose regs (it's probably 12 in debug mode because because RBP, RSP are used for frame/stackpointer and likely RBX for the global offset table and RAX reserved for returns) without realizing register re-use potential.
You might be able to eek things out by doing something like:
void *all_mem_args_tbl[16] = { memarg1, memarg2, ... };
void *trashme;
asm ("movq (%0), %1\n\t"
"movdqa (%1), %xmm0\n\t"
"movq 8(%0), %1\n\t"
"movdqa (%1), %xmm1\n\t"
...
: "r"all_mem_args_tbl : "r"(trashme) : ...);
i.e. put all the mem locations into a table that you pass as operand, and then manage the actual general-purpose register use on your own. It might be two pointer accesses through the indirection table, but whether that makes a difference is hard to say without knowing your complete assembler code piece.

The Debug configuration uses -O0 by default. Since this flag disables optimisations, the compiler is probably not being able to allocate registers given the constraints specified by your inline assembly code, resulting in register starvation.
One solution is to specify a different optimisation level, e.g. -Os, which is the one used by default in the Release configuration.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio