Visual Studio 2013 C++ compiler preventing me from accessing ebp - visual-studio

I'm screwing around with the Visual Studio 2013 C++ compiler (what I'm doing is not really important or interesting at all) and I'm running across some very odd behavior. Right now I have the code:
void fun(void)
{
int *ebp = (int *)(&ebp + 1);
}
Which should give me a pointer to ebp using Visual Studio's stack semantics. NOTE: I have set Basic Runtime Checks to "Default" and I have disabled security checks. When I look at the memory when debugging in Visual Studio I see:
0x009BFBFC 41 03 81 51 fe ff ff ff 44 fc 9b 00
Where 0x009BFBFC is the address of ebp. Note that there is a random -2 in the location immediately preceding ebp (0xfffffffe). Also note that the saved ebp is right after this -2 (0x009bfc44). "Okay" I say, "I'll just add 2 instead!" I now have this code:
void fun(void)
{
int *ebp = (int *)(&ebp + 2);
}
And when I run it and look at the memory, this time I see:
0x0032FCD8 fe ff ff ff 1c fd 32 00 5e 3c 39 00
Again, 0x0032FCD8 is the address of ebp. What madness is this! The random extra space is now gone giving me a pointer to the return address instead!
Is this deliberate? I can't see any reason why the Visual Studio compiler would intentionally prevent me from accessing the base pointer from code, but then again I can't see why the compiler would behave so oddly when I change a two to a one. For those curious, I looked at the disassembly and the first example does allocate 4 more bytes than the second for no apparent reason (it's not used anywhere). If anyone has any insight, that would be awesome; this is kind of irritating that it would do this.

Related

What is the real address of `%fs:0xfffffffffffffff8`?

I want to trace the goid of go programs using ebpf.
After reading for some posts and blogs, I know that %fs:0xfffffffffffffff8 points to the g struct of go and mov %fs:0xfffffffffffffff8,%rcx instruction always appear at the start of a go function.
Taking main.main as an example:
func main() {
177341 458330: 64 48 8b 0c 25 f8 ff mov %fs:0xfffffffffffffff8,%rcx
177342 458337: ff ff
177343 458339: 48 3b 61 10 cmp 0x10(%rcx),%rsp
177344 45833d: 76 1a jbe 458359 <main.main+0x29>
177345 45833f: 48 83 ec 08 sub $0x8,%rsp
177346 458343: 48 89 2c 24 mov %rbp,(%rsp)
177347 458347: 48 8d 2c 24 lea (%rsp),%rbp
177348 myFunc()
177349 45834b: e8 10 00 00 00 callq 458360 <main.myFunc>
177350 }
I also know the goid information is stored in the g struct of go. The value of fs register can be obtained via the ctx argument of ebpf function.
But I don't know what the real address of %fs:0xfffffffffffffff8 because I am new to assembly language. Could anyone give me some hints?
If the value of fs register were 0x88, what is the value of %fs:0xfffffffffffffff8?
That's a negative number, so it's one qword before the FS base. You need the FS base address, which is not the selector value in the FS segment register that you could see with a debugger.
Your process probably made a system call to ask the OS to set it, or possibly used the wrfsbase instruction at some point on systems that support it.
Note that at least outside of Go, Linux typically uses FS for thread-local storage.
(I'm not sure what the standard way to actually find the FS base is; it's obviously OS dependent to do that in user-space where rdmsr isn't available; FS and GS base are exposed as MSRs, so OSes use that instead of actually modifying a GDT or LDT entry. rdfsbase needs to be enabled by the kernel setting a bit in CR4 on CPUs that support the FSGSBASE ISA extension, so you can't count on that working.)
#MargaretBloom suggests that user-space could trigger an invalid page fault; most OSes report the faulting virtual address back to user-space. In Linux for example, SIGSEGV has the address. (Or SIGBUS if it was non-canonical, IIRC. i.e. not in the low or high 47 bits of virtual address space, but in the "hole" where the address isn't the sign-extension of the low 48.)
So you'd want to install signal handlers for those signals and try a load from an offset that (with 0 base) would be in the middle of kernel space, or something like that. If for some reason that doesn't fault, increment the virtual address by 1TiB or something in a loop. Normally no MMIO is mapped into user-space's virtual address space so there are no side effects for merely reading.

Why do these `const int main=0xc3` (or other number) programs return 252 on OS X?

I heard about the "shortest C program that results in an illegal instruction": const main=6; for x86-64 over on codegolf.SE and it got me curious what would happen if I put different numbers there.
Now I guess this has to do with what is or isn't a valid x86-64 instruction (durr) but specifically I'd like to know what the different results mean.
const main=0 through 2 give bus error.
const main=3 gives a segfault.
6 and 7 give illegal instruction.
I get various bus errors and segfaults and illegal instructions up until
const main=194 which didn't give me an interrupt at all (at least not that got through to my python script that was generating these little programs).
There are a few other numbers that also do not lead to exceptions/interrupts and thus to Unix signals. I checked the return code of a couple and the return code was 252. I don't know why or what that means or how it got there.
204 got me a "trace trap". This is 0xcc which I know is the int3 interrupt - that's fun! (241/0xf1 also gets me this)
Anyway, it keeps going and it's obviously mostly bus errors and segfaults and a few illegal instructions here and there and the occasional... does whatever it does and then returns with 252...
I googled around some opcodes but I don't really know what I am doing or where to look to be honest. I haven't even looked at all my outputs yet just been scrolling through. I understand that a segfault is invalid access to valid memory and a bus error is access to invalid memory and I plan to look at the patterns of the numbers and work out where these are happening and why. But the 252 thing has me a bit stumped.
#!/usr/bin/env python3
import os
import subprocess
import time
import signal
os.mkdir("testc")
try:
os.chdir("testc")
except:
print("Could not change directory, exiting.")
for i in range(0, 65536):
filename = "test" + str(i) + ".c"
f = open(filename, "w")
f.write("const main=" + str(i) + ";")
f.close()
outname = "test" + str(i)
subprocess.Popen(["gcc", filename, "-o", outname], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
time.sleep(1)
err = subprocess.Popen("./" + outname, shell=True)
result = None
while result is None:
result = err.poll()
r = result
if result == -11:
r = "segfault"
if result == -10:
r = "bus error"
if result == -4:
r = "illegal instruction"
if result == -5:
print = "trap"
print("const main=" + str(hex(i)) + " : " + r)
This produces a C program in testc/test20.c like
const int main=20;
Then compiles it with gcc and runs it. (And sleeps for 1 second before trying the next number.)
There were no expectations. I just wanted to see what happened.
int main = 194 is c2 00 00 00, which decodes as ret 0
Whatever called main must have left 252 in the low byte of RAX. (The calling convention says that RAX is the return-value register, but it's not an arg-passing register so on function entry it holds whatever tmp garbage your caller was using it for.)
See the bottom of the answer for a theory on why you get SIGBUS for 2 but SIGSEGV for 3: I think RAX is a valid pointer on entry to main (by chance of what the dynamic linker had there), 03 00 add eax, [rax] destroys it but 02 00 add al, [rax] doesn't, and then execution either faults on the 00 00 add [rax], al from the next 2 bytes of main, or runs the 00 00 instruction and then falls off the end of a page.
Update from #MichaelPetch: RAX is pointing to main (in the read-only TEXT segment), and stores to read-only pages also SIGBUS. So 00 00 add [rax], al will SIGBUS for that reason if RAX is still pointing there.
(Beware that this answer has some wrong guesses and wasn't fully rewritten every time I got new info from #SWilliams or #MichaelPetch. The bullet points about what kinds of #PF cause which signal are up to date, and I've tried to at least add a correction after things that weren't quite accurate. I think there's some value to the wrong theories, as an illustration of others kinds of things that might have happened, so I'm leaving it all in here.)
Your Python program fails on my Linux machine once it gets to c2 00 00 00 ret imm16, the first one that returns successfully. (On Linux, the .rodata section ends up after .text in the TEXT segment, so there's nothing for main to fall into.)
...
const main=0xc0 : segfault
const main=0xc1 : segfault
Traceback (most recent call last):
File "./opcode-test.py", line 34, in <module>
print("const main=" + str(hex(i)) + " : " + r)
TypeError: must be str, not int
Doesn't python have an equivalent of strsignal(3) to map signals to standard text strings like "Illegal instruction"? (Like strerror but for signal codes instead of errno values?)
Most x86 instructions are multiple bytes long. x86 is little-endian, so you're mostly looking at
?? 00 00 00 90 90 90 ... or for larger integers ?? ?? 00 00 90 90 90 90 ..., assuming your linker fills bytes between functions with 0x90 nop like GNU ld on Linux does.
These byte sequences might decode to one or more valid instructions before you hit the NOPs and fall through to whatever CRT function the linker puts after main. If you get there without faulting, and without offsetting the stack pointer, you've entered the function with a valid return address on the stack (main's caller, another CRT function) exactly like if main tail-called it.
Presumably that function returns 252 (or some wider value whose low byte is 252). Returning from main leads to clean process exit, making an exit system call with main's return value.
This fall-through tailcall is like if main ended with return next_function(argc, argv);.
Correction (without rewriting the whole answer, sorry)
Since main=194 is the first one that worked, I think you're not actually getting fall-through, probably only C2 ret imm16 and C3 ret are leading to a clean exit. And for c2, it has to be followed by 2 00 bytes, or else it'll break the stack for main's caller.
Or those instructions with a prefix that doesn't do anything, or a harmless one-byte instruction. e.g. 90 nop / c3 ret or 90 nop / c2 00 00 ret 0. Or 91 xchg eax, ecx, etc. could actually give you a different return value, swapping EAX with another register. (x86 dedicates opcodes 90 .. 97 to xchg-with-EAX, because on original 8086 AX was more "special", without instructions like movsx to sign-extend into other registers. And without 2 operand imul.
Other harmless one-byte instructions include 99 cdq and 98 cwde, but not push or pop (because changing RSP would make it not point at the return address). Some one-byte flag set/clear instructions are f9 stc, fd std, but not fb sti (that's privileged, unlike the carry flag and direction flag).
Harmless prefixes are 0x40..4f REX prefixes, 0xf2/f3REP, and0x66and0x67` operand-size and address size. Also any segment-override prefixes might also be harmless.
I just tested main=0xc366 and main=0xc367 and yes they both exit cleanly. GDB decodes 66 c3 as retw (operand-size prefix) and 67 c3 as addr32 ret (address size prefix), but both still pop a 64-bit return address, and don't truncate the stack pointer either. (I took out the -no-pie I'd been using, so RIP was outside the low 32 bits along with RSP).
Note that 00 is the opcode for add [r/m8], r8, so 00 00 decodes as add [rax], al.
To get past those 00 bytes and get to the "nop sled" the linker inserts as padding, you need the opcode (and modrm byte if the opcode uses one) to encode the start of a longer instruction, like 0xb8 mov eax, imm32 which is 5 bytes long, and consumes the next 4 bytes after the 0xb8. In fact there are short-form mov-immediate encodings for every register, so 0xb8 + 0..7 will all get you past the gap. Except for mov esp, imm32, which will lead to a crash once you get to the next function because it stepped on the stack pointer.
One of the early ones is 05, the short-form (no modrm) opcode for add eax, imm32. Most original-8086 ALU instructions have a special AX,imm16 / EAX,imm32 short form, instead of the op r/m32, imm32 or imm8 form that uses a ModRM byte to encode the destination operand. (And the bits of the /r field in ModRM as extra opcode bits.)
See Tips for golfing in x86/x64 machine code for more about AL / EAX / RAX short form encodings, and one byte instructions.
For manually decoding x86 machine code, see Intel's manuals, especially the vol.2 manual which details the instruction encoding formats, and has an opcode table at the end. (See links in the x86 tag wiki). For just an opcode map, see http://ref.x86asm.net/coder64.html.
Use a disassembler or debugger to see what's in your executables
But really, use a disassembler like objdump -drwC -Mintel. Or llvm-objdump. Find main in the output, and look at what you get. (Or use GDB, because labels in the middle of an instruction throw off the disassembler.)
Use objdump -rwC -Mintel -D -j .rodata -j .text testc/test194 to get output like this, disassembling the .text and .rodata sections as code:
testc/test194: file format elf64-x86-64
Disassembly of section .text:
0000000000400540 <__libc_csu_init>:
400540: 41 57 push r15
400542: 49 89 d7 mov r15,rdx
...
4005a4: c3 ret
4005a5: 90 nop
4005a6: 66 2e 0f 1f 84 00 00 00 00 00 nop WORD PTR cs:[rax+rax*1+0x0]
00000000004005b0 <__libc_csu_fini>:
4005b0: c3 ret
Disassembly of section .rodata:
00000000004005c0 <_IO_stdin_used>: ;;;; This is actually data!
4005c0: 01 00 add DWORD PTR [rax],eax
4005c2: 02 00 add al,BYTE PTR [rax]
00000000004005c4 <main>:
4005c4: c2 00 00 ret 0x0
... ; objdump elided the last 0, not me. It literally put ...
(I modified your python script to add the -no-pie gcc option, which is why my disassembly has absolute addresses, instead of just small addresses relative to the start of the file = 0. I wondered if that might put main somewhere it could fall through, but it didn't.)
Notice there's only a small gap between .text and .rodata. They're part of the same ELF segment (in the ELF program headers that the OS's program loader looks at), so they're part of the same mapping, no unmapped pages between them. If we're lucky, the intervening bytes are even filled with 0x90 nop instead of 00. Actually, something filled the gap between __libc_csu_init and __libc_csu_fini with long NOPs. Maybe that was from the assembler if they were in the same source file.
main is of course in .rodata because you declared it in C as a read-only global (static storage), like const int main = 6;. I you used const int main __attribute__((section(".text"))) = 123, you could get main in the normal .text section. On my system, it ends up right before __libc_csu_init.
But labels interrupt disassembly; the disassembler thinks it must have been wrong and restarts decoding from the label. So in GDB on testc/test5 (with set disassembly-flavor intel and layout reg, then using the start command to stop at the start of main), I'll get
|0x40053c <main> add eax,0x41000000 │
│0x400541 <__libc_csu_init+1> push rdi │
│0x400542 <__libc_csu_init+2> mov r15,rdx
But from objdump -drwC -Mintel (disassembing only the .text section is the default for -d, and I used the GNU C attribute to put main there so my program could work the way yours does), I get:
000000000040053c <main>:
40053c: 05 00 00 00 ....
0000000000400540 <__libc_csu_init>:
400540: 41 57 push r15
400542: 49 89 d7 mov r15,rdx
Notice that the .... on the same line as the 05 00 00 00 indicates that decoding didn't get to the end of an instruction.
And since main isn't aligned by 16 here, it's right up against the start of __libc_csu_init. So the add eax, imm32 consumes the REX.W prefix (41) from push r15, making it decode as push rdi if reached by falling through from main instead of by a call to the __libc_csu_init label.
The above output was from Linux. Your OS X system would be different
OS X puts most of the CRT startup code in libc, not statically linked into the executable with main.
Or maybe there isn't anything for your main to fall through into
If there was, main=5 would have worked, but you say the first non-crashing result was with main=194, which is an actual ret.
If nothing before c3 ret or c2 00 00 ret 0 returned, then probably there's nothing to fall into after main, or the gap isn't padded with repeated 90 nop to form a "nop sled" that will execute ok if decoding starts anywhere in the middle of it. (e.g. after an earlier instruction consumes the trailing 0 bytes at the end of the dword int main, and some of the padding bytes.)
I understand that a segfault is invalid access to valid memory and a bus error is access to invalid memory
No, that simplified description is backwards. Usually you get a segfault for trying to access an unmapped page, on all Unixes. But you get a bus error for some kinds of invalid access (even on valid addresses).
Solaris on SPARC gives you a bus error for misaligned word loads/stores to valid memory.
On x86-64 Linux, you only get SIGBUS for really weird stuff. See Debugging SIGBUS on x86 Linux. Non-canonical stack pointer leading to a #SS exception, reading past the end of a mmaped file that was truncated. Also if you enable x86 alignment checking (AC flag), but nobody does that because library funcs like memcpy use unaligned loads/stores, and compiler code-gen assumes that unaligned integer loads/stores are safe.
IDK what hardware exceptions *BSD maps to SIGBUS, but I'd assume that regular out-of-bounds access, like NULL-pointer dereference, would SIGSEGV. That's pretty standard.
#MichaelPetch says in comments that on OS X
#PF (page fault hardware exception) from code-fetch cases the kernel to deliver SIGBUS
#PF from a data load/store to an unmapped page results in SIGSEGV.
#PF from a store to a read-only page results in SIGBUS. (And this is what's happening after 02 00 add al, [rax], in the 00 00 add [rax], al that forms the 2nd byte of main. The rest of this answer doesn't take this into account.)
(Of course this is after checking if the page-fault was due a difference between the hardware page table and the logical process memory map, e.g. from lazy mapping, copy-on-write, or pages paged out to disk.)
So if your int main is landing at the very end of an unmapped page, 05 add eax,imm32 would read one extra byte past the end of the dword holding int main (.long 5 in GAS syntax asm). That would go into the next page and SIGBUS. (Your last comment indicates it does SIGBUS.)
A theory for what's going on with the first few values:
You report:
a bus error for main = 02 00 add al, [rax] / `00 00 add [rax], al
but a segfault for main = 03 00 add eax, [rax] / 00 00 add [rax], al.
We know the low byte of RAX is 252, so if RAX holds a valid pointer value, it's 4-byte aligned. So if loading a byte from [rax] works, so does loading a dword.
So probably the memory-source add is succeeding, and modifying AL, the low byte of RAX (byte operand size) probably still leaving RAX a valid pointer.** Then if the rest of the page containing main is filled with 00 00 add [rax], al instructions (or just the one inside main itself), those will succeed (without further modifying RAX) until execution falls off into an unmapped page, as long as RAX is still a valid pointer after running whatever main decoded to.
Actually, the memory-destination add itself faults and raises SIGBUS.
03 00 add eax, [rax] writes EAX, and thus truncates RAX to 32-bit. (writing a 32-bit register implicitly zero-extends into the full 64-bit register, unlike writing low 8 or 16 partial registers.) This definitely gives you an invalid pointer, because OS X maps static code/data outside the low 32 bits of virtual address space.
So the following 00 00 add [rax], al will definitely fault from trying to write an out-of-bounds address, causing a #PF that raises SIGSEGV.
There's probably just the one 00 00 from the last two bytes of main before the end of a page. Otherwise 05 add eax, imm32 would segfault from truncating RAX and then running 00 00 add [rax], al. For it to SIGBUS, it must code-fetch into an unmapped page without decoding any memory-access instructions after that.
There are certainly other plausible explanations for what you're seeing, but I think this explains all your observations so far; without more data we can't disprove it. Obviously the easiest thing would be to fire up GDB or whatever other debugger and just start / si and watch what happens.

win32 singleton with std containers CRT false memory leak? [duplicate]

It seems whenever there are static objects, _CrtDumpMemoryLeaks returns a false positive claiming it is leaking memory. I know this is because they do not get destroyed until after the main() (or WinMain) function. But is there any way of avoiding this? I use VS2008.
I found that if you tell it to check memory automatically after the program terminates, it allows all the static objects to be accounted for. I was using log4cxx and boost which do a lot of allocations in static blocks, this fixed my "false positives"...
Add the following line, instead of invoking _CrtDumpMemoryLeaks, somewhere in the beginning of main():
_CrtSetDbgFlag ( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF );
For more details on usage and macros, refer to MSDN article:
http://msdn.microsoft.com/en-us/library/5at7yxcs(v=vs.71).aspx
Not a direct solution, but in general I've found it worthwhile to move as much allocation as possible out of static initialization time. It generally leads to headaches (initialization order, de-initialization order etc).
If that proves too difficult you can call _CrtMemCheckpoint (http://msdn.microsoft.com/en-us/library/h3z85t43%28VS.80%29.aspx) at the start of main(), and _CrtMemDumpAllObjectsSince
at the end.
1) You said:
It seems whenever there are static objects, _CrtDumpMemoryLeaks returns a false positive claiming it is leaking memory.
I don't think this is correct. EDIT: Static objects are not created on heap. END EDIT: _CrtDumpMemoryLeaks only covers crt heap memory. Therefore these objects are not supposed to return false positives.
However, it is another thing if static variables are objects which themselves hold some heap memory (if for example they dynamically create member objects with operator new()).
2) Consider using _CRTDBG_LEAK_CHECK_DF in order to activate memory leak check at the end of program execution (this is described here: http://msdn.microsoft.com/en-us/library/d41t22sb(VS.80).aspx). I suppose then memory leak check is done even after termination of static variables.
Old question, but I have an answer. I am able to split the report in false positives and real memory leaks. In my main function, I initialize the memory debugging and generate a real memory leak at the really beginning of my application (never delete pcDynamicHeapStart):
int main()
{
_CrtSetDbgFlag( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF );
char* pcDynamicHeapStart = new char[ 17u ];
strcpy_s( pcDynamicHeapStart, 17u, "DynamicHeapStart" );
...
After my application is finished, the report contains
Detected memory leaks!
Dumping objects ->
{15554} normal block at 0x00000000009CB7C0, 80 bytes long.
Data: < > DD DD DD DD DD DD DD DD DD DD DD DD DD DD DD DD
{14006} normal block at 0x00000000009CB360, 17 bytes long.
Data: <DynamicHeapStart> 44 79 6E 61 6D 69 63 48 65 61 70 53 74 61 72 74
{13998} normal block at 0x00000000009BF4B0, 32 bytes long.
Data: < ^ > E0 5E 9B 00 00 00 00 00 F0 7F 9C 00 00 00 00 00
{13997} normal block at 0x00000000009CA4B0, 8 bytes long.
Data: < > 14 00 00 00 00 00 00 00
{13982} normal block at 0x00000000009CB7C0, 16 bytes long.
Data: < # > D0 DD D6 40 01 00 00 00 90 08 9C 00 00 00 00 00
...
Object dump complete.
Now look at line "Data: <DynamicHeapStart> 44 79 6E 61 6D 69 63 48 65 61 70 53 74 61 72 74".
All reportet leaks below are false positives, all above are real leaks.
False positives don't mean there is no leak (it could be a static linked library which allocates heap at startup and never frees it), but you cannot eliminate the leak and that's no problem at all.
Since I invented this approach, I never had leaking applications any more.
I provide this here and hope this helps other developers to get stable applications.
Can you take a snapshot of the currently allocated objects every time you want a list? If so, you could remove the initially allocated objects from the list when you are looking for leaks that occur in operation. In the past, I have used this to find incremental leaks.
Another solution might be to sort the leaks and only consider duplicates for the same line of code. This should rule out static variable leaks.
Jacob
Ach. If you are sure that _CrtDumpMemoryLeaks() is lying, then you are probably correct. Most alleged memory leaks that I see are down to incorect calls to _CrtDumpMemoryLeaks(). I agree entirely with the following; _CrtDumpMemoryLeaks() dumps all open handles. But your program probably already has open handles, so be sure to call _CrtDumpMemoryLeaks() only when all handles have been released. See http://www.scottleckie.com/2010/08/_crtdumpmemoryleaks-and-related-fun/ for more info.
I can recommend Visual Leak Detector (it's free) rather than using the stuff built into VS. My problem was using _CrtDumpMemoryLeaks with an open source library that created 990 lines of output, all false positives so far as I can tell, as well as some things coming from boost. VLD ignored these and correctly reported some leaks I added for testing, including in a native DLL called from C#.

How do you view segment-offset memory addresses in the Visual Studio debugger?

I'm debugging some code from the disassembly (no source code is available), and there a number of instructions accessing data via the ds segment register, e.g. something like this:
66 3B 05 8A B1 43 00 cmp ax,word ptr ds:[43B18Ah]
How do you get the Visual Studio debugger to tell you the offset of the ds segment register so that I can inspect the memory this is referring to? The Watch window does not seem to accept expressions like ds:[0x43B18A] or variants; it will tell me that ds is 0, but that doesn't tell me what segment 0's offset is.
Is there some special syntax for this, or is this something that VS just can't do? Would I have better luck with another debugger, such as WinDbg or ntsd?
This is a quirk of the disassembler built into Visual Studio. It is superfluous, the DS register is the default. Just ignore it, on Windows the DS, CS and ES registers are set to the same value. A protected mode selector. And the same value used by the Memory window. Just omit the ds: prefix.

x86 assember - illegal opcode 0xff /7 under Windows

I'm currently developing an x86 disassembler, and I started disassembling a win32 PE file. Most of the disassembled code looks good, however there are some occurences of the illegal 0xff /7 opcode (/7 means reg=111, 0xff is the opcode group inc/dec/call/callf/jmp/jmpf/push/illegal with operand r/m 16/32). The first guess was, that /7 is the pop instruction, but it is encoded with 0x8f /0. I've checked this against the official Intel Architecture Software Developer’s Manual Volume 2: Instruction Set Reference - so I'm not just missleaded.
Example disassembly: (S0000O0040683a is a lable being jumped to by another instruction)
S0000O0040683a: inc edi ; 0000:0040683a ff c7
test dword ptr [eax+0xff],edi ; 0000:0040683c 85 78 ff
0xff/7 edi ; 0000:0040683f ff ff
BTW: gdb disassembles this equally (except the bug 0xff not yielding -1 in my disassembly):
(gdb) disassemble 0x0040683a 0x00406840
Dump of assembler code from 0x40683a to 0x406840:
0x0040683a: inc %edi
0x0040683c: test %edi,0xffffffff(%eax)
0x0040683f: (bad)
End of assembler dump.
So the question is: Is there any default handler in the illegal opcode exception handler of Windows, which implements any functionality in this illegal opcode, and if yes: What happends there?
Regards, Bodo
After many many additional hours getting my disassembler to produce the output in the exact same syntax than gdb does, I could diff over the two versions. This revealed a rather awkward bug in my disassember: I forgot to take into account, that the 0x0f 0x8x jump instruction have a TWO byte opcode (plus the rel16/32 operand). So each 0x0f 0x8x jump target was off by one leading to code which is not reachable in reality. After fixing this bug, no 0xff/7 opcodes are disassembled any longer.
Thanks go to everyone answering to my question (and commenting that answers as well) and thus at least trying to help me.
Visual Studio disassembles this to the following:
00417000 FF C7 inc edi
00417002 85 78 FF test dword ptr [eax-1],edi
00417005 ?? db ffh
00417006 FF 00 inc dword ptr [eax]
Obviously, a general protection fault happens at 00417002 because eax does not point to anything meaningful, but even if I nop it out (90 90 90) it throws an illegal opcode exception at 00417005 (it does not get handled by the kernel). I'm pretty sure that this is some sort of data and not executable code.
To answer your question, Windows will close the application with the exception code 0xC000001D STATUS_ILLEGAL_INSTRUCTION. The dialog will match the dialog used for any other application crashes, whether it offers a debugger or to send an error report.
Regarding the provided code, it would appear to have either been assembled incorrectly (encoding a greater than 8-bit displacement) or is actually data (as suggested by others already).
It looks like 0xFFFFFFFF has been inserted instead of 0xFF for the test instruction, probably in error?
85 = test r/m32, and 78 is the byte for parameters [eax+disp8], edi, with the disp8 to follow which should just be 0xFF (-1) but as a 32-bit signed integer this is 0xFFFFFFFF.
So I am assuming that you have 85 78 FF FF FF FF where it should be 85 B8 FF FF FF FF for a 32-bit displacement or 85 78 FF for the 8-bit displacement? If this is the case the next byte in the code should be 0xFF...
Of course, as suggested already, this could just be data, and don't forget that data can be stored in PE files and there is no strong guarantee of any particular structure. You can actually insert code or user defined data into some of the MZ or PE header fields if you are agressively optimising to decrease the .exe size.
EDIT: as per the comments below I'd also recommend using an executable where you already know exactly what the expected disassembled code should be.

Resources