Why does gcc emit 0x0(%r13) in one instruction but (%r13) in another? - gcc

I am debugging a piece of code which has the following instruction.
mov %esi,0x0(%r13)
Then at another place, I see an instruction like this:
mov %esi,(%r13)
I thought the former one moves the contents of esi register to address given by contents of r13 + 0x0. With that logic, the latter should also result in the same effect.
Is there any difference between these instructions?
Why does gcc write the same thing differently?
EDIT: The disassembly has been generated using objdump -S.

Related

Position of GCC stack canaries

Unless I am misunderstanding something, it seems the position of the canary value can be before or after ebp, therefore in the second case the attacker can overwrite the frame pointer without touching the canary.
For example in this snippet, the canary is located at a lower address (ebp-0xc) than ebp therefore protecting it (an attacker must overwrite the canary to overwrite ebp):
0x080484e0 <+52>: mov eax,DWORD PTR [ebp-0xc]
0x080484e3 <+55>: xor eax,DWORD PTR gs:0x14
0x080484ea <+62>: je 0x80484f1 <func+69>
0x080484ec <+64>: call 0x8048360 <__stack_chk_fail#plt>
However looking at other code the canary is after rbp+8:
How should I interpret this? Does this depend on GCC version or something else?
The canary is always below the frame pointer, with every version of gcc I've tried. You can see that confirmed in the gdb disassembly immediately below the IDA disassembly in the blog post you linked, which has mov rax, QWORD PTR [rbp-0x8].
I think this is just an artifact of IDA's disassembler. Instead of displaying the numerical offset for rbp-relative addresses, it assigns a name to each stack slot, and displays the name instead; basically assuming that every rbp-relative access is to a local variable or argument. And it looks like it always displays that name with a + regardless of whether the offset is positive or negative. Note that buf and fd also get a + sign even though they are local variables which are clearly below the frame pointer.
In this example, it has named the canary var_8 as if it were a local variable. So I suppose to translate this properly, you have to think of var_8 as having the value -8.

How can I print numbers in my assembly program

I have a problem with my assembly program. My assembly compiler is NASM. The source and the outputs are in this picture:
The problem is that I can't print numbers from calculations with the extern C function printf(). How can I do it?
The output should be "Ergebnis: 8" but it isn't correct.
In NASM documentation it is pointed that NASM Requires Square Brackets For Memory References. When you write label name without bracket NASM gives its memory address (or offset as it is called sometimes). So, mov eax, val_1 it means that eax register gets val_1's offset. When you add eax, val_2, val_2 offset is added to val_1 offset and you get the result you see.
Write instead:
mov eax, [val_1]
add eax, [val_2]
And you shoul get 8 in eax.
P.S. It seems that you have just switched to NASM from MASM or TASM.
There are a lot of guides for switchers like you. See for example nice tutorials here and here.

Visual Studio only breaks on second line of assembly?

The short description:
Setting a breakpoint on the first line of my .CODE segment in an assembly program will not halt execution of the program.
The question:
What about Visual Studio's debugger would allow it to fail to create a breakpoint at the first line of a program written in assembly? Is this some oddity of the debugger, a case of breaking on a multi-byte instruction, or am I just doing something silly?
The details:
I have the following assembly program compiling and running in Visual Studio:
; Tell MASM to use the Intel 80386 instruction set.
.386
; Flat memory model, and Win 32 calling convention
.MODEL FLAT, STDCALL
; Treat labels as case-sensitive (required for windows.inc)
OPTION CaseMap:None
include windows.inc
include masm32.inc
include user32.inc
include kernel32.inc
include macros.asm
includelib masm32.lib
includelib user32.lib
includelib kernel32.lib
.DATA
BadText db "Error...", 0
GoodText db "Excellent!", 0
.CODE
main PROC
;int 3 ; <-- If uncommented, this will not break.
mov ecx, 6 ; <-- Breakpoint here will not hit.
xor eax, eax ; <-- Breakpoint here will.
_label: add eax, ecx
dec ecx
jnz _label
cmp eax, 21
jz _good
_bad: invoke StdOut, addr BadText
jmp _quit
_good: invoke StdOut, addr GoodText
_quit: invoke ExitProcess, 0
main ENDP
END main
If I try to set a breakpoint on the first line of the main function, mov ecx, 6, it is ignored, and the program executes without stopping. Only will a breakpoint be hit if I set it on the line after that, xor eax, eax, or any subsequent line.
I have even tried inserting a software breakpoint, int 3, as the first line of the function, and it is also ignored.
The first thing I notice that is odd: viewing the disassembly after hitting one of my breakpoints gives me the following:
01370FFF add byte ptr [ecx+6],bh
--- [Path]\main.asm
xor eax, eax
00841005 xor eax,eax --- <-- Breakpoint is hit here
_label: add eax, ecx
00841007 add eax,ecx
dec ecx
00841009 dec ecx
jnz _label
0084100A jne _label (841007h)
cmp eax, 21
0084100C cmp eax,15h
What's interesting here is that the xor is, in Visual Studio's eyes, the first operation in my program. Absent is the line move ecx, 6. Directly above where it thinks my source begins is the line that actually sets ecx to 6. So the actual start of my program has been mangled according to the disassembly.
If I make the first line of my program int 3, the line that appears above where my code is in the disassembly is:
00F80FFF add ah,cl
As suggested in one of the answers, I turned off ASLR, and it looks like the disassembly is a little more stable:
.CODE
main PROC
;mov ecx, 6
xor eax, eax
00401000 xor eax,eax --- <-- Breakpoint is present here, but not hit.
_label: add eax, ecx
00401002 add eax,ecx --- <-- Breakpoint here is hit.
dec ecx
00401004 dec ecx
The complete program is visible in the disassembly, but the problem still perists. Despite my program starting on an expected address, and the first breakpoint being shown in the disassembly, it is still skipped. Placing an int 3 as the first line still results in the following line:
00400FFF add ah,cl
and does not stop execution, and re-mangles the view of my program in the disassembly again. The next line of my program is then at location 00401001, which I suppose makes sense because int 3 is a one-byte instruction, but why would it have disappeared in the disassembly?
Even starting the program using the 'Step Into (F11)' command does not allow me to break on the first line. In fact, with no breakpoint, starting the program with F11 does not halt execution at all.
I'm not really sure what else I can try to solve the problem, beyond what I have detailed here. This is stretching beyond my current understanding of assembly and debuggers.
01370FFF add byte ptr [ecx+6],bh
At least I can explain away one mystery. Note the address, 0x1370fff. The CODE segment never starts at an address like that, segments begin at an address that's a multiple of 0x1000. Which makes the last 3 hex digits of the start address always 0. The debugger got confuzzled and started disassembling the code at the wrong address, off by one. The actual start address is 0x1371000. The disassembly starts off poorly because there's a 0 at 0x1370fff. That's a multi-byte ADD instruction. So it displays garbage for a while until it catches up with real machine code instructions by accident.
You need to help it along and give it a command to start disassembling at the proper address. In VS that's the Address box, type "0x1371000".
Another notable quirk is the strange value of the start address. A process normally starts at address 0x400000. You have a feature called ASLR turned on, Address Space Layout Randomization. It is an anti-virus feature that makes programs start at an unpredictable start address. Nice feature but it doesn't exactly help debugging programs. It isn't clear how you built this code but you need the /DYNAMICBASE:NO linker option to turn it off.
Another important quirk of debuggers you need to keep in mind here is the way they set breakpoints. They do so by patching the code, replacing the start byte of an instruction with an int 3 instruction. When the breakpoint hits, it quickly replaces the byte with the original machine code instruction byte. So you never see this. This goes wrong if you pick the wrong address to set the breakpoint, like in the middle of a multi-byte instruction. It now no longer breaks the code, the altered byte messes up the original instruction. You can easily fall into this trap when you started with a bad disassembly.
Well, do this the Right Way. Start debugging with the debugger's STEP command instead.
I have discovered what the root of the problem is, but I haven't a clue why it is so.
After creating another MASM project, I noticed that the new one would break on the first line of the program, and the disassembly did not appear to be mangled or altered. So, I compared its properties to my original project (for the Debug configuration). The only difference I found was that my original project had Incremental Linking disabled. Specifically, it added /INCREMENTAL:NO to the linker command line.
Removing this option from the command line (thereby enabling Incremental Linking) resulted in the program behaving as expected during debugging; my code shown in the disassembly window remained unaltered, I could hit a breakpoint on the first line of the main procedure, and an int 3 instruction would also execute properly as the first line.
If you press F+11 (step into) instead of Start Debugging the debugger will stop on the first line.
It is possible there is some messed up breakpoint setting. Delete any *.suo files in your project directory to reset all breakpoints.
Note that your project will have a secret headers and stuff in it if it has a main function. To set a breakpoint at the real entry point use: Debug + New Breakpoint + Break at Function -> wWinMainCRTStartup for a windows program or mainCRTStartup or wmainCRTStartup for a console program.

Why does GCC add assembly commands to my inline assembly?

I'm using Apple's llvm-gcc to compile some code with inline assembly. I wrote what I want it to do, but it adds extraneous commands that keep writing variables to memory. Why is it doing this and how can I stop it?
Example:
__asm__{
mov r11, [rax]
and r11, 0xff
cmp r11, '\0'
}
becomes (in the "assembly" assistant view):
mov 0(%rax), %r11 // correct
movq %r11, -104(%rbp) // no, GCC, obviously wrong
and $255, %r11
movq %r11, -104(%rbp)
cmp $0, %r11
Cheers.
You need to use GCC's extended asm syntax to tell it which registers you're using as input and output and which registers get clobbered. If you don't do that, it has no idea what you're doing, and the assembly it generates can easily interfere with your code.
By informing it about what your code is doing, it changes how it does register allocation and optimization and avoids breaking your code.
it's because gcc tries to optimize your code. you can prevent optimizations by adding -O0 to command-line.
Try adding volatile after __asm__ if you don't want that. That additional commands are probably part previous/next C instructions. Without volatile compiler is allowed to do this (as it probably executes faster this way - not your code, the whole routine).

How to debug an assembled program?

I have a program written in assembly that crashes with a segmentation fault. (The code is irrelevant, but is here.)
My question is how to debug an assembly language program with GDB?
When I try running it in GDB and perform a backtrace, I get no meaningful information. (Just hex offsets.)
How can I debug the program?
(I'm using NASM on Ubuntu, by the way if that somehow helps.)
I would just load it directly into gdb and step through it instruction by instruction, monitoring all registers and memory contents as you go.
I'm sure I'm not telling you anything you don't know there but the program seems simple enough to warrant this sort of approach. I would leave fancy debugging tricks like backtracking (and even breakpoints) for more complex code.
As to the specific problem (code paraphrased below):
extern printf
SECTION .data
format: db "%d",0
SECTION .bss
v_0: resb 4
SECTION .text
global main
main:
push 5
pop eax
mov [v_0], eax
mov eax, v_0
push eax
call printf
You appear to be just pushing 5 on to the stack followed by the address of that 5 in memory (v_0). I'm pretty certain you're going to need to push the address of the format string at some point if you want to call printf. It's not going to take to kindly to being given a rogue format string.
It's likely that your:
mov eax, v_0
should be:
mov eax, format
and I'm assuming that there's more code after that call to printf that you just left off as unimportant (otherwise you'll be going off to never-never land when it returns).
You should still be able to assemble with Stabs markers when linking code (with gcc).
I reccomend using YASM and assembling with -dstabs options:
$ yasm -felf64 -mamd64 -dstabs file.asm
This is how I assemble my assembly programs.
NASM and YASM code is interchangable for the most part (YASM has some extensions that aren't available in NASM, but every NASM code is well assembled with YASM).
I use gcc to link my assembled object files together or while compiling with C or C++ code. When using gcc, I use -gstabs+ to compile it with debug markers.

Resources