Why does basic assembly code fail to be build? - visual-studio

I have a big problem, I just started with assembly, and I think I at least have understood the basics of MOV and the system calls, but I cannot really understand why do not my codes want to be build and just run, they are the basic 'Hello World' commands
My code looks like this
global_start
_start:
mov eax,4
mov ebx,1
mov ecx,msg
mov edx,len
int 0x80
mov eax,1
int 0x80
segment .data
msg db 'Ide Gas na max', 0xa
len equ $ - msg
I tried to set up all different environments, ,MESM32 SDK, Visual Studio, Visual Studio Code with MASM/TASM extension which opens DOSBOX, sadly the it crashes instantly and debug option gives error for every line, (I did learn that TASM is more for 16bit applications, so I changed to MSAM only in preferences)
main.ASM(1): error A2008: ression : segment
main.ASM(2): error A2008: ression : global_start
main.ASM(4): error A2034: values for structure
main.ASM(16): error A2088: ring
I did not include full error, because only those 4 variations repeat for each line, I can post if you want to see, but I do not want to make this too long. So then I thought maybe it is just wrong set of instruction I used, so I just found random hello world codes online and just copy and pasted and absolutely none worked, always there was an error 🤔 And then I changed the IDE, to MASM32 editor, which only always gives "Assembly Error" and Visual Studio just says every time diff message which I have no idea now what is, I deleted it, I just do not like it really, and yes I did set up MASM for the VS project also, I set it up following more tutorials, and also I have some book I followed.
So please, can someone explain me what do to, what to try, I am clueless, in the code I also tried Section instead of Segment or just changing orders or dots, still nothing

Related

Strange memory content display in Visual Studio debug mode

I am writing some multi-thread C program. I tried to modify the few instructions at the beginning of a function's body to redirect the execution to somewhere else.
But I noticed that when debugging within Visual Studio 2015, some memory location seems to be unchangeable as displayed in the Memory window.
For example:
In below picture, a function ApSignalMceToOs() begins at 0x7FFBBEE51360. I have unprotected the memory range 0x7FFBBEE51360 to 0x7FFBBEE5136E to modify it.
Line 305 to 312 modify the address range 0x7FFBBEE51360 ~ 0x7FFBBEE5136E.
Everything is fine until 0x7FFBBEE51369. At line 311, the (uint32_t(((uintptr_t)dst) >> 32 is 0x00007ffb.
After line 311 is executed, I was expecting the memory range in 0x7FFBBEE51369 ~ 0x7FFBBEE5136C will be filled as fb 7f 00 00. But as shown below, Visual Studio says it is 48 7f 00 00, where the 48 is the old value.
Then I went to check the disassembly code of the function ApSignalMceToOs(). And not surprisingly, the instruction at 00007FFBBF171365 is mov dword ptr [rsp+4], 7F48h, which should be 7FFB. As shown below in the red box below.
So until this point, Visual Studio 2015 is telling me that my modification would fail.
But as the yellow arrow in above picture shows, after the mov dword ptr [rsp+4], 7F48h is executed, I checked the content in the stack area. Surprisingly it is indeed 7f fb got moved onto the stack (shown in the green box in above picture).
And after the ret instruction is executed, the RIP register does change to 00007FFBBEEAD940, which is no surprise. See below:
And in another function, the same location is being read. Shown as below:
The code[len] or byte ptr [rax] is the memory location holding 48 or fb. But it reads 0xcc, which is neither 0x48 nor 0xfb.
Visual Studio disassembly code is decoded from the memory content. So the memory content or how VS2015 read/refresh it is the key point.
Based on above observation, I came to 2 conclusions with VS 2015 debug mode:
Some memory content is not correctly shown (or refreshed in GUI).
Some memory read operation doesn't work correctly.
But the program runs smoothly when not debugging.
Does anyone know why this is happening?
ADD 1 - 5:08 PM 10/14/2019
Thanks to #MichaelBurr. I guess I can explain it now.
The root cause is I added a breakpoint at 0x00007FFB...369 at the disassembly code level, not the C source level.
When I did this, the VS Debugger did add a 0xCC instruction at the location 0x00007FFB...369. But it seems Visual Studio 2015 goes to great lengths to hide this fact. Below is the show of the memory content with the breakpoint at 0x00007FFB...369, we can see 0x00007FFB...369 still holds the old value 0x48.
But after I manually copied the memory from 0x00007FFB...360 to 0x00007FFB...36e to somewhere else. The 0xCC instruction at the offset 0x9 is unveiled. See below:
When I modify the content at 0x00007FFB...369, Visual Studio seemed to be alerted and it just restored the content to the old preserved one, i.e. 0x48. Not my newly written one.
But I think this very restoration doesn't make any sense. The restoration of the preserved byte content shouldn't be triggered at this moment in any way. A more reasonable action is to update the breakpoint's location a little bit and insert the 0xCC instruction to a new location. Because the newly modified code may change the "instruction boundary". This way, the debug experience of the self-modifying code can be best preserved. But this will require the Visual Studio to disassemble the new code in the nearby. And the new instruction content could be invalid if the programmer made a mistake.
I think you are essentially fighting with the debugger's breakpoint/single step handling. Breakpoints are often implemented with the int 3 instruction which has the encoding 0xCC. When the debugger sets the 0xCC for the breakpoint it has to save the original value, then replace it when the debugger has stopped program execution.
In a normal situation (code that isn't self-modified) this makes things appear as you expect when examining the code memory region. However if your program modifies the memory that is being managed by the debugger you can get confusing results since the debugger will restore the value it had saved when it set the breakpoint (overwriting your modification).

How to find code for crash

I have some 64-bit code that runs in release mode on a server. There's no Visual studio on the server, only on my dev-machine. The program has been written by many authors now (me latest), and some code in it I'm still not familiar with, and its quite big.
The program crashes now and then with a nullpointer. The instruction at 0xwhatever (latest 0x40066c19) referenced memory at 0x00000000 - click on OK to terminate the program. I have all the source and PDB files for the EXE, but when i run it and attach the process, the memory 0x40066c19 is completely out of range. There is only ?? in that area. How do you use the info about "the instruction at ..." ?
The disassembly window displays something like (example) - but as you see there are simply too far from 00000001403CB888 to 0x40066c19
if (LastKickIdle > GetTickCount())
00000001403CB882 call qword ptr [__imp_GetTickCount (0140688310h)]
00000001403CB888 cmp dword ptr [LastKickIdle (0140888DF8h)],eax
00000001403CB88E ja CMainDlg::OnKickIdle+281h (01403CBAB1h)
return 1;
LastKickIdle = GetTickCount() + 500;
00000001403CB894 mov qword ptr [__formal],rbx
00000001403CB89C call qword ptr [__imp_GetTickCount (0140688310h)]
00000001403CB8A2 add eax,1F4h
00000001403CB8A7 mov dword ptr [LastKickIdle (0140888DF8h)],eax
I run into similar situations at work. I created a log class that takes a string arg and writes it to a file with some other useful info. I make entries at the beginning of methods or at places that I think something may be or may later be problematic. With this, I have at least been able to narrow down my search for problems.
Hope this helps.

Compiler generated unexpected `IN AL, DX` (opcode `EC`) while setting up call stack

I was looking at some compiler output, and when a function is called it usually starts setting up the call stack like so:
PUSH EBP
MOV EBP, ESP
PUSH EDI
PUSH ESI
PUSH EBX
So we save the base pointer of the calling routine on the stack, move our own base pointer up, and then store the contents of a few registers on the stack. These are then restored to their original values at the end of the routine, like so:
LEA ESP, [EBP-0Ch]
POP EBX
POP ESI
POP EDI
POP EBP
RET
So far, so good. However, I noticed that in one routine the code that sets up the call stack looks a little different. In fact, it looks like this:
IN AL, DX
PUSH EDI
PUSH ESI
PUSH EBX
This is quite confusing for a number of reasons. For one thing, the end-of-method code is identical to that quoted above for the other method, and in particular seems to expect a saved copy of EBP to be available on the stack.
For another, if I understand correctly the command IN AL, DX reads into the AL register, which is the same as the EAX register, and as it so happens the very next command here is
XOR EAX, EAX
as the program wants to zero a few things it allocated on the stack.
Question: I'm wondering exactly what's going on here that I don't understand. The machine code being translated as IN AL, DX is the single byte EC, whereas the pair of instructions
PUSH EBP
MOV EBP, ESP
would correspond to three byte 55 88 EC. Is the disassembler misreading this somehow? Or is something relying on a side effect I don't understand?
If anyone's curious, this machine code was generated by the CLR's JIT compiler, and I'm viewing it with the Visual Studio debugger. Here's a minimal reproduction in C#:
class C {
string s = "";
public void f(string s) {
this.s = s;
}
}
However, note that this seems to be non-deterministic; sometimes I seem to get the IN AL, DX version, while other times there's a PUSH EBP followed by a MOV EBP, ESP.
EDIT: I'm starting to strongly suspect a disassembler bug -- I just got another situation where it shows IN AL, DX (opcode EC) and the two preceding bytes in memory are 55 88. So perhaps the disassembler is simply confused about the entry point of the method. (Though I'd still like some insight as to why that's happening!)
Sounds like you are using VS2015. Your conclusion is correct, its debugging engine has a lot of bugs. Yes, wrong address. Not the only problem, it does not restore breakpoints properly and you are apt to see the INT3 instruction still in the code. And it can't correctly refresh the disassembly when the jitter has re-generated the code and replace stub calls. You can't trust anything you see.
I recommend you use Tools > Options > Debugging > General and tick the "Use Managed Compatibility Mode" checkbox. That forces the debugger to use an older debugging engine, VS2010 vintage. It is much more stable.
You'll lose some features with this engine, like return value inspection and 64-bit Edit+Continue. Won't be missed when you do this kind of debugging. You will however see fake code addresses, as was always common before, so all CALL addresses are wrong and you can't easily identify calls into the CLR. Flipping the engine back-and-forth is a workaround of sorts, but of course a big annoyance.
This has not been worked on either, I saw no improvements in the Updates. But they no doubt had a big bug list to work through, VS2015 shipped before it was done. Hopefully VS2017 is better, we'll find out soon.
As Hans's answered, it's a bug in Visual Studio.
To confirm the same, I disassembled a binary using IDA 6.5 and Visual Studio 2019. Here is the screenshot:
Visual Studio 2019 missed 2 bytes (0x55 0x8B) while considering the start of main.
Note: 'Use managed compatibility mode' mentioned by Hans didn't fix the issue in VS2019.

Visual Studio only breaks on second line of assembly?

The short description:
Setting a breakpoint on the first line of my .CODE segment in an assembly program will not halt execution of the program.
The question:
What about Visual Studio's debugger would allow it to fail to create a breakpoint at the first line of a program written in assembly? Is this some oddity of the debugger, a case of breaking on a multi-byte instruction, or am I just doing something silly?
The details:
I have the following assembly program compiling and running in Visual Studio:
; Tell MASM to use the Intel 80386 instruction set.
.386
; Flat memory model, and Win 32 calling convention
.MODEL FLAT, STDCALL
; Treat labels as case-sensitive (required for windows.inc)
OPTION CaseMap:None
include windows.inc
include masm32.inc
include user32.inc
include kernel32.inc
include macros.asm
includelib masm32.lib
includelib user32.lib
includelib kernel32.lib
.DATA
BadText db "Error...", 0
GoodText db "Excellent!", 0
.CODE
main PROC
;int 3 ; <-- If uncommented, this will not break.
mov ecx, 6 ; <-- Breakpoint here will not hit.
xor eax, eax ; <-- Breakpoint here will.
_label: add eax, ecx
dec ecx
jnz _label
cmp eax, 21
jz _good
_bad: invoke StdOut, addr BadText
jmp _quit
_good: invoke StdOut, addr GoodText
_quit: invoke ExitProcess, 0
main ENDP
END main
If I try to set a breakpoint on the first line of the main function, mov ecx, 6, it is ignored, and the program executes without stopping. Only will a breakpoint be hit if I set it on the line after that, xor eax, eax, or any subsequent line.
I have even tried inserting a software breakpoint, int 3, as the first line of the function, and it is also ignored.
The first thing I notice that is odd: viewing the disassembly after hitting one of my breakpoints gives me the following:
01370FFF add byte ptr [ecx+6],bh
--- [Path]\main.asm
xor eax, eax
00841005 xor eax,eax --- <-- Breakpoint is hit here
_label: add eax, ecx
00841007 add eax,ecx
dec ecx
00841009 dec ecx
jnz _label
0084100A jne _label (841007h)
cmp eax, 21
0084100C cmp eax,15h
What's interesting here is that the xor is, in Visual Studio's eyes, the first operation in my program. Absent is the line move ecx, 6. Directly above where it thinks my source begins is the line that actually sets ecx to 6. So the actual start of my program has been mangled according to the disassembly.
If I make the first line of my program int 3, the line that appears above where my code is in the disassembly is:
00F80FFF add ah,cl
As suggested in one of the answers, I turned off ASLR, and it looks like the disassembly is a little more stable:
.CODE
main PROC
;mov ecx, 6
xor eax, eax
00401000 xor eax,eax --- <-- Breakpoint is present here, but not hit.
_label: add eax, ecx
00401002 add eax,ecx --- <-- Breakpoint here is hit.
dec ecx
00401004 dec ecx
The complete program is visible in the disassembly, but the problem still perists. Despite my program starting on an expected address, and the first breakpoint being shown in the disassembly, it is still skipped. Placing an int 3 as the first line still results in the following line:
00400FFF add ah,cl
and does not stop execution, and re-mangles the view of my program in the disassembly again. The next line of my program is then at location 00401001, which I suppose makes sense because int 3 is a one-byte instruction, but why would it have disappeared in the disassembly?
Even starting the program using the 'Step Into (F11)' command does not allow me to break on the first line. In fact, with no breakpoint, starting the program with F11 does not halt execution at all.
I'm not really sure what else I can try to solve the problem, beyond what I have detailed here. This is stretching beyond my current understanding of assembly and debuggers.
01370FFF add byte ptr [ecx+6],bh
At least I can explain away one mystery. Note the address, 0x1370fff. The CODE segment never starts at an address like that, segments begin at an address that's a multiple of 0x1000. Which makes the last 3 hex digits of the start address always 0. The debugger got confuzzled and started disassembling the code at the wrong address, off by one. The actual start address is 0x1371000. The disassembly starts off poorly because there's a 0 at 0x1370fff. That's a multi-byte ADD instruction. So it displays garbage for a while until it catches up with real machine code instructions by accident.
You need to help it along and give it a command to start disassembling at the proper address. In VS that's the Address box, type "0x1371000".
Another notable quirk is the strange value of the start address. A process normally starts at address 0x400000. You have a feature called ASLR turned on, Address Space Layout Randomization. It is an anti-virus feature that makes programs start at an unpredictable start address. Nice feature but it doesn't exactly help debugging programs. It isn't clear how you built this code but you need the /DYNAMICBASE:NO linker option to turn it off.
Another important quirk of debuggers you need to keep in mind here is the way they set breakpoints. They do so by patching the code, replacing the start byte of an instruction with an int 3 instruction. When the breakpoint hits, it quickly replaces the byte with the original machine code instruction byte. So you never see this. This goes wrong if you pick the wrong address to set the breakpoint, like in the middle of a multi-byte instruction. It now no longer breaks the code, the altered byte messes up the original instruction. You can easily fall into this trap when you started with a bad disassembly.
Well, do this the Right Way. Start debugging with the debugger's STEP command instead.
I have discovered what the root of the problem is, but I haven't a clue why it is so.
After creating another MASM project, I noticed that the new one would break on the first line of the program, and the disassembly did not appear to be mangled or altered. So, I compared its properties to my original project (for the Debug configuration). The only difference I found was that my original project had Incremental Linking disabled. Specifically, it added /INCREMENTAL:NO to the linker command line.
Removing this option from the command line (thereby enabling Incremental Linking) resulted in the program behaving as expected during debugging; my code shown in the disassembly window remained unaltered, I could hit a breakpoint on the first line of the main procedure, and an int 3 instruction would also execute properly as the first line.
If you press F+11 (step into) instead of Start Debugging the debugger will stop on the first line.
It is possible there is some messed up breakpoint setting. Delete any *.suo files in your project directory to reset all breakpoints.
Note that your project will have a secret headers and stuff in it if it has a main function. To set a breakpoint at the real entry point use: Debug + New Breakpoint + Break at Function -> wWinMainCRTStartup for a windows program or mainCRTStartup or wmainCRTStartup for a console program.

How to debug an assembled program?

I have a program written in assembly that crashes with a segmentation fault. (The code is irrelevant, but is here.)
My question is how to debug an assembly language program with GDB?
When I try running it in GDB and perform a backtrace, I get no meaningful information. (Just hex offsets.)
How can I debug the program?
(I'm using NASM on Ubuntu, by the way if that somehow helps.)
I would just load it directly into gdb and step through it instruction by instruction, monitoring all registers and memory contents as you go.
I'm sure I'm not telling you anything you don't know there but the program seems simple enough to warrant this sort of approach. I would leave fancy debugging tricks like backtracking (and even breakpoints) for more complex code.
As to the specific problem (code paraphrased below):
extern printf
SECTION .data
format: db "%d",0
SECTION .bss
v_0: resb 4
SECTION .text
global main
main:
push 5
pop eax
mov [v_0], eax
mov eax, v_0
push eax
call printf
You appear to be just pushing 5 on to the stack followed by the address of that 5 in memory (v_0). I'm pretty certain you're going to need to push the address of the format string at some point if you want to call printf. It's not going to take to kindly to being given a rogue format string.
It's likely that your:
mov eax, v_0
should be:
mov eax, format
and I'm assuming that there's more code after that call to printf that you just left off as unimportant (otherwise you'll be going off to never-never land when it returns).
You should still be able to assemble with Stabs markers when linking code (with gcc).
I reccomend using YASM and assembling with -dstabs options:
$ yasm -felf64 -mamd64 -dstabs file.asm
This is how I assemble my assembly programs.
NASM and YASM code is interchangable for the most part (YASM has some extensions that aren't available in NASM, but every NASM code is well assembled with YASM).
I use gcc to link my assembled object files together or while compiling with C or C++ code. When using gcc, I use -gstabs+ to compile it with debug markers.

Resources