So I have a simple C program that loops through the args passed to main then returns:
#include <stdio.h>
int main(int argc, char *argv[])
{
int i;
for(i = 0; i < argc; ++i) {
fprintf(stdout, "%s\n", argv[i]);
}
return 0;
}
I wanted to see how gcc wrote out the assembly in NASM format. I was looking over the output in the .asm file and noticed that the syntax was TASM. Below is the make file and the output from gcc. Am I doing something wrong or is it that gcc does not output true NASM syntax?
all: main
main: main.o
ld -o main main.o
main.o : main.c
gcc -S -masm=intel -o main.asm main.c
nasm -f elf -g -F stabs main.asm -l main.lst
AND
.file "main.c"
.intel_syntax noprefix
.section .rodata
.LC0:
.string "%s\n"
.text
.globl main
.type main, #function
main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 32
mov DWORD PTR [esp+28], 0
jmp .L2
.L3:
mov eax, DWORD PTR [esp+28]
sal eax, 2
add eax, DWORD PTR [ebp+12]
mov ecx, DWORD PTR [eax]
mov edx, OFFSET FLAT:.LC0
mov eax, DWORD PTR stdout
mov DWORD PTR [esp+8], ecx
mov DWORD PTR [esp+4], edx
mov DWORD PTR [esp], eax
call fprintf
add DWORD PTR [esp+28], 1
.L2:
mov eax, DWORD PTR [esp+28]
cmp eax, DWORD PTR [ebp+8]
jl .L3
mov eax, 0
leave
ret
.size main, .-main
.ident "GCC: (GNU) 4.5.1 20100924 (Red Hat 4.5.1-4)"
.section .note.GNU-stack,"",#progbits
The errors on the command line are:
[mehoggan#fedora sandbox-print_args]$ make
gcc -S -masm=intel -o main.asm main.c
nasm -f elf -g -F stabs main.asm -l main.lst
main.asm:1: error: attempt to define a local label before any non-local labels
main.asm:1: error: parser: instruction expected
main.asm:2: error: attempt to define a local label before any non-local labels
main.asm:2: error: parser: instruction expected
main.asm:3: error: attempt to define a local label before any non-local labels
main.asm:3: error: parser: instruction expected
main.asm:4: error: attempt to define a local label before any non-local labels
main.asm:5: error: attempt to define a local label before any non-local labels
main.asm:5: error: parser: instruction expected
main.asm:6: error: attempt to define a local label before any non-local labels
main.asm:7: error: attempt to define a local label before any non-local labels
main.asm:7: error: parser: instruction expected
main.asm:8: error: attempt to define a local label before any non-local labels
main.asm:8: error: parser: instruction expected
main.asm:14: error: comma, colon or end of line expected
main.asm:17: error: comma, colon or end of line expected
main.asm:19: error: comma, colon or end of line expected
main.asm:20: error: comma, colon or end of line expected
main.asm:21: error: comma, colon or end of line expected
main.asm:22: error: comma, colon or end of line expected
main.asm:23: error: comma, colon or end of line expected
main.asm:24: error: comma, colon or end of line expected
main.asm:25: error: comma, colon or end of line expected
main.asm:27: error: comma, colon or end of line expected
main.asm:29: error: comma, colon or end of line expected
main.asm:30: error: comma, colon or end of line expected
main.asm:35: error: parser: instruction expected
main.asm:36: error: parser: instruction expected
main.asm:37: error: parser: instruction expected
make: *** [main.o] Error 1
What lead me to believe that this is TASM syntax was information posted at this link:
http://rs1.szif.hu/~tomcat/win32/intro.txt
TASM coders usually have lexical difficulties with NASM because it
lacks the "ptr" keyword used extensively in TASM.
TASM uses this:
mov al, byte ptr [ds:si] or mov ax, word ptr [ds:si] or mov eax,
dword ptr [ds:si]
For NASM This simply translates into:
mov al, byte [ds:si] or mov ax, word [ds:si] or mov eax, dword
[ds:si]
NASM allows these size keywords in many places, and thus gives you a
lot of control over the generated opcodes in a unifrom way, for
example These are all valid:
push dword 123 jmp [ds: word 1234] ; these both specify the size
of the offset jmp [ds: dword 1234] ; for tricky code when
interfacing 32bit and
; 16bit segments
it can get pretty hairy, but the important thing to remember is you
can have all the control you need, when you want it.
Intel syntax means Intel syntax, not NASM syntax. MASM and TASM syntaxes are based on Intel Syntax, NASM syntax gets inspiration from Intel syntax, but it is different.
What gcc outputs is actually gas syntax using Intel syntax for individual instructions, (Assembler directives, labels et al. use gas-specific syntax)
Related
I have been trying to use stat in NASM to get file sizes. However, st_size returns 0. Can anyone explain why this happens?
Here is my code:
global _main
extern _printf
section .bss
stat resb 144
section .text
filename:
db "test.asm", 0 ; The name of this NASM file
format:
db "%lld", 10, 0
_main:
mov rax, 0x20000bc ; system call for stat
mov rdi, filename
mov rsi, stat
syscall ; returns 0
push rax
mov rdi, format
mov rsi, stat
mov rsi, [rsi + 96] ; the offset of st_size in __DARWIN_STRUCT_STAT64 as defined in <sys/stat.h> is 96
call _printf
pop rax
ret
This is not a duplicate of Get file size with stat syscall
You're using the wrong syscall. That's the one for backward compatibility with the 32-bit-sized structure. Of course, that means that the st_size field is not at the offset your code is expecting.
The stat() function's symbol name is not _stat, by default, since 10.6. Rather, it's _stat$INODE64. If you look at the assembly for that in /usr/lib/system/libsystem_kernel.dylib, you'll find that it uses the syscall value 0x2000152.
I'm trying to create dll using VS 2017.
The dll will have one proc: symbol_count.
It asks to enter the string and then set symbol what is needed to count.
.def file
LIBRARY name
EXPORTS
symbol_count
Code:
.586
.model flat, stdcall
option casemap: none
include C:\masm32\include\windows.inc
include C:\masm32\include\user32.inc
include C:\masm32\include\msvcrt.inc
includelib C:\masm32\lib\msvcrt.lib
includelib C:\masm32\lib\user32.lib
.data
msg_string db 'Enter string: ', 0
msg_symbol db 'Enter symbol: ', 0
result db 'Count = %d', 0
str_modifier db '%s', 0
sym_modifier db '%c', 0
.data
string db ?
symbol db ?
DllEntry PROC hInstDLL:DWORD, reason:DWORD, reserved:DWORD
mov eax, 1
ret
DllEntry ENDP
symbol_count PROC
invoke crt_printf, OFFSET msg_string
invoke crt_scanf, OFFSET str_modifier, OFFSET string
invoke crt_printf, OFFSET msg_symbol
invoke crt_scanf, OFFSET sym_modifier, OFFSET symbol
xor esi, esi
xor ecx, ecx
mov ebx, OFFSET string
mov ecx, eax
mov al, symbol
loop1: <------------------------------------------ A2108
cmp byte ptr [ebx + ecx], 0
je endloop <------------------------------ A2107
cmp al, byte ptr [ebx + ecx]
jne next <-------------------------------- A2107
inc esi
next: <------------------------------------------- A2108
inc ecx
jmp loop1 <------------------------------- A2107
endloop: <---------------------------------------- A2108
invoke crt_printf, OFFSET result, esi
ret
symbol_count ENDP
End DllEntry
Here is the list of error messages, what a compiler gives to me: (
in the code, I marked the places where the compiler swears)
A2108 use of register assumed to ERROR
A2108 use of register assumed to ERROR
A2108 use of register assumed to ERROR
A2107 cannot have implicit far jump or call to near label
A2107 cannot have implicit far jump or call to near label
A2107 cannot have implicit far jump or call to near label
procedure argument or local not referenced : hInstDLL } all this points
procedure argument or local not referenced : reason } to DllEntry ENDP
procedure argument or local not referenced : reserved }
"You put your code into the .data section which may or may not cause some of the errors. The last 3 should just be warnings as you don't use the arguments." – #Jester
I am trying to print a character received as a parameter by a function.
My function is declared as follows:
STD_OUTPUT_HANDLE equ -11
NULL equ 0
global _print
extern _ExitProcess#4, _GetStdHandle#4, _WriteConsoleA#20
section .data
msg db 'a', 13, 10, 0
msg.len equ $ - msg
section .bss
dummy resd 1
section .text
_print:
;Prologue
push ebp
mov ebp, esp
mov edx, [ebp + 4]
push STD_OUTPUT_HANDLE
call _GetStdHandle#4
push NULL
push dummy
push 1
push edx
push eax
call _WriteConsoleA#20
;Epilogue
mov esp, ebp
pop ebp
push NULL
call _ExitProcess#4
And my calling function is declared as follows:
global _main
extern _print
section .data
msg db 'c', 13, 10, 0
msg.len equ $ - msg
section .text
_main:
;Prologue
push ebp
mov ebp, esp
push msg
call _print
;Epilogue
mov esp, ebp
pop ebp
I am expecting 'c' as output in my console, but this is not working. The method printing the message works when I pass the variable msg (the one defined as 'a'), to WriteConsole. So I am guessing the problem is in passing the parameter or reading it from the stack.
I am using Nasm to compile, gcc to link & compiling with intel syntax on a windows platform (32bit).
I use these commands to compile and link
nasm -fwin32 tiny.asm
nasm -fwin32 tiny_print_char.asm
gcc tiny_print_char.obj tiny.obj -m32 --enable-stdcall-fixup -nostdlib c:\windows\system32\kernel32.dll -lkernel32
Can anyone help me?
I have the following code:
%include "io.inc"
section .data
msg db 'Hello World...$'
section .text
global CMAIN
CMAIN:
;write your code here
mov ah,09
mov dx,OFFSET msg
int 21h
xor eax, eax
xor dx,dx
ret
and it gets the next error:
[19:28:32] Warning! Errors have occurred in the build:
C:/Users/user/AppData/Local/Temp/SASM/program.asm:12: error: comma, colon, decorator or end of line expected after operand
gcc.exe: error: C:/Users/user/AppData/Local/Temp/SASM/program.o: No such file or directory
What is the problem? i'm using sasm ide.
This is TASM/MASM syntax:
mov dx,OFFSET msg
When using NASM you'd simply write:
mov dx,msg
I have some code generated by gcc with the options -march=native -mtune=native -mfpmath=sse -O3 -ffast-math -masm=intel -S -fverbose-asm, on Core i7 930. Here's an excerpt of the code:
mov esi, DWORD PTR [ebp-52] # batmp.271, %sfp
mov eax, DWORD PTR [ebp-28] #, %sfp
add esi, edi # batmp.271,
add eax, edi #,
mov ecx, DWORD PTR [ebp-108] #, %sfp
...
cmp DWORD PTR [ebp-100], eax # %sfp, D.48541
What are batmp.XXX, %sfp and D.XXXXX here? How do these names deabbreviate and what do these terms mean?
It would be easier to tell if you provided the C source for reference.
Apparently batmp is "base address temporary" used for array accesses. %sfp is used as base address for registers spilled to the stack. Unfortunately the compiler doesn't tell us what it spilled even if it is a named local variable, according to my tests. D.x is just a general notation meaning "declaration with uid x". If it doesn't have a name then it's probably a compiler generated helper variable.