Assembly code of malloc

Assembly code of malloc - debugging

I want to view the assembly code of malloc(), calloc() and free() but when I print the assembly code on radare2 it gives me the following code:
push rbp
mov rbp, rsp
sub rsp, 0x10
mov eax, 0xc8
mov edi, eax
call sym.imp.malloc
xor ecx, ecx
mov qword [local_8h], rax
mov eax, ecx
add rsp, 0x10
pop rbp
ret
How can I see sym.imp.malloc function code? Is there any way to see the code or any website to see the assembly?

Since libc is an open-source library, it is freely available and you can simply read the source code.
The source-code of malloc is available on many places online (example), and you can view the source of different versions of libc under malloc/malloc.c here.
The symbol sym.imp.malloc is how radare flags the address of malloc in the PLT (Procedure Linkage Table) and not the function itself.
Reading the Assembly of the function can be done in several ways:
Open your local libc library with radare2, seek to malloc, analyze the function and then print its disassmbly:
$ r2 /usr/lib/libc.so.6
[0x00020630]> s sym.malloc
[0x0007c620]> af
[0x0007c620]> pdf
If you want to see malloc when linked to another binary you need to open the binary in debug mode, then step to main to make it load the library, then search for the address of malloc, seek to it, analyze the function and print the disassembly:
$ r2 -d /bin/ls
Process with PID 20540 started...
= attach 20540 20540
bin.baddr 0x00400000
Using 0x400000
Assuming filepath /bin/ls
asm.bits 64
[0x7fa764841d80]> dcu main
Continue until 0x004028b0 using 1 bpsize
hit breakpoint at: 4028b0
[0x004028b0]> dmi libc malloc~name=malloc$
vaddr=0x7fa764315620 paddr=0x0007c620 ord=4162 fwd=NONE sz=388 bind=LOCAL type=FUNC name=malloc
vaddr=0x7fa764315620 paddr=0x0007c620 ord=5225 fwd=NONE sz=388 bind=LOCAL type=FUNC name=malloc
vaddr=0x7fa764315620 paddr=0x0007c620 ord=5750 fwd=NONE sz=388 bind=GLOBAL type=FUNC name=malloc
vaddr=0x7fa764315620 paddr=0x0007c620 ord=7013 fwd=NONE sz=388 bind=GLOBAL type=FUNC name=malloc
[0x004028b0]> s 0x7fa764315620
[0x7fa764315620]> af
[0x7fa764315620]> pdf

Related

Interruption service in assembler (int 21h) and it's behavior (w/OllyDbg) [duplicate]

I wanted to write something basic in assembly under Windows. I'm using NASM, but I can't get anything working.
How do I write and compile a hello world program without the help of C functions on Windows?

This example shows how to go directly to the Windows API and not link in the C Standard Library.
global _main
extern _GetStdHandle#4
extern _WriteFile#20
extern _ExitProcess#4
section .text
_main:
; DWORD bytes;
mov ebp, esp
sub esp, 4
; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
push -11
call _GetStdHandle#4
mov ebx, eax
; WriteFile( hstdOut, message, length(message), &bytes, 0);
push 0
lea eax, [ebp-4]
push eax
push (message_end - message)
push message
push ebx
call _WriteFile#20
; ExitProcess(0)
push 0
call _ExitProcess#4
; never here
hlt
message:
db 'Hello, World', 10
message_end:
To compile, you'll need NASM and LINK.EXE (from Visual studio Standard Edition)
nasm -fwin32 hello.asm
link /subsystem:console /nodefaultlib /entry:main hello.obj

NASM examples.
Calling libc stdio printf, implementing int main(){ return printf(message); }
; ----------------------------------------------------------------------------
; helloworld.asm
;
; This is a Win32 console program that writes "Hello, World" on one line and
; then exits. It needs to be linked with a C library.
; ----------------------------------------------------------------------------
global _main
extern _printf
section .text
_main:
push message
call _printf
add esp, 4
ret
message:
db 'Hello, World', 10, 0
Then run
nasm -fwin32 helloworld.asm
gcc helloworld.obj
a
There's also The Clueless Newbies Guide to Hello World in Nasm without the use of a C library. Then the code would look like this.
16-bit code with MS-DOS system calls: works in DOS emulators or in 32-bit Windows with NTVDM support. Can't be run "directly" (transparently) under any 64-bit Windows, because an x86-64 kernel can't use vm86 mode.
org 100h
mov dx,msg
mov ah,9
int 21h
mov ah,4Ch
int 21h
msg db 'Hello, World!',0Dh,0Ah,'$'
Build this into a .com executable so it will be loaded at cs:100h with all segment registers equal to each other (tiny memory model).
Good luck.

These are Win32 and Win64 examples using Windows API calls. They are for MASM rather than NASM, but have a look at them. You can find more details in this article.
This uses MessageBox instead of printing to stdout.
Win32 MASM
;---ASM Hello World Win32 MessageBox
.386
.model flat, stdcall
include kernel32.inc
includelib kernel32.lib
include user32.inc
includelib user32.lib
.data
title db 'Win32', 0
msg db 'Hello World', 0
.code
Main:
push 0 ; uType = MB_OK
push offset title ; LPCSTR lpCaption
push offset msg ; LPCSTR lpText
push 0 ; hWnd = HWND_DESKTOP
call MessageBoxA
push eax ; uExitCode = MessageBox(...)
call ExitProcess
End Main
Win64 MASM
;---ASM Hello World Win64 MessageBox
extrn MessageBoxA: PROC
extrn ExitProcess: PROC
.data
title db 'Win64', 0
msg db 'Hello World!', 0
.code
main proc
sub rsp, 28h
mov rcx, 0 ; hWnd = HWND_DESKTOP
lea rdx, msg ; LPCSTR lpText
lea r8, title ; LPCSTR lpCaption
mov r9d, 0 ; uType = MB_OK
call MessageBoxA
add rsp, 28h
mov ecx, eax ; uExitCode = MessageBox(...)
call ExitProcess
main endp
End
To assemble and link these using MASM, use this for 32-bit executable:
ml.exe [filename] /link /subsystem:windows
/defaultlib:kernel32.lib /defaultlib:user32.lib /entry:Main
or this for 64-bit executable:
ml64.exe [filename] /link /subsystem:windows
/defaultlib:kernel32.lib /defaultlib:user32.lib /entry:main
Why does x64 Windows need to reserve 28h bytes of stack space before a call? That's 32 bytes (0x20) of shadow space aka home space, as required by the calling convention. And another 8 bytes to re-align the stack by 16, because the calling convention requires RSP be 16-byte aligned before a call. (Our main's caller (in the CRT startup code) did that. The 8-byte return address means that RSP is 8 bytes away from a 16-byte boundary on entry to a function.)
Shadow space can be used by a function to dump its register args next to where any stack args (if any) would be. A system call requires 30h (48 bytes) to also reserve space for r10 and r11 in addition to the previously mentioned 4 registers. But DLL calls are just function calls, even if they're wrappers around syscall instructions.
Fun fact: non-Windows, i.e. the x86-64 System V calling convention (e.g. on Linux) doesn't use shadow space at all, and uses up to 6 integer/pointer register args, and up to 8 FP args in XMM registers.
Using MASM's invoke directive (which knows the calling convention), you can use one ifdef to make a version of this which can be built as 32-bit or 64-bit.
ifdef rax
extrn MessageBoxA: PROC
extrn ExitProcess: PROC
else
.386
.model flat, stdcall
include kernel32.inc
includelib kernel32.lib
include user32.inc
includelib user32.lib
endif
.data
caption db 'WinAPI', 0
text db 'Hello World', 0
.code
main proc
invoke MessageBoxA, 0, offset text, offset caption, 0
invoke ExitProcess, eax
main endp
end
The macro variant is the same for both, but you won't learn assembly this way. You'll learn C-style asm instead. invoke is for stdcall or fastcall while cinvoke is for cdecl or variable argument fastcall. The assembler knows which to use.
You can disassemble the output to see how invoke expanded.

To get an .exe with NASM as the assembler and Visual Studio's linker this code works fine:
default rel ; Use RIP-relative addressing like [rel msg] by default
global WinMain
extern ExitProcess ; external functions in system libraries
extern MessageBoxA
section .data
title: db 'Win64', 0
msg: db 'Hello world!', 0
section .text
WinMain:
sub rsp, 28h ; reserve shadow space and make RSP%16 == 0
mov rcx, 0 ; hWnd = HWND_DESKTOP
lea rdx,[msg] ; LPCSTR lpText
lea r8,[title] ; LPCSTR lpCaption
mov r9d, 0 ; uType = MB_OK
call MessageBoxA
mov ecx,eax ; exit status = return value of MessageBoxA
call ExitProcess
add rsp, 28h ; if you were going to ret, restore RSP
hlt ; privileged instruction that crashes if ever reached.
If this code is saved as test64.asm, then to assemble:
nasm -f win64 test64.asm
Produces test64.obj
Then to link from command prompt:
path_to_link\link.exe test64.obj /subsystem:windows /entry:WinMain /libpath:path_to_libs /nodefaultlib kernel32.lib user32.lib /largeaddressaware:no
where path_to_link could be C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin or wherever is your link.exe program in your machine,
path_to_libs could be C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x64 or wherever are your libraries (in this case both kernel32.lib and user32.lib are on the same place, otherwise use one option for each path you need) and the /largeaddressaware:no option is necessary to avoid linker's complain about addresses to long (for user32.lib in this case).
Also, as it is done here, if Visual's linker is invoked from command prompt, it is necessary to setup the environment previously (run once vcvarsall.bat and/or see MS C++ 2010 and mspdb100.dll).
(Using default rel makes the lea instructions work from anywhere, including outside the low 2GiB of virtual address space. But the call MessageBoxA is still a direct call rel32 that can only reach instructions +-2GiB away from itself.)

Flat Assembler does not need an extra linker. This makes assembler programming quite easy. It is also available for Linux.
This is hello.asm from the Fasm examples:
include 'win32ax.inc'
.code
start:
invoke MessageBox,HWND_DESKTOP,"Hi! I'm the example program!",invoke GetCommandLine,MB_OK
invoke ExitProcess,0
.end start
Fasm creates an executable:
>fasm hello.asm
flat assembler version 1.70.03 (1048575 kilobytes memory)
4 passes, 1536 bytes.
And this is the program in IDA:
You can see the three calls: GetCommandLine, MessageBox and ExitProcess.

If you want to use NASM and Visual Studio's linker (link.exe) with anderstornvig's Hello World example you will have to manually link with the C Runtime Libary that contains the printf() function.
nasm -fwin32 helloworld.asm
link.exe helloworld.obj libcmt.lib
Hope this helps someone.

Unless you call some function this is not at all trivial. (And, seriously, there's no real difference in complexity between calling printf and calling a win32 api function.)
Even DOS int 21h is really just a function call, even if its a different API.
If you want to do it without help you need to talk to your video hardware directly, likely writing bitmaps of the letters of "Hello world" into a framebuffer. Even then the video card is doing the work of translating those memory values into DisplayPort/HDMI/DVI/VGA signals.
Note that, really, none of this stuff all the way down to the hardware is any more interesting in ASM than in C. A "hello world" program boils down to a function call. One nice thing about ASM is that you can use any ABI you want fairly easily; you just need to know what that ABI is.

The best examples are those with fasm, because fasm doesn't use a linker, which hides the complexity of windows programming by another opaque layer of complexity.
If you're content with a program that writes into a gui window, then there is an example for that in fasm's example directory.
If you want a console program, that allows redirection of standard in and standard out that is also possible.
There is a (helas highly non-trivial) example program available that doesn't use a gui, and works strictly with the console, that is fasm itself. This can be thinned out to the essentials. (I've written a forth compiler which is another non-gui example, but it is also non-trivial).
Such a program has the following command to generate a proper header for 32-bit executable, normally done by a linker.
FORMAT PE CONSOLE
A section called '.idata' contains a table that helps windows during startup to couple names of functions to the runtimes addresses. It also contains a reference to KERNEL.DLL which is the Windows Operating System.
section '.idata' import data readable writeable
dd 0,0,0,rva kernel_name,rva kernel_table
dd 0,0,0,0,0
kernel_table:
_ExitProcess#4 DD rva _ExitProcess
CreateFile DD rva _CreateFileA
...
...
_GetStdHandle#4 DD rva _GetStdHandle
DD 0
The table format is imposed by windows and contains names that are looked up in system files, when the program is started. FASM hides some of the
complexity behind the rva keyword. So _ExitProcess#4 is a fasm label and _exitProcess is a string that is looked up by Windows.
Your program is in section '.text'. If you declare that section readable writeable and executable, it is the only section you need to add.
section '.text' code executable readable writable
You can call all the facilities you declared in the .idata section. For a console program you need _GetStdHandle to find he filedescriptors for standard in and standardout (using symbolic names like STD_INPUT_HANDLE which fasm finds in the include file win32a.inc).
Once you have the file descriptors you can do WriteFile and ReadFile.
All functions are described in the kernel32 documentation. You are probably aware of that or you wouldn't try assembler programming.
In summary: There is a table with asci names that couple to the windows OS.
During startup this is transformed into a table of callable addresses, which you use in your program.

For ARM Windows:
AREA data, DATA
Text DCB "Hello world(text)", 0x0
Caption DCB "Hello world(caption)", 0x0
EXPORT WinMainCRTStartup
IMPORT __imp_MessageBoxA
IMPORT __imp_ExitProcess
AREA text, CODE
WinMainCRTStartup PROC
movs r3,#0
ldr r2,Caption_ptr
ldr r1,Text_ptr
movs r0,#0
ldr r4,MessageBoxA_ptr # nearby, reachable with PC-relative
ldr r4,[r4]
blx r4
movs r0,#0
ldr r4,ExitProcess_ptr
ldr r4,[r4]
blx r4
MessageBoxA_ptr DCD __imp_MessageBoxA # literal pool (constants near code)
ExitProcess_ptr DCD __imp_ExitProcess
Text_ptr DCD Text
Caption_ptr DCD Caption
ENDP
END

Why "mov rcx, rax" is required when calling printf in x64 assembler?

I am trying to learn x64 assembler. I wrote "hello world" and tried to call printf using the following code:
EXTERN printf: PROC
PUBLIC hello_world_asm
.data
hello_msg db "Hello world", 0
.code
hello_world_asm PROC
push rbp ; save frame pointer
mov rbp, rsp ; fix stack pointer
sub rsp, 8 * (4 + 2) ; shadow space (32bytes)
lea rax, offset hello_msg
mov rcx, rax ; <---- QUESTION ABOUT THIS LINE
call printf
; epilog. restore stack pointer
mov rsp, rbp
pop rbp
ret
hello_world_asm ENDP
END
At the beginning I called printf without "mov rcx, rax", which ended up with access violation. Getting all frustrated I just wrote in C++ a call to printf and looked in the disassembler. There I saw the line "mov rcx, rax" which fixed everything, but WHY do I need to move RAX to RCX ??? Clearly I am missing something fundamental.
Thanks for your help!
p.s. a reference to good x64 assembler tutorial is more than welcome :-) couldn't find one.

It isn't required, this code just wastes an instruction by doing an lea into RAX and then copying to RCX, when it could do
lea rcx, hello_msg
call printf ; printf(rcx, rdx, r8, r9, stack...)
printf on 64-bit Windows ignores RAX as an input; RAX is the return-value register in the Windows x64 calling convention (and can also be clobbered by void functions). The first 4 args go in RCX, RDX, R8, and R9 (if they're integer/pointer like here).
Also note that FP args in xmm0..3 have to be mirrored to the corresponding integer register for variadic functions like printf (MS's docs), but for integer args it's not required to movq xmm0, rcx.
In the x86-64 System V calling convention, variadic functions want al = the number of FP args passed in registers. (So you'd xor eax,eax to zero it). But the x64 Windows convention doesn't need that; it's optimized to make variadic functions easy to implement (instead of for higher performance / more register args for normal functions).
A few 32-bit calling conventions pass an arg in EAX, for example Irvine32, or gcc -m32 -mregparm=1. But no standard x86-64 calling conventions do. You can do whatever you like with private asm functions you write, but you have to follow the standard calling conventions when calling library functions.
Also note that lea rax, offset hello_msg was weird; LEA uses memory-operand syntax and machine encoding (and gives you the address instead of the data). offset hello_msg is an immediate, not a memory operand. But MASM accepts it as a memory operand anyway in this context.
You could use mov ecx, offset hello_msg in position-dependent code, otherwise you want a RIP-relative LEA. I'm not sure of the MASM syntax for that.

The Windows 64-bit (x64/AMD64) calling convention passes the first four integer arguments in RCX, RDX, R8 and R9.
The return value is stored in RAX and it is volatile so a C/C++ compiler is allowed to use it as generic storage in a function.

Incremental linking causes unexpected disassembly for MASM program

A while ago I posted this question regarding strange behavior I was experiencing in trying to step through a MASM program.
Essentially, given the following code:
; Tell MASM to use the Intel 80386 instruction set.
.386
; Flat memory model, and Win 32 calling convention
.MODEL FLAT, STDCALL
; Treat labels as case-sensitive (required for windows.inc)
OPTION CaseMap:None
include windows.inc
include masm32.inc
include user32.inc
include kernel32.inc
include macros.asm
includelib masm32.lib
includelib user32.lib
includelib kernel32.lib
.DATA
BadText db "Error...", 0
GoodText db "Excellent!", 0
.CODE
main PROC
int 3
mov eax, 6
xor eax, eax
_label: add eax, ecx
dec ecx
jnz _label
cmp eax, 21
jz _good
_bad: invoke StdOut, addr BadText
jmp _quit
_good: invoke StdOut, addr GoodText
_quit: invoke ExitProcess, 0
main ENDP
END main
I could not get the int 3 instruction to trigger. It was clear why it didn't, examining the disassembly:
00400FFD add byte ptr [eax],al
00400FFF add ah,cl
--- [User path]\main.asm
mov eax, 6
00401001 mov eax,6
xor eax, eax
00401006 xor eax,eax
_label: add eax, ecx
The int 3 instruction had been replaced with add al,cl, but I had no idea why. I managed to track the problem to whether or not Incremental Linking was enabled. The above disassembly was generated with Incremental Linking disabled (/INCREMENTAL:NO option on the command line). Re-enabling it would result in something like the following:
.CODE
main PROC
int 3
00401010 int 3
mov eax, 6
00401011 mov eax,6
xor eax, eax
00401016 xor eax,eax
I should note that the interleaving lines are references back to the original code (I guess a feature of Visual Studio's disassembly window). With Incremental Linking enabled, the disassembly corresponds exactly to what I had written in the program, which is how I expected it to behave all along.
So, why would disabling Incremental Linking cause the disassembly of my program to be altered? What could be happening behind the scenes that would actually alter how the program executes?

The "add" instruction is a two byte instruction, the second of which is the 1-byte opcode of your int3. The first byte of the two byte add instruction is probably some garbage just before the entrypoint. The address of the add instruction is probably 1 byte before where the int3 instruction would be.
I quickly assembled and then disassembled those two instructions with GNU as en objdump, and the result is:
8: 00 cc add %cl,%ah
a: cc int3
Here you can clearly see that the the add instruction contains the second byte 0xcc, while int3 is 0xcc
IOW make sure that you start disassembling on the entry point to avoid this problem.

x64 nasm: pushing memory addresses onto the stack & call function

I'm pretty new to x64-assembly on the Mac, so I'm getting confused porting some 32-bit code in 64-bit.
The program should simply print out a message via the printf function from the C standart library.
I've started with this code:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
push msg
call _printf
mov rsp, rbp
pop rbp
ret
Compiling it with nasm this way:
$ nasm -f macho64 main.s
Returned following error:
main.s:12: error: Mach-O 64-bit format does not support 32-bit absolute addresses
I've tried to fix that problem byte changing the code to this:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
mov rax, msg ; shouldn't rax now contain the address of msg?
push rax ; push the address
call _printf
mov rsp, rbp
pop rbp
ret
It compiled fine with the nasm command above but now there is a warning while compiling the object file with gcc to actual program:
$ gcc main.o
ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not
allowed in code signed PIE, but used in _main from main.o. To fix this warning,
don't compile with -mdynamic-no-pic or link with -Wl,-no_pie
Since it's a warning not an error I've executed the a.out file:
$ ./a.out
Segmentation fault: 11
Hope anyone knows what I'm doing wrong.

The 64-bit OS X ABI complies at large to the System V ABI - AMD64 Architecture Processor Supplement. Its code model is very similar to the Small position independent code model (PIC) with the differences explained here. In that code model all local and small data is accessed directly using RIP-relative addressing. As noted in the comments by Z boson, the image base for 64-bit Mach-O executables is beyond the first 4 GiB of the virtual address space, therefore push msg is not only an invalid way to put the address of msg on the stack, but it is also an impossible one since PUSH does not support 64-bit immediate values. The code should rather look similar to:
; this is what you *would* do for later args on the stack
lea rax, [rel msg] ; RIP-relative addressing
push rax
But in that particular case one needs not push the value on the stack at all. The 64-bit calling convention mandates that the fist 6 integer/pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9, exactly in that order. The first 8 floating-point or vector arguments go into XMM0, XMM1, ..., XMM7. Only after all the available registers are used or there are arguments that cannot fit in any of those registers (e.g. a 80-bit long double value) the stack is used. 64-bit immediate pushes are performed using MOV (the QWORD variant) and not PUSH. Simple return values are passed back in the RAX register. The caller must also provide stack space for the callee to save some of the registers.
printf is a special function because it takes variable number of arguments. When calling such functions AL (the low byte of RAX) should be set to the number of floating-point arguments, passed in the vector registers. Also note that RIP-relative addressing is preferred for data that lies within 2 GiB of the code.
Here is how gcc translates printf("This is a test\n"); into assembly on OS X:
xorl %eax, %eax # (1)
leaq L_.str(%rip), %rdi # (2)
callq _printf # (3)
L_.str:
.asciz "This is a test\n"
(this is AT&T style assembly, source is left, destination is right, register names are prefixed with %, data width is encoded as a suffix to the instruction name)
At (1) zero is put into AL (by zeroing the whole RAX which avoids partial-register delays) since no floating-point arguments are being passed. At (2) the address of the string is loaded in RDI. Note how the value is actually an offset from the current value of RIP. Since the assembler doesn't know what this value would be, it puts a relocation request in the object file. The linker then sees the relocation and puts the correct value at link time.
I am not a NASM guru, but I think the following code should do it:
default rel ; make [rel msg] the default for [msg]
section .data
msg: db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp ; re-aligns the stack by 16 before call
mov rbp, rsp
xor eax, eax ; al = 0 FP args in XMM regs
lea rdi, [rel msg]
call _printf
mov rsp, rbp
pop rbp
ret

No answer yet has explained why NASM reports
Mach-O 64-bit format does not support 32-bit absolute addresses
The reason NASM won't do this is explained in Agner Fog's Optimizing Assembly manual in section 3.3 Addressing modes under the subsection titled 32-bit absolute addressing in 64 bit mode he writes
32-bit absolute addresses cannot be used in Mac OS X, where addresses are above 2^32 by
default.
This is not a problem on Linux or Windows. In fact I already showed this works at static-linkage-with-glibc-without-calling-main. That hello world code uses 32-bit absolute addressing with elf64 and runs fine.
#HristoIliev suggested using rip relative addressing but did not explain that 32-bit absolute addressing in Linux would work as well. In fact if you change lea rdi, [rel msg] to lea rdi, [msg] it assembles and runs fine with nasm -efl64 but fails with nasm -macho64
Like this:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
xor al, al
lea rdi, [msg]
call _printf
mov rsp, rbp
pop rbp
ret
You can check that this is an absolute 32-bit address and not rip relative with objdump. However, it's important to point out that the preferred method is still rip relative addressing. Agner in the same manual writes:
There is absolutely no reason to use absolute addresses for simple memory operands. Rip-
relative addresses make instructions shorter, they eliminate the need for relocation at load
time, and they are safe to use in all systems.
So when would use use 32-bit absolute addresses in 64-bit mode? Static arrays is a good candidate. See the following subsection Addressing static arrays in 64 bit mode. The simple case would be e.g:
mov eax, [A+rcx*4]
where A is the absolute 32-bit address of the static array. This works fine with Linux but once again you can't do this with Mac OS X because the image base is larger than 2^32 by default. To to this on Mac OS X see example 3.11c and 3.11d in Agner's manual. In example 3.11c you could do
mov eax, [(imagerel A) + rbx + rcx*4]
Where you use the extern reference from Mach O __mh_execute_header to get the image base. In example 3.11c you use rip relative addressing and load the address like this
lea rbx, [rel A]; rel tells nasm to do [rip + A]
mov eax, [rbx + 4*rcx] ; A[i]

According to the documentation for the x86 64bit instruction set http://download.intel.com/products/processor/manual/325383.pdf
PUSH only accepts 8, 16 and 32bit immediate values (64bit registers and register addressed memory blocks are allowed though).
PUSH msg
Where msg is a 64bit immediate address will not compile as you found out.
What calling convention is _printf defined as in your 64bit library?
Is it expecting the parameter on the stack or using a fast-call convention where the parameters on in registers? Because x86-64 makes more general purpose registers available the fast-call convention is used more often.

Segmentation fault in assembly program

I am trying to spawn a shell using the following code:
Section .Text
global _start
_start:
jmp short TrickCall
_ReturnHere:
pop esi
xor eax,eax
mov byte [esi+7],al
lea ebx,[esi]
mov long [esi+8],ebx
mov long [esi+12],eax
mov byte al,0x0b
mov ebx,esi
lea ecx,[esi+8]
lea edx,[esi+12]
int 0x80
TrickCall:
call _ReturnHere
db "/bin/shJAAAANNNN"
I am using gcc version 4.4.3 as my compiler. When I run it using gdb it gives the following output:
(gdb) run
Starting program: /root/spawn_shell
Program received signal SIGSEGV, Segmentation fault.
0x08048059 in _ReturnHere ()
It cannot access the memory address of _ReturnHere. Any way to get around this?

Your problem is DEP, when you pop the return address off the stack and try to write to it, its not marked as writable, only readable & executable. You either need to disable DEP (bad, its meant to protect against exploits that do something like this) or put the text just after call _ReturnHere into a RW(X) memory.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio