I believe GCC is producing incorrect machine code - gcc

I am trying to compile this dead simple program:
int print(int x, int y)
{
return x * y;
}
int main()
{
return print(8, 7);
}
with this command: gcc -c -nostdinc -m32 -masm=intel main.c -O0
The file produced (main.o) has the following object dump:
$ objdump -d main.o
main.o: file format elf32-i386
Disassembly of section .text:
00000000 <print>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: e8 fc ff ff ff call 4 <print+0x4>
8: 05 01 00 00 00 add $0x1,%eax
d: 8b 45 08 mov 0x8(%ebp),%eax
10: 0f af 45 0c imul 0xc(%ebp),%eax
14: 5d pop %ebp
15: c3 ret
00000016 <main>:
16: 55 push %ebp
17: 89 e5 mov %esp,%ebp
19: e8 fc ff ff ff call 1a <main+0x4>
1e: 05 01 00 00 00 add $0x1,%eax
23: 6a 07 push $0x7
25: 6a 08 push $0x8
27: e8 fc ff ff ff call 28 <main+0x12>
2c: 83 c4 08 add $0x8,%esp
2f: c9 leave
30: c3 ret
Disassembly of section .text.__x86.get_pc_thunk.ax:
00000000 <__x86.get_pc_thunk.ax>:
0: 8b 04 24 mov (%esp),%eax
3: c3 ret
If I understand correctly, this line
27: e8 fc ff ff ff call 28 <main+0x12> represents the call to print. However, the offset given is -4, which results in jumping to address 28. But there isn't even an instruction at that offset. The code does run however. But I have the feeling this machine code isn't quite right. (Also, why is there a call instruction in the print function, if the print function doesn't even call anything?)

You compiled with -c, thus the output is an object file. It still contains placeholders for symbols and such that will be resolved / patched by the linker. As mentioned, use objdump with flag -r added which will show symbol name for the reloc(s). Before linking, the bits in the opcode are 0, thus the call target points to the address as shown by objdump.

Related

Why does gcc generates strange code without flag -fno-pie?

I am trying to compile dummy function in gcc with flag -fno-pie and without.
void dummy_test_entrypoint() { }
When i compile without the flag.
gcc -m32 -ffreestanding -c test.c -o test.o
I get the following disassembled code.
00000000 <dummy_test_entrypoint>:
0: 55 push ebp
1: 89 e5 mov ebp,esp
3: e8 fc ff ff ff call 4 <dummy_test_entrypoint+0x4>
8: 05 01 00 00 00 add eax,0x1
d: 90 nop
e: 5d pop ebp
f: c3 ret
When i compile with the flag.
00000000 <dummy_test_entrypoint>:
0: 55 push ebp
1: 89 e5 mov ebp,esp
3: 90 nop
4: 5d pop ebp
5: c3 ret
My question.
What is it???
3: e8 fc ff ff ff call 4 <dummy_test_entrypoint+0x4>
8: 05 01 00 00 00 add eax,0x1
You disassembled the object file without the --reloc flag, so the output is misleading. With the --reloc flag, you'll see this:
3: e8 fc ff ff ff call 4 <dummy_test_entrypoint+0x4>
4: R_386_PC32 __x86.get_pc_thunk.ax
8: 05 01 00 00 00 add $0x1,%eax
9: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
And the subroutine looks like this:
00000000 <__x86.get_pc_thunk.ax>:
0: 8b 04 24 mov (%esp),%eax
3: c3 ret
This construct loads the GOT pointer into %eax, in case the function needs to reference global data. The function does not contain such a reference, but because you compiled the code without optimization, GCC did not remove the dead code.

Why does GCC insert a callq at the begain of a function? [duplicate]

I know when using objdump -dr in my file call shows up in machine code as e8 00 00 00 00 because it has not yet been linked. But I need to find out what the 00 00 00 00 will turn into after the linker has done it's job. I know it should calculate the offset, but I'm a little confused about that.
As an example with the code below, after the linker part is done, how should the e8 00 00 00 00 be? And how do I get to that answer?
I'm testing out with this sample code: (I'm trying to call moo)
Disassembly of section .text:
0000000000000000 <foo>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 89 7d fc mov %edi,-0x4(%rbp)
7: 8b 45 fc mov -0x4(%rbp),%eax
a: 83 e8 0a sub $0xa,%eax
d: 5d pop %rbp
e: c3 retq
000000000000000f <moo>:
f: 55 push %rbp
10: 48 89 e5 mov %rsp,%rbp
13: 89 7d fc mov %edi,-0x4(%rbp)
16: b8 01 00 00 00 mov $0x1,%eax
1b: 5d pop %rbp
1c: c3 retq
000000000000001d <main>:
1d: 55 push %rbp
1e: 48 89 e5 mov %rsp,%rbp
21: 48 83 ec 10 sub $0x10,%rsp
25: c7 45 fc 8e 0c 00 00 movl $0xc8e,-0x4(%rbp)
2c: 8b 45 fc mov -0x4(%rbp),%eax
2f: 89 c7 mov %eax,%edi
31: e8 00 00 00 00 callq 36 <main+0x19>
32: R_X86_64_PC32 moo-0x4
36: 89 45 fc mov %eax,-0x4(%rbp)
39: b8 00 00 00 00 mov $0x0,%eax
3e: c9 leaveq
3f: c3 retq
With objdump -r you have Relocations printed with your disassembly -d:
31: e8 00 00 00 00 callq 36 <main+0x19>
32: R_X86_64_PC32 moo-0x4
ld-linux.so.2 loader will relocate objects (in modern world it will relocate even executable to random address) and fill the relocations with correct address.
Check with gdb by adding breakpoint at main and starting program (linker works before main function is started):
gdb ./program
(gdb) start
(gdb) disassemble main
If you want to compile the code without relocations, show source code and compilation options.
Object files and executable files on several architectures that I know of do not necessarily fix jump destinations at link time.
This is a feature which provides flexibility.
Jump target addresses do not have to be fixed until just before the instruction executes. They do not need to be fixed up at link time—nor even at program start time!
Most systems (Windows, Linux, Unix, VAX/VMS) tag such locations in the object code as an address which needs adjustment. There is additional information about what the target address is, what type of reference it is (such as absolute or relative; 16-bit, 24-bit, 32-bit, 64-bit, etc.).
The zero value there is not necessarily a placeholder, but the base value upon which to evaluate the result. For example, if the instruction were—for whatever reason—call 5+external_address, then there might be 5 (e8 05 00 00 00) in the object code.
If you want to see what the address is at execution time, run the program under a debugger, place a breakpoint at that instruction and then view the instruction just before it executes.
A common anti-virus, security-enhancing feature known as ASLR (address space layout randomization) intentionally loads programs sections at inconsistent addresses to thwart malicious code which alters programs or data. Programs operating in this environment may not have some target addresses assigned until after the program runs a bit.
(Of related interest, VAX/VMS in particular has a complex fixup mode in which an equation describes the operations needed to compute a value. Operations include addition, subtraction, multiplication, division, shifting, rotating, and probably others. I never saw it actually used, but it was interesting to contemplate how one might apply the capability.)
but you clearly know how to do all of this. you know how to disassemble before linking just disassemble after to see how the linker modifies those instructions.
asm(".globl _start; _start: nop\n");
unsigned int foo ( unsigned int x )
{
return(x+5);
}
unsigned int moo ( unsigned int x )
{
return(foo(x)+3);
}
int main ( void )
{
return(moo(3)+2);
}
0000000000000000 <_start>:
0: 90 nop
0000000000000001 <foo>:
1: 55 push %rbp
2: 48 89 e5 mov %rsp,%rbp
5: 89 7d fc mov %edi,-0x4(%rbp)
8: 8b 45 fc mov -0x4(%rbp),%eax
b: 83 c0 05 add $0x5,%eax
e: 5d pop %rbp
f: c3 retq
0000000000000010 <moo>:
10: 55 push %rbp
11: 48 89 e5 mov %rsp,%rbp
14: 48 83 ec 08 sub $0x8,%rsp
18: 89 7d fc mov %edi,-0x4(%rbp)
1b: 8b 45 fc mov -0x4(%rbp),%eax
1e: 89 c7 mov %eax,%edi
20: e8 00 00 00 00 callq 25 <moo+0x15>
25: 83 c0 03 add $0x3,%eax
28: c9 leaveq
29: c3 retq
000000000000002a <main>:
2a: 55 push %rbp
2b: 48 89 e5 mov %rsp,%rbp
2e: bf 03 00 00 00 mov $0x3,%edi
33: e8 00 00 00 00 callq 38 <main+0xe>
38: 83 c0 02 add $0x2,%eax
3b: 5d pop %rbp
3c: c3 retq
0000000000001000 <_start>:
1000: 90 nop
0000000000001001 <foo>:
1001: 55 push %rbp
1002: 48 89 e5 mov %rsp,%rbp
1005: 89 7d fc mov %edi,-0x4(%rbp)
1008: 8b 45 fc mov -0x4(%rbp),%eax
100b: 83 c0 05 add $0x5,%eax
100e: 5d pop %rbp
100f: c3 retq
0000000000001010 <moo>:
1010: 55 push %rbp
1011: 48 89 e5 mov %rsp,%rbp
1014: 48 83 ec 08 sub $0x8,%rsp
1018: 89 7d fc mov %edi,-0x4(%rbp)
101b: 8b 45 fc mov -0x4(%rbp),%eax
101e: 89 c7 mov %eax,%edi
1020: e8 dc ff ff ff callq 1001 <foo>
1025: 83 c0 03 add $0x3,%eax
1028: c9 leaveq
1029: c3 retq
000000000000102a <main>:
102a: 55 push %rbp
102b: 48 89 e5 mov %rsp,%rbp
102e: bf 03 00 00 00 mov $0x3,%edi
1033: e8 d8 ff ff ff callq 1010 <moo>
1038: 83 c0 02 add $0x2,%eax
103b: 5d pop %rbp
103c: c3 retq
for example
20: e8 00 00 00 00 callq 25 <moo+0x15>
1033: e8 d8 ff ff ff callq 1010 <moo>

time attack on bash program with usleep() inside

I have a little hackme where I have to get the password with brute force. In the program is the function usleep(); when I have the right length and it is changing when one letter is right.
It would not be a problem, but the sleep time is about one minute and this is quite a long time.
Is there a way to make the usleep timer faster?
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs)
Method 1
You can override library functions with a LD_PRELOAD directive.
There's a good tutorial here and here to get you started with this.
Suppose you have the following program code, which is then compiled to a binary elf file.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h> /* for usleep() */
int main(int argc, char* argv[]) {
printf("Entry point. We'll now wait 10 seconds.\n");
system("date +\"%H:%M:%S\""); //Output time
usleep(10*1000*1000);
printf("Woke up again.\n");
system("date +\"%H:%M:%S\""); //Output time
return 0;
}
Running it normally would give you
root#kali:~/so# gcc -o prog prog.c
root#kali:~/so# ./prog
Entry point. We'll now wait 10 seconds.
20:31:10
Woke up again.
20:31:20
Now write your own version of usleep().
#include <unistd.h>
#include <stdio.h>
int usleep(useconds_t usec){
printf("Nope, you're not sleeping today :)\n");
return 0;
}
Compile it as a shared library.
root#kali:~/so# gcc -Wall -fPIC -shared -o usleep_override.so usleep_override.c
Now preload that library function before executing the original program.
root#kali:~/so# LD_PRELOAD=./usleep_override.so ./prog
Entry point. We'll now wait 10 seconds.
20:35:28
Nope, you're not sleeping today :)
Woke up again.
20:35:28
As you can see when looking at the date output, it executed the hooked function instead of the original and then immediatly returned.
Method 2
Modify the binary. In particular, modify the instructions sothat the usleep() function is not executed.
When we dump the instructions of the main() function of prog with objdump, we get:
root#kali:~/so# objdump -d -Mintel prog | grep -A20 "<main>"
0000000000400596 <main>:
400596: 55 push rbp
400597: 48 89 e5 mov rbp,rsp
40059a: 48 83 ec 10 sub rsp,0x10
40059e: 89 7d fc mov DWORD PTR [rbp-0x4],edi
4005a1: 48 89 75 f0 mov QWORD PTR [rbp-0x10],rsi
4005a5: bf 68 06 40 00 mov edi,0x400668
4005aa: e8 a1 fe ff ff call 400450 <puts#plt>
4005af: bf 90 06 40 00 mov edi,0x400690
4005b4: e8 a7 fe ff ff call 400460 <system#plt>
4005b9: bf 80 96 98 00 mov edi,0x989680
4005be: e8 cd fe ff ff call 400490 <usleep#plt>
4005c3: bf a2 06 40 00 mov edi,0x4006a2
4005c8: e8 83 fe ff ff call 400450 <puts#plt>
4005cd: bf 90 06 40 00 mov edi,0x400690
4005d2: e8 89 fe ff ff call 400460 <system#plt>
4005d7: b8 00 00 00 00 mov eax,0x0
4005dc: c9 leave
4005dd: c3 ret
4005de: 66 90 xchg ax,ax
We can see the offending lines that are responsible for the usleep(10*1000*1000) call:
4005b9: bf 80 96 98 00 mov edi,0x989680
4005be: e8 cd fe ff ff call 400490 <usleep#plt>
Since 0x989680 equals 10000000 in decimal, we can deduce that this is the argument for the usleep() function. So, we can just modify the binary (search for the byte sequence bf 80 96 98 00 e8 cd fe ff ff) and instead just put the 0x90 there for a NOP instruction, which does nothing.
Before and after:
When we now dump the instructions:
root#kali:~/so# objdump -d -Mintel prog_cracked | grep -A28 "<main>"
0000000000400596 <main>:
400596: 55 push rbp
400597: 48 89 e5 mov rbp,rsp
40059a: 48 83 ec 10 sub rsp,0x10
40059e: 89 7d fc mov DWORD PTR [rbp-0x4],edi
4005a1: 48 89 75 f0 mov QWORD PTR [rbp-0x10],rsi
4005a5: bf 68 06 40 00 mov edi,0x400668
4005aa: e8 a1 fe ff ff call 400450 <puts#plt>
4005af: bf 90 06 40 00 mov edi,0x400690
4005b4: e8 a7 fe ff ff call 400460 <system#plt>
4005b9: 90 nop
4005ba: 90 nop
4005bb: 90 nop
4005bc: 90 nop
4005bd: 90 nop
4005be: 90 nop
4005bf: 90 nop
4005c0: 90 nop
4005c1: 90 nop
4005c2: 90 nop
4005c3: bf a2 06 40 00 mov edi,0x4006a2
4005c8: e8 83 fe ff ff call 400450 <puts#plt>
4005cd: bf 90 06 40 00 mov edi,0x400690
4005d2: e8 89 fe ff ff call 400460 <system#plt>
4005d7: b8 00 00 00 00 mov eax,0x0
4005dc: c9 leave
4005dd: c3 ret
4005de: 66 90 xchg ax,ax
Nice, the call is gone. Run and we get:
root#kali:~/so# chmod +x prog_cracked
root#kali:~/so# ./prog_cracked
Entry point. We'll now wait 10 seconds.
21:11:18
Woke up again.
21:11:18
And thus, the program is "cracked" again.

Where do the static functions go in ELF binary

I made a simple linux kernel module which has a static function. When I use objdump or nm on the .ko file, I cannot see the entry for my static function. Where did it go?
Thanks.
Edit: Adding code
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
int p;
int q;
module_param(p, int, S_IRUGO);
module_param(q, int, S_IRUGO);
static int add(int x, int y)
{
return x + y;
}
static int __init hello_init(void)
{
int res;
res = add(p, q);
return res;
}
static void __exit hello_cleanup(void)
{
}
module_init(hello_init);
module_exit(hello_cleanup);
MODULE_VERSION("dev");
MODULE_LICENSE("Proprietary");
objdump output with non-static function:
Disassembly of section .text:
0000000000000000 <add>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: e8 00 00 00 00 callq 9 <add+0x9>
9: c9 leaveq
a: 8d 04 3e lea (%rsi,%rdi,1),%eax
d: c3 retq
e: 90 nop
f: 90 nop
Disassembly of section .init.text:
0000000000000000 <init_module>:
0: 55 push %rbp
1: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 7 <init_module+0x7>
7: 03 05 00 00 00 00 add 0x0(%rip),%eax # d <init_module+0xd>
d: 48 89 e5 mov %rsp,%rbp
10: c9 leaveq
11: c3 retq
Disassembly of section .exit.text:
0000000000000000 <cleanup_module>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: e8 00 00 00 00 callq 9 <cleanup_module+0x9>
9: c9 leaveq
a: c3 retq
objdump output with static function:
Disassembly of section .init.text:
0000000000000000 <init_module>:
0: 55 push %rbp
1: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 7 <init_module+0x7>
7: 03 05 00 00 00 00 add 0x0(%rip),%eax # d <init_module+0xd>
d: 48 89 e5 mov %rsp,%rbp
10: c9 leaveq
11: c3 retq
Disassembly of section .exit.text:
0000000000000000 <cleanup_module>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: e8 00 00 00 00 callq 9 <cleanup_module+0x9>
9: c9 leaveq
a: c3 retq

Is there a way to output the assembly of a single function in isolation?

I am learning how a C file is compiled to machine code. I know I can generate assembly from gcc with the -S flag, however it also produces a lot of code to do with main() and printf() that I am not interested in at the moment.
Is there a way to get gcc or clang to "compile" a function in isolation and output the assembly?
I.e. get the assembly for the following c in isolation:
int add( int a, int b ) {
return a + b;
}
There are two ways to do this for a specific object file:
The -ffunction-sections option to gcc instructs it to create a separate ELF section for each function in the sourcefile being compiled.
The symbol table contains section name, start address and size of a given function; that can be fed into objdump via the --start-address/--stop-address arguments.
The first example:
$ readelf -S t.o | grep ' .text.'
[ 1] .text PROGBITS 0000000000000000 00000040
[ 4] .text.foo PROGBITS 0000000000000000 00000040
[ 6] .text.bar PROGBITS 0000000000000000 00000060
[ 9] .text.foo2 PROGBITS 0000000000000000 000000c0
[11] .text.munch PROGBITS 0000000000000000 00000110
[14] .text.startup.mai PROGBITS 0000000000000000 00000180
This has been compiled with -ffunction-sections and there are four functions, foo(), bar(), foo2() and munch() in my object file. I can disassemble them separately like so:
$ objdump -w -d --section=.text.foo t.o
t.o: file format elf64-x86-64
Disassembly of section .text.foo:
0000000000000000 <foo>:
0: 48 83 ec 08 sub $0x8,%rsp
4: 8b 3d 00 00 00 00 mov 0(%rip),%edi # a <foo+0xa>
a: 31 f6 xor %esi,%esi
c: 31 c0 xor %eax,%eax
e: e8 00 00 00 00 callq 13 <foo+0x13>
13: 85 c0 test %eax,%eax
15: 75 01 jne 18 <foo+0x18>
17: 90 nop
18: 48 83 c4 08 add $0x8,%rsp
1c: c3 retq
The other option can be used like this (nm dumps symbol table entries):
$ nm -f sysv t.o | grep bar
bar |0000000000000020| T | FUNC|0000000000000026| |.text
$ objdump -w -d --start-address=0x20 --stop-address=0x46 t.o --section=.text
t.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000020 <bar>:
20: 48 83 ec 08 sub $0x8,%rsp
24: 8b 3d 00 00 00 00 mov 0(%rip),%edi # 2a <bar+0xa>
2a: 31 f6 xor %esi,%esi
2c: 31 c0 xor %eax,%eax
2e: e8 00 00 00 00 callq 33 <bar+0x13>
33: 85 c0 test %eax,%eax
35: 75 01 jne 38 <bar+0x18>
37: 90 nop
38: bf 3f 00 00 00 mov $0x3f,%edi
3d: 48 83 c4 08 add $0x8,%rsp
41: e9 00 00 00 00 jmpq 46 <bar+0x26>
In this case, the -ffunction-sections option hasn't been used, hence the start offset of the function isn't zero and it's not in its separate section (but in .text).
Beware though when disassembling object files ...
This isn't exactly what you want, because, for object files, the call targets (as well as addresses of global variables) aren't resolved - you can't see here that foo calls printf, because the resolution of that on binary level happens only at link time. The assembly source would have the call printf in there though. The information that this callq is actually to printf is in the object file, but separate from the code (it's in the so-called relocation section that lists locations in the object file to be 'patched' by the linker); the disassembler can't resolve this.
The best way to go would be to copy your function in a single temp.c C file and to compile it with the -c flag like this: gcc -c -S temp.c -o temp.s
It should produce a more tighten assembly code with no other distraction (except for the header and footer).

Resources