I have a question about symbol tables. Now according to my textbook the rules of a symbol table are as follows:
Find the .ORIG statement,
which tells us the address of the first instruction and initialize location counter (LC), which keeps track of the
current instruction.
For each non-empty line in the program:
a) If line contains a label, add label and LC to symbol table.
b) Increment LC.
– NOTE: If statement is .BLKW or .STRINGZ,
increment LC by the number of words allocated.
Stop when .END statement is reached.
I'm having a problem with the second rule. Specifically, the statement about .BLKW or .STRINGZ (increment LC by the number of words allocated). I could not find any example in the textbook that could help me understand this, but I did find some code online.
In my example code, what are the addresses of the statements that contain .BLKW and .STRINGZ?
01 .ORIG x 3000
02
03 INIT
04 LEA R0, START.STR
05 JSR PRINST.STR
06 LD R0, TEN
07 LEA R1, DATA.B
08
09 STORE_LOOP
10 STR RO, R1, %0
11 ADD R1, R1, %1
12 ADD R0, R0, %-1
13 BRp ST.LOOP
14
15 LD R0, TEN
16 ADD R1, R1, %-1
17 AND R2, R2, %0
18
19 ADD_LOOP
20 LDR R3, R1, %0
21 ADD R2, R3, R2
22 ADD R1, R1, %-1
23 ADD R0, R0, %-1
24 BRp ADD_LOOP
25
26 STORE_SUM
27 ST R2, RESULT
28 TRAP %25
29
30 PRINT_STR
31 ST R7, SAVE.R7
32 PUTS
33 LD R7, SAVE.R7
34 RET
35
36 TEN .FILL #10
37 SAVE.R7 .BLKW #1
38 DATA.B .BLKW #10
39 START_STR .STRINGZ "Starting..."
40 RESULT .FILL #0
41 END
Code annotated with comments showing the address of each one.
01 .ORIG x 3000
02
03 INIT ; x3000
04 LEA R0, START.STR ; x3000
05 JSR PRINST.STR ; x3001
06 LD R0, TEN ; x3002
07 LEA R1, DATA.B ; x3003
08
09 STORE_LOOP ; x3004
10 STR RO, R1, %0 ; x3004
11 ADD R1, R1, %1 ; x3005
12 ADD R0, R0, %-1 ; x3006
13 BRp ST.LOOP ; x3007
14
15 LD R0, TEN ; x3008
16 ADD R1, R1, %-1 ; x3009
17 AND R2, R2, %0 ; x300A
18
19 ADD_LOOP ; x300B
20 LDR R3, R1, %0 ; x300B
21 ADD R2, R3, R2 ; x300C
22 ADD R1, R1, %-1 ; x300D
23 ADD R0, R0, %-1 ; x300E
24 BRp ADD_LOOP ; x300F
25
26 STORE_SUM ; x3010
27 ST R2, RESULT ; x3010
28 TRAP %25 ; x3011
29
30 PRINT_STR ; x3012
31 ST R7, SAVE.R7 ; x3012
32 PUTS ; x3013
33 LD R7, SAVE.R7 ; x3014
34 RET ; x3015
35
36 TEN .FILL #10 ; x3016
37 SAVE.R7 .BLKW #1 ; x3017
38 DATA.B .BLKW #10 ; x3018
39 START_STR .STRINGZ "Starting..."; x3022
40 RESULT .FILL #0 ; x302E
41 END
Things to note.
Whenever you encounter a BLKW you add however many addresses it is reserving.
Whenever you encounter a .stringz you add however many characters that are in the string PLUS 1 for the NUL terminator character.
Also (for at least the tools I work with) some assemblers / simulators will generate a symbol table file when assembling, and some simulators will visibly show what labels are at what addresses.
Related
I would like to see the first few instructions that my machine executes at startup. On x64 the reset vector is at physical address FFFFFFF0. I have enabled local kernel debugging on my Windows 10, restarted the PC and started WinDbg as Administrator. When doing a kernel debug (File -> Kernel Debug...), I am not sure what to type at the lkd> prompt to unassemble the code. I can do "!db FFFFFFF0" which displays some bytes:
#fffffff0 90 90 e9 83 e8 00 00 00-fc 00 00 00 00 00 e1 ff ................
#100000000 bb 00 fc 6a 00 00 e1 a9-00 00 bb 00 fc 6a 00 00 ...j.........j..
#100000010 01 aa 00 00 bb 00 fc 6a-00 00 21 aa 00 00 bb 00 .......j..!.....
#100000020 fc 6a 00 00 41 aa 00 00-bb 00 fc 6a 00 00 61 aa .j..A......j..a.
#100000030 00 00 bb 00 fc 6a 00 00-81 aa 00 00 bb 00 fc 6a .....j.........j
#100000040 00 00 a1 aa 00 00 bb 00-fc 6a 00 00 c1 aa 00 00 .........j......
#100000050 bb 00 fc 6a 00 00 e1 aa-00 00 bb 00 fc 6a 00 00 ...j.........j..
#100000060 01 ab 00 00 bb 00 fc 6a-00 00 21 ab 00 00 bb 00 .......j..!.....
then I tried "!u FFFFFFF0" which returns:
Op:
Dest:
Dest: 0
Src:
Srct: 0
it is just two nops and a jump if you mean you need to disassemble it as 16 bit
use ur
i patched the first 16 bytes from your query and disassemble it as 16 bit for demo below
0:000> db . l10
772805a6 90 90 e9 83 e8 00 00 00-fc 00 00 00 00 00 e1 ff ................
0:000> ur . l3
ntdll!LdrpDoDebuggerBreak+0x2c:
772805a6 90 nop
772805a7 90 nop
772805a8 e983e8 jmp EE2E
0:000>
Local Kernel Debugging is not Live it is Dead Debugging it operates on a snap shot
livekd operates on a state as the system was when dumped
!u is undocumented iirc and doesn't disassemble it provides a verbose details of a single instruction
0:000> u .+1 l1
ntdll!LdrpDoDebuggerBreak+0x2d:
772805a7 8975fc mov dword ptr [ebp-4],esi
0:000> !u 772805a7
Op: mov
Dest: esi
Dest: fffffffe
Src: dword ptr [ebp-4]
Srct: 0
0:000>
if you are looking for disassembling something like bios code use up
unassemble physical
kd> up cs:7c00 l1
0008:00007c00 eb52 jmp 00007c54
kd> up cs:7c54 l20
0008:00007c54 fa cli
0008:00007c55 33c0 xor eax,eax
0008:00007c57 8ed0 mov ss,ax
0008:00007c59 bc007cfb68 mov esp,68FB7C00h
0008:00007c5e c0071f rol byte ptr [edi],1Fh
0008:00007c61 1e push ds
0008:00007c62 686600cb88 push 88CB0066h
0008:00007c67 16 push ss
0008:00007c68 0e push cs
0008:00007c69 006681 add byte ptr [esi-7Fh],ah
0008:00007c6c 3e0300 add eax,dword ptr ds:[eax]
0008:00007c6f 4e dec esi
0008:00007c70 54 push esp
0008:00007c71 46 inc esi
0008:00007c72 53 push ebx
0008:00007c73 7515 jne 00007c8a
0008:00007c75 b441 mov ah,41h
0008:00007c77 bbaa55cd13 mov ebx,13CD55AAh
0008:00007c7c 720c jb 00007c8a
0008:00007c7e 81fb55aa7506 cmp ebx,675AA55h
0008:00007c84 f7c101007503 test ecx,3750001h
0008:00007c8a e9dd001e83 jmp 831e7d6c
i dont think debug.com is shipped in windows 10 x64 if you can get yourhands on win732bit etc you can use debug to disassemble the address
:>debug
-u f000:fff0 l1
F000:FFF0 EA5BE000F0 JMP F000:E05B
-u f000:e05b l1
F000:E05B EA3D3A00F0 JMP F000:3A3D
-u f000:3a3d l1
F000:3A3D FA CLI
-u f000:3a3d
F000:3A3D FA CLI
F000:3A3E B800F0 MOV AX,F000
F000:3A41 8ED0 MOV SS,AX
F000:3A43 BC493A MOV SP,3A49
F000:3A46 E93A8E JMP C883
F000:3A49 4B DEC BX
F000:3A4A 3ABB1DF1 CMP BH,[BP+DI+F11D]
F000:3A4E 2E CS:
F000:3A4F F747020800 TEST WORD PTR [BX+02],0008
F000:3A54 740E JZ 3A64
F000:3A56 32C0 XOR AL,AL
F000:3A58 BC5E3A MOV SP,3A5E
F000:3A5B E9E50B JMP 4643
-
I don't think you can step or disassemble the Physical Address f000:ffff in windbg.
I don't think it is mapped at all
I saw your post !db showing output so I wasn't quiet sure if it is available in x64.
afaik these codes are executed in Real Mode and you cant access them in protected mode with a software debugger like windbg
anyway back to the point
if you want to step through reset vector to MBR use a hardware emulator like bochs.
install bochs (2.6.11 x64 latest at the time of this edit) with the dlx demo
if you installed bochs in windows 10 you may need to give permissions to the bochs folder
(right click bochs folder->properties->security->edit full or write as the case may be)
once permission is granted you can double click the run.bat and you should land on LILo boot prompt
if you are here then close the bochs
open an elevated cmd prompt
and execute ../bochsdbg -f bochsrc
it will stop on Reset Vector
C:\Program Files\Bochs-2.6.11\dlxlinux>..\bochsdbg.exe -f ./bochsrc.bxrc
========================================================================
Bochs x86 Emulator 2.6.11
Built from SVN snapshot on January 5, 2020
Timestamp: Sun Jan 5 08:36:00 CET 2020
========================================================================
00000000000i[ ] reading configuration from ./bochsrc.bxrc
00000000000i[ ] installing win32 module as the Bochs GUI
00000000000i[ ] using log file bochsout.txt
Next at t=0
(0) [0x0000fffffff0] f000:fff0 (unk. ctxt): jmpf 0xf000:e05b ; ea5be000f0
set a breakpoint at 0x7c00 using lb 0x7c00 (MBR START)
use s or n to step through
if you single step using s you may need to press s or ENTER 42.5 million times to reach MBR
here is a step through from reset vector to MBR using n or next
<bochs:5> blist
Num Type Disp Enb Address
1 lbreakpoint keep y 0x0000000000007c00
<bochs:6> u
00000000fffffff0: ( ): jmpf 0xf000:e05b ; ea5be000f0
<bochs:7> n
Next at t=1
(0) [0x0000000fe05b] f000:e05b (unk. ctxt): xor ax, ax ; 31c0
<bochs:8>
Next at t=2
(0) [0x0000000fe05d] f000:e05d (unk. ctxt): out 0x0d, al ; e60d
<bochs:9>
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx cut off
Next at t=32
(0) [0x0000000fe0c4] f000:e0c4 (unk. ctxt): rep stosw word ptr es:[di], ax ; f3ab
<bochs:39>
Next at t=160
(0) [0x0000000fe0c6] f000:e0c6 (unk. ctxt): call .+13763 (0x000f168c) ; e8c335
<bochs:40>
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx cut off
Next at t=5078
(0) [0x0000000fe0cf] f000:e0cf (unk. ctxt): mov word ptr ds:0x0413, ax ; a31304
<bochs:43>
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx cut off
Next at t=330285
(0) [0x0000000fe1f5] f000:e1f5 (unk. ctxt): call .-18344 (0x000f9a50) ; e858b8
<bochs:146>
Next at t=1403926
(0) [0x0000000fe1f8] f000:e1f8 (unk. ctxt): mov cx, 0xc000 ; b900c0
<bochs:147>
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx cut off
Next at t=1877324
(0) [0x0000000fe21e] f000:e21e (unk. ctxt): call .+14565 (0x000f1b06) ; e8e538
<bochs:161>
Next at t=5333736
(0) [0x0000000fe221] f000:e221 (unk. ctxt): call .+21186 (0x000f34e6) ; e8c252
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx cut off
(0) [0x0000000fe230] f000:e230 (unk. ctxt): call .+12532 (0x000f1327) ; e8f430
<bochs:167>
Next at t=42311959
(0) [0x0000000fe233] f000:e233 (unk. ctxt): sti ; fb
<bochs:168>
Next at t=42311960
(0) [0x0000000fe234] f000:e234 (unk. ctxt): int 0x19 ; cd19
<bochs:169>
(0) Breakpoint 1, 0x0000000000007c00 in ?? () <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Next at t=42409977 <<<<< 42.5 million instructions until MBR is reached
(0) [0x000000007c00] 0000:7c00 (unk. ctxt): cli ; fa
<bochs:170> q
(0).[42409977] [0x000000007c00] 0000:7c00 (unk. ctxt): cli ; fa
Bochs is exiting. Press ENTER when you're ready to close this window.
This question already has an answer here:
Assembly do we need the endings? [duplicate]
(1 answer)
Closed 1 year ago.
Recently, I read some books about computer science. I wrote some C code, and disassembled them, using gcc and objdump.
The following C code:
#include <stdio.h>
#include <stdbool.h>
int dojob()
{
static short num[ ][4] = { {2, 9, -1, 5}, {3, 8, 2, -6}};
static short *pn[ ] = {num[0], num[1]};
static short s[2] = {0, 0};
int i, j;
for (i=0; i<2; i++) {
for (j=0; j<4; j++){
s[i] += *pn[i]++;
}
printf ("sum of line %d: %d\n", i+1, s[i]);
}
return 0;
}
int main ( )
{
dojob();
}
got the following assembly code (AT&T syntex; only assembly of function dojob and some data is list):
00401350 <_dojob>:
401350: 55 push %ebp
401351: 89 e5 mov %esp,%ebp
401353: 83 ec 28 sub $0x28,%esp
401356: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%ebp)
40135d: eb 75 jmp 4013d4 <_dojob+0x84>
40135f: c7 45 f0 00 00 00 00 movl $0x0,-0x10(%ebp)
401366: eb 3c jmp 4013a4 <_dojob+0x54>
401368: 8b 45 f4 mov -0xc(%ebp),%eax
40136b: 8b 04 85 00 20 40 00 mov 0x402000(,%eax,4),%eax
401372: 8d 48 02 lea 0x2(%eax),%ecx
401375: 8b 55 f4 mov -0xc(%ebp),%edx
401378: 89 0c 95 00 20 40 00 mov %ecx,0x402000(,%edx,4)
40137f: 0f b7 10 movzwl (%eax),%edx
401382: 8b 45 f4 mov -0xc(%ebp),%eax
401385: 0f b7 84 00 08 50 40 movzwl 0x405008(%eax,%eax,1),%eax
40138c: 00
40138d: 89 c1 mov %eax,%ecx
40138f: 89 d0 mov %edx,%eax
401391: 01 c8 add %ecx,%eax
401393: 89 c2 mov %eax,%edx
401395: 8b 45 f4 mov -0xc(%ebp),%eax
401398: 66 89 94 00 08 50 40 mov %dx,0x405008(%eax,%eax,1)
40139f: 00
4013a0: 83 45 f0 01 addl $0x1,-0x10(%ebp)
4013a4: 83 7d f0 03 cmpl $0x3,-0x10(%ebp)
4013a8: 7e be jle 401368 <_dojob+0x18>
4013aa: 8b 45 f4 mov -0xc(%ebp),%eax
4013ad: 0f b7 84 00 08 50 40 movzwl 0x405008(%eax,%eax,1),%eax
4013b4: 00
4013b5: 98 cwtl
4013b6: 8b 55 f4 mov -0xc(%ebp),%edx
4013b9: 83 c2 01 add $0x1,%edx
4013bc: 89 44 24 08 mov %eax,0x8(%esp)
4013c0: 89 54 24 04 mov %edx,0x4(%esp)
4013c4: c7 04 24 24 30 40 00 movl $0x403024,(%esp)
4013cb: e8 50 08 00 00 call 401c20 <_printf>
4013d0: 83 45 f4 01 addl $0x1,-0xc(%ebp)
4013d4: 83 7d f4 01 cmpl $0x1,-0xc(%ebp)
4013d8: 7e 85 jle 40135f <_dojob+0xf>
4013da: b8 00 00 00 00 mov $0x0,%eax
4013df: c9 leave
4013e0: c3 ret
Disassembly of section .data:
00402000 <__data_start__>:
402000: 08 20 or %ah,(%eax)
402002: 40 inc %eax
402003: 00 10 add %dl,(%eax)
402005: 20 40 00 and %al,0x0(%eax)
Disassembly of section .bss:
...
00405008 <_s.1927>:
405008: 00 00 add %al,(%eax)
...
I have two questions:
I don't understand the difference between mov and movl instruction? Why the compiler generate mov for some code, and movl for others?
I completely understand the meaning of the C code, but not the assembly that the compiler generated. Who can make some comments for it for me to understand? I will thank a lot.
The MOVL instruction was generated because you put two int (i and j variables), MOVL will perform a MOV of 32 bits, and integer' size is 32 bits.
a non exhaustive list of all MOV* exist (like MOVD for doubleword or MOVQ for quadword) to allow to optimize your code and use the better expression to gain most time as possible.
PS: may be the -M intel objdump's argument can help you to have a better comprehension of the disassembly, a lot of man on the Intel syntax can may be find easily.
I know when using objdump -dr in my file call shows up in machine code as e8 00 00 00 00 because it has not yet been linked. But I need to find out what the 00 00 00 00 will turn into after the linker has done it's job. I know it should calculate the offset, but I'm a little confused about that.
As an example with the code below, after the linker part is done, how should the e8 00 00 00 00 be? And how do I get to that answer?
I'm testing out with this sample code: (I'm trying to call moo)
Disassembly of section .text:
0000000000000000 <foo>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 89 7d fc mov %edi,-0x4(%rbp)
7: 8b 45 fc mov -0x4(%rbp),%eax
a: 83 e8 0a sub $0xa,%eax
d: 5d pop %rbp
e: c3 retq
000000000000000f <moo>:
f: 55 push %rbp
10: 48 89 e5 mov %rsp,%rbp
13: 89 7d fc mov %edi,-0x4(%rbp)
16: b8 01 00 00 00 mov $0x1,%eax
1b: 5d pop %rbp
1c: c3 retq
000000000000001d <main>:
1d: 55 push %rbp
1e: 48 89 e5 mov %rsp,%rbp
21: 48 83 ec 10 sub $0x10,%rsp
25: c7 45 fc 8e 0c 00 00 movl $0xc8e,-0x4(%rbp)
2c: 8b 45 fc mov -0x4(%rbp),%eax
2f: 89 c7 mov %eax,%edi
31: e8 00 00 00 00 callq 36 <main+0x19>
32: R_X86_64_PC32 moo-0x4
36: 89 45 fc mov %eax,-0x4(%rbp)
39: b8 00 00 00 00 mov $0x0,%eax
3e: c9 leaveq
3f: c3 retq
With objdump -r you have Relocations printed with your disassembly -d:
31: e8 00 00 00 00 callq 36 <main+0x19>
32: R_X86_64_PC32 moo-0x4
ld-linux.so.2 loader will relocate objects (in modern world it will relocate even executable to random address) and fill the relocations with correct address.
Check with gdb by adding breakpoint at main and starting program (linker works before main function is started):
gdb ./program
(gdb) start
(gdb) disassemble main
If you want to compile the code without relocations, show source code and compilation options.
Object files and executable files on several architectures that I know of do not necessarily fix jump destinations at link time.
This is a feature which provides flexibility.
Jump target addresses do not have to be fixed until just before the instruction executes. They do not need to be fixed up at link time—nor even at program start time!
Most systems (Windows, Linux, Unix, VAX/VMS) tag such locations in the object code as an address which needs adjustment. There is additional information about what the target address is, what type of reference it is (such as absolute or relative; 16-bit, 24-bit, 32-bit, 64-bit, etc.).
The zero value there is not necessarily a placeholder, but the base value upon which to evaluate the result. For example, if the instruction were—for whatever reason—call 5+external_address, then there might be 5 (e8 05 00 00 00) in the object code.
If you want to see what the address is at execution time, run the program under a debugger, place a breakpoint at that instruction and then view the instruction just before it executes.
A common anti-virus, security-enhancing feature known as ASLR (address space layout randomization) intentionally loads programs sections at inconsistent addresses to thwart malicious code which alters programs or data. Programs operating in this environment may not have some target addresses assigned until after the program runs a bit.
(Of related interest, VAX/VMS in particular has a complex fixup mode in which an equation describes the operations needed to compute a value. Operations include addition, subtraction, multiplication, division, shifting, rotating, and probably others. I never saw it actually used, but it was interesting to contemplate how one might apply the capability.)
but you clearly know how to do all of this. you know how to disassemble before linking just disassemble after to see how the linker modifies those instructions.
asm(".globl _start; _start: nop\n");
unsigned int foo ( unsigned int x )
{
return(x+5);
}
unsigned int moo ( unsigned int x )
{
return(foo(x)+3);
}
int main ( void )
{
return(moo(3)+2);
}
0000000000000000 <_start>:
0: 90 nop
0000000000000001 <foo>:
1: 55 push %rbp
2: 48 89 e5 mov %rsp,%rbp
5: 89 7d fc mov %edi,-0x4(%rbp)
8: 8b 45 fc mov -0x4(%rbp),%eax
b: 83 c0 05 add $0x5,%eax
e: 5d pop %rbp
f: c3 retq
0000000000000010 <moo>:
10: 55 push %rbp
11: 48 89 e5 mov %rsp,%rbp
14: 48 83 ec 08 sub $0x8,%rsp
18: 89 7d fc mov %edi,-0x4(%rbp)
1b: 8b 45 fc mov -0x4(%rbp),%eax
1e: 89 c7 mov %eax,%edi
20: e8 00 00 00 00 callq 25 <moo+0x15>
25: 83 c0 03 add $0x3,%eax
28: c9 leaveq
29: c3 retq
000000000000002a <main>:
2a: 55 push %rbp
2b: 48 89 e5 mov %rsp,%rbp
2e: bf 03 00 00 00 mov $0x3,%edi
33: e8 00 00 00 00 callq 38 <main+0xe>
38: 83 c0 02 add $0x2,%eax
3b: 5d pop %rbp
3c: c3 retq
0000000000001000 <_start>:
1000: 90 nop
0000000000001001 <foo>:
1001: 55 push %rbp
1002: 48 89 e5 mov %rsp,%rbp
1005: 89 7d fc mov %edi,-0x4(%rbp)
1008: 8b 45 fc mov -0x4(%rbp),%eax
100b: 83 c0 05 add $0x5,%eax
100e: 5d pop %rbp
100f: c3 retq
0000000000001010 <moo>:
1010: 55 push %rbp
1011: 48 89 e5 mov %rsp,%rbp
1014: 48 83 ec 08 sub $0x8,%rsp
1018: 89 7d fc mov %edi,-0x4(%rbp)
101b: 8b 45 fc mov -0x4(%rbp),%eax
101e: 89 c7 mov %eax,%edi
1020: e8 dc ff ff ff callq 1001 <foo>
1025: 83 c0 03 add $0x3,%eax
1028: c9 leaveq
1029: c3 retq
000000000000102a <main>:
102a: 55 push %rbp
102b: 48 89 e5 mov %rsp,%rbp
102e: bf 03 00 00 00 mov $0x3,%edi
1033: e8 d8 ff ff ff callq 1010 <moo>
1038: 83 c0 02 add $0x2,%eax
103b: 5d pop %rbp
103c: c3 retq
for example
20: e8 00 00 00 00 callq 25 <moo+0x15>
1033: e8 d8 ff ff ff callq 1010 <moo>
Coming from a Windows environment, when I do kernel debugging or even in user mode for that matter, I can see the disassembled code in a way that is quite detailed, for example:
80526db2 6824020000 push 224h
80526db7 6808a14d80 push offset nt!ObWatchHandles+0x8dc (804da108)
80526dbc e81f030100 call nt!_SEH_prolog (805370e0)
80526dc1 a140a05480 mov eax,dword ptr [nt!__security_cookie (8054a040)]
The first number is the address quite obviously but the second represent the opcode bytes and that is lacking on GDB or at least, I don't know how to get a similar result.
I usually will do something like this:
(gdb): display /i $pc
But all I get is something like this:
x/i $pc 0x21c4c: pop %eax
I can see what the code bytes are which is sometimes a bit of an issue for me. Is there something I can do with display that could help?
Edit: GDB in question is 6.3.50 on Mac OS X 10.8.3.
I think disassemble /r should give you what you are looking for:
(gdb) help disass
Disassemble a specified section of memory.
Default is the function surrounding the pc of the selected frame.
With a /m modifier, source lines are included (if available).
With a /r modifier, raw instructions in hex are included.
With a single argument, the function surrounding that address is dumped.
Two arguments (separated by a comma) are taken as a range of memory to dump,
in the form of "start,end", or "start,+length".
(gdb) disass /r main
Dump of assembler code for function main:
0x004004f8 <+0>: 55 push %ebp
0x004004f9 <+1>: 48 dec %eax
0x004004fa <+2>: 89 e5 mov %esp,%ebp
0x004004fc <+4>: 48 dec %eax
0x004004fd <+5>: 83 ec 10 sub $0x10,%esp
0x00400500 <+8>: 89 7d fc mov %edi,-0x4(%ebp)
0x00400503 <+11>: 48 dec %eax
0x00400504 <+12>: 89 75 f0 mov %esi,-0x10(%ebp)
0x00400507 <+15>: bf 0c 06 40 00 mov $0x40060c,%edi
0x0040050c <+20>: b8 00 00 00 00 mov $0x0,%eax
0x00400511 <+25>: e8 0a ff ff ff call 0x400420
0x00400516 <+30>: bf 00 00 00 00 mov $0x0,%edi
0x0040051b <+35>: e8 10 ff ff ff call 0x400430
End of assembler dump.
(gdb)
GDB disassemble command documentation
If you use lldb, you can use the -b option to disassemble to get the same effect:
(lldb) disassemble -b -p
Sketch`main + 46 at SKTMain.m:17:
-> 0x10001aa0e: 48 89 c7 movq %rax, %rdi
0x10001aa11: b0 00 movb $0, %al
0x10001aa13: e8 f2 48 00 00 callq 0x10001f30a ; symbol stub for: NSLog
0x10001aa18: 48 8d 35 99 fa 00 00 leaq 64153(%rip), %rsi ; #Sketch`.str3
How can I get arm-none-eabi-objcopy to copy/translate my .axf file into a .bin suitable for flashing to the device with lm4tools?
I have a ~20KB .axf file compiled and linked with arm-none-eabi-*. For anyone interested, this is with the Stellaris Launchpad.
I'm trying to modify the example provided with the stellaris-launchpad-template-gcc project to compile with C++ code. I've managed to get it to produce a .axf file in elf32-littlearm format (according to the .lst file), however when I try to do
arm-none-eabi-objcopy -v -O binary main.axf foo.bin
to translate it into a programmable file, the .bin file output has a size of 0 bytes. This question describes a similar problem. I'm confident the .axf file is complete, so why isn't my .bin file being populated with anything? If I remove -O binary, the file is copied fine. The directory is writeable and all permissions are fine. I've tried searching both on SO and the rest of the internet with various terms, all of which turned up nothing of use. It seems nobody's had this problem before.
Running arm-none-eabi-objdump -D main.axf yields a dump of the file in what appears to be assembler:
main.axf: file format elf32-littlearm
Disassembly of section .debug_info:
00000000 <_end_text>:
0: 00000082 andeq r0, r0, r2, lsl #1
4: 00000002 andeq r0, r0, r2
8: 01040000 mrseq r0, (UNDEF: 4)
c: 00000057 andeq r0, r0, r7, asr r0
10: 00000c04 andeq r0, r0, r4, lsl #24
14: 00002300 andeq r2, r0, r0, lsl #6
...
28: 08010200 stmdaeq r1, {r9}
2c: 00000015 andeq r0, r0, r5, lsl r0
30: 00410103 subeq r0, r1, r3, lsl #2
34: 32010000 andcc r0, r1, #0
38: 00000048 andeq r0, r0, r8, asr #32
...
48: 69050404 stmdbvs r5, {r2, sl}
4c: 0500746e streq r7, [r0, #-1134] ; 0x46e
50: 00000001 andeq r0, r0, r1
54: 46430100 strbmi r0, [r3], -r0, lsl #2
...
Is this correct, or am I not even compiling properly, seeing as this appears to be some kind of assembler?
Some info about the .axf file:
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x0
Start of program headers: 52 (bytes into file)
Start of section headers: 20372 (bytes into file)
Flags: 0x5000000, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 14
Section header string table index: 11
After running arm-none-eabi-readelf -lS main.axf, no program headers are returned:
There are 14 section headers, starting at offset 0x4f94:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .debug_info PROGBITS 00000000 000034 0019a6 00 0 0 1
[ 2] .debug_abbrev PROGBITS 00000000 0019da 0005ad 00 0 0 1
[ 3] .debug_loc PROGBITS 00000000 001f87 001796 00 0 0 1
[ 4] .debug_aranges PROGBITS 00000000 003720 000088 00 0 0 8
[ 5] .debug_ranges PROGBITS 00000000 0037a8 000130 00 0 0 1
[ 6] .debug_line PROGBITS 00000000 0038d8 00077d 00 0 0 1
[ 7] .debug_str PROGBITS 00000000 004055 000aa8 01 MS 0 0 1
[ 8] .comment PROGBITS 00000000 004afd 00003a 01 MS 0 0 1
[ 9] .ARM.attributes ARM_ATTRIBUTES 00000000 004b37 000039 00 0 0 1
[10] .debug_frame PROGBITS 00000000 004b70 000388 00 0 0 4
[11] .shstrtab STRTAB 00000000 004ef8 00009a 00 0 0 1
[12] .symtab SYMTAB 00000000 0051c4 000200 10 13 18 4
[13] .strtab STRTAB 00000000 0053c4 0000e0 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no program headers in this file.