ELF: How to print the r_info field? - gcc

I want to list the name of all relocation entries which have the type R_X86_64_JUMP_SLOT. Here's what I have so far:
char *str = (char *) (mapped_file + dynstr->sh_offset);
Elf64_Rela *reloc_entry = (Elf64_Rela *) (mapped_file + rela_plt->sh_offset);
for (i = 0; i < rela_plt->sh_size / sizeof(Elf64_Rela); ++i)
{
if (ELF64_R_TYPE(reloc_entry[i].r_info) == R_X86_64_JUMP_SLOT)
printf("name: %s\n", str + ELF64_R_SYM(reloc_entry[i].r_info));
}
The ELF64_R_TYPE macro is working fine but I'm having problems retrieving the relocation name with ELF64_R_SYM.
readelf shows the relocations correctly:
Relocation section '.rela.plt' at offset 0x588 contains 3 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000004018 000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts#GLIBC_2.2.5 + 0
000000004020 000300000007 R_X86_64_JUMP_SLO 0000000000000000 printf#GLIBC_2.2.5 + 0
000000004028 000600000007 R_X86_64_JUMP_SLO 0000000000000000 malloc#GLIBC_2.2.5 + 0
My code for some reason doesn't print the symbol name, instead it prints:
name: ibc.so.6
name: bc.so.6
name: so.6
what am I doing wrong?

r_info contains the symbol index, which is an array index for the symbol table. The array elements are of type Elf32_Sym or Elf64_Sym. The string table index of the symbol name is in the st_name member. This additional indirection is necessary because in general, binding a symbol requires more data than just its name.

Related

Where Debug info stored in Framework?

As the title mentioned, I'm confusing about where is the debug info stored in .Framework file.
I googled for serval days, what I gots are:
Framework binary compiled by debug mode will include a debug info segment, to indicate the symbol location. Release mode compiling will move it to a dSYM file.
But, what confused me is, I build a framework with ninja, and it doesn't generate dSYM file. Meanwhile I can't find the symbol location by dwarfdump command or MachOView app. As Regards strings command can get some relative file path results, like ../../flutter/fml/memory/task_runner_checker.cc.
Here dwarfdump prints:
Flutter.framework/Flutter: file format Mach-O arm64
.debug_info contents:
Here is my questions:
When I triggered breakpoint at somewhere in the framework, the source code still shows. Why?
When I run lldb command, it shows:
(lldb) image lookup -a $pc --verbose
Address: Flutter[0x0000000001964f18] (Flutter.__TEXT.__text + 26604184)
Summary: Flutter`dart::BootstrapNatives::DN_LoadLibraryFromTypedData(dart::Thread*, dart::Zone*, dart::NativeArguments*) + 44 [inlined] dart::NativeArguments::NativeArgAt(int) const at object.cc:537
Flutter`dart::BootstrapNatives::DN_LoadLibraryFromTypedData(dart::Thread*, dart::Zone*, dart::NativeArguments*) + 44 [inlined] dart::DN_HelperLoadLibraryFromTypedData(dart::Isolate*, dart::Thread*, dart::Zone*, dart::NativeArguments*) at object.cc:534
Flutter`dart::BootstrapNatives::DN_LoadLibraryFromTypedData(dart::Thread*, dart::Zone*, dart::NativeArguments*) + 44 at object.cc:534
Module: file = "/Users/xx/Library/Developer/Xcode/DerivedData/XXX-ddigzjlnuypwnydlawevfrkmdsov/Build/Products/Debug-iphoneos/XXX.app/Frameworks/Flutter.framework/Flutter", arch = "arm64"
CompileUnit: id = {0x00000000}, file = "/Users/xx/Documents/workspace/aion/flutter_engine/src/third_party/dart/runtime/lib/object.cc", language = "c++14"
Function: id = {0x7d40006244d}, name = "dart::BootstrapNatives::DN_LoadLibraryFromTypedData(dart::Thread*, dart::Zone*, dart::NativeArguments*)", mangled = "_ZN4dart16BootstrapNatives27DN_LoadLibraryFromTypedDataEPNS_6ThreadEPNS_4ZoneEPNS_15NativeArgumentsE", range = [0x0000000117310eec-0x0000000117311490)
FuncType: id = {0x7d40006244d}, byte-size = 0, decl = bootstrap_natives.h:507, compiler_type = "class dart::ObjectPtr (class dart::Thread *, class dart::Zone *, class dart::NativeArguments *)"
Blocks: id = {0x7d40006244d}, range = [0x117310eec-0x117311490)
id = {0x7d40006249b}, ranges = [0x117310f18-0x1173113e0)[0x117311404-0x117311490), name = "DN_HelperLoadLibraryFromTypedData", decl = object.cc:534, mangled = _ZN4dartL33DN_HelperLoadLibraryFromTypedDataEPNS_7IsolateEPNS_6ThreadEPNS_4ZoneEPNS_15NativeArgumentsE, demangled = dart::DN_HelperLoadLibraryFromTypedData(dart::Isolate*, dart::Thread*, dart::Zone*, dart::NativeArguments*)
id = {0x7d40006253a}, range = [0x117310f18-0x117310f24), name = "NativeArgAt", decl = native_arguments.h:129, mangled = _ZNK4dart15NativeArguments11NativeArgAtEi, demangled = dart::NativeArguments::NativeArgAt(int) const
LineEntry: [0x0000000117310f18-0x0000000117310f24): /Users/xx/Documents/workspace/xxx/flutter_engine/src/third_party/dart/runtime/vm/native_arguments.h:132:14
Symbol: id = {0x0013fd0f}, range = [0x0000000117310eec-0x0000000117311490), name="dart::BootstrapNatives::DN_LoadLibraryFromTypedData(dart::Thread*, dart::Zone*, dart::NativeArguments*)", mangled="_ZN4dart16BootstrapNatives27DN_LoadLibraryFromTypedDataEPNS_6ThreadEPNS_4ZoneEPNS_15NativeArgumentsE"
Variable: id = {0x7d400062553}, name = "this", type = "const dart::NativeArguments *", location = DW_OP_reg20 W20, decl =
Variable: id = {0x7d40006255c}, name = "index", type = "int", location = <decoding error> 00 00 00, decl = native_arguments.h:129
Variable: id = {0x7d4000624a8}, name = "isolate", type = "dart::Isolate *", location = , decl = object.cc:534
Variable: id = {0x7d4000624ad}, name = "thread", type = "dart::Thread *", location = , decl = object.cc:534
Variable: id = {0x7d4000624b2}, name = "zone", type = "dart::Zone *", location = DW_OP_reg19 W19, decl = object.cc:534
Variable: id = {0x7d4000624bb}, name = "arguments", type = "dart::NativeArguments *", location = DW_OP_reg20 W20, decl = object.cc:534
Variable: id = {0x7d4000624c4}, name = "program", type = "unique_ptr<dart::xx_kernel::Program, std::__1::default_delete<dart::xx_kernel::Program> >", location = DW_OP_breg31 WSP+64, decl = object.cc:556
Variable: id = {0x7d400062468}, name = "thread", type = "dart::Thread *", location = DW_OP_reg24 W24, decl = object.cc:534
Variable: id = {0x7d400062479}, name = "zone", type = "dart::Zone *", location = DW_OP_reg19 W19, decl = object.cc:534
Variable: id = {0x7d40006248a}, name = "arguments", type = "dart::NativeArguments *", location = DW_OP_reg20 W20, decl = object.cc:534
So, where the lldb get Compile Unit and LineEntry outputs?
Debug information on Darwin systems exists in one of two places: In the .o files, and later after dsymutil is run to create a .dSYM, it exists in the .dSYM bundle, all collected together, relocated to the actual binary's addresses.
This was a build-link-debug performance enhancement. Linking all of the debug information -- updating all the symbol addresses, copying it all around -- is very slow, so leaving the debug information in the .o files for this common iterative development cycle, and having the debugger locate the .o files and update the addresses of the functions internally, allows for rapid development.
Leaving all of the debug information in the .o files requires that they all be present, of course! And at the same file paths. So it is not good when you need to move a binary between computers, or save it for later debugging. For these cases, you link the debug information with dsymutil and you get a .dSYM bundle.

Disable GOT in GCC

Global Offset Table (GOT): Is used for relocation of ELF symbols (implemented GCC), It helps in sharing of same binary without any specific linking for each process. Thus reduces copies of same binary image in the memory.
My question is, is there any way to disable R_386_GOT32,R_386_GOTOFF type relocation entries in relocatable ELF image? I mean, can I force GCC to use R_386_PC32 or R_386_32 type relocation instead of GOT type relocation?
If not, could you explain the way of implementing GOT? I am writing a dynamic linking and loading library for ELF.
Edit:
Reference Links
https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-74186.html
http://man7.org/linux/man-pages/man8/ld.so.8.html
http://wiki.osdev.org/ELF
Finally I cracked it!
No, It is not possible to restrict GCC to output with non-GOT type relocation.
Now how to resolve GOT type relocation?
GOT is of fixed 128KB memory chunk (It works on principle of copy on write) allocated by dynamic linker which contains entries for relocation.
Dynamic Linker allocates GOT only if any type of (listed below) GOT relocation exist in ELF binary.
R_386_GOTOFF (== 0x9)
This relocation type computes the difference between a symbol's value and the address of the global offset table. It also instructs the link-editor to create the global offset table.
R_386_GOTPC (== 0xA)
This relocation type resembles R_386_PC32, except it uses the address of the global offset table in its calculation.
How to implement them?
Note: Following code-snippet belongs to Atom OS source code which protected by closed source license. But I (Atom Developer) hereby declare this code snippet free to use :)
uint GOT = Heap.kmalloc(1024 * 128); // 128 KB
...
private static void Relocate(Elf_Header* aHeader, Elf_Shdr* aShdr, uint GOT)
{
uint BaseAddress = (uint)aHeader;
Elf32_Rel* Reloc = (Elf32_Rel*)aShdr->sh_addr;
Elf_Shdr* TargetSection = (Elf_Shdr*)(BaseAddress + aHeader->e_shoff) + aShdr->sh_info;
uint RelocCount = aShdr->sh_size / aShdr->sh_entsize;
uint SymIdx, SymVal, RelocType;
for (int i = 0; i < RelocCount; i++, Reloc++)
{
SymVal = 0;
SymIdx = (Reloc->r_info >> 8);
RelocType = Reloc->r_info & 0xFF;
if (SymIdx != SHN_UNDEF)
{
if (RelocType == R_386_GOTPC)
SymVal = GOT;
else
SymVal = GetSymValue(aHeader, TargetSection->sh_link, SymIdx);
}
uint* add_ref = (uint*)(TargetSection->sh_addr + Reloc->r_offset);
switch(RelocType)
{
case R_386_32:
*add_ref = SymVal + *add_ref; // S + A
break;
case R_386_GOTOFF:
*add_ref = SymVal + *add_ref - GOT; // S + A - GOT
break;
case R_386_PLT32: // L + A - P
case R_386_PC32: // S + A - P
case R_386_GOTPC: // GOT + A - P
*add_ref = SymVal + *add_ref - (uint)add_ref;
break;
default:
throw new Exception("[ELF]: Unsupported Relocation type");
}
}
}
gcc -fno-plt -fno-pic will limit relocation types to R_386_PC32 and R_386_32 (or at least it worked in my case). Accepted answer is misleading in claiming it's not possible.
You can try to use gcc option: -fPIE or -fpie which could disable the GOT.

Declare an lldb summary-string for a sized string type

I would like to have a formatter for the buildin string type of the nim language, but somehow I fail at providing it. Nim compilis to c, and the c representation of the string type you see here:
#if defined(__GNUC__) || defined(__clang__) || defined(_MSC_VER)
# define SEQ_DECL_SIZE /* empty is correct! */
#else
# define SEQ_DECL_SIZE 1000000
#endif
typedef char NIM_CHAR;
typedef long long int NI64;
typedef NI64 NI;
struct TGenericSeq {NI len; NI reserved; };
struct NimStringDesc {TGenericSeq Sup; NIM_CHAR data[SEQ_DECL_SIZE]; };
and here is the output of what I have tried in the lldb session:
(lldb) frame variable *longstring
(NimStringDesc) *longstring = {
Sup = (len = 9, reserved = 15)
data = {}
}
(lldb) frame variable longstring->data
(NIM_CHAR []) longstring->data = {}
(lldb) type summary add --summary-string "${&var[0]%s}" "NIM_CHAR []"
(lldb) frame variable longstring->data
(NIM_CHAR []) longstring->data = {}
(lldb) type summary add --summary-string "${var%s}" "NIM_CHAR *"
(lldb) frame variable longstring->data
(NIM_CHAR []) longstring->data = {}
(lldb) frame variable &longstring->data[0]
(NIM_CHAR *) &[0] = 0x00007ffff7f3a060 "9 - 3 - 2"
(lldb) frame variable *longstring
(lldb) type summary add --summary-string "${var.data%s}" "NimStringDesc"
(lldb) frame variable *longstring
(NimStringDesc) *longstring = NIM_CHAR [] # 0x7ffff7f3a060
(lldb) type summary add --summary-string "${&var.data[0]%s}" "NimStringDesc"
(lldb) frame variable *longstring
(NimStringDesc) *longstring = {
Sup = (len = 9, reserved = 15)
data = {}
}
(lldb)
I simply can't manage, that the output will just be data interpreted as a '\0' terminated c-string
The summary string syntax you've tried is (by design) not as syntax rich as C.
And since you're using a zero-sized array, I don't think we have any magic provision to treat that as a pointer-to string. You might want to file a bug about it, but in this case, it's arguable whether it would help you. Since your string is length-encoded it doesn't really need to be zero-terminated, and that is the only hint LLDB would understand out of the box to know when to stop reading out of a pointer-to characters.
In your case, you're going to have to resort to Python formatters
The things you need are:
the memory location of the string buffer
the length of the string buffer
a process to read memory out of
This is a very small Python snippet that does it - I'll give you enhancement suggestions as well, but let's start with the basics:
def NimStringSummary(valobj,stuff):
l = valobj.GetChildMemberWithName('Sup').GetChildMemberWithName('len').GetValueAsUnsigned(0)
s = valobj.GetChildMemberWithName('data').AddressOf()
return '"%s"'% valobj.process.ReadMemory(s.GetValueAsUnsigned(0),l,lldb.SBError())
As you can see, first of all it reads the value of the length field;
then it reads the address-of the data buffer; then it uses the process that the value comes from to read the string content, and returns it in quotes
Now, this is a proof of concept. If you used it in production, you'd quickly run into a few issues:
What if your string buffer hasn't been initialized yet, and it says the size of the buffer is 20 gigabytes? You're going to have to limit the size of the data you're willing to read. For string-like types it has builtin knowledge of (char*, std::string, Swift.String, ...) LLDB prints out the truncated buffer followed by ..., e.g.
(const char*) buffer = "myBufferIsVeryLong"...
What if the pointer to the data is invalid? You should check that s.GetValueAsUnsigned(0) isn't actually zero - if it is you might want to print an error message like "null buffer".
Also, here I just passed an SBError that I then ignore - it would be better to pass one and then check it
All in all, you'd end up with something like:
import lldb
import os
def NimStringSummary(valobj,stuff):
l = valobj.GetChildMemberWithName('Sup').GetChildMemberWithName('len').GetValueAsUnsigned(0)
if l == 0: return '""'
if l > 1024: l = 1024
s = valobj.GetChildMemberWithName('data').AddressOf()
addr = s.GetValueAsUnsigned(0)
if addr == 0: return '<null buffer>'
err = lldb.SBError()
buf = valobj.process.ReadMemory(s.GetValueAsUnsigned(0),l,err)
if err.Fail(): return '<error: %s>' % str(err)
return '"%s"' % buf
def __lldb_init_module(debugger, internal_dict):
debugger.HandleCommand("type summary add NimStringDesc -F %s.NimStringSummary" % os.path.splitext(os.path.basename(__file__))[0])
The one extra trick is the __lldb_init_module function - this function is automatically called by LLDB whenever you 'command script import' a python file. This will allow you to add the 'command script import ' to your ~/.lldbinit file and automatically get the formatter to be picked up in all debug sessions
Hope this helps!

Can a breakpoint display the contents of "const unsigned char* variable"?

I'm on the trail of why the contents of a TXT record in a Bonjour service discovery is sometimes being incompletely interpreted, and I've reached a point where it would be really useful to have a breakpoint print out the contents of an unsigned char in a callback (I've tried NSLog, but using NSLog in a threaded callback can get really tricky).
The callback function is defined this way:
static void resolveCallback(DNSServiceRef sdRef, DNSServiceFlags flags, uint32_t interfaceIndex, DNSServiceErrorType errorCode,
const char* fullname, const char* hosttarget, uint16_t port, uint16_t txtLen,
const unsigned char* txtRecord, void* context) {
So I'm interested in the txtRecord
Right now my breakpoint is using:
memory read --size 4 --format x --count 4 `txtRecord`
But that's only because that was an example on the lldv.llvm.org example page ;-) It's certainly showing data that I expect to be there, partially.
Do I have to apply informed knowledge of the length or can the breakpoint be coded such that it uses the length that is present? I'm thinking that instead of "hard coding" the two 4s in the example there ought to be a way to wrap in other read instructions inside back ticks like I did with the variable name.
Looking at http://lldb.llvm.org/varFormats.html I thought I'd try a format of C instead of x but that prints out series of dots which must mean I picked a wrong format or something.
I just tried
memory read `txtRecord`
and that's almost exactly what I wanted to see as it gives:
0x1c5dd884: 10 65 6e 30 3d 31 39 32 2e 31 36 38 2e 31 2e 33 .en0=192.168.1.3
0x1c5dd894: 36 0a 70 6f 72 74 3d 35 30 32 37 38 00 00 00 00 6.port=50278....
This looks really close:
memory read `txtRecord` --format C
giving:
0x1d0c6974: .en0=192.168.1.36.port=50278....
If that's the best I can get, I guess I can deal with the length bytes in front of each of the two strings in that txtRecord.
I'm asking this question because I'd like to display the actual and correct values... the bug is that sometimes the IP address comes back wrong, losing the frontmost 1, other times the port comes back "short" (in network byte order) with non-numeric characters at the end, like "502¿" instead of "50278" (in this example run).
My initial response to this question, while informative, was not complete. I originally thought the problem being reported was just about printing a c-string array of type unsigned char * where the default formatters (char *) weren't being used. That answer comes first. Then comes the answer about how to print this (somewhat unique) array of pascal strings data that the program is actually dealing with.
First answer: lldb knows how to handle the char * well; it's the unsigned char * bit that is making it behave a little worse than usual. e.g. if txtRecord were a const char *,
(lldb) p txtRecord
(const char *) $0 = 0x0000000100000f51 ".en0=192.168.1.36.port=50278"
You can copy the type summary lldb has built in for char * for unsigned char *. type summary list lists all of the built in type summaries; copying lldb-179.5's summaries for char *:
(lldb) type summary add -p -C false -s ${var%s} 'unsigned char *'
(lldb) type summary add -p -C false -s ${var%s} 'const unsigned char *'
(lldb) fr va txtRecord
(const unsigned char *) txtRecord = 0x0000000100000f51 ".en0=192.168.1.36.port=50278"
(lldb) p txtRecord
(const unsigned char *) $2 = 0x0000000100000f51 ".en0=192.168.1.36.port=50278"
(lldb)
Of course you can put these in your ~/.lldbinit file and they'll be picked up by Xcode et al from now on.
Second answer: To print the array of pascal strings that this is actually using, you'll need to create a python function. It will take two arguments, the size of the pascal string buffer (txtLen) and the address of the start of the buffer (txtRecord). Create a python file like pstrarray.py (I like to put these in a directory I made, ~/lldb) and load it into your lldb via the ~/.lldbinit file so you have the command available:
command script import ~/lldb/pstrarray.py
The python script is a little long; I'm sure someone more familiar with python could express this more concisely. There's also a bunch of error handling which adds bulk. But the main idea is to take two parameters: the size of the buffer and the pointer to the buffer. The user will express these with variable names like pstrarray txtLen txtRecord, in which case you could look up the variables in the current frame, but they might also want to use an acutal expression like pstrarray sizeof(str) str. So we need to pass these parameters through the expression evaluation engine to get them down to an integer size and a pointer address. Then we read the memory out of the process and print the strings.
import lldb
import shlex
import optparse
def pstrarray(debugger, command, result, dict):
command_args = shlex.split(command)
parser = create_pstrarray_options()
try:
(options, args) = parser.parse_args(command_args)
except:
return
if debugger and debugger.GetSelectedTarget() and debugger.GetSelectedTarget().GetProcess():
process = debugger.GetSelectedTarget().GetProcess()
if len(args) < 2:
print "Usage: pstrarray size-of-buffer pointer-to-array-of-pascal-strings"
return
if process.GetSelectedThread() and process.GetSelectedThread().GetSelectedFrame():
frame = process.GetSelectedThread().GetSelectedFrame()
size_of_buffer_sbval = frame.EvaluateExpression (args[0])
if not size_of_buffer_sbval.IsValid() or size_of_buffer_sbval.GetValueAsUnsigned (lldb.LLDB_INVALID_ADDRESS) == lldb.LLDB_INVALID_ADDRESS:
print 'Could not evaluate "%s" down to an integral value' % args[0]
return
size_of_buffer = size_of_buffer_sbval.GetValueAsUnsigned ()
address_of_buffer_sbval = frame.EvaluateExpression (args[1])
if not address_of_buffer_sbval.IsValid():
print 'could not evaluate "%s" down to a pointer value' % args[1]
return
address_of_buffer = address_of_buffer_sbval.GetValueAsUnsigned (lldb.LLDB_INVALID_ADDRESS)
# If the expression eval didn't give us an integer value, try it again with an & prepended.
if address_of_buffer == lldb.LLDB_INVALID_ADDRESS:
address_of_buffer_sbval = frame.EvaluateExpression ('&%s' % args[1])
if address_of_buffer_sbval.IsValid():
address_of_buffer = address_of_buffer_sbval.GetValueAsUnsigned (lldb.LLDB_INVALID_ADDRESS)
if address_of_buffer == lldb.LLDB_INVALID_ADDRESS:
print 'could not evaluate "%s" down to a pointer value' % args[1]
return
err = lldb.SBError()
pascal_string_buffer = process.ReadMemory (address_of_buffer, size_of_buffer, err)
if (err.Fail()):
print 'Failed to read memory at address 0x%x' % address_of_buffer
return
pascal_string_array = bytearray(pascal_string_buffer, 'ascii')
index = 0
while index < size_of_buffer:
length = ord(pascal_string_buffer[index])
print "%s" % pascal_string_array[index+1:index+1+length]
index = index + length + 1
def create_pstrarray_options():
usage = "usage: %prog"
description='''print an buffer which has an array of pascal strings in it'''
parser = optparse.OptionParser(description=description, prog='pstrarray',usage=usage)
return parser
def __lldb_init_module (debugger, dict):
parser = create_pstrarray_options()
pstrarray.__doc__ = parser.format_help()
debugger.HandleCommand('command script add -f %s.pstrarray pstrarray' % __name__)
and an example program to run this on:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
int main ()
{
unsigned char str[] = {16,'e','n','0','=','1','9','2','.','1','6','8','.','1','.','3','6',
10,'p','o','r','t','=','5','1','6','8','7'};
uint8_t *p = str;
while (p < str + sizeof (str))
{
int len = *p++;
char buf[len + 1];
strlcpy (buf, (char*) p, len + 1);
puts (buf);
p += len;
}
puts ("done"); // break here
}
and in use:
(lldb) br s -p break
Breakpoint 1: where = a.out`main + 231 at a.c:17, address = 0x0000000100000ed7
(lldb) r
Process 74549 launched: '/private/tmp/a.out' (x86_64)
en0=192.168.1.36
port=51687
Process 74549 stopped
* thread #1: tid = 0x1c03, 0x0000000100000ed7 a.out`main + 231 at a.c:17, stop reason = breakpoint 1.1
#0: 0x0000000100000ed7 a.out`main + 231 at a.c:17
14 puts (buf);
15 p += len;
16 }
-> 17 puts ("done"); // break here
18 }
(lldb) pstrarray sizeof(str) str
en0=192.168.1.36
port=51687
(lldb)
While it's cool that it's possible to do this in lldb, it's not as smooth as we'd like to see. If the size of the buffer and the address of the buffer were contained in a single object, struct PStringArray {uint16_t size; uint8_t *addr;}, that would work much better. You could define a type summary formatter for all variables of type struct PStringArray and no special commands would be required. You'd still need to write a python function, but it could get all the information it needed out of the object directly so it would disappear into the lldb type format system. You could just write (lldb) p strs and the custom formatter function would be called on strs to print all the strings in there.

View array in LLDB: equivalent of GDB's '#' operator in Xcode 4.1

I would like to view an array of elements pointed to by a pointer. In GDB this can be done by treating the pointed memory as an artificial array of a given length using the operator '#' as
*pointer # length
where length is the number of elements I want to view.
The above syntax does not work in LLDB supplied with Xcode 4.1.
Is there any way how to accomplish the above in LLDB?
There are two ways to do this in lldb.
Most commonly, you use the parray lldb command which takes a COUNT and an EXPRESSION; EXPRESSION is evaluated and should result in a pointer to memory. lldb will then print COUNT items of that type at that address. e.g.
parray 10 ptr
where ptr is of type int *.
Alternatively, it can be done by casting the pointer to a pointer-to-array.
For example, if you have a int* ptr, and you want to view it as an array of ten integers, you can do
p *(int(*)[10])ptr
Because it relies only on standard C features, this method works without any plugins or special settings. It likewise works with other debuggers like GDB or CDB, even though they also have specialized syntaxes for printing arrays.
Starting with the lldb in Xcode 8.0, there is a new built-in parray command. So you can say:
(lldb) parray <COUNT> <EXPRESSION>
to print the memory pointed to by the result of the EXPRESSION as an array of COUNT elements of the type pointed to by the expression.
If the count is stored in a variable available in the current frame, then remember you can do:
(lldb) parray `count_variable` pointer_to_malloced_array
That's a general lldb feature, any command-line argument in lldb surrounded in backticks gets evaluated as an expression that returns an integer, and then the integer gets substituted for the argument before command execution.
The only way I found was via a Python scripting module:
""" File: parray.py """
import lldb
import shlex
def parray(debugger, command, result, dict):
args = shlex.split(command)
va = lldb.frame.FindVariable(args[0])
for i in range(0, int(args[1])):
print va.GetChildAtIndex(i, 0, 1)
Define a command "parray" in lldb:
(lldb) command script import /path/to/parray.py
(lldb) command script add --function parray.parray parray
Now you can use "parray variable length":
(lldb) parray a 5
(double) *a = 0
(double) [1] = 0
(double) [2] = 1.14468
(double) [3] = 2.28936
(double) [4] = 3.43404
With Xcode 4.5.1 (which may or may not help you now), you can do this in the lldb console:
(lldb) type summary add -s "${var[0-63]}" "float *"
(lldb) frame variable pointer
(float *) pointer = 0x000000010ba92950 [0.0,1.0,2.0,3.0, ... ,63.0]
This example assumes that 'pointer' is an array of 64 floats: float pointer[64];
It doesn't seem to be supported yet.
You could use the memory read function (memory read / x), like
(lldb) memory read -ff -c10 `test`
to print a float ten times from that pointer. This should be the same functionality as gdb's #.
Starting with Martin R answer I improved it as follow:
If the pointer is not a simple variable, e.g.:
struct {
int* at;
size_t size;
} a;
Then "parray a.at 5" fails.
I fixed this by replacing "FindVariable" with "GetValueForVariablePath".
Now what if the elements in your array are aggregates, e.g.:
struct {
struct { float x; float y; }* at;
size_t size;
} a;
Then "parray a.at 5" prints: a.at->x, a.at->y, a.at[2], a.at[3], a.at[4] because GetChildAtIndex() returns members of aggregates.
I fixed this by resolving "a.at" + "[" + str(i) + "]" inside the loop instead of resolving "a.at" and then retrieving its children.
Added an optional "first" argument (Usage: parray [FIRST] COUNT), which is useful when you have a huge number of elements.
Made it do the "command script add -f parray.parray parray" at init
Here is my modified version:
import lldb
import shlex
def parray(debugger, command, result, dict):
args = shlex.split(command)
if len(args) == 2:
count = int(args[1])
indices = range(count)
elif len(args) == 3:
first = int(args[1]), count = int(args[2])
indices = range(first, first + count)
else:
print 'Usage: parray ARRAY [FIRST] COUNT'
return
for i in indices:
print lldb.frame.GetValueForVariablePath(args[0] + "[" + str(i) + "]")
def __lldb_init_module(debugger, internal_dict):
debugger.HandleCommand('command script add -f parray.parray parray')
I tried to add a comment but that wasn't great for posting a full answer so I made my own answer. This solves the problem with getting "No Value". You need to get the current frame as I believe lldb.frame is set at module import time so it doesn't have the current frame when you stop at a breakpoint if you load the module from .lldbinit. The other version would work if you import or reloaded the script when you stopped at the breakpoint. The version below should always work.
import lldb
import shlex
#lldb.command('parray', 'command script add -f parray.parray parray')
def parray(debugger, command, result, dict):
target = debugger.GetSelectedTarget()
process = target.GetProcess()
thread = process.GetSelectedThread()
frame = thread.GetSelectedFrame()
args = shlex.split(command)
if len(args) == 2:
count = int(args[1])
indices = range(count)
elif len(args) == 3:
first = int(args[1])
count = int(args[2])
indices = range(first, first + count)
else:
print 'Usage: parray ARRAY [FIRST] COUNT'
return
for i in indices:
print frame.GetValueForVariablePath(args[0] + "[" + str(i) + "]")
To inspect variables you can use the frame variable command (fr v is the shortest unique prefix) which has a -Z flag which does exactly what you want:
(lldb) fr v buffer -Z5
(int64_t *) buffer = 0x000000010950c000 {
(int64_t) [0] = 0
(int64_t) [1] = 0
(int64_t) [2] = 0
(int64_t) [3] = 0
(int64_t) [4] = 0
}
unfortunately expression does not support that flag
Well at that point, you may as well write your own custom C function and invoke it with:
call (int)myprint(args)

Resources