Print addresses of all local variables in C - debugging

I want to print the addresses of all the local and global variables which are being used in a function, at different points of execution of a program and store them in a file.
I am trying to use gdb for this same.
The "info local" command prints the values of all local variables. I need something to print the addresses in a similar way. Is there any built in command for it?
Edit 1
I am working on a gcc plugin which generates a points-to graph at compile time.
I want to verify if the graph generated is correct, i.e. if the pointers do actually point to the variables, which the plugin tells they should be pointing to.
We want to validate this points-to information on large programs with over thousands of lines of code. We will be validating this information using a program and not manually. There are several local and global variables in each function, therefore adding printf statements after every line of code is not possible.

There is no built-in command to do this. There is an open feature request in gdb bugzilla to have a way to show the meaning of all the known slots in the current stack frame, but nobody has ever implemented this.
This can be done with a bit of gdb scripting. The simplest way is to use Python to iterate over the Blocks of the selected Frame. Then in each such Block, you can iterate over all the variables, and invoke info addr on the variable.
Note that printing the address with print &var will not always work. A variable does not always have an address -- but, if the variable exists, it will have a location, which is what info addr will show.
One simple way these ideas can differ is if the compiler decides to put the variable into a register. There are more complicated cases as well, though, for example the compiler can put the variable into different spots at different points in the function; or can split a local struct into its constituent parts and move them around.
By default info addr tries to print something vaguely human-readable. You can also ask it to just dump the DWARF location expressions if you need that level of detail.

programmatically ( in C/C++ ) you use the & operator to get the address of a variable (assuming it's not a pointer):
int a; //variable declaration
print("%d", a); //print the value of the variable (as an integer)
print("0x%x", &a); //print the address of the variable (as hex)
The same goes for (gdb), just use &
plus the question has already been answered here (and not only)

Related

Declaring a global variable in Linux Kernel and using it through out the device power on

I have multiple boards which were having different screen sizes(4",7",12" etc...). I need this info at multiple places in the Linux Kernel during boot time. We can know the device screen size by reading values on two input lines. I don't want to read every time these lines whenever I want to know the device screen size. I want to read these lines once at the beginning(maybe in the board file) and store it in a global variable. Check this global variable value where ever I need the screen size. Can anyone suggest where I can declare this global variable or any better way for this?
So you've given some thought to your problem and you think a "global" variable in the Linux kernel is the fastest/best solution. Where a "global" variable is a variable that one module defines and initializes and other modules can also read and modify it. You can do that by exporting the symbol with the EXPORT_SYMBOL macro.
As an example, look at this file on lines 54 and 55 the symbol drm_debug is declared as an unsigned int and exported for other driver modules to use. The other modules need to be informed about that symbol's existance before they can use it. In this case, that happens in this header file on line 781. So when another file or module wants to use that symbol, that source file includes the header, and then it just uses the variable as if it were declared in that file. If you don't want to create a header file for your "personal" global variables, you can just add that "extern" as a 1-liner to your global scope declarations for that source code file, and it will have the same effect.
In addition to exporting variables, you can also export functions in the same fashion. If you grep for "EXPORT_SYMBOL" in your kernel most of the hits are going to be functions. When you pass around a variable there is the chance that one module might change it at an inopportune moment when another module might be reading it causing undesired outcomes. I like to think of the function as a "read only" version of a variable, so only one module can change it, but everyone else can see it. Having one module being "responsible" for reading and interpreting the GPIO and then sharing that information with other modules to me seems to fit better with the exported function than with the exported variable.
Lastly, whenever you start sharing symbols in the kernel, you should give thought to the name of the global variable you choose. For example, don't take something that's already taken or likely to be taken.

Ui path studio arguments? defined as variables or hard coded values

I was doing practice questions for the rpa asociate exam and came across a question I was unsure on how to answer.
Suppose you have some arguments from an invoked workflow. Some are out arguments, some are in and some are in/out. The question asked which types of arguments supported being mapped to variables and which supported being mapped to hard coded values.
I didn't find much on the documentation expect a generic paragraph on arguments as a whole that said that you could map them to both variables and hard coded values regardless of direction.
TL;DR - If you hardcode a value to an Out or In/Out, the Out value produced cannot map to the Out argument and will not be used further down the line.
While it is probably allowed by UiPath Studio to add hardcoded values to Out or In/Out arguments, it would effectively break your automation by the nature of what an Out argument actually does.
When thinking through it, if you were to add a hardcoded value as the result of an Out or In/Out argument, all work done while the workflow is invoked would not be passed back to the invoking workflow when it is finished, since the out value produced is not able to replace a hardcoded value. An In argument, on the other hand, would work just as well whether the value is a variable or hard-coded, since that value is no longer needed as soon as the workflow that is being invoked starts (however, in most cases, it is probably still best practice to use a variable even for an In).

GDB using variable name to access local variable name

gdb provides a command "print localx" which prints the value stored in the localx variable. So, it basically must be using the symbol table to find the mapping (localx -> addressx on stack). I am unable to understand how this mapping can be created.
What I tried
I studied the intermediate temporary files of gcc using -save-temps option, and observed that a local variable local1 was mapped to a symbol name "LASF8". However, the objdump utility tool did not show this symbol name.
Context :
I am working on a project which requires building a pin-tool to print the accesses of local variables. Given a function, I would like to say that this address corresponds to this variable name. This requires reading the symbol table to correspond an address to a symbol table entry. GDB does the exact reverse mapping. Hence, I would like to understand the same.
The symbol table is contained in the debugging information. This debugging information is emitted by gcc -g. gdb reads the debugging information to get symbolic information, among other things.
Typically the debugging information is in DWARF format. See http://www.dwarfstd.org/ for the specification.
You can also see DWARF more directly using readelf. For example readelf -wi will show the main (".debug_info") debugging information for an ELF file.
Note that doing the mapping in reverse -- that is, assigning a name to every stack slot -- is not entirely easy. First, not every stack slot will have a name. This is because the compiler may spill temporaries to the stack. Second, many locals will have DWARF location expressions to represent their location. This means you'll need to write an expression evaluator (not hard but also not trivial); you could conceivably (unlikely in practice but possible in theory) run into expressions which cannot be evaluated without a real stack frame; and finally the names will therefore generally only be valid at a given PC.
I believe there's a feature request in gdb bugzilla to add this feature to gdb.

finding variable and function memory addresses in shell

In shell programming, is there any way to find the variable and function addresses?
Something like in c or c++ when we output the &x, we can get x memory address.
It depends on what you want. Most shells only support symbolic (or "soft") references, variables that contain the names of other variables. Korn shell 93 has name references, created using typeset -n, but the address is not exposed. Take a look at that in man ksh, it might help.
But even if you could get the memory address of variables, what would you do with it? Shell variables are not structured in a primitive manner as in C, so pointer arithemtic would be no use. You could not iterate through an array or character string just by incrementing an address. The best you could do is investigate the memory of that particular version of that particular shell.
Since the shells are mostly written in C, you could always compile the shell source with debug (-g) and connect using a debugger. No idea what that would buy you though, so my question has to be: why?
Edit; from a comment I see you want the stack size - I'm pretty sure that shell variables will be allocated either on the heap or in the environment block.

How can I force the order of functions in a binary with the gcc toolchain?

I'm building a static binary out of several source files and libraries, and I want to control the order in which the functions are put into the resulting binary.
The background is, I have external code which is linked against offsets in this binary. Now if I change the source, all the offsets change because gcc may decide to order the functions differently, so I want to put the referenced functions at the beginning in a fixed order so their offsets stay unchanged...
I looked through ld's documentation but couldn't find anything about order of functions.
The only thing i found was -fno-toplevel-reorder which doesn't really help me.
There is really no clean and reliable way of forcing a function to a particular address (except for the entry function) or even forcing functions having a particular order (and if you could enforce the order that would still not mean that the addresses stay the same when the source is changed!).
The biggest problem that I see is that even if it may be possible to fix a function to some address, it will be sheer impossible to fix all of them to exactly the addresses that the already existing external program expects (assuming you cannot modify this program). If that actually worked, it would be total coincidence and sheer luck.
It might be almost easiest to provide trampolines at the addresses that the other program expects, and having the real functions (whereever they may be) pointed to by these. That would require your code to use a different base address, so the actual program code doesn't collide with the trampolines.
There are three things that almost work for giving functions fixed addresses:
You can place each function that isn't allowed to move in its proper section using __attribute__ ((section ("some name"))). Unluckily, .text always appears as the first section, so if anything in .text changes so the size is bumped over the 512 byte boundary, your offsets will change. By default (but see below) you can't get a section to start before .text.
The -falign-functions=n commandline option lets you align functions to a boundary. Normally this is something around 16 bytes. Now, you could choose a large value like for example 1024. That will waste an immense amount of space, but it will also make sure that as long as functions only change moderately, the addresses of the following functions will remain the same. Obviously it still does not prevent the compiler/linker from reordering entire blocks when it feels like it (though -fno-toplevel-reorder will prevent this at least partially).
If you are willing to write a custom linker script, you can assign a start address for each section. These are virtual memory addresses, not positions in the executable, but I assume the hard linking works with VMAs (based on the default image base) too. So that could kind of work, although with much trouble and not in a pretty way.
When writing your own linker script, you could also consider putting the functions that must not move into their own sections and moving these sections at the beginning of the executable (in front of .text), so changes in .text won't move your functions around.
Update:
The "gcc" tag suggests that you probably target *NIX, so again this is probably not going to help you, but... if you have the option to use COFF, dollar-sign sections might work (the info might be interesting for others, in any case).
I just stumbled across this today (emphasis mine):
The "$" character (dollar sign) has a special interpretation in section names in object files. When determining the image section that will contain the contents of an object section, the linker discards the "$" and all characters that follow it. Thus, an object section named .text$X actually contributes to the .text section in the image. However, the characters following the "$" determine the ordering of the contributions to the image section. All contributions with the same object-section name are allocated contiguously in the image, and the blocks of contributions are sorted in lexical order by object-section name. Therefore, everything in object files with section name .text$X ends up together, after the .text$W contributions and before the .text$Y contributions.
If the documentation does not lie (and if I'm not reading wrong), this means you should be able to pack all the functions that you want located in the front into one section .text$A, and everything else into .text$B, and it should do just that.
Build your code with -ffunction-sections -- this will place each function into its own section.
If you are using GNU-ld, the linker script gives you absolute control, but is a very platform-specific and somewhat painful solution.
A better solution might be to use the recent work on gold, which allows exactly the function ordering you are seeking.
A lot of it comes from the order the functions are in the file and the order the files are on the command line when you link.
Embed something in the code that your external code can find, a const structure with some ascii code and the address to functions perhaps, then no matter where the compiler puts the functions you can find them.
that or use the normal .dll or .so mechanisms, and not have to mess with it.
In my experience, gcc -O0 will fix the binary order of functions to match the order in the source code.
However as others have mentioned, even if the order is fixed, the offsets can change as you modify the source code or upgrade your toolchain.

Resources