What is the default Calling Convention in TurboPascal? Variables should put in which registers or put into the stack LTR or RTL?
If it's the Turbo Pascal I knew long ago, it's the Pascal convention: parameters pushed on the stack left to right and the called function cleans up.
Related
How does make-array work in SBCL? Are there some equivalents of new and delete operators in C++, or is it something else, perhaps assembler level?
I peeked into the source, but didn't understand anything.
When using SBCL compiled from source and an environment like Emacs/Slime, it is possible to navigate the code quite easily using M-. (meta-point). Basically, the make-array symbol is bound to multiple things: deftransform definitions, and a defun. The deftransform are used mostly for optimization, so better just follow the function, first.
The make-array function delegates to an internal make-array% one, which is quite complex: it checks the parameters, and dispatches to different specialized implementation of arrays, based on those parameters: a bit-vector is implemented differently than a string, for example.
If you follow the case for simple-array, you find a function which calls allocate-vector-with-widetag, which in turn calls allocate-vector.
Now, allocate-vector is bound to several objects, multiple defoptimizers forms, a function and a define-vop form.
The function is only:
(defun allocate-vector (type length words)
(allocate-vector type length words))
Even if it looks like a recursive call, it isn't.
The define-vop form is a way to define how to compile a call to allocate-vector. In the function, and anywhere where there is a call to allocate-vector, the compiler knows how to write the assembly that implements the built-in operation. But the function itself is defined so that there is an entry point with the same name, and a function object that wraps over that code.
define-vop relies on a Domain Specific Language in SBCL that abstracts over assembly. If you follow the definition, you can find different vops (virtual operations) for allocate-vector, like allocate-vector-on-heap and allocate-vector-on-stack.
Allocation on heap translates into a call to calc-size-in-bytes, a call to allocation and put-header, which most likely allocates memory and tag it (I followed the definition to src/compiler/x86-64/alloc.lisp).
How memory is allocated (and garbage collected) is another problem.
allocation emits assembly code using %alloc-tramp, which in turns executes the following:
(invoke-asm-routine 'call (if to-r11 'alloc-tramp-r11 'alloc-tramp) node)
There are apparently assembly routines called alloc-tramp-r11 and alloc-tramp, which are predefined assembly instructions. A comment says:
;;; Most allocation is done by inline code with sometimes help
;;; from the C alloc() function by way of the alloc-tramp
;;; assembly routine.
There is a base of C code for the runtime, see for example /src/runtime/alloc.c.
The -tramp suffix stands for trampoline.
Have also a look at src/runtime/x86-assem.S.
I have been doing some x86 programming in Windows with NASM and I have run into some confusion. I am confused as to why I must do this:
extern _ExitProcess#4
Specifically I am confused about the '_' and the '#4'. I know that the '#4' is the size of the stack but why is it needed? When I looked in the kernel32.dll with a hex editor I only saw 'ExitProcess' not '_ExitProcess#4'.
I am also confused as to why C Functions do not need the underscore and the stack size such as this:
extern printf
Why don't C Functions need decorations?
My third question is "Is this the way I should be using these functions?" Right now I am linking with the actual dll files themselves.
I know that the '#4' is the size of the stack but why is it needed?
To enable the linker to report a fatal error if your compiler assumed the wrong calling convention for the function (this can happen if you forget to include header files in C and ignore all the compiler warnings or if a declaration doesn't exactly match the function in the shared library).
Why don't C Functions need decorations?
Functions that use the cdecl calling convention are decorated with a single leading (so it would actually be _printf).
The reason why no parameter size is encoded into the decorated name is that the caller is responsible for both setting up and tearing down the stack, so an argument count mismatch will not be fatal for the stack setup (though the calling function might still crash if it isn't given the right arguments, of course). It might even be possible that the argument count is variable, like in the case of printf.
When I looked in the kernel32.dll with a hex editor I only saw ExitProcess not _ExitProcess#4.
The mangled names are usually mapped to the actual exported names of the DLL using definition files (*.def), which then get compiled to *.lib import library files that can be used in your linker invocation. An example of such a definition file for kernel32.dll is this one. The following line defines the mapping for ExitProcess:
_ExitProcess#4 = ExitProcess
Is this the way I should be using these functions?
I don't know NASM very well, but the code I've seen so far usually specifies the decorated name, like in your example.
You can find more information on this excellent page about Win32 calling conventions.
This is my attempt to start a collection of GCC special features which usually do not encounter. this comes after #jlebedev in the another question mentioned "Effective C++" option for g++,
-Weffc++
This option warns about C++ code which breaks some of the programming guidelines given in the books "Effective C++" and "More Effective C++" by Scott Meyers. For example, a warning will be given if a class which uses dynamically allocated memory does not define a copy constructor and an assignment operator. Note that the standard library header files do not follow these guidelines, so you may wish to use this option as an occasional test for possible problems in your own code rather than compiling with it all the time.
What other cool features are there?
From time to time I go through the current GCC/G++ command line parameter documentation and update my compiler script to be even more paranoid about any kind of coding error. Here it is if you are interested.
Unfortunately I didn't document them so I forgot most, but -pedantic, -Wall, -Wextra, -Weffc++, -Wshadow, -Wnon-virtual-dtor, -Wold-style-cast, -Woverloaded-virtual, and a few others are always useful, warning me of potentially dangerous situations. I like this aspect of customizability, it forces me to write clean, correct code. It served me well.
However they are not without headaches, especially -Weffc++. Just a few examples:
It requires me to provide a custom copy constructor and assignment operator if there are pointer members in my class, which are useless since I use garbage collection. So I need to declare empty private versions of them.
My NonInstantiable class (which prevents instantiation of any subclass) had to implement a dummy private friend class so G++ didn't whine about "only private constructors and no friends"
My Final<T> class (which prevents subclassing of T if T derived from it virtually) had to wrap T in a private wrapper class to declare it as friend, since the standard flat out forbids befriending a template parameter.
G++ recognizes functions that never return a return value, and throw an exception instead, and whines about them not being declared with the noreturn attribute. Hiding behind always true instructions didn't work, G++ was too clever and recognized them. Took me a while to come up with declaring a variable volatile and comparing it against its value to be able to throw that exception unmolested.
Floating point comparison warnings. Oh god. I have to work around them by writing x <= y and x >= y instead of x == y where it is acceptable.
Shadowing virtuals. Okay, this is clearly useful to prevent stupid shadowing/overloading problems in subclasses but still annoying.
No previous declaration for functions. Kinda lost its importance as soon as I started copypasting the function declaration right above it.
It might sound a bit masochist, but as a whole, these are very cool features that increased my understanding of C++ and general programming.
What other cool features G++ has? Well, it's free, open, it's one of the most widely used and modern compilers, consistently outperforms its competitors, can eat almost anything people throw at it, available on virtually every platform, customizable to hell, continuously improved, has a wide community - what's not to like?
A function that returns a value (for example an int) will return a random value if a code path is followed that ends the function without a 'return value' statement. Not paying attention to this can result in exceptions and out of range memory writes or reads.
For example if a function is used to obtain the index into an array, and the faulty code path is used (the one that doesn't end with a return 'value' statement) then a random value will be returned which might be too big as an index into the array, resulting in all sorts of headaches as you wrongly mess up the stack or heap.
Maybe asking the question betrays my lack of knowledge about the process, but then again, there's no better reason to ask!
Tracking these down can be frustrating because stack traces can help me know where to start looking but not which object was null.
What is going on under the hood here? Is it because the variable names aren't bundled in the executable?
.NET code built with full optimizations and no debug info: your local variable names are gone, some local variables may have been eliminated entirely.
.NET code built with full optimizations + PDB (or full debug): most local variable names preserved, some local variables may have been eliminated
No optimizations + no debug info: local variable names are gone.
And then we have to consider that whatever you're dealing with may not be in a local variable at all - it might have been the result of a previous function call, on which you're chaining a new function call.
Basically you answered your own question. When you're code is compiled it's transformed in intermediate language (IL). IL does not have variable names the way your code does, arguments to a method being called are pushed on to a stack before the method is called and the currents methods arguments and local variables are referred to by there position. I believe this is because this structure aids the JIT compiler generate code.
The pdb symbols file stores a mapping between the IL generated and your code. It is used to tell you which line in your code each method call in the call stack refers to. Possibly the information stored here isn't detailed enough to say which variable is null, or possibly it was just considered too expensive in terms when of perf to be able to do this. In any case, if you have allowed the compiler to optimize the IL generated there may no longer be a one to one mapping between the variables in the IL and the variables in your code.
Hope that helps,
Rob
There is no "object identifier". There's no way that .NET could say "the object with identifier xxxx is null".
You'll learn how to not make these mistakes, don't worry. Just break down your expressions into smaller pieces, and you'll find which objects you forgot to initialize. You'll learn to iniitialize them in that scenario, and after a while, that case won't happen again.
GCC documentation states in 6.30 Declaring Attributes of Functions:
naked
Use this attribute on the ARM, AVR, IP2K, RX and SPU ports to indicate that the specified function does not need prologue/epilogue sequences generated by the compiler. It is up to the programmer to provide these sequences. The only statements that can be safely included in naked functions are asm statements that do not have operands. All other statements, including declarations of local variables, if statements, and so forth, should be avoided. Naked functions should be used to implement the body of an assembly function, while allowing the compiler to construct the requisite function declaration for the assembler.
Can I safely call functions using C syntax from naked functions, or only by using asm?
You can safely call functions from a naked function, provided that the called functions have a full prologue and epilogue.
Note that it is a bit of a nonsense to assert that you can 'safely' use assembly language in a naked function. You are entirely responsible for anything you do using assembly language, as you are for any calls you make to 'safe' functions.
To ensure that your generic called function is not static or inlined, it should be in a seperate compilation unit.
"naked" functions do not include any prologue or epilogue -- they are naked. In particular, they do not include operations on the stack for local variables, to save or restore registers, or to return to a calling function.
That does not mean that no stack exists -- the stack is initialized in the program initialisation, not in any function initialization. Since a stack exists, called function prologues and epilogues work correctly. A function call can safely push it's return address, any registers used, and space for any local variables. On return (using the return address), the registers are restored and the stack space is released.
Static or inlined-functions may not have a full prologue and epilogue. They can and may depend on the calling function to manage the stack and to restore corrupted registers.
This leads to the next point: you need the prologue and epilogue only to encapsulate the operations of the called function. If the called function is also safe (no explicit or implicit local variables, no changes to status registers), it can be safely static and/or inlined. As with asm, it would be your responsibility to make sure this is true.
If the only thing you do in the naked function is call another function you can just use a single JMP machine code instruction.
The function you jump to will have a valid prologue and it should return directly to the caller of the naked function since JMP doesn't push a return address on the stack.
The only statements that can be safely included in naked functions are asm statements that do not have operands. All other statements, including declarations of local variables, if statements, and so forth, should be avoided.
Based on the description you already gave, I would assume that even function calls are not suitable for the "naked" keyword.