I've heard the Google talk (http://www.youtube.com/watch?v=_gZK0tW8EhQ) by Ron Garret and read the paper (http://www.flownet.com/gat/jpl-lisp.html), but I'm not understanding how it worked to "correct" supposedly running code with a REPL. Was the DS-1's Lisp code running is some sort of virtual machine? Or was it "live" in the REPL's actual world? Or was the Lisp code an executable that got replaced? What exactly happened/happens when you dynamically change running Lisp code through a REPL?
Whereas most programs are built and distributed as an executable that contains only the necessary components to run the program, Lisp can be distributed as an image that contains not just the components for the specific program, but also much or all of the Lisp runtime and development environment.
The REPL is the quintessential mechanism for providing interactive access to a running Lisp environment. The two key components of the REPL, Read, and Eval, expose much of the Lisp runtime system. For example, many Lisp systems today implement Eval by compiling the provided form (that is read by the Reader), compiling the form to machine code, and then executing the result. This is in contrast to interpreting the form. Some systems, especially in the past, contained both an interpreter that executes quickly and is suitable for interactive access, and a compiler that produces better code. But modern systems are fast enough that the compiler phase isn't noticeable and simply forgo the interpreter.
Of course, you can do very similar things today. A simple example is running SSH to your Linux box that's hosting PHP. Your PHP server is up and running and live, serving pages and requests. But you login through SSH, go over and fix a PHP file, and as soon as you save that file, all of your users see the new result in real time -- the system updated itself on the fly.
The fact that PHP is running on a Linux runtime vs Lisp running on a Lisp runtime, is a detail. The effect is the same. The fact that PHP isn't compiled is a detail also. For example, you can do the same thing on a Java server: modify a JSP, save it, and the JSP is converted in to a Servlet as Java source code, then compiled on the fly by the Java runtime, then loaded in to the executing container, replacing the old code.
Lisps capability to do this is very nice, and it was very interesting far back in the day. Today, it's less so, as there are different system providing similar capabilities.
Addenda:
No, Lisp is not a virtual machine, there's no need for it to be that complicated.
The key to the concept is dynamic dispatch. With dynamic dispatch there is some lookup involved before a function is invoked.
In a static language like C, locations of things are pretty much set in stone once the linker and loader have finished processing the executable in preparation to start executing.
So, in C if you have something simple like:
int add(int i) {
return i + 1;
}
void main() {
add(1);
}
After compiling and linking and loading of the program, the address of the add function will be set in stone, and thus thing referring to that function will know exactly where to find it.
So, in assembly language: (note this is a contrived assembly language)
add: pop r1 ; pop R1 from the stack, loading the i parameter
add r1, 1; Add 1 to the parameter.
push r1 ; push result of function call
rts ; return from subroutine
main: push 1 ; Push parameter to function
call add ; call function
pop r1 ; gather (and ignore) the result
So, you can see here that add is fixed in place.
In something like Lisp, function are referred to indirectly.
int add(int i) {
return i + 1;
}
int *add_ptr() = &add;
void main() {
*(add_ptr)(1);
}
In assembly you get:
add: pop r1 ; pop R1 from the stack, loading the i parameter
add r1, 1; Add 1 to the parameter.
push r1 ; push result of function call
rts ; return from subroutine
add_ptr: dw add ; put the address of the add routine in add_ptr
main: push 1 ; Push parameter to function
mov r1, add_ptr ; Put the contents of add_ptr into R1
call (r1) ; call function indirectly through R1
pop r1 ; gather (and ignore) the result
Now, you can see here that rather than calling add directly, it is called indirectly through the add_ptr. In a Lisp runtime, it has the capability of compiling new code, and when that happens, add_ptr would be overwritten to point to the newly compiled code. You can see how the code in main never has to change, it will call whatever function add_ptr is pointing to.
Since most all of the functions in Lisp are indirectly referenced through their symbols, a lot can change "behind the back" of a running system, and the system, will continue to run.
When a function is recompiled, the old function code (assuming no other references) become eligible for garbage collection, and will, typically, eventually go away.
You can also see that when the system is garbage collected, any code that is moved (such as the code for the add function) can be moved by the runtime, and it's new location updated in the add_ptr so the system will continue operating even after code and been relocated by the garbage collector.
So, the key to it all, is to have your functions invoked through some lookup mechanism. Having this gives you a lot of flexibility.
Note, you can also do this is a running C system, for example. You can put code in a dynamic library, load the library, execute the code, and if you want you can build a new dynamic library, close the old one, open the new one, and invoke the new code -- all in a "running" system. The dynamic library interface provides the lookup mechanism that isolates the code.
Related
Today, only for the testing purposes, I came with the following idea, to create and compile a naive source code in CodeBlocks, using Release target to remove the unnecessary debugging code, a main function with three nop operations only to find faster where the entry point for the main function is.
CodeBlocks sample naive program:
Using IDA disassembler, I have seen something strange, OS actually can add aditional machine code calls in the main function (added implicitly), a call to system function which reside in kernel32.dll what is used for OS thread handling.
IDA program view:
In the machine code only for test reason the three "nop" (90) was replaced by "and esp, 0FFFFFFF0h", program was re-pached again, this is why "no operation" opcodes are not disponible in the view.
Observed behaviour:
It is logic to create a new thread for each process is opened, as we can explore it in the TaskManager, a process run in it's own thread, that is a reason why compiler add this code (the implicit default thread).
My questions:
How compiler know where to "inject" this call code automatically?
Why this call is not made before in the upper function (sub_401B8C) which will route to main function entry point?
To quote the gcc manual:
If no init section is available, when GCC compiles any function called
main (or more accurately, any function designated as a program entry
point by the language front end calling expand_main_function), it
inserts a procedure call to __main as the first executable code after
the function prologue. The __main function is defined in libgcc2.c and
runs the global constructors.
I've read this tutorial
I could follow the guide and run the code. but I have questions.
1) Why do we need both load-address and run-time address. As I understand it is because we have put .data at flash too; so why we don't run app there, but need start-up code to copy it into RAM?
http://www.bravegnu.org/gnu-eprog/c-startup.html
2) Why we need linker script and start-up code here. Can I not just build C source as below and run it with qemu?
arm-none-eabi-gcc -nostdlib -o sum_array.elf sum_array.c
Many thanks
Your first question was answered in the guide.
When you load a program on an operating system your .data section, basically non-zero globals, are loaded from the "binary" into the right offset in memory for you, so that when your program starts those memory locations that represent your variables have those values.
unsigned int x=5;
unsigned int y;
As a C programmer you write the above code and you expect x to be 5 when you first start using it yes? Well, if are booting from flash, bare metal, you dont have an operating system to copy that value into ram for you, somebody has to do it. Further all of the .data stuff has to be in flash, that number 5 has to be somewhere in flash so that it can be copied to ram. So you need a flash address for it and a ram address for it. Two addresses for the same thing.
And that begins to answer your second question, for every line of C code you write you assume things like for example that any function can call any other function. You would like to be able to call functions yes? And you would like to be able to have local variables, and you would like the variable x above to be 5 and you might assume that y will be zero, although, thankfully, compilers are starting to warn about that. The startup code at a minimum for generic C sets up the stack pointer, which allows you to call other functions and have local variables and have functions more than one or two lines of code long, it zeros the .bss so that the y variable above is zero and it copies the value 5 over to ram so that x is ready to go when the code your entry point C function is run.
If you dont have an operating system then you have to have code to do this, and yes, there are many many many sandboxes and toolchains that are setup for various platforms that already have the startup and linker script so that you can just
gcc -O myprog.elf myprog.c
Now that doesnt mean you can make system calls without a...system...printf, fopen, etc. But if you download one of these toolchains it does mean that you dont actually have to write the linker script nor the bootstrap.
But it is still valuable information, note that the startup code and linker script are required for operating system based programs too, it is just that native compilers for your operating system assume you are going to mostly write programs for that operating system, and as a result they provide a linker script and startup code in that toolchain.
1) The .data section contains variables. Variables are, well, variable -- they change at run time. The variables need to be in RAM so that they can be easily changed at run time. Flash, unlike RAM, is not easily changed at run time. The flash contains the initial values of the variables in the .data section. The startup code copies the .data section from flash to RAM to initialize the run-time variables in RAM.
2) Linker-script: The object code created by your compiler has not been located into the microcontroller's memory map. This is the job of the linker and that is why you need a linker script. The linker script is input to the linker and provides some instructions on the location and extent of the system's memory.
Startup code: Your C program that begins at main does not run in a vacuum but makes some assumptions about the environment. For example, it assumes that the initialized variables are already initialized before main executes. The startup code is necessary to put in place all the things that are assumed to be in place when main executes (i.e., the "run-time environment"). The stack pointer is another example of something that gets initialized in the startup code, before main executes. And if you are using C++ then the constructors of static objects are called from the startup code, before main executes.
1) Why do we need both load-address and run-time address.
While it is in most cases possible to run code from memory mapped ROM, often code will execute faster from RAM. In some cases also there may be a much larger RAM that ROM and application code may compressed in ROM, so the executable code may not simply be copied from ROM also decompressed - allowing a much larger application than the available ROM.
In situations where the code is stored on non-memory mapped mass-storage media such as NAND flash, it cannot be executed directly in any case and must be loaded into RAM by some sort of bootloader.
2) Why we need linker script and start-up code here. Can I not just build C source as below and run it with qemu?
The linker script defines the memory layout of you target and application. Since this tutorial is for bare-metal programming, there is no OS to handle that for you. Similarly the start-up code is required to at least set an initial stack-pointer, initialise static data, and jump to main. On an embedded system it is also necessary to initialise various hardware such as the PLL, memory controllers etc.
What is a context register and how does it change the way Go code is compiled?
Context: I've seen uses of stub functions in some parts of GOROOT, i.e. reflect, but am not quite sure how these work.
The expression "context register" first appeared in commit b1b67a3 (Feb. 2013, Go 1.1rc2) for implementing step 3 of the Go 1.1 Function Calls
Change the reflect.MakeFunc implementation to avoid run-time code generation as well
It was picked up in commit 4a000b9 in Feb. 2014, Go 1.3beta1, for assembly and system calls for Native Client x86-64, where sigreturn is managed as:
NaCl has abidcated its traditional operating system responsibility and declined to implement 'sigreturn'.
Instead the only way to return to the execution of our program is to restore the registers ourselves.
Unfortunately, that is impossible to do with strict fidelity, because there is no way to do the final update of PC that ends the sequence without either
(1) jumping to a register, in which case the register ends holding the PC value instead of its intended value, or
(2) storing the PC on the stack and using RET, which imposes the requirement that SP is valid and that is okay to smash the word below it.
The second would normally be the lesser of the two evils, except that on NaCl, the linker must rewrite RET into "POP reg; AND $~31, reg; JMP reg", so either way we are going to lose a register as a result of the incoming signal.
Similarly, there is no way to restore EFLAGS; the usual way is to use POPFL, but NaCl rejects that instruction.
We could inspect the bits and execute a sequence of instructions designed to recreate those flag settings, but that's a lot of work.
Thankfully, Go's signal handlers never try to return directly to the executing code, so all the registers and EFLAGS are dead and can be smashed.
The only registers that matter are the ones that are setting up for the simulated call that the signal handler has created.
Today those registers are just PC and SP, but in case additional registers are relevant in the future (for example DX is the Go func context register) we restore as many registers as possible.
Much more recently (Q4 2016), for Go 1.8, we have commit d5bd797 and commit bf9c71c, for eliminating stack rescanning:
morestack writes the context pointer to gobuf.ctxt, but since
morestack is written in assembly (and has to be very careful with
state), it does not invoke the requisite write barrier for this
write. Instead, we patch this up later, in newstack, where we invoke
an explicit write barrier for ctxt.
This already requires some subtle reasoning, and it's going to get a
lot hairier with the hybrid barrier.
Fix this by simplifying the whole mechanism.
Instead of writing gobuf.ctxt in morestack, just pass the value of the context register to newstack and let it write it to gobuf.ctxt. This is a normal Go pointer write, so it gets the normal Go write barrier. No subtle reasoning required.
Where can I find the source code of some of the system calls? For example, I am looking for the implementation of fstat as described here.
A system call is mostly implemented inside the Linux kernel, with a tiny glue code in the C standard library. But see also vdso(7).
From the user-land point of view, a system call (they are listed in syscalls(2)...) is a single machine instruction (often SYSENTER) with some calling conventions (e.g. defining which machine register hold the syscall number - e.g. __NR_stat from /usr/include/asm/unistd_64.h....-, and which other registers contain the arguments to the system call).
Use strace(1) to understand which system calls are done by a given program or process.
The C standard library has a tiny wrapper function (which invokes the kernel, following the ABI, and deals with error reporting & errno).
For stat(2), the C wrapping function is e.g. in stat/stat.c for musl-libc.
Inside the kernel code, most of the work happens in fs/stat.c (e.g. after line 207).
See also this & that answers
I'm developing a launcher for a game.
Want to intercept game's call for a function that prints text.
I don't know whether the code that contains this function is dynamically linked or statically. So I dont even know the function name.
I did intercepted some windows-api calls of this game through microsoft Detours, Ninject and some others.
But this one is not in import table either.
What should I do to catch this function call? What profiler should be used? IDA? How this could be done?
EDIT:
Finally found function address. Thanks, Skino!
Tried to hook it with Detours, injected dll. Injected DllMain:
typedef int (WINAPI *PrintTextType)(char *, int, float , int);
static PrintTextType PrintText_Origin = NULL;
int WINAPI PrintText_Hooked(char * a, int b, float c, int d)
{
return PrintText_Origin(a, b, c , d);
}
HMODULE game_dll_base;
/* game_dll_base initialization goes here */
BOOL APIENTRY DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved)
{
if(fdwReason==DLL_PROCESS_ATTACH)
{
DisableThreadLibraryCalls(hinstDLL);
DetourTransactionBegin();
DetourUpdateThread(GetCurrentThread());
PrintText_Origin = (PrintTextType)((DWORD)game_dll_base + 0x6049B0);
DetourAttach((PVOID *)&PrintText_Origin , PrintText_Hooked);
DetourTransactionCommit();
}
}
It hooks as expected. Parameter a has text that should be displayed. But when calling original function return PrintText_Origin (a, b, c , d); application crashes(http://i46.tinypic.com/ohabm.png, http://i46.tinypic.com/dfeh4.png)
Original function disassembly:
http://pastebin.com/1Ydg7NED
After Detours:
http://pastebin.com/eM3L8EJh
EDIT2:
After Detours:
http://pastebin.com/GuJXtyad
PrintText_Hooked disassembly http://pastebin.com/FPRMK5qt w3_loader.dll is the injected dll
Im bad at ASM, please tell what can be wrong ?
Want to intercept game's call for a function that prints text.
You can use a debugger for the investigative phase. Either IDA, or even Visual Studio (in combination with e.g. HxD), should do. It should be relatively easy to identify the function using the steps below:
Identify a particular fragment of text whose printing you want to trace (e.g. Hello World!)
Break the game execution at any point before the game normally prints the fragment you identified above
Search for that fragment of text† (look for either Unicode or ANSI) in the game's memory. IDA will allow you to do that IIRC, as will the free HxD (Extras > Open RAM...)
Once the address of the fragment has been identified, set a break-on-access/read data breakpoint so the debugger will give you control the moment the game attempts to read said fragment (while or immediately prior to displaying it)
Resume execution, wait for the data breakpoint to trigger
Inspect the stack trace and look for a suitable candidate for hooking
Step through from the moment the fragment is read from memory until it is printed if you want to explore additional potential hook points
†provided text is not kept compressed (or, for whatever reason, encrypted) until the very last moment
Once you are done with the investigative phase and you have identified where you'd like to inject your hook, you have two options when writing your launcher:
If, based on the above exercise, you were able to identify an export/import after all, then use any API hooking techniques
EDIT Use Microsoft Detours, making sure that you first correctly identify the calling convention (cdecl, fastcall, stdcall) of the function you are trying to detour, and use that calling convention for both the prototype of the original as well as for the implementation of the dummy. See examples.
If not, you will have to
use the Debugging API to programatically load the game
compute the hook address based on your investigative phase (either as a hard-coded offset from the module base, or by looking for the instruction bytes around the hook site‡)
set a breakpoint
resume the process
wait for the breakpoint to trigger, do whatever you have to do
resume execution, wait for the next trigger etc. again, all done programatically by your launcher via the Debugging API.
‡to be able to continue to work with eventual patch releases of the game
At this stage it sounds like you don't have a notion of what library function you're trying to hook, and you've stated it's not (obviously at least) an imported external function in the import table which probably means that the function responsible for generating the text is likely located inside the .text of the application you are disassembling directly or loaded dynamically, the text generation (especially in a game) is likely a part of the application.
In my experience, this simplest way to find code that is difficult to trace such as this is by stopping the application shortly during or before/after text is displayed and using IDA's fabulous call-graph functionality to establish what is responsible for writing it out (use watches and breakpoints liberally!)
Look carefully to calls to CreateRemoteThread or any other commonly used dynamic loading mechanism if you have reason to believe this functionality might be provided by an exported function that isn't showing up in the import table.
I strongly advice against it but for the sake of completeness, you could also hook NtSetInformationThread in the system service dispatch table. here's a good dump of the table for different Windows versions here. If you want to get the index in the table yourself you can just disassemble the NtSetInformationThread export from ntdll.dll.