How the globals are initialized before entry point? - windows

I'm trying to figure out how Windows manages to map the memory of a PE file into the address space, so I've seen something that makes me confused.
Let's say we have something like this:
HMODULE some_module = GetModuleHandleA(NULL);
int main() { // Or DllMain doesn't matter
// some operations using some_module or whatever
return 0;
}
The initialization of some_module is performed before entry point is called. I'm trying to implement this looking into the PE file (I found the initialization functions), but only thing I can see is that those initialization functions are used as RUNTIME_FUNCTION, nothing else. How can I extract those initialization functions among all the runtime functions and call them manually? Are there any documentation about this? I also tried a function called RtlAddFunctionTable but I think it's not made for that. What kind of operations can performed to implement that? Thanks.

Problem is solved, was about a different thing. But I had some research and see that those entries (runtime functions, includes static initializations) are already called in entry point. Those functions are specified as some memory range and called by a function called "ucrtbase!initterm" (or "ucrtbase!_initterm"). In some PE files that initterm function is compiled as a new function, instead of using an import from ucrtbase. And finally, those functions are called in an order of where they're located in memory (lower-address -> upper-address).

Related

Handling package and function names dynamically

I'm on my first Golang project, which consists of a small router for an MVC structure. Basically, what I hope it does is take the request URL, separate it into chunks, and forward the execution flow to the package and function specified in those chunks, providing some kind of fallback when there is no match for these values in the application .
An alternative would be to map all packages and functions into variables, then look for a match in the contents of those variables, but this is not a dynamic solution.
The alternative I have used in other languages (PHP and JS) is to reference those names sintatically, in which the language somehow considers the value of the variable instead of considering its literal. Something like: {packageName}.{functionName}() . The problem is that I still haven't found any syntactical way to do this in Golang.
Any suggestions?
func ParseUrl(request *http.Request) {
//out of this URL http://www.example.com/controller/method/key=2&anotherKey=3
var requestedFullURI = request.URL.RequestURI() // returns '/controller/method?key=2key=2&anotherKey=3'
controlFlowString, _ := url.Parse(requestedFullURI)
substrings := strings.Split(controlFlowString.Path, "/") // returns ["controller","method"]
if len(substrings[1]) > 0 {
// Here we'll check if substrings[1] mathes an existing package(controller) name
// and provide some error return in case it does not
if len(substrings[2]) > 0 {
// check if substrings[2] mathes an existing function name(method) inside
// the requested package and run it, passing on the control flow
}else{
// there's no requested method, we'll just run some fallback
}
} else {
err := errors.New("You have not determined a valid controller.")
fmt.Println(err)
}
}
You can still solve this in half dynamic manner. Define your handlers as methods of empty struct and register just that struct. This will greatly reduce amount of registrations you have to do and your code will be more explicit and readable. For example:
handler.register(MyStruct{}) // the implementation is for another question
Following code shows all that's needed to make all methods of MyStruct accessible by name. Now with some effort and help of reflect package you can support the routing like MyStruct/SomeMethod. You can even define struct with some fields witch can serve as branches so even MaStruct/NestedStruct/SomeMethod is possible to do.
dont do this please
Your idea may sound like a good one but believe me its not. Its lot better to use framework like go-chi that is more flexible and readable then doing some reflect madness that no one will understand. Not to mention that traversing type trees in go was newer the fast task. Your routes should not be defined by names of structures in your backend. When you commit into this you will end up with strangely named routes that use PascalCase instead of something-like-this.
What you're describing is very typical of PHP and JavaScript, and completely inappropriate to Go. PHP and JavaScript are dynamic, interpreted languages. Go is a static, compiled language. Rather than trying to apply idioms which do not fit, I'd recommend looking for ways to achieve the same goals using implementations more suitable to the language at hand.
In this case, I think the closest you get to what you're describing while still maintaining reasonable code would be to use a handler registry as you described, but register to it automatically in package init() functions. Each init function will be called once, at startup, giving the package an opportunity to initialize variables and register things like handlers and drivers. When you see things like database driver packages that need to be imported even though they're not referenced, init functions are why: importing the package gives it the chance to register the driver. The expvar package even does this to register an HTTP handler.
You can do the same thing with your handlers, giving each package an init function that registers the handler(s) for that package along with their routes. While this isn't "dynamic", being dynamic has zero value here - the code can't change after it's compiled, which means that all you get from being dynamic is slower execution. If the "dynamic" routes change, you'd have to recompile and restart anyway.

D / DLang : Inhibiting code generation of module-private inlined functions

I have a D module which I hope contains public and private parts. I have tried using the keywords private and static before function definitions. I have a function that I wish to make externally-callable / public and ideally I would like it to be inlined at the call-site. This function calls other module-internal functions that are intended to be private, i.e. not externally callable. Calls to these are successfully inlined within the module and a lot of the cruft is disposed of by CTFE plus known-constant propagation. However the GDC compiler also generates copies of these internal routines, even though they have been inlined where needed and they are not supposed to be externally callable. I'm compiling with -O3 -frelease. What should I be doing - should I expect this even if I use static and/or private?
I have also taken a brief look at this thread concerning GCC hoping for insight.
As I mentioned earlier, I've tried both using private and static on these internal functions, but I can't seem to suppress the code generation. I could understand this if a debugger needed to have copies of these routines to set breakpoints in. I need to stress that this could perhaps be sorted out somehow at link-time, for all I know. I haven't tried linking the program, I'm just looking at the generated code in the Matt Godbolt D Compiler Explorer using GDC. Everything can be made into templates with a zero-length list of template parameters (e.g. auto my_fn()( in arg_t x ) ), tried that, it doesn't help but does no harm.
A couple of other things to try: I could try and make a static class with private parts, as a way of implementing a package, Ada-style. (Needs to be single-instance strictly.) I've never done any C++, only massive amounts of asm and C professionally. So that would be a learning curve.
The only other thing I can think of is to use nested function definitions, Pascal/Ada-style, move the internal routines to be inside the body of their callers. But that has a whole lot of disadvantages.
Rough example
module junk;
auto my_public_fn() { return my_private_fn(); }
private
static // 'static' and/or 'private', tried both
auto my_private_fn() { xxx ; return whatever; }
I just had a short discussion with Iain about this and implementing this is not as simple as it seems.
First of all static has many meanings in D, but the C meaning of translation unit local function is not one of them ;-)
So marking these functions as private seems intuitive. After all, if you can't access a function from outside of the translation unit and you never leak an address to the function why not remove it? It could be either completely unused or inlined into all callers in this case.
Now here's the catch: We can't know for sure if a function is unused:
private void fooPrivate() {}
/*template*/ void fooPublic()()
{
fooPrivate();
}
When compiling the file GDC knows nothing about the fooPublic template (as templates can only be fully analyzed when instantiated), so fooPrivate appears to be unused. When later using fooPublic in a different file GDC will rely on fooPrivate being already emitted in the original source - after all it's not a template so it's not being emitted into the new module.
There might be workarounds but this whole problem seems nontrivial. We could also introduce a custom gcc.attribute attribute for this. It would cause the same problems with templates, but as it's a specific annotation for one usecase (unlike private) we could rely on the user to do the right thing.

C++/CLI Resource Management Confusion

I am extremely confused about resource management in C++/CLI. I thought I had a handle (no pun intended) on it, but I stumbled across the auto_gcroot<T> class while looking through header files, which led to a google search, then the better part of day reading documentation, and now confusion. So I figured I'd turn to the community.
My questions concern the difference between auto_handle/stack semantics, and auto_gcroot/gcroot.
auto_handle: My understanding is that this will clean up a managed object created in a managed function. My confusion is that isn't the garbage collector supposed to do that for us? Wasn't that the whole point of managed code? To be more specific:
//Everything that follows is managed code
void WillThisLeak(void)
{
String ^str = gcnew String ^();
//Did I just leak memory? Or will GC clean this up? what if an exception is thrown?
}
void NotGoingToLeak(void)
{
String ^str = gcnew String^();
delete str;
//Guaranteed not to leak, but is this necessary?
}
void AlsoNotGoingToLeak(void)
{
auto_handle<String ^> str = gcnew String^();
//Also Guaranteed not to leak, but is this necessary?
}
void DidntEvenKnowICouldDoThisUntilToday(void)
{
String str();
//Also Guaranteed not to leak, but is this necessary?
}
Now this would make sense to me if it was a replacement for the C# using keyword, and it was only recommended for use with resource-intensive types like Bitmap, but this isnt mentioned anywhere in the docs so im afraid ive been leaking memory this whole time now
auto_gcroot
Can I pass it as an argument to a native function? What will happen on copy?
void function(void)
{
auto_gcroot<Bitmap ^> bmp = //load bitmap from somewhere
manipulateBmp(bmp);
pictureBox.Image = bmp; //Is my Bitmap now disposed of by auto_gcroot?
}
#pragma unmanaged
void maipulateBmp(auto_gcroot<Bitmap ^> bmp)
{
//Do stuff to bmp
//destructor for bmp is now called right? does this call dispose?
}
Would this have worked if I'd used a gcroot instead?
Furthermore, what is the advantage to having auto_handle and auto_gcroot? It seems like they do similar things.
I must be misunderstanding something for this to make so little sense, so a good explanation would be great. Also any guidance regarding the proper use of these types, places where I can go to learn this stuff, and any more good practices/places I can find them would be greatly appreciated.
thanks a lot,
Max
Remember delete called on managed object is akin to calling Dispose in C#. So you are right, that auto_handle lets you do what you would do with the using statement in C#. It ensures that delete gets called at the end of the scope. So, no, you're not leaking managed memory if you don't use auto_handle (the garbage collector takes care of that), you are just failing to call Dispose. there is no need for using auto_handle if the types your dealing with do not implement IDisposable.
gcroot is used when you want to hold on to a managed type inside a native class. You can't just declare a manged type directly in a native type using the hat ^ symbol. You must use a gcroot. This is a "garbage collected root". So, while the gcroot (a native object) lives, the garbage collector cannot collect this object. When the gcroot is destroyed, it lets go of the reference, and the garbage collector is free to collect the object (assuming it has no other references). You declare a free-standing gcroot in a method like you've done above--just use the hat ^ syntax whenever you can.
So when would you use auto_gcroot? It would be used when you need to hold on to a manged type inside a native class AND that managed type happens to implement IDisposable. On destruction of the auto_gcroot, it will do 2 things: call delete on the managed type (think of this as a Dispose call--no memory is freed) and free the reference (so the type can be garbage collected).
Hope it helps!
Some references:
http://msdn.microsoft.com/en-us/library/aa730837(v=vs.80).aspx
http://msdn.microsoft.com/en-us/library/481fa11f(v=vs.80).aspx
http://www.codeproject.com/Articles/14520/C-CLI-Library-classes-for-interop-scenarios

State of object after std::move construction

Is it legal/proper c++0x to leave an object moved for the purpose of move-construction in a state that can only be destroyed? For instance:
class move_constructible {...};
int main()
{
move_constructible x;
move_constructible y(std::move(x));
// From now on, x can only be destroyed. Any other method will result
// in a fatal error.
}
For the record, I'm trying to wrap in a c++ class a c struct with a pointer member which is always supposed to be pointing to some allocated memory area. All the c library API relies on this assumption. But this requirement prevents to write a truly cheap move constructor, since in order for x to remain a valid object after the move it will need its own allocated memory area. I've written the destructor in such a way that it will first check for NULL pointer before calling the corresponding cleanup function from the c API, so that at least the struct can be safely destroyed after the move.
Yes, the language allows this. In fact it was one of the purposes of move semantics. It is however your responsibility to ensure that no other methods get called and/or provide proper diagnostics. Note, usually you can also use at least the assignment operator to "revive" your variable, such as in the classical example of swapping two values.
See also this question

How can I stop execution in the Visual Studio Debugger when a private member variable changes value?

Let's say my class has a private integer variable called count.
I've already hit a breakpoint in my code. Now before I press continue, I want to make it so the debugger will stop anytime count gets a new value assigned to it.
Besides promoting count to a field and setting a breakpoint on the set method of the field, is there any other way to do this?
What you're looking for is not possible in managed code. In C++ this is known as data break point. It allows you to break whenever a block of memory is altered by the running program. But this is only available in pure native C++ code.
A short version of why this is not implemented is that it's much harder in managed code. Native code is nice and predictable. You create memory and it doesn't move around unless you create a new object (or explicitly copy memory).
Managed code is much more complex because it's a garbage collected language. The CLR commonly moves objects around in memory. Therefore simply watching a bit of memory is not good enough. It requires GC interaction.
This is just one of the issues with implementing managed break points.
I assume you're trying to do this because you want to see where the change in value came from. You already stated the way I've always done it: create a property, and break on the set accessor (except that you must then always use that set accessor for this to work).
Basically, I'd say that since a private field is only storage you can't break on it because the private field isn't a breakable instruction.
The only way I can think do do this, is to right click on the variable, and select "Find all references". Once it finds all the references, you can create a new breakpoint at each point in the code where the variable is assigned a value. This would probable work pretty well, unless you were passing the variable in by reference to another function and changing the value in there. In that case, you'd need some way of watching a specific point in memory to see when it changed. I'm not sure if such a tool exists in VS.
Like ChrisW commented. You can set a 'Data Breakpoint' but only for native (non-managed) code. The garbage collector will move allocated memory blocks around when the garbage collector runs. Thus, data breakpoints are not possible for managed code.
Otherwise, no. You must encapsulate access to your item for which you want to 'break on modify'. Since its a private member already, I suggest following Kibbee's suggestion of setting breakpoints wherever its used.
Besides promoting count to a field and setting a breakpoint on the set method of the field, is there any other way to do this?
Make it a property of a different class, create an instance of the class, and set a breakpoint on the property.
Instead of ...
test()
{
int i = 3;
...etc...
i = 4;
}
... have ...
class Int
{
int m;
internal Int(int i) { m = i; }
internal val { set { m = value; } get { return m; } }
}
test()
{
Int i = new Int(3);
...etc...
i.val = 4;
}
The thing is that, using C#, the actual memory location of everything is being moved continually: and therefore the debugger can't easily use the CPU's 'break on memory access' debugging register, and it's easier for the debugger to, instead, implement a code-location breakpoint.

Resources