what differs 2 function calls in a row? - windows

Suppose you look at the stack and registers of a process which has the following code...
...
void Test()
{
for (int i = 0; i < 10; i++)
{
OneRunDontKnow();
}
}
...
You look at the stack twice exactly when the process executes the loop, and in both times the OneRunDontKnow is at the top of the stack.
Can you somehow know if OneRunDontKnow was popped out of the stack and then pushed in again or if it was never popped out?
EDIT: OneRunDontKnow can have any signature (it can also take parameters or return a value).

Probably the best way is to look at your assembled code. OneRunDontKnow() takes no parameters, so the only thing on the stack will be the instruction pointer, and other stack frame stuff, but no parameters. So find the place in the disassembly where OneRunDontKnow() should be called, and see what kind of PUSH and JMP inside the code where LOOP_ (LOOP, LOOPE, etc) is.

Related

XTestFakeKeyEvent calls get swallowed

I'm trying to spoof keystrokes; to be a bit more precise: I'm replaying a number of keystrokes which should all get sent at a certain time - sometimes several at the same time (or at least as close together as reasonably possible).
Implementing this using XTestFakeKeyEvent, I've come across a problem. While what I've written so far mostly works as it is intended and sends the events at the correct time, sometimes a number of them will fail. XTestFakeKeyEvent never returns zero (which would indicate failure), but these events never seem to reach the application I'm trying to send them to. I suspect that this might be due to the frequency of calls being too high (sometimes 100+/second) as it looks like it's more prone to fail when there's a large number of keystrokes/second.
A little program to illustrate what I'm doing, incomplete and without error checks for the sake of conciseness:
// #includes ...
struct action {
int time; // Time where this should be executed.
int down; // Keydown or keyup?
int code; // The VK to simulate the event for.
};
Display *display;
int nactions; // actions array length.
struct action *actions; // Array of actions we'll want to "execute".
int main(void)
{
display = XOpenDisplay(NULL);
nactions = get_actions(&actions);
int cur_time;
int cur_i = 0;
struct action *cur_action;
// While there's still actions to execute.
while (cur_i < nactions) {
cur_time = get_time();
cur_action = actions + cur_i;
// For each action that is (over)due.
while ((cur_action = actions + cur_i)->time <= cur_time) {
cur_i++;
XTestFakeKeyEvent(display, cur_action->code,
cur_action->down, CurrentTime);
XFlush(display);
}
// Sleep for 1ms.
nanosleep((struct timespec[]){{0, 1000000L}}, NULL);
}
}
I realize that the code above is very specific to my case, but I suspect that this is a broader problem - which is also why I'm asking this here.
Is there a limit to how often you can/should flush XEvents? Could the application I'm sending this to be the issue, maybe failing to read them quickly enough?
It's been a little while but after some tinkering, it turned out that my delay between key down and key up was simply too low. After setting it to 15ms the application registered the actions as keystrokes properly and (still) with very high accuracy.
I feel a little silly in retrospect, but I do feel like this might be something others could stumble over as well.

Understanding automatic inlining: when can the compiler inline methods involving private variables & abstract methods?

Using C#, but I persume this question is relevant for other (most c related) languages as well. Consider this...
private float radius = 0.0f; // Set somewhere else
public float GetDiameter() {
return radius * 2.0f;
}
Will the compiler inline this if called in other classes? I would think the answer is of course, but here is confusion: radius is private. So from a manual-programming perspective it would be impossible for us to inline this method since radius is private.
So what does the compiler do? I presume it can inline it anyhow, since if I remember correctly 'private' 'public' ect. modifiers only affect human written code and the assembly language can access any part of its own program if it wants?
Okay, but what about abstraction? Consider this...
public abstract class Animal {
abstract public bool CanFly();
}
public class Hawk : Animal {
...
override public bool CanFly() {
if (age < 1.0f) return false; // Baby hawks can't fly yet
return true;
}
}
public class Dog : Animal {
...
override public bool CanFly() {
return false;
}
}
In a non-animal class:
...
Animal a = GetNextAnimal();
if (a.CanFly()) {
...
Can this be inlined? I am almost certain no, because the compiler doesn't know what kind of animal is being used. But what if instead I did...
...
Animal a = new Hawk();
if (a.CanFly()) {
...
Does that make a difference? If not, surely this one can be?:
...
Hawk a = new Hawk();
if (a.CanFly()) {
...
Does anything change if, instead of a bool method above, I were to do:
float animalAge = a.GetAge();
In general, can too many abstract getters and setters cause a performance hit? If that gets to a point that is significant what would be the best solution?
There is in general no simple way to predict up front whether or not a method will get inlined. You have to actually write a program and look at the machine code that is produced for it. This is pretty easy to do in a C program, you can ask the compiler to produce an assembly code listing (like /FA for MSVC, -S for GCC).
More convoluted in .NET due to the jitter just-in-time compiling the code. Technically the source code of the optimizer is available from the CoreCLR project but it is very hard to figure out what it does, lots of pretty impregnable C++ code. You have to take advantage of the "visual" in Visual Studio and use the debugger.
That requires a bit of preparation to be sure you get the actual optimized code, it normally disables the optimizer to make debugging easy. Switch to the Release configuration and use Tools > Options > Debugging > General > untick the "Suppress JIT optimization" checkbox. If you want optimal floating point code then you always, always want 64-bit code, so use Project > Properties > Build tab, untick "Prefer 32-bit".
And write a little test program to exercise the method. That can be tricky, you might easily end up with no code at all. In this case it is easy, Console.WriteLine() is a good way to force this method to be used, it cannot be optimized away. So:
class Program {
static void Main(string[] args) {
var obj = new Example();
Console.WriteLine(obj.GetDiameter());
}
}
class Example {
private float radius = 0.0f;
public float GetDiameter() {
return radius * 2.0f;
}
}
Set a breakpoint on Main() and press F5. Then use Debug > Windows > Disassembly to look at the machine code. On my machine with a Haswell core (supports AVX) I get:
00007FFEB9D50480 sub rsp,28h ; setup stack frame
00007FFEB9D50484 mov rcx,7FFEB9C45A78h ; rcx = typeof(Example)
00007FFEB9D5048E call 00007FFF19362530 ; rax = new Example()
00007FFEB9D50493 vmovss xmm0,dword ptr [rax+8] ; xmm0 = Example.field
00007FFEB9D50499 vmulss xmm0,xmm0,dword ptr [7FFEB9D504B0h] ; xmm0 *= 2.0
00007FFEB9D504A2 call 00007FFF01647BB0 ; Console.WriteLine()
00007FFEB9D504A7 nop ; alignment
00007FFEB9D504A8 add rsp,28h ; tear down stack frame
00007FFEB9D504AC ret
I annotated the code to help to make sense of it, can be cryptic if you never looked at it before. But no doubt you can tell that the method got inlined. No CALL instruction, it got inlined to two instructions (VMOVSS and VMULSS).
As you expected. Accessibility plays no role whatsoever in inlining decisions, it is a simple code hoisting trick that does not change the logical operation of the program. It matters to the C# compiler first, next to the verifier built into the jitter, but then disappears as a concern to the code generator and optimizer.
Just do the exact same thing for the abstract class. You'll see that the method does not get inlined, an indirect CALL instruction is required. Even if the method is completely empty. Some language compilers can turn virtual method calls into non-virtual calls when they know the type of the object but the C# compiler is not one of them. The jitter optimizer doesn't either. EDIT: recent work was done on devirtualizing calls.
There are other reasons why a method won't be inlined, a moving target so difficult to document. But roughly, methods with too much MSIL, try/catch/throw, loops, CAS demands, some degenerate struct cases, MarshalByRefObject base won't be inlined. Always look at actual machine code to be sure.
The [MethodImpl(MethodImplOptions.AgressiveInlining)] attribute can force the optimizer to reconsider the MSIL limit. MethodImplOptions.Noinlining is helpful to disable inlining, the kind of thing you might want to do to get a better exception stack trace or slow down the jitter because an assembly might not be deployed.
More about the optimizations performed by the jitter optimizer in this post.

blocks and the stack

According to bbum:
2) Blocks are created on the stack. Careful.
Consider:
typedef int(^Blocky)(void);
Blocky b[3];
for (int i=0; i<3; i++)
b[i] = ^{ return i;};
for (int i=0; i<3; i++)
printf("b %d\n", b[i]());
You might reasonably expect the above to output:
0
1
2
But, instead, you get:
2
2
2
Since the block is allocated on the stack, the code is nonsense. It
only outputs what it does because the Block created within the lexical
scope of the for() loop’s body hasn’t happened to have been reused for
something else by the compiler.
I don't understand that explanation. If the blocks are created on the stack, then after the for loop completes wouldn't the stack look something like this:
stack:
---------
^{ return i;} #3rd block
^{ return i;} #2nd block
^{ return i;} #1st block
But bbum seems to be saying that when each loop of the for loop completes, the block is popped off the stack; then after the last pop, the 3rd block just happens to be sitting there in unclaimed memory. Then somehow when you call the blocks the pointers all refer to the 3rd block??
You are completely misunderstanding what "on the stack" means.
There is no such thing as a "stack of variables". The "stack" refers to the "call stack", i.e. the stack of call frames. Each call frame stores the current state of the local variables of that function call. All the code in your example is inside a single function, hence there is only one call frame that is relevant here. The "stack" of call frames is not relevant.
The mentioning of "stack" means only that the block is allocated inside the call frame, like local variables. "On the stack" means it has lifetime akin to local variables, i.e. with "automatic storage duration", and its lifetime is scoped to the scope in which it was declared.
This means that the block is not valid after the end of the iteration of the for-loop in which it was created. And the pointer you have to the block now points to an invalid thing, and it is undefined behavior to dereference the pointer. Since the block's lifetime is over and the space it was using is unused, the compiler is free to use that place in the call frame for something else later.
You are lucky that the compiler decided to place a later block in the same place, so that when you try to access the location as a block, it produces a meaningful result. But this is really just undefined behavior. The compiler could, if it wanted, place an integer in part of that space and another variable in another part, and maybe a block in another part of that space, so that when you try to access that location as a block, it will do all sorts of bad things and maybe crash.
The lifetime of the block is exactly analogous to a local variable declared in that same scope. You can see the same result in a simpler example that uses a local variable that reproduces what's going on:
int *b[3];
for (int i=0; i<3; i++) {
int j = i;
b[i] = &j;
}
for (int i=0; i<3; i++)
printf("b %d\n", *b[i]);
prints (probably):
b 2
b 2
b 2
Here, as in the case with the block, you are also storing a pointer to something that is scoped inside the iteration of the loop, and using it after the loop. And again, just because you're lucky, the space for that variable happens to be allocated to the same variable from a later iteration of the loop, so it seems to give a meaningful result, even though it's just undefined behavior.
Now, if you're using ARC, you likely do not see what your quoted text says happening, because ARC requires that when storing something in a variable of block-pointer type (and b[i] has block-pointer type), that a copy is made instead of a retain, and the copy is stored instead. When a stack block is copied, it is moved to the heap (i.e. it is dynamically allocated, and has dynamic lifetime and is memory managed like other objects), and it returns a pointer to the heap block. This you can safely use after the scope.
Yeah, that does make sense, but you really have to think about it. When b[0] is given its value, the "^{ return 0;}" is never used again. b[0] is just the address of it. The compiler kept overwriting those temp functions on the stack as it went along, so the "2" is just the last function written in that space. If you print those 3 addresses as they are created, I bet they are all the same.
On the other hand, if you unroll your assignment loop, and add other references to "^{ return 0;}", like assigning it to a c[0], and you'll likely see b[0] != b[1] != b[2]:
b[0] = ^{ return 0;};
b[1] = ^{ return 1;};
b[2] = ^{ return 2;};
c[0] = ^{ return 0;};
c[1] = ^{ return 1;};
c[2] = ^{ return 2;};
Optimization settings could affect the outcome.
By the way, I don't think bbum is saying the pop happens after the for loop completion -- it's happening after each iteration hits that closing brace (end of scope).
Mike Ash provides the answer:
Block objects [which are allocated on the stack] are only valid through the lifetime of their
enclosing scope
In bbum's example, the scope of the block is the for-loop's enclosing braces(which bbum omitted):
for (int i=0; i<3; i++) {#<------
b[i] = ^{ return i;};
}#<-----
So, each time through the loop, the newly created block is pushed onto the stack; then when each loop ends, the block is popped off the stack.
If you print those 3 addresses as they are created, I bet they are all
the same.
Yes, I think that's the way that it must have worked in the past. However, now it appears that a loop does not cause the block to be popped off the stack. Now, it must be the method's braces that determine the block's enclosing scope. Edit: Nope. I constructed an experiment, and I still get different addresses for each block:
AppDelegate.h:
typedef int(^Blocky)(void); #******TYPEDEF HERE********
#interface AppDelegate : NSObject <NSApplicationDelegate>
#end
AppDelegate.m:
#import "AppDelegate.h"
#interface AppDelegate ()
-(Blocky)blockTest:(int)i {
Blocky myBlock = ^{return i;}; #If the block is allocated on the stack, it should be popped off the stack at the end of this method.
NSLog(#"%p", myBlock);
return myBlock;
}
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification {
// Insert code here to initialize your application
Blocky b[3];
for (int i=0; i < 3; ++i) {
b[i] = [self blockTest:i];
}
for (int j=0; j < 3; ++j) {
NSLog(#"%d", b[j]() );
}
}
#end
--output:--
0x608000051820
0x608000051850
0x6080000517c0
0
1
2
That looks to me like blocks are allocated on the heap.
Okay, my results above are due to ARC. If I turn off ARC, then I get different results:
0x7fff5fbfe658
0x7fff5fbfe658
0x7fff5fbfe658
2
1606411952
1606411952
That looks like stack allocation. Each pointer points to the same area of memory because after a block is popped off the stack, that area of memory is reused for the next block.
Then it looks like when the first block was called it just happened to get the correct result, but by the time the 2nd block was called, the system had overwritten the reclaimed memory resulting in a junk value? I'm still not clear on how calling a non-existent block results in a value??

GTK+ - How to listen to an event from within a method?

I'm writing an application that runs an algorithm, but allows you to 'step through' the algorithm by pressing a button - displaying what's happening at each step.
How do I listen for events while within a method?
eg, look at the code I've got.
static int proceed;
button1Event(GtkWidget *widget)
{
proceed = 0;
int i = 0;
for (i=0; i<15; i++) //this is our example 'algorithm'
{
while (proceed ==0) continue;
printf("the nunmber is %d\n", i);
proceed = 0;
}
}
button2Event(GtkWidget *widget)
{
proceed = 1;
}
This doesn't work because it's required to exit out of the button1 method before it can listen for button2 (or any other events).
I'm thinking something like in that while loop.
while(proceed == 0)
{
listen_for_button_click();
}
What method is that?
The "real" answer here (the one any experienced GTK+ programmer will give you) isn't one you will like perhaps: don't do this, your code is structured the wrong way.
The options include:
recommended: restructure the app to be event-driven instead; probably you need to keep track of your state (either a state machine or just a boolean flag) and ignore whichever button is not currently applicable.
you can run a recursive main loop, as in the other answer with gtk_main_iteration(); however this is quite dangerous because any UI event can happen in that loop, such as windows closing or other totally unrelated stuff. Not workable in most real apps of any size.
move the blocking logic to another thread and communicate via a GAsyncQueue or something along those lines (caution, this is hard-ish to get right and likely to be overkill).
I think you are going wrong here:
while(proceed == 0)
{
listen_for_button_click();
}
You don't want while loops like this; you just want the GTK+ main loop doing your blocking. When you get the button click, in the callback for it, then write whatever the code after this while loop would have been.
You could check for pending events & handle the events in while loop in the clicked callback. Something on these lines:
button1Event(GtkWidget *widget)
{
proceed = 0;
int i = 0;
for (i=0; i<15; i++) //this is our example 'algorithm'
{
while (proceed ==0)
{
/* Check for all pending events */
while(gtk_events_pending())
{
gtk_main_iteration(); /* Handle the events */
}
continue;
}
printf("the nunmber is %d\n", i);
proceed = 0;
}
}
This way when the events related click on the second button is added to the event queue to be handled, the check will see the events as pending and handle them & then proceed. This way your global value changes can be reflected & stepping should be possible.
Hope this helps!
If you want to do it like this, the only way that comes to my mind is to create a separate thread for your algorithm and use some synchronization methods to notify that thread from within button click handlers.
GTK+ (glib, to be more specific) has its own API for threads and synchronization. As far as I know Condition variables are a standard way to implement wait-notify logic.

Drawing the call stack in a recursive method

I want to draw the call stack for any recursive method, so I've created a schema like this,
recursiveMethod(){
//Break recursion condition
if(){
// Add value here to the return values' list- No drawing
return
}
else{
//Draw stack with the value which will be pushed to the stack here
variable <- recursiveMethod()
//Clear the drawing which represents the poped value from the stack here
return variable
}}
Applying the schema will look something like this,
Notes:
This schema can draw recursive methods with n recursive call by making the recursive calls in a separate return statements.
returnValues list, is a list which save all the return values, just for viewing issues.
Draw stack means, simply Draw a simple cell "rectangle" + Drawing the pushed String.
What do you think of this? any suggestions are extremely welcomed.
I'm not sure if I understand your question correctly, but will take a stab at it, let me know if this is incorrect.
What I gather is that you want some way to keep track of you stack within a recursive function. One way you can do this is to have a Stack data structure, and a function that draws the data structure,how you wish to draw it is up to you, for now maybe just draw the stack as something like [---] with the '-' being the recursive depth.
Here is an approximate C++ like example:
So we have:
Stack recursiveFunctionTrackingStack; //Stack of something, maybe just '-'
void DrawStack(const Stack& aStack);
and another type something like:
struct StackUpdater
{
StackUpdater(){ recursiveFunctionTrackingStack.push('-'); }
StackUpdater(const string& somevalue)
{
recursiveFunctionTrackingStack.push(somevalue);
}
~StackUpdater(){ recursiveFunctionTrackingStack.pop(); }
}
so the 'StackUpdater' pushes something on the Stack data structure when an object of it is created, and pops it off when it is destructed.
Now within the recursive function we can do (using your code snippet):
recursiveMethod(){
if(){ return }
else{
{
StackUpdater su(pushedInValue); //Value pushed
variable <- recursiveMethod();
DrawStack(recursiveFunctionTrackingStack);
} //Value popped on destruct.
DrawStack(recursiveFunctionTrackingStack);
return variable
}}
Maybe what you want is something along those line. If not, then please clarify you question.
Hope this helps anyway.

Resources