I'm creating a game engine using wxWidgets and OpenGL. I'm trying to set up a timer so the game can be updated regularly. I don't want to use wxTimer, because it's probably not accurate enough for what I need. I'm using a while (true) and a wxStopWatch:
while (true) {
stopWatch.Start();
<handle events> // I need a function for this
game->OnUpdate();
game->Refresh();
if (stopWatch.Time() < 1000 / 60)
wxMilliSleep(1000 / 60 - stopWatch.Time());
}
What I need is a function that will handle all the wxWidgets events, because right now my app just freezes.
UPDATE: It doesn't. It's slightly jerky on Windows, and when tested on a Mac, it was extremely jerky. Apparently EVT_IDLE doesn't get called consistently on Windows, and even less on a Mac.
UPDATE2: It actually mostly does. It's fine on a Mac; I misunderstood my Mac tester's reply.
Instead of using a while (true) loop, I'm using EVT_IDLE, and it works perfectly.
UPDATE: It doesn't. It's slightly jerky on Windows, and when tested on a Mac, it was extremely jerky. Apparently EVT_IDLE doesn't get called consistently on Windows, and even less on a Mac.
UPDATE2: It actually mostly does. It's fine on a Mac; I misunderstood my Mac tester's reply.
"ave you requested idle events to be generated at the maximum rate? You have to call RequestMore() on the event, if you don't you will get the next idle event only after some other event has been processed. Note that constant idle processing will cause 100% CPU load on one core."
This works, I have the following code in a graphical window:-
BEGIN_EVENT_TABLE(MyCanvas, wxScrolledWindow)
EVT_PAINT (MyCanvas::OnPaint)
EVT_IDLE(MyCanvas::OnIdle)
EVT_MOTION (MyCanvas::OnMouseMove)
END_EVENT_TABLE()
The canvas needs to be updated when my_canvas->Refresh(bClearBackground) is called and not otherwise. To do this I needed to make a modification as the program was eating up half of the cpu time (or 100% of 1 cpu on a duel core).
void MyCanvas::OnIdle(wxIdleEvent &event)
{
wxPaintEvent unused;
OnPaint(unused);
event.RequestMore(false);
}
Setting the parameter of RequestMore() to false makes the app only ask for more when its needed, i.e. only when Refresh() has been called.
Have you requested idle events to be generated at the maximum rate? You have to call RequestMore() on the event, if you don't you will get the next idle event only after some other event has been processed. Note that constant idle processing will cause 100% CPU load on one core.
Even if you request more idle events you can't be sure how long it will take for the next one to arrive. Therefore to get smooth animation you will need to calculate the elapsed time since the last event, and update the display accordingly.
Related
I know that glutMainLoop() is used to call display over and over again, maintaining a constant frame rate. At the same time, if I also have glutTimerFunc(), which calls glutPostRedisplay() at the end, so it can maintain a different framerate.
When they are working together, what really happens ? Does the timer function add on to the framerate of main loop and make it faster ? Or does it change the default refresh rate of main loop ? How do they work in conjunction ?
I know that glutMainLoop() is used to call display over and over again, maintaining a constant frame rate.
Nope! That's not what glutMainLoop does. The purpose of glutMainLoop is to pull operating system events, check if timers elapsed, see if windows have to be redrawn and then call into the respective callback functions registered by the user. This happens in a loop and usually this loop is started from the main entry point of the program, hence the name "main - loop".
When they are working together, what really happens ? Does the timer function add on to the framerate of main loop and make it faster ? Or does it change the default refresh rate of main loop ? How do they work in conjunction?
As already told, dispatching timers is part of the responsibility of glutMainLoop, so you can't have GLUT timers without that. More importantly if there happened no events and no re-display was posted and if there's not idle function registerd, glutMainLoop will "block" the program until some interesting happens (i.e. no CPU cycles are being consumed).
Essentially it goes like
void glutMainLoop(void)
{
for(;;){
/* ... */
foreach(t in timers){
if( t.elapsed() ){
t.callback(…);
continue;
}
}
/* ... */
if( display.posted ){
display.callback();
display.posted = false;
continue;
}
idle.callback();
}
}
At the same time, if I also have glutTimerFunc(), which calls glutPostRedisplay() at the end, so it can maintain a different framerate.
The timers provided by GLUT make no guarantees about their precision and jitter. Hence they're not particularly well suited for framerate limiting.
Normally the framerate is limited by v-sync (or it should be), but blocking on v-sync means you can not use that time to do something usefull, because the process is blockd. A better approach is to register an idle function, in which you poll a high resolution timer (on POSIX compliant systems clock_gettime(CLOCK_MONOTONIC, …), on Windows QueryPerformanceCounter) and perform a glutPostRedisplay after one display refresh interval minus the time required for rendering the frame elapsed.
Of course it's hard to predict how long rendering is going to take exactly, so the usual approach is to collect sliding window average and deviation and adjust with that. Also you want to align that timer with v-sync.
This is of course a solved problem (at least in electrical engineering) which can be addressed by a Phase Locked Loop. Essentially you have a "phase comparator" (i.e. something that compares if your timer runs slower or faster than something you want synchronize to), a "charge pump" (a variable you add to or subtract from the delta from the phase comparator), a "loop filter" (sliding window average) and an "oscillator" (a timer) controlled by the loop filtered value in the charge pump.
So you poll the status of the v-sync (not possible with GLUT functions, and not even possible with core OpenGL or even some of the swap control extensions – you'll have to use OS specific functions for that) and compare if your timers lag beind or run fast compared to that. You add that delta to the "charge pump", filter it and feed the result back into the timer. The nice thing about this approach is, that this will automatically adjust to and filter the time spent for rendering frames as well.
From the glutMainLoop doc pages:
glutMainLoop enters the GLUT event processing loop. This routine should be called at most once in a GLUT program. Once called, this routine will never return. It will call as necessary any callbacks that have been registered. (grifos mine)
That means that the idea of glutMainLoop is just processing events, calling anything that is installed. Indeed, I do not believe that it keeps calling display over and over, but only when there is an event that request its redisplay.
This is where glutTimerFunc() comes into the play. It register a timer event callback to be called by glutMainLoop when this event is triggered. Note that this is one of several possible others event callbacks that can be registered. That explains why in doc they use the expression at least.
(...) glutTimerFunc registers the timer callback func to be triggered in at least msecs milliseconds. (...)
I'm planning on making a clock. An actual clock, not something for Windows. However, I would like to be able to write most of the code now. I'll be using a PIC16F628A to drive the clock, and it has a timer I can access (actually, it has 3, in addition to the clock it has built in). Windows, however, does not appear to have this function. Which makes making a clock a bit hard, since I need to know how long it's been so I can update the current time. So I need to know how I can get a pulse (1Hz, 1KHz, doesn't really matter as long as I know how fast it is) in Windows.
There are many timer objects available in Windows. Probably the easiest to use for your purposes would be the Multimedia Timer, but that's been deprecated. It would still work, but Microsoft recommends using one of the new timer types.
I'd recommend using a threadpool timer if you know your application will be running under Windows Vista, Server 2008, or later. If you have to support Windows XP, use a Timer Queue timer.
There's a lot to those APIs, but general use is pretty simple. I showed how to use them (in C#) in my article Using the Windows Timer Queue API. The code is mostly API calls, so I figure you won't have trouble understanding and converting it.
The LARGE_INTEGER is just an 8-byte block of memory that's split into a high part and a low part. In assembly, you can define it as:
MyLargeInt equ $
MyLargeIntLow dd 0
MyLargeIntHigh dd 0
If you're looking to learn ASM, just do a Google search for [x86 assembly language tutorial]. That'll get you a whole lot of good information.
You could use a waitable timer object. Since Windows is not a real-time OS, you'll need to make sure you set the period long enough that you won't miss pulses. A tenth of a second should be safe most of the time.
Additional:
The const LARGE_INTEGER you need to pass to SetWaitableTimer is easy to implement in NASM, it's just an eight byte constant:
period: dq 100 ; 100ms = ten times a second
Pass the address of period as the second argument to SetWaitableTimer.
I'm having some difficulty with handling streaming sources in OpenAL on Mac OS X (using the system framework). I'm still not sure what triggers it, but sometimes, after stopping a streaming source and playing it again, queueing a buffer increases the AL_BUFFERS_PROCESSED value. I use a while loop like the following to process the source's buffers:
alGetSourcei(source, AL_BUFFERS_PROCESSED, &processed);
while (processed--)
{
ALuint buffer;
// Get a free buffer.
alSourceUnqueueBuffers(source, 1, &buffer);
streamAtomic(buffer, decoder); // streamAtomic decodes compressed audio data and calls alBufferData.
alSourceQueueBuffers(source, 1, &buffer);
}
The full source code to the Source class can be found here.
Normally this update loop works fine, but whenever this bug gets triggered, calling alSourceQueueBuffers seemingly increases AL_BUFFERS_PROCESSED, meaning that every update cycle, this loop takes longer and longer, until it reaches the total number of buffers queued, period (32, in this case), where it stays until pausing or stopping the source, at which point AL_BUFFERS_PROCESSED resets - and promptly begins increasing again. I checked, and the count does decrease by 1 after calling alSourceUnqueueBuffers. It's only after I call alSourceQueueBuffers that the count increases again.
I've been poring over my code, the OpenAL spec, Stack Overflow, the OpenAL mailing list, and Google, and I can't find any documentation of this occurring, nor any indication as to whether I'm doing something wrong or if it's a bug in the OpenAL implementation. For what it's worth, this bug does not occur, using the exact same code, under OpenAL Soft on Windows and Linux. I couldn't get OpenAL Soft working properly on my Mac to test, though.
Any ideas?
Let's say I have a contrived program:
#include <Windows.h>
void useless_function()
{
Sleep(5000);
}
void useful_function()
{
// ... do some work
useless_function();
// ... do some more work
}
int main()
{
useful_function();
return 0;
}
Objective: I want the profiler to tell me useful_function() is needlessly calling useless_function() which waits for no obvious reasons. Under XPerf, this doesn't show up in any of the graphs I have because the call to WaitForMultipleObjects() seem to be accounted to Idle.exe instead of my own program.
And here's the xperf command line that I currently run:
xperf -on Latency -stackwalk Profile
Any ideas?
(This is not restricted to wait functions. The above might have been solved by placing breakpoints at NtWaitForMultipleObjects. Ideally there could be a way to see the stack sample that's taking up a lot of wall-clock time as opposed to only CPU time)
I think what you are looking for is the Wait analysis with Ready Thread functionality in Xperf. It captures every context switch and gives you the call stack of the thread once it wakes up from sleep (or an otherwise blocked operation). In your case, you would see the stack just after the call sleep(5000) as well as the time spend sleeping.
The functionality is a bit obscure to use. But it is fortunately well described here:
Use Xperf's Wait Analysis for Application-Performance Troubleshooting
Wait Analysis is the way to do this. You should:
Record the CSWITCH provider, in order to get all context switches
Record call stacks on context switches by adding +CSWITCH to your -stackwalk argument
Probably record call stacks on the ready thread to get more information on who readied you (i.e.; who released the Mutex or CS or semaphore and where) by adding +READYTHREAD to your -stackwalk
Then you use CPU Usage (Precise) in WPA (or xperfview, but that's ancient) to look at the context switches and find where your TimeSinceLast is high on a thread that shouldn't be going idle. You'll typically want the columns in CPU Usage (Precise) in this sort of order:
NewProcess (your process being switched in)
NewThreadId
NewThreadStack
ReadyingProcess (who made your thread ready to run)
ReadyingThreadId (optional)
ReadyThreadStack (optional, requires +ReadyThread on -stackwalk)
Orange bar
Count
TimeSinceLast (us) - sort by this column, usually
Whatever other columns you want
For details see these particular articles from my blog:
- https://randomascii.wordpress.com/2014/08/19/etw-training-videos-available-now/
- https://randomascii.wordpress.com/2012/06/19/wpaxperf-trace-analysis-reimagined/
This "profiler" will tell you - just randomly pause it a few times and look at the stack. If do some work takes 5 seconds, and do some more work takes 5 seconds, then 33% of the time the stack will look like this
main: calling useful_function
useful_function: calling useless_function
useless_function: calling Sleep
So roughly 33% of your stack samples will show exactly that. Any line of code that's costing some fraction of wall-clock time will appear on roughly that fraction of samples.
On the rest of the samples you will see it doing the other things.
There are automated profilers that do the same thing in a more pretty way, such as Zoom and LTProf, although they don't actually show you the samples.
I looked at the xperf doc, trying to figure out if you could get stack samples on wall-clock time and get percents at line-level resolution. It seems you gotta be on Windows 7 or Vista. They only bother with functions, not lines, which if you have realistically big functions, is important. I couldn't figure out how to get access to the individual samples, which I think is important for seeing why the program is spending its time.
I want to use extra-cpu cycles to do some of my own processing, and I was wondering if someone could point me in the right direction as to how to get started on this?
I would suggest writing a program that runs continuously (make sure it blocks occasionally), and then simply setting it to a low priority. The OS Scheduler (Windows/*nix) should handle the rest automatically.
You can use extra CPU cycles by writing a program that runs in the background.
You can check the CPU usage to find out when the computer is idle (but it's not necessarily a good idea), or you can listen for mouse/keyboard activity.
To check CPU usage in C#, use the following code:
float cpuUsage; //Between 0 and 100
using (var cpu = new PerformanceCounter("Processor", "% Processor Time", "_Total")) {
cpu.NextValue(); //First call gives wrong values
cpuUsage = cpu.NextValue();
}
To check for keyboard or mouse activity, you'll need to use a keyboard / mouse hook; see here for instructions.
Write an application. Set its thread priorities to "background". Job done ;)