I have some kind of "main loop" using glut. I'd like to be able to measure how much time it takes to render a frame. The time used to render a frame might be used for other calculations. The use of the function time isn't adequate.
(time (procedure))
I found out that there is a function called current-time. I had to import some package to get it.
(define ct (current-time))
Which define ct as a time object. Unfortunately, I couldn't find any arithmetic packages for dates in scheme. I saw that in Racket there is something called current-inexact-milliseconds which is exactly what I'm looking for because it has nanoseconds.
Using the time object, there is a way to convert it to nanoseconds using
(time->nanoseconds ct)
This lets me do something like this
(let ((newTime (current-time)))
(block)
(print (- (time->nanoseconds newTime) (time->nanoseconds oldTime)))
(set! oldTime newTime))
Seems good enough for me except that for some reasons it was printing things like this
0
10000
0
0
10000
0
10000
I'm rendering things using opengl and I find it hard to believe that some rendering loop are taking 0 nanoseconds. And that each loop is quite stable enough to always take the same amount of nanoseconds.
After all, your results are not so surprising because we have to consider the limited timer resolution for each system. In fact, there are some limits that depend in general by the processor and by the OS processes. These are not able to count in an accurate manner than we expect, despite a quartz oscillator can reach and exceed a period of a nanosecond. You are also limited by the accuracy and resolution of the functions you used. I had a look at the documentation of Chicken scheme but there is nothing similar to (current-inexact-milliseconds) → real? of Racket.
After digging around, I came with the solution that I should write it in C and bind it to scheme using bindings.
(require-extension bind)
(bind-rename "getTime" "current-microseconds")
(bind* #<<EOF
uint64_t getTime();
#ifndef CHICKEN
#include <sys/time.h>
uint64_t getTime() {
struct timeval tim;
gettimeofday(&tim, NULL);
return 1000000 * tim.tv_sec + tim.tv_usec;
}
#endif
EOF
)
Unfortunately this solution isn't the best because it will be chicken-scheme only. It could be implemented as a library but a library to wrap only one function that doesn't exists on any other scheme doesn't make sense.
Since nanoseconds doens't actually make much sense after all, I got the microseconds instead.
Watch the trick here, define the function to get wrapped above and prevent the include to get parsed by bind. When the file will get loaded in Gcc, it will build with the include and function definition.
CHICKEN has current-milliseconds: http://api.call-cc.org/doc/library/current-milliseconds
Related
I need to perform simple arithmetic on struct tm from time.h. I need to add or subtract seconds or minutes, and be able to normalize the structure. Normally, I'd use mktime(3) which performs this normalization as a side effect:
struct tm t = {.tm_hour=0, .tm_min=59, .tm_sec=40};
t.tm_sec += 30;
mktime(&t);
// t.tm_hour is now 1
// t.tm_min is now 0
// t.tm_sec is now 10
I'm doing this on an STM32 with 32 kB of flash, and binary gets very big. mktime(3) and the other stuff it pulls in take up 16 kB of flash--half the available space.
Is there a function in newlib that is specifically responsible for struct tm normalization? I realize that linking to a private function like that would make the code less portable.
There is a validate_structure() function in newlib/libc/time/mktime.c which does a part of the job, normalizes month, day-of-month, hour, min, sec, but leaves day-of-week and day-of-year alone.
It's declared static, so you can't simply call it, but you can copy the function from the sources. (There might be licensing issues though). Or you can just reimplement it, it's quite straightforward.
The tm_wday and tm_yday is calculated later in mktime(), so you'd need the whole mess including the timezone stuff in order to have these two normalized.
The bulk of that 16kB code is related to a call to siscanf(), a variant of sscanf() without floating point support, which is (I believe) used to parse timezone and DST information in environment variables.
You can cut lots of unnecessary code by using --specs=nano.specs when linking, which would switch to simplified printf/scanf code, saving about 10kB of code in your case.
I'm looking for input as to why this breaks. See the addendum for contextual information, but I don't really think it is relevant.
I have an std::vector<uint16_t> depth_buffer that is initialized to have 640*480 elements. This means that the total space it takes up is 640*480*sizeof(uint16_t) = 614400.
The code that breaks:
void Kinect360::DepthCallback(void* _depth, uint32_t timestamp) {
lock_guard<mutex> depth_data_lock(depth_mutex);
uint16_t* depth = static_cast<uint16_t*>(_depth);
std::copy(depth, depth + depthBufferSize(), depth_buffer.begin());/// the error
new_depth_frame = true;
}
where depthBufferSize() will return 614400 (I've verified this multiple times).
My understanding of std::copy(first, amount, out) is that first specifies the memory address to start copying from, amount is how far in bytes to copy until, and out is the memory address to start copying to.
Of course, it can be done manually with something like
#pragma unroll
for(auto i = 0; i < 640*480; ++i) depth_buffer[i] = depth[i];
instead of the call to std::copy, but I'm really confused as to why std::copy fails here. Any thoughts???
Addendum: the context is that I am writing a derived class that inherits from FreenectDevice to work with a Kinect 360. Officially the error is a Bus Error, but I'm almost certain this is because libfreenect interprets an error in the DepthCallback as a Bus Error. Stepping through with lldb, it's a standard runtime_error being thrown from std::copy. If I manually enter depth + 614400 it will crash, though if I have depth + (640*480) it will chug along. At this stage I am not doing something meaningful with the depth data (rendering the raw depth appropriately with OpenGL is a separate issue xD), so it is hard to tell if everything got copied, or just a portion. That said, I'm almost positive it doesn't grab it all.
Contrasted with the corresponding VideoCallback and the call inside of copy(video, video + videoBufferSize(), video_buffer.begin()), I don't see why the above would crash. If my understanding of std::copy were wrong, this should crash too since videoBufferSize() is going to return 640*480*3*sizeof(uint8_t) = 640*480*3 = 921600. The *3 is from the fact that we have 3 uint8_t's per pixel, RGB (no A). The VideoCallback works swimmingly, as verified with OpenGL (and the fact that it's essentially identical to the samples provided with libfreenect...). FYI none of the samples I have found actually work with the raw depth data directly, all of them colorize the depth and use an std::vector<uint8_t> with RGB channels, which does not suit my needs for this project.
I'm happy to just ignore it and move on in some senses because I can get it to work, but I'm really quite perplexed as to why this breaks. Thanks for any thoughts!
The way std::copy works is that you provide start and end points of your input sequence and the location to begin copying to. The end point that you're providing is off the end of your sequence, because your depthBufferSize function is giving an offset in bytes, rather than the number of elements in your sequence.
If you remove the multiply by sizeof(uint16_t), it will work. At that point, you might also consider calling std::copy_n instead, which takes the number of elements to copy.
Edit: I just realised that I didn't answer the question directly.
Based on my understanding of std::copy, it shouldn't be throwing exceptions with the input you're giving it. The only thing in that code that could throw a runtime_error is the locking of the mutex.
Considering you have undefined behaviour as a result of running off of the end of your buffer, I'm tempted to say that has something to do with it.
I need a way to find out the amount of time taken by a function, and a section of my code inside a function, to execute
Does Visual Studio provide any mechanism for doing this, or is it possible to do so from the program using MFC functions? I am new to MFC so I am not sure how this can be done. I thought this should be a pretty straight forward operation but I cannot find any examples on how this may be done either
A quick way, but quite imprecise, is by using GetTickCount():
DWORD time1 = GetTickCount();
// Code to profile
DWORD time2 = GetTickCount();
DWORD timeElapsed = time2-time1;
The problem is GetTickCount() uses the system timer, which has a typical resolution of 10 - 15 ms. So it is only useful with long computations.
It can't tell the difference between a function that takes 2 ms to run and one that takes 9 ms. But if you are in the seconds range, it may well be enough.
If you need more resolution, you can use the performance counter, as RedEye explains.
Or you can try a profiler (maybe this is what you were looking for?). See this question.
There may be better ways, but I do it like this :
// At the start of the function
LARGE_INTEGER lStart;
QueryPerformanceCounter(&lStart);
LARGE_INTEGER lFreq;
QueryPerformanceFrequency(&lFreq);
// At the en of the function
LARGE_INTEGER lEnd;
QueryPerformanceCounter(&lEnd);
TRACE("FunctionName t = %dms\n", (1000*(lEnd.LowPart - lStart.LowPart))/lFreq.LowPart);
I use this method quite a lot for optimising graphics code, finding time taken for screen updates etc. There are other methods of doing the same or similar, but this is quick and simple.
I have a Fortran 90 program calling a multi threaded routine. I would like to time this program from the calling routine. If I use cpu_time(), I end up getting the cpu_time for all the threads (8 in my case) added together and not the actual time it takes for the program to run. The etime() routine seems to do the same. Any idea on how I can time this program (without using a stopwatch)?
Try omp_get_wtime(); see http://gcc.gnu.org/onlinedocs/libgomp/omp_005fget_005fwtime.html for the signature.
If this is a one-off thing, then I agree with larsmans, that using gprof or some other profiling is probably the way to go; but I also agree that it is very handy to have coarser timers in your code for timing different phases of the computation. The best timing information you have is the stuff you actually use, and it's hard to beat stuff that's output every single tiem you run your code.
Jeremia Wilcock pointing out omp_get_wtime() is very useful; it's standards compliant so should work on any OpenMP compiler - but it only has second resolution, which may or may not be enough, depending on what you're doing. Edited; the above was completely wrong.
Fortran90 defines system_clock() which can also be used on any standards-compliant compiler; the standard doesn't specify a time resolution, but gfortran it seems to be milliseconds and ifort seems to be microseconds. I usually use it in something like this:
subroutine tick(t)
integer, intent(OUT) :: t
call system_clock(t)
end subroutine tick
! returns time in seconds from now to time described by t
real function tock(t)
integer, intent(in) :: t
integer :: now, clock_rate
call system_clock(now,clock_rate)
tock = real(now - t)/real(clock_rate)
end function tock
And using them:
call tick(calc)
! do big calculation
calctime = tock(calc)
print *,'Timing summary'
print *,'Calc: ', calctime
After noticing some timing descrepencies with events in my code, I boiled the problem all the way down to my Windows Message Loop.
Basically, unless I'm doing something strange, I'm experiencing this behaviour:-
MSG message;
while (PeekMessage(&message, _applicationWindow.Handle, 0, 0, PM_REMOVE))
{
int timestamp = timeGetTime();
bool strange = message.time > timestamp; //strange == true!!!
TranslateMessage(&message);
DispatchMessage(&message);
}
The only rational conclusion I can draw is that MSG::time uses a different timing mechanism then timeGetTime() and therefore is free to produce differing results.
Is this the case or am i missing something fundemental?
Could this be a signed unsigned issue? You are comparing a signed int (timestamp) to an unsigned DWORD (msg.time).
Also, the clock wraps every 40ish days - when that happens strange could well be true.
As an aside, if you don't have a great reason to use timeGetTime, you can use GetTickCount here - it saves you bringing in winmm.
The code below shows how you should go about using times - you should never compare the times directly, because clock wrapping messes that up. Instead you should always subtract the start time from the current time and look at the interval.
// This is roughly equivalent code, however strange should never be true
// in this code
DWORD timestamp = GetTickCount();
bool strange = (timestamp - msg.time < 0);
I don't think it's advisable to expect or rely on any particular relationship between the absolute values of timestamps returned from different sources. For one thing, the multimedia timer may have a different resolution from the system timer. For another, the multimedia timer runs in a separate thread, so you may encounter synchronisation issues. (I don't know if each CPU maintains its own independent tick count.) Furthermore, if you are running any sort of time synchronisation service, it may be making its own adjustments to your local clock and affecting the timestamps you are seeing.
Are you by any chance running an AMD dual core? There is an issue where since each core has a separate timer and can run at different speeds, the timers can diverge from each other. This can manifest itself in negative ping times, for example.
I had similar issues when measuring timeouts in different threads using GetTickCount().
Install this driver (IIRC) to resolve the issue.
MSG.time is based on GetTickCount(), and timeGetTime() uses the multimedia timer, which is completely independent of GetTickCount(). I would not be surprised to see that one timer has 'ticked' before the other.