How to handle the often-generated messages? - algorithm

I have the following system:
producer ---> proxy ---> consumer
<good> 10 msg/s
A producer that generates messages - 10 msg/second. A proxy that returns bool value to a consumer depends on how many messages producer generated, if it generate more than 10, proxy returns false.
This is timestamp:
0.1
0.2
0.25
0.26
0.27
0.28
0.29
0.30
0.31
0.32
0.33
0.34
2.0
2.1
2.2
Messages that comes at 0.1, 0.2, 0.25 ... 0.32 they are good and we return true, 0.32 is bad because it's 11th message in 1 second, so we return false.
Here is a protoype:
bool shouldProxyAcceptMessage(float timestamp)
How can I implement this algorithm?

Okay, I adapted the solution based on the OP's comments. The algorithm uses two global variables, nrMessages and startTime, and it initializes startTime to the current time whenever nrMessage==0 (i.e. when it receives the first message); it returns true for the first 10 timestamps that arrived within 1 second. Once it receives a timestamp older than 1 second, it resets nrMessages=0 (so startTime will reinitialize when the next timestamp arrives).
An implicit assumption we're
long nrMessages = 0; // global variable
long startTime = 0;
boolean shouldProxyAcceptMessage(long timestamp) {
if (nrMessages++ == 0) {
startTime = System.currentTimeMillis()();
}
timeDiff = timestamp-startTime;
if (timeDiff > 1000) {
nrMessages = 0;
return false;
} else {
return nrMessages <= 10;
}
}

Related

Problems getting the current time in microseconds with a STM32 device

I am using a stm32f103c8 and I need a function that will return the correct time in microseconds when called from within an interrupt handler. I found the following bit of code online which proports to do that:
uint32_t microsISR()
{
uint32_t ret;
uint32_t st = SysTick->VAL;
uint32_t pending = SCB->ICSR & SCB_ICSR_PENDSTSET_Msk;
uint32_t ms = UptimeMillis;
if (pending == 0)
ms++;
return ms * 1000 - st / ((SysTick->LOAD + 1) / 1000);
}
My understanding of how this works is uses the system clock counter which repeatedly counts down from 8000 (LOAD+1) and when it reaches zero, an interrupt is generated which increments the variable UptimeMills. This gives the time in milliseconds. To get microseconds we get the current value of the system clock counter and divide it by 8000/1000 to give the offset in microseconds. Since the counter is counting down we subtract it from the current time in milliseconds * 1000. (Actually to be correct I believe one should have be added to the # milliseconds in this calculation).
This is all fine and good unless, when this function is called (in an interrupt handler), the system clock counter has already wrapped but the system clock interrupt has not yet been called, then UptimeMillis count will be off by one. This is the purpose of the following lines:
if (pending == 0)
ms++;
Looking at this does not make sense, however. It is incrementing the # ms if there is NO pending interrupt. Indeed if I use this code, I get a large number of glitches in the returned time at the points at which the counter rolls over. So I changed the lines to:
if (pending != 0)
ms++;
This produced much better results but I still get the occasional glitch (about 1 in every 2000 interrupts) which always occurs at a time when the counter is rolling over.
During the interrupt, I log the current value of milliseconds, microseconds and counter value. I find there are two situations where I get an error:
Milli Micros DT Counter Pending
1 1661 1660550 826 3602 0
2 1662 1661374 824 5010 0
3 1663 1662196 822 6436 0
4 1663 1662022 -174 7826 0
5 1664 1663847 1825 1228 0
6 1665 1664674 827 2614 0
7 1666 1665501 827 3993 0
The interrupts are comming in at a regular rate of about 820us. In this case what seems to be happening between interrupt 3 and 4 is that the counter has wrapped but the pending flag is NOT set. So I need to be adding 1000 to the value and since I fail to do so I get a negative elapsed time.
The second situation is as follows:
Milli Micros DT Counter Pending
1 1814 1813535 818 3721 0
2 1815 1814357 822 5151 0
3 1816 1815181 824 6554 0
4 1817 1817000 1819 2 1
5 1817 1816817 -183 1466 0
6 1818 1817637 820 2906 0
This is a very similar situation except in this case the counter has NOT yet wrapped and yet I am already getting the pending interrupt flag which causes me to erronously add 1000.
Clearly there is some kind of race condition between the two competing interrupts. I have tried setting the clock interrupt priority both above and below that of the external interrupt but the problem persists.
Does anyone have any suggestions how to deal with this problem or a suggestion for a different approach to get the time is microseconds within an interrupt handler.
Read UptimeMillis before and after SysTick->VAL to ensure a rollover has not occurred.
uint32_t microsISR()
{
uint32_t ms = UptimeMillis;
uint32_t st = SysTick->VAL;
// Did UptimeMillis rollover while reading SysTick->VAL?
if (ms != UptimeMillis)
{
// Rollover occurred so read both again.
// Must read both because we don't know whether the
// rollover occurred before or after reading SysTick->VAL.
// No need to check for another rollover because there is
// no chance of another rollover occurring so quickly.
ms = UptimeMillis;
st = SysTick->VAL;
}
return ms * 1000 - st / ((SysTick->LOAD + 1) / 1000);
}
Or here is the same idea in a do-while loop.
uint32_t microsISR()
{
uint32_t ms;
uint32_t st;
// Read UptimeMillis and SysTick->VAL until
// UptimeMillis doesn't rollover.
do
{
ms = UptimeMillis;
st = SysTick->VAL;
} while (ms != UptimeMillis);
return ms * 1000 - st / ((SysTick->LOAD + 1) / 1000);
}

Exponent calculation speed

I am currently testing Julia (I've worked with Matlab)
In matlab the calculation speed of N^3 is slower than NxNxN. This doesn't happen with N^2 and NxN. They use a different algorithm to calculate higher-order exponents because they prefer accuracy rather than speed.
I think Julia do the same thing.
I wanted to ask if there is a way to force Julia to calculate the exponent of N using multiplication instead of the default algorithm, at least for cube exponents.
Some time ago a I did a few test on matlab of this. I made a translation of that code to julia.
Links to code:
http://pastebin.com/bbeukhTc
(I cant upload all the links here :( )
Results of the scripts on Matlab 2014:
Exponente1
Elapsed time is 68.293793 seconds. (17.7x times of the smallest)
Exponente2
Elapsed time is 24.236218 seconds. (6.3x times of the smallests)
Exponente3
Elapsed time is 3.853348 seconds.
Results of the scripts on Julia 0.46:
Exponente1
18.423204 seconds (8.22 k allocations: 372.563 KB) (51.6x times of the smallest)
Exponente2
13.746904 seconds (9.02 k allocations: 407.332 KB) (38.5 times of the smallest)
Exponente3
0.356875 seconds (10.01 k allocations: 450.441 KB)
In my tests julia is faster than Matlab, but i am using a relative old version. I cant test other versions.
Checking Julia's source code:
julia/base/math.jl:
^(x::Float64, y::Integer) =
box(Float64, powi_llvm(unbox(Float64,x), unbox(Int32,Int32(y))))
^(x::Float32, y::Integer) =
box(Float32, powi_llvm(unbox(Float32,x), unbox(Int32,Int32(y))))
julia/base/fastmath.jl:
pow_fast{T<:FloatTypes}(x::T, y::Integer) = pow_fast(x, Int32(y))
pow_fast{T<:FloatTypes}(x::T, y::Int32) =
box(T, Base.powi_llvm(unbox(T,x), unbox(Int32,y)))
We can see that Julia uses powi_llvm
Checking llvm's source code:
define double #powi(double %F, i32 %power) {
; CHECK: powi:
; CHECK: bl __powidf2
%result = call double #llvm.powi.f64(double %F, i32 %power)
ret double %result
}
Now, the __powidf2 is the interesting function here:
COMPILER_RT_ABI double
__powidf2(double a, si_int b)
{
const int recip = b < 0;
double r = 1;
while (1)
{
if (b & 1)
r *= a;
b /= 2;
if (b == 0)
break;
a *= a;
}
return recip ? 1/r : r;
}
Example 1: given a = 2; b = 7:
- r = 1
- iteration 1: r = 1 * 2 = 2; b = (int)(7/2) = 3; a = 2 * 2 = 4
- iteration 2: r = 2 * 4 = 8; b = (int)(3/2) = 1; a = 4 * 4 = 16
- iteration 3: r = 8 * 16 = 128;
Example 2: given a = 2; b = 8:
- r = 1
- iteration 1: r = 1; b = (int)(8/2) = 4; a = 2 * 2 = 4
- iteration 2: r = 1; b = (int)(4/2) = 2; a = 4 * 4 = 16
- iteration 3: r = 1; b = (int)(2/2) = 1; a = 16 * 16 = 256
- iteration 4: r = 1 * 256 = 256; b = (int)(1/2) = 0;
Integer power is always implemented as a sequence multiplications. That's why N^3 is slower than N^2.
jl_powi_llvm (called in fastmath.jl. "jl_" is concatenated by macro expansion), on the other hand, casts the exponent to floating-point and calls pow(). C source code:
JL_DLLEXPORT jl_value_t *jl_powi_llvm(jl_value_t *a, jl_value_t *b)
{
jl_value_t *ty = jl_typeof(a);
if (!jl_is_bitstype(ty))
jl_error("powi_llvm: a is not a bitstype");
if (!jl_is_bitstype(jl_typeof(b)) || jl_datatype_size(jl_typeof(b)) != 4)
jl_error("powi_llvm: b is not a 32-bit bitstype");
jl_value_t *newv = newstruct((jl_datatype_t*)ty);
void *pa = jl_data_ptr(a), *pr = jl_data_ptr(newv);
int sz = jl_datatype_size(ty);
switch (sz) {
/* choose the right size c-type operation */
case 4:
*(float*)pr = powf(*(float*)pa, (float)jl_unbox_int32(b));
break;
case 8:
*(double*)pr = pow(*(double*)pa, (double)jl_unbox_int32(b));
break;
default:
jl_error("powi_llvm: runtime floating point intrinsics are not implemented for bit sizes other than 32 and 64");
}
return newv;
}
Lior's answer is excellent. Here is a solution to the problem you posed: Yes, there is a way to force usage of multiplication, at cost of accuracy. It's the #fastmath macro:
julia> #benchmark 1.1 ^ 3
BenchmarkTools.Trial:
samples: 10000
evals/sample: 999
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 16.00 bytes
allocs estimate: 1
minimum time: 13.00 ns (0.00% GC)
median time: 14.00 ns (0.00% GC)
mean time: 15.74 ns (6.14% GC)
maximum time: 1.85 μs (98.16% GC)
julia> #benchmark #fastmath 1.1 ^ 3
BenchmarkTools.Trial:
samples: 10000
evals/sample: 1000
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 0.00 bytes
allocs estimate: 0
minimum time: 2.00 ns (0.00% GC)
median time: 3.00 ns (0.00% GC)
mean time: 2.59 ns (0.00% GC)
maximum time: 20.00 ns (0.00% GC)
Note that with #fastmath, performance is much better.

What does "nil#" mean in profiler results?

Here is very simplified part of code:
srand 0
WIDTH, HEIGHT = 10, 10
array = Array.new(HEIGHT){ [0]*WIDTH }
require "profile"
10000.times do
y, x = rand(HEIGHT), rand(WIDTH)
g = array[y][x] + [-1,+1].sample
array[y][x] = g unless [[y-1,x],[y+1,x],[y,x-1],[y,x+1]].any?{ |y, x|
y>=0 && y<HEIGHT && x>=0 && x<WIDTH && 1 < (array[y][x] - g).abs
}
end
We see (ruby 2.0.0p451 (2014-02-24) [i386-mingw32]):
% cumulative self self total
time seconds seconds calls ms/call ms/call name
39.33 0.86 0.86 47471 0.02 0.05 nil#
If we totally remove the unless thing:
53.48 0.48 0.48 10000 0.05 0.09 nil#
Looks like we get this nil# operation:
on every comparision or boolean operation?
on every Array object creation or block envocation?
Would be nice to get a deep answer.

How to best create a Timer Function in R

I'm trying to create a function in R that returns for the first x seconds after the function call 1, the next x seconds 0, the next x seconds 1 again,...the whole procedure should stop after another time interval or after n iterations. I want to do this with one function call.
I read about the package tcltk which apparently entails some possibilities to create such "timer" functions, however I did not find enough explanations to sort out my issue.
Could you advise me where to find a good manual that explains tcl in context with R? Do you have other ideas how to create such a function in an efficient way?
Thanks a lot for your help.
If I understood you correctly, you are trying to create a function that will return 1 whenever it is called in the first x secs, then return 0 whenever it's called in the next x secs, then return 1 over the next x secs, etc. And after a certain total time, it should be "done", maybe return -1?
You could do this using the following function that will "create" a function with any desired interval:
flipper <- function(interval=10, total = 60) {
t0 <- Sys.time()
function() {
seconds <- round(as.double( difftime(Sys.time(), t0, u = 'secs')))
if(seconds > total)
return(-1) else
return(trunc( 1 + (seconds / interval ) ) %% 2)
}
}
You can use this to create a function that alternates between 0 and 1 every 10 secs during the first 60 seconds, and returns -1 after 60 seconds:
> flp <- flipper(10,60)
Now calling flp() will have the behavior you are looking for, i.e. when you call flp() during the next 60 secs, it will alternate between 1 and 0 every 10 secs, then after 60 secs it will just return -1.
Sys.sleep from base could not be a solution?
E.g.: stop every 10th iteration in a loop for 10 seconds:
for (i in 1:100) {
# do something
if ((i %% 10) == 0) {
Sys.sleep(10)
}
}
May I also suggest:
Timer 1
library(data.table)
begin.time <- proc.time()
timetaken(begin.time)
Timer 2
library(matlab)
tic(gcFirst=FALSE)
toc(echo=TRUE)
Both are excellent choices for timers
The tcltk command after will call a function after a time delay. Getting it to repeatedly call can be done along the following lines (made more complicated by this desire to have different intervals between events).
afterID <- ""
someFlag <- TRUE
MS <- 5000 # milliseconds
repeatCall <- function(ms=MS, f) {
afterID <<- tcl("after", ms, function() {
if(someFlag) {
f()
afterID <<- repeatCall(MS - ms, f)
} else {
tcl("after", "cancel", afterID)
}
})
}
repeatCall(MS, function() {
print("Running. Set someFlag <- FALSE to stop.")
})
timefun <- function(interval = 10, output = c(0, 1)) {
start <- Sys.time()
# do some stuff
for (i in 1:99999) paste(i, i^2)
# how many intervals did it take
elapsed <- as.numeric(round(Sys.time() - start)) / interval
output[elapsed %% length(output) + 1]
}
I'm not clear on "the whole procedure should stop after another time interval or after n iterations", do you want to check elapsed time periodically during the function execution and stop() if it's above a certain value?

Calculating frames per second in a game

What's a good algorithm for calculating frames per second in a game? I want to show it as a number in the corner of the screen. If I just look at how long it took to render the last frame the number changes too fast.
Bonus points if your answer updates each frame and doesn't converge differently when the frame rate is increasing vs decreasing.
You need a smoothed average, the easiest way is to take the current answer (the time to draw the last frame) and combine it with the previous answer.
// eg.
float smoothing = 0.9; // larger=more smoothing
measurement = (measurement * smoothing) + (current * (1.0-smoothing))
By adjusting the 0.9 / 0.1 ratio you can change the 'time constant' - that is how quickly the number responds to changes. A larger fraction in favour of the old answer gives a slower smoother change, a large fraction in favour of the new answer gives a quicker changing value. Obviously the two factors must add to one!
This is what I have used in many games.
#define MAXSAMPLES 100
int tickindex=0;
int ticksum=0;
int ticklist[MAXSAMPLES];
/* need to zero out the ticklist array before starting */
/* average will ramp up until the buffer is full */
/* returns average ticks per frame over the MAXSAMPLES last frames */
double CalcAverageTick(int newtick)
{
ticksum-=ticklist[tickindex]; /* subtract value falling off */
ticksum+=newtick; /* add new value */
ticklist[tickindex]=newtick; /* save new value so it can be subtracted later */
if(++tickindex==MAXSAMPLES) /* inc buffer index */
tickindex=0;
/* return average */
return((double)ticksum/MAXSAMPLES);
}
Well, certainly
frames / sec = 1 / (sec / frame)
But, as you point out, there's a lot of variation in the time it takes to render a single frame, and from a UI perspective updating the fps value at the frame rate is not usable at all (unless the number is very stable).
What you want is probably a moving average or some sort of binning / resetting counter.
For example, you could maintain a queue data structure which held the rendering times for each of the last 30, 60, 100, or what-have-you frames (you could even design it so the limit was adjustable at run-time). To determine a decent fps approximation you can determine the average fps from all the rendering times in the queue:
fps = # of rendering times in queue / total rendering time
When you finish rendering a new frame you enqueue a new rendering time and dequeue an old rendering time. Alternately, you could dequeue only when the total of the rendering times exceeded some preset value (e.g. 1 sec). You can maintain the "last fps value" and a last updated timestamp so you can trigger when to update the fps figure, if you so desire. Though with a moving average if you have consistent formatting, printing the "instantaneous average" fps on each frame would probably be ok.
Another method would be to have a resetting counter. Maintain a precise (millisecond) timestamp, a frame counter, and an fps value. When you finish rendering a frame, increment the counter. When the counter hits a pre-set limit (e.g. 100 frames) or when the time since the timestamp has passed some pre-set value (e.g. 1 sec), calculate the fps:
fps = # frames / (current time - start time)
Then reset the counter to 0 and set the timestamp to the current time.
Increment a counter every time you render a screen and clear that counter for some time interval over which you want to measure the frame-rate.
Ie. Every 3 seconds, get counter/3 and then clear the counter.
There are at least two ways to do it:
The first is the one others have mentioned here before me.
I think it's the simplest and preferred way. You just to keep track of
cn: counter of how many frames you've rendered
time_start: the time since you've started counting
time_now: the current time
Calculating the fps in this case is as simple as evaluating this formula:
FPS = cn / (time_now - time_start).
Then there is the uber cool way you might like to use some day:
Let's say you have 'i' frames to consider. I'll use this notation: f[0], f[1],..., f[i-1] to describe how long it took to render frame 0, frame 1, ..., frame (i-1) respectively.
Example where i = 3
|f[0] |f[1] |f[2] |
+----------+-------------+-------+------> time
Then, mathematical definition of fps after i frames would be
(1) fps[i] = i / (f[0] + ... + f[i-1])
And the same formula but only considering i-1 frames.
(2) fps[i-1] = (i-1) / (f[0] + ... + f[i-2])
Now the trick here is to modify the right side of formula (1) in such a way that it will contain the right side of formula (2) and substitute it for it's left side.
Like so (you should see it more clearly if you write it on a paper):
fps[i] = i / (f[0] + ... + f[i-1])
= i / ((f[0] + ... + f[i-2]) + f[i-1])
= (i/(i-1)) / ((f[0] + ... + f[i-2])/(i-1) + f[i-1]/(i-1))
= (i/(i-1)) / (1/fps[i-1] + f[i-1]/(i-1))
= ...
= (i*fps[i-1]) / (f[i-1] * fps[i-1] + i - 1)
So according to this formula (my math deriving skill are a bit rusty though), to calculate the new fps you need to know the fps from the previous frame, the duration it took to render the last frame and the number of frames you've rendered.
This might be overkill for most people, that's why I hadn't posted it when I implemented it. But it's very robust and flexible.
It stores a Queue with the last frame times, so it can accurately calculate an average FPS value much better than just taking the last frame into consideration.
It also allows you to ignore one frame, if you are doing something that you know is going to artificially screw up that frame's time.
It also allows you to change the number of frames to store in the Queue as it runs, so you can test it out on the fly what is the best value for you.
// Number of past frames to use for FPS smooth calculation - because
// Unity's smoothedDeltaTime, well - it kinda sucks
private int frameTimesSize = 60;
// A Queue is the perfect data structure for the smoothed FPS task;
// new values in, old values out
private Queue<float> frameTimes;
// Not really needed, but used for faster updating then processing
// the entire queue every frame
private float __frameTimesSum = 0;
// Flag to ignore the next frame when performing a heavy one-time operation
// (like changing resolution)
private bool _fpsIgnoreNextFrame = false;
//=============================================================================
// Call this after doing a heavy operation that will screw up with FPS calculation
void FPSIgnoreNextFrame() {
this._fpsIgnoreNextFrame = true;
}
//=============================================================================
// Smoothed FPS counter updating
void Update()
{
if (this._fpsIgnoreNextFrame) {
this._fpsIgnoreNextFrame = false;
return;
}
// While looping here allows the frameTimesSize member to be changed dinamically
while (this.frameTimes.Count >= this.frameTimesSize) {
this.__frameTimesSum -= this.frameTimes.Dequeue();
}
while (this.frameTimes.Count < this.frameTimesSize) {
this.__frameTimesSum += Time.deltaTime;
this.frameTimes.Enqueue(Time.deltaTime);
}
}
//=============================================================================
// Public function to get smoothed FPS values
public int GetSmoothedFPS() {
return (int)(this.frameTimesSize / this.__frameTimesSum * Time.timeScale);
}
Good answers here. Just how you implement it is dependent on what you need it for. I prefer the running average one myself "time = time * 0.9 + last_frame * 0.1" by the guy above.
however I personally like to weight my average more heavily towards newer data because in a game it is SPIKES that are the hardest to squash and thus of most interest to me. So I would use something more like a .7 \ .3 split will make a spike show up much faster (though it's effect will drop off-screen faster as well.. see below)
If your focus is on RENDERING time, then the .9.1 split works pretty nicely b/c it tend to be more smooth. THough for gameplay/AI/physics spikes are much more of a concern as THAT will usually what makes your game look choppy (which is often worse than a low frame rate assuming we're not dipping below 20 fps)
So, what I would do is also add something like this:
#define ONE_OVER_FPS (1.0f/60.0f)
static float g_SpikeGuardBreakpoint = 3.0f * ONE_OVER_FPS;
if(time > g_SpikeGuardBreakpoint)
DoInternalBreakpoint()
(fill in 3.0f with whatever magnitude you find to be an unacceptable spike)
This will let you find and thus solve FPS issues the end of the frame they happen.
A much better system than using a large array of old framerates is to just do something like this:
new_fps = old_fps * 0.99 + new_fps * 0.01
This method uses far less memory, requires far less code, and places more importance upon recent framerates than old framerates while still smoothing the effects of sudden framerate changes.
You could keep a counter, increment it after each frame is rendered, then reset the counter when you are on a new second (storing the previous value as the last second's # of frames rendered)
JavaScript:
// Set the end and start times
var start = (new Date).getTime(), end, FPS;
/* ...
* the loop/block your want to watch
* ...
*/
end = (new Date).getTime();
// since the times are by millisecond, use 1000 (1000ms = 1s)
// then multiply the result by (MaxFPS / 1000)
// FPS = (1000 - (end - start)) * (MaxFPS / 1000)
FPS = Math.round((1000 - (end - start)) * (60 / 1000));
Here's a complete example, using Python (but easily adapted to any language). It uses the smoothing equation in Martin's answer, so almost no memory overhead, and I chose values that worked for me (feel free to play around with the constants to adapt to your use case).
import time
SMOOTHING_FACTOR = 0.99
MAX_FPS = 10000
avg_fps = -1
last_tick = time.time()
while True:
# <Do your rendering work here...>
current_tick = time.time()
# Ensure we don't get crazy large frame rates, by capping to MAX_FPS
current_fps = 1.0 / max(current_tick - last_tick, 1.0/MAX_FPS)
last_tick = current_tick
if avg_fps < 0:
avg_fps = current_fps
else:
avg_fps = (avg_fps * SMOOTHING_FACTOR) + (current_fps * (1-SMOOTHING_FACTOR))
print(avg_fps)
Set counter to zero. Each time you draw a frame increment the counter. After each second print the counter. lather, rinse, repeat. If yo want extra credit, keep a running counter and divide by the total number of seconds for a running average.
In (c++ like) pseudocode these two are what I used in industrial image processing applications that had to process images from a set of externally triggered camera's. Variations in "frame rate" had a different source (slower or faster production on the belt) but the problem is the same. (I assume that you have a simple timer.peek() call that gives you something like the nr of msec (nsec?) since application start or the last call)
Solution 1: fast but not updated every frame
do while (1)
{
ProcessImage(frame)
if (frame.framenumber%poll_interval==0)
{
new_time=timer.peek()
framerate=poll_interval/(new_time - last_time)
last_time=new_time
}
}
Solution 2: updated every frame, requires more memory and CPU
do while (1)
{
ProcessImage(frame)
new_time=timer.peek()
delta=new_time - last_time
last_time = new_time
total_time += delta
delta_history.push(delta)
framerate= delta_history.length() / total_time
while (delta_history.length() > avg_interval)
{
oldest_delta = delta_history.pop()
total_time -= oldest_delta
}
}
qx.Class.define('FpsCounter', {
extend: qx.core.Object
,properties: {
}
,events: {
}
,construct: function(){
this.base(arguments);
this.restart();
}
,statics: {
}
,members: {
restart: function(){
this.__frames = [];
}
,addFrame: function(){
this.__frames.push(new Date());
}
,getFps: function(averageFrames){
debugger;
if(!averageFrames){
averageFrames = 2;
}
var time = 0;
var l = this.__frames.length;
var i = averageFrames;
while(i > 0){
if(l - i - 1 >= 0){
time += this.__frames[l - i] - this.__frames[l - i - 1];
}
i--;
}
var fps = averageFrames / time * 1000;
return fps;
}
}
});
How i do it!
boolean run = false;
int ticks = 0;
long tickstart;
int fps;
public void loop()
{
if(this.ticks==0)
{
this.tickstart = System.currentTimeMillis();
}
this.ticks++;
this.fps = (int)this.ticks / (System.currentTimeMillis()-this.tickstart);
}
In words, a tick clock tracks ticks. If it is the first time, it takes the current time and puts it in 'tickstart'. After the first tick, it makes the variable 'fps' equal how many ticks of the tick clock divided by the time minus the time of the first tick.
Fps is an integer, hence "(int)".
Here's how I do it (in Java):
private static long ONE_SECOND = 1000000L * 1000L; //1 second is 1000ms which is 1000000ns
LinkedList<Long> frames = new LinkedList<>(); //List of frames within 1 second
public int calcFPS(){
long time = System.nanoTime(); //Current time in nano seconds
frames.add(time); //Add this frame to the list
while(true){
long f = frames.getFirst(); //Look at the first element in frames
if(time - f > ONE_SECOND){ //If it was more than 1 second ago
frames.remove(); //Remove it from the list of frames
} else break;
/*If it was within 1 second we know that all other frames in the list
* are also within 1 second
*/
}
return frames.size(); //Return the size of the list
}
In Typescript, I use this algorithm to calculate framerate and frametime averages:
let getTime = () => {
return new Date().getTime();
}
let frames: any[] = [];
let previousTime = getTime();
let framerate:number = 0;
let frametime:number = 0;
let updateStats = (samples:number=60) => {
samples = Math.max(samples, 1) >> 0;
if (frames.length === samples) {
let currentTime: number = getTime() - previousTime;
frametime = currentTime / samples;
framerate = 1000 * samples / currentTime;
previousTime = getTime();
frames = [];
}
frames.push(1);
}
usage:
statsUpdate();
// Print
stats.innerHTML = Math.round(framerate) + ' FPS ' + frametime.toFixed(2) + ' ms';
Tip: If samples is 1, the result is real-time framerate and frametime.
This is based on KPexEA's answer and gives the Simple Moving Average. Tidied and converted to TypeScript for easy copy and paste:
Variable declaration:
fpsObject = {
maxSamples: 100,
tickIndex: 0,
tickSum: 0,
tickList: []
}
Function:
calculateFps(currentFps: number): number {
this.fpsObject.tickSum -= this.fpsObject.tickList[this.fpsObject.tickIndex] || 0
this.fpsObject.tickSum += currentFps
this.fpsObject.tickList[this.fpsObject.tickIndex] = currentFps
if (++this.fpsObject.tickIndex === this.fpsObject.maxSamples) this.fpsObject.tickIndex = 0
const smoothedFps = this.fpsObject.tickSum / this.fpsObject.maxSamples
return Math.floor(smoothedFps)
}
Usage (may vary in your app):
this.fps = this.calculateFps(this.ticker.FPS)
I adapted #KPexEA's answer to Go, moved the globals into struct fields, allowed the number of samples to be configurable, and used time.Duration instead of plain integers and floats.
type FrameTimeTracker struct {
samples []time.Duration
sum time.Duration
index int
}
func NewFrameTimeTracker(n int) *FrameTimeTracker {
return &FrameTimeTracker{
samples: make([]time.Duration, n),
}
}
func (t *FrameTimeTracker) AddFrameTime(frameTime time.Duration) (average time.Duration) {
// algorithm adapted from https://stackoverflow.com/a/87732/814422
t.sum -= t.samples[t.index]
t.sum += frameTime
t.samples[t.index] = frameTime
t.index++
if t.index == len(t.samples) {
t.index = 0
}
return t.sum / time.Duration(len(t.samples))
}
The use of time.Duration, which has nanosecond precision, eliminates the need for floating-point arithmetic to compute the average frame time, but comes at the expense of needing twice as much memory for the same number of samples.
You'd use it like this:
// track the last 60 frame times
frameTimeTracker := NewFrameTimeTracker(60)
// main game loop
for frame := 0;; frame++ {
// ...
if frame > 0 {
// prevFrameTime is the duration of the last frame
avgFrameTime := frameTimeTracker.AddFrameTime(prevFrameTime)
fps := 1.0 / avgFrameTime.Seconds()
}
// ...
}
Since the context of this question is game programming, I'll add some more notes about performance and optimization. The above approach is idiomatic Go but always involves two heap allocations: one for the struct itself and one for the array backing the slice of samples. If used as indicated above, these are long-lived allocations so they won't really tax the garbage collector. Profile before optimizing, as always.
However, if performance is a major concern, some changes can be made to eliminate the allocations and indirections:
Change samples from a slice of []time.Duration to an array of [N]time.Duration where N is fixed at compile time. This removes the flexibility of changing the number of samples at runtime, but in most cases that flexibility is unnecessary.
Then, eliminate the NewFrameTimeTracker constructor function entirely and use a var frameTimeTracker FrameTimeTracker declaration (at the package level or local to main) instead. Unlike C, Go will pre-zero all relevant memory.
Unfortunately, most of the answers here don't provide either accurate enough or sufficiently "slow responsive" FPS measurements. Here's how I do it in Rust using a measurement queue:
use std::collections::VecDeque;
use std::time::{Duration, Instant};
pub struct FpsCounter {
sample_period: Duration,
max_samples: usize,
creation_time: Instant,
frame_count: usize,
measurements: VecDeque<FrameCountMeasurement>,
}
#[derive(Copy, Clone)]
struct FrameCountMeasurement {
time: Instant,
frame_count: usize,
}
impl FpsCounter {
pub fn new(sample_period: Duration, samples: usize) -> Self {
assert!(samples > 1);
Self {
sample_period,
max_samples: samples,
creation_time: Instant::now(),
frame_count: 0,
measurements: VecDeque::new(),
}
}
pub fn fps(&self) -> f32 {
match (self.measurements.front(), self.measurements.back()) {
(Some(start), Some(end)) => {
let period = (end.time - start.time).as_secs_f32();
if period > 0.0 {
(end.frame_count - start.frame_count) as f32 / period
} else {
0.0
}
}
_ => 0.0,
}
}
pub fn update(&mut self) {
self.frame_count += 1;
let current_measurement = self.measure();
let last_measurement = self
.measurements
.back()
.copied()
.unwrap_or(FrameCountMeasurement {
time: self.creation_time,
frame_count: 0,
});
if (current_measurement.time - last_measurement.time) >= self.sample_period {
self.measurements.push_back(current_measurement);
while self.measurements.len() > self.max_samples {
self.measurements.pop_front();
}
}
}
fn measure(&self) -> FrameCountMeasurement {
FrameCountMeasurement {
time: Instant::now(),
frame_count: self.frame_count,
}
}
}
How to use:
Create the counter:
let mut fps_counter = FpsCounter::new(Duration::from_millis(100), 5);
Call fps_counter.update() on every frame drawn.
Call fps_counter.fps() whenever you like to display current FPS.
Now, the key is in parameters to FpsCounter::new() method: sample_period is how responsive fps() is to changes in framerate, and samples controls how quickly fps() ramps up or down to the actual framerate. So if you choose 10 ms and 100 samples, fps() would react almost instantly to any change in framerate - basically, FPS value on the screen would jitter like crazy, but since it's 100 samples, it would take 1 second to match the actual framerate.
So my choice of 100 ms and 5 samples means that displayed FPS counter doesn't make your eyes bleed by changing crazy fast, and it would match your actual framerate half a second after it changes, which is sensible enough for a game.
Since sample_period * samples is averaging time span, you don't want it to be too short if you want a reasonably accurate FPS counter.
store a start time and increment your framecounter once per loop? every few seconds you could just print framecount/(Now - starttime) and then reinitialize them.
edit: oops. double-ninja'ed

Resources