Appropriate use of FPS count code in Unity and Games in general? - performance

So I've been working on a project in Unity, and after writing this very simple code, I started wondering about how computers with faster and slower frame rates would generate slightly different speeds.
The code:
for (i = 0; i < distance ; i++) transform.Translate(speed*Time.deltaTime,0,0);
Mostly, I was wondering if in slower computers, the moving object wouldn't move as far, and if I took out the deltaTime multiplication, would the object appear to be moving slower in a slower computer than on one with a faster FPS count.
If so, how would I solve this problem, if it's a problem at all?

The whole point of using the Time.DeltaTime in the equation is to remove this fear. Remember that the Time.DeltaTime is not analogous to the FPS your machine can produce, with that you can produce animations/movements and in general anything time dependent and frame rate independent.

If you use the Update() function (of a Monobehaviour object) then you shouldn't need to worry about the object not traveling as far.

Related

Pharo: How to increase MouseMoveEvent Frequency?

In the Pharo book there is an example for a Paint Canvas.
The problem is that the frequency in which mouse move events are passed to the handler is rather low, therefore you cannot draw continous paths if you move the mouse too quickly.
Is there some way to increase the update frequency for a morph? In Squeak, there is a SketchMorphEditor which does not have that problem, but I have not figured out why yet.
I am using Pharo 5.0.
As far as I know there is no way to increase the sampling rate. Even if it could be done, it would be a very bad idea for several reasons.
First, linear interpolation yields fairly good results (which can be improved with techniques like anti-aliasing, if necessary):
Second, we cannot rely on the sampling rate to be the same on every machine and to have consistent results. And third, since I plan to use a gesture recognizer, algorithms like the $1 Recognizer do not rely on sampling rates and work surprisingly well.

PID Controller that works on percentages instead of absolute units?

I am working on a PID Control software simulator for teaching PID Control concepts interactively.
I am working on an example for a Velocity controller. I have the example working, but I really want the input to my process to be in percentage instead of the output I am getting which is a fixed value that the process should increase by.
Right now I have to interpolate the output increase by the max acceleration for a sample step and then scale the output to a percentage. The problem is that the rate of acceleration is non-linear depending on the speed and current gearing of the drive train.
This works but isn't very flexible or adaptable, for instance, it makes everything accelerate at maximum until it gets near the setpoint velocity and then either overshoots and oscillates a few periods or takes an equally long time to get that last little bit without over shooting.
Sometimes you will want this maximum acceleration behavior, sometimes you will want to manage the battery/fuel source and accelerate at maximum efficiency; sometimes you want a bit of both.
Scaling the output like I am doing now is brute force and not very subtle. I would rather inject a output modifier into the calculation of the output by dynamically tuning the P, I and D gains, but I am not sure which ones to focus on and in what order?
When I tune them manually one at a time I can get really good results, but when I try and start automatically tuning them everything goes crazy.
I have spent the last week reading about control theory and auto tuning and the math notation just gets to cryptic for me, I understand the math if I can find some implementation in code; regardless of language.
I have tried applying Z-N heuristics, but I still get wild swings and it is really hard to compensate for overshoot; is hard to tolerate much overshoot when you can only decelerate at a fraction of the rate you can accelerate. imagine a system with no active braking and only relying on passive drag to slow down
What is a good approach to injecting dynamic gain tuning for velocity control?

Performance problem with rendering 1000 cubes in XNA 4.0 [duplicate]

I'm aware that the following is a vague question, but I'm hitting performance problems that I did not anticipate in XNA.
I have a low poly model (It has 18 faces and 14 vertices) that I'm trying to draw to the screen a (high!) number of times. I get over 60 FPS (on a decent machine) until I draw this model 5000+ times. Am I asking too much here? I'd very much like to double or triple that number (10-15k) at least.
My code for actually drawing the models is given below. I have tried to eliminate as much computation from the draw cycle as possible, is there more I can squeeze from it, or better alternatives all together?
Note: tile.Offset is computed once during initialisation, not every cycle.
foreach (var tile in Tiles)
{
var myModel = tile.Model;
Matrix[] transforms = new Matrix[myModel.Bones.Count];
myModel.CopyAbsoluteBoneTransformsTo(transforms);
foreach (ModelMesh mesh in myModel.Meshes)
{
foreach (BasicEffect effect in mesh.Effects)
{
// effect.EnableDefaultLighting();
effect.World = transforms[mesh.ParentBone.Index]
* Matrix.CreateTranslation(tile.Offset);
effect.View = CameraManager.ViewMatrix;
effect.Projection = CameraManager.ProjectionMatrix;
}
mesh.Draw();
}
}
You're quite clearly hitting the batch limit. See this presentation and this answer and this answer for details. Put simply: there is a limit to how many draw calls you can submit to the GPU each second.
The batch limit is a CPU-based limit, so you'll probably see that your CPU gets pegged once you get to your 5000+ models. Worse still, when your game is doing other calculations, it will reduce the CPU time available to submit those batches.
(And it's important to note that, conversely, you are almost certainly not hitting GPU limits. No need to worry about mesh complexity yet.)
There are a number of ways to reduce your batch count. Frustrum culling is one. Probably the best one to persue in your case is Geometry Instancing, this lets you draw multiple models in a single batch. Here is an XNA sample that does this.
Better still, if it's static geometry, can you simply bake it all into one or a few big meshes?
As with any performance problem there are limits where a particular approach works. You need to measure and see where problems are. The best option is to use profiler but even basic measurements like looking at CPU load may show what bottlencks you have.
As a first investiagtion step I'd recommend to remove all computations (like matrix multiplications) and see you get improvments - this would mean that CPU is still doing more work than GPU.
Make sure you are not doing measurements on debug build - it could make application significantly slower if it is CPU bound.
Side note: GPU works the best when you send large operations relatively infrequently. Your code does more or less opposite - send huge number of very small drawing requests. You should be able to batch your primitives and get better performance. There are samples around how to render large number of simple objects (including ones in DirectX SDK), searching for "gpu rendering crowds" can give you starting point.

How to get volume from mic input on WP7 [duplicate]

Given two byte arrays of data captured from a microphone, how can I determine which one has more spikes in noise? I would assume there is an algorithm I can apply to the data, but I have no idea where to start.
Getting down to it, I need to be able to determine when a baby is crying vs ambient noise in the room.
If it helps, I am using the Microsoft.Xna.Framework.Audio.Microphone class to capture the sound.
you can convert each sample (normalised to a range 1.0 to -1.0) into a decibel rating by applying the formula
dB = 20 * log-base-10 (sample-value)
To be honest, so long as you don't mind the occasional false positive, and your microphone is set up OK, you should have no problem telling the difference between a baby crying and ambient background noise, without going through the hassle of doing an FFT.
I'd recommend you having a look at the source code for a noise gate, which does pretty much what you are after, with configurable attack times & thresholds.
First use a Fast Fourier Transform to transform the signal into the frequency domain.
Then check if the signal in the typical "cry-frequencies" is significantly higher than the other amplitudes.
The preprocessor of the speex codec supports noise vs signal detection, but I don't know if you can get it to work with XNA.
Or if you really want some kind of loudness calculate the sum of squares of the amplitudes from the frequencies you're interested in (for example 50-20000Hz) and if the average of that over the last 30 seconds is significantly higher than the average over the last 10 minutes or exceeds a certain absolute threshold sound the alarm.
Louder at what point? The signal's average amplitude will tell you which one is louder on average, but that is kind of a dumb, brute force way to go about it. It may work for you in practice though.
Getting down to it, I need to be able to determine when a baby is crying vs ambient noise in the room.
Ok, so, I'm just throwing out ideas here; I am by no means an expert on audio processing.
If you know your input, i.e., a baby crying (relatively loud with a high pitch) versus ambient noise (relatively quiet), you should be able to analyze the signal in terms of pitch (frequency) and amplitude (loudness). Of course, if during he recording someone drops some pots and pans onto the kitchen floor, that will be tough to discern.
As a first pass I would simply traverse the signal, maintaining a standard deviation of pitch and amplitude throughout, and then set a flag when those deviations jump beyond some threshold that you will have to define. When they come back down you may be able to safely assume that you captured the baby's cry.
Again, just throwing you an idea here. You will have to see how it works in practice with actual data.
I agree with #Ed Swangren, it will take a lot of playing with samples of data for a lot of sources. To me, it sounds like the trick will be to limit or hopefully eliminate false positives. My experience with babies is they are much louder crying than the environment. so, keeping track of the average measurements (freq/amp/??) of the normal environment and then classifying how well the changes match the characteristics of a crying baby which changes from kid to kid, so you'll probably want a system that 'learns'. Best of luck.
update: you might find this library useful http://naudio.codeplex.com/

Efficiency/speed for trigonometric functions

In a game I'm making, I've got two points, pt1 and pt2, and I want to work out the angle between them. I've already worked out the distance, in an earlier calculation. The obvious way would be to arctan the horizontal distance over the vertical distance (tan(theta) = opp/adj).
I'm wondering though, as I've already calculated the distance, would it be quicker to use arcsine/arccosine with the distance and dx or dy?
Also, might I be better off pre-calculating in a table?
I suspect there's a risk of premature optimization here. Also, be careful about your geometry. Your opposite/adjacent approach is a property of right angle triangles, is that what you actually have?
I'm assuming your points are planar, and so for the general case you have them implicitly representing two vectors form the origin (call these v1 v2), so your angle is
theta=arccos(dot(v1,v2)/(|v1||v2|)) where |.| is vector length.
Making this faster (assuming the need) will depend on a lot of things. Do you know the vector lengths, or have to compute them? How fast can you do a dot product in your architecture. How fast is acos? At some point tricks like table lookup (probably interpolated) might help but that will cost you accuracy.
It's all trade-offs though, there really isn't a general answer to your question.
[edit: added commentary]
I'd like to re-emphasize that often playing "x is fastest" is a bit of a mugs game with modern cpus and compilers anyway. You won't know until you measure it and grovel the generated code. When you hit the point that you really care about it at this level for a (hopefully small) piece of code, you can find out in detail what your system is doing. But it's painstaking. Maybe a table is good. But maybe you've got fast vector computations and a small cache. etc. etc. etc. It all amounts to "it depends". Sorry 'bout that. On the other hand, if you haven't reached the point that you really care so much about this bit of code... you probably shouldn't be thinking about it at this level at all. Make it right. Make it clean (which means abstraction as well as code). Then worry about the overhead.
Aside from all of the wise comments regarding premature optimization, let's just assume this is the hotspot and do a frigg'n benchmark:
Times are in nanoseconds, scaled to normalize 'acos' between the systems.
'acos' simply assumes unit radius i.e. acos(adj), whereas 'acos+div' means acos(adj/hyp).
System 1 is a 2.4GHz i5 running Mac OS X 10.6.4 (gcc 4.2.1)
System 2 is a 2.83GHz Core2 Quad running Red Hat 7 Linux 2.6.28 (gcc 4.1.2)
System 3 is a 1.66GHz Atom N280 running Ubuntu 10.04 2.6.32 (gcc 4.4.3)
System 4 is a 2.40GHz Pentium 4 running Ubuntu 10.04 2.6.32 (gcc 4.4.3)
Summary: Relative performance is all over the map. Sometimes atan2 is faster, sometimes its slower. Very strangely, on some systems doing acos with a division is faster than doing it without. Test on your own system :-/
If you're going to be doing this many times, pre-calculate in a table. Performance will be much better this way.
Tons of good answers here.
By the way, if you use Math.atan2, you get a full 2π of angles out of it.
I would just do it, then run it flat out. If you don't like the speed, and if samples show that you're actually in that code most of the time and not someplace else,
try replacing it with table lookup. If you don't need precision closer than 1 degree, you could use a pretty small table and interpolation.
Also, you may want to memoize the function. Why recompute something you already did recently?
Added: If you use a table, it only has to cover angles from 0-45 degrees (and it can be hard-coded). You can get everything else by symmetry.
From a pure speed standpoint, a precalculated table and a closest-match lookup would be best. It involves some overhead, of course, depending on how fine-grained you need the angle to be, but it's more than worth it if you're doing this calculation a lot (or in a tight loop), as those are going to be expensive calculations.
Get it right first !
And then profile and optimize. Table lookup is a good candidate for sure, but be sure to have your calculation right before doing anything fancy
If you're interested in big-O notation, all the methods you might use are O(1).
If you're interested in what works fastest, test it. Write a wrapper function, one that calls your preferred method but can be easily changed, and test with that. Make sure that your application spends a noticeable amount of time doing this, so you aren't wasting your own time. Try whatever ways occur to you. Ideally, run it on more than one different CPU.
I've become very leery of predicting what will take more or less time on modern processors. Lookup tables used to be the answer if you needed speed, but you don't know a priori the effects on caching or how long it's going to take to normalize and look up versus how long it's going to take to do a trig function on a particular CPU.
Given that this is for a game, you probably care about speed. A lookup table is definitely the fastest but you trade accuracy for speed with this method. So how accurate must you be to meet requirements? Only you can answer that. Before you trade accuracy, determine first if you have a speed problem. All of the trigonometric functions are calculated using numerical methods (research numerical analysis to learn more). Some trig functions are have more expensive methods than others because they rely on series that converge more slowly and who knows, your computer may have different implementations for these functions than another computer. At any rate, you can find out for yourself how expensive these functions are by writing some small programs that loop through as many iterations as you desire, with increments of your choosing, all the while timing the outcomes. Then you can pick the fastest method.
While others are very right to mention that you are almost certainly falling into the pit of premature optimization, when they say that trigonometric functions are O(1) they're not telling the whole story.
Most trigonometric function implementations are actually O(N) in the value of the input function. This is because the trig functions are most efficiently calculated on a small interval like [0, 2π) (or, for the best implementations, even smaller parts of this interval, but that one suffices to explain things). So the algorithm looks something like this, in pseudo-Python:
def Cosine_0to2Pi(x):
#a series approximation of some kind, or CORDIC, or perhaps a table
#this function requires 0 <= x < 2Pi
def MyCosine(x):
if x < 0:
x = -x
while x >= TwoPi:
x -= TwoPi
return Cosine_0to2Pi(x)
Even microcoded CPU instructions like the x87's FSINCOS end up doing something like this internally. So trig functions, because they are periodic, usually take O(N) time to do the argument reduction. There are two caveats, however:
If you have to calculate a ton of values off the principal domain of the trig functions, your math is probably not very well thought out.
Big-O notation hides a constant factor. Argument reduction has a very small constant factor, because it's simple to do. Thus the O(1) part is going to dominate the O(N) part for just about every input.

Resources