How to simulate JME3 bullet physics as fast as possible - performance

I am working on a car simulation using bullet physics and I want to be able to speed up the simulation - even to be able to run the physics simulation as fast as possible.
I tried to call pSpace.update(1/60, 1) (which calls directly DynamicsWorld.stepSimulation), then listen for physicsTick and call this again ( => no waiting for anything). Unfortunately it looks like the thread is not waiting for all the bullet's work done and objects go through surface then (when I got over StackOverflowError).
I would probably need some mechansim to be called by bullet when the computation is done and I can call it again.
Or does bullet have its own clock which cannot be speeden up and I am completely wrong? I see the whole thing that it works as a single computation of forces on given time.
I know that JME3 can speed up bullet by calling stepSimulation(speed * tpf, 4), but it only speeds the simulation 4x on maximum as it makes 4 steps in a row, is this the way?
Thank you very much for anybody's hint.


Modelica events and hybrid modelling

I would like to understand the general idea behind hybrid modelling (in particular state events) from a numerical point of view (although I am not a mathematician :)). Given the following Modelica model:
model BouncingBall
constant Real g=9.81
Real h(start=1);
Real v(start=0);
when h < 0 then
end when;
end BouncingBall;
I understand the concept of when and reinit.
The equation in the when statement are only active when the condition become true right?
Let's assume that the ball would hit the floor at exactly 2sec. Since I am using multi-step solver does that mean that the solver "goes beyond 2 seconds", recognizes that h<0 (lets assume at simulation time = 2.5sec , h = -0.7). What does this mean "The time for the event is searched using a crossing function? Is there a simple explanation(example)?
Is the solver now going back? Taking a smaller step-size?
What does the pre() operation mean in that context?
noEvent(): "Expressions are taken literally instead of generating crossing functions. Since there is no crossing function, there is no requirement tat the expression can be evaluated beyond the event limit": What does that mean? Given the same example with the bouncing ball: The solver detects at time 2.5 that h = 0.7. Whats the difference between with and without noEvent()?
Yes, the body of when is only executed at events.
Simple view: The solver takes steps, and then uses a continuous extension to generate a (smooth) interpolation formula for the previous step. That interpolation formula can be used to generate a plot, and also for finding the first point where h has crossed zero (likely 2.000000001). An event iteration is then done at that interpolated point - and afterwards the solver is restarted.
I wouldn't say that the solver goes back. It takes a partial step and then continues forward. Some solvers need to reduce the step-size a lot after the event - others don't.
pre(x) is set to the value of x before the event.
noEvent(h<0) basically means evaluate the expression as written without all the bells-and-whistles of crossing functions. You cannot use when noEvent(h<0) then
There are many additional point:
If you are familiar with Sturm-sequences or control theory you might realize that it is not necessary to interpolate a formula to determine if it crossed zero or not in an interval (and some tools use that). The fact that the function is not necessarily smooth makes it a bit more complicated, and also means that derivative-tests cannot be used.
How much the solver is reset depends on the kind of solver. One-step solvers (Runge-Kutta) can be restarted directly as if virtually nothing happened, whereas multi-step solvers (BDF/Adams - such as dassl/lsodar/cvode) need to start with lower order and smaller step-size.

Commence key press collection contingent on sound duration

I am implementing an experiment in Psychopy in which I am designing a same-different discrimination task comparing two sounds that are of variable duration (sound_1, sound_2) played in succession with an interval of 0.5 s in between. Now I have managed to start sound_1 at 0.0 and sound_2 at 0.5 s after the end of sound_1 using "$sound_1.getDuration() + 0.5"; however, I want to get a key press response with the RT measured from the end of sound_2 on; I tried start time "$sound_1.getDuration() + 0.5 + sound_2.getDuration()", however the keypress is already functional during the presentation of sound_2 and RTs appear to be too long as compared with usual RTs observed for this kind of task. Does anyone know how to obtain the accurate onset for measuring RTs here?
Btw my question is similar, however not fully answered by the following thread:
variable stimuli duration but two kinds of fixed ISI in PsychoPy

GPGPU computation with MATLAB does not scale properly

I've been experimenting with the GPU support of Matlab (v2014a). The notebook I'm using to test my code has a NVIDIA 840M build in.
Since I'm new to GPU computing with Matlab, I started out with a few simple examples and observed a strange scalability behavior. Instead of increasing the size of my problem, I simply put a forloop around my computation. I expected the time for the computation, to scale with the number of iterations, since the problem size itself does not increase. This was also true for smaller numbers of iterations, however at a certain point the time does not scale as expected, instead I observe a huge increase in computation time. Afterwards, the problem continues to scale again as expected.
The code example started from a random walk simulation, but I tried to produce an example that is easy and still shows the problem.
Here's what my code does. I initialize two matrices as sin(alpha)and cos(alpha). Then I loop over the number of iterations from 2**1to 2**15. I then repead the computation sin(alpha)^2 and cos(alpha)^2and add them up (this was just to check the result). I perform this calculation as often as the number of iterations suggests.
function gpu_scale
close all
NP = 10000;
NT = 1000;
ALPHA = rand(NP,NT,'single')*2*pi;
GSINALPHA = gpuArray(SINALPHA); % move array to gpu
for P = 1:PMAX;
for i=1:2^P
The following plot, shows the computation time in a log-log plot for the case that I always double the number of iterations. The jump occurs when doubling from 1024 to 2048 iterations.
The initial bump for two iterations might be due to initialization and is not really relevant anyhow.
I see no reason for the jump between 2**10 and 2**11 computations, since the computation time should only depend on the number of iterations.
My question: Can somebody explain this behavior to me? What is happening on the software/hardware side, that explains this jump?
Thanks in advance!
EDIT: As suggested by Divakar, I changed the way I time my code. I wasn't sure I was using gputimeit correctly. however MathWorks suggests another possible way, namely
gd= gpuDevice();
% the computation
Time = toc;
Using this way to measure my performance, the time is significantly slower, however I don't observe the jump in the previous plot. I added the CPU performance for comparison and keept both timings for the GPU (wait / no wait), which can be seen in the following plot
It seems, that the observed jump "corrects" the timining in the direction of the case where I used wait. If I understand the problem correctly, then the good performance in the no wait case is due to the fact, that we do not wait for the GPU to finish completely. However, then I still don't see an explanation for the jump.
Any ideas?

How to benchmark Matlab processes?

Searching for an idea how to avoid using loop in my Matlab code, I found following comments under one question on SE:
The statement "for loops are slow in Matlab" is no longer generally true since Matlab...euhm, R2008a?
Have you tried to benchmark a for loop vs what you already have? sometimes it is faster than vectorized code...
So I would like to ask, is there commonly used way to test the speed of a process in Matlab? Can user see somewhere how much time the process takes or the only way is to extend the processes for several minutes in order to compare the times between each other?
The best tool for testing the performance of MATLAB code is Steve Eddins' timeit function, available here from the MATLAB Central File Exchange.
It handles many subtle issues related to benchmarking MATLAB code for you, such as:
ensuring that JIT compilation is used by wrapping the benchmarked code in a function
warming up the code
running the code several times and averaging
Update: As of release R2013b, timeit is part of core MATLAB.
Update: As of release R2016a, MATLAB also includes a performance testing framework that handles the above issues for you in a similar way to timeit.
You can use the profiler to assess how much time your functions, and the blocks of code within them, are taking.
>> profile on; % Starts the profiler
>> myfunctiontorun( ); % This can be a function, script or block of code
>> profile viewer; % Opens the viewer showing you how much time everything took
Viewer also clears the current profile data for next time.
Bear in mind, profile does tend to slow execution a bit, but I believe it does so in a uniform way across everything.
Obviously if your function is very quick, you might find you don't get reliable results so if you can run it many times or extend the computation that would improve matters.
If it's really simple stuff you're testing, you can also just time it using tic and toc:
>> tic; % Start the timer
>> myfunctionname( );
>> toc; % End the timer and display elapsed time
Also if you want multiple timers, you can assign them to variables:
>> mytimer = tic;
>> myfunctionname( );
>> toc(mytimer);
Finally, if you want to store the elapsed time instead of display it:
>> myresult = toc;
I think that I am right to state that many of us time Matlab by wrapping the block of code we're interested in between tic and toc. Furthermore, we take care to ensure that the total time is of the order of 10s of seconds (rather than 1s of seconds or 100s of seconds) and repeat it 3 - 5 times and take some measure of central tendency (such as the mean) and draw our conclusions from that.
If the piece of code takes less than, say 10s, then repeat it as many times as necessary to bring it into the range, being careful to avoid any impact of one iteration on the next. And if the code naturally takes 100s of seconds or longer, either spend longer on the testing or try it with artificially small input data to run more quickly.
In my experience it's not necessary to run programs for minutes to get data on average run time with acceptably low variance. If I run a program 5 times and one (or two) of the results is wildly different from the mean I'll re-run it.
Of course, if the code has any features which make its run time non-deterministic then it's a different matter.

How to use TDD correctly to implement a numerical method?

I am trying to use Test Driven Development to implement my signal processing library. But I have a little doubt: Assume I am trying to implement a sine method (I'm not):
Write the test (pseudo-code)
assertEqual(0, sine(0))
Write the first implementation
function sine(radians)
return 0
Second test
assertEqual(1, sine(pi))
At this point, should I:
implement a smart code that will work for pi and other values, or
implement the dumbest code that will work only for 0 and pi?
If you choose the second option, when can I jump to the first option? I will have to do it eventually...
At this point, should I:
implement real code that will work outside the two simple tests?
implement more dumbest code that will work only for the two simple tests?
Neither. I'm not sure where you got the "write just one test at a time" approach from, but it sure is a slow way to go.
The point is to write clear tests and use that clear testing to design your program.
So, write enough tests to actually validate a sine function. Two tests are clearly inadequate.
In the case of a continuous function, you have to provide a table of known good values eventually. Why wait?
However, testing continuous functions has some problems. You can't follow a dumb TDD procedure.
You can't test all floating-point values between 0 and 2*pi. You can't test a few random values.
In the case of continuous functions, a "strict, unthinking TDD" doesn't work. The issue here is that you know your sine function implementation will be based on a bunch of symmetries. You have to test based on those symmetry rules you're using. Bugs hide in cracks and corners. Edge cases and corner cases are part of the implementation and if you unthinkingly follow TDD you can't test that.
However, for continuous functions, you must test the edge and corner cases of the implementation.
This doesn't mean TDD is broken or inadequate. It says that slavish devotion to a "test first" can't work without some thinking about what you real goal is.
In kind of the strict baby-step TDD, you might implement the dumb method to get back to green, and then refactor the duplication inherent in the dumb code (testing for the input value is a kind of duplication between the test and the code) by producing a real algorithm. The hard part about getting a feel for TDD with such an algorithm is that your acceptance tests are really sitting right next to you (the table S. Lott suggests), so you kind of keep an eye on them the whole time. In more typical TDD, the unit is divorced enough from the whole that the acceptance tests can't just be plugged in right there, so you don't start thinking about testing for all scenarios, because all scenarios are not obvious.
Typically, you might have a real algorithm after one or two cases. The important thing about TDD is that it is driving design, not the algorithm. Once you have enough cases to satisfy the design needs, the value in TDD drops significantly. Then the tests more convert into covering corner cases to ensure your algorithm is correct in all aspects you can think of. So, if you are confident in how to build the algorithm, go for it. The kinds of baby steps you are talking about are only appropriate when you are uncertain. By taking such baby steps you start to build out the boundaries of what your code has to cover, even though your implementation isn't actually real yet. But as I said, that is more for when you are uncertain about how to build the algorithm.
Write tests that verify Identities.
For the sin(x) example, think about double-angle formula and half-angle formula.
Open a signal-processing textbook. Find the relevant chapters and implement every single one of those theorems/corollaries as test code applicable for your function. For most signal-processing functions there are identities that must be uphold for the inputs and the outputs. Write tests that verify those identities, regardless of what those inputs might be.
Then think about the inputs.
Divide the implementation process into separate stages. Each stage should have a Goal. The tests for each stage would be to verify that Goal. (Note 1)
The goal of the first stage is to be "roughly correct". For the sin(x) example, this would be like a naive implementation using binary search and some mathematical identities.
The goal of the second stage is to be "accurate enough". You will try different ways of computing the same function and see which one gets better result.
The goal of the third stage is to be "efficient".
(Note 1) Make it work, make it correct, make it fast, make it cheap. - attributed to Alan Kay
I believe the step when you jump to the first option is when you see there are too many "ifs" in your code "just to pass the tests". That wouldn't be the case yet, just with 0 and pi.
You'll feel the code is beginning to smell, and will be willing to refactor it asap. I'm not sure if that's what pure TDD says, but IMHO you do it in the refactor phase (test fail, test pass, refactor cycle). I mean, unless your failing tests ask for a different implementation.
Note that (in NUnit) you can also do
Assert.That(2.1 + 1.2, Is.EqualTo(3.3).Within(0.0005);
when you're dealing with floating-point equality.
One piece of advice I remember reading was to try to refactor out the magic numbers from your implementations.
You should code up all your unit tests in one hit (in my opinion). While the idea of only creating tests specifically covering what has to be tested is correct, your particular specification calls for a functioning sine() function, not a sine() function that works for 0 and PI.
Find a source you trust enough (a mathematician friend, tables at the back of a math book or another program that already has the sine function implemented).
I opted for bash/bc because I'm too lazy to type it all in by hand :-). If it were a sine() function, I'd just run the following program and paste it into the test code. I'd also put a copy of this script in there as a comment as well so I can re-use it if something changes (such as the desired resolution if more than 20 degrees in this case, or the value of PI you want to use).
while [[ ${d} -le 400 ]] ; do
r=$(echo "3.141592653589 * ${d} / 180" | bc -l)
s=$(echo "s(${r})" | bc -l)
echo "assertNear(${s},sine(${r})); // ${d} deg."
d=$(expr ${d} + 20)
This outputs:
assertNear(0,sine(0)); // 0 deg.
assertNear(.34202014332558591077,sine(.34906585039877777777)); // 20 deg.
assertNear(.64278760968640429167,sine(.69813170079755555555)); // 40 deg.
assertNear(.86602540378430644035,sine(1.04719755119633333333)); // 60 deg.
assertNear(.98480775301214683962,sine(1.39626340159511111111)); // 80 deg.
assertNear(.98480775301228458404,sine(1.74532925199388888888)); // 100 deg.
assertNear(.86602540378470305958,sine(2.09439510239266666666)); // 120 deg.
assertNear(.64278760968701194759,sine(2.44346095279144444444)); // 140 deg.
assertNear(.34202014332633131111,sine(2.79252680319022222222)); // 160 deg.
assertNear(.00000000000079323846,sine(3.14159265358900000000)); // 180 deg.
assertNear(-.34202014332484051044,sine(3.49065850398777777777)); // 200 deg.
assertNear(-.64278760968579663575,sine(3.83972435438655555555)); // 220 deg.
assertNear(-.86602540378390982112,sine(4.18879020478533333333)); // 240 deg.
assertNear(-.98480775301200909521,sine(4.53785605518411111111)); // 260 deg.
assertNear(-.98480775301242232845,sine(4.88692190558288888888)); // 280 deg.
assertNear(-.86602540378509967881,sine(5.23598775598166666666)); // 300 deg.
assertNear(-.64278760968761960351,sine(5.58505360638044444444)); // 320 deg.
assertNear(-.34202014332707671144,sine(5.93411945677922222222)); // 340 deg.
assertNear(-.00000000000158647692,sine(6.28318530717800000000)); // 360 deg.
assertNear(.34202014332409511011,sine(6.63225115757677777777)); // 380 deg.
assertNear(.64278760968518897983,sine(6.98131700797555555555)); // 400 deg.
Obviously you will need to map this answer to what your real function is meant to do. My point is that the test should fully validate the behavior of the code in this iteration. If this iteration was to produce a sine() function that only works for 0 and PI, then that's fine. But that would be a serious waste of an iteration in my opinion.
It may be that your function is so complex that it must be done over several iterations. Then your approach two is correct and the tests should be updated in the next iteration where you add the extra functionality. Otherwise, find a way to add all the tests for this iteration quickly, then you won't have to worry about switching between real code and test code frequently.
Strictly following TDD, you can first implement the dumbest code that will work. In order to jump to the first option (to implement the real code), add more tests:
assertEqual(tan(x), sin(x)/cos(x))
If you implement more than what is absolutely required by your tests, then your tests will not completely cover your implementation. For example, if you implemented the whole sin() function with just the two tests above, you could accidentally "break" it by returning a triangle function (that almost looks like a sine function) and your tests would not be able to detect the error.
The other thing you will have to worry about for numeric functions is the notion of "equality" and having to deal with the inherent loss of precision in floating point calculations. That's what I thought your question was going to be about after reading just the title. :)
I don't know what language you are using, but when I am dealing with a numeric method, I typically write a simple test like yours first to make sure the outline is correct, and then I feed more values to cover cases where I suspect things might go wrong. In .NET, NUnit 2.5 has a nice feature for this, called [TestCase], where you can feed multiple input values to the same test like this:
public int CheckAddition(int a, int b)
return a+b;
Short answer.
Write one test at a time.
Once it fails, Get back to green first. If that means doing the simplest thing that can work, do it. (Option 2)
Once you're in the green, you can look at the code and choose to cleanup (option1). Or you can say that the code still doesn't smell that much and write subsequent tests that put the spotlight on the smells.
Another question you seem to have, is how many tests should you write. You need to test till fear (the function may not work) turns into boredom. So once you've tested for all the interesting input-output combinations, you're done.
