AI: selecting immediate acceleration/rotation to get to a final point - algorithm

I'm working on a game where on each update of the game loop, the AI is run. During this update, I have the chance to turn the AI-controlled entity and/or make it accelerate in the direction it is facing. I want it to reach a final location (within reasonable range) and at that location have a specific velocity and direction (again it doesn't need to be exact) That is, given a current:
P0(x, y) = Current position vector
V0(x, y) = Current velocity vector (units/second)
θ0 = Current direction (radians)
τmax = Max turn speed (radians/second)
αmax = Max acceleration (units/second^2)
|V|max = Absolute max speed (units/second)
Pf(x, y) = Target position vector
Vf(x, y) = Target velocity vector (units/second)
θf = Target rotation (radians)
Select an immediate:
τ = A turn speed within [-τmax, τmax]
α = An acceleration scalar within [0, αmax] (must accelerate in direction it's currently facing)
Such that these are minimized:
t = Total time to move to the destination
|Pt-Pf| = Distance from target position at end
|Vt-Vf| = Deviation from target velocity at end
|θt-θf| = Deviation from target rotation at end (wrapped to (-π,π))
The parameters can be re-computed during each iteration of the game loop. A picture says 1000 words so for example given the current state as the blue dude, reach approximately the state of the red dude within as short a time as possible (arrows are velocity):
Pic http://public.blu.livefilestore.com/y1p6zWlGWeATDQCM80G6gaDaX43BUik0DbFukbwE9I4rMk8axYpKwVS5-43rbwG9aZQmttJXd68NDAtYpYL6ugQXg/words.gif
Assuming a constant α and τ for Δt (Δt → 0 for an ideal solution) and splitting position/velocity into components, this gives (I think, my math is probably off):
Equations http://public.blu.livefilestore.com/y1p6zWlGWeATDTF9DZsTdHiio4dAKGrvSzg904W9cOeaeLpAE3MJzGZFokcZ-ZY21d0RGQ7VTxHIS88uC8-iDAV7g/equations.gif
(EDIT: that last one should be θ = θ0 + τΔt)
So, how do I select an immediate α and τ (remember these will be recomputed every iteration of the game loop, usually > 100 fps)? The simplest, most naieve way I can think of is:
Select a Δt equal to the average of the last few Δts between updates of the game loop (i.e. very small)
Compute the above 5 equations of the next step for all combinations of (α, τ) = {0, αmax} x {-τmax, 0, τmax} (only 6 combonations and 5 equations for each, so shouldn't take too long, and since they are run so often, the rather restrictive ranges will be amortized in the end)
Assign weights to position, velocity and rotation. Perhaps these weights could be dynamic (i.e. the further from position the entity is, the more position is weighted).
Greedily choose the one that minimizes these for the location Δt from now.
Its potentially fast & simple, however, there are a few glaring problems with this:
Arbitrary selection of weights
It's a greedy algorithm that (by its very nature) can't backtrack
It doesn't really take into account the problem space
If it frequently changes acceleration or turns, the animation could look "jerky".
Note that while the algorithm can (and probably should) save state between iterations, but Pf, Vf and θf can change every iteration (i.e. if the entity is trying to follow/position itself near another), so the algorithm needs to be able to adapt to changing conditions.
Any ideas? Is there a simple solution for this I'm missing?
Thanks,
Robert

sounds like you want a PD controller. Basically draw a line from the current position to the target. Then take the line direction in radians, that's your target radians. The current error in radians is current heading - line heading. Call it Eh. (heading error) then you say the current turn rate is going to be KpEh+d/dt EhKd. do this every step with a new line.
thats for heading
acceleration is "Accelerate until I've reached max speed or I wont be able to stop in time". you threw up a bunch of integrals so I'm sure you'll be fine with that calculation.
I case you're wondering, yes I've solved this problem before, PD controller works. don't bother with PID, don't need it in this case. Prototype in matlab. There is one thing I've left out, you need to have a trigger, like "i'm getting really close now" so I should start turning to get into the target. I just read your clarification about "only accelerating in the direction we're heading". that changes things a bit but not too much. that means to need to approach the target "from behind" meaning that the line target will have to be behind the real target, when you get near the behind target, follow a new line that will guide you to the real target. You'll also want to follow the lines, rather than just pick a heading and try to stick with it. So don't update the line each frame, just say the error is equal to the SIGNED DISTANCE FROM THE CURRENT TARGET LINE. The PD will give you a turn rate, acceleration is trivial, so you're set. you'll need to tweak Kd and Kp by head, that's why i said matlab first. (Octave is good too)
good luck, hope this points you in the right direction ;)
pun intended.
EDIT: I just read that...lots of stuff, wrote real quick. this is a line following solution to your problem, just google line following to accompany this answer if you want to take this solution as a basis to solving the problem.

I would like to suggest that yout consider http://en.wikipedia.org/wiki/Bang%E2%80%93bang_control (Bang-bang control) as well as a PID or PD. The things you are trying to minimise don't seem to produce any penalty for pushing the accelerator down as far as it will go, until it comes time to push the brake down as far as it will go, except for your point about how jerky this will look. At the very least, this provides some sort of justification for your initial guess.

Related

How to improve the performance of an alpha-beta minimax method in Ruby?

I’ve been a programming student for about six months. I’d been wanting to write a chess game since a long time, and finally made it. I’m very happy with the result, however, there’s a point I don’t know how to address. The AI is based on an alpha-beta pruning #minimax method that chooses the best move on the basis of the best possible outcome for the current player, with a depth default value of 3, which means the computer can think ahead 3 turns. The computer does choose the correct move, but the method in its current implementation is very slow.
#provisional makes and 'unmakes' a possible move, and returns the value returned from the code block. The evaluation function #evaluate is very simple, it’s just a sum of the material value of the pieces and their location values according to how are they placed on the board.
I’d really appreciate some light here, as I don’t know how to get a faster version of this method.
Thank you so much for your time.
This is the method:
def minimax(move, depth, alpha, beta, maximizing_player)
return board.evaluate if depth.zero?
board.provisional(move, color) do
if maximizing_player
best_minimizing_evaluation = Float::INFINITY
board.generate_moves(:black).each do |possible_move|
evaluation = minimax(possible_move, depth - 1, alpha, beta, false)
best_minimizing_evaluation = [best_minimizing_evaluation, evaluation].min
beta = [beta, evaluation].min
break if beta <= alpha
end
best_minimizing_evaluation
else
best_maximizing_evaluation = -Float::INFINITY
board.generate_moves(:white).each do |possible_move|
evaluation = minimax(possible_move, depth - 1, alpha, beta, true)
best_maximizing_evaluation = [best_maximizing_evaluation, evaluation].max
alpha = [alpha, evaluation].max
break if beta <= alpha
end
best_maximizing_evaluation
end
end
end
With an initial depth value of 3, it takes between 15 and 50 seconds for the method to resolve and return the chosen move; this is a lot, and makes the game barely enjoyable. Changing the depth value to 2, the times are more reasonable, being about a third of the previous times, but I’d really like to keep the depth at 3. With a depth of 1, of course, it takes less than a second.
I realize that some improvements can be made, however, I don’t know how to:
Will a negamax version of this method significantly improve its performance?
I’m aware of the Tail Recursion Optimization, however, is it possible in this case? I don’t know if it is a tree-generating method like this.
I’ve been told that pre-sorting the moves somehow before they are evaluated can improve the performance in alpha-beta minimaxes, but, how can I sort the moves? I can’t sort them by the immediate outcome, because, for instance, sometimes it’s worth to sacrifice a piece to win a better position. I could sort them by best possible outcome in 2 turns... but then I'd be computing twice, once for the sorting of the moves, and once for the actual move evaluation.
It's implementing a transposition table worth it? I mean, there are tons of potential positions, and they easy can make a file really big. For example, in a day, the program generated a 100 MB text file, and I didn't notice a huge performance improvement, as the computer don't always makes the same moves, and neither do I. The different position for a chess game are innumerable, unlike in a game like Tic Tac Toe.
Thank you so much.

How to set a convergence tolerance to an specific variable using Dymola?

So, I have a model of a tube with pressure loss, where the unknown is the mass flow rate. Normally, and on most models of this problem, the conservation equations are used to calculate the mass flow rate, but such models have lots of convergence issues (because of the blocked flow at the end of the tube which results in an infinite pressure derivative at the end). See figure below for a representation of the problem on the left and the right a graph showing the infinite pressure derivative.
Because of that I'm using a model which is more robust, though it outputs not the mass flow rate but the tube length, which is known. Therefore an iterative loop is needed to determine the mass flow rate. Ok then, I coded a function length that given the tube geometry, mass flow rate and boundary conditions it outputs the calculated tube length and made the equations like so:
parameter Real L;
Real m_flow;
...
equation
L = length(geometry, boundary, m_flow)
It simulates fine, but it takes ages... And it shouldn't because the mass flow rate is rather insensitive to the tube length, e.g. if L=3 I could say that m_flow has converged if the output of length is within L ± 0.1. On the other hand the default convergence tolerance of DASSL in Dymola is 0.0001, which is fine for all other variables, but a major setback to my model here...
That being said, I'd like to know if there's a (hacky) way of setting a specific tolerance L (from annotations or something). I was unable to find any solution online or in Dymola's user manual... So far I managed a workaround by making a second function which uses a Newton-Raphson method to determine the mass flow rate, something like:
function massflowrate
input geometry, boundary, m_flow_start, tolerance;
output m_flow;
protected
Real error, L, dL, dLdm_flow, Delta_m_flow;
algorithm
error = geometry.L;
m_flow = m_flow_start;
while error>tolerance loop
L = length(geometry, boundary, m_flow);
error = abs(boundary.L - L);
dL = length(geometry, boundary, m_flow*1.001);
dLdm_flow = dL/(0.001*m_flow);
Delta_m_flow = (geometry.L - L)/dLdm_flow;
m_flow = m_flow + Delta_m_flow;
end while;
end massflowrate;
And then I use it in the equations section:
parameter Real L;
Real m_flow;
...
equation
m_flow = massflowrate(geometry, boundary, delay(m_flow,10), tolerance)
Nevertheless, this solutions is not without it's problems, the real equations are very non-linear and depending on the boundary conditions the solver reaches a never-ending loop... =/
PS: I'm sorry for the long post and the lack of a MWE, the real equations are very long and with loads of thermodynamics which I believe not to be of any help, be that as it may, if necessary, I'm able to provide the real model.
Is the length-function smooth? To me that it being non-smooth seems like a likely cause for problems, and the suggestions by #Phil might also be good ideas.
However, it should also be possible to do what you want as follows:
Real m_flow(nominal=1e9);
Explanation: The equations are normally solved to a certain tolerance in unknowns - in this case m_flow.
The tolerance for each variable is a relative/absolute tolerance taking into the nominal value, and Dymola does not allow you to set different tolerances for different variables.
Thus the simple way to compute m_flow less accurately is by setting a high nominal value for it, since the error tolerance will be tol*(abs(m_flow)+abs(nominal(m_flow))) or something like that.
The downside is that it may be too inaccurate, e.g. causing additional events, or that the error is so random that the solver is still slowed down.

Modelica events and hybrid modelling

I would like to understand the general idea behind hybrid modelling (in particular state events) from a numerical point of view (although I am not a mathematician :)). Given the following Modelica model:
model BouncingBall
constant Real g=9.81
Real h(start=1);
Real v(start=0);
equation
der(h)=v;
der(v)=-g;
algorithm
when h < 0 then
reinit(v,-pre(v));
end when;
end BouncingBall;
I understand the concept of when and reinit.
The equation in the when statement are only active when the condition become true right?
Let's assume that the ball would hit the floor at exactly 2sec. Since I am using multi-step solver does that mean that the solver "goes beyond 2 seconds", recognizes that h<0 (lets assume at simulation time = 2.5sec , h = -0.7). What does this mean "The time for the event is searched using a crossing function? Is there a simple explanation(example)?
Is the solver now going back? Taking a smaller step-size?
What does the pre() operation mean in that context?
noEvent(): "Expressions are taken literally instead of generating crossing functions. Since there is no crossing function, there is no requirement tat the expression can be evaluated beyond the event limit": What does that mean? Given the same example with the bouncing ball: The solver detects at time 2.5 that h = 0.7. Whats the difference between with and without noEvent()?
Yes, the body of when is only executed at events.
Simple view: The solver takes steps, and then uses a continuous extension to generate a (smooth) interpolation formula for the previous step. That interpolation formula can be used to generate a plot, and also for finding the first point where h has crossed zero (likely 2.000000001). An event iteration is then done at that interpolated point - and afterwards the solver is restarted.
I wouldn't say that the solver goes back. It takes a partial step and then continues forward. Some solvers need to reduce the step-size a lot after the event - others don't.
pre(x) is set to the value of x before the event.
noEvent(h<0) basically means evaluate the expression as written without all the bells-and-whistles of crossing functions. You cannot use when noEvent(h<0) then
There are many additional point:
If you are familiar with Sturm-sequences or control theory you might realize that it is not necessary to interpolate a formula to determine if it crossed zero or not in an interval (and some tools use that). The fact that the function is not necessarily smooth makes it a bit more complicated, and also means that derivative-tests cannot be used.
How much the solver is reset depends on the kind of solver. One-step solvers (Runge-Kutta) can be restarted directly as if virtually nothing happened, whereas multi-step solvers (BDF/Adams - such as dassl/lsodar/cvode) need to start with lower order and smaller step-size.

Trouble implementing Perceptron in Scala

I'm taking the CalTech online course Learning From Data, and I'm stumped with creating a Perceptron in Scala. I chose Scala because I'm learning it and wanted to challenge myself. I understand the theory, and I also understand others' solutions in Python and Ruby. But I can't figure out why my own Scala code doesn't work.
For a background in the Perceptron code: Learning_algorithm
I'm running Scala 2.11 on OSX 10.10.
Per the algorithm, I start off with weights (0.0, 0.0, 0.0), where weight[2] is a learned bias component. I've already generated a test set in the space [-1, 1],[-1,1] on the X-Y plane. I do this by a) picking two random points and drawing a line through them, then b) generating some other random points and calculating if they are on one side of the line or the other. As far as I can tell by plotting it in Python, this generates linearly separable data.
My next step is to take my initialized weights and check against every point to find miss-classified points, i.e. points that don't generate the right +1 or -1 result. Here is the code that simply calculates dot-product of the weight and the vector x:
def h(weight:List[Double], p:Point ): Double = if ( (weight(0)*p.x + weight(1)*p.y + weight(2)) > 0) 1 else -1
It's the initial weights, so they are all miss-classified. I then update the weights, like so:
def newH(weight:List[Double], p:Point, y:Double): List[Double] = {
val newWt = scala.collection.mutable.ArrayBuffer[Double](0.0, 0.0, 0.0)
newWt(0) = weight(0) + p.x*y
newWt(1) = weight(1) + p.y*y
newWt(2) = weight(2) + 1*y
return newWt.toList
}
Then I identify miss-classified points again by checking the test set against the value output by h() above, and continue iterating.
This follows the algorithm (or is supposed to, at least) that Prof Yaser shows here: Library
The problem is that the algorithm never converges. My weights -- the third component of which is the bias -- keep getting more negative or more positive. My weight vector after every adjustment resembles this:
Weights: List(16.43341624736786, 11627.122008800507, -34130.0)
Weights: List(15.533397436141968, 11626.464265227318, -34131.0)
Weights: List(14.726969361305237, 11626.837346673012, -34132.0)
Weights: List(14.224745154380798, 11627.646470665932, -34133.0)
Weights: List(14.075232982635498, 11628.026384592056, -34134.0)
I'm a Scala newbie so my code is probably atrocious. But am I missing something in Scala, e.g. reassignment, that could be causing my weight to be messed up? Or have I completely misunderstood how the Perceptron even operates? Is my weight update just wrong?
Thanks for any help you can give me on this!
Thanks Till. I've discovered the two problems with my code and I'll share them, but to address your point: Someone else asked about this on the class's forum and it looks like what the Wiki formula does is simply to change the learning rate. Alpha can be picked randomly, and y-h(weight, p) would give you weights like
-1-1 = 2
In the case that y=-1 and h()=1, or
1-(-1) = 2
In the case that y=1 and h()=-1
My/the class formula takes 1*p.x instead of alpha*2, which seems to be a matter of different learning rates. Hope that makes sense.
My two problems were as follows:
The y value passed into the recalculation formula newH needs to be the target value of y, that is, the "correct y" that was discovered while generating the test points. I was passing in the y that was generated through h(), which is the guessed-at function. This makes sense obviously since we are looking to correc the weight by using the target y, not the incorrect y.
I was doing a comparison of target y and h()=yin Scala, but was comparison an element obtained from a map through .get(). My Scala map looks like Map[Point,Double] where the Double value refers to the y value generated during the test set creation. But doing a .get() gives you Option[Double] and not a Double value at all. This is explained in Scala Map#get and the return of Some() and makes a lot of sense now. I did map.get(<some Point>).get() for now, since I was focusing on debugging and not code perfection, and then I was accurately able to compare two Double values.

What is a "good" R value when comparing 2 signals using cross correlation?

I apologize for being a bit verbose in advance: if you want to skip all the background mumbo jumbo you can see my question down below.
This is pretty much a follow up to a question I previously posted on how to compare two 1D (time dependent) signals. One of the answers I got was to use the cross-correlation function (xcorr in MATLAB), which I did.
Background information
Perhaps a little background information will be useful: I'm trying to implement an Independent Component Analysis algorithm. One of my informal tests is to (1) create the test case by (a) generate 2 random vectors (1x1000), (b) combine the vectors into a 2x1000 matrix (called "S"), and multiply this by a 2x2 mixing matrix (called "A"), to give me a new matrix (let's call it "T").
In summary: T = A * S
(2) I then run the ICA algorithm to generate the inverse of the mixing matrix (called "W"), (3) multiply "T" by "W" to (hopefully) give me a reconstruction of the original signal matrix (called "X")
In summary: X = W * T
(4) I now want to compare "S" and "X". Although "S" and "X" are 2x1000, I simply compare S(1,:) to X(1,:) and S(2,:) to X(2,:), each which is 1x1000, making them 1D signals. (I have another step which makes sure that these vectors are the proper vectors to compare to each other and I also normalize the signals).
So my current quandary is how to 'grade' how close S(1,:) matches to X(1,:), and likewise with S(2,:) to X(2,:).
So far I have used something like: r1 = max(abs(xcorr(S(1,:), X(1,:)))
My question
Assuming that using the cross correlation function is a valid way to go about comparing the similarity of two signals, what would be considered a good R value to grade the similarity of the signals? Wikipedia states that this is a very subjective area, and so I defer to the better judgment of those who might have experience in this field.
As you might realize, I'm not coming from a EE/DSP/statistical background at all (I'm a medical student) so I'm going through a sort of "baptism through fire" right now, and I appreciate all the help I can get. Thanks!
(edit: as far as directly answering your question about R values, see below)
One way to approach this would be to use cross-correlation. Bear in mind that you have to normalize amplitudes and correct for delays: if you have signal S1, and signal S2 is identical in shape, but half the amplitude and delayed by 3 samples, they're still perfectly correlated.
For example:
>> t = 0:0.001:1;
>> y = #(t) sin(10*t).*exp(-10*t).*(t > 0);
>> S1 = y(t);
>> S2 = 0.4*y(t-0.1);
>> plot(t,S1,t,S2);
These should have a perfect correlation coefficient. A way to compute this is to use maximum cross-correlation:
>> f = #(S1,S2) max(xcorr(S1,S2));
f =
#(S1,S2) max(xcorr(S1,S2))
>> disp(f(S1,S1)); disp(f(S2,S2)); disp(f(S1,S2));
12.5000
2.0000
5.0000
The maximum value of xcorr() takes care of the time-delay between signals. As far as correcting for amplitude goes, you can normalize the signals so that their self-cross-correlation is 1.0, or you can fold that equivalent step into the following:
ρ2 = f(S1,S2)2 / (f(S1,S1)*f(S2,S2);
In this case ρ2 = 5 * 5 / (12.5 * 2) = 1.0
You can solve for ρ itself, i.e. ρ = f(S1,S2)/sqrt(f(S1,S1)*f(S2,S2)), just bear in mind that both 1.0 and -1.0 are perfectly correlated (-1.0 has opposite sign)
Try it on your signals!
with respect to what threshold to use for acceptance/rejection, that really depends on what kind of signals you have. 0.9 and above is fairly good but can be misleading. I would consider looking at the residual signal you get after you subtract out the correlated version. You could do this by looking at the time index of the maximum value of xcorr():
>> t = 0:0.001:1;
>> y = #(a,t) sin(a*t).*exp(-a*t).*(t > 0);
>> S1=y(10,t);
>> S2=0.4*y(9,t-0.1);
>> f(S1,S2)/sqrt(f(S1,S1)*f(S2,S2))
ans =
0.9959
This looks pretty darn good for a correlation. But let's try fitting S2 with a scaled/shifted multiple of S1:
>> [A,i]=max(xcorr(S1,S2)); tshift = i-length(S1);
>> S2fit = zeros(size(S2)); S2fit(1-tshift:end) = A/f(S1,S1)*S1(1:end+tshift);
>> plot(t,[S2; S2fit]); % fit S2 using S1 as a basis
>> plot(t,[S2-S2fit]); % residual
Residual has some energy in it; to get a feel for how much, you can use this:
>> S2res=S2-S2fit;
>> dot(S2res,S2res)/dot(S2,S2)
ans =
0.0081
>> sqrt(dot(S2res,S2res)/dot(S2,S2))
ans =
0.0900
This says that the residual has about 0.81% of the energy (9% of the root-mean-square amplitude) of the original signal S2. (the dot product of a 1D signal with itself will always be equal to the maximum value of cross-correlation of that signal with itself.)
I don't think there's a silver bullet for answering how similar two signals are with each other, but hopefully I've given you some ideas that might be applicable to your circumstances.
A good starting point is to get a sense of what a perfect match will look like by calculating the auto-correlations for each signal (i.e. do the "cross-correlation" of each signal with itself).
THIS IS A COMPLETE GUESS - but I'm guessing max(abs(xcorr(S(1,:),X(1,:)))) > 0.8 implies success. Just out of curiosity, what kind of values do you get for max(abs(xcorr(S(1,:),X(2,:))))?
Another approach to validate your algorithm might be to compare A and W. If W is calculated correctly, it should be A^-1, so can you calculate a measure like |A*W - I|? Maybe you have to normalize by the trace of A*W.
Getting back to your original question, I come from a DSP background, so I get to deal with fairly noise-free signals. I understand that's not a luxury you get in biology :) so my 0.8 guess might be very optimistic. Perhaps looking at some literature in your field, even if they aren't using cross-correlation exactly, might be useful.
Usually in such cases people talk about "false acceptance rate" and "false rejection rate".
The first one describes how many times algorithm says "similar" for non-similar signals, the second one is the opposite.
Selecting a threshold thus becomes a trade-off between these criteria. To make FAR=0, threshold should be 1, to make FRR=0 threshold should be -1.
So probably, you will need to decide which trade-off between FAR and FRR is acceptable in your situation and this will give the right value for threshold.
Mathematically this can be expressed in different ways. Just a couple of examples:
1. fix some of rates at acceptable value and minimize other one
2. minimize max(FRR,FAR)
3. minimize aFRR+bFAR
Since they should be equal, the correlation coefficient should be high, between .99 and 1. I would take the max and abs functions out of your calculation, too.
EDIT:
I spoke too soon. I confused cross-correlation with correlation coefficient, which is completely different. My answer might not be worth much.
I would agree that the result would be subjective. Something that would involve the sum of the squares of the differences, element by element, would have some value. Two identical arrays would give a value of 0 in that form. You would have to decide what value then becomes "bad". Make up 2 different vectors that "aren't too bad" and find their cross-correlation coefficient to be used as a guide.
(parenthetically: if you were doing a correlation coefficient where 1 or -1 would be great and 0 would be awful, I've been told by bio-statisticians that a real-life value of 0.7 is extremely good. I understand that this is not exactly what you are doing but the comment on correlation coefficient came up earlier.)

Resources