About cv_wgt_start for CVs with quite different delays - gekko

We are trying to control tclab with man made big delay. We followed suggestions in url. The
cv_wgt_start worked very well. See . I set CV_wgt_delay as double relay time(fopdt).
But my question is as cv_mgt_start is the global option, how can we deal with two CVS with quite different delays?

One option to have a different delay is to define a custom objective function with a parameter p that is a step function where the penalty starts. In this example, there are 10 time steps in the control / prediction horizon. The penalty (5) starts at step 7 for T1 and step 3 for T2.
from gekko import gekko
import numpy as np
p1 = np.zeros(11)
p1[6:] = 5
p1 = m.Param(p1)
p2 = np.zeros(11)
p2[2:] = 5
p2 = m.Param(p2)
m.Minimize(p1*(T1-55)**2)
m.Minimize(p2*(T2-35)**2)
Another option is to use m.CV_TYPE=3 as a derivative form of the reference trajectory. This is especially suited for systems with time delay.

Related

Gekko PRED_HOR and CTRL_HOR vs m.time

I'm trying to implement an online MPC controller and I'm a bit confused about what exactly the m.time does.
With m.options.IMODE = 6 #MPC and m.options.REQCTRLMODE=3, I try to define the prediction and control horizons:
m.options.CTRL_HOR=10
m.options.CTRL_TIME=0.05
m.options.PRED_HOR=10
m.options.PRED_TIME=0.05
If I understand it right the ctrl_hor and pred_hor sets how many future timesteps we calculate and the pred_time and ctrl_time defines how long is one timestep.
But the problem is that the controller throws an error if I don't define m.time, but what exactly does it do and why isn't it enough to set ctrl and pred horizons with respective timesteps?
Gekko uses m.time by default instead of CTRL_HOR and PRED_HOR. You can define an equivalent control / prediction horizon in Gekko with:
import numpy as np
from gekko import GEKKO
m = GEKKO()
m.time = np.linspace(0,0.05,11)
The CTRL_HOR and PRED_HOR properties are optionally used when CSV_READ=0. However, Gekko uses the CSV file to insert information about default values for parameters and variables so I don't recommend that you turn it off. Using m.time is also more flexible because you can have a non-uniform control / prediction horizon such as:
m.time = [0,0.05,0.1,0.2,0.5,1.0]
This helps to have the fine resolution at the beginning and then larger steps to determine steady-state move plans. Here is a practical TCLab MPC application with real-time data.

In Tensorflow, what is the difference between Session.partial_run and Session.run?

I always thought that Session.run required all placeholders in the graph to be fed, while Session.partial_run only the ones specified through Session.partial_run_setup, but looking further that is not the case.
So how exactly do the two methods differentiate? What are the advantages/disadvantages of using one over the other?
With tf.Session.run, you usually give some inputs and expected outputs, and TensorFlow runs the operations in the graph to compute and return those outputs. If you later want to get some other output, even if it is with the same input, you have to run again all the necessary operations in the graph, even if some intermediate results will be the same as in the previous call. For example, consider something like this:
import tensorflow as tf
input_ = tf.placeholder(tf.float32)
result1 = some_expensive_operation(input_)
result2 = another_expensive_operation(result1)
with tf.Session() as sess:
x = ...
sess.run(result1, feed_dict={input_: x})
sess.run(result2, feed_dict={input_: x})
Computing result2 will require to run both the operations from some_expensive_operation and another_expensive_operation, but actually most of the computation is repeated from when result1 was calculated. tf.Session.partial_run allows you to evaluate part of a graph, leave that evaluation "on hold" and complete it later. For example:
import tensorflow as tf
input_ = tf.placeholder(tf.float32)
result1 = some_expensive_operation(input_)
result2 = another_expensive_operation(result1)
with tf.Session() as sess:
x = ...
h = sess.partial_run_setup([result1, result2], [input_ ])
sess.partial_run(h, result1, feed_dict={input_: x})
sess.partial_run(h, result2)
Unlike before, here the operations from some_expensive_operation will only we run once in total, because the computation of result2 is just a continuation from the computation of result1.
This can be useful in several contexts, for example if you want to split the computational cost of a run into several steps, but also if you need to do some mid-evaluation checks out of TensorFlow, such as computing an input to the second half of the graph that depends on an output of the first half, or deciding whether or not to complete an evaluation depending on an intermediate result (these may also be implemented within TensorFlow, but there may be cases where you do not want that).
Note too that it is not only a matter of avoiding repeating computation. Many operations have a state that changes on each evaluation, so the result of two separate evaluations and one evaluation divided into two partial ones may actually be different. This is the case with random operations, where you get a new different value per run, and other stateful object like iterators. Variables are also obviously stateful, so operations that change variables (like tf.Session.assign or optimizers) will not produce the same results when they are run once and when they are run twice.
In any case, note that, as of v1.12.0, partial_run is still an experimental feature and is subject to change.

Speeding up evaluation of many scipy splines over the same set of knots

I have a few quick questions with regards to speeding-up spline function evaluation in scipy (version 0.12.0) and I wish to apologize in advance for my novice understanding of splines. I am trying to create an object for scipy.integrate.odeint integration of a chemical kinetics problems using spline lookups for reaction rates (1.e2-1.e3 functions of ode system variables) and generated c-code for all of the algebra in the ODE system of equations. In comparison to a previous implementation that was purely in python, evaluating the c-code is so much faster than the spline interpolations that the evaluation of splines is the bottleneck in the ODE function. In trying to remove the bottleneck, I have reformed all of the reaction rates into splines that exist on the same knot values with the same order while having different smoothing coefficients (In reality I will have multiple sets of functions, where each function set was found on the same knots, has the same argument variable, and at the same derivative level, but for simplicity I will assume one function set for this question).
In principle this is just a collection of curves on the same x-values and could be treated with interp1d (equivalently rewrapping splmake and spleval from scipy.interpolate) or a list of splev calls on tck data from splrep.
In [1]: %paste
import numpy
import scipy
from scipy.interpolate import *
#Length of Data
num_pts = 3000
#Number of functions
num_func = 100
#Shared x list by all functions
x = numpy.linspace(0.0,100.0,num_pts)
#Separate y(x) list for each function
ylist = numpy.zeros((num_pts,num_func))
for ind in range(0,num_func):
#Dummy test for different data
ylist[:,ind] = (x**ind + x - 3.0)
testval = 55.0
print 'Method 1'
fs1 = [scipy.interpolate.splrep(x,ylist[:,ind],k=3) for ind in range(0,num_func)]
out1 = [scipy.interpolate.splev(testval,fs1[ind]) for ind in range(0,num_func)]
%timeit [scipy.interpolate.splev(testval,fs1[ind]) for ind in range(0,num_func)]
print 'Method 2 '
fs2 = scipy.interpolate.splmake(x,ylist,order=3)
out2 = scipy.interpolate.spleval(fs2,testval)
%timeit scipy.interpolate.spleval(fs2,testval)
## -- End pasted text --
Method 1
1000 loops, best of 3: 1.51 ms per loop
Method 2
1000 loops, best of 3: 1.32 ms per loop
As far as I understand spline evaluations, once the tck arrays have been created (either with splrep or splmake) the evaluation functions (splev and spleval) perform two operations when given some new value xnew:
1) Determine relevant indicies of knots and smoothing coefficients
2) Evaluate polynomial expression with smoothing coefficients and new xnew
Questions
Since all of the splines (in a function set) are created on the same knot values, is it possible to avoid step (1, relevant indices) in the spline evaluation once it has been performed on the first function of a function set? From my looking at the Fortran fitpack files (directly from DIERCKX, I could not find the .c files used by scipy on my machine) I do not think this is supported, but I would love to be shown wrong.
The compilation of the system c-code as well as the creation of all of the spline tck arrays is a preprocessing step as far as I am concerned; if I am worried about the speed of evaluating these lists of many functions, should be looking at a compiled variant since my tck lists will be unchanging?
One of my function sets will likely have an x-array of geometrically spaced values as opposed to linearly spaced; will this drastically reduce the evaluation time of the splines?
Thank you in advance for your time and answers.
Cheers,
Guy

Time delay estimation between two audio signals

I have two audio recordings of a same signal by 2 different microphones (for example, in a WAV format), but one of them is recorded with delay, for example, several seconds.
It's easy to identify such a delay visually when viewing these signals in some kind of waveform viewer - i.e. just spotting first visible peak in every signal and ensuring that they're the same shape:
(source: greycat.ru)
But how do I do it programmatically - find out what this delay (t) is? Two digitized signals are slightly different (because microphones are different, were at different positions, due to ADC setups, etc).
I've digged around a bit and found out that this problem is usually called "time-delay estimation" and it has myriads of approaches to it - for example, one of them.
But are there any simple and ready-made solutions, such as command-line utility, library or straight-forward algorithm available?
Conclusion: I've found no simple implementation and done a simple command-line utility myself - available at https://bitbucket.org/GreyCat/calc-sound-delay (GPLv3-licensed). It implements a very simple search-for-maximum algorithm described at Wikipedia.
The technique you're looking for is called cross correlation. It's a very simple, if somewhat compute intensive technique which can be used for solving various problems, including measuring the time difference (aka lag) between two similar signals (the signals do not need to be identical).
If you have a reasonable idea of your lag value (or at least the range of lag values that are expected) then you can reduce the total amount of computation considerably. Ditto if you can put a definite limit on how much accuracy you need.
Having had the same problem and without success to find a tool to sync the start of video/audio recordings automatically,
I decided to make syncstart (github).
It is a command line tool. The basic code behind it is this:
import numpy as np
from scipy import fft
from scipy.io import wavfile
r1,s1 = wavfile.read(in1)
r2,s2 = wavfile.read(in2)
assert r1==r2, "syncstart normalizes using ffmpeg"
fs = r1
ls1 = len(s1)
ls2 = len(s2)
padsize = ls1+ls2+1
padsize = 2**(int(np.log(padsize)/np.log(2))+1)
s1pad = np.zeros(padsize)
s1pad[:ls1] = s1
s2pad = np.zeros(padsize)
s2pad[:ls2] = s2
corr = fft.ifft(fft.fft(s1pad)*np.conj(fft.fft(s2pad)))
ca = np.absolute(corr)
xmax = np.argmax(ca)
if xmax > padsize // 2:
file,offset = in2,(padsize-xmax)/fs
else:
file,offset = in1,xmax/fs
A very straight forward thing todo is just to check if the peaks exceed some threshold, the time between high-peak on line A and high-peak on line B is probably your delay. Just try tinkering a bit with the thresholds and if the graphs are usually as clear as the picture you posted, then you should be fine.

Best way to calculate the result of a formula?

I currently have an application which can contain 100s of user defined formulae. Currently, I use reverse polish notation to perform the calculations (pushing values and variables on to a stack, then popping them off the stack and evaluating). What would be the best way to start parallelizing this process? Should I be looking at a functional language?
The calculations are performed on arrays of numbers so for example a simple A+B could actually mean 100s of additions. I'm currently using Delphi, but this is not a requirement going forward. I'll use the tool most suited to the job. Formulae may also be dependent on each other So we may have one formula C=A+B and a second one D=C+A for example.
Let's assume your formulae (equations) are not cyclic, as otherwise you cannot "just" evaluate them. If you have vectorized equations like A = B + C where A, B and C are arrays, let's conceptually split them into equations on the components, so that if the array size is 5, this equation is split into
a1 = b1 + c1
a2 = b2 + c2
...
a5 = b5 + c5
Now assuming this, you have a large set of equations on simple quantities (whether integer, rational or something else).
If you have two equations E and F, let's say that F depends_on E if the right-hand side of F mentions the left-hand side of E, for example
E: a = b + c
F: q = 2*a + y
Now to get towards how to calculate this, you could always use randomized iteration to solve this (this is just an intermediate step in the explanation), following this algorithm:
1 while (there is at least one equation which has not been computed yet)
2 select one such pending equation E so that:
3 for every equation D such that E depends_on D:
4 D has been already computed
5 calculate the left-hand side of E
This process terminates with the correct answer regardless on how you make your selections on line // 2. Now the cool thing is that it also parallelizes easily. You can run it in an arbitrary number of threads! What you need is a concurrency-safe queue which holds those equations whose prerequisites (those the equations depend on) have been computed but which have not been computed themselves yet. Every thread pops out (thread-safely) one equation from this queue at a time, calculates the answer, and then checks if there are now new equations so that all their prerequisites have been computed, and then adds those equations (thread-safely) to the work queue. Done.
Without knowing more, I would suggest taking a SIMD style approach if possible. That is, create threads to compute all formulas for a single data set. Trying to divide the computation of formulas to parallelise them wouldn't yield much speed improvement as the logic required to be able to split up the computations into discrete units suitable for threading would be hard to write and harder to get right, the overhead would cancel out any speed gains. It would also suffer quickly from diminishing returns.
Now, if you've got a set of formulas that are applied to many sets of data then the parallelisation becomes easier and would scale better. Each thread does all computations for one set of data. Create one thread per CPU core and set its affinity to each core. Each thread instantiates one instance of the formula evaluation code. Create a supervisor which loads a single data set and passes it an idle thread. If no threads are idle, wait for the first thread to finish processing its data. When all data sets are processed and all threads have finished, then exit. Using this method, there's no advantage to having more threads than there are cores on the CPU as thread switching is slow and will have a negative effect on overall speed.
If you've only got one data set then it is not a trivial task. It would require parsing the evaluation tree for branches without dependencies on other branches and farming those branches to separate threads running on each core and waiting for the results. You then get problems synchronizing the data and ensuring data coherency.

Resources