I use OMNeT++-4.6, sumo-0.22.0 and Veins-4a2.
I am interested to calculate the speed of the vehicle when a message is received. I used getSpeed() function to do it. But the problem is that when I calculated manually the speed basing on the time and the distance (using the formula s = d / t), the value is different.
For example, at t= 55.104470531278 s and the distance d= 29.0477 m, the speed obtained by calling the function getSpeed() is s= 3.34862 m/s = 10.8 km/h.
On the other hand the one calculated manually is s= 0.52713 m/s = 1.9 km/h.
I need help to understand why the value obtained by using getSpeed() is different please.
getSpeed() returns the current speed of the vehicle (to be precise the one in the last simulation step which is by default 1s) while your calculation gives the average speed over the last ~55s (assuming your simulation started at time 0).
Related
We have run an Interrupted Time Series - Poisson model on some count data.
The p-value at the point of interevntion (level change) is <0.05. We are however reproting the level change (i.e. difference between modeled data and counterfactual at time t + 8).
How would I go about deriving a separate p-value for this specific time point? And would it be different to the original level change.
Using R:
Model coded as below:
fit1a <- glm(`Subject Total` ~ Quarter + int2 + time_since_intervention2 , df, family = "poisson")
By "arbitrary" I mean that I don't have a signal sampled on a grid that is amenable to taking an FFT. I just have points (e.g. in time) where events happened, and I'd like an estimate of the rate, for example:
p = [0, 1.1, 1.9, 3, 3.9, 6.1 ...]
...could be hits from a process with a nominal periodicity (repetition interval) of 1.0, but with noise and some missed detections.
Are there well known methods for processing such data?
A least square algorithm may do the trick, if correctly initialized. A clustering method can be applied to this end.
As an FFT is performed, the signal is depicted as a sum of sine waves. The amplitude of the frequencies may be depicted as resulting from a least square fit on the signal. Hence, if the signal is unevenly sampled, resolving the same least square problem may make sense if the Fourier transform is to be estimated. If applied to a evenly sampled signal, it boils down to the same result.
As your signal is descrete, you may want to fit it as a sum of Dirac combs. It seems more sound to minimize the sum of squared distance to the nearest Dirac of the Dirac comb. This is a non-linear optimization problem where Dirac combs are described by their period and offset. This non-linear least-square problem can be solved by mean of the Levenberg-Marquardt algorithm. Here is an python example making use of the scipy.optimize.leastsq() function. Moreover, the error on the estimated period and offset can be estimated as depicted in How to compute standard deviation errors with scipy.optimize.least_squares . It is also documented in the documentation of curve_fit() and Getting standard errors on fitted parameters using the optimize.leastsq method in python
Nevertheless, half the period, or the thrid of the period, ..., would also fit, and multiples of the period are local minima that are to be avoided by a refining the initialization of the Levenberg-Marquardt algorithm. To this end, the differences between times of events can be clustered, the cluster featuring the smallest value being that of the expected period. As proposed in Clustering values by their proximity in python (machine learning?) , the clustering function sklearn.cluster.MeanShift() is applied.
Notice that the procedure can be extended to multidimentionnal data to look for periodic patterns or mixed periodic patterns featuring different fundamental periods.
import numpy as np
from scipy.optimize import least_squares
from scipy.optimize import leastsq
from sklearn.cluster import MeanShift, estimate_bandwidth
ticks=[0,1.1,1.9,3,3.9,6.1]
import scipy
print scipy.__version__
def crudeEstimate():
# loooking for the period by looking at differences between values :
diffs=np.zeros(((len(ticks))*(len(ticks)-1))/2)
k=0
for i in range(len(ticks)):
for j in range(i):
diffs[k]=ticks[i]-ticks[j]
k=k+1
#see https://stackoverflow.com/questions/18364026/clustering-values-by-their-proximity-in-python-machine-learning
X = np.array(zip(diffs,np.zeros(len(diffs))), dtype=np.float)
bandwidth = estimate_bandwidth(X, quantile=1.0/len(ticks))
ms = MeanShift(bandwidth=bandwidth, bin_seeding=True)
ms.fit(X)
labels = ms.labels_
cluster_centers = ms.cluster_centers_
print cluster_centers
labels_unique = np.unique(labels)
n_clusters_ = len(labels_unique)
for k in range(n_clusters_):
my_members = labels == k
print "cluster {0}: {1}".format(k, X[my_members, 0])
estimated_period=np.min(cluster_centers[:,0])
return estimated_period
def disttoDiracComb(x):
residual=np.zeros((len(ticks)))
for i in range(len(ticks)):
mindist=np.inf
for j in range(len(x)/2):
offset=x[2*j+1]
period=x[2*j]
#print period, offset
index=np.floor((ticks[i]-offset)/period)
#print 'index', index
currdist=ticks[i]-(index*period+offset)
if currdist>0.5*period:
currdist=period-currdist
index=index+1
#print 'event at ',ticks[i], 'not far from index ',index, '(', currdist, ')'
#currdist=currdist*currdist
#print currdist
if currdist<mindist:
mindist=currdist
residual[i]=mindist
#residual=residual-period*period
#print x, residual
return residual
estimated_period=crudeEstimate()
print 'crude estimate by clustering :',estimated_period
xp=np.array([estimated_period,0.0])
#res_1 = least_squares(disttoDiracComb, xp,method='lm',xtol=1e-15,verbose=1)
p,pcov,infodict,mesg,ier=leastsq(disttoDiracComb, x0=xp,ftol=1e-18, full_output=True)
#print ' p is ',p, 'covariance is ', pcov
# see https://stackoverflow.com/questions/14581358/getting-standard-errors-on-fitted-parameters-using-the-optimize-leastsq-method-i
s_sq = (disttoDiracComb(p)**2).sum()/(len(ticks)-len(p))
pcov=pcov *s_sq
perr = np.sqrt(np.diag(pcov))
#print 'estimated standard deviation on parameter :' , perr
print 'estimated period is ', p[0],' +/- ', 1.96*perr[0]
print 'estimated offset is ', p[1],' +/- ', 1.96*perr[1]
Applied to your sample, it prints :
crude estimate by clustering : 0.975
estimated period is 1.0042857141346768 +/- 0.04035792507868619
estimated offset is -0.011428571139828817 +/- 0.13385206912205957
It sounds like you need to decide what exactly you want to determine. If you want to know the average interval in a set of timestamps, then that's easy (just take the mean or median).
If you expect that the interval could be changing, then you need to have some idea about how fast it is changing. Then you can find a windowed moving average. You need to have an idea of how fast it is changing so that you can select your window size appropriately - a larger window will give you a smoother result, but a smaller window will be more responsive to a faster-changing rate.
If you have no idea whether the data is following any sort of pattern, then you are probably in the territory of data exploration. In that case, I would start by plotting the intervals, to see if a pattern appears to the eye. This might also benefit from applying a moving average if the data is quite noisy.
Essentially, whether or not there is something in the data and what it means is up to you and your knowledge of the domain. That is, in any set of timestamps there will be an average (and you can also easily calculate the variance to give an indication of variability in the data), but it is up to you whether that average carries any meaning.
My application requires to sample a sensor and send the samples out in real time, without saving the samples.
I am required to run a low pass filter on the samples, but here is the problem cause if I want to do it on 20 samples for example then I will not be able to send out the samples each after I got it as required, and when I send them immediately each after I got it I use an ongoing average on 10 samples but that wasn't enough.
what low pass filter algorithm is the best for this application?
what is the accepted way to do this ?
Try the Exponential, or First-order filter:
Y[n] = dY[n-1] + (1-d)X[n]
Y[n] is the current output
Y[n-1] is the previous output
X[n] is the current input
d is the damping factor
The damping factor is a number between 0 and 1. If d = 0, the output is just equal to the input, and no filtering (smoothing) takes place. The closer d is to 1, the greater the amount of filtering (smoothing).
Here is an example: The input signal is a Sine function with amplitude +- 100, added to that is random noise of +-10. To filter out the noise we run the data through the filter equation first with d = 0.5 and then d = 0.8
More details with code and second and third order filters available here >> ST Community- answer
I am using Newton Raphson +successive Substitute algorithm to perform flash calculation(chemical process simulation).
The algorithm can converge well when the input is in low precision like 0.1, but when the number precision is increased to 0.11111 or 0.99999. The algorithm will not converge.
When I am using the quasi newton method with BFGS update, the same problem occurs again. How can we decrease the sensitivity of the code to the numerical precision?
Here is a simple example using matlab to solve the Rachford-Rice equation. When the comp_overall=[0.9,1-0.9], it converges well. However, when the number precision increase to like[0.99999,1-0.99999]. It will not converge.
K=[0.053154011443159 34.234731216532658],
comp_overall= [0.99999 1- 0.99999], phi=0.5; %initial values
epsilon = 1.0;
iter1 = 1;
while (epsilon >=1.e-05)
rc=0.0;
drc=0.0;
for i=1:2
% Rachford-Rice Equation
rc = comp_overall(i)*(K(i)-1.0)/(1.0+phi*(K(i)-1.0))+rc;
% Derivative
drc = comp_overall(i)*(K(i)-1.0)^2/(1.0+phiK(i)-1.0))^2+drc;
end
% Newton-Raphson
phi1 = phi +0.01 (rc / drc);
epsilon = abs( (phi1-phi)/phi );
% Convergence
phi = phi1;
iter1=iter1+1;
end
The Newton–Raphson method relies on the function being differentiable between any two consecutive approximations. Depending on the choice of the initial value, this may not be the case for z₁ = 0.99999. Let's look at the graph of the Rachford-Rice function:
The root of this function is φ₀ ≈ –0.0300781429 and the nearest point of discontinuity is –1/(K₂-1) ≈ –0.0300890052. They are close enough for the Newton–Raphson method to overshoot, to jump over that discontinuity.
For example:
φ₁ = –0.025
f(φ₁) ≈ -0.9229770571
f'(φ₁) ≈ 1.2416569960
φ₂ = φ₁ + 0.01 * f(φ₁) / f'(φ₁) ≈ -0.0324334302
φ₂ lies to the left of the discontinuity, so the following steps will be away from, not towards the root.
φ₃ = -0.0358986759 < φ₂
What can be done about it:
When the algorithm fails to converge, repeat it with smaller steps. For example, start with the coefficient 0.01 (as it is now) and decrease it 10 times after every failure.
Detect overshoots. On each iteration check if there is a discontinuity point (–1/(Kᵢ-1)) between the current approximation and the previous one. When it happens, discard the current approximation, decrease the coefficient and continue.
Limit the scope of the search. Are solutions outside of [0, 1] physically meaningful? If not, you can stop once the approximated value falls out of that range.
Use different method. The function is monotonic on any interval between two consecutive discontinuity points, so you can perform binary search on each such interval. It will be both faster and more robust than the Newton–Raphson method.
For a current project, I have to discretize quasi-continuous values into bins defined by some pre-defined binning resolution. For this purpose, I have written a function, which I expected to be highly efficient as it is able to both process scalar inputs as well as vector inputs using bsxfun. However, after some profiling, I found out that almost all processing time of my much larger project is produced in this function, and within the function, it's mainly the bsxfun part that takes time, with the min-query following on second place. Long story short, I am looking for advice on how to solve this task MUCH faster in MATLAB. Side note: I am usually passing vectors with some 50k elements.
Here's the code:
function sampleNo = value2sample(value,bins)
%Make sure both vectors have orientations fitting bsxfun
value = value(:);
bins = bins(:)';
%Recover bin resolution (avoids passing another parameter)
delta = median(diff(bins));
%Calculate distance matrix between all combinations
dist = abs(bsxfun(#minus,value,bins));
%What we really want to know is the minimum distance per row
[minval,ind] = min(dist,[],2);
%Make sure we don't accidentally further process NaNs as 1st bin
ind(isnan(minval))=NaN;
sampleNo = ind;
sampleNo(minval>delta) = NaN;
end
The reason that your function is slow is because you are computing the distance between every element of values and bins and storing them all in an array - if there are N values and M bins then you will require NM elements to store all the distances, and this is probably a really big number (e.g. if each input has 50,000 elements then you need 2.5 billion elements in the output array).
Moreover, since your bins are sorted (you didn't state this, but it looks like you are assuming it in your code) you do not need to compute the distance from every value to every bin. You can be much smarter,
function ind = value2sample(value, bins)
% Find median bin distance
delta = median(diff(bins));
% Bucket into 'nearest' bin by using midpoints
bins = bins(:);
mids = [-Inf; 0.5 * (bins(1:end-1) + bins(2:end))];
[~, ind] = histc(value, mids);
% Ensure that NaN values and points that aren't near any bin are returned as NaN
ind(isnan(value)) = NaN;
ind(abs(value - bins(ind)) > delta) = NaN;
end
In my tests, with values = randn(10000, 1) and bins = -50:50 it takes around 4.5 milliseconds to run the original function, and 485 microseconds to run the code above, so you are getting around a 10x speedup (and the speedup will be even greater as you increase the size of the inputs).
Thanks to #Chris Taylor, I was able to solve the problem very efficiently. The code now runs almost 400 times faster than before. The only changes I had to make from his version are reflected in the code below. Main issue was to replace histc (whose use is not encouraged anymore) by discretize.
function ind = value2sample(value, bins)
% Make sure the vectors are standing
value = value(:);
bins = bins(:);
% Bucket into 'nearest' bin by using midpoints
mids = [eps; 0.5 * (bins(1:end-1) + bins(2:end))];
ind = discretize(value, mids);
The only thing is, that in this implementation your bins must be non-negative. Other than that, this code does exactly what I want, including the fact that ind has the same size as value and contains NaNs whenever a value is NaN or out of the range of bins.