Creating sliding windows of NaN padded elements off 1D NumPy array - performance

I have a time series x[0], x[1], ... x[n-1], stored as a 1 dimensional numpy array. I would like to convert it to the following matrix:
NaN, ... , NaN , x[0]
NaN, ... , x[0], x[1]
.
.
NaN, x[0], ... , x[n-3],x[n-2]
x[0], x[1], ... , x[n-2],x[n-1]
I would like to use this matrix to speedup time-series calculations. Is there a function in numpy or scipy to do this? (I don't want to use for loop in python to do it)

One approach with np.lib.stride_tricks.as_strided -
def nanpad_sliding2D(a):
L = a.size
a_ext = np.concatenate(( np.full(a.size-1,np.nan) ,a))
n = a_ext.strides[0]
strided = np.lib.stride_tricks.as_strided
return strided(a_ext, shape=(L,L), strides=(n,n))
Sample run -
In [41]: a
Out[41]: array([48, 82, 96, 34, 93, 25, 51, 26])
In [42]: nanpad_sliding2D(a)
Out[42]:
array([[ nan, nan, nan, nan, nan, nan, nan, 48.],
[ nan, nan, nan, nan, nan, nan, 48., 82.],
[ nan, nan, nan, nan, nan, 48., 82., 96.],
[ nan, nan, nan, nan, 48., 82., 96., 34.],
[ nan, nan, nan, 48., 82., 96., 34., 93.],
[ nan, nan, 48., 82., 96., 34., 93., 25.],
[ nan, 48., 82., 96., 34., 93., 25., 51.],
[ 48., 82., 96., 34., 93., 25., 51., 26.]])
Memory efficiency with strides
As mentioned in the comments by #Eric, this strides based approach would be a memory efficient one as the output would be simply a view into the NaNs-padded 1D version. Let's test this out -
In [158]: a # Sample 1D input
Out[158]: array([37, 95, 87, 10, 35])
In [159]: L = a.size # Run the posted approach
...: a_ext = np.concatenate(( np.full(a.size-1,np.nan) ,a))
...: n = a_ext.strides[0]
...: strided = np.lib.stride_tricks.as_strided
...: out = strided(a_ext, shape=(L,L), strides=(n,n))
...:
In [160]: np.may_share_memory(a_ext,out) O/p might be a view into extended version
Out[160]: True
Let's confirm that the output is actually a view indeed by assigning values into a_ext and then checking out.
Initial values of a_ext and out :
In [161]: a_ext
Out[161]: array([ nan, nan, nan, nan, 37., 95., 87., 10., 35.])
In [162]: out
Out[162]:
array([[ nan, nan, nan, nan, 37.],
[ nan, nan, nan, 37., 95.],
[ nan, nan, 37., 95., 87.],
[ nan, 37., 95., 87., 10.],
[ 37., 95., 87., 10., 35.]])
Modify a_ext :
In [163]: a_ext[:] = 100
See the new out :
In [164]: out
Out[164]:
array([[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.]])
Confirms that it's a view.
Finally, let's test out the memory requirements :
In [131]: a_ext.nbytes
Out[131]: 72
In [132]: out.nbytes
Out[132]: 200
So, the output even though it shows as 200 bytes is actually just 72 bytes because of being a view into the extended array that has a size of 72 bytes.
One more approach with Scipy's toeplitz -
from scipy.linalg import toeplitz
out = toeplitz(a, np.full(a.size,np.nan) )[:,::-1]

Related

Creating 1D vectors over 3D tensors in pytorch

I have the following tensor with dimensions (2, 3, 2, 2) where the dimensions represent (batch_size, channels, height, width):
tensor([[[[ 1., 2.],
[ 3., 4.]],
[[ 5., 6.],
[ 7., 8.]],
[[ 9., 10.],
[11., 12.]]],
[[[13., 14.],
[15., 16.]],
[[17., 18.],
[19., 20.]],
[[21., 22.],
[23., 24.]]]])
I would like to convert this into the following tensor with dimensions (8, 3):
tensor([[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11],
[ 4, 8, 12],
[13, 17, 21],
[14, 18, 22],
[15, 19, 23],
[16, 20, 24]])
Essentially I would like to create 1D vector over the elements of the matrices. I have tried many operations such as flatten and reshape, but I cannot figure out how to achieve this reshaping.
You can do it this way:
import torch
x = torch.Tensor(
[
[
[[1,2],[3,4]],
[[5,6],[7,8]],
[[9,10],[11,12]]],
[
[[13,14],[15,16]],
[[17,18],[19,20]],
[[21,22],[23,24]]]
]
)
result = x.swapaxes(0, 1).reshape(3, -1).T
print(result)
# > tensor([[ 1., 5., 9.],
# > [ 2., 6., 10.],
# > [ 3., 7., 11.],
# > [ 4., 8., 12.],
# > [13., 17., 21.],
# > [14., 18., 22.],
# > [15., 19., 23.],
# > [16., 20., 24.]])
You could achieve this with an axes permutation and a flattening the resulting tensor:
swap axis=1 (of size 3) with the last one: axis=-1, using torch.permute (torch.swapaxes is an alias),
flatten everything but the last axis i.e. from axis=0 to axis=-2 using torch.flatten.
This looks like:
>>> x.transpose(1, -1).flatten(0, -2)
tensor([[ 1., 5., 9.],
[ 3., 7., 11.],
[ 2., 6., 10.],
[ 4., 8., 12.],
[13., 17., 21.],
[15., 19., 23.],
[14., 18., 22.],
[16., 20., 24.]])

How do I screen out NaNs when comparing floats in Ruby?

I'm using Rails 5 wiht Ruby 2.4. How do I determine if a variable is a number? I thought is_a?(Numeric) was the way to go, but apparently not with me. I have this code
puts "guesses arr: #{guesses_arr}"
guesses_arr.map!{|str| str.is_a?(Numeric) && str == guesses_arr.max ? str : 0}
which dies with
guesses arr: [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN]
ArgumentError: comparison of Float with NaN failed
from /Users/davea/Documents/workspace/myproject/app/services/text_table_to_race_time_converter_service.rb:2121:in `max'
from /Users/davea/Documents/workspace/myproject/app/services/text_table_to_race_time_converter_service.rb:2121:in `block (2 levels) in guess_headers_from_data_cols'
from /Users/davea/Documents/workspace/myproject/app/services/text_table_to_race_time_converter_service.rb:2121:in `map!'
I'm unable to reproduce this in a console because I can't seem to produce an array with all those "NaN" values. The point is, how do I make sure those don't get compared in my loop?
You can produce an array of NaN values in the console like so:
[ 0.0 / 0.0 ] * 3
# => [NaN, NaN, NaN]
It's not entirely clear what your code is doing, but it might be worth noting that it's the max which is throwing the exception, and not the equality test. Incidentally, if you are trying to compare each array value against the maximum value - you might want to max outside of the map, to avoid having to evaluate it multiple times.
Building on top the other answer here on rejecting NaNs, you can put it all together to get something like:
# set up dummy data - 2 values and 3 NaNs
guesses_arr = [ 0.0 / 0.0 ] * 3 + [ 2.0, 3.0 ]
puts "guesses arr: #{guesses_arr}"
# => [NaN, NaN, NaN, 2.0, 3.0]
# remove NaNs in place
guesses_arr.reject!(&:nan?)
maximum = guesses_arr.max
guesses_arr.map!{|str| str.is_a?(Numeric) && str == maximum ? str : 0}
# => [ 0, 3.0 ]
Edit: if you have integers mixed up in the array, try converting them into floats first:
# set up dummy data - including floats and integers
guesses_arr = [ 0.0 / 0.0 ] * 3 + [ 2.0, 3.0, 0 ]
guesses_arr.map!(&:to_f).reject!(&:nan?)
# => [ 2.0, 3.0, 0.0 ]
You can get NaN by calling Float::NAN:
[7] pry(main)> Float::NAN
=> NaN
[8] pry(main)> Float::NAN.nan?
=> true
If you just want to remove NaN:
guesses_arr.reject! { |x| x.nan? }
For example:
[4] pry(main)> [1.0, Float::NAN].reject(&:nan?)
=> [1.0]
There are a couple of things that we can do. To answer the specific question.
arr = [1.0, Float::NAN, "1.0", "1"]
arr.map(&:to_f).reject(&:nan?)
would be a concise example around removing NaN
With a guess that you are trying to get a max value from inconsistent data, first we need to clean it up with a method like the following, which will convert a string or a integer to a float, or otherwise return nil.
def to_number(maybe_a_number)
Float(maybe_a_number)
rescue ArgumentError
nil
end
Assuming we have the above method
guesses_arr.map(&:to_number).max # or .min
That being said, the reason it's been hard to debug is the fact that you had a .map! that causes guesses_arr to be changing as you move over it. (Generally we avoid use of the standard libraries bang methods as they object errors can be hard to debug.

Not able to remove NaN from ruby array

I have an array of values
=> [0.0, 4.76, 0.0, Infinity, NaN, 2.63, 0.74, 10.0, NaN, NaN, NaN, NaN, 0.0, NaN, NaN, NaN, NaN, NaN, Infinity, 5.26, NaN, 0.0, NaN, 3.45, 2.5, NaN, 10.0, 0.0, NaN, 2.94, NaN, NaN, 0.0, 2.04, 0.0, 11.11, NaN, NaN, 1.23, NaN, NaN, 11.11, NaN, NaN, NaN, 0.0, 9.68, NaN, NaN, 10.0, 5.0, 3.7, 10.0, Infinity, 0.0, 0.0, 1.41, NaN, 3.45, NaN]
When I run this script to remove NaN's it removes some but not all NaN's.
def remove_from_array(numArray)
numArray.inject(0) do |i|
if numArray[i].nan?
numArray.delete_at(i)
end
i += 1
end
numArray
end
What am I missing?
If you delete an item and move to the next index, you’re moving two items ahead, because the item at the current index no longer exists.
Luckily, there’s a better way, using Array#reject!:
numArray.reject! &:nan?

Matrix derivative doesn't get evaluated

I'm trying to evaluate the partial derivative of the most general 3D rotation matrix, like this:
phi, psi, theta = sympy.symbols("phi, psi, theta")
RMatrixPhi = sympy.Matrix([[cos(phi), sin(phi), 0],
[-sin(phi), cos(phi), 0],
[0, 0, 1]])
RMatrixPsi = sympy.Matrix([[cos(psi), 0, sin(psi)],
[0, 1, 0 ],
[-sin(psi), 0, cos(psi)]])
RMatrixTheta = sympy.Matrix([[1, 0, 0 ],
[0, cos(theta), sin(theta)],
[0, -sin(theta), cos(theta) ]])
RMatrix = RMatrixPhi * RMatrixPsi * RMatrixTheta
D = diff(RMatrix, phi)
However,D is then a sympy.Derivative object, and I cannot get it evaluated,
it's just printed out as Derivative(Matrix(...))
The only way I could get it working is by writing
sympy.Matrix([sympy.diff(r, phi) for r in RMatrix]).reshape(3,3)
but that looks ugly. What's the right way to compute such derivatives?
The Matrix class has a method called diff which, according to the documentation ...
Docstring:
Calculate the derivative of each element in the matrix.
So use
RMatrix.diff(phi)
to perform element-wise derivation.

Rounding errors in Ruby matrix implementation

I'm doing a bit 'o matrix algebra in ruby. When testing the results, I'm seeing what I can only assume is a rounding error.
All I'm doing is multiplying 3 matrices, but the values are fairly small:
c_xy:
[0.9702957262759965, 0.012661213742314235, -0.24159035004964077]
[0, 0.9986295347545738, 0.05233595624294383]
[0.24192189559966773, -0.050781354673095955, 0.9689659697053497]
i2k = Matrix[[8.1144E-06, 0.0, 0.0],
[0.0, 8.1144E-06, 0.0],
[0.0, 0.0, 8.1144E-06]]
c_yx:
[0.9702957262759965, 0, 0.24192189559966773]
[0.012661213742314235, 0.9986295347545738, -0.050781354673095955]
[-0.24159035004964077, 0.05233595624294383, 0.9689659697053497]
What I'm trying to do is c_xy * i2k * c_yx. Here's what I expect (this was done in Excel):
8.1144E-06 0 2.11758E-22
0 8.1144E-06 0
2.11758E-22 -5.29396E-23 8.1144E-06
And what I get:
[8.1144e-06, 1.3234889800848443e-23, 6.352747104407253e-22]
[0.0, 8.114399999999998e-06, -5.293955920339377e-23]
[2.117582368135751e-22, 0.0, 8.1144e-06]
As you can see, the first column matches, as does the diagonal. But then (in r,c indexing) (0,1) is wrong (though approaching 0), (0,2) is very wrong, and (1,2) and (2,1) seem to be transposed. I thought it had something to do with the8.1144e-6 value, and tried wrapping it in a BigDecimal to no avail.
Any ideas on places I can look? I'm using the standard Ruby Matrix library
edit
here's the code.
phi1 = 0.24434609527920614
phi2 = 0.05235987755982988
i2k = Matrix[[8.1144E-06, 0.0, 0.0],
[0.0, 8.1144E-06, 0.0],
[0.0, 0.0, 8.1144E-06]]
c_x = Matrix[[1, 0, 0],
[0, Math.cos(phi2), Math.sin(phi2)],
[0, -Math.sin(phi2), Math.cos(phi2)]]
c_y = Matrix[[Math.cos(phi1), 0, -Math.sin(phi1)],
[0, 1, 0],
[Math.sin(phi1), 0, Math.cos(phi1)]]
c_xy = c_y * c_x
c_yx = c_xy.transpose
c_xy * i2k * c_yx
i2k is equal to the identity matrix times 8.1144E-06. This simplifies the answer to:
c_xy * i2k * c_yx = 8.1144E-06 * c_xy * c_yx
However since c_yx = c_xy.transpose and c_xy is a rotation matrix, the transpose of any rotation matrix is its inverse. So c_xy * c_yx is the identity matrix, and thus the exact answer is 8.1144E-06 times the identity matrix.
Here is one way to calculate c_xy * c_yx without using the matrix algebra a priori:
require 'matrix'
require 'pp'
phi1 = 14 * Math::PI/180
phi2 = 3 * Math::PI/180
c_x = Matrix[
[1,0,0],
[0, Math.cos(phi2), Math.sin(phi2) ],
[0, -Math.sin(phi2), Math.cos(phi2) ] ]
c_y = Matrix[
[Math.cos(phi1), 0, -Math.sin(phi1) ],
[0,1,0],
[Math.sin(phi1), 0, Math.cos(phi1) ] ]
c_xy = c_y * c_x
c_yx = c_xy.transpose
product = c_xy * c_yx
pp *product
clone = *product
puts "\nApplying EPSILON:"
product.each_with_index do |e,i,j|
clone[i][j] = 0 if e.abs <= Float::EPSILON
end
pp clone
Output:
[1.0, 0.0, 2.7755575615628914e-17]
[0.0, 0.9999999999999999, -6.938893903907228e-18]
[2.7755575615628914e-17, -6.938893903907228e-18, 0.9999999999999999]
Applying EPSILON:
[1.0, 0, 0]
[0, 0.9999999999999999, 0]
[0, 0, 0.9999999999999999]
which one can then surmise should be the identity matrix. This uses Float::EPSILON which is about 2.220446049250313e-16 in order to set values that have an absolute value no more than this equal to 0. These kinds of approximations are inevitable in floating point calculations; one must evaluate the appropriateness of these approximations on a case-by-case basis.
An alternative is to do symbolic computation where possible rather than numeric.
Floating point numbers have a precision:
puts Float::DIG # => 15
That's the number of decimal digits a Float can have on my, and probably your system. Numbers smaller than 1E-15 can not be represented with a float. You could try BigDecimal for arbitrary large precision.

Resources