How do I screen out NaNs when comparing floats in Ruby? - ruby

I'm using Rails 5 wiht Ruby 2.4. How do I determine if a variable is a number? I thought is_a?(Numeric) was the way to go, but apparently not with me. I have this code
puts "guesses arr: #{guesses_arr}"
guesses_arr.map!{|str| str.is_a?(Numeric) && str == guesses_arr.max ? str : 0}
which dies with
guesses arr: [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN]
ArgumentError: comparison of Float with NaN failed
from /Users/davea/Documents/workspace/myproject/app/services/text_table_to_race_time_converter_service.rb:2121:in `max'
from /Users/davea/Documents/workspace/myproject/app/services/text_table_to_race_time_converter_service.rb:2121:in `block (2 levels) in guess_headers_from_data_cols'
from /Users/davea/Documents/workspace/myproject/app/services/text_table_to_race_time_converter_service.rb:2121:in `map!'
I'm unable to reproduce this in a console because I can't seem to produce an array with all those "NaN" values. The point is, how do I make sure those don't get compared in my loop?

You can produce an array of NaN values in the console like so:
[ 0.0 / 0.0 ] * 3
# => [NaN, NaN, NaN]
It's not entirely clear what your code is doing, but it might be worth noting that it's the max which is throwing the exception, and not the equality test. Incidentally, if you are trying to compare each array value against the maximum value - you might want to max outside of the map, to avoid having to evaluate it multiple times.
Building on top the other answer here on rejecting NaNs, you can put it all together to get something like:
# set up dummy data - 2 values and 3 NaNs
guesses_arr = [ 0.0 / 0.0 ] * 3 + [ 2.0, 3.0 ]
puts "guesses arr: #{guesses_arr}"
# => [NaN, NaN, NaN, 2.0, 3.0]
# remove NaNs in place
guesses_arr.reject!(&:nan?)
maximum = guesses_arr.max
guesses_arr.map!{|str| str.is_a?(Numeric) && str == maximum ? str : 0}
# => [ 0, 3.0 ]
Edit: if you have integers mixed up in the array, try converting them into floats first:
# set up dummy data - including floats and integers
guesses_arr = [ 0.0 / 0.0 ] * 3 + [ 2.0, 3.0, 0 ]
guesses_arr.map!(&:to_f).reject!(&:nan?)
# => [ 2.0, 3.0, 0.0 ]

You can get NaN by calling Float::NAN:
[7] pry(main)> Float::NAN
=> NaN
[8] pry(main)> Float::NAN.nan?
=> true
If you just want to remove NaN:
guesses_arr.reject! { |x| x.nan? }
For example:
[4] pry(main)> [1.0, Float::NAN].reject(&:nan?)
=> [1.0]

There are a couple of things that we can do. To answer the specific question.
arr = [1.0, Float::NAN, "1.0", "1"]
arr.map(&:to_f).reject(&:nan?)
would be a concise example around removing NaN
With a guess that you are trying to get a max value from inconsistent data, first we need to clean it up with a method like the following, which will convert a string or a integer to a float, or otherwise return nil.
def to_number(maybe_a_number)
Float(maybe_a_number)
rescue ArgumentError
nil
end
Assuming we have the above method
guesses_arr.map(&:to_number).max # or .min
That being said, the reason it's been hard to debug is the fact that you had a .map! that causes guesses_arr to be changing as you move over it. (Generally we avoid use of the standard libraries bang methods as they object errors can be hard to debug.

Related

Creating sliding windows of NaN padded elements off 1D NumPy array

I have a time series x[0], x[1], ... x[n-1], stored as a 1 dimensional numpy array. I would like to convert it to the following matrix:
NaN, ... , NaN , x[0]
NaN, ... , x[0], x[1]
.
.
NaN, x[0], ... , x[n-3],x[n-2]
x[0], x[1], ... , x[n-2],x[n-1]
I would like to use this matrix to speedup time-series calculations. Is there a function in numpy or scipy to do this? (I don't want to use for loop in python to do it)
One approach with np.lib.stride_tricks.as_strided -
def nanpad_sliding2D(a):
L = a.size
a_ext = np.concatenate(( np.full(a.size-1,np.nan) ,a))
n = a_ext.strides[0]
strided = np.lib.stride_tricks.as_strided
return strided(a_ext, shape=(L,L), strides=(n,n))
Sample run -
In [41]: a
Out[41]: array([48, 82, 96, 34, 93, 25, 51, 26])
In [42]: nanpad_sliding2D(a)
Out[42]:
array([[ nan, nan, nan, nan, nan, nan, nan, 48.],
[ nan, nan, nan, nan, nan, nan, 48., 82.],
[ nan, nan, nan, nan, nan, 48., 82., 96.],
[ nan, nan, nan, nan, 48., 82., 96., 34.],
[ nan, nan, nan, 48., 82., 96., 34., 93.],
[ nan, nan, 48., 82., 96., 34., 93., 25.],
[ nan, 48., 82., 96., 34., 93., 25., 51.],
[ 48., 82., 96., 34., 93., 25., 51., 26.]])
Memory efficiency with strides
As mentioned in the comments by #Eric, this strides based approach would be a memory efficient one as the output would be simply a view into the NaNs-padded 1D version. Let's test this out -
In [158]: a # Sample 1D input
Out[158]: array([37, 95, 87, 10, 35])
In [159]: L = a.size # Run the posted approach
...: a_ext = np.concatenate(( np.full(a.size-1,np.nan) ,a))
...: n = a_ext.strides[0]
...: strided = np.lib.stride_tricks.as_strided
...: out = strided(a_ext, shape=(L,L), strides=(n,n))
...:
In [160]: np.may_share_memory(a_ext,out) O/p might be a view into extended version
Out[160]: True
Let's confirm that the output is actually a view indeed by assigning values into a_ext and then checking out.
Initial values of a_ext and out :
In [161]: a_ext
Out[161]: array([ nan, nan, nan, nan, 37., 95., 87., 10., 35.])
In [162]: out
Out[162]:
array([[ nan, nan, nan, nan, 37.],
[ nan, nan, nan, 37., 95.],
[ nan, nan, 37., 95., 87.],
[ nan, 37., 95., 87., 10.],
[ 37., 95., 87., 10., 35.]])
Modify a_ext :
In [163]: a_ext[:] = 100
See the new out :
In [164]: out
Out[164]:
array([[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.],
[ 100., 100., 100., 100., 100.]])
Confirms that it's a view.
Finally, let's test out the memory requirements :
In [131]: a_ext.nbytes
Out[131]: 72
In [132]: out.nbytes
Out[132]: 200
So, the output even though it shows as 200 bytes is actually just 72 bytes because of being a view into the extended array that has a size of 72 bytes.
One more approach with Scipy's toeplitz -
from scipy.linalg import toeplitz
out = toeplitz(a, np.full(a.size,np.nan) )[:,::-1]

Not able to remove NaN from ruby array

I have an array of values
=> [0.0, 4.76, 0.0, Infinity, NaN, 2.63, 0.74, 10.0, NaN, NaN, NaN, NaN, 0.0, NaN, NaN, NaN, NaN, NaN, Infinity, 5.26, NaN, 0.0, NaN, 3.45, 2.5, NaN, 10.0, 0.0, NaN, 2.94, NaN, NaN, 0.0, 2.04, 0.0, 11.11, NaN, NaN, 1.23, NaN, NaN, 11.11, NaN, NaN, NaN, 0.0, 9.68, NaN, NaN, 10.0, 5.0, 3.7, 10.0, Infinity, 0.0, 0.0, 1.41, NaN, 3.45, NaN]
When I run this script to remove NaN's it removes some but not all NaN's.
def remove_from_array(numArray)
numArray.inject(0) do |i|
if numArray[i].nan?
numArray.delete_at(i)
end
i += 1
end
numArray
end
What am I missing?
If you delete an item and move to the next index, you’re moving two items ahead, because the item at the current index no longer exists.
Luckily, there’s a better way, using Array#reject!:
numArray.reject! &:nan?

Rounding errors in Ruby matrix implementation

I'm doing a bit 'o matrix algebra in ruby. When testing the results, I'm seeing what I can only assume is a rounding error.
All I'm doing is multiplying 3 matrices, but the values are fairly small:
c_xy:
[0.9702957262759965, 0.012661213742314235, -0.24159035004964077]
[0, 0.9986295347545738, 0.05233595624294383]
[0.24192189559966773, -0.050781354673095955, 0.9689659697053497]
i2k = Matrix[[8.1144E-06, 0.0, 0.0],
[0.0, 8.1144E-06, 0.0],
[0.0, 0.0, 8.1144E-06]]
c_yx:
[0.9702957262759965, 0, 0.24192189559966773]
[0.012661213742314235, 0.9986295347545738, -0.050781354673095955]
[-0.24159035004964077, 0.05233595624294383, 0.9689659697053497]
What I'm trying to do is c_xy * i2k * c_yx. Here's what I expect (this was done in Excel):
8.1144E-06 0 2.11758E-22
0 8.1144E-06 0
2.11758E-22 -5.29396E-23 8.1144E-06
And what I get:
[8.1144e-06, 1.3234889800848443e-23, 6.352747104407253e-22]
[0.0, 8.114399999999998e-06, -5.293955920339377e-23]
[2.117582368135751e-22, 0.0, 8.1144e-06]
As you can see, the first column matches, as does the diagonal. But then (in r,c indexing) (0,1) is wrong (though approaching 0), (0,2) is very wrong, and (1,2) and (2,1) seem to be transposed. I thought it had something to do with the8.1144e-6 value, and tried wrapping it in a BigDecimal to no avail.
Any ideas on places I can look? I'm using the standard Ruby Matrix library
edit
here's the code.
phi1 = 0.24434609527920614
phi2 = 0.05235987755982988
i2k = Matrix[[8.1144E-06, 0.0, 0.0],
[0.0, 8.1144E-06, 0.0],
[0.0, 0.0, 8.1144E-06]]
c_x = Matrix[[1, 0, 0],
[0, Math.cos(phi2), Math.sin(phi2)],
[0, -Math.sin(phi2), Math.cos(phi2)]]
c_y = Matrix[[Math.cos(phi1), 0, -Math.sin(phi1)],
[0, 1, 0],
[Math.sin(phi1), 0, Math.cos(phi1)]]
c_xy = c_y * c_x
c_yx = c_xy.transpose
c_xy * i2k * c_yx
i2k is equal to the identity matrix times 8.1144E-06. This simplifies the answer to:
c_xy * i2k * c_yx = 8.1144E-06 * c_xy * c_yx
However since c_yx = c_xy.transpose and c_xy is a rotation matrix, the transpose of any rotation matrix is its inverse. So c_xy * c_yx is the identity matrix, and thus the exact answer is 8.1144E-06 times the identity matrix.
Here is one way to calculate c_xy * c_yx without using the matrix algebra a priori:
require 'matrix'
require 'pp'
phi1 = 14 * Math::PI/180
phi2 = 3 * Math::PI/180
c_x = Matrix[
[1,0,0],
[0, Math.cos(phi2), Math.sin(phi2) ],
[0, -Math.sin(phi2), Math.cos(phi2) ] ]
c_y = Matrix[
[Math.cos(phi1), 0, -Math.sin(phi1) ],
[0,1,0],
[Math.sin(phi1), 0, Math.cos(phi1) ] ]
c_xy = c_y * c_x
c_yx = c_xy.transpose
product = c_xy * c_yx
pp *product
clone = *product
puts "\nApplying EPSILON:"
product.each_with_index do |e,i,j|
clone[i][j] = 0 if e.abs <= Float::EPSILON
end
pp clone
Output:
[1.0, 0.0, 2.7755575615628914e-17]
[0.0, 0.9999999999999999, -6.938893903907228e-18]
[2.7755575615628914e-17, -6.938893903907228e-18, 0.9999999999999999]
Applying EPSILON:
[1.0, 0, 0]
[0, 0.9999999999999999, 0]
[0, 0, 0.9999999999999999]
which one can then surmise should be the identity matrix. This uses Float::EPSILON which is about 2.220446049250313e-16 in order to set values that have an absolute value no more than this equal to 0. These kinds of approximations are inevitable in floating point calculations; one must evaluate the appropriateness of these approximations on a case-by-case basis.
An alternative is to do symbolic computation where possible rather than numeric.
Floating point numbers have a precision:
puts Float::DIG # => 15
That's the number of decimal digits a Float can have on my, and probably your system. Numbers smaller than 1E-15 can not be represented with a float. You could try BigDecimal for arbitrary large precision.

How do I convert a 2d array into a 1d array of strings?

I have a 2d array of the following form:
[ [(, (, (, (, (,],
[1.0,1.0,1.0,1.0,1.0,],
[+, +, +, +, +,],
[2.0,2.0,2.0,2.0,2.0,],
[), ), ), ), )]
]
i.e a mixture of integers and mathematical characters. What I want to do is convert this into a series of strings in a 1D array like:
[
(1+2),
(1+2),
(1+2),
(1+2),
(1+2)
]
Is there a way to do this with join?
xs.transpose.map(&:join)
#=> ["(1.1+2.1)", "(1.2+2.2)", "(1.3+2.3)", "(1.4+2.4)", "(1.5+2.5)"]
(side note: if you want to know why we can't write Array.zip(*xs) look at this issue. Facets guys did implement it.)
You could use transpose, and then join the elements of the transposed array (I have modified the numbers to clearly show what is happening):
# edited to add a transpose with map, thank you #Mladen
# edited to get rid of <<'s, thank you #tokland
# assumption:
a = [ ['(', '(', '(', '(', '('],
[1.1, 1.2, 1.3, 1.4, 1.5],
['+', '+', '+', '+', '+'],
[2.1, 2.2, 2.3, 2.4, 2.5],
[')', ')', ')', ')', ')'] ]
# transpose:
b = []
a.transpose.each { |c| b.push c.join }
# or, transpose with map:
b = a.transpose.map { |c| c.join }
# test:
puts b.inspect
Which has as an output:
["(1.1+2.1)", "(1.2+2.2)", "(1.3+2.3)", "(1.4+2.4)", "(1.5+2.5)"]

How to sort an array of floats in Ruby?

Just wondering how to sort an array of floats in Ruby, since "sort" and "sort!" only work for integer arrays.
Arrays of floats can certainly be sorted:
>> [6.2, 5.8, 1.1, 4.9, 13.4].sort
=> [1.1, 4.9, 5.8, 6.2, 13.4]
Maybe you have a nil in your array, which can't be sorted with anything.
You can sort a float array without any problem like :
irb(main):005:0> b = [2.0, 3.0, 1.0, 4.0]
=> [2.0, 3.0, 1.0, 4.0]
irb(main):006:0> b.sort
=> [1.0, 2.0, 3.0, 4.0]
perhaps you have something like this in your array and haven't noticed:
[1.0 , 3.0, 0/0, ...]
the 0/0 will give you a NaN which is impossible to compare with a Float... in this case you should try to
[2.3,nil,1].compact.sort
# => [1,2.3]
that or perhaps the same error with 1.0/0 wich yields infinity (but this error is detected by ruby)

Resources