Unary minus and floating point number in OCaml - syntax

I wanted to have a vector of complex numbers in my program, so I wrote this:
[|pt 0. 0.; pt -4. 1.; pt -7. -2.; pt 4. 5.; pt 1. 1.|]
Here pt is a function of type float -> float -> Complex.t. But ocaml refused to compile this saying:
Characters 12-14:
[|pt 0. 0.; pt -4. 1.; pt -7. -2.; pt 4. 5.; pt 1. 1.|];;
^^
Error: This expression has type float -> float -> Complex.t
but an expression was expected of type int
What I wanted to do here is (obviously) include the complex number whose real part is -4 and whose imaginary part is 1. But ocaml treated what I intended to be an unary minus as a function of type int -> int ->int.
What should I write to do what I wanted to?

Here's how I think it goes (after reading docs and experimenting). There really are four completely different operators:
- Integer subtraction int -> int -> int
-. Floating subtraction float -> float -> float
~- Integer unary negation int -> int
~-. Floating unary negation float -> float
If everybody used these operators, things would be clear, but unfortunately it's also a pretty clumsy notation. In my experience the ~- and ~-. operators are rarely used. The OCaml grammar is specified to let you use the subtraction operators as unary negation operators, as in many other languages. If you do that, you often have to use extra parentheses. If you're willing to use the specific unary operators, you don't need the parentheses.
I.e., you can write (as in pad's edited answer):
[|pt 0. 0.; pt ~-.4. 1.; pt ~-.7. ~-.2.; pt 4. 5.; pt 1. 1.|]
Or you can write:
[|pt 0. 0.; pt (-.4.) 1.; pt (-.7.) (-.2.); pt 4. 5.; pt 1. 1.|]
There's also one extra confusing factor, which is that the OCaml lexer is specified to let you use the integer subtraction operator for unary negation when you use it with a floating constant. Again, this makes the notation more like other languages. Since it's fundamentally a binary operator, you need the parentheses here too.
This means you can write:
[|pt 0. 0.; pt (-4.) 1.; pt (-7.) (-2.); pt 4. 5.; pt 1. 1.|]
This notation only works for negative floating constants. The other two notations work for any expression you might want to negate.
# (-) ;;
- : int -> int -> int = <fun>
# (-.) ;;
- : float -> float -> float = <fun>
# (~-) ;;
- : int -> int = <fun>
# (~-.) ;;
- : float -> float = <fun>
# let f x = x +. 2.0;;
val f : float -> float = <fun>
# f ~-.5.;;
- : float = -3.
# f -.5.;;
Characters 0-1:
f -.5.;;
^
Error: This expression has type float -> float
but an expression was expected of type float
# f (-.5.);;
- : float = -3.
# f -5.;;
^
Error: This expression has type float -> float
but an expression was expected of type int
# f (-5.);;
- : float = -3.

Related

Implementing a FIR filter using Vectors

I have implemented a FIR filter in Haskell. I don't know that much about FIR filters and my code is heavily based on an existing C# implementation. Therefore, I have a feeling that my implementation is has too much of a C# style and is not really Haskell-like. I would like to know if there is a more idiomatic Haskell way of implementing my code. Ideally, I'm lucky for some combination of higher-order functions (map, filter, fold, etc.) that implement the algorithm.
My Haskell code looks like this:
applyFIR :: Vector Double -> Vector Double -> Vector Double
applyFIR b x = generate (U.length x) help
where
help i = if i >= (U.length b - 1) then loop i (U.length b - 1) else 0
loop yi bi = if bi < 0 then 0 else b !! bi * x !! (yi-bi) + loop yi (bi-1)
vec !! i = unsafeIndex vec i -- Shorthand for unsafeIndex
This code is based on the following C# code:
public float[] RunFilter(double[] x)
{
int M = coeff.Length;
int n = x.Length;
//y[n]=b0x[n]+b1x[n-1]+....bmx[n-M]
var y = new float[n];
for (int yi = 0; yi < n; yi++)
{
double t = 0.0f;
for (int bi = M - 1; bi >= 0; bi--)
{
if (yi - bi < 0) continue;
t += coeff[bi] * x[yi - bi];
}
y[yi] = (float) t;
}
return y;
}
As you can see, it's almost a straight copy. How can I turn my implementation into a more Haskell-like one? Do you have any ideas? The only thing I could come up with was using Vector.generate.
I know that the DSP library has an implementation available. But it uses lists and is way too slow for my use case. This Vector implementation is a lot faster than the one in DSP.
I've also tried implementing the algorithm using Repa. It is faster than the Vector implementation. Here is the result:
applyFIR :: V.Vector Float -> Array U DIM1 Float -> Array D DIM1 Float
applyFIR b x = R.traverse x id (\_ (Z :. i) -> if i >= len then loop i (len - 1) else 0)
where
len = V.length b
loop :: Int -> Int -> Float
loop yi bi = if bi < 0 then 0 else (V.unsafeIndex b bi) * x !! (Z :. (yi-bi)) + loop yi (bi-1)
arr !! i = unsafeIndex arr i
First of all, I don't think that your initial vector code is a faithful translation - that is, I think it disagrees with the C# code. For example, suppose that both "x" and "b" ("b" is coeff in C#) have length 3, and have all values of 1.0. Then for y[0] the C# code would produce x[0] * coeff[0], or 1.0. (it would hit continue for all other values of bi)
With your Haskell code, however, help 0 produces 0. Your Repa version seems to suffer from the same problem.
So let's start with a more faithful translation:
applyFIR :: Vector Double -> Vector Double -> Vector Double
applyFIR b x = generate (U.length x) help
where
help i = loop i (min i $ U.length b - 1)
loop yi bi = if bi < 0 then 0 else b !! bi * x !! (yi-bi) + loop yi (bi-1)
vec !! i = unsafeIndex vec i -- Shorthand for unsafeIndex
Now, you're basically doing a calculation like this for computing, say, y[3]:
... b[3] | b[2] | b[1] | b[0]
x[0] | x[1] | x[2] | x[3] | x[4] | x[5] | ....
multiply
b[3]*x[0]|b[2]*x[1] |b[1]*x[2] |b[0]*x[3]
sum
y[3] = b[3]*x[0] + b[2]*x[1] + b[1]*x[2] + b[0]*x[3]
So one way to think of what you're doing is "take the b vector, reverse it, and to compute spot i of the result, line b[0] up with x[i], multiply all the corresponding x and b entries, and compute the sum".
So let's do that:
applyFIR :: Vector Double -> Vector Double -> Vector Double
applyFIR b x = generate (U.length x) help
where
revB = U.reverse b
bLen = U.length b
help i = let sliceLen = min (i+1) bLen
bSlice = U.slice (bLen - sliceLen) sliceLen revB
xSlice = U.slice (i + 1 - sliceLen) sliceLen x
in U.sum $ U.zipWith (*) bSlice xSlice

Implementing Phase Unwrapping Algorithm with Haskell Repa Array

I'm trying to implement a Phase Unwrapping Algorithm for Three Phase Structured Light Scanning in Haskell using a Repa Array. I want to implement a flood fill based unwrapping algorithm recursing outward from the point (width / 2, height / 2). Unfortunately using that method of recursion I'm getting an out of memory exception. I'm new to Haskell and the Repa library so I was wondering whether it looks like I'm doing anything glaringly wrong. Any help with this would be greatly appreciated!
Update (#leventov):
I am now considering implementing the following path following algorithm using mutable arrays in Yarr. (Publication: K. Chen, J. Xi, Y. Yu & J. F. Chicharo, "Fast quality-guided flood-fill phase unwrapping algorithm for threedimensional fringe pattern profilometry," in Optical Metrology and Inspection for Industrial Applications,
2010, pp. 1-9.)
{-# OPTIONS_GHC -Odph -rtsopts -fno-liberate-case -fllvm -optlo-O3 -XTypeOperators -XNoMonomorphismRestriction #-}
module Scanner where
import Data.Word
import Data.Fixed
import Data.Array.Repa.Eval
import qualified Data.Array.Repa as R
import qualified Data.Array.Repa.Repr.Unboxed as U
import qualified Data.Array.Repa.Repr.ForeignPtr as P
import Codec.BMP
import Data.Array.Repa.IO.BMP
import Control.Monad.Identity (runIdentity)
import System.Environment( getArgs )
type ImRead = Either Error Image
type Avg = P.Array R.U R.DIM2 (ImageT, ImageT, ImageT)
type ImageT = (Word8, Word8, Word8)
type PhaseT = (Float, Float, Float)
type WrapT = (Float, Int)
type Image = P.Array R.U R.DIM2 (Word8, Word8, Word8)
type Phase = P.Array R.U R.DIM2 (Float, Float, Float)
type Wrap = P.Array R.U R.DIM2 (Float, Int)
type UWrapT = (Float, Int, [(Int, Int)], String)
type DepthT = (Float, Int, String)
{-# INLINE noise #-}
{-# INLINE zskew #-}
{-# INLINE zscale #-}
{-# INLINE compute #-}
{-# INLINE main #-}
{-# INLINE doMain #-}
{-# INLINE zipImg #-}
{-# INLINE mapWrap #-}
{-# INLINE avgPhase #-}
{-# INLINE doAvg #-}
{-# INLINE doWrap #-}
{-# INLINE doPhase #-}
{-# INLINE isPhase #-}
{-# INLINE diffPhase #-}
{-# INLINE shape #-}
{-# INLINE countM #-}
{-# INLINE inArr #-}
{-# INLINE idx #-}
{-# INLINE getElem #-}
{-# INLINE start #-}
{-# INLINE unwrap #-}
{-# INLINE doUnwrap #-}
{-# INLINE doDepth #-}
{-# INLINE write #-}
noise :: Float
noise = 0.1
zskew :: Float
zskew = 24
zscale :: Float
zscale = 130
compute :: (R.Shape sh, U.Unbox e) => P.Array R.D sh e -> P.Array R.U sh e
compute a = runIdentity (R.computeP a)
main :: IO ()
main = do
commandArguments <- getArgs
case commandArguments of
(file1 : file2 : file3 : _ ) -> do
image1 <- readImageFromBMP file1
image2 <- readImageFromBMP file2
image3 <- readImageFromBMP file3
doMain image1 image2 image3
_ -> putStrLn "Not enough arguments"
doMain :: ImRead -> ImRead -> ImRead -> IO()
doMain (Right i1) (Right i2) (Right i3) = write
where
write = writeFile "out.txt" str
(p, m, d, str) = start $ mapWrap i1 i2 i3
doMain _ _ _ = putStrLn "Error loading image"
zipImg :: Image -> Image -> Image -> Avg
zipImg i1 i2 i3 = U.zip3 i1 i2 i3
mapWrap :: Image -> Image -> Image -> Wrap
mapWrap i1 i2 i3 = compute $ R.map wrap avg
where
wrap = (doWrap . avgPhase)
avg = zipImg i1 i2 i3
avgPhase :: (ImageT, ImageT, ImageT) -> PhaseT
avgPhase (i1, i2, i3) = (doAvg i1, doAvg i2, doAvg i3)
doAvg :: ImageT -> Float
doAvg (r, g, b) = (r1 + g1 + b1) / d1
where
r1 = fromIntegral r
g1 = fromIntegral g
b1 = fromIntegral b
d1 = fromIntegral 765
doWrap :: PhaseT -> WrapT
doWrap (p1, p2, p3) = (wrap, mask)
where
wrap = isPhase $ doPhase (p1, p2, p3)
mask = isNoise $ diffPhase [p1, p2, p3]
doPhase :: PhaseT -> (Float, Float)
doPhase (p1, p2, p3) = (x1, x2)
where
x1 = sqrt 3 * (p1 - p3)
x2 = 2 * p2 - p1 - p3
isPhase :: (Float, Float) -> Float
isPhase (x1, x2) = atan2 x1 x2 / (2 * pi)
diffPhase :: [Float] -> Float
diffPhase phases = maximum phases - minimum phases
isNoise :: Float -> Int
isNoise phase = fromEnum $ phase <= noise
shape :: Wrap -> [Int]
shape wrap = R.listOfShape $ R.extent wrap
countM :: Wrap -> (Float, Int)
countM wrap = R.foldAllS count (0,0) wrap
where count = (\(x, y) (i, j) -> (x, y))
start :: Wrap -> UWrapT
start wrap = unwrap wrap (x, y) (ph, m, [], "")
where
[x0, y0] = shape wrap
x = quot x0 2
y = quot y0 2
(ph, m) = getElem wrap (x0, y0)
inArr :: Wrap -> (Int, Int) -> Bool
inArr wrap (x,y) = x >= 0 && y >= 0 && x < x0 && y < y0
where
[x0, y0] = shape wrap
idx :: (Int, Int) -> (R.Z R.:. Int R.:. Int)
idx (x, y) = (R.Z R.:. x R.:. y)
getElem :: Wrap -> (Int, Int) -> WrapT
getElem wrap (x, y) = wrap R.! idx (x, y)
unwrap :: Wrap -> (Int, Int) -> UWrapT -> UWrapT
unwrap wrap (x, y) (ph, m, done, str) =
if
not $ inArr wrap (x, y) ||
(x, y) `elem` done ||
toEnum m::Bool
then
(ph, m, done, str)
else
up
where
unwrap' = doUnwrap wrap (x, y) (ph, m, done, str)
right = unwrap wrap (x+1, y) unwrap'
left = unwrap wrap (x-1, y) right
down = unwrap wrap (x, y+1) left
up = unwrap wrap (x, y-1) down
doUnwrap :: Wrap -> (Int, Int) -> UWrapT -> UWrapT
doUnwrap wrap (x, y) (ph, m, done, str) = unwrapped
where
unwrapped = (nph, m, (x, y):done, out)
(phase, mask) = getElem wrap (x, y)
rph = fromIntegral $ round ph
off = phase - (ph - rph)
nph = ph + (mod' (off + 0.5) 1) - 0.5
out = doDepth wrap (x, y) (nph, m, str)
doDepth :: Wrap -> (Int, Int) -> DepthT -> String
doDepth wrap (x, y) (ph, m, str) = write (x, ys, d, str)
where
[x0, y0] = shape wrap
ys = y0 - y
ydiff = fromIntegral (y - (quot y0 2))
plane = 0.5 - ydiff / zskew
d = (ph - plane) * zscale
write :: (Int, Int, Float, String) -> String
write (x, y, depth, str) = str ++ vertex
where
vertex = xstr ++ ystr ++ zstr
xstr = show x ++ " "
ystr = show y ++ " "
zstr = show depth ++ "\n"
Sorry for wasting some your time by my first misleading advice.
You should use another 2-dimensional array of pixel states (already visited or not) instead of
(x, y) `elem` done
because the latter takes linear time.
Examples of solving almost the same task: for repa and vector, and for yarr.
Perhaps, you have out of memory exception because of building a string by appending to the end (in write function) - the worst solution, linear time and memory consumption. You would better aggregate results using cons (:) and write it to the output file at the end, in reverse order. Even better - write results to another unboxed Vector of (Int, Int, Float) elements (allocate vector of width*height size - as upper bound of possible size).

performance of static member constraint functions

I'm trying to learn static member constraints in F#. From reading Tomas Petricek's blog post, I understand that writing an inline function that "uses only operations that are themselves written using static member constraints" will make my function work correctly for all numeric types that satisfy those constraints. This question indicates that inline works somewhat similarly to c++ templates, so I wasn't expecting any performance difference between these two functions:
let MultiplyTyped (A : double[,]) (B : double[,]) =
let rA, cA = (Array2D.length1 A) - 1, (Array2D.length2 A) - 1
let cB = (Array2D.length2 B) - 1
let C = Array2D.zeroCreate<double> (Array2D.length1 A) (Array2D.length2 B)
for i = 0 to rA do
for k = 0 to cA do
for j = 0 to cB do
C.[i,j] <- C.[i,j] + A.[i,k] * B.[k,j]
C
let inline MultiplyGeneric (A : 'T[,]) (B : 'T[,]) =
let rA, cA = Array2D.length1 A - 1, Array2D.length2 A - 1
let cB = Array2D.length2 B - 1
let C = Array2D.zeroCreate<'T> (Array2D.length1 A) (Array2D.length2 B)
for i = 0 to rA do
for k = 0 to cA do
for j = 0 to cB do
C.[i,j] <- C.[i,j] + A.[i,k] * B.[k,j]
C
Nevertheless, to multiply two 1024 x 1024 matrixes, MultiplyTyped completes in an average of 2550 ms on my machine, whereas MultiplyGeneric takes about 5150 ms. I originally thought that zeroCreate was at fault in the generic version, but changing that line to the one below didn't make a difference.
let C = Array2D.init<'T> (Array2D.length1 A) (Array2D.length2 B) (fun i j -> LanguagePrimitives.GenericZero)
Is there something I'm missing here to make MultiplyGeneric perform the same as MultiplyTyped? Or is this expected?
edit: I should mention that this is VS2010, F# 2.0, Win7 64bit, release build. Platform target is x64 (to test larger matrices) - this makes a difference: x86 produces similar results for the two functions.
Bonus question: the type inferred for MultiplyGeneric is the following:
val inline MultiplyGeneric :
^T [,] -> ^T [,] -> ^T [,]
when ( ^T or ^a) : (static member ( + ) : ^T * ^a -> ^T) and
^T : (static member ( * ) : ^T * ^T -> ^a)
Where does the ^a type come from?
edit 2: here's my testing code:
let r = new System.Random()
let A = Array2D.init 1024 1024 (fun i j -> r.NextDouble())
let B = Array2D.init 1024 1024 (fun i j -> r.NextDouble())
let test f =
let sw = System.Diagnostics.Stopwatch.StartNew()
f() |> ignore
sw.Stop()
printfn "%A" sw.ElapsedMilliseconds
for i = 1 to 5 do
test (fun () -> MultiplyTyped A B)
for i = 1 to 5 do
test (fun () -> MultiplyGeneric A B)
Good question. I'll answer the easy part first: the ^a is just part of the natural generalization process. Imagine you had a type like this:
type T = | T with
static member (+)(T, i:int) = T
static member (*)(T, T) = 0
Then you can still use your MultiplyGeneric function with arrays of this type: multiplying elements of A and B will give you ints, but that's okay because you can still add them to elements of C and get back values of type T to store back into C.
As to your performance question, I'm afraid I don't have a great explanation. Your basic understanding is right - using MultiplyGeneric with double[,] arguments should be equivalent to using MultiplyTyped. If you use ildasm to look at the IL the compiler generates for the following F# code:
let arr = Array2D.zeroCreate 1024 1024
let f1 = MultiplyTyped arr
let f2 = MultiplyGeneric arr
let timer = System.Diagnostics.Stopwatch()
timer.Start()
f1 arr |> ignore
printfn "%A" timer.Elapsed
timer.Restart()
f2 arr |> ignore
printfn "%A" timer.Elapsed
then you can see that the compiler really does generate identical code for each of them, putting the inlined code for MultipyGeneric into an internal static function. The only difference that I see in the generated code is in the names of locals, and when running from the command line I get roughly equal elapsed times. However, running from FSI I see a difference similar to what you've reported.
It's not clear to me why this would be. As I see it there are two possibilities:
FSI's code generation may be doing something slightly different than the static compiler
The CLR's JIT compiler may be treat code generated at runtime slightly differently from compiled code. For instance, as I mentioned my code above using MultiplyGeneric actually results in an internal method that contains the inlined body. Perhaps the CLR's JIT handles the difference between public and internal methods differently when they are generated at runtime than when they are in statically compiled code.
I'd like to see your benchmarks. I don't get the same results (VS 2012 F# 3.0 Win 7 64-bit).
let m = Array2D.init 1024 1024 (fun i j -> float i * float j)
let test f =
let sw = System.Diagnostics.Stopwatch.StartNew()
f() |> ignore
sw.Stop()
printfn "%A" sw.Elapsed
test (fun () -> MultiplyTyped m m)
> 00:00:09.6013188
test (fun () -> MultiplyGeneric m m)
> 00:00:09.1686885
Decompiling with Reflector, the functions look identical.
Regarding your last question, the least restrictive constraint is inferred. In this line
C.[i,j] <- C.[i,j] + A.[i,k] * B.[k,j]
because the result type of A.[i,k] * B.[k,j] is unspecified, and is passed immediately to (+), an extra type could be involved. If you want to tighten the constraint you can replace that line with
let temp : 'T = A.[i,k] * B.[k,j]
C.[i,j] <- C.[i,j] + temp
That will change the signature to
val inline MultiplyGeneric :
A: ^T [,] -> B: ^T [,] -> ^T [,]
when ^T : (static member ( * ) : ^T * ^T -> ^T) and
^T : (static member ( + ) : ^T * ^T -> ^T)
EDIT
Using your test, here's the output:
//MultiplyTyped
00:00:09.9904615
00:00:09.5489653
00:00:10.0562346
00:00:09.7023183
00:00:09.5123992
//MultiplyGeneric
00:00:09.1320273
00:00:08.8195283
00:00:08.8523408
00:00:09.2496603
00:00:09.2950196
Here's the same test on ideone (with a few minor changes to stay within the time limit: 512x512 matrix and one test iteration). It runs F# 2.0 and produced similar results.

Approximating inverse trigonometric functions

I have to implement asin, acos and atan in environment where I have only following math tools:
sine
cosine
elementary fixed point arithmetic (floating point numbers are not available)
I also already have reasonably good square root function.
Can I use those to implement reasonably efficient inverse trigonometric functions?
I don't need too big precision (the floating point numbers have very limited precision anyways), basic approximation will do.
I'm already half decided to go with table lookup, but I would like to know if there is some neater option (that doesn't need several hundred lines of code just to implement basic math).
EDIT:
To clear things up: I need to run the function hundreds of times per frame at 35 frames per second.
In a fixed-point environment (S15.16) I successfully used the CORDIC algorithm (see Wikipedia for a general description) to compute atan2(y,x), then derived asin() and acos() from that using well-known functional identities that involve the square root:
asin(x) = atan2 (x, sqrt ((1.0 + x) * (1.0 - x)))
acos(x) = atan2 (sqrt ((1.0 + x) * (1.0 - x)), x)
It turns out that finding a useful description of the CORDIC iteration for atan2() on the double is harder than I thought. The following website appears to contain a sufficiently detailed description, and also discusses two alternative approaches, polynomial approximation and lookup tables:
http://ch.mathworks.com/examples/matlab-fixed-point-designer/615-calculate-fixed-point-arctangent
Do you need a large precision for arcsin(x) function? If no you may calculate arcsin in N nodes, and keep values in memory. I suggest using line aproximation. if x = A*x_(N) + (1-A)*x_(N+1) then x = A*arcsin(x_(N)) + (1-A)*arcsin(x_(N+1)) where arcsin(x_(N)) is known.
you might want to use approximation: use an infinite series until the solution is close enough for you.
for example:
arcsin(z) = Sigma((2n!)/((2^2n)*(n!)^2)*((z^(2n+1))/(2n+1))) where n in [0,infinity)
http://en.wikipedia.org/wiki/Inverse_trigonometric_functions#Expression_as_definite_integrals
You could do that integration numerically with your square root function, approximating with an infinite series:
Submitting here my answer from this other similar question.
nVidia has some great resources I've used for my own uses, few examples: acos asin atan2 etc etc...
These algorithms produce precise enough results. Here's a straight up Python example with their code copy pasted in:
import math
def nVidia_acos(x):
negate = float(x<0)
x=abs(x)
ret = -0.0187293
ret = ret * x
ret = ret + 0.0742610
ret = ret * x
ret = ret - 0.2121144
ret = ret * x
ret = ret + 1.5707288
ret = ret * math.sqrt(1.0-x)
ret = ret - 2 * negate * ret
return negate * 3.14159265358979 + ret
And here are the results for comparison:
nVidia_acos(0.5) result: 1.0471513828611643
math.acos(0.5) result: 1.0471975511965976
That's pretty close! Multiply by 57.29577951 to get results in degrees, which is also from their "degrees" formula.
It should be easy to addapt the following code to fixed point. It employs a rational approximation to calculate the arctangent normalized to the [0 1) interval (you can multiply it by Pi/2 to get the real arctangent). Then, you can use well known identities to get the arcsin/arccos from the arctangent.
normalized_atan(x) ~ (b x + x^2) / (1 + 2 b x + x^2)
where b = 0.596227
The maximum error is 0.1620º
#include <stdint.h>
#include <math.h>
// Approximates atan(x) normalized to the [-1,1] range
// with a maximum error of 0.1620 degrees.
float norm_atan( float x )
{
static const uint32_t sign_mask = 0x80000000;
static const float b = 0.596227f;
// Extract the sign bit
uint32_t ux_s = sign_mask & (uint32_t &)x;
// Calculate the arctangent in the first quadrant
float bx_a = ::fabs( b * x );
float num = bx_a + x * x;
float atan_1q = num / ( 1.f + bx_a + num );
// Restore the sign bit
uint32_t atan_2q = ux_s | (uint32_t &)atan_1q;
return (float &)atan_2q;
}
// Approximates atan2(y, x) normalized to the [0,4) range
// with a maximum error of 0.1620 degrees
float norm_atan2( float y, float x )
{
static const uint32_t sign_mask = 0x80000000;
static const float b = 0.596227f;
// Extract the sign bits
uint32_t ux_s = sign_mask & (uint32_t &)x;
uint32_t uy_s = sign_mask & (uint32_t &)y;
// Determine the quadrant offset
float q = (float)( ( ~ux_s & uy_s ) >> 29 | ux_s >> 30 );
// Calculate the arctangent in the first quadrant
float bxy_a = ::fabs( b * x * y );
float num = bxy_a + y * y;
float atan_1q = num / ( x * x + bxy_a + num );
// Translate it to the proper quadrant
uint32_t uatan_2q = (ux_s ^ uy_s) | (uint32_t &)atan_1q;
return q + (float &)uatan_2q;
}
In case you need more precision, there is a 3rd order rational function:
normalized_atan(x) ~ ( c x + x^2 + x^3) / ( 1 + (c + 1) x + (c + 1) x^2 + x^3)
where c = (1 + sqrt(17)) / 8
which has a maximum approximation error of 0.00811º
Maybe some kind of intelligent brute force like newton rapson.
So for solving asin() you go with steepest descent on sin()
Use a polynomial approximation. Least-squares fit is easiest (Microsoft Excel has it) and Chebyshev approximation is more accurate.
This question has been covered before: How do Trigonometric functions work?
Only continous functions are approximable by polynomials. And arcsin(x) is discontinous in point x=1.same arccos(x).But a range reduction to interval 1,sqrt(1/2) in that case avoid this situation. We have arcsin(x)=pi/2- arccos(x),arccos(x)=pi/2-arcsin(x).you can use matlab for minimax approximation.Aproximate only in range [0,sqrt(1/2)](if angle for that arcsin is request is bigger that sqrt(1/2) find cos(x).arctangent function only for x<1.arctan(x)=pi/2-arctan(1/x).

Small difference in types

I have three functions that ought to be equal:
let add1 x = x + 1
let add2 = (+) 1
let add3 = (fun x -> x + 1)
Why do the types of these methods differ?
add1 and add3 are int -> int, but add2 is (int -> int).
They all work as expected, I am just curious as to why FSI presents them differently?
This is typically an unimportant distinction, but if you're really curious, see the Arity Conformance for Values section of the F# spec.
My quick summary would be that (int -> int) is a superset of int -> int. Since add1 and add3 are syntactic functions, they are inferred to have the more specific type int -> int, while add2 is a function value and is therefore inferred to have the type (int -> int) (and cannot be treated as an int -> int).

Resources