floating point operations

floating point operations - algorithm

i have read that some machine can't express exaclty floating point number for example 1.1
let's take code
float x=0.1;
do{
x+=0.1;
printf("%f\n",x);
} while(x!=1.1);
this code never finished how can i make that code finish? maybe convert it to double or?

For numerical problems, it is common to specify an epsilon of accuracy:
bool within_epsilon(float x, float y, float e) {
if (abs(x - y) > e) {
return false
} else {
return true
}
}
The epsilon you choose will change your accuracy, and the epsilon you can choose is dependent on your floating point implementation: Machine epsilon.

For example, compare within an acceptable margin. I.e.
while (abs(x-1.1)>0.001);
Doubles will have the same issue, just with more precision. Some languages also offer you rational types, where you can specify a number as the fraction 1/10, or fixed point data types.

In this case, checking "<" will do the trick:
float x=0.1;
do{
x+=0.1;
printf("%f\n",x);
} while(x<1.05);
In general, you should test against an "epsilon". Look here for further information.

Work in fixed point, for that kind of task.
The decimal type for example might help. It's not the solution for all problems though.

If you want to do code precisely like you are saying then you want to use a type like decimal (where available) which is a base 10 floating point implementation rather than a base 2.
Further reading: http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems and http://en.wikipedia.org/wiki/Decimal_floating_point

Related

Binary search / bisection for floating point numbers

It is easy to find an integer with binary search even if it can be arbitrarily large: first guess the order of magnitude, then keep dividing the interval.
This answer describes how to find an arbitrary rational number.
Having set the scene, my question is similar: how can we guess a IEEE 754 floating point number? Assume it is not NaN, but everything else is fair game. For each guess, your program will be told whether the number in question is higher, equal or lower. Minimize the number of guesses required in the worst case.
(This is not a homework assignment. Though, I might make it one, if this turns out to have an interesting answer that's not just "beat the floating point numerical difficulties to death with lots and lots of special case handling.")
Edit: if I were better at searching I could have found the answer---but that only works if you already know that reinterpretation as int works (with certain caveats). So leaving this up. Thanks to Harold for a great answer!

IEEE-754 64-bit floating point numbers are really 64-bit representations. Furthermore, with the exception of NaN values, there is no difference between floating point comparison and integer comparison of positive values. (That is, two bit patterns with the sign bit unset will produce the same comparison result regardless of whether you compare them as int64_t or double, unless one of the bit patterns is a floating point NaN-.)
That means you can find a number in 64 guesses by guessing one bit at a time, even if the number is ±∞. Start by comparing the number with 0; if the target is "less", then produce the guesses in the same way as below, but negate them before guessing. (Since IEEE-754 floats are sign/magnitude, you can negate the number by setting the sign bit to 1. Or you could do the positive bit-pattern reinterpretation and then floating point negate the result.)
After that, guess one bit at a time, starting with the highest-order value bit. Set that bit to 1 if the number is greater than or equal to the guess; set that bit to 0 if the number is less; and continue with the next bit until there aren't any more. To construct the guess, reinterpret the bit pattern as a double.
There are two caveats:
You cannot distinguish between ±0 with comparison tests. That means that if your opponent wants you to distinguish between them, they will have to supply you with a way to ask about equality with −0, and you'll have to use that mechanism after you've apparently established that the number is 0 (which will happen on the 64th guess). This would add one guess, for a total of 65.
If you are assured that the target is not a NaN, then there is no other problem. If it might be a NaN, you need to be careful how you compare: things will work out fine if you always ask "is X less than this guess?", because a NaN comparison will always return false. That means that after 11 successive "no" answers (not counting the one to establish the sign), you will find yourself guessing ∞, with the assumption that if the number is not less than ∞, it must be equal. However, in this case alone you need to explicitly test for equality as well, because that will also be false if the target is a NaN. This doesn't add an additional guess to the count, because it will always happen long before 64 guesses have been used up.

The same approach can be applied to a floating point number. Worse case run time is O(log n).
public class GuessComparer
{
private float random;
public GuessComparer() // generate a random float and keep it private
{
Random rnd = new Random();
var buffer = new byte[4];
rnd.NextBytes(buffer);
random = BitConverter.ToSingle(buffer, 0);
}
public int CheckGuess(float quess) // answer whether number is high, lower or the same.
{
return random.CompareTo(quess);
}
}
public class FloatFinder
{
public static int Find(GuessComparer checker)
{
float guess = 0;
int result = checker.CheckGuess(guess);
int guesscount = 1;
var high = float.MaxValue;
var low = float.MinValue;
while (result != 0)
{
if (result > 0) //random is higher than guess
low = guess;
else// random is lower than guess
high = guess;
guess = (high + low) / 2;
guesscount++;
result = checker.CheckGuess(guess);
}
Console.WriteLine("Found answer in {0}", guesscount);
return guesscount;
}
public static void Find()
{
var checker = new GuessComparer();
int guesses = Find(checker);
}
}

Sampling from all possible floats in D

In the D programming language, the standard random (std.random) module provides a simple mechanism for generating a random number in some specified range.
auto a = uniform(0, 1024, gen);
What is the best way in D to sample from all possible floating point values?
For clarification, sampling from all possible 32-bit integers can be done as follows:
auto l = uniform!int(); // randomly selected int from all possible integers

Depends on the kind of distribution you want.
A uniform distribution over all possible values could be done by generating a random ulong and then casting the bits into floating point. For T being float or double:
union both { ulong input; T output; }
both val;
val.input = uniform!"[]"(ulong.min, ulong.max);
return val.output;
Since roughly half of the positive floating point numbers are between 0 and 1, this method will often give you numbers near zero.`It will also give you infinity and NaN values.
Aside: This code should be fine with D, but would be undefined behavior in C/C++. Use memcpy there.
If you prefer a uniform distribution over all possible numbers in floating point (equal probability for 0..1 and 1..2 etc), you need something like the normal uniform!double, which unfortunately does not work very well for large numbers. It also will not generate infinity or NaN. You could generate double numbers and convert them to float, but I have no answer for generating random large double numbers.

How to tell if float has non zero decimals?

Is there a way to determine if a float has non zero decimal values? I'd like to avoid a string conversion then splitting on any decimal. But not sure if there is some other way.

You can't.
Floating point variables (both Float and Double) store values with a limited precesion. Very rarely will a number will be stored as .000....
See Is floating point math broken?
The work around:
First, determine an epsilon value you deem to be "as far from .000... as a number can be for me to still consider it 'whole'". This number will depend on your problem domain. Suppose I values within .001 are acceptably "whole"
Secondly, determine what the closest whole number by rounding it.
Finally, subtract the original value from its rounded counterpart, and check if the difference is less within the epsilon value.
import Foundation
extension Double {
private static var epsilon = 0.0001
var isWhole: Bool { return abs(self - round(self)) < Double.epsilon }
}
let input = 1.0
print(input.isWhole)
This is very similar to the recommended technique for comparing equality of two Float/Double values.

In Swift 3.0
Since Swift 3.0 is fast approaching, I'll include an answer for it, even if the question specifically covers Swift 2.
In Swift 3.0 Enhanced Floating Point Protocols has been implemented, making it easier to work with floating point arithmetics. We can, e.g. use the isEqual method, which implements the IEEE 754 equality predicate, for comparing two floating point numbers
import Foundation // for access to round function
extension Double {
var isAsGoodAsIntegerValuedAsItGets: Bool {
return isEqual(to: round(self))
}
}
var input = 1.01
print(input.isAsGoodAsIntegerValuedAsItGets) // false
input = 1
print(input.isAsGoodAsIntegerValuedAsItGets) // true
/* increase to least representable value that compares
greater than current `self` */
input = input.nextUp
print(input.isAsGoodAsIntegerValuedAsItGets) // false
/* decrease to the greatest representable value that
compares less than current `self` */
input = input.nextDown
print(input.isAsGoodAsIntegerValuedAsItGets) // true

strange behaviour when comparing floating points in rspec

the 3rd of the following tests fails:
specify { (0.6*2).should eql(1.2) }
specify { (0.3*3).should eql(0.3*3) }
specify { (0.3*3).should eql(0.9) } # this one fails
Why is that? Is this a floating point issue or a ruby or rspec issue?

As of rspec-2.1
specify { (0.6*2).should be_within(0.01).of(1.2) }
Before that:
specify { (0.6*2).should be_close(1.2, 0.01) }

Don't compare floating point numbers for equality
The problem is that neither 0.3 nor 0.9 has an exact representation1 in the floating point format, and so when multiplying 0.3 * 3 you get a number that is very, very close to 0.9, and which will round to 0.9 for printing, but it isn't 0.9.
And your 0.9 constant is also not precisely 0.9, and the two numbers are very slightly different.
Using exact equality comparisons for floating point numbers is usually a mistake in any language.
1. All integers up to about 252 have exact FP representations, but the fractions are composed of a sequence of 1 / 2n terms. Most decimal string fractions repeat in base 2.

Function derivatives

I have some function,
int somefunction( //parameters here, let's say int x) {
return something Let's say x*x+2*x+3 or does not matter
}
How do I find the derivative of this function? If I have
int f(int x) {
return sin(x);
}
after derivative it must return cos(x).

You can approximate the derivative by looking at the gradient over a small interval. Eg
const double DELTA=0.0001;
double dfbydx(int x) {
return (f(x+DELTA) - f(x)) / DELTA;
}
Depending on where you're evaluating the function, you might get better results from (f(x+DELTA) - f(x-DELTA)) / 2*DELTA instead.
(I assume 'int' in your question was a typo. If they really are using integers you might have problems with precision this way.)

You can get the numerical integral of mostly any function using one of many numerical techniques such as Numerical ordinary Differential Equations
Look at: Another question
But you can get the integration result as a function definition with a library such as Maple, Mathematica, Sage, or SymPy

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio