I have a double value (9999999999999.99) and trying to store the same into elastic search document (type double and tried scaled_float as well). But, on the Elasticsearch document the same value being shown as 9.99999999999999E12.
Could some one please educate me to resolve this issue
It seems that you don't actually have a problem, let me explain:
9999999999999.99 is a normal representation of a floating point value (float, double etc.)
9.99999999999999E12 is another representation of the same value. E meaning Exponent.
Exponent is the same as multiplying by 10 to the power of the exponent value, in this case 12 so 10**12 == 1000000000000.
9.99999999999999 * 1000000000000 == 9999999999999.99
Further explanation:
If you move the decimal point twelve digits to the right, you'll get the same number 9999999999999.99, and that's what the exponent is used for.
Also the exponent can be negative, in which case you'd move the decimal point to the left.
Related
This is a question for school, reinventing the wheel as usual.
I'm allowed to use basic arithmetic +, -, *, / and comparison, but I'm obviously not allowed to use cast.
The method has to be efficient, so I thought about multiplying a variable by 2 until it's bigger then do a dichomitic search between the powers of 2 that contains the real number I want to extract the integer part.
However, in the next section, I'm not allowed to use these basic arithmetic and comparison between integer and float, only between 2 integers, or 2 floats.
I can't find any solution to this...
You can follow your idea of multiplication by two to surpass the value then dichomitic search (aka binary search) to get the desired integer. However, since you are not allowed to compare a float with an integer, start with two values, the float 1.0 and the integer 1. Do all your multiplications and comparisons with the float value, then at each step whatever you do to the float value you also do to the integer value. So at any point, your float value and your integer value are equal, and you are using the float value for all comparisons with your given value.
So if your given value is 3.1416, you start with your initial guess values of 1.0 and 1. 1.0 is less than 3.1416, so you double both guesses and get 2.0 and 2. The float 2.0 is still less than 3.1416 so you double both guesses again and get 4.0 and 4. Your float guess 4.0 is finally too high, so you use binary search and try 3.0 and 3. The float guess is low. However, your integer guess 3 is just one away from your previous integer guess of 4, so you are done. The final integer result is thus 3.
My understanding is that floats are internally represented as binary expansion, and that introduces errors. If that is the case, then why are float literals represented as are given? Suppose that 0.1 is represented internally as 0.0999999999999999 according to binary expansion (I am using a fake example, just to show the point. This particular value is probably not correct.). Then in inspection or in the return value of irb, why does it/why is it possible to print the given literal 0.1 and not 0.0999999999999999? Isn't the original literal form gone once it is interpreted and expanded into binary?
In other words, a float literal-to-internal binary expression is a many-to-one mapping. Different float literals that are close enough are mapped to the same internal binary expression. Why then is it possible to reconstruct the original literal from the internal expression (modulo differences between 1.10 and 1.1, 1.23e2 123.0 as in Mark Dickinson's comment)?
The decimal-to-floating-point conversion applied to floating-point literals such as “0.1” rounds to the nearest floating-point (0.5 ULP) for most platforms. (Ruby calls a function from the platform for this, and the only fallback Ruby's source code contains for portability is awful, but let us assume conversion to the nearest). As a consequence, if you print to any number of decimals between 1…15 the closest decimal representation to the double that correspond to the literal 0.1, then the result is 0.10…0 (and the trailing zeroes can be omitted, of course), and if you print the shortest decimal representation that converts back to the double nearest 0.1, then this results in “0.1”, of course.
Programming languages usually use one of the above two approaches (fixed number of significant digits, or shortest decimal representation that converts back to the original floating-point number) when converting floating-point number to a decimal representation. Ruby uses the latter.
This article introduced “floating-point to shortest decimal representation that converts back to the same floating-point number” floating-point-to-decimal conversion.
Ruby (like several other languages, including Java and Python) uses a "shortest-string" representation when converting binary floating-point numbers to decimal for display purposes: given a binary floating-point number, it will compute the shortest decimal string that rounds back to that binary floating-point number under the usual round-to-nearest decimal-to-binary method.
Now suppose that we start out with a reasonably short (in terms of number of significant digits) decimal literal, 123.456 for example, and convert that to the nearest binary float, z. Clearly 123.456 is one decimal string that rounds to z, so it's a candidate for the representation of z, and it should be at least plausible that the "shortest-string" algorithm will spit that back at us. But we'd like more than just plausibility here: to be sure that 123.456 is exactly what we're going to get back, all we need to know is that there aren't any other, shorter, candidates.
And that's true, essentially because if we restrict to short-ish decimal values (to be made more precise below), the spacing between successive such values is larger than the spacing between successive floats. More precisely, we can make a statement like the following:
Any decimal literal x with 15 or fewer significant digits and absolute value
between 10^-307 and 10^308 will be recovered by the "shortest-string" algorithm.
Here by "recovered", I mean that the output string will have the same decimal digits, and the same value as the original literal when thought of as a decimal number; it's still possible that the form of the literal may have changed, e.g., from 1.230 to 1.23, or from 0.000345 to 3.45e-4. I'm also assuming IEEE 754 binary64 format with the usual round-ties-to-even rounding mode.
Now let's give a sketch of a proof. Without loss of generality, assume x is positive. Let z be the binary floating-point value nearest x. We have to show that there's no other, shorter, string y that also rounds to z under round-to-nearest. But if y is a shorter string than x, it's also representable in 15 significant digits or fewer, so it differs from x by at least one 'unit in the last place'. To formalize that, find integers e and f such that 2^(e-1) <= x < 2^e and 10^(f-1) < x <= 10^f. Then the difference |x-y| is at least 10^(f-15). However, if y is too far away from x, it can't possibly round to z: since the binary64 format has a precision of 53 bits (away from the underflow and overflow ranges, at least) the interval of numbers that round to z has width at most 2^(e-53)[1]. We need to show that the width of this interval is smaller than |x-y|; that is, that 2^(e-53) < 10^(f-15).
But this follows from our choices: 2^(e-53) <= 2^-52 x by our choice of e, and since 2^-52 < 10^-15 we get 2^(e-53) < 10^-15 x. Then 10^-15 x <= 10^(f-15) (by choice of f).
It's not hard to find examples showing that 15 is best possible here. For example, the literal 8.123451234512346 has 16 significant digits, and converts to the floating-point value 0x1.03f35000dc341p+3, or 4573096494089025/562949953421312. When rendered back as a string using the shortest string algorithm, we get 8.123451234512347.
[1] Not quite true: there's an annoying corner case when z is an exact power of two, in which case the width of the interval is 1.5 2^(e-53). The statement remains true in that case, though; I'll leave the details as an exercise.
What is the recommended way to convert a floating point type to an integer type, truncating everything after the decimal point? CLng rounds, apparently, and the documentation for the = operator doesn't mention the subject.
UseFix or Int depending on the treatment you wish for negative numbers.
Microsoft article Q196652 discusses rounding in incredible detail. Here is an excerpt
The VB Fix() function is an example
of truncation. For example, Fix(3.5)
is 3, and Fix(-3.5) is -3.
The Int() function rounds down to
the highest integer less than the
value. Both Int() and Fix() act
the same way with positive numbers -
truncating - but give different
results for negative numbers:
Int(-3.5) gives -4.
Full disclosure: I referred to this nice answer by elo80ka
see
this
Undocumented behavior of the CInt() function
The CInt() function rounds to the nearest integer value. In other words, CInt(2.4) returns 2, and CInt(2.6) returns 3.
This function exhibits an under-documented behavior when the fractional part is equal to 0.5. In this case, this function rounds down if the integer portion of the argument is even, but it rounds up if the integer portion is an odd number. For example, CInt(2.5) returns 2, but CInt(3.5) returns 4.
This behavior shouldn't be considered as a bug, because it helps not to introduce errors when doing statistical calculations. UPDATE: Matthew Wills let us know that this behavior is indeed documented in VB6's help file: When the fractional part is exactly 0.5, CInt and CLng always round it to the nearest even number. For example, 0.5 rounds to 0, and 1.5 rounds to 2. CInt and CLng differ from the Fix and Int functions, which truncate, rather than round, the fractional part of a number. Also, Fix and Int always return a value of the same type as is passed in.
For positive numbers you would use
truncated = Int(value)
If used on negative numbers it would go down, i.e. -7.2 would become -8.
I am wondering if this is true: When I take the square root of a squared integer, like in
f = Math.sqrt(123*123)
I will get a floating point number very close to 123. Due to floating point representation precision, this could be something like 122.99999999999999999999 or 123.000000000000000000001.
Since floor(122.999999999999999999) is 122, I should get 122 instead of 123. So I expect that floor(sqrt(i*i)) == i-1 in about 50% of the cases. Strangely, for all the numbers I have tested, floor(sqrt(i*i) == i. Here is a small ruby script to test the first 100 million numbers:
100_000_000.times do |i|
puts i if Math.sqrt(i*i).floor != i
end
The above script never prints anything. Why is that so?
UPDATE: Thanks for the quick reply, this seems to be the solution: According to wikipedia
Any integer with absolute value less
than or equal to 2^24 can be exactly
represented in the single precision
format, and any integer with absolute
value less than or equal to 2^53 can
be exactly represented in the double
precision format.
Math.sqrt(i*i) starts to behave as I've expected it starting from i=9007199254740993, which is 2^53 + 1.
Here's the essence of your confusion:
Due to floating point representation
precision, this could be something
like 122.99999999999999999999 or
123.000000000000000000001.
This is false. It will always be exactly 123 on a IEEE-754 compliant system, which is almost all systems in these modern times. Floating-point arithmetic does not have "random error" or "noise". It has precise, deterministic rounding, and many simple computations (like this one) do not incur any rounding at all.
123 is exactly representable in floating-point, and so is 123*123 (so are all modest-sized integers). So no rounding error occurs when you convert 123*123 to a floating-point type. The result is exactly 15129.
Square root is a correctly rounded operation, per the IEEE-754 standard. This means that if there is an exact answer, the square root function is required to produce it. Since you are taking the square root of exactly 15129, which is exactly 123, that's exactly the result you get from the square root function. No rounding or approximation occurs.
Now, for how large of an integer will this be true?
Double precision can exactly represent all integers up to 2^53. So as long as i*i is less than 2^53, no rounding will occur in your computation, and the result will be exact for that reason. This means that for all i smaller than 94906265, we know the computation will be exact.
But you tried i larger than that! What's happening?
For the largest i that you tried, i*i is just barely larger than 2^53 (1.1102... * 2^53, actually). Because conversions from integer to double (or multiplication in double) are also correctly rounded operations, i*i will be the representable value closest to the exact square of i. In this case, since i*i is 54 bits wide, the rounding will happen in the very lowest bit. Thus we know that:
i*i as a double = the exact value of i*i + rounding
where rounding is either -1,0, or 1. If rounding is zero, then the square is exact, so the square root is exact, so we already know you get the right answer. Let's ignore that case.
So now we're looking at the square root of i*i +/- 1. Using a Taylor series expansion, the infinitely precise (unrounded) value of this square root is:
i * (1 +/- 1/(2i^2) + O(1/i^4))
Now this is a bit fiddly to see if you haven't done any floating point error analysis before, but if you use the fact that i^2 > 2^53, you can see that the:
1/(2i^2) + O(1/i^4)
term is smaller than 2^-54, which means that (since square root is correctly rounded, and hence its rounding error must be smaller than 2^54), the rounded result of the sqrt function is exactly i.
It turns out that (with a similar analysis), for any exactly representable floating point number x, sqrt(x*x) is exactly x (assuming that the intermediate computation of x*x doesn't over- or underflow), so the only way you can encounter rounding for this type of computation is in the representation of x itself, which is why you see it starting at 2^53 + 1 (the smallest unrepresentable integer).
For "small" integers, there is usually an exact floating-point representation.
It's not too hard to find cases where this breaks down as you'd expect:
Math.sqrt(94949493295293425**2).floor
# => 94949493295293424
Math.sqrt(94949493295293426**2).floor
# => 94949493295293424
Math.sqrt(94949493295293427**2).floor
# => 94949493295293424
Ruby's Float is a double-precision floating point number, which means that it can accurately represent numbers with (rule of thumb) about 16 significant decimal digits. For regular single-precision floating point numbers it's about significant 7 digits.
You can find more information here:
What Every Computer Scientist Should Know About Floating-Point Arithmetic:
http://docs.sun.com/source/819-3693/ncg_goldberg.html
Regarding reporting services 2005:
I want to sum the field of a measure. If I sum 0.234 + 0.441 and format the sum to 2 digits after the point it will give 0.7. Because I format in the same way the field itself it shows 0.2+0.4=0.7.
It says I have to do the sum by adding the rounded field each time.
The only way I found to round a number to a few digits after the number is by format/FormatNumber and the reporting services doesn't accept format(sum(format(number))-is there another function?
The Math.Round function should do what you need, give it the number to round and no of decimal places you want (2 here I think), and returns a double (or decimal etc depending on what was passed in).
Then sum the rounded values.