Does gmp lazy canonicalization with mpq_t? - performance

With gmp rationals, do I have the duty to bookkeep my calls to canonicalize() (which can be costly performance-wise)? Does gmp know that the rational was not changed since the last call to canonicalize() and will just return if I attempt canonicalization?
I cannot find an answer in the documentation, and maybe someone already looked into the source for this.

It won't likely just return, as mpq_t does not contain any form of information whether fraction is already in canonical form or not. At least GMP documentation does not mention it as of 16.2 Rational Internals:
mpq_t variables represent rationals using an mpz_t numerator and
denominator (see Integer Internals).
In practice, it will likely call mpz_gcd() (or equivalent), to check whether numerator and denominator are coprimes or not.

Related

Multiply two numbers whose range is 10^18

There is a variable first_variable which is always a mod of some number, mod_value.
In every step first_variable is multiplied with some number second_variable.
And the range of all three variables is from 1 to 10^18.
For that I build a formula,
first_variable = ((first_variable%mod_value)*(second_variable%mod_value))%mod_value
But this gives a wrong answer,
For example, If first_variable and second_variable is (10^18)-1 and mod_value = 10^18
Please suggest me method, so that first_variable will always give right answer.
Seems you are using a runtime where arithmetic is implemented using 64-bit integers. You can check this using multipliers like 2^32: if their product is 0, my guess is true. In that case, you should switch to an arbitrary long arithmetic implementation, or at least one that is much longer than the current one. E.g. Python supports integers up to 2^1016 (256^127), same for Erlang.
I've seen in comments you use C++. If so, look for GMP library and analogs. Or, if 128 bits is enough, modern GCC support it through own library.
This is basically overflows, so you should either use different value for mod_value (up to 10^9) or limit the range for first value and second value.
Your number is O(10^36) which is O(2^108) which cannot fit in any primitive data type in languages like java or C++. Use BigInt in C++ or Java or use numpy in python to get over it.

JDBC / Oracle Double value insertion fails [duplicate]

double r = 11.631;
double theta = 21.4;
In the debugger, these are shown as 11.631000000000000 and 21.399999618530273.
How can I avoid this?
These accuracy problems are due to the internal representation of floating point numbers and there's not much you can do to avoid it.
By the way, printing these values at run-time often still leads to the correct results, at least using modern C++ compilers. For most operations, this isn't much of an issue.
I liked Joel's explanation, which deals with a similar binary floating point precision issue in Excel 2007:
See how there's a lot of 0110 0110 0110 there at the end? That's because 0.1 has no exact representation in binary... it's a repeating binary number. It's sort of like how 1/3 has no representation in decimal. 1/3 is 0.33333333 and you have to keep writing 3's forever. If you lose patience, you get something inexact.
So you can imagine how, in decimal, if you tried to do 3*1/3, and you didn't have time to write 3's forever, the result you would get would be 0.99999999, not 1, and people would get angry with you for being wrong.
If you have a value like:
double theta = 21.4;
And you want to do:
if (theta == 21.4)
{
}
You have to be a bit clever, you will need to check if the value of theta is really close to 21.4, but not necessarily that value.
if (fabs(theta - 21.4) <= 1e-6)
{
}
This is partly platform-specific - and we don't know what platform you're using.
It's also partly a case of knowing what you actually want to see. The debugger is showing you - to some extent, anyway - the precise value stored in your variable. In my article on binary floating point numbers in .NET, there's a C# class which lets you see the absolutely exact number stored in a double. The online version isn't working at the moment - I'll try to put one up on another site.
Given that the debugger sees the "actual" value, it's got to make a judgement call about what to display - it could show you the value rounded to a few decimal places, or a more precise value. Some debuggers do a better job than others at reading developers' minds, but it's a fundamental problem with binary floating point numbers.
Use the fixed-point decimal type if you want stability at the limits of precision. There are overheads, and you must explicitly cast if you wish to convert to floating point. If you do convert to floating point you will reintroduce the instabilities that seem to bother you.
Alternately you can get over it and learn to work with the limited precision of floating point arithmetic. For example you can use rounding to make values converge, or you can use epsilon comparisons to describe a tolerance. "Epsilon" is a constant you set up that defines a tolerance. For example, you may choose to regard two values as being equal if they are within 0.0001 of each other.
It occurs to me that you could use operator overloading to make epsilon comparisons transparent. That would be very cool.
For mantissa-exponent representations EPSILON must be computed to remain within the representable precision. For a number N, Epsilon = N / 10E+14
System.Double.Epsilon is the smallest representable positive value for the Double type. It is too small for our purpose. Read Microsoft's advice on equality testing
I've come across this before (on my blog) - I think the surprise tends to be that the 'irrational' numbers are different.
By 'irrational' here I'm just referring to the fact that they can't be accurately represented in this format. Real irrational numbers (like π - pi) can't be accurately represented at all.
Most people are familiar with 1/3 not working in decimal: 0.3333333333333...
The odd thing is that 1.1 doesn't work in floats. People expect decimal values to work in floating point numbers because of how they think of them:
1.1 is 11 x 10^-1
When actually they're in base-2
1.1 is 154811237190861 x 2^-47
You can't avoid it, you just have to get used to the fact that some floats are 'irrational', in the same way that 1/3 is.
One way you can avoid this is to use a library that uses an alternative method of representing decimal numbers, such as BCD
If you are using Java and you need accuracy, use the BigDecimal class for floating point calculations. It is slower but safer.
Seems to me that 21.399999618530273 is the single precision (float) representation of 21.4. Looks like the debugger is casting down from double to float somewhere.
You cant avoid this as you're using floating point numbers with fixed quantity of bytes. There's simply no isomorphism possible between real numbers and its limited notation.
But most of the time you can simply ignore it. 21.4==21.4 would still be true because it is still the same numbers with the same error. But 21.4f==21.4 may not be true because the error for float and double are different.
If you need fixed precision, perhaps you should try fixed point numbers. Or even integers. I for example often use int(1000*x) for passing to debug pager.
Dangers of computer arithmetic
If it bothers you, you can customize the way some values are displayed during debug. Use it with care :-)
Enhancing Debugging with the Debugger Display Attributes
Refer to General Decimal Arithmetic
Also take note when comparing floats, see this answer for more information.
According to the javadoc
"If at least one of the operands to a numerical operator is of type double, then the
operation is carried out using 64-bit floating-point arithmetic, and the result of the
numerical operator is a value of type double. If the other operand is not a double, it is
first widened (§5.1.5) to type double by numeric promotion (§5.6)."
Here is the Source

__builtin_expect from GCC with probability

__builtin_expect from GCC can be used by programmer to show which variants are expected to be very often and which are rare. But __builtin_expect have only "true" and "false" (0% or 100% probability)
For some big projects it is vary hard to get profile feedback (-fprofile-arcs), and sometimes programmer does know, what probability of branch he have in some point of program.
It is possible to give a hint to compiler that a branch have probability >0% and <100% ?
From here:
long __builtin_expect_with_probability
(long exp, long c, double probability) The function has the same semantics as __builtin_expect, but caller provides the expected probability that exp == c. Last argument, probability, is a floating-value in the inclusive range 0.0f and 1.0f. The probability argument must be constant floating-point expression.
Jesin pointed out in the comments, Clang 11 has it too.
True and false really mean that "the first variant is more likely" and "the second variant is more likely". There's no practical need for any values other than these. The compiler won't be able to use that information.
Non-determinism is not a desirable trait for compiler output, let alone language features. There is no real benefit to only partial optimization preferring one branch, and no compiler I'm aware of can do this.

gcc implementation of rand()

I've tried for hours to find the implementation of rand() function used in gcc...
It would be much appreciated if someone could reference me to the file containing it's implementation or website with the implementation.
By the way, which directory (I'm using Ubuntu if that matters) contains the c standard library implementations for the gcc compiler?
rand consists of a call to a function __random, which mostly just calls another function called __random_r in random_r.c.
Note that the function names above are hyperlinks to the glibc source repository, at version 2.28.
The glibc random library supports two kinds of generator: a simple linear congruential one, and a more sophisticated linear feedback shift register one. It is possible to construct instances of either, but the default global generator, used when you call rand, uses the linear feedback shift register generator (see the definition of unsafe_state.rand_type).
You will find C library implementation used by GCC in the GNU GLIBC project.
You can download it sources and you should find rand() implementation. Sources with function definitions are usually not installed on a Linux distribution. Only the header files which I guess you already know are usually stored in /usr/include directory.
If you are familiar with GIT source code management, you can do:
$ git clone git://sourceware.org/git/glibc.git
To get GLIBC source code.
The files are available via FTP. I found that there is more to rand() used in stdlib, which is from [glibc][2]. From the 2.32 version (glibc-2.32.tar.gz) obtained from here, the stdlib folder contains a random.c file which explains that a simple linear congruential algorithm is used. The folder also has rand.c and rand_r.c which can show you more of the source code. stdlib.h contained in the same folder will show you the values used for macros like RAND_MAX.
/* An improved random number generation package. In addition to the
standard rand()/srand() like interface, this package also has a
special state info interface. The initstate() routine is called
with a seed, an array of bytes, and a count of how many bytes are
being passed in; this array is then initialized to contain
information for random number generation with that much state
information. Good sizes for the amount of state information are
32, 64, 128, and 256 bytes. The state can be switched by calling
the setstate() function with the same array as was initialized with
initstate(). By default, the package runs with 128 bytes of state
information and generates far better random numbers than a linear
congruential generator. If the amount of state information is less
than 32 bytes, a simple linear congruential R.N.G. is used.
Internally, the state information is treated as an array of longs;
the zeroth element of the array is the type of R.N.G. being used
(small integer); the remainder of the array is the state
information for the R.N.G. Thus, 32 bytes of state information
will give 7 longs worth of state information, which will allow a
degree seven polynomial. (Note: The zeroth word of state
information also has some other information stored in it; see setstate
for details). The random number generation technique is a linear
feedback shift register approach, employing trinomials (since there
are fewer terms to sum up that way). In this approach, the least
significant bit of all the numbers in the state table will act as a
linear feedback shift register, and will have period 2^deg - 1
(where deg is the degree of the polynomial being used, assuming
that the polynomial is irreducible and primitive). The higher order
bits will have longer periods, since their values are also
influenced by pseudo-random carries out of the lower bits. The
total period of the generator is approximately deg*(2deg - 1); thus
doubling the amount of state information has a vast influence on the
period of the generator. Note: The deg*(2deg - 1) is an
approximation only good for large deg, when the period of the shift
register is the dominant factor. With deg equal to seven, the
period is actually much longer than the 7*(2**7 - 1) predicted by
this formula. */

Should I use NSDecimalNumber to deal with money?

As I started coding my first app I used NSNumber for money values without thinking twice. Then I thought that maybe c types were enough to deal with my values. Yet, I was advised in the iPhone SDK forum to use NSDecimalNumber, because of its excellent rounding capabilities.
Not being a mathematician by temperament, I thought that the mantissa/exponent paradigm might be overkill; still, googlin' around, I realised that most talks about money/currency in cocoa were referred to NSDecimalNumber.
Notice that the app I am working on is going to be internationalised, so the option of counting the amount in cents is not really viable, for the monetary structure depends greatly on the locale used.
I am 90% sure that I need to go with NSDecimalNumber, but since I found no unambiguous answer on the web (something like: "if you deal with money, use NSDecimalNumber!") I thought I'd ask here. Maybe the answer is obvious to most, but I want to be sure before starting a massive re-factoring of my app.
Convince me :)
Marcus Zarra has a pretty clear stance on this: "If you are dealing with currency at all, then you should be using NSDecimalNumber." His article inspired me to look into NSDecimalNumber, and I've been very impressed with it. IEEE floating point errors when dealing with base-10 math have been irritating me for a while (1 * (0.5 - 0.4 - 0.1) = -0.00000000000000002776) and NSDecimalNumber does away with them.
NSDecimalNumber doesn't just add another few digits of binary floating point precision, it actually does base-10 math. This gets rid of the errors like the one shown in the example above.
Now, I'm writing a symbolic math application, so my desire for 30+ decimal digit precision and no weird floating point errors might be an exception, but I think it's worth looking at. The operations are a little more awkward than simple var = 1 + 2 style math, but they're still manageable. If you're worried about allocating all sorts of instances during your math operations, NSDecimal is the C struct equivalent of NSDecimalNumber and there are C functions for doing the exact same math operations with it. In my experience, these are plenty fast for all but the most demanding applications (3,344,593 additions/s, 254,017 divisions/s on a MacBook Air, 281,555 additions/s, 12,027 divisions/s on an iPhone).
As an added bonus, NSDecimalNumber's descriptionWithLocale: method provides a string with a localized version of the number, including the correct decimal separator. The same goes in reverse for its initWithString:locale: method.
Yes. You have to use
NSDecimalNumber and
not double or float when you deal with currency on iOS.
Why is that??
Because we don't want to get things like $9.9999999998 instead of $10
How that happens??
Floats and doubles are approximations. They always comes with a rounding error. The format computers use to store decimals cause this rouding error.
If you need more details read
http://floating-point-gui.de/
According to apple docs,
NSDecimalNumber is an immutable subclass of NSNumber, provides an object-oriented wrapper for doing base-10 arithmetic. An instance can represent any number that can be expressed as mantissa x 10^exponent where mantissa is a decimal integer up to 38 digits long, and exponent is an integer from –128 through 127.wrapper for doing base-10 arithmetic.
So NSDecimalNumber is recommonded for deal with currency.
(Adapted from my comment on the other answer.)
Yes, you should. An integral number of pennies works only as long as you don't need to represent, say, half a cent. If that happens, you could change it to count half-cents, but what if you then need to represent a quarter-cent, or an eighth of a cent?
The only proper solution is NSDecimalNumber (or something like it), which puts off the problem to 10^-128¢ (i.e.,
0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000001¢).
(Another way would be arbitrary-precision arithmetic, but that requires a separate library, such as the GNU MP Bignum library. GMP is under the LGPL. I've never used that library and don't know exactly how it works, so I couldn't say how well it would work for you.)
[Edit: Apparently, at least one person—Brad Larson—thinks I'm talking about binary floating-point somewhere in this answer. I'm not.]
I've found it convenient to use an integer to represent the number of cents and then divide by 100 for presentation. Avoids the whole issue.
A better question is, when should you not use NSDecimalNumber to deal with money. The short answer to that question is, when you can't tolerate the performance overhead of NSDecimalNumber and you don't care about small rounding errors because you're never dealing with more than a few digits of precision. The even shorter answer is, you should always use NSDecimalNumber when dealing with money.
VISA, MasterCards and others are using integer values while passing amounts. It's up to sender and reciever to parse amouts correctly according to currency exponent (divide or multiply by 10^num, where num - is an exponent of the currency). Note that different currencies have different exponents. Usually it's 2 (hence we divide and multiply by 100), but some currencies have exponent = 0 (VND,etc), or = 3.

Resources