Why is there no name for the precision of the negative class in the assessment of a binary classifer? - metrics

In the assessment of a binary classifier, we know that specificity is the recall of the negative class.
So there are clear names for both flavors of recall.
Then looking at precision, I can't help but wonder why there isn't an analogous version for the negative class, ala
tn / (tn + fn)
It seems useful to ask, "of all the sample predicted as negative, how many are really negative?" so you'd think that metric would have it's own name. Is there something obvious I'm missing?

Actually there's a name for it: negative predictive value.

Related

Calculated fields- Formula using number e

Not sure if possible, but here's a working version of a form I want to reproduce in Redcap- http://extubation.net/
Provided equation that is calculated-
Gestational age, oxygen, respiratory score, day of life, pH, and weight at extubation are the input fields.
Is this possible to reproduce within a Redcap form?
Yes, it should be straightforward. I just plugged some random numbers into a calculation I defined based on your image and this popped out:
I don't know these metrics and the numbers I put in might be way off, so this might be off by some orders of magnitude, but the important points are:
In REDCap calculations, base and exponents are both wrapped in parentheses, so (3)^(2) yields 9, while 3^2 is not syntactic.
To get Euler's number, you can either hardcode it to some degree of precision yourself in the calculation (2.718281828459045)^(-47.3379 + 0.40635*[gage]..., or you can use the JavaScript Math library and use (Math.E)^(-47.3379 + 0.40635*[gage].... If you do the latter, you must have your admin add the calculation for you, as JavaScript in calculations is reserved for administrators. It might only give you so much precision anyway, so you might as well hard code it.
My calculation was:
(Math.E) ^ (-47.3379 + 0.40635*[gage] - 0.0701*[o2] - 0.1273*[rss] + 0.04202*[edol] + 0.52154*10*[bgph] + 0.0016*[ext_weight_g])

error bound in function approximation algorithm

Suppose we have the set of floating point number with "m" bit mantissa and "e" bits for exponent. Suppose more over we want to approximate a function "f".
From the theory we know that usually a "range reduced function" is used and then from such function we derive the global function value.
For example let x = (sx,ex,mx) (sign exp and mantissa) then...
log2(x) = ex + log2(1.mx) so basically the range reduced function is "log2(1.mx)".
I have implemented at present reciprocal, square root, log2 and exp2, recently i've started to work with the trigonometric functions. But i was wandering if given a global error bound (ulp error especially) it is possible to derive an error bound for the range reduced function, is there some study about this kind of problem? Speaking of the log2(x) (as example) i would lke to be able to say...
"ok i want log2(x) with k ulp error, to achieve this given our floating point system we need to approximate log2(1.mx) with p ulp error"
Remember that as i said we know we are working with floating point number, but the format is generic, so it could be the classic F32, but even for example e=10, m = 8 end so on.
I can't actually find any reference that shows such kind of study. Reference i have (i.e. muller book) doesn't treat the topic in this way so i was looking for some kind of paper or similar. Do you know any reference?
I'm also trying to derive such bound by myself but it is not easy...
There is a description of current practice, along with a proposed improvement and an error analysis, at https://hal.inria.fr/ensl-00086904/document. The description of current practice appears consistent with the overview at https://docs.oracle.com/cd/E37069_01/html/E39019/z4000ac119729.html, which is consistent with my memory of the most talked about problem being the mod pi range reduction of trigonometric functions.
I think IEEE floating point was a big step forwards just because it standardized things at a time when there were a variety of computer architectures, so lowering the risks of porting code between them, but the accuracy requirements implied by this may have been overkill: for many problems the constraint on the accuracy of the output is the accuracy of the input data, not the accuracy of the calculation of intermediate values.

constrained regression with many variables

I have around 200 dummies, and wish to run a constrained OLS regression where I impose that the sum of all coefficients on the dummies is equal to 1.
One option is to type:
constraint define 1 dummy_1+dummy_2 +...+dummy_200=1
cnsreg y x_1 x_2 dummy_1-dummy_200, c(1)
...but typing the constraint out would obviously be very painful.
Is there a way to quickly define such a large constraint? The matrix form would be very quick and straightforward, but after much reading online and in Stata guide, it is not clear to me how to do constraints in matrix form, and if they are even possible.
There are at least two sides to this, how to do it and whether it will work in any statistical sense.
How to do it seems easier than you fear as the difficult bit is just inserting "+" signs between the variable names, and that's string manipulation. Something like
unab myvars : dummy_*
local myvars : subinstr local myvars " " "+", all
mac li
constraint 1 `myvars' = 1
should get you started. The macro list is so you can see what you did, especially if it is not what you want.
Whether it will work for you statistically is outside the scope of this forum, but if that's the only constraint note that it's consistent with all kinds of negative and positive coefficients. Perhaps there are special features of your problem that make it a natural constraint, but my intuition is that such a model will be hard to estimate.
I would take a completely different approach. Such constraints typically occur when trying out a different coding scheme for a set of indicator variables. If that is the case then I would use Stata's factor variables, combined with margins with the contrast operators.

JDBC / Oracle Double value insertion fails [duplicate]

double r = 11.631;
double theta = 21.4;
In the debugger, these are shown as 11.631000000000000 and 21.399999618530273.
How can I avoid this?
These accuracy problems are due to the internal representation of floating point numbers and there's not much you can do to avoid it.
By the way, printing these values at run-time often still leads to the correct results, at least using modern C++ compilers. For most operations, this isn't much of an issue.
I liked Joel's explanation, which deals with a similar binary floating point precision issue in Excel 2007:
See how there's a lot of 0110 0110 0110 there at the end? That's because 0.1 has no exact representation in binary... it's a repeating binary number. It's sort of like how 1/3 has no representation in decimal. 1/3 is 0.33333333 and you have to keep writing 3's forever. If you lose patience, you get something inexact.
So you can imagine how, in decimal, if you tried to do 3*1/3, and you didn't have time to write 3's forever, the result you would get would be 0.99999999, not 1, and people would get angry with you for being wrong.
If you have a value like:
double theta = 21.4;
And you want to do:
if (theta == 21.4)
{
}
You have to be a bit clever, you will need to check if the value of theta is really close to 21.4, but not necessarily that value.
if (fabs(theta - 21.4) <= 1e-6)
{
}
This is partly platform-specific - and we don't know what platform you're using.
It's also partly a case of knowing what you actually want to see. The debugger is showing you - to some extent, anyway - the precise value stored in your variable. In my article on binary floating point numbers in .NET, there's a C# class which lets you see the absolutely exact number stored in a double. The online version isn't working at the moment - I'll try to put one up on another site.
Given that the debugger sees the "actual" value, it's got to make a judgement call about what to display - it could show you the value rounded to a few decimal places, or a more precise value. Some debuggers do a better job than others at reading developers' minds, but it's a fundamental problem with binary floating point numbers.
Use the fixed-point decimal type if you want stability at the limits of precision. There are overheads, and you must explicitly cast if you wish to convert to floating point. If you do convert to floating point you will reintroduce the instabilities that seem to bother you.
Alternately you can get over it and learn to work with the limited precision of floating point arithmetic. For example you can use rounding to make values converge, or you can use epsilon comparisons to describe a tolerance. "Epsilon" is a constant you set up that defines a tolerance. For example, you may choose to regard two values as being equal if they are within 0.0001 of each other.
It occurs to me that you could use operator overloading to make epsilon comparisons transparent. That would be very cool.
For mantissa-exponent representations EPSILON must be computed to remain within the representable precision. For a number N, Epsilon = N / 10E+14
System.Double.Epsilon is the smallest representable positive value for the Double type. It is too small for our purpose. Read Microsoft's advice on equality testing
I've come across this before (on my blog) - I think the surprise tends to be that the 'irrational' numbers are different.
By 'irrational' here I'm just referring to the fact that they can't be accurately represented in this format. Real irrational numbers (like π - pi) can't be accurately represented at all.
Most people are familiar with 1/3 not working in decimal: 0.3333333333333...
The odd thing is that 1.1 doesn't work in floats. People expect decimal values to work in floating point numbers because of how they think of them:
1.1 is 11 x 10^-1
When actually they're in base-2
1.1 is 154811237190861 x 2^-47
You can't avoid it, you just have to get used to the fact that some floats are 'irrational', in the same way that 1/3 is.
One way you can avoid this is to use a library that uses an alternative method of representing decimal numbers, such as BCD
If you are using Java and you need accuracy, use the BigDecimal class for floating point calculations. It is slower but safer.
Seems to me that 21.399999618530273 is the single precision (float) representation of 21.4. Looks like the debugger is casting down from double to float somewhere.
You cant avoid this as you're using floating point numbers with fixed quantity of bytes. There's simply no isomorphism possible between real numbers and its limited notation.
But most of the time you can simply ignore it. 21.4==21.4 would still be true because it is still the same numbers with the same error. But 21.4f==21.4 may not be true because the error for float and double are different.
If you need fixed precision, perhaps you should try fixed point numbers. Or even integers. I for example often use int(1000*x) for passing to debug pager.
Dangers of computer arithmetic
If it bothers you, you can customize the way some values are displayed during debug. Use it with care :-)
Enhancing Debugging with the Debugger Display Attributes
Refer to General Decimal Arithmetic
Also take note when comparing floats, see this answer for more information.
According to the javadoc
"If at least one of the operands to a numerical operator is of type double, then the
operation is carried out using 64-bit floating-point arithmetic, and the result of the
numerical operator is a value of type double. If the other operand is not a double, it is
first widened (§5.1.5) to type double by numeric promotion (§5.6)."
Here is the Source

Should I use NSDecimalNumber to deal with money?

As I started coding my first app I used NSNumber for money values without thinking twice. Then I thought that maybe c types were enough to deal with my values. Yet, I was advised in the iPhone SDK forum to use NSDecimalNumber, because of its excellent rounding capabilities.
Not being a mathematician by temperament, I thought that the mantissa/exponent paradigm might be overkill; still, googlin' around, I realised that most talks about money/currency in cocoa were referred to NSDecimalNumber.
Notice that the app I am working on is going to be internationalised, so the option of counting the amount in cents is not really viable, for the monetary structure depends greatly on the locale used.
I am 90% sure that I need to go with NSDecimalNumber, but since I found no unambiguous answer on the web (something like: "if you deal with money, use NSDecimalNumber!") I thought I'd ask here. Maybe the answer is obvious to most, but I want to be sure before starting a massive re-factoring of my app.
Convince me :)
Marcus Zarra has a pretty clear stance on this: "If you are dealing with currency at all, then you should be using NSDecimalNumber." His article inspired me to look into NSDecimalNumber, and I've been very impressed with it. IEEE floating point errors when dealing with base-10 math have been irritating me for a while (1 * (0.5 - 0.4 - 0.1) = -0.00000000000000002776) and NSDecimalNumber does away with them.
NSDecimalNumber doesn't just add another few digits of binary floating point precision, it actually does base-10 math. This gets rid of the errors like the one shown in the example above.
Now, I'm writing a symbolic math application, so my desire for 30+ decimal digit precision and no weird floating point errors might be an exception, but I think it's worth looking at. The operations are a little more awkward than simple var = 1 + 2 style math, but they're still manageable. If you're worried about allocating all sorts of instances during your math operations, NSDecimal is the C struct equivalent of NSDecimalNumber and there are C functions for doing the exact same math operations with it. In my experience, these are plenty fast for all but the most demanding applications (3,344,593 additions/s, 254,017 divisions/s on a MacBook Air, 281,555 additions/s, 12,027 divisions/s on an iPhone).
As an added bonus, NSDecimalNumber's descriptionWithLocale: method provides a string with a localized version of the number, including the correct decimal separator. The same goes in reverse for its initWithString:locale: method.
Yes. You have to use
NSDecimalNumber and
not double or float when you deal with currency on iOS.
Why is that??
Because we don't want to get things like $9.9999999998 instead of $10
How that happens??
Floats and doubles are approximations. They always comes with a rounding error. The format computers use to store decimals cause this rouding error.
If you need more details read
http://floating-point-gui.de/
According to apple docs,
NSDecimalNumber is an immutable subclass of NSNumber, provides an object-oriented wrapper for doing base-10 arithmetic. An instance can represent any number that can be expressed as mantissa x 10^exponent where mantissa is a decimal integer up to 38 digits long, and exponent is an integer from –128 through 127.wrapper for doing base-10 arithmetic.
So NSDecimalNumber is recommonded for deal with currency.
(Adapted from my comment on the other answer.)
Yes, you should. An integral number of pennies works only as long as you don't need to represent, say, half a cent. If that happens, you could change it to count half-cents, but what if you then need to represent a quarter-cent, or an eighth of a cent?
The only proper solution is NSDecimalNumber (or something like it), which puts off the problem to 10^-128¢ (i.e.,
0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000001¢).
(Another way would be arbitrary-precision arithmetic, but that requires a separate library, such as the GNU MP Bignum library. GMP is under the LGPL. I've never used that library and don't know exactly how it works, so I couldn't say how well it would work for you.)
[Edit: Apparently, at least one person—Brad Larson—thinks I'm talking about binary floating-point somewhere in this answer. I'm not.]
I've found it convenient to use an integer to represent the number of cents and then divide by 100 for presentation. Avoids the whole issue.
A better question is, when should you not use NSDecimalNumber to deal with money. The short answer to that question is, when you can't tolerate the performance overhead of NSDecimalNumber and you don't care about small rounding errors because you're never dealing with more than a few digits of precision. The even shorter answer is, you should always use NSDecimalNumber when dealing with money.
VISA, MasterCards and others are using integer values while passing amounts. It's up to sender and reciever to parse amouts correctly according to currency exponent (divide or multiply by 10^num, where num - is an exponent of the currency). Note that different currencies have different exponents. Usually it's 2 (hence we divide and multiply by 100), but some currencies have exponent = 0 (VND,etc), or = 3.

Resources