Should I test if equal to 1 or not equal to 0? - performance

I was coding here the other day, writing a couple of if statements with integers that are always either 0 or 1 (practically acting as bools). I asked myself:
When testing for positive result, which is better; testing for int == 1 or int != 0?
For example, given an int n, if I want to test if it's true, should I use n == 1 or n != 0?
Is there any difference at all in regards to speed, processing power, etc?
Please ignore the fact that the int may being more/less than 1/0, it is irrelevant and does not occur.

Human's brain better process statements that don't contain negations, which makes "int == 1" better way.

It really depends. If you're using a language that supports booleans, you should use the boolean, not an integer, ie:
if (value == false)
or
if (value == true)
That being said, with real boolean types, it's perfectly valid (and typically nicer) to just write:
if (!value)
or
if (value)
There is really very little reason in most modern languages to ever use an integer for a boolean operation.
That being said, if you're using a language which does not support booleans directly, the best option here really depends on how you're defining true and false. Often, false is 0, and true is anything other than 0. In that situation, using if (i == 0) (for false check) and if (i != 0) for true checking.
If you're guaranteed that 0 and 1 are the only two values, I'd probably use if (i == 1) since a negation is more complex, and more likely to lead to maintenance bugs.

If you're working with values that can only be 1 or 0, then I suggest you use boolean values to begin with and then just do if (bool) or if (!bool).

In language where int that are not 0 represents the boolean value 'true', and 0 'false', like C, I will tend to use if (int != 0) because it represents the same meaning as if (int) whereas int == 1 represents more the integer value being equal to 1 rather than the boolean true. It may be just me though. In languages that support the boolean type, always use it rather than ints.

A Daft question really. If you're testing for 1, test for 1, if you're testing for zero, test for zero.
The addition of an else statement can make the choice can seem arbitrary. I'd choose which makes the most sense, or has more contextual significance, default or 'natural' behaviour suggested by expected frequency of occurrence for example.
This choice between int == 0 and int != 1 may very well boil down to subjective evaluations which probably aren't worth worrying about.

Two points:
1) As noted above, being more explicit is a win. If you add something to an empty list you not only want its size to be not zero, but you also want it to be explicitly 1.
2) You may want to do
(1 == int)
That way if you forget an = you'll end up with a compile error rather than a debugging session.

To be honest if the value of int is just 1 or 0 you could even say:
if (int)
and that would be the same as saying
if (int != 0)
but you probably would want to use
if (int == 1)
because not zero would potentially let the answer be something other than 1 even though you said not to worry about it.

If only two values are possible, then I would use the first:
if(int == 1)
because it is more explicit. If there were no constraint on the values, I would think otherwise.

IF INT IS 1
NEXT SENTENCE
ELSE MOVE "INT IS NOT ONE" TO MESSAGE.

As others have said, using == is frequently easier to read than using !=.
That said, most processors have a specific compare-to-zero operation. It depends on the specific compiler, processor, et cetera, but there may be an almost immeasurably small speed benefit to using != 0 over == 1 as a result.
Most languages will let you use if (int) and if (!int), though, which is both more readable and get you that minuscule speed bonus.

I'm paranoid. If a value is either 0 or 1 then it might be 2. May be not today, may be not tomorrow, but some maintenance programmer is going to do something weird in a subclass. Sometimes I make mistakes myself [shh, don't tell my employer]. So, make the code say tell me that the value is either 0 or 1, otherwise it cries to mummy.
if (i == 0) {
... 0 stuff ...
} else if (i == 1) {
... 1 stuff ...
} else {
throw new Error();
}
(You might prefer switch - I find its syntax in curly brace language too heavy.)

When using integers as booleans, I prefer to interpret them as follows: false = 0, true = non-zero.
I would write the condition statements as int == 0 and int != 0.

I would say it depends on the semantics, if you condition means
while ( ! abort ) negation is ok.
if ( quit ) break; would be also ok.

if( is_numeric( $int ) ) { its a number }
elseif( !$int ) { $int is not set or false }
else { its set but its not a number }
end of discussion :P

I agree with what most people have said in this post. It's much more efficient to use boolean values if you have one of two distinct possibilities. It also makes the code a lot easier to read and interpret.
if(bool) { ... }

I was from the c world. At first I don't understand much about objective-c. After some while, I prefer something like:
if (int == YES)
or
if (int == NO)
in c, i.e.:
if (int == true)
if (int == false)
these days, I use varchar instead of integer as table keys too, e.g.
name marital_status
------ --------------
john single
joe married
is a lot better than:
name marital_status
------ --------------
john S
joe M
or
name marital_status
------ --------------
john 1
joe 2

(Assuming your ints can only be 1 or 0) The two statements are logically equivalent. I'd recommend using the == syntax though because I think it's clearer to most people when you don't introduce unnecessary negations.

Related

How write a hash function to make such expression to be true?

pseudocode:
// deprecated x!=y && hash(x) == hash(y) // how to make this true?
x!=y && hash(x) == hash(y) && (z!=x && z!=y) && (hash(x) != hash(z) && (hash(y) != hash(z)) // how to make this true?
x and y can be any readable value
Whatever the language, the pseudocode is just help to understand what I mean.
I just wonder how to implement such hash function.
PS: For math, i am an idiot. I can not imagine if there is an algorithm that can do this.
UPDATE 1:
The pseudocode has bug, so I updated the code(actually still has bug, never mind, I will explain).
My original requirement is to make a hash function that can return same value for different parameter, and the parameter value should contains some rule. It means, only the parameter value in same category would gets same hash code, others are not.
e.g.
The following expressions are clearly(you can treat '0' as placeholder):
hash("1.1") == hash("1.0") == hash("0.1")
hash("2.2") == hash("2.0") == hash("0.2")
and
hash("2.2") != hash("2.1") != hash("1.2")
I think this question can do such description:
There are two or more different values contains implied same attribute.
Only these values have such same attribute in the world.
The attribute can obtain through some way(maybe a function), hash() will call it inside.
hash() one of the values, you can retrive the attribute, then you can get the unique hashCode.
It's looks like hash collision, but we exactly know what they are. Also looks like many-to-one model.
How to design collision rules? The values could be any character or numeric. And how to implement the designs?
PPS: This is a question full of bugs, maybe the updated parts cannot explain the the problem either. Or maybe this is a false proposition. I want abstract my issue as a general model, but it makes my mind overflowed. If necessary I will post my actual issue that I am facing.
Any constant hash trivially satisfies your condition:
hash(v) = 42
A less constant answer than yuri kilocheck's would be to use the mod operator:
hash(v) = v % 10;
Then you'll have:
hash(1) = 1
hash(2) = 2
hash(3) = 3
...
hash(11) = 1
hash(12) = 2

Early return statements and cyclomatic complexity

I prefer this writing style with early returns:
public static Type classify(int a, int b, int c) {
if (!isTriangle(a, b, c)) {
return Type.INVALID;
}
if (a == b && b == c) {
return Type.EQUILATERAL;
}
if (b == c || a == b || c == a) {
return Type.ISOSCELES;
}
return Type.SCALENE;
}
Unfortunately, every return statement increases the cyclomatic complexity metric calculated by Sonar. Consider this alternative:
public static Type classify(int a, int b, int c) {
final Type result;
if (!isTriangle(a, b, c)) {
result = Type.INVALID;
} else if (a == b && b == c) {
result = Type.EQUILATERAL;
} else if (b == c || a == b || c == a) {
result = Type.ISOSCELES;
} else {
result = Type.SCALENE;
}
return result;
}
The cyclomatic complexity of this latter approach reported by Sonar is lower than the first, by 3. I have been told that this might be the result of a wrong implementation of the CC metrics. Or is Sonar correct, and this is really better? These related questions seem to disagree with that:
https://softwareengineering.stackexchange.com/questions/118703/where-did-the-notion-of-one-return-only-come-from
https://softwareengineering.stackexchange.com/questions/18454/should-i-return-from-a-function-early-or-use-an-if-statement
If I add support for a few more triangle types, the return statements will add up to make a significant difference in the metric and cause a Sonar violation. I don't want to stick a // NOSONAR on the method, as that might mask other problems by new features/bugs added to the method in the future. So I use the second version, even though I don't really like it. Is there a better way to handle the situation?
Your question relates to https://jira.codehaus.org/browse/SONAR-4857. For the time being all SonarQube analysers are mixing the cyclomatic complexity and essential complexity. From a theoretical point of view return statement should not increment the cc and this change is going to happen in the SQ ecosystem.
Not really an answer, but way too long for a comment.
This SONAR rule seems to be thoroughly broken. You could rewrite
b == c || a == b || c == a
as
b == c | a == b | c == a
and gain two points in this strange game (and maybe even some speed as branching is expensive; but this is on the discretion of the JITc, anyway).
The old rule claims, that the cyclomatic complexity is related to the number of tests. The new one doesn't, and that's a good thing as obviously the number of meaningfull tests for your both snippets is exactly the same.
Is there a better way to handle the situation?
Actually, I do have an answer: For each early return use | instead of || once. :D
Now seriously: There is a bug requesting annotations allowing to disable a single rule, which is marked as fixed. I din't look any further.
Since the question is also about early return statements as a coding style, it would be helpful to consider the effect of size on the return style. If the method or function is small, less than say 30 lines, early returns are no problem, because anyone reading the code can see the whole method at a glance including all of the returns. In larger methods or functions, an early return can be a trap unintentionally set for the reader. If the early return occurs above the code the reader is looking at, and the reader doesn't know the return is above or forgets that it is above, the reader will misunderstand the code. Production code can be too big to fit on one screen.
So whoever is managing a code base for complexity should be allowing for method size in cases where the complexity appears to be problem. If the code takes more than one screen, a more pedantic return style may be justified. If the method or function is small, don't worry about it.
(I use Sonar and have experienced this same issue.)

Coding styles in conditional expression of some programming languages

It's a bit confusing to me about what is the difference between these condition expressions below:
if( 1 == a) {
//something
}
and
if( a == 1 ) {
//something
}
I saw the above one in some scripts I have downloaded and I wonder what's the difference between them.
The former has been coined a Yoda Condition.
Using if(constant == variable) instead of if(variable == constant), like if(1 == a). Because it's like saying "if blue is the sky" or "if tall is the man".
The constant == variable syntax is often used to avoid mistyping == as =. It is, of course, often used without understanding also when you have constant == function_call_retuning_nothing_modifiable.
Other than that there's no difference, unless you have some weird operator override.
Many programming languages allow assignments like a = 1 to be used as expressions, making the following code syntactically valid (given that integers can be used in conditionals, such as in C or many scripting languages):
if (a = 1) {
// something
}
This is rarely desired, and can lead to unexpected behavior. If 1 == a is used, then this mistake cannot occur because 1 = a is not valid.
Well, I am not sure about the trick. Generally, we could say the equal sign is commutative. So, a = b implies b = a. However, when you have == or === this doesn't work in certain cases, for example when on the right side you have a range: 5 === (1..10) vs. (1..10) === 5.

When are numbers NOT Magic?

I have a function like this:
float_as_thousands_str_with_precision(value, precision)
If I use it like this:
float_as_thousands_str_with_precision(volts, 1)
float_as_thousands_str_with_precision(amps, 2)
float_as_thousands_str_with_precision(watts, 2)
Are those 1/2s magic numbers?
Yes, they are magic numbers. It's obvious that the numbers 1 and 2 specify precision in the code sample but not why. Why do you need amps and watts to be more precise than volts at that point?
Also, avoiding magic numbers allows you to centralize code changes rather than having to scour the code when for the literal number 2 when your precision needs to change.
I would propose something like:
HIGH_PRECISION = 3;
MED_PRECISION = 2;
LOW_PRECISION = 1;
And your client code would look like:
float_as_thousands_str_with_precision(volts, LOW_PRECISION )
float_as_thousands_str_with_precision(amps, MED_PRECISION )
float_as_thousands_str_with_precision(watts, MED_PRECISION )
Then, if in the future you do something like this:
HIGH_PRECISION = 6;
MED_PRECISION = 4;
LOW_PRECISION = 2;
All you do is change the constants...
But to try and answer the question in the OP title:
IMO the only numbers that can truly be used and not be considered "magic" are -1, 0 and 1 when used in iteration, testing lengths and sizes and many mathematical operations. Some examples where using constants would actually obfuscate code:
for (int i=0; i<someCollection.Length; i++) {...}
if (someCollection.Length == 0) {...}
if (someCollection.Length < 1) {...}
int MyRidiculousSignReversalFunction(int i) {return i * -1;}
Those are all pretty obvious examples. E.g. start and the first element and increment by one, testing to see whether a collection is empty and sign reversal... ridiculous but works as an example. Now replace all of the -1, 0 and 1 values with 2:
for (int i=2; i<50; i+=2) {...}
if (someCollection.Length == 2) {...}
if (someCollection.Length < 2) {...}
int MyRidiculousDoublinglFunction(int i) {return i * 2;}
Now you have start asking yourself: Why am I starting iteration on the 3rd element and checking every other? And what's so special about the number 50? What's so special about a collection with two elements? the doubler example actually makes sense here but you can see that the non -1, 0, 1 values of 2 and 50 immediately become magic because there's obviously something special in what they're doing and we have no idea why.
No, they aren't.
A magic number in that context would be a number that has an unexplained meaning. In your case, it specifies the precision, which clearly visible.
A magic number would be something like:
int calculateFoo(int input)
{
return 0x3557 * input;
}
You should be aware that the phrase "magic number" has multiple meanings. In this case, it specifies a number in source code, that is unexplainable by the surroundings. There are other cases where the phrase is used, for example in a file header, identifying it as a file of a certain type.
A literal numeral IS NOT a magic number when:
it is used one time, in one place, with very clear purpose based on its context
it is used with such common frequency and within such a limited context as to be widely accepted as not magic (e.g. the +1 or -1 in loops that people so frequently accept as being not magic).
some people accept the +1 of a zero offset as not magic. I do not. When I see variable + 1 I still want to know why, and ZERO_OFFSET cannot be mistaken.
As for the example scenario of:
float_as_thousands_str_with_precision(volts, 1)
And the proposed
float_as_thousands_str_with_precision(volts, HIGH_PRECISION)
The 1 is magic if that function for volts with 1 is going to be used repeatedly for the same purpose. Then sure, it's "magic" but not because the meaning is unclear, but because you simply have multiple occurences.
Paul's answer focused on the "unexplained meaning" part thinking HIGH_PRECISION = 3 explained the purpose. IMO, HIGH_PRECISION offers no more explanation or value than something like PRECISION_THREE or THREE or 3. Of course 3 is higher than 1, but it still doesn't explain WHY higher precision was needed, or why there's a difference in precision. The numerals offer every bit as much intent and clarity as the proposed labels.
Why is there a need for varying precision in the first place? As an engineering guy, I can assume there's three possible reasons: (a) a true engineering justification that the measurement itself is only valid to X precision, so therefore the display shoulld reflect that, or (b) there's only enough display space for X precision, or (c) the viewer won't care about anything higher that X precision even if its available.
Those are complex reasons difficult to capture in a constant label, and are probbaly better served by a comment (to explain why something is beng done).
IF the use of those functions were in one place, and one place only, I would not consider the numerals magic. The intent is clear.
For reference:
A literal numeral IS magic when
"Unique values with unexplained meaning or multiple occurrences which
could (preferably) be replaced with named constants." http://en.wikipedia.org/wiki/Magic_number_%28programming%29 (3rd bullet)

How to create conditional breakpoint with std::string

Suppose I have this function:
std::string Func1(std::string myString)
{
//do some string processing
std::string newString = Func2(myString)
return newString;
}
How do I set a conditional break when newString has a specific value? (without changing the source)
Setting the condition newString == "my value" didn't work. The breakpoints were disabled with an error overloaded operator not found.
There is a much easier way in Visual Studio 2010/2012.
To accomplish what you are looking for in ANSI use this:
strcmp(newString._Bx._Ptr,"my value")==0
And in unicode (if newString were unicode) use this:
wcscmp(newString._Bx._Ptr, L"my value")==0
There are more things you can do than just a compare, you can read more about it here:
http://blogs.msdn.com/b/habibh/archive/2009/07/07/new-visual-studio-debugger-2010-feature-for-c-c-developers-using-string-functions-in-conditional-breakpoints.aspx
In VS2017, I was able to set the condition as:
strcmp(&newString[0], "my value") == 0
Some searching has failed to turn up any way to do this. Suggested alternatives are to put the test in your code and add a standard breakpoint:
if (myStr == "xyz")
{
// Set breakpoint here
}
Or to build up your test from individual character comparisons. Even looking at individual characters in the string is a bit dicey; in Visual Studio 2005 I had to dig down into the member variables like
myStr._Bx._Buf[0] == 'x' && myStr._Bx._Buf[1] == 'y' && myStr._Bx._Buf[2] == 'z'
Neither of these approaches is very satisfactory. We should have better access to a ubiquitous feature of the Standard Library.
In VS2017 you can do
strcmp(newString._Mypair._Myval2._Bx._Buf,"myvalue")==0
While I've had to work around this using something similar to Brad's answer (plus using DebugBreak() to break right from the code), sometimes editing/recompiling/re-running a bit of code is either too time consuming or just plain impossible.
Luckily, it's apparently possible to spelunk into the actual members of the std::string class. One way is mentioned here -- and though he calls out VS2010 specifically, you can still access individual chars manually in earlier versions. So if you're using 2010, you can just use the nice strcmp() functions and the like (more info), but if you're like me and still have 2008 or earlier, you can come up with a raggedy, terrible, but functional alternative by setting a breakpoint conditional something like:
strVar._Bx._Ptr[0] == 'a' && strVar._Bx._Ptr[1] == 'b' &&
strVar._Bx._Ptr[2] == 'c'
to break if the first three characters in strVar are "abc". You can keep going with additional chars, of course. Ugly.. but it's saved me a little time just now.
VS2012:
I just used the condition below because newString._Bx._Ptr ( as in OBWANDO's answer ) referenced illegal memory
strcmp( newString._Bx._Buf, "my value")==0
and it worked...
#OBWANDO (almost) has the solution, but as multiple comments rightly point out, the actual buffer depends on the string size; I see 16 to be the threshold. Prepending a size check to the strcmp on the appropriate buffer works.
newString._Mysize < 16 && strcmp(newString._Bx._Buf, "test value") == 0
or
newString._Mysize >= 16 && strcmp(newString._Bx._Ptr, "ultra super long test value") == 0
Tried to use strcmp in gdb8.1 under ubuntu18.04, but it doesn't work:
(ins)(gdb) p strcmp("a", "b")
$20 = (int (*)(const char *, const char *)) 0x7ffff5179d60 <__strcmp_ssse3>
According to this answer, strcmp, is a special IFUNC, one can setup condition like this:
condition 1 __strcmp_ssse3(camera->_name.c_str(), "ping")==0
It's pretty ugly, don't want to do it the second time.
This answer gives a much better solution, it use std::string::compare :
condition 1 camera->_name.compare("ping") == 0
In VS2015 you can do
newstring[0]=='x' && newString[1]=='y' && newString[2]=='z'
Comparing string works better than comparing characters
strcmp(name._Mypair._Myval2._Bx._Buf, "foo")==0
This works, but is very inconvenient to use and error prone.
name._Mypair._Myval2._Bx._Buf[0] == 'f' &&
name._Mypair._Myval2._Bx._Buf[1] == '0' &&
name._Mypair._Myval2._Bx._Buf[2] == '0'
You could convert it into a c string using c_str() like so:
$_streq(myStr.c_str(), "foo")
To set a conditional breakpoint in std::string you need to set it on real internal members of std::string. What you see on watch window is simplified.
You can display real structure of a variable in the watch window by using ,! suffix. In your example:
newString,!
For MSVC 2015 – 2019 you can use:
For string that were never longer than 15 characters:
(newString._Mypair._Myval2._Myres < 16) ?
strcmp(newString._Mypair._Myval2._Bx._Buf, "short") == 0 :
false
For (even historically) longer strings:
(newString._Mypair._Myval2._Myres < 16) ? false :
strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_str_value_longer_than_16_chars") == 0
Beware:
The variable name is written twice in each condition!
You need whole expression on single line. Use the copy-paste versions bellow.
Universal condition needs to put the test value twice and variable name three times:
(newString._Mypair._Myval2._Myres < 16) ?
strcmp(newString._Mypair._Myval2._Bx._Buf, "My_test_string") == 0 :
strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_string") == 0
Notes: use wcscmp instead of strcmp if you are working with std::wstring.
Find more info on small string optimization in C++ https://vorbrodt.blog/2019/03/30/sso-of-stdstring/ includes sample code to find size of string's internal buffer.
All std:string and std::wstring single line versions for your copy paste convenience:
(newString._Mypair._Myval2._Myres < 16) ? strcmp(newString._Mypair._Myval2._Bx._Buf, "short") == 0 : false
(newString._Mypair._Myval2._Myres < 16) ? false : strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_str_value_longer_than_16_chars") == 0
(newString._Mypair._Myval2._Myres < 16) ? strcmp(newString._Mypair._Myval2._Bx._Buf, "My_test_string") == 0 : strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_string") == 0
(newString._Mypair._Myval2._Myres < 16) ? wcscmp(newString._Mypair._Myval2._Bx._Buf, L"short") == 0 : false
(newString._Mypair._Myval2._Myres < 16) ? false : wcscmp(newString._Mypair._Myval2._Bx._Ptr, L"My_test_str_value_longer_than_16_chars") == 0
(newString._Mypair._Myval2._Myres < 16) ? wcscmp(newString._Mypair._Myval2._Bx._Buf, L"My_test_string") == 0 : wcscmp(newString._Mypair._Myval2._Bx._Ptr, L"My_test_string") == 0
All above copy/paste samples tested on MSVC version 16.9.10 and program for Windows 10.

Resources