Something like this (yes, this doesn't deal with some edge cases - that's not the point):
int CountDigits(int num) {
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
What's your opinion about this? That is, using function arguments as local variables.
Both are placed on the stack, and pretty much identical performance wise, I'm wondering about the best-practices aspects of this.
I feel like an idiot when I add an additional and quite redundant line to that function consisting of int numCopy = num, however it does bug me.
What do you think? Should this be avoided?
As a general rule, I wouldn't use a function parameter as a local processing variable, i.e. I treat function parameters as read-only.
In my mind, intuitively understandabie code is paramount for maintainability, and modifying a function parameter to use as a local processing variable tends to run counter to that goal. I have come to expect that a parameter will have the same value in the middle and bottom of a method as it does at the top. Plus, an aptly-named local processing variable may improve understandability.
Still, as #Stewart says, this rule is more or less important depending on the length and complexity of the function. For short simple functions like the one you show, simply using the parameter itself may be easier to understand than introducing a new local variable (very subjective).
Nevertheless, if I were to write something as simple as countDigits(), I'd tend to use a remainingBalance local processing variable in lieu of modifying the num parameter as part of local processing - just seems clearer to me.
Sometimes, I will modify a local parameter at the beginning of a method to normalize the parameter:
void saveName(String name) {
name = (name != null ? name.trim() : "");
...
}
I rationalize that this is okay because:
a. it is easy to see at the top of the method,
b. the parameter maintains its the original conceptual intent, and
c. the parameter is stable for the rest of the method
Then again, half the time, I'm just as apt to use a local variable anyway, just to get a couple of extra finals in there (okay, that's a bad reason, but I like final):
void saveName(final String name) {
final String normalizedName = (name != null ? name.trim() : "");
...
}
If, 99% of the time, the code leaves function parameters unmodified (i.e. mutating parameters are unintuitive or unexpected for this code base) , then, during that other 1% of the time, dropping a quick comment about a mutating parameter at the top of a long/complex function could be a big boon to understandability:
int CountDigits(int num) {
// num is consumed
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
P.S. :-)
parameters vs arguments
http://en.wikipedia.org/wiki/Parameter_(computer_science)#Parameters_and_arguments
These two terms are sometimes loosely used interchangeably; in particular, "argument" is sometimes used in place of "parameter". Nevertheless, there is a difference. Properly, parameters appear in procedure definitions; arguments appear in procedure calls.
So,
int foo(int bar)
bar is a parameter.
int x = 5
int y = foo(x)
The value of x is the argument for the bar parameter.
It always feels a little funny to me when I do this, but that's not really a good reason to avoid it.
One reason you might potentially want to avoid it is for debugging purposes. Being able to tell the difference between "scratchpad" variables and the input to the function can be very useful when you're halfway through debugging.
I can't say it's something that comes up very often in my experience - and often you can find that it's worth introducing another variable just for the sake of having a different name, but if the code which is otherwise cleanest ends up changing the value of the variable, then so be it.
One situation where this can come up and be entirely reasonable is where you've got some value meaning "use the default" (typically a null reference in a language like Java or C#). In that case I think it's entirely reasonable to modify the value of the parameter to the "real" default value. This is particularly useful in C# 4 where you can have optional parameters, but the default value has to be a constant:
For example:
public static void WriteText(string file, string text, Encoding encoding = null)
{
// Null means "use the default" which we would document to be UTF-8
encoding = encoding ?? Encoding.UTF8;
// Rest of code here
}
About C and C++:
My opinion is that using the parameter as a local variable of the function is fine because it is a local variable already. Why then not use it as such?
I feel silly too when copying the parameter into a new local variable just to have a modifiable variable to work with.
But I think this is pretty much a personal opinion. Do it as you like. If you feel sill copying the parameter just because of this, it indicates your personality doesn't like it and then you shouldn't do it.
If I don't need a copy of the original value, I don't declare a new variable.
IMO I don't think mutating the parameter values is a bad practice in general,
it depends on how you're going to use it in your code.
My team coding standard recommends against this because it can get out of hand. To my mind for a function like the one you show, it doesn't hurt because everyone can see what is going on. The problem is that with time functions get longer, and they get bug fixes in them. As soon as a function is more than one screen full of code, this starts to get confusing which is why our coding standard bans it.
The compiler ought to be able to get rid of the redundant variable quite easily, so it has no efficiency impact. It is probably just between you and your code reviewer whether this is OK or not.
I would generally not change the parameter value within the function. If at some point later in the function you need to refer to the original value, you still have it. in your simple case, there is no problem, but if you add more code later, you may refer to 'num' without realizing it has been changed.
The code needs to be as self sufficient as possible. What I mean by that is you now have a dependency on what is being passed in as part of your algorithm. If another member of your team decides to change this to a pass by reference then you might have big problems.
The best practice is definitely to copy the inbound parameters if you expect them to be immutable.
I typically don't modify function parameters, unless they're pointers, in which case I might alter the value that's pointed to.
I think the best-practices of this varies by language. For example, in Perl you can localize any variable or even part of a variable to a local scope, so that changing it in that scope will not have any affect outside of it:
sub my_function
{
my ($arg1, $arg2) = #_; # get the local variables off the stack
local $arg1; # changing $arg1 here will not be visible outside this scope
$arg1++;
local $arg2->{key1}; # only the key1 portion of the hashref referenced by $arg2 is localized
$arg2->{key1}->{key2} = 'foo'; # this change is not visible outside the function
}
Occasionally I have been bitten by forgetting to localize a data structure that was passed by reference to a function, that I changed inside the function. Conversely, I have also returned a data structure as a function result that was shared among multiple systems and the caller then proceeded to change the data by mistake, affecting these other systems in a difficult-to-trace problem usually called action at a distance. The best thing to do here would be to make a clone of the data before returning it*, or make it read-only**.
* In Perl, see the function dclone() in the built-in Storable module.
** In Perl, see lock_hash() or lock_hash_ref() in the built-in Hash::Util module).
This is general programming, but if it makes a difference, I'm using objective-c. Suppose there's a method that returns a value, and also performs some actions, but you don't care about the value it returns, only the stuff that it does. Would you just call the method as if it was void? Or place the result in a variable and then delete it or forget about it? State your opinion, what you would do if you had this situation.
A common example of this is printf, which returns an int... but you rarely see this:
int val = printf("Hello World");
Yeah just call the method as if it was void. You probably do it all the time without noticing it. The assignment operator '=' actually returns a value, but it's very rarely used.
It depends on the environment (the language, the tools, the coding standard, ...).
For example in C, it is perfectly possible to call a function without using its value. With some functions like printf, which returns an int, it is done all the time.
Sometimes not using a value will cause a warning, which is undesirable. Assigning the value to a variable and then not using it will just cause another warning about an unused variable. For this case the solution is to cast the result to void by prefixing the call with (void), e.g.
(void) my_function_returning_a_value_i_want_to_ignore().
There are two separate issues here, actually:
Should you care about returned value?
Should you assign it to a variable you're not going to use?
The answer to #2 is a resounding "NO" - unless, of course, you're working with a language where that would be illegal (early Turbo Pascal comes to mind). There's absolutely no point in defining a variable only to throw it away.
First part is not so easy. Generally, there is a reason value is returned - for idempotent functions the result is function's sole purpose; for non-idempotent it usually represents some sort of return code signifying whether operation was completed normally. There are exceptions, of course - like method chaining.
If this is common in .Net (for example), there's probably an issue with the code breaking CQS.
When I call a function that returns a value that I ignore, it's usually because I'm doing it in a test to verify behavior. Here's an example in C#:
[Fact]
public void StatService_should_call_StatValueRepository_for_GetPercentageValues()
{
var statValueRepository = new Mock<IStatValueRepository>();
new StatService(null, statValueRepository.Object).GetValuesOf<PercentageStatValue>();
statValueRepository.Verify(x => x.GetStatValues());
}
I don't really care about the return type, I just want to verify that a method was called on a fake object.
In C it is very common, but there are places where it is ok to do so and other places where it really isn't. Later versions of GCC have a function attribute so that you can get a warning when a function is used without checking the return value:
The warn_unused_result attribute causes a warning to be emitted if a caller of the function with this attribute does not use its return value. This is useful for functions where not checking the result is either a security problem or always a bug, such as realloc.
int fn () __attribute__ ((warn_unused_result));
int foo ()
{
if (fn () < 0) return -1;
fn ();
return 0;
}
results in warning on line 5.
Last time I used this there was no way of turning off the generated warning, which causes problems when you're compiling 3rd-party code you don't want to modify. Also, there is of course no way to check if the user actually does something sensible with the returned value.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When have you run into syntax that might be dated, never used or just plain obfuscated that you couldn't understand for the life of you.
For example, I never knew that comma is an actual operator in C. So when I saw the code
if(Foo(), Bar())
I just about blew a gasket trying to figure out what was going on there.
I'm curious what little never-dusted corners might exist in other languages.
C++'s syntax for a default constructor on a local variable. At first I wrote the following.
Student student(); // error
Student student("foo"); // compiles
This lead me to about an hour of reading through a cryptic C++ error message. Eventually a non-C++ newbie dropped by, laughed and pointed out my mistake.
Student student;
This is always jarring:
std::vector <std::vector <int> >
^
mandatory space.
When using the System.DirectoryServices name space to bind to an ADAM (Active Directory Application Mode; now called AD LDS, I think), I lost an entire day trying to debug this simple code:
DirectoryEntry rootDSE = new DirectoryEntry(
"ldap://192.168.10.78:50000/RootDSE",
login,
password,
AuthenticationTypes.None);
When I ran the code, I kept getting a COMException with error 0x80005000, which helpfully mapped to "Unknown error."
I could use the login and password and bind to the port via ADSI Edit. But this simple line of code didn't work. Bizarre firewall permission? Something screwed in configuration? Some COM object not registered correctly? Why on earth wasn't it working?
The answer? It's LDAP://, not ldap://.
And this is why we drink.
C++
class Foo
{
// Lots of stuff here.
} bar;
The declaration of bar is VERY difficult to see. More commonly found in C, but especially annoying in C++.
Perl's syntax caused me a bad day a while ago:
%table = {
foo => 1,
bar => 2
};
Without proper warnings (which are unavailable on the platform I was using), this creates a one-element hash with a key as the given hash reference and value undef. Note the subtle use of {}, which creates a new hash reference, and not (), which is an array used to populate the %table hash.
I was shocked Python's quasi-ternary operator wasn't a syntax error the first time I saw it:
X if Y else Z
This is stupid and common, but this syntax:
if ( x = y ) {
// do something
}
Has caught me about three times in the past year in a couple of different languages. I really like the R language's convention of using <- for assignment, like this:
x <- y
If the x = y syntax were made to mean x == y, and x <- y to mean assignment, my brain would make a smoother transition to and from math and programming.
C/C++'s bitvector syntax. The worst part about this is trying to google for it simply based on the syntax.
struct C {
unsigned int v1 : 12;
unsigned int v2 : 1;
};
C#'s ?? operator threw me for a loop the first time I saw it. Essentially it will return the LHS if it's non-null and the RHS if the LHS is null.
object bar = null;
object foo = bar ?? new Student(); // gets new Student()
Powershell's function calling semantics
function foo() {
params ($count, $name);
...
}
foo (5, "name")
For the non powershellers out there. This will work but not how you expect it to. It actually creates an array and passes it as the first argument. The second argument has no explicit value. The correct version is
foo 5 "name"
The first time I saw a function pointer in C++ I was confused. Worse, because the syntax has no key words, it was really hard to look up. What exactly does one type into a search engine for this?
int (*Foo)(float, char, char);
I ended up having to ask the local C++ guru what it was.
VB's (yeah yeah, I have to use it) "And" keyword - as in:
If Object IsNot Nothing And Object.Property Then
See that Object.Property reference, after I've made sure the object isn't NULL? Well, VB's "And" keyword * does * not * block * further * evaluation and so the code will fail.
VB does have, however, another keyword - AndAlso:
If Object IsNot Nothing AndAlso Object.Property Then
That will work as you'd expect and not explode when run.
I was once very confused by some C++ code that declared a reference to a local variable, but never used it. Something like
MyLock &foo;
(Cut me some slack on the syntax, I haven't done C++ in nearly 8 years)
Taking that seemingly unused variable out made the program start dying in obscure ways seemingly unrelated to this "unused" variable. So I did some digging, and found out that the default ctor for that class grabbed a thread lock, and the dtor released it. This variable was guarding the code against simultaneous updates without seemingly doing anything.
Javascript: This syntax ...
for(i in someArray)
... is for looping through arrays, or so I thought. Everything worked fine until another team member dropped in MooTools, and then all my loops were broken because the for(i in ...) syntax also goes over extra methods that have been added to the array object.
Had to translate some scientific code from old FORTRAN to C. A few things that ruined my day(s):
Punch-card indentation. The first 6 characters of every line were reserved for control characters, goto labels, comments, etc:
^^^^^^[code starts here]
c [commented line]
Goto-style numbering for loops (coupled with 6 space indentation):
do 20, i=0,10
do 10, j=0,10
do_stuff(i,j)
10 continue
20 continue
Now imagine there are multiple nested loops (i.e., do 20 to do 30) which have no differentiating indentation to know what context you are in. Oh, and the terminating statements are hundreds of lines away.
Format statement, again using goto labels. The code wrote to files (helpfully referred to by numbers 1,2,etc). To write the values of a,b,c to file we had:
write (1,51) a,b,c
So this writes a,b,c to file 1 using a format statement at the line marked with label 51:
51 format (f10.3,f10.3,f10.3)
These format lines were hundreds of lines away from where they were called. This was complicated by the author's decision to print newlines using:
write (1,51) [nothing here]
I am reliably informed by a lecturer in the group that I got off easy.
C's comma operator doesn't seem very obscure to me: I see it all the time, and if I hadn't, I could just look up "comma" in the index of K&R.
Now, trigraphs are another matter...
void main() { printf("wat??!\n"); } // doesn't print "wat??!"
Wikipedia has some great examples, from the genuinely confusing:
// Will the next line be executed????????????????/
a++;
to the bizarrely valid:
/??/
* A comment *??/
/
And don't even get me started on digraphs. I would be surprised if there's somebody here who can fully explain C's digraphs from memory. Quick, what digraphs does C have, and how do they differ from trigraphs in parsing?
Syntax like this in C++ with /clr enabled. Trying to create a Managed Dictionary object in C++.
gcroot<Dictionary<System::String^, MyObj^>^> m_myObjs;
An oldie:
In PL/1 there are no reserved words, so you can define variables, methods, etc. with the same name as the language keywords.
This can be a valid line of code:
IF ELSE THEN IF ELSE THEN
(Where ELSE is a boolean, and IF and THEN are functions, obviously.)
Iif(condition, expression, expression) is a function call, not an operator.
Both sides of the conditional are ALWAYS evaluated.
It always ruines my day if I have to read/write some kind of Polish notation as used in a lot of HP calculators...
PHP's ternary operator associates left to right. This caused me much anguish one day when I was learning PHP. For the previous 10 years I had been programming in C/C++ in which the ternary operator associates right to left.
I am still a little curious as to why the designers of PHP chose to do that when, in many other respects, the syntax of PHP matches that C/C++ fairly closely.
EDIT: nowadays I only work with PHP under duress.
Not really obscure, but whenever I code too much in one language, and go back to another, I start messing up the syntax of the latter. I always chuckle at myself when I realize that "#if" in C is not a comment (but rather something far more deadly), and that lines in Python do not need to end in a semicolon.
While performing maintentnace on a bit of C++ code I once spotted that someone had done something like this:
for (i=0; i<10; i++)
{
MyNumber += 1;
}
Yes, they had a loop to add 1 to a number 10 times.
Why did it ruin my day? The perpetrator had long since left, and I was having to bug fix their module. I thought that if they were doing something like this, goodness knows what else I was going to encounter!
AT&T assembler syntax >:(
This counter-intuitive, obscure syntax has ruined many of my days, for example, the simple Intel syntax assembly instruction:
mov dword es:[ebp-5], 1 /* Cool, put the value 1 into the
* location of ebp minus five.
* this is so obvious and readable, and hard to mistake
* for anything else */
translates into this in AT&T syntax
movl $1, %es:-4(%ebp) /* huh? what's "l"? 4 bytes? 8 bytes? arch specific??
* wait, why are we moving 1 into -4 times ebp?
* or is this moving -4 * ebp into memory at address 0x01?
* oh wait, YES, I magically know that this is
* really setting 4 bytes at ebp-5 to 1!
More...
mov dword [foo + eax*4], 123 /* Intel */
mov $123, foo(, %eax, 4) /* AT&T, looks like a function call...
* there's no way in hell I'd know what this does
* without reading a full manual on this syntax */
And one of my favorites.
It's as if they took the opcode encoding scheme and tried to incorporate it into the programming syntax (read: scale/index/base), but also tried to add a layer of abstraction on the data types, and merge that abstraction into the opcode names to cause even more confusion. I don't see how anyone can program seriously with this.
In a scripting language (Concordance Programming Language) for stand alone database software (Concordance) used for litigation document review, arrays were 0 indexed while (some) string functions were 1 indexed. I haven't touched it since.
This. I had my run in with it more then once.
GNU extensions are often fun:
my_label:
unsigned char *ptr = (unsigned char *)&&my_label;
*ptr = 5; // Will it segfault? Finding out is half the fun...
The syntax for member pointers also causes me grief, more because I don't use it often enough than because there's anything really tricky about it:
template<typename T, int T::* P>
function(T& t)
{
t.*P = 5;
}
But, really, who needs to discuss the obscure syntax in C++? With operator overloading, you can invent your own!