Related
so i just came across some code that reads like so:
checkCalculationPeriodFrequency("7D", "7D", SHOULD_MATCH);
and
checkCalculationPeriodFrequency("7D", "8D", SHOULD_NOT_MATCH);
Let's not worry about what the code does for now (or indeed, ever), but instead, let's worry about that last parameter - the SHOULD_MATCH and SHOULD_NOT_MATCH
Its something i've thought of before but thought might be "bad" to do (inasmuch as "bad" holds any real meaning in a postmodernist world).
above, those values are declared (as you might have assumed):
private boolean SHOULD_MATCH = true;
private boolean SHOULD_NOT_MATCH = false;
I can't recall reading about "naming" the boolean parameter passed to a method call to ease readability, but it certainly makes sense (for readability, but then, it also hides what the value is, if only a teeny bit). Is this a style thing that others have found is instagram or like, soooo facebook?
Naming the argument would help with readability, especially when the alternative is usually something like
checkCalculationFrequency("7D",
"8D",
true /* should match */);
which is ugly. Having context-specific constants could be a solution to this.
I would actually go a step further and redefine the function prototype to accept an enum instead:
enum MatchType {
ShouldMatch,
ShouldNotMatch
};
void checkCalculationFrequency(string a, string b, MatchType match);
I would prefer this over a boolean, because it gives you flexibility to extend the function to accept other MatchTypes later.
I suggest you not to do this way.
First, for each object, the two members SHOULD_MATCH and SHOULD_NOT_MATCH are regenerated. And that's not good because it's not a behavior of the object. So it you want to use is, at least describe it as STATIC FINAL.
Second, I prefer to use an enum instead, because you can control completely the value of the param, i.e. when you use it, you must use either SHOULD_MATCH or SHOULD_NOT_MATCH, not just true or false. And this increase the readability too.
Regards.
It is indeed for readability. The idea is that the reader of the function call might not know immediately what the value true mean in the function call, but SHOULD_MATCH conveys the meaning immediately (and if you need to look up the actual value, you can do so with not much effort).
This becomes even more understandable if you have more than one boolean parameters in the function call: which true means what?
The next step in this logic is to create named object values (e.g. via enum) for the parameter values: you cannot pass on the wrong value to the function (e.g. in the example of three boolean parameters, nothing stops me from passing in SHOULD_MATCH for all of them, even though it does not make sense semantically for that function).
It's definitely more than a style thing.
We have a similar system that takes takes input from a switch in the form of boolean values, 1 or 0, which is pretty much the same as true or false.
In this system we declare our variables OPEN = true and CLOSED = false* and pass them into functions which perform different actions depending on the state of the switch. Now if someone happens to hook up the switch differently it may be that we now get the value 0 when it is OPEN and 1 when it is CLOSED.
By having named boolean variables we can easily adapt the system without having to change the logic throughout. The code becomes self documenting because developers can clearer see what action is meant to be taken in which case without worrying what value comes.
Of course the true purpose of the boolean value should be well documented else where and it is in our system....honest....
*(maybe we use OPEN, !OPEN I forget)
Before jumping into python, I had started with some Objective-C / Cocoa books. As I recall, most functions required keyword arguments to be explicitly stated. Until recently I forgot all about this, and just used positional arguments in Python. But lately, I've ran into a few bugs which resulted from improper positions - sneaky little things they were.
Got me thinking - generally speaking, unless there is a circumstance that specifically requires non-keyword arguments - is there any good reason NOT to use keyword arguments? Is it considered bad style to always use them, even for simple functions?
I feel like as most of my 50-line programs have been scaling to 500 or more lines regularly, if I just get accustomed to always using keyword arguments, the code will be more easily readable and maintainable as it grows. Any reason this might not be so?
UPDATE:
The general impression I am getting is that its a style preference, with many good arguments that they should generally not be used for very simple arguments, but are otherwise consistent with good style. Before accepting I just want to clarify though - is there any specific non-style problems that arise from this method - for instance, significant performance hits?
There isn't any reason not to use keyword arguments apart from the clarity and readability of the code. The choice of whether to use keywords should be based on whether the keyword adds additional useful information when reading the code or not.
I follow the following general rule:
If it is hard to infer the function (name) of the argument from the function name – pass it by keyword (e.g. I wouldn't want to have text.splitlines(True) in my code).
If it is hard to infer the order of the arguments, for example if you have too many arguments, or when you have independent optional arguments – pass it by keyword (e.g. funkyplot(x, y, None, None, None, None, None, None, 'red') doesn't look particularly nice).
Never pass the first few arguments by keyword if the purpose of the argument is obvious. You see, sin(2*pi) is better than sin(value=2*pi), the same is true for plot(x, y, z).
In most cases, stable mandatory arguments would be positional, and optional arguments would be keyword.
There's also a possible difference in performance, because in every implementation the keyword arguments would be slightly slower, but considering this would be generally a premature optimisation and the results from it wouldn't be significant, I don't think it's crucial for the decision.
UPDATE: Non-stylistical concerns
Keyword arguments can do everything that positional arguments can, and if you're defining a new API there are no technical disadvantages apart from possible performance issues. However, you might have little issues if you're combining your code with existing elements.
Consider the following:
If you make your function take keyword arguments, that becomes part of your interface.
You can't replace your function with another that has a similar signature but a different keyword for the same argument.
You might want to use a decorator or another utility on your function that assumes that your function takes a positional argument. Unbound methods are an example of such utility because they always pass the first argument as positional after reading it as positional, so cls.method(self=cls_instance) doesn't work even if there is an argument self in the definition.
None of these would be a real issue if you design your API well and document the use of keyword arguments, especially if you're not designing something that should be interchangeable with something that already exists.
If your consideration is to improve readability of function calls, why not simply declare functions as normal, e.g.
def test(x, y):
print "x:", x
print "y:", y
And simply call functions by declaring the names explicitly, like so:
test(y=4, x=1)
Which obviously gives you the output:
x: 1
y: 4
or this exercise would be pointless.
This avoids having arguments be optional and needing default values (unless you want them to be, in which case just go ahead with the keyword arguments! :) and gives you all the versatility and improved readability of named arguments that are not limited by order.
Well, there are a few reasons why I would not do that.
If all your arguments are keyword arguments, it increases noise in the code and it might remove clarity about which arguments are required and which ones are optionnal.
Also, if I have to use your code, I might want to kill you !! (Just kidding), but having to type the name of all the parameters everytime... not so fun.
Just to offer a different argument, I think there are some cases in which named parameters might improve readability. For example, imagine a function that creates a user in your system:
create_user("George", "Martin", "g.m#example.com", "payments#example.com", "1", "Radius Circle")
From that definition, it is not at all clear what these values might mean, even though they are all required, however with named parameters it is always obvious:
create_user(
first_name="George",
last_name="Martin",
contact_email="g.m#example.com",
billing_email="payments#example.com",
street_number="1",
street_name="Radius Circle")
I remember reading a very good explanation of "options" in UNIX programs: "Options are meant to be optional, a program should be able to run without any options at all".
The same principle could be applied to keyword arguments in Python.
These kind of arguments should allow a user to "customize" the function call, but a function should be able to be called without any implicit keyword-value argument pairs at all.
Sometimes, things should be simple because they are simple.
If you always enforce you to use keyword arguments on every function call, soon your code will be unreadable.
When Python's built-in compile() and __import__() functions gain keyword argument support, the same argument was made in favor of clarity. There appears to be no significant performance hit, if any.
Now, if you make your functions only accept keyword arguments (as opposed to passing the positional parameters using keywords when calling them, which is allowed), then yes, it'd be annoying.
I don't see the purpose of using keyword arguments when the meaning of the arguments is obvious
Keyword args are good when you have long parameter lists with no well defined order (that you can't easily come up with a clear scheme to remember); however there are many situations where using them is overkill or makes the program less clear.
First, sometimes is much easier to remember the order of keywords than the names of keyword arguments, and specifying the names of arguments could make it less clear. Take randint from scipy.random with the following docstring:
randint(low, high=None, size=None)
Return random integers x such that low <= x < high.
If high is None, then 0 <= x < low.
When wanting to generate a random int from [0,10) its clearer to write randint(10) than randint(low=10) in my view. If you need to generate an array with 100 numbers in [0,10) you can probably remember the argument order and write randint(0, 10, 100). However, you may not remember the variable names (e.g., is the first parameter low, lower, start, min, minimum) and once you have to look up the parameter names, you might as well not use them (as you just looked up the proper order).
Also consider variadic functions (ones with variable number of parameters that are anonymous themselves). E.g., you may want to write something like:
def square_sum(*params):
sq_sum = 0
for p in params:
sq_sum += p*p
return sq_sum
that can be applied a bunch of bare parameters (square_sum(1,2,3,4,5) # gives 55 ). Sure you could have written the function to take an named keyword iterable def square_sum(params): and called it like square_sum([1,2,3,4,5]) but that may be less intuitive, especially when there's no potential confusion about the argument name or its contents.
A mistake I often do is that I forget that positional arguments have to be specified before any keyword arguments, when calling a function. If testing is a function, then:
testing(arg = 20, 56)
gives a SyntaxError message; something like:
SyntaxError: non-keyword arg after keyword arg
It is easy to fix of course, it's just annoying. So in the case of few - lines programs as the ones you mention, I would probably just go with positional arguments after giving nice, descriptive names to the parameters of the function. I don't know if what I mention is that big of a problem though.
One downside I could see is that you'd have to think of a sensible default value for everything, and in many cases there might not be any sensible default value (including None). Then you would feel obliged to write a whole lot of error handling code for the cases where a kwarg that logically should be a positional arg was left unspecified.
Imagine writing stuff like this every time..
def logarithm(x=None):
if x is None:
raise TypeError("You can't do log(None), sorry!")
When debugging a function I usually use
library(debug)
mtrace(FunctionName)
FunctionName(...)
And that works quite well for me.
However, sometimes I am trying to debug a complex function that I don't know. In which case, I can find that inside that function there is another function that I would like to "go into" ("debug") - so to better understand how the entire process works.
So one way of doing it would be to do:
library(debug)
mtrace(FunctionName)
FunctionName(...)
# when finding a function I want to debug inside the function, run again:
mtrace(FunctionName.SubFunction)
The question is - is there a better/smarter way to do interactive debugging (as I have described) that I might be missing?
p.s: I am aware that there where various questions asked on the subject on SO (see here). Yet I wasn't able to come across a similar question/solution to what I asked here.
Not entirely sure about the use case, but when you encounter a problem, you can call the function traceback(). That will show the path of your function call through the stack until it hit its problem. You could, if you were inclined to work your way down from the top, call debug on each of the functions given in the list before making your function call. Then you would be walking through the entire process from the beginning.
Here's an example of how you could do this in a more systematic way, by creating a function to step through it:
walk.through <- function() {
tb <- unlist(.Traceback)
if(is.null(tb)) stop("no traceback to use for debugging")
assign("debug.fun.list", matrix(unlist(strsplit(tb, "\\(")), nrow=2)[1,], envir=.GlobalEnv)
lapply(debug.fun.list, function(x) debug(get(x)))
print(paste("Now debugging functions:", paste(debug.fun.list, collapse=",")))
}
unwalk.through <- function() {
lapply(debug.fun.list, function(x) undebug(get(as.character(x))))
print(paste("Now undebugging functions:", paste(debug.fun.list, collapse=",")))
rm(list="debug.fun.list", envir=.GlobalEnv)
}
Here's a dummy example of using it:
foo <- function(x) { print(1); bar(2) }
bar <- function(x) { x + a.variable.which.does.not.exist }
foo(2)
# now step through the functions
walk.through()
foo(2)
# undebug those functions again...
unwalk.through()
foo(2)
IMO, that doesn't seem like the most sensible thing to do. It makes more sense to simply go into the function where the problem occurs (i.e. at the lowest level) and work your way backwards.
I've already outlined the logic behind this basic routine in "favorite debugging trick".
I like options(error=recover) as detailed previously on SO. Things then stop at the point of error and one can inspect.
(I'm the author of the 'debug' package where 'mtrace' lives)
If the definition of 'SubFunction' lives outside 'MyFunction', then you can just mtrace 'SubFunction' and don't need to mtrace 'MyFunction'. And functions run faster if they're not 'mtrace'd, so it's good to mtrace only as little as you need to. (But you probably know those things already!)
If 'MyFunction' is only defined inside 'SubFunction', one trick that might help is to use a conditional breakpoint in 'MyFunction'. You'll need to 'mtrace( MyFunction)', then run it, and when the debugging window appears, find out what line 'MyFunction' is defined in. Say it's line 17. Then the following should work:
D(n)> bp( 1, F) # don't bother showing the window for MyFunction again
D(n)> bp( 18, { mtrace( SubFunction); FALSE})
D(n)> go()
It should be clear what this does (or it will be if you try it).
The only downsides are: the need to do it again whenever you change the code of 'MyFunction', and; the slowing-down that might occur through 'MyFunction' itself being mtraced.
You could also experiment with adding a 'debug.sub' argument to 'MyFunction', that defaults to FALSE. In the code of 'MyFunction', then add this line immediately after the definition of 'SubFunction':
if( debug.sub) mtrace( SubFunction)
That avoids any need to mtrace 'MyFunction' itself, but does require you to be able to change its code.
This is a troublesome violation of type safety in my project, so I'm looking for a way to disable it. It seems that if a function takes an AnyRef (or a java.lang.Object), you can call the function with any combination of parameters, and Scala will coalesce the parameters into a Tuple object and invoke the function.
In my case the function isn't expecting a Tuple, and fails at runtime. I would expect this situation to be caught at compile time.
object WhyTuple {
def main(args: Array[String]): Unit = {
fooIt("foo", "bar")
}
def fooIt(o: AnyRef) {
println(o.toString)
}
}
Output:
(foo,bar)
No implicits or Predef at play here at all -- just good old fashioned compiler magic. You can find it in the type checker. I can't locate it in the spec right now.
If you're motivated enough, you could add a -X option to the compiler prevent this.
Alternatively, you could avoid writing arity-1 methods that accept a supertype of TupleN.
What about something like this:
object Qx2 {
#deprecated def callingWithATupleProducesAWarning(a: Product) = 2
def callingWithATupleProducesAWarning(a: Any) = 3
}
Tuples have the Product trait, so any call to callingWithATupleProducesAWarning that passes a tuple will produce a deprecation warning.
Edit: According to people better informed than me, the following answer is actually wrong: see this answer. Thanks Aaron Novstrup for pointing this out.
This is actually a quirk of the parser, not of the type system or the compiler. Scala allows zero- or one-arg functions to be invoked without parentheses, but not functions with more than one argument. So as Fred Haslam says, what you've written isn't an invocation with two arguments, it's an invocation with one tuple-valued argument. However, if the method did take two arguments, the invocation would be a two-arg invocation. It seems like the meaning of the code affects how it parses (which is a bit suckful).
As for what you can actually do about this, that's tricky. If the method really did require two arguments, this problem would go away (i.e. if someone then mistakenly tried to call it with one argument or with three, they'd get a compile error as you expect). Don't suppose there's some extra parameter you've been putting off adding to that method? :)
The compile is capable of interpreting methods without round brackets. So it takes the round brackets in the fooIt to mean Tuple. Your call is the same as:
fooIt( ("foo","bar") )
That being said, you can cause the method to exclude the call, and retrieve the value if you use some wrapper like Some(AnyRef) or Tuple1(AnyRef).
I think the definition of (x, y) in Predef is responsible. The "-Yno-predefs" compiler flag might be of some use, assuming you're willing to do the work of manually importing any implicits you otherwise need. By that I mean that you'll have to add import scala.Predef._ all over the place.
Could you also add a two-param override, which would prevent the compiler applying the syntactic sugar? By making the types taking suitably obscure you're unlikely to get false positives. E.g:
object WhyTuple {
...
class DummyType
def fooIt(a: DummyType, b: DummyType) {
throw new UnsupportedOperationException("Dummy function - should not be called")
}
}
Something like this (yes, this doesn't deal with some edge cases - that's not the point):
int CountDigits(int num) {
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
What's your opinion about this? That is, using function arguments as local variables.
Both are placed on the stack, and pretty much identical performance wise, I'm wondering about the best-practices aspects of this.
I feel like an idiot when I add an additional and quite redundant line to that function consisting of int numCopy = num, however it does bug me.
What do you think? Should this be avoided?
As a general rule, I wouldn't use a function parameter as a local processing variable, i.e. I treat function parameters as read-only.
In my mind, intuitively understandabie code is paramount for maintainability, and modifying a function parameter to use as a local processing variable tends to run counter to that goal. I have come to expect that a parameter will have the same value in the middle and bottom of a method as it does at the top. Plus, an aptly-named local processing variable may improve understandability.
Still, as #Stewart says, this rule is more or less important depending on the length and complexity of the function. For short simple functions like the one you show, simply using the parameter itself may be easier to understand than introducing a new local variable (very subjective).
Nevertheless, if I were to write something as simple as countDigits(), I'd tend to use a remainingBalance local processing variable in lieu of modifying the num parameter as part of local processing - just seems clearer to me.
Sometimes, I will modify a local parameter at the beginning of a method to normalize the parameter:
void saveName(String name) {
name = (name != null ? name.trim() : "");
...
}
I rationalize that this is okay because:
a. it is easy to see at the top of the method,
b. the parameter maintains its the original conceptual intent, and
c. the parameter is stable for the rest of the method
Then again, half the time, I'm just as apt to use a local variable anyway, just to get a couple of extra finals in there (okay, that's a bad reason, but I like final):
void saveName(final String name) {
final String normalizedName = (name != null ? name.trim() : "");
...
}
If, 99% of the time, the code leaves function parameters unmodified (i.e. mutating parameters are unintuitive or unexpected for this code base) , then, during that other 1% of the time, dropping a quick comment about a mutating parameter at the top of a long/complex function could be a big boon to understandability:
int CountDigits(int num) {
// num is consumed
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
P.S. :-)
parameters vs arguments
http://en.wikipedia.org/wiki/Parameter_(computer_science)#Parameters_and_arguments
These two terms are sometimes loosely used interchangeably; in particular, "argument" is sometimes used in place of "parameter". Nevertheless, there is a difference. Properly, parameters appear in procedure definitions; arguments appear in procedure calls.
So,
int foo(int bar)
bar is a parameter.
int x = 5
int y = foo(x)
The value of x is the argument for the bar parameter.
It always feels a little funny to me when I do this, but that's not really a good reason to avoid it.
One reason you might potentially want to avoid it is for debugging purposes. Being able to tell the difference between "scratchpad" variables and the input to the function can be very useful when you're halfway through debugging.
I can't say it's something that comes up very often in my experience - and often you can find that it's worth introducing another variable just for the sake of having a different name, but if the code which is otherwise cleanest ends up changing the value of the variable, then so be it.
One situation where this can come up and be entirely reasonable is where you've got some value meaning "use the default" (typically a null reference in a language like Java or C#). In that case I think it's entirely reasonable to modify the value of the parameter to the "real" default value. This is particularly useful in C# 4 where you can have optional parameters, but the default value has to be a constant:
For example:
public static void WriteText(string file, string text, Encoding encoding = null)
{
// Null means "use the default" which we would document to be UTF-8
encoding = encoding ?? Encoding.UTF8;
// Rest of code here
}
About C and C++:
My opinion is that using the parameter as a local variable of the function is fine because it is a local variable already. Why then not use it as such?
I feel silly too when copying the parameter into a new local variable just to have a modifiable variable to work with.
But I think this is pretty much a personal opinion. Do it as you like. If you feel sill copying the parameter just because of this, it indicates your personality doesn't like it and then you shouldn't do it.
If I don't need a copy of the original value, I don't declare a new variable.
IMO I don't think mutating the parameter values is a bad practice in general,
it depends on how you're going to use it in your code.
My team coding standard recommends against this because it can get out of hand. To my mind for a function like the one you show, it doesn't hurt because everyone can see what is going on. The problem is that with time functions get longer, and they get bug fixes in them. As soon as a function is more than one screen full of code, this starts to get confusing which is why our coding standard bans it.
The compiler ought to be able to get rid of the redundant variable quite easily, so it has no efficiency impact. It is probably just between you and your code reviewer whether this is OK or not.
I would generally not change the parameter value within the function. If at some point later in the function you need to refer to the original value, you still have it. in your simple case, there is no problem, but if you add more code later, you may refer to 'num' without realizing it has been changed.
The code needs to be as self sufficient as possible. What I mean by that is you now have a dependency on what is being passed in as part of your algorithm. If another member of your team decides to change this to a pass by reference then you might have big problems.
The best practice is definitely to copy the inbound parameters if you expect them to be immutable.
I typically don't modify function parameters, unless they're pointers, in which case I might alter the value that's pointed to.
I think the best-practices of this varies by language. For example, in Perl you can localize any variable or even part of a variable to a local scope, so that changing it in that scope will not have any affect outside of it:
sub my_function
{
my ($arg1, $arg2) = #_; # get the local variables off the stack
local $arg1; # changing $arg1 here will not be visible outside this scope
$arg1++;
local $arg2->{key1}; # only the key1 portion of the hashref referenced by $arg2 is localized
$arg2->{key1}->{key2} = 'foo'; # this change is not visible outside the function
}
Occasionally I have been bitten by forgetting to localize a data structure that was passed by reference to a function, that I changed inside the function. Conversely, I have also returned a data structure as a function result that was shared among multiple systems and the caller then proceeded to change the data by mistake, affecting these other systems in a difficult-to-trace problem usually called action at a distance. The best thing to do here would be to make a clone of the data before returning it*, or make it read-only**.
* In Perl, see the function dclone() in the built-in Storable module.
** In Perl, see lock_hash() or lock_hash_ref() in the built-in Hash::Util module).