Marking-up argument names in comments of functions - coding-style

One of the most common dilemmas I have when commenting code is how to mark-up argument names. I'll explain what I mean:
def foo(vector, widht, n=0):
""" Transmogrify vector to fit into width. No more than n
elements will be transmogrified at a time
"""
Now, my problem with this is that the argument names vector, width and n are not distinguished in that comment in any way, and can be confused for simple text. Some other options:
Transmogrify 'vector' to fit into
'width'. No more than 'n'
Or maybe:
Transmogrify -vector- to fit into
-width-. No more than -n-
Or even:
Transmogrify :vector: to fit into
:width:. No more than :n:
You get the point. Some tools like Doxygen impose this, but what if I don't use a tool ? Is this language dependent ?
What do you prefer to use ?

I personally prefer single quotes--your first example. It seems closest to how certain titles / named entities can be referenced in English text when neither underlining nor italics are available.

I agree with Reuben: The first example is the most readable.
Of course that depends on your personal reading habits - If you got used to read comments in the style of your third example, you may find that style the most readable.
But the first style is closest to the way we read and write text in day-to-day life (newspapers, book). Therefore it is the one that will be easiest to read for someone who has no prior experience to reading your comments.

In kinda use neither, and simply put the names of the variables in the text. Or I write the whole text in such a way that it explains what the function does, but does not mention the parameters in it. That's in the case when the meaning of the parameters should become clear by itself when you understand what the function does.

My favourite option is to write:
def foo(vector, width, n=0):
""" Transmogrify 'vector' to fit into 'width'. No more than 'n'
elements will be transmogrified at a time
#param vector: list of something
#param width: int
#keyword n: int (default 0)
"""
Epydoc recognizes #param (see Epydoc manual), and you can use some fancy regexp to find and print parameters of your function, and hopefully Eclipse will start to show parameters description for Python functions in quick assist some day, and I'm pretty sure that it would follow pattern
# <keyword> <paramName> <colon>
Anyway, when that day come it will be easy to replace #param with #anythingElse.

Related

Refactoring Business Rule, Function Naming, Width, Height, Position X & Y

I am refactoring some business rule functions to provide a more generic version of the function.
The functions I am refactoring are:
DetermineWindowWidth
DetermineWindowHeight
DetermineWindowPositionX
DetermineWindowPositionY
All of them do string parsing, as it is a string parsing business rules engine.
My question is what would be a good name for the newly refactored function?
Obviously I want to shy away from a function name like:
DetermineWindowWidthHeightPositionXPositionY
I mean that would work, but it seems unnecessarily long when it could be something like:
DetermineWindowMoniker or something to that effect.
Function objective: Parse an input string like 1280x1024 or 200,100 and return either the first or second number. The use case is for data-driving test automation of a web browser window, but this should be irrelevant to the answer.
Question objective: I have the code to do this, so my question is not about code, but just the function name. Any ideas?
There are too little details, you should have specified at least the parameters and returns of the functions.
Have I understood correctly that you use strings of the format NxN for sizes and N,N for positions?
And that this generic function will have to parse both (and nothing else), and will return either the first or second part depending on a parameter of the function?
And that you'll then keep the various DetermineWindow* functions but make them all call this generic function?
If so:
Without knowing what parameters the generic function has it's even harder to help, but it's most likely impossible to give it a simple name.
Not all batches of code can be described by a simple name.
You'll most likely need to use a different construction if you want to have clear names. Here's an idea, in pseudo code:
ParseSize(string, outWidth, outHeight) {
ParsePair(string, "x", outWidht, outHeight)
}
ParsePosition(string, outX, outY) {
ParsePair(string, ",", outX, outY)
}
ParsePair(string, separator, outFirstItem, outSecondItem) {
...
}
And the various DetermineWindow would call ParseSize or ParsePosition.
You could also use just ParsePair, directly, but I thinks it's cleaner to have the two other functions in the middle.
Objects
Note that you'd probably get cleaner code by using objects rather than strings (a Size and a Position one, and probably a Pair one too).
The ParsePair code (adapted appropriately) would be included in a constructor or factory method that gives you a Pair out of a string.
---
Of course you can give other names to the various functions, objects and parameters, here I used the first that came to my mind.
It seems this question-answer provides a good starting point to answer this question:
Appropriate name for container of position, size, angle
A search on www.thesaurus.com for "Property" gives some interesting possible answers that provide enough meaningful context to the usage:
Aspect
Character
Characteristic
Trait
Virtue
Property
Quality
Attribute
Differentia
Frame
Constituent
I think ConstituentProperty is probably the most apt.

Good style for splitting lengthy expressions over lines

If the following is not the best style, what is for the equivalent expression?
if (some_really_long_expression__________ && \
some_other_really_long_expression)
The line continuation feels ugly. But I'm having a hard time finding a better alternative.
The parser doesn't need the backslashes in cases where the continuation is unambiguous. For example, using Ruby 2.0:
if true &&
true &&
true
puts true
end
#=> true
The following are some more-or-less random thoughts about the question of line length from someone who just plays with Ruby. Nor have I had any training as a software engineer, so consider yourself forewarned.
I find the problem of long lines is often more the number of characters than the number of operations. The former can be reduced by (drum-roll) shortening variable names and method names. The question, of course, is whether the application of a verbosity filter (aka babbling, prattling or jabbering filter) will make the code harder to comprehend. How often have you seen something fairly close to the following (without \)?
total_cuteness_rating = cats_dogs_and_pigs.map {|animal| \
cuteness_calculation(animal)}.reduce {|cuteness_accumulator, \
cuteness_per_animal| cuteness_accumulator + cuteness_per_animal}
Compare that with:
tot_cuteness = pets.map {|a| cuteness(a)}.reduce(&:+)
Firstly, I see no benefit of long names for local variables within a block (and rarely for local variables in a method). Here, isn't it perfectly obvious what a refers to in the calculation of tot_cuteness? How good a memory do you need to remember what a is when it is confined to a single line of code?
Secondly, whenever possible use the short form for enumerables followed by a block (e.g, reduce(&:+)). This allows us to comprehend what's going on in microseconds, here as soon as our eyes latch onto the +. Same, for .to_i, _s or _f. True, reduce {|tot, e| tot + e} isn't much longer, but we're forcing the reader's brain to decode two variables as well as the operator, when + is really all it needs.
Another way to shorten lines is to avoid long chains of operations. That comes at a cost, however. As far as I'm concerned, the longer the chain, the better. It reduces the need for temporary variables, reduces the number of lines of code and--possibly of greatest importance--allows us to read across a line, as most humans are accustomed, rather than down the page. The above line of code reads, "To calculate total cuteness, calculate each pet's cuteness rating, then sum those ratings". How could it be more clear?
When chains are particularly long, they can be written over multiple lines without using the line-continuaton character \:
array.each {|e| blah, blah, ..., blah
.map {|a| blah, blah, ..., blah
.reduce {|i| blah, blah, ..., blah }
}
}
That's no less clear than separate statements. I think this is frequently done in Rails.
What about the use of abbreviations? Which of the following names is most clear?
number_of_dogs
number_dogs
nbr_dogs
n_dogs
I would argue the first three are equally clear, and the last no less clear if the writer consistently prefixes variable names with n_ when that means "number of". Same for tot_, and so on. Enough.
One approach is to encapsulate those expressions inside meaningful methods. And you might be able to break it into multiple methods that you can later reuse.
Other then that is hard to suggest anything with the little information you gave. You might be able to get rid of the if statement using command objects or something like that but I can't tell if it makes sense on your code because you didn't show it.
Ismael answer works really well in Ruby (there may be other languages too) for 2 reasons:
Ruby has very low overhead to creating methods due to lack of type
definition
It allows you to decouple such logic for reuse or future adaptability and testing
Another option I'll toss out is create logic equations and store the result in a variable e.g.
# this are short logic equations testing x but you can apply same for longer expressions
number_gt_5 = x > 5
number_lt_20 = x < 20
number_eq_11 = x == 11
if (number_gt_5 && number_lt_20 && !number_eq_11)
# do some stuff
end

Turn string into number in Racket

I used read to get a line from a file. The documentation said read returns any, so is it turning the line to a string? I have problems turning the string "1" to the number 1, or "500.8232" into 500.8232. I am also wondering if Racket can directly read numbers in from a file.
Check out their documentation search, it's complete and accurate. Conversion functions usually have the form of foo->bar (which you can assume takes a foo and returns a bar constructed from it).
You sound like you're looking for a function that takes a string and returns a number, and as it happens, string->number does exist, and does pretty much exactly what you're looking for.
Looks like this was answered in another question:
Convert String to Code in Scheme
NB: that converts any s-expression, not just integers. If you want just integers, try:
string->number
Which is mentioned in
Scheme language: merge two numbers
HTH

Any reason NOT to always use keyword arguments?

Before jumping into python, I had started with some Objective-C / Cocoa books. As I recall, most functions required keyword arguments to be explicitly stated. Until recently I forgot all about this, and just used positional arguments in Python. But lately, I've ran into a few bugs which resulted from improper positions - sneaky little things they were.
Got me thinking - generally speaking, unless there is a circumstance that specifically requires non-keyword arguments - is there any good reason NOT to use keyword arguments? Is it considered bad style to always use them, even for simple functions?
I feel like as most of my 50-line programs have been scaling to 500 or more lines regularly, if I just get accustomed to always using keyword arguments, the code will be more easily readable and maintainable as it grows. Any reason this might not be so?
UPDATE:
The general impression I am getting is that its a style preference, with many good arguments that they should generally not be used for very simple arguments, but are otherwise consistent with good style. Before accepting I just want to clarify though - is there any specific non-style problems that arise from this method - for instance, significant performance hits?
There isn't any reason not to use keyword arguments apart from the clarity and readability of the code. The choice of whether to use keywords should be based on whether the keyword adds additional useful information when reading the code or not.
I follow the following general rule:
If it is hard to infer the function (name) of the argument from the function name – pass it by keyword (e.g. I wouldn't want to have text.splitlines(True) in my code).
If it is hard to infer the order of the arguments, for example if you have too many arguments, or when you have independent optional arguments – pass it by keyword (e.g. funkyplot(x, y, None, None, None, None, None, None, 'red') doesn't look particularly nice).
Never pass the first few arguments by keyword if the purpose of the argument is obvious. You see, sin(2*pi) is better than sin(value=2*pi), the same is true for plot(x, y, z).
In most cases, stable mandatory arguments would be positional, and optional arguments would be keyword.
There's also a possible difference in performance, because in every implementation the keyword arguments would be slightly slower, but considering this would be generally a premature optimisation and the results from it wouldn't be significant, I don't think it's crucial for the decision.
UPDATE: Non-stylistical concerns
Keyword arguments can do everything that positional arguments can, and if you're defining a new API there are no technical disadvantages apart from possible performance issues. However, you might have little issues if you're combining your code with existing elements.
Consider the following:
If you make your function take keyword arguments, that becomes part of your interface.
You can't replace your function with another that has a similar signature but a different keyword for the same argument.
You might want to use a decorator or another utility on your function that assumes that your function takes a positional argument. Unbound methods are an example of such utility because they always pass the first argument as positional after reading it as positional, so cls.method(self=cls_instance) doesn't work even if there is an argument self in the definition.
None of these would be a real issue if you design your API well and document the use of keyword arguments, especially if you're not designing something that should be interchangeable with something that already exists.
If your consideration is to improve readability of function calls, why not simply declare functions as normal, e.g.
def test(x, y):
print "x:", x
print "y:", y
And simply call functions by declaring the names explicitly, like so:
test(y=4, x=1)
Which obviously gives you the output:
x: 1
y: 4
or this exercise would be pointless.
This avoids having arguments be optional and needing default values (unless you want them to be, in which case just go ahead with the keyword arguments! :) and gives you all the versatility and improved readability of named arguments that are not limited by order.
Well, there are a few reasons why I would not do that.
If all your arguments are keyword arguments, it increases noise in the code and it might remove clarity about which arguments are required and which ones are optionnal.
Also, if I have to use your code, I might want to kill you !! (Just kidding), but having to type the name of all the parameters everytime... not so fun.
Just to offer a different argument, I think there are some cases in which named parameters might improve readability. For example, imagine a function that creates a user in your system:
create_user("George", "Martin", "g.m#example.com", "payments#example.com", "1", "Radius Circle")
From that definition, it is not at all clear what these values might mean, even though they are all required, however with named parameters it is always obvious:
create_user(
first_name="George",
last_name="Martin",
contact_email="g.m#example.com",
billing_email="payments#example.com",
street_number="1",
street_name="Radius Circle")
I remember reading a very good explanation of "options" in UNIX programs: "Options are meant to be optional, a program should be able to run without any options at all".
The same principle could be applied to keyword arguments in Python.
These kind of arguments should allow a user to "customize" the function call, but a function should be able to be called without any implicit keyword-value argument pairs at all.
Sometimes, things should be simple because they are simple.
If you always enforce you to use keyword arguments on every function call, soon your code will be unreadable.
When Python's built-in compile() and __import__() functions gain keyword argument support, the same argument was made in favor of clarity. There appears to be no significant performance hit, if any.
Now, if you make your functions only accept keyword arguments (as opposed to passing the positional parameters using keywords when calling them, which is allowed), then yes, it'd be annoying.
I don't see the purpose of using keyword arguments when the meaning of the arguments is obvious
Keyword args are good when you have long parameter lists with no well defined order (that you can't easily come up with a clear scheme to remember); however there are many situations where using them is overkill or makes the program less clear.
First, sometimes is much easier to remember the order of keywords than the names of keyword arguments, and specifying the names of arguments could make it less clear. Take randint from scipy.random with the following docstring:
randint(low, high=None, size=None)
Return random integers x such that low <= x < high.
If high is None, then 0 <= x < low.
When wanting to generate a random int from [0,10) its clearer to write randint(10) than randint(low=10) in my view. If you need to generate an array with 100 numbers in [0,10) you can probably remember the argument order and write randint(0, 10, 100). However, you may not remember the variable names (e.g., is the first parameter low, lower, start, min, minimum) and once you have to look up the parameter names, you might as well not use them (as you just looked up the proper order).
Also consider variadic functions (ones with variable number of parameters that are anonymous themselves). E.g., you may want to write something like:
def square_sum(*params):
sq_sum = 0
for p in params:
sq_sum += p*p
return sq_sum
that can be applied a bunch of bare parameters (square_sum(1,2,3,4,5) # gives 55 ). Sure you could have written the function to take an named keyword iterable def square_sum(params): and called it like square_sum([1,2,3,4,5]) but that may be less intuitive, especially when there's no potential confusion about the argument name or its contents.
A mistake I often do is that I forget that positional arguments have to be specified before any keyword arguments, when calling a function. If testing is a function, then:
testing(arg = 20, 56)
gives a SyntaxError message; something like:
SyntaxError: non-keyword arg after keyword arg
It is easy to fix of course, it's just annoying. So in the case of few - lines programs as the ones you mention, I would probably just go with positional arguments after giving nice, descriptive names to the parameters of the function. I don't know if what I mention is that big of a problem though.
One downside I could see is that you'd have to think of a sensible default value for everything, and in many cases there might not be any sensible default value (including None). Then you would feel obliged to write a whole lot of error handling code for the cases where a kwarg that logically should be a positional arg was left unspecified.
Imagine writing stuff like this every time..
def logarithm(x=None):
if x is None:
raise TypeError("You can't do log(None), sorry!")

What are the pros and cons of putting as much logic as possible in a minimum(one-liners) piece of code?

Is it cool?
IMO one-liners reduces the readability and makes debugging/understanding more difficult.
Maximize understandability of the code.
Sometimes that means putting (simple, easily understood) expressions on one line in order to get more code in a given amount of screen real-estate (i.e. the source code editor).
Other times that means taking small steps to make it obvious what the code means.
One-liners should be a side-effect, not a goal (nor something to be avoided).
If there is a simple way of expressing something in a single line of code, that's great. If it's just a case of stuffing in lots of expressions into a single line, that's not so good.
To explain what I mean - LINQ allows you to express quite complicated transformations in relative simplicity. That's great - but I wouldn't try to fit a huge LINQ expression onto a single line. For instance:
var query = from person in employees
where person.Salary > 10000m
orderby person.Name
select new { person.Name, person.Deparment };
is more readable than:
var query = from person in employees where person.Salary > 10000m orderby person.Name select new { person.Name, person.Deparment };
It's also more readabe than doing all the filtering, ordering and projection manually. It's a nice sweet-spot.
Trying to be "clever" is rarely a good idea - but if you can express something simply and concisely, that's good.
One-liners, when used properly, transmit your intent clearly and make the structure of your code easier to grasp.
A python example is list comprehensions:
new_lst = [i for i in lst if some_condition]
instead of:
new_lst = []
for i in lst:
if some_condition:
new_lst.append(i)
This is a commonly used idiom that makes your code much more readable and compact. So, the best of both worlds can be achieved in certain cases.
This is by definition subjective, and due to the vagueness of the question, you'll likely get answers all over the map. Are you referring to a single physical line or logical line? EG, are you talking about:
int x = BigHonkinClassName.GetInstance().MyObjectProperty.PropertyX.IntValue.This.That.TheOther;
or
int x = BigHonkinClassName.GetInstance().
MyObjectProperty.PropertyX.IntValue.
This.That.TheOther;
One-liners, to me, are a matter of "what feels right." In the case above, I'd probably break that into both physical and logic lines, getting the instance of BigHonkinClassName, then pulling the full path to .TheOther. But that's just me. Other people will disagree. (And there's room for that. Like I said, subjective.)
Regarding readability, bear in mind that, for many languages, even "one-liners" can be broken out into multiple lines. If you have a long set of conditions for the conditional ternary operator (? :), for example, it might behoove you to break it into multiple physical lines for readability:
int x = (/* some long condition */) ?
/* some long method/property name returning an int */ :
/* some long method/property name returning an int */ ;
At the end of the day, the answer is always: "It depends." Some frameworks (such as many DAL generators, EG SubSonic) almost require obscenely long one-liners to get any real work done. Othertimes, breaking that into multiple lines is quite preferable.
Given concrete examples, the community can provide better, more practical advice.
In general, I definitely don't think you should ever "squeeze" a bunch of code onto a single physical line. That doesn't just hurt legibility, it smacks of someone who has outright disdain for the maintenance programmer. As I used to teach my students: always code for the maintenance programmer, because it will often be you.
:)
Oneliners can be useful in some situations
int value = bool ? 1 : 0;
But for the most part they make the code harder to follow. I think you only should put things on one line when it is easy to follow, the intent is clear, and it won't affect debugging.
One-liners should be treated on a case-by-case basis. Sometimes it can really hurt readability and a more verbose (read: easy-to-follow) version should be used.
There are times, however when a one-liner seems more natural. Take the following:
int Total = (Something ? 1 : 2)
+ (SomethingElse ? (AnotherThing ? x : y) : z);
Or the equivalent (slightly less readable?):
int Total = Something ? 1 : 2;
Total += SomethingElse ? (AnotherThing ? x : y) : z;
IMHO, I would prefer either of the above to the following:
int Total;
if (Something)
Total = 1;
else
Total = 2;
if (SomethingElse)
if (AnotherThing)
Total += x;
else
Total += y;
else
Total += z
With the nested if-statements, I have a harder time figuring out the final result without tracing through it. The one-liner feels more like the math formula it was intended to be, and consequently easier to follow.
As far as the cool factor, there is a certain feeling of accomplishment / show-off factor in "Look Ma, I wrote a whole program in one line!". But I wouldn't use it in any context other than playing around; I certainly wouldn't want to have to go back and debug it!
Ultimately, with real (production) projects, whatever makes it easiest to understand is best. Because there will come a time that you or someone else will be looking at the code again. What they say is true: time is precious.
That's true in most cases, but in some cases where one-liners are common idioms, then it's acceptable. ? : might be an example. Closure might be another one.
No, it is annoying.
One liners can be more readable and they can be less readable. You'll have to judge from case to case.
And, of course, on the prompt one-liners rule.
VASTLY more important is developing and sticking to a consistent style.
You'll find bugs MUCH faster, be better able to share code with others, and even code faster if you merely develop and stick to a pattern.
One aspect of this is to make a decision on one-liners. Here's one example from my shop (I run a small coding department) - how we handle IFs:
Ifs shall never be all on one line if they overflow the visible line length, including any indentation.
Thou shalt never have else clauses on the same line as the if even if it comports with the line-length rule.
Develop your own style and STICK WITH IT (or, refactor all code in the same project if you change style).
.
The main drawback of "one liners" in my opinion is that it makes it hard to break on the code and debug. For example, pretend you have the following code:
a().b().c(d() + e())
If this isn't working, its hard to inspect the intermediate values. However, it's trivial to break with gdb (or whatever other tool you may be using) in the following, and check each individual variable and see precisely what is failing:
A = a();
B = A.b();
D = d();
E = e(); // here i can query A B D and E
B.C(d + e);
One rule of thumb is if you can express the concept of the one line in plain language in a very short sentence. "If it's true, set it to this, otherwise set it to that"
For a code construct where the ultimate objective of the entire structure is to decide what value to set a single variable, With appropriate formatting, it is almost always clearer to put multiple conditonals into a single statement. With multiple nested if end if elses, the overall objective, to set the variable...
" variableName = "
must be repeated in every nested clause, and the eye must read all of them to see this.. with a singlr statement, it is much clearer, and with the appropriate formatting, the complexity is more easily managed as well...
decimal cost =
usePriority? PriorityRate * weight:
useAirFreight? AirRate * weight:
crossMultRegions? MultRegionRate:
SingleRegionRate;
The prose is an easily understood one liner that works.
The cons is the concatenation of obfuscated gibberish on one line.
Generally, I'd call it a bad idea (although I do it myself on occasion) -- it strikes me as something that's done more to impress on how clever someone is than it is to make good code. "Clever tricks" of that sort are generally very bad.
That said, I personally aim to have one "idea" per line of code; if this burst of logic is easily encapsulated in a single thought, then go ahead. If you have to stop and puzzle it out a bit, best to break it up.

Resources