Using the name "function" for a variable in Python code - coding-style

Is using the word function for the name of an argument considered bad style in Python code?
def conjunction_junction(function):
pass # do something and call function in here
This pattern occurs all the time, especially in decorators. You see func, fn and f used all of the time but I prefer to avoid abbreviations when possible. Is the fact that it's the name of a type enough to warrant abbreviating it?
>> type(conjunction_junction).__name__
'function'

It's not a reserved keyword, so I don't see why not.
From the Style Guide
If a function argument's name clashes with a reserved keyword, it is
generally better to append a single trailing underscore rather than
use an abbreviation or spelling corruption. Thus class_ is better than
clss. (Perhaps better is to avoid such clashes by using a synonym.)

Using function is perfectly fine.
There is nothing in the style guide about it specifically. The reason that the use of type names such as str and list is highly discouraged is because they have functionality within the language. Overwriting them would obscure the functionality of the code. function on the other hand, does nothing.
I suspect func, fn, and f are used because they are all shorter than typing function ;)

Related

Is there a way to use a dynamic function name in Elixir from string interpolation like in Ruby?

I want to be able to construct a function call from a string in elixir. Is this possible? The equivalent ruby method call would be:
"uppercase".send("u#{:pcase}")
Although the answer by #fhdhsni is perfectly correct, I’d add some nitpicking clarification.
The exact equivalent of Kernel#send from ruby in elixir is impossible, because Kernel#send allows to call private methods on the receiver. In elixir, private functions do not ever exist in the compiled code.
If you meant Kernel#public_send, it might be achieved with Kernel.apply/3, as mentioned by #fhdhsni. The only correction is since the atom table is not garbage collected, and one surely wants to call an indeed existing function, it should be done with String.to_existing_atom/1.
apply(
String,
String.to_existing_atom("u#{:pcase}"),
["uppercase"]
)
Also, one might use macros during the compilation stage to generate respective clauses when the list of functions to call is predictable (when it’s not, the code already smells.)
defmodule Helper do
Enum.each(~w|upcase|a, fn fname ->
def unquote(fname)(param),
do: String.unquote(fname)(param)
# or
# defdelegate unquote(fname)(param), to: String
end)
end
Helper.upcase("uppercase")
#⇒ "UPPERCASE"
In Elixir module and function names are atoms. You can use apply to call them dynamically.
apply(String, String.to_atom("u#{:pcase}"), ["uppercase"]) # "UPPERCASE"
Depending on your use case it might not be a good idea to create atoms dynamically (since the atom table is not garbage collected).

Perl, Alias sub to variable

I'm currently doing micro-optimization of a perl program and like to optimize some getters.
I have a package with this getter-structure:
package Test;
our $ABC;
sub GetABC { $ABC } # exported sub...
Calling GetABC() creates a lot of sub-related overhead. Accessing the variable directly via $Test::ABC is insanely faster.
Is there a way to alias the getter to the variable to gain the same performanceboost as if I would call the variable directly? Inlining hint with "()" doesn seem to work...
There is no way to turn a variable into an accessor sub, or to replace a sub with a variable access. You will have to live with the overhead.
Non-solutions:
Using a () prototype does not turn calls into your sub to constant accesses because that prototype merely makes a sub potentially eligible for inlining. Since the body of the sub is not itself constant, this sub cannot be a constant.
The overhead is per-call as perl has to do considerable bookkeeping for each call. Therefore, rewriting that accessor in XS won't help much.
Creating a constant won't help because the constant will be a copy, not an alias of your variable.
But looking at the constant.pm source code seems to open up an interesting solution. Note that this a hack, and may not work in all versions of Perl: When we assign a scalar ref to a symbol table entry directly where that entry does not yet contain a typeglob, then an inlineable sub springs into place:
package Foo;
use strict;
use warnings;
use feature 'say';
my $x = 7;
BEGIN { $Foo::{GetX} = \$x } # don't try this at home
say GetX; #=> 7
$x = 3;
say GetX; #=> 3
This currently works on most of my installed perl versions (5.14, 5.22, 5.24, 5.26). However, my 5.22-multi and 5.26-multi die with “Modification of a read-only value attempted”. This is not a problem for the constant module since it makes the reference target readonly first and (more importantly) never modifies that variable.
So not only doesn't this work reliably, this will also completely mess up constant folding.
If the function call overhead is indeed unbearable (e.g. takes a double-digit percentage of your processing time), then doing the inlining yourself in the source code is going to be your best bet. Even if you have a lot of call locations, you can probably create a simple script that fixes the easy cases for you: select all files that import your module and only have a single package declaration. Within such files, replace calls to GetABC (with or without parens) to fully qualified variable accesses. Hopefully that token is not mentioned within any strings. Afterwards, you can manually inspect the few remaining occurrences of these calls.

Why are 'new' and 'make' not reserved keywords?

With syntax highlighting enabled, it's distracting while reading code like answer to this question with new used as a variable name.
I'm trying to think of a reason why only a subset of keywords would be reserved and can't come up with a good one.
Edit: Alternate title for this question:
Why are Go's predeclared identifiers not reserved ?
That's because new and make aren't really keywords, but built-in functions.
If you examine the full list of the reserved keywords, you won't see len or cap either...
In short: because using predeclared identifiers in your own declarations don't make your code ambiguous.
new and make are builtin functions, amongst many others. See the full list in the doc of the builtin package.
Keywords are a carefully chosen small set of reserved words. You cannot use keywords as identifiers because that could make interpreting your source ambiguous.
Builtin functions are special functions: you can use them without any imports, and they may have varying parameter and return list. You are allowed to declare functions or variables with the names of the builtin functions because your code will still remain unambiguous: your identifier in scope will "win".
If you would use a keyword as an identifier, you couldn't reach your entity because an attempt to refer to it by its name would always mean the keyword and not your identifier.
See this dirty example of using the predeclared identifiers of the builtin function make and the builtin type int (not recommended for production code) (try it on the Go Playground):
make := func() string {
return "hijacked"
}
int := make() // Completely OK, variable 'int' will be a string
fmt.Println(int) // Prints "hijacked"
If you could use keywords as identifiers, even interpreting declarations would cause headache to the compiler: what would the following mean?
func() - is it calling your function named func or is it a function type with no params and no return types?
import() - is it a grouped import declaration (and importing nothing) or calling our function named import?
...

Pythonesque blocks and postfix expressions

In JavaScript,
f = function(x) {
return x + 1;
}
(5)
seems at a glance as though it should assign f the successor function, but actually assigns the value 6, because the lambda expression followed by parentheses is interpreted by the parser as a postfix expression, specifically a function call. Fortunately this is easy to fix:
f = function(x) {
return x + 1;
};
(5)
behaves as expected.
If Python allowed a block in a lambda expression, there would be a similar problem:
f = lambda(x):
return x + 1
(5)
but this time we can't solve it the same way because there are no semicolons. In practice Python avoids the problem by not allowing multiline lambda expressions, but I'm working on a language with indentation-based syntax where I do want multiline lambda and other expressions, so I'm trying to figure out how to avoid having a block parse as the start of a postfix expression. Thus far I'm thinking maybe each level of the recursive descent parser should have a parameter along the lines of 'we have already eaten a block in this statement so don't do postfix'.
Are there any existing languages that encounter this problem, and how do they solve it if so?
Python has semicolons. This is perfectly valid (though ugly and not recommended) Python code: f = lambda(x): x + 1; (5).
There are many other problems with multi-line lambdas in otherwise standard Python syntax though. It is completely incompatible with how Python handles indentation (whitespace in general, actually) inside expressions - it doesn't, and that's the complete opposite of what you want. You should read the numerous python-ideas thread about multi-line lambdas. It's somewhere between very hard to impossible.
If you want arbitrarily complex compound statements inside lambdas you can't use the existing rules for multi-line expressions even if you made all statements expressions. You'd have to change the indentation handling (see the language reference for how it works right now) so that expressions can also contain blocks. This is hard to do without breaking perfectly fine Python code, and will certainly result in a language many Python programmers will consider worse in several regards: Harder to understand, more complex to implement, permits some stupid errors, etc.
Most languages don't solve this exact problem at all. Most candidates (Scala, Ruby, Lisps, and variants of these three) have explicit end-of-block tokens. I know of two languages that have the same problem, one of which (Haskell) has been mentioned by another answer. Coffeescript also uses indentation without end-of-block tokens. It parses the transliteration of your example correctly. However, I could not find any specification of how or why it does this (and I won't dig through the parser source code). Both differ significantly from Python in syntax as well as design philosophy, so their solution is of little (if any) use for Python.
In Haskell, there is an implicit semicolon whenever you start a line with the same indentation as a previous one, assuming the parser is in a layout-sensitive mode.
More specifically, after a token is encountered that signals the start of a (layout-sensitive) block, the indentation level of the first token of the first block item is remembered. Each line that is indented more continues the current block item; each line that is indented the same starts a new block item, and the first line that is indented less implies the closure of the block.
How your last example would be treated depends on whether the f = is a block item in some block or not. If it is, then there will be an implicit semicolon between the lambda expression and the (5), since the latter is indented the same as the former. If it is not, then the (5) will be treated as continuing whatever block item the f = is a part of, making it an argument to the lamda function.
The details are a bit messier than this; look at the Haskell 2010 report.

Any reason NOT to always use keyword arguments?

Before jumping into python, I had started with some Objective-C / Cocoa books. As I recall, most functions required keyword arguments to be explicitly stated. Until recently I forgot all about this, and just used positional arguments in Python. But lately, I've ran into a few bugs which resulted from improper positions - sneaky little things they were.
Got me thinking - generally speaking, unless there is a circumstance that specifically requires non-keyword arguments - is there any good reason NOT to use keyword arguments? Is it considered bad style to always use them, even for simple functions?
I feel like as most of my 50-line programs have been scaling to 500 or more lines regularly, if I just get accustomed to always using keyword arguments, the code will be more easily readable and maintainable as it grows. Any reason this might not be so?
UPDATE:
The general impression I am getting is that its a style preference, with many good arguments that they should generally not be used for very simple arguments, but are otherwise consistent with good style. Before accepting I just want to clarify though - is there any specific non-style problems that arise from this method - for instance, significant performance hits?
There isn't any reason not to use keyword arguments apart from the clarity and readability of the code. The choice of whether to use keywords should be based on whether the keyword adds additional useful information when reading the code or not.
I follow the following general rule:
If it is hard to infer the function (name) of the argument from the function name – pass it by keyword (e.g. I wouldn't want to have text.splitlines(True) in my code).
If it is hard to infer the order of the arguments, for example if you have too many arguments, or when you have independent optional arguments – pass it by keyword (e.g. funkyplot(x, y, None, None, None, None, None, None, 'red') doesn't look particularly nice).
Never pass the first few arguments by keyword if the purpose of the argument is obvious. You see, sin(2*pi) is better than sin(value=2*pi), the same is true for plot(x, y, z).
In most cases, stable mandatory arguments would be positional, and optional arguments would be keyword.
There's also a possible difference in performance, because in every implementation the keyword arguments would be slightly slower, but considering this would be generally a premature optimisation and the results from it wouldn't be significant, I don't think it's crucial for the decision.
UPDATE: Non-stylistical concerns
Keyword arguments can do everything that positional arguments can, and if you're defining a new API there are no technical disadvantages apart from possible performance issues. However, you might have little issues if you're combining your code with existing elements.
Consider the following:
If you make your function take keyword arguments, that becomes part of your interface.
You can't replace your function with another that has a similar signature but a different keyword for the same argument.
You might want to use a decorator or another utility on your function that assumes that your function takes a positional argument. Unbound methods are an example of such utility because they always pass the first argument as positional after reading it as positional, so cls.method(self=cls_instance) doesn't work even if there is an argument self in the definition.
None of these would be a real issue if you design your API well and document the use of keyword arguments, especially if you're not designing something that should be interchangeable with something that already exists.
If your consideration is to improve readability of function calls, why not simply declare functions as normal, e.g.
def test(x, y):
print "x:", x
print "y:", y
And simply call functions by declaring the names explicitly, like so:
test(y=4, x=1)
Which obviously gives you the output:
x: 1
y: 4
or this exercise would be pointless.
This avoids having arguments be optional and needing default values (unless you want them to be, in which case just go ahead with the keyword arguments! :) and gives you all the versatility and improved readability of named arguments that are not limited by order.
Well, there are a few reasons why I would not do that.
If all your arguments are keyword arguments, it increases noise in the code and it might remove clarity about which arguments are required and which ones are optionnal.
Also, if I have to use your code, I might want to kill you !! (Just kidding), but having to type the name of all the parameters everytime... not so fun.
Just to offer a different argument, I think there are some cases in which named parameters might improve readability. For example, imagine a function that creates a user in your system:
create_user("George", "Martin", "g.m#example.com", "payments#example.com", "1", "Radius Circle")
From that definition, it is not at all clear what these values might mean, even though they are all required, however with named parameters it is always obvious:
create_user(
first_name="George",
last_name="Martin",
contact_email="g.m#example.com",
billing_email="payments#example.com",
street_number="1",
street_name="Radius Circle")
I remember reading a very good explanation of "options" in UNIX programs: "Options are meant to be optional, a program should be able to run without any options at all".
The same principle could be applied to keyword arguments in Python.
These kind of arguments should allow a user to "customize" the function call, but a function should be able to be called without any implicit keyword-value argument pairs at all.
Sometimes, things should be simple because they are simple.
If you always enforce you to use keyword arguments on every function call, soon your code will be unreadable.
When Python's built-in compile() and __import__() functions gain keyword argument support, the same argument was made in favor of clarity. There appears to be no significant performance hit, if any.
Now, if you make your functions only accept keyword arguments (as opposed to passing the positional parameters using keywords when calling them, which is allowed), then yes, it'd be annoying.
I don't see the purpose of using keyword arguments when the meaning of the arguments is obvious
Keyword args are good when you have long parameter lists with no well defined order (that you can't easily come up with a clear scheme to remember); however there are many situations where using them is overkill or makes the program less clear.
First, sometimes is much easier to remember the order of keywords than the names of keyword arguments, and specifying the names of arguments could make it less clear. Take randint from scipy.random with the following docstring:
randint(low, high=None, size=None)
Return random integers x such that low <= x < high.
If high is None, then 0 <= x < low.
When wanting to generate a random int from [0,10) its clearer to write randint(10) than randint(low=10) in my view. If you need to generate an array with 100 numbers in [0,10) you can probably remember the argument order and write randint(0, 10, 100). However, you may not remember the variable names (e.g., is the first parameter low, lower, start, min, minimum) and once you have to look up the parameter names, you might as well not use them (as you just looked up the proper order).
Also consider variadic functions (ones with variable number of parameters that are anonymous themselves). E.g., you may want to write something like:
def square_sum(*params):
sq_sum = 0
for p in params:
sq_sum += p*p
return sq_sum
that can be applied a bunch of bare parameters (square_sum(1,2,3,4,5) # gives 55 ). Sure you could have written the function to take an named keyword iterable def square_sum(params): and called it like square_sum([1,2,3,4,5]) but that may be less intuitive, especially when there's no potential confusion about the argument name or its contents.
A mistake I often do is that I forget that positional arguments have to be specified before any keyword arguments, when calling a function. If testing is a function, then:
testing(arg = 20, 56)
gives a SyntaxError message; something like:
SyntaxError: non-keyword arg after keyword arg
It is easy to fix of course, it's just annoying. So in the case of few - lines programs as the ones you mention, I would probably just go with positional arguments after giving nice, descriptive names to the parameters of the function. I don't know if what I mention is that big of a problem though.
One downside I could see is that you'd have to think of a sensible default value for everything, and in many cases there might not be any sensible default value (including None). Then you would feel obliged to write a whole lot of error handling code for the cases where a kwarg that logically should be a positional arg was left unspecified.
Imagine writing stuff like this every time..
def logarithm(x=None):
if x is None:
raise TypeError("You can't do log(None), sorry!")

Resources