In Lisp/Racket/Scheme how is it possible to have an argument named `list`? - arguments

Isn’t list a keyword to create a new list in Lisp, but yet it is possible to have an argument called list in Lisp. I thought keywords in most programming languages such as Java or C++ cannot be used for argument names, is there a special reason in Lisp that they can?

The name list isn't a reserved keyword, it's an ordinary function. Reusing the name for another purpose can be confusing for the reader but doesn't present any problems for the language itself; it's the same as having two variables called x in different parts of the program.

Mainstream Lisp descendants and derivatives like Commmon Lisp and Scheme do not incorporate the concept of reserved keywords. It is alien to the way Lisp works.
When Lisp read syntax is scanned, identifier tokens which appear in it are converted into corresponding symbol objects. These tokens are all in the same lexical category: symbol.
When Lisp read syntax is scanned and turned into an object, such as a nested list representing program code, this is done without regard for the semantics (what the symbols mean).
This is different from the parsing of languages (such as some of those in the broad Fortran/Algol family) which have reserved keywords.
Roughly speaking, reserved keywords are tokens which look like symbols but are actually just punctuation. Lisp has punctuation also, like parentheses, sharpsign prefixes, various quotes and such.
These punctuation words have a fixed role in the phrase structure grammar, and the phrase structure grammar must be processed before the semantics of the program can be considered.
So for instance, the reserved BEGIN and END keywords in Pascal are essentially nothing more than verbose parentheses. The '(' and ')' tokens are similarly reserved in Lisp-like languages. Trying to use BEGIN as the name of a function or variable in Pascal is similar to trying to use ( as the name of a function or variable in Lisp.
Some languages have keywords which determine phrase structure, yet allow identifiers which look exactly like reserved keywords to be used anyway. For instance, PL/I was famous for this:
IF IF=THEN THEN THEN=ELSE; ELSE ELSE=IF
Lisp dialects may assign special semantic treatment to certain symbols or certain categories of symbols. This is a sort of reservation, but not exactly the same as reserved keywords, because it is at the semantic level. For instance, in Common Lisp, the symbols nil and t (more specifically the nil and t in the common-lisp package, common-lisp:nil and common-lisp:t) may not be used as function or variable names. When either one appears as an expression, it evaluates to itself: the value of t is t and that of nil is nil. Moreover, nil is also the Boolean false value and the empty list. So, effectively, these symbols are reserved in some regards. Common Lisp also has a keyword package. All symbols in that package evaluate to themselves and may not be used as variables. They may be used as function names, and for any other purpose.

You say Lisp, but the answer changes depending on which Lisp you're talking about.
In Common Lisp, you can use list as a variable because Common Lisp is a Lisp-2, meaning that each symbol has a separate slot for a function binding and a variable binding. Common Lisp sets the function binding for the symbol list in the CL package, but doesn't set the variable binding. You can't change the function binding because Common Lisp doesn't allow you to redefine bindings for symbols that are set in the CL package (you can, of course, use whatever symbols you like in your own packages), but since the variable binding is free you're allowed to use it.
Scheme is a Lisp-1, which means that it only has one binding per symbol. There's no separation of function bindings and variable bindings (hence why you use define in Scheme, but defun and defvar in CL). The reason you can use "list" as a variable is because Scheme doesn't prevent you from rebinding its built-in symbols. It's just generally a bad idea, since by redefining list you can no longer call the list function.
Emacs Lisp is a Lisp-2 but doesn't prevent you from rebinding symbols, which means you can do things like (defun + (- a b)) and totally screw up your editing session. So... don't do that, unless you really know what you're doing.
Clojure is a Lisp-1. I don't have a working Clojure install at the moment so I can't comment on what it lets you do. I would suspect it's more strict than Scheme.

Related

Why do Julia programmers need to prefix macros with the at-sign?

Whenever I see a Julia macro in use like #assert or #time I'm always wondering about the need to distinguish a macro syntactically with the # prefix. What should I be thinking of when using # for a macro? For me it adds noise and distraction to an otherwise very nice language (syntactically speaking).
I mean, for me '#' has a meaning of reference, i.e. a location like a domain or address. In the location sense # does not have a meaning for macros other than that it is a different compilation step.
The # should be seen as a warning sign which indicates that the normal rules of the language might not apply. E.g., a function call
f(x)
will never modify the value of the variable x in the calling context, but a macro invocation
#mymacro x
(or #mymacro f(x) for that matter) very well might.
Another reason is that macros in Julia are not based on textual substitution as in C, but substitution in the abstract syntax tree (which is much more powerful and avoids the unexpected consequences that textual substitution macros are notorious for).
Macros have special syntax in Julia, and since they are expanded after parse time, the parser also needs an unambiguous way to recognise them
(without knowing which macros have been defined in the current scope).
ASCII characters are a precious resource in the design of most programming languages, Julia very much included. I would guess that the choice of # mostly comes down to the fact that it was not needed for something more important, and that it stands out pretty well.
Symbols always need to be interpreted within the context they are used. Having multiple meanings for symbols, across contexts, is not new and will probably never go away. For example, no one should expect #include in a C program to go viral on Twitter.
Julia's Documentation entry Hold up: why macros? explains pretty well some of the things you might keep in mind while writing and/or using macros.
Here are a few snippets:
Macros are necessary because they execute when code is parsed,
therefore, macros allow the programmer to generate and include
fragments of customized code before the full program is run.
...
It is important to emphasize that macros receive their arguments as
expressions, literals, or symbols.
So, if a macro is called with an expression, it gets the whole expression, not just the result.
...
In place of the written syntax, the macro call is expanded at parse
time to its returned result.
It actually fits quite nicely with the semantics of the # symbol on its own.
If we look up the Wikipedia entry for 'At symbol' we find that it is often used as a replacement for the preposition 'at' (yes it even reads 'at'). And the preposition 'at' is used to express a spatial or temporal relation.
Because of that we can use the #-symbol as an abbreviation for the preposition at to refer to a spatial relation, i.e. a location like #tony's bar, #france, etc., to some memory location #0x50FA2C (e.g. for pointers/addresses), to the receiver of a message (#user0851 which twitter and other forums use, etc.) but as well for a temporal relation, i.e. #05:00 am, #midnight, #compile_time or #parse_time.
And since macros are processed at parse time (here you have it) and this is totally distinct from the other code that is evaluated at run time (yes there are many different phases in between but that's not the point here).
In addition to explicitly direct the attention to the programmer that the following code fragment is processed at parse time! as oppossed to run time, we use #.
For me this explanation fits nicely in the language.
thanks#all ;)

What is the convention for considering a function with side-effects in Racket/Scheme?

Obviously, it is a convention in Racket/Scheme to append an exclamation point to function names that perform mutation. For example, in Racket, set!, box-set!, vector-set!, etc. Certain functions have side-effects, like print, but since those side-effects are "harmless," I understand why they don't usually come with exclamation marks attached.
However, this convention is arbitrarily violated. For example, async-channel-get and async-channel-put clearly perform mutation, but they don't have the "mutation marker" appended to their names. This can be somewhat justified by pointing out that these are channels, clearly mutation-based, so the "!" would be superfluous.
This is not a valid justification for everything, though. Racket's WebSockets library provides ws-send! and ws-close! functions, both with the obvious markers, but ws-recv does not! Is this just an isolated violation of the convention, or is there some rule?
I ask this mostly to be sure of how I should name functions in my own code. When should I use the exclamation mark, when should I not? I recognize that it's just a convention, not a rule, and it will likely be somewhat inconsistent, but I still would like to know what the best practices are.
I don't think #!racket has its own naming convention, but according to this R5RS page you can read:
The names of procedures and syntactic forms that cause side effects
end with an exclamation point ( ! ). These include set! and
vector-set!. Procedures that perform input or output technically cause
side effects, but their names are exceptions to this rule.
In the Scheme wiki variable naming convention it says procedure! is for "significant side effects". IMO that means the side effects is the hero of the procedure as in set-car! and set! while read returns a value which perhaps is the main feature of read?

Using strings instead of symbols: good or evil?

Often enough, I find myself dealing with lists of function options (or more general replacement lists) of the form {foo->value,...}. This leads to bugs when foo already has a value in $Context. One obvious way to prevent this is using a string "foo" instead of the symbol: {"foo"->value,...}. This works, but seems to draw ire of some seasoned LISPers I know, who chastise me for conflating symbols and strings and tell me to use built-in quoting constructs.
While it is certainly possible to write code that avoids collisions without using strings, it often seems more trouble than it is worth. On the other hand, I haven't seen too many examples of {"string"->value} type replacement rules. So the question to you is -- is this an acceptable usage pattern?.. Are there cases where it is particularly appropriate?.. Where should it be avoided?..
In my opinion (disclaimer - it is only my opinion), it is best to avoid using strings as option names, at least for "main" options in your function. Strings OTOH are totally fine as settings (r.h.s. of options). This is not to say that you can not use strings, just as you noted. Perhaps, they could be more appropriate for sub-options, and they are used in this way by many system functions (usually "superfunctions" like NDSolve, that may have sub-options within options). The main problems I see with using strings is that they reduce the introspection capabilities, both for the system and for the user. In other words, it is harder to discover an option that has a string name than that with a symbol name - for the latter I can just inspect the names of the symbols in a package, and also symbolic option names have usage messages. You may also want to automate some things, such as writing a utility that finds all option names in the package etc. It is easier to do when option names are symbols, since they all belong to the same context. It is also easy to discover that some options do not have usage messages, one can do that automatically by writing a utility function.
Finally, you may have a better protection against accidental collisions of similar option names. It may be, that many option sequences are passed to your function, and occasionally they may contain options with the same name. If option names were symbols, full symbol names would be different. Then, you will both get a shadowing warning, and at the same time a protection - only the correct option (full) name will be used. For string, you don't get any warning, and may end up using incorrect option setting, if the duplicate string option name with a wrong setting (intended for a different function, say) happens to be first in the list. This scenario is more likely to occur in larger projects, but bugs like this are probably very hard to catch (this is a guess, I never had such situation).
As for possible collisions, if you follow some naming conventions such as option name always starting with a capital letter, plus put most of your code in packages, and do not start your variable or function names (for functions in the interactive session), with a capital letter, then you will greatly reduce the chance of such collisions. Additionally, you should Protect option names, when you define them, or at the end of the package. Then, the collisions will be detected as cases of shadowing. Avoiding shadowing, OTOH, is a general necessity, so the case of options is no more special in this respect than for function names etc.

What are some example use cases for symbol literals in Scala?

The use of symbol literals is not immediately clear from what I've read up on Scala. Would anyone care to share some real world uses?
Is there a particular Java idiom being covered by symbol literals? What languages have similar constructs? I'm coming from a Python background and not sure there's anything analogous in that language.
What would motivate me to use 'HelloWorld vs "HelloWorld"?
Thanks
In Java terms, symbols are interned strings. This means, for example, that reference equality comparison (eq in Scala and == in Java) gives the same result as normal equality comparison (== in Scala and equals in Java): 'abcd eq 'abcd will return true, while "abcd" eq "abcd" might not, depending on JVM's whims (well, it should for literals, but not for strings created dynamically in general).
Other languages which use symbols are Lisp (which uses 'abcd like Scala), Ruby (:abcd), Erlang and Prolog (abcd; they are called atoms instead of symbols).
I would use a symbol when I don't care about the structure of a string and use it purely as a name for something. For example, if I have a database table representing CDs, which includes a column named "price", I don't care that the second character in "price" is "r", or about concatenating column names; so a database library in Scala could reasonably use symbols for table and column names.
If you have plain strings representing say method names in code, that perhaps get passed around, you're not quite conveying things appropriately. This is sort of the Data/Code boundary issue, it's not always easy to the draw the line, but if we were to say that in that example those method names are more code than they are data, then we want something to clearly identify that.
A Symbol Literal comes into play where it clearly differentiates just any old string data with a construct being used in the code. It's just really there where you want to indicate, this isn't just some string data, but in fact in some way part of the code. The idea being things like your IDE would highlight it differently, and given the tooling, you could refactor on those, rather than doing text search/replace.
This link discusses it fairly well.
Note: Symbols will be deprecated and then removed in Scala 3 (dotty).
Reference: http://dotty.epfl.ch/docs/reference/dropped-features/symlits.html
Because of this, I personally recommend not using Symbols anymore (at least in new scala code). As the dotty documentation states:
Symbol literals are no longer supported
it is recommended to use a plain string literal [...] instead
Python mantains an internal global table of "interned strings" with the names of all variables, functions, modules, etc. With this table, the interpreter can make faster searchs and optimizations. You can force this process with the intern function (sys.intern in python3).
Also, Java and Scala automatically use "interned strings" for faster searchs. With scala, you can use the intern method to force the intern of a string, but this process don't works with all strings. Symbols benefit from being guaranteed to be interned, so a single reference equality check is both sufficient to prove equality or inequality.

Is a symbol table in Ruby any different from a symbol table in other languages

The wikipedia entry on Symbol tables is a good reference:
http://en.wikipedia.org/wiki/Symbol_table
But as I try to understand symbols in Ruby and how they are represented in the Array of Symbols (returned by the Symbol.all_symbols method),
I'm wondering whether Ruby's approach to the symbol table has any important differences from other languages?
Ruby doesn't really have a "symbol table" in that sense. It has bindings, and symbols (what lispers call atoms) but it isn't really doing it the way that article describes.
So in answer to your question: it isn't so much that ruby has the same thing done differently, but rather that it does two different things (:xxx notation --> unique ids and bindings in scopes) and uses similar / overlapping terminology for them.
To clarify:
The article you link to gives the conventional definition of a symbol table, to wit
where each identifier in a program's source code is associated with information relating to its declaration or appearance in the source, such as its type, scope level and sometimes its location
But this isn't what ruby's symbol table does. It just provides a globally unique identity for a certain class of objects which can be written as :something in the source code, including things like :+ and :"Hi bob!" which aren't identifiers. Also, merely using an identifier will not create a corresponding symbol. And finally, none of the information listed in the passage above is stored in ruby's list of symbols.
It's a coincidence of naming, and reading that article will not help you understand ruby's symbols.
The biggest difference is that (like Lisp) Ruby actually has a syntax for symbols, and it's easy to add/remove things at runtime yourself. If you say :balloon (or "balloon".intern) it will intern that for you. Even though you're referring to it by name in your source, internally it's just a pointer in the symbol table. If you compare symbols, it's just a pointer-compare, not a string-compare.
Languages like C don't really have a way to say simply "create a new symbol for me" at runtime. You can do it implicitly at compile-time by defining a function, but that's really its only use. Since C has no syntax for symbols, if you want to be able to say Balloon in your program but be able to compare it with a single machine instruction, you use enums (or #defines).
In Ruby, it takes only one character to make a symbol, so you can use it for all kinds of things (like hash keys).
Symbols in Ruby are used where other languages tend to use enums, defines, constants and the like. They're also often used for associative keys. Their use has little to do with a symbol table as discussed in that article, except that they obviously exist in one.

Resources