Understanding the syntax of LISP - syntax

Although LISP has some of the most simple syntax I've seen, I am still confused about the fundamentals. I've done research, and I've come to the conclusion that there are two datatypes: "atoms" and lists. However, I have also come across the term "S-expression", which seems to describe both atoms and lists. So, what exactly is an S-expression? Is it a datatype? In addition, I am not sure how to distinguish function calls from data lists in LISP. For example, (1 2 3) is a list, while (f 2 3) could be some function. But how am I supposed to know whether f is a function name or some datatype? Since lists and functions use the same syntax, I am not sure how to differentiate between the two. Finally, most importantly, I need a mental model for thinking about how LISP works. For example, what are the fundamental datatypes? What are the built-in procedures used to do things with the fundamental datatypes? How can we see data and procedures as distinct? For example, in Java, instance variables at the top of classes are used to represent data, while methods are the procedures that manipulate the data. What does this look like in LISP?
(I'm new, so I'm not sure if this question is too broad or not)

I second the recommendations for LispBook and Practical Common Lisp. Both great books. Once you have understood the basics, I really, really recommend Paul Graham's "On Lisp".
To guide you toward answers on your specific questions:
Data types: Lisp has a rich set of data types (numeric types such as integer, rational, float - character and string - arrays and hash-tables - plenty more), but in my opinion, assuming you are already comfortable with integer and string, you should start by reading up on symbol and cons.
symbol: Realise that a symbol is both an identifier and a value. Understand that each symbol can "point to" one data value and, at the same time, "point to" one function (and has three other properties that you don't need to worry about yet)
cons: Discover that this magical thing called a 'cons cell' is just a struct with two pointers. One called "the Address part of the Register" and the other called "the Decrement part of the register". Don't worry about what those names mean (they're no longer relevant), just know that the car function returns the Contents of the Address part of the Register, and cdr returns the Decrement part. Now forget all of that and just remember that modern Lispers think of the cons cell as a struct with two pointers, one called the car and the other called the cdr.
Lists: "There is no spoon." There is no list either. Realise that what we think of as a list is nothing more than a set of cons cells, with the car of each cons pointing to one member of the list, and the cdr of each cons pointing to the next cons (except for the last cons, whose cdr is "the null pointer" nil). This is vital to understand list manipulation, nested lists, tree structures, etc.
Atoms: For now, think of an atom as a number, a character, a string or a symbol. That's enough to get you started you can dig deeper into the details later.
S-expressions: An s-expr is defined as "either an atom, or a cons cell pointing to two s-expressions". Don't worry about what is or isn't an s-expr for now, but checkout wikipedia for a brief intro
Lists vs function calls: Man, oh, man! I struggled with this when I got started! But it's actually quite easy with a bit of practice: "Five times two" is just an English phrase, right? But if I asked you to "evaluate" it, you would probably say "ten". "1 + 2" is a mathematical expression, but a mathematician would evaluate it to 3. "5" is just a number, but if you type it into a calculator and press "=", the calculator will evaluate it to the answer 5. In Lisp (+ 1 2) is just a list. It is a list containing the symbol +, the number 1 and the number 2. BUT - if you ask Lisp to evaluate that list, it will call the function +, passing it 1 and 2 as parameters. A large part of getting comfortable with Lisp is learning when and where s-expressions are evaluated, and where they aren't. For now - anything you type into the REPL will be evaluated, parameters to function calls will be evaluated, parameters to macro calls will probably not be evaluated, you can use quote to prevent evaluation. This will become easier with practice.
Essential functions: Although I said the "list" doesn't really exist ("Its conses all the way down, Mr. Fry"), learn the list based stuff first, car, cdr, cons, list, append, reverse and the applicative stuff mapcar, mapcons, apply. These are a great way to start thinking in both lists and functional programming.
Classes, member data and methods: I recommend leaving object orientation for later. First learn the basics of Lisp. Fight the urge to worry about data encapsulation, access control and polymorphism until you have become friends with the language itself. When you are ready, read "On Lisp", Chapter 25 will walk you through adding object orientation to Lisp yourself, showing how Lisp really is the programmable programming language. The book will then introduce you to CLOS, the standard object-orientation system built into Common Lisp. Get to know CLOS, but certainly browse around for other OO libraries. Lisp is the only language I know where you can actually choose how your want the language to implement OO.
I'm going to stop here. Get comfortable with the first 8 concepts above and a strong mental model will will have sorted itself out.

Everything that is self evaluating is data. eg 2, "hello".
Everything that is quoted is data. eg '(f 3 4), 'test, or the longhand version (quote test)
Everything else that is fed to the REPL needs to be code. eg (f 3 4) is code. It's s-expression and indistinguishable to the quoted data above, but it isn't quoted so it has to be code.
There are some special forms like if, let, lambda, defun, ... that you just need to learn how works. How do you know that if isn't a method in a C dialect like Java or C#? You just have to know them by heart.
You also need to know some essential functions. Usually they are introduced in each and every tutorial together with the special forms. I recommend you read Land of Lisp and Practical Common Lisp for Common Lisp and Realm of Racket and How to design programs for Racket. For pure Scheme I recommend The wizard book (SICP). Don't do them all at the same time. Learning the other ones are easy when you have learned one good enougn.
Now, learning a Lisp is more difficult than learning another C dialect. It's because it is a totally different language with different ways to do stuff. Eg. you won't find for or while loops and variables don't have types but the objects they refer to have. You need to learn how to program almost as if you didn't know a C dialect. (Actually, knowing a C dialect might make the learning harder)
Good luck!

An "S-expression" is 'an atom, or a list of S-expressions' (there are some more complications here, I'll skip those for now, basically boiling down to something called "read macros", where a difference between the textual representation and the internal representation can be done, to simplify human writing).
(1 2 3) is a list. But, (f 1 2 3) is also a list. However, in some (most?) circumstances, it can also be a function call (rather than a function).
I think you mostly have the syntax down, the rest is semantics (in a very technical sense). At this point, I would point you at some good reading material, there's plenty of good books around (I started with an earlier edition of Winston-Horn's "Lisp".

Related

Is it standard for Scheme interpreters to not allow if-statements without alternative clauses?

I'm new to Scheme and I have a background coding mainly in C++/Java, and a bit of Python. I'm doing exercises in Study and Interpretation of Computer Programs, and I've come across this problem:
The book details the structure of the "if" special form as (if (conditional) (consequent clause) (alternative clause)). However, nothing indicates that an alternative clause MUST be included.
In fact, in the exercise I'm currently stuck on (exercise 1.22, for anyone interested), they provide some code that we are supposed to use in creating a procedure that tests for prime numbers within a given range and gives the amount of time taken to find them.
(define (start-prime-test n start-time)
(if (is-prime n)
(report-prime (- (runtime) start-time))))
This did not work, so I modified it slightly:
(define (start-prime-test n start-time)
(if (is-prime n)
(report-prime (- (runtime) start-time))
(display ""))) ;prints nothing
The first version results in "SchemeError: too few operands..." I modified it to have an alternative clause that essentially does nothing, and I'm no longer getting the error when testing the procedure.
I just want to know whether requiring an alternative clause is standard for most Scheme interpreters, or if it's unique to the one I'm using. I AM currently using two different interpreters, because the first one I used does not include the built-in procedures detailed in the book, so I have noticed there are some major differences in Scheme interpreters. But that's as far as I know, and it's been very hard finding useful information through googling.
Any help would be greatly appreciated; I don't like including "do-nothing" procedures.
In most Scheme interpreters if allows expressions without the "else" part, and that's what the standard says, as pointed by #codybartfast in his answer.
I'm only aware of Racket enforcing the rule that if must always have both the consequent and alternative parts, and it's for a very good reason: it'll help you catch the kind of mistakes that happen when you forget to write the "else" part.
Although it's valid to have if expressions without the alternative part, that only happens when we are writing procedural code (like displaying a result in your example), and that's the kind of programming style we want to avoid when using Scheme (as we favor functional programming).
Having said that, if you're absolutely certain that you want to write procedural code, then you should use when, which doesn't have an else part and unlike if, it can have several expressions inside because it has an implicit begin. This will work:
(define (start-prime-test n start-time)
(when (is-prime n)
(report-prime (- (runtime) start-time))))
I understand R5RS/R6RS are effectively the language standards. They say that the alternate clause is optional:
(if <test> <consequent> <alternate>)‌‌syntax
(if <test> <consequent>)‌‌syntax
Syntax: <Test>, <consequent>, and <alternate> must be expressions.
Semantics: An if expression is evaluated as follows: first, <test> is
evaluated. If it yields a true value (see section 5.7), then <consequent> is
evaluated and its values are returned. Otherwise <alternate> is evaluated
and its values are returned. If <test> yields #f and no <alternate> is
specified, then the result of the expression is unspecified.
But in 2009 (quite recently for scheme), the language steering commitee said:
Alas: Scheme has the unhappy distinction of being the world's most unportable programming language. It is almost misleading to call Scheme a "programming language;" it would be more accurate to characterise Scheme as a family of dialects, all loosely related by the common features of lexical scope, dynamic typing, list structure, higher-order functions, proper tail-recursion, garbage collection, macros, and (some form of) s-expression based lexical syntax.
So although a formal stanard may exist there seems little expectation that any given implementaion will adhere to that standard. E.g. By default Racket requires the alternate clause. (Although Racket can also support a R6RS compliant dialet.)
Personally I use Racket with the SICP language pack to be consistent with the book.
The one-armed if has indeed been standard in Scheme for a while.
Unfortunately the one-armed if allows one to make a mistake - forgetting the second arm is very easy to do. Some implementations added when and unless to reduce the number of mistakes.
See also: Why is one-armed "if" missing from Racket?

Is the scope of primitive functions in The Little Schemer incorrect?

Consider the following s-expression:
((lambda (car) (car (quote (a b c)))) cdr)
In most scheme implementations I've tried, this evaluates to (b c) because cdr is passed to the lambda, which names it car, taking precedence over the primitive implementation of car.
The Little Schemer provides an implementation of scheme written in scheme in chapter 10. That implementation returns a for the above expression, which seems incorrect to me.
It's clear why that implementation behaves that way: the names of primitive functions are treated as *const rather than *identifier here. A *const which is not a number or boolean is rendered as a primitive and this is eventually hardwired to the actual primitives.
I believe that the correct implementation would be to have no special detection of primitive names, but rather to create an initial table in the value function that contains an entry mapping the primitive names to the actual primitive implementations.
My question is: is this a bug in The Little Schemer's implementation of scheme? Is this behaviour well specified in scheme, or was it maybe not well specified in 1974 when the book was written?
Is it a bug?
The question whether it is a bug or not is to establish if the interpreter is supposed to follow Scheme scoping rules. Since you mention the year 1974 it is the year before the first Scheme report was posted, but many of the ideas were probably written about at the time and what was to become Scheme were small interpreters that probably were shared between research students with various subtle differences.
The little Schemer was originally called The little Lisper and worked under a Lisp later to become Common Lisp. When rewriting it to follow Scheme they tried to keep most of the code from earlier so it is likely the interpreter has different features than Scheme.
What the RNRS Scheme reports say
If the interpreter is to conform to the scheme reports it must allow for bindings to shadow the top level bindings. Here is a quote from the R5RS report about Variables, syntactic keywords, and regions
Every mention of an identifier refers to the binding of the identifier
that established the innermost of the regions containing the use. If
there is no binding of the identifier whose region contains the use,
then the use refers to the binding for the variable in the top level
environment
For later reports like the same section in R6RS top level is replaced with "definition or import at the top of the enclosing library or top-level program".

How does Racket Scheme's "design by contract" features different from Eiffel?

I know that both Eiffel (the progenitor) and Racket both to implement "Design by Contract" features. Sadly, I am not sure how one would different from the other. Eiffel's DBC is reliant on the OOP paradigm and inheritance, but how would Racket, a very different language account for such a disparity?
Racket's main claim to contract fame is the notion of blame, and dealing with ho function is a big part of that for everyday Racket programming, definitely.
You might also want to check out the first two sections of this paper:
http://www.ccs.neu.edu/scheme/pubs/oopsla01-ff.pdf
First of all, your best source of information at this point is the Racket Guide, which is intended as an introductory text rather than a reference manual. Specifically, there is an extensive chapter about contracts that would help. EDIT: Also see the paper that Robby pointed at, he's the main Racket contract guy.
As for your question -- I don't know much about the Eiffel contract system, but I think that it precedes Racket's system. However (and this is again an "IIRC") I think that Racket's contract system was the first one that introduced higher order contracts. Specifically, when you deal with higher order functions assigning proper blame gets a little more complicated -- for example, if you take a foo function that has a contract of X? -> Y? and you send it a value that doesn't match X? then the client code that sent this value to foo is blamed. But if your function is (X? -> Y?) -> Z? and the X? predicate is not satisfied, then the blame goes to foo itself, not to the client (and if Y? is not satisfied then the blame is still with the client).
I think you're asking, how could a contract system work without OOP and inheritance? As a user of Racket who is unfamiliar with Eiffel, I'm wondering why a contract system would have anything to do with OOP and inheritance. :)
On a practical level I think of Racket contracts as a way to get some of the benefits of static type declarations, while keeping the flexibility of dynamically typed languages. Plus contracts go beyond just types, and can fill the role of asserts.
For instance I can say a function requires one argument that is an exact integer ... but also say that it should be an exact positive integer, or a union of certain specific values, or in fact any arbitrarily complicated test of the passed value. In this way, contracts in Racket combine what you might do with both (a) type declarations and (b) assertions in say C/C++.
One gotcha with contracts in Racket is that they can be slow. One way to deal with this is to use them at first while developing, then remove them selectively especially from "inner-loop" types of functions. Another approach I've tried is to turn them on/off wholesale: Make a pair modules like contracts-on.rkt and contract-off.rkt, where the latter provides some do-nothing macros. Have your modules require a contracts.rkt, which provides all from either of the -on or -off files. This is like compiling in DEBUG vs RELEASE mode.
If you're coming from Eiffel maybe my C/C++ slant on Racket contracts won't be helpful, but I wanted to share it anyway.

Are there good alternative Scheme syntaxes?

I imagine Scheme (and perhaps Lisp) could be made more `user friendly' by using a different syntax. For example, instead of nested S-expressions with ugly parentheses, one could devise some kind of syntax closer to some of the more widely used languages (e.g. Java-like without needing to define classes).
It's not necessarily a bad thing if it's more verbose. For example, the syntax may require line separators and commas in the places where many people will expect them, and expect explicit return statements. Also, it doesn't seem that difficult to allow some operators to be used infix style (just obey the generally accepted operator preference rules).
And if it doesn't make things too messy, the syntax could even be backwards-compatible, so that in any place where an expression is expected, a normal S-expression between parentheses can be used.
What are your opinions and ideas about this? And does anything like this exist? (I expect it does, but "Scheme" is a worthless google term, I can't find anything!)
Originally, Lisp was planned to use a syntax called M-Expressions, with S-Expressions being only a transitional solution for easier compiler building. When M-Expressions were ready to be introduces, the programmers who had already taken on Lisp just stayed with what they had become accustomed to, and M-Expressions never caught on.
There is an infix notation in Guile, but it's rarely used. A good Lisp programmer doesn't even see the parens anymore, and prefix notation does have its merits...
I think "sweet expressions" might be one of the more thoughtful approaches to getting rid of the parentheses in Lisp. It apparently even supports macros.
http://www.dwheeler.com/readable/sweet-expressions.html
However, I think most people eventually get over the parentheses or use another language.
Take a look at "sweet-expressions", which provides a set of additional abbreviations for traditional s-expressions. They add syntactically-relevant indentation, a way to do infix, and traditional function calls like f(x). Unlike nearly all past efforts to make Lisps readable, sweet-expressions are backwards-compatible (you can freely mix well-formatted s-expressions and sweet-expressions), generic, and homoiconic.
Sweet-expressions were developed on http://readable.sourceforge.net and there is a sample implementation.
For Scheme there is a SRFI for sweet-expresssions: http://srfi.schemers.org/srfi-110/
Try SRFI 49 for size. :-P
(Seriously, though, as Rafe commented, "I don't think anybody wants this".)
Some people consider Python to be a kind of Scheme with infix notation for operators, algebraic notation for functions and which uses a more "java-like" syntax for representing the language. I don't agree with that assessment, but I can see where the idea comes from.
The big problem with changing the notation for Scheme is that macros become very hard to write (to see how hard, take a look at the Nimrod language or Boo). Instead of working directly with the code as lists, you have to parse the input language first. This usually involves constructing an AST (abstract syntax tree) for the language from the input. When working directly with Scheme, this is unnecessary.
However, you might check out the SIX expression syntax in Gambit Scheme. There's a nice set of slides here which contains a discussion of this:
http://www.iro.umontreal.ca/~gambit/Gambit-inside-out.pdf
But don't tell anyone about it! (The inside joke is that someone suggests writing a Lisp without parentheses and with infix notation about once a day, and someone announces an implementation about once a month.)
There are some languages that do exactly that. For instance: Dylan.

Pseudocode interpreter?

Like lots of you guys on SO, I often write in several languages. And when it comes to planning stuff, (or even answering some SO questions), I actually think and write in some unspecified hybrid language. Although I used to be taught to do this using flow diagrams or UML-like diagrams, in retrospect, I find "my" pseudocode language has components of C, Python, Java, bash, Matlab, perl, Basic. I seem to unconsciously select the idiom best suited to expressing the concept/algorithm.
Common idioms might include Java-like braces for scope, pythonic list comprehensions or indentation, C++like inheritance, C#-style lambdas, matlab-like slices and matrix operations.
I noticed that it's actually quite easy for people to recognise exactly what I'm triying to do, and quite easy for people to intelligently translate into other languages. Of course, that step involves considering the corner cases, and the moments where each language behaves idiosyncratically.
But in reality, most of these languages share a subset of keywords and library functions which generally behave identically - maths functions, type names, while/for/if etc. Clearly I'd have to exclude many 'odd' languages like lisp, APL derivatives, but...
So my questions are,
Does code already exist that recognises the programming language of a text file? (Surely this must be a less complicated task than eclipse's syntax trees or than google translate's language guessing feature, right?) In fact, does the SO syntax highlighter do anything like this?
Is it theoretically possible to create a single interpreter or compiler that recognises what language idiom you're using at any moment and (maybe "intelligently") executes or translates to a runnable form. And flags the corner cases where my syntax is ambiguous with regards to behaviour. Immediate difficulties I see include: knowing when to switch between indentation-dependent and brace-dependent modes, recognising funny operators (like *pointer vs *kwargs) and knowing when to use list vs array-like representations.
Is there any language or interpreter in existence, that can manage this kind of flexible interpreting?
Have I missed an obvious obstacle to this being possible?
edit
Thanks all for your answers and ideas. I am planning to write a constraint-based heuristic translator that could, potentially, "solve" code for the intended meaning and translate into real python code. It will notice keywords from many common languages, and will use syntactic clues to disambiguate the human's intentions - like spacing, brackets, optional helper words like let or then, context of how variables are previously used etc, plus knowledge of common conventions (like capital names, i for iteration, and some simplistic limited understanding of naming of variables/methods e.g containing the word get, asynchronous, count, last, previous, my etc). In real pseudocode, variable naming is as informative as the operations themselves!
Using these clues it will create assumptions as to the implementation of each operation (like 0/1 based indexing, when should exceptions be caught or ignored, what variables ought to be const/global/local, where to start and end execution, and what bits should be in separate threads, notice when numerical units match / need converting). Each assumption will have a given certainty - and the program will list the assumptions on each statement, as it coaxes what you write into something executable!
For each assumption, you can 'clarify' your code if you don't like the initial interpretation. The libraries issue is very interesting. My translator, like some IDE's, will read all definitions available from all modules, use some statistics about which classes/methods are used most frequently and in what contexts, and just guess! (adding a note to the program to say why it guessed as such...) I guess it should attempt to execute everything, and warn you about what it doesn't like. It should allow anything, but let you know what the several alternative interpretations are, if you're being ambiguous.
It will certainly be some time before it can manage such unusual examples like #Albin Sunnanbo's ImportantCustomer example. But I'll let you know how I get on!
I think that is quite useless for everything but toy examples and strict mathematical algorithms. For everything else the language is not just the language. There are lots of standard libraries and whole environments around the languages. I think I write almost as many lines of library calls as I write "actual code".
In C# you have .NET Framework, in C++ you have STL, in Java you have some Java libraries, etc.
The difference between those libraries are too big to be just syntactic nuances.
<subjective>
There has been attempts at unifying language constructs of different languages to a "unified syntax". That was called 4GL language and never really took of.
</subjective>
As a side note I have seen a code example about a page long that was valid as c#, Java and Java script code. That can serve as an example of where it is impossible to determine the actual language used.
Edit:
Besides, the whole purpose of pseudocode is that it does not need to compile in any way. The reason you write pseudocode is to create a "sketch", however sloppy you like.
foreach c in ImportantCustomers{== OrderValue >=$1M}
SendMailInviteToSpecialEvent(c)
Now tell me what language it is and write an interpreter for that.
To detect what programming language is used: Detecting programming language from a snippet
I think it should be possible. The approach in 1. could be leveraged to do this, I think. I would try to do it iteratively: detect the syntax used in the first line/clause of code, "compile" it to intermediate form based on that detection, along with any important syntax (e.g. begin/end wrappers). Then the next line/clause etc. Basically write a parser that attempts to recognize each "chunk". Ambiguity could be flagged by the same algorithm.
I doubt that this has been done ... seems like the cognitive load of learning to write e.g. python-compatible pseudocode would be much easier than trying to debug the cases where your interpreter fails.
a. I think the biggest problem is that most pseudocode is invalid in any language. For example, I might completely skip object initialization in a block of pseudocode because for a human reader it is almost always straightforward to infer. But for your case it might be completely invalid in the language syntax of choice, and it might be impossible to automatically determine e.g. the class of the object (it might not even exist). Etc.
b. I think the best you can hope for is an interpreter that "works" (subject to 4a) for your pseudocode only, no-one else's.
Note that I don't think that 4a,4b are necessarily obstacles to it being possible. I just think it won't be useful for any practical purpose.
Recognizing what language a program is in is really not that big a deal. Recognizing the language of a snippet is more difficult, and recognizing snippets that aren't clearly delimited (what do you do if four lines are Python and the next one is C or Java?) is going to be really difficult.
Assuming you got the lines assigned to the right language, doing any sort of compilation would require specialized compilers for all languages that would cooperate. This is a tremendous job in itself.
Moreover, when you write pseudo-code you aren't worrying about the syntax. (If you are, you're doing it wrong.) You'll wind up with code that simply can't be compiled because it's incomplete or even contradictory.
And, assuming you overcame all these obstacles, how certain would you be that the pseudo-code was being interpreted the way you were thinking?
What you would have would be a new computer language, that you would have to write correct programs in. It would be a sprawling and ambiguous language, very difficult to work with properly. It would require great care in its use. It would be almost exactly what you don't want in pseudo-code. The value of pseudo-code is that you can quickly sketch out your algorithms, without worrying about the details. That would be completely lost.
If you want an easy-to-write language, learn one. Python is a good choice. Use pseudo-code for sketching out how processing is supposed to occur, not as a compilable language.
An interesting approach would be a "type-as-you-go" pseudocode interpreter. That is, you would set the language to be used up front, and then it would attempt to convert the pseudo code to real code, in real time, as you typed. An interactive facility could be used to clarify ambiguous stuff and allow corrections. Part of the mechanism could be a library of code which the converter tried to match. Over time, it could learn and adapt its translation based on the habits of a particular user.
People who program all the time will probably prefer to just use the language in most cases. However, I could see the above being a great boon to learners, "non-programmer programmers" such as scientists, and for use in brainstorming sessions with programmers of various languages and skill levels.
-Neil
Programs interpreting human input need to be given the option of saying "I don't know." The language PL/I is a famous example of a system designed to find a reasonable interpretation of anything resembling a computer program that could cause havoc when it guessed wrong: see http://horningtales.blogspot.com/2006/10/my-first-pli-program.html
Note that in the later language C++, when it resolves possible ambiguities it limits the scope of the type coercions it tries, and that it will flag an error if there is not a unique best interpretation.
I have a feeling that the answer to 2. is NO. All I need to prove it false is a code snippet that can be interpreted in more than one way by a competent programmer.
Does code already exist that
recognises the programming language
of a text file?
Yes, the Unix file command.
(Surely this must be a less
complicated task than eclipse's syntax
trees or than google translate's
language guessing feature, right?) In
fact, does the SO syntax highlighter
do anything like this?
As far as I can tell, SO has a one-size-fits-all syntax highlighter that tries to combine the keywords and comment syntax of every major language. Sometimes it gets it wrong:
def median(seq):
"""Returns the median of a list."""
seq_sorted = sorted(seq)
if len(seq) & 1:
# For an odd-length list, return the middle item
return seq_sorted[len(seq) // 2]
else:
# For an even-length list, return the mean of the 2 middle items
return (seq_sorted[len(seq) // 2 - 1] + seq_sorted[len(seq) // 2]) / 2
Note that SO's highlighter assumes that // starts a C++-style comment, but in Python it's the integer division operator.
This is going to be a major problem if you try to combine multiple languages into one. What do you do if the same token has different meanings in different languages? Similar situations are:
Is ^ exponentiation like in BASIC, or bitwise XOR like in C?
Is || logical OR like in C, or string concatenation like in SQL?
What is 1 + "2"? Is the number converted to a string (giving "12"), or is the string converted to a number (giving 3)?
Is there any language or interpreter
in existence, that can manage this
kind of flexible interpreting?
On another forum, I heard a story of a compiler (IIRC, for FORTRAN) that would compile any program regardless of syntax errors. If you had the line
= Y + Z
The compiler would recognize that a variable was missing and automatically convert the statement to X = Y + Z, regardless of whether you had an X in your program or not.
This programmer had a convention of starting comment blocks with a line of hyphens, like this:
C ----------------------------------------
But one day, they forgot the leading C, and the compiler choked trying to add dozens of variables between what it thought was subtraction operators.
"Flexible parsing" is not always a good thing.
To create a "pseudocode interpreter," it might be necessary to design a programming language that allows user-defined extensions to its syntax. There already are several programming languages with this feature, such as Coq, Seed7, Agda, and Lever. A particularly interesting example is the Inform programming language, since its syntax is essentially "structured English."
The Coq programming language allows "syntax extensions", so the language can be extended to parse new operators:
Notation "A /\ B" := (and A B).
Similarly, the Seed7 programming language can be extended to parse "pseudocode" using "structured syntax definitions." The while loop in Seed7 is defined in this way:
syntax expr: .while.().do.().end.while is -> 25;
Alternatively, it might be possible to "train" a statistical machine translation system to translate pseudocode into a real programming language, though this would require a large corpus of parallel texts.

Resources