Preventing evaluation of Mathematica expressions - wolfram-mathematica

In a recent SO question three different answers were supplied each using a different method of preventing the evaluation of the Equal[] expression. They were
Defer[]
Unevaluated[]
HoldForm[]
Sometimes I still have trouble choosing between these options (and judging by answers to the before mentioned question, the choice isn't always clear for other people either). Can someone write a clear exposition on the use of these three methods?
There are three other wrappers
Hold[],
HoldPattern[],
HoldComplete[],
and the various Attributes for functions
HoldAll, HoldFirst, HoldRest and the numeric versions NHold* that can also be discussed if you wish!
Edit
I just noticed that this is basically a repeat of the old question (which I had already upvoted, just forgotten...). The accepted answer linked to this talk at the 1999 Mathematica Developer Conference, which doesn't discuss Defer since it is "New in 6". Defer is more closely linked to the frontend than the other evaluation control mechanisms. It is used to create an unevaluated output that will be evaluated if supplied in and Input expression. To quote the Documentation Center:
Defer[expr] returns an object which
remains unchanged until it is
explicitly supplied as Mathematica
input, and evaluated using
Shift+Enter, Evaluate in Place, etc.

Not touching Defer, since I did not work much with it and feel that in any given case its behavior can be reproduced by other mentioned wrappers, and discussing Hold instead on HoldForm (the difference is really in the way they are printed), here is the link to a mathgroup post where I gave a rather extensive explanation of differences between Hold and Unevaluated, including differences in usage and in the evaluation process (my second and third posts in particular).
To put the long story short, Hold is used to preserve an expression unevaluated in between several evaluations (for indefinite time, until we need it), is a visible wrapper in the sense that say Depth[Hold[{1,2,3}]] is not the same as Depth[{1,2,3}] (this is of course a consequence of evaluation), and is generally nothing special - just a wrapper with HoldAll attribute like any other, except being an "official" holding wrapper and being integrated much better with the rest of the system, since many system functions use or expect it.
OTOH, Unevaluated[expr] is used to temporarily, just once, make up for a missing Hold* attribute for a function enclosing expression expr. While resulting in behavior which would require this enclosing function to hold expr as if it had Hold* - attribute, Unevaluated belongs to the argument, and works only once, for a single evaluation, since it gets stripped in the process. Also, because it gets stripped, it often is invisible for the surrounding wrappers, unlike Hold. Finally, it is one of a very few "magic symbols", along with Sequence and Evaluate - these are deeply wired into the system and can not be easily replicated or blocked, unlike Hold - in that sense, Unevaluated is more fundamental.
HoldComplete is used when one wants to prevent certain stages of evaluation process, which Hold does not prevent. This includes splicing sequences, for example:
In[25]:= {Hold[Sequence[1, 2]], HoldComplete[Sequence[1, 2]]}
Out[25]= {Hold[1, 2], HoldComplete[Sequence[1, 2]]},
search for UpValues, for example
In[26]:=
ClearAll[f];
f /: Hold[f[x_]] := f[x];
f[x_] := x^2;
In[29]:= {Hold[f[5]], HoldComplete[f[5]]},
Out[29]= {25, HoldComplete[f[5]]}
and immunity to Evaluate:
In[33]:=
ClearAll[f];
f[x_] := x^2;
In[35]:= {Hold[Evaluate[f[5]]], HoldComplete[Evaluate[f[5]]]}
Out[35]= {Hold[25], HoldComplete[Evaluate[f[5]]]}
In other words, it is used when you want to prevent any evaluation of the expression inside, whatsoever. Like Hold, HoldComplete is nothing special in the sense that it is just an "official" wrapper with HoldAllComplete attribute, and you can make your own which would behave similarly.
Finally, HoldPattern is a normal (usual) head with HoldAll attribute for the purposes of evaluation, but its magic shows in the pattern-matching: it is invisible to the pattern-matcher, and is very important ingredient of the language since it allows pattern-matcher to be consistent with the evaluation process. Whenever there is a danger that the pattern in some rule may evaluate, HoldPattern can be used to ensure that this won't happen, while the pattern remains the same for the pattern-matcher. One thing I'd stress here that this is the only purpose for it. Often people use it also as an escape mechanism for the pattern-matcher, where Verbatim must be used instead. This works, but is conceptually wrong.
One very good account on evaluation process and all these things is a book of David Wagner, Power programming with Mathematica - the kernel, which was written in 1996 for version 3, but most if not all of the discussion there remains valid today. It is out of print alas, but you might have some luck on Amazon (as I had a few years ago).

Leonid Shifrin's answer is quite nice, but I wanted to touch on Defer, which is really useful for only one thing. In some instances, it's nice to be able to directly construct expressions that won't be evaluated, but that a user will be able to easily edit; the basic example for this kind of behavior is button palettes that you can use to insert expressions or expression templates into input cells which the user can then edit as needed. This isn't the only way to do this, and for some more sophisticated applications you'll need to get into the hairy world MakeBoxes, but for the basics Defer will serve nicely.

Related

What is the purpose of the dummy argument that's passed to BASIC's POS function?

Not long ago I read that the Commodore 64's BASIC interpreter contains a POS function which returns the current horizontal position of the cursor. Since then, I've noticed this idiosyncrasy in some other BASIC dialects, including Microsoft QBASIC, and even Roku's BrightScript, which is much more recent.
What I'm wondering is, why is this a thing? If the value of the argument isn't used, why even require it? My guess is that maybe really early on BASIC didn't support functions without arguments, and it's stuck around for whatever reason, probably compatibility. But that wouldn't explain why it's still a required argument.
Worth mentioning is QBASIC also includes CSRLEN, which returns the vertical position of the cursor, but it doesn't require/accept any arguments. This supports my idea that it came from "ancient times"—POS would have been a well-defined operation on the earliest terminals (teletypes), but CSRLEN wouldn't have made sense until later hardware.
I seem to recall (very vaguely) that the lookup table in which pos was placed was one where all functions had an argument (like sin or fre). To that end, pos used common code to ensure it had an argument, even though it was ignored.
The BASIC interpreter in the C64, being based on the (rather limited) 6502 CPU had to all sorts of wondrous tricks to allow all its functionality.
Now keep in mind that required reaching down in to my gray matter through 30-odd years of detritus. I suspect you'll get a more accurate(a) answer over at the retro-computing sister site.
(a) And probably more complete, to an almost painful degree :-)

Why is the introduction of the Ycombinator in λ-calculus necessary?

I am reading a book on λ-calculus "Functional programming Through Lambda Calculus" (Greg Michaelson). In the book the author introduces a short-hand notation for defining functions. For example
def identity = λx.x
and goes on saying that we should insist that when using such shorthand "all defined names should be replaced by their definitions before the expression is evaluated"
Later on, when introducing recursion he uses as an example a definition of the addition function such as:
def add x y = if iszero y then x else add (succ x) (pred y)
and goes to say, that had we not had the restriction mentioned above we would be able to evaluate this function by slowly expanding it. However since we have the restriction of replacing all defined names before the evaluation of the expression, we cannot do that since we go on indefinetely replacing add and thus the need of thinking about recursion in a more detailed way.
My question is thus the following: What are the theoritical or practical reasons for placing this restriction upon ourselves? (of having to replace all defined names before the evaluation of the function)? Are there any?
I was trying to show how to build a rich language from a very simple one, by adding successive layers of syntax, where each layer could be translated into the previous layer. So it's important to distinguish translation, which must terminate, from evaluation which needn't. I think it's really interesting that recursion can be translated into non-recursion. I'm sorry if my explanation isn't helpful.
The reason is that we want to stay within the rules of the lambda calculus. Allowing names for terms to mean anything other than immediate substitution would mean adding a recursive let expression to the language, which would mean we would need a truly more expressive system (no longer the lambda calculus).
You can think of the names as no more than syntactic sugar for the original lambda term. The Y-combinator is exactly the way to introduce recursion into a system that does not have it built in.
If the book you are currently reading confuses you, you might want search for some additional resources on the internet explaining the Y-combinator.
I will try to post my own answer, the way I understand it.
For the untyped lambda calculus there is no practical reason, we need the Y combinator. By practical I mean that if someone wants to build an expression evaluator, it is possible to do it without needing the combinator and just slowly expanding the definition.
For theoretical reasons though, we need to make sure that when we define a function this definition has some meaning and is not defined in terms of itself. e.g. there is not much meaning in the following definition:
def something = something
For this reason, we need to see if it is possible to rewrite the definition in a way that it is not self-referential, i.e. it is possible to define something without referring to itself. It turns out that in the untyped lambda calculus we can always do that through the Y-combinator.
Using the Y-combinator we can always construct the solution to the equation x=f(x)=f(f(x))=...=f(f(f(f(x)))=.... for any f,
i.e. we can always rewrite a self-referential definition to a definition that it does not include itself

Attributed variables: library interfaces / implementations / portability

When I was skimming some prolog related questions recently, I stumbled upon this answer by #mat to question How to represent directed cyclic graph in Prolog with direct access to neighbour verticies .
So far, my personal experience with attributed variables in Prolog has been very limited. But the use-case given by #mat sparked my interest. So I tried using it for answering another question, ordering lists with constraint logic programming.
First, the good news: My first use of attributed variables worked out like I wanted it to.
Then, the not so good news: When I had posted by answer, I realized there were several API's and implementations for attributed variables in Prolog.
I feel I'm over my head here... In particular I want to know the following:
What API's are in wide-spread use? Up to now, I found two: SICStus and SWI.
Which features do the different attributed variable implementations offer? The same ones? Or does one subsume the other?
Are there differences in semantics?
What about the actual implementation? Are some more efficient than others?
Can be (or is) using attributed variables a portability issue?
Lots of question marks, here... Please share your experience / stance?
Thank you in advance!
Edit 2015-04-22
Here's a code snippet of the answer mentioned above:
init_att_var(X,Z) :-
put_attr(Z,value,X).
get_att_value(Var,Value) :-
get_attr(Var,value,Value).
So far I "only" use put_attr/3 and get_attr/3, but---according to the SICStus Prolog documentation on attributed variables---SICStus offers put_attr/2 and get_attr/2.
So even this very shallow use-case requires some emulation layer (one way or the other).
I would like to focus on one important general point I noticed when working with different interfaces for attributes variables: When designing an interface for attributed variables, an implementor should also keep in mind the following:
Is it possible to take attributes into account when reasoning about simultaneous unifications, as in [X,Y] = [0,1]?
This is possible for example in SICStus Prolog, because such bindings are undone before verify_attributes/3 is called. In the interface provided by hProlog (attr_unify_hook/2, called after the unification and with all bindings already in place) it is hard to take into account the (previous) attributes of Y when reasoning about the unification of X in attr_unify_hook/2, because Y is no longer a variable at this point! This may be sufficient for solvers that can make decisions based on ground values alone, but it is a serious limitation for solvers that need additional data, typically stored in attributes, to see whether a unification should succeed, and which are then no longer easily available. One obvious example: Boolean unification with decision diagrams.
As of 2016, the verify-attributes branch of SWI-Prolog also supports verify_attributes/3, thanks to great implementation work by Douglas Miles. The branch is ready for testing and intended to be merged into master as soon as it works correctly and efficiently. For compatibility with hProlog, the branch also supports attr_unify_hook/2: It does so by rewriting such definitions to the more general verify_attributes/3 at compilation time.
Performance-wise, it is clear that there may be a downside to verify_attributes/3, because making several variables ground at the same time may let you sooner see (in attr_unify_hook/2) that a unification cannot succeed. However, I will gladly and any time exchange this typically negligible advantage for the improved reliability, ease of use, and increased functionality that the more general interface gives you, and which is in any case already the standard behaviour in SICStus Prolog which is on top of its generality also one of the faster Prolog systems around.
SICStus Prolog also features an important predicate called project_attributes/2: It is used by the toplevel to project constraints to query variables. SWI-Prolog also supports this in recent versions.
There is also one huge advantage of the SWI interface: The residual goals that attribute_goals//1 and hence copy_term/3 give you are always a list. This helps users to avoid defaultyness in their code, and encourages a more declarative interface, because a list of pure constraint goals cannot contain control structures.
Interestingly, neither interface lets you interpret unifications other than syntactically. Personally, I think there are cases where you may want to interpret unifications differently than syntactically, however, there may also be good arguments against that.
The other interface predicates for attributed variables are mostly easily interchangable with simple wrapper predicates for different systems.
Jekejeke Minlog has state-less or thin attribute variables. Well not exactly, an attribute variable can have zero, one or many hooks, which are allowed to be closures, and hence can carry a little state.
But typically an implementation manages the state elsewere. For this
purpose Jekejeke Minlog allows creating reference types from variables,
so that they can be used as indexes into tables.
The full potential is unleashed if this combined with trailing and/or
forward chaining. As an example we have implemented CLP(FD). There is also a little solver tutorial.
The primitive ingredients in our case are:
1) State-less Attribute Variables
2) Trailing and Variable Keys
3) Continuation Queue
The attribute variables hooks might have binding effects upto extending the continuation queue but are only executed once. Goals from the continuation queue can be non-deterministic.
There are some additional layers before realizing applications, that are mostly aggregations of the primitives to make changes temporarily.
The main applications so far are open source here and here:
a) Finite Domain Constraint Solver
b) Herbrand Constraints
c) Goal Suspension
Bye
An additional perspective on attributed variable libraries is how many attributes can be defined per module. In the case of SWI-Prolog/YAP and citing SWI documentation:
Each attribute is associated to a module, and the hook
(attr_unify_hook/2) is executed in this module.
This is a severe limitation for implementers of libraries such as CLP(FD) as it forces using additional modules for the sole purpose of having multiple attributes instead of being able to define as many attributes as required in the module implementing their library. This limitation doesn't exist on the SICStus Prolog interface, which provides a directive attribute/1 that allows the declaration of an arbitrary number of attributes per module.
You can find one of the oldest and most elaborate implementations of attributed variables in ECLiPSe, where it forms part of the wider infrastructure for implementing constraint solvers.
The main characteristics of this design are:
attributes must be declared, and in return the compiler supports efficient access
a syntax for attributed variables, so that they can be read and written
a more complete set of handlers for attribute operations, so that attributes are not only taken into account for unification, but also for other generic operations such as term copying and subsumption tests
a clear separation between the concepts of variable attribute and suspended goals
used in over a dozen of ECLiPSe's libraries
This paper (section 4) and the ECLiPSe documentation have more details.

When would you swap two numbers without using a third variable?

I have read several sources that discuss how to swap two numbers without using a third variable. These are a few of the most relevant:
How do you swap two integer variables without using any if conditions, casting, or additional variables?
Potential Problem in "Swapping values of two variables without using a third variable"
Swap two integers without using a third variable
http://www.geeksforgeeks.org/swap-two-numbers-without-using-temporary-variable/
I understand why it doesn't make sense to use the described methods in most cases: the code becomes cluttered and difficult to read and will usually execute more slowly than a solution utilizing a third "temp" variable. However, none of the questions I have found discuss any benefits of the two-variable methods in practice. Do they have any redeeming qualities or benefits(historical or contemporary), or are they only useful as obscure programming trivia?
At this point it's just a neat trick. Speed-wise if it makes sense though your compiler will recognize a normal swap and optimize it appropriately (but no guarantees that it will recognize weird xoring and optimize that appropriately).
Another strike against xor is that if one variable alias the other, xor’ing them will zero both out. Since you’ll have to check for and handle this condition, you’ll have extra code involved – probably by using the third variable method.
You could also try adding and subtracting values… except that you’d have to check for and handle overflow, which would involve more code (probably the third variable method). Multiplication and division have the same flaw, but more importantly, there’s the exquisite delight of representing fractions in binary (so this wouldn’t work in the first place).
Edit: D’oh, sorry for the thread necromancy… got so caught up in following links that I forgot to check the dates.

Pseudocode interpreter?

Like lots of you guys on SO, I often write in several languages. And when it comes to planning stuff, (or even answering some SO questions), I actually think and write in some unspecified hybrid language. Although I used to be taught to do this using flow diagrams or UML-like diagrams, in retrospect, I find "my" pseudocode language has components of C, Python, Java, bash, Matlab, perl, Basic. I seem to unconsciously select the idiom best suited to expressing the concept/algorithm.
Common idioms might include Java-like braces for scope, pythonic list comprehensions or indentation, C++like inheritance, C#-style lambdas, matlab-like slices and matrix operations.
I noticed that it's actually quite easy for people to recognise exactly what I'm triying to do, and quite easy for people to intelligently translate into other languages. Of course, that step involves considering the corner cases, and the moments where each language behaves idiosyncratically.
But in reality, most of these languages share a subset of keywords and library functions which generally behave identically - maths functions, type names, while/for/if etc. Clearly I'd have to exclude many 'odd' languages like lisp, APL derivatives, but...
So my questions are,
Does code already exist that recognises the programming language of a text file? (Surely this must be a less complicated task than eclipse's syntax trees or than google translate's language guessing feature, right?) In fact, does the SO syntax highlighter do anything like this?
Is it theoretically possible to create a single interpreter or compiler that recognises what language idiom you're using at any moment and (maybe "intelligently") executes or translates to a runnable form. And flags the corner cases where my syntax is ambiguous with regards to behaviour. Immediate difficulties I see include: knowing when to switch between indentation-dependent and brace-dependent modes, recognising funny operators (like *pointer vs *kwargs) and knowing when to use list vs array-like representations.
Is there any language or interpreter in existence, that can manage this kind of flexible interpreting?
Have I missed an obvious obstacle to this being possible?
edit
Thanks all for your answers and ideas. I am planning to write a constraint-based heuristic translator that could, potentially, "solve" code for the intended meaning and translate into real python code. It will notice keywords from many common languages, and will use syntactic clues to disambiguate the human's intentions - like spacing, brackets, optional helper words like let or then, context of how variables are previously used etc, plus knowledge of common conventions (like capital names, i for iteration, and some simplistic limited understanding of naming of variables/methods e.g containing the word get, asynchronous, count, last, previous, my etc). In real pseudocode, variable naming is as informative as the operations themselves!
Using these clues it will create assumptions as to the implementation of each operation (like 0/1 based indexing, when should exceptions be caught or ignored, what variables ought to be const/global/local, where to start and end execution, and what bits should be in separate threads, notice when numerical units match / need converting). Each assumption will have a given certainty - and the program will list the assumptions on each statement, as it coaxes what you write into something executable!
For each assumption, you can 'clarify' your code if you don't like the initial interpretation. The libraries issue is very interesting. My translator, like some IDE's, will read all definitions available from all modules, use some statistics about which classes/methods are used most frequently and in what contexts, and just guess! (adding a note to the program to say why it guessed as such...) I guess it should attempt to execute everything, and warn you about what it doesn't like. It should allow anything, but let you know what the several alternative interpretations are, if you're being ambiguous.
It will certainly be some time before it can manage such unusual examples like #Albin Sunnanbo's ImportantCustomer example. But I'll let you know how I get on!
I think that is quite useless for everything but toy examples and strict mathematical algorithms. For everything else the language is not just the language. There are lots of standard libraries and whole environments around the languages. I think I write almost as many lines of library calls as I write "actual code".
In C# you have .NET Framework, in C++ you have STL, in Java you have some Java libraries, etc.
The difference between those libraries are too big to be just syntactic nuances.
<subjective>
There has been attempts at unifying language constructs of different languages to a "unified syntax". That was called 4GL language and never really took of.
</subjective>
As a side note I have seen a code example about a page long that was valid as c#, Java and Java script code. That can serve as an example of where it is impossible to determine the actual language used.
Edit:
Besides, the whole purpose of pseudocode is that it does not need to compile in any way. The reason you write pseudocode is to create a "sketch", however sloppy you like.
foreach c in ImportantCustomers{== OrderValue >=$1M}
SendMailInviteToSpecialEvent(c)
Now tell me what language it is and write an interpreter for that.
To detect what programming language is used: Detecting programming language from a snippet
I think it should be possible. The approach in 1. could be leveraged to do this, I think. I would try to do it iteratively: detect the syntax used in the first line/clause of code, "compile" it to intermediate form based on that detection, along with any important syntax (e.g. begin/end wrappers). Then the next line/clause etc. Basically write a parser that attempts to recognize each "chunk". Ambiguity could be flagged by the same algorithm.
I doubt that this has been done ... seems like the cognitive load of learning to write e.g. python-compatible pseudocode would be much easier than trying to debug the cases where your interpreter fails.
a. I think the biggest problem is that most pseudocode is invalid in any language. For example, I might completely skip object initialization in a block of pseudocode because for a human reader it is almost always straightforward to infer. But for your case it might be completely invalid in the language syntax of choice, and it might be impossible to automatically determine e.g. the class of the object (it might not even exist). Etc.
b. I think the best you can hope for is an interpreter that "works" (subject to 4a) for your pseudocode only, no-one else's.
Note that I don't think that 4a,4b are necessarily obstacles to it being possible. I just think it won't be useful for any practical purpose.
Recognizing what language a program is in is really not that big a deal. Recognizing the language of a snippet is more difficult, and recognizing snippets that aren't clearly delimited (what do you do if four lines are Python and the next one is C or Java?) is going to be really difficult.
Assuming you got the lines assigned to the right language, doing any sort of compilation would require specialized compilers for all languages that would cooperate. This is a tremendous job in itself.
Moreover, when you write pseudo-code you aren't worrying about the syntax. (If you are, you're doing it wrong.) You'll wind up with code that simply can't be compiled because it's incomplete or even contradictory.
And, assuming you overcame all these obstacles, how certain would you be that the pseudo-code was being interpreted the way you were thinking?
What you would have would be a new computer language, that you would have to write correct programs in. It would be a sprawling and ambiguous language, very difficult to work with properly. It would require great care in its use. It would be almost exactly what you don't want in pseudo-code. The value of pseudo-code is that you can quickly sketch out your algorithms, without worrying about the details. That would be completely lost.
If you want an easy-to-write language, learn one. Python is a good choice. Use pseudo-code for sketching out how processing is supposed to occur, not as a compilable language.
An interesting approach would be a "type-as-you-go" pseudocode interpreter. That is, you would set the language to be used up front, and then it would attempt to convert the pseudo code to real code, in real time, as you typed. An interactive facility could be used to clarify ambiguous stuff and allow corrections. Part of the mechanism could be a library of code which the converter tried to match. Over time, it could learn and adapt its translation based on the habits of a particular user.
People who program all the time will probably prefer to just use the language in most cases. However, I could see the above being a great boon to learners, "non-programmer programmers" such as scientists, and for use in brainstorming sessions with programmers of various languages and skill levels.
-Neil
Programs interpreting human input need to be given the option of saying "I don't know." The language PL/I is a famous example of a system designed to find a reasonable interpretation of anything resembling a computer program that could cause havoc when it guessed wrong: see http://horningtales.blogspot.com/2006/10/my-first-pli-program.html
Note that in the later language C++, when it resolves possible ambiguities it limits the scope of the type coercions it tries, and that it will flag an error if there is not a unique best interpretation.
I have a feeling that the answer to 2. is NO. All I need to prove it false is a code snippet that can be interpreted in more than one way by a competent programmer.
Does code already exist that
recognises the programming language
of a text file?
Yes, the Unix file command.
(Surely this must be a less
complicated task than eclipse's syntax
trees or than google translate's
language guessing feature, right?) In
fact, does the SO syntax highlighter
do anything like this?
As far as I can tell, SO has a one-size-fits-all syntax highlighter that tries to combine the keywords and comment syntax of every major language. Sometimes it gets it wrong:
def median(seq):
"""Returns the median of a list."""
seq_sorted = sorted(seq)
if len(seq) & 1:
# For an odd-length list, return the middle item
return seq_sorted[len(seq) // 2]
else:
# For an even-length list, return the mean of the 2 middle items
return (seq_sorted[len(seq) // 2 - 1] + seq_sorted[len(seq) // 2]) / 2
Note that SO's highlighter assumes that // starts a C++-style comment, but in Python it's the integer division operator.
This is going to be a major problem if you try to combine multiple languages into one. What do you do if the same token has different meanings in different languages? Similar situations are:
Is ^ exponentiation like in BASIC, or bitwise XOR like in C?
Is || logical OR like in C, or string concatenation like in SQL?
What is 1 + "2"? Is the number converted to a string (giving "12"), or is the string converted to a number (giving 3)?
Is there any language or interpreter
in existence, that can manage this
kind of flexible interpreting?
On another forum, I heard a story of a compiler (IIRC, for FORTRAN) that would compile any program regardless of syntax errors. If you had the line
= Y + Z
The compiler would recognize that a variable was missing and automatically convert the statement to X = Y + Z, regardless of whether you had an X in your program or not.
This programmer had a convention of starting comment blocks with a line of hyphens, like this:
C ----------------------------------------
But one day, they forgot the leading C, and the compiler choked trying to add dozens of variables between what it thought was subtraction operators.
"Flexible parsing" is not always a good thing.
To create a "pseudocode interpreter," it might be necessary to design a programming language that allows user-defined extensions to its syntax. There already are several programming languages with this feature, such as Coq, Seed7, Agda, and Lever. A particularly interesting example is the Inform programming language, since its syntax is essentially "structured English."
The Coq programming language allows "syntax extensions", so the language can be extended to parse new operators:
Notation "A /\ B" := (and A B).
Similarly, the Seed7 programming language can be extended to parse "pseudocode" using "structured syntax definitions." The while loop in Seed7 is defined in this way:
syntax expr: .while.().do.().end.while is -> 25;
Alternatively, it might be possible to "train" a statistical machine translation system to translate pseudocode into a real programming language, though this would require a large corpus of parallel texts.

Resources