What characteristic of the language grammar defines that it will be interpreted or compiled? - formal-languages

Is there any element or characteristic on the language grammar that defines that it will be interpreted or compiled? Or it will only depends on the implementation of the compiler/interpreter to process that language?

You tagged this question formal-languages which should give you a hint: a formal language is a mathematical abstraction whereas interpreters and compilers are depressingly concrete.
So it is more or less like asking the relationship between the Peano axioms and currency. Nothing in the formal model of a number tells you whether it makes cents, so to speak.
On a practical level, if a language has anything like an eval primitive, you can be reasonably assured that it includes an interpreter, but of course the interpreter might consist of compiling the code to be evaluated and then running the resulting code. Unlike formal mathematical models, the real world is full of leaky abstractions.

Related

Initialization versus assignment frequency

In many languages, including the C family, initialization and assignment are both spelled =. (In languages that don't formally distinguish between them, I'm considering the first assignment to a variable to be its initialization.)
Which is more common in typical mainstream languages? Eyeballing some chunks of open source code doesn't give a clear answer, and it's tricky trying to distinguish them with a regular expression. (I'd be happy with statistics from particular code bases, e.g. 'in the Linux kernel there are X occurrences of the assignment operator, of which Y are actually initialization'.)
(The reason I ask is I'm designing a language that needs to distinguish between them, and I'm wondering whether to do as the ML family does and give initialization the shorter spelling, or vice versa.)

Is there any algorithm needs functional language exclusively to be implemented

I'm a C# developer and I don't have enough information about functional languages,
My question that is there any algorithm needs functional language exclusively to be implemented?
Regards.
As long as a language is Turing complete, any algorithm can be implemented in it (by definition of "algorithm"). But as others have said, functional languages can do certain things more elegantly. (Just take a look at Haskell. What a lovely language.) I'd also argue that there is a class of problems that OOP languages do better. (In my opinion, GUIs, although some may disagree.)
No, however a functional language may lead to a more elegant implementation for an algorithm that can exploit the features of such a language. For example, one that requires large recursive depth.
As I understand it, such algorithm would have to be translated into a set of machine commands executed on some micro-processor (whether you use compiled or interpreted language). And none of the current processors are 'functional'.
In fact, this leads to even broader assertion: any 'functional algorithm' can be implemented in C or assembler :)

Pseudocode interpreter?

Like lots of you guys on SO, I often write in several languages. And when it comes to planning stuff, (or even answering some SO questions), I actually think and write in some unspecified hybrid language. Although I used to be taught to do this using flow diagrams or UML-like diagrams, in retrospect, I find "my" pseudocode language has components of C, Python, Java, bash, Matlab, perl, Basic. I seem to unconsciously select the idiom best suited to expressing the concept/algorithm.
Common idioms might include Java-like braces for scope, pythonic list comprehensions or indentation, C++like inheritance, C#-style lambdas, matlab-like slices and matrix operations.
I noticed that it's actually quite easy for people to recognise exactly what I'm triying to do, and quite easy for people to intelligently translate into other languages. Of course, that step involves considering the corner cases, and the moments where each language behaves idiosyncratically.
But in reality, most of these languages share a subset of keywords and library functions which generally behave identically - maths functions, type names, while/for/if etc. Clearly I'd have to exclude many 'odd' languages like lisp, APL derivatives, but...
So my questions are,
Does code already exist that recognises the programming language of a text file? (Surely this must be a less complicated task than eclipse's syntax trees or than google translate's language guessing feature, right?) In fact, does the SO syntax highlighter do anything like this?
Is it theoretically possible to create a single interpreter or compiler that recognises what language idiom you're using at any moment and (maybe "intelligently") executes or translates to a runnable form. And flags the corner cases where my syntax is ambiguous with regards to behaviour. Immediate difficulties I see include: knowing when to switch between indentation-dependent and brace-dependent modes, recognising funny operators (like *pointer vs *kwargs) and knowing when to use list vs array-like representations.
Is there any language or interpreter in existence, that can manage this kind of flexible interpreting?
Have I missed an obvious obstacle to this being possible?
edit
Thanks all for your answers and ideas. I am planning to write a constraint-based heuristic translator that could, potentially, "solve" code for the intended meaning and translate into real python code. It will notice keywords from many common languages, and will use syntactic clues to disambiguate the human's intentions - like spacing, brackets, optional helper words like let or then, context of how variables are previously used etc, plus knowledge of common conventions (like capital names, i for iteration, and some simplistic limited understanding of naming of variables/methods e.g containing the word get, asynchronous, count, last, previous, my etc). In real pseudocode, variable naming is as informative as the operations themselves!
Using these clues it will create assumptions as to the implementation of each operation (like 0/1 based indexing, when should exceptions be caught or ignored, what variables ought to be const/global/local, where to start and end execution, and what bits should be in separate threads, notice when numerical units match / need converting). Each assumption will have a given certainty - and the program will list the assumptions on each statement, as it coaxes what you write into something executable!
For each assumption, you can 'clarify' your code if you don't like the initial interpretation. The libraries issue is very interesting. My translator, like some IDE's, will read all definitions available from all modules, use some statistics about which classes/methods are used most frequently and in what contexts, and just guess! (adding a note to the program to say why it guessed as such...) I guess it should attempt to execute everything, and warn you about what it doesn't like. It should allow anything, but let you know what the several alternative interpretations are, if you're being ambiguous.
It will certainly be some time before it can manage such unusual examples like #Albin Sunnanbo's ImportantCustomer example. But I'll let you know how I get on!
I think that is quite useless for everything but toy examples and strict mathematical algorithms. For everything else the language is not just the language. There are lots of standard libraries and whole environments around the languages. I think I write almost as many lines of library calls as I write "actual code".
In C# you have .NET Framework, in C++ you have STL, in Java you have some Java libraries, etc.
The difference between those libraries are too big to be just syntactic nuances.
<subjective>
There has been attempts at unifying language constructs of different languages to a "unified syntax". That was called 4GL language and never really took of.
</subjective>
As a side note I have seen a code example about a page long that was valid as c#, Java and Java script code. That can serve as an example of where it is impossible to determine the actual language used.
Edit:
Besides, the whole purpose of pseudocode is that it does not need to compile in any way. The reason you write pseudocode is to create a "sketch", however sloppy you like.
foreach c in ImportantCustomers{== OrderValue >=$1M}
SendMailInviteToSpecialEvent(c)
Now tell me what language it is and write an interpreter for that.
To detect what programming language is used: Detecting programming language from a snippet
I think it should be possible. The approach in 1. could be leveraged to do this, I think. I would try to do it iteratively: detect the syntax used in the first line/clause of code, "compile" it to intermediate form based on that detection, along with any important syntax (e.g. begin/end wrappers). Then the next line/clause etc. Basically write a parser that attempts to recognize each "chunk". Ambiguity could be flagged by the same algorithm.
I doubt that this has been done ... seems like the cognitive load of learning to write e.g. python-compatible pseudocode would be much easier than trying to debug the cases where your interpreter fails.
a. I think the biggest problem is that most pseudocode is invalid in any language. For example, I might completely skip object initialization in a block of pseudocode because for a human reader it is almost always straightforward to infer. But for your case it might be completely invalid in the language syntax of choice, and it might be impossible to automatically determine e.g. the class of the object (it might not even exist). Etc.
b. I think the best you can hope for is an interpreter that "works" (subject to 4a) for your pseudocode only, no-one else's.
Note that I don't think that 4a,4b are necessarily obstacles to it being possible. I just think it won't be useful for any practical purpose.
Recognizing what language a program is in is really not that big a deal. Recognizing the language of a snippet is more difficult, and recognizing snippets that aren't clearly delimited (what do you do if four lines are Python and the next one is C or Java?) is going to be really difficult.
Assuming you got the lines assigned to the right language, doing any sort of compilation would require specialized compilers for all languages that would cooperate. This is a tremendous job in itself.
Moreover, when you write pseudo-code you aren't worrying about the syntax. (If you are, you're doing it wrong.) You'll wind up with code that simply can't be compiled because it's incomplete or even contradictory.
And, assuming you overcame all these obstacles, how certain would you be that the pseudo-code was being interpreted the way you were thinking?
What you would have would be a new computer language, that you would have to write correct programs in. It would be a sprawling and ambiguous language, very difficult to work with properly. It would require great care in its use. It would be almost exactly what you don't want in pseudo-code. The value of pseudo-code is that you can quickly sketch out your algorithms, without worrying about the details. That would be completely lost.
If you want an easy-to-write language, learn one. Python is a good choice. Use pseudo-code for sketching out how processing is supposed to occur, not as a compilable language.
An interesting approach would be a "type-as-you-go" pseudocode interpreter. That is, you would set the language to be used up front, and then it would attempt to convert the pseudo code to real code, in real time, as you typed. An interactive facility could be used to clarify ambiguous stuff and allow corrections. Part of the mechanism could be a library of code which the converter tried to match. Over time, it could learn and adapt its translation based on the habits of a particular user.
People who program all the time will probably prefer to just use the language in most cases. However, I could see the above being a great boon to learners, "non-programmer programmers" such as scientists, and for use in brainstorming sessions with programmers of various languages and skill levels.
-Neil
Programs interpreting human input need to be given the option of saying "I don't know." The language PL/I is a famous example of a system designed to find a reasonable interpretation of anything resembling a computer program that could cause havoc when it guessed wrong: see http://horningtales.blogspot.com/2006/10/my-first-pli-program.html
Note that in the later language C++, when it resolves possible ambiguities it limits the scope of the type coercions it tries, and that it will flag an error if there is not a unique best interpretation.
I have a feeling that the answer to 2. is NO. All I need to prove it false is a code snippet that can be interpreted in more than one way by a competent programmer.
Does code already exist that
recognises the programming language
of a text file?
Yes, the Unix file command.
(Surely this must be a less
complicated task than eclipse's syntax
trees or than google translate's
language guessing feature, right?) In
fact, does the SO syntax highlighter
do anything like this?
As far as I can tell, SO has a one-size-fits-all syntax highlighter that tries to combine the keywords and comment syntax of every major language. Sometimes it gets it wrong:
def median(seq):
"""Returns the median of a list."""
seq_sorted = sorted(seq)
if len(seq) & 1:
# For an odd-length list, return the middle item
return seq_sorted[len(seq) // 2]
else:
# For an even-length list, return the mean of the 2 middle items
return (seq_sorted[len(seq) // 2 - 1] + seq_sorted[len(seq) // 2]) / 2
Note that SO's highlighter assumes that // starts a C++-style comment, but in Python it's the integer division operator.
This is going to be a major problem if you try to combine multiple languages into one. What do you do if the same token has different meanings in different languages? Similar situations are:
Is ^ exponentiation like in BASIC, or bitwise XOR like in C?
Is || logical OR like in C, or string concatenation like in SQL?
What is 1 + "2"? Is the number converted to a string (giving "12"), or is the string converted to a number (giving 3)?
Is there any language or interpreter
in existence, that can manage this
kind of flexible interpreting?
On another forum, I heard a story of a compiler (IIRC, for FORTRAN) that would compile any program regardless of syntax errors. If you had the line
= Y + Z
The compiler would recognize that a variable was missing and automatically convert the statement to X = Y + Z, regardless of whether you had an X in your program or not.
This programmer had a convention of starting comment blocks with a line of hyphens, like this:
C ----------------------------------------
But one day, they forgot the leading C, and the compiler choked trying to add dozens of variables between what it thought was subtraction operators.
"Flexible parsing" is not always a good thing.
To create a "pseudocode interpreter," it might be necessary to design a programming language that allows user-defined extensions to its syntax. There already are several programming languages with this feature, such as Coq, Seed7, Agda, and Lever. A particularly interesting example is the Inform programming language, since its syntax is essentially "structured English."
The Coq programming language allows "syntax extensions", so the language can be extended to parse new operators:
Notation "A /\ B" := (and A B).
Similarly, the Seed7 programming language can be extended to parse "pseudocode" using "structured syntax definitions." The while loop in Seed7 is defined in this way:
syntax expr: .while.().do.().end.while is -> 25;
Alternatively, it might be possible to "train" a statistical machine translation system to translate pseudocode into a real programming language, though this would require a large corpus of parallel texts.

A question about logic and the Curry-Howard correspondence

Could you please explain me what is the basic connection between the fundamentals of logical programming and the phenomenon of syntactic similarity between type systems and conventional logic?
The Curry-Howard correspondence is not about logic programming, but functional programming. The fundamental mechanic of Prolog is justified in proof theory by John Robinson's resolution technique, which shows how it is possible to check whether logical formulae expressed as Horn clauses are satisfiable, that is, whether you can find terms to substitue for their logic variables that make them true.
Thus logic programming is about specifying programs as logical formulae, and the calculation of the program is some form of proof inference, in Prolog reolution, as I have said. By contrast the Curry-Howard correspondence shows how proofs in a special formulasition of logic, called natural deduction, correspond to programs in the lambda calculus, with the type of the program corresponding to the formula that the proof proves; computation in the lambda calculus corresponds to an important phenomenon in proof theory called normalisation, which transforms proofs into new, more direct proofs. So logic programming and functional programming correspond to different levels in these logics: logic programs match formulae of a logic, whilst functional programs match proofs of formulae.
There's another difference: the logics used are generally different. Logic programming generally uses simpler logics — as I said, Prolog is founded on Horn clauses, which are a highly restricted class of formulae where implications may not be nested, and there are no disjunctions, although Prolog recovers the full strength of classical logic using the cut rule. By contrast, functional programming languages such as Haskell make heavy use of programs whose types have nested implications, and are decorated by all kinds of forms of polymorphism. They are also based on intuitionistic logic, a class of logics that forbids use of the principle of the excluded middle, which Robinson's computational mechanism is based on.
Some other points:
It is possible to base logic programming on more sophisticated logics than Horn clauses; for example, Lambda-prolog is based on intuitionistic logic, with a different computation mechanism than resolution.
Dale Miller has called the proof-theoretic paradigm behind logic programming the proof search as programming metaphor, to contrast with the proofs as programs metaphor that is another term used for the Curry-Howard correspondence.
Logic programming is fundamentally about goal directed searching for proofs. The structural relationship between typed languages and logic generally involves functional languages, although sometimes imperative and other languages - but not logic programming languages directly. This relationship relates proofs to programs.
So, logic programming proof search can be used to find proofs that are then interpreted as functional programs. This seems to be the most direct relationship between the two (as you asked for).
Building whole programs this way isn't practical, but it can be useful for filling in tedious details in programs, and there's some important examples of this in practice. A basic example of this is structural subtyping - which corresponds to filling in a few proof steps via a simple entailment proof. A much more sophisticated example is the type class system of Haskell, which involves a particular kind of goal directed search - in the extreme this involves a Turing-complete form of logic programming at compile time.

Is Ruby a functional language?

Wikipedia says Ruby is a functional language, but I'm not convinced. Why or why not?
Whether a language is or is not a functional language is unimportant. Functional Programming is a thesis, best explained by Philip Wadler (The Essence of Functional Programming) and John Hughes (Why Functional Programming Matters).
A meaningful question is, 'How amenable is Ruby to achieving the thesis of functional programming?' The answer is 'very poorly'.
I gave a talk on this just recently. Here are the slides.
Ruby does support higher-level functions (see Array#map, inject, & select), but it is still an imperative, Object-Oriented language.
One of the key characteristics of a functional language it that it avoids mutable state. Functional languages do not have the concept of a variable as you would have in Ruby, C, Java, or any other imperative language.
Another key characteristic of a functional language is that it focuses on defining a program in terms of "what", rather than "how". When programming in an OO language, we write classes & methods to hide the implementation (the "how") from the "what" (the class/method name), but in the end these methods are still written using a sequence of statements. In a functional language, you do not specify a sequence of execution, even at the lowest level.
I most definitely think you can use functional style in Ruby.
One of the most critical aspects to be able to program in a functional style is if the language supports higher order functions... which Ruby does.
That said, it's easy to program in Ruby in a non-functional style as well. Another key aspect of functional style is to not have state, and have real mathematical functions that always return the same value for a given set of inputs. This can be done in Ruby, but it is not enforced in the language like something more strictly functional like Haskell.
So, yeah, it supports functional style, but it also will let you program in a non-functional style as well.
I submit that supporting, or having the ability to program in a language in a functional style does not a functional language make.
I can even write Java code in a functional style if I want to hurt my collegues, and myself a few months weeks on.
Having a functional language is not only about what you can do, such as higher-order functions, first-class functions and currying. It is also about what you cannot do, like side-effects in pure functions.
This is important because it is a big part of the reason why functional programs are, or functional code in generel is, easier to reason about. And when code is easier to reason about, bugs become shallower and float to the conceptual surface where they can be fixed, which in turn gives less buggy code.
Ruby is object-oriented at its core, so even though it has reasonably good support for a functional style, it is not itself a functional language.
That's my non-scientific opinion anyway.
Edit:
In retrospect and with consideration for the fine comments I have recieved to this answer thus far, I think the object-oriented versus functional comparison is one of apples and oranges.
The real differentiator is that of being imparative in execution, or not. Functional languages have the expression as their primary linguistic construct and the order of execution is often undefined or defined as being lazy. Strict execution is possible but only used when needed. In an imparative language, strict execution is the default and while lazy execution is possible, it is often kludgy to do and can have unpredictable results in many edge cases.
Now, that's my non-scientific opinion.
Ruby will have to meet the following requirements in order to be "TRUELY" functional.
Immutable values: once a “variable” is set, it cannot be changed. In Ruby, this means you effectively have to treat variables like constants. The is not fully supported in the language, you will have to freeze each variable manually.
No side-effects: when passed a given value, a function must always return the same result. This goes hand in hand with having immutable values; a function can never take a value and change it, as this would be causing a side-effect that is tangential to returning a result.
Higher-order functions: these are functions that allow functions as arguments, or use functions as the return value. This is, arguably, one of the most critical features of any functional language.
Currying: enabled by higher-order functions, currying is transforming a function that takes multiple arguments into a function that takes one argument. This goes hand in hand with partial function application, which is transforming a multi-argument function into a function that takes less arguments then it did originally.
Recursion: looping by calling a function from within itself. When you don’t have access to mutable data, recursion is used to build up and chain data construction. This is because looping is not a functional concept, as it requires variables to be passed around to store the state of the loop at a given time.
Lazy-evaluation, or delayed-evaluation: delaying processing of values until the moment when it is actually needed. If, as an example, you have some code that generated list of Fibonacci numbers with lazy-evaluation enabled, this would not actually be processed and calculated until one of the values in the result was required by another function, such as puts.
Proposal (Just a thought)
I would be of great to have some kind of definition to have a mode directive to declare files with functional paradigm, example
mode 'functional'
Ruby is a multi-paradigm language that supports a functional style of programming.
Ruby is an object-oriented language, that can support other paradigms (functional, imperative, etc). However, since everything in Ruby is an object, it's primarily an OO language.
example:
"hello".reverse() = "olleh", every string is a string object instance and so on and so forth.
Read up here or here
It depends on your definition of a “functional language”. Personally, I think the term is itself quite problematic when used as an absolute. The are more aspects to being a “functional language” than mere language features and most depend on where you're looking from. For instance, the culture surrounding the language is quite important in this regard. Does it encourage a functional style? What about the available libraries? Do they encourage you to use them in a functional way?
Most people would call Scheme a functional language, for example. But what about Common Lisp? Apart from the multiple-/single-namespace issue and guaranteed tail-call elimination (which some CL implementations support as well, depending on the compiler settings), there isn't much that makes Scheme as a language more suited to functional programming than Common Lisp, and still, most Lispers wouldn't call CL a functional language. Why? Because the culture surrounding it heavily depends on CL's imperative features (like the LOOP macro, for example, which most Schemers would probably frown upon).
On the other hand, a C programmer may well consider CL a functional language. Most code written in any Lisp dialect is certainly much more functional in style than your usual block of C code, after all. Likewise, Scheme is very much an imperative language as compared to Haskell. Therefore, I don't think there can ever be a definite yes/no answer. Whether to call a language functional or not heavily depends on your viewpoint.
Ruby isn't really much of a multi-paradigm language either, I think. Multi-paradigm tends to be used by people wanting to label their favorite language as something which is useful in many different areas.
I'd describe Ruby is an object-oriented scripting language. Yes, functions are first-class objects (sort of), but that doesn't really make it a functional language. IMO, I might add.
Recursion is common in functional programming. Almost any language does support recursion, but recursive algorithms are often ineffective if there is no tail call optimization (TCO).
Functional programming languages are capable of optimizing tail recursion and can execute such code in constant space. Some Ruby implementations do optimize tail recursion, the other don't, but in general Ruby implementations are not required to do TCO. See Does Ruby perform Tail Call Optimization?
So, if you write some Ruby functional style and rely on TCO of some particular implementation, your code may be very ineffective in another Ruby interpreter. I think this is why Ruby is not a functional language (neither is Python).
Strictly speaking, it doesn't make sense to describe a language as "functional"; most languages are capable of functional programming. Even C++ is.
Functional style is more or less a subset of imperative language features, supported with syntactic sugar and some compiler optimizations like immutability and tail-recursion flattening,
The latter arguably is a minor implementation-specific technicality and has nothing to do with the actual language. The x64 C# 4.0 compiler does tail-recursion optimization, whereas the x86 one doesn't for whatever stupid reason.
Syntactic sugar can usually be worked around to some extent or another, especially if the language has a programmable precompiler (i.e. C's #define).
It might be slightly more meaningful to ask, "does language __ support imperative programming?", and the answer, for instance with Lisp, is "no".
Please, have a look at the beginning of the book: "A-Great-Ruby-eBook". It discusses the very specific topic you are asking. You can do different types of programming in Ruby. If you want to program like functionally, you can do it. If you want to program like imperatively, you can do it. It is a definition question how functional Ruby in the end is. Please, see the reply by the user camflan.

Resources