I just noticed that in the wikipedia page of Ruby, this language is defined as interpreted language.
I understood that probably there's something missing in my background. I have always known the difference between an interpreted language that doesn't need a compiler and a compiled language (who require to be compiled before the execution of programs), but what characterize a scripting language ?
Is Ruby definable as a scripting language ?
Thank you and forgive me for the black out
Things aren't just black and white. At the very least, they're also big and small, loud and quiet, blue and orange, grey and gray, long and short, right and wrong, etc.
Interpreted/compiled is just one way to categorize languages, and it's completely independent from (among countless other things) whether you call the same language a "scripting language" or not. To top it off, it's also a broken classification:
Interpreted/compiled depends on the language implementation, not on the language (this is not just theory, there are indeed quite a few languages for which both interpreters and compilers exist)
There are language implementations (lots of them, including most Ruby implementations) that are compilers, but "only" compile to bytecode and interpret that bytecode.
There are also implementations that switch between interpreting and compiling to native code (JIT compilers).
You see, reality is a complex beast ;) Ruby is, as mentioned above, frequently compiled. The output of that compilation is then interpreted, at least in some cases - there are also implementations that JIT-compile (Rubinius, and IIRC JRuby compiles to Java bytecode after a while). The reference implementation has been a compiler for a long time, and IIRC still is. So is Ruby interpreted or compiled? Neither term is meaningful unless you define it ;)
But back to the question: "Scripting language" isn't a property of the language either, it depends on how the language is used - namely, whether the language is used for scripting tasks. If you're looking for a definition, the Wikipedia page on "Scripting language" may help (just don't let them confuse you with the notes on implementation details such as that scripts are usually interpreted). There are indeed a few programs that use Ruby for scripting tasks, and there are doubtless numerous free-standing Ruby programs that would likely qualify as scripts (web scraping, system administration, etc).
So yes, I guess one can call Ruby a scripting language. Of course that doesn't mean a ruby on rails web app is just a script.
Yes.
Detailed response:
A scripting language is typically used to control applications that are often not written in this language. For example, shell scripts etc. can call arbitrary console applications.
Ruby is a general purpose dynamic language that is frequently used for scripting.
You can make arbitrary system calls using backtick notation like below.
`<system command>`
There are also many excellent Ruby gems such as Watir and RAutomation for automating web and native GUIs.
For definition of scripting language, see here.
The term 'scripting language' is very broad, and it can include both interpreted and compiled languages. Ruby in particular, might compiled or interpreted depending on what particular implementation we're using - for instance, JRuby gets compiled to bytecode, whereas CRuby (Ruby's reference implementation) is interpreted
Related
I' am curious, "indentation as a code block indication" is an Intellectual Property of Python programming language and other languages such as C# and Java will not be able to copy it. OR, it's technically impossible for these such programming languages which are of type complied to benefit this idea?
There is nothing stopping other languages from using indentation to denote blocks, and some other languages do; for example, indentation matters in F♯. Older languages presumably don't do this because it's simpler to parse a block if the syntax has dedicated "begin block" and "end block" tokens (and older parsers are more likely to have been written by hand and had to run on more limited hardware), and other modern languages may choose not to use indentation like this for a variety of reasons.
Note that this is a choice: even if it seems to you that using indentation to denote blocks is simply a superior way to design a language's syntax, you should be aware that many programmers will disagree with you. If a language doesn't use indentation to denote blocks, it's generally not because the language's syntax couldn't have been that way, but rather because the language designers chose not to make it that way.
This has nothing to do with whether the language is compiled or interpreted, since parsing occurs before either compilation or interpretation; actually, it is not really correct to say that a language is compiled or interpreted, it is more correct to say that an implementation of a language is a compiler or an interpreter, and some languages have both kinds of implementation.
I'm a secondary school student currently taking the OCR Computer Science GCSE (J276). I taught myself to program and was recently surprised by the context of a question in one of OCR's specimen papers (this one) as it goes against my knowledge of programming.
In question 5b, a question goes on to ask for a description of the differences between compilers and interpreters:
Harry can use either a compiler or an interpreter to translate the code [that he has created].
This confused me, as it seemed to suggest that the code written could either be interpreted or compiled in order to run, which would be odd as it was my understanding that languages fit into one of two boxes, interpreted (python, javascript) or compiled (c++, java), rather than fitting into both.
Is it true that a single programming language can be either compiled or interpreted based on the programmer's desire, or is this another case of OCR simplifying the course to make it easier to understand?
C is a language that is usually compiled, but interpreted implementations exist.
According to #delnan in this answer,
First off, interpreted/compiled is not a property of the language but a property of the implementation. For most languages, most if not all implementations fall in one category, so one might save a few words saying the language is interpreted/compiled too, but it's still an important distinction, both because it aids understanding and because there are quite a few languages with usable implementations of both kinds (mostly in the realm of functional languages, see Haskell and ML). In addition, there are C interpreters and projects that attempt to compile a subset of Python to C or C++ code (and subsequently to machine code).
In reality, it looks like the designers of your course said something that was true in theory, but in practice tends to be more restricted. This is found all over programming and, in fact, the world in general. Could you write a JavaScript compiler for Commodore 64? Sure, the C64 implements a full, general purpose computer system, and JavaScript is Turing Complete. Just because something is possible doesn't mean that a lot of people actually do it, though, or that it is easy.
I have all kind of scripting with Ruby:
rails (symfony)
ruby (php, bash)
rb-appscript (applescript)
Is it possible to replace low level languages with Ruby too?
I write in Ruby and it converts it to java, c++ or c.
Cause People say that when it comes to more performance critical tasks in Ruby, you could extend it with C. But the word extend means that you write C files that you just call in your Ruby code. I wonder, could I instead use Ruby and convert it to C source code which will be compiled to machine code. Then I could "extend" it with C but in Ruby code.
That is what this post is about. Write everything in Ruby but get the performance of C (or Java).
The second advantage is that you don't have to learn other languages.
Just like HipHop for PHP.
Are there implementations for this?
Such a compiler would be an enormous piece of work. Even if it works, it still has to
include the ruby runtime
include the standard library (which wasn't built for performance but for usability)
allow for metaprogramming
do dynamic dispatch
etc.
All of these inflict tremendous runtime penalties, because a C compiler can neither understand nor optimize such abstractions. Ruby and other dynamic languages are not only slower because they are interpreted (or compiled to bytecode which is then interpreted), but also because they are dynamic.
Example
In C++, a method call can be inlined in most cases, because the compiler knows the exact type of this. If a subtype is passed, the method still can't change unless it is virtual, in which case a still very efficient lookup table is used.
In Ruby, classes and methods can change in any way at any time, thus a (relatively expensive) lookup is required every time.
Languages like Ruby, Python or Perl have many features that simply are expensive, and most if not all relevant programs rely heavily on these features (of course, they are extremely useful!), so they cannot be removed or inlined.
Simply put: Dynamic languages are very hard to optimize, simply doing what an interpreter would do and compiling that to machine code doesn't cut it. It's possible to get incredible speed out of dynamic languages, as V8 proves, but you have to throw huge piles of money and offices full of clever programmers at it.
There is https://github.com/seattlerb/ruby_to_c Ruby To C compiler. It actually only takes in a subset of Ruby though. I believe the main missing part is the Meta Programming features
In a recent interview (November 16th, 2012) Yukihiro "Matz" Matsumoto (creator of Ruby) talked about compiling Ruby to C
(...) In University of Tokyo a research student is working on an academic research project that compiles Ruby code to C code before compiling the binary code. The process involves techniques such as type inference, and in optimal scenarios the speed could reach up to 90% of typical hand-written C code. So far there is only a paper published, no open source code yet, but I’m hoping next year everything will be revealed... (from interview)
Just one student is not much, but it might be an interesting project. Probably a long way to go to full support of Ruby.
"Low level" is highly subjective. Many people draw the line differently, so for the sake of this argument, I'm just going to assume you mean compiling Ruby down to an intermediate form which can then be turned into machine code for your particular platform. I.e., compiling ruby to C or LLVM IR, or something of that nature.
The short answer is yes this is possible.
The longer answer goes something like this:
Several languages (Objective-C most notably) exist as a thin layer over other languages. ObjC syntax is really just a loose wrapper around the objc_*() libobjc runtime calls, for all practical purposes.
Knowing this, then what does the compiler do? Well, it basically works as any C compiler would, but also takes the objc-specific stuff, and generates the appropriate C function calls to interact with the objc runtime.
A ruby compiler could be implemented in similar terms.
It should also be noted however, that just by converting one language to a lower level form does not mean that language is instantly going to perform better, though it does not mean it will perform worse either. You really have to ask yourself why you're wanting to do it, and if that is a good reason.
There is also JRuby, if you still consider Java a low level language. Actually, the language itself has little to do here: it is possible to compile to JVM bytecode, which is independent of the language.
Performance doesn't come solely from "low level" compiled languages. Cross-compiling your Ruby program to convoluted, automatically generated C code isn't going to help either. This will likely just confuse things, include long compile times, etc. And there are much better ways.
But you first say "low level languages" and then mention Java. Java is not a low-level language. It's just one step below Ruby in terms of high- or low-level languages. But if you look at how Java works, the JVM, bytecode and just-in-time compilation, you can see how high level languages can be fast(er). Ruby is currently doing something similar. MRI 1.8 was an interpreted language, and had some performance problems. 1.9 is much faster, it's using a bytecode interpreter. I'm not sure if it'll ever happen on MRI, but Ruby is just one step away from JIT on MRI.
I'm not sure about the technologies behind jRuby and IronRuby, but they may already be doing this. However, both have their own advantages and disadvantages. I tend to stick with MRI, it's fast enough and it works just fine.
It is probably feasible to design a compiler that converts Ruby source code to C++. Ruby programs can be compiled to Python using the unholy compiler, so they could be compiled from Python to C++ using the Nuitka compiler.
The unholy compiler was developed more than a decade ago, but it might still work with current versions of Python.
Ruby2Cextension is another compiler that translates a subset of Ruby to C++, though it hasn't been updated since 2008.
In the chosen answer for this question about Blue Ruby, Chuck says:
All of the current Ruby
implementations are compiled to
bytecode. Contrary to SAP's claims, as
of Ruby 1.9, MRI itself includes a
bytecode compiler, though the ability
to save the compiled bytecode to disk
disappeared somewhere in the process
of merging the YARV virtual machine.
JRuby is compiled into Java .class
files. I don't have a lot of details
on MagLev, but it seems safe to say it
will take that road as well.
I'm confused about this compilation/interpretation issue with respect to Ruby.
I learned that Ruby is an interpreted language and that's why when I save changes to my Ruby files I don't need to re-build the project.
But if all of the Ruby implementations now are compiled, is it still fair to say that Ruby is an interpreted language? Or am I misunderstanding something?
Nearly every language is "compiled" nowadays, if you count bytecode as being compiled. Even Emacs Lisp is compiled. Ruby was a special case because until recently, it wasn't compiled into bytecode.
I think you're right to question the utility of characterizing languages as "compiled" vs. "interpreted." One useful distinction, though, is whether the language creates machine code (e.g. x86 assembler) directly from user code. C, C++, many Lisps, and Java with JIT enabled do, but Ruby, Python, and Perl do not.
People who don't know better will call any language that has a separate manual compilation step "compiled" and ones that don't "interpreted."
Yes, Ruby's still an interpreted language, or more precisely, Matz's Ruby Interpreter (MRI), which is what people usually talk about when they talk about Ruby, is still an interpreter. The compilation step is simply there to reduce the code to something that's faster to execute than interpreting and reinterpreting the same code time after time.
A subtle question indeed...
It used to be that "interpreted" languages were parsed and transformed into an intermediate form which was faster to execute, but the "machine" executing them was a pretty language specific program. "Compiled" languages were translated instead into the machine code instructions supported by the computer on which it was run. An early distinction was very basic--static vs. dynamic scope. In a statically typed language, a variable reference could pretty much be resolved to a memory address in a few machine instructions--you knew exactly where in the calling frame the variable referred. In dynamically typed languages you had to search (up an A-list or up a calling frame) for the reference. With the advent of object oriented programming, the non-immediate nature of a reference expanded to many more concepts--classes(types), methods(functions),even syntactical interpretation (embedded DSLs like regex).
The distinction, in fact, going back to maybe the late 70's was not so much between compiled and interpreted languages, but whether they were run in a compiled or interpreted environment.
For example, Pascal (the first high-level language I studied) ran at UC Berkeley first on Bill Joy's pxp interpreter, and later on the compiler he wrote pcc. Same language, available in both compiled and interpreted environments.
Some languages are more dynamic than others, the meaning of something--a type, a method, a variable--is dependent on the run-time environment. This means that compiled or not there is substantial run-time mechanism associated with executing a program. Forth, Smalltalk, NeWs, Lisp, all were examples of this. Initially, these languages required so much mechanism to execute (versus a C or a Fortran) that they were a natural for interpretation.
Even before Java, there were attempts to speed up execution of complex, dynamic languages with tricks, techniques which became threaded compilation, just-in-time compilation, and so on.
I think it was Java, though, which was the first wide-spread language that really muddied the compiler/interpreter gap, ironically not so that it would run faster (though, that too) but so that it would run everywhere. By defining their own machine language and "machine" the java bytecode and VM, Java attempted to become a language compiled into something close to any basic machine, but not actually any real machine.
Modern languages marry all these innovations. Some have the dynamic, open-ended, you-don't-know-what-you-get-until-runtime nature of traditional "interpreted languages (ruby, lisp, smalltalk, python, perl(!)), some try to have the rigor of specification allowing deep type-based static error detection of traditional compiled languages (java, scala). All compile to actual machine-independent representations (JVM) to get write once-run anywhere semantics.
So, compiled vs. interpreted? Best of both, I'd say. All the code's around in source (with documentation), change anything and the effect is immediate, simple operations run almost as fast as the hardware can do them, complex ones are supported and fast enough, hardware and memory models are consistent across platforms.
The bigger polemic in languages today is probably whether they are statically or dynamically typed, which is to say not how fast will they run, but will the errors be found by the compiler beforehand (at the cost of the programmer having to specify pretty complex typing information) or will the errors come up in testing and production.
You can run Ruby programs interactively using irb, the Interactive Ruby Shell. While it may generate intermediate bytecode, it's certainly not a "compiler" in the traditional sense.
A compiled language is generally compiled into machine code, as opposed to just byte code. Some byte code generators can actually further compile the byte code into machine code though.
Byte code itself is just an intermediate step between the literal code written by the user and the virtual machine, it still needs to be interpreted by the virtual machine though (as it's done with Java in a JVM and PHP with an opcode cache).
This is possibly a little off topic but...
Iron Ruby is a .net based implementation of ruby and therefore is usually compiled to byte code and then JIT compiled to machine language at runtime (i.e. not interpreted). Also (at least with other .net languages, so I assume with ruby) ngen can be used to generate a compiled native binary ahead of time, so that's effectively a machine code compiled version of ruby code.
As for the information I got from RubyConf 2011 in Shanghai, Matz is developing a 'MRuby'(stands for Matz's Ruby) to targeting running on embedded devices. And Matz said the the MRuby, will provide the ability to compile the ruby code into machine code to boost the speed and decrease the usage of the (limited) resources on the embedded devices. So, there're various kind of Ruby implementation and definitely not all of them are just interpreted during the runtime.
If there are any language designers out there (or people simply in the know), I'm curious about the methodology behind creating standard libraries for interpreted languages. Specifically, what seems to be the best approach? Defining standard functions/methods in the interpreted language, or performing the processing of those calls in the compiled language in which the interpreter is written?
What got me to thinking about this was the SO question about a stripslashes()-like function in Python. My first thought was "why not define your own and just call it when you need it", but it raised the question: is it preferable, for such a function, to let the interpreted language handle that overhead, or would it be better to write an extension and leverage the compiled language behind the interpreter?
The line between "interpreted" and "compiled" languages is really fuzzy these days. For example, the first thing Python does when it sees source code is compile it into a bytecode representation, essentially the same as what Java does when compiling class files. This is what *.pyc files contain. Then, the python runtime executes the bytecode without referring to the original source. Traditionally, a purely interpreted language would refer to the source code continuously when executing the program.
When building a language, it is a good approach to build a solid foundation on which you can implement the higher level functions. If you've got a solid, fast string handling system, then the language designer can (and should) implement something like stripslashes() outside the base runtime. This is done for at least a few reasons:
The language designer can show that the language is flexible enough to handle that kind of task.
The language designer actually writes real code in the language, which has tests and therefore shows that the foundation is solid.
Other people can more easily read, borrow, and even change the higher level function without having to be able to build or even understand the language core.
Just because a language like Python compiles to bytecode and executes that doesn't mean it is slow. There's no reason why somebody couldn't write a Just-In-Time (JIT) compiler for Python, along the lines of what Java and .NET already do, to further increase the performance. In fact, IronPython compiles Python directly to .NET bytecode, which is then run using the .NET system including the JIT.
To answer your question directly, the only time a language designer would implement a function in the language behind the runtime (eg. C in the case of Python) would be to maximise the performance of that function. This is why modules such as the regular expression parser are written in C rather than native Python. On the other hand, a module like getopt.py is implemented in pure Python because it can all be done there and there's no benefit to using the corresponding C library.
There's also an increasing trend of reimplementing languages that are traditionally considered "interpreted" onto a platform like the JVM or CLR -- and then allowing easy access to "native" code for interoperability. So from Jython and JRuby, you can easily access Java code, and from IronPython and IronRuby, you can easily access .NET code.
In cases like these, the ability to "leverage the compiled language behind the interpreter" could be described as the primary motivator for the new implementation.
See the 'Papers' section at www.lua.org.
Especially The Implementation of Lua 5.0
Lua defines all standard functions in the underlying (ANSI C) code. I believe this is mostly for performance reasons. Recently, i.e. the 'string.*' functions got an alternative implementation in pure Lua, which may prove vital for subprojects where Lua is run on top of .NET or Java runtime (where C code cannot be used).
As long as you are using a portable API for the compiled code base like the ANSI C standard library or STL in C++, then taking advantage of those functions would keep you from reinventing the wheel and likely provide a smaller, faster interpreter. Lua takes this approach and it is definitely small and fast as compared to many others.