I just have a question which I can't find a accurate answer online ..
Using swipl-ld
can help to combine Prolog and C code together, and eventually generating one signal
executable binary.
But there is one thing I am confused with...
In the generated binary, does the Prolog Interpreter (Virtual Machine or others) still exist?
If so, then probably the original Prolog code is stored as string in the .rodata section of ELF binary, but after a searching inside this section, I didn't find the code.. But perhaps the original code has been transformed into bytecode, and that's why I just can't find it at all..
If not, then how can Prolog code directly translate into semantic equivalent asm code based on SWI-Prolog? I have read some materials about the implementation of GNU-Prolog, based on WAM virtual machine, however, I haven't found any materials about the implementation of SWI-Prolog...
Could anyone give me some help?
The compiled binary does not contain your original source code nor the whole Prolog interpreter. However it does contain your program in form of bytecode compiled by the qsave_program/2 predicate. This bytecode is executed by Prolog emulator, which is a subset of the Prolog interpreter used during a normal interactive dialog, and which is also included in the compiled binary.
All relevant information can be found in the Generating Runtime Applications section of the SWI-Prolog documentation.
Related
I'm looking into how the v8 compiler works. I read an article which states source code is tokenized, parsed, an AST is constructed, then bytecode is generated (https://medium.com/dailyjs/understanding-v8s-bytecode-317d46c94775)
Is this bytecode an intermediate representation?
Short answer: No. Usually people use the terms "bytecode" and "intermediate representation" to mean two different things.
Long answer: It depends a bit on your definition (but for most definitions, "no" is still the right answer).
"Bytecode" in virtual machines like V8 refers to a representation that is used as input for an interpreter. The article you linked to gives a good description.
"Intermediate representation" or IR usually refers to data that a compiler uses internally, as an intermediate step (hence the name) between its input (usually the AST = abstract syntax tree, i.e. parsed version of the source text) and its output (usually machine code or byte code, but it could be anything, as in a source-to-source compiler).
So in a traditional setup, you have:
source --(parser)--> AST --(compiler front-end)--> IR --(compiler back-end)--> machine code
where the IR is usually modified several times as the compiler performs various optimizations on it, before finally generating machine code from it. There can also be several different IRs; for example V8's earlier optimizing compiler ("Crankshaft") had two: high-level IR "Hydrogen" and low-level IR "Lithium", whereas V8's current optimizing compiler ("Turbofan") even has three: "JavaScript-level nodes", "Simplified nodes", and "Machine-level nodes".
Now if you wanted to draw the boxes in your whiteboard diagram of the system a little differently, then instead of having a "parser" and a "compiler" you could treat everything between source and machine code as one big "compiler" (which as a first step parses the source). In that case, the AST would be a form of intermediate representation. But, as stated above, usually when people use the term IR they mean "compiler IR", not the AST.
In a virtual machine like V8, the overall execution pipeline is more complicated than described above. It starts with:
source --(parser)--> AST --(bytecode generator)--> bytecode
This bytecode is primarily used as input for V8's interpreter.
As an optimization, when V8 decides to run a function through the optimizing compiler, it does not start with the source code and a parser again, but instead the optimizing compiler uses the bytecode as its input. In diagram form:
bytecode --(interpreter)--> program execution
bytecode --(compiler front-end)--> IR --(compiler back-end)--> machine code --(CPU)--> program execution
Now here's the part where your perspective comes in: since the bytecode in V8 is not only used as input for the interpreter, but also as input for the optimizing compiler and in that sense as a step on the way from source text to machine code, if you wanted to call it a special form of intermediate representation, you wouldn't technically be wrong. It would be an unusual definition of the term though. When a compiler theory textbook talks about "intermediate representation", it does not mean "bytecode".
I know about assembly language and machine code.
A computer can usually execute programs written in its native machine language. Each instruction in this language is simple enough to be executed using a relatively small number of
electronic circuits. For simplicity, we will call this language L0. Programmers would have a difficult time writing programs in L0 because it is enormously detailed and consists purely of numbers. If a new language, L1, could be constructed that was easier to use, programs could be written in L1.
But I just want to know that is there a single example what the machine code is?
I mean is there any thing that I can write and just save it and run it (without compiling it with any compiler).
Аssembly instructions have a one-to-one relationship with the underlying machine instructions. This means that essentially you can convert assembly instructions into machine instructions with a look-up table.
look here: x86 instructions
I am looking at the Java-written Prolog system, Prova.
https://prova.ws/
But it is not clear about its implementation, a Prolog compiler or Prolog interpreter? I read the manual, but did not found an answer.
There are some rumors that Prova is based on Mandarax. The newest
version seem to be heading in the same direction as SWI-Prolog 7,
i.e. it supports dicts and a dot notation. See also here:
http://prova.ws/confluence/display/REWRITEDEV/Prova+maps+for+defining+slotted+terms
The original Mandarax seems to have been an interpreter, and
in the user manual of Prova we find one sentence that self
declares it as a Prolog interpreter, but no hint for compilation.
But there seems to be a newer version of Mandarax (1.1.0) which was
some kind of compiler, but maybe Prova was already branched out
before the compiler arrived, and its still an interpeter.
So although it self declares as a Prolog interpreter, it is most
likely not an ISO Prolog systems, since for example op/3 is missing.
I guess it uses aa tokenizer with some hard wired operators and a
parser with some hard wired operator expressions. (*)
It might nevertheless offer some goodies, but judging from the
documentation and binary size, they might not be many. Which
is possibly compensated by the ability to directly embed Java
calls by the dot notation:
http://prova.ws/confluence/display/REWRITEDEV/Calling+Java+from+Prova+rulebases
Bye
(*)
The Prova syntax goes even that far, that it requires the end-user
to write fail() instead of fail. A syntax variant that is also
found in the new SWI-Prolog 7, although not with the same drastic
effect on the end-user that he/she would be not anymore allowed to use
atoms as goals.
Is it possible to distinguish between a Prolog Interpreter and Prolog Compiler from its usage or intermediary files generated?
Wikipedia has a good compilation of Prolog implementations
http://en.wikipedia.org/wiki/Comparison_of_Prolog_implementations
This is a question about the notation used in table.
Does the column "Compiled Code" means that the corresponding Prolog is implemented with a Prolog Compiler?
(I am not sure if stackoverflow is a good place to ask about this. If not, please let me know, I will remove this thread.)
"Compiled Code" in this table means that any given Prolog program is itself compiled by the respective Prolog system, and the compiled form is executed.
Most of these systems compile Prolog programs to abstract machine code before executing it. Examples of abstract machines for Prolog (like the JVM for Java) are the WAM, ZIP, TOAM etc.
Some of these systems even compile Prolog code to native machine code, for example via JIT compilation, just like Java systems can compile Java code to native machine code.
In practice, you usually do not create intermediary files when working with Prolog: You run the Prolog system, load your source file, and the system compiles the file on the fly and in memory to abstract machine code, without creating an intermediary file. You usually can create such files manually if you need them, but you typically do not.
Thus, the creation of intermediary files is not a criterion that lets you distinguish a compiler from an interpreter.
Wikipedia says:
A debug symbol is information that expresses which programming-language constructs generated a specific piece of machine code in a given executable module.
Any examples of what kind of programming-language constructs are used for the purpose?
What is the meaning of "constructs" in this context? Functions?
The programming language constructs reffered to are things like if statements, while loops, assignment statements, etc etc.
Debug symbols are usually files that map addresses of executable chunks of machine bytecode with the original source code file and line number they represent. This is what allows you to do things like put a breakpoint on an if statement, and have the machine stop when the execution reached that particular bit of bytecode.