What does a just-in-time (JIT) compiler do? - compilation

What does a JIT compiler specifically do as opposed to a non-JIT compiler? Can someone give a succinct and easy to understand description?

A JIT compiler runs after the program has started and compiles the code (usually bytecode or some kind of VM instructions) on the fly (or just-in-time, as it's called) into a form that's usually faster, typically the host CPU's native instruction set. A JIT has access to dynamic runtime information whereas a standard compiler doesn't and can make better optimizations like inlining functions that are used frequently.
This is in contrast to a traditional compiler that compiles all the code to machine language before the program is first run.
To paraphrase, conventional compilers build the whole program as an EXE file BEFORE the first time you run it. For newer style programs, an assembly is generated with pseudocode (p-code). Only AFTER you execute the program on the OS (e.g., by double-clicking on its icon) will the (JIT) compiler kick in and generate machine code (m-code) that the Intel-based processor or whatever will understand.

In the beginning, a compiler was responsible for turning a high-level language (defined as higher level than assembler) into object code (machine instructions), which would then be linked (by a linker) into an executable.
At one point in the evolution of languages, compilers would compile a high-level language into pseudo-code, which would then be interpreted (by an interpreter) to run your program. This eliminated the object code and executables, and allowed these languages to be portable to multiple operating systems and hardware platforms. Pascal (which compiled to P-Code) was one of the first; Java and C# are more recent examples. Eventually the term P-Code was replaced with bytecode, since most of the pseudo-operations are a byte long.
A Just-In-Time (JIT) compiler is a feature of the run-time interpreter, that instead of interpreting bytecode every time a method is invoked, will compile the bytecode into the machine code instructions of the running machine, and then invoke this object code instead. Ideally the efficiency of running object code will overcome the inefficiency of recompiling the program every time it runs.

JIT-Just in time
the word itself says when it's needed (on demand)
Typical scenario:
The source code is completely converted into machine code
JIT scenario:
The source code will be converted into assembly language like structure [for ex IL (intermediate language) for C#, ByteCode for java].
The intermediate code is converted into machine language only when the application needs that is required codes are only converted to machine code.
JIT vs Non-JIT comparison:
In JIT not all the code is converted into machine code first a part
of the code that is necessary will be converted into machine code
then if a method or functionality called is not in machine then that
will be turned into machine code... it reduces burden on the CPU.
As the machine code will be generated on run time....the JIT
compiler will produce machine code that is optimised for running
machine's CPU architecture.
JIT Examples:
In Java JIT is in JVM (Java Virtual Machine)
In C# it is in CLR (Common Language Runtime)
In Android it is in DVM (Dalvik Virtual Machine), or ART (Android RunTime) in newer versions.

As other have mentioned
JIT stands for Just-in-Time which means that code gets compiled when it is needed, not before runtime.
Just to add a point to above discussion JVM maintains a count as of how many time a function is executed. If this count exceeds a predefined limit JIT compiles the code into machine language which can directly be executed by the processor (unlike the normal case in which javac compile the code into bytecode and then java - the interpreter interprets this bytecode line by line converts it into machine code and executes).
Also next time this function is calculated same compiled code is executed again unlike normal interpretation in which the code is interpreted again line by line. This makes execution faster.

JIT compiler only compiles the byte-code to equivalent native code at first execution. Upon every successive execution, the JVM merely uses the already compiled native code to optimize performance.
Without JIT compiler, the JVM interpreter translates the byte-code line-by-line to make it appear as if a native application is being executed.
Source

JIT stands for Just-in-Time which means that code gets compiled when it is needed, not before runtime.
This is beneficial because the compiler can generate code that is optimised for your particular machine. A static compiler, like your average C compiler, will compile all of the code on to executable code on the developer's machine. Hence the compiler will perform optimisations based on some assumptions. It can compile more slowly and do more optimisations because it is not slowing execution of the program for the user.

After the byte code (which is architecture neutral) has been generated by the Java compiler, the execution will be handled by the JVM (in Java). The byte code will be loaded in to JVM by the loader and then each byte instruction is interpreted.
When we need to call a method multiple times, we need to interpret the same code many times and this may take more time than is needed. So we have the JIT (just-in-time) compilers. When the byte has been is loaded in to JVM (its run time), the whole code will be compiled rather than interpreted, thus saving time.
JIT compilers works only during run time, so we do not have any binary output.

A just in time compiler (JIT) is a piece of software which takes receives an non executable input and returns the appropriate machine code to be executed. For example:
Intermediate representation JIT Native machine code for the current CPU architecture
Java bytecode ---> machine code
Javascript (run with V8) ---> machine code
The consequence of this is that for a certain CPU architecture the appropriate JIT compiler must be installed.
Difference compiler, interpreter, and JIT
Although there can be exceptions in general when we want to transform source code into machine code we can use:
Compiler: Takes source code and returns a executable
Interpreter: Executes the program instruction by instruction. It takes an executable segment of the source code and turns that segment into machine instructions. This process is repeated until all source code is transformed into machine instructions and executed.
JIT: Many different implementations of a JIT are possible, however a JIT is usually a combination of a compiler and an interpreter. The JIT first turn intermediary data (e.g. Java bytecode) which it receives into machine language via interpretation. A JIT can often measures when a certain part of the code is executed often and the will compile this part for faster execution.

Just In Time Compiler (JIT) :
It compiles the java bytecodes into machine instructions of that specific CPU.
For example, if we have a loop statement in our java code :
while(i<10){
// ...
a=a+i;
// ...
}
The above loop code runs for 10 times if the value of i is 0.
It is not necessary to compile the bytecode for 10 times again and again as the same instruction is going to execute for 10 times. In that case, it is necessary to compile that code only once and the value can be changed for the required number of times. So, Just In Time (JIT) Compiler keeps track of such statements and methods (as said above before) and compiles such pieces of byte code into machine code for better performance.
Another similar example , is that a search for a pattern using "Regular Expression" in a list of strings/sentences.
JIT Compiler doesn't compile all the code to machine code. It compiles code that have a similar pattern at run time.
See this Oracle documentation on Understand JIT to read more.

You have code that is compliled into some IL (intermediate language). When you run your program, the computer doesn't understand this code. It only understands native code. So the JIT compiler compiles your IL into native code on the fly. It does this at the method level.

I know this is an old thread, but runtime optimization is another important part of JIT compilation that doesn't seemed to be discussed here. Basically, the JIT compiler can monitor the program as it runs to determine ways to improve execution. Then, it can make those changes on the fly - during runtime. Google JIT optimization (javaworld has a pretty good article about it.)

just-in-time (JIT) compilation, (also dynamic translation or run-time compilation), is a way of executing computer code that involves compilation during execution of a program – at run time – rather than prior to execution.
IT compilation is a combination of the two traditional approaches to translation to machine code – ahead-of-time compilation (AOT), and interpretation – and combines some advantages and drawbacks of both. JIT compilation combines the speed of compiled code with the flexibility of interpretation.
Let's consider JIT used in JVM,
For example, the HotSpot JVM JIT compilers generate dynamic optimizations. In other words, they make optimization decisions while the Java application is running and generate high-performing native machine instructions targeted for the underlying system architecture.
When a method is chosen for compilation, the JVM feeds its bytecode to the Just-In-Time compiler (JIT). The JIT needs to understand the semantics and syntax of the bytecode before it can compile the method correctly. To help the JIT compiler analyze the method, its bytecode are first reformulated in an internal representation called trace trees, which resembles machine code more closely than bytecode. Analysis and optimizations are then performed on the trees of the method. At the end, the trees are translated into native code.
A trace tree is a data structure that is used in the runtime compilation of programming code. Trace trees are used in a type of 'just in time compiler' that traces code executing during hotspots and compiles it. Refer this.
Refer :
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html
https://en.wikipedia.org/wiki/Just-in-time_compilation

A non-JIT compiler takes source code and transforms it into machine specific byte code at compile time. A JIT compiler takes machine agnostic byte code that was generated at compile time and transforms it into machine specific byte code at run time. The JIT compiler that Java uses is what allows a single binary to run on a multitude of platforms without modification.

Jit stands for just in time compiler
jit is a program that turns java byte code into instruction that can be sent directly to the processor.
Using the java just in time compiler (really a second compiler) at the particular system platform complies the bytecode into particular system code,once the code has been re-compiled by the jit complier ,it will usually run more quickly in the computer.
The just-in-time compiler comes with the virtual machine and is used optionally. It compiles the bytecode into platform-specific executable code that is immediately executed.

20% of the byte code is used 80% of the time. The JIT compiler gets these stats and optimizes this 20% of the byte code to run faster by adding inline methods, removal of unused locks etc and also creating the bytecode specific to that machine. I am quoting from this article, I found it was handy. http://java.dzone.com/articles/just-time-compiler-jit-hotspot

Just In Time compiler also known as JIT compiler is used for
performance improvement in Java. It is enabled by default. It is
compilation done at execution time rather earlier.
Java has popularized the use of JIT compiler by including it in
JVM.

JIT refers to execution engine in few of JVM implementations, one that is faster but requires more memory,is a just-in-time compiler. In this scheme, the bytecodes of a method are compiled to native machine code the first time the method is invoked. The native machine code for the method is then cached, so it can be re-used the next time that same method is invoked.

JVM actually performs compilation steps during runtime for performance reasons. This means that Java doesn't have a clean compile-execution separation. It first does a so called static compilation from Java source code to bytecode. Then this bytecode is passed to the JVM for execution. But executing bytecode is slow so the JVM measures how often the bytecode is run and when it detects a "hotspot" of code that's run very frequently it performs dynamic compilation from bytecode to machinecode of the "hotspot" code (hotspot profiler). So effectively today Java programs are run by machinecode execution.

Related

How can a program on my CPU run the same way on another CPU?

Let's say that I would code a program with Windows API and then compile it. The code is compiled to machine code for the CPU to execute. Now, my question is: If I share the executable file for someone else with another instruction set in their CPU. How can their CPU run the code the same way and not give errors or run a different code?
someone else with another instruction set in their CPU
...
How can their CPU run the code the same way
The code won't run. The CPU's, simply put, speak another language.
You have two options
recompile your code for the target CPU (assuming you can use the same source language and no platform specific API, so you're left with C/C++ with stdlib)
Write a script / bytecode and use a runtime available for both platforms to interpret the script (or bytecode)
That's why there are Runtime installations such as JVM (for Java) and scripts (Python, Scala, Lua, JavaScript, etc) where the code is in a form of a script or as platform independent code.
And now - next step. If you're using Windows API, well - as the name suggests - it's API (services) provided by the Windows system. So even using the same CPU without the Windows system (e.g. on a Linux system), the application won't run. (ok, there is often a way how to expose Windows API on Linux, but it can be tricky sometimes).
Conclusion: Binaries are not portable between instruction sets, if you're using any high level API (Win32, ...), you're pretty much hooked to the operating system too
When high-level languages are compiled into executable, often they are compiled to intermediate code. This is a representation of the source code compiled closer to assembly language, however it is not specific to any CPU instruction set. It is up to the machine running the executable to interpret this intermediate code and run it in the CPU's native instruction set.

What makes DosBox on web.archive.org so slow despite JIT?

I wanted to see if I could play Exile 2 from web.archive.org and I found that I need to install it first, which takes ages. Given that I'm basically emulating x86 machine on an x86 computer, DosBox supports dynarec (dynamic recompilation) and contemporary browsers support JITing the JavaScript code (and Emscripten generates asm.js, which should be rather easy to JIT), what makes it all so slow? In other words, what could be the bottleneck?
Dosbox is compiled using Emterpreter, which makes it slower than the pure asmjs version:
The Emterpreter is an option that compiles asm.js output from Emscripten into a binary bytecode. It also generates an interpreter ("Emscripten interpreter", hence Emterpreter) capable of executing that bytecode. This lets you compile your project, or parts of your project, into bytecode that will be interpreted, as opposed to asm.js that will be executed directly by the JavaScript engine.
The second reason is, that the Dynamic recompilation in the emscripten port of dosbox is not available yet. It would be a lot of work, to make it possible to create asmjs code on the fly.

How compilation and linking at runtime is happening?

In a tutorial I've encountered a new concept (for me), that I never thought is possible. Actually, I thought that compilation is an entirely pre-run-time process. This is the phrase from tutorial: "Compile time occurs before link time (when the output of one or more compiled files are joined together) and runtime (when a program is executed). In some programming languages it may be necessary for some compilation and linking to occur at runtime".
My questions are:
Is pre-run-time compilation and linking processes absolutely different from run-time compilation and linking? If yes, please explain the main differences.
How are code sections that need to be compiled(linked) during run-time marked and where is that information kept? (This may be different from language to language, if possible, please give a specific example).
Thank you very much for your time!
Runtime compilation
The best (most well known) example I'm personally aware of is the just in time compilation used by Java. As you might know Java code is being compiled into bytecode which can be interpreted by the Java Virtual Machine. It's therefore different from let's say C++ which is first fully (preprocessed) compiled (and linked) into an executable which can be ran directly by the OS without any virtual machine.
The Java bytecode is instead interpreted by the VM, which maps them to processor specific instructions. That being said the JVM does JIT, which takes that bytecode and compiles it (during runtime) into machine code. Here we arrive at your second question. Even in Java it can depend on which JVM you are using but basically there are pieces of code called hotspots, the pieces of code that are run frequently and which might be compiled so the application's performance improves. This is done during runtime because the normal compiler does not have (or well might not have) all the necessary data to make a proper judgement which pieces of code are in fact ran frequently. Therefore JIT requires some kind of runtime statistics gathering, which is done parallel to the program execution and is done by the JVM. What kind of statistics are gathered, what can be optimised (compiled in runtime) etc. depends on the implementation (you obviously cannot do everything a normal compiler would do due to memory and time constraints - guess this partly answers the first question? you don't compile everything and usually only a limited set of optimisations are supported in runtime compilation). You can try looking for such info but from my experience usually it's very badly documented and hard to find (at least when it comes to official sources, not presentations/blogs etc.)
Runtime linking
Linker is a different pair of shoes. We cannot use the Java example anymore since it doesn't really have a linker like C or C++ (instead it has a classloader which takes care of loading files and putting it all together).
Usually linking is performed by a linker after the compilation step (static linking), this has pros (no dependencies) and cons (higher memory imprint as we cannot use a shared library, when the library number changes you need to recompile your sources).
Runtime linking (dynamic/late linking) is actually performed by the OS and it's the OS linker's job to first load shared libraries and then attach them to a running process. Furthermore there are also different types of dynamic linking: explicit and implicit. This has the benefit of not having to recompile the source when the version number changes since it's dynamic and library sharing but also drawbacks, what if you have different programs that use the same library but require different versions (look for DLL hell). So yes those two concepts are also quite different.
Again how it's all done, how it's decided what and how should be linked, is OS specific, for instance Microsoft has the dynamic-link library concept.

How Does An Operating System Introduces The C Language To Write The Kernel

Does this introduction occurs at the NTLDR stage because it must be introduce, I mean isn't the Kernel written in C? I thought a computers only "known-before" programming language was Assembly Language that is hard coded at the Microcode of the Processor?
The first operating systems were all written in assembly. The C Language was created because its first use case was the creation of UNIX. A C compiler was written to handle this code and produce the assembly that the system understands (compiler was written in assembly of course). The effect snowballs from there. We now have a more powerful system to write code so we can of course write better compilers and better software with a more high level approach and let the compiler do the work for us.
As far as Windows is concerned it was a rewrite of an operating system called QDOS which was written in C.
Sidenote: Operating systems still require assembly code to function as there are many hardware independent pieces of information required (for example CR2 read on page fault on x86). Bootloaders and BIOS (older ones) are written in assembly because they are very specific to the hardware and are required to setup things such as interrupts and the stack pointer.
C is a compiled language, as opposed to an interpreted language. C programs as well as the C runtime library are compiled into machine code, so they don't need any kind of runtime environment such as an interpreter or virtual machine to be loaded in order to execute.
The entry point of a compiled program (including a kernel) will call into its runtime library and perform any initialization required before executing the program, but this is all machine code.

Why is bytecode JIT compiled at execution time and not at installation time?

Compiling a program to bytecode instead of native code enables a certain level of portability, so long a fitting Virtual Machine exists.
But I'm kinda wondering, why delay the compilation? Why not simply compile the byte code when installing an application?
And if that is done, why not adopt it to languages that directly compile to native code? Compile them to an intermediate format, distribute a "JIT" compiler with the installer and compile it on the target machine.
The only thing I can think of is runtime optimization. That's about the only major thing that can't be done at installation time. Thoughts?
Often it is precompiled. Consider, for example, precompiling .NET code with NGEN.
One reason for not precompiling everything would be extensibility. Consider those languages which allow use of reflection to load additional code at runtime.
Some JIT Compilers (Java HotSpot, for example) use type feedback based inlining. They track which types are actually used in the program, and inline function calls based on the assumption that what they saw earlier is what they will see later. In order for this to work, they need to run the program through a number of iterations of its "hot loop" in order to know what types are used.
This optimization is totally unavailable at install time.
The bytecode has been compiled just as well as the C++ code has been compiled.
Also the JIT compiler, i.e. .NET and the Java runtimes are massive and dynamic; And you can't foresee in a program which parts the apps use so you need the entire runtime.
Also one has to realize that a language targeted to a virtual machine has very different design goals than a language targeted to bare metal.
Take C++ vs. Java.
C++ wouldn't work on a VM, In particular a lot of the C++ language design is geared towards RAII.
Java wouldn't work on bare metal for so many reasons. primitive types for one.
EDIT: As delnan points out correctly; JIT and similar technologies, though hugely benificial to bytecode performance, would likely not be available at install time. Also compiling for a VM is very different from compiling to native code.

Resources