What makes DosBox on web.archive.org so slow despite JIT? - performance

I wanted to see if I could play Exile 2 from web.archive.org and I found that I need to install it first, which takes ages. Given that I'm basically emulating x86 machine on an x86 computer, DosBox supports dynarec (dynamic recompilation) and contemporary browsers support JITing the JavaScript code (and Emscripten generates asm.js, which should be rather easy to JIT), what makes it all so slow? In other words, what could be the bottleneck?

Dosbox is compiled using Emterpreter, which makes it slower than the pure asmjs version:
The Emterpreter is an option that compiles asm.js output from Emscripten into a binary bytecode. It also generates an interpreter ("Emscripten interpreter", hence Emterpreter) capable of executing that bytecode. This lets you compile your project, or parts of your project, into bytecode that will be interpreted, as opposed to asm.js that will be executed directly by the JavaScript engine.
The second reason is, that the Dynamic recompilation in the emscripten port of dosbox is not available yet. It would be a lot of work, to make it possible to create asmjs code on the fly.

Related

Why making/building C apps takes very long time?

Why does making a C/C++ app take very long compared to other apps (Java for example).
I am trying to build Ubuntu Unity, and it takes about 4 minutes on my local machine.
I think the process of Generating object files is the one that take most time.
Any advice?
If you want to speed up code generation you can use ccache. Also you can take a look at gcc version as older versions are known to lag behind. Clang also supersede them a lot.
I'm not touching compilation speed bacause this is a HUGE topic. Starting from that C/C++ is a fully compilable languages, while in Java you never compile to the machine codes, you just generate a bytecode leaving everything else to the VM.

Why is bytecode JIT compiled at execution time and not at installation time?

Compiling a program to bytecode instead of native code enables a certain level of portability, so long a fitting Virtual Machine exists.
But I'm kinda wondering, why delay the compilation? Why not simply compile the byte code when installing an application?
And if that is done, why not adopt it to languages that directly compile to native code? Compile them to an intermediate format, distribute a "JIT" compiler with the installer and compile it on the target machine.
The only thing I can think of is runtime optimization. That's about the only major thing that can't be done at installation time. Thoughts?
Often it is precompiled. Consider, for example, precompiling .NET code with NGEN.
One reason for not precompiling everything would be extensibility. Consider those languages which allow use of reflection to load additional code at runtime.
Some JIT Compilers (Java HotSpot, for example) use type feedback based inlining. They track which types are actually used in the program, and inline function calls based on the assumption that what they saw earlier is what they will see later. In order for this to work, they need to run the program through a number of iterations of its "hot loop" in order to know what types are used.
This optimization is totally unavailable at install time.
The bytecode has been compiled just as well as the C++ code has been compiled.
Also the JIT compiler, i.e. .NET and the Java runtimes are massive and dynamic; And you can't foresee in a program which parts the apps use so you need the entire runtime.
Also one has to realize that a language targeted to a virtual machine has very different design goals than a language targeted to bare metal.
Take C++ vs. Java.
C++ wouldn't work on a VM, In particular a lot of the C++ language design is geared towards RAII.
Java wouldn't work on bare metal for so many reasons. primitive types for one.
EDIT: As delnan points out correctly; JIT and similar technologies, though hugely benificial to bytecode performance, would likely not be available at install time. Also compiling for a VM is very different from compiling to native code.

Portable scripting language for a multi-server admin?

Please Note: Portable as in portableapps.com, not in the traditional sense of a language that can be used on multiple architectures or operating systems. Whoever coined this usage of the word portable should be whacked. :)
I'm a DBA and sysadmin, mostly for Windows machines running SQL Server. I'm looking for a programming/scripting language for Windows that doesn't require Admin access or an installer, needing no install process other than expanding it into a folder. My intent is to have a language for automation around which I can standardize.
Up to this point, I've been using a combination of batch files and Unix shell, using sh.exe from UnxUtils but it's far from a perfect solution.
I've evaluated a handful of options, all of them have at least one serious shortcoming or another. I have a strong preference for something open source or dual license, but I'm more interested in finding the right tool than anything else. Not interested that anything that relies on Cygwin or Java, but at this point I'd be fine with something that needs .NET.
Requirements:
Manageable footprint (1-100 files, under 30 MB installed)
Run on Windows XP and Server (2003+)
No installer (exe, msi)
No reliance on a JVM or Cygwin install
Works with external pipes, processes, and files
Support for MS SQL Server or ODBC connections
Bonus Points:
Open Source
FFI for calling functions in native DLLs
GUI support (native or gtk, wx, fltk, etc)
Linux, AIX, and/or OS X support
Dynamic, object oriented and/or functional, interpreted or bytecode compiled; interactive development
Able to package or compile scripts into executables
So far I've tried:
Ruby: 148 MB on disk, 23000 files
Portable Python: 54 MB on disk, 2800 files
Strawberry Perl: 123 MB on disk, 3600 files
REBOL: Great, except closed source and no MSSQL or ODBC in free version
Squeak Smalltalk: Great, except poor support for scripting
I urge you to try Lua. Regarding your requirements:
Tiny footprint (56 source files, under 150K compiled)
Runs everywhere (uses only ANSI C)
No installer needed; you compile from source (there's also a "batteries included" package that I haven't explored
Doesn't need JVM and works with any ANSI C compiler, so you can compile with Visual Studio, not Cygwin
Works with external processes and files but only to the extent supported by ANSI C. If POSIX popen is provided then that is supported also.
And your bonus points:
Open source (MIT license)
FFI to C is brilliantly conceived and executed—not quite as simple as Tcl but loads more powerful. Much better integration with C than Python or Ruby.
GUI support is mixed but there are good bindings for wx widgets. QT support was there at one time but I don't know if it has been maintained.
Linux is supported
Language/compiler features:
Dynamic
Functional
Prototype-based objects and inheritance through metamethods (you'll want to see examples in the book below
Fastest bytecode compiler in the West
Interactive read-eval-print loop; load new code dynamically
Able to package scripts into executables; either use Luiz de Figueiredo's srlua, or I can send you a 120-line Lua script that converts Lua source to a .c file that you link in with your app and the interpreter to make an executable.
Additional bonus points:
Very crisp, clean, well-designed language.
Small enough to master in its entirety and to be productive within a day.
Superb book Programming in Lua (check out the previous edition free online)
There are a couple of options for Python that might fit your bill:
The first is IronPython, which can be run without an installer and will play nicely with .net APIs. This gives you access to anything with a .net API or a COM typelib that you could build a PIA for. I've used at as a scripting mechanism for precisely this reason - it could be dropped into a directory within the system and did not need to be explicitly installed..You will have to have an appropriate .Net runtime installed, but .Net 2.0 is installed with SQL Server 2005. SQL Server can be accessed through ADO.net and building GUIs with Winforms is fairly straightforward.
The second is Portable Python which is designed to be run off a USB key. Although I see you've already tried it, you might elaborate on what the shortcomings were. If something isn't available in the basic install you could always look into building a custom version with it included. TkInter (at least) is bundled.You can also use Py2EXE to generate standalone python applications with all superfluous junk stripped out. This will give you about 10 files or so (depending on the number of DLLs) that can be run from a single directory, possibly on a USB key.
Running local python installs on Unix-oid OS's is pretty straightforward, so that's pretty much a no brainer. Also, python comes with most linux distros and is available as 'contributed software' from most if not all trad unix vendors. IIRC it's also bundled with MacOS.
Tclkit is a single-file, self-contained Tcl/Tk system. The mac version I have is about 3.8 megs. You can get a version for just about any modern OS. I carry around a thumb drive that has mac, windows and linux binaries so I can run my scripts on any platform. No install is required, just copy one file wherever you want.
The only thing it's missing from your original spec is MS SQL Server / ODBC support out of the box. I know people use tcl for that but I think you'll have to add an extra library or something. See the Tcl'ers wiki entry on MS SQL Server for more information.
For tcl, apart from Tclkit, freewrap is another small portable, self-contained interpreter for tcl.
Just rename the freewrap executable to something else will convert it to a stand-alone interpreter. Renaming it back to freewrap will convert it to a script wrapper.
Also, freewrapped apps contain a tcl interpreter. In dire emergencies you can try opening the app as a zip file and edit/replace the tcl code contained within (just remember to make a copy first). This has saved me several times when I'm at a client site without development tools but need to troubleshoot something. I just make a copy of one of my deployed app and presto - instant development environment!
Looking at wikipedia's exhaustive list of portable software
There's Tiny C compiler, again on Wikipedia here, and its own homepage here.
To summarize by quoting from wikipedia's list of features:
Small - can compile and execute C code everywhere, for example on rescue disks (about 100KB for x86 TCC executable, including C preprocessor, C compiler, assembler and linker).
Fast - tcc generates optimized x86 code. No byte code overhead. It compiles, assembles and links about 9 times faster than GCC.
Any C dynamic library can be used directly. TCC is heading towards full ISOC99 compliance. TCC can of course compile itself.
Includes an optional memory and bound checker. Bound checked code can be mixed freely with standard code.
Compile and execute C source directly. No linking or assembly necessary. Full C preprocessor and GNU-like assembler included.
C script is supported: just add '#!/usr/local/bin/tcc -run' at the first line of your C source, and execute it directly from the command line.
With libtcc, you can use TCC as a backend for dynamic code generation.
Few dependencies. It includes its own hand-written lexer, and it is implemented using a recursive descent parser. Thus, building TCC requires few other libraries.
Its LGPL license permits anyone to use, modify, and/or redistribute the software, and it can be used to develop either open source or proprietary software.
Hope this helps and would be of use,
Best regards,
Tom.
Every somewhat modern Windows version comes pre-installed with both VBScript and JScript. The doesn't meet all your features (compile to an executable comes to mind), but they certainly have an unbeatable advantage with the installation size: it's hard to beat 0.
In addition to the Lua suggestion, there is also Idle. It is basically a superset of Lua 5.1, with both the language (and libraries) and the implementation based on Lua. It was originally created to be a more complete scripting solution for Windows: because Lua is primarly intended for embedding, it has a rather small standard library and it is usually expected that the embedding application provides a rich library to Lua.
This makes sense for an embedded language, because, after all, there isn't much common functionality between, say Adobe Lightroom, Nginx and World of Warcraft, so there simply is nothing you can put in a standard library. But for a more general purpose OS scripting language, one would want a slightly larger library. Thus, Idle bundles a couple of libraries that are third-party (and sometimes hard to get to work on Windows) in Lua in its standard library.
Some of the things that the Idle standard library adds over Lua are tight Win32 integration, SQLite3 support, networking support, a PEG parser generator and archive support.
Also, Idle has support for embedding Perl and C code into your Idle programs.

Is it possible to compile Ruby to byte code as with Python?

In Python, if I want to give out an application without sources I can compile it into bytecode .pyc, is there a way to do something like it in Ruby?
I wrote a much more detailed answer to this question in the question "Can Ruby, PHP, or Perl create a pre-compiled file for the code like Python?"
The answer is: it depends. The Ruby Language has no provisions for compiling to bytecode and/or running bytecode. It also has no specfication of a bytecode format. The reason for this is simple: it would be much too restricting for language implementors if they were forced to use a specific bytecode format, or even bytecodes at all. For example, XRuby and JRuby compile to JVM bytecode, Ruby.NET and IronRuby compile to CIL bytecode, Cardinal compiles to PAST, SmallRuby compiles to Smalltalk/X bytecode, MagLev compiles to GemStone/S bytecode. For all of these implementations it would be plain stupid to use any other bytecode format than the one they currently use, since their whole point is interoperating with other language implementations that use the same bytecode format.
Simlar for MacRuby: it compiles to native code, not bytecode. Again, using bytecode would be stupid, since one of the goals is to run Ruby on the iPhone, which pretty much requires native code.
And of course there is MRI, which is a pure AST-walking script interpreter and thus doesn't have a bytecode format.
That being said, there are some Ruby Implementations which allow compiling to and loading from bytecode. Rubinius allows that, for example. (Indeed, it has to have that functionality since its Ruby compiler is written in Ruby, and thus the compiler must be compiled to Rubinius bytecode first, in order to solve the Catch-22.)
YARV also can save and load bytecode, although the loading functionality is currently disabled until a bytecode verifier is implemented that prevents users from loading manipulated bytecode that could crash or otherwise subvert the interpreter.
But, of course, both of these have their own bytecode formats and don't understand each other's (nor tinyrb's or RubyGoLightly's or ...) Also, neither of those formats is understood by a JVM or a CLR and vice versa.
However, the whole point is irrelevant because, as Mark points out, you can always reverse engineer the byte code anyway, especially in cases like CPython, PyPy, Rubinius, YARV, tinyrb, RubyGoLightly, where the bytecode format was specifically designed to be very close to the source language.
In general it is simply impossible to protect code that way. The reason is simple: you want the machine to be able to execute the code. (Otherwise what's the point in writing it in the first place?) However, in order to execute the code, the machine must understand the code. Since machines are much dumber than humans, it follows that any code that can be understood by a machine can just as well be understood by a human, no matter whether that code happens to be in source form, bytecode, assembly, native code or a deck of punch cards.
There is only one workable technical solution: if you control the entire execution pipeline, i.e. build your own CPU, your own computer, your own operating system, your own compiler, your own interpreter, and so forth and use strong cryptography to protect all of those, then and only then might you be able to protect your code. However, as e.g. Microsoft found out the hard way with the XBox 360, even doing all of that and hiring some of the smartest cryptographers and mathematicians on the planet, doesn't guarantee success.
The only real solution is not a technical but a social one: as soon as you have written your code, it is automatically fully protected by copyright law, without you having to do one single thing. That's it. Your code is protected.
The short answer is "YES",
check rubini.us
It will solve your problem.
Here is how to compile ruby code:
http://rubini.us/2011/03/17/running-ruby-with-no-ruby/
Although Ruby's 1.9 YARV VM is a byte-code compiler I don't believe it can dump the byte-code to disk. You might want to look at the alternative compiler, Rubinius, I believe it has this ability. You should note though that byte-code pyc files (and I imagine the ruby equivalent) can be pretty easily "decompiled".
Not with the MRI interpretter, no.
Some newer VM's are being worked on where this is on the table, but these aren't widely used (or even ready to be used) at this point.
If you use Jruby, you can compile your Ruby code into Java .class files (including your Rails stuff) to execute them with (open)jdk out of the box!
You can even compile your complete stuff into a .war file to deploy it on Apache Tomcat or Jboss with a tool called "warbler"
https://rubygems.org/gems/warbler/
Depends on your ruby.
JRuby - https://github.com/jruby/jruby/wiki/JRubyCompiler
MRuby - http://mruby.org/docs/articles/executing-ruby-code-with-mruby.html
MRI (C)Ruby - https://devtechnica.com/ruby-language/compile-ruby-code-to-binary-and-execute-it

What does a just-in-time (JIT) compiler do?

What does a JIT compiler specifically do as opposed to a non-JIT compiler? Can someone give a succinct and easy to understand description?
A JIT compiler runs after the program has started and compiles the code (usually bytecode or some kind of VM instructions) on the fly (or just-in-time, as it's called) into a form that's usually faster, typically the host CPU's native instruction set. A JIT has access to dynamic runtime information whereas a standard compiler doesn't and can make better optimizations like inlining functions that are used frequently.
This is in contrast to a traditional compiler that compiles all the code to machine language before the program is first run.
To paraphrase, conventional compilers build the whole program as an EXE file BEFORE the first time you run it. For newer style programs, an assembly is generated with pseudocode (p-code). Only AFTER you execute the program on the OS (e.g., by double-clicking on its icon) will the (JIT) compiler kick in and generate machine code (m-code) that the Intel-based processor or whatever will understand.
In the beginning, a compiler was responsible for turning a high-level language (defined as higher level than assembler) into object code (machine instructions), which would then be linked (by a linker) into an executable.
At one point in the evolution of languages, compilers would compile a high-level language into pseudo-code, which would then be interpreted (by an interpreter) to run your program. This eliminated the object code and executables, and allowed these languages to be portable to multiple operating systems and hardware platforms. Pascal (which compiled to P-Code) was one of the first; Java and C# are more recent examples. Eventually the term P-Code was replaced with bytecode, since most of the pseudo-operations are a byte long.
A Just-In-Time (JIT) compiler is a feature of the run-time interpreter, that instead of interpreting bytecode every time a method is invoked, will compile the bytecode into the machine code instructions of the running machine, and then invoke this object code instead. Ideally the efficiency of running object code will overcome the inefficiency of recompiling the program every time it runs.
JIT-Just in time
the word itself says when it's needed (on demand)
Typical scenario:
The source code is completely converted into machine code
JIT scenario:
The source code will be converted into assembly language like structure [for ex IL (intermediate language) for C#, ByteCode for java].
The intermediate code is converted into machine language only when the application needs that is required codes are only converted to machine code.
JIT vs Non-JIT comparison:
In JIT not all the code is converted into machine code first a part
of the code that is necessary will be converted into machine code
then if a method or functionality called is not in machine then that
will be turned into machine code... it reduces burden on the CPU.
As the machine code will be generated on run time....the JIT
compiler will produce machine code that is optimised for running
machine's CPU architecture.
JIT Examples:
In Java JIT is in JVM (Java Virtual Machine)
In C# it is in CLR (Common Language Runtime)
In Android it is in DVM (Dalvik Virtual Machine), or ART (Android RunTime) in newer versions.
As other have mentioned
JIT stands for Just-in-Time which means that code gets compiled when it is needed, not before runtime.
Just to add a point to above discussion JVM maintains a count as of how many time a function is executed. If this count exceeds a predefined limit JIT compiles the code into machine language which can directly be executed by the processor (unlike the normal case in which javac compile the code into bytecode and then java - the interpreter interprets this bytecode line by line converts it into machine code and executes).
Also next time this function is calculated same compiled code is executed again unlike normal interpretation in which the code is interpreted again line by line. This makes execution faster.
JIT compiler only compiles the byte-code to equivalent native code at first execution. Upon every successive execution, the JVM merely uses the already compiled native code to optimize performance.
Without JIT compiler, the JVM interpreter translates the byte-code line-by-line to make it appear as if a native application is being executed.
Source
JIT stands for Just-in-Time which means that code gets compiled when it is needed, not before runtime.
This is beneficial because the compiler can generate code that is optimised for your particular machine. A static compiler, like your average C compiler, will compile all of the code on to executable code on the developer's machine. Hence the compiler will perform optimisations based on some assumptions. It can compile more slowly and do more optimisations because it is not slowing execution of the program for the user.
After the byte code (which is architecture neutral) has been generated by the Java compiler, the execution will be handled by the JVM (in Java). The byte code will be loaded in to JVM by the loader and then each byte instruction is interpreted.
When we need to call a method multiple times, we need to interpret the same code many times and this may take more time than is needed. So we have the JIT (just-in-time) compilers. When the byte has been is loaded in to JVM (its run time), the whole code will be compiled rather than interpreted, thus saving time.
JIT compilers works only during run time, so we do not have any binary output.
A just in time compiler (JIT) is a piece of software which takes receives an non executable input and returns the appropriate machine code to be executed. For example:
Intermediate representation JIT Native machine code for the current CPU architecture
Java bytecode ---> machine code
Javascript (run with V8) ---> machine code
The consequence of this is that for a certain CPU architecture the appropriate JIT compiler must be installed.
Difference compiler, interpreter, and JIT
Although there can be exceptions in general when we want to transform source code into machine code we can use:
Compiler: Takes source code and returns a executable
Interpreter: Executes the program instruction by instruction. It takes an executable segment of the source code and turns that segment into machine instructions. This process is repeated until all source code is transformed into machine instructions and executed.
JIT: Many different implementations of a JIT are possible, however a JIT is usually a combination of a compiler and an interpreter. The JIT first turn intermediary data (e.g. Java bytecode) which it receives into machine language via interpretation. A JIT can often measures when a certain part of the code is executed often and the will compile this part for faster execution.
Just In Time Compiler (JIT) :
It compiles the java bytecodes into machine instructions of that specific CPU.
For example, if we have a loop statement in our java code :
while(i<10){
// ...
a=a+i;
// ...
}
The above loop code runs for 10 times if the value of i is 0.
It is not necessary to compile the bytecode for 10 times again and again as the same instruction is going to execute for 10 times. In that case, it is necessary to compile that code only once and the value can be changed for the required number of times. So, Just In Time (JIT) Compiler keeps track of such statements and methods (as said above before) and compiles such pieces of byte code into machine code for better performance.
Another similar example , is that a search for a pattern using "Regular Expression" in a list of strings/sentences.
JIT Compiler doesn't compile all the code to machine code. It compiles code that have a similar pattern at run time.
See this Oracle documentation on Understand JIT to read more.
You have code that is compliled into some IL (intermediate language). When you run your program, the computer doesn't understand this code. It only understands native code. So the JIT compiler compiles your IL into native code on the fly. It does this at the method level.
I know this is an old thread, but runtime optimization is another important part of JIT compilation that doesn't seemed to be discussed here. Basically, the JIT compiler can monitor the program as it runs to determine ways to improve execution. Then, it can make those changes on the fly - during runtime. Google JIT optimization (javaworld has a pretty good article about it.)
just-in-time (JIT) compilation, (also dynamic translation or run-time compilation), is a way of executing computer code that involves compilation during execution of a program – at run time – rather than prior to execution.
IT compilation is a combination of the two traditional approaches to translation to machine code – ahead-of-time compilation (AOT), and interpretation – and combines some advantages and drawbacks of both. JIT compilation combines the speed of compiled code with the flexibility of interpretation.
Let's consider JIT used in JVM,
For example, the HotSpot JVM JIT compilers generate dynamic optimizations. In other words, they make optimization decisions while the Java application is running and generate high-performing native machine instructions targeted for the underlying system architecture.
When a method is chosen for compilation, the JVM feeds its bytecode to the Just-In-Time compiler (JIT). The JIT needs to understand the semantics and syntax of the bytecode before it can compile the method correctly. To help the JIT compiler analyze the method, its bytecode are first reformulated in an internal representation called trace trees, which resembles machine code more closely than bytecode. Analysis and optimizations are then performed on the trees of the method. At the end, the trees are translated into native code.
A trace tree is a data structure that is used in the runtime compilation of programming code. Trace trees are used in a type of 'just in time compiler' that traces code executing during hotspots and compiles it. Refer this.
Refer :
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html
https://en.wikipedia.org/wiki/Just-in-time_compilation
A non-JIT compiler takes source code and transforms it into machine specific byte code at compile time. A JIT compiler takes machine agnostic byte code that was generated at compile time and transforms it into machine specific byte code at run time. The JIT compiler that Java uses is what allows a single binary to run on a multitude of platforms without modification.
Jit stands for just in time compiler
jit is a program that turns java byte code into instruction that can be sent directly to the processor.
Using the java just in time compiler (really a second compiler) at the particular system platform complies the bytecode into particular system code,once the code has been re-compiled by the jit complier ,it will usually run more quickly in the computer.
The just-in-time compiler comes with the virtual machine and is used optionally. It compiles the bytecode into platform-specific executable code that is immediately executed.
20% of the byte code is used 80% of the time. The JIT compiler gets these stats and optimizes this 20% of the byte code to run faster by adding inline methods, removal of unused locks etc and also creating the bytecode specific to that machine. I am quoting from this article, I found it was handy. http://java.dzone.com/articles/just-time-compiler-jit-hotspot
Just In Time compiler also known as JIT compiler is used for
performance improvement in Java. It is enabled by default. It is
compilation done at execution time rather earlier.
Java has popularized the use of JIT compiler by including it in
JVM.
JIT refers to execution engine in few of JVM implementations, one that is faster but requires more memory,is a just-in-time compiler. In this scheme, the bytecodes of a method are compiled to native machine code the first time the method is invoked. The native machine code for the method is then cached, so it can be re-used the next time that same method is invoked.
JVM actually performs compilation steps during runtime for performance reasons. This means that Java doesn't have a clean compile-execution separation. It first does a so called static compilation from Java source code to bytecode. Then this bytecode is passed to the JVM for execution. But executing bytecode is slow so the JVM measures how often the bytecode is run and when it detects a "hotspot" of code that's run very frequently it performs dynamic compilation from bytecode to machinecode of the "hotspot" code (hotspot profiler). So effectively today Java programs are run by machinecode execution.

Resources