Bytecode Profiling Tools for understanding JVM languages - performance

I'm experimenting (with some friends) with JVM languages, such as Clojure and Scala. We recently found a functional solution to an algorithm that performed 30 times faster in Scala than in Java. With these functional languages, has anyone used a bytecode profiling tool to see what these functions become in JVM bytecode? What is the best tool to use for this purpose?
For that matter, as I'm just starting to look at bytecode instrumentation and profiling products, which tool is the best to use? I see recommendations on Stackoverflow, but I'm not sure if they are specifically tuned to the desire to see what machine implementation differences exist between two pieces of code, or if they are purely for code-coverage purposes, which is not my interest.

From the tool perspective, JVM languages don't pose any different problem to profiling than plain Java bytecode. What you need is to get an understanding of how things work behind the hood of your fancy functional constructs in order to properly read the results of your profiler.
YourKit or visualVM will be then your friends again.

Related

Is it possible to use TurboFan as the backend for your programming language?

Can the v8's code generating backend be used in a third party programming language, in a way similar to LLVM is used? Is it "general enough" for that, can you even separate the backend from the v8?
I found this, but it does not help to answer my question:
https://github.com/v8/v8/wiki/TurboFan
V8 developer here. No, V8's compiler is not designed to be used as a stand-alone compiler. It is closely intertwined with the rest of the V8 runtime system, and very much tailored towards JavaScript.
Of course, many of the concepts in Turbofan are applicable to other compilers/languages too. If you have a couple of person-years of engineering time available, you could totally extend (or fork and adapt) it to support one or more other languages. But that would be a lot of work.

OpenCL programming in Charm++

Is it possible to run OpenCL through Charm++, while retaining the same fault tolerance and load balancing capabilities as for CPU or CUDA?
I did not explicitly see anything mentioned in the tutorials or the book.
Background: I'm one of the core developers of Charm++.
It's not clear whether you mean compiling OpenCL code to a Charm++-based parallel program, or calling kernels written in OpenCL from Charm++ code. Regardless, there is nothing explicitly implemented to support either of those cases at present.
Compiling OpenCL to Charm++ would be a large project. I don't know of anyone proposing to do such a thing, but it's not fundamentally implausible.
The research group behind Charm++, the Parallel Programming Laboratory has looked at the possibility of implementing OpenCL support to match our offload support for CUDA-based accelerators. This would not be particularly hard. However, at present, we don't have any demand from grant-funded projects that support our work to do so. We would welcome contributions of code to do this. There's also the possibility that commercial development may lead to this getting implemented.

How these to platforms compare in performance namely staff-wsf and wt?

How these to c++ web service framworks compare in performance namely staff-wsf and witty?
I did not perform bench marking, But can get idea from following
Although implemented in C++, Wt’s main focus or novelty is not its
performance, but its focus on developing maintainable applications and
its extensive library of built-in widgets. But because it is popular
and widely used in embedded systems, you will find that performance
and foot-print has been optimized too, by virtue of a no-nonsense API,
thoughtful architecture, and C++ …
given in webtoolkit tutorial

What are some compiled programming languages that compile fast?

I think I finally know what I want in a compiled programming language, a fast compiler. I get the feeling that this is a really superficial thing to care about but some time after switching from Java to Scala for a while I realized that being able to make a small change in code and immediately run the program is actually quite important to me. Besides Java and Go I don't know of any languages that really value compile speed.
Delphi/Object Pascal. Make a change, press F9 and it runs - you don't even notice the compile time. A full rebuild of a fairly substantial project that we run takes of the order of 10-20 seconds, even on a fairly wimpy machine
There's an open source variant available at www.freepascal.org. I've not messed with it but it reportedly is just as fast - it's the design of the Pascal language that allows this.
Java isn't fast for compiling. The feature you a looking for is probably a hot replacement/redeployment while coding. Eclipse recompiles just the files you changed.
You could try some interpreted languages. They usually don't require compiling at all.
I wouldn't choose a language based on compilation speed...
Java is not the fastest compiler out there.
Pascal (and its close relatives) is designed to be fast - it can be compiled in a single pass. Objective Caml is known for its compilation speed (and there is a REPL too).
On the other hand, what you really need is REPL, not a fast recompilation and re-linking of everything. So you may want to try a language which supports an incremental compilation. Clojure fits well (and it is built on top of the same JVM you're used to). Common Lisp is another option.
I'd like to add that there official compilers for languages and unofficial ones made by different people. Obviously because of this the performance changes per compiler.
If you were to talk just about the official compiler I'd say it's probably Fortran. It's very old but it's still used in most science and engineering projects because it is one of the fastest languages. C and C++ come probably tied in second because there also used in science and engineering.

Recommendations for Open Source Parallel programming IDE

What are the best IDE's / IDE plugins / Tools, etc for programming with CUDA / MPI etc?
I've been working in these frameworks for a short while but feel like the IDE could be doing more heavy lifting in terms of scaling and job processing interactions.
(I usually use Eclipse or Netbeans, and usually in C/C++ with occasional Java, and its a vague question but I can't think of any more specific way to put it)
This is not really an answer, but I feel so confined by the comment box ...
I do a fair amount of MPI programming, OpenMP too, but not CUDA and GPU stuff. I write mainly Fortran, some C++. I'm still using Emacs as my editor, and for the other things that Emacs does well. I use a separate parallel debugger (DDT, I've used TotalView in the past, more a question of which one is on the machine than which one I prefer) and a performance profiling tool called OPT (like DDT produced by Allinea Software).
I have looked, though not for a year or so, for plug-ins for NetBeans and Eclipse (former preferred, latter too Java-centric and too heavy these days) for parallel programming. What's out there is better for C++ than for Fortran. But I haven't yet come across any plug-in which has really made it far enough out of the research lab to be useful enough to make me change from the old ways.
I'll be as interested as you to see what other SOers recommend though right now it doesn't look very promising.

Resources