How do you measure the performance metrics of a program? - performance

I am doing an investigation into two different code languages to compare them and see how they perform differently to each other. I understand the basic differences between things like interpreted languages and static languages but I want to generate numbers by which I can express those differences. Does anyone know what metrics I should be looking at specifically and do I simply add extra code to my programs to give information like runtime or are there tools out there that can do a better job?

Related

Are programming languages converted in machine code by compilers?

If so, why different programs written in different languages have different execution speeds?
Simple answer: they don't produce the same machine code. They might produce different machine code which still produces the same side effects (same end result), but via different machine instructions.
Imagine you have two interpreters (let's say male and female just to distinguish them) to translate what you say into some other language. Both of them may translate what you say properly into the desired language, but they won't necessarily be equally efficient. One of them might feel the need to explain more of what you meant, one might be very terse and translate what you say in a very short and sweet way.
Performance doesn't just vary between languages. They vary between compilers for the same programming language.
For example, with C, the performance difference between GCC and Tiny-C can be about 2 to 3x, with Tiny-C being roughly 2-3 times slower.
And it's because even within the same programming language (C), GCC and Tiny-C don't produce identical machine instructions. In the case of Tiny-C, it was optimized to compile quickly, not to produce code that runs as quickly. For example, it doesn't make the best use of the fastest form of memory available to the machine (registers) and spills more data into the stack (which uses anything from L1 to DRAM depending on the access patterns). Because it doesn't bother to get so fancy with register allocation, Tiny-C can compile code quite quickly, but the resulting code isn't as efficient.
If you want a more in-depth answer, then you should study compiler design starting with the Dragon Book.
Though programs written in different languages are converted into machine code at the end of the day, different languages have different implementation to say same thing.
You can take analogy from human languages e.g the English statement I am coming home. is translated to Chinese as 我未来的家。, as you can see the Chinese one is more concise though it is not always true; same concept applies to programming languages.
So in the case of programming languages a machine code X can be written in programming language A as 2X-X, programming language B as X/2 + X/2...but executing machine code X and 2X-X will result same result though their performance wont same ( this is hypothetical example but hope it makes sense.)
Basically it is not guaranteed that a program with same output written in different programming languages results in same machine code, but is converted into a machine code that gives same output, that where the difference comes.
But this will give you thorough info
Because 1) the compilers are written by different people so the machine code they generate is not the same, and 2) they make use of preexisting run-time libraries of routines to do math, input-output, memory management, and more, and those libraries are also not the same, for the same reason.
Some compilers do not generate machine code, because then the resulting code would not be portable to different machines, so instead they generate code for a fictitious general computer.
Then on any particular machine that code is either interpreted directly by an interpreter program, or it is translated into that machine's code, or a combination of these (look up just-in-time(JIT) compiler).

Is there any scripting language that's fast, easy to embed, and well-suited for high-level game-programming?

First off, I'm aware that there are many questions related to this, but none of them seemed to help my specific situation. In particular, lua and python don't fit my needs as well as I could hope. It may be that no language with my requirements exists, but before coming to that conclusion it'd be nice to hear a few more opinions. :)
As you may have guessed, I need such a language for a game engine I'm trying to create. The purpose of this game engine is to provide a user with the basic tools for building a game, while still giving her the freedom of creating many different types of games.
For this reason, the scripting language should be able to handle game concepts intuitively. Among other things, it should be easy to define a variety of types, sub-type them with slightly different properties, query and modify objects dynamically, and so on.
Furthermore, it should be possible for the game developer to handle every situation they come across in the scripting language. While basic components like the renderer and networking would be implemented in C++, game-specific mechanisms such as rotating a few hundred objects around a planet will be handled in the scripting language. This means that the scripting language has to be insanely fast, 1/10 C speed is probably the minimum.
Then there's the problem of debugging. Information about the function, stack trace and variable states that the error occurred in should be accessible.
Last but not least, this is a project done by a single person. Even if I wanted to, I simply don't have the resources to spend weeks on just the glue code. Integrating the language with my project shouldn't be much harder than integrating lua.
Examining the two suggested languages, lua and python, lua is fast(luajit) and easy to integrate, but its standard debugging facilities seem to be lacking. What's even worse, lua by default has no type-system at all. Of course you can implement that on your own, but the syntax will always be weird and unintuitive.
Python, on the other hand, is very comfortable to use and has a basic class system. However, it's not that easy to integrate, it's paradigm doesn't really involve type-checking and it's definitely not fast enough for more complex games. I'd again like to point out that everything would be done in python. I'm well aware that python would likely be fast enough for 90% of the code.
There's also Scala, which I haven't seen suggested so far. Scala seems to actually fulfill most of the requirements, but embedding the Java VM with C doesn't seem very easy, and it generally seems like java expects you to build your application around java rather than the other way around. I'm also not sure if Scala's functional paradigm would be good for intuitive game-development.
EDIT: Please note that this question isn't about finding a solution at any cost. If there isn't any language better than lua, I will simply compromise and use that(I actually already have the thing linked into my program). I just want to make sure I'm not missing something that'd be more suitable before doing so, seeing as lua is far from the perfect solution for me.
You might consider mono. I only know of one success story for this approach, but it is a big one: C++ engine with mono scripting is the approach taken in Unity.
Try the Ring programming language
http://ring-lang.net
It's general-purpose multi-paradigm scripting language that can be embedded in C/C++ projects, extended using C/C++ code and/or used as standalone language. The supported programming paradigms are Imperative, Procedural, Object-Oriented, Functional, Meta programming, Declarative programming using nested structures, and Natural programming.
The language is simple, trying to be natural, encourage organization and comes with transparent implementation. It comes with compact syntax and a group of features that enable the programmer to create natural interfaces and declarative domain-specific languages in a fraction of time. It is very small, fast and comes with smart garbage collector that puts the memory under the programmer control. It supports many programming paradigms, comes with useful and practical libraries. The language is designed for productivity and developing high quality solutions that can scale.
The compiler + The Virtual Machine are 15,000 lines of C code
Embedding Ring Interpreter in C/C++ Programs
https://en.wikibooks.org/wiki/Ring/Lessons/Embedding_Ring_Interpreter_in_C/C%2B%2B_Programs
For embeddability, you might look into Tcl, or if you're into Scheme, check out SIOD or Guile. I would suggest Lua or Python in general, of course, but your question precludes them.
Since noone seems to know a combination better than lua/luajit, I think I will leave it at that. Thanks for everyone's input on this. I personally find lua to be very lacking as a high-level language for game-programming, but it's probably the best choice out there. So to whomever finds this question and has the same requirements(fast, easy to use, easy to embed), you'll either have to use lua/luajit or make your own. :)

Assembly Analysis Tools

Does anyone have any suggestions for assembly file analysis tools? I'm attempting to analyze ARM/Thumb-2 ASM files generated by LLVM (or alternatively GCC) when passed the -S option. I'm particularly interested in instruction statistics at the basic block level, e.g. memory operation counts, etc. I may wind up rolling my own tool in Python, but was curious to see if there were any existing tools before I started.
Update: I've done a little searching, and found a good resource for disassembly tools / hex editors / etc here, but unfortunately it is mainly focused on x86 assembly, and also doesn't include any actual assembly file analyzers.
What you need is a tool for which you can define an assembly language syntax, and then build custom analyzers. You analyzers might be simple ("how much space does an instruction take?") or complex ("How many cycles will this isntruction take to execute?" [which depends on the preceding sequence of instructions and possibly a sophisticated model of the processor you care about]).
One designed specifically to do that is the New Jersey Machine Toolkit. It is really designed to build code generators and debuggers. I suspect it would be good at "instruction byte count". It isn't clear it is good at more sophisticated analyses. And I believe it insists you follow its syntax style, rather than yours.
One not designed specifically to do that, but good at parsing/analyzing langauges in general is our
DMS Software Reengineering Toolkit.
DMS can be given a grammar description for virtually any context free language (that covers most assembly language syntax) and can then parse a specific instance of that grammar (assembly code) into ASTs for further processing. We've done with with several assembly langauges, including the IBM 370, Motorola's 8 bit CPU line, and a rather peculiar DSP, without trouble.
You can specify an attribute grammar (computation over an AST) to DMS easily. These are great way to encode analyses that need just local information, such as "How big is this instruction?". For more complex analysese, you'll need a processor model that is driven from a series of instructions; passing such a machine model the ASTs for individual instructions would be an easy way to apply a machine model to compute more complex things as "How long does this instruction take?".
Other analyses such as control flow and data flow, are provided in generic form by DMS. You can use an attribute evaluator to collect local facts ("control-next for this instruction is...", "data from this instruction flows to,...") and feed them to the flow analyzers to compute global flow facts ("if I execute this instruction, what other instructions might be executed downstream?"..)
You do have to configure DMS for your particular (assembly) language. It is designed to be configured for tasks like these.
Yes, you can likely code all this in Python; after all, its a Turing machine. But likely not nearly as easily.
An additional benefit: DMS is willing to apply transformations to your code, based on your analyses. So you could implement your optimizer with it, too. After all, you need to connect the analysis indication the optimization is safe, to the actual optimization steps.
I have written many disassemblers, including arm and thumb. Not production quality but for the purposes of learning the assembler. For both the ARM and Thumb the ARM ARM (ARM Architectural Reference Manual) has a nice chart from which you can easily count up data operations from load/store, etc. maybe an hours worth of work, maybe two. At least up front, you would end up with data values being counted though.
The other poster may be right, as with the chart I am talking about it should be very simple to write a program to examine the ASCII looking for ldr, str, add, etc. No need to parse everything if you are interested in memory operations counts, etc. Of course the downside is that you are likely not going to be able to examine loops. One function may have a load and store, another may have a load and store but have it wrapped by a loop, causing many more memory operations once executed.
Not knowing what you really are interested in, my guess is you might want to simulate the code and count these sorts of things. I wrote a thumb simulator (thumbulator) that attempts to do just that. (and I have used it to compare llvm execution vs gcc execution when it comes to number of instructions executed, fetches, memory operations, etc) The problem may be that it is thumb only, no ARM no Thumb2. Thumb2 could be added easier than ARM. There exists an armulator from arm, which is in the gdb sources among other places. I cant remember now if it executes thumb2. My understanding is that when arm was using it would accurately tell you these sorts of statistics.
You can plug your statistics into LLVM code generator, it's quite flexible and it is already collecting some stats, which could be used as an example.

Is there a relation between static code analysis and application performance

My Question:
Performance tests are generally done after an application is integrated with various modules and ready for deploy.
Is there any way to identify performance bottlenecks during the development phase. Does code analysis throw any hints # performance?
It all depends on rules that you run during code analysis but I don't think that you can prevent performance bottlenecks just by CA.
From my expired it looks that performance problems are usually quite complicated and to find real problems you have to run performance tests.
No, except in very minor cases (eg for Java, use StringBuilder in a loop rather than string appends).
The reason is that you won't know how a particular piece of code will affect the application as a whole, until you're running the whole application with relevant dataset.
For example: changing bubblesort to quicksort wouldn't significantly affect your application if you're consistently sorting lists of a half-dozen elements. Or if you're running the sort once, in the middle of the night, and it doesn't delay other processing.
If we are talking .NET, then yes and no... FxCop (or built-in code analysis) has a number of rules in it that deal with performance concerns. However, this list is fairly short and limited in nature.
Having said that, there is no reason that FxCop could not be extended with a lot more rules (heuristic or otherwise) that catch potential problem areas and flag them. It's simply a fact that nobody (that I know of) has put significant work into this (yet).
Generally, no, although from experience I can look at a system I've never seen before and recognize some design approaches that are prone to performance problems:
How big is it, in terms of lines of code, or number of classes? This correlates strongly with performance problems caused by over-design.
How many layers of abstraction are there? Each layer is a chance to spend more cycles than necessary, and this effect compounds, especially if each operation is perceived as being "pretty efficient".
Are there separate data structures that need to be kept in agreement? If so, how is this done? If there is an attempt, through notifications, to keep the data structures tightly in sync, that is a red flag.
Of the categories of input information to the system, does some of it change at low frequency? If so, chances are it should be "compiled" rather than "interpreted". This can be a huge win both in performance and ease of development.
A common motif is this: Programmer A creates functions that wrap complex operations, like DB access to collect a good chunk of information. Programmer A considers this very useful to other programmers, and expects these functions to be used with a certain respect, not casually. Programmer B appreciates these powerful functions and uses them a lot because they get so much done with only a single line of code. (Programmers B and A can be the same person.) You can see how this causes performance problems, especially if distributed over multiple layers.
Those are the first things that come to mind.

Is it possible to perform arbitrary data analysis in Erlang?

I want to answer questions about data in Erlang: count things, correlate messages, provide arbitrary statistics. I had thought about resorting to Hadoop for this but is it possible to build a solution in raw Erlang to do rather arbitrary data analysis not necessarily via map/reduce but somehow? I have seen some hints of people doing this but no explicit blog posts or examples of this being done. I know that Powerset's natural language capabilities are written in Erlang. I also know about CouchDB but was looking for some other solutions.
Yes.
For general-purpose computation and statistics, Erlang works just fine. It isn't optimized heavily for such work, so it will have trouble keeping up with similar numeric code in, say MatLab, ForTran, or any of the major C package for this work -- but for most uses it will do just fine. And of course if your code parallelizes neatly and you have multiple CPUs available, Erlang will catch up more easily.
(You also mentioned the map/reduce pattern; it is relatively trivial given the Erlang/OTP runtime and libraries.)
I and my colleagues have written plenty of "raw" Erlang to do counting, statistics, and so on. We have found it to be more than sufficient for most tasks.

Resources