I'm wondering what / if any consequences there are in having unused functions in code?
If you hunt down and remove all unused functions and variables would there be any percievable improvement in performance?
Or is it just good practice to remove unused functions and variables would?
Unused functions can't harm performance. They are making the job harder for guys who are maintaining the code. Modern IDE's keep track of unused functions/methods and variables. If it's not a case with the technology that you are speaking about maintainers will have to deal with unused code thinking it's necessary.
Depending on your compiler / linker, it may have no cost at all (and even be removed automatically), or give a small penalty because the code is bigger and gives cache misses. But I'd expect it too be very minor difference.
Edit: Removal cannot be done automatically when there are chances that other code will call it, i.e. library code or other binary that can later be reused. It is also language dependent - if you write JavaScript, everything will get loaded and probably parsed, so this will make a much bigger penalty than in compiled languages.
there is a security issue: if an attacker can control execution of your application (buffer overflow, crosssite scripting etc.), code fragments in memory will make it easier for him to achieve something significant (especially true if code fragments access privileged resources such as registry keys and files).
In most languages unused functions will not have any measurable performance impact on execution. Unused functions will affect the code/binary size. In Javascript this affects the download time and some parsing time.
Unused variables might affect performance a little bit, since they do some memory allocations. But the overhead of some unused variable here and there is probably not measurable either.
The big benefit of removing unused code is better control during development. If you do a change you don't need to go through lots of unused code to check if it might be effected.
OCaml and Haskell warn you of unused functions/variables on the assumption that if you defined them it must have been for a reason, and not using them may indicate a typo somewhere else in the code (e.g. calling a similarly named function instead). For the benefit of this additional help, I try to avoid or at least comment out things that I don't use.
A good compiler will simply optimize away unused code, so there is no penalty at runtime.
Just a good practice.
Nearly every compiler/linker will skip non-used code when compiling with optimizations turned on.
They will increase the compile time, but the final binary (or library) will not increase, because all unused symbols should be striped.
As already mentioned, there are no run-time penalty.
Related
Would this:
module variables
! 100 variables declared here
end
Be significantly more computationally expensive when used everywhere than if there are 5 modules with 20 variables each and only some are called in different places?
In other words, does the use statement iterate through all the contents or does it simply give access in a more efficient way regardless of what the specific contents are?
Thanks!
Fortran programs are (nearly always) compiled and so declaring variables should have no overhead at runtime. Except they take some space in the heap/stack. While this generally have no significant impact on the performance of programs, there are few (very) rare pathological cases. For example, declared variables can impact the the alignment in memory of other variable and the alignment can slightly impact the speed of loading variable from memory. The organization of the variable in modules does not introduce any additional overhead (as long as optimizations are enabled) since Fortran programs are compiled to monolithic binaries. I think it is a good idea not to care about that unless you see regressions. One should focus about readability & maintainability first.
I am implementing a DES Encryption algorithm using C++, I benchmark it on a very large document(1.1MB) plaint text.
I have now reached about 1.1 sec on encryption, I need to squeeze off more performance out of it.
I was thinking of obfuscation, will that help in optimizing my code?
I think optimizing your code is the best way to optimize it:
Fix redundant code
Rethink the logic
Remove unused or trivial variables
Store commonly used values in variables to reduce redundant computation
Obfuscation makes code harder to read by:
Replacing variable names with underscores or single letters (compilers don't use variable names)
Removing whitespace to create a neutron star of unreadable text (compilers do this internally)
Removing comments (compilers don't read comments)
Sometimes adding useless code to further hinder readability (making your program run slower)
Well, you did not write what kind of obfuscation you have in mind (on a source code level?), but generally: no, it won't. In a language like Javascript (or very old interpreted basic dialects), sometimes obfuscation and optimization go hand-in-hand (shorten variable names, deleting unnecessary whitespace/indentation etc.), but not in a compiled language like C++.
Of course, sometimes some kind of misguided optimization will lead to obfuscated code, but that is a different thing.
C++ compilers nowadays are really REALLY smart. Major optimizations come at a macroscopic level. Even Blender's example, removing unused variables, is not needed, since the optimizer will remove them anyway.
Obfuscation doesn't make your code smarter, it doesn't change algorithms, it doesn't introduce dynamic programming, or anything of the sort.
I don't see why you would want that though. With compiled languages, you don't have to ship the source code, you can, if needed, ship headers and libraries, but those don't give away implementation details.
In a recent conversation with a fellow programmer, I asserted that "if you're writing the same code more than once, it's probably a good idea to refactor that functionality such that it can be called once from each of those places."
My fellow programmer buddy instead insisted that the performance impact of making these function calls was not acceptable.
Now, I'm not looking for validation of who was right. I'm simply curious to know if there are situations or patterns where I should consider the performance impact of a function call before refactoring.
"My fellow programmer buddy instead insisted that the performance impact of making these function calls was not acceptable."
...to which the proper answer is "Prove it."
The old saw about premature optimization applies here. Anyone who isn't familiar with it needs to be educated before they do any more harm.
IMHO, if you don't have the attitude that you'd rather spend a couple hours writing a routine that can be used for both than 10 seconds cutting and pasting code, you don't deserve to call yourself a coder.
Don't even consider the effect of calling overhead if the code isn't in a loop that's being called millions of times, in an area where the user is likely to notice the difference. Once you've met those conditions, go ahead and profile to see if your worries are justified.
Modern compilers of languages such as Java will inline certain function calls anyway. My opinion is that the design is way more important over the few instructions spent with function call. The only situation I can think about would be writing some really fine tuned code in assembler.
You need to ask yourself several questions:
Cost of time spent on optimizing code vs cost of throwing more hardware at it.
How does this impact maintainability?
How does going in either direction impact your deadline?
Does this really beg optimization when many modern compilers will do it for you anyway? Do not try to outsmart the compiler.
And of course, which will help you sleep better at night? :)
My bet is that there was a time in which the performance cost of a call to an external method or function WAS something to be concerned with, in the same way that the lengths of variable names and such all needed to be evaluated with respect to performance implications.
With the monumental increases in processor speed and memory resources int he last two decades, I propose that these concerns are no longer as pertinent as they once were.
We have been able use long variable names without concern for some time, and the cost of a call to external code is probably negligible in most cases.
There might be exceptions. If you place a function call within a large loop, you may see some impact, depending upon the number of iterations.
I propose that in most cases you will find that refactoring code into discrete function calls will have a negligible impact. There might be occasions in which there IS an impact. However, proper TESTING of a refactoring will reveal this. In those minority of cases, your friend might be correct. For most of the rest of the time, I propose that your friend is clining a little to closely to practices which pre-date most modern processors and storage media.
You care about function call overhead the same time you care about any other overhead: when your performance profiling tool indicates that it's a problem.
for the c/c++ family:
the 'cost' of the call is not important. if it needs to be fast, you just have to make sure the compiler is able to inline it. that means that:
the body must be visible to the compiler
the body is indeed small enough to be considered an inline candidate.
the method does not require dynamic dispatch
there are a few ways to break this default ability. for example:
huge instruction count already in the callsite. even with early inlining, the compiler may pop a trivial function out of line (even though it could generate more instructions/slower execution). early inlining is the compiler's ability to inline a function early on, when it sees the call costs more than the inline.
recursion
the inline keyword is more or less useless in this era, regarding its original intent. however, many compilers offer a means to restore the meaning, with a compiler specific directive. using this directive (correctly) helps considerably. learning how to use it correctly takes time. if in doubt, omit the directive and leave it up to the compiler.
assuming you are using a modern compiler, there is no excuse to avoid the function, unless you're also willing to go down to assembly for this particular program.
as it stands, and if performance is crucial, you really have two choices:
1) learn to write well organized programs for speed. downside: longer compile times
2) maintain a poorly written program
i prefer 1. any day.
(yes, i have spent a lot of time writing performance critical programs)
From what I have read java (usually) seems to compile java to not very (is at all?) optimised java bytecode, leaving it to the jit to optimise. Is this true? And if it is has there been any exploration (possibly in alternative implementations) of getting the compiler to optimise the code so the jit has less work to do (is this possible)?
Also many people seem to have a dislike for native code generation (sometimes referred to as ahead of time compilation) for Java (and many other high level memory managed languages) , for many reasons such as loss of portability (and ect.) , but also partially because (at least for those languages that have a just in time compiler) the thinking goes that ahead of time compilation to machine code will miss the possible optimisations that can be done by a jit compiler and therefore may be slower in the long run.
This leads me to wonder whether anyone has ever tried to implement http://en.wikipedia.org/wiki/Profile-guided_optimization (compiling to a binary + some extras then running the program and analysing the runtime information of the test run to generate a hopefully more optimised binary for real world usage) for java/(other memory managed languages) and how this would compare to jit code? Anyone have a clue?
Personally, I think the big difference is not between JIT compiling and AOT compiling, but between class-compilation and whole-program optimization.
When you run javac, it only looks at a single .java file, compiling it into a single .class file. All the interface implementations and virtual methods and overrides are checked for validity but left unresolved (because it's impossible to know the true method invocation targets without analyzing the whole program).
The JVM uses "runtime loading and linking" to assemble all of your classes into a coherent program (and any class in your program can invoke specialized behavior to change the default loading/linking behavior).
But then, at runtime, the JVM can remove the vast majority of virtual methods. It can inline all of your getters and setters, turning them into raw fields. And when those raw fields are inlined, it can perform constant-propagation to further optimize the code. (At runtime, there's no such thing as a private field.) And if there's only one thread running, the JVM can eliminate all synchronization primitives.
To make a long story short, there are a lot of optimizations that aren't possible without analyzing the whole program, and the best time for doing whole program analysis is at runtime.
Profile-guided optimization has some caveats, one of them mentioned even in the Wiki article you linked. It's results are valid
for the given samples, representing how your code is actually used by the user or other code.
for the given platform (CPU, memory + other hardware, OS, whatever).
From the performance point of view there are quite big differences even among platforms that are usually considered (more or less) the same (e.g. compare a single core, old Athlon with 512M with a 6 core Intel with 8G, running on Linux, but with very different kernel versions).
for the given JVM and its config.
If any of these change then your profiling results (and the optimizations based on them) are not necessary valid any more. Most likely some of the optimizations will still have a beneficial effect, but some of them may turn out suboptimal (or even degrading performance).
As it was mentioned the JIT JVMs do something very similar to profiling, but they do it on the fly. It's also called 'hotspot', because it constantly monitors the executed code, looks for hot spots that are executed frequently and will try to optimize only those parts. At this point it will be able to exploit more knowledge about the code (knowing the context of it, how it is used by other classes, etc.) so - as mentioned by you and the other answers - it can do better optimizations as a static one. It will continue monitoring and if its needed it will do another turn of optimization later, this time trying even harder (looking for more, more expensive optimizations).
Working on the real life data (usage statistics + platform + config) it can avoid the caveats mentioned before.
The price of it is some additional time it needs to spend on "profiling" + JIT-ing. Most of the time its spent quite well.
I guess a profile-guided optimizer could still compete with it (or even beat it), but only in some special cases, if you can avoid the caveats:
you are quite sure that your samples represent the real life scenario well and they won't change too much during execution.
you know your target platform quite precisely and can do the profiling on it.
and of course you know/control the JVM and its config.
It will happen rarely and I guess in general JIT will give you better results, but I have no evidence for it.
Another possibility for getting value from the profile-guided optimization if you target a JVM that can't do JIT optimization (I think most small devices have such a JVM).
BTW one disadvantage mentioned in other answers would be quite easy to avoid: if static/profile guided optimization is slow (which is probably the case) then do it only for releases (or RCs going to testers) or during nightly builds (where time does not matter so much).
I think the much bigger problem would be to have good sample test cases. Creating and maintaining them is usually not easy and takes a lot of time. Especially if you want to be able to execute them automatically, which would be quite essential in this case.
The official Java Hot Spot compiler does "adaptive optimisation" at runtime, which is essentially the same as the profile-guided optimisation you mentioned. This has been a feature of at least this particular Java implementation for a long time.
The trade-off to performing more static analysis or optimisation passes up-front at compile time is essentially the (ever-diminishing) returns you get from this extra effort against the time it takes for the compiler to run. A compiler like MLton (for Standard ML) is a whole-program optimising compiler with a lot of static checks. It produces very good code, but becomes very, very slow on medium-to-large programs, even on a fast system.
So the Java approach seems to be to use JIT and adaptive optimisation as much as possible, with the initial compilation pass just producing an acceptable valid binary. The absolute opposite end is to use an approach like that of something like MLKit, which does a lot of static inference of regions and memory behaviour.
I was thinking more about the programming language i am designing. and i was wondering, what are ways i could minimize its compile time?
Your main problem today is I/O. Your CPU is many times faster than main memory and memory is about 1000 times faster than accessing the hard disk.
So unless you do extensive optimizations to the source code, the CPU will spend most of the time waiting for data to be read or written.
Try these rules:
Design your compiler to work in several, independent steps. The goal is to be able to run each step in a different thread so you can utilize multi-core CPUs. It will also help to parallelize the whole compile process (i.e. compile more than one file at the same time)
It will also allow you to load many source files in advance and preprocess them so the actual compile step can work faster.
Try to allow to compile files independently. For example, create a "missing symbol pool" for the project. Missing symbols should not cause compile failures as such. If you find a missing symbol somewhere, remove it from the pool. When all files have been compiled, check that the pool is empty.
Create a cache with important information. For example: File X uses symbols from file Y. This way, you can skip compiling file Z (which doesn't reference anything in Y) when Y changes. If you want to go one step further, put all symbols which are defined anywhere in a pool. If a file changes in such a way that symbols are added/removed, you will know immediately which files are affected (without even opening them).
Compile in the background. Start a compiler process which checks the project directory for changes and compile them as soon as the user saves the file. This way, you will only have to compile a few files each time instead of everything. In the long run, you will compile much more but for the user, turnover times will be much shorter (= time user has to wait until she can run the compiled result after a change).
Use a "Just in time" compiler (i.e. compile a file when it is used, for example in an import statement). Projects are then distributed in source form and compiled when run for the first time. Python does this. To make this perform, you can precompile the library during the installation of your compiler.
Don't use header files. Keep all information in a single place and generate header files from the source if you have to. Maybe keep the header files just in memory and never save them to disk.
what are ways i could minimize its compile time?
No compilation (interpreted language)
Delayed (just in time) compilation
Incremental compilation
Precompiled header files
I've implemented a compiler myself, and ended up having to look at this once people started batch feeding it hundreds of source files. I was quite suprised what I found out.
It turns out that the most important thing you can optimize is not your grammar. It's not your lexical analyzer or your parser either. Instead, the most important thing in terms of speed is the code that reads in your source files from disk. I/O's to disk are slow. Really slow. You can pretty much measure your compiler's speed by the number of disk I/Os it performs.
So it turns out that the absolute best thing you can do to speed up a compiler is to read the entire file into memory in one big I/O, do all your lexing, parsing, etc. from RAM, and then write out the result to disk in one big I/O.
I talked with one of the head guys maintaining Gnat (GCC's Ada compiler) about this, and he told me that he actually used to put everything he could onto RAM disks so that even his file I/O was really just RAM reads and writes.
In most languages (pretty well everything other than C++), compiling individual compilation units is quite fast.
Binding/linking is often what's slow - the linker has to reference the whole program rather than just a single unit.
C++ suffers as - unless you use the pImpl idiom - it requires the implementation details of every object and all inline functions to compile client code.
Java (source to bytecode) suffers because the grammar doesn't differentiate objects and classes - you have to load the Foo class to see if Foo.Bar.Baz is the Baz field of object referenced by the Bar static field of the Foo class, or a static field of the Foo.Bar class. You can make the change in the source of the Foo class between the two, and not change the source of the client code, but still have to recompile the client code, as the bytecode differentiates between the two forms even though the syntax doesn't. AFAIK Python bytecode doesn't differentiate between the two - modules are true members of their parents.
C++ and C suffer if you include more headers than are required, as the preprocessor has to process each header many times, and the compiler compile them. Minimizing header size and complexity helps, suggesting better modularity would improve compilation time. It's not always possible to cache header compilation, as what definitions are present when the header is preprocessed can alter its semantics, and even syntax.
C suffers if you use the preprocessor a lot, but the actual compilation is fast; much of C code uses typedef struct _X* X_ptr to hide implementation better than C++ does - a C header can easily consist of typedefs and function declarations, giving better encapsulation.
So I'd suggest making your language hide implementation details from client code, and if you are an OO language with both instance members and namespaces, make the syntax for accessing the two unambiguous. Allow true modules, so client code only has to be aware of the interface rather than implementation details. Don't allow preprocessor macros or other variation mechanism to alter the semantics of referenced modules.
Here are some performance tricks that we've learned by measuring compilation speed and what affects it:
Write a two-pass compiler: characters to IR, IR to code. (It's easier to write a three-pass compiler that goes characters -> AST -> IR -> code, but it's not as fast.)
As a corollary, don't have an optimizer; it's hard to write a fast optimizer.
Consider generating bytecode instead of native machine code. The virtual machine for Lua is a good model.
Try a linear-scan register allocator or the simple register allocator that Fraser and Hanson used in lcc.
In a simple compiler, lexical analysis is often the greatest performance bottleneck. If you are writing C or C++ code, use re2c. If you're using another language (which you will find much more pleasant), read the paper aboug re2c and apply the lessons learned.
Generate code using maximal munch, or possibly iburg.
Surprisingly, the GNU assembler is a bottleneck in many compilers. If you can generate binary directly, do so. Or check out the New Jersey Machine-Code Toolkit.
As noted above, design your language to avoid anything like #include. Either use no interface files or precompile your interface files. This tactic dramatically reduces the burdern on the lexer, which as I said is often the biggest bottleneck.
Here's a shot..
Use incremental compilation if your toolchain supports it.
(make, visual studio, etc).
For example, in GCC/make, if you have many files to compile, but only make changes in one file, then only that one file is compiled.
Eiffel had an idea of different states of frozen, and recompiling didn't necessarily mean that the whole class was recompiled.
How much can you break up the compliable modules, and how much do you care to keep track of them?
Make the grammar simple and unambiguous, and therefore quick and easy to parse.
Place strong restrictions on file inclusion.
Allow compilation without full information whenever possible (eg. predeclaration in C and C++).
One-pass compilation, if possible.
One thing surprisingly missing in answers so far: make you you're doing a context free grammar, etc. Have a good hard look at languages designed by Wirth such as Pascal & Modula-2. You don't have to reimplement Pascal, but the grammar design is custom made for fast compiling. Then see if you can find any old articles about the tricks Anders pulled implementing Turbo Pascal. Hint: table driven.
it depends on what language/platform you're programming for. for .NET development, minimise the number of projects that you have in your solution.
In the old days you could get dramatic speedups by setting up a RAM drive and compiling there. Don't know if this still holds true, though.
In C++ you could use distributed compilation with tools like Incredibuild
A simple one: make sure the compiler can natively take advantage of multi-core CPUs.
Make sure that everything can be compiled the fist time you try to compile it. E.g. ban forward references.
Use a context free grammar so that you can find the correct parse tree without a symbol table.
Make sure that the semantics can be deduced from the syntax so you can construct the correct AST directly rather than by mucking with a parse tree and symbol table.
How serious a compiler is this?
Unless the syntax is pretty convoluted, the parser should be able to run no more than 10-100 times slower than just indexing through the input file characters.
Similarly, code generation should be limited by output formatting.
You shouldn't be hitting any performance issues unless you're doing a big, serious compiler, capable of handling mega-line apps with lots of header files.
Then you need to worry about precompiled headers, optimization passes, and linking.
I haven't seen much work done for minimizing the compile time. But some ideas do come to mind:
Keep the grammar simple. Convoluted grammar will increase your compile time.
Try making use of parallelism, either using multicore GPU or CPU.
Benchmark a modern compiler and see what are the bottlenecks and what you can do in you compiler/language to avoid them.
Unless you are writing a highly specialized language, compile time is not really an issue..
Make a build system that doesn't suck!
There's a huge amount of programs out there with maybe 3 source files that take under a second to compile, but before you get that far you'd have to sit through an automake script that takes about 2 minutes checking things like the size of an int. And if you go to compile something else a minute later, it makes you sit through almost exactly the same set of tests.
So unless your compiler is doing awful things to the user like changing the size of its ints or changing basic function implementations between runs, just dump that info out to a file and let them get it in a second instead of 2 minutes.