C++ optimization - compilation

C++ optimization - compilation

Can we see optimized code in c++.............(not assembly)??

No.
'Optimization' is something a compiler does to the intermediate representation and assembly, not to the source code.

Optimised code rarely translates back to the language it came from cleanly, with the optimisations intact, since optimisations will usually happen at the assembly level. Either that, or the 'optimisations' are entirely artificial at the C++ (or other language) level - the difference between implementing an if statement with one instruction rather than another for example.

Related

Is there any performance difference between macros and functions in Rust?

In Rust Macros are executed at compile time. They generally expand
into new pieces of code that the compiler will then need to further
process.
But after macros compiled or before compiled is there any performance difference between normal function vs macros?

I assume you're talking about runtime performance. Compile-time wise macros are usually slower as they are compiled for each invocation.
Macros are like #[inline(always)] functions. This can be good or bad for performance, depending on lot of characteristics like number of calls to the code, code size or instruction cache pressure. Always benchmark before making a decision.
If you can use a function, prefer that. It can always be marked #[inline(always)] if deemed good for performance, while using more familiar syntax and faster compile times.

Do compilers take the "status quo" when optimizations produced worse results?

To my knowledge, when using optimizations there is a risk to face the "maybe will be worse" case (i.e. the performance will be degraded, or the code size will be higher, or both). However do compilers able to detect such cases and return to the "status quo" (i.e. fall back to the original non-optimized code) when optimizations produced worse results? Can someone give (if possible) a particular examples of what compilers (for example, gcc, Clang (LLVM), etc.) do in this case?

In JIT compilers there is a thing called Deoptimization. Normally the compiler will optimize heavily assuming something, but during execution some of the assumption may fail. For example the compiler will assume the inmput of a function is always an integer and produce a highly efficient code for integer manipulation, but if, and such things happen in dynamic languages, the input is suddenly and array or a string, the code should revert. See v8 turbofan speculative optimizator for example.
For non JIT there is no way to deoptimize during runtime, but the compiler may create multiple execution paths. Your question is not fully logical because how would compiler know if it created unoptimal code? It can only use the same algorithm it used to do the optimization itself. That's probably why you are downwoted.

Algorithms that can only be written in assembly

Any algorithm you can implement in a HLL you can implement in assembly. On the other hand, there are many algorithms you can implement in assembly which you cannot implement in a HLL. - Randall Hyde
I found this statement in the forward to a book on assembly. The book is here: https://courses.engr.illinois.edu/ece390/books/artofasm/fwd/fwd.html#109
Does anyone know an example of this type of algorithm?

It's plain wrong.
You can implement any algorithm (in the CS sense of the word) in any turing complete programming language.
On the other hand, if he would have said something a like: "Some algorithms can be implemented very efficiently, and with ease in assembly, much more so than what is possible in most high level programming languages", then his statement would have made sense...
Interesting text though....

There is a sense in which it is trivially false: in the worst case, you could write an emulator in the HLL and then run the algorithm in there. But that's cheating a bit because now the HLL does not directly implement the algorithm.
A concrete example of what many HLL's can't do (or maybe they can in practice, but it is not guaranteed that they can do it), is directly implementing a XOR linked list. In many languages you just cannot XOR pointers, and/or it wouldn't make sense even if you could (consider garbage collection). Of course you can refer to every node by an integer ID and XOR those, but that's a workaround, not a direct implementation.
HLL's often have trouble implementing unstructured control flow, though many (particularly older) languages offer a goto. That means you may have to jump through hoops to implement a state machine (using a switch in a loop or whatever), instead of letting the state be implied by the program counter.
There are also many algorithms and data structures that rely on operations that don't exist in typical HLL's, for example popcnt or lzcnt, which can again be emulated, but then so can everything.

In case you have strict limitations in terms of memory and/or execution time, you might be forced to use assembly language.
High level languages typically require a run-time library which might be too big to fit into your program memory.
Think of a time-critical driver routine. An interrupt service routine for example. If there are only a few nanoseconds available for the routine, assembly language might be the only viable option.

How about this? You need to write some assembly code in order to access system registers and tables. But onces the setup is done, no CPU instructions are executed (everything's done by the complex CPU exception handling mechanisms) and yet the thing is Turing-complete and can "run" programs.

What levels should static analyzers analyze?

I've noticed that some static analyzers operate on source code, while others operate on bytecode (e.g., FindBugs). I'm sure there are even some that work on object code.
My question is a simple one, what are the advantages and disadvantages of writing different kinds of static analyzers for different levels of analysis?
Under "static analyzers" I'm including linters, bug finders, and even full-blown verifiers.
And by levels of analysis I would include source code, high-level IRs, low-level IRs, bytecode, object code, and compiler plugins that have access to all phases.

These different facets can influence the level at which an analyzer may decide to work:
Designing a static analyzer is a lot of work. It would be a shame not to factor this work for several languages compiled to the same bytecode, especially when the bytecode retains most of the structure of the source program: Java (FindBugs), .NET (various tools related to Code Contracts). In some cases, the common target language was made up for the purpose of analysis although the compilation scheme wasn't following this path.
Related to 1, you may hope that your static analyzer will be a little less costly to write if it works on a normalized version of the program with a minimum number of constructs. When authoring static analyzers, having to write the treatment for repeat until when you have already written while do is a bother. You may structure your analyzer so that several functions are shared for these two cases, but the care-free way to handle this is to translate one to the other, or to translate the source to an intermediate language that only has one of them.
On the other hand as already pointed out in Flash Sheridan's answer, source code contains the most information. For instance, in languages with fuzzy semantics, bugs at the source level may be removed by compilation. C and C++ have numerous "undefined behaviors" where the compiler is allowed to do anything, including generating a program that works accidentally. Fine, you might think, if the bug is not in the executable it's not a problematic bug. But when you ever re-compile the program for another architecture or with the next version of the compiler, the bug may appear again. This is one reason for not doing the analysis after any phase that might potentially remove bugs.
Some properties can only be checked with reasonable precision on compiled code. That includes absence of compiler-introduced bugs as pointed out again by Flash Sheridan, but also worst-case execution time. Similarly, many languages do not let you know what floating-point code does precisely unless you look at the assembly generated by the compiler (this is because existing hardware does not make it convenient for them to guarantee more). The choice is then to write an imprecise source-level analyzer that takes into account all possibilities, or to analyze precisely one particular compilation of a floating-point program, as long as it is understood that it is that precise assembly code that will be executed.

Source code analysis is the most generally useful, of course; sometimes heuristics even need to analyze comments or formatting. But you’re right that even object code analysis can be necessary, e.g., to detect bugs introduced by GCC misfeatures. Thomas Reps, head of GrammaTech and a Wisconsin professor, gave a good talk on this at Stanford a couple of years ago: http://pages.cs.wisc.edu/~reps/#TOPLAS-WYSINWYX.

Assembly language and compiled languages

How is assembly faster than compiled languages if both are translated to machine code?
I'm talking about truly compiled languages which are translated to machine code. Not C# or Java which are compiled to an intermediate language first and then compiled to native code by a software interpreter, etc.
On Wikipedia, I found something which I'm not sure if it's in any way related to this. Is it because that translation from a higher level language generates extra machine code? Or is my understanding wrong?
A utility program called an assembler is used to translate assembly language statements into the target computer's machine code. The assembler performs a more or less isomorphic translation (a one-to-one mapping) from mnemonic statements into machine instructions and data. This is in contrast with high-level languages, in which a single statement generally results in many machine instructions.

Well, it relates a bit to your question, indeed. The point is that compilers produce inefficient machine code at times for various reasons, such as not being able to completely analyze your code, inserting automatic range checks, automatic checks for objects being null, etc.
On the other hand if you write assembler code by hand and know what you're doing, then you can probably write some things much more efficient than the compiler, although the compiler's behavior may be tweaked and you can usually tell it not to do range checking, for example.
Most people, however, will not write better assembler code than a compiler, simply because compilers are written by people who know a good deal of really weird but really cool optimizations. Also things like loop unrolling are usually a pain to write yourself and make the resulting code faster in many cases.
While it's generally true that everything that a computer executes is machine code, the code that runs differs greatly depending on how many abstraction levels you put between the machine and the programmer. For Assembler that's one level, for Java there are a few more ...
Also many people mistakenly believe that certain optimizations at a higher abstraction layer pay off at a lower one. This is not necessarily the case and the compiler may just have trouble understanding what you are trying to do and fail to properly optimize it.

Assembly may sometimes be faster than a compiled language if an assembly programmer writes better assembly than that generated by the compiler.
A compiled language is often faster than assembly because programmers who write compilers usually know the CPU architecture better than programmers who are utilizing assembly in a one-off, limited-case, situation.

An assembly expert may be able to write assembly code that is more effective (fewer instructions, more efficient instructions, SIMD, ...) than what a compiler generates automatically.
However, most of the time, you're better off trusting the optimizer of your compiler.
Learn what your compiler does. Then let the compiler do it.

My standard answer when questions about assembly vs. high-level come up is to take a look at Michael Abrash's Graphics Programming Black Book.
The first couple of chapters give a good idea of what you can optimise effectively using assembly, and what you can't.
You can download it from GameDev - Jeff's links seem to be broken now unfortunately.

All good answers. My only additional point is that programmers tend to write a certain number of lines of code per day, regardless of language. Since the advantage of a high-level language is that it lets you get more done with less code, it takes incredible programmer discipline to actually write less code.
This is especially an issue for performance because it matters almost nowhere except in a tiny part of the code. It only matters in your hotspots - code that you write (1) consuming a significant fraction of execution time (2) without calling functions (3).

First of all, compilers generate very good (fast) assembly code.
It's true that compilers can add extra code since high order languages have mechanisms, like virtual methods and exceptions in C++. Thus the compiler will have to produce more code. There are cases where raw assembly could speed up the code but that's rare nowdays.

First - assembler should be used only in small code pieces, which eat most of the CPU time in a program - some kind of calculations for example - in the "bottle neck" of algorithm.
Secondly - it depends on experience in ASM of those who implements the same code in Assembler. If the assembler implementation of "bottle neck" code will be faster. If experience is low - it will be slower. And it will contain a lot of bugs. If experience is high enough - ASM will give significant profit.

How is assembly faster than compiled languages if both are translated to machine code?
The implicit assumption is hand-written assembly code. Of course, most compilers (e.g. GCC for C, C++, Fortran, Go, D etc...) are generating some assembler code; for example you might compile your foo.cc C++ source code with g++ -fverbose-asm -Wall -S -O2 -march=native foo.cc and look into the generated foo.s assembler code.
However, efficient assembler code is so difficult to write that, today, compilers can optimize better than human do. See this.
So practically speaking, it is not worth coding in assembler (also, take into account that development efforts cost very often much more than the hardware running the compiled code). Even when performance matters a lot and is worth spending a lot of money, it is better to hand-code only very few routines in assembler, or even to embed some assembler code in some of your C routines.
Look into the CppCon 2017 talk: Matt Godbolt “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

C++ optimization - compilation

Can we see optimized code in c++.............(not assembly)??

No. 'Optimization' is something a compiler does to the intermediate representation and assembly, not to the source code.

Related

Is there any performance difference between macros and functions in Rust?

Do compilers take the "status quo" when optimizations produced worse results?

Algorithms that can only be written in assembly

What levels should static analyzers analyze?

Assembly language and compiled languages

Categories

Resources