Using the llvm::DominatorTree class, you can find out if an instruction dominates another. The necessary functions to do so are available: see
http://llvm.org/doxygen/classllvm_1_1DominatorTree.html
DT = DominatorTree(Func);
...
...
DT.dominates(I1,I2);
However, the same functions are not available for the llvm::PostDominatorTree struct. In fact, the latter's doxygen page is almost empty:
http://llvm.org/doxygen/structllvm_1_1PostDominatorTree.html
Is there a way to check postdominance as easily as dominance in LLVM?
Most of PostDominatorTree's methods are inherited from DominatorTreeBase, including dominates. So this works the same as with llvm::DominatorTree
You find the doxygen documentation under "Public Member Functions inherited from llvm::DominatorTreeBase< NodeT, IsPostDom >".
Related
I have a function that I would like to provide an assembly implementation for
on amd64 architecture. For the sake of discussion let's just suppose it's an
Add function, but it's actually more complicated than this. I have the
assembly version working but my question concerns getting the godoc to display
correctly. I have a feeling this is currenty impossible, but I wanted to seek
advice.
Some more details:
The assembly implementation of this function contains only a few
instructions. In particular, the mere cost of calling the function is a
significant part of the entire cost.
It makes use of special instructions (BMI2) therefore can only be used
following a CPUID capability check.
The implementation is structured like this gist. At a high level:
In the generic (non-amd64 case) the function is defined by delegating to
addGeneric.
In the amd64 case the function is actually a variable, initially set to
addGeneric but replaced by addAsm in the init function if a cpuid
check passes.
This approach works. However the godoc output is crappy because in the
amd64 case the function is actually a variable. Note godoc appears to be
picking up the same build tags as the machine it's running on. I'm not sure
what godoc.org would do.
Alternatives considered:
The Add function delegates to addImpl. Then we pull some similar trick
to replace addImpl in the amd64 case. The problem with this is (in my
experiments) Go doesn't seem to be able to inline the call, and the assembly
is now wrapped in two function calls. Since the assembly is so small already
this has a noticable impact on performance.
In the amd64 case we define a plain function Add that has the useAsm
check inside it, and calls one of addGeneric and addAsm depending on the
result. This would have an even worse impact on performance.
So I guess the questions are:
Is there a better way to structure the code to achieve the performance I
want, and have it appear properly in documentation.
If there is no alternative, is there some other way to "trick" godoc?
See math.Sqrt for an example of how to do this.
Write a stub function with the documentation
Write a generic implementation as an unexported function.
For each architecture, write a function in assembler that jumps to the unexported generic implementation or implements the function directly.
To handle the cpuid check, set a package variable in init() and conditionally jump based on that variable in the assembly implementation.
In Julia, I most often see code written like fun(n::T) where T<:Integer, when the function works for all subtypes of Integer. But sometimes, I also see fun(n::Integer), which some guides claim is equivalent to the above, whereas others say it's less efficient because Julia doesn't specialize on the specific subtype unless the subtype T is explicitly referred to.
The latter form is obviously more convenient, and I'd like to be able to use that if possible, but are the two forms equivalent? If not, what are the practicaly differences between them?
Yes Bogumił Kamiński is correct in his comment: f(n::T) where T<:Integer and f(n::Integer) will behave exactly the same, with the exception the the former method will have the name T already defined in its body. Of course, in the latter case you can just explicitly assign T = typeof(n) and it'll be computed at compile time.
There are a few other cases where using a TypeVar like this is crucially important, though, and it's probably worth calling them out:
f(::Array{T}) where T<:Integer is indeed very different from f(::Array{Integer}). This is the common parametric invariance gotcha (docs and another SO question about it).
f(::Type) will generate just one specialization for all DataTypes. Because types are so important to Julia, the Type type itself is special and allows parameterization like Type{Integer} to allow you to specify just the Integer type. You can use f(::Type{T}) where T<:Integer to require Julia to specialize on the exact type of Type it gets as an argument, allowing Integer or any subtypes thereof.
Both definitions are equivalent. Normally you will use fun(n::Integer) form and apply fun(n::T) where T<:Integer only if you need to use specific type T directly in your code. For example consider the following definitions from Base (all following definitions are also from Base) where it has a natural use:
zero(::Type{T}) where {T<:Number} = convert(T,0)
or
(+)(x::T, y::T) where {T<:BitInteger} = add_int(x, y)
And even if you need type information in many cases it is enough to use typeof function. Again an example definition is:
oftype(x, y) = convert(typeof(x), y)
Even if you are using a parametric type you can often avoid using where clause (which is a bit verbose) like in:
median(r::AbstractRange{<:Real}) = mean(r)
because you do not care about the actual value of the parameter in the body of the function.
Now - if you are Julia user like me - the question is how to convince yourself that this works as expected. There are the following methods:
you can check that one definition overwrites the other in methods table (i.e. after evaluating both definitions only one method is present for this function);
you can check code generated by both functions using #code_typed, #code_warntype, #code_llvm or #code_native etc. and find out that it is the same
finally you can benchmark the code for performance using BenchmarkTools
A nice plot explaining what Julia does with your code is here http://slides.com/valentinchuravy/julia-parallelism#/1/1 (I also recommend the whole presentation to any Julia user - it is excellent). And you can see on it that Julia after lowering AST applies type inference step to specialize function call before LLVM codegen step.
You can hint Julia compiler to avoid specialization. This is done using #nospecialize macro on Julia 0.7 (it is only a hint though).
If there is a function from an external package that is not used at all in my project, will the compiler remove the function from the generated machine code?
This question could be targeted at any language compiler in general. But, I think the behaviour may vary language to language. So, I am interested in knowing what does go compilers do.
I would appreciate any help on understanding this.
The language spec does not mention this anywhere, and from a correctness point of view this is irrelevant.
But know that the current version does remove certain constructs that the compiler can prove is not used and will not change the runtime behaviour of the app.
Quoting from The Go Blog: Smaller Go 1.7 binaries:
The second change is method pruning. Until 1.6, all methods on all used types were kept, even if some of the methods were never called. This is because they might be called through an interface, or called dynamically using the reflect package. Now the compiler discards any unexported methods that do not match an interface. Similarly the linker can discard other exported methods, those that are only accessible through reflection, if the corresponding reflection features are not used anywhere in the program. That change shrinks binaries by 5–20%.
Methods are a "harder" case than functions because methods can be listed and called with reflection (unlike functions), but the Go tools do what they can even to remove unused methods too.
You can see examples and proof of removed / unlinked code in this answer:
How to remove unused code at compile time?
Also see other relevant questions:
Splitting client/server code
Call all functions with special prefix or suffix in Golang
In general one can implement typical type_traits using template techniques.
However I didn't imagine how std::is_standard_layout could be implemented in these terms. http://en.cppreference.com/w/cpp/types/is_standard_layout
When I checked the gcc standard library, I found that it is implemented in terms of __is_standard_layout(T) which I could not find defined anywhere else. Is this a compiler magic function?
Would it be possible to implement std::is_standard_layout explicitly?
For example one of the conditions is that it inherits from a single class.
That seems to be impossible to determine at compile time.
No, std::is_standard_layout is not something you can implement without compiler intrinsics. As you've correctly pointed out, it needs more information than the C++ type system can express.
Background information (you do not need to repeat these steps to answer the question, this just gives some background):
I am trying to compile a rather large set of generated modules. These files are the output of a prototype Modelica to OCaml compiler and reflect the Modelica class structure of the Modelica Standard Library.
The main feature is the use of polymorphic, open recursion: Every method takes a this argument which contains the final superclass hierarchy. So for instance the model:
model A type T = Real type S = T end A;
is translated into
let m_A = object
method m_T this = m_Modelica_Real
method m_S this = this#m_T this
end
and has to be closed before usage:
let _ = m_A#m_T m_A
This seems to postpone a lot of typechecking until the superclass hierarchy is actually fixed, which in turn makes it impossible to compile the final linkage module (try ocamlbuild Linkage.cmo after editing the comments in the corresponding file to see what I mean).
Unfortunately, since the code base is rather large and uses a lot of objects, the type-structure might not be the root cause after all, it might as well be some optimization or a flaw in the code-generation (although I strongly suspect the typechecker). So my question is: Is there any way to profile the ocaml compiler in a way that signals when a certain phase (typechecking, intermediate code generation, optimization) is over and how long it took? Any further insights into my particular use case are also welcome.
As of right now, there isn't.
You can do it yourself though, the compiler source are open and you can get those and modify them to fit your needs.
Depending on whether you use ocamlc or ocamlopt, you'll need to modify either driver/compile.ml or driver/optcompile.ml to add timers to the compilation process.
Fortunately, this already has been done for you here. Just compile with the option -dtimings or environment variable OCAMLPARAM=timings=1,_.
Even more easily, you can download the opam Flambda switch:
opam switch install 4.03.0+pr132
ocamlopt -dtimings myfile.ml
Note: Flambda itself changes the compilation time (most what happens after typing) and its integration into the OCaml compiler is not confirmed yet.
OCaml compiler is an ordinary OCaml program in that regard. I would use poorman's profiler for a quick inspection, using e.g. pmp script.