static analysis of linux kernel on source code or LLVM IR? [closed] - linux-kernel

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 months ago.
Improve this question
in https://www.usenix.org/system/files/sec21-tan.pdf the authors do static analysis on LLVM IR of linux kernel (a pass for call graph construction, a pass for data flow analysis and alias analysis and ...). and in some other papers I see they do static analysis on LLVM IR and not the source code. my question is why they do their static analysis on LLVM IR? why they don't analyze the source code of linux kernel instead? (for example, they can construct the call graph with analyzing the source code but they construct it by analyzing the LLVM IR).

Analyzing the LLVM IR simplifies analysis of the semantics of the program while analyzing the source code is needed to see what the program does in the terms of the programming language. What I mean is that the C expression *x is definitely "performing an indirection" but it may or may not load or store to memory, for instance the larger expression &*x does not even though it contains *x. This sort of thing doesn't happen with LLVM IR. Every memory access is either a load or store instruction, or a memory access occurs inside a called function through a call instruction. However if x is NULL then *x is still undefined behaviour even if the larger expression is &*x, and you won't be able to see that bug by looking only at the LLVM IR.
LLVM also has a bunch of analysis built in, for instance LLVM already has the ability to build a call graph. Sometimes the call graph isn't immediately obvious from the source code and you need to run some optimizations to see what the callee is (or to remove dead code, eliminating function calls with it), and LLVM performs optimizations quite well too.

Related

powerpc-elf abi instead of elfv2 on 64 bit powerpc systems, is it possible? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 months ago.
Improve this question
I have been trying to cross compile gcc for 64-bit powerpc architecture. However, GCC configuration lacks "powerpc64-elf" target. It has "powerpc64-linux", powerpc-rtems (which can produces 32/64 bit code).
Digging further, I have read the following document (which describes the ABI used by linux for powerpc64 arch):
https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html
Specification introduces an additional segment called TOC. Also, this specification uses ELFv2 format to address these changes.
My first (and maybe unrelated questions is) is;
using TOC for access to global variables that beneficial? Instead of using single load instruction, we have to jump to TOC table instead and then use a load instruction.
does something prevent us from using single load on powerpc systems?
At this point, I am fairly uncertain about the advantages of (or even the necessity of) ELFv2 on ELF.
My actual question is, when compiling GCC, if I were to just change ABI to default powerpc ABI,
will the code produced by this compiler, can still produce valid 64bit code?
I am guessing even if this works, I may not be able utilize some components of the hardware?
Thanks in advance.
Edit:
Clarification:
I forgot to mention that I am not planning to run the programs compiled using modified gcc on existing linux targets. Rather, I want to simplify ABI for OS architecture support package (possibly using the same architecture support both platforms).
In this case, I want to run 64 bit code (without 4GB memory limit) using ELFv1 ABI on powerpc64 architecture using powerpc-linux (rather than powerpc64-linux).
TOC/GOT overheads:
According to Bill's answer about GOT and TOC overheads, I compared the dumps of simple programs compiled powerpc32 and powerpc64 compilers. As he described GOT too uses extra level of indirection. TOC seems to intruduce 2 additional instructions (a load immediate followed by an add immediate - which are trivial).
Edit2: In the end, I opted to use standard ABI. Compiler and OS needs a handshake at this point.
But I did create a custom configuration of gcc by observing other OS (like linux, rtems) and following this tutorial: Structure of a gcc backend.
Short answer: No, you can't use powerpc-linux as your target for a 64-bit PowerPC target.
You need to cross compile for the target of the Linux distribution where your code is intended to run. For most modern distributions, the code will run in little-endian mode, so you need to target powerpc64le-linux. Some older distributions run in big-endian mode, with target powerpc64-linux. Generally speaking, powerpc64-linux uses the ELF v1 ABI, and powerpc64le-linux uses the ELF v2 ABI. Both use a TOC pointer.
PowerPC is a little unique in its use of a compiler-managed table-of-contents (TOC) and a linker-managed global offset table (GOT), as opposed to a single GOT that many targets use. But in practice the overhead is not that different, thanks to a variety of compiler and linker optimizations. In both systems, an extra level of indirection is necessary when accessing global shared variables because addresses of these are not known until run time, and are satisfied by the dynamic linker (except when using pure static linking, which is uncommon today).
In short, don't worry about the TOC, and set up your cross-compile for the environment in which your code is expected to run.

Segmentation fault with automatic arrays [duplicate]

I have some Fortran code that calls RESHAPE to reorder a matrix such that the dimension that I am now about to loop over becomes the first varying dimension (Column-major order in Fortran).
This has nothing to do with C/Fortran interoperability.
Now the matrix is rather large and when I call the RESHAPE function I get a seg fault which I am very confident is a stack overflow. I know this because I can compile my code in ifort with -heap-arrays and the problem disappears.
I do not want to modify the stack-size. This code needs to be portable for any computer without the user having to concern himself with stack-size.
Is there someway I can get this call of the RESHAPE function to use the heap and not the stack for its internal memory use.
Worst case I will have to 'roll my own' RESHAPE function for this instance but I wish there was a better way.
The Fortran standard does not speak about stack and heap at all, that is an implementation detail. In which part of memory something is placed and whether there are any limits is implementation defined.
Therefore it is impossible to control the stack or heap behaviour from the Fortran code itself. The compiler must be instructed by other means if you want to specify this and the compiler options are used for that. Intel Fortran uses stack by default and has the -heap-arrays n option (n is the limit in kB), gfortran is slightly different and has the opposite -fstack-arrays option (included in -Ofast, but can be disabled).
This is valid for all kinds of temporaries and automatic arrays.

Writing in pure 1's and 0's [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have been wondering about writing code in pure binary. I realize that this is completely impractical, as well as extremely hard. I have codded some in a few languages but for some reason the thought of being able to write pure binary seems intriguing. Can you hepl explain the basics of this? I have researched but all the answers are "this is not practical" even though that is obvious and the point is to see the limits of programmers and binary. If their would be a link to a IDE or tutorial to teach some of this it would be awesome. My point is more to understand how it works, not to make big programs but it would be nice to be able to create some extremely basic ones. I think it would be nice to make a very high performance program.
It is possible, but not particularly easy.
To do it, you normally start by writing the code in assembly language, typically on paper. You then use an encoding table to assemble that data by hand. Finally, you use a debugger or "monitor" program to enter the binary into RAM. Usually you want to save it to a file before running it (to avoid re-entering it if, for example, it crashes).
There is one major caveat though: it's really only even close to practical on systems that support some sort of executable file format that has little or no overhead. For one example, under MS-DOS you typically did this with a .com file, which is pretty much just raw binary. What you put into the file gets loaded at an offset of 0x100, and execution starts at the beginning.
When you get to something like Windows with PE format executables or Linux with ELF format, it's a whole different story. I'd say about the only halfway reasonable way to do the job under these would be to write a shell program using some existing linker to produce the executable file, and have it allocate some memory, load code from your file into it, and then execute a jump to your code to start execution. Trying to encode a PE or ELF header by hand would be pretty dire, to put it mildly.
Here is some binary/machine code and the corresponding assembly code:
Machine code Assembly code
03 45 84 add 0xffffff84(%ebp),%eax
83 c0 30 add $0x30, %eax
f7 e2 mul %edx
Notice there is a 1-1 correspondence between an assembly instructions and the corresponding machine code. (The instruction lengths differ because this is x86 code).
Because binary code and pure assembly code map to one another, there is no difference in speed between the two.
In practice, the only people who program in hex are those that don't have access to an assembler or people who like to solve problems the (really) hard way.

Prolog - high-level purpose of WAM [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am trying to understand the purpose of the WAM at a conceptual high level, but all the sources I have consulted so far assume that I know more than I currently do at this juncture, and they approach the issue from the bottom (details). They start with throwing trees at me, where as right now I am concerned with seeing the whole forest.
The answers to the following questions would help me in this endeavor:
Pick any group of accomplished, professional Prolog implementers - the SISCtus people, the YAP people, the ECLiPSe people - whoever. Now, give them the goal of implementing a professional, performant, WAM-based Prolog on an existing virtual machine - say the Erlang VM or Java VM. To eliminate answers such as "it depends on what your other goals are," lets say that any other goals they have besides the one I just gave are the ones they had when they developed their previous implementations.
Would they (typically) implement a virtual machine (the WAM) inside of a VM (Erlang/JVM), meaning would you have a virtual machine running on top of, or being simulated by, another virtual machine?
If the answer to 1 is no, does that mean that they would try to somehow map the WAM and its associated instructions and execution straight onto the underlying Erlang/Java VM, in order to make the WAM 'disappear' so to speak and only have one VM running (Erlang/JVM)? If so, does this imply that any WAM heaps, stacks, memory operations, register allocations, instructions, etc. would actually be Erlang/Java ones (with some tweaking or massaging)?
If the answer to 1 is yes, does that mean that any WAM heaps, stacks, memory ops, etc. would simply be normal arrays or linked lists in whatever language (Erlang or Java, or even Clojure running on the JVM for that matter) the developers were using?
What I'm trying to get at is this. Is the WAM merely some abstraction or tool to help the programmer organize code, understand what is going on, map Prolog to the underlying machine, perhaps provide portability, etc. or is it seen as an (almost) necessary, or at least quite useful "end within itself" in order to implement a Prolog?
Thanks.
I'm excited to see what those more knowledgeable than I are able to say in response to this interesting question, but in the unlikely event that I actually know more than you do, let me outline my understanding. We'll both benefit when the real experts show up and correct me and/or supply truer answers.
The WAM gives you a procedural description of a way of implementing Prolog. Prolog as specified does not say how exactly it must be implemented, it just talks about what behavior should be seen. So WAM is an implementation approach. I don't think any of the popular systems follow it purely, they each have their own version of it. It's more like an architectural pattern and algorithm sketch than a specification like the Java virtual machine. If it were firmer, the book Warren's Abstract Machine: A Tutorial Reconstruction probably wouldn't need to exist. My (extremely sparse) understanding is that the principal trick is the employment of two stacks: one being the conventional call/return stack of every programming language since Algol, and the other being a special "trail" used for choice points and backtracking. (edit: #false has now arrived and stated that WAM registers are the principal trick, which I have never heard of before, demonstrating my ignorance.) In any case, to implement Prolog you need a correct way of handling the search. Before WAM, people mostly used ad-hoc methods. I wouldn't be surprised to learn that there are newer and/or more sophisticated tricks, but it's a sound architecture that is widely used and understood.
So the answer to your three-part question is, I think, both. There will be a VM within the VM. The VM within the VM will, of course, be implemented in the appropriate language and will therefore use that language's primitives for handling the invisible parts of the VM (the stack and the trail). Clojure might provide insight into the ways a language can share things with its own implementation language. You would be free to intermix as desired.
The answer to your final question, what you're trying to get at, is that the WAM is merely an abstraction for the purposes you describe and not an end to itself. There is not, for instance, such a thing as "portable WAM bytecode" the way compiled Java becomes portable JVM bytecode which might justify it absent the other benefits. If you have a novel way of implementing Prolog, by all means try it and forget all about WAM.

Fast linear system solver for D? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Where can I get a fast linear system solver written in D? It should be able to take a square matrix A and a vector b and solve the equation Ax = b for b and, ideally, also perform explicit inversion on A. I have one I wrote myself, but it's pretty slow, probably because it's completely cache-naive. However, for my use case, I need something with the following absolute, non-negotiable requirements, i.e. if it doesn't meet these, then I don't otherwise care how good it otherwise is:
Must be licensed public domain, Boost license, or some similar permissive license. Ideally it should not require attribution in binaries (i.e. not BSD), though this point is somewhat negotiable.
Must be written in pure D or easily translatable to pure D. Inscrutable Fortran code (i.e. LAPACK) is not a good answer no matter how fast it is.
Must be optimized for large (i.e. n > 1000) systems. I don't want something that's designed for game programmers to solve 4x4 matrices really, really fast.
Must not be inextricably linked to a huge library of stuff I don't need.
Edit: The reason for these seemingly insane requirements is that I need this code for a permissively licensed open source library that I don't want to have any third-party dependencies.
If you don't like Fortran code, one reasonably fast C++ dense matrix library with modest multi-core support, well-written code and a good user-interface is Eigen. It should be straightforward to translate its code to D (or to take some algorithms from it).
And now my "think about your requirements": there is a reason why "everyone" (Mathematica, Matlab, Maple, SciPy, GSL, R, ...) uses ATLAS / LAPACK, UMFPACK, PARDISO, CHOLMOD etc. It is hard work to write fast, multi-threaded, memory-efficient, portable and numerically stable matrix solvers (trust me, I have tried). A lot of this hard work has gone into ATLAS and the rest.
So my approach would be to write bindings for the relevant library depending on your matrix type, and link from D against the C interfaces. Maybe the bindings in multiarray are enough (I haven't tried). Otherwise, I'd suggest looking at another C++ library, namely uBlas and the respective bindings for ideas.

Resources