Does cppcheck analyses over multiple files - static-analysis

is cppcheck able to keep track of malloc/dealloc or data flow over multiple files? Or does it analyse only single files seperately?

There is a limited whole program analysis in cppcheck. Summaries are created for each function and then after all files have been analyzed those summaries are combined and checked if there is dangerous stuff. Well we need separate logic for whole program analysis.
There is no summary-based whole program analysis for memory leaks.
There is summary-based whole program analysis for:
array index out of bounds
one definition rule violations
null pointer dereference
unused functions
uninitialized variables

Related

How does a bytecode interpreter know what line a runtime error occurred on?

As of now, I am working on a language that compiles to bytecode, and then is ran by a VM. My question is, when a runtime error occurs, how does the VM know what line of the source code caused the error, as all whitespace is removed during the compilation process. One thing I would think of is to store a separate array of integers correlating to the bytecode with the line numbers within it, but that sounds extremely memory-inefficient, especially when there are a lot of instructions.
Some forms of bytecode contain information about line numbers, method names, etc. which are included to provide better debugging information. In the JVM, for example, method bytecode contains a table that maps ranges of bytecode addresses to source line numbers. That’s a more efficient way of storing it than tagging each bytecode operation with a line number, since there are typically multiple operations per line. It does use extra space, though I wouldn’t classify it as extremely inefficient.
Absent this info, there really isn’t a way for the interpreter to report anything about the original program, since as you’ve noted all that information is otherwise discarded.
This is similar to how compiled executables handle debug info. With debug symbols included, the program has tables mapping code addresses to function names and line numbers. With symbols stripped out, you just have raw instructions and data and there’s no way to reference the original code.

Eclipse CLP: maximum number of constraints/variables

In the Eclipse CLP, how many constraints or variables can I define?
I am currently remodeling my scheduling problem - I need to replace a single alldifferent constraint with many atmost constraints. But since I've introduced this change, my ecl script is not working. By "not working" I mean the Eclipse CLP - eclipse.exe or the TkEclipse GUI just shuts down. Without any error message,comment or saying goodbye. Just nothing.
If I try to comment-out some constraints, the script at least gets compiled.
Has someone already bothered with this issue?
There is no specific limit on the number or variables or constraints.
But you were working with large, generated source files where clauses had thousands of subgoals. Because ECLiPSe uses a recursive descent parser, such files can cause an OS stack overflow, in particular on Windows. You could either increase the Windows stack limit, or you could break your generated code into smaller clauses, and call these in conjunction.
Generally, however, generating textual source code isn't such a great idea: it must be created, written, read, parsed, compiled, and is then executed just once. Consider instead generating a pure data file that contains only things like arrays/lists of numbers, but no variables. You can then have a generic ECLiPSe program that reads these data and uses them to create variables and constraints, usually in several loops.
For a very simple example, compare https://eclipseclp.org/examples/transport1.pl.txt (where all the data is explicit in the flat model) with
https://eclipseclp.org/examples/transport_arr.pl.txt where the model is generic and all data comes from the data/3 fact at the end (this would correspond to the generated data file).

Is it possible to generate unwind table on an object file

The background is that we have a prebuilt object file without unwind table, but somehow gcc unwind had problem backtracking on the object. Is it possible to generate unwind table without source code? Considering unwind table is based stack statics which is also available even without source code.
In general, it is not possible to generate proper unwind tables from machine code in an object file. For a start, some constructs are quite difficult to represent accurately in unwinding information. Retpolines are an example.
The larger practical problem is that DWARF unwinding information is structured per function. A bare object file (without debugging information and only a minimal symbol table) does not capture function boundary information. Without that, it is impossible to say if a location in the file is the target of a function call and the start of a function. Similarly, a call to a noreturn function may be the last instruction in a function, even though it is not followed by a return instruction. It may be possible to use relocation data. There are several tools out there which attempt to infer function boundaries; every disassembler does it to some extent.
Your best bet is to locate the functions which fail unwinding and figure out why, and then compensate for that, either using custom-written unwind data or a GDB plugin. As Alexey Frunze said, a full conversion will be rather tedious.

Run custom preprocessing step for source files in Xcode build

As part of my build process (in Visual Studio), I use a custom MSBuild task that takes all the input source files, copies them to a secondary location, runs a preprocessing tool over them, and then gives the copies back to MSBuild to continue the build process with. I'm now working on a project for iOS, and I need to be able to do the same thing. It's been a very long time since I've worked with Xcode, so I'm pretty rusty on how I can set up the build process to work the same way I just described.
Specifically, here's the preprocessing that I'm doing:
As with many games and engines, there are often a lot of named resources, events, script symbols, object states, etc. that need to be human readable when represented in source code but for which having to do a full string comparison at runtime would be much too costly. Instead of using a full string, I use a 32-bit StringId integer type to represent these values. My preprocessing tool runs through the source code and replaces all instances of a macro in the form SID('some-named-identifier') with the 32-bit hash of the string inside that macro. During development, programmers and designers can use arbitrary strings as identifiers for whatever they need to be used for. At runtime, comparisons between StringIds are simple integer comparisons and since they are the hashed versions of the actual strings, there are no strings stored in the compiled binary that could be extracted.
Additionally, when preprocessing the SID macros, I populate a MySQL database with the strings and their hashed values. This lets me do a reverse lookup at runtime in order to print the human-readable strings while debugging. It's a great system, and I'd love to get it working in Xcode as well!
Thanks in advance!

Accurately accessing VB6 limitations

As antiquated and painful as it is - I work at a company that continues to actively use VB6 for a large project. In fact, 18 months ago we came up against the 32k identifier limit.
Not willing to give up on the large code base and rewrite everything in .NET we broke our application into a main executable and several supporting DLL files. This week we ran into the 32k limit again.
The problem we have is that no tool we can find will tell us how many unique identifiers our source is using. We have no accurate way to gauge how our efforts are reducing the number of identifiers or how close we are to the limit before we reach it.
Does anyone know of a tool that will scan the source for a project and return some accurate metrics and statistics?
OK. The Project Metrics Viewer which is part of the Project Analyzer tool from Aivosto will do exactly what you want. I've included a screenshot and also the link to the metrics list which includes numbers of variables etc.
Metrics List
(source: aivosto.com)
The company I work for also has a large VB6 project that encountered the identifier limit. I developed a way to accurately count the number of identifiers remaining, and this has been incorporated into our build process for this project.
After trying several tools without success, I finally realized that the VB6 IDE itself knows exactly how many identifiers it has remaining. In fact, the VB6 IDE throws an "out of memory" error when you add one variable past its limit.
Taking advantage of this fact, I wrote a VB6 Add-In project that first compiles the currently loaded project in the IDE, then adds uniquely named variables to the project until it throws an error. When an error is raised, it records the number of identifiers added before the error as the number of identifiers remaining.
This number is stored in file in a location known to our automated build process, which then reads this number and reports it to the development team. When it gets below a value we feel comfortable with, we schedule some refactoring time and move more code out of this project into DLL projects. We have been using this in production for several years now, and has proven to be a reliable process.
To directly answer the question, using an Add-In is the only way I know to accurately measure the number of remaining identifiers. While I cannot share the Add-In code our project is using, I can say there is not much code involved, and it did not take long to develop.
Microsoft has a decent guide for how to create an Add-In, which can get you started:
https://support.microsoft.com/en-us/kb/189468
Here are some important details specific to counting identifiers:
The VB6 IDE will not consistently throw an error when out of identifiers until the current loaded project has been compiled. Our Add-In programmatically does this before adding identifiers to guarantee an accurate count. If the project cannot be compiled, then an accurate count cannot be obtained.
There are 32,500 identifiers available to a new, empty VB6 project.
Only unique identifier names count. Two local variables with the same name in two different routines only count as one identifier.
CodeSmart by AxTools is very good.
(source: axtools.com)
Cheat - create an unused class with #### unique variables in it. Use Excel or something to generate the alphabetical unique variable names. Remove the class from the project when you hit the limit, or comment out blocks of 100 unique variables..
I'd rather lean on the compiler (which defines how many variables are too many) than on some 3rd party tool anyway.
(oh crud, sorry to necro - didn't notice the dates)
You could get this from a tool that extracted identifiers from VB6 code. Then all you'd have to do is sort the list, eliminate duplicates, and measure the list size. We have a source code search engine that breaks up source code into language tokens ("lexes"), with some of those tokens being exactly those identifiers. That would contain exactly the data you want.
But maybe there's another way to solve your problem: find out which variable names which occur rarely and replace them by a set of standard names (e.g., "temp"). So what you really want is a count of the number of each variable name so you can sort for "small numbers of references". The same lexer data can provide this information.
Then all you need is a tool to rename low-occurrence identifiers to something from the standard set. We offer obfuscators that replace one name by another that could probably do this.
[Oct 2014 update]. Just had a long conversation with somebody with this problem. It turns out there's a pretty conceptual answer on which to base a tool, and that is called register coloring, which allocates a fixed number of registers to an arbitrary number of operands. This works by computing an "interference graph" over operands; and two operands that don't "interfere" can be assigned the same register. One could use that to allocate 2^16 available variable names names to an arbitrary number of identifiers, if the interference graph isn't bad enough. My guess is that it is not. YMMV, and somebody still has to build such a tool, needing likely a VB6 parser and machinery to compute such a graph. [Check out my bio].
It seems that Compuware's DevPartner had that kind of code analysis. I don't know if the current version still supports Visual Basic 6.0. (But at least there's a 14-day trial available)

Resources