I am looking for a way to find out which lines are executed most often in my code. I am mainly using Xcode and Instruments.
Using Instruments, I already know how to get the heaviest stack traces using the Time Profiler, which samples my program.
I would like to study the code's performance under a different angle. I am not interested in finding the heaviest functions with regards to execution time, I want to find the heaviest functions with regards to number of invocations.
The closest I could get so far is by using llvm's code coverage tools through Xcode. After activating code coverage, I can see execution counts on the various lines of code of my program inside Xcode. Unfortunately, it is a large project and going through every file one by one isn't a good way to work. That's why I'm looking for an efficient way to enumerate the top results.
I have tried using llvm-cov on the command line, but I couldn't find any way to achieve my goal. The --line-coverage-gt parameter doesn't seem to achieve what it is supposed to do. Any value greater than 0 filters out the lines of code that weren't executed at all, but any number greater than 99 filters out everything. Some of the lines are executed tens of millions of times, but I can't get rid of the ones that got executed thousands of times.
Answers to this question can leverage Xcode, Instruments and llvm-cov without any issues. I'm also interested in other alternatives even though they might be harder to apply on my codebase.
Thank you!
Related
I'm working on a large project in C++ using Visual Studio, but it very regularly either produces a duff build (the executable it generates doesn't match the code, resulting in random crashes or the inability to set breakpoints) or refuses to give any debug info for many of the types. For example, a vector of very simple structs stored by value will be displayed as "size: attempt to divide by zero". You can't drill down into the entries of the vector to see the values, and you get a similar thing for lists only you see a bunch of question marks instead of the divide by zero error.
This doesn't just affect standard library containers, but they are some of the worst culprits because they so often behave in this way. Doing a complete rebuild of the code will maybe rectify the problem 10% of the time, but it's completely unpredictable. I have found that writing shorter C++ files (I literally mean the file size, nothing to do with the objects themselves) can sometimes help, but I suspect that's just down to luck. It really doesn't make much sense that it could be relevant, anyway.
I work as part of a team on the same project, and only two of us seem to run into these kind of gnarly problems on a daily basis.
If anyone has any suggestions as to how I might be able to get the VS debugger to behave, I would be incredibly grateful.
Please forgive me if the terms are inaccurate or this is the wrong forum.
Essentially I write sketches in Processing and I am struggling to find out why my code runs slow.
Sometimes a sketch runs fast and I have no idea why other than there are less lines of code. Sometimes a different sketch runs slow and I have no idea why.
I am curious if there is a way within the Processing IDE, or maybe a general tool, to determine or analyze which lines of code are causing the sketch to run slow?
Such as a way to isolate "Oh, these lines are causing it to run the slowest. Looks like it's a section of this loop. Maybe I should concentrate on improving this function rather than searching for a needle in a haystack."
Similar to how when my computer is running slow I can use task manager to take a look at which programs are running slow and then adjust. I don't just guess. Or develop an unfounded penchant for quitting one program over another.
I of course could upload my sketch but this is a example independent problem I am trying to get better at solving. I would like to find a way to analyze all sketches. Or even code in different languages, etc.
If there is no general tool for analyzing a Processing sketch how do people go about this? Surely there must be a better method than trial and error, brute force, intuition, or otherwise. Of course those methods could yield better running code but there must be some formal analysis.
Even if you didn't have anything specific to share for Processing any suggestions on search terms, subjects, or topics would be appreciated as I have no idea how to proceed other than the brute force/trial and error method.
Thank you,
Without an example code snippet I can only provide a couple of general approaches.
The simplest thing you could do is use time sections of code and add print statements (poor man's profiler). The idea is to take a time snapshot before running a function/block of code you suspect is slow, take another one after then print the difference. The chunks of code with the largest time difference is what you want to focus on. Here's a minimal example, assuming doSomethingInteresting() is a function that you suspect is slow
// get the current millis since the sketch started
int before = millis();
// call the function here
doSomethingInteresting();
// get the millis again, after the function and calculate the time difference
int diff = millis() - before;
println("executed in ~" + diff + "ms");
If you use Processing's default Java mode you can use VisualVM to sample your PApplet. In your case you will want to sample CPU usage and the nice thing about VisualVM is that you will the results sorted by the slowest function calls (the first you should improve) and you can drill down to see what is your code versus what is part of the runtime or other native parts of code.
I have been doing sport programming for a while and still improving day by day. But one thing I have always wondered is that it would be really nice if I could automate the test-case generation process and cross-validation of my program. Definitely it would be a brute force approach as some test cases would be algorithm specific.
Doing a google search gives me a nice link on Quora : How do programming contest problem setters make test cases ? and the popular testlib used by problem setters.
But isn't this a chicken-egg problem?
Assume I generated 1 million input test cases, but what would I check them against? How will I generate the outputs? Because I am still in the process of validating the program... If my script generates the correct outputs as well, then whats the point of writing the program in the first place. I can submit the script itself. Also, its not possible to write 1 million outputs for generated test cases manually. Can anyone please clarify this confusion.
I hope i have clarified the problem correctly.
It's common to generate the answer by a slow but obviously correct solution (like an exhaustive search). It can't be used as a main solution as it's too slow for large test cases, but you can check the output of your fast (but possibly incorrect) program using it.
Well the thing is it is not as broad as you think it to be. Test generation in competitive programming is guided by the algorithm of the problem and it's correctness proof.
So when you are thinking that there are million of test cases if you analyze the different situations the program can be then you will likely to get all the test cases. Maybe in certain algorithm you are some times processing the even index elements or the odd index elements of an array. Now what you will do? Divide it in 2 cases even or odd. Consider the smallest case for even ones . Same for odd ones. This way you are basically visiting all the control flow path of the program.
In competitive programming as we first determine the algorithm then we decide on a proper input sizes and then all this test cases and validations, it is often easy to think the corner points. Test case for 1000000 elements or when input is 0 or 1 ...test cases like this.
Another is most of the time we write a brute force solution much more slower than the original one. Now what we do? we just generate random medium size test cases and then run it again the slow program and we can check with our checker solution etc.
correctness is guided by some mathematical proof also.(Heuristics, Induction, Box principle, Number Theory etc ) That way we are sure about the correctness of the solution.
I faced the same issue earlier this year and saw some of my colleagues also figuring out a way to deal with this. Because there are sometimes when I just couldn’t think any more test cases and that’s when I decided to make a test case generator tool of my own, It’s free and open-source so anyone can use it.
You can easily generate a lot of test cases using this tool and validate the result using the output given from the correct but slower approach(in terms of time complexity and space complexity). You can either run them parallelly and check for outputs or write a simple script to compare the output of both programs (the slower but correct one and the better but unsure one) to validate.
I believe good coders won’t be needing it anyway, but for the middle level (div2, div3) coders and newbies, it can prove to be a lot helpful.
You can access it from GitHub : Test Case generator.
Both python source code and .exe files are present there with instructions,
If you want to make some changes of your own you can work with the python file.
If you just directly want it to generate some test cases you should prefer the .exe file(inside the zip).
It’ll help a lot if you’re beginning your Competitive programming journey.
Also, any suggestions or improvements are always welcome. Additionally you can also contribute to this project by adding some feature that you think the project is lacking or by adding some new test case formats by yourselves or making by request for the same.
I'm working on a board game algorithm where a large tree is traversed using recursion, however, it's not behaving as expected. How do I handle this and what are you experiences with these situations?
To make things worse, it's using alpha-beta pruning which means entire parts of the tree are never visited, as well that it simply stops recursion when certain conditions are met. I can't change the search-depth to a lower number either, because while it's deterministic, the outcome does vary by how deep is searched and it may behave as expected at a lower search-depth (and it does).
Now, I'm not gonna ask you "where is the problem in my code?" but I am looking for general tips, tools, visualizations, anything to debug code like this. Personally, I'm developing in C#, but any and all tools are welcome. Although I think that this may be most applicable to imperative languages.
Logging. Log in your code extensively. In my experience, logging is THE solution for these types of problems. when it's hard to figure out what your code is doing, logging it extensively is a very good solution, as it lets you output from within your code what the internal state is; it's really not a perfect solution, but as far as I've seen, it works better than using any other method.
One thing I have done in the past is to format your logs to reflect the recursion depth. So you may do a new indention for every recurse, or another of some other delimiter. Then make a debug dll that logs everything you need to know about a each iteration. Between the two, you should be able to read the execution path and hopefully tell whats wrong.
I would normally unit-test such algorithms with one or more predefined datasets that have well-defined outcomes. I would typically make several such tests in increasing order of complexity.
If you insist on debugging, it is sometimes useful to doctor the code with statements that check for a given value, so you can attach a breakpoint at that time and place in the code:
if ( depth = X && item.id = 32) {
// Breakpoint here
}
Maybe you could convert the recursion into an iteration with an explicit stack for the parameters. Testing is easier in this way because you can directly log values, access the stack and don't have to pass data/variables in each self-evaluation or prevent them from falling out of scope.
I once had a similar problem when I was developing an AI algorithm to play a Tetris game. After trying many things a loosing a LOT of hours in reading my own logs and debugging and stepping in and out of functions what worked out for me was to code a fast visualizer and test my code with FIXED input.
So, if time is not a problem and you really want to understand what is going on, get a fixed board state and SEE what your program is doing with the data using a mix of debug logs/output and some sort of your own tools that shows information on each step.
Once you find a board state that gives you this problem, try to pin-point the function(s) where it starts and then you will be in a position to fix it.
I know what a pain this can be. At my job, we are currently working with a 3rd party application that basically behaves as a black box, so we have to devise some interesting debugging techniques to help us work around issues.
When I was taking a compiler theory course in college, we used a software library to visualize our trees; this might help you as well, as it could help you see what the tree looks like. In fact, you could build yourself a WinForms/WPF application to dump the contents of your tree into a TreeView control--it's messy, but it'll get the job done.
You might want to consider some kind of debug output, too. I know you mentioned that your tree is large, but perhaps debug statements or breaks at key point during execution that you're having trouble visualizing would lend you a hand.
Bear in mind, too, that intelligent debugging using Visual Studio can work wonders. It's tough to see how state is changing across multiple breaks, but Visual Studio 2010 should actually help with this.
Unfortunately, it's not particularly easy to help you debug without further information. Have you identified the first depth at which it starts to break? Does it continue to break with higher search depths? You might want to evaluate your working cases and try to determine how it's different.
Since you say that the traversal is not working as expected, I assume you have some idea of where things may go wrong. Then inspect the code to verify that you have not overlooked something basic.
After that I suggest you set up some simple unit tests. If they pass, then keep adding tests until they fail. If they fail, then reduce the tests until they either pass or are as simple as they can be. That should help you pinpoint the problems.
If you want to debug as well, I suggest you employ conditional breakpoints. Visual Studio lets you modify breakpoints, so you can set conditions on when the breakpoint should be triggered. That can reduce the number of iterations you need to look at.
I would start by instrumenting the function(s). At each recursive call log the data structures and any other info that will be useful in helping you identify the problem.
Print out the dump along with the source code then get away from the computer and have a nice paper-based debugging session over a cup of coffee.
Start from the base case where you've mentioned if else statements and then try to channelize your thinking by writing it down on pen and paper + printing the values on console when the first few instances of recursive functions are generated with values.
The motto is to find the correct trend between the values you print and match them with those values you wrote on paper in the initial few steps of your recursive algorithm.
What's your standard way of debugging a problem? This might seem like a pretty broad question with some of you replying 'It depends on the problem' but I think a lot of us debug by instinct and haven't actually tried wording our process. That's why we say 'it depends'.
I was sort of forced to word my process recently because a few developers and I were working an the same problem and we were debugging it in totally different ways. I wanted them to understand what I was trying to do and vice versa.
After some reflection I realized that my way of debugging is actually quite monotonous. I'll first try to be able to reliably replicate the problem (especially on my local machine). Then through a series of elimination (and this is where I think it's problem dependent) try to identify the problem.
The other guys were trying to do it in a totally different way.
So, just wondering what has been working for you guys out there? And what would you say your process is for debugging if you had to formalize it in words?
BTW, we still haven't found out our problem =)
My approach varies based on my familiarity with the system at hand. Typically I do something like:
Replicate the failure, if at all possible.
Examine the fail state to determine the immediate cause of the failure.
If I'm familiar with the system, I may have a good guess about to root cause. If not, I start to mechanically trace the data back through the software while challenging basic assumptions made by the software.
If the problem seems to have a consistent trigger, I may manually walk forward through the code with a debugger while challenging implicit assumptions that the code makes.
Tracing the root cause is, of course, where things can get hairy. This is where having a dump (or better, a live, broken process) can be truly invaluable.
I think that the key point in my debugging process is challenging pre-conceptions and assumptions. The number of times I've found a bug in that component that I or a colleague would swear is working fine is massive.
I've been told by my more intuitive friends and colleagues that I'm quite pedantic when they watch me debug or ask me to help them figure something out. :)
Consider getting hold of the book "Debugging" by David J Agans. The subtitle is "The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems". His list of debugging rules — available in a poster form at the web site (and there's a link for the book, too) is:
Understand the system
Make it fail
Quit thinking and look
Divide and conquer
Change one thing at a time
Keep an audit trail
Check the plug
Get a fresh view
If you didn't fix it, it ain't fixed
The last point is particularly relevant in the software industry.
I picked those on the web or some book which I can't recall (it may have been CodingHorror ...)
Debugging 101:
Reproduce
Progressively Narrow Scope
Avoid Debuggers
Change Only One Thing At a Time
Psychological Methods:
Rubber-duck debugging
Don't Speculate
Don't be too Quick to Blame the Tools
Understand Both Problem and Solution
Take a Break
Consider Multiple Causes
Bug Prevention Methods:
Monitor Your Own Fault Injection Habits
Introduce Debugging Aids Early
Loose Coupling and Information Hiding
Write a Regression Test to Prevent Re occurrence
Technical Methods:
Inert Trace Statements
Consult the Log Files of Third Party Products
Search the web for the Stack Trace
Introduce Design By Contract
Wipe the Slate Clean
Intermittent Bugs
Explot Localility
Introduce Dummy Implementations and Subclasses
Recompile / Relink
Probe Boundary Conditions and Special Cases
Check Version Dependencies (third party)
Check Code that Has Changed Recently
Don't Trust the Error Message
Graphics Bugs
When I'm up against a bug that I can't get seem to figure out, I like to make a model of the problem. Make a copy of the section of problem code, and start removing features from it, one at a time. Run a unit test against the code after every removal. Through this process your will either remove the feature with the bug (and hence, locate the bug), or you will have isolated the bug down to a core piece of code that contains the essence of the problem. And once you figure out the essence of the problem, its a lot easier to fix.
I normally start off by forming an hypothesis based on the information I have at hand. Once this is done, I work to prove it to be correct. If it proves to be wrong, I start off with a different hypothesis.
Most of the Multithreaded synchronization issues get solved very easily with this approach.
Also you need to have a good understanding of the debugger you are using and its features. I work on Windows applications and have found windbg to be extremely helpful in finding bugs.
Reducing the bug to its simplest form often leads to greater understanding of the issue as well adding the benefit of being able to involve others if necessary.
Setting up a quick reproduction scenario to allow for efficient use of your time to test any hypothosis you chose.
Creating tools to dump the environment quickly for comparisons.
Creating and reproducing the bug with logging turned onto the maximum level.
Examining the system logs for anything alarming.
Looking at file dates and timestamps to get a feeling if the problem could be a recent introduction.
Looking through the source repository for recent activity in the relevant modules.
Apply deductive reasoning and apply the Ockham's Razor principles.
Be willing to step back and take a break from the problem.
I'm also a big fan of using process of elimination. Ruling out variables tremendously simplifies the debugging task. It's often the very first thing that should to be done.
Another really effective technique is to roll back to your last working version if possible and try again. This can be extremely powerful because it gives you solid footing to proceed more carefully. A variation on this is to get the code to a point where it is working, with less functionality, than not working with more functionality.
Of course, it's very important to not just try things. This increases your despair because it never works. I'd rather make 50 runs to gather information about the bug rather take a wild swing and hope it works.
I find the best time to "debug" is while you're writing the code. In other words, be defensive. Check return values, liberally use assert, use some kind of reliable logging mechanism and log everything.
To more directly answer the question, the most efficient way for me to debug problems is to read code. Having a log helps you find the relevant code to read quickly. No logging? Spend the time putting it in. It may not seem like you're finding the bug, and you may not be. The logging might help you find another bug though, and eventually once you've gone through enough code, you'll find it....faster than setting up debuggers and trying to reproduce the problem, single stepping, etc.
While debugging I try to think of what the possible problems could be. I've come up with a fairly arbitrary classification system, but it works for me: all bugs fall into one of four categories. Keep in mind here that I'm talking about runtime problems, not compiler or linker errors. The four categories are:
dynamic memory allocation
stack overflow
uninitialized variable
logic bug
These categories have been most useful to me with C and C++, but I expect they apply pretty well elsewhere. The logic bug category is a big one (e.g. putting a < b when the correct thing was a <= b), and can include things like failing to synchronize access among threads.
Knowing what I'm looking for (one of these four things) helps a lot in finding it. Finding bugs always seems to be much harder than fixing them.
The actual mechanics for debugging are most often:
do I have an automated test that demonstrates the problem?
if not, add a test that fails
change the code so the test passes
make sure all the other tests still pass
check in the change
No automated testing in your environment? No time like the present to set it up. Too hard to organize things so you can test individual pieces of your program? Take the time to make it so. May make it take "too long" to fix this particular bug, but the sooner you start, the faster everything else'll go. Again, you might not fix the particular bug you're looking for but I bet you find and fix others along the way.
My method of debugging is different, probably because I am still beginner.
When I encounter logical bug I seem to end up adding more variables to see which values go where and then I go and debug line by line in the piece of code that causing a problem.
Replicating the problem and generating a repeatable test data set is definitely the first and most important step to debugging.
If I can identify a repeatable bug, I'll typically try and isolate the components involved until I locate the problem. Frequently I'll spend a little time ruling out cases so I can state definitively: The problem is not in component X (or process Y, etc.).
First I try to replicate the error, without being able to replicate the error it is basically impossible in a non-trivial program to guess the problem.
Then if possible, break out the code in a separate standalone project. There are several reasons for this: If the original project is big it quite difficult to debug second it eliminates or highlights any assumptions about the code.
I normally always have another copy of VS open which I use for the debugging parts in mini projects and to test routines which I later add to the main project.
Once having reproduced the error in the separate module the battle is almost won.
Sometimes it is not easy to break out a piece of code so in those cases I use different methods depending on how complex the issue is. In most cases assumptions about data seem to come and bite me so I try to add lots of asserts in the code in order make sure my assumptions are correct. I also disabling code by using #ifdef until the error disappears. Eliminating dependencies to other modules etc... sort of slowly circling in the bug like a vulture ..
I think I don't have really a conscious way of doing it, it varies quite a lot but the general principle is to eliminate the noise around the issue until it is quite obvious what it is. Hope I didn't sound too confusing :)