My program contains a child of a widget class, and the paint() function is redefined for the child.
The program is consuming a lot of CPU cycles even when idle. A printf() inside my paint() function shows that paint() is called only when I expect it to be called.
What else can I try to locate the source of the consumption?
Add
Let me step back to something truly elementary. In XCode 3 there used to be a build setting to choose between a "Debug" and a "Release" build, but I no longer see such a setting in XCode 4. How does one generate a debug build? Perhaps the answer to my original question would be as simple as pressing "pause" (another button that disappeared) while the program is in the idle loop. (The loop itself, I should add, belongs to the toolkit, not my code.)
Assuming this is MacOS Xcode development, you can use the profiler that comes with Xcode.
If not, use whatever profiler is available.
If no profiler is available, start slowly stripping out functionality out of your application. Or perhaps not slowly, but do a binary search (i.e. strip out half of functionality). Whatever is easier.
Depending on your application doing the 3rd things (i.e. ripping things out instead of using profiler) might actually be the fastest route to victory, but it's worth to spend some time and learn to use profiler.
Related
I have built the DX11VideoRenderer sample (a replacement for EVR that uses DirectX11 instead of EVR's DirectX9), and it's working. Problem is, it's not working very well. It's using twice the CPU time that the EVR does for the same videos (more on this in the next question).
Since I've got the source, I decided to profile it to see what's going on. (Among other things) this led me to:
HRESULT DX11VideoRenderer::CPresenter::CheckDeviceState(BOOL* pbDeviceChanged)
I'm not much of a DirectX expert (actually, I'm not one at all), but it seems likely that window handles can invalidate as monitors get unplugged, windows get FullScreened, closed, etc so a function like this makes perfect sense to me.
However.
When I look at the code for CheckDeviceState, the first thing it does is call SetVideoMonitor, which seems odd.
SetVideoMonitor looks like the routine you call when you first initialize the presenter (or change the target window), not something you'd call repeatedly to "Check" the device state.
Indeed, SetVideoMonitor calls TerminateDisplaySystem, followed by InitializeDisplaySystem. I could see doing this once at startup, but those functions are being called once per frame. That can't be right.
I can comment out the call to SetVideoMonitor in CheckDeviceState (or actually all of CheckDeviceState), and the code continues to function correctly (it's predictably a bit faster). But then I'm not checking the device state anymore.
Trying to figure out the proper way to check for state changes in DX11 brought me here which talks about just checking the return codes for IDXGISwapChain::Present and ResizeBuffers. Is that how this should be done? Because that makes it seem like this whole routine is some leftover from DX9 (where it still would have been poorly implemented).
What's the correct way to check the device state in DX11? Is this even a thing anymore?
In Turbo Pascal 7 for DOS you can use the Crt unit to define a window. If you define a second window on top of the first one, like a popup, I don’t see a way to get rid of the second one except for redrawing the first one on top again.
Is there a window close technique I’m overlooking?
I’m considering keeping an array of screens in memory to make it work, but the TP IDE does popups like I want to do, so maybe it’s easy and I’m just looking in the wrong place?
I don't think there's a window-closing technique you're missing, if you mean one provided by the CRT unit.
The library Borland used for the TP7 IDE was called TurboVision (see https://en.wikipedia.org/wiki/Turbo_Vision) and it was eventually released to the public domain, but well before that, a number of 3rd-party screen handling/windowing libraries had become available and these were much more powerful than what could be achieved with the CRT unit. Probably the best known was Turbopower Software's Object Professional (aka OPro).
Afaik, these libraries (and, fairly obviously TurboVision) were all based on an in-memory representation of a framed window which could be rapidly copied in and out of the PC's video memory and, as in Windows with a capital W, they were treated as a stack with a z-order. So the process or closing/erasing the top level window was one of getting the window(s) that it had been covering to re-draw itself/themselves. Otoh, CRT had basically evolved from v. primitive origins similar to, if not based on, the old DEC VT100 display protocol and wasn't really up to the job of supporting independent, stackable window objects.
Although you may still be able to track down the PD release of TurboVision, it never really caught on as a library for developers. In an ideal world, a better place to start would be with OPro. It was apparently on SoureForge for a while, but seems to have been taken down sometime since about 2007, and these days even if you could get hold of a copy, there is a bit of a question mark over licensing. However ...
There was also a very popular freeware library available for TP by the name of the "Technojock's toolkit" and which had a large functionality overlap (including screen handling) with OPro and it is still available on github - see https://github.com/lallousx86/TurboPascal/tree/master/TotLib/TOTSRC11. Unlike OPro, I never used TechnoJocks myself, but devotees swore by it. Take a look.
This question is not about finding out who retained a particular object but rather looking at a section of code that appears from the profiler to have excessive retain/release calls and figuring out which objects are responsible.
I have a Swift application that after initial porting was spending 90% of its time in retain/release code. After a great deal of restructuring to avoid referencing objects I have gotten that down to about 25% - but this remaining bit is very hard to attribute. I can see that a given chunk of it is coming from a given section of code using the profiler, but sometimes I cannot see anything in that code that should (to my understanding) be causing a retain/release. I have spent time viewing the assembly code in both Instruments (with the side-by-side view when it's working) and also the output of otool -tvV and sometimes the proximity of the retain/release calls to a recognizable section give me a hint as to what is going on. I have even inserted dummy method calls at places just to give me a better handle on where I am in the code and turned off optimization to limit code reordering, etc. But in many cases it seems like I would have to trace the code back to follow branches and figure out what is on the stack in order to understand the calls and I am not familiar enough with x86 to know know if that is practical. (I will add a couple of screenshots of the assembly view in Instruments and some otool output for reference below).
My question is - what else can I be doing to debug / examine / attribute these seemingly excessive retain/release calls to particular code? Is there something else I can do in Instruments to count these calls? I have played around with the allocation view and turned on the reference counting option but it didn't seem to give me any new information (I'm not actually sure what it did). Alternately, if I just try harder to interpret the assembly should I be able to figure out what objects are being retained by it? Are there any other tools or tricks I should know on that front?
EDIT: Rob's info below about single stepping into the assembly was what I was looking for. I also found it useful to set a symbolic breakpoint in XCode on the lib retain/release calls and log the item on the stack (using Rob's suggested "p (id)$rdi") to the console in order to get a rough count of how many calls are being made rather than inspect each one.
You should definitely focus on the assembly output. There are two views I find most useful: the Instruments view, and the Assembly assistant editor. The problem is that Swift doesn't support the Assembly assistant editor currently (I typically do this kind of thing in ObjC), so we come around to your complaint.
It looks like you're already working with the debug assembly view, which gives somewhat decent symbols and is useful because you can step through the code and hopefully see how it maps to the assembly. I also find Hopper useful, because it can give more symbols. Once you have enough "unique-ish" function calls in an area, you can usually start narrowing down how the assembly maps back to the source.
The other tool I use is to step into the retain bridge and see what object is being passed. To do this, instruction-step (^F7) into the call to swift_bridgeObjectRetain. At that point, you can call:
p (id)$rdi
And it should print out at least some type information about the what's being passed ($rdi is correct on x86_64 which is what you seem to be working with). I don't always have great luck extracting more information. It depends on exactly is in there. For example, sometimes it's a ContiguousArrayStorage<Swift.CVarArgType>, and I happen to have learned that usually means it's an NSArray. I'm sure better experts in LLDB could dig deeper, but this usually gets me at least in the right ballpark.
(BTW, I don't know why I can't call p (id)$rdi before jumping inside bridgeObjectRetain, but it gives strange type errors for me. I have to go into the function call.)
Wish I had more. The Swift tool chain just hasn't caught up to where the ObjC tool chain is for tracing this kind of stuff IMO.
Does anyone know of a Debugger or Programming Language that allows you to set a break point, and then modify the code and then execute the newly modified code.
This is even more useful if the Debugger also had the ability for reverse debugging. So you could step though the buggy code, stack backwards, fix the code, and then step though it again to see if you fixed the bug. Now that's sexy, is anyone doing this?
I believe the Hot Code Replace in eclipse is what you meant in the problem:
The idea is that you can start a debugging session on a given runtime
workbench and change a Java file in your development workbench, and
the debugger will replace the code in the receiving VM while it is
running. No restart is required, hence the reference to "hot".
But there are limitations:
HCR only works when the class signature does not change; you cannot
remove or add fields to existing classes, for instance. However, HCR
can be used to change the body of a method.
The totalview debugger provides the concept of Evaluation Point which allows user to "fix his code on the fly" or to "patch it" or to examine what if scenario without having to recompile.
Basically, user plants an Evaluation Point at some line and writes a piece of C/C++ or Fortran code he wants to execute instead. Could be a simple printf, goto, a set of if-then-else tests, some for loops etc... This is really powerful and time-sparing.
As for reverse-debugging, it's a highly desirable feature, but I'm not sure it already exists.
http://msdn.microsoft.com/en-us/library/bcew296c%28v=vs.80%29.aspx
The link is for VS 2005 but applies to 2008 and 2010 as well.
Edit, 2015: Read chapters 1 and 2 of my MSc thesis, Combining reverse debugging and live programming towards visual thinking in computer programming, it answers the question in detail.
The Python debugger, Pdb, allows you to run arbitrary code while paused (like at a breakpoint). For example, let's say you are debugging and have paused at the following line in your program, where the variable hasn't been declared in the program itself :
print (x)
so that moving forward (i.e., running that line) would result in :
NameError: name 'x' is not defined
You can define that variable in the debugger, and have the program continue executing with it :
(Pdb) 'x' in locals()
False
(Pdb) x = 1
(Pdb) 'x' in locals()
True
If you meant that the change should not be provided at the debugger console, but that you want to change the original code in some editor, then have the debugger automatically update the state of the live program in some way, so that the executing program reflects that change, that is called "live programming". (Not to be confused with "live coding" which is live performance of coding -- see TOPLAP -- though there is some confusion.) There has been an interest in research into live programming (and live coding) in the last 2 or 3 years. It is a very difficult problem to solve, and there are many different approaches. You can watch Bret Victor's talk, Inventing on Principle, for some examples of that. Note that those are prototypes only, to illustrate the idea. Hot-swapping of code so that the tree is drawn differently in the next loop of some draw() function, or so that the game character responds differently next time, (or so that the music or visuals are changed during a live coding session), is not that difficult, some languages and systems cater for that explicitly. However, the state of the program is not necessarily then a true reflection of the code (as also in the Pdb example above) -- if e.g. the game character could access an area based on some ability like jumping, and the code is then swapped out, he might never be able to access that area in the game any longer should the game be played from the start. To solve change propagation for general programming is difficult -- you can see that his search example re-runs the code from the start each time a change is made.
True reverse execution is also a tricky problem. There are a number of commercial projects, but almost all of them only record trace data to browse it afterwards, called omniscient debugging (but they are often called reverse-, back-in-time, bidirectional- or time-travel-debuggers, also a lot of confusion). In terms of free and open-source projects, the GNU debugger, gdb, has two modes, one is process record and replay which also only records the program for browsing it afterwards, the other is true reverse debugging which allows you to reverse in a live program. It is extremely slow, as it undoes single machine instruction at a time. The extended python debugger prototype, epdb, also allows for true reversing in a live program, and is much faster as it uses a snapshot/checkpoint and replay mechanism. Here is the thesis and here is the program and the code.
I've been testing out the performance and memory profiler AQTime to see if it's worthwhile spending those big $$$ for it for my Delphi application.
What amazes me is how it can give you source line level performance tracing (which includes the number of times each line was executed and the amount of time that line took) without modifying the application's source code and without adding an inordinate amount of time to the debug run.
The way that they do this so efficiently makes me think there might be some techniques/technologies used here that I don't know about that would be useful to know about.
Do you know what kind of methods they use to capture the execution line-by-line without code changes?
Are there other profiling tools that also do non-invasive line-by-line checking and if so, do they use the same techniques?
I've made an open source profiler for Delphi which does the same:
http://code.google.com/p/asmprofiler/
It's not perfect, but it's free :-). Is also uses the Detour technique.
It stores every call (you must manual set which functions you want to profile),
so it can make an exact call history tree, including a time chart (!).
This is just speculation, but perhaps AQtime is based on a technology that is similar to Microsoft Detours?
Detours is a library for instrumenting
arbitrary Win32 functions on x86, x64,
and IA64 machines. Detours intercepts
Win32 functions by re-writing the
in-memory code for target functions.
I don't know about Delphi in particular, but a C application debugger can do line-by-line profiling relatively easily - it can load the code and associate every code path with a block of code. Then it can break on all the conditional jump instructions and just watch and see what code path is taken. Debuggers like gdb can operate relatively efficiently because they work through the kernel and don't modify the code, they just get informed when each line is executed. If something causes the block to be exited early (longjmp), the debugger can hook that and figure out how far it got into the blocks when it happened and increment only those lines.
Of course, it would still be tough to code, but when I say easily I mean that you could do it without wasting time breaking on each and every instruction to update a counter.
The long-since-defunct TurboPower also had a great profiling/analysis tool for Delphi called Sleuth QA Suite. I found it a lot simpler than AQTime, but also far easier to get meaningful result. Might be worth trying to track down - eBay, maybe?