A stopwatch that ticks in a specific rate? - time

I wrote a simple stopwatch class, that begins in setting its _start data member to System.getCurrentTicks(), and when calling stopwatch.elapsed() it simply returns System.getCurrentTicks() - _start.
However I want to be able to speed up my stopwatch or slow it down. Something like stopwatch.setSpeed(2.0f); //ticks 2x faster, but I can't figure out what the code should be like. It might involve manipulating _start but I'm not sure how. I searched for this on the web and found nothing. Any ideas?

Related

What's the correct way to CheckDeviceState in DirectX11?

I have built the DX11VideoRenderer sample (a replacement for EVR that uses DirectX11 instead of EVR's DirectX9), and it's working. Problem is, it's not working very well. It's using twice the CPU time that the EVR does for the same videos (more on this in the next question).
Since I've got the source, I decided to profile it to see what's going on. (Among other things) this led me to:
HRESULT DX11VideoRenderer::CPresenter::CheckDeviceState(BOOL* pbDeviceChanged)
I'm not much of a DirectX expert (actually, I'm not one at all), but it seems likely that window handles can invalidate as monitors get unplugged, windows get FullScreened, closed, etc so a function like this makes perfect sense to me.
However.
When I look at the code for CheckDeviceState, the first thing it does is call SetVideoMonitor, which seems odd.
SetVideoMonitor looks like the routine you call when you first initialize the presenter (or change the target window), not something you'd call repeatedly to "Check" the device state.
Indeed, SetVideoMonitor calls TerminateDisplaySystem, followed by InitializeDisplaySystem. I could see doing this once at startup, but those functions are being called once per frame. That can't be right.
I can comment out the call to SetVideoMonitor in CheckDeviceState (or actually all of CheckDeviceState), and the code continues to function correctly (it's predictably a bit faster). But then I'm not checking the device state anymore.
Trying to figure out the proper way to check for state changes in DX11 brought me here which talks about just checking the return codes for IDXGISwapChain::Present and ResizeBuffers. Is that how this should be done? Because that makes it seem like this whole routine is some leftover from DX9 (where it still would have been poorly implemented).
What's the correct way to check the device state in DX11? Is this even a thing anymore?

Performance implications of function calls in PSM1 Modules

I have a function that does a find/replace on text files, and it has worked well for some time. Until I needed to process a 12 million line file.
My initial code used Get-Content and Write-Content, and with the massive file it was going to take hours to process, not to mention the memory implications of loading 12 million lines into RAM.
So, I wrote a little test script to compare that approach vs Stream Reader/Writer. And Streaming looked like it was going to be a massive performance improvement, dropping processing to 30 seconds. I then added a .Replace() on each line, and total processing time only went up to maybe a minute. All good. So then I went to implement it in my real code, and performance has tanked again. That code is a PS1 that loads a number of PSM1 files. The function to do the find replace is in one of those PSM1 files, and that code calls functions in another PSM1. The test script was everything in a single small PS1.
Given that my test script didn't use a function call at all, I tested that first, so there is a function in the PS1 that gets called 12 million times from the loop in the same PS1. No real performance impact.
So, my thought then was that calling a function in one PSM1 that then calls a function in another PSM1 (12 million times) might be the issue. So I made a dummy function (which just returns the passed string, as if no replacement was needed) in the same PSM1 as the loop. And that is orders of magnitude slower.
I have not tested this with everything in the PS1, mostly because these functions are needed in three different scripts with very different argument requirements, so implementing it with Modules really made a lot of sense logistically, and changing that would be a massive undertaking.
That said, is there a known performance hit when calling a function that lives in a Module? I was under the impression that once the Modules are loaded, it's basically the same as if it was all in a single PS1, but perhaps not? FWIW, I am not using NameSpaces. All of my functions just have function name prefix on the noun side to avoid conflicts.
I also can't really post minimally functional code very easily since that's in a single file that doesn't exhibit the behavior. If there is no obvious answer to someone I guess my next step is to implement the test script with some modules, but that's not really apples to apples either, since my real modules are rather large.
To add a little context: When the function (in a PSM1) does not call a function and simply sets $writeLine = $originalLine total time is 15 seconds.
When doing an actual find and replace inline (no call to a function) like this $writeLine = $originalLine.Replace($replace, $with) total processing time is 16 seconds.
When calling a function in the same PSM1 that just returns the original string total time is 17 minutes.
But again, when it's all in a PS1 file with no modules, calling a function has minimal impact. So it certainly seems like calling a function in a PSM1, even from a function in that same PSM1, has a massive performance overhead.
And more context:
I moved the replace function in the test script into a Module. No appreciable change. So I moved the main code, including the loop, into a function in that module, and called it from the main script. Again, no real change. Both took around 15 seconds.
So, it's not something innate in Modules. That then begs the question, what could I be doing in my other modules that would trigger this behavior? This modules are 3000-10,000 lines of code, so there is a lot going on. Hopefully someone has some insight as to best practices with modules to mitigate this. And hopefully it's not "Don't use big modules". ;)
Final update:
It seems it IS a function of how big the module is. I deleted all the other functions in the Module that contains the loop, and performance is fine, 17 seconds. So, basically even as of PS5.0, the implementation of modules is pretty useless for anything large. Rather disconcerting. I wonder if the same would be true if all the functions where in a single file, and PowerShell performance with large files with lots of functions is just bad? Anyone have any experience down this road?

How can I determine what objects ARC is retaining using Instruments or viewing assembly?

This question is not about finding out who retained a particular object but rather looking at a section of code that appears from the profiler to have excessive retain/release calls and figuring out which objects are responsible.
I have a Swift application that after initial porting was spending 90% of its time in retain/release code. After a great deal of restructuring to avoid referencing objects I have gotten that down to about 25% - but this remaining bit is very hard to attribute. I can see that a given chunk of it is coming from a given section of code using the profiler, but sometimes I cannot see anything in that code that should (to my understanding) be causing a retain/release. I have spent time viewing the assembly code in both Instruments (with the side-by-side view when it's working) and also the output of otool -tvV and sometimes the proximity of the retain/release calls to a recognizable section give me a hint as to what is going on. I have even inserted dummy method calls at places just to give me a better handle on where I am in the code and turned off optimization to limit code reordering, etc. But in many cases it seems like I would have to trace the code back to follow branches and figure out what is on the stack in order to understand the calls and I am not familiar enough with x86 to know know if that is practical. (I will add a couple of screenshots of the assembly view in Instruments and some otool output for reference below).
My question is - what else can I be doing to debug / examine / attribute these seemingly excessive retain/release calls to particular code? Is there something else I can do in Instruments to count these calls? I have played around with the allocation view and turned on the reference counting option but it didn't seem to give me any new information (I'm not actually sure what it did). Alternately, if I just try harder to interpret the assembly should I be able to figure out what objects are being retained by it? Are there any other tools or tricks I should know on that front?
EDIT: Rob's info below about single stepping into the assembly was what I was looking for. I also found it useful to set a symbolic breakpoint in XCode on the lib retain/release calls and log the item on the stack (using Rob's suggested "p (id)$rdi") to the console in order to get a rough count of how many calls are being made rather than inspect each one.
You should definitely focus on the assembly output. There are two views I find most useful: the Instruments view, and the Assembly assistant editor. The problem is that Swift doesn't support the Assembly assistant editor currently (I typically do this kind of thing in ObjC), so we come around to your complaint.
It looks like you're already working with the debug assembly view, which gives somewhat decent symbols and is useful because you can step through the code and hopefully see how it maps to the assembly. I also find Hopper useful, because it can give more symbols. Once you have enough "unique-ish" function calls in an area, you can usually start narrowing down how the assembly maps back to the source.
The other tool I use is to step into the retain bridge and see what object is being passed. To do this, instruction-step (^F7) into the call to swift_bridgeObjectRetain. At that point, you can call:
p (id)$rdi
And it should print out at least some type information about the what's being passed ($rdi is correct on x86_64 which is what you seem to be working with). I don't always have great luck extracting more information. It depends on exactly is in there. For example, sometimes it's a ContiguousArrayStorage<Swift.CVarArgType>, and I happen to have learned that usually means it's an NSArray. I'm sure better experts in LLDB could dig deeper, but this usually gets me at least in the right ballpark.
(BTW, I don't know why I can't call p (id)$rdi before jumping inside bridgeObjectRetain, but it gives strange type errors for me. I have to go into the function call.)
Wish I had more. The Swift tool chain just hasn't caught up to where the ObjC tool chain is for tracing this kind of stuff IMO.

OpenMP4 - Why does calling a function multiple times makes it faster

So I'm currently learning openmp4.
What I experienced is that if I call a function a 2nd time it will get significantly faster.
The omp block is inside of this function.
In my example the 1st call takes 5 seconds and the 2nd only 0,3s.
I am using the intel-icc with an Intel Xeon Phi(60cores 240Threads).
Could someone please explain why this is happening?
I think that is due to the OpenMP threading initialization. The creation of the threading team, and initializing each thread, etc.. This in fact is an expensive procedure.
I don't know what your code looks like, but this can also be the effect of caching. During first time, the mic will start caching the needed data. The second time: the data will already be cached and ready. But I can't confirm this until I see the code.

Performance tuning VBA code in large procedure

I've been asked to tune the performance of a specific function which loads every time a worksheet is opened (so it's important that it doesn't make things slow). One of the things that seems to make this function slow is that it does a long call to the database (which is remote), but there are a bunch of other possibilities too. So far, I've been stepping through the code, and when something seems to take a long time making a note of it as a candidate for tuning.
I'd like a more objective way to tell which calls are slowing me down. Searching for timing and VBA yields a lot of results which basically amount to "Write a counter, and start and stop it either side of the critical section" (often with the macro explicitly called). I was wondering whether there was a way to (in the debugger) do something like "Step to next line, and tell me the time elapsed".
If not, can someone suggest a reasonable macro that I could use in the Immediate window to get what I'm after? Specifically, I would like to be able to time an arbitrary line of code within a larger procedure (rather than a whole procedure at once, which is what I found through Google).
Keywords for your further search would be to look for a "Profiler" for VBA. I've heard of VB Watch and VBA Code Profiler System (VBACP) as well as from Stephen Bull's PerfMon, but sparing the latter they're mostly not free.
So far for the official part of my answer, and I toss in some extra in terms of maybe useless suggestions:
Identifying "slow" code by "humanly measurement" (run a line and say: "Woah, that takes forever") in the debugger is certainly helpful, and you can then start looking into why they're slow. Your remote database call may take quite long if it has to transmit a lot of data - in which cases it may be a good idea to timestamp the data on both ends and ask the DB whether data had been modified before you grab it.
Writing the data into the sheet may be slow depending on the way you write it - which can sometimes be improved by writing arrays to a range instead of some form of iteration.
And I probably don't need to tell you about ScreenUpdating and EnableEvents and so on?

Resources