Understanding Xcode build time summary - xcode

I am trying to generate the Xcode build summary for my project so that I can optimize the bottlenecks. As per attached screenshot
Total build time it shows at the bottom is 135.3 seconds. While the first module CompileC takes 449.356 seconds. I know Xcode do some parallelization while building the project but I am not sure how it is calculating this summary time. Can anyone explain this?

I know this is old, but I was looking into this, and I came across this comment by Rick Ballard, an Apple Xcode build system engineer.
Yes – many commands, especially compilation, are able to run in parallel with each other, so multicore machines will indeed finish the build much faster than the time it took to run each of the commands.
In other words, the quoted numbers are core-seconds, not real time, except for the last one. So if you have six cores, your CompileC task might only take 449/6 = 75 seconds. You've got, maybe, 660 core-seconds, so you'd get about 110 clock-seconds, which looks about right versus 135 total time.

Related

Interpreting Xcode Time profiler

I profiled my app for about 38 seconds and selected the 12 seconds that have UI problems. It looks to me like the profiler is telling me that out of the 12 seconds that I have selected, that more than 3 seconds is spent removing notification observers. Is this the correct way to interpret these results?
It's telling you it's spending 3 seconds out of 12 doing _CFXNotificationRemoveObservers.
Is that useful?
I would think you'd want to know why it's doing that, and whatever else it is doing as well.
It's giving you a very incomplete picture.
If you simply paused it randomly (a few times) in that 12 seconds, you would be using this technique.
It tells you not only what the program is doing at the time you stopped it, but you can see why by reading the stack.
If it is spending any of that time doing I/O or blocking system calls, you will see that too.
Is this the correct way to interpret these results?
Yes, that's correct -- according to your image, your app is spending a lot of time removing observers.

Give all possible resources to a program

I created a program in C# to work with 2.5 million records in Oracle Express (local instance), parse/split those records and create an additional 5 million records.
I added some code to print times on the screen and it seems fairly fast. It is doing all the processing for 1K records every 9 seconds. Which means it takes more than 6 hours to finish.
Now, with Task Manager I can see the program is using 6% of CPU (max) and around 50MB of memory. I understand the OS, and Oracle itself need resources to operate but..... is there a way to tell this little program "hey, it's ok, go ahead and use at least 50% of CPU, there are 4GB of RAM so knock yourself out"?
Note: One of the reasons I'm using a local instance with Oracle Express is to reduce the network bottleneck. Also I might not run this process quite often but I was intrigued to see if this was at all possible.
Please forgive my noobness,
Thanks!
The operating system will give your program all the resources it needs, the reason your process is not consuming all the CPU is probably because it's waiting for the IO sub system more than the processor.
If you want to see if you can consume more CPU cycles try writing a program that runs a short infinite loop as fast as possible and you will see the difference in CPU usage.
A number of thoughts, not really answers I guess, but.
You could up the priority of the applications thread, however, its possible that the code maybe less efficient than you think, so..
Have you run a profiler on it?
If its currently a single threaded app, you could look to see if you could parse it in batches and therefore run them in parallel.
Without knowing a lot of detail of the splitting of records, is it possible to off hand that more to oracle to do? eg, would matter less about network etc or local or otherwise.
If you're apps drawing/updating a screen or UI then it will almost certainly slow the progress of the work down. An example. I ran an app which sorted about 10k emails into around 250k lines into a database, if I added an item to a listbox each line the time went from short to rediculous eg, crash out got bored. So, again, offloading to a thread to do the work with as few UI updates to do as possible can help.

Performance issue between builds

I've been developing a small indie game in my spare time and have run across an inexplicable issue. Some builds of the game will randomly run several hundred frames per second slower than other builds. For example, when rendering some text and no 3D scene, I can achieve 1800FPS on my own hardware. Add one 3D sphere (10k verts, pixel shaded), achieve 1700 FPS. Add two more spheres, achieve 800 FPS. Remove all spheres, achieve 1100FPS- even though the code now renders the same scene as I previously achieved at 1800FPS, which is just the FPS counter being rendered. I've tried rebuilding and cleaning the project and rebooting the compiler. This is in Release mode and I turned on all the optimizations I could find. Any suggestions as to the cause?
I ran a quick profile, and Visual Studio seems to think that over 90% of my time was spent in D3D9_43.dll, suggesting that it's not a bug in my app, which doesn't explain why it manifests in only some builds.
I rebooted my machine and it's back up to 1800FPS. I think it's a bug in the DirectX SDK tools (amongst many others). Going to delete this question.
I don't know if MSVC does this, but GCC does:
When GCC cannot determine the most likely branch it throws the dice.
If MSVC does that, it may be that in each build an important branch point is being predicted one way or the other and it makes a difference.
You can fix that by doing a PGO build: profile guided optimization. That will examine the code at run time and it will make all the branches correctly predicted. At least, it will be correct if your test run is a good sample.
That said, the results are not usually so dramatic. If you had more objects on the scene and more code involved the changes would even out more.
Another possibility: CPU speed scaling.
If your program spends most of its time executing on the GPU, CPU usage may not rise high enough to push the CPU into full speed.
Try adjusting Windows power management to full speed instead of balanced, see if it changes anything.

How to quantify your "slow" development machine?

( Please provide the question this one duplicates. I'm disappointed I couldn't find it. )
My development machine is "slow". I wait on it "a lot".
I've been asked by decision makers who want to help to fairly and accurately measure that time. How do you quantify the amount of time you spend waiting on the computer (during compiles, waiting for apps to open every day, etc).
Is there software which effectively reports on this sort of thing? Is there an OS metric (I/O something something, pagefile swapping frequency, etc, etc) that captures and communicates this particularly well? Some sort of benchmark you'd recommend me testing against?
EDIT: I'm writing C# (mostly ASP.NET).
Here's one metric that may impress some higher ups: measure the average time it takes to build your application, and how many times you do that per day. For instance, We ended up with ~100 builds a day # 60 secs each.
Now, measure the average build time on a presumably faster machine (say 30 secs per build).
At this point you can see how much time it would save you to have the 'faster' machine. Per-developer, per-day. Multiply by the number of developers, and the days in a month and you can see how this stacks against adding another developer to the team.
Yes, I know, there are other considerations when adding more people to a team, but this will give you a rough comparison that 'higher ups' can relate to. For instance: if we all had faster machines, we would spend less time on the builds, comparable to one extra developer.
On the other side, you should provide good estimates of the cost of upgrading everyone's machine.
Now, if you can, you should run this type of comparison against multiple 'faster' machines to determine their relative performance and perhaps individuate which bottlenecks you are facing (RAM vs CPU vs I/O ?).
Finally, my personal take is that, while this sort of process and the following discussion with the stakeholders takes place (and it may take a while), you could get everyone bigger/more monitors. That's a relatively cheap upgrade (of course, not that cheap if you go for 52" LCD monitors, right?) and more monitor-estate does improve productivity (protip: also improves employees morale, which, in turn, improves productivity).
HTH
Close FireFox to gain some memory. Add RAM. Helped me a lot.
Depends on your work environment. E.g. in Visual Studio (C++, 2005) you can do timed builds, such that the IDE prints the elapsed time after the regular build output.
Quantification is difficult when you don't have anything to measure/compare against. If your dev-box takes 12 minutes to compile a project of 100,000 lines of code without any other dev-box to measure against you have no idea if this is good or bad. Maybe 12 minutes for 100,000 lines is actually good?
Measuring it won't help you and it certainly won't help your decision makers. Consider; "Yes boss, it takes an average of twelve minutes to compile our project." The boss says; "Ok, is that normal?". You have no idea.
Computer hardware is cheap. Look at the dev-box and consider asking the decision makers to throw some cash at it to improve its performance. If you compile on average 5 times a day and it takes an average of 12 minutes a compile, that's a lost hour every single day - adding up to 5 lost hours a week. Well worth the cost of some RAM or a CPU upgrade.
To me, a slow machine does not kill productivity as much as unexpected slowing down does - if the machine compiles the whole solution in 12 minutes every time you press F5, the solution has some problem, not the machine. Beside that, I don't have problem with 12 minutes, I can get up take a break. It's actually good to take a break when you know and have control on how long the break will be.
What I found the most productivity killer are these cooperate software that start scanning virus (or install updates) at their will - having to sit there and wait is a pain of ass.

How do I get repeatable CPU-bound benchmark runtimes on Windows?

We sometimes have to run some CPU-bound tests where we want to measure runtime. The tests last in the order of a minute. The problem is that from run to run the runtime varies by quite a lot (+/- 5%). We suspect that the variation is caused by activity from other applications/services on the system, eg:
Applications doing housekeeping in their idle time (e.g. Visual Studio updating IntelliSense)
Filesystem indexers
etc..
What tips are there to make our benchmark timings more stable?
Currently we minimize all other applications, run the tests at "Above Normal" priority, and not touch the machine while it runs the test.
The usual approach is to perform lots of repetitions and then discard outliers. So, if the distractions such as the disk indexer only crops up once every hour or so, and you do 5 minutes runs repeated for 24 hours, you'll have plenty of results where nothing got in the way. It is a good idea to plot the probability density function to make sure you are understand what is going on. Also, if you are not interested in startup effects such as getting everything into the processor caches then make sure the experiment runs long enough to make them insignificant.
First of all, if it's just about benchmarking the application itself, you should use CPU time, not wallclock time as a measure. That's then (almost) free from influences of what the other processes or the system do. Secondly, as Dickon Reed pointed out, more repetitions increase confidence.
Quote from VC++ team blog, how they do performance tests:
To reduce noise on the benchmarking machines, we take several steps:
Stop as many services and processes as possible.
Disable network driver: this will turn off the interrupts from NIC caused by >broadcast packets.
Set the test’s processor affinity to run on one processor/core only.
Set the run to high priority which will decrease the number of context switches.
Run the test for several iterations.
I do the following:
Call the method x times and measure the time
Do this n times and calculate the mean and standard deviation of those measurements
Try to get the x to a point where you're at a >1 second per measurement. This will reduce the noise a bit.
The mean will tell you the average performance of your test and the standard deviation the stability of your test/measurements.
I also set my application at a very high priority, and when I test a single-thread algorithm I associate it with one cpu core to make sure there is not scheduling overhead.
This code demonstrates how to do this in .NET:
Thread.CurrentThread.Priority = ThreadPriority.Highest;
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.RealTime;
if (Environment.ProcessorCount > 1)
{
Process.GetCurrentProcess().ProcessorAffinity =
new IntPtr(1 << (Environment.ProcessorCount - 1));
}

Resources