Handling very large SFTP uploads - Cocoa - cocoa

I'm working on a small free Cocoa app that involves some SFTP functionality, specifically working with uploads. The app is nearing completion however I've run into a pretty bad issue with regards to uploading folders that contain a lot of files.
I'm using ConnectionKit to handle the uploads:
CKTransferRecord * record;
record = [connection recursivelyUpload:#"/Users/me/large-folder"
to:#"/remote/directory"];
This works fine for most files and folders. Although in this case #"/Users/me/large-folder" has over 300 files in it. Calling this method spins my CPU up to 100% for about 30 seconds and my application is unresponsive (mac spinning ball). After the 30 seconds my upload is queued and works fine, but this is hardly ideal. Obviously whatever is enumerating these files is causing my app to lock up until it's done.
Not really sure what to do about this. I'm open to just about any solution - even using a different framework, although I've done my research and ConnectionKit seems to be the best of what's out there.
Any ideas?

Use Shark. Start sampling, start downloading, and as soon as the hang ends, stop sampling.
If the output confirms that the problem is in ConnectionKit, you have two choices:
Switch to something else.
Contribute a patch that makes it not hang.
The beauty of open-source is that #2 is possible. It's what I recommend. Then, not only will you have a fast ConnectionKit, but once the maintainers accept your patch, everyone else who uses CK can have one too.
And if Shark reveals that the problem is not in ConnectionKit (rule #2 of profiling: you will be surprised), then you have Shark's guidance on how to fix your app.

Since the problem is almost certainly on the enumeration, you'll likely need to move the enumeration into an asynchronous action. Most likely they're using NSFileManager -enumeratorAtPath: for this. If that's the main problem, then the best solution is likely to move that work onto its own thread. Given the very long time involved, I suspect they're actually reading the files during enumeration. The solution to that is to read the files lazily, right before uploading.
Peter is correct that Shark is helpful, but after being a long-time Shark fan, I've found that Instruments tends to give more useable answers more quickly. You can more easily add a disk I/O and memory allocation track to your CPU Sampler using Instruments.
If you're blocking just one core at 100%, I recommend setting Active Thread to "Main Thread" and setting Sample Perspective to "All Sample Counts." If you're blocking all your cores at 100%, I recommend setting Active Thread to "All Threads" and Sample Perspective to "Running Sample Times."

Related

Best practices for Document Based Core Data application that takes a long time to open

I have an OS X Cocoa cpplication (10.8-10.9) that opens very large NSDocument (Core Data backed) files that are about 2 GBs or bigger. This takes about 20-40 seconds with the initial load of the document, but then is pretty snappy. The 20-40 seconds doesn't exactly jive with a good UI experience, so I'd like to fix that.
I would like to either A) make the document opening faster, or B) show a "loading" screen.
I'm wondering what (if anything) folks have done to A) make opening a core data document faster (even if its doing stuff in the background) or B) display a splash screen / progress bar during the opening process.
WRT to B) (not a 2-part question, really, just wanted to demonstrate I have do research) showing a splash screen by creating methods in the NSDocument class works if calling during: windowControllerWillLoadNib and windowControllerDidLoadNib methods, but only after the first document has been opened (I'm sure there is a workaround). Either way, there is no "progress" that I can see that I could establish a hook into, anyway.
What does -getLociWithChromsomes: do? What happens when you drill into that method in Instruments to see what is taking all the time?
Same question with -GetAllLoci: and -getMaxLocusSnps?
It appears from the small amount of data that I can see is that you are doing a ton of fetching on launch. That is a bad design no matter what platform you are on. You should avoid doing a lot of fetching during the launch of the document and delay it until after the initial document/application launch has completed.
You could use a multi-threaded design here and it probably would help but it is really masking the problem.
The core of the issue appears to be that you are trying to do too much on launch. I am not certain that the entire delay is in core data or what you are doing with the data once it is retrieved as I do not have code level access through the trace you provided. It would be interesting to see screenshots of those methods above in Instruments with the time percentages highlighted.
Update
Again, you are loading too much during launch and blocking the UI. You have three answers all saying the same basic thing. This is not a Core Data problem. This is a performance problem because you are loading too much on launch or you are doing too heavy of a calculation on launch.
In addition to the specific hints Duncan gave you it's always useful to check out the latest (and free) WWDC videos on ADC to get an idea on the patterns provided by OS X and Cocoa in particular to boost an apps performance. Some starting points:
WWDC '12
- Asynchronous Design Patterns with Blocks, GCD, and XPC
WWDC '13
- Designing Code for Performance
Try opening the store on a background thread and once it's open activate the UI - that seems to work OK.
I create a background thread to call the [psc addPersistentStoreWithType:configuration:URL:options:error:] and once that's complete I hand control back to the main thread where the managedObjectContext gets created and the UI enabled.
You face this issue if Core Data has to do a big upgrade to a new model version as well. Some of my files take around a minute and if they are in iCloud it can be even longer. And if the upgrade happens on an iOS device then it can take a few minutes so this seems to be important for iOS apps as well.
BTW I have posted sample apps here
http://ossh.com.au/design-and-technology/software-development/sample-library-style-ios-core-data-app-with-icloud-integration/

NSURLDownload downloadDidFinish: called too early

One of the main components of the the OS X application I work on is a downloading system, which is built around NSURLDownload.
The vast majority of our users (as well as our internal dev team) don't have any issues with downloading files. However we seem to have a small percentage of random users (1-2%) worldwide that do. For those users, NSURLDownload's downloadDidFinish: delegate method is being called before the download is complete. For example, if the user tries to download a 5 GB file, downloadDidFinish: might get called after just 3.5 GB of data have been transfered, resulting in an incomplete file. Obviously this is causing a lot of frustration with users.
We've spent a good deal of time on this problem but haven't made much progress. We've seen that there's at least one case where OS X (erroneously?) calls downloadDidFinish: too early, and that's when the Mac volume runs low on free space. This struck me as odd, since why wouldn't the download:didFailWithError: delegate method be called instead? But the vast majority of users with incomplete downloads have plenty of disk space, so we don't think low disk space is their problem.
So I guess my question is: Do you guys know of any reason (other than low disk space) why NSURLDownload downloadDidFinish: might be called before the download is complete? Thanks for any advice you can give us.
Anoop's suggestion above looks to be correct. We went back to the drawing board and simulated Server-side disconnects better than we had been. We found that by doing so, our Client software would receive NSURLDownload:downloadDidFinish: notifications.
Personally I think it's strange that OS X would send our NSURLDownload objects a "finish" (rather than a "fail") notification in this situation. But I guess that's the way Apple wants it to work.

What could cause the application as well as the system to slowdown?

I am debugging an application which slows down the system very badly. The application loads a large amount of data (some 1000 files each of half an MB) from the local hard disk.The files are loaded as memory mapped files and are mapped only when needed. This means that at any given point in time the virtual memory usage does not exceed 300 MB.
I also checked the Handle count using handle.exe from sysinternals and found that there are at the most some 8000 odd handles opened. When the data is unloaded it drops to around 400. There are no handle leaks after each load and unload operation.
After 2-3 Load unload cycles, during one load, the system becomes very slow. I checked the virtual memory usage of the application as well as the handle counts at this point and it was well within the limits (VM about 460MB not much fragmentation also, handle counts 3200).
I want how an application could make the system very slow to respond? What other tools can I use to debug this scenario?
Let me be more specific, when i mean system it is entire windows that is slowing down. Task manager itself takes 2 mins to come up and most often requires a hard reboot
The fact that the whole system slows downs is very annoying, it means you can not attach a profiler easily, it also means it would be even difficult to stop the profiling session in order to view the results ( since you said it require a hard reboot ).
The best tool suited for the job in this situation is ETW ( Event Tracing for Windows ), these tools are great, will give you the exact answer you are looking for
Check them out here
http://msdn.microsoft.com/en-us/library/cc305210.aspx
and
http://msdn.microsoft.com/en-us/library/cc305221.aspx
and
http://msdn.microsoft.com/en-us/performance/default.aspx
Hope this works.
Thanks
Tools you can use at this point:
Perfmon
Event Viewer
In my experience, when things happen to a system that prevent Task Manager from popping up, they're usually of the hardware variety -- checking the system event log of Event Viewer is sometimes just full of warnings or errors that some hardware device is timing out.
If Event Viewer doesn't indicate that any kind of loggable hardware error is causing the slowdown, then try Perfmon -- add counters for system objects to track file read, exceptions, context switches etc. per second and see if there's something obvious there.
Frankly the sort of behavior demonstrated is meant to be impossible - by design - for user-mode code to cause. WinNT goes to a lot of effort to insulate applications from each other and prevent rogue applications from making the system unusable. So my suspicion is some kind of hardware fault is to blame. Is there any chance you can simply run the same test on a different PC?
If you don't have profilers, you may have to do the same work by hand...
Have you tried commenting out all read/write operations, just to check whether the slow down disappears ?
"Divide and conquer" strategies will help you find where the problem lies.
If you run it under an IDE, run it until it gets real slow, then hit the "pause" button. You will catch it in the act of doing whatever takes so much time.
You use tools like "IBM Rational Quantify" or "Intel VTune" to detect performance issue.
[EDIT]
Like BenoƮt did, one good mean is measuring tasks time to identify which is eating cpu.
But remember, as you are working with many files, is likely to be missing that causes the memory to disk swap.
when task manager is taking 2 minutes to come up, are you getting a lot of disk activity? or is it cpu-bound?
I would try process explorer from sysinternals. When your system is in the slowed-down state, and you try running, say, notepad, pay attention to page fault deltas.
Windows is very greedy about caching file data. I would try removing file I/O as someone suggested, and also making sure you close the file mapping as soon as you are done with a file.
I/O is probably causing your slowdown,especially if your files are on the same disk as the OS. Another way to test that would be to move your files to another disk and see if that alleviates the problem.

How to isolate causes of system hang on Unix/OSX

I am on OSX, and my system is becoming unresponsive for a few seconds roughly every 10 minutes. (It gives me the spinning beach ball of death). I was wondering if there was any way I could isolate the problem (I have plenty of RAM, and there are no pageouts/thrashing). Any Unix/OSX tools that could help me monitor and isolate the cause of this behaviour?
Activity Monitor (cmd+space, type, activity monitor), should give you an intuitive overview of what's happening on your system. If as you say it is there are no processes clogging CPU, please do take a look at the disk/IO activity. Perhaps your disk is going south.
I have been having problems continually over the years with system hangs. It seems that generally they are a result of filesystem errors, however Apple does not do enough to take care of this issue. System reliability should be a 100% focus and I am certainly sick of these issues. I have started to move a lot of files and all backups over to a ZFS volume on a FreeBSD server and this is helping a bit as it has started to both ease my mind and allow me to recover more quickly from issues. Additionally I've placed my system volume on a large SSD (240GB as I have a lot of support files and am trying to keep things from being too divided up with symlinks) and my Users folders on another drive. This too has helped add to reliability.
Having said this, you should try to explore spindump and stackshot to see if you can catch frozen processes before the system freezes up entirely. It is very likely that you have an app or two that is attempting to access bad blocks and it just hangs the system or you have a process blocking all others for some reason with a system call that's halting io.
Apple has used stackshot a few times with me over the last couple years to hunt some nasty buggers down and the following link can shed some light on how to perhaps better hunt this goblin down: http://www.stormacq.com/?p=346
Also try: top -l2 -S > top_output.txt and exame the results for hangs / zombie processes.
The deeper you go into this, you may find it useful to subscribe to the kernel developer list (darwin-kernel#lists.apple.com) as there are some very, very sharp cookies on here that can shed light on some of the most obscure issues and help to understand precisely what panic's are saying.
Additionally, you may want to uninstall any VMs you have installed. There is a particular developer who, I've heard from very reliable sources, has very faulty hypervisor issues and it would be wise to look into that if you have any installed. It may be time to clean up your kexts altogether as well.
But, all-in-all, we really quite desperately need a better filesystem and proactive mechanisms therein to watch for bad blocks. I praised the day and shouted for joy when I thought that we were getting ZFS officially. I am doubtful Lion is that much better on the HFS+ front sadly and I certainly am considering ZFS for my Users volume + other storage on the workstation due to it's ability to scrub for bad blocks and to eliminate issues like these.
They are the bain of our existence on Apple hardware and having worked in this field for 20 years and thousands of clients, hard drive failure should be considered inexcusable at this point. Even if the actual mfgs can't and won't fix it, the onus falls upon OS developers to handle exceptions better and to guard against such failures in order to hold off silent data loss and nightmares such as these.
I'd run a mixture of 'top' as well as tail -f /var/log/messages (or wherever your main log file is).
Chances are right before/after the hang, some error message will come out. From there you can start to weed out your problems.
Activity Monitor is the GUI version of top, and with Leopard you can use the 'Sample Process' function to watch what your culprit tasks are spending most of their time doing. Also in Utilities you'll find Console aka tail -f /var/log/messages.
As a first line of attack, I'd suggest keeping top running in a Terminal window where you can see it, and watching for runaway jobs there.
If the other answers aren't getting you anywhere, I'd run watch uptime and keep notes on the times and uptimes when it locks up. Locking up about every 10 minutes is very different from locking up exactly every 10 minutes; the latter suggests looking in crontab -l for jobs starting with */10.
Use Apple's Instruments. Honestly, it's helped immensely in finding hangs like these.
Periodic unresponsiveness is often the case when swapping is happening. Do you have sufficient memory in your system? Examine the disk io to see if there are peaks.
EDIT:
I have seen similar behaviour on my Mac lately which was caused by the filesystem being broken so OS X tried to access non-existing blocks on the disk and even trying to repair it with Disk Manger told me to reformat and reinstall. Doing that and reestablishing with Time Machine helped!
If you do this, then double check that Journalling is enabled on the HFS on the harddisk. This helps quite a bit avoiding it happening again.

Comparing cold-start to warm start

Our application takes significantly more time to launch after a reboot (cold start) than if it was already opened once (warm start).
Most (if not all) the difference seems to come from loading DLLs, when the DLLs' are in cached memory pages they load much faster. We tried using ClearMem to simulate rebooting (since its much less time consuming than actually rebooting) and got mixed results, on some machines it seemed to simulate a reboot very consistently and in some not.
To sum up my questions are:
Have you experienced differences in launch time between cold and warm starts?
How have you delt with such differences?
Do you know of a way to dependably simulate a reboot?
Edit:
Clarifications for comments:
The application is mostly native C++ with some .NET (the first .NET assembly that's loaded pays for the CLR).
We're looking to improve load time, obviously we did our share of profiling and improved the hotspots in our code.
Something I forgot to mention was that we got some improvement by re-basing all our binaries so the loader doesn't have to do it at load time.
As for simulating reboots, have you considered running your app from a virtual PC? Using virtualization you can conveniently replicate a set of conditions over and over again.
I would also consider some type of profiling app to spot the bit of code causing the time lag, and then making the judgement call about how much of that code is really necessary, or if it could be achieved in a different way.
It would be hard to truly simulate a reboot in software. When you reboot, all devices in your machine get their reset bit asserted, which should cause all memory system-wide to be lost.
In a modern machine you've got memory and caches everywhere: there's the VM subsystem which is storing pages of memory for the program, then you've got the OS caching the contents of files in memory, then you've got the on-disk buffer of sectors on the harddrive itself. You can probably get the OS caches to be reset, but the on-disk buffer on the drive? I don't know of a way.
How did you profile your code? Not all profiling methods are equal and some find hotspots better than others. Are you loading lots of files? If so, disk fragmentation and seek time might come into play.
Maybe even sticking basic timing information into the code, writing out to a log file and examining the files on cold/warm start will help identify where the app is spending time.
Without more information, I would lean towards filesystem/disk cache as the likely difference between the two environments. If that's the case, then you either need to spend less time loading files upfront, or find faster ways to load files.
Example: if you are loading lots of binary data files, speed up loading by combining them into a single file, then do a slerp of the whole file into memory in one read and parse their contents. Less disk seeks and time spend reading off of disk. Again, maybe that doesn't apply.
I don't know offhand of any tools to clear the disk/filesystem cache, but you could write a quick application to read a bunch of unrelated files off of disk to cause the filesystem/disk cache to be loaded with different info.
#Morten Christiansen said:
One way to make apps start cold-start faster (sort of) is used by e.g. Adobe reader, by loading some of the files on startup, thereby hiding the cold start from the users. This is only usable if the program is not supposed to start up immediately.
That makes the customer pay for initializing our app at every boot even when it isn't used, I really don't like that option (neither does Raymond).
One succesful way to speed up application startup is to switch DLLs to delay-load. This is a low-cost change (some fiddling with project settings) but can make startup significantly faster. Afterwards, run depends.exe in profiling mode to figure out which DLLs load during startup anyway, and revert the delay-load on them. Remember that you may also delay-load most Windows DLLs you need.
A very effective technique for improving application cold launch time is optimizing function link ordering.
The Visual Studio linker lets you pass in a file lists all the functions in the module being linked (or just some of them - it doesn't have to be all of them), and the linker will place those functions next to each other in memory.
When your application is starting up, there are typically calls to init functions throughout your application. Many of these calls will be to a page that isn't in memory yet, resulting in a page fault and a disk seek. That's where slow startup comes from.
Optimizing your application so all these functions are together can be a big win.
Check out Profile Guided Optimization in Visual Studio 2005 or later. One of the thing sthat PGO does for you is function link ordering.
It's a bit difficult to work into a build process, because with PGO you need to link, run your application, and then re-link with the output from the profile run. This means your build process needs to have a runtime environment and deal cleaning up after bad builds and all that, but the payoff is typically 10+ or more faster cold launch with no code changes.
There's some more info on PGO here:
http://msdn.microsoft.com/en-us/library/e7k32f4k.aspx
As an alternative to function order list, just group the code that will be called within the same sections:
#pragma code_seg(".startUp")
//...
#pragma code_seg
#pragma data_seg(".startUp")
//...
#pragma data_seg
It should be easy to maintain as your code changes, but has the same benefit as the function order list.
I am not sure whether function order list can specify global variables as well, but use this #pragma data_seg would simply work.
One way to make apps start cold-start faster (sort of) is used by e.g. Adobe reader, by loading some of the files on startup, thereby hiding the cold start from the users. This is only usable if the program is not supposed to start up immediately.
Another note, is that .NET 3.5SP1 supposedly has much improved cold-start speed, though how much, I cannot say.
It could be the NICs (LAN Cards) and that your app depends on certain other
services that require the network to come up. So profiling your application alone may not quite tell you this, but you should examine the dependencies for your application.
If your application is not very complicated, you can just copy all the executables to another directory, it should be similar to a reboot. (Cut and Paste seems not work, Windows is smart enough to know the files move to another folder is cached in the memory)

Resources