Use Crashlytics on macOS daemon - macos

I have a macOS application that is integrated with Crashlytics. If I run it as an agent, everything seems to work fine. But when I run it as a daemon, the crashs and errors don't show up on the web panel.
I'm thinking the problem might be that crashlytics uses a framework that is not daemon-safe.
Apple documentation regarding the subject says:
If your daemon uses frameworks that aren't daemon-safe, you can run
into a variety of problems.
Is this really the issue? Is there a workaround so I can get it to work?

Former maintainer for the Crashlytics SDK on Apple platforms here. However, I haven't been with the organization for a while, so my information could be out of date. You should definitely reach out to them for assistance. However, I'll still give this a shot.
A number of others have asked for this kind of functionality in the past, and from what I know, have successfully integrated Crashlytics into non-UI processes. There are some things to watch out for, though. I'm also aware of the daemon-safe issue, and that could be a problem. However, I'm unsure of how it might manifest itself.
When you say agent vs daemon, are you talking about per-process vs per-user launchd jobs, or something else? One thing I can be fairly certain of, is Crashlytics does not support multiple processes with the same bundle id running simultaneously. If there can be multiple copies of your process running at the same time, you cannot make this work. Even if it seems like it does work sometimes, it will not work reliably, at best, can could lead to serious issues at worst (potentially crashes).
One thing that is absolutely essential for correct operation is a main runloop. Crashlytics will definitely not work correctly without one.
Crashlytics also requires an Info.plist. This is actually possible to add to standalone binaries, but often trips people up. I'm guessing you figured this one out.
On macOS, Crashlytics integrates a bit with AppKit, to improve exception reporting. If I recall right, it's possible to just skip this integration completely, as outlined in the docs.
Another thing that Crashlytics relies on is a standard user file system home directory. There must be a ~\Library directory present with the standard internal structure. This one might be problematic for launchd daemons, since they run as root.
Keeping those things in mind, I'm pretty sure it's possible to make this work. There could be some things I'm not remembering, as it's been a while. However, one thing I definitely do know is this is a bit of a gray area. It works, but wasn't an explicit design goal. It might now be unsupported. You should definitely check in with them about this before shipping something.

Related

More accurate identification of running applications on Mac OS

By using runningApplications of NSWorkspace, it is possible to get a list of running apps on Mac OS as NSRunningApplication objects, and from this get additional information like what application is in the foreground.
It is possible to identify the running application using their name (localizedName), but I'm sure that can be spoofed by rogue applications. Other things like bundleIdentifier seem better, but I believe that too could be spoofed.
I would imagine that pretty much all of the metadata could be spoofed for applications outside of the public app store, but for any apps gotten from the app store things like bundleIdentifier should be safe ways to identify an app, right?
If we include arbitrary apps that someone downloads from the Internet, is there any better way to identify an app as to filter out rogue apps? I realize that there may be no solution that has no drawbacks, but looking for a best-effort attempt.
As you mention, all of these things can be pretty easily spoofed. Having written a product that does exactly what you're describing professionally, the solution is relatively straightforward: fingerprint every version of every popular app into a massive database, and then fingerprint each app you discover on the machine and look them up in your database. When you discover an app you've never seen before, flag it for adding to your database.
Maintaining that database is very large and ongoing endeavor. That's where most of the value of the product is. The agent code is not that complicated. The up-to-date database is what customers pay for. It's a pretty hard space to get into.
You're correct that you can verify signatures to make sure that things downloaded from MAS or part of the OS are what they claim to be. This will get you started, but isn't nearly enough; there's just so much that doesn't come from MAS.
The other headache is that you can see what "apps" are currently running in NSWorkspace, but it's pretty messy what it means. A lot of things that you don't think of as "apps" show up in runningApplications, like MobileDeviceUpdater and nbagent. On the other hand, things like mysqld aren't. Fingerprinting from runningApplications can miss things that aren't in that list, or malicious apps could lie about their bundle path to make themselves look legitimate. You can use tools like lsof to see what files a process really has open, but it gets more and more complicated.
Best of luck; it's a deep rabbit hole with dozens of corner cases, and very little documentation.

Alternatives to ShellAPI to get file list and icons

I need to build a file/folder tree with associated file icons and special locations like network computers.
Currently I'm using Shell API to achieve it: SHGetFileInfo, IShellFolder.EnumObjects and other functions.
It works fine most of the time, but occasionally, on customer's machines it causes various errors like random access violations deep in system libraries. Analyzing bug reports, some of those seem to be a result of 3rd party shell extensions which are loaded to my app's address space when the Shell API is used.
I'm thinking to somehow avoid using Shell API and do the job another way. What are the other good approaches to build a folder tree?
If the problem really is due to faulty shell extensions then the only sensible approach, in my view, is to remove those shell extensions. Trying to work with the shell, but avoid using the shell API won't lead anywhere useful. In fact I think that the likely outcome is that your alternative code will be less functional. All for the sake of one user that won't fix their broken machine. That's a terrible trade off.
If explorer is also crashing then that is a clear indication that the problem is indeed due to shell extensions.
Having said all of that, you post makes me suspect that you have had bug reports from multiple clients. That makes your diagnosis much less plausible. The shell API is a complex beast and it is very plausible that your code is defective in some way. I suspect that you may be guilty of a case of diagnosis by wishful thinking. It's very easy, when facing a fault that is hard to reproduce and diagnose, to believe that your code is not to blame. If multiple clients are reporting problems then my bet is that the defect can be found in your code.

Any scenarios where Meteor.js autopublish would actually be useful?

Lately I've been getting more and more interested in Meteor.js. At the moment I'm developing a new web project of mine. What I can't get out of my mind is the Autopublish feature of Meteor. At the moment of writing my MongoDB has a total of 32453 records, therefore, as you can probably guess I had to turn off autopublish and subscribe/publish manually.
I've read a mouthful of guides now and it seems to be a completely common practice to turn off autopublish as soon as your application is created. This makes me question - does the feature have any practical use in a real world scenario? I can see it being useful for the shock and awe effect of the examples, but aside from that, it seems more or less pointless. I might be missing something very obvious though.
Autopublish is designed to be turned off before production. It's simply a feature to speed up development in the early stages, and that's all. From the Meteor Docs:
By default, a new Meteor app includes the autopublish and insecure packages, which together mimic the effect of each client having full read/write access to the server's database. These are useful prototyping tools, but typically not appropriate for production applications. When you're ready, just remove the packages.
You are not missing anything. It was added to make the examples work and to get users up and running quickly when working on new projects. I can't think of a compelling reason for a production app to have autopublish on.

Node.js Modules Designed for Unix?

I'm probably missing something, because I don't hear anyone else mentioning this. But when I look at the process and file system modules I see a lot of Unix-isms that are unlikely to work on Windows. How is this going to work for Windows users? Windows users who never used Unix may not even realize which are Unix-isms, that are never going to work for them. I suppose this is really just a documentation issue, it would be nice to filter documentation based on Unix or Windows. Process.getuid() would be one example. Chmod would be another. Even SIGUSERn is there. (Vague memories of servers mysteriously shutting down.) I do have Unix experience from way back, but many will not have. I avoided Rails because it was slow on Windows, but I hear that node.js is smoking fast on Windows, so I'm hopeful!
Since version v0.6 (except the latest v0.6.9, big fs bug), Node.js has become a first class citizen on Windows. Because of this, it's my thought that most 3rd party modules will eventually support Windows. A lot of the major ones do now, Express.js most notably.
This will get better too. Some third party modules require the extra step of forcing the user to compile native C++ modules (NPM/node-waf automates this). However, in future versions Node.js will be ditching this for binary compatibility. The developers behind Node.js are trying very hard to emphasize a first class development experience that focuses on a pleasant developer experience across all platforms.
Admittedly, the Node.js experience will be better on OS X or Linux, but it's going to only get better on Windows. I highly encourage you to check it out if you haven't already.

Different methodologies for solving bugs that only occur in production

As one who is relatively new to the whole support and bug fixing environment and a young programmer I had never come across a bug that only occurs in the Websphere environment but not on the localhost test enviroment, until today. When I first got this bug report I was confused as to why I couldn't reproduce it on the localhost test environment. I decided to try on the Websphere test environment to see what would happen and I successfully reproduced the bug. The problem is I can't make changes and build to the Websphere test enviroment. I can only make changes to my local environment. Given this handicap what methodologies exist for resolving these kinds of bugs. Or are there even any methodologies at all? Any advice or help on how to approach issues like this?
Campaign for full access to a test environment. Being able to tweak things, redeploy and retry makes a huge difference. It's entirely reasonable to explain how not having access severely restricts your ability to do your job.
Make sure you've got sufficient logging, and make it configurable. Make sure you keep the logs for long enough to track down a problem reported by a customer even if it happened a few days ago.
When you finally diagnose a problem and why it only happens in a particular environment, document it and try to persuade your local system to behave the same way - that should make it easier to diagnose another symptom of the same problem next time.
In short, the methodology is to isolate and understand the differences between environments and which one or ones might be causing the issue.
Check your local build. Make sure it the same version (code and database) as Test and Prod. If it is, what are the environment differences that could effect the issue you are seeing? (Multi-core, load balancing, operating system version, 3rd library version). Don't run locally in the debugger, make sure your running a release build (if that's what is on Test and Prod) and maybe even do a local deployment rather than building from source.
Check to see if it is particular data that might be causing the problem. If you can, pull a copy of the database back from Test onto Local and see if that enables you to repro the problem.
Check with other developers. See if they can repro. the issue in their environment. Check with your QA guys, get their thoughts on what might be causing such an issue (often times they will have seen "similar" issues and might give you a clue).
At that point, if nothing helps, I generally go into a deep state of zen to try and understand what I am missing. But, there must be a difference, you just have to find it.
The Scientific Method always applies-- check your assumptions first. If the systems are different, the problem might reside in some sort of implicit default being different, or a different implementation of some function.
In all debugging processes, localization is the key. You gotta isolate the area of the problem first. If your OS, patches, libraries, and the main software itself are all identical, then it's probably the system settings (limits for sockets, file descriptors, etc). If you know you have enough inodes, space and memory left, then it's not a resource issue. If the computer is barely responsive to your interactive prodding, your load is too high, or you have some runaway processes. Remember what every process needs to run, and make sure they got what they need.
It can be also code just can't deal with the load of the production system. Locking mechanisms are a very popular cause of problem in production vs dev/test systems, simply because you can't generate enough of test cases that you get for free in production.
Logging can be easily overlooked, but I also like to add a lot of debug values into the code, to make debugging easier. I cannot even count how many times some environment variable, path, or broken symlink have ruined my day, just to realize that it would be a trivial fix if I looked at the values of variables while running, not just looking at the static code.
If all else fails, ltrace and strace are the best way to really look at what's going on under the hood. They're not easy to read, but once you get used to how spot and interpret return values of some syscalls, you gain a very powerful debugging tool.

Resources