Cocoa Distributed Objects, GC client, non-GC server - cocoa

I have a setup where there are two Cocoa processes, communicating with Distributed Objects (DO). The client is using garbage collection, the server is not.
It seems that the client hangs on to the distant objects outside of my direct references to them. This means that even after I don't have references to the objects, they hang around owned by NSDistantObjectTableEntry. Obviously they don't get deallocated on the server.
Only when the client is quit it lets go of all the distant objects. Breaking the connection manually would probably also work, but I don't want to do that while the client is running.
Is there a way to tell a GC'd DO client to let go of the distant objects that aren't referenced locally anymore?

There may be a retain cycle that spans the client and server - i.e. a client object is retaining a proxy of a server object, which is in turn retaining the proxy of the client's object.
That's a very simple example of a retain cycle, when more that two objects are involved it gets more complicated to diagnose.
See The Subtle Dangers Of Distributed Objects for example of other DO related gotchas.

Related

Is there a way to update cached in-memory value on all running instance of a serverless function? (AWS,Google,Azure or OpenWhisk)

Suppose I am running a serverless function with a global state variable which is cached in memory. Assuming that the value is cached on multiple running instances, how an update to the global state would be broadcasted to every serverless instance with the updated value?
Is this possible in any of the serverless framework?
It depends on the serverless framework you're using, which makes it hard to give a useful answer on Stack Overflow. You'll have to research each of them. And you'll have to review this over time because their underlying implementations can change.
In general, you will be able to achieve your goal as long as you can open up a bidirectional connection from each function instance so that your system outside the function instances can send them updates when it needs to. This is because you can't just send a request and have it reach every backing instance. The serverless frameworks are specifically designed to not work that way. They load balance your requests to the various backing instances. And it's not guaranteed to be round robin, so there's no way for you to be confident you're sending enough duplicate requests for each of the backing instances to have been hit at least once.
However, there is something also built into most serverless frameworks that may stop you, even if you can open up long lives connections from each of them that allow them to be reliably messaged at least once each. To help keep resources available for functions that need them, inactive functions are often "paused" in some way. Again, each framework will have its own way of doing this.
For example, OpenWhisk has a configurable "grace period" where it allows CPU to be allocated only for a small period of time after the last request for a container. OpenWhisk calls this pausing and unpausing containers. When a container is paused, no CPU is allocated to it, so background processing (like if it's Node.js and you've put something onto the event loop with setInterval) will not run and messages sent to it from a connection it opened will not be responded to.
This will prevent your updates from reliably going out unless you have constant activity that keeps every OpenWhisk container not only warm, but unpaused. And, it goes against the interests of the folks maintaining the OpenWhisk cluster you're deploying to. They will want to pause your container as soon as they can so that the CPU it consumed can be allocated to containers not yet paused instead. They will try to tune their cluster so that containers remain unpaused for a duration as short as possible after a request/event is handled. So, this will be hard for you to control unless you're working with an OpenWhisk deployment you control, in which case you just need to tune it according to your needs.
Network restrictions that interfere with your ability to open these connections may also prevent you from using this architecture.
You should take these factors into consideration if you plan to use a serverless framework and consider changing your architecture if you require global state that would be mutated this way in your system.
Specifically, you should consider switching to a stateless design where instead of caching occurring in each function instance, it occurs in a shared service designed for fast caching, like Redis or Memcached. Then each function can check that shared caching service for the data before retrieving it from its source. Many cloud providers who provide serverless compute options also provide managed databases like these. So you can often deploy it all to the same place.
Also, you could switch, if not to a stateless design, a pull model for caching instead of a push model. Instead of having updates pushed out to each function instance to refresh their cached data, each function would pull fresh data from its source when they detect that the data stored in their memory has expired.

Which NSXPCConnection related objects do I have to retain myself?

I cannot find any hint in the docs regarding object lifecycle management.
In the XPC service, do I have to keep a strong reference to the NSXPCListener, or does the resume call take care of this effectively?
I'm using Swift and a connection creation object to get most of the stuff out of the main.swift file:
// main.swift
if let dependencies = Dependencies().setUp() {
// Actually run the service code (and never return)
NSRunLoop.currentRunLoop().run()
}
I have the hunch that the dependencies object (which creates the NSXPCListener during set up) should keep a strong reference to the listener object. But the resume method is said to work like operation queues.
Conversely, does the client need to keep the NSXPCConnection around?
In the XPC service, upon incoming connection, does setting exportedObject retain that object for the duration of the connection, or do I have to keep a strong ref to it myself?
Consequently: When multiple connections come in, should I maintain a list of exportedObjects?
In either the client of service, should I obtain a remoteObjectProxy once and keep it around, or should I obtain a proxy object anew for every call?
My partcular XPC service is a launchd process running all the time, not a one-off thing, and the client app itself might run for a few hours in the background, too. I worry whether it's safe to keep a proxy object to the background service for a potentially long-running communication.
If background services crash, launchd is said to restart them. Now if my service was a "launch on demand" service instead, will message calls to proxy objects issue a relaunch if necessary, will obtaining a proxy object do, or will only reconnecting achieve that?
Thanks for helping me sort this out!

Best way to initialize initial connection with a server for REST calls?

I've been building some apps that connect to a SQL backend. I use ajax calls to hit WebMethods, a WebAPI, etc.
I notice that the first initial call to the SQL backend retrieves the data fairly slow. I can only assume that this is because it must first negotiate credentials first before retrieving the data. It probably caches this somewhere, and thus, any calls made afterwards come back very fast.
I'm wondering if there's an ideal, or optimal way, to initialize this connection.
My thought was to make a simple GET call right when the page loads (grabbing something very small, like a single entry). I probably wouldn't be using the returned data in any useful way, other than to ensure that any calls afterwards come back faster.
Is this an okay way to approach fixing the initial delay? I'd love to hear how others handle this.
Cheers!
There are a number of reasons that your first call could be slower than subsequent ones
Depending on your server platform, code may be compiled when first executed
You may not have an active DB connection in your connection pool
The database may not have cached indices or data on the first call
Some VM platforms may take a while to allocate sufficient resources to your server if it has been idle for a while.
One way I deal with those types of issues on the server side is to add startup code to my web service that fetches data likely to be used by many callers when the service first initializes (e.g. lookup tables, user credential tables, etc).
If you only control the client, consider that you may well wish to monitor server health (I use the open source monitoring platform Zabbix. There are also many commercial web-based monitoring solutions). Exercising the server outside of end-user code is probably better than making an extra GET call from a page that an end user has loaded.

How does session replication across containers work?

I would be interested in some timing details. For example I place in session some container, which can keep different data. I do change of content of the container frequently. How can I assure that the container session value get replicates across nodes for any change?
You don't need to make sure; that's the application server's job.
The J2EE specification doesn't deal with session-information synchronization amongst distributed components.
Theoretically, all you have to do is code thread-safe. In your example, simply make sure that access to the container is synchronized. If your application server is bug-free, then you can safely assume that the session information is properly replicated across all nodes in a seamless manner; if your application server has bugs around session synchronization... well... then nothing is really safe anymore, now is it.
Application servers use different strategies to synchronize session information between nodes. Session content can be considered as dirty and required synchronization at
put data in session
get data from session
get data from session falls in two categories as
get structured object
get scalar object or immutable object
So if session data get modified indirectly by modifying an structured object, then simple re-read it from session can assure that the object content got replicated.

detect client process termination from EXE COM Server

I'm writing an EXE COM Server that exposes a class that lock a system resource.
In normal execution the client release the resource (the COM executable shutsdown a couple of seconds later.
In abnormal execution, the client app crashes, leaving the com sever with an instance having a positive reference count. The COM executable runs for ~12 minutes until termination. This means that the system resource is locked during this time.
Is there a way to detect client termination instantaneously, as in socket IPC or driver protocol? if not it would seem that COM is inferior to other IPC mechanisms.
I had the same question a couple of years ago. I found the answer here: How To Turn Off the COM Garbage Collection Mechanism. In short: no, there is no way to detect client termination instantaneously. Excerpts:
When a COM client terminates normally,
it releases all references to its
server object. When a client
terminates abnormally however, there
might be outstanding references to the
server object. Without a garbage
collection mechanism, the server code
has no way of knowing when to reclaim
the resources allocated for the COM
object, which can then cause a
resource leak. To address this
problem, COM implements an automatic
garbage collection mechanism in which
the COM resolver process (RPCSS) on
the client machine pings the server
machine on behalf of the client
process.
Alternatives to using COM's GC
protocol (for example, using periodic
application-level "pings"--method
calls that inform the object that
clients are still alive, or using an
underlying transport mechanism such as
TCP keepalives) are demonstrably much
less efficient. Therefore, DCOM's
default GC mechanism should be used
for any objects that must be shut down
when their clients disappear or
otherwise misbehave if those objects
would effectively become memory leaks
on the server.
The resolver on the server machine
keeps track of the pings for each
server object. The ping period is 2
minutes and, currently, it is non-
configurable. When the resolver on the
server machine detects that an object
has not been pinged for 6 minutes, it
assumes that all clients of the object
have terminated or otherwise are no
longer using the object. The resolver
will then release all external
references to the object. It does this
by simply having the object's stub
manager (the COM runtime code that
delivers calls to each object) call
::Release() on the object's IUnknown
interface. At this point, the object's
reference count will be zero so far as
the COM runtime is concerned. (There
may still be references held by local
(same-apartment) clients, so the
object's internal reference count may
not necessarily go to zero at this
point.) The object may then shut
itself down.
NOTE: Garbage collection applies to
all servers regardless of whether
their clients are local or remote, or
a combination of local and remote. The
underlying pinging mechanism is
different in the local case as no
network packets are generated, but for
all practical purposes, the behavior
is the same.

Resources