What's the best erlang approach to being able to identify a processes identity from its process id? - debugging

When I'm debugging, I'm usually looking at about 5000 processes, each of which could be one of about 100 gen_servers, fsms, etc. If I want to know WHAT an erlang process is, I can do:
process_info(pid(0,1,0), initial_call).
And get a result like:
{initial_call,{proc_lib,init_p,5}}
...which is all but useless.
More recently, I hit upon the idea (brace yourselves) of registering each process with a name that told me WHO that process represented. For example, player_1150 is the player process that represents player 1150. Yes, I end up making a couple million atoms over the course of a week-long run. (And I would love to hear comments on the drawbacks of boosting the limit to 10,000,000 atoms when my system runs with about 8GB of real memory unused, if there are any.) Doing this meant that I could, at the console of a live system, query all processes for how long their message queue was, find the top offenders, then check to see if those processes were registered and print out the atom they were registered with.
I've hit a snag with this: I'm moving processes from one node to another. Now a player process can have 3 different names; player_1158, player_1158_deprecating, player_1158_replacement. And I have to make absolutely sure I register and unregister these names with precision timing to make sure that a process is always named and that the appropriate names always exist, AND that I don't try to register a name that some dying process already holds. There is some slop room, since this is only used for console debugging of a live system Nonetheless, the moment I started feeling like this mechanism was affecting how I develop the system (the one that moves processes around) I felt like it was time to do something else.
There are two ideas on the table for me right now. An ets tables that associates process ids with their description:
ets:insert(self(), {player, 1158}).
I don't really like that one because I have to manually keep the tables clean. When a player exits (or crashes) someone is responsible for making sure that his data are removed from the ets table.
The second alternative was to use the process dictionary, storing similar information. When my exploration of a live system led me to wonder who a process is, I could just look at his process dictionary using process_info.
I realize that none of these solutions is functionally clean, but given that the system itself is never, EVER the consumer of these data, I'm not too worried about it. I need certain debugging tools to work quickly and easily, so the behavior described is not open for debate. Are there any convincing arguments to go one way or another (other than the academic "don't use the _, it's evil" canned garbage?) I'd be happy to hear other suggestions and their justifications.

You should try out gproc, it's a very convenient application for keeping process metadata.
A process can be registered with several names and you can associate arbitrary properties to a process (where the key and value can be any erlang term). Also gproc monitors the registered processes and unregisters them automatically if they crash.

If you're debugging gen_servers and gen_fsms while they're still running, I would implement the handle_info functions for these behaviors. When you send each process a {get_info, ReplyPid} tuple, the process in question can send back a term describing its own state, what it is, etc. That way you don't have to keep track of this information outside of the process itself.
Isac mentions there is already a built in way to do this

Related

How to identify a process in Windows? Kernel and User mode

In Windows, what is the formal way of identifying a process uniquely? I am not talking about PID, which is allocated dynamically, but a unique ID or a name which is permanent to that process. I know that every program/process has a security descriptor but it seems to hold SIDs for loggedin user and group (not the process). We cannot use the path and name of executable from where the process starts as that can change.
My aim is to identify a process in the kernel mode and allow it to perform certain operation. What is the easiest and best way of doing this?
Your question is too vague to answer properly. For example how could the path possibly change (without poking around in kernel memory) after creation of a process? And yes, I am aware that one could hook into the memory-mapping process during process creation to replace the image originally destined to be loaded with another. Point is that a process is merely one instance of running a given executable. And it's not clear what exact tampering attempts you want to counter here.
But from kernel mode you do have the ability to simply use the pointer to the EPROCESS structure. No need to use the PID, although that will be unique while the process is still alive.
So assuming your process uses an IRP to communicate to the driver (whether it be WriteFile, ReadFile, DeviceIoControl or something more exotic), in order to register itself, you can use IoGetCurrentProcess to get the PEPROCESS value which will be unique to the process.
While the structure itself is not officially documented, hints can be gleaned from the "Windows Internals" book (in its various incarnations), the dt (Display Type) command in WinDbg (and friends) as well as from third-party resources on the internet (e.g. here, specific to Vista).
The process objects are kept in several linked lists. So if you know the (officially undocumented!!!) layout for a particular OS version, you may traverse the lists to get from one to the next process object (i.e. EPROCESS structure).
Cautionary notes
Make sure to reference the object of the process, by using the respective object manager routines. Otherwise you cannot be certain it's safe to both reach into these structures (which is anyway unsafe, since you cannot rely on their layout across OS versions) or to pass it to functions that expect a PEPROCESS.
As a side-note: Harry Johnston is of course right to assert that a privileged user can insert arbitrary (well almost arbitrary) code into the TCB in order to thwart your protective measures. In the end it is going to be an arms race.
Also keep in mind that similar to PIDs, theoretically the value of the PEPROCESS may be recycled. But in both cases you can simply counter this by invalidating whatever internal state you keep in your driver that allows the process to do its magic, whenever the process goes down. Using something like PsSetCreateProcessNotifyRoutine would seem to be a good method here. In order to translate your process handle from the callback to a PEPROCESS value, use ObReferenceObjectByHandle.
An alternative of countering recycling of the PID/PEPROCESS is by keeping a reference to the process object and thus keeping it in a kind of undead state (similar to not closing a handle in user mode), although the main thread may have finished.

Is it possible to associate data with a running process?

As the title says, I want to associate a random bit of data (ULONG) with a running process on the local machine. I want that data persisted with the process it's associated with, not the process thats reading & writing the data. Is this possible in Win32?
Yes but it can be tricky. You can't access an arbitrary memory address of another process and you can't count on shared memory because you want to do it with an arbitrary process.
The tricky way
What you can do is to create a window (with a special and known name) inside the process you want to decorate. See the end of the post for an alternative solution without windows.
First of all you have to get a handle to the process with OpenProcess.
Allocate memory with VirtualAllocEx in the other process to hold a short method that will create a (hidden) window with a special known name.
Copy that function from your own code with WriteProcessMemory.
Execute it with CreateRemoteThread.
Now you need a way to identify and read back this memory from another process other than the one that created that. For this you simply can find the window with that known name and you have your holder for a small chunk of data.
Please note that this technique may be used to inject code in another process so some Antivirus may warn about it.
Final notes
If Address Space Randomization is disabled you may not need to inject code in the process memory, you can call CreateRemoteThread with the address of a Windows kernel function with the same parameters (for example LoadLibrary). You can't do this with native applications (not linked to kernel32.dll).
You can't inject into system processes unless you have debug privileges for your process (with AdjustTokenPrivileges).
As alternative to the fake window you may create a suspended thread with a local variable, a TLS or stack entry used as data chunk. To find this thread you have to give it a name using, for example, this (but it's seldom applicable).
The naive way
A poor man solution (but probably much more easy to implement and somehow even more robust) can be to use ADS to hide a small data file for each process you want to monitor (of course an ADS associated with its image then it's not applicable for services and rundll'ed processes unless you make it much more complicated).
Iterate all processes and for each one create an ADS with a known name (and the process ID).
Inside it you have to store the system startup time and all the data you need.
To read back that informations:
Iterate all processes and check for that ADS, read it and compare the system startup time (if they mismatch then it means you found a widow ADS and it should be deleted.
Of course you have to take care of these widows so periodically you may need to check for them. Of course you can avoid this storing ALL these small chunk of data into a well-known location, your "reader" may check them all each time, deleting files no longer associated to a running process.

Asynchronous interapplication communication on Mac OS X

On Mac OS X, I have a process which produces JSON objects, and another intermittent process which should consume them. The producer and consumer processes are independent of each other. Objects will be produced no more often than every 5 seconds, and will typically be several hundred bytes, but may range up into megabytes sometimes. The objects should be communicated first-in-first-out. The consumer may or may not be running when the producer is producing, and may or may not read objects immediately.
My boneheaded solution is
Create a directory.
Producer writes each JSON object to a text file, names it with a serial number.
When Consumer launches, it reads and then deletes files in serial-number order, and while it is running, uses FSEvents to watch this directory for new files arriving.
Is there any easier or better way to do this?
The modern way to do this, as of Lion, is to use XPC. Unfortunately, there's no good documentation of it; there's a broad overview in the Daemons and Services guide and a primitive HeaderDoc-generated reference, but the best way to get introduced to it is to watch the session about it from last year's WWDC sessions.
With XPC, you won't have to worry about keeping serial numbers serial, having to contend for a spinning disk, or whether there's enough disk space. Indeed, you don't even have to generate and parse JSON data at all, since XPC's communication mechanism is built around JSON-esque/plist-esque container and value objects.
Assuming you want the consumer to see the old files, this is the way it's been done since the beginning of time - loathsome though it may be.
There's lots of highish tech things that look cleaner - but honestly, they just tend to add complexity and/or deployment infrastructure that add hassle. What you suggest works, and it works well, and it's easy to write and maintain. You might need some kind of sentinel files to track what you are doing for crash recovery, but that's probably about it.
Hell, most people would just poll with a sleep 5. At least you are all all up in the fsevent.
Now if it was accepable to lose the events generated when the listener wasn't around; and perf was paramount - it could get more interesting. :)

Best form of IPC for a decentralized roguelike?

I've got a project to create a roguelike that in some way abstracts the UI from the engine and the engine from map creation, line-of-site, etc. To narrow the focus, i first want to just get the UI (player's client) and engine working.
My current idea is to make the client basically a program that decides what one character (player, monsters) will do for its turn and waits until it can move again. So each monster has a client, and so does the player. The player's client prints the map, waits for input, sends it to the engine, and tells the player what happened. The monster's client does the same except without printing the map and using AI instead of keyboard input.
Before i go any futher, if this seems somehow an obfuscated way of doing things, my goal is to learn, not write a roguelike. It's the journy, not the destination.
And so i need to choose what form of ipc fits this model best.
My first attempt used pipes because they're simplest and i wrote a
UI for the player and a program to pipe in instructions such as
where to put the map and player. While this works, it only allows
one client--communicating through stdin and out.
I've thought about making the engine a daemon that looks in a spool
where clients, when started, create unique-per-client temp files to
give instructions to the engine and recieve feedback.
Lastly, i've done a little introductory programing with sockets.
They seem like they might be the way to go, and would allow the game
to perhaps someday be run over a net. I'd like to, if possible, use
a simpler solution, and since i'm unfamiliar with them, it's more
error prone.
I'm always open to suggestions.
I've been playing around with using these combinations for a similar problem (multiple clients talking via a single daemon on the local box, with much of the intelligence shoved off into the clients).
mmap for sharing large data blobs, with unix domain sockets, messages queues, or named pipes for notification
same, but using individual files per blob instead of munging them all together in an mmap
same, but without the files or mmap (in other words, more like conventional messaging)
In general I like the idea of breaking things up into separate executables this way -- it certainly makes testing easier, for instance. I think the choice of method comes down to usage patterns -- how large are messages, how persistent does the data in them need to be, can you afford the cost of multiple trips through the network stack for a socket-based message, that sort of thing. The fact that you're sticking to Linux makes things easy in terms of what's available -- you don't need to worry about portability of message queues, for instance.
This one's also applicable: https://stackoverflow.com/a/1428542/1264797

What causes CreateDirectory to return ERROR_ACCESS_DENIED?

In another question, we established that yes, CreateDirectory occasionally fails with the undocumented GetLastError value of ERROR_ACCESS_DENIED, and that the right way to handle the situation is probably to try again a few times. It's easy to implement such an algorithm, but it's not so easy to test it when you don't know how to reproduce it.
I don't need theories for why this happens. It might be a bug in Windows, yeah. It might also be by design. Ultimately, at this point, it doesn't matter, because Microsoft shipped the behavior, and I must cope.
I also don't need an explanation of multi-tasking operating system theory and how Windows implements it in general. I write system software for a living. I understand little else.
What I need now is a reliable way to reproduce the failure so I can write a test case for the code which copes. Here's what I've tried so far:
I wrote test program P1, which slowly and repeatedly enumerates the contents of the would-be parent. As well, I wrote test program P2 which does nothing but repeatedly attempt to delete and create a directory in the would-be parent. I figured keeping an enumeration open a long time might make the problem more likely. Running P2 by itself produces an occasional period of failures (on the order of every several minutes for approximately 10 milliseconds). Running P1 and P2 at the same time does not seem to make the failures any more frequent or long.
I ran two instances of P2 at the same time, and that does not seem to make the failures any more frequent or long.
I modified P2 so that it can create files in addition to directories, and running that at the same time as P1 does not seem to make the failures any more frequent or long.
I ran P1 and multiple instances of P2 with different parameters all at the same time, and that does not seem to make the failures any more frequent or long.
I wrote test program P3 which moves items into and out of the would-be parent and ran P3 at the same time as P2, and that does not seem to make the failures any more frequent or long.
Any other ideas?
Let me start by double-checking that I understand the question. If you run something like the below snippet, you expect it to fail eventually, right?
while (true)
{
System.IO.Directory.CreateDirectory( ".\\FooDir" );
System.IO.Directory.Delete( ".\\FooDir" );
}
If your application is the only thing running on the system that has a handle open to that file, then this feels like a bug. So knowing the OS version would help.
On the other hand, if there is something else in the system that is keeping the handle open for just a little while, then whether this is a bug or not becomes a little more fuzzy. The number of things that try to blindly grok files and directories might surprise you. A naive indexer, for example, might be walking into that directory, enumerating it, looking for files to index and so on -- and if you collide with him, blammo. A similarly naive anti-virus filter, or some other file system filter, might be poking it as well (in this case, it still feels like a bug).
There are little things we've done in the OS to try and give services like these ways to get out of your way. Does it repro if you turn the indexer off, if you turn off any anti-virus, any anti-malware? We can go from there, and hopefully we will find that newer bits have it fixed already (that statement had a lot of assumptions in it, I know).
One other relatively interesting piece of trivia is that ERROR_ACCESS_DENIED is a Win32 error that is mapped from more than one underlying status in the system (see this article for example). So if we can dig a little deeper, we may be able to find out what the file system is trying to tell the app (if it's more than access denied).
We might end up getting into a conversation about whether you can, in the wild, assume that your app is the only thing poking at your files and directories. You can probably guess where that one will go.
I would wager a guess that your enumeration / deletion / creation is causing some synchronization problems with the handles. If CreateDirectory is anything like CreateFile (and I would assume the logic behind it would be shared), then you would see similar behaviour to CreateFile:
If you call CreateFile on a file that
is pending deletion as a result of a
previous call to DeleteFile, the
function fails. The operating system
delays file deletion until all handles
to the file are closed. GetLastError
returns ERROR_ACCESS_DENIED.

Resources