I am writing a program to sit in the background on osx 10.6, listen to keystrokes and record them, grouping them by window title. (No, I am not writing malicious software. I do not need this program to be sneaky in any way, I just want to have a safety net for when I have typed a huge email and then accidentally refresh the page (APPLE-R) instead of opening a new tab (APPLE-T)) I have already found apple's EventMonitorTest example for the keystroke capturing code, now I just need to find the "key window" title.
Does anyone know where I can find examples for this kind of functionality? Thank you!
A couple of possibilities:
You could use the Accessibility API (though of course keep in mind that 64-bit Carbon does not support this)
You could use the CGWindow functions introduced in Leopard
I suspect the first option will be easier to do this with, since the CGWindow API is somewhat low-level and treats all windows (application windows, menu bars, dock icons, etc.) more or less equally.
Related
Is there any way to replace a special character with a keyboard shortcut live?
For instance: Writing $ would actually press ctrl+n or arrow key left
Every help is much appreciated!
This is primarily speculation with a little experience and research mixed in.
This sort of thing is easy enough if you are checking in an application that currently has focus, but creating a universal keypress hook? Not so much.
I built a C#/C++ program in grad school that intercepted keystrokes intended for another application, but I was only able to do it by waiting for the desired application window to open, auto-opening my own pop-up window to receive the input, and then passing keystrokes back to the original window.
I'm not saying it can't be done, period, but my background knowledge (though slightly dated) and a little cursory research isn't turning up anything in the basic scripting world that would satisfy what you appear to be after.
The only way I know how to do it (which is likely wrong) would be to have hooks in every open application, and when a textbox on the application gained focus give focus to your own text-receiving app. Analyze the keypresses, and then pass the desired text/keypresses on to the original app/textbox. This would require prior knowledge of the "windows" (i.e. all objects) in all possible apps on the machine you're working on, so you would know when a textbox received focus.
If I recall, it might be possible to tell when keys are being pressed (if you have hooks in all apps) and re-direct from there, but you might lose the first keystroke, even then.
Again, this is primarily speculative.
I use Mac OS X 10.10 and I would like to write a program that looks continuously for a window analyzing all the names of the opened windows. When the windows appear, I would like that the program will look for a button with a specific label and once found it, the app should send it a "pressed message".
I would be able to do it under windows, but I am not so familiar with Mac.
I have found a question related to mine (How do I get a list of the window titles on the Mac OSX?), but I think the most difficult part is finding the button and sending it a "pressed message".
Thank you in advance!
What you are looking for is the Accessibilty APIs. These are mostly Core Foundation style C APIs and typically prefixed with AX.
You might also want to consider additional identifiers beyond window title as window titles are not necessarily unique.
Using the AX APIs is not easy and is extremely verbose. You can use them to explore the UI and find things and interact with them but you might have more limited success observing user interaction. That might require a more fragile combination with event monitoring using NSEvent globalMonitor or CGEventTap depending on the UI widgets involved.
Also note that using the AX APIs to control anything outside your app is not sandbox capable.
I want to get the window handle of some controls to do some stuff with it (requiring a handle). The controls are in a different application.
Strangely enough; I found out that many controls don't have a windows handle, like the buttons in the toolbar (?) in Windows Explorer. Just try to get a handle to the Folder/Search/(etc) buttons. It just gives me 0.
So.. first question: how come that some controls have no windows handle? Aren't all controls windows, in their hearts? (Just talking about standard controls, like I would expect them in Windows Explorer, nothing customdrawn on a pane or the like.)
Which brings me to my second question: how to work with them (like using EnableWindow) if you cannot get their handle?
Many thanks for any inputs!
EDIT (ADDITIONAL INFORMATION):
Windows Explorer is just an example. I have the problem frequently - and in a different application (the one I am really interested in, a proprietary one). I have "physical" controls (since I can get an AutomationElement of those controls), but they have no windows handle. Also, I am trying to send a message (SendMessage) to get the button state, trying to find out whether it is pushed or not (it is a standard button that seems to exhibit that behaviour only through that message - at least as far as I have seen. Also, the pushed state can last a lot longer on that button than you would expect on a standard button, though the Windows Explorer buttons show a similar behaviour, acting like button-style checkboxes, though they are (push)buttons). SendMessage requires a window handle.
Does a ToolBar in some way change the behaviour of its child elements? Taking away their window handle or something similar? (Using parent handle/control id for identification??) But then how to use functions on those controls that require a windows handle?
If they don't have a handle, they're not real controls, they're just drawn to look like controls.
But of course, the toolbar buttons in Windows Explorer do have window handles, they're part of a toolbar. Use the toolbar manipulation functions to interact with them, not EnableWindow.
Or, better yet, use the documented APIs for things like search. Reverse-engineering Windows Explorer has never ended well for anyone, least of all the poor Windows Shell team, saddled with years of backwards-compatibility hacks for certain developers who thought that APIs are for everyone else. Whatever you do manage to get to work is very likely to break on the next version of Windows.
The controls you are talking about are using the ToolbarWindow32 class. If you want to interact with them then you'll need to use the toolbar control APIs/message. For example for enabling buttons you'd want to use TB_ENABLEBUTTON.
You can implement the controls yourself using GDI, OpenGL or DirectX. Try Window Detective on Mozilla Firefox and you will see that there is only one window. Controls in dialog boxes are not windows known to Windows.
I'm facing a problem for an application I'm writing (http://code.google.com/p/blazingstars/issues/detail?id=25), where my program is a menulet (menu bar) application that uses the Accessibility API to interact with and control another program. I do the usual things like registering for the API notifications and getting the window list through API calls, etc., but I realized a while ago that if my program is started in a second Space (virtual desktop) after the program I'm interacting with is started in the first, my program will crash and burn because it can't access any information about its target. (Is there a way around that problem I'm missing?)
A simple solution would be to popup a dialog asking the user to restart the program in the correct Space, but for the life of me I can't figure out how to tell which Space my target is in, either through NSWorkspace or the Accessibility API, so that I can compare it to the Space that I'm in. Any ideas?
Note that setting the collection behaviour to NSWindowCollectionBehaviorCanJoinAllSpaces isn't going to do me any good because I have to do a bunch of work upon launch, so I have to be in the same space as my target right from the start.
I think you can do this with the APIs in CGWindow.h..
Specifically see CGWindowListCopyWindowInfo() and kCGWindowWorkspace.
I've used these APIs to do all types of things like getting window contents, window frames, etc...
If that doesn't work then you might want to try this private API:
extern CGSError CGSGetWindowWorkspace(const CGSConnectionID cid,
CGSWindowID wid,
CGSWorkspaceID *workspace);
The trick would be getting the connection ID of the target process.
You should probably redesign your app so that it delays its initialization until the app you want to control is in the current space.
There is no easy way to do this under Leopard because there are no official "space change" notifications, but the blog post and comments on this page may help.
Using the Apple OS X Cocoa framework, how can I post a sheet (slide-down modal dialog) on the window of another process?
Edit: Clarified a bit:
My application is a Finder extension to do Subversion version control (http://scplugin.tigris.org/). Part of my application is a plug-in (a Contextual Menu Item for Finder); the bulk of my application, however, is in a separate daemon proces. For several reasons, we've chosen to put virtually all the code into the daemon; the plug-in only defines the menu itself, and Apple-Events over to the Daemon.
Sometimes, the daemon needs to prompt the user for further information. It can toss a window on-screen for this, but that's disruptive (randomly positioned), and it seems to me the work flow here is legitimately modal, for example "select a file, pick 'commit' from the menu, provide commit comments, do the operation."
Interprocess cooperation (such as passing a reference of some kind) is acceptable: both processes are mine, but I want to avoid binding the sheet's code into the primary process.
Really, it sounds like you're trying to have your inter-process communication happen at the view level, which isn't really how Cocoa generally works. Things will be much easier if you separate your layers a bit more than that.
Why don't you want to put the sheet code into the other process? It's view code, and view code is inherently process-specific. The right thing to do here is probably to add somewhat generic modal-sheet support to your plugin code, and an IPC call that your daemon can make to summon that code. Trying to ship view objects over to the remote process is going to be nightmarish if you can make it work at all.
You're fighting the frameworks with this approach.
You can't add a sheet to a window in another process, because you have at most only the most restricted access to the windows in the other process.
Please don't do this. Make the interaction nonmodal if at all possible. Especially in something like a commit, it's much nicer to be able to browse around your files while you're writing commit comments.
OS X does have window groups, but I don't think they can (easily) span applications.
Another thing to consider is that in OS X it's possible to have many Finder windows open on the same folder (unlike in OS 9). Even if you did have sufficient privileges/APIs to add a sheet to a Finder window, it's not like the modality of that window would prevent the user from being able to continue working with the files.
(My personal opinion as a long-time Mac user is that this kind of interaction would drive me right up the wall.)