If one is attempting to build a desktop program with a semi-complex GUI, especially one in which users can open multiple instances of identical GUI components such as having a "project" GUI and permitting users to open multiple projects concurrently within the main window, is it good practice to push the event listeners further up the widget hierarchy and use the event detail to determine upon which widget the event took place, as opposed to placing event listeners on each individual widget?
For example, in doing something similar in a web browser, there were no event listeners on any individual project GUI elements. The listeners were on the parent container that held the multiple instances of each project GUI. A project had multiple tabs within its GUI, but only one tab was visible within a project at a time and only one project was visible at any one time; so, it was fairly easy to use classes on the HTML elements and then the e.matches() method on the event.target to act upon the currently visible tab within the currently visible project in a manner that was independent of which project it was that was visible. Without any real performance testing, it was my unqualified impression as an amateur that having as few event listeners as possible was more efficient and I got most of that by reading information that wasn't very exact.
I read recently in John Ousterhout's book that Tk applications can have hundreds of event handlers and wondered whether or not attempting to limit the number of them as described above really makes any difference in Tcl/Tk.
My purpose in asking this question is solely to understand events better in order to start off the coding of my Tcl/Tk program correctly and not have to re-code a bunch of poorly structured event listeners. I'm not attempting to dispute anything written in the mentioned book and don't know enough to do so if I wanted to.
Thank you for any guidance you may be able to provide.
Having hundreds of event handlers is usually just a mark that there's a lot of different events possibly getting sent around. Since you usually (but not always) try to specialize the binding to be as specific as possible, the actual event handler is usually really small, but might call a procedure to do the work. That tends to work out well in practice. Myself, my rule of thumb is that if it is not a simple call then I'll put in a helper procedure; it's easier to debug them that way. (The main exception to my rule is if I want to generate a break.)
There are four levels you can usually bind on (plus more widget-specific ones for canvas and text):
The individual widget. This is the one that you'll use most.
The widget class. This is mostly used by Tk; you'll usually not want to change it because it may alter the behaviour of code that you just use. (For example, don't alter the behaviour of buttons!)
The toplevel containing the widget. This is ideal for hotkeys. (Be very careful though; some bindings at this level can be trouble. <Destroy> is the one that usually bites.) Toplevel widgets themselves don't have this, because of rule 1.
all, which does what it says, and which you almost never need.
You can define others with bindtags… but it's usually not a great plan as it is a lot of work.
The other thing to bear in mind is that Tk supports virtual events, <<FooBarHappened>>. They have all sorts of uses, but the main one in a complex application (that you should take note of) is for defining higher-level events that are triggered by a sequence of low-level events occasionally, and yet which other widgets than the originator may wish to take note of.
Related
I want to do some tests on our tcl tk application regarding the user interaction. As the application has parts similar to a CAD for which every mouse movement is relevant, I would like to do something like record all events of some user interactions. My goal would be to playback these events laterwards and on every program change to discover potential changes. Or even better to assure the GUI behaves always the same and produces always the same data.
I know, that I can generate some enter motion and button events, but this would not be the same like the thousands of events generated by a real user interaction. But it is very important for me to have exactly these thousands of events.
Is there any possibility to achieve this?
It's relatively easy to record events of particular types with bind — you'll find that <ButtonPress>, <ButtonRelease>, <Enter>, <Leave>, <FocusIn>, <FocusOut>, <KeyPress> and <KeyRelease> cover pretty much everything that you are interested in — and then play them back with event generate. (You need to record quite a bit of information about each event in order to regenerate it correctly, but the underlying model is that of X events with similar names.) Assuming you're not wanting to support inter-application cut-and-paste or drag-and-drop for the purposes of recording; those complicate things a lot. You'll likely have a lot of events; recording to an SQLite database might make a lot of sense.
However, you should think carefully about which parts of the application you want to record. Does it matter if the order of two buttons in the outer shell of the application outside the CAD-like area get swapped in order? For most users, provided you're clear about what the buttons do (through clear labels and icons) it isn't very important, but for replaying recorded events it can matter hugely. Instead, for the parts of the application that are simple buttons and edit fields, I'd not record the details of them but would instead just record when the buttons are clicked and the changes to the text content of entries and so on. In effect, it's capturing higher-level events, and that's much easier to replay correctly. It's only when the user is in that main CAD area that you need the full detail.
Also, beware of changes to font sizes and screen sizes/scaling. They can change how things are laid out and may happen because of system-level alterations outside the scope of your application.
We started out the way you describe: record all those thousands of motion events, etc. Including exact timings which are extremely important for a GUI application as well.
It quickly became appearent that those recordings became too hard to maintain. They are also overly brittle in light of UI changes. Another problem where the hardcoded time values. A switch to a more powerful machine (or a cpu under load) would break the execution.
The two biggest improvements we introduced
Event compression: recognize the high-level action the user wanted to perform (like selecting a menu item). The recorded activateItem command would then perform the necessary work (event emulation) on replay.
Synchronization functions: instead of relying on a particular timing commands like waitForObject wait for an object to come into existance and become ready for interaction.
It took several years for this to work fluently, however. Including a central Object Map repository, property and screenshot verifications, high-level test descriptions in BDD and others. Feel free to take a look a the Squish for Tk product that came out of this work.
I'm working on a third-party program that aggregates data from a bunch of different, existing Windows programs. Each program has a mechanism for exporting the data via the GUI. The most brain-dead approach would have me generate extracts by using AutoIt or some other GUI manipulation program to generate the extractions via the GUI. The problem with this is that people might be interacting with the computer when, suddenly, some automated program takes over. That's no good. What I really want to do is somehow have a program run once a day and silently (i.e. without popping up any GUIs) export the data from each program.
My research is telling me that I need to hook each application (assume these applications are always running) and inject a custom DLL to trigger each export. Am I remotely close to being on the right track? I'm a fairly experienced software dev, but I don't know a whole lot about reverse engineering or hooking. Any advice or direction would be greatly appreciated.
Edit: I'm trying to manage the availability of a certain type of professional. Their schedules are stored in proprietary systems. With their permission, I want to install an app on their system that extracts their schedule from whichever system they are using and uploads the information to a central server so that I can present that information to potential clients.
I am aware of four ways of extracting the information you want, both with their advantages and disadvantages. Before you do anything, you need to be aware that any solution you create is not guaranteed and in fact very unlikely to continue working should the target application ever update. The reason is that in each case, you are relying on an implementation detail instead of a pre-defined interface through which to export your data.
Hooking the GUI
The first way is to hook the GUI as you have suggested. What you are doing in this case is simply reading off from what an actual user would see. This is in general easier, since you are hooking the WinAPI which is clearly defined. One danger is that what the program displays is inconsistent or incomplete in comparison to the internal data it is supposed to be representing.
Typically, there are two common ways to perform WinAPI hooking:
DLL Injection. You create a DLL which you load into the other program's virtual address space. This means that you have read/write access (writable access can be gained with VirtualProtect) to the target's entire memory. From here you can trampoline the functions which are called to set UI information. For example, to check if a window has changed its text, you might trampoline the SetWindowText function. Note every control has different interfaces used to set what they are displaying. In this case, you are hooking the functions called by the code to set the display.
SetWindowsHookEx. Under the covers, this works similarly to DLL injection and in this case is really just another method for you to extend/subvert the control flow of messages received by controls. What you want to do in this case is hook the window procedures of each child control. For example, when an item is added to a ComboBox, it would receive a CB_ADDSTRING message. In this case, you are hooking the messages that are received when the display changes.
One caveat with this approach is that it will only work if the target is using or extending WinAPI controls.
Reading from the GUI
Instead of hooking the GUI, you can alternatively use WinAPI to read directly from the target windows. However, in some cases this may not be allowed. There is not much to do in this case but to try and see if it works. This may in fact be the easiest approach. Typically, you will send messages such as WM_GETTEXT to query the target window for what it is currently displaying. To do this, you will need to obtain the exact window hierarchy containing the control you are interested in. For example, say you want to read an edit control, you will need to see what parent window/s are above it in the window hierarchy in order to obtain its window handle.
Reading from memory (Advanced)
This approach is by far the most complicated but if you are able to fully reverse engineer the target program, it is the most likely to get you consistent data. This approach works by you reading the memory from the target process. This technique is very commonly used in game hacking to add 'functionality' and to observe the internal state of the game.
Consider that as well as storing information in the GUI, programs often hold their own internal model of all the data. This is especially true when the controls used are virtual and simply query subsets of the data to be displayed. This is an example of a situation where the first two approaches would not be of much use. This data is often held in some sort of abstract data type such as a list or perhaps even an array. The trick is to find this list in memory and read the values off directly. This can be done externally with ReadProcessMemory or internally through DLL injection again. The difficulty lies mainly in two prerequisites:
Firstly, you must be able to reliably locate these data structures. The problem with this is that code is not guaranteed to be in the same place, especially with features such as ASLR. Colloquially, this is sometimes referred to as code-shifting. ASLR can be defeated by using the offset from a module base and dynamically getting the module base address with functions such as GetModuleHandle. As well as ASLR, a reason that this occurs is due to dynamic memory allocation (e.g. through malloc). In such cases, you will need to find a heap address storing the pointer (which would for example be the return of malloc), dereference that and find your list. That pointer would be prone to ASLR and instead of a pointer, it might be a double-pointer, triple-pointer, etc.
The second problem you face is that it would be rare for each list item to be a primitive type. For example, instead of a list of character arrays (strings), it is likely that you will be faced with a list of objects. You would need to further reverse engineer each object type and understand internal layouts (at least be able to determine offsets of primitive values you are interested in in terms of its offset from the object base). More advanced methods revolve around actually reverse engineering the vtable of objects and calling their 'API'.
You might notice that I am not able to give information here which is specific. The reason is that by its nature, using this method requires an intimate understanding of the target's internals and as such, the specifics are defined only by how the target has been programmed. Unless you have knowledge and experience of reverse engineering, it is unlikely you would want to go down this route.
Hooking the target's internal API (Advanced)
As with the above solution, instead of digging for data structures, you dig for the internal API. I briefly covered this with when discussing vtables earlier. Instead of doing this, you would be attempting to find internal APIs that are called when the GUI is modified. Typically, when a view/UI is modified, instead of directly calling the WinAPI to update it, a program will have its own wrapper function which it calls which in turn calls the WinAPI. You simply need to find this function and hook it. Again this is possible, but requires reverse engineering skills. You may find that you discover functions which you want to call yourself. In this case, as well as being able to locate the location of the function, you have to reverse engineer the parameters it takes, its calling convention and you will need to ensure calling the function has no side effects.
I would consider this approach to be advanced. It can certainly be done and is another common technique used in game hacking to observe internal states and to manipulate a target's behaviour, but is difficult!
The first two methods are well suited for reading data from WinAPI programs and are by far easier. The two latter methods allow greater flexibility. With enough work, you are able to read anything and everything encapsulated by the target but requires a lot of skill.
Another point of concern which may or may not relate to your case is how easy it will be to update your solution to work should the target every be updated. With the first two methods, it is more likely no changes or small changes have to be made. With the second two methods, even a small change in source code can cause a relocation of the offsets you are relying upon. One method of dealing with this is to use byte signatures to dynamically generate the offsets. I wrote another answer some time ago which addresses how this is done.
What I have written is only a brief summary of the various techniques that can be used for what you want to achieve. I may have missed approaches, but these are the most common ones I know of and have experience with. Since these are large topics in themselves, I would advise you ask a new question if you want to obtain more detail about any particular one. Note that in all of the approaches I have discussed, none of them suffer from any interaction which is visible to the outside world so you would have no problem with anything popping up. It would be, as you describe, 'silent'.
This is relevant information about detouring/trampolining which I have lifted from a previous answer I wrote:
If you are looking for ways that programs detour execution of other
processes, it is usually through one of two means:
Dynamic (Runtime) Detouring - This is the more common method and is what is used by libraries such as Microsoft Detours. Here is a
relevant paper where the first few bytes of a function are overwritten
to unconditionally branch to the instrumentation.
(Static) Binary Rewriting - This is a much less common method for rootkits, but is used by research projects. It allows detouring to be
performed by statically analysing and overwriting a binary. An old
(not publicly available) package for Windows that performs this is
Etch. This paper gives a high-level view of how it works
conceptually.
Although Detours demonstrates one method of dynamic detouring, there
are countless methods used in the industry, especially in the reverse
engineering and hacking arenas. These include the IAT and breakpoint
methods I mentioned above. To 'point you in the right direction' for
these, you should look at 'research' performed in the fields of
research projects and reverse engineering.
I am currently refactoring code that coordinates multiple hardware components for data acquisition, and feeling a bit like I'm recreating the wheel. In particular, an MVC-like pattern seems to be emerging. Except, this has nothing to do with a GUI and I'm worried that I'm forcing this particular pattern where another might be more appropriate. Here's my scenario:
Individual hardware "component" classes obey interface contracts for each hardware type. Previously, component instances were orchestrated by a single monolithic InstrumentController class, which relied heavily on configuration + branching logic for executing a specific acquisition sequence. After an iteration, I have a separate controller for each component, with these controllers all managed by a small InstrumentControllerBase (or its derivatives). The composite system will receive "input" either programmatically or via inter-hardware component triggering - in either case these interactions are routed to, and handled by, the appropriate controller.
So, I have something that feels MVC-esque, but I don't know if that's because I'm forcing the point. With little direct MVC experience in application development, it's hard to know if I'm just trying to make my scenario fit MVC, where another pattern might be a good alternative or complimentary. My problem is, search results and wiki documentation of these family of patterns seems to immediately drop me into GUI-specific discussions.
I understand "M means Model data and the V means View" - but what do you call the superset pattern? Component-Commander-Controller?
Whence can I exhume examples exemplary?
IMO a "view" is not necessarily a GUI component. The pattern is easiest to demonstrate with GUIs but that does not limit its usability to GUIs. If it works for you, don't worry about the name :-) And of course, feel free to tailor it according to your needs.
Update: Of more generic kins of MVC, the only example which surfaced in my mind (after a day's background processing) is PAC.
My question has two parts:
Does anyone have any tips or references to some documentation on the web about how to write GUI code that is easy to read, write, and maintain?
Example.
I find that the more extensive my GUI forms become, I end up with a long list of fairly short event handler methods. If I try to add any private helper methods, they just get lost in the shuffle, and I constantly have to scroll around the page to follow a single line of thought.
How can I easily manage settings across the application?
Example.
If the user selects a new item in a drop-down list, I might need to enable some components on the GUI, update an app config file, and store the new value in a local variable for later. I usually opt to not create event handlers for all the settings (see above), and end up with methods like "LoadGUISettings" and "SaveGUISettings", but then I end up calling these methods all over my code, and it runs through a lot of code just to update very few, if any, actual changes.
Thanks!
Some guidelines for the first question, from an OO perspective:
Break up large classes into smaller ones. Does that panel have a bunch of fairly modular subpanels? Make a smaller class for each subpanel, then have another, higher-level class put them all together.
Reduce duplication. Do you have two trees that share functionality? Make a superclass! Are all of your event handlers doing something similar? Create a method that they all call!
Second question. I see two ways of doing this:
Listeners. If many components should respond to a change that occured in one component, have that component fire an event.
Global variables. If many components are reading and writing the same data, make it global (however you do that in your chosen language). For extra usefulness, combine the two approaches and let components listen for changes in the global data object.
If you're using WPF, you might want to read the Composite Application Guideance for WPF.
It discusses many of these topics (as well as many others). The main goal of that guideance is to make large scale applications in a flexible, maintainable manner.
You should definitely look at Jeremy Miller's guide to rich client design. It is incomplete, but I believe he is writing a book on the subject.
Another blog you should check out is Rich Newman's. He is writing about Composite Application Block which is a MS best practice guide on how to structure rich clients.
You can also read this book which is only a very light read but gives you some good ideas.
I'm not sure if this has been asked or not yet, but how much logic should you put in your UI classes?
When I started programming I used to put all my code behind events on the form which as everyone would know makes it an absolute pain in the butt to test and maintain. Overtime I have come to release how bad this practice is and have started breaking everything into classes.
Sometimes when refactoring I still have that feeling of "where should I put this stuff", but because most of the time the code I'm working on is in the UI layer, has no unit tests and will break in unimaginable places, I usually end up leaving it in the UI layer.
Are there any good rules about how much logic you put in your UI classes? What patterns should I be looking for so that I don't do this kind of thing in the future?
Just logic dealing with the UI.
Sometimes people try to put even that into the Business layer. For example, one might have in their BL:
if (totalAmount < 0)
color = "RED";
else
color = "BLACK";
And in the UI display totalAmount using color -- which is completely wrong. It should be:
if (totalAmount < 0)
isNegative = true;
else
isNegative = false;
And it should be completely up to the UI layer how totalAmount should be displayed when isNegative is true.
As little as possible...
The UI should only have logic related to presentation. My personal preference now is to have the UI/View
just raise events (with supporting data) to a PresenterClass stating that something has happened. Let the Presenter respond to the event.
have methods to render/display data to be presented
a minimal amount of client side validations to help the user get it right the first time... (preferably done in a declarative manner) screening off invalid inputs before it even reaches the presenter e.g. ensure that the text field value is within a-b range by setting the min and max properties.
http://martinfowler.com/eaaDev/uiArchs.html describes the evolution of UI design. An excerpt
When people talk about self-testing
code user-interfaces quickly raise
their head as a problem. Many people
find that testing GUIs to be somewhere
between tough and impossible. This is
largely because UIs are tightly
coupled into the overall UI
environment and difficult to tease
apart and test in pieces.
But there are occasions where this is
impossible, you miss important
interactions, there are threading
issues, and the tests are too slow to
run.
As a result there's been a steady
movement to design UIs in such a way
that minimizes the behavior in objects
that are awkward to test. Michael
Feathers crisply summed up this
approach in The Humble Dialog Box.
Gerard Meszaros generalized this
notion to idea of a Humble Object -
any object that is difficult to test
should have minimal behavior. That way
if we are unable to include it in our
test suites we minimize the chances of
an undetected failure.
The pattern you are looking for may be Model-view-controller, which basically separates the DB(model) from the GUI(view) and the logic(controller). Here's Jeff Atwood's take on this. I believe one should not be fanatical about any framework, language or pattern - While heavy numerical calculations probably should not sit in the GUI, it is fine to do some basic input validation and output formatting there.
I suggest UI shouldn't include any sort of business logic. Not even the validations. They all should be at business logic level. In this way you make your BLL independent of UI. You can easily convert you windows app to web app or web services and vice versa. You may use object frameworks like Csla to achieve this.
Input validations attached to control.
Like emails,age,date validators with text boxes
James is correct. As a rule of thumb, your business logic should not make any assumption regarding presentation.
What if you plan on displaying your results on various media? One of them could be a black and white printer. "RED" would not cut it.
When I create a model or even a controller, I try to convince myself that the user interface will be a bubble bath. Believe me, that dramatically reduces the amount of HTML in my code ;)
Always put the minimum amount of logic possible in whatever layer you are working.
By that I mean, if you are adding code to the UI layer, add the least amount of logic necessary for that layer to perform it's UI (only) operations.
Not only does doing that result in a good separation of layers...it also saves you from code bloat.
I have already written a 'compatible' answer to this question here. The rule is (according to me) that there should not be any logic in the UI except the UI logic and calls for standard procedures that will manage generic/specific cases.
In our situation, we came to a point where form's code is automatically generated out of the list of controls available on a form. Depending on the kind of control (bound text, bound boolean, bound number, bound combobox, unbound label, ...), we automatically generate a set of event procedures (such as beforeUpdate and afterUpdate for text controls, onClick for labels, etc) that launch generic code located out of the form.
This code can then either do generic things (test if the field value can be updated in the beforeUpdate event, order the recordset ascending/descending in the onClick event, etc) or specific treatments based on the form's and/or the control's name (making for example some work in a afterUpdate event, such as calculating the value of a totalAmount control out of the unitPrice and quantity values).
Our system is now fully automated, and form's production relies on two tables: Tbl_Form for a list of forms available in the app, and Tbl_Control for a list of controls available in our forms
Following the referenced answer and other posts in SO, some users have asked me to develop on my ideas. As the subject is quite complex, I finally decided to open a blog to talk about this UI logic. I have already started talking about UI interface, but it might take a few days (.. weeks!) until I can specifically reach the subject you're interested in.