Could you explain me why WinAPI needs InvalidateRgn with its handle to the region to be added to the update region (hRgn) if we have only RECT during BeginPaint while processing WM_PAINT?
Thanks in advance!
Win32 API is about 30 years old; there's lots of code in there for backwards compatibility. There's a perfectly sane InvalidateRect.
Having said that, calling InvalidateRgn with bErase=TRUE will erase a non-rectangular area.
Requiring the specific (complex) update region is an extremely rare use case. The system is optimized for the most common use case, where applications invalidate and track dirty areas of a window using rectangles. That's what you get when calling BeginPaint.
If you are in the rare situation where you need the update region, you can call GetUpdateRgn instead. Since BeginPaint validates the update region, you would have to call GetUpdateRegion before that.
Why does Windows not just go ahead and invent a BeginPaintEx API that returns the update region? Because adding an API is unbelievably expensive, and needs to be well justified. Adding a function that doesn't add any value (as in this case) is hard to justify.
Related
Ok, it may be a bit difficult to explain:
Suppose someone creates a Windows application (using C# or any other language) that uses the GetDesktopWindow() function on the user32.dll to capture a Screenshot and then sends this image to any online service.
Since it's custom made application, no anti-virus software will be able to determine that it's a virus because it's still an unknown application for it. Also, there are legitimate uses for such API, so it's not necessarily a virus, it can be a harmless window capture tool or some kind of espionage tool.
What I want to know is: Is there any way to see what a specific EXE file does regarding the Windows functions? Can I know if "myapp.exe" uses GetDesktopWindow() of user32.dll?
This is only one example. There are plenty other Windows endpoints that I would like to know when they're used by any application.
Is there a way to do that?
It depends to what lengths you want to go doing that. It's essentially a game of cat and mouse - bad actors will attempt to find new ways to circumvent your detection by jumping through some obscure hoops, you will add more sophisticated detection methods for those tricks, they will think of new tricks, and so on.
Also, it depends on whether you want to statically or dynamically determine that, and whether you actually want to know if GetDesktopWindow is called or if "the program gets a handle to the desktop window" (which can be achieved in other ways as well).
Here is a non-exhaustive list of ideas:
You could statically determine whether the function is imported by looking at the import directory. Research the PE file structure to find out more. This article may help.
This method of detection can be easily circumvented by dynamically importing the function using LoadLibrary and GetProcAddress.
You could scan the file for the string GetDesktopWindow to detect possible usage for dynamic import.
This method of detection can be easily circumvented by packing, encrypting or otherwise obfuscating the name of the dynamically imported function.
You could dynamically observe whether the GetDesktopWindow function gets called by registering an AppInit_DLL or a global hook which is injected into every new process and hook the GetDesktopWindow function from inside the process by overwriting its first bytes with a jump to your own code, notifying your detection component somehow, executing the original bytes and jumping back. (Microsoft Detours can help there.)
This method of detection can be circumvented if the target notices the hook and removes it before calling, since its in its own process space. (You could also do some tricks with acting like a debugger and setting a hardware breakpoint on the first instruction of GetDesktopWindow, but yet again there would be ways to detect or circumvent that since the target could also modify the debug registers.)
You could build a driver that does this from kernel-mode instead, but now we are getting really deep.
Note that until now we focused on the actual GetDesktopWindow function from user32.dll. But what if the target will just use a different way to achieve its goal of getting a desktop window handle?
The desktop window handle for the current thread is stored in the TIB (thread information block) which is accessible via fs:[18] from user mode. You can see this in the GetDesktopWindow source code of ReactOS which is pretty accurate compared to Microsoft's actual implementation (which you can verify by looking at it in a debugger). The target could therefore just access the TIB and extract this value, without even calling GetDesktopWindow at all.
The target could just take a known top-level window such as the shell's hidden compatibility window which you'll get via GetShellWindow() or - to avoid detection of GetShellWindow too - for example FindWindow(NULL, "Program Manager") (or even a newly created window!) and call GetAncestor(hWnd, GA_PARENT) on it to get the desktop window handle.
I'm sure, with some creativity, your adversaries will come up with more clever ideas than these.
Also, if we take this one step further and take a look at the ultimate goal of taking a screenshot, there as well exist other ways to achieve that. First example coming to mind: They could use keybd_event to emulate pressing the PrnSc key and then read the screenshot out of the clipboard data.
So it's all a matter of how far you want to take this.
By the way, you may find the drltrace project interesting - it is a library call tracer.
A simple search for DoEvents brings up lots of results that lead, basically, to:
DoEvents is evil. Don't use it. Use threading instead.
The reasons generally cited are:
Re-entrancy issues
Poor performance
Usability issues (e.g. drag/drop over a disabled window)
But some notable Win32 functions such as TrackPopupMenu and DoDragDrop perform their own message processing to keep the UI responsive, just like DoEvents does.
And yet, none of these seem to come across these issues (performance, re-entrancy, etc.).
How do they do it? How do they avoid the problems cited with DoEvents? (Or do they?)
DoEvents() is dangerous. But I bet you do lots of dangerous things every day. Just yesterday I set off a few explosive devices (future readers: note the original post date relative to a certain American holiday). With care, we can sometimes account for the dangers. Of course, that means knowing and understanding what the dangers are:
Re-entry issues. There are actually two dangers here:
Part of the problem here has to do with the call stack. If you call .DoEvents() in a loop that itself handles messages that use DoEvents(), and so on, you're getting a pretty deep call stack. It's easy to over-use DoEvents() and accidentally fill up your call stack, resulting in a StackOverflow exception. If you're only using .DoEvents() in one or two places, you're probably okay. If it's the first tool you reach for whenever you have a long-running process, you can easily find yourself in trouble here. Even one use in the wrong place can make it possible for a user to force a stackoverflow exception (sometimes just by holding down the enter key), and that can be a security issue.
It is sometimes possible to find your same method on the call stack twice. If you didn't build the method with this in mind (hint: you probably didn't) then bad things can happen. If everything passed in to the method is a value type, and there is no dependance on things outside of the method, you might be fine. But otherwise, you need to think carefully about what happens if your entire method were to run again before control is returned to you at the point where .DoEvents() is called. What parameters or resources outside of your method might be modified that you did not expect? Does your method change any objects, where both instances on the stack might be acting on the same object?
Performance Issues. DoEvents() can give the illusion of multi-threading, but it's not real mutlithreading. This has at least three real dangers:
When you call DoEvents(), you are giving control on your existing thread back to the message pump. The message pump might in turn give control to something else, and that something else might take a while. The result is that your original operation could take much longer to finish than if it were in a thread by itself that never yields control, definitely longer than it needs.
Duplication of work. Since it's possible to find yourself running the same method twice, and we already know this method is expensive/long-running (or you wouldn't need DoEvents() in the first place), even if you accounted for all the external dependencies mentioned above so there are no adverse side effects, you may still end up duplicating a lot of work.
The other issue is the extreme version of the first: a potential to deadlock. If something else in your program depends on your process finishing, and will block until it does, and that thing is called by the message pump from DoEvents(), your app will get stuck and become unresponsive. This may sound far-fetched, but in practice it's surprisingly easy to do accidentally, and the crashes are very hard to find and debug later. This is at the root of some of the hung app situations you may have experienced on your own computer.
Usability Issues. These are side-effects that result from not properly accounting for the other dangers. There's nothing new here, as long as you looked in other places appropriately.
If you can be sure you accounted for all these things, then go ahead. But really, if DoEvents() is the first place you look to solve UI responsiveness/updating issues, you're probably not accounting for all of those issues correctly. If it's not the first place you look, there are enough other options that I would question how you made it to considering DoEvents() at all. Today, DoEvents() exists mainly for compatibility with older code that came into being before other credible options where available, and as a crutch for newer programmers who haven't yet gained enough experience for exposure to the other options.
The reality is that most of the time, at least in the .Net world, a BackgroundWorker component is nearly as easy, at least once you've done it once or twice, and it will do the job in a safe way. More recently, the async/await pattern or the use of a Task can be much more effective and safe, without needing to delve into full-blown multi-threaded code on your own.
Back in 16-bit Windows days, when every task shared a single thread, the only way to keep a program responsive within a tight loop was DoEvents. It is this non-modal usage that is discouraged in favor of threads. Here's a typical example:
' Process image
For y = 1 To height
For x = 1 to width
ProcessPixel x, y
End For
DoEvents ' <-- DON'T DO THIS -- just put the whole loop in another thread
End For
For modal things (like tracking a popup), it is likely to still be OK.
I may be wrong, but it seems to me that DoDragDrop and TrackPopupMenu are rather special cases, in that they take over the UI, so don't have the reentrancy problem (which I think is the main reason people describe DoEvents as "Evil").
Personally I don't think it's helpful to dismiss a feature as "Evil" - rather explain the pitfalls so that people can decide for themselves. In the case of DoEvents there are rare cases where it's still reasonable to use it, for example while a modal progress dialog is displayed, where the user can't interact with the rest of the UI so there is no re-entrancy issue.
Of course, if by "Evil" you mean "something you shouldn't use without fully understanding the pitfalls", then I agree that DoEvents is evil.
I'm working on a third-party program that aggregates data from a bunch of different, existing Windows programs. Each program has a mechanism for exporting the data via the GUI. The most brain-dead approach would have me generate extracts by using AutoIt or some other GUI manipulation program to generate the extractions via the GUI. The problem with this is that people might be interacting with the computer when, suddenly, some automated program takes over. That's no good. What I really want to do is somehow have a program run once a day and silently (i.e. without popping up any GUIs) export the data from each program.
My research is telling me that I need to hook each application (assume these applications are always running) and inject a custom DLL to trigger each export. Am I remotely close to being on the right track? I'm a fairly experienced software dev, but I don't know a whole lot about reverse engineering or hooking. Any advice or direction would be greatly appreciated.
Edit: I'm trying to manage the availability of a certain type of professional. Their schedules are stored in proprietary systems. With their permission, I want to install an app on their system that extracts their schedule from whichever system they are using and uploads the information to a central server so that I can present that information to potential clients.
I am aware of four ways of extracting the information you want, both with their advantages and disadvantages. Before you do anything, you need to be aware that any solution you create is not guaranteed and in fact very unlikely to continue working should the target application ever update. The reason is that in each case, you are relying on an implementation detail instead of a pre-defined interface through which to export your data.
Hooking the GUI
The first way is to hook the GUI as you have suggested. What you are doing in this case is simply reading off from what an actual user would see. This is in general easier, since you are hooking the WinAPI which is clearly defined. One danger is that what the program displays is inconsistent or incomplete in comparison to the internal data it is supposed to be representing.
Typically, there are two common ways to perform WinAPI hooking:
DLL Injection. You create a DLL which you load into the other program's virtual address space. This means that you have read/write access (writable access can be gained with VirtualProtect) to the target's entire memory. From here you can trampoline the functions which are called to set UI information. For example, to check if a window has changed its text, you might trampoline the SetWindowText function. Note every control has different interfaces used to set what they are displaying. In this case, you are hooking the functions called by the code to set the display.
SetWindowsHookEx. Under the covers, this works similarly to DLL injection and in this case is really just another method for you to extend/subvert the control flow of messages received by controls. What you want to do in this case is hook the window procedures of each child control. For example, when an item is added to a ComboBox, it would receive a CB_ADDSTRING message. In this case, you are hooking the messages that are received when the display changes.
One caveat with this approach is that it will only work if the target is using or extending WinAPI controls.
Reading from the GUI
Instead of hooking the GUI, you can alternatively use WinAPI to read directly from the target windows. However, in some cases this may not be allowed. There is not much to do in this case but to try and see if it works. This may in fact be the easiest approach. Typically, you will send messages such as WM_GETTEXT to query the target window for what it is currently displaying. To do this, you will need to obtain the exact window hierarchy containing the control you are interested in. For example, say you want to read an edit control, you will need to see what parent window/s are above it in the window hierarchy in order to obtain its window handle.
Reading from memory (Advanced)
This approach is by far the most complicated but if you are able to fully reverse engineer the target program, it is the most likely to get you consistent data. This approach works by you reading the memory from the target process. This technique is very commonly used in game hacking to add 'functionality' and to observe the internal state of the game.
Consider that as well as storing information in the GUI, programs often hold their own internal model of all the data. This is especially true when the controls used are virtual and simply query subsets of the data to be displayed. This is an example of a situation where the first two approaches would not be of much use. This data is often held in some sort of abstract data type such as a list or perhaps even an array. The trick is to find this list in memory and read the values off directly. This can be done externally with ReadProcessMemory or internally through DLL injection again. The difficulty lies mainly in two prerequisites:
Firstly, you must be able to reliably locate these data structures. The problem with this is that code is not guaranteed to be in the same place, especially with features such as ASLR. Colloquially, this is sometimes referred to as code-shifting. ASLR can be defeated by using the offset from a module base and dynamically getting the module base address with functions such as GetModuleHandle. As well as ASLR, a reason that this occurs is due to dynamic memory allocation (e.g. through malloc). In such cases, you will need to find a heap address storing the pointer (which would for example be the return of malloc), dereference that and find your list. That pointer would be prone to ASLR and instead of a pointer, it might be a double-pointer, triple-pointer, etc.
The second problem you face is that it would be rare for each list item to be a primitive type. For example, instead of a list of character arrays (strings), it is likely that you will be faced with a list of objects. You would need to further reverse engineer each object type and understand internal layouts (at least be able to determine offsets of primitive values you are interested in in terms of its offset from the object base). More advanced methods revolve around actually reverse engineering the vtable of objects and calling their 'API'.
You might notice that I am not able to give information here which is specific. The reason is that by its nature, using this method requires an intimate understanding of the target's internals and as such, the specifics are defined only by how the target has been programmed. Unless you have knowledge and experience of reverse engineering, it is unlikely you would want to go down this route.
Hooking the target's internal API (Advanced)
As with the above solution, instead of digging for data structures, you dig for the internal API. I briefly covered this with when discussing vtables earlier. Instead of doing this, you would be attempting to find internal APIs that are called when the GUI is modified. Typically, when a view/UI is modified, instead of directly calling the WinAPI to update it, a program will have its own wrapper function which it calls which in turn calls the WinAPI. You simply need to find this function and hook it. Again this is possible, but requires reverse engineering skills. You may find that you discover functions which you want to call yourself. In this case, as well as being able to locate the location of the function, you have to reverse engineer the parameters it takes, its calling convention and you will need to ensure calling the function has no side effects.
I would consider this approach to be advanced. It can certainly be done and is another common technique used in game hacking to observe internal states and to manipulate a target's behaviour, but is difficult!
The first two methods are well suited for reading data from WinAPI programs and are by far easier. The two latter methods allow greater flexibility. With enough work, you are able to read anything and everything encapsulated by the target but requires a lot of skill.
Another point of concern which may or may not relate to your case is how easy it will be to update your solution to work should the target every be updated. With the first two methods, it is more likely no changes or small changes have to be made. With the second two methods, even a small change in source code can cause a relocation of the offsets you are relying upon. One method of dealing with this is to use byte signatures to dynamically generate the offsets. I wrote another answer some time ago which addresses how this is done.
What I have written is only a brief summary of the various techniques that can be used for what you want to achieve. I may have missed approaches, but these are the most common ones I know of and have experience with. Since these are large topics in themselves, I would advise you ask a new question if you want to obtain more detail about any particular one. Note that in all of the approaches I have discussed, none of them suffer from any interaction which is visible to the outside world so you would have no problem with anything popping up. It would be, as you describe, 'silent'.
This is relevant information about detouring/trampolining which I have lifted from a previous answer I wrote:
If you are looking for ways that programs detour execution of other
processes, it is usually through one of two means:
Dynamic (Runtime) Detouring - This is the more common method and is what is used by libraries such as Microsoft Detours. Here is a
relevant paper where the first few bytes of a function are overwritten
to unconditionally branch to the instrumentation.
(Static) Binary Rewriting - This is a much less common method for rootkits, but is used by research projects. It allows detouring to be
performed by statically analysing and overwriting a binary. An old
(not publicly available) package for Windows that performs this is
Etch. This paper gives a high-level view of how it works
conceptually.
Although Detours demonstrates one method of dynamic detouring, there
are countless methods used in the industry, especially in the reverse
engineering and hacking arenas. These include the IAT and breakpoint
methods I mentioned above. To 'point you in the right direction' for
these, you should look at 'research' performed in the fields of
research projects and reverse engineering.
Does anyone know a good document/article about GDI resource handling?
I need to share some resources like icons and bitmaps among classes that can have different lifetime, and I want to understand how I should approach this problem.
For Mutexes and other kernel objects there is a DuplicateHandle function, but GDI is confusing me a little. Also, the way CBitmap returns HBITMAP through const operator HBITMAP, and that like, is a little bit scary.
I would like to avoid creating local bitmaps on every redraw, so some resource caching would be good, but also, I am not sure I can start creating and loading C##### resources while the main message pump hasn't started running.
Seems that I'm using wrong keywords, as I can't find any good, but manageably short documentation.
There is no such documentation, it is all pretty straightforward. It is completely up to you to decide when to call DeleteObject(). And to decide how to balance resource usage of your program against dynamically creating and destroying the object when needed. Only largish bitmaps are really worthy to keep around. Pens and brushes are very cheap, you create and destroy them on-the-fly. Fonts are a corner case, often cached simply for the live of the program since you need so few of them.
There are plenty of ways to manage caching, a shared_ptr<> in C++ provides the standard reference counting pattern for example. But it is very typical to just keep the reference as a member of your window wrapper class. It isn't very common that the same bitmap would be used in multiple windows. Ymmv.
Creating GDI objects doesn't require a message loop.
There's a registry key where I can check (& set) the currently set GDI object quota for processes. However, if a user changes that registry key, the value remains the old value until a reboot occurs. In my program, I need to know if there's a way to determine, programatically, how many more GDI objects I can create. Is there an API for getting GDI information for the current process? What about at the system level?
Always hard to prove the definite absence of an API, but this one is a 95% no-go. Lots of system settings are configured through the registry without an API to tweak it afterward.
Raymond Chen's typical response to questions like these is "if you want to know then you are doing something wrong". It applies here, the default quota of 10,000 handles is enormous.
If you want to find the current quota that matters to you, create GDI objects until that fails. Record that number. Then, destroy all of them.
If you feel like doing this on a regular basis to get an accurate number, you can do so. It's probably going to be fairly expensive though.
Since Hans mentioned Raymond already, we should play his "Imagine if this were true" game. If this API - GetGDIObjectLimit or whatever - existed, what would it return? If the object count limit is 10000, then you expect it to return that right? So what happens when the system is low on memory? The API tells you a value which has no actual meaning. If you're getting close to 10000 GDI objects, you are doing something wrong and you should concentrate on fixing that.