Enumerating devices and display modes for OpenGL rendering - windows

I'm currently writing an OpenGL renderer and am part-way through writing some classes for enumerating display adaptors, devices and modes for use in drop-down lists.
I'm using EnumDisplayDevices to get the adaptors and then EnumDisplaySettings for each device, giving me bpp, width, height and refresh rate. However I'm not sure how to find out which modes are available full-screen (there doesn't appear to be a flag for it in the DEVMODE structure). Can I assume all modes listed can in-principle be instantiated full-screen?
As a follow up question, is this approach to device enumeration generally the best way of getting the required information on Windows?

OpenGL has not this distinction between windowed and fullscreen mode. If you want an OpenGL program to be fullscreen you just set the window to be toplevel, borderless, without decoration, stay on top and maximum size.

The above is actually a dumb question. By definition windowed mode must be the current display settings. All other modes must be available full-screen (provided the OS supports them, i.e. 640x480 not advisable in Vista/7).

Hmmph, not correct at all, and with an attitude too. You have variety of functions that can be used.
SetPixelFormat, ChoosePixelFormat, ChangeDisplaySettings.
The PixelFormat functions will let you enumerator available modes. ChangeDisplaySettings with allow you to set whatever screen mode (including bit depth) your app wants. Look them up in MSDN.

Related

How OBS record hidden window?

I'm making a program to record a window that is obscured by another window via python and WIN32API library.
Through many searches, I succeeded in capturing the hidden window through hwnd and BitBlt, but the execution time of my code is not stable.
I tried to provide the recording function by selecting 30~60 fps, but the time required to capture the hidden window and write() it to the video object of cv2 is irregular, so I can't make a 60fps video.
So I thought of OBS and Discord. In the case of OBS, it is possible to enforce stable recording for obscured windows. For Discord, there is a feature that allows you to select a specific window and share it with multiple people in real time (this can also be done for hidden windows).
I'd like to know how these programs provide stable video for occluded windows. I'm a student, and I'm not elite. I am asking this question because it is difficult to analyze the vast Github source code of OBS. Can someone give me an explanation of how the above program captures the screen?
Last time I checked, OBS was doing it with low-level hacks instead of APIs.
Specifically, they have wrote a DLL which they inject into the target application using CreateRemoteThread WinAPI. Then, they patch application’s code to intercept calls to IDXGISwapChain.Present method. Once a call is intercepted, the injected code has access to D3D frame buffer texture. It can copy that texture into another texture on GPU, and then do something with the copy. One possibility is DXGI surface sharing to pass the copy from the target application into the capturing process. The APIs for that don’t require both sides of the sharing to be in the same process, textures can be shared across processes just fine.
Unfortunately for you, their approach is borderline impossible to re-implement in higher-level languages like Python. Such things are only doable in C++ or similar low-level languages, and relatively hard to implement and debug.
#dy.kim, don't be afraid of large codebases. window-capture.c and the OBS GUI fairly obviously list "bitblt" and "Windows Graphics Capture" as the two methods it uses to capture windows, with the preference going to WGC if neither is specified.

AccessibleObjectFromPoint and per-monitor DPI

I'm using accessibility with the AccessibleObjectFromPoint function, and I'd like it to work correctly on a per-monitor DPI environment. Unfortunately, I can't get it to work. I tried many things, and the situation for now is:
My app is marked as per-monitor-DPI-aware in the manifest. (True/PM)
I use GetCursorPos and then AccessibleObjectFromPoint.
How can the problem be reproduced:
Have two monitors, one with 100% DPI, the other with 125%.
Run Chrome on the 125% monitor.
Use AccessibleObjectFromPoint on one of the tab names, it won't work.
It works with some apps (DPI-aware, it seems, like explorer), but doesn't work with others. I tried several relevant functions, such as GetPhysicalCursorPos and PhysicalToLogicalPointForPerMonitorDPI, but nothing works.
It's worth noting that Microsoft's inspect.exe works as expected.
I’ve been struggling with this exact same problem for several weeks and can now tell you my findings. Unfortunately I can’t give you more than a hint of code, because the project I am working on, is proprietary.
The issue started at Windows 8.1. The problem did not exist on Windows 7 or Vista, because AccessibleObjectFromPoint always used raw physical coordinates, as documented here: https://msdn.microsoft.com/en-us/library/windows/desktop/dd317984(v=vs.85).aspx .
“Microsoft Active Accessibility does not use logical coordinates. The following methods and functions either return physical coordinates or take them as parameters.” This has not been true since Windows 8.1.
AccessibleObjectFromPoint now uses a flawed calculation that cannot always find the correct window for reasons similar to my question here: High DPI scaling, mouse hooks and WindowFromPoint .
My findings lead me to one conclusion: The API is broken. This does not mean it is not possible though.
Possible solutions I have partially tested that seem to work follow.
Prerequisites are that you
1/. Make your process per monitor DPI aware, NOT USING THE MANIFEST (more on that later).
2/. Determine the hWnd of the window you want to query (WindowFromPoint() variants)
3/. Determine the monitor DPI of the queried hWnd
4/. Determine the DPI of your process
5/. Determine the DPI of the queried hWnd
6/. Determine the monitor origin and offset for the queried hWnd (MonitorFromWindow() and GetMonitorInfo() )
Next, depends on your platform
Windows 10.0.14393+
Write a function that finds the IAccessible (AccessibleObjectFromWindow() ) from the top level window, and then recursively call IAccessible::accHitTest until you reach the bottom-most IAccessible and perhaps ChildID data. Return that as if you would call AccessibleObjectFromPoint.
To call it successfully, you will need to scale the (x,y) co-ordinates into the scale system of the queried hWnd, using the DPIs and co-ordinates fetched in the list above. Watch out for systems where monitors are not the same size or if monitors are partially offset, or above and below.
And now for the important part for 10.0.14393 – Set your thread to the same DPI_AWARENESS_CONTEXT of the hWnd you are querying. Now call your new function. Now revert your thread to monitor DPI aware, and voila, it works, even if the window is not maximised. This is why you must not use the manifest.
If you are on Windows 8.1 to 10.0.10586 you have a tougher task.
Instead of calling accHitTest, as above, you have to recursively call AccessibleChildren and iterate the call IAccessible::accLocation to determine if your test point is within each child. This is tricky and starts to get really messy when you get to e.g. combo boxes in products like Office, which is only system DPI aware.
That’s all I can give you for now.
To do it successfully on multi-platform (mine has to work from Vista to Windows-Current) the only really safe bet is to write a wrapper DLL in C++ that can determine at runtime which OS it is on and change code path accordingly. The reason you want to do it in C++ is to avoid passing IAccessible objects across the .Net/unmanaged marshalling boundary. You can call IUnknown::Release on objects you don’t need to return n the unmanaged side. You can do it all in .Net, but it will be slow.
P.S. also watch out for Chrome returning infinite trees where parents are children of their parents, some snity checks are required. Also, Chrome does not return accRole correctly, and will give you HTML tags instead of VT_I4.
Good luck
A fairly workable solution is as follows, in your IAccessible recursive function:
Use getwindowrect to capture the physical right on main window
Use accChild.accLocation in loop to capture left and Width on each Object
Add this simple test
If l > rct2r.Right And l > arrIACC.x2 Then
arrIACC.x2 = l + w
End If
if dpi = 100 then no Object is furter out than physical right
if dpi > 100 then closebutton is...x pix offset
Use the difference to rescale all values you are in use of Width
arrIACC.w1 = CInt(((-rct2r.Left + arrIACC.w1) / arrIACC.x2) * rct2r.Right)
This solution is from an Excel plugin I have developed, I was testing the Width of the quick access toolbar qat and my result was +- 5 pixels regardless of any DPI.

On-screen display driver

I need write linux kernel module that will display message box over all other windows on the screen. And I need drawing image in the kernel, access to this picture from user-space application is not required. I don't understand how to do this. What framework should I use - framebuffers or v4l? I suppose direct programming of the display controller is not a good idea, because there is other driver in the kernel that already do this. So, the questions are: how to interact between in-kernel drivers, and how to specify that my picture should be on top?
I would be grateful for any help.
What you want can not be done in this way, since the kernel does not handle GUIs and it especially does not handle window systems. It provides access to video output devices in one form or another, but all the actual drawing and compositing of a screen is done in user space.
Now, a kernel module would have the power to just overwrite the frame buffer, but, as you noticed, there are multiple interfaces for different purposes. Additionally, using 3D rendering even for 2D desktops is quite common. Hijacking a 3D command stream for your purposes would be disproportionately difficult.
Even if you managed all that, there is no guarantee that the user space window system wouldn't just overwrite your message box immediately. Maybe even before it reaches the display.
So no, it can't be done in any practical way directly from the kernel. Your best alternative would be a user space daemon displaying the message on your kernel code's behalf through the standard channels like any other GUI program.

What does this bit of MSDN documentation mean?

The first parameter to the EnumFontFamiliesEx function, according to the MSDN documentation, is described as:
hdc [in]
A handle to the device context from which to enumerate the fonts.
What exactly does it mean?
What does device context mean?
Why should a device context be related to fonts?
Question (3) is a legitimately difficult thing to find an explanation for, but the reason is simple enough:
Some devices provide their own font support. For example, a PostScript printer will allow you to use PostScript fonts. But those same fonts won't be usable when rendering on-screen, or to another printer without PostScript support. Another example would be that a plotter (which is a motorized pen) requires vector fonts with a fixed stroke thickness, so raster fonts can't be used with such a device.
If you're interested in device-specific font support, you'll want to know about the GetDeviceCaps function.
The windows API uses the concept of handles extensively. A handle is an integer value that you can use as a token to access an API resource. You can think of it as a kind of "this" pointer, although it is definitely not a pointer.
A device context is an object within the windows API that represents a something that you can draw on or display graphics on. It might be a printer, a bitmap, or a screen, or some other context in which creating graphics makes sense. In Windows, fonts must be selected into device contexts before they can be used. In order to find out what fonts are currently available in any given device context, you can enumerate them. That's where EnumFontFamiliesEx comes in.
Microsoft has other articles on device context,
https://learn.microsoft.com/en-us/windows/win32/gdi/about-device-contexts
An application must inform GDI to load a particular device driver and,
once the driver is loaded, to prepare the device for drawing
operations (such as selecting a line color and width, a brush pattern
and color, a font typeface, a clipping region, and so on). These tasks
are accomplished by creating and maintaining a device context (DC). A
DC is a structure that defines a set of graphic objects and their
associated attributes, and the graphic modes that affect output. The
graphic objects include a pen for line drawing, a brush for painting
and filling, a bitmap for copying or scrolling parts of the screen, a
palette for defining the set of available colors, a region for
clipping and other operations, and a path for painting and drawing
operations. Unlike most of the structures, an application never has
direct access to the DC; instead, it operates on the structure
indirectly by calling various functions.
Obviously font is a kind of drawing.

How can I capture the screen with Haskell on Mac OS X?

How can I capture the screen with Haskell on Mac OS X?
I've read Screen capture in Haskell?. But I'm working on a Mac Mini. So, the Windows solution is not applicable and the GTK solution does not work because it only captures a black screen. GTK in Macs only captures black screens.
How can I capture the screen with … and OpenGL?
Only with some luck. OpenGL is primarily a drawing API and the contents of the main framebuffer are undefined unless it's drawn to by OpenGL functions themself. That OpenGL could be abused was due to the way graphics system did manage their on-screen windows' framebuffers: After a window without predefined background color/brush was created, its initial framebuffer content was simply everything that was on the screen right before the window's creation. If a OpenGL context is created on top of this, the framebuffer could be read out using glReadPixels, that way creating a screenshot.
Today window compositing has become the norm which makes abusing OpenGL for taking screenshots almost impossible. With compositing each window has its own off-screen framebuffer and the screen's contents are composited only at the end. If you used that method outlined above, which relies on uninitialized memory containing the desired content, on a compositing window system, the results will vary wildly, between solid clear color, over wildly distorted junk fragments, to data noise.
Since taking a screenshot reliably must take into account a lot of idiosyncrasy of the system this is to happen on, it's virtually impossible to write a truly portable screenshot program.
And OpenGL is definitely the wrong tool for it, no matter that people (including myself) were able to abuse it for such in the past.
I programmed this C code to capture the screen of Macs and to show it in an OpenGL window through the function glDrawPixels:
opengl-capture.c
http://pastebin.com/pMH2rDNH
Coding the FFI for Haskell is quite trivial. I'll do it soon.
This might be useful to find the solution in C:
NeHe Productions - Using gluUnProject
http://nehe.gamedev.net/article/using_gluunproject/16013/
Apple Mailing Lists - Re: Screen snapshot example code posted
http://lists.apple.com/archives/cocoa-dev/2005/Aug/msg00901.html
Compiling OpenGL programs on Windows, Linux and OS X
http://goanna.cs.rmit.edu.au/~gl/teaching/Interactive3D/2012/compiling.html
Grab Mac OS Screen using GL_RGB format

Resources