How to get text from the screen - windows

There is some Win OS API call or so that would let one obtain text from the screen
not via obtaining a snapshot and then doing OCR on it, but via API
the idea is to get the text that is under the mouse that the user points to and clicks on.
This is how tools like Babylon (http://www.babylon.com) and 1-Click Answers (http://www.answers.com/main/download_answers_win.jsp) and many others work.
Can someone point me to the right direction to get this functionality?

There is no direct way to obtain text. An application could render text in a zillion different ways (Windows API being one of them), and after it's rendered - it's just a bunch of pixels.
A method you could try however is to find the window directly under the mouse and trying to get the text from them. This would work fine on most standard Windows controls (labels, textboxes, etc.) Wouldn't work on Internet browsers though.
I think the best you can do is make your application such that it supports as many different (common) controls as possible in the above described manner.

You can get the text of every window with the GetWindowText API. The mouse position can be found with the GetCursorPos API.
In Delphi you could use this function (kudos to Peter Below)
Function ChildWindowUnderCursor: HWND;
Var
hw, lasthw: HWND;
pt, clientpt: TPoint;
Begin
Result := 0;
GetCursorPos( pt );
// find top-level window under cursor
hw := WindowFromPoint( pt );
If hw = 0 Then Exit;
// look for child windows in the window recursively
// until we find no new windows
Repeat
lasthw := hw;
clientpt := Pt;
Windows.ScreenToClient( lasthw, clientpt );
// Use ChildwindowfromPoint if app needs to run on NT 3.51!
hw := ChildwindowFromPointEx( lasthw, clientpt, CWP_SKIPINVISIBLE );
Until hw = lasthw;
Result := hw;
End;
Regards,
Lieven

Windows has APIs for accessibility tools like screen-readers for the blind. (Newer versions are also used for other purposes, like UI automation and testing.) It works with many applications, even most browsers which render their own content without using the standard Windows controls. It won't work with all applications, but it can be used to figure out the text under the mouse in most cases.
The current API is called the Windows Automation API. Describing how to do this in general is beyond the scope of a Stack Overflow answer, so I've simply provided a link to the documentation.
The older API that was widely available when this question was first posted is called the Microsoft Active Accessibility API. As with the modern APIs, the scope here is too broad to detail here.
Note that documentation for both APIs is written both for both developers building accessibility tools (like screen readers) as well as for developers writing apps that want to be compatible with those accessibility tools.
The basic idea is that an accessibility tool gets COM interfaces provided by the target application's window(s), and it can use those interfaces to figure out the controls and their text and how they're related both logically and spatially. Applications that are composed of standard Windows controls are mostly automatically supported. Applications with custom UI implementations have to do work to provide these interfaces. Fortunately, the important ones, like the mainstream browsers, have done the work to support these interfaces.

i think its called the clipboard. i am going to bet these programs inject click and double click & keyboard events and then copy items there for inspection. Alternatively, they are gettin jiggy with the windows text controls, and grabbing content that way. i suspect due to security issues, these tools have problems running in vista also.

Related

StartScreenCapturebyWindowId() not excluding overlapping windows for certain programs (Agora Unity)

I am trying to setup individual window sharing for a project in Unity for Windows. The way I'm currently going about doing this is by using EnumWindows(), IsVisableWindow(), and GetWindowText() to create a dictionary of window titles and handles, then calling StartScreeCapturebyWindowId() to share the selected window.
This works relatively well for most process; the window of the process and only the window of the process is streamed. However, for certain programs (like Google Chrome, Discord, and Windows Photos) the captured area is set correctly, but overlapping windows are not culled out.
Does anyone know what could be causing this problem? Is there something wrong with the way I'm grabbing the handles for these windows? Or is there something about starting a screen capture that I am missing?
You certainly did the correct things. However, you also hit the limitation to the Windows part of the SDK. To understand this better, the set of programs are UWP applications. They have different ways to share the visible pixels. Previously version of Agora SDK could not even show the window. Starting from 3.0.1, the SDK uses Rectangle cutting method to get the window display. You may further read the online documentation about that API here.
There isn't much Agora can do for the near term. So you will just need to deal with the user experience (e.g. by warning them) or look at solutions like using Web SDK instead.

How do I duplicate iTunes-style windows on Windows?

Can anyone provide some insight on how to "duplicate" an iTunes style window in Windows? Specifically I am looking for the following features:
1) rounded window
2) top and bottom toolbars
3) rounded text fields
I'm currently attempting a bit of cross-platform development with Real Studio and while I've discovered the mechanism by which to perform the rounded windows in OS X (declare method call to HIWindowSetContentBorderThickness or SetContentBorderThickness), I cannot find in the MSDN how to do similar things in Windows. Obviously Apple accomplished it in actually writing iTunes for Windows. Perhaps they wrote custom controls from the ground up.
SIDENOTE: I found this article from a few years back that briefly discusses it (http://discuss.joelonsoftware.com/default.asp?joel.3.454369.12), but this is pretty much all I could find.
Even if I can't duplicate it exactly, some direction on which Windows libraries might contain the functionality I need to do it "manually" would be nice. Any further assistance would be greatly appreciated.
There's no API for doing Apple-style rounded corners, but there are lower-level APIs for creating windows (both frame windows and controls) of any shape you want.
I don't use RealStudio, but I believe it allows you to access both .NET and native Win32 APIs, so:
If you're using .NET Windows.Forms, read Shaped Windows Forms and Controls in Visual Studio .NET. It's written for VB7, but should be easy to translate to your favorite language.
If you're using the raw Win32 API, there are at least two ways to do this. The simplest, but most limited, is to call the SetWindowRgn API, which sets the shape of your window to anything you can create as an HRGN. But that probably won't cut it for you. You don't want jagged edges; you want smooth curves, with alpha-blended borders, and maybe shadows. (At least that's what Apple does.) The Layered Windows API is the way to do this. It allows arbitrary shapes (even changing on the fly, if you use UpdateLayeredWindow—although you don't need that feature to emulate iTunes), alpha transparency, and complicated hit testing. Since the original article is very out of date, and doesn't cover all of the functionality, also see Layered Windows for the current documentation, which has links to the references.
there is a third party controls that do what you want. It works on Mac & Windows.
http://www.madebyfiga.com/fgsourcelist/
works well.
sb

Evernote and Producteev GUI Toolkit

I was looking at Evernote and Producteev for Windows and noticed they used similar GUI toolkits. What tookit(s) are they using? Here's some links., Producteev. (Check the Windows Screenshots) and Evernote.
I am the programmer for the Windows desktop application for Producteev. We didn't use any toolkits for the GUI. All of the UI code was written from scratch based on designs from our in-house artist. We drew a little bit of influence from Evernote (particularly in the toolbar area) to give a familiar look to the application, so that's why there's some similarities.
The application itself is written in C#, and I use GDI+ to make all the drawing calls. There are about a dozen custom-written controls, including buttons (some of which bring down menus), glowing text boxes, list boxes (for tasks). There's also another collection of them to replace for all of the default Windows controls in order to force anti-aliased text rasterizing.

What sort of GUI controls are used in Windows Resource Monitor?

I am new to GUI programming in Windows.
The Windows Resource Monitor (perfmon.exe /res) has four bars (CPU/Disk/Network/Memory) that have gradient backgrounds, as well as charts on the right for displaying recent CPU/Disk/Network/Memory usage.
I am wondering what kind of controls were used in this application. Are they readily available in C++ or in C#?
They are custom controls that are not available for external use, sorry.
You can use the Spy++ window finder tool (Spy++ is included with DevStudio) to find the window class names (and window boundaries).
http://msdn.microsoft.com/en-us/library/aa266028(v=vs.60).aspx
It shows that the overall window is a DirectUIHWND, the graphs are windows but the bars labelled CPU/Disk/Network, etc are not windows at all, the appear to be drawn directly in the resource monitors client area.
The implementation is not public for these controls, but I'm pretty sure they are incorporated using Windowless Controls.
Those bars remind me of Outlook bars. One old implementation is described in Code Project, and that one also has no windows on its own. Everything is painted inside.
Edit: That Code Project article was C# port. For C++ original go to Code Guru.

Multiple mouse cursors on Windows 7

We are using CPNMouse for an application running on Windows XP. One mouse device is detached from the normal event queue, so we can get it's position and events and draw the cursor ourselves.
Unfortunately, CPNMouse does not work on Windows Vista/7 (see here). Is there any library/SDK that provides the same capabilities on Windows 7?
Just to clarify - we want the "normal" cursor to be present and to draw another cursor, that should be mapped to a different mouse device.
Update:
CPNMouse is no longer supported and its previously "open source" withdrawn from SourceForge. Only this legacy documentation page exists on the official CPN Tools site.
Looks like only commercial products are available...
MiniFrame SoftXpand
MultiMouse
If anyone has a copy of the original source from SourceForce please post a link here... Assuming the original license was a standard SourceForge (open to share/expand) variant it should be okay for someone to fork it to a new project for continued development (of a free tool).
Take a look at the MultiPoint SDK, which allows for up to 25 cursors on the same display, which also supports Windows 7.
You should use the dsf from the windows ddk to create an emulated mouse device then any program can accept input from that specific emulated mouse device and draw the pointer it self. So any program you create will have multiple mouse devices in it. Its sad to say that its not possible to have multiple pointers in windows 7 OS though they specifically dont support it. So you would have to draw the poiner onto the screen yourself.
Im afraid the point is you need to draw a pointer yourself for the mouse
You want the "Raw Input" API which comes with the Windows Platform SDK: MSDN: About Raw Input

Resources