How to alter colors in mac/cocoa text views - cocoa

Is there a way to alter the standard background and text colors in cocoa text views throughout the mac os? Since the same text system is used by textedit, mail and other programs, I imagine there is an underlying preference that could be altered. Any ideas how to access it?
Just to be clear, I don't mean simply to alter the document colors. This shouldn't be setting a background color or text color as document formatting. I'm just wondering about how I could make all of these text views use different defaults when displaying documents (e.g. white on blue, rather than black on white).
Thanks!

There is no supported way to do this but you might achieve it with a custom Application Enhancer module. Note, however, this is a very controversial subject as poorly-written APE modules (or bugs in Application Enhancer itself) can cause a whole lot of pain in very unusual places (and blame is often placed on the wrong developer as a result).

Related

Extract text from rectangle on Windows screen without using OCR

Given a rectangle that represents an area on a Windows screen that contains text, what is the best way to extract the text?
I know that it is possible using OCR, but even after significant pre processing, the quality is really poor.
Getting the Window Text using Win32 API does not always work as well.
Assuming that the text was rendered using a font, is it possible to get it from there?
Any directions would be extremely helpful. Thanks!
Given a rectangle that represents an area on window screen, the best way to extract text is indeed OCR. Use a better OCR library like this one from Microsoft.
The reason getting the window text using Win32 API does not work well is because there may be multiple windows in that rectangle. You will have to find out what all windows the rectangle contains and send a message to get the text for each window. It is not impossible but difficult to do and even if you manage to do that, you will run into issues of text alignment, etc. OCR is your best option.
It does seem possible without using OCR, as NirSoft SysExporter can do this:
https://www.nirsoft.net/utils/sysexp.html
This may be suitable for programmatic use as it can be run from a command line:
Starting from version 1.70, you can export the content of Windows
control from command-line, without displaying any user interface.
You may not be able to target it at a specific rectangle on the screen, but maybe the same result could be achieved by first scraping everything followed by some post-processing.
Further basic info:
SysExporter utility allows you to grab the data stored in standard
list-views, tree-views, list boxes, combo boxes, text-boxes, and
WebBrowser/HTML controls from almost any application running on your
system, and export it to text, HTML or XML file.
...
Known Limitations
SysExporter can export data from most combo boxes, list boxes,
tree-view, and list-view controls, but not from all of them. There are
some applications that use these controls to display data, but the
data itself is not actually stored in the control, but in another
location in the computer's memory. In such cases, SysExporter won't be
able to export the data.
Personally I've used it to grab text from what look like label controls.

How can I get the original font name of some text using PDFKit?

I wrote a script which parses information from PDF files and outputs it to HTML. It's written in Python, using pdfminer.
On some text segments, the font style can have semantic significance. For instance: bold, italic and color should trigger different behavior. Pdfminer provides scripts with the font name, but not the color, and it has a number of other issues; so I'm working on a Swift version of that program, using Apple's PDFKit, to extract the same features.
I now find that I have the opposite problem. While PDFKit makes it easy to retrieve color, retrieving the original font name seems to be non-obvious. PDFSelection objects have an attributedString property, but for fonts that are not installed on my computer, the NSFont object is Helvetica. Of course, the fonts in question are fairly expensive, and acquiring a copy just for this purpose would be poor form.
Short of dropping to CGPDFContentStream (which is way too big of a hammer for what I want to get), is there a way of getting the original font name? I know in advance what the fonts are going to be, can I use that to my advantage?
PDFKit seems to use the standard font lookup system and then falls back on some default, so this can be resolved by spoofing the font to ensure that PDFKit doesn't need to fall back. Inspecting the document, I was able to identify that it uses the following fonts (referenced with their PostScript name):
"NeoSansIntel"
"NeoSansIntelMedium"
"NeoSansIntel,Italic"
I used a free font creation utility to create dummy fonts with these PostScript names, and I added them to my app bundle. I then used CTFontManagerRegisterFontsForURLs to load these fonts (in the .process scope), and now PDFKit uses these fonts for attributed strings that need them.
Of course, the fonts are bogus and this is useless for rendering. However, it works perfectly for the purpose of identifying text that uses these font.

Display of Asian characters (with Unicode): Difference in character spacing when presented in a RichEdit control compared with using ExtTextOut

This picture illustrates my predicament:
All of the characters appear to be the same size, but the space between them is different when presented in a RichEdit control compared with when I use ExtTextOut.
I would like to present the characters the same as in the RichEdit control (ideally), in order to preserve wrap positions.
Can anyone tell me:
a) Which is the more correct representation?
b) Why the RichEdit control displays the text with no gaps between the Asian Characters?
c) Is there any way to make ExtTextOut reproduce the behaviour of the RichEdit control when drawing these characters?
d) Would this be any different if I was working on an Asian version of Windows?
Perhaps I'm being optimistic, but if anyone has any hints to offer, I'd be very interested to hear.
In case it helps:
Here's my text:
快的棕色狐狸跳在懶惰狗1 2 3 4 5 6 7 8 9 0
apologies to Asian readers, this is merely for testing our Unicode implemetation and I don't even know what language the characters are taken from, let alone whether they mean anything
In order to view the effect by pasting these characters into a RichEdit control (eg. Wordpad), you may find you have to swipe them and set the font to 'Arial'.
The rich text that I obtain is:
{\rtf1\ansi\ansicpg1252\deff0\deflang2057{\fonttbl{\f0\fnil\fcharset0 Arial;}}{\colortbl ;\red0\green0\blue0;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\fs22\u24555?\u30340?\u26837?\u33394?\u29392?\u29432?\u36339?\u22312?\u25078?\u24816?\u29399?1 2 3 4 5 6 7 8 9 0\par\pard\'a3 $$ \'80\'80\cf1\lang2057\fs16\par}
It doesn't appear to contain a value for character 'pitch' which was my first thought.
I don't know the answer, but there are several things to suspect:
There are several versions of the rich edit control. Perhaps you're using an older one that doesn't have all the latest typographic improvements.
There are many styles and flags that affect the behavior of a rich editcontrol, so you might want to explore which ones are set and what they do. For example, look at EM_GETEDITSTYLE.
Many Asian fonts come in two versions on Windows. One is optimized for horizontal layout, and the other for vertical layout. That latter usually has the same name, but has # prepended to it. Perhaps you are using the wrong one in the rich edit control.
UPDATE: By messing around with Wordpad, I was able to reproduce the problem with the crowded text in the rich edit control.
Open a new document in Wordpad on Windows 7. Note that the selected font is Calibri.
Paste the sample text into the document.
Text appears correct, but Wordpad changed the font to SimSun.
Select the text and change the font back to Calibri or Arial.
The text will now be overcrowded, very similar to your example. Thus it appears the fundamental problem is with font linking and fallback. ExtTextOut is probably selecting an appropriate font for the script automatically. Your challenge is to figure out how to identify the right font for the script and set that font in the rich edit control.
This will only help with part of your problem, but there is a way to draw text to a DC that will look exactly the same as it does with RichEdit: what's called the windowless RichEdit control. It not exactly easy to use: I wrote a CodeProject article on it a few years back. I used this to solve the problem of a scrollable display of blocks of text, each one of which can be edited by clicking on it: the normal drawing is done with the windowless RichEdit, and the editing by showing a "real" RichEdit control on the top of it.
That would at least get you the text looking the same in both cases, though unfortunately both cases would show too little character spacing.
One further thought: if you could rely on Microsoft Office being installed, you could also try later versions of RichEdit that come with office. There's more about these on Murray Sargent's blog, as well as some interesting articles on font binding that might also help.
ExtTextOut allows you to specify the logical spacing between records. It has the parameter lpDx which is a const pointer to an array of values that indicate the distance between origins of adjacent character cells. The Microsoft API documentation notes that if you don't set it, then it sets it's own default spacing. I would have to say that's why ExtTextOut is working fine.
In particular, when you construct a EMR_EXTTEXTOUTW record in EMF, it populates an EMR_TEXT structure with this DX array - which looking at one of your comments, allowed the RichEdit to insert the EMF with the information contained in the record, whereby if you didn't set a font binding then the RTF record does some matching to work out what font to use.
In terms of the RichEdit control, the following article might be useful:
Use Font Binding in a Rich Edit Control
After character sets are assigned, Rich Edit scans the text around the
insertion point forward and backward to find the nearest fonts that
have been used for the character sets. If no font is found for a
character set, Rich Edit uses the font chosen by the client for that
character set. If the client hasn't specified a font for the character
set, Rich Edit uses the default font for that character set. If the
client wants some other font, the client can always change it, but
this approach will work most of the time. The current default font
choices are based on the following table. Note that the default fonts
are set per-process, and there are separate lists for UI usage and for
non-UI usage.
If you haven't set the characterset, then it further explains that it falls back to ANSI_CHARSET. However, it's most definitely a lot more complicated than that, as that blog article by Murray Sargent (a programmer at Microsoft) shows.

Get the word under the mouse cursor in Windows

Greetings everyone,
A friend and I are discussing the possibility of a new project: A translation program that will pop up a translation whenever you hover over any word in any control, even static, non-editable ones. I know there are many browser plugins to do this sort of thing on webpages; we're thinking about how we would do it system-wide (on Windows).
Of course, the key difficulty is figuring out the word the user is hovering over. I'm aware of MSAA and Automation, but as far as I can tell, those things only allow you to get the entire contents of a control, not the specific word the mouse is over.
I stumbled upon this (proprietary) application that does pretty much exactly what we want to do: http://www.gettranslateit.com/
Somehow they are able to get the exact word the user is hovering over in almost any application (It seems to have trouble in a few apps, notably Windows Explorer). It even grabs text out of obviously custom-drawn controls, somehow. At first I thought it must be using OCR. But even when I shrink the font so far down that the text becomes a completely unreadable blob, it can still recognize words perfectly. (And yet, it doesn't recognize anything if I change the font to Wingdings. But maybe that's by design?)
Any ideas as to how it's achieving this seemingly impossible task?
EDIT: It doesn't work with Wingdings, but it does work with some other nonsense fonts, so I've confirmed it can't be OCR.
You could capture the GDI calls that output text to the display, and then figure out which word's bounding box the cursor falls in.
Well, for GDI controls you can get the position and size of the control, and you can usually get the font info. For example, with static text controls you'd use WM_GETFONT. Then once you have that you can get the position of the mouse relative to the position of the control and use one of the font functions, perhaps something like GetTextExtentPoint32 to figure out what is under the cursor. I'm pretty sure the answer lies in that direction...
You can run dumpbin /imports on the other application and see what APIs they are calling.

How to set g:text style to bold font in a Windows Gadget?

I'm developing a Vista/Win7 Desktop Gadget that uses a translucent g:background (doc) area with g:text (doc) on top. I'm adding the text via addTextObject (doc), and this all works as expected.
However, I can't figure out how to set that text to bold style. There doesn't seem to be a way to do this directly via the exposed properties that I can see, and I can't use regular text + CSS in this case due to the fact this text is placed onto a g:background object.
I have also tried specifying a bold font directly, such as Arial Bold (doesn't work) instead of Arial (works).
So how can this be done?
Edit: I have tried setting font-weight:bold for both the body and the g:background object that parents my text; no luck.
See Flip Calendar, by Jonathan Abbott. His code is usually well commented so maybe you can get some ideas from that.
EDIT
The source of my information was from the early days of Vista Beta 2 where that was the official word from MS. I also found the following response to a thread on the MSDN forums regarding the Flip Calendar gadget itself:
http://social.msdn.microsoft.com/Forums/en-US/sidebargadfetdevelopment/thread/841e9d5e-32e9-453f-bd0e-dc5a4e607c33/
The gadget has options for setting bold font on the day of the month (a g:text object) but on closer inspection it doesn't work. Sorry about that. The MS guys have been known to be wrong as well on one or more occasions. I can honestly say that I don't use the g:text object.
This means your only (well, non activex route) option is VML text, which provides a lot of flexibility on layout. However, you will have to place it on a fully opaque area of the gadget which is probably why you wanted to use the addTextObject in the first place. Gary Beene's site really helped me out when I was getting started, but it doesn't go into any detail on the v:textbox element and the v:textpath element, though the MSDN documentation goes into enough detail on these.
If you need to place the text on a non-fully opaque area of the gadget, then you could still go the VML route and place an image behind the text that acts as a shadow, starting out fully opaque and fading to fully transparent. This is how Microsoft does text in window title bars with aero enabled.
Alternatively, you could create an ActiveXObject that draws the text you need in the font you want and saves the image to a temporary file in the gadget folder. Then you set that to the src of an addImageObject. I've done something similar in a gadget and it's fast enough not to be noticeable. You can also set min/max dimensions so shrinking/stretching to fit becomes a breeze.

Resources