MFC: multiline text with more formatting? - winapi

I have a dialog app which initially has static text control on top showing program name. I want to h-v centering with multiline and different font sizes to show more information nicely (figure below). I feel pretty hard using static text. For example, centerimage doesn't work with \r\n. Is there better control or approach?

Related

Extract text from rectangle on Windows screen without using OCR

Given a rectangle that represents an area on a Windows screen that contains text, what is the best way to extract the text?
I know that it is possible using OCR, but even after significant pre processing, the quality is really poor.
Getting the Window Text using Win32 API does not always work as well.
Assuming that the text was rendered using a font, is it possible to get it from there?
Any directions would be extremely helpful. Thanks!
Given a rectangle that represents an area on window screen, the best way to extract text is indeed OCR. Use a better OCR library like this one from Microsoft.
The reason getting the window text using Win32 API does not work well is because there may be multiple windows in that rectangle. You will have to find out what all windows the rectangle contains and send a message to get the text for each window. It is not impossible but difficult to do and even if you manage to do that, you will run into issues of text alignment, etc. OCR is your best option.
It does seem possible without using OCR, as NirSoft SysExporter can do this:
https://www.nirsoft.net/utils/sysexp.html
This may be suitable for programmatic use as it can be run from a command line:
Starting from version 1.70, you can export the content of Windows
control from command-line, without displaying any user interface.
You may not be able to target it at a specific rectangle on the screen, but maybe the same result could be achieved by first scraping everything followed by some post-processing.
Further basic info:
SysExporter utility allows you to grab the data stored in standard
list-views, tree-views, list boxes, combo boxes, text-boxes, and
WebBrowser/HTML controls from almost any application running on your
system, and export it to text, HTML or XML file.
...
Known Limitations
SysExporter can export data from most combo boxes, list boxes,
tree-view, and list-view controls, but not from all of them. There are
some applications that use these controls to display data, but the
data itself is not actually stored in the control, but in another
location in the computer's memory. In such cases, SysExporter won't be
able to export the data.
Personally I've used it to grab text from what look like label controls.

Add text layer to PDF of scanned handwritten notes in OSX

While in class I like to take handwritten notes, afterwards I scan them and then type them up (helps me remember them and also makes them easily searchable). The main issue is I have is I use A LOT of drawings and complex math and converting the math formulas into latex (or word) is very time consuming and the drawings require that I keep the PDF and the text document. What I would like to do is take the basic text that I have typed myself (no OCR) and add a text layer to the PDF's that way the PDF's will be searchable and I can save a lot of time by not converting the math or drawings.
I've looked into Preview, PDFpenPro, acrobat, a couple of linux programs but so far I haven't really found anything that will do this.
Any idea of how I could do this or a program to use?
I also scan my notes. Sometimes I go back and add some text to them using this technique:
Open up the scanned pdf in Preview, then click on the "Edit" button in the top right corner, then the "Text tools" button on the left side (its a little box with Aa in it). From there you can drag open a text box and type into it.
Now the secret trick is that if you save it here as it is and try to open it in your ipad using PDFExpert or some other program then the text might not be there. So here's how to go through that slight hiccup: After you've annotated your notes how you want instead of just saving it as a pdf, use the Print option: File->Print or Command+P. Now click the PDF button on the left to "Save it as a pdf". Now that its printed you can open it and search it in any program that reads pdfs. Attached is an example.
One other thing, it seems like maybe you want to write over your existing handwritten text with typed text? I'm not sure if this is the best way. But if that's what I was trying to do I would:
Scan my notes
Read through them, typing them up as you said
Open the scanned notes in Photoshop or some other program
Draw a giant White Fill White Stroke rectangle over the handwritten text
Save it as a pdf
Do the technique above and copy and paste the typed text from step 2.
I hope this helps. And I wish you luck, I'm still working out the kinks myself for scanned notes but the possibilities have me pretty excited!
EDIT: I just checked out PDFpenPro, which I highly recommend because you don't have to go through that printing trick, you can just save the pdf document after annotating and other programs will recognize the annotations.

Changing font of text error

I have a textbox control inside of a software app which has some text in it. That software is using a custom font which doesn't exist anywhere else and is just specific to this program. I don't have it's source or access to it's creators. Now I want to copy that text inside of a notepad or MS word but when I do the text is no more readable unless I change the font of word processor to the font that the software is using (the font that text is written with). So I want the text to be readable anywhere and not to depend on a specific font. So is it possible?
I'm a c# programmer. Here is an example of unreadable text:
ý¶† ±øõœ ­ý¶† –ý¾‡¨ ÿ†°†¬ ­ñð‡ì úÞ±¶ Ä쇤 ½±”
à¥ì ±øõœ þ·ñœ­Œ Ý稆­Œ ô±º±” (.ì)
[þü‡íý‘†õø]
ý¶†
[þ¶­ñùì ïõéÎ]
±øõœ ­ý¶† ‡º±”
[þíýº]
ý¶†
[úð‡ýì‡Î —‡¤çȾ†] ÿ¬.¹†.ë† °­©ì ÿû¬‡ì ²† þÎõð.ÿ¬.¹†.ë†"
The interesting thing is that it's showing up like this in almost all the fonts except the one that text is originally written with. By the way the text is in Arabic and all of fonts that I tested the text with are supporting Arabic chars.
Now if I type some text that consist of English and Arabic in that font then change the font of notepad to some other font it's looks OK and works normal! So the problem only appears when the text is pasted into the word processor.
EDIT: I think I found the problem! The custom font is a raster font (bitmap font) which has a .fon extension and in the following thread someone wanted to convert the bitmap font to ttf since he was having a problem in printing the documents. I want to copy and paste, so maybe I have to convert the font ?
The discussion:
how to convert a bitmap font .fon into a truetype font ttf
Any kind of help is really appreciated.
thank you.
any kind of help is really appreciated.
If I had seen this question on superuser.com my answer would have been:
You can change the font of text from font A to Arial.
For example in Microsoft Word
Open the Replace dialog box (Edit >> Replace or Ctrl + H)
Make sure no text is specified in the Find what or Replace with boxes
Click in the Find what box, then click Format (If you don’t see the Format button, click More to expand the search options)
Select Font from the pop up list
In the Find Font dialog box, select the text formatting options you would like to replace
Click OK
Click in the Replace with box
Click Format
Select Font from the pop up list
In the Replace Font dialog box, select the new text formatting options you would like to apply
Click OK
Click Replace all
Click OK
Click Close
(from http://wordprocessing.about.com/cs/quicktips/qt/fontreplace.htm)
As an aside: If the document uses styles, it is actually much easier to change the font. For this reason I try to always use styles and never directly apply fonts to text.
If you are not referring to Word documents, please amend your question to say exactly what software was used to create the text - or exactly what file-format the text is stored in.
Since you asked on stackoverflow.com I slowly deduced you may be writing a program in some unspecified programming language. I suggest you edit your question and specify what programming language you are using and give some example code to illustrate the problem.
For example, in Java you might do something like
JLabel label = new JLabel("hello world");
label.setFont(new Font("Arial", Font.PLAIN, 12));
It sounds very much as though the author of the original program has invented their own character encoding and provided a font to go with it. Maybe the development tools were restricted to ANSI text and the developers came up with this extreme solution.
Test out the hypothesis by writing some English text in the custom
font and see if Arabic
characters appear.
If this is so then you will have to work out what the encoding is and translate the strings character by character.

Display of Asian characters (with Unicode): Difference in character spacing when presented in a RichEdit control compared with using ExtTextOut

This picture illustrates my predicament:
All of the characters appear to be the same size, but the space between them is different when presented in a RichEdit control compared with when I use ExtTextOut.
I would like to present the characters the same as in the RichEdit control (ideally), in order to preserve wrap positions.
Can anyone tell me:
a) Which is the more correct representation?
b) Why the RichEdit control displays the text with no gaps between the Asian Characters?
c) Is there any way to make ExtTextOut reproduce the behaviour of the RichEdit control when drawing these characters?
d) Would this be any different if I was working on an Asian version of Windows?
Perhaps I'm being optimistic, but if anyone has any hints to offer, I'd be very interested to hear.
In case it helps:
Here's my text:
快的棕色狐狸跳在懶惰狗1 2 3 4 5 6 7 8 9 0
apologies to Asian readers, this is merely for testing our Unicode implemetation and I don't even know what language the characters are taken from, let alone whether they mean anything
In order to view the effect by pasting these characters into a RichEdit control (eg. Wordpad), you may find you have to swipe them and set the font to 'Arial'.
The rich text that I obtain is:
{\rtf1\ansi\ansicpg1252\deff0\deflang2057{\fonttbl{\f0\fnil\fcharset0 Arial;}}{\colortbl ;\red0\green0\blue0;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\fs22\u24555?\u30340?\u26837?\u33394?\u29392?\u29432?\u36339?\u22312?\u25078?\u24816?\u29399?1 2 3 4 5 6 7 8 9 0\par\pard\'a3 $$ \'80\'80\cf1\lang2057\fs16\par}
It doesn't appear to contain a value for character 'pitch' which was my first thought.
I don't know the answer, but there are several things to suspect:
There are several versions of the rich edit control. Perhaps you're using an older one that doesn't have all the latest typographic improvements.
There are many styles and flags that affect the behavior of a rich editcontrol, so you might want to explore which ones are set and what they do. For example, look at EM_GETEDITSTYLE.
Many Asian fonts come in two versions on Windows. One is optimized for horizontal layout, and the other for vertical layout. That latter usually has the same name, but has # prepended to it. Perhaps you are using the wrong one in the rich edit control.
UPDATE: By messing around with Wordpad, I was able to reproduce the problem with the crowded text in the rich edit control.
Open a new document in Wordpad on Windows 7. Note that the selected font is Calibri.
Paste the sample text into the document.
Text appears correct, but Wordpad changed the font to SimSun.
Select the text and change the font back to Calibri or Arial.
The text will now be overcrowded, very similar to your example. Thus it appears the fundamental problem is with font linking and fallback. ExtTextOut is probably selecting an appropriate font for the script automatically. Your challenge is to figure out how to identify the right font for the script and set that font in the rich edit control.
This will only help with part of your problem, but there is a way to draw text to a DC that will look exactly the same as it does with RichEdit: what's called the windowless RichEdit control. It not exactly easy to use: I wrote a CodeProject article on it a few years back. I used this to solve the problem of a scrollable display of blocks of text, each one of which can be edited by clicking on it: the normal drawing is done with the windowless RichEdit, and the editing by showing a "real" RichEdit control on the top of it.
That would at least get you the text looking the same in both cases, though unfortunately both cases would show too little character spacing.
One further thought: if you could rely on Microsoft Office being installed, you could also try later versions of RichEdit that come with office. There's more about these on Murray Sargent's blog, as well as some interesting articles on font binding that might also help.
ExtTextOut allows you to specify the logical spacing between records. It has the parameter lpDx which is a const pointer to an array of values that indicate the distance between origins of adjacent character cells. The Microsoft API documentation notes that if you don't set it, then it sets it's own default spacing. I would have to say that's why ExtTextOut is working fine.
In particular, when you construct a EMR_EXTTEXTOUTW record in EMF, it populates an EMR_TEXT structure with this DX array - which looking at one of your comments, allowed the RichEdit to insert the EMF with the information contained in the record, whereby if you didn't set a font binding then the RTF record does some matching to work out what font to use.
In terms of the RichEdit control, the following article might be useful:
Use Font Binding in a Rich Edit Control
After character sets are assigned, Rich Edit scans the text around the
insertion point forward and backward to find the nearest fonts that
have been used for the character sets. If no font is found for a
character set, Rich Edit uses the font chosen by the client for that
character set. If the client hasn't specified a font for the character
set, Rich Edit uses the default font for that character set. If the
client wants some other font, the client can always change it, but
this approach will work most of the time. The current default font
choices are based on the following table. Note that the default fonts
are set per-process, and there are separate lists for UI usage and for
non-UI usage.
If you haven't set the characterset, then it further explains that it falls back to ANSI_CHARSET. However, it's most definitely a lot more complicated than that, as that blog article by Murray Sargent (a programmer at Microsoft) shows.

Why alt attribute shows for a split second in Firefox?

I'm working with Course Management System Moodle and in the admin the folder tree (which uses folder icons) displays for about a second the alt attribute given (In this case "Open Folder") then it hides and shows the image when the image is ready.
The system is kind of slow so I assume Firefox thinks at first that the images don't exist.
This is a problem because during that split second the layout stretches to fit the wider words making it look unprofessional in my opinion.
Is there a way I can hide this tag without having to remove the alt tags? (which would be labor intensive) maybe using JQUERY or CSS.
displays for about a second the alt attribute given (In this case "Open Folder") then it hides and shows the image when the image is ready.
Yes, that's what alt text is for: it provides a textual alternative for when the image isn't available — whether that's because there's an error, or images are turned off in the browser settings, or, in this case, the file just hasn't arrived yet.
Is alt text really what you want? Unless the image in question actually contains the words “Open Folder”, the above is inappropriate alt text. If we're talking about one of those little plus/minus icons that opens a tree, a better alt text would be ‘+’. “Open folder”, as a description of what the image does (as opposed to what it contains), would be better applied to the ‘title’ attribute used for tooltips.
Note that if you're using Quirks Mode and the image has a fixed size specified, Firefox will use a ‘broken image’ icon with the alt text overlaid and cropped inside, instead of the plain alt text on its own. This is to match IE's old behaviour. But you don't really want to use Quirks Mode, and in the common case where the fixed size is small, the cropping makes the alt text unreadable and useless.
This is a problem because during that split second the layout stretches to fit the wider words making it look unprofessional in my opinion.
I'd recommend: getting over it. That's how the web rolls, any page can move about a bit as it renders progressively. For images you should only ever see it happen once, then the image will be cached and will appear straight away. If it doesn't, there's something wrong with the cacheing setup.
Depending on what kind of layout you are talking about, you can perhaps fix that to not respond to the changing image size, too. For example if using a table, setting “table-layout: fixed” on the table and “width: (some number of)px” on the top row's image cell will make it stick to that width even if the text inside is smaller. Possibly causing the alt text to run over into the next cell though, mind.
If the images are part of the layout, I'd recommend moving them to CSS. You should also optimize your images wherever possible whether they are CSS or otherwise. You could also move your JavaScript files to the bottom of the page where possible as they block parallel downloads. In general, applying a lot of the techniques here would probably help.
If the images have to be a certain width, give them an explicit width.

Resources