alternative to gdi when writing text editor in winapi

alternative to gdi when writing text editor in winapi - windows

Is there some alternative to GDI when one want to
write nice working, fast text editor under
winapi? I want something what would work
with older windows versions for example XP, too.
I heard that GDI is slow, maybe there is something
more proper to GDI when writng text editor?
Does maybe somebody know what to that purpose
are using miscleanous nice text editors?

GDI is not too fast. But probably for editor, it should be sufficient. It also depends on the inteligence of the paint algorithm. When being edited, for example, you should only re-render the affected line(s). Even when inserting new lines, you may just scroll most of the ones below with ScrollWindow() or ScrollWindowEx().
As an alternative you may look at Uniscribe (USP10.DLL). However I am not sure whether it
relies on GDI or not. It is more or less replacement of TextOut() and similar GDI functions to support properly different scripting systems, including aspects like right-to-left reading, mixtures of left-to-right and right-to-left (e.g. arabian with embedded European personal names etc.)
Then there is also DirectWrite, which is supposed to be used together with Direct2D. That should be faster as Direct2D offloads a lot work to graphics card, while GDI eats mainly CPU and system memory. Note however these APIs are only available since Windows 7.

Related

Font layout algorithm, Win32

I'm trying to come up with a platform independent way to render unicode text to some platform specific surface, but assuming that most platforms support something at least kinda similar, maybe we can talk in terms of the win32 API. I'm most interested in rendering LARGE buffers and supporting rich text, so while I definitely don't want to ever look inside a unicode buffer, I'd like to be told myself what to draw and be hinted where to draw it, so if the buffer is modified, I might properly ask for updates on partial regions of the buffer.
So the actual questions. GetTextExtentExPointW clearly allows me to get the widths of each character, how do I get the width of a nonbreaking extent of text? If I have some long word, he should probably be put on a new line rather than splitting the word. How can I tell where to break the text? Will I need to actually look inside the unicode buffer? This seems very dangerous. Also, how do I figure how far each baseline should be while rendering?
Finally, this is already looking like it's going to be extremely complicated. Are there alternative strategies for doing what I'm trying to do? I really would not at all like to rerender a HUGE buffer each time I change tiny chunks of it. Something between looking at individual glyphs and just giving a box to spat text in.
Finally, I'm not interested in using anything like knuth's word breaking algorithm. No hyphenation. Ideally I'd like to render justified text, but that's on me if the word positioning is given. Ragged right hand margin is fine by me.

What you're trying to do is called shaping in unicode jargon. Don't bother writing your own shaping engine, it's a full-time job that requires continuous updates to take into account changes in unicode and opentype standards. If you want to spend any time on the rest of your app you'll need to delegate shaping to a third-party engine (harbuzz-ng, uniscribe, icu, etc)
As others wrote:
– unicode font rendering is hellishly complex, much more than you expect
– winapi is not cross platform at all
The three common strategies for rendering unicode text are:
1. write one backend per system (plugging on the system native text stack) or
2. select one set of cross-platform libs (for example freebidi + harfbuzz-ng + freetype + fontconfig, or a framework like QT) and recompile them for each target system or
3. take compliance shortcuts
The only strategy I strongly advise against is the last one. You can not control unicode.org normalization (adding upper ss to German), you do not understand worldwide script uses (both African languages and Vietnamese are Latin variants but they exercise unexpected unicode properties), you will underestimate font creators ingenuity (oh, indic users requested this opentype property, but it will be really handy for this English use-case…).
The two first strategies have their own drawbacks. It's simpler to maintain a single text backend but deploying a complete text stack on a foreign system is far from hassle-free. Almost every project that tries cross-plarform has to get rid of msvc first since it targets windows and its language dialect won't work on other platforms and cross-platform libs will typically only compile easily in gcc or llvm.
I think harfbuzz-ng has reached parity with uniscribe even on windows so that's the lib I'd target if I wanted cross-platform today (chrome, firefox and libreoffice use it at least on some platforms). However Libreoffice at least uses the multi-backend strategy. No idea if it reflects current library state more than some past historic assessment. There are not so many cross-platform apps with heavy text usage to look at, and most of them carry the burden of legacy choices.

Unicode rendering is surprisingly complicated. Line breaking is just the beginning; there are many other subtleties that I don't think you've appreciated (vertical text, right-to=left text, glyph combining, and many more). Microsoft has several teams dedicated to doing nothing but implementing text rendering, for example.
Sounds like you're interested in DirectWrite. Note that this is NOT platform independent, obviously. It's also possible to do a less accurate job if you don't care about being language independent; many of the more unusual features only occur in rarer languages. (Chinese being a notable exception.)

If you want some perfect multi platform there will be problems. If you draw one sentence with GDI, one GDI+, one with Direct2D, one on Linux, one on Mac all with the same font size on the same buffer, you'll have differences some round some position to int some other use float for examples.
There is not one, but a least two problems. Drawing text and computing text position, line break etc are very different. Some library do both some do only computing or rendering part. A very simplified explanation is drawing do only render one single char at the position you ask, with transformations zoom, rotation and anti aliasing. Computing do everything else chose each char position in word, sentences line break, paragraphs etc
If you want to be platform independent you could use FreeType to read font files and get every information on each characters. That library get exact same result on each platform and predictability in font is good. The main problem with font is lot of bad, missed, or even wrong information in characters descriptions.Nobody do text perfectly because it's very hard (tipping hat to word, acrobat and every team who deal directly with fonts)
If your computing of font is good. There is lot of work to do everything you could see in a good word processor software (space between characters, spaces between word, line break, alignment, rotation, grease, aliasing...) then you could do the rendering. It should be the easier. You can with same computation do GDI, Direct2D, PDF or printing rendering path.

ShellIconOverlayIdentifiers - why so few?

At this point, everyone knows that there's a limit to the number of ShellIconOverlayIdentifiers (from MSDN):
The number of different icon overlay handlers that the system can support is limited by the amount of space available for icon overlays in the system image list. There are currently fifteen slots allotted for icon overlays, some of which are reserved by the system. For this reason, icon overlay handlers should be implemented only if there are no satisfactory alternatives
I can understand the 15 overlay limt in Windows 95. But in an environment where there's Gigs of RAM, numerous Cores, and GPUs, is there some technical reason for such a low number in a modern operating system?
And why isn't this value configurable?
Before giving the 'performance' answer, consider:
Windows allows for configuration such that you can kill performance... why pick on this issue specifically?

Unless someone here happens to work on the Windows Shell team, I doubt that you're going to get an answer that really addresses the technical limitations and how they affect the design choice. But I'll try...
My guess is that there isn't any technical limitation, or at least there isn't one now. The real reason is presumably that no one has ever taken the time to sit down and update the code, the design, and the spec to lift this limitation. Features aren't implemented by default, and just because the computing environment has changed in the last few years doesn't mean that someone sat down and rewrote Windows to take full advantage of all those changes.
You should also consider that is more than likely a conscious design choice, rather than an imposed limitation. Raymond Chen (who actually does work on the shell team) published a blog entry responding to the uproar about Windows 7 removing the "sharing hand" overlay. He makes a compelling argument that the icon overlay is really not a desirable way of showing information (above and beyond the fact that the system is limited to 15) [emphasis added]:
Generally speaking, overlays are not a
good way of presenting information
because there can be only one overlay
per icon, and there is a limit of 15
overlays per ImageList. If there are
two or more overlays which apply to an
item, then one will win and the others
will lose, at which point the value of
the overlay as a way of determining
what properties apply to an item
diminishes since the only way to be
sure that a property is missing is
when you see no overlay at all. (If
you see some other overlay, you can't
tell whether it's because your
property is missing or because that
other overlay is showing instead of
yours.)
It seems reasonable to me that the extra clutter added to the shell is simply not worth it in the majority of real-world cases. The Windows Shell team obviously reached the same conclusion and cut the "sharing hand" overlay. Raymond's direct explanation:
Given the changes in how people use
computers, sharing information is
becoming more and more of the default
state. When you set up a HomeGroup,
pretty much everything is going to be
shared. To remove the visual clutter,
the information was moved to the
Details pane.
And, I know you specifically asked not to mention performance, but Windows really does try to keep you from shooting yourself in the foot. Users demand responsiveness in the shell, and overlay icons can interfere with this. As further evidence that they are not the priority, another blog post by the same Raymond Chen chastises:
Another example of applications having
a selfish view of performance came
from a company developing an icon
overlay handler. The shell treats
overlay computation as a low-priority
item, since it is more important to
get icons on the screen so the user
can start doing whatever it is they
wanted to be doing. The decorations
can come later. This company wanted to
know if there was a way they could
improve their performance and get
their overlay onto the screen even
before the icon shows up,
demonstrating a phenomenally selfish
interpretation of "performance".

Excellent response on the practical issues by Cody. As to why 15 and not some other number, the limit is baked into the ImageList control itself.

This is all very well and good, as explained by Cody Gray, but frankly it is pretty unimaginative, and as reported behind the scenes, sounding a bit frustrated.
In 2015 and with Windows 10, surely there can and needs to be a better ability, as I noted about thirty overlays present and had to prioritize ones I wanted most to see, which is not what you want most people to worry about at all. Also I see aggressive vendors like Box over-competing to try to prioritize themselves, and that will never go any place good.
Here's a possibility: What if multiply overlaid icons had a generic overlay indicator; a small rectangle matrix of multiple colors like the Google Chrome Apps button? Singly overlaid would just show the overlay out of a long list.
Then when the mouse pointer meets the icon, a small flyout window collects all the icon variations to view (at small icon size or a little larger). Each overlaid icon in turn announces by tooltip what it is, when you mouse over.
Now you can have all the icon overlays you need, for state in various clouds, for repository indications as for Tortoise tools, and so forth.

I quote an extract of the definitive answer here from Why is there a limit of 15 shell icon overlays? Raymond Chen 2019 post
The value 15 came from the corresponding limit for image lists. The
ImageList_SetOverlayImage function supports up to 15 image list
overlays per image list. (Hey, it used to be worse. The limit used to
be only 3!)
Okay, but why only 15? Why not more?
The overlay image is one of the pieces of information used when
drawing an image from an image list. The options are encoded in the
fStyle parameter, and when the bits were divided up for various
purposes, four bits were available to be used to specify the overlay
image. (You get 15 overlay images instead of 16 because you lose one
of the values in order to specify “no overlay.”)
Okay, but the values in the fStyle parameter use only the bottom 16
bits. What about the upper 16 bits? There’s plenty of room there.
The 16-bit limit was carried over from the 16-bit version of the
common controls (which still needed to be supported in Windows 95). Of
course, nowadays, nobody cares about the 16-bit version of the common
controls, so why not start using the upper bits?
There’s an unsatisfying explanation: The code internally that manages
the fStyle still uses a WORD in some places, so all the code that
manages the fStyle would have to be revised. This occurs in multiple
modules across Windows, so a synchronized change would have to be made
across multiple components. This is a breaking change at the binary
level because the interfaces are no longer compatible. Breaking
changes are procedurally difficult to coordinate: The affected code
may not be visible to the shell team because they are sitting in a
far-away leaf branch that has not yet RI’d to the trunk. It might be
that expanding fStyle from a WORD to a DWORD has far-reaching
consequences for some component.
Like I said, this is unsatisfying. Basically it boils down to “It
would be a lot of work and we are lazy.”

Options for Cocoa-based text editor

I am working on a cocoa-based text editor. Should I base it on NSTextView or is there a more efficient option? Keep in mind that I plan to support tabs so there can be many editors open at the same time.

I am working on a cocoa-based text editor. Should I base it on NSTextView
Yes.
or is there a more efficient option?
No, assuming “efficiency” includes your own time and effort weighed against the feature set you want to support—Cocoa's text system does a lot for you, which you'd be throwing away if you rolled your own.
Some examples:
Undo support
Advanced editing (emacs keys)
Support for input managers/input methods
Support for all of Unicode
Mouse selection
Keyboard selection
Multiple selection
Fonts
Colors
Images
Sounds
Find
Find and Replace
Spelling-checking
Grammar-checking
Text replacement
Accessibility
If you roll your own, you get to spend months reinventing and debugging some if not most if not all of those wheels. I call that inefficient.
The text system you already have, meanwhile, is fast nearly all of the time. You need huge texts with long lines (or maybe lots of embedded images/sounds) to bog it down.
Keep in mind that I plan to support tabs so there can be many editors open at the same time.
Unless the user is going to be typing into all of them at once, I don't see how that will cause a performance problem. 0% CPU × N or N-1 views = 0% CPU.
The one place where you might have a problem is memory usage, if the documents are both many and large. They'd have to be both in the extreme, as even a modest Mac nowadays has 1 GiB of RAM, and text doesn't weigh much.
If that's the case, then you could only keep the N most recently used unmodified texts in memory, and otherwise remember only the arrays of selection ranges. But 99% of the time, swapping texts in and out will be far more expensive than just leaving them all in memory.

NSTextView is probably the simplest way to go if you want to get a ton of nice features for free. It can't do everything, but it's an awesome start.

How a marker-based augmented reality algorithm (like ARToolkit's one) works?

For my job i've been using a Java version of ARToolkit (NyARTookit). So far it proven good enough for our needs, but my boss is starting to want the framework ported in other platforms such as web (Flash, etc) and mobiles. While i suppose i could use other ports, i'm increasingly annoyed by not knowing how the kit works and beyond that, from some limitations. Later i'll also need to extend the kit's abilities to add stuff like interaction (virtual buttons on cards, etc), which as far as i've seen in NyARToolkit aren't supported.
So basically, i need to replace ARToolkit with a custom mark detector (and in case of NyARToolkit, try to get rid of JMF and use a better solution via JNI). However i don't know how these detectors work. I know about 3D graphics and i've built a nice framework around it, but i need to know how to build the underlying tech :-).
Does anyone know any sources about how to implement a marker-based augmented reality application from scratch? When searching in google i only find "applications" of AR, not the underlying algorithms :-/.

'From scratch' is a relative term. Truly doing it from scratch, without using any pre-existing vision code, would be very painful and you wouldn't do a better job of it than the entire computer vision community.
However, if you want to do AR with existing vision code, this is more reasonable. The essential sub-tasks are:
Find the markers in your image or video.
Make sure they are the ones you want.
Figure out how they are oriented relative to the camera.
The first task is keypoint localization. Techniques for this include SIFT keypoint detection, the Harris corner detector, and others. Some of these have open source implementations - i think OpenCV has the Harris corner detector in the function GoodFeaturesToTrack.
The second task is making region descriptors. Techniques for this include SIFT descriptors, HOG descriptors, and many many others. There should be an open-source implementation of one of these somewhere.
The third task is also done by keypoint localizers. Ideally you want an affine transformation, since this will tell you how the marker is sitting in 3-space. The Harris affine detector should work for this. For more details go here: http://en.wikipedia.org/wiki/Harris_affine_region_detector

Why is GUI code so computationally expensive?

All you Stackoverflowers,
I was wondering why GUI code is responsible for sucking away many, many cpu cycles. In principle, the graphical rendering is far less complex than Doom (although most corporate GUIs will introduce lots of window dressing). The event handling layer is also seemingly a heavy cost, however, it seems that a well-written implementation should switch between contexts efficiently on modern processors with a lot of memory/cache.
If anybody has run a profiler on their big GUI application, or a common API itself, I'm interested in where the bottlenecks lie.
Possible explanations (that I imagine) may be:
High levels of abstraction between hardware and application interface
Lots of levels of indirection to the correct code to execute
Low priority (compared to other processes)
Misbehaving applications flooding API with calls
Excessive object orientation?
Complete poor design choices in API (not just issues, but design philosophy)
Some GUI frameworks are much better than others, so I'd like to hear varied perspectives. For example, the Unix/X11 system is much different than Windows and even than WinForms.
Edit: Now a community wiki - go for it. I have one more thing to add -- I'm an algorithms guy in school and would be interested if there are inefficient algorithms in GUI code and which they are. Then again, it's probably just the implementation overhead.

I've no idea generally, but I'd like to add another item to your list - font rendering and calculations. Finding vector glyphs in a font and converting them to bitmap representations with anti-aliasing is no small task. And often it needs to be done twice - first to calculate the width/height of the text for positioning, and then actually drawing the text at the right coordinates.
Also, most drawing code today relies on clipping mechanisms to update just a part of the GUI. So, if just one part needs to be redrawn, the code actually redraws the whole window behind the scenes, and then takes just the needed part to actually update.
Added:
In the comments I found this:
I'm also very interested in this. It can't be that the gui is rendered using only the cpu because if you don't have proper drivers for your gfx-card, desktop graphics render incredibly slow. If you have gfx-drivers however desktop-gfx go kinda fast but never as fast as a directx/opengl app.
Here's the deal as I understand it: every graphic card out there today supports a generic interface for drawing. I'm not sure if it's called "VESA", "SVGA", or if those are just old names from the past. Anyway, this interface involves doing everything through interrupts. For every pixel there is an interrupt call. Or something like that. The proper VGA driver however is able to take advantage of DMA and other enhancements that make the whole process WAY less CPU-intensive.
Added 2: Ah, and for OpenGL/DirectX - that's another feature of today's graphics cards. They are optimized for 3D operations in exclusive mode. That's why the speed. The normal GUI just utilizes basic 2D drawing procedures. So it gets to send the contents of the whole screen every time it wants an update. 3D applications however send a bunch of textures and triangle definitions to the VRAM (video-RAM) and then just reuse them for drawing. They just say something like "take the triangle set #38 with the texture set #25 and draw them". All these things are cached in the VRAM so this is again way faster.
I'm not sure, but I would suspect that the modern 3D-accelerated GUIs (Vista Aero, compiz on Linux, etc.) also might take advantage of this. They could send common bitmaps to the VGA up front and then just reuse them directly from the VRAM. Any application-drawn surfaces however would still need to be sent directly every time for updates.
Added 3: More ideas. :) The modern GUI's for Windows, Linux, etc. are widget-oriented (that's control-oriented for Windows speakers). The problem with this is that each widget has its own drawing code and associated drawing surface (more or less). When the window needs to get redrawn, it calls the drawing code for all its child-widgets, who in turn call the drawing code for their child-widgets, etc.. Every widget redraws its whole surface, even though some of it is obscured by other widgets. With above mentioned clipping techniques some of this drawn information is immediately discarded to reduce flickering and other artifacts. But still it's lots of manual drawing code that includes bitmap blitting, stretching, skewing, drawing lines, text, flood-filling, etc.. And all this gets translated to a series of putpixel calls that get filtered through clipping filters/masks and other stuff. Ah, yes, and alpha blending has also become popular today for nice effects which means even more work. So... yes, you could say this is because of lots of abstraction and indirection. But... could you really do it any better? I don't think so. Only 3D techniques might help, because they take advantage of GPU for alpha-calculations and clipping.

Let's begin by saying that writing libraries is much harder than writing a stand-alone code. The requirement that your abstraction be reusable in as many contexts as possible, including contexts which you haven't though of yet, makes the task challenging even for experienced programmers.
Amongst libraries, writing a GUI toolkit library is a famously difficult problem. This is because the programs which use GUI libraries range over a very wide variety of domains with very different needs. Mr Why and Martin DeMollo discussed the requirements placed of GUI libraries a little while ago.
Writing GUI widgets themselves is difficult because computer users are very sensitive minute details of the behavior of the interface. Non-native widget never feel right, don't they? In order to get non-native widget right -- in order to get any widget right, in fact -- you need to spend an inordinate amount of time tweaking the details of the behavior.
So, GUI are slow because of the inefficiencies introduced by the abstraction mechanisms used to create highly-reusable components, that added to shortness of time available to optimize the code once so much time has been spent just getting the behavior right.

Uhm, that's quite a lot.
The most simple but probably obvious answer is that the programmers behind these GUI apps, are really bad programmers. You can go along way in writing code which does the most bizarre things and it will be faster but few people seem to care how to do this or they deem it to be an expensive non-profitable time wasted effort.
To set things straight off-loading computations to the GPU won't necessarily fix any problems. The GPU is just like the CPU except it's less general purpose and more a data paralleled processor. It can do graphics computations exceptionally well. Whatever graphics API/OS and driver combination you have doesn't really matter that much... well OK, with Vista as an example, they changed the desktop composition engine. This engine is far better composting only that which has changed, and since the number one bottle neck for GUI apps is redrawing is a neat optimization strategy. This idea of virtualizing your computational needs and only update the smallest change every time.
Win32 sends WM_PAINT messages to windows when they need to be redrawn, this can be a result of windows occluding each other. However it's up to the window itself to figure out whats actually changed. More than so nothing did change or the change that was made was trivial enough so that it could have been just preformed on top of what ever top most surface you had.
This kind of graphics handling doesn't necessarily exist today. I would say that people have refrained from writing really efficient and virtualizing rendering solutions because the benefit/cost ration is rather low/high (bad).
Something Windows Presentation Foundation (WPF) does, which I think is far superior to most other GUI API is that it splits layout updates and rendering updates into two separate passes. And while WPF is managed code the rendering engine is not. What happens with rendering is that the managed WPF rendering engine builds a command queue (this is what DirectX and OpenGL does) which is then handed of to the native rendering engine. What's a bit more elegant here is that WPF will then try to retain any computation which didn't change the visual state. A trick if you may, where you avoid costly rendering calls for things that doesn't have to be rendered (virtualizing).
In contrast to WM_PAINT which tells a Win32 window to repaint itself a WPF app would check what parts of that window requires repainting and only repaint the smallest change.
Now WPF is not supreme, it's a solid effort from Microsoft but it's not the holy grail yet... the code which runs the pipeline could still be improved and the memory footprint of any managed app is still more than I would want. But I hope this is the kind of answer you are looking for.
WPF is able to do some things asynchronously rather decent, which is a huge deal if you wanna make a really responsive low-latency/low-cpu UI. Asynchronous operations is more than off-loading work on a different thread.
To summarize things slow and expensive GUI means too much repainting and the kind of repainting which is very expensive i.e. the entire surface area.

I does to some degree depend on the language. You might have noticed that Java and RealBasic applications are a fair bit slower than their C-based (C++, C#, Objective-C) counterparts.
However GUI applications are much more complex than command line apps. The Terminal window needs only to draw a simple window that doesn't support buttons.
There are also multiple loops for extra inputs and features.

I think that you can find some interesting thoughts on this topic in "Window System Design: If I had it to do over again in 2002" by James Gosling (the Java guy, also known for his work on pre-X11 windowing systems). Available online here[pdf].
The article focuses on the positive side (how to make it fast), not on the negative side (what's making it slow), but it is still a good read on the topic.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio