What do CS_BYTEALIGNCLIENT and CS_BYTEALIGNWINDOW mean?

What do CS_BYTEALIGNCLIENT and CS_BYTEALIGNWINDOW mean? - winapi

I have trouble understanding these two class styles. The docs say that they align the window on a byte boundary, but I don't understand what that means.
I have tried to use them and yes, the position of the window upon creation is different, but what do they do and why would I use them is unclear to me.

What do they do and why would I use them?
With modern display technology and GPUs, they (probably) do very little in terms of performance.
In older times, though, a (potentially slow) CPU would need to write blocks of RAM directly to display memory. In such cases, where a display and/or bitmap has a "colour depth" of less than one byte – like monochrome (1 bit-per-pixel) and low colour (say, 4 bpp) – windows and their clients could be aligned such that each row was not 'aligned' to an actual byte boundary; thus, block-copy operations (like BitBlt) would be very slow, because the first few pixels in each row would have to be set by manually setting some of the bits in the display memory according to some of the bits in the first bytes of the source (RAM). These slow operations would also be propagated along each row.
Forcing the display (be it the client area or the entire window) to have its x-origin (those flags/styles only affect the x-position) aligned to a true byte boundary allows much faster copying, because there would then be a direct correspondence between bytes in the source (RAM) and bytes in the target (display); thus, simple block-copying of a row of bytes can be performed (with something akin to memcpy), without the need for any manipulation of individual bits from different bytes.
As a vague analogy, consider the difference (in speed and simplicity) between: (a) copying one array of n bytes to another of the same size; and (b) replacing each byte in the second array with the combination of the lower 4 bits of one source element with the higher 4 bits of the following source element.

From Why did Windows 95 keep window coordinates at multiples of 8? by Raymond Chen:
The screen itself is a giant bitmap, and this means that copying data to the screen goes much faster if x-coordinate of the destination resides on a full byte boundary. And the most common x-coordinate is the left edge of a window’s contents (known as its client area).
Applications can request that Windows position their windows so that their client area began at these advantageous coordinates by setting the CS_BYTEALIGNCLIENT style in their window class. And pretty much all applications did this because of the performance benefit it produced.
So what happened after Windows 95 that made this optimization go away?
Oh, the optimization is still there. You can still set the CS_BYTEALIGNCLIENT style today, and the system will honor it.
The thing that changed wasn’t Windows. The thing that changed was your video card.
In the Windows 95 era, predominant graphics cards were the VGA (Video Graphics Array) and EGA (Enhanced Graphics Adapter). Older graphics cards were also supported, such as the CGA (Color Graphics Adapter) and the monochrome HGC (Hercules Graphics Card).
All of these graphics cards had something in common: They used a pixel format where multiple pixels were represented within a single byte,¹ and therefore provided an environment where byte alignment causes certain x-coordinates to become ineligible positions.
Once you upgraded your graphics card and set the color resolution to “256 colors” or higher, every pixel occupies at least a full byte,² so the requirement that the x-coordinate be byte-aligned is vacuously satisfied. Every coordinate is eligible.
Nowadays, all graphics cards use 32-bit color formats, and the requirement that the coordinate be aligned to a byte offset is satisfied by all x-coordinates.³ The multiples of 8 are no longer special.

Related

Fast subrects from layered image

I have this 2d raster upon which are layered from 1 to say 20 other 2d rasters (with random size and offset). I'm searching for fast way to access a sub-rectangle view (with random size and offset). The view should return all the layered pixels for each X and Y coordinate.
I guess this is kind of how say, GIMP or other 2d paint apps draw layers upon each other, with the exception that I want to have all the pixels upon each other, and not just projection where the top pixel hides the other ones below it.
I have met this problem and before and I still do now, spend already a lot time to search around internet and here about similar issues, but can't find any. I will describe two possible solution, both from which I'm not satisfied:
Have a basically 3d array of pre-allocated size. This is easy to manage but the storage wasted and memory overhead is really big. For 4k raster of say 16 slots, 4 bytes each, is like 1 GiB of memory? And in application case, most of that space will be wasted, not used.
My solution which I made before. Have two 2d arrays, one is with indices, the other with actual values. Each "pixel" of the first one says in which range of pixels in the second array you can find the actual pixels contributed from all layers. This is well compressed on size, but any request is bouncing between two memory regions and is a bit hassle to setup, not to mention update (a nice to have feature, but not mandatory).
So... any know-how on such kind of problem? Thank you in advance!
Forgot to add that I'm targeting self-sufficient, preferably single thread, CPU solution. The layers, will be most likely greyscale with alpha (that is, certain pixel data will not existent). Lookup operation is priority, updates like adding/removing a layer can be more slow.
Added by Mark (see comment):
In that image, if taking top-left corner of the red rectangle, a lookup should report red, green, blue and black. If the bottom-right corner is taken, it should report red and black only.

I would store the offsets and size in a data-structure separate from the pixel-data. This way you do not jump around in the memory while you calculate the relative coordinates for each layer (or even if you can ignore some layers).
If you want to access single pixels or small areas rather than iterating big areas a Quad-Tree might be a good idea to store your data with more local memory access while accessing pixels or areas which are near each other (in x or y direction).

How to draw pixel on screen without BIOS?

I am writing an OS and want to have GUI. I can't find good tutorial for drawing pixels on the screen.
I'd like to have some assembly + C example which I can build and run on some emulator like BOCHS or v86

The basic idea is:
1) bootloader uses firmware (VBE on BIOS, GOP or UGA on UEFI) to set a graphics mode that is supported by the monitor, video card and OS; and while doing this it gets relevant information (physical address of frame buffer, horizontal and vertical resolution, pixel format, bytes between horizontal lines) about the frame buffer from the firmware that it can pass to the OS; so that the OS can use this information during "early initialisation" (before a native video driver is started), and can continue using it (as a kind of "limp mode") if there is no suitable native video driver.
2) The OS uses the information to figure out how to write to the frame buffer. This may be a calculation like physical_address = base_address + y * bytes_between_lines + x * bytes_per_pixel (where bytes_per_pixel is determined from the pixel format).
Notes for "early initialisation":
for performance reasons, it's better to draw everything in a buffer in RAM and then copy ("blit") the data from the buffer in RAM to the frame buffer.
for performance reasons, the code to copy ("blit") the data from the buffer in RAM to the frame buffer can/should use some tricks to avoid copying data that didn't change since last time
to support many different pixel formats, it's possible to use a "standard" pixel format for the buffer in RAM (e.g. maybe "8-bit red, 8-bit green, 8-bit blue, 8-bit padding") and convert that to whichever pixel format the video card happens to want (e.g. maybe "5-bit blue, 6-bit green, 5-bit red, no padding") while copying data from the buffer in RAM to the frame buffer. This allows you to have a single version of all the functions to draw things (characters, lines, rectangles, icons, ...) instead of having multiple different versions of many different functions (one for each possible pixel format).
Notes for "middle initialisation":
eventually the OS will try to find and start suitable device drivers for all the different devices. This includes trying to find a suitable driver for video card/s (e.g. that supports things like vertical sync, GPU, GPGPU, etc).
you will need to design a video driver interface that native video drivers can use that (ideally) supports modern features (e.g. full 3D graphics and shaders maybe).
when there is no native video driver, the OS can/should start a "generic frame buffer" driver that implements the same video driver interface (that was designed to support hardware acceleration) that does everything in software without the benefit of hardware acceleration.
when video driver/s are started, the OS needs to have some kind of "hand off" where ownership of the frame buffer is passed from the earlier boot code to the video driver. After this "hand off" the earlier boot code (which was designed to draw things directly to the frame buffer) should not touch the frame buffer and should ask the video driver to do the "convert pixel data and copy to frame buffer" work.
Notes for "after initialisation":
For a traditional "2D GUI"; typically you have one buffer (or "canvas" or "texture" or whatever) for the background/desktop, plus more buffers/canvases for each window or dialog box, and possibly more buffers/canvases for smaller things (e.g. mouse pointer, drop down menus, "widgets", etc); such that applications can modify their buffer/canvas (but are prevented from directly or indirectly accessing any other buffer/canvas for security reasons). Then the GUI tells the video driver where each of these buffers/canvases should be drawn; and the video driver (using hardware acceleration if its a native video driver) combines these pieces together ("composes") to get pixel data for the whole frame, then does the pixel format conversion (using GPU hopefully) to get raw pixel data to display/to send to the monitor. This means various actions (moving windows around the screen, "alt tabbing" between windows, moving the mouse around, etc) become extremely fast when there's a native video driver because the CPU is doing nothing and the video card itself is doing all the work.
ideally there would be a way (e.g. OpenGL) for the application to ask the video driver to draw stuff in the application's buffer/canvas; such that more work can be done by the video card (and not done by the CPU). This is especially important for 3D games, but there's no reason why normal 2D applications can't benefit from using the same approach for 2D graphics.
Note that most beginners do everything wrong (don't have a well designed native video driver interface) and therefore will never have any native video drivers because all their software can't use a native video driver anyway. These people will probably try to convince you that it's not worth the hassle (because in their experience native video drivers won't ever exist). The reality is that most native video drivers are extremely hard to write, but some of them (for virtual machines) aren't hard to write; and your goal should be to allow other people write drivers eventually (by designing suitable interfaces and providing adequate documentation) rather than writing all the drivers yourself.

Top answer did a very good job of explaining. You did ask for some example code, so here's a code snippet from my GitHub, and a detailed explanation will follow.
1. bios_setup:
2. mov ah, 00h ; tell the bios we'll be in graphics mode
3. mov al, 13h
4. int 10h ; call the BIOS
5. mov ah, 0Ch ; set video mode
6. mov bh, 0 ; set output vga
7. mov al, 0 ; set initial color
8. mov cx, 0 ; x = 0
9. mov dx, 0 ; y = 0
10. int 10h ; BIOS interrupt
Line 2 is where the fun begins. Firstly, we move the value 0 into the ah register. At line 3, we move 13 hex into al - now we're ready for our BIOS call.
Line 4 calls the bios with interrupt vector 10 hex. BIOS now checks in ah and al.
AH:
- tells BIOS to set video mode
AL:
- tells BIOS to enter write string mode.
Now that we called the interrupt on line 4, we're ready to move new values into some registers.
At line 5, we put 0C hex into the ah register.
This tells BIOS that we want to write a graphics pixel.
At line 6, we throw 0 into the bh register, which tells BIOS that we'll be either using a CGA, EGA, MCGA, or VGA adapter to output. So output mode 0 basically.
And next all we have to do is set our color. So lets start at 0, which is black.
That's all nice, but where do we want to actually draw this black pixel to?
That's where lines 8-9 come in, where registers cx and dx store the x,y coordinates of the pixel to draw, respectively.
Once they are set, we call the BIOS with interrupt 10 hex. And the pixel in drawn.
After reading Brendan's elaborate and informative answer, this code will make
much more sense. Certain values must be in certain registers before calling the
BIOS simply because those are the registers in which the according interrupt
will check. Everything else is pretty straight forward. If you want another
color, simply change the value in al. You want to blit your pixel somewhere else?
Mess around with the x and y values in cx and dx. Again, this isn't very
efficient for graphics intensive programs as it is pretty slow.
For educational purposes, however, it beats writing your own graphics driver ;)
You can still get some efficiency by drawing everything in a buffer in RAM before
blitting to the screen, as Brendan said, but I'd much rather keep it simple in
my example.
Check out the full - free - example on my GitHub. I've also included a README and a Makefile, but they are Linux exclusive. If you're running on Windows, some googling will yield any information necessary to assembling the OS to a bootable floppy, and just about any virtual machine host will do. Also, feel free to ask me about anything that's unclear. Cheers!
Ps: I did not write a tool, simply a small script in NASM that is meant to be assembled to a floppy and ran as a kernel (in a VM if you will)

Find updated rectangles in image

I need to find which rectangular regions were updated between two images. E.g., I have these images:
first http://storage.thelogin.ru/stackoverflow/find-updated-rectangles-in-image/1.png second http://storage.thelogin.ru/stackoverflow/find-updated-rectangles-in-image/2.png
Imagemagick's compare tells me this pixels were updated:
compare http://storage.thelogin.ru/stackoverflow/find-updated-rectangles-in-image/3.png
So I need to repaint this regions (have outlined first of them):
compare http://storage.thelogin.ru/stackoverflow/find-updated-rectangles-in-image/4.png
Repainting is done over slow connection (57600 baud), so number one priority is data size (one byte for magic word, one byte for checksum, six bytes for region coordinates, two bytes for each pixel). Which algorithm can I use to find these regions? I think, something like that is used in vnc and similar software.

As far as actually finding the regions which have changed, as ImageMagick has done for you, you can compute a pixel by pixel difference (e.g. XOR). Regions with a difference of 0 have not changed.
It is not clear from your question whether the painting itself is slow or just the transmission of the repainting data. It is also not clear what kind of encoding/decoding can be done on the other end of the transmission. Do you have to send your data as you specified or can you encode it in another way if you wish?
Your data packets have a 8 byte overhead per rectangle "(one byte for magic word, one byte for checksum, six bytes for region coordinates, two bytes for each pixel)". I take it from the two bytes for each pixel that the color depth is 16-bit? So, due to the overhead, some of the smallest rectangles you outlined actually cost you more than combining them with other rectangles and resending some data on non-updated regions.
The actual problem of finding rectangles where each has an overhead is analogous to the "Strawberry Fields" hiring problem put forth by ITA Software. The original link is dead, but here is someone's solution with problem description.
At 57600 baud, you get to send 7200 bytes per second, which would be 3600 pixels at two bytes per pixel. As a square, this is a measly 60x60. You've certainly outlined more than that in your example, and this does not count the overhead.
The refresh rate of the monitor on the receiving end also needs to be considered. If the monitor is refreshing 60 times per second and you are only able to send one 60x60 square per second, how will this look?
Things to consider:
reduce color depth
run length encode pixel differences per scan line
attempt more ambitious compression per region, but watch the overhead
send non-graphic data and let the receiver compute the graphics (e.g. in this example, send the text that has changed, the updated time, etc. and let the receiver draw the progress bar, etc.)
abandon this insanity

Expanding 8-bit color to 24-bit color

Setup
I have a couple hundred Sparkfun LED pixels (similar to https://www.sparkfun.com/products/11020) connected to an Arduino Uno and want to control the pixels from a PC using the built-in Serial-over-USB connection of the Arduino.
The pixels are individually adressable, each has 24 bits for the color (RGB). Since I want to be able to change the color of each pixel very quickly, the transmission of the data from the pc to the Arduino has to be very efficient (the further transmission of data from the Arduino to the pixels is very fast already).
Problem
I've tried simply sending the desired RGB-Values directly as is to the Arduino but this leads to a visible delay, when I want to for example turn on all LEDs at the same time. My straightforward idea to minimize the amount of data is to reduce the available colors from 24-bit to 8-bit, which is more than enough for my application.
If I do this, I have to expand the 8-bit values from the PC to 24-bit values on the Arduino to set the actual color on the pixels. The obvious solution here would be a palette that holds all available 8-bit values and the corresponding 24-bit colors. I would like to have a solution without a palette though, mostly for memory space reasons.
Question
What is an efficient way to expand a 8-bit color to a 24-bit one, preferrably one that preserves the color information accurately? Are there standard algorithms for this task?
Possible solution
I was considering a format with 2 bits for each R and B and 3 bits for G. These values would be packed into a single byte that would be transmitted to the Arduino and then be unpacked using bit-shifting and interpolated using the map() function (http://arduino.cc/en/Reference/Map).
Any thoughts on that solution? What would be a better way to do this?

R2B2G3 would give you very few colors (there's actually one more bit left). I don't know if it would be enough for your application. You can use dithering technique to make 8-bit images look a little better.
Alternatively, if you have any preferred set of colors, you can store known palette on your device and never send it over the wire. You can also store multiple palettes for different situations and specify which one to use with small integer index.
On top of that it's possible to implement some simple compression algorithm like RLE or LZW and decompress after receiving.
And there are some very fast compression libraries with small footprint you can use: Snappy, miniLZO.

Regarding your question “What would be a better way to do this?”, one of the first things to do (if not yet done) is increase the serial data rate. An Arduino Forum suggests using 115200 bps as a standard rate, and trying 230400 bps. At those rates you would need to write the receiving software so it quickly transfers data from the relatively small receive buffer into a larger buffer, instead of trying to work on the data out of the small receive buffer.
A second possibility is to put activation times into your data packets. Suppose F1, F2, F3... are a series of frames you will display on the LED array. Send those frames from the PC ahead of time, or during idle or wait times, and let the Arduino buffer them until they are scheduled to appear. When the activation time arrives for a given frame, have the Arduino turn it on. If you know in advance the frames but not the activation times, send and buffer the frames and send just activation codes at appropriate times.
Third, you can have multiple palettes and dynamic palettes that change on the fly and can use pixel addresses or pixel lists as well as pixel maps. That is, you might use different protocols at different times. Protocol 3 might download a whole palette, 4 might change an element of a palette, 5 might send a 24-bit value v, a time t, a count n, and a list of n pixels to be set to v at time t, 6 might send a bit map of pixel settings, and so forth. Bit maps can be simple 1-bit-per-pixel maps indicating on or off, or can be k-bits-per-pixel maps, where a k-bit entry could specify a palette number or a frame number for a pixel. This is all a bit vague because there are so many possibilities; but in short, define protocols that work well with whatever you are displaying.
Fourth, given the ATmega328P's small (2KB) RAM but larger (32KB) flash memory, consider hard-coding several palettes, frames, and macros into the program. By macros, I mean routines that generate graphic elements like arcs, lines, open or filled rectangles. Any display element that is known in advance is a candidate for flash instead of RAM storage.

Your (2, 3, 2) bit idea is used "in the wild." It should be extremely simple to try out. The quality will be pretty low, but try it out and see if it meets your needs.
It seems unlikely that any other solution could save much memory compared to a 256-color lookup table, if the lookup table stays constant over time. I think anything successful would have to exploit a pattern in the kind of images you are sending to the pixels.

Any way you look at it, what you're really going for is image compression. So, I would recommend looking at the likes of PNG and JPG compression, to see if they're fast enough for your application.
If not, then you might consider rolling your own. There's only so far you can go with per-pixel compression; size-wise, your (2,3,2) idea is about as good as you can expect to get. You could try a quadtree-type format instead: take the average of a 4-pixel block, transmit a compressed (lossy) representation of the differences, then apply the same operation to the half-resolution image of averages...
As others point out, dithering will make your images look better at (2,3,2). Perhaps the easiest way to dither for your application is to choose a different (random or quasi-random) fixed quantization threshold offset for each color of each pixel. Both the PC and the Arduino would have a copy of this threshold table; the distribution of thresholds would prevent posterization, and the Arduino-side table would help maintain accuracy.

How are GUIs drawn?

How do people make GUIs? I mean the basic building block or principle they used to draw visual components on the screen like KDE, Gnome, etc. Are there any simple examples about how to draw something like a rectangle on the screen by directly dealing with the hardware?
I am using a PC for those who are asking about my platform.

Well okay, let's start at the bottom. You have a monitor that displays an image. This image is a matrix of pixels, say, 1600x1200 pixels with 24 bits depth.
The monitor knows what to display from the video adapter. The video adapter knows what to display through the "frame buffer", which is a big block of memory that - in this example - contains 1600 * 1200 pixels, usually with 32 bits per pixel in contemporary cards.
The frame buffer is often accessible to the CPU as a big block and memory that it can poke into directly, and some adapters have GPUs that allow for things like rendering stuff into the frame buffer, like shaded textured triangles, so the CPU just sends commands through a "command buffer", telling it what to draw and where.
Then you have the operating system, which loads a hardware driver that communicates with the video adapter.
The operating system usually offers functions to write to the frame buffer using functions. Win32 for example has lots of functions like BitBlt, Line, Text, etc. These will end up talking to the driver.
Then you have something like Java, that renders its own graphics, typically using functions provided by the operating system.

The simple answer is bitmaps, in fact this would also apply to fonts on terminals in the early days.
The original GUI's, things like Xerox Parc's Alto GUI were based on bitmap displays, and the graphics were drawn with simple bitmap drawing tools and graphics libraries, using simple geometry to determine shapes like circles, squares, rectangles etc, and then map them to display pixels.
Today's GUI are the same, except with additional software and hardware that have sped up and improved the process, and performance of these GUIs.
The fundamental mapping of bits e.g. 10101010 to pixels is dependent on the display hardware, but at a simplistic level, you would provide a display buffer in memory and simply populate it's bytes with the display data.
So for a basic monochrome bitmap, you'd draw it by providing bits that represented the shape you want to draw, you can either position these bits, like this, a simple 8x8pix button.
01111110
10000001
10000001
10111101
10111101
10000001
10000001
01111110
Which you can see easier if I render it with # and SPACE instead of 1 and 0.
######
# #
# #
# #### #
# #### #
# #
# #
######
Which as a bitmap image would look like this : http://i.stack.imgur.com/i7lVQ.png (I know it's a bit small :) but this is the sort of scale we would've begun at, when GUI's were first developed.)
If you had a more complex (e.g. 24 bit color display, you'd provide each pixel using a 24bit number.)
Obviously some bitmaps cannot be drawn manually (for example the border of a window), like we've done above, this is where geometry comes in handy, and we can use simple functions to determine the pixel values required to draw a rectangle, or any other simple shape, and then build from there.
Once you are able to draw graphics in this way on a display, you then hook a drawing loop onto a system interrupt to keep the display up to date (you redraw the display very often, depending on your system performance.) This way you can make it handle interaction from user devices, e.g. a mouse.
Back in the early days, even before Xerox Parc / Alto there were a number of early computer systems which had Vector based displays, these would make up an image by drawing lines on a CRT representation of a cartesian plane. However, these displays never saw mainstream use, except perhaps in some early video games, like Asteroids and Tempest.

You probably need a graphics library such as, for example, OpenGL.
For direct hardware interaction, you probably need to do something like assembly, which is completely computer specific.

If you are willing to look through a lot of source code, you might look at Mesa 3D, an open source implementation of the OpenGL specification.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio