I am writing an OS and want to have GUI. I can't find good tutorial for drawing pixels on the screen.
I'd like to have some assembly + C example which I can build and run on some emulator like BOCHS or v86
The basic idea is:
1) bootloader uses firmware (VBE on BIOS, GOP or UGA on UEFI) to set a graphics mode that is supported by the monitor, video card and OS; and while doing this it gets relevant information (physical address of frame buffer, horizontal and vertical resolution, pixel format, bytes between horizontal lines) about the frame buffer from the firmware that it can pass to the OS; so that the OS can use this information during "early initialisation" (before a native video driver is started), and can continue using it (as a kind of "limp mode") if there is no suitable native video driver.
2) The OS uses the information to figure out how to write to the frame buffer. This may be a calculation like physical_address = base_address + y * bytes_between_lines + x * bytes_per_pixel (where bytes_per_pixel is determined from the pixel format).
Notes for "early initialisation":
for performance reasons, it's better to draw everything in a buffer in RAM and then copy ("blit") the data from the buffer in RAM to the frame buffer.
for performance reasons, the code to copy ("blit") the data from the buffer in RAM to the frame buffer can/should use some tricks to avoid copying data that didn't change since last time
to support many different pixel formats, it's possible to use a "standard" pixel format for the buffer in RAM (e.g. maybe "8-bit red, 8-bit green, 8-bit blue, 8-bit padding") and convert that to whichever pixel format the video card happens to want (e.g. maybe "5-bit blue, 6-bit green, 5-bit red, no padding") while copying data from the buffer in RAM to the frame buffer. This allows you to have a single version of all the functions to draw things (characters, lines, rectangles, icons, ...) instead of having multiple different versions of many different functions (one for each possible pixel format).
Notes for "middle initialisation":
eventually the OS will try to find and start suitable device drivers for all the different devices. This includes trying to find a suitable driver for video card/s (e.g. that supports things like vertical sync, GPU, GPGPU, etc).
you will need to design a video driver interface that native video drivers can use that (ideally) supports modern features (e.g. full 3D graphics and shaders maybe).
when there is no native video driver, the OS can/should start a "generic frame buffer" driver that implements the same video driver interface (that was designed to support hardware acceleration) that does everything in software without the benefit of hardware acceleration.
when video driver/s are started, the OS needs to have some kind of "hand off" where ownership of the frame buffer is passed from the earlier boot code to the video driver. After this "hand off" the earlier boot code (which was designed to draw things directly to the frame buffer) should not touch the frame buffer and should ask the video driver to do the "convert pixel data and copy to frame buffer" work.
Notes for "after initialisation":
For a traditional "2D GUI"; typically you have one buffer (or "canvas" or "texture" or whatever) for the background/desktop, plus more buffers/canvases for each window or dialog box, and possibly more buffers/canvases for smaller things (e.g. mouse pointer, drop down menus, "widgets", etc); such that applications can modify their buffer/canvas (but are prevented from directly or indirectly accessing any other buffer/canvas for security reasons). Then the GUI tells the video driver where each of these buffers/canvases should be drawn; and the video driver (using hardware acceleration if its a native video driver) combines these pieces together ("composes") to get pixel data for the whole frame, then does the pixel format conversion (using GPU hopefully) to get raw pixel data to display/to send to the monitor. This means various actions (moving windows around the screen, "alt tabbing" between windows, moving the mouse around, etc) become extremely fast when there's a native video driver because the CPU is doing nothing and the video card itself is doing all the work.
ideally there would be a way (e.g. OpenGL) for the application to ask the video driver to draw stuff in the application's buffer/canvas; such that more work can be done by the video card (and not done by the CPU). This is especially important for 3D games, but there's no reason why normal 2D applications can't benefit from using the same approach for 2D graphics.
Note that most beginners do everything wrong (don't have a well designed native video driver interface) and therefore will never have any native video drivers because all their software can't use a native video driver anyway. These people will probably try to convince you that it's not worth the hassle (because in their experience native video drivers won't ever exist). The reality is that most native video drivers are extremely hard to write, but some of them (for virtual machines) aren't hard to write; and your goal should be to allow other people write drivers eventually (by designing suitable interfaces and providing adequate documentation) rather than writing all the drivers yourself.
Top answer did a very good job of explaining. You did ask for some example code, so here's a code snippet from my GitHub, and a detailed explanation will follow.
1. bios_setup:
2. mov ah, 00h ; tell the bios we'll be in graphics mode
3. mov al, 13h
4. int 10h ; call the BIOS
5. mov ah, 0Ch ; set video mode
6. mov bh, 0 ; set output vga
7. mov al, 0 ; set initial color
8. mov cx, 0 ; x = 0
9. mov dx, 0 ; y = 0
10. int 10h ; BIOS interrupt
Line 2 is where the fun begins. Firstly, we move the value 0 into the ah register. At line 3, we move 13 hex into al - now we're ready for our BIOS call.
Line 4 calls the bios with interrupt vector 10 hex. BIOS now checks in ah and al.
AH:
- tells BIOS to set video mode
AL:
- tells BIOS to enter write string mode.
Now that we called the interrupt on line 4, we're ready to move new values into some registers.
At line 5, we put 0C hex into the ah register.
This tells BIOS that we want to write a graphics pixel.
At line 6, we throw 0 into the bh register, which tells BIOS that we'll be either using a CGA, EGA, MCGA, or VGA adapter to output. So output mode 0 basically.
And next all we have to do is set our color. So lets start at 0, which is black.
That's all nice, but where do we want to actually draw this black pixel to?
That's where lines 8-9 come in, where registers cx and dx store the x,y coordinates of the pixel to draw, respectively.
Once they are set, we call the BIOS with interrupt 10 hex. And the pixel in drawn.
After reading Brendan's elaborate and informative answer, this code will make
much more sense. Certain values must be in certain registers before calling the
BIOS simply because those are the registers in which the according interrupt
will check. Everything else is pretty straight forward. If you want another
color, simply change the value in al. You want to blit your pixel somewhere else?
Mess around with the x and y values in cx and dx. Again, this isn't very
efficient for graphics intensive programs as it is pretty slow.
For educational purposes, however, it beats writing your own graphics driver ;)
You can still get some efficiency by drawing everything in a buffer in RAM before
blitting to the screen, as Brendan said, but I'd much rather keep it simple in
my example.
Check out the full - free - example on my GitHub. I've also included a README and a Makefile, but they are Linux exclusive. If you're running on Windows, some googling will yield any information necessary to assembling the OS to a bootable floppy, and just about any virtual machine host will do. Also, feel free to ask me about anything that's unclear. Cheers!
Ps: I did not write a tool, simply a small script in NASM that is meant to be assembled to a floppy and ran as a kernel (in a VM if you will)
Related
I have trouble understanding these two class styles. The docs say that they align the window on a byte boundary, but I don't understand what that means.
I have tried to use them and yes, the position of the window upon creation is different, but what do they do and why would I use them is unclear to me.
What do they do and why would I use them?
With modern display technology and GPUs, they (probably) do very little in terms of performance.
In older times, though, a (potentially slow) CPU would need to write blocks of RAM directly to display memory. In such cases, where a display and/or bitmap has a "colour depth" of less than one byte – like monochrome (1 bit-per-pixel) and low colour (say, 4 bpp) – windows and their clients could be aligned such that each row was not 'aligned' to an actual byte boundary; thus, block-copy operations (like BitBlt) would be very slow, because the first few pixels in each row would have to be set by manually setting some of the bits in the display memory according to some of the bits in the first bytes of the source (RAM). These slow operations would also be propagated along each row.
Forcing the display (be it the client area or the entire window) to have its x-origin (those flags/styles only affect the x-position) aligned to a true byte boundary allows much faster copying, because there would then be a direct correspondence between bytes in the source (RAM) and bytes in the target (display); thus, simple block-copying of a row of bytes can be performed (with something akin to memcpy), without the need for any manipulation of individual bits from different bytes.
As a vague analogy, consider the difference (in speed and simplicity) between: (a) copying one array of n bytes to another of the same size; and (b) replacing each byte in the second array with the combination of the lower 4 bits of one source element with the higher 4 bits of the following source element.
From Why did Windows 95 keep window coordinates at multiples of 8? by Raymond Chen:
The screen itself is a giant bitmap, and this means that copying data to the screen goes much faster if x-coordinate of the destination resides on a full byte boundary. And the most common x-coordinate is the left edge of a window’s contents (known as its client area).
Applications can request that Windows position their windows so that their client area began at these advantageous coordinates by setting the CS_BYTEALIGNCLIENT style in their window class. And pretty much all applications did this because of the performance benefit it produced.
So what happened after Windows 95 that made this optimization go away?
Oh, the optimization is still there. You can still set the CS_BYTEALIGNCLIENT style today, and the system will honor it.
The thing that changed wasn’t Windows. The thing that changed was your video card.
In the Windows 95 era, predominant graphics cards were the VGA (Video Graphics Array) and EGA (Enhanced Graphics Adapter). Older graphics cards were also supported, such as the CGA (Color Graphics Adapter) and the monochrome HGC (Hercules Graphics Card).
All of these graphics cards had something in common: They used a pixel format where multiple pixels were represented within a single byte,¹ and therefore provided an environment where byte alignment causes certain x-coordinates to become ineligible positions.
Once you upgraded your graphics card and set the color resolution to “256 colors” or higher, every pixel occupies at least a full byte,² so the requirement that the x-coordinate be byte-aligned is vacuously satisfied. Every coordinate is eligible.
Nowadays, all graphics cards use 32-bit color formats, and the requirement that the coordinate be aligned to a byte offset is satisfied by all x-coordinates.³ The multiples of 8 are no longer special.
I've been using SDL to render graphics in C. I know there are several options to create graphics at the pixel level on Windows, including SDL and OpenGL. But how do these programs do it? Fine, I can use SDL. But I'd like to know what SDL is doing so I don't feel like an ignorant fool. Am I the only one slightly frustrated by the opaque layer of frosting on modern computers?
A short explanation as to how this is done on other operating systems would also be interesting, but I am most concerned with Windows.
Edit: Since this question seems to be somehow unclear, this is precisely what I want:
I would like to know how pixel level graphics manipulations (drawing on the screen pixel by pixel) works on Windows. What do libraries like SDL do with the operating system to allow this to happen. I can manipulate the screen pixel by pixel using SDL, so what magic happens in SDL to let me do this?
Windows has many graphics APIs. Some are layers built on top of others (e.g., GDI+ on top of GDI), and others are completely independent stacks (like the Direct3D family).
In an API like GDI, there are functions like SetPixel which let you change the value of a single pixel on the screen (or within a region of the screen that you have access to). But using SetPixel to setting lots of pixels is generally slow.
If you were to build a photorealistic renderer, like a ray tracer, then you'd probably build up a bitmap in memory (pixel by pixel), and use an API like BitBlt that sends the entire bitmap to the screen at once. This is much faster.
But it still may not be fast enough for rendering something like video. Moving all that data from system memory to the video card memory takes time. For video, it's common to use a graphics stack that's closer to the low-level graphics drivers and hardware. If the graphics card can do the video decompression directly, then sending the compressed video stream to the card will be much more efficient than sending the decompressed data from system memory to the video card--and that's often the limiting factor.
But conceptually, it's the same thing: you're manipulating a bitmap (or texture or surface or raster or ...), but that bitmap lives in graphics memory, and you're issuing commands to the GPU to set the pixels the way you want, and then to display that bitmap at some portion of the screen (often with some sort of transformation).
Modern graphics processors actually run little programs--called shaders--that can (among other things) do calculations to determine the pixel values. The GPUs are optimized to do these types of calculations and can do many of them in parallel. But ultimately, it boils down to getting the pixel values into some sort of bitmap in video memory.
What happens during a display mode change (resolution, depth) on an ordinary computer? (classical stationarys and laptops)
It might not be so trivial since video cards are so different, but one thing is common to all of them:
The screen goes black (understandable since the signal is turned off)
It takes many seconds for the signal to return with the new mode
and if it is under D3D or GL:
The graphics device is lost and all VRAM objects must be reloaded, making the mode change take even longer
Can someone explain the underlying nature of this, and specifically why a display mode change is not a trivial reallocation of the backbuffer(s) and takes such a "long" time?
The only thing that actually changes are the settings of the so called RAMDAC (a Digital Analog Converter directly attached to the video RAM), well today with digital connections it's more like a RAMTX (a DVI/HDMI/DisplayPort Transmitter attached to the video RAM). DOS graphics programmer veterans probably remember the fights between the RAMDAC, the specification and the woes of one's own code.
It actually doesn't take seconds until the signal returns. This is a rather quick process, but most display devices take their time to synchronize with the new signal parameters. Actually with well written drivers the change happens almost immediately, between vertical blanks. A few years ago, when the displays were, errr, stupider and analogue, after changing the video mode settings, one could see the picture going berserk for a short moment, until the display resynchronized (maybe I should take a video of this, while I still own equipment capable of this).
Since what actually is going on is just a change of RAMDAC settings there's also not neccesary data lost as long as the basic parameters stays the same: Number of Bits per Pixel, number of components per pixel and pixel stride. And in fact OpenGL contexts usually don't loose their data with an video mode change. Of course visible framebuffer layouts change, but that happens also when moving the window around.
DirectX Graphics is a bit of different story, though. There is device exclusive access and whenever switching between Direct3D fullscreen mode and regular desktop mode all graphics objects are swapped, so that's the reason for DirectX Graphics being so laggy when switching from/to a game to the Windows desktop.
If the pixel data format changes it usually requires a full reinitialization of the visible framebuffer, but today GPUs are exceptionally good in maping heterogenous pixel formats into a target framebuffer, so no delays neccesary there, too.
I have written an anaglyph filter that mixes two images into one stereographic image. It is a fast routine that works with one pixel at a time.
Right now I'm using pointers to output each calculated pixel to a memory bitmap, then Bitblt that whole image onto the window.
This seems redundant to me. I'd rather copy each pixel directly to the screen, since my anaglyph routine is quite fast. Is it possible to bypass Bitblt and simply have the pointer point directly to wherever Bitblt would copy it to?
I'm sure it's possible, but you really really really don't want to do this. It's much more efficient to draw the entire pattern at once.
You can't draw directly to the screen from windows because the graphics card memory isn't necessarily mapped in any sane order.
Bltting to the screen is amazingly fast.
Remember you don't blt after each pixel - only when you want a new result to be shown, even then there's no point doing this faster than the refresh on your screen - probably 60hz
You are looking for something like glMapBuffer in OpenGL, but acessing directly to the screen.
But writing to the GPU memory pixel per pixel is the slower operation you can do. PCI works faster if you send big streams of data. Also, there are many issues if you write and read data. And the pixel layout is also important (see nvidia docs about fast texture transfers). Bitblt will do it for you in a driver optimised way.
How do people make GUIs? I mean the basic building block or principle they used to draw visual components on the screen like KDE, Gnome, etc. Are there any simple examples about how to draw something like a rectangle on the screen by directly dealing with the hardware?
I am using a PC for those who are asking about my platform.
Well okay, let's start at the bottom. You have a monitor that displays an image. This image is a matrix of pixels, say, 1600x1200 pixels with 24 bits depth.
The monitor knows what to display from the video adapter. The video adapter knows what to display through the "frame buffer", which is a big block of memory that - in this example - contains 1600 * 1200 pixels, usually with 32 bits per pixel in contemporary cards.
The frame buffer is often accessible to the CPU as a big block and memory that it can poke into directly, and some adapters have GPUs that allow for things like rendering stuff into the frame buffer, like shaded textured triangles, so the CPU just sends commands through a "command buffer", telling it what to draw and where.
Then you have the operating system, which loads a hardware driver that communicates with the video adapter.
The operating system usually offers functions to write to the frame buffer using functions. Win32 for example has lots of functions like BitBlt, Line, Text, etc. These will end up talking to the driver.
Then you have something like Java, that renders its own graphics, typically using functions provided by the operating system.
The simple answer is bitmaps, in fact this would also apply to fonts on terminals in the early days.
The original GUI's, things like Xerox Parc's Alto GUI were based on bitmap displays, and the graphics were drawn with simple bitmap drawing tools and graphics libraries, using simple geometry to determine shapes like circles, squares, rectangles etc, and then map them to display pixels.
Today's GUI are the same, except with additional software and hardware that have sped up and improved the process, and performance of these GUIs.
The fundamental mapping of bits e.g. 10101010 to pixels is dependent on the display hardware, but at a simplistic level, you would provide a display buffer in memory and simply populate it's bytes with the display data.
So for a basic monochrome bitmap, you'd draw it by providing bits that represented the shape you want to draw, you can either position these bits, like this, a simple 8x8pix button.
01111110
10000001
10000001
10111101
10111101
10000001
10000001
01111110
Which you can see easier if I render it with # and SPACE instead of 1 and 0.
######
# #
# #
# #### #
# #### #
# #
# #
######
Which as a bitmap image would look like this : http://i.stack.imgur.com/i7lVQ.png (I know it's a bit small :) but this is the sort of scale we would've begun at, when GUI's were first developed.)
If you had a more complex (e.g. 24 bit color display, you'd provide each pixel using a 24bit number.)
Obviously some bitmaps cannot be drawn manually (for example the border of a window), like we've done above, this is where geometry comes in handy, and we can use simple functions to determine the pixel values required to draw a rectangle, or any other simple shape, and then build from there.
Once you are able to draw graphics in this way on a display, you then hook a drawing loop onto a system interrupt to keep the display up to date (you redraw the display very often, depending on your system performance.) This way you can make it handle interaction from user devices, e.g. a mouse.
Back in the early days, even before Xerox Parc / Alto there were a number of early computer systems which had Vector based displays, these would make up an image by drawing lines on a CRT representation of a cartesian plane. However, these displays never saw mainstream use, except perhaps in some early video games, like Asteroids and Tempest.
You probably need a graphics library such as, for example, OpenGL.
For direct hardware interaction, you probably need to do something like assembly, which is completely computer specific.
If you are willing to look through a lot of source code, you might look at Mesa 3D, an open source implementation of the OpenGL specification.