How to read video frame buffer in windows - windows

I am trying to create a small project wherein I need to capture/read the video frame buffer and calculate the average RGB value of the screen.
I don't need to write anything on the screen. I'm doing this in Windows.
Can anyone help me with any Windows API which will read the video frame buffer and calculate the average RGB value?
What I came to know is that I need to write a kernel driver which will have access to read the frame buffer.
Is this the only solution?
Is there any other way of reading frame buffer?
Is there an algorithm to calculate the RGB value from frame buffer data?

If you want really good performance, you might have to use directx and capture the backbuffer to a texture. Using mipmaps, it will automatically create downsamples all the way to 1X1. Justgrab the color of that 1 pixel and you're good to go.
Good luck, though. I'm working on implimenting this as we speak. I'm creating an ambient light control for my room. I was getting about 15FPS using device contexts and StretchBLT. Only got decent performance if I grabbed 1 pixel with GetPixel(). That's an i5 3570K # 4.5GHz
But with the directx method, you could technically get hundreds if not thousands of frames per second. (when I make a spinning triangle, my 660 gets about 24,000 FPS. It couldn't be TOO much slower, minus the CPU calls.)

Related

Better way to display a radar ppi?

I am using Qt 4.8.6 to display multiple radar videos.
For now i am getting about 4096 azimuths (360°) per 2.5 seconds and video.
I display my image using a class inherited from QGraphicsObject (see here), using one of the RGB-Channels for each video.
Per Azimuth I get the angle and an array of 8192 rangebins and my image has the size of 1024x1024 pixels. I now check for every pixel (i am going through every x-coordinate and check the max y- and min y-coordinate for every azimuth and pixel coordinate), which rangebins are present at that pixel and write the biggest data into my image-array.
My problems
The calculating of every azimuth lasts about 1ms, which is way too slow. (I get two azimuths every about 600 microseconds, later there may be even more video channels.)
I want to zoom and move my image and for now have thought about two methods to do that:
Using an image array in full size and zoom and move the QGraphicsscene directly/"virtual"
That would cause the array to have a size of 16384x16384x4 bytes, which is way too big (i can not manage to allocate enough space)
Save multiple images for different scalefactors and offsets, but for that I would need my transforming algorithm to calculate multiple times (which is already slow) and causing the zoom and offset to display only after the full 2.5 seconds
Can you think of any better methods to do that?
Are there any standard rules, how I can check my algorithm for better performance?
I know that is a very special question, but since my mentor is not at work for the next days, I will take a try here.
Thank you!
I'm not sure why you are using a QGraphicsScene for the scenario you are doing. Have you considered turning your data into a raster image, and presenting the data as a bitmap?

How much max FPS in OpenCV?

I want to use a high speed camera(500-1000 fps) for capturing rice seeds. Before proceeding I want to know what is the max FPS that's supported by OpenCV ? I want to detect color for every pixel (2048 pixels per frame).
I have these questions :
Can I use OpenCV to do this work? What are the alternatives?
Is it possible to use OpenCV, if I limit the number of pixels per frame (for example, 50 pixels per frame)?
I don't think there is such value as MAX_FPS in OpenCV. However processing 500-1000 frames per second might be quite hard. What is the size (width, height, number of chennels, depth) of single frame? The only option which comes to my mind is to grab frames using the normal approach and than process them on GPU (OpenCV has Cuda module for this). You can try to process them one by one or grab x frames and than process them in the same time (parallel). Of course you can try to do it on CPU as well, but most likely you will not be able to use as much "threads" as on GPU.

What's the fastest way to access video pixels in as3?

I would like to copy pixels from a 1080p video from one location to another efficiently/with as little CPU impact as possible.
So far my implementation is fairly simple:
using BitmapData's draw() method to grab the pixels from the video
using BitmapData's copyPixels() to shuffle pixels about
Ideally this would have as little CPU impact as possible but I am running out of options and could really use some tips from experienced actionscript 3 developers.
I've profiled my code with Scout and noticed the CPU usage is mostly around 70% but goes above 100% quite a bit. I've looked into StageVideo but one of the main limitations is this:
The video data cannot be copied into a BitmapData object
(BitmapData.draw).
Is there a more direct way to access video pixels, rather than rasterizing a DisplayObject ?
Can I access each video frame as a ByteArray directly and plug it into a BitmapData object ?
(I found appendBytes but it seems to do the reverse of what I need in my setup).
What is the most CPU friendly way to manipulate pixels from an h264 1080p video in actionscript 3 ?
Also, is there a faster way to moving pixels around other than copyPixels() using Flash Player ?Also, I see Scout points out that video is not hardware accelerated( .rend.video.hwrender: false ). Shouldn't h264 video be hardware accelerated (even without stage video) according to this article (or is this for the fullscreen mode only) ?
Latest AIR beta introduced video as texture support which you could possibly use to manipulate the video on GPU (and do that way faster than with BitmapData). But keep in mind that it is currently available for AIR on Windows only and there are some other limitations.

Is it possible to use pointers to write directly (low level) onto a window without using Bitblt?

I have written an anaglyph filter that mixes two images into one stereographic image. It is a fast routine that works with one pixel at a time.
Right now I'm using pointers to output each calculated pixel to a memory bitmap, then Bitblt that whole image onto the window.
This seems redundant to me. I'd rather copy each pixel directly to the screen, since my anaglyph routine is quite fast. Is it possible to bypass Bitblt and simply have the pointer point directly to wherever Bitblt would copy it to?
I'm sure it's possible, but you really really really don't want to do this. It's much more efficient to draw the entire pattern at once.
You can't draw directly to the screen from windows because the graphics card memory isn't necessarily mapped in any sane order.
Bltting to the screen is amazingly fast.
Remember you don't blt after each pixel - only when you want a new result to be shown, even then there's no point doing this faster than the refresh on your screen - probably 60hz
You are looking for something like glMapBuffer in OpenGL, but acessing directly to the screen.
But writing to the GPU memory pixel per pixel is the slower operation you can do. PCI works faster if you send big streams of data. Also, there are many issues if you write and read data. And the pixel layout is also important (see nvidia docs about fast texture transfers). Bitblt will do it for you in a driver optimised way.

graphics: best performance with floating point accumulation images

I need to speed up some particle system eye candy I'm working on. The eye candy involves additive blending, accumulation, and trails and glow on the particles. At the moment I'm rendering by hand into a floating point image buffer, converting to unsigned chars at the last minute then uploading to an OpenGL texture. To simulate glow I'm rendering the same texture multiple times at different resolutions and different offsets. This is proving to be too slow, so I'm looking at changing something. The problem is, my dev hardware is an Intel GMA950, but the target machine has an Nvidia GeForce 8800, so it is difficult to profile OpenGL stuff at this stage.
I did some very unscientific profiling and found that most of the slow down is coming from dealing with the float image: scaling all the pixels by a constant to fade them out, and converting the float image to unsigned chars and uploading to the graphics hardware. So, I'm looking at the following options for optimization:
Replace floats with uint32's in a fixed point 16.16 configuration
Optimize float operations using SSE2 assembly (image buffer is a 1024*768*3 array of floats)
Use OpenGL Accumulation Buffer instead of float array
Use OpenGL floating-point FBO's instead of float array
Use OpenGL pixel/vertex shaders
Have you any experience with any of these possibilities? Any thoughts, advice? Something else I haven't thought of?
The problem is simply the sheer amount of data you have to process.
Your float buffer is 9 megabytes in size, and you touch the data more than once. Most likely your rendering loop looks somewhat like this:
Clear the buffer
Render something on it (uses reads and writes)
Convert to unsigned bytes
Upload to OpenGL
That's a lot of data that you move around, and the cache can't help you much because the image is much larger than your cache. Let's assume you touch every pixel five times. If so you move 45mb of data in and out of the slow main memory. 45mb does not sound like much data, but consider that almost each memory access will be a cache miss. The CPU will spend most of the time waiting for the data to arrive.
If you want to stay on the CPU to do the rendering there's not much you can do. Some ideas:
Using SSE for non temporary loads and stores may help, but they will complicate your task quite a bit (you have to align your reads and writes).
Try break up your rendering into tiles. E.g. do everything on smaller rectangles (256*256 or so). The idea behind this is, that you actually get a benefit from the cache. After you've cleared your rectangle for example the entire bitmap will be in the cache. Rendering and converting to bytes will be a lot faster now because there is no need to get the data from the relative slow main memory anymore.
Last resort: Reduce the resolution of your particle effect. This will give you a good bang for the buck at the cost of visual quality.
The best solution is to move the rendering onto the graphic card. Render to texture functionality is standard these days. It's a bit tricky to get it working with OpenGL because you have to decide which extension to use, but once you have it working the performance is not an issue anymore.
Btw - do you really need floating point render-targets? If you get away with 3 bytes per pixel you will see a nice performance improvement.
It's best to move the rendering calculation for massive particle systems like this over to the GPU, which has hardware optimized to do exactly this job as fast as possible.
Aaron is right: represent each individual particle with a sprite. You can calculate the movement of the sprites in space (eg, accumulate their position per frame) on the CPU using SSE2, but do all the additive blending and accumulation on the GPU via OpenGL. (Drawing sprites additively is easy enough.) You can handle your trails and blur either by doing it in shaders (the "pro" way), rendering to an accumulation buffer and back, or simply generate a bunch of additional sprites on the CPU representing the trail and throw them at the rasterizer.
Try to replace the manual code with sprites: An OpenGL texture with an alpha of, say, 10%. Then draw lots of them on the screen (ten of them in the same place to get the full glow).
If you by "manual" mean that you are using the CPU to poke pixels, I think pretty much anything you can do where you draw textured polygons using OpenGL instead will represent a huge speedup.

Resources