I am working on a data visualization tool using OpenGL, and the LAB color space is the most comprehensible color space for visualization of the data I'm dealing with (3 axes of data are mapped to the 3 axes of the color space). Is there a fast (e.g. no non-integer exponentiation, suitable for execution in a shader) algorithm for approximate conversion of LAB values to and from RGB values?
If doing the actual conversion calculation in a shader is too complex/expensive, you can always use a lookup table. Since both color spaces have 3 components, you can use a 3D RGB texture to represent the lookup table.
Using a 3D texture might sound like a lot of overhead. Since 8 bits/component is often used to represent colors in OpenGL, you would need a 256x256x256 3D texture. At 4 bytes/texel, that's a 64 MByte texture, which is not outrageous, but very substantial.
However, depending on how smooth the values in the translation table are, you might be able to get away with a lower resolution. Keep in mind that texture sampling uses linear interpolation. If piecewise linear interpolation is good enough with a certain base-resolution of the lookup table, you can greatly reduce the size.
If you go this direction, and can't afford to use 64 MBytes for the LUT, you'll have to play with the size of the LUT, and make a possible size/performance vs. quality tradeoff.
Related
I'm wondering what the trade off is between using a texture that's 128x512 vs. a texture that's 512x512.
The texture is a skateboard deck (naturally rectangular) so I initially made the texture have an aspect ratio that made the deck appear correctly.
I'd like to use a .basis texture and read "Transcoding to PVRTC1 (for iOS) requires square power-of-two textures." on the Three.js BasisTextureLoader documentation.
So I'm trying to weigh the loading time + performance trade off between using the 128x512 as a JPG or PNG vs. a 512x512 basis texture.
My best guess is that the 128x512 would take up less memory because less texels but I've also read that the GPU likes square textures and basis is much more GPU optimized, so I'm torn between which route to take here.
Any knowledge of the performance trade offs between these two options would be highly appreciated, especially an explanation of the benfits of basis textures in general.
Three.js only really needs power-of-two textures when you're asking the texture's .minFilter to perform mip-mapping. In this case, the GPU will make several copies of the texture at half the resolution as the previous one (512, 256, 128, 64, etc...) which is why it asks for a power-of-two. The default value does perform mip-mapping, you can see alternative .minFilter values in this page under "Minification Filters". Nearest and Linear do not require P.O.T. textures, but you'll get pixellization artifacts when the texture is scaled down.
In WebGL, you can use a 512x128 without problems, since both dimensions are a power-of-two. The perfomance tradeoff is that you save a bunch of pixels that would have been stretched-out duplicates anyway.
I'm currently developing a skeletal animation system for my current project. I guess I understand how it works after doing a lot of reading but while looking at the source of some different projects, I was wondering however why the scale, rotation and translation values of each bone are stored seperately. You could store all of them within a single matrix right? Wouldn't that be more efficient, or would you need more math to get the seperate values rendering it less efficient?
Also while I'm at it, apparently there are 2 common ways of storing the values, which is by matrices, or using vectors and quaternions. The latter one being used more frequently to avoid gimbal lock. However my project has only 2 degrees of freedom. What would be the most efficient way to store my values?
No, a matrix wouldn't be more efficient because you would have to use a 4x4 matrix, so 16 floats. This is mainly because of the way rotation + translation are stored in the matrix.
If you store the values in SRT form you end up with 9 floats, as the rotation quaternion's w component can be re-computed from the others on load.
Moreover many game engines do not support non-uniform scaling, so you can shrink to 1 float for the scale and end up with 8 floats per bones !
And that's before compression: since you know that bones won't move past a certain point (they are volume-bound) then there is no point allocating precision to those ranges it will never reach, so you can encode those floats down to say 16-bit, so you end up with 4 floats per bones.
Now to be fair, I've never implemented that last part with compression; it seemed a bit extreme to me and I didn't have the time.
But going from 64 bytes per bones to 32 bytes per bone is 50% saving !
I've looked high and low for an answer to this, but I haven't found anything yet. Here's my question:
Can you pack pre-calculated noise into a 2d texture in such a way as to be able to calculate a reasonable facsimile of 3d noise without the overhead of having to calculate it with a full blown 3d noise algorithm.
My initial idea was to take Z slices of X by Y noise and pack them up side by side, then for every pixel to calculate the 'low' and 'high' noise pixels and do a weighted interpolation between the two Z samples. Needless to say this didn't work very well.
I know about the various shaders that generate noise, but by and large they have problems on the mobile platform due to the low spec hardware and various optimizations that the mfrs have put into place, so it's not viable to calculate on the fly.
I have googled around but havnt found an answer that suits me for OpenGL.
I want to construct a sparse matrix with a single diagonal and around 9 off-diagonals. These diagonals arent necessarily next to the main diagonal and they wrap around. Each diagonal is an image in row-major format i.e. a vector of size NxM.
The size of the matrix is (NxM)x(NxM)
My question is as follows:
After some messing around with the math I have arrived at the basic units of my operation. It involves a pixel by pixel multiplication of two images (WITHOUT limiting the value of the result i.e. so it can be above 1 or below 0), storing the resulting image and then adding a bunch of the resulting images (SAME as above).
How can I multiply and add images on a pixel by pixel basis in OpenGL? Is it easier in 1.1 or 2.0? Will use of textures cause hard maxing of the results to between 0 and 1? Will this maximize the use of the gpu cores?
In order to be able to store values outside the [0-1] range you would have to use floating point textures. There is no support in OpenGL ES 1.1 and for OpenGL ES 2.0 it is an optional extension (See other SO question).
In case your implementation supports it you could then write a fragment program to do the required math.
In OpenGL ES 1.1 you could use the glTexEnv call to set up how the pixels from different texture units are supposed to be combined. You could then use "modulate" or "add" to multiply/add the values. The result would be clamped to [0,1] range though.
How do I segment a 2D image into blobs of similar values efficiently? The given input is a n array of integer, which includes hue for non-gray pixels and brightness of gray pixels.
I am writing a virtual mobile robot using Java, and I am using segmentation to analyze the map and also the image from the camera. This is a well-known problem in Computer Vision, but when it's on a robot performance does matter so I wanted some inputs. Algorithm is what matters, so you can post code in any language.
Wikipedia article: Segmentation (image processing)
[PPT] Stanford CS-223-B Lecture 11 Segmentation and Grouping (which says Mean Shift is perhaps the best technique to date)
Mean Shift Pictures (paper is also available from Dorin Comaniciu)
I would downsample,in colourspace and in number of pixels, use a vision method(probably meanshift) and upscale the result.
This is good because downsampling also increases the robustness to noise, and makes it more likely that you get meaningful segments.
You could use floodfill to smooth edges afterwards if you need smoothness.
Some more thoughts (in response to your comment).
1) Did you blend as you downsampled? y[i]=(x[2i]+x[2i+1])/2 This should eliminate noise.
2)How fast do you want it to be?
3)Have you tried dynamic meanshift?(also google for dynamic x for all algorithms x)
Not sure if it is too efficient, but you could try using a Kohonen neural network (or, self-organizing map; SOM) to group the similar values, where each pixel contains the original color and position and only the color is used for the Kohohen grouping.
You should read up before you implement this though, as my knowledge of the Kohonen network goes as far as that it is used for grouping data - so I don't know what the performance/viability options are for your scenario.
There are also Hopfield Networks. They can be mangled into grouping from what I read.
What I have now:
Make a buffer of the same size as the input image, initialized to UNSEGMENTED.
For each pixel in the image where the corresponding buffer value is not UNSEGMENTED, flood the buffer using the pixel value.
a. The border checking of the flooding is done by checking if pixel is within EPSILON (currently set to 10) of the originating pixel's value.
b. Flood filling algorithm.
Possible issue:
The 2.a.'s border checking is called many times in the flood filling algorithm. I could turn it into a lookup if I could precalculate the border using edge detection, but that may add more time than current check.
private boolean isValuesCloseEnough(int a_lhs, int a_rhs) {
return Math.abs(a_lhs - a_rhs) <= EPSILON;
}
Possible Enhancement:
Instead of checking every single pixel for UNSEGMENTED, I could randomly pick a few points. If you are expecting around 10 blobs, picking random points in that order may suffice. Drawback is that you might miss a useful but small blob.
Check out Eyepatch (eyepatch.stanford.edu). It should help you during the investigation phase by providing a variety of possible filters for segmentation.
An alternative to flood-fill is the connnected-components algorithm. So,
Cheaply classify your pixels. e.g. divide pixels in colour space.
Run the cc to find the blobs
Retain the blobs of significant size
This approach is widely used in early vision approaches. For example in the seminal paper "Blobworld: A System for Region-Based Image Indexing and Retrieval".