Is it posible to know the brightness of a picture in Flutter? - image

I am building an application which has a Camera inside.
After I take a photo, I want to analyze it to know the brightness of this picture, if it is bad I have to take again the photo.
This is my code right now, it's a javascript function that I found and writing in Dart:
Thanks to #Abion47
EDIT 1
for (int i = 0; i < pixels.length; i++) {
int pixel = pixels[i];
int b = (pixel & 0x00FF0000) >> 16;
int g = (pixel & 0x0000FF00) >> 8;
int r = (pixel & 0x000000FF);
avg = ((r + g + b) / 3).floor();
colorSum += avg;
}
brightness = (colorSum / (width * height)).floor();
}
brightness = (colorSum / (width * height)).round();
// I tried with this other code
//brightness = (colorSum / pixels.length).round();
return brightness;
But I've got less brightness on white than black, the numbers are a little bit weird.
Do you know a better way to know the brightness?
SOLUTION:
Under further investigation we found the solution, we had an error doing the image decoding, but we used a Image function to do it.
Here is our final code:
Image image = decodeImage(file.readAsBytesSync());
var data = image.getBytes();
var colorSum = 0;
for(var x = 0; x < data.length; x += 4) {
int r = data[x];
int g = data[x + 1];
int b = data[x + 2];
int avg = ((r + g + b) / 3).floor();
colorSum += avg;
}
var brightness = (colorSum / (image.width * image.height)).floor();
return brightness;
Hope it helps you.

There are several things wrong with your code.
First, you are getting a range error because you are attempting to access a pixel that doesn't exist. This is probably due to width and/or height being greater than the image's actual width or height. There are a lot of ways to try and get these values, but for this application it doesn't actually matter since the end result is to get an average value across all pixels in the image, and you don't need the width or height of the image for that.
Second, you are fetching the color values by serializing the color value into a hex string and then parsing the individual channel substrings. Your substring is going to result in incorrect values because:
foo.substring(a, b) takes the substring of foo from a to b, exclusive. That means that a and b are indices, not lengths, and the resulting string will not include the character at b. So assuming hex is "01234567", when you do hex.substring(0, 2), you get "01", and then you do hex.substring(3, 5) you get "34" while hex.substring(6, 8) gets you "67". You need to do hex.substring(0, 2) followed by hex.substring(2, 4) and hex.substring(4, 6) to get the first three channels.
That being said, you are fetching the wrong channels. The image package stores its pixel values in ABGR format, meaning the first two characters in the hex string are going to be the alpha channel which is unimportant when calculating image brightness. Instead, you want the second, third, and forth channels for the blue, green, and red values respectively.
And having said all that, this is an extremely inefficient way to do this anyway when the preferred way to retrieve channel data from an integer color value is with bitwise operations on the integer itself. (Never convert a number to a string or vice versa unless you absolutely have to.)
So in summary, what you want will likely be something akin to the following;
final pixels = image.data;
double colorSum = 0;
for (int i = 0; i < pixels.length; i++) {
int pixel = pixels[i];
int b = (pixel & 0x00FF0000) >> 16;
int g = (pixel & 0x0000FF00) >> 8;
int r = (pixel & 0x000000FF);
avg = (r + g + b) / 3;
colorSum += avg;
}
return colorSum / pixels.length;

Related

Issue in plotting resultant bit map of two bit maps difference

I want to compare one bitmap with another bitmap (reference bitmap) and draw all the difference of it in resultant bit map.
Using below code I am able to draw only difference area but not with exact color of it.
Here is my code
Bitmap ResultantBitMap = new Bitmap(bitMap1.Height, bitMap2.Height);
BitmapData bitMap1Data = bitMap1.LockBits(new Rectangle(0, 0, bitMap1.Width, bitMap1.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
BitmapData bitMap2Data = bitMap2.LockBits(new Rectangle(0, 0, bitMap2.Width, bitMap2.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
BitmapData bitMapResultantData = ResultantBitMap.LockBits(new Rectangle(0, 0, ResultantBitMap.Width, ResultantBitMap.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
IntPtr scan0 = bitMap1Data.Scan0;
IntPtr scan02 = bitMap2Data.Scan0;
IntPtr scan0ResImg1 = bitMapResultantData.Scan0;
int bitMap1Stride = bitMap1Data.Stride;
int bitMap2Stride = bitMap2Data.Stride;
int ResultantImageStride = bitMapResultantData.Stride;
for (int y = 0; y < bitMap1.Height; y++)
{
//define the pointers inside the first loop for parallelizing
byte* p = (byte*)scan0.ToPointer();
p += y * bitMap1Stride;
byte* p2 = (byte*)scan02.ToPointer();
p2 += y * bitMap2Stride;
byte* pResImg1 = (byte*)scan0ResImg1.ToPointer();
pResImg1 += y * ResultantImageStride;
for (int x = 0; x < bitMap1.Width; x++)
{
//always get the complete pixel when differences are found
if (Math.Abs(p[0] - p2[0]) >= 20 || Math.Abs(p[1] - p2[1]) >= 20 || Math.Abs(p[2] - p2[2]) >= 20)
{
pResImg1[0] = p2[0];// B
pResImg1[1] = p2[1];//R
pResImg1[2] = p2[2];//G
pResImg1[3] = p2[3];//A (Opacity)
}
p += 4;
p2 += 4;
pResImg1 += 4;
}
}
bitMap1.UnlockBits(bitMap1Data);
bitMap2.UnlockBits(bitMap2Data);
ResultantBitMap.UnlockBits(bitMapResultantData);
ResultantBitMap.Save(#"c:\\abcd\abcd.jpeg");
What I want is the difference image with exact color of the reference image.
It's hard to tell what's going on without knowing what all those library calls and "+= 4" are but, are you sure p and p2 correspond to the first and second images of your diagram?
Also, your "Format32bppArgb" format suggests that [0] corresponds to alpha, not to red. Maybe there's a problem with that, too.

How do convolution matrices work?

How do those matrices work? Do I need to multiple every single pixel? How about the upperleft, upperright, bottomleft and bottomleft pixels where there's no surrounding pixel? And does the matrix work from left to right and from up to bottom or from up to bottom first and then left to right?
Why does this kernel (Edge enhance) : http://i.stack.imgur.com/d755G.png
turns into this image: http://i.stack.imgur.com/NRdkK.jpg
The Convolution filter is applied to every single pixel.
On the edges there are a few things you can do (all leave a type of border or shrink the image):
skip the edges and crop 1 pixel from the edge of the image
substitute 0 or 255 for any of the pixels that are out of bounds for the image
use a cubic spline (or other interpolation method) between 0 (or 255) and the value of the images edge pixel to come up with a substitute.
The order you apply the convolution does not matter (upper right to bottom left is most common) you should get the same results no matter the order.
However, a common mistake when applying a convolution matrix is to overwrite the current pixel you are examining with the new value. This will affect the value you come up with for the pixel next to the current one. A better method would be to create a buffer to hold the computed values, so that previous applications of the convolution filter do not affect current application of the matrix.
From your example images it is hard to tell why the filter applied creates the black and white version without seeing the original image.
Below is a step by step example of applying a convolution kernel to an image (1D for simplicity).
As for the edge enhancement kernel in your post, notice the +1 next to the -1. Think about what that will do. If the region is constant the two pixel under the +/-1 will add to zero (black). If the two pixels are different they will have a non-zero value. So what you are seeing is that pixels next to each other that are different get highlighted, while ones that are the same get set to black. The bigger the difference the brighter (more white) the pixel in the filtered image.
Yes, you multiply every pixel, with that matrix. The traditional method is to find the relevant pixels relative to the pixel being convoluted, multiple the factors, and average it out. So a 3x3 blur of:
1, 1, 1,
1, 1, 1,
1, 1, 1
This matrix, means you take the relevant values of the various components and multiply them. Then divide by the number of elements. So you would get that 3 by 3 box, add up all the red values then divide by 9. You'd get the 3 by 3 box, add up all the green values then divide by 9. You'd get the 3 by 3 box, add up all the blue values then divide by 9.
This means a couple things. First, you need a second giant chunk of memory to perform this operation. And you do every pixel you can.
However, that's only for the traditional method and the traditional method is actually needlessly convoluted (get it?). If you return the results in a corner. You never actually need any additional memory and always do the entire operation within the memory footprint you started with.
public static void convolve(int[] pixels, int offset, int stride, int x, int y, int width, int height, int[][] matrix, int parts) {
int index = offset + x + (y*stride);
for (int j = 0; j < height; j++, index += stride) {
for (int k = 0; k < width; k++) {
int pos = index + k;
pixels[pos] = convolve(pixels,stride,pos, matrix, parts);
}
}
}
private static int crimp(int color) {
return (color >= 0xFF) ? 0xFF : (color < 0) ? 0 : color;
}
private static int convolve(int[] pixels, int stride, int index, int[][] matrix, int parts) {
int redSum = 0;
int greenSum = 0;
int blueSum = 0;
int pixel, factor;
for (int j = 0, m = matrix.length; j < m; j++, index+=stride) {
for (int k = 0, n = matrix[j].length; k < n; k++) {
pixel = pixels[index + k];
factor = matrix[j][k];
redSum += factor * ((pixel >> 16) & 0xFF);
greenSum += factor * ((pixel >> 8) & 0xFF);
blueSum += factor * ((pixel) & 0xFF);
}
}
return 0xFF000000 | ((crimp(redSum / parts) << 16) | (crimp(greenSum / parts) << 8) | (crimp(blueSum / parts)));
}
With the kernel traditionally returning the value to the center most pixel. This allows the image to blur around the edges but more or less remain where it started. This seemed like a good idea but it's actually problematic. The correct way to do it is to have the results pixel in the upper-left corner. Then you can simply, and with no extra memory, just iterate the the entire image with a scanline, going one pixel at a time and returning the value, without causing errors. The bulk of the color weight is shifted up and left by one pixel. But, it's one pixel, and you can shift it back down and to the left if you iterate backwards with a result pixel in the bottom-right. Though this might be trouble for the cache hits.
However, a lot of modern architecture have GPUs now, so the entire image can be done simultaneously. Making it a kind of moot point. But, it is strange that one of the most important algorithm in graphics is weird in requiring this, as that makes the easiest way to do the operation impossible, and a memory hog.
So that people like Matt on this question say things like "However, a common mistake when applying a convolution matrix is to overwrite the current pixel you are examining with the new value." -- Really this is the correct way to do it, the error is writing the result pixel to the center rather than the upper left corner. Because unlike the upper-left corner, you will need the center pixel again. You won't ever need the upper-left corner again (assuming you are iterating left->right, top->bottom), and so it's safe to store your value there.
"This will affect the value you come up with for the pixel next to the current one." -- If you wrote it to the upper left corner as you processed it as a scan, you would overwrite data that you do not ever need again. Using a bunch of extra memory isn't a better solution.
As such, here's likely the fastest Java blur you'd ever see.
private static void applyBlur(int[] pixels, int stride) {
int v0, v1, v2, r, g, b;
int pos;
pos = 0;
try {
while (true) {
v0 = pixels[pos];
v1 = pixels[pos+1];
v2 = pixels[pos+2];
r = ((v0 >> 16) & 0xFF) + ((v1 >> 16) & 0xFF) + ((v2 >> 16) & 0xFF);
g = ((v0 >> 8 ) & 0xFF) + ((v1 >> 8) & 0xFF) + ((v2 >> 8) & 0xFF);
b = ((v0 ) & 0xFF) + ((v1 ) & 0xFF) + ((v2 ) & 0xFF);
r/=3;
g/=3;
b/=3;
pixels[pos++] = r << 16 | g << 8 | b;
}
}
catch (ArrayIndexOutOfBoundsException e) { }
pos = 0;
try {
while (true) {
v0 = pixels[pos];
v1 = pixels[pos+stride];
v2 = pixels[pos+stride+stride];
r = ((v0 >> 16) & 0xFF) + ((v1 >> 16) & 0xFF) + ((v2 >> 16) & 0xFF);
g = ((v0 >> 8 ) & 0xFF) + ((v1 >> 8) & 0xFF) + ((v2 >> 8) & 0xFF);
b = ((v0 ) & 0xFF) + ((v1 ) & 0xFF) + ((v2 ) & 0xFF);
r/=3;
g/=3;
b/=3;
pixels[pos++] = r << 16 | g << 8 | b;
}
}
catch (ArrayIndexOutOfBoundsException e) { }
}

Data structures and algorithms for adaptive "uniform" mesh?

I need a data structure for storing float values at an uniformly sampled 3D mesh:
x = x0 + ix*dx where 0 <= ix < nx
y = y0 + iy*dy where 0 <= iy < ny
z = z0 + iz*dz where 0 <= iz < nz
Up to now I have used my Array class:
Array3D<float> A(nx, ny,nz);
A(0,0,0) = 0.0f; // ix = iy = iz = 0
Internally it stores the float values as an 1D array with nx * ny * nz elements.
However now I need to represent an mesh with more values than I have RAM,
e.g. nx = ny = nz = 2000.
I think many neighbour nodes in such an mesh may have similar values so I was thinking if there was some simple way that I could "coarsen" the mesh adaptively.
For instance if the 8 (ix,iy,iz) nodes of an cell in this mesh have values that are less than 5% apart; they are "removed" and replaced by just one value; the mean of the 8 values.
How could I implement such a data structure in a simple and efficient way?
EDIT:
thanks Ante for suggesting lossy compression. I think this could work the following way:
#define BLOCK_SIZE 64
struct CompressedArray3D {
CompressedArray3D(int ni, int nj, int nk) {
NI = ni/BLOCK_SIZE + 1;
NJ = nj/BLOCK_SIZE + 1;
NK = nk/BLOCK_SIZE + 1;
blocks = new float*[NI*NJ*NK];
compressedSize = new unsigned int[NI*NJ*NK];
}
void setBlock(int I, int J, int K, float values[BLOCK_SIZE][BLOCK_SIZE][BLOCK_SIZE]) {
unsigned int csize;
blocks[I*NJ*NK + J*NK + K] = compress(values, csize);
compressedSize[I*NJ*NK + J*NK + K] = csize;
}
float getValue(int i, int j, int k) {
int I = i/BLOCK_SIZE;
int J = j/BLOCK_SIZE;
int K = k/BLOCK_SIZE;
int ii = i - I*BLOCK_SIZE;
int jj = j - J*BLOCK_SIZE;
int kk = k - K*BLOCK_SIZE;
float *compressedBlock = blocks[I*NJ*NK + J*NK + K];
unsigned int csize = compressedSize[I*NJ*NK + J*NK + K];
float values[BLOCK_SIZE][BLOCK_SIZE][BLOCK_SIZE];
decompress(compressedBlock, csize, values);
return values[ii][jj][kk];
}
// number of blocks:
int NI, NJ, NK;
// number of samples:
int ni, nj, nk;
float** blocks;
unsigned int* compressedSize;
};
For this to be useful I need a lossy compression that is:
extremely fast, also on small datasets (e.g. 64x64x64)
compress quite hard > 3x, never mind if it looses quite a bit of info.
Any good candidates?
It sounds like you're looking for a LOD (level of detail) adaptive mesh. It's a recurring theme in video games and terrain simulation.
For terrain, see here: http://vterrain.org/LOD/Papers/ -- look for the ROAM video which is IIRC not only adaptive by distance, but also by view direction.
For non-terrain entities, there is a huge body of work (here's one example: Generic Adaptive Mesh Refinement).
I would suggest to use OctoMap to handle large 3D data.
And to extend it as shown here to handle geometrical properties.

Node.js / coffeescript performance on a math-intensive algorithm

I am experimenting with node.js to build some server-side logic, and have implemented a version of the diamond-square algorithm described here in coffeescript and Java. Given all the praise I have heard for node.js and V8 performance, I was hoping that node.js would not lag too far behind the java version.
However on a 4096x4096 map, Java finishes in under 1s but node.js/coffeescript takes over 20s on my machine...
These are my full results. x-axis is grid size. Log and linear charts:
Is this because there is something wrong with my coffeescript implementation, or is this just the nature of node.js still?
Coffeescript
genHeightField = (sz) ->
timeStart = new Date()
DATA_SIZE = sz
SEED = 1000.0
data = new Array()
iters = 0
# warm up the arrays to tell the js engine these are dense arrays
# seems to have neligible effect when running on node.js though
for rows in [0...DATA_SIZE]
data[rows] = new Array();
for cols in [0...DATA_SIZE]
data[rows][cols] = 0
data[0][0] = data[0][DATA_SIZE-1] = data[DATA_SIZE-1][0] =
data[DATA_SIZE-1][DATA_SIZE-1] = SEED;
h = 500.0
sideLength = DATA_SIZE-1
while sideLength >= 2
halfSide = sideLength / 2
for x in [0...DATA_SIZE-1] by sideLength
for y in [0...DATA_SIZE-1] by sideLength
avg = data[x][y] +
data[x + sideLength][y] +
data[x][y + sideLength] +
data[x + sideLength][y + sideLength]
avg /= 4.0;
data[x + halfSide][y + halfSide] =
avg + Math.random() * (2 * h) - h;
iters++
#console.log "A:" + x + "," + y
for x in [0...DATA_SIZE-1] by halfSide
y = (x + halfSide) % sideLength
while y < DATA_SIZE-1
avg =
data[(x-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)][y]
data[(x+halfSide)%(DATA_SIZE-1)][y]
data[x][(y+halfSide)%(DATA_SIZE-1)]
data[x][(y-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)]
avg /= 4.0;
avg = avg + Math.random() * (2 * h) - h;
data[x][y] = avg;
if x is 0
data[DATA_SIZE-1][y] = avg;
if y is 0
data[x][DATA_SIZE-1] = avg;
#console.log "B: " + x + "," + y
y += sideLength
iters++
sideLength /= 2
h /= 2.0
#console.log iters
console.log (new Date() - timeStart)
genHeightField(256+1)
genHeightField(512+1)
genHeightField(1024+1)
genHeightField(2048+1)
genHeightField(4096+1)
Java
import java.util.Random;
class Gen {
public static void main(String args[]) {
genHeight(256+1);
genHeight(512+1);
genHeight(1024+1);
genHeight(2048+1);
genHeight(4096+1);
}
public static void genHeight(int sz) {
long timeStart = System.currentTimeMillis();
int iters = 0;
final int DATA_SIZE = sz;
final double SEED = 1000.0;
double[][] data = new double[DATA_SIZE][DATA_SIZE];
data[0][0] = data[0][DATA_SIZE-1] = data[DATA_SIZE-1][0] =
data[DATA_SIZE-1][DATA_SIZE-1] = SEED;
double h = 500.0;
Random r = new Random();
for(int sideLength = DATA_SIZE-1;
sideLength >= 2;
sideLength /=2, h/= 2.0){
int halfSide = sideLength/2;
for(int x=0;x<DATA_SIZE-1;x+=sideLength){
for(int y=0;y<DATA_SIZE-1;y+=sideLength){
double avg = data[x][y] +
data[x+sideLength][y] +
data[x][y+sideLength] +
data[x+sideLength][y+sideLength];
avg /= 4.0;
data[x+halfSide][y+halfSide] =
avg + (r.nextDouble()*2*h) - h;
iters++;
//System.out.println("A:" + x + "," + y);
}
}
for(int x=0;x<DATA_SIZE-1;x+=halfSide){
for(int y=(x+halfSide)%sideLength;y<DATA_SIZE-1;y+=sideLength){
double avg =
data[(x-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)][y] +
data[(x+halfSide)%(DATA_SIZE-1)][y] +
data[x][(y+halfSide)%(DATA_SIZE-1)] +
data[x][(y-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)];
avg /= 4.0;
avg = avg + (r.nextDouble()*2*h) - h;
data[x][y] = avg;
if(x == 0) data[DATA_SIZE-1][y] = avg;
if(y == 0) data[x][DATA_SIZE-1] = avg;
iters++;
//System.out.println("B:" + x + "," + y);
}
}
}
//System.out.print(iters +" ");
System.out.println(System.currentTimeMillis() - timeStart);
}
}
As other answerers have pointed out, JavaScript's arrays are a major performance bottleneck for the type of operations you're doing. Because they're dynamic, it's naturally much slower to access elements than it is with Java's static arrays.
The good news is that there is an emerging standard for statically typed arrays in JavaScript, already supported in some browsers. Though not yet supported in Node proper, you can easily add them with a library: https://github.com/tlrobinson/v8-typed-array
After installing typed-array via npm, here's my modified version of your code:
{Float32Array} = require 'typed-array'
genHeightField = (sz) ->
timeStart = new Date()
DATA_SIZE = sz
SEED = 1000.0
iters = 0
# Initialize 2D array of floats
data = new Array(DATA_SIZE)
for rows in [0...DATA_SIZE]
data[rows] = new Float32Array(DATA_SIZE)
for cols in [0...DATA_SIZE]
data[rows][cols] = 0
# The rest is the same...
The key line in there is the declaration of data[rows].
With the line data[rows] = new Array(DATA_SIZE) (essentially equivalent to the original), I get the benchmark numbers:
17
75
417
1376
5461
And with the line data[rows] = new Float32Array(DATA_SIZE), I get
19
47
215
855
3452
So that one small change cuts the running time down by about 1/3, i.e. a 50% speed increase!
It's still not Java, but it's a pretty substantial improvement. Expect future versions of Node/V8 to narrow the performance gap further.
Caveat: It's got to be mentioned that normal JS numbers are double-precision, i.e. 64-bit floats. Using Float32Array will thus reduce precision, making this a bit of an apples-and-oranges comparison—I don't know how much of the performance improvement is from using 32-bit math, and how much is from faster array access. A Float64Array is part of the V8 spec, but isn't yet implemented in the v8-typed-array library.)
If you're looking for performance in algorithms like this, both coffee/js and Java are the wrong languages to be using. Javascript is especially poor for problems like this because it does not have an array type - arrays are just hash maps where keys must be integers, which obviously will not be as quick as a real array. What you want is to write this algorithm in C and call that from node (see http://nodejs.org/docs/v0.4.10/api/addons.html). Unless you're really good at hand-optimizing machine code, good C will easily outstrip any other language.
Forget about Coffeescript for a minute, because that's not the root of the problem. That code just gets written to regular old javascript anyway when node runs it.
Just like any other javascript environment, node is single-threaded. The V8 engine is bloody fast, but for certain types of applications you might not be able to exceed the speed of the jvm.
I would first suggest trying to right out your diamond algorithm directly in js before moving to CS. See what kinds of speed optimizations you can make.
Actually, I'm kind of interested in this problem now too and am going to take a look at doing this.
Edit #2 This is my 2nd re-write with some optimizations such as pre-populating the data array. Its not significantly faster, but the code is a bit cleaner.
var makegrid = function(size){
size++; //increment by 1
var grid = [];
grid.length = size,
gsize = size-1; //frequently used value in later calculations.
//setup grid array
var len = size;
while(len--){
grid[len] = (new Array(size+1).join(0).split('')); //creates an array of length "size" where each index === 0
}
//populate four corners of the grid
grid[0][0] = grid[gsize][0] = grid[0][gsize] = grid[gsize][gsize] = corner_vals;
var side_length = gsize;
while(side_length >= 2){
var half_side = Math.floor(side_length / 2);
//generate new square values
for(var x=0; x<gsize; x += side_length){
for(var y=0; y<gsize; y += side_length){
//calculate average of existing corners
var avg = ((grid[x][y] + grid[x+side_length][y] + grid[x][y+side_length] + grid[x+side_length][y+side_length]) / 4) + (Math.random() * (2*height_range - height_range));
//calculate random value for avg for center point
grid[x+half_side][y+half_side] = Math.floor(avg);
}
}
//generate diamond values
for(var x=0; x<gsize; x+= half_side){
for(var y=(x+half_side)%side_length; y<gsize; y+= side_length){
var avg = Math.floor( ((grid[(x-half_side+gsize)%gsize][y] + grid[(x+half_side)%gsize][y] + grid[x][(y+half_side)%gsize] + grid[x][(y-half_side+gsize)%gsize]) / 4) + (Math.random() * (2*height_range - height_range)) );
grid[x][y] = avg;
if( x === 0) grid[gsize][y] = avg;
if( y === 0) grid[x][gsize] = avg;
}
}
side_length /= 2;
height_range /= 2;
}
return grid;
}
makegrid(256)
makegrid(512)
makegrid(1024)
makegrid(2048)
makegrid(4096)
I have always assumed that when people described javascript runtime's as 'fast' they mean relative to other interpreted, dynamic languages. A comparison to ruby, python or smalltalk would be interesting. Comparing JavaScript to Java is not a fair comparison.
To answer your question, I believe that the results you are seeing are indicative of what you can expect comparing these two vastly different languages.

How to reduce the number of colors in an image with OpenCV?

I have a set of image files, and I want to reduce the number of colors of them to 64. How can I do this with OpenCV?
I need this so I can work with a 64-sized image histogram.
I'm implementing CBIR techniques
What I want is color quantization to a 4-bit palette.
This subject was well covered on OpenCV 2 Computer Vision Application Programming Cookbook:
Chapter 2 shows a few reduction operations, one of them demonstrated here in C++ and later in Python:
#include <iostream>
#include <vector>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
void colorReduce(cv::Mat& image, int div=64)
{
int nl = image.rows; // number of lines
int nc = image.cols * image.channels(); // number of elements per line
for (int j = 0; j < nl; j++)
{
// get the address of row j
uchar* data = image.ptr<uchar>(j);
for (int i = 0; i < nc; i++)
{
// process each pixel
data[i] = data[i] / div * div + div / 2;
}
}
}
int main(int argc, char* argv[])
{
// Load input image (colored, 3-channel, BGR)
cv::Mat input = cv::imread(argv[1]);
if (input.empty())
{
std::cout << "!!! Failed imread()" << std::endl;
return -1;
}
colorReduce(input);
cv::imshow("Color Reduction", input);
cv::imwrite("output.jpg", input);
cv::waitKey(0);
return 0;
}
Below you can find the input image (left) and the output of this operation (right):
The equivalent code in Python would be the following:
(credits to #eliezer-bernart)
import cv2
import numpy as np
input = cv2.imread('castle.jpg')
# colorReduce()
div = 64
quantized = input // div * div + div // 2
cv2.imwrite('output.jpg', quantized)
You might consider K-means, yet in this case it will most likely be extremely slow. A better approach might be doing this "manually" on your own. Let's say you have image of type CV_8UC3, i.e. an image where each pixel is represented by 3 RGB values from 0 to 255 (Vec3b). You might "map" these 256 values to only 4 specific values, which would yield 4 x 4 x 4 = 64 possible colors.
I've had a dataset, where I needed to make sure that dark = black, light = white and reduce the amount of colors of everything between. This is what I did (C++):
inline uchar reduceVal(const uchar val)
{
if (val < 64) return 0;
if (val < 128) return 64;
return 255;
}
void processColors(Mat& img)
{
uchar* pixelPtr = img.data;
for (int i = 0; i < img.rows; i++)
{
for (int j = 0; j < img.cols; j++)
{
const int pi = i*img.cols*3 + j*3;
pixelPtr[pi + 0] = reduceVal(pixelPtr[pi + 0]); // B
pixelPtr[pi + 1] = reduceVal(pixelPtr[pi + 1]); // G
pixelPtr[pi + 2] = reduceVal(pixelPtr[pi + 2]); // R
}
}
}
causing [0,64) to become 0, [64,128) -> 64 and [128,255) -> 255, yielding 27 colors:
To me this seems to be neat, perfectly clear and faster than anything else mentioned in other answers.
You might also consider reducing these values to one of the multiples of some number, let's say:
inline uchar reduceVal(const uchar val)
{
if (val < 192) return uchar(val / 64.0 + 0.5) * 64;
return 255;
}
which would yield a set of 5 possible values: {0, 64, 128, 192, 255}, i.e. 125 colors.
There are many ways to do it. The methods suggested by jeff7 are OK, but some drawbacks are:
method 1 have parameters N and M, that you must choose, and you must also convert it to another colorspace.
method 2 answered can be very slow, since you should compute a 16.7 Milion bins histogram and sort it by frequency (to obtain the 64 higher frequency values)
I like to use an algorithm based on the Most Significant Bits to use in a RGB color and convert it to a 64 color image. If you're using C/OpenCV, you can use something like the function below.
If you're working with gray-level images I recommed to use the LUT() function of the OpenCV 2.3, since it is faster. There is a tutorial on how to use LUT to reduce the number of colors. See: Tutorial: How to scan images, lookup tables... However I find it more complicated if you're working with RGB images.
void reduceTo64Colors(IplImage *img, IplImage *img_quant) {
int i,j;
int height = img->height;
int width = img->width;
int step = img->widthStep;
uchar *data = (uchar *)img->imageData;
int step2 = img_quant->widthStep;
uchar *data2 = (uchar *)img_quant->imageData;
for (i = 0; i < height ; i++) {
for (j = 0; j < width; j++) {
// operator XXXXXXXX & 11000000 equivalent to XXXXXXXX AND 11000000 (=192)
// operator 01000000 >> 2 is a 2-bit shift to the right = 00010000
uchar C1 = (data[i*step+j*3+0] & 192)>>2;
uchar C2 = (data[i*step+j*3+1] & 192)>>4;
uchar C3 = (data[i*step+j*3+2] & 192)>>6;
data2[i*step2+j] = C1 | C2 | C3; // merges the 2 MSB of each channel
}
}
}
Here's a Python implementation of color quantization using K-Means Clustering with cv2.kmeans. The idea is to reduce the number of distinct colors in an image while preserving the color appearance of the image as much as possible. Here's the result:
Input -> Output
Code
import cv2
import numpy as np
def kmeans_color_quantization(image, clusters=8, rounds=1):
h, w = image.shape[:2]
samples = np.zeros([h*w,3], dtype=np.float32)
count = 0
for x in range(h):
for y in range(w):
samples[count] = image[x][y]
count += 1
compactness, labels, centers = cv2.kmeans(samples,
clusters,
None,
(cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10000, 0.0001),
rounds,
cv2.KMEANS_RANDOM_CENTERS)
centers = np.uint8(centers)
res = centers[labels.flatten()]
return res.reshape((image.shape))
image = cv2.imread('1.jpg')
result = kmeans_color_quantization(image, clusters=8)
cv2.imshow('result', result)
cv2.waitKey()
The answers suggested here are really good. I thought I would add my idea as well. I follow the formulation of many comments here, in which it is said that 64 colors can be represented by 2 bits of each channel in an RGB image.
The function in code below takes as input an image and the number of bits required for quantization. It uses bit manipulation to 'drop' the LSB bits and keep only the required number of bits. The result is a flexible method that can quantize the image to any number of bits.
#include "include\opencv\cv.h"
#include "include\opencv\highgui.h"
// quantize the image to numBits
cv::Mat quantizeImage(const cv::Mat& inImage, int numBits)
{
cv::Mat retImage = inImage.clone();
uchar maskBit = 0xFF;
// keep numBits as 1 and (8 - numBits) would be all 0 towards the right
maskBit = maskBit << (8 - numBits);
for(int j = 0; j < retImage.rows; j++)
for(int i = 0; i < retImage.cols; i++)
{
cv::Vec3b valVec = retImage.at<cv::Vec3b>(j, i);
valVec[0] = valVec[0] & maskBit;
valVec[1] = valVec[1] & maskBit;
valVec[2] = valVec[2] & maskBit;
retImage.at<cv::Vec3b>(j, i) = valVec;
}
return retImage;
}
int main ()
{
cv::Mat inImage;
inImage = cv::imread("testImage.jpg");
char buffer[30];
for(int i = 1; i <= 8; i++)
{
cv::Mat quantizedImage = quantizeImage(inImage, i);
sprintf(buffer, "%d Bit Image", i);
cv::imshow(buffer, quantizedImage);
sprintf(buffer, "%d Bit Image.png", i);
cv::imwrite(buffer, quantizedImage);
}
cv::waitKey(0);
return 0;
}
Here is an image that is used in the above function call:
Image quantized to 2 bits for each RGB channel (Total 64 Colors):
3 bits for each channel:
4 bits ...
There is the K-means clustering algorithm which is already available in the OpenCV library. In short it determines which are the best centroids around which to cluster your data for a user-defined value of k ( = no of clusters). So in your case you could find the centroids around which to cluster your pixel values for a given value of k=64. The details are there if you google around. Here's a short intro to k-means.
Something similar to what you are probably trying was asked here on SO using k-means, hope it helps.
Another approach would be to use the pyramid mean shift filter function in OpenCV. It yields somewhat "flattened" images, i.e. the number of colors are less so it might be able to help you.
If you want a quick and dirty method in C++, in 1 line:
capImage &= cv::Scalar(0b11000000, 0b11000000, 0b11000000);
So, what it does is keep the upper 2 bits of each R, G, B component, and discards the lower 6 bits, hence the 0b11000000.
Because of the 3 channels in RGB, you get maximum 4 R x 4 B x 4 B = max 64 colors. The advantage of doing this is that you can run this on any number of images and the same colors will be mapped.
Note that this can make your image a bit darker since it discards some bits.
For a greyscale image, you can do:
capImage &= 0b11111100;
This will keep the upper 6 bits, which means you get 64 grays out of 256, and again the image can become a bit darker.
Here's an example, original image = 251424 unique colors.
And the resulting image has 46 colors:
Assuming that you want to use the same 64 colors for all images (ie palette not optimized per image), there are a at least a couple choices I can think of:
1) Convert to Lab or YCrCb colorspace and quantize using N bits for luminance and M bits for each color channel, N should be greater than M.
2) Compute a 3D histogram of color values over all your training images, then choose the 64 colors with the largest bin values. Quantize your images by assigning each pixel the color of the closest bin from the training set.
Method 1 is the most generic and easiest to implement, while method 2 can be better tailored to your specific dataset.
Update:
For example, 32 colors is 5 bits so assign 3 bits to the luminance channel and 1 bits to each color channel. To do this quantization, do integer division of the luminance channel by 2^8/2^3 = 32 and each color channel by 2^8/2^1 = 128. Now there are only 8 different luminance values and 2 different color channels each. Recombine these values into a single integer doing bit shifting or math (quantized color value = luminance*4+color1*2+color2);
A simple bitwise and with a proper bitmask would do the trick.
python, for 64 colors,
img = img & int("11000000", 2)
The number of colors for an RGB image should be a perfect cube (same across 3 channels).
For this method, the number of possible values for a channel should be a power of 2. (This check is ignored by the code and the next lower power of 2 is taken by it)
import numpy as np
import cv2 as cv
def is_cube(n):
cbrt = np.cbrt(n)
return cbrt ** 3 == n, int(cbrt)
def reduce_color_space(img, n_colors=64):
n_valid, cbrt = is_cube(n_colors)
if not n_valid:
print("n_colors should be a perfect cube")
return
n_bits = int(np.log2(cbrt))
if n_bits > 8:
print("Can't generate more colors")
return
bitmask = int(f"{'1' * n_bits}{'0' * (8 - n_bits)}", 2)
return img & bitmask
img = cv.imread("image.png")
cv.imshow("orig", img)
cv.imshow("reduced", reduce_color_space(img))
cv.waitKey(0)
img = numpy.multiply(img//32, 32)
Why don't you just do Matrix multiplication/division? Values will be automatically rounded.
Pseudocode:
convert your channels to unsigned characters (CV_8UC3),
Divide by
total colors / desired colors. Mat = Mat / (256/64). Decimal points
will be truncated.
Multiply by the same number. Mat = mat * 4
Done. Each channel now only contains 64 colors.

Resources