How can I read/transform the range images of the stanford bunny .ply-files? - boost

I want to read the not reconstructed data from the Stanford Bunny. The point data is stored as several range images, which have to be transformed to be combined to one big point cloud, like written in the README:
These data files were obtained with a Cyberware 3030MS optical
triangulation scanner. They are stored as range images in the "ply"
format. The ".conf" file contains the transformations required to
bring each range image into a single coordinate system.
This is the .conf-file:
camera -0.0172 -0.0936 -0.734 -0.0461723 0.970603 -0.235889 0.0124573
bmesh bun000.ply 0 0 0 0 0 0 1
bmesh bun045.ply -0.0520211 -0.000383981 -0.0109223 0.00548449 -0.294635 -0.0038555 0.955586
bmesh bun090.ply 2.20761e-05 -3.34606e-05 -7.20881e-05 0.000335889 -0.708202 0.000602459 0.706009
bmesh bun180.ply 0.000116991 2.47732e-05 -4.6283e-05 -0.00215148 0.999996 -0.0015001 0.000892527
bmesh bun270.ply 0.000130273 1.58623e-05 0.000406764 0.000462632 0.707006 -0.00333301 0.7072
bmesh top2.ply -0.0530127 0.138516 0.0990356 0.908911 -0.0569874 0.154429 0.383126
bmesh top3.ply -0.0277373 0.0583887 -0.0796939 0.0598923 0.670467 0.68082 -0.28874
bmesh bun315.ply -0.00646017 -1.36122e-05 -0.0129064 0.00449209 0.38422 -0.00976512 0.923179
bmesh chin.ply 0.00435102 0.0882863 -0.108853 -0.441019 0.213083 0.00705734 0.871807
bmesh ear_back.ply -0.0829384 0.0353082 0.0711536 0.111743 0.925689 -0.215443 -0.290169
For each range image seven values are stored. But I do not know, what information can be obtained from these values.
I guess that three of them will contain some information about the translation and maybe three contain information about the rotation. But I didn't find something about the order of these values and how to transform the values to get one point cloud.
The wiki page doesn't handle with range images and I found nothing more at the Stanford pages. They just talk about, that the method of Turk94 is used to scan this data set, but the method has no information about the transformations needed. (Or I was not able to get the information out of this paper.)
Does anybody know how to read these values correctly? Why is there a transformation for the camera position? Is this just a good initial value to view the whole point cloud?
Thanks for your help.
EDIT:
Ok. At this point, I already tried to read the data and to correctly transform them, but everything did not work. I use the boost library to handle with the quaternions
Here is my code for it:
boost::math::quaternion<double> translation, quaternionRotation;
//Get Transformation
translation = boost::math::quaternion<double>(0.0, lineData[2].toDouble(), lineData[3].toDouble(), lineData[4].toDouble());
quaternionRotation = boost::math::quaternion<double>(lineData[5].toDouble(),lineData[6].toDouble(),lineData[7].toDouble(),lineData[8].toDouble());
//do some file related stuff
//...
//for each line: read the point data and transform it and store the point in a data array
pointData[j].x = stringPointData[0].toDouble();
pointData[j].y = stringPointData[1].toDouble();
pointData[j].z = stringPointData[2].toDouble();
tmpQuat = boost::math::quaternion<double> (0.0,pointData[j].x,pointData[j].y,pointData[j].z);
//first translation
tmpQuat += translation;
//then quaternion rotation
tmpQuat = (quaternionRotation * (tmpQuat) * boost::math::conj(quaternionRotation));
//read the data from quaternion to a usual type
pointData[j].x = tmpQuat.R_component_2();
pointData[j].y = tmpQuat.R_component_3();
pointData[j].z = tmpQuat.R_component_4();
I assume that the first component of the quaternion is the w component and the others refers to x, y andz like in equation 2 from here. If necessary I can provide the screenshots of the false transformations.
EDIT: It is written in the source code of zipper in the file zipper.c, that the 7 values are saved as followed:
transX transY transZ quatX quatY quatZ quatW
The quaternion is then transformed into a rotation matrix and then the rotation is performed with this new matrix. But even with this information, I am not able to transform it correctly. To test it, I implemented the function quat_to_mat() from zipper in my project:
glm::dmat4 cPlyObjectLoader::quat_to_mat(boost::math::quaternion<double> quat) const
{
float s;
float xs,ys,zs;
float wx,wy,wz;
float xx,xy,xz;
float yy,yz,zz;
glm::dmat4 mat(1.0);
s = 2 / (quat.R_component_2()*quat.R_component_2() +
quat.R_component_3()*quat.R_component_3() +
quat.R_component_4()*quat.R_component_4() +
quat.R_component_1()*quat.R_component_1());
xs = quat.R_component_2() * s;
ys = quat.R_component_3() * s;
zs = quat.R_component_4() * s;
wx = quat.R_component_1() * xs;
wy = quat.R_component_1() * ys;
wz = quat.R_component_1() * zs;
xx = quat.R_component_2() * xs;
xy = quat.R_component_2() * ys;
xz = quat.R_component_2() * zs;
yy = quat.R_component_3() * ys;
yz = quat.R_component_3() * zs;
zz = quat.R_component_4() * zs;
mat[0][0] = 1 - (yy + zz);
mat[0][1] = xy - wz;
mat[0][2] = xz + wy;
mat[0][3] = 0;
mat[1][0] = xy + wz;
mat[1][1] = 1 - (xx + zz);
mat[1][2] = yz - wx;
mat[1][3] = 0;
mat[2][0] = xz - wy;
mat[2][1] = yz + wx;
mat[2][2] = 1 - (xx + yy);
mat[2][3] = 0;
mat[3][0] = 0;
mat[3][1] = 0;
mat[3][2] = 0;
mat[3][3] = 1;
return mat;
}
Now I am doing the translation and rotation with a vector and this matrix:
quaternionRotation = boost::math::quaternion<double>(lineData[8].toDouble(),lineData[5].toDouble(),lineData[6].toDouble(),lineData[7].toDouble());
rotationMat = this->quat_to_mat(quaternionRotation);
translationVec = glm::dvec4(lineData[2].toDouble(), lineData[3].toDouble(), lineData[4].toDouble(),0.0);
//same stuff as above
//...
glm::dvec4 curPoint = glm::dvec4(pointData[j].x,pointData[j].y,pointData[j].z,1.0);
curPoint += translationVec;
curPoint = rotationMat*curPoint;
The result is different to my quaternion rotation (Why? It should be the same.), but not correct.
Debug information:
the input of all transformations is correct
the input of all points is correct

As i read from stanford 3d scan
For all the Stanford models, alignment was done using a modified ICP
algorithm, as described in this paper. These alignments are stored in
".conf" files, which list each range image in the model along with a
translation and a quaternion rotation.
Here is the link to "this paper"
Edit: The two methods are called zippering and volmetric merging

As Ello mentioned, it is written at the stanford 3D repo:
For all the Stanford models, alignment was done using a modified ICP algorithm, as described in this paper. These alignments are stored in ".conf" files, which list each range image in the model along with a translation and a quaternion rotation.
But that is not enough to understand everything of this data file.
It is correct, that the first line:
camera -0.0172 -0.0936 -0.734 -0.0461723 0.970603 -0.235889 0.0124573
stores a good initial camera position and every other line starting with bmesh refers to a .ply-file, which stores a ranged image.
The transformation values are stored as followed:
transX transY transZ quatX quatY quatZ quatW
where trans... refers to a translation value and quat... refers to a value of the quaternion. Currently, I do not know, why it doesn't work with the quaternion rotation by itself, but by transforming it into a rotation matrix with the code of zipper the transformation is correct. Be aware, that the translation is stored first, but to get a correct transformation the rotation has to be done at the beginning and the translation afterwards.
My code snippet to read the files and transform it, is the following:
boost::math::quaternion<double> translation, quaternionRotation;
//Get Transformation
translationVec = glm::dvec4(lineData[2].toDouble(), lineData[3].toDouble(), lineData[4].toDouble(),0.0);
quaternionRotation = boost::math::quaternion<double>(lineData[8].toDouble(),lineData[5].toDouble(),lineData[6].toDouble(),lineData[7].toDouble());
//calculate the unit quaternion
double magnitude = std::sqrt(
quaternionRotation.R_component_1()*quaternionRotation.R_component_1()+
quaternionRotation.R_component_2()*quaternionRotation.R_component_2()+
quaternionRotation.R_component_3()*quaternionRotation.R_component_3()+
quaternionRotation.R_component_4()*quaternionRotation.R_component_4());
quaternionRotation /= magnitude;
rotationMat = this->quat_to_mat(quaternionRotation);
//do some file related stuff
//...
//for each line: read the point data and transform it and store the point in a data array
pointData[j].x = stringPointData[0].toDouble();
pointData[j].y = stringPointData[1].toDouble();
pointData[j].z = stringPointData[2].toDouble();
//transform the curren point
glm::dvec4 curPoint = glm::dvec4(pointData[j].x,pointData[j].y,pointData[j].z,1.0);
//first rotation
curPoint = rotationMat*curPoint;
//then translation
curPoint += translationVec;
//store the data in a data array
pointData[j].x = curPoint.x;
pointData[j].y = curPoint.y;
pointData[j].z = curPoint.z;
I know, that it's not the best one, but it works. Feel free to optimize it by yourself.

Here is the file converter that I wrote. It will assemble all the scans into a single file, one point per line. It supports different file formats (including Stanford .conf files).
#include <string>
#include <vector>
#include <sstream>
#include <iostream>
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#ifndef M_PI
#define M_PI 3.14159265
#endif
class LineInput {
public:
LineInput(const std::string& filename) {
F_ = fopen(filename.c_str(), "r" ) ;
ok_ = (F_ != 0) ;
}
~LineInput() {
if(F_ != 0) {
fclose(F_); F_ = 0 ;
}
}
bool OK() const { return ok_ ; }
bool eof() const { return feof(F_) ; }
bool get_line() {
line_[0] = '\0' ;
// Skip the empty lines
while(!isprint(line_[0])) {
if(fgets(line_, MAX_LINE_LEN, F_) == 0) {
return false ;
}
}
// If the line ends with a backslash, append
// the next line to the current line.
bool check_multiline = true ;
int total_length = MAX_LINE_LEN ;
char* ptr = line_ ;
while(check_multiline) {
int L = strlen(ptr) ;
total_length -= L ;
ptr = ptr + L - 2;
if(*ptr == '\\' && total_length > 0) {
*ptr = ' ' ;
ptr++ ;
fgets(ptr, total_length, F_) ;
} else {
check_multiline = false ;
}
}
if(total_length < 0) {
std::cerr
<< "MultiLine longer than "
<< MAX_LINE_LEN << " bytes" << std::endl ;
}
return true ;
}
int nb_fields() const { return field_.size() ; }
char* field(int i) { return field_[i] ; }
int field_as_int(int i) {
int result ;
ok_ = ok_ && (sscanf(field(i), "%d", &result) == 1) ;
return result ;
}
double field_as_double(int i) {
double result ;
ok_ = ok_ && (sscanf(field(i), "%lf", &result) == 1) ;
return result ;
}
bool field_matches(int i, const char* s) {
return !strcmp(field(i), s) ;
}
void get_fields(const char* separators=" \t\r\n") {
field_.resize(0) ;
char* tok = strtok(line_,separators) ;
while(tok != 0) {
field_.push_back(tok) ;
tok = strtok(0,separators) ;
}
}
private:
enum { MAX_LINE_LEN = 65535 } ;
FILE* F_ ;
char line_[MAX_LINE_LEN] ;
std::vector<char*> field_ ;
bool ok_ ;
} ;
std::string to_string(int x, int mindigits) {
char buff[100] ;
sprintf(buff, "%03d", x) ;
return std::string(buff) ;
}
double M[4][4] ;
void transform(double* xyz) {
double xyzw[4] ;
for(unsigned int c=0; c<4; c++) {
xyzw[c] = M[3][c] ;
}
for(unsigned int j=0; j<4; j++) {
for(unsigned int i=0; i<3; i++) {
xyzw[j] += M[i][j] * xyz[i] ;
}
}
for(unsigned int c=0; c<3; c++) {
xyz[c] = xyzw[c] / xyzw[3] ;
}
}
bool read_frames_file(int no) {
std::string filename = "scan" + to_string(no,3) + ".frames" ;
std::cerr << "Reading frames from:" << filename << std::endl ;
LineInput in(filename) ;
if(!in.OK()) {
std::cerr << " ... not found" << std::endl ;
return false ;
}
while(!in.eof() && in.get_line()) {
in.get_fields() ;
if(in.nb_fields() == 17) {
int f = 0 ;
for(unsigned int i=0; i<4; i++) {
for(unsigned int j=0; j<4; j++) {
M[i][j] = in.field_as_double(f) ; f++ ;
}
}
}
}
return true ;
}
bool read_pose_file(int no) {
std::string filename = "scan" + to_string(no,3) + ".pose" ;
std::cerr << "Reading pose from:" << filename << std::endl ;
LineInput in(filename) ;
if(!in.OK()) {
std::cerr << " ... not found" << std::endl ;
return false ;
}
double xyz[3] ;
double euler[3] ;
in.get_line() ;
in.get_fields() ;
xyz[0] = in.field_as_double(0) ;
xyz[1] = in.field_as_double(1) ;
xyz[2] = in.field_as_double(2) ;
in.get_line() ;
in.get_fields() ;
euler[0] = in.field_as_double(0) * M_PI / 180.0 ;
euler[1] = in.field_as_double(1) * M_PI / 180.0 ;
euler[2] = in.field_as_double(2) * M_PI / 180.0 ;
double sx = sin(euler[0]);
double cx = cos(euler[0]);
double sy = sin(euler[1]);
double cy = cos(euler[1]);
double sz = sin(euler[2]);
double cz = cos(euler[2]);
M[0][0] = cy*cz;
M[0][1] = sx*sy*cz + cx*sz;
M[0][2] = -cx*sy*cz + sx*sz;
M[0][3] = 0.0;
M[1][0] = -cy*sz;
M[1][1] = -sx*sy*sz + cx*cz;
M[1][2] = cx*sy*sz + sx*cz;
M[1][3] = 0.0;
M[2][0] = sy;
M[2][1] = -sx*cy;
M[2][2] = cx*cy;
M[2][3] = 0.0;
M[3][0] = xyz[0];
M[3][1] = xyz[1];
M[3][2] = xyz[2];
M[3][3] = 1.0;
return true ;
}
void setup_transform_from_translation_and_quaternion(
double Tx, double Ty, double Tz,
double Qx, double Qy, double Qz, double Qw
) {
/* for unit q, just set s = 2 or set xs = Qx + Qx, etc. */
double s = 2.0 / (Qx*Qx + Qy*Qy + Qz*Qz + Qw*Qw);
double xs = Qx * s;
double ys = Qy * s;
double zs = Qz * s;
double wx = Qw * xs;
double wy = Qw * ys;
double wz = Qw * zs;
double xx = Qx * xs;
double xy = Qx * ys;
double xz = Qx * zs;
double yy = Qy * ys;
double yz = Qy * zs;
double zz = Qz * zs;
M[0][0] = 1.0 - (yy + zz);
M[0][1] = xy - wz;
M[0][2] = xz + wy;
M[0][3] = 0.0;
M[1][0] = xy + wz;
M[1][1] = 1 - (xx + zz);
M[1][2] = yz - wx;
M[1][3] = 0.0;
M[2][0] = xz - wy;
M[2][1] = yz + wx;
M[2][2] = 1 - (xx + yy);
M[2][3] = 0.0;
M[3][0] = Tx;
M[3][1] = Ty;
M[3][2] = Tz;
M[3][3] = 1.0;
}
bool read_points_file(int no) {
std::string filename = "scan" + to_string(no,3) + ".3d" ;
std::cerr << "Reading points from:" << filename << std::endl ;
LineInput in(filename) ;
if(!in.OK()) {
std::cerr << " ... not found" << std::endl ;
return false ;
}
while(!in.eof() && in.get_line()) {
in.get_fields() ;
double xyz[3] ;
if(in.nb_fields() >= 3) {
for(unsigned int c=0; c<3; c++) {
xyz[c] = in.field_as_double(c) ;
}
transform(xyz) ;
printf("%f %f %f\n",xyz[0],xyz[1],xyz[2]) ;
}
}
return true ;
}
/* only works for ASCII PLY files */
void read_ply_file(char* filename) {
std::cerr << "Reading points from:" << filename << std::endl;
LineInput in(filename) ;
if(!in.OK()) {
std::cerr << filename << ": could not open" << std::endl ;
return;
}
bool reading_vertices = false;
int nb_vertices = 0 ;
int nb_read_vertices = 0 ;
while(!in.eof() && in.get_line()) {
in.get_fields();
if(reading_vertices) {
double xyz[3] ;
for(unsigned int c=0; c<3; c++) {
xyz[c] = in.field_as_double(c) ;
}
transform(xyz) ;
printf("%f %f %f\n",xyz[0],xyz[1],xyz[2]) ;
++nb_read_vertices;
if(nb_read_vertices == nb_vertices) {
return;
}
} else if(
in.field_matches(0,"element") &&
in.field_matches(1,"vertex")
) {
nb_vertices = in.field_as_int(2);
} else if(in.field_matches(0,"end_header")) {
reading_vertices = true;
}
}
}
/* For Stanford scanning repository */
void read_conf_file(char* filename) {
LineInput in(filename) ;
if(!in.OK()) {
std::cerr << filename << ": could not open" << std::endl ;
return;
}
while(!in.eof() && in.get_line()) {
in.get_fields();
if(in.nb_fields() == 0) { continue ; }
if(in.field_matches(0,"bmesh")) {
char* filename = in.field(1);
// Translation vector
double Tx = in.field_as_double(2);
double Ty = in.field_as_double(3);
double Tz = in.field_as_double(4);
/// Quaternion
double Qx = in.field_as_double(5);
double Qy = in.field_as_double(6);
double Qz = in.field_as_double(7);
double Qw = in.field_as_double(8);
setup_transform_from_translation_and_quaternion(Tx,Ty,Tz,Qx,Qy,Qz,Qw);
read_ply_file(filename);
}
}
}
int main(int argc, char** argv) {
if(argc != 2) { return -1 ; }
if(strstr(argv[1],".conf")) {
read_conf_file(argv[1]);
} else {
int max_i = atoi(argv[1]) ;
for(int i=0; i<=max_i; i++) {
if(!read_frames_file(i)) {
read_pose_file(i) ;
}
read_points_file(i) ;
}
}
return 0 ;
}

Okay so here is my solution since none of the above worked for me (note this is in python using blender's bpy). It seems that I need to transpose the rotation part of my 4x4 transformation matrix (note I am using a standard way to convert quaternion to rotation matrix and not the one from zipper). Also note since I am using blender when importing or using any model it only stores the models local coordinates relative to the objects world transformation so you do not have to do this point = objWorld * point, it is blender specific.
#loop
for meshName, transform in zip(plyFile, transformations):
#Build Quaternion
#transform structure [x, y, z, qx, qy, qz, qw]
Rt = mathutils.Quaternion((transform[6], transform[3], transform[4], transform[5])).to_matrix().to_4x4()
Rt.normalize()
Rt.transpose()
Rt[0][3] = transform[0]
Rt[1][3] = transform[1]
Rt[2][3] = transform[2]
bpy.ops.object.select_all(action='DESELECT')
#import the ply mesh into blender
bpy.ops.import_mesh.ply(filepath=baseDir + meshName)
#get the ply object
obj = bpy.context.object
#get objects world matrix
objWorld = obj.matrix_world
for index in range(len(obj.data.vertices)):
#get local point
point = mathutils.Vector([obj.data.vertices[index].co[0],obj.data.vertices[index].co[1], obj.data.vertices[index].co[2], 1.])
#convert local point to world
point = objWorld * point
#apply ply transformation
point = Rt * point
#update the point in the mesh
obj.data.vertices[index].co[0] = point[0]
obj.data.vertices[index].co[1] = point[1]
obj.data.vertices[index].co[2] = point[2]
#all vertex positions should be updated correctly

As mentioned in other answers, the Stanford 3D repository gives some info about the data organization in the '.conf' files but, the transformation for the bunny model were not working properly when using the quaternion data provided.
I was also stuck in this registration problem for the bunny model, and based on my tests I have some extra considerations to add up. When applying the transformation - rotations to be more specific - I have realized that quaternion values were not rotating the cloud in the correct direction but, when using the corresponding Euler notation, by changing the sign of one specific axis of rotation, I got the correct registration. So, back to the quaternion notation used in the '.conf' file, after some tests I have noticed that by changing the sign of the 'w' component in the quaternion, in each 'bmesh' row, but the first (bun000.ply), the rotation by quaternion can be used.
Furthermore, for some reason, when registering the dragon (dragon_stand and dragon_side) and armadillo (armadillo_stand) stanford point clouds, in order to get the correct result I had to use a different sequence for reading the quaternion data in the ‘.conf’ file. It seems to be stored as:
tx ty tz qw qx qy qz
where 't' refers to a translation value and 'q' refers to a quaternion value. Just to be clear, I have just tested these three models, therefore, I don’t know what is the default pattern for the quaternion values. Besides, for these last two point cloud models, I did not need to change the '.conf' file.
I hope this could be useful for someone else trying to do the same

Just in case someone is looking for a full python implementation on the basis of what #DanceIgel found out, here is some code in python 3.9.1, also generating a figure in mathplotlib:
# Python 3.9.1
import numpy as np
import sys
import math
import glob
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import open3d as o3d
def get_pointcloud_files(path):
files = list()
for f in glob.glob(path + '/*.ply'):
files.append(f)
return files
def get_pointcloud_from_file(path, filename):
cloud = o3d.io.read_point_cloud(path + '/' + filename)
return cloud
def get_transformations_from_file(path, filename):
with open(path + '/' + filename) as f:
lines = (line for line in f)
source = np.loadtxt(lines, delimiter=' ', skiprows=1, dtype='str')
source = np.delete(source, 0, 1) #remove camera
filenames = source[:,0]
source = source[filenames.argsort()]
filenames = np.sort(filenames)
translations = list()
for row in source[:,1:4]:
translations.append(np.reshape(row, [3,1]).astype(np.float32))
quaternions = list()
for row in source[:,4:]:
quaternions.append(np.reshape(row, [4,1]).astype(np.float32))
return filenames, translations, quaternions
def quaternion_rotation_matrix(Q):
# Extract the values from Q
q0 = Q[3]
q1 = Q[0]
q2 = Q[1]
q3 = Q[2]
# calculate unit quarternion
magnitude = math.sqrt(q0*q0 + q1*q1 + q2*q2 + q3*q3)
q0 = q0 / magnitude
q1 = q1 / magnitude
q2 = q2 / magnitude
q3 = q3 / magnitude
# First row of the rotation matrix
r00 = 2 * (q0 * q0 + q1 * q1) - 1
r01 = 2 * (q1 * q2 - q0 * q3)
r02 = 2 * (q1 * q3 + q0 * q2)
# Second row of the rotation matrix
r10 = 2 * (q1 * q2 + q0 * q3)
r11 = 2 * (q0 * q0 + q2 * q2) - 1
r12 = 2 * (q2 * q3 - q0 * q1)
# Third row of the rotation matrix
r20 = 2 * (q1 * q3 - q0 * q2)
r21 = 2 * (q2 * q3 + q0 * q1)
r22 = 2 * (q0 * q0 + q3 * q3) - 1
# 3x3 rotation matrix
rot_matrix = np.array([[r00, r01, r02],
[r10, r11, r12],
[r20, r21, r22]])
rot_matrix = np.transpose(rot_matrix)
return rot_matrix
if __name__=="__main__": # $python visualization_bunny.py bunny/data
path = sys.argv[1]
# load transformations and filenames from file
filenames, translations, quaternions = get_transformations_from_file(path, 'bun.conf')
curr_transformation = np.zeros([3,4])
clouds = list()
for curr_filename, curr_quaternion, curr_translation in zip(filenames, quaternions, translations): # go through input files
curr_cloud = get_pointcloud_from_file(path, curr_filename)
# convert cloud to numpy
curr_cloud = np.asarray(curr_cloud.points)
# compute rotation matrix from quaternions
curr_rotation_matr = quaternion_rotation_matrix(curr_quaternion)
curr_rotation_matr = np.squeeze(curr_rotation_matr)
curr_translation = np.squeeze(curr_translation)
# create transformation matrix
curr_transformation[:,0:3] = curr_rotation_matr
curr_transformation[:,3] = curr_translation
# transform current cloud
for i in range(curr_cloud.shape[0]):
# apply rotation
curr_point = np.matmul(curr_rotation_matr, np.transpose(curr_cloud[i,:]))
# apply translation
curr_point = curr_point + curr_translation
curr_cloud[i,0] = curr_point[0]
curr_cloud[i,1] = curr_point[1]
curr_cloud[i,2] = curr_point[2]
# add current cloud to list of clouds
clouds.append(curr_cloud)
#plot separate point clouds in same graph
ax = plt.axes(projection='3d')
for cloud in clouds:
ax.plot(cloud[:,0], cloud[:,1], cloud[:,2], 'bo', markersize=0.005)
#ax.view_init(elev=90, azim=270)
ax.view_init(elev=100, azim=270)
plt.axis('off')
plt.savefig("ZZZ_Stanford_Bunny_PointCloud.png", bbox_inches='tight')
plt.show()

Related

When declaring a static array as "private" before a parallel loop is perfectly equivalent to declaring the array inside the loop?

I've encountered a situation where the code generates different results in the case of having arrays defined inside the loop on index i (case #1) and in the case of declaring them outside the loop on the i index and using the clause private (case #2).
Case #2 generates the same results of the code running on CPU only.
Case #1
#pragma acc parallel loop
for (j = jbeg; j <= jend; j++){
#pragma acc loop
for (i = ibeg; i <= iend; i++){
double Rc[NFLX][NFLX];
double eta[NFLX], um[NFLX], dv[NFLX];
double lambda[NFLX], alambda[NFLX];
double fL[NFLX], fR[NFLX];
.
.
.
}
}}
Case #2
#pragma acc parallel loop
for (j = jbeg; j <= jend; j++){
double Rc[NFLX][NFLX];
double eta[NFLX], um[NFLX], dv[NFLX];
double lambda[NFLX], alambda[NFLX];
double fL[NFLX], fR[NFLX];
#pragma acc loop private(Rc[:NFLX][:NFLX], eta[:NFLX], \
um[:NFLX], lambda[:NFLX], alambda[:NFLX], \
dv[:NFLX], fL[:NFLX], fR[:NFLX])
for (i = ibeg; i <= iend; i++){
.
.
.
}
}}
I have the following values:
NFLX = 8;
jbeg = 3, jend = 258;
ibeg = 3, iend = 1026;
In which cases the two techniques are equivalent and when it is better to choose one over the other?
This is what I see with -Minfo=accel:
case #1:
71, Local memory used for Rc,dv,fR,um,lambda,alambda,fL,eta
case #2:
71, Local memory used for Rc,dv,fR,lambda,alambda,fL,eta
CUDA shared memory used for Rc,eta
Local memory used for um
CUDA shared memory used for um,lambda,alambda,dv,fL,fR
function:
/* ********************************************************************* */
void Roe_Solver (Data *d, timeStep *Dts, Grid *grid, RBox *box)
/*
* Solve the Riemann problem between L/R states using a
* Rusanov-Lax Friedrichs flux.
*********************************************************************** */
{
int i, j, k;
int ibeg = *(box->nbeg)-1, iend = *(box->nend);
int jbeg = *(box->tbeg), jend = *(box->tend);
int kbeg = *(box->bbeg), kend = *(box->bend);
int VXn = VX1, VXt = VX2, VXb = VX3;
int MXn = MX1, MXt = MX2, MXb = MX3;
int ni, nj;
double gmm = GAMMA_EOS;
double gmm1 = gmm - 1.0;
double gmm1_inv = 1.0/gmm1;
double delta = 1.e-7;
double delta_inv = 1.0/delta;
ARRAY_OFFSET (grid, ni, nj);
INDEX_CYCLE (grid->dir, VXn, VXt, VXb);
INDEX_CYCLE (grid->dir, MXn, MXt, MXb);
#pragma acc parallel loop collapse(2) present(d, Dts, grid)
for (k = kbeg; k <= kend; k++){
for (j = jbeg; j <= jend; j++){
long int offset = ni*(j + nj*k);
double * __restrict__ cmax = &Dts->cmax [offset];
double * __restrict__ SL = &d->sweep.SL [offset];
double * __restrict__ SR = &d->sweep.SR [offset];
double um[NFLX];
double fL[NFLX], fR[NFLX];
#pragma acc loop private(um[:NFLX], fL[:NFLX], fR[:NFLX])
for (i = ibeg; i <= iend; i++){
int nv;
double scrh, vel2;
double a2, a, h;
double alambda, lambda, eta;
double s, c, hl, hr;
double bmin, bmax, scrh1;
double pL, pR;
double * __restrict__ vL = d->sweep.vL [offset + i];
double * __restrict__ vR = d->sweep.vR [offset + i];
double * __restrict__ uL = d->sweep.uL [offset + i];
double * __restrict__ uR = d->sweep.uR [offset + i];
double * __restrict__ flux = d->sweep.flux[offset + i];
double a2L = SoundSpeed2 (vL);
double a2R = SoundSpeed2 (vR);
PrimToCons (vL, uL);
PrimToCons (vR, uR);
Flux (vL, uL, fL, grid->dir);
Flux (vR, uR, fR, grid->dir);
pL = vL[PRS];
pR = vR[PRS];
s = sqrt(vR[RHO]/vL[RHO]);
um[RHO] = vL[RHO]*s;
s = 1.0/(1.0 + s);
c = 1.0 - s;
um[VX1] = s*vL[VX1] + c*vR[VX1];
um[VX2] = s*vL[VX2] + c*vR[VX2];
um[VX3] = s*vL[VX3] + c*vR[VX3];
vel2 = um[VX1]*um[VX1] + um[VX2]*um[VX2] + um[VX3]*um[VX3];
hl = 0.5*(vL[VX1]*vL[VX1] + vL[VX2]*vL[VX2] + vL[VX3]*vL[VX3]);
hl += a2L*gmm1_inv;
hr = 0.5*(vR[VX1]*vR[VX1] + vR[VX2]*vR[VX2] + vR[VX3]*vR[VX3]);
hr += a2R*gmm1_inv;
h = s*hl + c*hr;
/* ----------------------------------------------------
1. the following should be equivalent to
scrh = dv[VX1]*dv[VX1] + dv[VX2]*dv[VX2] + dv[VX3]*dv[VX3];
a2 = s*a2L + c*a2R + 0.5*gmm1*s*c*scrh;
and therefore always positive.
---------------------------------------------------- */
a2 = gmm1*(h - 0.5*vel2);
a = sqrt(a2);
/* ----------------------------------------------------------------
2. define non-zero components of conservative eigenvectors Rc,
eigenvalues (lambda) and wave strenght eta = L.du
---------------------------------------------------------------- */
#pragma acc loop seq
NFLX_LOOP(nv) flux[nv] = 0.5*(fL[nv] + fR[nv]);
/* ---- (u - c_s) ---- */
SL[i] = um[VXn] - a;
/* ---- (u + c_s) ---- */
SR[i] = um[VXn] + a;
/* ---- get max eigenvalue ---- */
cmax[i] = fabs(um[VXn]) + a;
NFLX_LOOP(nv) flux[nv] = 0.5*(fL[nv] + fR[nv]) - 0.5*cmax[i]*(uR[nv] - uL[nv]);
#if DIMENSIONS > 1
/* ---------------------------------------------
3. use the HLL flux function if the interface
lies within a strong shock.
The effect of this switch is visible
in the Mach reflection test.
--------------------------------------------- */
scrh = fabs(vL[PRS] - vR[PRS]);
scrh /= MIN(vL[PRS],vR[PRS]);
if (scrh > 0.5 && (vR[VXn] < vL[VXn])){ /* -- tunable parameter -- */
bmin = MIN(0.0, SL[i]);
bmax = MAX(0.0, SR[i]);
scrh1 = 1.0/(bmax - bmin);
#pragma acc loop seq
for (nv = 0; nv < NFLX; nv++){
flux[nv] = bmin*bmax*(uR[nv] - uL[nv])
+ bmax*fL[nv] - bmin*fR[nv];
flux[nv] *= scrh1;
}
}
#endif /* DIMENSIONS > 1 */
} /* End loop on i */
}} /* End loop on j,k */
}
Technically they are equivalent, but in practice different. What's happening is that the compiler will hoist the declaration of these arrays outside of the loops. This is standard practice for the compiler and happens before the OpenACC directives are applied. What should happen is that then these arrays are implicitly privatized within the scoping unit they are declared. However the compiler doesn't currently track this so the arrays are implicitly copied into the compute region as shared arrays. If you add the flag "-Minfo=accel", you'll see the compiler feedback messages indicating the implicit copies.
I have an open issue report requesting this support, TPR #31360, however it's been a challenge to implement so not in a released compiler as of yet. Hence until/if we can fix the behavior, you'll need to manually hoist the declaration of these arrays and then add them to a "private" clause.

improper mandelbrot set output plotting

i am trying to write a code to display Mandelbrot set for the numbers between
(-3,-3) to (2,2) on my terminal.
The main function generates & feeds a complex number to analyze function.
The analyze function returns character "*" for the complex number Z within the set and "." for the numbers which lie outside the set.
The code:
#define MAX_A 2 // upperbound on real
#define MAX_B 2 // upper bound on imaginary
#define MIN_A -3 // lowerbnd on real
#define MIN_B -3 // lower bound on imaginary
#define NX 300 // no. of points along x
#define NY 200 // no. of points along y
#define max_its 50
int analyze(double real,double imag);
void main()
{
double a,b;
int x,x_arr,y,y_arr;
int array[NX][NY];
int res;
for(y=NY-1,x_arr=0;y>=0;y--,x_arr++)
{
for(x=0,y_arr++;x<=NX-1;x++,y_arr++)
{
a= MIN_A+ ( x/( (double)NX-1)*(MAX_A-MIN_A) );
b= MIN_B+ ( y/( (double)NY-1 )*(MAX_B-MIN_B) );
//printf("%f+i%f ",a,b);
res=analyze(a,b);
if(res>49)
array[x][y]=42;
else
array[x][y]=46;
}
// printf("\n");
}
for(y=0;y<NY;y++)
{
for(x=0;x<NX;x++)
printf("%2c",array[x][y]);
printf("\n");
}
}
The analyze function accepts the coordinate on imaginary plane ;
and computes (Z^2)+Z 50 times ; and while computing if the complex number explodes, then function returns immidiately else the function returns after finishing 50 iterations;
int analyze(double real,double imag)
{
int iter=0;
double r=4.0;
while(iter<50)
{
if ( r < ( (real*real) + (imag*imag) ) )
{
return iter;
}
real= ( (real*real) - (imag*imag) + real);
imag= ( (2*real*imag)+ imag);
iter++;
}
return iter;
}
So, i am analyzing 60000 (NX * NY) numbers & displaying it on the terminal
considering 3:2 ratio (300,200) , i even tried 4:3 (NX:NY) , but the output remains same and the generated shape is not even close to the mandlebrot set :
hence, the output appears inverted ,
i browsed & came across lines like:
(x - 400) / ZOOM;
(y - 300) / ZOOM;
on many mandelbrot codes , but i am unable to understand how this line may rectify my output.
i guess i am having trouble in mapping output to the terminal!
(LB_Real,UB_Imag) --- (UB_Real,UB_Imag)
| |
(LB_Real,LB_Imag) --- (UB_Real,LB_Imag)
Any Hint/help will be very useful
The Mandelbrot recurrence is zn+1 = zn2 + c.
Here's your implementation:
real= ( (real*real) - (imag*imag) + real);
imag= ( (2*real*imag)+ imag);
Problem 1. You're updating real to its next value before you've used the old value to compute the new imag.
Problem 2. Assuming you fix problem 1, you're computing zn+1 = zn2 + zn.
Here's how I'd do it using double:
int analyze(double cr, double ci) {
double zr = 0, zi = 0;
int r;
for (r = 0; (r < 50) && (zr*zr + zi*zi < 4.0); ++r) {
double zr1 = zr*zr - zi*zi + cr;
double zi1 = 2 * zr * zi + ci;
zr = zr1;
zi = zi1;
}
return r;
}
But it's easier to understand if you use the standard C99 support for complex numbers:
#include <complex.h>
int analyze(double cr, double ci) {
double complex c = cr + ci * I;
double complex z = 0;
int r;
for (r = 0; (r < 50) && (cabs(z) < 2); ++r) {
z = z * z + c;
}
return r;
}

OCR algorithm (GOCR) to 32F429IDISCOVERY board

I'm trying to implement an OCR algorithm (GOCR algorithm specifically) to 32F429IDISCOVERY board and I'm still getting nothing back...
I'm recording a image from OV7670 camera in RGB565 format to SDRAM of the board that is then converted to greyscale and passed to the algorithm itself.
From this and other forums I got the impression that GOCR is very good algorithm and it seemed to be working very well on PC but I just cant get it to work on the board.
Does anyone have some experience with implementing OCR or GOCR? I am not sure where the problem is because it beaves in a very wierd way. The code stops in different part of the algorithm almost every time...
Calling the OCR algorithm:
void ocr_algorithm(char *output_str) {
job_t job1, *job; /* fixme, dont want global variables for lib */
job=OCR_JOB=&job1;
int linecounter;
const char *line;
uint8_t r,g,b;
uint32_t n,i,buffer;
char *p_pic;
uint32_t *image = (uint32_t*) SDRAM_START_ADR;
setvbuf(stdout, (char *) NULL, _IONBF, 0); /* not buffered */
job_init(job); /* init cfg and db */
job_init_image(job); /* single image */
p_pic = malloc(IMG_ROWS*IMG_COLUMNS);
// Converting RGB565 to grayscale
i=0;
for (n = 0; n < IMG_ROWS*IMG_COLUMNS; n++) {
if (n % 2 == 0){
buffer = image[i] & 0xFFFF;
}
else{
buffer = (image[i] >> 16) & 0xFFFF;
i++;
}
r = (uint8_t) ((buffer >> 11) & 0x1F);
g = (uint8_t) ((buffer >> 5) & 0x3F);
b = (uint8_t) (buffer & 0x1F);
// RGB888
r = ((r * 527) + 23) >> 6;
g = ((g * 259) + 33) >> 6;
b = ((b * 527) + 23) >> 6;
// Greyscale
p_pic[n] = 0.299*r + 0.587*g + 0.114*b;
}
//read_picture;
job->src.p.p = p_pic;
job->src.p.x = IMG_ROWS;
job->src.p.y = IMG_COLUMNS;
job->src.p.bpp = 1;
/* call main loop */
pgm2asc(job);
//print output
strcpy(output_str, "");
linecounter = 0;
line = getTextLine(&(job->res.linelist), linecounter++);
while (line) {
strcat(output_str, line);
strcat(output_str, "\n");
line = getTextLine(&(job->res.linelist), linecounter++);
}
free_textlines(&(job->res.linelist));
job_free_image(job);
free(p_pic);
}

Fast Converting RGBA to ARGB

I am trying to convert a rgba buffer into argb, is there any way to improve the next algorithm, or any other faster way to perform such operation?
Taking into account that the alpha value is not important once in the argb buffer, and should always end up as 0xFF.
int y, x, pixel;
for (y = 0; y < height; y++)
{
for (x = 0; x < width; x++)
{
pixel = rgbaBuffer[y * width + x];
argbBuffer[(height - y - 1) * width + x] = (pixel & 0xff00ff00) | ((pixel << 16) & 0x00ff0000) | ((pixel >> 16) & 0xff);
}
}
I will focus only in the swap function:
typedef unsigned int Color32;
inline Color32 Color32Reverse(Color32 x)
{
return
// Source is in format: 0xAARRGGBB
((x & 0xFF000000) >> 24) | //______AA
((x & 0x00FF0000) >> 8) | //____RR__
((x & 0x0000FF00) << 8) | //__GG____
((x & 0x000000FF) << 24); //BB______
// Return value is in format: 0xBBGGRRAA
}
Assuming that the code is not buggy (just inefficient), I can guess that all you want to do is swap every second (even-numbered) byte (and of course invert the buffer), isn't it?
So you can achieve some optimizations by:
Avoiding the shift and masking operations
Optimizing the loop, eg economizing in the indices calculations
I would rewrite the code as follows:
int y, x;
for (y = 0; y < height; y++)
{
unsigned char *pRGBA= (unsigned char *)(rgbaBuffer+y*width);
unsigned char *pARGB= (unsigned char *)(argbBuffer+(height-y-1)*width);
for (x = 4*(width-1); x>=0; x-=4)
{
pARGB[x ] = pRGBA[x+2];
pARGB[x+1] = pRGBA[x+1];
pARGB[x+2] = pRGBA[x ];
pARGB[x+3] = 0xFF;
}
}
Please note that the more complex indices calculation is performed in the outer loop only. There are four acesses to both rgbaBuffer and argbBuffer for each pixel, but I think this is more than offset by avoiding the bitwise operations and the indixes calculations. An alternative would be (like in your code) fetch/store one pixel (int) at a time, and make the processing locally (this econimizes in memory accesses), but unless you have some efficient way to swap the two bytes and set the alpha locally (eg some inline assembly, so that you make sure that everything is performed at registers level), it won't really help.
Code you provided is very strange since it shuffles color components not rgba->argb, but rgba->rabg.
I've made a correct and optimized version of this routine.
int pixel;
int size = width * height;
for (unsigned int * rgba_ptr = rgbaBuffer, * argb_ptr = argbBuffer + size - 1; argb_ptr >= argbBuffer; rgba_ptr++, argb_ptr--)
{
// *argb_ptr = *rgba_ptr >> 8 | 0xff000000; // - this version doesn't change endianess
*argb_ptr = __builtin_bswap32(*rgba_ptr) >> 8 | 0xff000000; // This does
}
The first thing i've made is simplifying your shuffling expression. It is obvious that XRGB is just RGBA >> 8.
Also i've removed calculation of array index on each iteration and used pointers as loop variables.
This version is about 2 times faster than the original on my machine.
You can also use SSE for shuffling if this code is intended for x86 CPU.
I am very late to this one. But I had the exact same problem when generating video on the fly. By reusing the buffer, I could get away with only setting the R, G, B values for every frame and only setting the A once.
See below code:
byte[] _workingBuffer = null;
byte[] GetProcessedPixelData(SKBitmap bitmap)
{
ReadOnlySpan<byte> sourceSpan = bitmap.GetPixelSpan();
if (_workingBuffer == null || _workingBuffer.Length != bitmap.ByteCount)
{
// Alloc buffer
_workingBuffer = new byte[sourceSpan.Length];
// Set all the alpha
for (int i = 0; i < sourceSpan.Length; i += 4) _workingBuffer[i] = byte.MaxValue;
}
Stopwatch w = Stopwatch.StartNew();
for (int i = 0; i < sourceSpan.Length; i += 4)
{
// A
// Dont set alpha here. The alpha is already set in the buffer
//_workingBuffer[i] = byte.MaxValue;
//_workingBuffer[i] = sourceSpan[i + 3];
// R
_workingBuffer[i + 1] = sourceSpan[i];
// G
_workingBuffer[i + 2] = sourceSpan[i + 1];
// B
_workingBuffer[i + 3] = sourceSpan[i + 2];
}
Debug.Print("Copied " + sourceSpan.Length + " in " + w.Elapsed.TotalMilliseconds);
return _workingBuffer;
}
This got me to around 15 milliseconds on an iPhone for a (1920 * 1080 * 4) buffer which is ~8mb.
This was not nearly enough for me. My final solution was instead to do a offset memcopy (Buffer.BlockCopy in C#) since the alpha is not important.
byte[] _workingBuffer = null;
byte[] GetProcessedPixelData(SKBitmap bitmap)
{
ReadOnlySpan<byte> sourceSpan = bitmap.GetPixelSpan();
byte[] sourceArray = sourceSpan.ToArray();
if (_workingBuffer == null || _workingBuffer.Length != bitmap.ByteCount)
{
// Alloc buffer
_workingBuffer = new byte[sourceSpan.Length];
// Set first byte. This is the alpha component of the first pixel
_workingBuffer[0] = byte.MaxValue;
}
// Converts RGBA to ARGB in ~2 ms instead of ~15 ms
//
// Copies the whole buffer with a offset of 1
// R G B A R G B A R G B A
// Originally the source buffer has: R1, G1, B1, A1, R2, G2, B2, A2, R3, G3, B3, A3
// A R G B A R G B A R G B A
// After the copy it looks like: 0, R1, G1, B1, A1, R2, G2, B2, A2, R3, G3, B3, A3
// So essentially we get the wrong alpha for every pixel. But all alphas should be 255 anyways.
// The first byte is set in the alloc
Buffer.BlockCopy(sourceArray, 0, _workingBuffer, 1, sourceSpan.Length - 1);
// Below is an inefficient method of converting RGBA to ARGB. Takes ~15 ms on iPhone 12 Pro Max for a 8mb buffer (1920 * 1080 * 4 bytes)
/*
for (int i = 0; i < sourceSpan.Length; i += 4)
{
// A
// Dont set alpha here. The alpha is already set in the buffer
//_workingBuffer[i] = byte.MaxValue;
//_workingBuffer[i] = sourceSpan[i + 3];
byte sR = sourceSpan[i];
byte sG = sourceSpan[i + 1];
byte sB = sourceSpan[i + 2];
if (sR == 0 && sG == byte.MaxValue && sB == 0)
continue;
// R
_workingBuffer[i + 1] = sR;
// G
_workingBuffer[i + 2] = sG;
// B
_workingBuffer[i + 3] = sB;
}
*/
return _workingBuffer;
}
The code is commented on how this works. On my same iPhone it takes ~2 ms which is sufficient for my use case.
Use assembly, the following is for Intel.
This example swaps Red and Blue.
void* b = pixels;
UINT len = textureWidth*textureHeight;
__asm
{
mov ecx, len // Set loop counter to pixels memory block size
mov ebx, b // Set ebx to pixels pointer
label:
mov al,[ebx+0] // Load Red to al
mov ah,[ebx+2] // Load Blue to ah
mov [ebx+0],ah // Swap Red
mov [ebx+2],al // Swap Blue
add ebx,4 // Move by 4 bytes to next pixel
dec ecx // Decrease loop counter
jnz label // If not zero jump to label
}
(pixel << 24) | (pixel >> 8) rotates a 32-bit integer 8 bits to the right, which would convert a 32-bit RGBA value to ARGB. This works because:
pixel << 24 discards the RGB portion of RGBA off the left side, resulting in A000.
pixel >> 8 discards the A portion of RGBA off the right side, resulting in 0RGB.
A000 | 0RGB == ARGB.

How to compare two colors for similarity/difference

I want to design a program that can help me assess between 5 pre-defined colors which one is more similar to a variable color, and with what percentage. The thing is that I don't know how to do that manually step by step. So it is even more difficult to think of a program.
More details: The colors are from photographs of tubes with gel that as different colors. I have 5 tubes with different colors were each is representative of 1 of 5 levels. I want to take photographs of other samples and on the computer assess to which level that sample belongs by comparing colors, and I want to know that with a percentage of approximation too. I would like a program that does something like this: http://www.colortools.net/color_matcher.html
If you can tell me what steps to take, even if they are things for me to think and do manually. It would be very helpful.
See Wikipedia's article on Color Difference for the right leads.
Basically, you want to compute a distance metric in some multidimensional colorspace.
But RGB is not "perceptually uniform", so your Euclidean RGB distance metric suggested by Vadim will not match the human-perceived distance between colors. For a start, L*a*b* is intended to be a perceptually uniform colorspace, and the deltaE metric is commonly used. But there are more refined colorspaces and more refined deltaE formulas that get closer to matching human perception.
You'll have to learn more about colorspaces and illuminants to do the conversions. But for a quick formula that is better than the Euclidean RGB metric, just do this:
Assume that your RGB values are in the sRGB colorspace
Find the sRGB to L*a*b* conversion formulas
Convert your sRGB colors to L*a*b*
Compute deltaE between your two L*a*b* values
It's not computationally expensive, it's just some nonlinear formulas and some multiplications and additions.
Just an idea that first came to my mind (sorry if stupid).
Three components of colors can be assumed 3D coordinates of points and then you could calculate distance between points.
F.E.
Point1 has R1 G1 B1
Point2 has R2 G2 B2
Distance between colors is
d=sqrt((r2-r1)^2+(g2-g1)^2+(b2-b1)^2)
Percentage is
p=d/sqrt((255)^2+(255)^2+(255)^2)
Actually I walked the same path a couple of months ago. There is no perfect answer to the question (that was asked here a couple of times) but there is one, more sophisticated than the sqrt(r-r) etc. answer and more easy to implement directly with RGB without moving to all kinds of alternate color spaces. I found this formula here which is a low cost approximation of the quite complicated real formula (by the CIE which is the W3C of colors, since this is a not finished quest, you can find older and simpler color difference equations there).
Good Luck.
Edit: For posterity, here's the relevant C code:
typedef struct {
unsigned char r, g, b;
} RGB;
double ColourDistance(RGB e1, RGB e2)
{
long rmean = ( (long)e1.r + (long)e2.r ) / 2;
long r = (long)e1.r - (long)e2.r;
long g = (long)e1.g - (long)e2.g;
long b = (long)e1.b - (long)e2.b;
return sqrt((((512+rmean)*r*r)>>8) + 4*g*g + (((767-rmean)*b*b)>>8));
}
If you have two Color objects c1 and c2, you can just compare each RGB value from c1 with that of c2.
int diffRed = Math.abs(c1.getRed() - c2.getRed());
int diffGreen = Math.abs(c1.getGreen() - c2.getGreen());
int diffBlue = Math.abs(c1.getBlue() - c2.getBlue());
Those values you can just divide by the amount of difference saturations (255), and you will get the difference between the two.
float pctDiffRed = (float)diffRed / 255;
float pctDiffGreen = (float)diffGreen / 255;
float pctDiffBlue = (float)diffBlue / 255;
After which you can just find the average color difference in percentage.
(pctDiffRed + pctDiffGreen + pctDiffBlue) / 3 * 100
Which would give you a difference in percentage between c1 and c2.
A color value has more than one dimension, so there is no intrinsic way to compare two colors. You have to determine for your use case the meaning of the colors and thereby how to best compare them.
Most likely you want to compare the hue, saturation and/or lightness properties of the colors as oppposed to the red/green/blue components. If you are having trouble figuring out how you want to compare them, take some pairs of sample colors and compare them mentally, then try to justify/explain to yourself why they are similar/different.
Once you know which properties/components of the colors you want to compare, then you need to figure out how to extract that information from a color.
Most likely you will just need to convert the color from the common RedGreenBlue representation to HueSaturationLightness, and then calculate something like
avghue = (color1.hue + color2.hue)/2
distance = abs(color1.hue-avghue)
This example would give you a simple scalar value indicating how far the gradient/hue of the colors are from each other.
See HSL and HSV at Wikipedia.
One of the best methods to compare two colors by human perception is CIE76. The difference is called Delta-E. When it is less than 1, the human eye can not recognize the difference.
There is wonderful color utilities class ColorUtils (code below), which includes CIE76 comparison methods. It is written by Daniel Strebel,University of Zurich.
From ColorUtils.class I use the method:
static double colorDifference(int r1, int g1, int b1, int r2, int g2, int b2)
r1,g1,b1 - RGB values of the first color
r2,g2,b2 - RGB values ot the second color that you would like to compare
If you work with Android, you can get these values like this:
r1 = Color.red(pixel);
g1 = Color.green(pixel);
b1 = Color.blue(pixel);
ColorUtils.class by Daniel Strebel,University of Zurich:
import android.graphics.Color;
public class ColorUtil {
public static int argb(int R, int G, int B) {
return argb(Byte.MAX_VALUE, R, G, B);
}
public static int argb(int A, int R, int G, int B) {
byte[] colorByteArr = {(byte) A, (byte) R, (byte) G, (byte) B};
return byteArrToInt(colorByteArr);
}
public static int[] rgb(int argb) {
return new int[]{(argb >> 16) & 0xFF, (argb >> 8) & 0xFF, argb & 0xFF};
}
public static int byteArrToInt(byte[] colorByteArr) {
return (colorByteArr[0] << 24) + ((colorByteArr[1] & 0xFF) << 16)
+ ((colorByteArr[2] & 0xFF) << 8) + (colorByteArr[3] & 0xFF);
}
public static int[] rgb2lab(int R, int G, int B) {
//http://www.brucelindbloom.com
float r, g, b, X, Y, Z, fx, fy, fz, xr, yr, zr;
float Ls, as, bs;
float eps = 216.f / 24389.f;
float k = 24389.f / 27.f;
float Xr = 0.964221f; // reference white D50
float Yr = 1.0f;
float Zr = 0.825211f;
// RGB to XYZ
r = R / 255.f; //R 0..1
g = G / 255.f; //G 0..1
b = B / 255.f; //B 0..1
// assuming sRGB (D65)
if (r <= 0.04045)
r = r / 12;
else
r = (float) Math.pow((r + 0.055) / 1.055, 2.4);
if (g <= 0.04045)
g = g / 12;
else
g = (float) Math.pow((g + 0.055) / 1.055, 2.4);
if (b <= 0.04045)
b = b / 12;
else
b = (float) Math.pow((b + 0.055) / 1.055, 2.4);
X = 0.436052025f * r + 0.385081593f * g + 0.143087414f * b;
Y = 0.222491598f * r + 0.71688606f * g + 0.060621486f * b;
Z = 0.013929122f * r + 0.097097002f * g + 0.71418547f * b;
// XYZ to Lab
xr = X / Xr;
yr = Y / Yr;
zr = Z / Zr;
if (xr > eps)
fx = (float) Math.pow(xr, 1 / 3.);
else
fx = (float) ((k * xr + 16.) / 116.);
if (yr > eps)
fy = (float) Math.pow(yr, 1 / 3.);
else
fy = (float) ((k * yr + 16.) / 116.);
if (zr > eps)
fz = (float) Math.pow(zr, 1 / 3.);
else
fz = (float) ((k * zr + 16.) / 116);
Ls = (116 * fy) - 16;
as = 500 * (fx - fy);
bs = 200 * (fy - fz);
int[] lab = new int[3];
lab[0] = (int) (2.55 * Ls + .5);
lab[1] = (int) (as + .5);
lab[2] = (int) (bs + .5);
return lab;
}
/**
* Computes the difference between two RGB colors by converting them to the L*a*b scale and
* comparing them using the CIE76 algorithm { http://en.wikipedia.org/wiki/Color_difference#CIE76}
*/
public static double getColorDifference(int a, int b) {
int r1, g1, b1, r2, g2, b2;
r1 = Color.red(a);
g1 = Color.green(a);
b1 = Color.blue(a);
r2 = Color.red(b);
g2 = Color.green(b);
b2 = Color.blue(b);
int[] lab1 = rgb2lab(r1, g1, b1);
int[] lab2 = rgb2lab(r2, g2, b2);
return Math.sqrt(Math.pow(lab2[0] - lab1[0], 2) + Math.pow(lab2[1] - lab1[1], 2) + Math.pow(lab2[2] - lab1[2], 2));
}
}
Just another answer, although it's similar to Supr's one - just a different color space.
The thing is: Humans perceive the difference in color not uniformly and the RGB color space is ignoring this. As a result if you use the RGB color space and just compute the euclidean distance between 2 colors you may get a difference which is mathematically absolutely correct, but wouldn't coincide with what humans would tell you.
This may not be a problem - the difference is not that large I think, but if you want to solve this "better" you should convert your RGB colors into a color space that was specifically designed to avoid the above problem. There are several ones, improvements from earlier models (since this is based on human perception we need to measure the "correct" values based on experimental data). There's the Lab colorspace which I think would be the best although a bit complicated to convert it to. Simpler would be the CIE XYZ one.
Here's a site that lists the formula's to convert between different color spaces so you can experiment a bit.
Kotlin version with how much percent do you want to match.
Method call with percent optional argument
isMatchingColor(intColor1, intColor2, 95) // should match color if 95% similar
Method body
private fun isMatchingColor(intColor1: Int, intColor2: Int, percent: Int = 90): Boolean {
val threadSold = 255 - (255 / 100f * percent)
val diffAlpha = abs(Color.alpha(intColor1) - Color.alpha(intColor2))
val diffRed = abs(Color.red(intColor1) - Color.red(intColor2))
val diffGreen = abs(Color.green(intColor1) - Color.green(intColor2))
val diffBlue = abs(Color.blue(intColor1) - Color.blue(intColor2))
if (diffAlpha > threadSold) {
return false
}
if (diffRed > threadSold) {
return false
}
if (diffGreen > threadSold) {
return false
}
if (diffBlue > threadSold) {
return false
}
return true
}
All methods below result in a scale from 0-100.
internal static class ColorDifference
{
internal enum Method
{
Binary, // true or false, 0 is false
Square,
Dimensional,
CIE76
}
public static double Calculate(Method method, int argb1, int argb2)
{
int[] c1 = ColorConversion.ArgbToArray(argb1);
int[] c2 = ColorConversion.ArgbToArray(argb2);
return Calculate(method, c1[1], c2[1], c1[2], c2[2], c1[3], c2[3], c1[0], c2[0]);
}
public static double Calculate(Method method, int r1, int r2, int g1, int g2, int b1, int b2, int a1 = -1, int a2 = -1)
{
switch (method)
{
case Method.Binary:
return (r1 == r2 && g1 == g2 && b1 == b2 && a1 == a2) ? 0 : 100;
case Method.CIE76:
return CalculateCIE76(r1, r2, g1, g2, b1, b2);
case Method.Dimensional:
if (a1 == -1 || a2 == -1) return Calculate3D(r1, r2, g1, g2, b1, b2);
else return Calculate4D(r1, r2, g1, g2, b1, b2, a1, a2);
case Method.Square:
return CalculateSquare(r1, r2, g1, g2, b1, b2, a1, a2);
default:
throw new InvalidOperationException();
}
}
public static double Calculate(Method method, Color c1, Color c2, bool alpha)
{
switch (method)
{
case Method.Binary:
return (c1.R == c2.R && c1.G == c2.G && c1.B == c2.B && (!alpha || c1.A == c2.A)) ? 0 : 100;
case Method.CIE76:
if (alpha) throw new InvalidOperationException();
return CalculateCIE76(c1, c2);
case Method.Dimensional:
if (alpha) return Calculate4D(c1, c2);
else return Calculate3D(c1, c2);
case Method.Square:
if (alpha) return CalculateSquareAlpha(c1, c2);
else return CalculateSquare(c1, c2);
default:
throw new InvalidOperationException();
}
}
// A simple idea, based on on a Square
public static double CalculateSquare(int argb1, int argb2)
{
int[] c1 = ColorConversion.ArgbToArray(argb1);
int[] c2 = ColorConversion.ArgbToArray(argb2);
return CalculateSquare(c1[1], c2[1], c1[2], c2[2], c1[3], c2[3]);
}
public static double CalculateSquare(Color c1, Color c2)
{
return CalculateSquare(c1.R, c2.R, c1.G, c2.G, c1.B, c2.B);
}
public static double CalculateSquareAlpha(int argb1, int argb2)
{
int[] c1 = ColorConversion.ArgbToArray(argb1);
int[] c2 = ColorConversion.ArgbToArray(argb2);
return CalculateSquare(c1[1], c2[1], c1[2], c2[2], c1[3], c2[3], c1[0], c2[0]);
}
public static double CalculateSquareAlpha(Color c1, Color c2)
{
return CalculateSquare(c1.R, c2.R, c1.G, c2.G, c1.B, c2.B, c1.A, c2.A);
}
public static double CalculateSquare(int r1, int r2, int g1, int g2, int b1, int b2, int a1 = -1, int a2 = -1)
{
if (a1 == -1 || a2 == -1) return (Math.Abs(r1 - r2) + Math.Abs(g1 - g2) + Math.Abs(b1 - b2)) / 7.65;
else return (Math.Abs(r1 - r2) + Math.Abs(g1 - g2) + Math.Abs(b1 - b2) + Math.Abs(a1 - a2)) / 10.2;
}
// from:http://stackoverflow.com/questions/9018016/how-to-compare-two-colors
public static double Calculate3D(int argb1, int argb2)
{
int[] c1 = ColorConversion.ArgbToArray(argb1);
int[] c2 = ColorConversion.ArgbToArray(argb2);
return Calculate3D(c1[1], c2[1], c1[2], c2[2], c1[3], c2[3]);
}
public static double Calculate3D(Color c1, Color c2)
{
return Calculate3D(c1.R, c2.R, c1.G, c2.G, c1.B, c2.B);
}
public static double Calculate3D(int r1, int r2, int g1, int g2, int b1, int b2)
{
return Math.Sqrt(Math.Pow(Math.Abs(r1 - r2), 2) + Math.Pow(Math.Abs(g1 - g2), 2) + Math.Pow(Math.Abs(b1 - b2), 2)) / 4.41672955930063709849498817084;
}
// Same as above, but made 4D to include alpha channel
public static double Calculate4D(int argb1, int argb2)
{
int[] c1 = ColorConversion.ArgbToArray(argb1);
int[] c2 = ColorConversion.ArgbToArray(argb2);
return Calculate4D(c1[1], c2[1], c1[2], c2[2], c1[3], c2[3], c1[0], c2[0]);
}
public static double Calculate4D(Color c1, Color c2)
{
return Calculate4D(c1.R, c2.R, c1.G, c2.G, c1.B, c2.B, c1.A, c2.A);
}
public static double Calculate4D(int r1, int r2, int g1, int g2, int b1, int b2, int a1, int a2)
{
return Math.Sqrt(Math.Pow(Math.Abs(r1 - r2), 2) + Math.Pow(Math.Abs(g1 - g2), 2) + Math.Pow(Math.Abs(b1 - b2), 2) + Math.Pow(Math.Abs(a1 - a2), 2)) / 5.1;
}
/**
* Computes the difference between two RGB colors by converting them to the L*a*b scale and
* comparing them using the CIE76 algorithm { http://en.wikipedia.org/wiki/Color_difference#CIE76}
*/
public static double CalculateCIE76(int argb1, int argb2)
{
return CalculateCIE76(Color.FromArgb(argb1), Color.FromArgb(argb2));
}
public static double CalculateCIE76(Color c1, Color c2)
{
return CalculateCIE76(c1.R, c2.R, c1.G, c2.G, c1.B, c2.B);
}
public static double CalculateCIE76(int r1, int r2, int g1, int g2, int b1, int b2)
{
int[] lab1 = ColorConversion.ColorToLab(r1, g1, b1);
int[] lab2 = ColorConversion.ColorToLab(r2, g2, b2);
return Math.Sqrt(Math.Pow(lab2[0] - lab1[0], 2) + Math.Pow(lab2[1] - lab1[1], 2) + Math.Pow(lab2[2] - lab1[2], 2)) / 2.55;
}
}
internal static class ColorConversion
{
public static int[] ArgbToArray(int argb)
{
return new int[] { (argb >> 24), (argb >> 16) & 0xFF, (argb >> 8) & 0xFF, argb & 0xFF };
}
public static int[] ColorToLab(int R, int G, int B)
{
// http://www.brucelindbloom.com
double r, g, b, X, Y, Z, fx, fy, fz, xr, yr, zr;
double Ls, fas, fbs;
double eps = 216.0f / 24389.0f;
double k = 24389.0f / 27.0f;
double Xr = 0.964221f; // reference white D50
double Yr = 1.0f;
double Zr = 0.825211f;
// RGB to XYZ
r = R / 255.0f; //R 0..1
g = G / 255.0f; //G 0..1
b = B / 255.0f; //B 0..1
// assuming sRGB (D65)
if (r <= 0.04045) r = r / 12;
else r = (float)Math.Pow((r + 0.055) / 1.055, 2.4);
if (g <= 0.04045) g = g / 12;
else g = (float)Math.Pow((g + 0.055) / 1.055, 2.4);
if (b <= 0.04045) b = b / 12;
else b = (float)Math.Pow((b + 0.055) / 1.055, 2.4);
X = 0.436052025f * r + 0.385081593f * g + 0.143087414f * b;
Y = 0.222491598f * r + 0.71688606f * g + 0.060621486f * b;
Z = 0.013929122f * r + 0.097097002f * g + 0.71418547f * b;
// XYZ to Lab
xr = X / Xr;
yr = Y / Yr;
zr = Z / Zr;
if (xr > eps) fx = (float)Math.Pow(xr, 1 / 3.0);
else fx = (float)((k * xr + 16.0) / 116.0);
if (yr > eps) fy = (float)Math.Pow(yr, 1 / 3.0);
else fy = (float)((k * yr + 16.0) / 116.0);
if (zr > eps) fz = (float)Math.Pow(zr, 1 / 3.0);
else fz = (float)((k * zr + 16.0) / 116);
Ls = (116 * fy) - 16;
fas = 500 * (fx - fy);
fbs = 200 * (fy - fz);
int[] lab = new int[3];
lab[0] = (int)(2.55 * Ls + 0.5);
lab[1] = (int)(fas + 0.5);
lab[2] = (int)(fbs + 0.5);
return lab;
}
}
A simple method that only uses RGB is
cR=R1-R2
cG=G1-G2
cB=B1-B2
uR=R1+R2
distance=cR*cR*(2+uR/256) + cG*cG*4 + cB*cB*(2+(255-uR)/256)
I've used this one for a while now, and it works well enough for most purposes.
I've tried various methods like LAB color space, HSV comparisons and I've found that luminosity works pretty well for this purpose.
Here is Python version
def lum(c):
def factor(component):
component = component / 255;
if (component <= 0.03928):
component = component / 12.92;
else:
component = math.pow(((component + 0.055) / 1.055), 2.4);
return component
components = [factor(ci) for ci in c]
return (components[0] * 0.2126 + components[1] * 0.7152 + components[2] * 0.0722) + 0.05;
def color_distance(c1, c2):
l1 = lum(c1)
l2 = lum(c2)
higher = max(l1, l2)
lower = min(l1, l2)
return (higher - lower) / higher
c1 = ImageColor.getrgb('white')
c2 = ImageColor.getrgb('yellow')
print(color_distance(c1, c2))
Will give you
0.0687619047619048
Android for ColorUtils API RGBToHSL:
I had two int argb colors (color1, color2) and I wanted to get distance/difference among the two colors. Here is what I did;
private float getHue(int color) {
int R = (color >> 16) & 0xff;
int G = (color >> 8) & 0xff;
int B = (color ) & 0xff;
float[] colorHue = new float[3];
ColorUtils.RGBToHSL(R, G, B, colorHue);
return colorHue[0];
}
Then I used below code to find the distance between the two colors.
private float getDistance(getHue(color1), getHue(color2)) {
float avgHue = (hue1 + hue2)/2;
return Math.abs(hue1 - avgHue);
}
The best way is deltaE. DeltaE is a number that shows the difference of the colors. If deltae < 1 then the difference can't recognize by human eyes. I wrote a code in canvas and js for converting rgb to lab and then calculating delta e. On this example the code is recognising pixels which have different color with a base color that I saved as LAB1. and then if it is different makes those pixels red. You can increase or reduce the sensitivity of the color difference with increae or decrease the acceptable range of delta e. In this example I assigned 10 for deltaE in the line that I wrote (deltae <= 10):
<script>
var constants = {
canvasWidth: 700, // In pixels.
canvasHeight: 600, // In pixels.
colorMap: new Array()
};
// -----------------------------------------------------------------------------------------------------
function fillcolormap(imageObj1) {
function rgbtoxyz(red1,green1,blue1){ // a converter for converting rgb model to xyz model
var red2 = red1/255;
var green2 = green1/255;
var blue2 = blue1/255;
if(red2>0.04045){
red2 = (red2+0.055)/1.055;
red2 = Math.pow(red2,2.4);
}
else{
red2 = red2/12.92;
}
if(green2>0.04045){
green2 = (green2+0.055)/1.055;
green2 = Math.pow(green2,2.4);
}
else{
green2 = green2/12.92;
}
if(blue2>0.04045){
blue2 = (blue2+0.055)/1.055;
blue2 = Math.pow(blue2,2.4);
}
else{
blue2 = blue2/12.92;
}
red2 = (red2*100);
green2 = (green2*100);
blue2 = (blue2*100);
var x = (red2 * 0.4124) + (green2 * 0.3576) + (blue2 * 0.1805);
var y = (red2 * 0.2126) + (green2 * 0.7152) + (blue2 * 0.0722);
var z = (red2 * 0.0193) + (green2 * 0.1192) + (blue2 * 0.9505);
var xyzresult = new Array();
xyzresult[0] = x;
xyzresult[1] = y;
xyzresult[2] = z;
return(xyzresult);
} //end of rgb_to_xyz function
function xyztolab(xyz){ //a convertor from xyz to lab model
var x = xyz[0];
var y = xyz[1];
var z = xyz[2];
var x2 = x/95.047;
var y2 = y/100;
var z2 = z/108.883;
if(x2>0.008856){
x2 = Math.pow(x2,1/3);
}
else{
x2 = (7.787*x2) + (16/116);
}
if(y2>0.008856){
y2 = Math.pow(y2,1/3);
}
else{
y2 = (7.787*y2) + (16/116);
}
if(z2>0.008856){
z2 = Math.pow(z2,1/3);
}
else{
z2 = (7.787*z2) + (16/116);
}
var l= 116*y2 - 16;
var a= 500*(x2-y2);
var b= 200*(y2-z2);
var labresult = new Array();
labresult[0] = l;
labresult[1] = a;
labresult[2] = b;
return(labresult);
}
var canvas = document.getElementById('myCanvas');
var context = canvas.getContext('2d');
var imageX = 0;
var imageY = 0;
context.drawImage(imageObj1, imageX, imageY, 240, 140);
var imageData = context.getImageData(0, 0, 240, 140);
var data = imageData.data;
var n = data.length;
// iterate over all pixels
var m = 0;
for (var i = 0; i < n; i += 4) {
var red = data[i];
var green = data[i + 1];
var blue = data[i + 2];
var xyzcolor = new Array();
xyzcolor = rgbtoxyz(red,green,blue);
var lab = new Array();
lab = xyztolab(xyzcolor);
constants.colorMap.push(lab); //fill up the colormap array with lab colors.
}
}
// -----------------------------------------------------------------------------------------------------
function colorize(pixqty) {
function deltae94(lab1,lab2){ //calculating Delta E 1994
var c1 = Math.sqrt((lab1[1]*lab1[1])+(lab1[2]*lab1[2]));
var c2 = Math.sqrt((lab2[1]*lab2[1])+(lab2[2]*lab2[2]));
var dc = c1-c2;
var dl = lab1[0]-lab2[0];
var da = lab1[1]-lab2[1];
var db = lab1[2]-lab2[2];
var dh = Math.sqrt((da*da)+(db*db)-(dc*dc));
var first = dl;
var second = dc/(1+(0.045*c1));
var third = dh/(1+(0.015*c1));
var deresult = Math.sqrt((first*first)+(second*second)+(third*third));
return(deresult);
} // end of deltae94 function
var lab11 = new Array("80","-4","21");
var lab12 = new Array();
var k2=0;
var canvas = document.getElementById('myCanvas');
var context = canvas.getContext('2d');
var imageData = context.getImageData(0, 0, 240, 140);
var data = imageData.data;
for (var i=0; i<pixqty; i++) {
lab12 = constants.colorMap[i];
var deltae = deltae94(lab11,lab12);
if (deltae <= 10) {
data[i*4] = 255;
data[(i*4)+1] = 0;
data[(i*4)+2] = 0;
k2++;
} // end of if
} //end of for loop
context.clearRect(0,0,240,140);
alert(k2);
context.putImageData(imageData,0,0);
}
// -----------------------------------------------------------------------------------------------------
$(window).load(function () {
var imageObj = new Image();
imageObj.onload = function() {
fillcolormap(imageObj);
}
imageObj.src = './mixcolor.png';
});
// ---------------------------------------------------------------------------------------------------
var pixno2 = 240*140;
</script>
I used this in my android up and it seems satisfactory although RGB space is not recommended:
public double colourDistance(int red1,int green1, int blue1, int red2, int green2, int blue2)
{
double rmean = ( red1 + red2 )/2;
int r = red1 - red2;
int g = green1 - green2;
int b = blue1 - blue2;
double weightR = 2 + rmean/256;
double weightG = 4.0;
double weightB = 2 + (255-rmean)/256;
return Math.sqrt(weightR*r*r + weightG*g*g + weightB*b*b);
}
Then I used the following to get percent of similarity:
double maxColDist = 764.8339663572415;
double d1 = colourDistance(red1,green1,blue1,red2,green2,blue2);
String s1 = (int) Math.round(((maxColDist-d1)/maxColDist)*100) + "% match";
It works well enough.
I expect you want to analyze a whole image at the end, don't you? So you could check for the smallest/highest difference to the identity color matrix.
Most math operations for processing graphics use matrices, because the possible algorithms using them are often faster than classical point by point distance and comparism calculations. (e.g. for operations using DirectX, OpenGL, ...)
So I think you should start here:
http://en.wikipedia.org/wiki/Identity_matrix
http://en.wikipedia.org/wiki/Matrix_difference_equation
... and as Beska already commented above:
This may not give the best "visible" difference...
Which means also that your algorithm depends onto your definiton of "similar to" if you are processing images.
You'll need to convert any RGB colors into the Lab color space to be able to compare them in the way that humans see them. Otherwise you'll be getting RGB colors that 'match' in some very strange ways.
The wikipedia link on Color Differences gives you an intro into the various Lab color space difference algorithms that have been defined over the years. The simplest that just checks the Euclidian distance of two lab colours, works but has a few flaws.
Conveniently there's a Java implementation of the more sophisticated CIEDE2000 algorithm in the OpenIMAJ project. Provide it your two sets of Lab colours and it'll give you back single distance value.
The only "right" way to compare colors is to do it with deltaE in CIELab or CIELuv.
But for a lot of applications I think this is a good enough approximation:
distance = 3 * |dR| + 4 * |dG| + 3 * |dB|
I think a weighted manhattan distance makes a lot more sense when comparing colors. Remember that color primaries are only in our head. They don't have any physical significance. CIELab and CIELuv is modelled statistically from our perception of color.
For quick and dirty, you can do
import java.awt.Color;
private Color dropPrecision(Color c,int threshold){
return new Color((c.getRed()/threshold),
(c.getGreen()/threshold),
(c.getBlue()/threshold));
}
public boolean inThreshold(Color _1,Color _2,int threshold){
return dropPrecision(_1,threshold)==dropPrecision(_2,threshold);
}
making use of integer division to quantize the colors.
Swift 5 Answer
I found this thread because I needed a Swift version of this question. As nobody has answered with the solution, here's mine:
extension UIColor {
var rgba: (red: CGFloat, green: CGFloat, blue: CGFloat, alpha: CGFloat) {
var red: CGFloat = 0
var green: CGFloat = 0
var blue: CGFloat = 0
var alpha: CGFloat = 0
getRed(&red, green: &green, blue: &blue, alpha: &alpha)
return (red, green, blue, alpha)
}
func isSimilar(to colorB: UIColor) -> Bool {
let rgbA = self.rgba
let rgbB = colorB.rgba
let diffRed = abs(CGFloat(rgbA.red) - CGFloat(rgbB.red))
let diffGreen = abs(rgbA.green - rgbB.green)
let diffBlue = abs(rgbA.blue - rgbB.blue)
let pctRed = diffRed
let pctGreen = diffGreen
let pctBlue = diffBlue
let pct = (pctRed + pctGreen + pctBlue) / 3 * 100
return pct < 10 ? true : false
}
}
Usage:
let black: UIColor = UIColor.black
let white: UIColor = UIColor.white
let similar: Bool = black.isSimilar(to: white)
I set less than 10% difference to return similar colours, but you can customise this yourself.

Resources