Related
I am wanting to scale grayscale images (input masks, really) with discrete values up smoothly. The values in these images are indexes that represent arbitrary concepts (e.g. "terrain types"; they are usually indices into a table), rather than values on a continuous scale, so they can't be averaged or blended in any way.
Do there exist algorithms that can do this with a more pleasing result than nearest-neighbour, which results in a very blocky, pixelated result? I am looking for something that will at least produce more rounded, more fluid results. The kind of thing that would be ideal would be a whitepaper, or a library (preferably in Java).
I've researched the subject, but I can't find anything. There is plenty about linear or cubic interpolation, etc., but that won't work for indexed values. The only algorithm I ever see mentioned that does not try to average values is nearest-neighbour. But there must be more?
Using colour here for clarity. I do of course understand that the preferred result here is impossible; I'm not asking for something that reconstitutes destroyed information, just hoping for something that will at least guestimate something smoother than the first result.
Scan the destination image and for every corresponding source pixel (non-integer coordinates) check if the colors of the four surrounding pixels are the same. If yes, assign that color.
If not, perform as many bilinear interpolations as there are different colors. For this assign the weight 1 for a given color (each in turn) and 0 for the others, and interpolate the weight. Finally, keep the color with the largest weight.
By analytical geometry, one can show that in bilinear interpolation, the iso-weight curves are arcs of hyperbola. If your magnification is large, you will see them. G1 continuity is not guaranteed. If this is an annoyance, you can work with G1 bicubic interpolation instead.
If this still does not satisfy you, you can try smooth approximating surfaces rather than interpolating ones. But the principle of keeping the color of maximum weight remains.
If there aren't many distinct colors and you want to use ready-made functions, you can work this out as follows:
split the image in several binary images (white for a chosen color, black for background);
magnify all images (to grayscale) using the favorite method;
now implement yourself a function that assigns every pixel the color that has the largest value among the magnified images.
You can also apply a smoothing filter to the binary images before or after magnification.
For the sake of illustration, here is what you would get with two colors at a time (but this easily generalizes).
Color source image:
Smoothing applied to the binary equivalents:
Magnified:
Maximum weight decision:
One thing you could try is to extract a polygon for the boundary of each uniformly-colored region, then upscale and draw the polygon in the output image. You won’t create neatly rounded edges, but you will avoid the stair-case effect of the nearest neighbor interpolation. Upscaling polygons should avoid gaps between the regions too.
I guess that smoothing the shape for each value individually is a way to avoid undesired mixed value.
To handle values individually, here, I started with your nearest-neighbour image v, and create 3 image { A.bmp, B.bmp, C.bmp } by hand.
(each image has only 1 color region and background is black. e.g. A.bmp is below:)
After smoothing the shape for each image, draw these shapes to one result image buffer with different color.
//I use C++ and OpenCV
int main()
{
const std::string FileNames[3] = { "A.bmp", "B.bmp", "C.bmp" };
const cv::Scalar ResultShowColor[3] = { cv::Scalar(0,255,255), cv::Scalar(0,255,0), cv::Scalar(0,0,255) };
cv::Mat Imgs[3];
const int KernelSize = 15;
for( int i=0; i<3; ++i )
{
Imgs[i] = cv::imread( FileNames[i], cv::IMREAD_GRAYSCALE );
if( Imgs[i].empty() )return 0;
cv::threshold( Imgs[i], Imgs[i], 32, 255, cv::THRESH_BINARY );
cv::GaussianBlur( Imgs[i], Imgs[i], cv::Size(KernelSize,KernelSize), 0 );
cv::threshold( Imgs[i], Imgs[i], 255*0.5, 255, cv::THRESH_BINARY );
cv::imshow( FileNames[i], Imgs[i] );
}
cv::Mat ResultImg = cv::Mat::zeros( Imgs[0].size(), CV_8UC3 );
for( int i=0; i<3; ++i )
{
ResultImg.setTo( ResultShowColor[i], Imgs[i] );
}
cv::imshow( "ResultImg", ResultImg );
if( cv::waitKey() == 's' ){ cv::imwrite( "ResultImg.png", ResultImg ); }
return 0;
}
This is result:
Yes, this result is not enough. Gaps exist at the boundaries of shapes.
Therefore some ingenuity is required... but I post this because it might be some hint for you.
I processed my input image and the result is below. I just need the characters. I tried but can't remove the noise surrounding the characters.
A simple erosion with a small structuring element, like a 3 x 3 square may work where you would eliminate the small white noise profile and thus make the characters darker. You can also take advantage of the fact that the areas that are black that are not characters are connected to the boundaries of the image. You can remove these from the image by removing areas connected to the boundaries.
Therefore, perform an erosion first using imerode, then you will need to remove the boundaries using imclearborder but this requires that the pixels touching the border are white. Therefore, use the inverse of the output from imerode into the function, then inverse it again.
Something like this will work and I'll read your image from Stack Overflow directly:
% Read the image and threshold in case
im = imread('https://i.stack.imgur.com/Hl6Y9.jpg');
im = im > 200;
% Erode
out = imerode(im, strel('square', 3));
% Remove the border and find inverse
out = ~imclearborder(~out);
We get this image now:
There are some isolated black holes near the B that you may not want. You can do some additional post-processing by using bwareaopen to remove islands that are below a certain area. I chose this to be 50 pixels from experimentation. You'll have to do this on the inverse of the output from imclearborder:
% Read the image and threshold in case
im = imread('https://i.stack.imgur.com/Hl6Y9.jpg');
im = im > 200;
% Erode
out = imerode(im, strel('square', 3));
% Remove the border
bor = imclearborder(~out);
% Remove small areas and inverse
out = ~bwareaopen(bor, 50);
We now get this:
I know this thread about converting black color to white and white to black simultaneously.
I would like to convert only black to white.
I know this thread about doing this what I am asking but I do not understand what goes wrong.
Picture
Code
rgbImage = imread('ecg.png');
grayImage = rgb2gray(rgbImage); % for non-indexed images
level = graythresh(grayImage); % threshold for converting image to binary,
binaryImage = im2bw(grayImage, level);
% Extract the individual red, green, and blue color channels.
redChannel = rgbImage(:, :, 1);
greenChannel = rgbImage(:, :, 2);
blueChannel = rgbImage(:, :, 3);
% Make the black parts pure red.
redChannel(~binaryImage) = 255;
greenChannel(~binaryImage) = 0;
blueChannel(~binaryImage) = 0;
% Now recombine to form the output image.
rgbImageOut = cat(3, redChannel, greenChannel, blueChannel);
imshow(rgbImageOut);
Which gives
Where seems to be something wrong in red color channel.
The Black color is just (0,0,0) in RGB so its removal should mean to turn every (0,0,0) pixel to white (255,255,255).
Doing this idea with
redChannel(~binaryImage) = 255;
greenChannel(~binaryImage) = 255;
blueChannel(~binaryImage) = 255;
Gives
So I must have misunderstood something in Matlab. The blue color should not have any black. So this last image is strange.
How can you turn only black color to white?
I want to keep the blue color of the ECG.
If I understand you properly, you want to extract out the blue ECG plot while removing the text and axes. The best way to do that would be to examine the HSV colour space of the image. The HSV colour space is great for discerning colours just like the way humans do. We can clearly see that there are two distinct colours in the image.
We can convert the image to HSV using rgb2hsv and we can examine the components separately. The hue component represents the dominant colour of the pixel, the saturation denotes the purity or how much white light there is in the pixel and the value represents the intensity or strength of the pixel.
Try visualizing each channel doing:
im = imread('http://i.stack.imgur.com/cFOSp.png'); %// Read in your image
hsv = rgb2hsv(im);
figure;
subplot(1,3,1); imshow(hsv(:,:,1)); title('Hue');
subplot(1,3,2); imshow(hsv(:,:,2)); title('Saturation');
subplot(1,3,3); imshow(hsv(:,:,3)); title('Value');
Hmm... well the hue and saturation don't help us at all. It's telling us the dominant colour and saturation are the same... but what sets them apart is the value. If you take a look at the image on the right, we can tell them apart by the strength of the colour itself. So what it's telling us is that the "black" pixels are actually blue but with almost no strength associated to it.
We can actually use this to our advantage. Any pixels whose values are above a certain value are the values we want to keep.
Try setting a threshold... something like 0.75. MATLAB's dynamic range of the HSV values are from [0-1], so:
mask = hsv(:,:,3) > 0.75;
When we threshold the value component, this is what we get:
There's obviously a bit of quantization noise... especially around the axes and font. What I'm going to do next is perform a morphological erosion so that I can eliminate the quantization noise that's around each of the numbers and the axes. I'm going to make it the mask a bit large to ensure that I remove this noise. Using the image processing toolbox:
se = strel('square', 5);
mask_erode = imerode(mask, se);
We get this:
Great, so what I'm going to do now is make a copy of your original image, then set any pixel that is black from the mask I derived (above) to white in the final image. All of the other pixels should remain intact. This way, we can remove any text and the axes seen in your image:
im_final = im;
mask_final = repmat(mask_erode, [1 1 3]);
im_final(~mask_final) = 255;
I need to replicate the mask in the third dimension because this is a colour image and I need to set each channel to 255 simultaneously in the same spatial locations.
When I do that, this is what I get:
Now you'll notice that there are gaps in the graph.... which is to be expected due to quantization noise. We can do something further by converting this image to grayscale and thresholding the image, then filling joining the edges together by a morphological dilation. This is safe because we have already eliminated the axies and text. We can then use this as a mask to index into the original image to obtain our final graph.
Something like this:
im2 = rgb2gray(im_final);
thresh = im2 < 200;
se = strel('line', 10, 90);
im_dilate = imdilate(thresh, se);
mask2 = repmat(im_dilate, [1 1 3]);
im_final_final = 255*ones(size(im), class(im));
im_final_final(mask2) = im(mask2);
I threshold the previous image that we got without the text and axes after I convert it to grayscale, and then I perform dilation with a line structuring element that is 90 degrees in order to connect those lines that were originally disconnected. This thresholded image will contain the pixels that we ultimately need to sample from the original image so that we can get the graph data we need.
I then take this mask, replicate it, make a completely white image and then sample from the original image and place the locations we want from the original image in the white image.
This is our final image:
Very nice! I had to do all of that image processing because your image basically has quantization noise to begin with, so it's going to be a bit harder to get the graph entirely. Ander Biguri in his answer explained in more detail about colour quantization noise so certainly check out his post for more details.
However, as a qualitative measure, we can subtract this image from the original image and see what is remaining:
imshow(rgb2gray(abs(double(im) - double(im_final_final))));
We get:
So it looks like the axes and text are removed fine, but there are some traces in the graph that we didn't capture from the original image and that makes sense. It all has to do with the proper thresholds you want to select in order to get the graph data. There are some trouble spots near the beginning of the graph, and that's probably due to the morphological processing that I did. This image you provided is quite tricky with the quantization noise, so it's going to be very difficult to get a perfect result. Also, these thresholds unfortunately are all heuristic, so play around with the thresholds until you get something that agrees with you.
Good luck!
What's the problem?
You want to detect all black parts of the image, but they are not really black
Example:
Your idea (or your code):
You first binarize the image, selecting the pixels that ARE something against the pixels that are not. In short, you do: if pixel>level; pixel is something
Therefore there is a small misconception you have here! when you write
% Make the black parts pure red.
it should read
% Make every pixel that is something (not background) pure red.
Therefore, when you do
redChannel(~binaryImage) = 255;
greenChannel(~binaryImage) = 255;
blueChannel(~binaryImage) = 255;
You are doing
% Make every pixel that is something (not background) white
% (or what it is the same in this case, delete them).
Therefore what you should get is a completely white image. The image is not completely white because there has been some pixels that were labelled as "not something, part of the background" by the value of level, in case of your image around 0.6.
A solution that one could think of is manually setting the level to 0.05 or similar, so only black pixels will be selected in the gray to binary threholding. But this will not work 100%, as you can see, the numbers have some very "no-black" values.
How would I try to solve the problem:
I would try to find the colour you want, extract just that colour from the image, and then delete outliers.
Extract blue using HSV (I believe I answered you somewhere else how to use HSV).
rgbImage = imread('ecg.png');
hsvImage=rgb2hsv(rgbImage);
I=rgbImage;
R=I(:,:,1);
G=I(:,:,2);
B=I(:,:,3);
th=0.1;
R((hsvImage(:,:,1)>(280/360))|(hsvImage(:,:,1)<(200/360)))=255;
G((hsvImage(:,:,1)>(280/360))|(hsvImage(:,:,1)<(200/360)))=255;
B((hsvImage(:,:,1)>(280/360))|(hsvImage(:,:,1)<(200/360)))=255;
I2= cat(3, R, G, B);
imshow(I2)
Once here we would like to get the biggest blue part, and that would be our signal. Therefore the best approach seems to first binarize the image taking all blue pixels
% Binarize image, getting all the pixels that are "blue"
bw=im2bw(rgb2gray(I2),0.9999);
And then using bwlabel, label all the independent pixel "islands".
% Label each "blob"
lbl=bwlabel(~bw);
The label most repeated will be the signal. So we find it and separate the background from the signal using that label.
% Find the blob with the highes amount of data. That will be your signal.
r=histc(lbl(:),1:max(lbl(:)));
[~,idxmax]=max(r);
% Profit!
signal=rgbImage;
signal(repmat((lbl~=idxmax),[1 1 3]))=255;
background=rgbImage;
background(repmat((lbl==idxmax),[1 1 3]))=255;
Here there is a plot with the signal, background and difference (using the same equation as #rayryang used)
Here is a variation on #rayryeng's solution to extract the blue signal:
%// retrieve picture
imgRGB = imread('http://i.stack.imgur.com/cFOSp.png');
%// detect axis lines and labels
imgHSV = rgb2hsv(imgRGB);
BW = (imgHSV(:,:,3) < 1);
BW = imclose(imclose(BW, strel('line',40,0)), strel('line',10,90));
%// clear those masked pixels by setting them to background white color
imgRGB2 = imgRGB;
imgRGB2(repmat(BW,[1 1 3])) = 255;
%// show extracted signal
imshow(imgRGB2)
To get a better view, here is the detected mask overlayed on top of the original image (I'm using imoverlay function from the File Exchange):
figure
imshow(imoverlay(imgRGB, BW, uint8([255,0,0])))
Here is a code for this:
rgbImage = imread('ecg.png');
redChannel = rgbImage(:, :, 1);
greenChannel = rgbImage(:, :, 2);
blueChannel = rgbImage(:, :, 3);
black = ~redChannel&~greenChannel&~blueChannel;
redChannel(black) = 255;
greenChannel(black) = 255;
blueChannel(black) = 255;
rgbImageOut = cat(3, redChannel, greenChannel, blueChannel);
imshow(rgbImageOut);
black is the area containing the black pixels. These pixels are set to white in each color channel.
In your code you use a threshold and a grayscale image so of course you have much bigger area of pixels that is set to white resp. red color. In this code only pixel that contain absolutly no red, green and blue are set to white.
The following code does the same with a threshold for each color channel:
rgbImage = imread('ecg.png');
redChannel = rgbImage(:, :, 1);
greenChannel = rgbImage(:, :, 2);
blueChannel = rgbImage(:, :, 3);
black = (redChannel<150)&(greenChannel<150)&(blueChannel<150);
redChannel(black) = 255;
greenChannel(black) = 255;
blueChannel(black) = 255;
rgbImageOut = cat(3, redChannel, greenChannel, blueChannel);
imshow(rgbImageOut);
I need to do computer visions tasks in order to detect watter bottles or soda cans. I will obtain 'frontal' images of bottles, soda cans or any other random objects (one by one) and my algorithm should determine whether it's a bottle, a can or any of them.
Some details about object detecting scenario:
As mentioned, I will test one single object per image/video frame.
Not all watter bottles are the same. There could be color in plastic, lid or label variation. Maybe some could not get label or lid.
Same about variation goes for soda cans. No wrinkled soda cans are gonna be tested though.
There could be small size variation between objects.
I could have a green (or any custom color) background.
I will do any needed filters on image.
This will be run on a Raspberry Pi.
Just in case, an example of each:
I've tested a couple times OpenCV face detection algorithms and I know it works pretty good but I'd need to obtain an special Haar Cascades features XML file for detecting each custom object on this approach.
So, the distinct alternatives I have in mind are:
Creating a custom Haar Classifier.
Considering shapes.
Considering outlines.
I'd like to get a simple algorithm and I think creating a custom Haar classifier could be even not needed. What would you suggest?
Update
I strongly considered the shape/aspect ratio approach.
However I guess I'm facing some issues as bottles come in distinct sizes or even shapes each. But this made me think or set following considerations:
I'm applying a threshold with THRESH_BINARY method. (Thanks to the answers).
I will use a white background on detection.
Soda cans are all same size.
So, a bounding box for soda cans with high accuracy might distinguish a can.
What I've achieved:
Threshold really helped me, I could notice that on white background tests I would obtain for cans:
And this is what it's obtained for bottles:
So, darker areas left dominancy is noticeable. There are some cases in cans where this might turn into false negatives. And for bottles, light and angle may lead to not consistent results but I really really think this could be a shorter approach.
So, I'm quite confused now how I should evaluate that darkness dominancy, I've read that findContours leads to it but I'm quite lost on how to seize such function. For example, in case of soda cans, it may find several contours, so I get lost on what to evaluate.
Note: I'm open to test any other algorithms or libraries distinct to Open CV.
I see few basic ideas here:
Check object (to be precise - object boundind rect) width/height ratio. For can it's approimetely 2-2.5, for bottle i think it will be >3. It's very simple idea to it should be easy to test it quickly and i think it should has quite good accuracy. For some values, like 2.75 (assumimg that values that i gave are correct, which most likely isn't true) you can use some different algorithm.
Check whether you object contains glass/transparence regions - if yes, than definitely it's a bottle. Here you can read more about it.
Use grabcut algorithm to get object mask/more precise shape and check whether this shape width at the top is similar to width at the bottom - if yes than it's a can, no - bottle (bottles has screw cap at the top).
Since you want to recognize can vs bottle rather than pepsi vs coke, shape matching is probably the way to go when compared to Haar and the features2d matchers like SIFT/SURF/ORB
A unique background color will make things easier.
First create a histogram from an image of just the background
int channels[] = {0,1,2}; // use all the channels
int rgb_bins = 32; // quantize to 32 colors per channel
int histSize[] = {rgb_bins, rgb_bins, rgb_bins};
float _range[] = {0,255};
float* ranges[] = {_range, _range, _range};
cv::SparseMat bghist;
cv::calcHist(&bg_image, 1, channels, cv::noArray(),bghist, 3, histSize, ranges );
Then use calcBackProject to create a mask of bg and not bg
cv::MatND temp_ND;
cv::calcBackProject( &bottle_image, 1, channels, bghist, temp_ND, ranges );
cv::Mat bottle_mask, bottle_backproj;
if( feeling_lazy ){
cv::normalize(temp_ND, bottle_backproj, 0, 255, cv::NORM_MINMAX, CV_8U);
//a small blur here could work nicely
threshold( bottle_backproj, bottle_mask, 0, 255, THRESH_OTSU );
bottle_mask = cv::Scalar(255) - bottle_mask; //invert the mask
} else {
//finding just the right value here might be better than the above method
int magic_threshold = 64;
temp_ND.convertTo( bottle_backproj, CV_8U, 255.);
//I expect temp_ND to be CV_32F ranging from 0-1, but I might be wrong.
threshold( bottle_backproj, bottle_mask, magic_threshold, 255, THRESH_BINARY_INV );
}
Then either:
Compare bottle_mask or bottle_backproj to a few sample bottle masks/backprojections using matchTemplate with a threshold on confidence to decide if it's a match.
matchTemplate(bottle_mask, bottle_template, result, CV_TM_CCORR_NORMED);
double confidence; minMaxLoc( result, NULL, &confidence);
Or use matchShapes, though I've never gotten this to work properly.
double confidence = matchShapes(bottle_mask, bottle_template, CV_CONTOURS_MATCH_I3);
Or use linemod which is difficult to set up but works great for images like this where the shape isn't very complex. Aside from the linked file, I haven't found any working samples of this method so here's what I did.
First create/train the detector with some sample images
//some magic numbers
std::vector<int> T_at_level;
T_at_level.push_back(4);
T_at_level.push_back(8);
//add some padding so linemod doesn't scream at you
const int T = 32;
int width = bottle_mask.cols;
if( width % T != 0)
width += T - width % T;
int height = bottle_mask.rows;
if( height % T != 0)
height += T - height % T;
//in this case template_backproj is created specifically from a sample bottle_backproj
cv::Rect padded_roi( (width - template_backproj.cols)/2, (height - template_backproj.rows)/2, template_backproj.cols, template_backproj.rows);
cv::Mat padded_backproj = zeros( width, height, template_backproj.type());
padded_backproj( padded_roi ) = template_backproj;
cv::Mat padded_mask = zeros( width, height, template_mask.type());
padded_mask( padded_roi ) = template_mask;
//you might need to erode padded_mask by a few pixels.
//initialize detector
std::vector< cv::Ptr<cv::linemod::Modality> > modalities;
modalities.push_back( cv::makePtr<cv::linemod::ColorGradient>() ); //for those that don't have a kinect
cv::Ptr<cv::linemod::Detector> new_detector = cv::makePtr<cv::linemod::Detector>(modalities, T_at_level);
//add sample images to the detector
std::vector<cv::Mat> template_images;
templates.push_back( padded_backproj);
cv::Rect ignore_me;
const std::string class_id = "bottle";
template_id = new_detector->addTemplate(template_images, class_id, padded_mask, &ignore_me);
Then do some matching
std::vector<cv::Mat> sources_vec;
sources_vec.push_back( padded_backproj );
//padded_backproj doesn't need to be the same size as the trained template images, but it does need to be padded the same way.
float matching_threshold = 0.8; //a higher number makes the algorithm faster
std::vector<cv::linemod::Match> matches;
std::vector<cv::String> class_ids;
new_detector->match(sources_vec, matching_threshold, matches,class_ids);
float confidence = matches.size() > 0? matches[0].similarity : 0;
As cyriel suggests, the aspect ratio (width/height) might be one useful measure. Here is some OpenCV Python code that finds contours (hopefully including the outline of the bottle or can) and gives you aspect ratio and some other measurements:
# src image should have already had some contrast enhancement (such as
# cv2.threshold) and edge finding (such as cv2.Canny)
contours, hierarchy = cv2.findContours(src, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
num_points = len(contour)
if num_points < 5:
# The contour has too few points to fit an ellipse. Skip it.
continue
# We could use area to help determine the type of object.
# Small contours are probably false detections (not really a whole object).
area = cv2.contourArea(contour)
bounding_ellipse = cv2.fitEllipse(contour)
center, radii, angle_degrees = bounding_ellipse
# Let's define an ellipse's normal orientation to be landscape (width > height).
# We must ensure that the ellipse's measurements match this orientation.
if radii[0] < radii[1]:
radii = (radii[1], radii[0])
angle_degrees -= 90.0
# We could use the angle to help determine the type of object.
# A bottle or can's angle is probably approximately a multiple of 90 degrees,
# assuming that it is at rest and not falling.
# Calculate the aspect ratio (width / height).
# For example, 0.5 means the object's height is 2 times its width.
# A bottle is probably taller than a can.
aspect_ratio = radii[0] / radii[1]
For checking transparency, you can compare the picture to a known background using histogram analysis or background subtraction.
The contour's moments can be used to determine its centroid (center of gravity):
moments = cv2.moments(contour)
m00 = moments['m00']
m01 = moments['m01']
m10 = moments['m10']
centroid = (m10 / m00, m01 / m00)
You could compare this to the center. If the object is bigger ("heavier") on one end, the centroid will be closer to that end than the center is.
So, my main approach for detection was:
Bottles are transparent and cans are opaque
Generally algorithm consisted in:
Take a grayscale picture.
Apply a binary threshold.
Select a convenient ROI from it.
Obtain it's color mean and even the standard deviation.
Distinguish.
Implementation was basically reduced to this function (where CAN and BOTTLE were previously defined):
int detector(int x, int y, int width, int height, int thresholdValue, CvCapture* capture) {
Mat img;
Rect r;
vector<Mat> channels;
r = Rect(x,y,width,height);
if ( !capture ) {
fprintf( stderr, "ERROR: capture is NULL \n" );
getchar();
return -1;
}
img = Mat(cvQueryFrame( capture ));
cvtColor(img,img,CV_RGB2GRAY);
threshold(img, img, 127, 255, THRESH_BINARY);
// ROI
Mat roiImage = img(r);
split(roiImage, channels);
Scalar m = mean(channels[0]);
float media = m[0];
printf("Media: %f\n", media);
if (media < thresholdValue) {
return CAN;
}
else {
return BOTTLE;
}
}
As it can be seen, a THRESH_BINARY threshold was applied, and it was a plain white background which was used. However the main and critical issue I faced with this whole approach and algorithm was luminosity changes in environment, even minor ones.
Sometimes I could notice a THRESH_BINARY_INV might help more, but I wonder if I could use some certian threshold parameters or wether applying other filters may lead to getting rid of environment lightning as an issue.
I really appreciate the aspect ratio calculation approach from bounding box or finding contours but I found this straight forward and simple when conditions were adjusted.
I'd use deep learning, based on Transfer learning.
The idea is this: given a highly complex well trained neural network, that was trained on a similar classification task (tipically over a large public dataset, like imagenet), you can freeze the majority of its weigths and only train the last layers. There are lots of tutorials out there. You don't need to have a background on deep learning.
There is a tutorial which is almost out of the box with tensorflow here and here there is another based on keras.
Is there any way of removing a white background and turning it into black in MATLAB?
Say i have this image:
I get the following output when i apply the code suggested in the answer: Which isn't perfect
The problem, as Andrey noticed, is that not all background pixels are "255 white". This probably is happening due to JPEG compression algorithm and also because there's a shadow of the fruit in the image.
To solve this problem, first get a binary mask of the fruit region by blurring the image (this is necessary to overcome the JPEG artifacts) and then threshold the image with a very high value, but a little lower than 255. Here's the solution in Matlab:
I = imread('http://i.stack.imgur.com/5p4jV.jpg'); % Load your image.
H = fspecial('gaussian'); % Create the filter kernel.
I = imfilter(I,H); % Blur the image.
Mask = im2bw(Ig, 0.9); % Now we are generating the binary mask.
I([Mask, Mask, Mask]) = 0; % Now we have the image.
Here's the output (you can also try different threshold values in im2bw):
You fail due to the anti-aliasing effect that blurs the edges your image. These pixels that were not removed are not 255! They are a bit lower. Basically you have 2 options:
(I wrote them from the perspective of using Matlab).
Select the relevant part by using imfreehand and then create a mask by calling createMask from the API.
Finding the correct threshold level, which isn't 255. (Much harder - if possible)
Here is a Matlab code for the first:
function SO1
im = imread('c:\x.jpg');
figure();
imshow(im);
f = imfreehand();
mask = f.createMask();
mask = repmat(mask,[1 1 3]);
im(~mask) = 0;
figure;imshow(im);
end
You should draw the image to a black background.
//Your bitmap
Bitmap originalImage = new Bitmap(100, 100);
//Black background
Bitmap bitmap = new Bitmap(100, 100);
Graphics g = Graphics.FromImage(bitmap);
//Draw the background
g.FillRectangle(Brushes.Black, 0, 0, 100, 100);
//Draw the original bitmap over the black one
g.DrawImage(originalImage, 0, 0);
yes.
if your image is save as a variable called img:
thr = 255;
mask = sum(img,3)==thr*3;
for i=1:3
c = img(:,:,i);
c(mask)=0;
img(:,:,i)=c;
end
|-)