I made a Haar-Cascade train for made a poop detector now I'm working in the confusiĆ³n matrix with my own data set my real problem is IDK how to really made the matrix because I have 110 test images that look like this enter image description here
All the images have a poop or 2 poops in the middle and the background is cement. So when i tested my images i have a good precision but i only have TP, FP, and FN values but I never had a TN because all my pics look like the pic that i put in the description and also i only work with one object class (Poop) so can i make a confusion matrix without having a TN values ? someone can explain me how is the real process for made the confusion matrix please
You actually have two classes, poop and non-poop.
To get the confusion matrix you have to also test the model on images without poop and then you can get your TN, which is where the model correctly predicts the non-poop class.
Related
I'm not really new to MATLAB, just new to this whole Machine Learning thing.
I have to do a simple binary image classification. I don't care if it's a toolbox or just code, I just need to do it. I tried a couple of classification codes I found online on Github or on other sites, but most of them worked randomly and some of them worked for pre-defined images.
Those that worked on pre-defined images were neat (e.g.: http://www.di.ens.fr/willow/events/cvml2011/materials/practical-classification/), but I had issues applying on a new set of images, just because there were some .txt files (vectors of the name of the images, which was easy to replicate) and some .mat files (with both name and histogram).
I had issues creating the name and histogram in the same order, the piece of code that I use is:
for K = 1 : 4
filename = sprintf('image_%04d.jpg', K);
I = imread(filename);
IGray = rgb2gray(I);
H = hist(Igray(:), 32);
end
save('ImageDatabase.mat', 'I', 'H');
But for one reason or another, only the name and path of the last image remains stored (e.g. in this case, only image_0004 is stored in the name slot).
Another code that I found and it seemed easy was: https://github.com/rich-hart/SVM-Classifier , but the output is really random (for me) so if someone could explain to me what is happening I'd be grateful. There are 19 training images and 20 for test. Yet, if I remove one of the test images, 2 entries disappear from the Support Vector Structure?
Anyway, if you have a toolbox, or a more easy to adapt code or some explanations to the above codes, I'd be grateful.
Cheers!
EDIT:
I tried following the example of this code: http://dipwm.blogspot.ro/2013/01/svm-support-vector-machine-with-matlab.html
And even though I got 30 images of 100x100 I keep getting this error:
Error using svmtrain (line 253)
Y and TRAINING must have the same number of rows.
Error in Untitled (line 74)
SVMStruct = svmtrain(Training_Set , train_label, 'kernel_function', 'linear');
There is no way to train any classifier on raw 100x100 images, when you only have ~40 data points for training, testing and validation. So recommending a Matlab toolbox wouldn't really help your problem.
The answer is: Get more data
For completeness here are two approaches you could try:
Feature extraction
Maybe there are some very obvious features (some pictures are darker, have a white corner etc.) in your pictures, that you can extract before the training. With 3-4 features you could try training a classifier with your data set. In this case I would try fitcensemble as it is very easy to use without the inner workings of the algorithm.
Using a pre-trained classifier
You can use GoogLeNet and maybe your pictures are fitting one of the ImageNet categories. Try transfer learning if your images do not match any category.
I have a small render engine written for fun. I would like to have some unit testing that would render automatically an image and then compare it to a stored image to check for differences. This should give some sort of metric to be able to gauge if the image is too far off or if we can attribute that to just different timings in animations. If it can also produce the location in the image of the differences that would be great, but not necessary. We can also assume that the 2 images are the exact same size.
What are the classic papers/techniques for that sort of thing ?
(the language is Go, probably nothing exists for it yet and I'd like to implement it myself to understand what's going on. The renderer is github.com/luxengine)
Thank you
One idea could be to see your problem as a case in Image Registration.
The following figure (taken from http://it.mathworks.com/help/images/point-mapping.html) gives a flow-chart for a method to solve the image registration problem.
Using the above figure terms, the basic idea is:
find some interest points in the Fixed image;
find in the Moving image the same corresponding points;
estimate the transformation between the two images using the point correspondences. One of the simplest transformation is a translation represented by a 2D vector; the magnitude of this vector is a measure of differences between the two images, in your case it can be related to the shift you wrote about in your comment. A richer transformation is an homography described by a 3x3 matrix, its distance from the identity matrix is again a measure of differences between the two images.
you can apply the transformation back, for example in the case of the translation you apply the translation to the Moving image and then the warped image can be compared (here I am simplifying a little) pixel by pixel to the Reference image.
Some more ideas are here: Image comparison - fast algorithm
I have read all that staff about image similarity index on that forum but i think that my subject is kind different because images that i want to compare comes from an L-system generator and as you can see bellow it's hard to find obvious differences. So i couldn't decide which method and software to choose for my problem.
But let's take the story from the beginning. I have a collection of data , by measuring angles and lengths of branches of some plants (15 in total), and i represented them with L-system fractals method as already told.
These images looks like the above ones:
Plant A
Plant B
Plant C
Till now i tried to find differences using two methods.
1) By calculating the fractal dimension of those images but as expected, it was 2 in all of them
2) By calculating the % of area coverage in a same canvas. Numbers in that case show some differences but there are not statistically significant.
So the thought was to use an other similarity index but there are too many protocols and ideas out there that i couldn't find a starting point. I read about OPENCV , VisualCI etc but because i've never used such methods again, i feel somehow lost.
Any of your suggestions will be welcome.
Thank you.
first of all, I have to say I'm new to the field of computervision and I'm currently facing a problem, I tried to solve with opencv (Java Wrapper) without success.
Basicly I have a picture of a part from a Model taken by a camera (different angles, resoultions, rotations...) and I need to find the position of that part in the model.
Example Picture:
Model Picture:
So one question is: Where should I start/which algorithm should I use?
My first try was to use KeyPoint Matching with SURF as Detector, Descriptor and BF as Matcher.
It worked for about 2 pcitures out of 10. I used the default parameters and tried other detectors, without any improvements. (Maybe it's a question of the right parameters. But how to find out the right parameteres combined with the right algorithm?...)
Two examples:
My second try was to use the color to differentiate the certain elements in the model and to compare the structure with the model itself (In addition to the picture of the model I also have and xml representation of the model..).
Right now I extraxted the color red out of the image, adjusted h,s,v values manually to get the best detection for about 4 pictures, which fails for other pictures.
Two examples:
I also tried to use edge detection (canny, gray, with histogramm Equalization) to detect geometric structures. For some results I could imagine, that it will work, but using the same canny parameters for other pictures "fails". Two examples:
As I said I'm not familiar with computervision and just tried out some algorithms. I'm facing the problem, that I don't know which combination of algorithms and techniques is the best and in addition to that which parameters should I use. Testing it manually seems to be impossible.
Thanks in advance
gemorra
Your initial idea of using SURF features was actually very good, just try to understand how the parameters for this algorithm work and you should be able to register your images. A good starting point for your parameters would be varying only the Hessian treshold, and being fearles while doing so: your features are quite well defined, so try to use tresholds around 2000 and above (increasing in steps of 500-1000 till you get good results is totally ok).
Alternatively you can try to detect your ellipses and calculate an affine warp that normalizes them and run a cross-correlation to register them. This alternative does imply much more work, but is quite fascinating. Some ideas on that normalization using the covariance matrix and its choletsky decomposition here.
I have created a simple program to generate random images, giving random colors to each pixel. I know that there is a very low chance of generating a reconocible image but I would like to try.
I have observed that the longest part of the work is to check if the images are really something. I also observed that most of the images produced are just fields of colorful images with lots of individual pixels. That's why I would like to ask for an algorithm in pseudocode to detect similar color regions in an image. I think that the easiest way to find meaningful images is to filter all those random pixel images. It's not perfect but I think it will help.
If someone could propose another kind of filtering algorithm that would help with this task I would also apreciate it.
(edited)
To clarify this, in case my explanation was not clear enough, I will show you some images:
This is the kind of images I'm getting, basically I would describe it as "Colorful noise". As you can see, all the pixels are spread individually without grouping in similar color regions to hopfully create shapes of objects or anything reconocible as something.
In here you can see a conventional image, a "reconocible" picture. We can clearly see a dog lying on the grass with a tennis ball. If you observe carefully this picture it can be clearly distinguished from the other one because it has agrupations of similar colors which we can difer (as the dog, a white region, the grass, a dark green region, and the tenis ball, a light green region).
What I exactly want is to remove the "pixelly" images before saving them in the HD and only save the ones with color agrupations. As I said before, this idea is the best I had to filter these randomly generated images but if someone proposes another more efficient way I would really apreciate it.
(edited)
Ok, I think that this post is becoming too long... Well if someone want's to have a look here is the code of the program I wrote. It's really straightforward. I've program it in Python using Pygame. I know that this isn't nearly the most efficient way to do it, I'm aware of that. The thing is that I'm quite a noob in this field and I don't really know another way to do this in other languages or modules. Maybe you could help me also with this... I don't know, maybe translate the code to C++? I'm feeling that I'm asking for to many questions in the same post but, as I sayd tons of times, any help would be greatly apreciated.
import pygame, random
pygame.init()
#lots of printed text statements here
imageX = int(input("Enter the widht of the image you want to produce: "))
imageY = int(input("Enter the height of the image you want to produce: "))
maxImages = int(input("Enter the maximun image amoungt you want to produce: "))
maxMem = int(input("Enter the maximun data you want to produce (MB, only works with 800x600 images): "))
maxPPS = int(input("Enter the maximun image amoungt you want to produce each second: "))
firstSeed = int(input("Enter the first seed you want to use: "))
print("\n\n\n\n")
seed = firstSeed
clock = pygame.time.Clock()
images = 0
keepGoing = True
while keepGoing:
#seed
random.seed(seed)
#PPS
clock.tick(maxPPS)
#surface
image = pygame.Surface((imageX,imageY))
#generation
for x in range(imageX):
for y in range(imageY):
red = random.randint(0,255)
green = random.randint(0,255)
blue = random.randint(0,255)
image.set_at((x,y),(red,green,blue))
#save
pygame.image.save(image,str(seed)+".png")
#update parameters
seed += 1
images += 1
#print seed
print(seed - 1)
#check end
if images >= maxImages:
keepGoing = False
elif (images * 1.37) >= maxMem:
keepGoing = False
pygame.event.pump()
print("\n\nThis is the last seed that was used: " + str(seed - 1))
input("\nPress Enter to exit")
Here is a butchered algorithm for you to try (try it in OpenCV):
Get image
Work with just one color dimension of the image i.e. Red or Green... or Gray ...or do the following for each separately
Sum up all the values in the image and save this as the "energy" value of the image
Use OpenCV's Smooth function to blur the image
The trick to blurring the image correctly is to choose the size of the kernel (aka filter) to be smaller than the width of important features and larger than the unimportant or noisy features. The size is controlled by defining param1 and param2.
See http://opencv.willowgarage.com/documentation/python/imgproc_image_filtering.html
Now for the output, sum up all the values to get the output "energy"
Keep image if the output has at least half of the energy of the input. Technically the trick in number 5 is the same as choosing 50 percent as the threshold to keep or discard images. So changing the threshold here is approximately the same as changing the filter size.
Optional. No need to think too much about it though, just get these energy values for some set of the images and choose the threshold by eye.
What is happening?
Your filtering out high frequencies then seeing if there is still something left over. Most images have lots of energy at lower spatial frequencies. In fact jpeg compression uses this fact to compress images. The filter must have an energy of one to work correctly, so I'm assuming that this is true.
Hope this helps!
The simplest way of filtering out noise is to look for correlation. Nearby regions should be highly correlated in most of the image. There are so many ways to do it.
You should use a combination of the following and do some tweaking to find parameters to get acceptable hit/miss ratio
Color correlation: You will find huge amount of correlation in U/V in nearby regions in "proper" images.
Edge detection: Natural images tend to have well defined edges. easiest way to detect noise from natural images is to do this.
Quite bit more can be done: Frequency analysis: Noisy images will have all frequencies natural images have huge peaks usually. scale space analysis etc depending on how complex you want to get.. how much hit ratio you are willing to tolerate. In general trying to get recognizable images is an open ended topic but you should be able to get very high hit ratio if you specifically wanting to remove out noise images like the one you gave in the example.
EDIT:
In general there is no exact algorithms for problems like this. You have to make assumptions about properties of underlying data. Then use basic primitives (correlation, frequency domain data, edges etc) and combine it to give your algorithm for solving the problem. This is because the solution to problems like this is very data specific. Quite different than solving say Computer science algorithms. This is not to say that signal processing algorithms don't have exactness. However your current problem and many others deal with what is known as Random Variables and Stochastic Processes. You may have to search if someone has tried to solve this problem in literature or at some university. You can use that as your starting point. Tweak that algorithm to suite you. However you are not going to get a solution easily unless you take some time to understand some of the things I mentioned and are willing to do some experiments and emperical analysis.
Without knowing exactly what you're trying to achieve, it's difficult to offer specific help. But, reading your account did remind me of something I saw recently which, whilst quite different in implementation, has a similar end goal: generate a recognisable image from randomness.
Check out https://github.com/phl/pareidoloop by Philip McCarthy.
Philip's project starts with random polygons and the algorithm favours face like images. Two key points here: the polygons significantly reduce the amount of random noise right off the bat so the chances of generating something recognisable are significantly increased. Secondly, the algorithm favours a specific type of recognisable image: I suspect you're going to have to work towards a specific type of image so that you have some parameters with which to computationally estimate "recognisability".
hth!