I'm not really new to MATLAB, just new to this whole Machine Learning thing.
I have to do a simple binary image classification. I don't care if it's a toolbox or just code, I just need to do it. I tried a couple of classification codes I found online on Github or on other sites, but most of them worked randomly and some of them worked for pre-defined images.
Those that worked on pre-defined images were neat (e.g.: http://www.di.ens.fr/willow/events/cvml2011/materials/practical-classification/), but I had issues applying on a new set of images, just because there were some .txt files (vectors of the name of the images, which was easy to replicate) and some .mat files (with both name and histogram).
I had issues creating the name and histogram in the same order, the piece of code that I use is:
for K = 1 : 4
filename = sprintf('image_%04d.jpg', K);
I = imread(filename);
IGray = rgb2gray(I);
H = hist(Igray(:), 32);
end
save('ImageDatabase.mat', 'I', 'H');
But for one reason or another, only the name and path of the last image remains stored (e.g. in this case, only image_0004 is stored in the name slot).
Another code that I found and it seemed easy was: https://github.com/rich-hart/SVM-Classifier , but the output is really random (for me) so if someone could explain to me what is happening I'd be grateful. There are 19 training images and 20 for test. Yet, if I remove one of the test images, 2 entries disappear from the Support Vector Structure?
Anyway, if you have a toolbox, or a more easy to adapt code or some explanations to the above codes, I'd be grateful.
Cheers!
EDIT:
I tried following the example of this code: http://dipwm.blogspot.ro/2013/01/svm-support-vector-machine-with-matlab.html
And even though I got 30 images of 100x100 I keep getting this error:
Error using svmtrain (line 253)
Y and TRAINING must have the same number of rows.
Error in Untitled (line 74)
SVMStruct = svmtrain(Training_Set , train_label, 'kernel_function', 'linear');
There is no way to train any classifier on raw 100x100 images, when you only have ~40 data points for training, testing and validation. So recommending a Matlab toolbox wouldn't really help your problem.
The answer is: Get more data
For completeness here are two approaches you could try:
Feature extraction
Maybe there are some very obvious features (some pictures are darker, have a white corner etc.) in your pictures, that you can extract before the training. With 3-4 features you could try training a classifier with your data set. In this case I would try fitcensemble as it is very easy to use without the inner workings of the algorithm.
Using a pre-trained classifier
You can use GoogLeNet and maybe your pictures are fitting one of the ImageNet categories. Try transfer learning if your images do not match any category.
Related
I have a big database of pictures (say, 1 million 512x512px images) and I want to do the following query in a fast way:
Given a cropped image, find an image from the database that contains it.
(The closest question that I could find in StackOverflow is this one, which I address later in this post)
The following image illustrates what I'm trying to do.
I have the following restrictions:
(I) – The query must be fast. 10⁶ is a lot, so I don't think I can compare each image in the query to each of the others individually.
(II) – I need to work with cropped images, so solutions like simple image hashing won't do it (of course, this does not apply to crop-resistant hashes)
(III) – I don't know the proportion between the area of the queried image and the image that contains it. In the example above, the refrigerant is just a small portion of the original image, but the cat takes a lot of space in the image where it is contained. Although I estimate that the proportion is always between 10%~100%, I don't know the exact amount beforehand (suppose the images in queries are always 512x512px, for example)
I've gathered some information in my research:
Simple image hash matching isn't possible because of (II) (I'm working with cropped parts)
Reddit's RepostSleuthBot (available on GitHub) is an excellent starting point for me: It can identify if an image was already posted in an efficient way. Instead of simply matching hashes, seems like it uses the ANNOY algorithm to find similar images (so it can match images with slight modifications in text or brightness, for example). The only problem with this approach is that it isn't well adapted for cropped images. So, this addresses (I) but not (II) and (III).
In my StackOverflow searches, the closest thing I found to help in this problem is that if I knew the proportion between the cropped image and the original, I could match it using phase correlation, like this answer says.
This addresses (II), which is awesome, but then I'll have problems with (I) because I'd have to try to match with each image of the database, and it's also inviable because of (III).
A promising feature would be cropping-resistant image hashing - the paper Efficient Cropping-Resistant Robust Image Hashing, 10.1109/ares.2014.85 describes one, but seems like it isn't that performant, especially taking in consideration that I'm aiming at small crops (10%~100% of the original image) and a huge amount of images.
I got stuck after this point. Is there any other algorithm or method I should be aware of? Anything will be very appreciated.
I'm currently working on my thesis on the neural networks. I'm using the CIFAR10 as a reference dataset. Now I would like to show some example results in my paper. The problem is, that the images in the dataset are 32x32 pixels so it's really hard to recognize something on them when printed on paper.
Is there any way to get hold of the original images with higher resolution?
UPDATE: I'm not asking for image processing algorithm, but for the original images presented in CIFAR-10. I need some higher resolution samples to put in my paper.
I now have the same problem and I just found your question.
It seems that CIFAR was built from labeling the tinyimages dataset, and are kind enough to share the indexing from CIFAR to tinyimages. Now tinyimages contain metadata file with URL of the original images and a toolbox for getting for any image you wish (e.g. those included in the CIFAR index).
So one may write a mat file which does this and share the results...
They're just small:
The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset.
You could use Google reverse image search if you're curious.
I'm looking for an algorithm or library that can spot the differences between two images (like in a "find the errors" game) and output the coordinated of the bounding box containing those changes.
I'm open to the algorithm being in Python, C, or almost any other language.
If you just want to show the differences, so you can use the code below.
FastBitmap original = new FastBitmap(bitmap);
FastBitmap overlay = new FastBitmap(processedBitmap);
//Subtract the original with overlay and just see the differences.
Subtract sub = new Subtract(overlay);
sub.applyInPlace(original);
// Show the results
JOptionPane.showMessageDialog(null, original.toIcon());
For compare two images, you can use ObjectiveFideliy class in Catalano Framework.
Catalano Framework is in Java, so you can port this class in another LGPL project.
FastBitmap original = new FastBitmap(bitmap);
FastBitmap reconstructed = new FastBitmap(processedBitmap);
ObjectiveFidelity of = new ObjectiveFidelity(original, reconstructed);
int error = of.getTotalError();
double errorRMS = of.getErrorRMS();
double snr = of.getSignalToNoiseRatioRMS();
//Show the results
Disclaimer: I am the author of this framework, but I thought this would help.
There are many, suited for different purposes. You could get a start by looking at OpenCV, the free computer vision library with an API in C, C++, and also bindings to Python and many other languages. It can do subtraction easily and also has functions for bounding or grouping sets of points.
Aside from simple image subtraction, one of the specific uses addressed by OpenCV is motion detection or object tracking.
You can ask more specific image-related algorithmic related questions in the Signal Processing stackexchange site.
"Parse" the two images into multiple smaller images by cropping the original image.
The size of each "sub-image" would be the "resolution" of your scanning operation. For example, if the original images are 100 pixels x 100 pixels, you could set the resolution to 10 x 10 and you'd have one hundred 10 x 10 sub-images for each original image. Save the sub-images to disk.
Next, compare each pair of sub-image files, one from each original image. If there is a file size or data difference, then you can mark that "coordinate" as having a difference on the original images.
This algorithm assumes you're not looking for the coordinates of the individual pixel differences.
Imagemagick's compare (command-line) function does basically this, as you can read about/see examples of here. One constraint though, is that both images must be of the same size and not have been translated/rotated/scaled. If they are not of the same size/orientation/scale, you'll need to take care of that first. OpenCV contains some algorithms for that. You can find a good tutorial on OpenCV functions you could use to rectify the image here.
How can I work with my own dataset in scikit-learn?
Scikit Tutorial always take as example to load his dataset (digit dataset, flower dataset...)
http://scikit-learn.org/stable/datasets/index.html
ie: from sklearn.datasets import load_iris
I have my images and I have no idea how create new one.
Particularly, for starting, i use this example i found (i use library opencv):
img =cv2.imread('telamone.jpg')
# Convert them to grayscale
imgg =cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# SURF extraction
surf = cv2.SURF()
kp, descritors = surf.detect(imgg,None,useProvidedKeypoints = False)
# Setting up samples and responses for kNN
samples = np.array(descritors)
responses = np.arange(len(kp),dtype = np.float32)
I would like to extract features of a set of images, in a way useful to implement a machine learning algorithm!
You would first need to clearly define what you are trying to achieve: "extract feature to a set of images, in a way useful to implement a machine learning algorithm!" is much too vague to give you any guidance.
Are you trying to do:
image classification of the picture as a whole (e.g. indoor scene vs outdoor scene)?
object recognition (e.g. recognizing several instances of the same object in different pictures) inside sub-parts of a set of pictures, maybe using a scan procedures with windows of various sizes?
object detection and class-based categorization (e.g. finding all occurrences of cars or pedestrians in pictures and a bounding box around each occurrence of instances of those classes)?
full picture semantic parsing a.k.a. segmentation of the pixels + class categorization of each segment (build, road, people, trees)...
Each of those tasks will require different pipelines (feature extraction + machine learning models combo).
You should probably start by reading a book on the subject, for instance: http://szeliski.org/Book/
Also as a side note, stackoverflow is probably not the best place to ask such open ended questions.
I have created a simple program to generate random images, giving random colors to each pixel. I know that there is a very low chance of generating a reconocible image but I would like to try.
I have observed that the longest part of the work is to check if the images are really something. I also observed that most of the images produced are just fields of colorful images with lots of individual pixels. That's why I would like to ask for an algorithm in pseudocode to detect similar color regions in an image. I think that the easiest way to find meaningful images is to filter all those random pixel images. It's not perfect but I think it will help.
If someone could propose another kind of filtering algorithm that would help with this task I would also apreciate it.
(edited)
To clarify this, in case my explanation was not clear enough, I will show you some images:
This is the kind of images I'm getting, basically I would describe it as "Colorful noise". As you can see, all the pixels are spread individually without grouping in similar color regions to hopfully create shapes of objects or anything reconocible as something.
In here you can see a conventional image, a "reconocible" picture. We can clearly see a dog lying on the grass with a tennis ball. If you observe carefully this picture it can be clearly distinguished from the other one because it has agrupations of similar colors which we can difer (as the dog, a white region, the grass, a dark green region, and the tenis ball, a light green region).
What I exactly want is to remove the "pixelly" images before saving them in the HD and only save the ones with color agrupations. As I said before, this idea is the best I had to filter these randomly generated images but if someone proposes another more efficient way I would really apreciate it.
(edited)
Ok, I think that this post is becoming too long... Well if someone want's to have a look here is the code of the program I wrote. It's really straightforward. I've program it in Python using Pygame. I know that this isn't nearly the most efficient way to do it, I'm aware of that. The thing is that I'm quite a noob in this field and I don't really know another way to do this in other languages or modules. Maybe you could help me also with this... I don't know, maybe translate the code to C++? I'm feeling that I'm asking for to many questions in the same post but, as I sayd tons of times, any help would be greatly apreciated.
import pygame, random
pygame.init()
#lots of printed text statements here
imageX = int(input("Enter the widht of the image you want to produce: "))
imageY = int(input("Enter the height of the image you want to produce: "))
maxImages = int(input("Enter the maximun image amoungt you want to produce: "))
maxMem = int(input("Enter the maximun data you want to produce (MB, only works with 800x600 images): "))
maxPPS = int(input("Enter the maximun image amoungt you want to produce each second: "))
firstSeed = int(input("Enter the first seed you want to use: "))
print("\n\n\n\n")
seed = firstSeed
clock = pygame.time.Clock()
images = 0
keepGoing = True
while keepGoing:
#seed
random.seed(seed)
#PPS
clock.tick(maxPPS)
#surface
image = pygame.Surface((imageX,imageY))
#generation
for x in range(imageX):
for y in range(imageY):
red = random.randint(0,255)
green = random.randint(0,255)
blue = random.randint(0,255)
image.set_at((x,y),(red,green,blue))
#save
pygame.image.save(image,str(seed)+".png")
#update parameters
seed += 1
images += 1
#print seed
print(seed - 1)
#check end
if images >= maxImages:
keepGoing = False
elif (images * 1.37) >= maxMem:
keepGoing = False
pygame.event.pump()
print("\n\nThis is the last seed that was used: " + str(seed - 1))
input("\nPress Enter to exit")
Here is a butchered algorithm for you to try (try it in OpenCV):
Get image
Work with just one color dimension of the image i.e. Red or Green... or Gray ...or do the following for each separately
Sum up all the values in the image and save this as the "energy" value of the image
Use OpenCV's Smooth function to blur the image
The trick to blurring the image correctly is to choose the size of the kernel (aka filter) to be smaller than the width of important features and larger than the unimportant or noisy features. The size is controlled by defining param1 and param2.
See http://opencv.willowgarage.com/documentation/python/imgproc_image_filtering.html
Now for the output, sum up all the values to get the output "energy"
Keep image if the output has at least half of the energy of the input. Technically the trick in number 5 is the same as choosing 50 percent as the threshold to keep or discard images. So changing the threshold here is approximately the same as changing the filter size.
Optional. No need to think too much about it though, just get these energy values for some set of the images and choose the threshold by eye.
What is happening?
Your filtering out high frequencies then seeing if there is still something left over. Most images have lots of energy at lower spatial frequencies. In fact jpeg compression uses this fact to compress images. The filter must have an energy of one to work correctly, so I'm assuming that this is true.
Hope this helps!
The simplest way of filtering out noise is to look for correlation. Nearby regions should be highly correlated in most of the image. There are so many ways to do it.
You should use a combination of the following and do some tweaking to find parameters to get acceptable hit/miss ratio
Color correlation: You will find huge amount of correlation in U/V in nearby regions in "proper" images.
Edge detection: Natural images tend to have well defined edges. easiest way to detect noise from natural images is to do this.
Quite bit more can be done: Frequency analysis: Noisy images will have all frequencies natural images have huge peaks usually. scale space analysis etc depending on how complex you want to get.. how much hit ratio you are willing to tolerate. In general trying to get recognizable images is an open ended topic but you should be able to get very high hit ratio if you specifically wanting to remove out noise images like the one you gave in the example.
EDIT:
In general there is no exact algorithms for problems like this. You have to make assumptions about properties of underlying data. Then use basic primitives (correlation, frequency domain data, edges etc) and combine it to give your algorithm for solving the problem. This is because the solution to problems like this is very data specific. Quite different than solving say Computer science algorithms. This is not to say that signal processing algorithms don't have exactness. However your current problem and many others deal with what is known as Random Variables and Stochastic Processes. You may have to search if someone has tried to solve this problem in literature or at some university. You can use that as your starting point. Tweak that algorithm to suite you. However you are not going to get a solution easily unless you take some time to understand some of the things I mentioned and are willing to do some experiments and emperical analysis.
Without knowing exactly what you're trying to achieve, it's difficult to offer specific help. But, reading your account did remind me of something I saw recently which, whilst quite different in implementation, has a similar end goal: generate a recognisable image from randomness.
Check out https://github.com/phl/pareidoloop by Philip McCarthy.
Philip's project starts with random polygons and the algorithm favours face like images. Two key points here: the polygons significantly reduce the amount of random noise right off the bat so the chances of generating something recognisable are significantly increased. Secondly, the algorithm favours a specific type of recognisable image: I suspect you're going to have to work towards a specific type of image so that you have some parameters with which to computationally estimate "recognisability".
hth!