I need to use SVM for image features extracted. The ouput of SVM should be in binary. Please share the resources if any available. Actually I am working on sclera detection. I manually got the sclera region from the training images and have the features extracted in the form of histogram values of each and every sub region in the image. Now I will extract the same features from the test image. Upon receiving the features from the test image, I need to compare test and training features with SVM on whether that particular block corresponds to sclera or not. If i get an out put in 0 and 1 form, then I shall use the code to segmentation of the region in an easier way.
Look at libsvm for Matlab, which is probably the most widely used free SVM toolbox in Matlab.
There is a useful post about libsvm in Matlab right here. When you download libsvm, it contains several examples for Matlab that I was able to run in a few minutes. You can easily modify one of those examples to get your binary output.
Related
For a project I am using YOLO to detect phallusia (microbial organisms) that swim into focus in a video. The issue is that I have to train YOLO on my own data. The data needs to be segmented so I can isolate the phallusia. I am not sure how to properly segment/cut-out the phallusia to fit the format that YOLO needs. For example in the picture below I want YOLO to detect when a phallusia is in focus similar to the one I have boxed in red. Do I just cut-out that segment of the image and save it as its own image and feed to that to YOLO? Do all segmented images need to have the same dimensions? Not sure what I am doing and could use some guidance.
It looks like you need to start from basics, ok, no fear. I will try to suggest a simple route to start efficiently to use YOLO techniques. Luckly the web has a lot of examples.
Understand WHAT is a YOLO method.
Andrew NG's YOLO explanation is a good start, but only if you alread know what are classification and detection.
Understand the YOLO Loss function, the heart of the algorithm.
Check the paper YOLO itself, don't be scared. At page #2, in Unified Detection section, you will find the information about the bounding box detection used, but be aware that you can use whatever notation you want (even invent a new one), in order to be compatible with the Loss function, real meaning of this algorithm.
Start to implement an example
As I wrote above, there are plenty of examples. You can check this one if you are familiar with python and tensorflow.
Inside it you will find a way to prepare the dataset, that is your target for this question, I think. In this case a tool named labelImg is used.
I hope it will be useful. Please share your code when it will be ready, I'm curious :). Good luck!
Do I just cut-out that segment of the image and save it as its own image and feed to that to YOLO?
You need as much images as you can get of your microbial organism, in different sizes, positions, etc. It doesn't need to be the only thing on the image, but you need to know the <x> <y> <width> <height> position of it.
Do all segmented images need to have the same dimensions?
No, they can be of any size and Yolo adapts them. See the VOC dataset for examples of images Yolo is normally trained on. A couple examples;
kitchen, dogs
Not sure what I am doing and could use some guidance.
My advice would be to follow the instructions for "Training YOLO on VOC" from the original Yolo website; https://pjreddie.com/darknet/yolo/
Once you have that working, you will have a better idea of the steeps you need to take.
I had similar problems when I wanted to train YOLOv2 for some game cards.
In order to solve the problem I took a picture from every game card with my cellphone and I cut out them. Because I didn't have enough training data I wrote a dataset generator program what generated the training data by using the photos from the cards. This program is able to multiply, rotate, scale the image then to place it on a background.
It can happen that you will have problems if you don't have enough learning data. In this case don't panic, because from several raw images by rotating and scaling you can generate a large dataset.
Here you can find my dataset generator, which is able to generate Pascal VOC style and darknet style training data: https://github.com/szaza/dataset-generator. Feel free to reuse it, if you need something similar.
I have just begun to use TensorFlow in python. I want to build a binary image classifier using CNN.
I found an example code on the internet: https://github.com/tensorflow/tensorflow/blob/r1.1/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py
The explanation is given here: https://www.tensorflow.org/get_started/mnist/pros
This code builds a small Neural Network and uses the MNIST dataset to train and test it.
I roughly understood the working of the CNN but I didn't understand the code line by line.
I want to use the same code with my own Dataset of Images (for both training and testing). In the example the input images are converted into mx784, where m is the number of training/testing examples and 784 comes from flattened images of size 28x28 each. I have converted all my images into an array of size mx1024 using a python script and Similarly converted the ground truth into an array of size mx1. I have stored them into text file as X.txt and y.txt.
Now in the code I have changed the dimensions according to my image dimensions. However, I am confused how to feed the images into the network. Is there a way out other than going through the code line by line? I will be very grateful if you could help me out.
This Get Started is just great to understand line by line and to go "deeper" into neural networks step by step.
https://www.tensorflow.org/get_started/
Try to understand it, it will really help you :)
I am very unfamiliar with Caffe. My task is to train an autoencoder net on image pairs, given in .tif format, where one is a grayscale image of nerves, and the other is the corresponding binary mask which shows if a certain structure is present on the image or not. I have these in the same "train" folder. What I would like to accomplish, is a meaningful experiment with these images (segmentation, classification, it is not specified). My first problem is that I do not know how to feed the images into the net without an existing train.txt. Can I use the images directly, or another format like lmdb, hdf5 needed? Any suggestion is appreciated.
you can accomplish it with simple classification (existing like alexnet, googlenet, lenet). You can use only the binary mask or gray scale image and the class name to do this. Nvidia Digits is a good graphical tool to make the pair dataset and learning....
Please see this link:
https://developer.nvidia.com/digits
I am working on a project in which I need to highlight the difference between pair of scanned images of text.
Example images are here and here.
I am building a webapp based on HTML,JS for this.
I found that openCV does support highlighting differences between 2 images.
Also I saw that imageMagick also has such support.
Does openCV has support for doing automatic registration of images?
And is there a JS module for openCV?
Which one is more suited for my purpose?
1. Simplistic way:
Suppose the images are perfectly aligned and similarly illuminated: subtract one image from another pixel by pixel, then threshold the result, filter out noisy blobs, and select the biggest ones. Good for a school project
2. A bit more complicated:
Align the images, then find a way to uniform the illumination, then apply the simplistic way.
How to align:
Find the text area in two images, as being a darker than the file color.
Find its corners
Use getPerspectiveTransform() to find the transform between images.
warpPerspective() one image to another.
Another way to register the two images is by feature matching. It has quite an extensive support in OpenCV. And findHomography() will estimate the pose between two images from a bigger set of matching points.
3. Canonical answer:
Align the image.
Convert it to text with an OCR engine.
Compare the text in the two images.
Well, besides the great help given by vasile, you also need the web app answer.
In order to make it work in a server, you will probably need a file upload form, as well as an answer from the server with the applied algorithm. There are several ways you can do it depending on the server restrictions you have. If you can run command line arguments, you would probably need to implement the highlight algorithm in opencv and pass the two input files a an output one for the program. A php script should be used for uploading the files, calling the command line program, and outputting the result to the user.
Another approach could be using java and JavaCV in a web container like Apache Tomcat, for instance.
Best regards,
Daniel
Sometimes two image files may be different on a file level, but a human would consider them perceptively identical. Given that, now suppose you have a huge database of images, and you wish to know if a human would think some image X is present in the database or not. If all images had a perceptive hash / fingerprint, then one could hash image X and it would be a simple matter to see if it is in the database or not.
I know there is research around this issue, and some algorithms exist, but is there any tool, like a UNIX command line tool or a library I could use to compute such a hash without implementing some algorithm from scratch?
edit: relevant code from findimagedupes, using ImageMagick
try $image->Sample("160x160!");
try $image->Modulate(saturation=>-100);
try $image->Blur(radius=>3,sigma=>99);
try $image->Normalize();
try $image->Equalize();
try $image->Sample("16x16");
try $image->Threshold();
try $image->Set(magick=>'mono');
($blob) = $image->ImageToBlob();
edit: Warning! ImageMagick $image object seems to contain information about the creation time of an image file that was read in. This means that the blob you get will be different even for the same image, if it was retrieved at a different time. To make sure the fingerprint stays the same, use $image->getImageSignature() as the last step.
findimagedupes is pretty good. You can run "findimagedupes -v fingerprint images" to let it print "perceptive hash", for example.
Cross-correlation or phase correlation will tell you if the images are the same, even with noise, degradation, and horizontal or vertical offsets. Using the FFT-based methods will make it much faster than the algorithm described in the question.
The usual algorithm doesn't work for images that are not the same scale or rotation, though. You could pre-rotate or pre-scale them, but that's really processor intensive. Apparently you can also do the correlation in a log-polar space and it will be invariant to rotation, translation, and scale, but I don't know the details well enough to explain that.
MATLAB example: Registering an Image Using Normalized Cross-Correlation
Wikipedia calls this "phase correlation" and also describes making it scale- and rotation-invariant:
The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation.
Colour histogram is good for the same image that has been resized, resampled etc.
If you want to match different people's photos of the same landmark it's trickier - look at haar classifiers. Opencv is a great free library for image processing.
I don't know the algorithm behind it, but Microsoft Live Image Search just added this capability. Picasa also has the ability to identify faces in images, and groups faces that look similar. Most of the time, it's the same person.
Some machine learning technology like a support vector machine, neural network, naive Bayes classifier or Bayesian network would be best at this type of problem. I've written one each of the first three to classify handwritten digits, which is essentially image pattern recognition.
resize the image to a 1x1 pixle... if they are exact, there is a small probability they are the same picture...
now resize it to a 2x2 pixle image, if all 4 pixles are exact, there is a larger probability they are exact...
then 3x3, if all 9 pixles are exact... good chance etc.
then 4x4, if all 16 pixles are exact,... better chance.
etc...
doing it this way, you can make efficiency improvments... if the 1x1 pixel grid is off by a lot, why bother checking 2x2 grid? etc.
If you have lots of images, a color histogram could be used to get rough closeness of images before doing a full image comparison of each image against each other one (i.e. O(n^2)).
There is DPEG, "The" Duplicate Media Manager, but its code is not open. It's a very old tool - I remember using it in 2003.
You could use diff to see if they are REALLY different.. I guess it will remove lots of useless comparison. Then, for the algorithm, I would use a probabilistic approach.. what are the chances that they look the same.. I'd based that on the amount of rgb in each pixel. You could also find some other metrics such as luminosity and stuff like that.