Recognition of numbers in images(Matlab) - algorithm

I am learning image processing and i am trying to start my first project, that is Simple number recognition in an image.
So far i have applied thresholding to the image. Now i would like to know some algorithms by which my system can recognize the number in the image. Preferably the algorithm must be simple and it doesn't have to robust as i am would be generating the image in paint using the same font.
I have looked at the similar questions here on SO and they all point out to using libraries. Remember guys i am trying to learn so please don't point out some libraries.

Are the numbers printed or hand-written?
The Computer Vision System Toolbox includes a function called ocr, which will recognize both, letters and numbers.
If you are looking for hand-written digit recognition, please take a look at this example.

Related

Removing skew/distortion based on known dimensions of a shape

I have an idea for an app that takes a printed page with four squares in each corner and allows you to measure objects on the paper given at least two squares are visible. I want to be able to have a user take a picture from less than perfect angles and still have the objects be measured accurately.
I'm unable to figure out exactly how to find information on this subject due to my lack of knowledge in the area. I've been able to find examples of opencv code that does some interesting transforms and the like but I've yet to figure out what I'm asking in simpler terms.
Does anyone know of papers or mathematical concepts I can lookup to get further into this project?
I'm not quite sure how or who to ask other than people on this forum, sorry for the somewhat vague question.
What you describe is very reminiscent of augmented reality marker tracking. Maybe you can start by searching these words on a search engine of your choice.
A single marker, if done correctly, can be used to identify it without confusing it with other markers AND to determine how the surface is placed in 3D space in front of the camera.
But that's all very difficult and advanced stuff, I'd greatly advise to NOT try and implement something like this, it would take years of research... The only way you have is to use a ready-made open source library that outputs the data you need for your app.
It may even not exist. In that case you'll have to buy one. Given the niché of your problem that would be perfectly plausible.
Here I give you only the programming aspect and if you want you can find out about the mathematical aspect from those examples. Most of the functions you need can be done using OpenCV. Here are some examples in python:
To detect the printed paper, you can use cv2.findContours function. The most outer contour is possibly the paper, but you need to test on actual images. https://docs.opencv.org/3.1.0/d4/d73/tutorial_py_contours_begin.html
In case of sloping (not in perfect angle), you can find the angle by cv2.minAreaRect which return the angle of the contour you found above. https://docs.opencv.org/3.1.0/dd/d49/tutorial_py_contour_features.html (part 7b).
If you want to rotate the paper, use cv2.warpAffine. https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html
To detect the object in the paper, there are some methods. The easiest way is using the contours above. If the objects are in certain colors, you can detect it by using color filter. https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html

Pattern recognition for image recognition

I need to write a program that recognize some patterns in different photos. I wrote a program that take a photo as input and creates another image with the edges of previous photo. Now I'm stuck with pattern detection. I tried to take 2d arrays of pixel and mark each possible pattern by giving each pixel a value from 0 to n(maximum number of pixel in a sequence). Then I take the objects that the program already knows and see which one contains the more patterns that were found.
The problem is that beside efficiency, the program won't work if the image is upside down(If I train it with a photo and then flip the photo, the program won't recognize it).
Can you tell me some methods to fulfill my task, or some good tutorials or courses that explain the process a lot deeper than just:"search for patterns"?
Your problem description is very general. To get better answers provide some input data characteristics as well as describe what kind of patterns you're looking for.
What could be useful in general problem of pattern recognition, is using neural networks.
For example you could check the first chap. of this book http://neuralnetworksanddeeplearning.com/chap1.html
There's simple example of pattern recognition for handwritten digits.
In your case for solving rotation problem, you'd probably have to rotate training example as well.

Algorithm to detect doors and windows in an image

I'm working on an educational project where I need to detect objects, mainly doors and windows. I have tried to find specific algorithms to do this.
This is the first step in a project to detect all objects and let a user choose the object he wants. Then in the next step the system will define edges of the object accurately.
I want to detect objects by their color variety with background or with overlapping objects. I need an algorithm to start with. I started learning color spaces and I chose hvs color space. I read many papers and I know how they work, but I'm still confused and don't know what algorithm will really help.
You can use any segmentation algorithm.
You will need to find features from images to use in segmentation, a good approach for feature selection is by using any deep learning technique, i would recommend try CNN, you will find a builtin library "matconvnet" "http://www.vlfeat.org/matconvnet/" for implementing CNN in MATLAB.
you can also find few already build models for segmentation using CNN here http://www.vlfeat.org/matconvnet/pretrained/

Logo recognition with a huge dataset

First of all, thanks for reading my question. I'm beginner in computer vision.
I read a lot but I didn't find any solution.
I have an image and I want to detect logo/logos on it.
Also, I have a whole of images with different logos, all image containing a logo on it and nothing more.
Can you help me with any idea of how to detect logo/logos on an image when I have a whole (thousands) of training sets (known logos set)?
It can be done by using the SURF or SIFT feature detection algorithm for few known logos, by matching the given image with all of the others but I have a huge dataset, and I can't match with all other images.
To try all images in the dataset takes toooooo much time :)
Can be useful any SDK? (it can be even for mobile phones or for desktop also).
Or can I use some multiple algorithms for it?
I found an interesting paper about this question with a SIGMA algorithm, but I can't find any description for these algorithms (http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5495345).
I think to detect the features on the images is OK (SIFT, maybe SURF).
But I think the problem is with the big number of known images/logos.
I think it should be stored in a special way.
Ex. made a tree somehow from the thousand of known logos, or to separate them in groups.
Is it possible to do this task?
I appreciate any help.
The thousands of training sets is useful only to test your algorithm, it will not help to analyze a new image.
I made a bit of pattern recognition in the past, I would start this way: look for sharp edges (sharp color transitions too). So an edge filter and statistical analysis about features all located in the same corner. The result of the algorithm will be a number that you will use with your training set.
Since you are doing original reserch be prepared for a long work. If a SDK with a function "ImageHasLogo()" exists yet, you will find it on Google.

Algorithm to detect presence of text on image

With my new assignment I am looking for a method to detect the presence of text on image. The image is a map - can be for example google map. The task is to detect where the street/city label is placed.
I know that opencv library has algorithm that can detect features (for example human faces) - haar classifier or hog (histogram of oriented gradients), but I heard that learning process of such algorithms is quite difficult.
Do you know of any algorithm, method or a library that could do that (detect presence of text on image)?
Thanks,
John
There is a standard problem in vision called text detection in images. it is quite different to OCR. OCR concerms itself with what it says, while text detection is about determining if there is text in the image. Adi Shavit's third link is a method to address this problem. You can look on google scholar well cited articles on text detection.
There are several possible approaches you can take.
Use OCR. A search for OCR on Stackoverflow will show many options. These include Tesseract and Ocropus.
If your text uses very specific fixed font, you may get away with simple template matching.
In the more general case you might want to take a look at "Detecting Text in Natural Scenes with Stroke Width Transform"
UPDATE Jan. 2017
The OpenCV 3.2 contrib module now has a text detection module.
It also includes a sample (C++, Python) of how to use it.
You need to tune this to a specific type of map images, or the problem is going to be very difficult (see the previous post about links to articles).
OCR is the way to go, and you should use an existing library. However, OCR is mainly done on text on white backgrounds. To reduce your problem to a regular OCR problem, you should attempt to work on the color space of the map. Likely the map text has a very specific color and this may be enough to find these pixels. You can then filter the detected pixels based on the size of connected regions.
If you literally only want to find the locations of text labels, you can do the above, and pretty much just skip the OCR step. If the labels are not too close, simple clustering algorithms can be used to find their respective positions.

Resources