Advice on using OCR on an image of a blackboard - image

I'm trying to get an image of a blackboard readable by OCR. Naturally, most OCR software doesn't like dirty images. What image processing should I try to put the image through to clean the image up?

Have you tried the OCR software yet? It's likely that the OCR software is well suited to reading what's essentially already a black and white image.
However, if you were required to do so you could try to:
Threshold the image.
Essentially take a greyscale version of the image and turn it into black / white pixels
Perform Binary Dilation to grow the remaining objects
Perform Binary Erosion
The idea is by dilating then eroding you would remove any rough / noisy edges and then you can pass the skeletonized image to the OCR.
There are probably plenty of methods to achieve a similar result. Given that there are entire books devoted to computer vision this answer will hardly do them justice.
The only texts I have are from 1997, but surely there's been more written on the subject since.
Algorithms for Image Processing and Computer Vision - J.R. Parker
Digital Image Processing - Gonzalez / Woods

Offhand, I'd say invert the image (reverse the colors, so that the writing is black on white) and increase the contrast a bit. You can try modifying the brightness to get the erased chalk fogginess to disappear into the background.

In Photoshop, the Levels dialog may be your most useful image adjustment. Mimicking this in code is another subject, entirely.
The basis of Levels is that you adjust the max, min and midpoints of the brightness levels. Usually shown on a histogram, you adjust the points such that you obtain the desired amount of contrast, but also move the midpoint such that text in the image is the most well-defined; critical for OCR applications. By moving the midpoint you can "eliminate" the grayscale fuzz that ordinarily surrounds handwriting by causing it to disappear into the light (or dark) areas of the image.
Also you might try converting the image to 1-bit after such an adjustment, forcing everything to black or white. Sometimes this speeds up the OCR process. But be careful, it also will discard detail.

Have you tried edge detection techniques such as Roberts Cross and Sobel operator to filter noise out of the image? Without seeing the quality of the image, can't say how effective that'd be.

Not sure how constrained you are in the choice of OCR solution, but the ABBYY OCR engine (and a web API based on it, http://www.wisetrend.com/wisetrend_ocr_cloud.shtml ) includes automatic image cleanup / texture removal options.

There are commercial solutions but cleaning up board images appears to be an open problem. Add OCR to an unsolved problem, and you get... an unsolved problem.

Related

Detecting pasted shape (or object) within an image

I'm trying to find a way to reliably determine the location of a puzzle piece in an image. The puzzle piece varies in both shape and how easy it is to find it. What algorithm(s) in the opencv module would help me with the task at hand? Or is what I'm trying to do beyond the scope of the module?
Example images below:
Update
The original title was "Detecting obscure shapes with Opencv Python". However I am interested in concepts of image-processing that would solve such a problem: How to find a pasted image inside the bigger image?
Assume the following:
The jigsaw shapes are always of same (rectangle) boundary size (ie: a template-based searching method could work).
The jigsaw shape is not rotated to any angle (ie: there will be straight(-ish) horizontal and vertical lines to find.
The jigsaw shape is always "pasted" into some other "original" image (ie: a paste-detection method could work).
The solution can be OpenCV (as requested by the asker), but the core concepts should be applicable when using any appropriate software (ie: can loop through image pixels to process their values, in order to achieve the described solution).
I myself use JavaScript, but of course I will understand that openCV.calcHist() becomes a histogram function in JS code. I have no problem translating a good explanation into code. I will consider OpenCV code as pseudo-code towards a working idea.
In my opinion the best approach for a canonical answer was suggested in the comments by Christoph, which is training a CNN:
Implement a generator for shapes of puzzle pieces.
Get a large set of natural images from the net.
Generate tons of sample images with puzzle pieces.
Train your model to detect those puzzle pieces.
Histogram of Largest Error
This is a rough concept of a possible algorithm.
The idea comes from an unfounded premise that seems plausible enough.
The premise is that adding the puzzle piece drastically changes the histogram of the image.
Let's assume that the puzzle piece is bounded by a 100px by 100px square.
We are going to use this square as a mask to mask out pixels that are used to calculate the histogram.
The algorithm is to find the placement of the square mask on the image such that the error between the histogram of the masked image and the original image is maximized.
There are many norms to experiment with to measure the error: You could start with the sum over the error components squared.
I'll throw in my own attempt. It fails on the first image, only works fine on the next two images. I am open to other pixel-processing based techniques where possible.
I do not use OpenCV so the process is explained with words (and pictures). It is up to the reader to implement the solution in their own chosen programming language/tool.
Background:
I wondered if there was something inherent in pasted images (something maybe revealed by pixel processing or even by frequency domain analysis, eg: could a Fourier signal analysis help here?).
After some research I came across Error Level Analysis (or ELA). This page has a basic introduction for beginners.
Process: In 7 easy steps, this detects the location of a pasted puzzle piece.
(1) Take a provided cat picture and re-save 3 times as JPEG in this manner:
Save copy #1 as JPEG of quality setting 2.
Reload (to force a decode of) copy #1 then re-save copy #2 as JPEG of quality setting 5.
Reload (to force a decode of) copy #2 then re-save copy #3 as JPEG of quality setting 2.
(2) Do a difference blend-mode with original cat picture as base layer versus the re-saved copy #3 image. Thimage will be black so we increase Levels.
(3) Increase Levels to make the ELA detected area(s) more visible. note: I recommend working in BT.709 or BT.601 grayscale at this point. Not necessary, but it gives "cleaner" results when blurring later on.
(4) Alternate between applying a box blur to the image and also increasing levels, to a point where the islands disappear and a large blob remains..
(5) The blob itself is also emphasized with an increase of levels.
(6) Finally a Gaussian blur is used to smoothen the selection area
(7) Mark the blob area (draw an outline stroke) and compare to input image...

Algorithm to detect the change in visible luminosity in an image

I want a formula to detect/calculate the change in visible luminosity in a part of the image,provided i can calculate the RGB, HSV, HSL and CMYK color spaces.
E.g: In the above picture we will notice that the left side of the image is more bright when compared to the right side , which is beneath a shade.
I have had a little think about this, and done some experiments in Photoshop, though you could just as well use ImageMagick which is free. Here is what I came up with.
Step 1 - Convert to Lab mode and discard the a and b channels since the Lightness channel holds most of the brightness information which, ultimately, is what we are looking for.
Step 2 - Stretch the contrast of the remaining L channel (using Levels) to accentuate the variation.
Step 3 - Perform a Gaussian blur on the image to remove local, high frequency variations in the image. I think I used 10-15 pixels radius.
Step 4 - Turn on the Histogram window and take a single row marquee and watch the histogram change as different rows are selected.
Step 5 - Look out for a strongly bimodal histogram (two distimct peaks) to identify the illumination variations.
This is not a complete, general purpose solution, but may hold some pointers and cause people who know better to suggest improvememnts for you!!! Note that the method requires the image to have a some areas of high uniformity like the whiteish horizontal bar across your input image. However, nearly any algorithm is going to have a hard time telling the difference between a sheet of white paper with a shadow of uneven light across it and the same sheet of paper with a grey sheet of paper laid on top of it...
In the images below, I have superimposed the histogram top right. In the first one, you can see the histogram is not narrow and bimodal because the dotted horizontal selection marquee is across the bar-code area of the image.
In the subsequent images, you can see a strong bimodal histogram because the dotted selection marquee is across a uniform area of image.
The first problem is in "visible luminosity". It me mean one of several things. This discussion should be a good start. (Yes, it has incomplete and contradictory answers, as well.)
Formula to determine brightness of RGB color
You should make sure you operate on the linear image which does not have any gamma correction applied to it. AFAIK Photoshop does not degamma and regamma images during filtering, which may produce erroneous results. It all depends on how accurate results you want. Photoshop wants things to look good, not be precise.
In principle you should first pick a formula to convert your RGB values to some luminosity value which fits your use. Then you have a single-channel image which you'll need to filter with a Gaussian filter, sliding average, or some other suitable filter. Unfortunately, this may require special tools as photoshop/gimp/etc. type programs tend to cut corners.
But then there is one thing you would probably like to consider. If you have an even brightness gradient across an image, the eye is happy and does not perceive it. Rather large differences go unnoticed if the contrast in the image is constant across the image. Unfortunately, the definition of contrast is not very meaningful if you do not know at least something about the content of the image. (If you have scanned/photographed documents, then the contrast is clearly between ink and paper.) In your sample image the brightness changes quite abruptly, which makes the change visible.
Just to show you how strange the human vision is in determining "brightness", see the classical checker shadow illusion:
http://en.wikipedia.org/wiki/Checker_shadow_illusion
So, my impression is that talking about the conversion formulae is probably the second or third step in the process of finding suitable image processing methods. The first step would be to try to define the problem in more detail. What do you want to accomplish?

Restoring an old manuscript with image processing

Say i have this old manuscript ..What am trying to do is making the manuscript such that all the characters present in it can be perfectly recognized what are the things i should keep in mind ?
While approaching such a problem any methods for the same?
Please help thank you
Some graphics applications have macro recorders (e.g. Paint Shop Pro). They can record a sequence of operations applied to an image and store them as macro script. You can then run the macro in a batch process, in order to process all the images contained in a folder automatically. This might be a better option, than re-inventing the wheel.
I would start by playing around with the different functions manually, in order to see what they do to your image. There are an awful number of things you can try: Sharpening, smoothing and remove noise with a lot of different methods and options. You can work on the contrast in many different ways (stretch, gamma correction, expand, and so on).
In addition, if your image has a yellowish background, then working on the red or green channel alone would probably lead to better results, because then the blue channel has a bad contrast.
Do you mean that you want to make it easier for people to read the characters, or are you trying to improve image quality so that optical character recognition (OCR) software can read them?
I'd recommend that you select a specific goal for readability. For example, you might want readers to be able to read the text 20% faster if the image has been processed. If you're using OCR software to read the text, set a read rate you'd like to achieve. Having a concrete goal makes it easier to keep track of your progress.
The image processing book Digital Image Processing by Gonzalez and Woods (3rd edition) has a nice example showing how to convert an image like this to a black-on-white representation. Once you have black text on a white background, you can perform a few additional image processing steps to "clean up" the image and make it a little more readable.
Sample steps:
Convert the image to black and white (grayscale)
Apply a moving average threshold to the image. If the characters are usually about the same size in an image, then you shouldn't have much trouble selecting values for the two parameters of the moving average threshold algorithm.
Once the image has been converted to just black characters on a white background, try simple operations such as morphological "close" to fill in small gaps.
Present the original image and the cleaned image to adult readers, and time how long it takes for them to read each sample. This will give you some indication of the improvement in image quality.
A technique call Stroke Width Transform has been discussed on SO previously. It can be used to extract character strokes from even very complex backgrounds. The SWT would be harder to implement, but could work for quite a wide variety of images:
Stroke Width Transform (SWT) implementation (Java, C#...)
The texture in the paper could present a problem for many algorithms. However, there are technique for denoising images based on the Fast Fourier Transform (FFT), an algorithm that you can use to find 1D or 2D sinusoidal patterns in an image (e.g. grid patterns). About halfway down the following page you can see examples of FFT-based techniques for removing periodic noise:
http://www.fmwconcepts.com/misc_tests/FFT_tests/index.html
If you find a technique that works for the images you're testing, I'm sure a number of people would be interested to see the unprocessed and processed images.

How can I deblur an image in matlab?

I need to remove the blur this image:
Image source: http://www.flickr.com/photos/63036721#N02/5733034767/
Any Ideas?
Although previous answers are right when they say that you can't recover lost information, you could investigate a little and make a few guesses.
I downloaded your image in what seems to be the original size (75x75) and you can see here a zoomed segment (one little square = one pixel)
It seems a pretty linear grayscale! Let's verify it by plotting the intensities of the central row. In Mathematica:
ListLinePlot[First /# ImageData[i][[38]][[1 ;; 15]]]
So, it is effectively linear, starting at zero and ending at one.
So you may guess it was originally a B&W image, linearly blurred.
The easiest way to deblur that (not always giving good results, but enough in your case) is to binarize the image with a 0.5 threshold. Like this:
And this is a possible way. Just remember we are guessing a lot here!
HTH!
You cannot generally retrieve missing information.
If you know what it is an image of, in this case a Gaussian or Airy profile then it's probably an out of focus image of a point source - you can determine the characteristics of the point.
Another technique is to try and determine the character tics of the blurring - especially if you have many images form the same blurred system. Then iteratively create a possible source image, blur it by that convolution and compare it to the blurred image.
This is the general technique used to make radio astronomy source maps (images) and was used for the flawed Hubble Space Telescope images
When working with images one of the most common things is to use a convolution filter. There is a "sharpen" filter that does what it can to remove blur from an image. An example of a sharpen filter can be found here:
http://www.panoramafactory.com/sharpness/sharpness.html
Some programs like matlab make convolution really easy: conv2(A,B)
And most nice photo editing have the filters under some name or another (sharpen usually).
But keep in mind that filters can only do so much. In theory, the actual information has been lost by the blurring process and it is impossible to perfectly reconstruct the initial image (no matter what TV will lead you to believe).
In this case it seems like you have a very simple image with only black and white. Knowing this about your image you could always use a simple threshold. Set everything above a certain threshold to white, and everything below to black. Once again most photo editing software makes this really easy.
You cannot retrieve missing information, but under certain assumptions you can sharpen.
Try unsharp masking.

Image Transformation

I've been reading a lot about information transformations/distortions on here and elsewhere, and they seem to fall into two categories: distortion of the pixels of the image while maintaining the original boundaries, or transformations like rotation, scaling, etc. What I would like to do is pretty different.
I would like to warp a rectangular image into a polygon. In particular, I want to warp an image into each one of the 50 United States. Simple mapping of the state and then cropping out parts of the image that don't fit in is not acceptable. These images have borders, and then someone's face contained inside of them. I did find this really cool paper on content-aware image resizing (paper), and while it would let me keep the focus of the images (the face) undistorted, it still maps to 4 corners. For my initial test, I don't care about warping the faces too.
Does anyone have suggestions? Research papers, code, Wikipedia pages, GIMP plugins, software tools, etc welcome.
See if multipoint distortion in ImageMagick can work for you.

Resources