Is there a way to align 2 scanned multiple choice paper images? Maybe by using a bar code to auto align them or using a specific shape so that I can compare or find the differences in the answers between the two papers in Matlab?
This is the reference image.
This is the image I want to alter which will overlay the reference image.
Your problem is sometimes called "rigid image registration" and there are many packages that address this problem. Check out for example the following on MATLAB central.
http://www.mathworks.com/matlabcentral/fileexchange/19086-automatic-2d-rigid-body-image-registration
"rigid" is here as opposed to "elastic", so only translation and rotation, but no stretching is considered, since you work with paper.
Related
I'm trying to find a way to reliably determine the location of a puzzle piece in an image. The puzzle piece varies in both shape and how easy it is to find it. What algorithm(s) in the opencv module would help me with the task at hand? Or is what I'm trying to do beyond the scope of the module?
Example images below:
Update
The original title was "Detecting obscure shapes with Opencv Python". However I am interested in concepts of image-processing that would solve such a problem: How to find a pasted image inside the bigger image?
Assume the following:
The jigsaw shapes are always of same (rectangle) boundary size (ie: a template-based searching method could work).
The jigsaw shape is not rotated to any angle (ie: there will be straight(-ish) horizontal and vertical lines to find.
The jigsaw shape is always "pasted" into some other "original" image (ie: a paste-detection method could work).
The solution can be OpenCV (as requested by the asker), but the core concepts should be applicable when using any appropriate software (ie: can loop through image pixels to process their values, in order to achieve the described solution).
I myself use JavaScript, but of course I will understand that openCV.calcHist() becomes a histogram function in JS code. I have no problem translating a good explanation into code. I will consider OpenCV code as pseudo-code towards a working idea.
In my opinion the best approach for a canonical answer was suggested in the comments by Christoph, which is training a CNN:
Implement a generator for shapes of puzzle pieces.
Get a large set of natural images from the net.
Generate tons of sample images with puzzle pieces.
Train your model to detect those puzzle pieces.
Histogram of Largest Error
This is a rough concept of a possible algorithm.
The idea comes from an unfounded premise that seems plausible enough.
The premise is that adding the puzzle piece drastically changes the histogram of the image.
Let's assume that the puzzle piece is bounded by a 100px by 100px square.
We are going to use this square as a mask to mask out pixels that are used to calculate the histogram.
The algorithm is to find the placement of the square mask on the image such that the error between the histogram of the masked image and the original image is maximized.
There are many norms to experiment with to measure the error: You could start with the sum over the error components squared.
I'll throw in my own attempt. It fails on the first image, only works fine on the next two images. I am open to other pixel-processing based techniques where possible.
I do not use OpenCV so the process is explained with words (and pictures). It is up to the reader to implement the solution in their own chosen programming language/tool.
Background:
I wondered if there was something inherent in pasted images (something maybe revealed by pixel processing or even by frequency domain analysis, eg: could a Fourier signal analysis help here?).
After some research I came across Error Level Analysis (or ELA). This page has a basic introduction for beginners.
Process: In 7 easy steps, this detects the location of a pasted puzzle piece.
(1) Take a provided cat picture and re-save 3 times as JPEG in this manner:
Save copy #1 as JPEG of quality setting 2.
Reload (to force a decode of) copy #1 then re-save copy #2 as JPEG of quality setting 5.
Reload (to force a decode of) copy #2 then re-save copy #3 as JPEG of quality setting 2.
(2) Do a difference blend-mode with original cat picture as base layer versus the re-saved copy #3 image. Thimage will be black so we increase Levels.
(3) Increase Levels to make the ELA detected area(s) more visible. note: I recommend working in BT.709 or BT.601 grayscale at this point. Not necessary, but it gives "cleaner" results when blurring later on.
(4) Alternate between applying a box blur to the image and also increasing levels, to a point where the islands disappear and a large blob remains..
(5) The blob itself is also emphasized with an increase of levels.
(6) Finally a Gaussian blur is used to smoothen the selection area
(7) Mark the blob area (draw an outline stroke) and compare to input image...
I have about 6000 aerial images taken by 3DR drone for vegetation sites.
The images have to overlap to some extant because the drone flights cover the area go from EW and then again NS, so the images present the same area from two directions. I need the overlap for the images for extra accuracy.
I don't know to write a code on IDL to combine the images and create that overlap. Can anyone help please?
Thanks
What you need is something that is identifiable that occurs in both images. Preferably you would have several things across the field of view so that you could get the correct rotation as well as a simple x-y shift.
The basic steps you will need to follow are:
Source Identification - Identify sources in all images that will later be used to align the images. Make sure the centering of these soruces are good so that they will align better later.
Basic alignment. Start with a guess on where the images should align then try to match the sources.
Match the sources. There are several libraries that can do this for stars (in astronomical images) that could be adapted for this.
Shift and rotate the images. This can be done to the pixels or to the header that is read in and have a program manipulate the pixels on the fly.
My use case is to detect the angle a black&white scanned image is rotated. Most image processing algorithms I am finding online as a first step do a bitwise not on the image so that the background is black and the objects white.
My questions is what is the reason behind this? I cannot find any other answers that this is how it is done in image processing.
Thank you.
I have come across two reasons for doing this inversion:
1) Inverted image sometimes gives better interpretation (eg. in medical image analysis)
2) When background region of the image capture (black in colour) is needed for segmentation as mentioned in this problem
Problems with line detection in Emgu CV
For representing most popular artists from EchoNest API, I've been trying to set-up Silverlight Toolkit's TreeMap using images, their TreeItemDefinition.ValueBinding being defined as the area of the image.
While it mostly fills up the space when the image stretch is set to 'Fill' :
When setting image stretch to 'Uniform' a lot of blank spaces remain :
On this post, image carving is suggested : Treemapping with a given aspect ratio
How can I know which images should be carved and at what dimensions they should be carved if possible at all ?
Is this problem solvable without human intervention for a good result ?
I don't think there is a way to know which images should be carved and at what dimensions they should be carved. An ok-ish euristic might be to check if the mean energy of an image is > a certain threshold (this can be refined to check only blocks of every image, and combining the result later: if the image has blocks without details/energy, it can be carved, at least in that section).
What i think would be better is to apply seam carving to the already composed image: that will try to carve out the white outlines (adding "artificial" energy to the patches of images might lead to even better results, preserving more the shapes of each image). This paper might be of use to check out other image resizing methods too.
I'm trying to align several images in matlab and I'm having trouble getting matlab to align them properly. I want to align them so that I can overlay them to make a larger image by stitching/averaging them together. I posted several of the images here. While it is not difficult to align 5 images manually I want to make a script to do it so that I do not need to align hundreds of similar images manually as well.
Over the past few days I've tried several ways of getting this to work. My thought was that if I could filter the images enough then I could make a mask for the letters and then it would be easy to align them--but I haven't been able to make it work.
I've tried using local adaptive thresholding to compensate for a nonuniform brightness level across the picture but it hasn't allowed me to align them properly. For the actual image alignment I've been using imregister() and normxcorr2() but both do not properly align the images.
I don't think that this should be that difficult to do but I haven't been able to do it. Any thoughts or insight would be greatly appreciated.
Edit: uploaded images after performing various operations here
I would do some feature detection, followed by the RANSAC algorithm to find the transformation between any pair of images.
see this demo on the Mathworks web-site: http://www.mathworks.com/videos/feature-detection-extraction-and-matching-with-ransac-73589.html