I have a large collection of scanned images, and they are all somewhat skewed, with a white area around them.
So, these images have rectangles of colors, surrounded by a large white area. The problem is that these rectangles of color are not parallel to the image border.
I'm sure there must be a way to programmatically detect these rectangles of color, so that I can rotate the image (thus un-skewing it) and then crop it so that just the interesting part is left. I guess I'm not really sure what this process is called, so I am having trouble searching for a solution on Google.
Does anyone know of an approach that would get me started? Any libraries out there that I should look into? Or the name of an algorithm that would help?
I am planning on using Java for this project, but I haven't really started yet, so I am open to library suggestions in any language.
border detection
hough transform (if all rectangles on an image have the same skew)
rectangle contour detection (connected component contour, then minimum area bounding rectangle)
Alyn is a third party package to detect and fix skew in images containing text. It uses Canny Edge Detection and Hough Transform to find skew.
To detect the skew, just run
./skew_detect.py -i image.jpg
To correct the skew, run
./deskew.py -i image.jpg -o skew_corrected_image.jpg
You might also try scikit-image http://scikit-image.org/docs/dev/auto_examples/.
It's a great library for the hough transformation, but also has other methods like Radon transformation and geometric transformations for this kind of task.
This is a python library.
Related
How to detect such cracks that you can see in the attached images? I have tried some OpenCV algorithms like blob detection (cv::SimpleBlobDetector) but couldn't get any results.
It is a cropped image, the full image has some other features as well, so I am not sure thresholding can work because I have to get the bounding box of the detected crack. One way is to assign several (region of interest) ROI and try to detect within that ROI, but this crack doesn't appear at the same location in the image. Any idea?
Can this problem be solved with machine/deep learning (like object detection)? If I train a model with a crack dataset? Because the crack part of the image doesn't have lots of features so I am not sure this method will work. Please guide.
Thanks.
These cracks are difficult to detect because the image is noisy (presumably X-ray) and the contrast poor, so the signal-to-noise ratio is low.
I would try by applying a gaussian filter for denoising, but only in the horizontal direction, to preserve the horizontal edges. Then detection of the horizontal edges.
This is about what a Gabor filter does. You can try different orientations.
Use mathematical morphology operation.
By example Matlab code:
a=imread('in.png');
se=strel( 'disk', 7);
b = imgaussfilt(a,1.3);
c=b-imopen(b,se);
c=3*c;
d=imclearborder(c);
imwrite(d, 'out.png');
The shape and positions of all the polygons are known beforehand. The polygons are not overlapping and will be of different colors and shapes, and there could be quite many of them. The polygons are defined in floating point based coordinates and will be painted on top of a JPEG photo as annotation.
How could I create the resulting image file as fast as possible after I get to know which color I should give each polygon?
If it would save time I would like to perform as much as possible of the computations beforehand. All information regarding geometry and positions of the polygons are known in advance. The JPEG photo is also known in advance. The only information not known beforehand is the color of each polygon.
The JPEG photo has a size of 250x250 pixels, so that would also be the image size of the resulting rasterised image.
The computations will be done on a Linux computer with a standard graphics card, so OpenGL might be a viable option. I know there are also rasterisation libraries like Cairo that could be used to paint polygons. What I wonder is if I could take advantage of the fact that I know so much of the input in advance and use that to speed up the computation. The only thing missing is the color of each polygon.
Preferably I would like to find a solution that would only precompute things in the form of data files. In other words as soon as the polygon colors are known, the algorithm would load the other information from datafiles (JPEG file, polygon geometry file and/or possibly precomputed datafiles). Of course it would be faster to start the computation out with a "warm" state ready in the GPU/CPU/RAM but I'd like to avoid that. The choice of programming language is not so import, but could for instance be C++.
To give some more background information: The JavaScript library OpenSeadragon that is running in a web browser requests image tiles from a web server. The idea is that measurement points (i.e. the polygons) could be plotted on-the-fly on to pregenerated Zooming Images (DZI format) by the web server. So for one image tile the algorithm would only need to be run one time. The aim is low latency.
I am in the process of learning how to create a lens flare application. I've got most of the basic components figured out and now I'm moving on to the more complicated ones such as the glimmers / glints / spikeball as seen here: http://wiki.nuaj.net/images/e/e1/OpticalFlaresLensObjects.png
Or these: http://ak3.picdn.net/shutterstock/videos/1996229/preview/stock-footage-blue-flare-rotate.jpg
Some have suggested creating particles that emanate outwards from the center while fading out and either increasing or decreasing in size but I've tried this and there are just too many nested loops which makes performance awful.
Someone else suggested drawing a circular gradient from center white to radius black and using some algorithms to lighten and darken areas thus producing rays.
Does anyone have any ideas? I'm really stuck on this one.
I am using a limited compiler that is similar to C but I don't have any access to antialiasing, predefined shapes, etc. Everything has to be hand-coded.
Any help would be greatly appreciated!
I would create large circle selections, then use a radial gradient. Each side of the gradient is white, but one side has 100% alpha and the other 0%. Once you have used the gradient tool to draw that gradient inside the circle. Deselect it and use the transform tool to Skew or in a sense smash it. Then duplicate it several times and turn each one creating a spiral or circle holding Ctrl to constrain when needed. Then once those several layers are in the rotation or design that you want. Group them in a folder and then you can further effect them all at once with another transform or skew. WHen you use these real smal, they are like little stars. But you can do many different things when creating each one to make them different. Like making each one lower in opacity than the last etc...
I found a few examples of how to do lens-flare 'via code'. Ideally you'd want to do this as a post-process - meaning after you're done with your regular render, you process the image further.
Fragment shaders are apt for this step. The easiest version I found is this one. The basic idea is to
Identify really bright spots in your image and potentially down sample it.
Shoot rays from the fragment to the center of the image and sample some pixels along the way.
Accumalate the samples and apply further processing - chromatic distortion etc - on it.
And you get a whole range of options to play with.
Another more common alternative seems to be
Have a set of basic images (circles, hexes) and render them as a bunch of bright objects, along the path from the camera to the light(s).
Composite this image on top of the regular render of you scene.
The problem is in determining when to turn on lens flare, since it is dependant on whether a light is visible/occluded from a camera. GPU Gems comes to rescue, with better options.
A more serious, physically based implementation is listed in this paper. This is a real-time version of making lens-flares, but you need a hardware that can support both vertex and geometry shaders.
I have a target image to be searched for a curve along its edges and a template image that contains the curve. What I need to achieve is to find the best match of the curve in the template image within the target image, and based on the score, to find out whether there is a match or not. That also includes rotation and resizing of the curve. The target image can be the output of a Canny Edge detector if that makes things easier.
I am considering to use OpenCV (by using Python or Processing/Java or if those have limited access to the required functions then by using C) to make things practical and efficient, however could not find out if I can use any functions (or a combination of them) in OpenCV that are useable for doing this job. I have been reading through the OpenCV documentation and thought at first that Contours could do this job, however all the examples show closed shapes as opposed to my case where I need to match a open curve to a part of an edge.
So is there a way to do this either by using OpenCV or with any known code or algorithm that you would suggest?
Here are some images to illustrate the problem:
My first thought was Generalized Hough Transform. However I don't know any good implementation for that.
I would try SIFT or SURF first on the canny edge image. It usually is used to find 2d areas, not 1d contours, but if you take the minimum bounding box around your contour and use that as the search pattern, it should work.
OpenCV has an implementation for that:
Features2D + Homography to find a known object
A problem may be getting a good edge image, those black shapes in the back could be distracting.
Also see this Stackoverflow answer:
Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition
Basically, suppose that I have a fingerprint. I know the dimension of my image, and I know that the fingerprint is black on a white background or that it is green on a black background or something like that.
Is there a way to process only the parts that delimit the image, in this case, the fingerprint? What I'm trying to do is basically this:
1) Delimit fingerprint
2) Extract the important points to compare to other fingerprints
3) Find best match on a database of other fingerprints that had their points previously extracted
I already have methods for 2 and 3, so now I just would have to delimit the image.
Programming language would have to be Ruby, Java or C++. Ruby preferred, then Java, and God help me if I have to use C++. I don't have any experience with image processing, but I'd like to do this with multiple common formats such as jpg, gif, png, if possible.
I think that the best way to do it is applying a edge detection filter to your image.
There are may approaches as suggested by wikipedia (article), but noone of them is trivial because they work on gradients or kernels. You should check Canny Edge Detection that should be enough straight-forward to implement: tutorial.
In any case if you want to avoid going deep into implementation details you should use OpenCV that is a computer vision library able to do these things in a simple way. You can use it for sure in C++ and Java but I think that a wrapper for Ruby is offered too. This is a simple example using that library with Canny algorithm.
EDIT: actually my answer covers point 2-3, so I'm wondering what you mean by delimiting the image? Think about the fact that scaling or rotating must be considered too if you want to compare different fingerprints: you need a fuzzy comparator.. maybe you should work on the Fast Fouried Transform version of the image that can handle such things in a better way.
An easy approach could be using threshold, like:
Convert your image to grayscale - so you have fingerprint in white on black.
Find a threshold value that gets most of the fingerprint.
Use open operation (http://en.wikipedia.org/wiki/Mathematical_morphology) to remove noise.
(experiment with dilate a few times)
Find the center of gravity (x,y) of the image and the standard deviation (vx, vy).
In the box:
[x-2vx,y-2vy],
[x-2vx,y+2vy],
[x+2vx,y+2vy],
[x+2vx,y-2vy]
You will find 95.4% of the pixels
You could narrow the box down to find the actual max and min pixels in it, if you have many outliers.
Use the box to clip from the original image.
It is simple method that might work well for your situation :)