I have a figure looks like this
I want to find the coordinates of all intersections of three hexagons.
How can I do this? Should I use OpenCV?
I am still trying to think of a faster/better method, but I think the following should work:
threshold your image to pure blacks and whites
generate and save a list of all black pixels for later
label your image so that each white hexagon is effectively flood-filled with a unique color (or shade of grey) - some folks call this "labelling", some call it "Blob Analysis", some call it "Connected Component Analysis". Whatever it is called, you will get something like this:
Now look at each black pixel from the list you saved in the second step and count how many different colours other than black are in the surrounding 9x9, or 15x15 area. If it's three it is probably an intersection like you are looking for.
Of course there are variations on this - you could implement a "minimum distance from other intersection" on top, for example. Or a "black line thinning first". Or a dilation of each blob to erode the black lines and make the three colours closer together. You could scale your image down (being careful to use NEAREST_NEIGHBOUR rather than interpolation) after labelling to reduce processing time - if important.
You can try to find these features using Harris corner detector.
Also check if findContours with analysis of result intersections could give you useful information.
Related
I don't know much about image processing so please bear with me if this is not possible to implement.
I have several sets of aerial images of the same area originating from different sources. The pictures have been taken during different seasons, under different lighting conditions etc. Unfortunately some images look patchy and suffer from discolorations or are partially obstructed by clouds or pix-elated, as par example picture1 and picture2
I would like to take as an input several images of the same area and (by some kind of averaging them) produce 1 picture of improved quality. I know some C/C++ so I could use some image processing library.
Can anybody propose any image processing algorithm to achieve it or knows any research done in this field?
I would try with a "color twist" transform, i.e. a 3x3 matrix applied to the RGB components. To implement it, you need to pick color samples in areas that are split by a border, on both sides. You should fing three significantly different reference colors (hence six samples). This will allow you to write the nine linear equations to determine the matrix coefficients.
Then you will correct the altered areas by means of this color twist. As the geometry of these areas is intertwined with the field patches, I don't see a better way than contouring the regions by hand.
In the case of the second picture, the limits of the regions are blurred so that you will need to blur the region mask as well and perform blending.
In any case, don't expect a perfect repair of those problems as the transform might be nonlinear, and completely erasing the edges will be difficult. I also think that colors are so washed out at places that restoring them might create ugly artifacts.
For the sake of illustration, a quick attempt with PhotoShop using manual HLS adjustment (less powerful than color twist).
The first thing I thought of was a kernel matrix of sorts.
Do a first pass of the photo and use an edge detection algorithm to determine the borders between the photos - this should be fairly trivial, however you will need to eliminate any overlap/fading (looks like there's a bit in picture 2), you'll see why in a minute.
Do a second pass right along each border you've detected, and assume that the pixel on either side of the border should be the same color. Determine the difference between the red, green and blue values and average them along the entire length of the line, then divide it by two. The image with the lower red, green or blue value gets this new value added. The one with the higher red, green or blue value gets this value subtracted.
On either side of this line, every pixel should now be the exact same. You can remove one of these rows if you'd like, but if the lines don't run the length of the image this could cause size issues, and the line will likely not be very noticeable.
This could be made far more complicated by generating a filter by passing along this line - I'll leave that to you.
The issue with this could be where there was development/ fall colors etc, this might mess with your algorithm, but there's only one way to find out!
Here is an example of binary images, i.e. as input we have an imageByteArray with 2 possible values: 0 and 255.
Example1:
Example2:
The image contains some document edge on a background.
The task is to remove, decrease amount of background pixels with minimal impact on edge pixels.
The question is what modern algorithms, techniques exist to do this?
What I do not expect as an answer: use Gaussian blur to get rid of background noise, use bitonal algorithm (Canny, Sobel, etc.) thresholds or use Hough (Hough linearization goes crazy on such noise no matter what options are set)
The simplest solution is to detect all contours and filter out ones with the lowest length. This works good, but sometimes depending on an image it will also erase useful edge pixels pretty much.
Update:
As input I have standard RGB image with a document (driver license ID, check, bill, credit card, ...) on some background. The main task is to detect document edges. Next steps are pretty known: greyscale, blur, Sobel binarization, Hough probabilistic, find rectangle or trapezium (if trapezium shape found then go to perspective transformation). On simple contrast backgrounds it all works fine. The reason why I am asking about noise reduction is that I have to work with thousands of backgrounds and some of them give noise no matter what options used. The noise will cause additional lines no matter how Hough is configured and additional lines may fool subsequent logic and seriously affect performance. (It is implemented in java script, no OpenCV or GPU support).
It's hard to know whether this approach will work with all your images since you only provided one, but a Hough Line detection with ImageMagick and these parameters in the Terminal command-line produces this:
convert card.jpg \
\( +clone -background none -fill red -stroke red \
-strokewidth 2 -hough-lines 49x49+100 -write lines.mvg \
\) -composite hough.png
and the file lines.mvg contains 4 lines as follows:
# Hough line transform: 49x49+100
viewbox 0 0 1024 765
line 168.14,0 141.425,765 # 215
line 0,155.493 1024,191.252 # 226
line 0,653.606 1024,671.48 # 266
line 940.741,0 927.388,765 # 158
ImageMagick is installed on most Linux distros and is available for OSX and Windows from here.
I assume you did mean binary image instead of bitonic...
Do flood fill based segmentation
scan image for set pixels (color=255)
for each set pixel create a mask/map of its area
Just flood fill set pixels with 4 or 8 neighbor connection and count how many pixels you filled.
for each filled area compute its bounding box
detect edge lines
edge lines have rectangular bounding box so test its aspect ratio if close to square then this is not edge line
also too small bounding box means not an edge line
too small filled pixels count in comparison to bounding box bigger side size then area is also not an edge line
You can make this more robust if you regress line for set pixels of each area and compute the average distance between regressed line and each set pixel. If too high area is not edge line ...
recolor not edge lines areas to black
so either substract the mask from image or flood fill with black again ...
[notes]
Sometimes step #5 can mess the inside of document. In that case you do not recolor anything instead you remember all the regressed lines for edge areas. Then after whole process is done join together all lines that are parallel and close to same axis (infinite line) that should reduce to 4 big lines determining document rectangle. So now fill with black all outside pixels (by geometric approach)
For such tasks you would usually carefully examine input data and try to figure out what cues can you utilize. But unfortunately you have provided only one example, which makes this approach pretty useless. Besides, this representation is not really comfortable to work with - have you done some preprocessing, or this is what you get as input? In first case, you may get better advice if you can show us real input.
Next, if your goal is noise reduction and not document/background segmentation - you are really limited in options. Similar to what you said, I would try to detect connected components with 255 intensity (instead of detecting contours, which can be less robust) and remove ones with small area. That may fail on certain instances.
Besides, on image you have provided you can use local statistics to suppress areas of regular noise. This will reduce background clutter if you select neighborhood size appropriately.
But again, if you are doing this for document detection - there may be more robust approaches.
For example, if you know the foreground object (driver's ID) - you can try to collect a dataset of ID images, and calculate the 'typical' color histogram - it may be rather characteristic. After that, you can backproject this histogram on input image and get either rough region of interest, or maybe even precise mask. Then you may binarize it and try to detect contours. You may try different color spaces and bin sizes to see which fits best.
If you have to work in different lighting conditions you can try to equalize histogram or do some other preprocessing to reduce color variation caused by lighting.
Strictly answering the question for the binary image (i.e. after the harm as been made):
What seems characteristic of the edge pixels as opposed to noise is that they form (relatively) long and smooth chains.
So far I see no better way than tracing all chains of 8-connected pixels, for instance with a contour following algorithm, and detect the straight sections, for example by Douglas-Peucker simplification.
As the noise is only on the outside of the card, the outline of the blobs will have at least one "clean" section. Keep the sections that are long enough.
This may destroy the curved corners as well and actually you should look for the "smooth" paths that are long enough.
Unfortunately, I cannot advise of any specific algorithm to address that. It should probably be based on graph analysis combined to geometry (enumerating long paths in a graph and checking the local/global curvature).
As far as I know (after reading thousands related articles), this is nowhere addressed in the literature.
None of the previous answers would really work, the only thing that can work here is a blob filter, filter it so that blobs below a certain size get deleted.
I am looking for algorithms to, given an image containing a crossword
crop the image to just the crossword
distinguish between regular and barred crosswords
extract the grid size and the positions of the black squares/bars
The crossword itself can be assumed to be regular (i.e. I am interested in crosswords that have been generated by some program and published as an image, rather than scanned paper-based crosswords), and I would like the program to run without needing any inputs other than the image bitmap.
I can think of some brute-force multi-pass ways to do this (essentially using variants of imagemagick's hit-and-miss filter and then looping over the image looking for leftover dots) but I'm hoping for better ideas from people who actually know about image processing.
This is a really broad question, but I wil try to give you some pointers.
These are the steps you need to take:
Detect the position of the crossword.
Detect the grid of the crossword. For this, you will need some Computer Vision algorithm (for example the Hough lines detector).
For each cell, you need to find if it have a character or not. To do so, you just simply analize the "amount" of white color does the cell have
For the cells containing a character you need to recognize it. To do so, you need an OCR, and I recommend you Tesseract.
Create your own algorithm for solving the crosswords. You can use this.
And here (1,2,3) you have an example of a Sudoku Solver in Python. The first steps are common to your problem so you can use OpenCV to solve it like this:
import cv2
import numpy as np
#Load the Black and White image
img = cv2.imread('sudoku.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray,(5,5),0)
thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)
#Detect the lines of the sudoku
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#Detect the square of the Sudoku
biggest = None
max_area = 0
for i in contours:
area = cv2.contourArea(i)
if area > 100:
peri = cv2.arcLength(i,True)
approx = cv2.approxPolyDP(i,0.02*peri,True)
if area > max_area and len(approx)==4:
biggest = approx
max_area = area
Using a screenshot of the linked crossword as example, I assume that:
the crossword grid is crisp, i.e. the horizontal and vertical grid lines are drawn at exact pixels with a constant dark colour and that there is no noise inside the grid cells,
the crossword is black or another relatively dark colour ("black") on white or light grey ("white"),
the clue numbers are written in the top left corner,
the crossword is rectangular and regular.
You can then scan the image from top to bottom to find horizontal black lines of sufficient length. A line starts with a black pixel and ends with a white pixel. Other pixels are indicators that it is not a line. (This is to weed out text and buttons.) Do the same for vertical lines.
Ideally, you now have the crossword lines. If your image is not cropped to the crossword, you might have false positives, such as the button borders. To find the crossword lines, sort them by length and look for the largest contiguous block of the same length. These should be your crossword lines unless you hae some degenerate cases
Now do a nested loop of horizontal and vertical lines, but skip the first line. Look two or three pixels to the northwest of the intersection of the lines. If the pixel is dark, that's a blank. If it is light, it's a cell. This heuristic seems to work well. I say dark and light here, bacause some crosswords use grey cells to save on ink when printing and some cell are highlighted in the screenshot.
If you end up with no blanks, you have a barred crossword. You can find the bars by checking whether one of the pixels to the left and right of a cell border is black.
Lastly, a tip: If you want to use your algorithm to find the cells in a crossword generated with the Crossword Compiler, look at the source. You will find a link to a Javascript file /puzzles/sample/cryptic_demo/cryptic_demo_xml.js, which contans the crossword as XML string, which also gives you the clues as a bonus.
Older versions of the Crossword Compiler, such as the one used for the Independent Cryptic hide their data in a file loaded from an applet. The format of that file is binary, but not too hard to read if you know the original data.
Try hough transform to find squares and when you get the squares check using histogram whether it is a dark or white square using threshold on its gray scale values
Thinking of an alternative way to do this.
This is similar in many respects to object recongition, computer vision
One way would be to use a framework like openCV which, trained with some samples of what you want to detect, can detect any similar results
(a javascript library for object detection based on Viola-Jones algorithm, used also by openCV and of which am the author is HAAR.js)
Apart from this (or a similar alternative to this) there is a possibility of constructing a "visual" template of a crossword you want to detect (in a scale-invariant way)
and scan the images looking for correlations of parts of the image with the template (complexity O(N*M), N size of image, M size of template)
Since crossword grids have relatively constant shapes (especially fixed outputs of crossword compilers) it should be relative easy to create a prototype template and have success in matching (and aligning) the detected regions to extract the shape information
I've already asked this question on https://dsp.stackexchange.com/ but didn't get any answer! hope to get any suggestion here:
I have a project in which I have to recognize 2 lines in different "position", the lines are orthogonal but can be projected on different surfaces. I'm using opencv.
The intersection can be anywhere on the frame. The lines are red (the images show just the gray scale).
UPDATE
-I'll be using a gray scale camera !!!!!!!!!
-the background and objects on which the lines will be projected can change
I'm not asking for code, but only for hints about how can I solve this? I tried houghlines function but it works only for straight surfaces.
thanks in advance !
This is not that difficult task as it include straight line. I have done similar kind of project.
First of all if your image is colored covert it to gray scale.
Then use a calibrated median filter to blur the image.
Now subtract the blurred image from the gray scale image.
After step 3 if you look at the image you will see that the on the places of lines the intensity
is higher than the other parts of image because these line are contrasted and when we apply median
filter the subtracted value is more than the rest of image.
to get a cleaner distinction you need to use create a binary image ie. only black and white with
a particular thresh hold.
6.Finally you got yu lines if their is noise you can use top hat filtering after step 4 and
gaussian filtering after step 5.
You can take help from this paper on crack detection
I think AMI's idea is good.
You can also think about using controled laser source. In that case you can get image pair one with laser turned on and one with turned off, then find difference.
It can be interesting for you: http://www.instructables.com/id/3-D-Laser-Scanner/
Here's the result of subtracting the output of a median filter (r=6):
You might be able to improve things a bit by adjusting the median filter radius, but these wavy, discontinuous lines are going to be difficult to detect reliably.
You really need better source images. Here are a few suggestions:
A colour camera would help enormously. Apply a high-pass filter to the red and green channels, and calculate the difference between the two. The red lines will stand out much better then.
Can you make the light source brighter?
Have you tried putting a red filter over the camera lens? Ideally you want one with a pass band that matches the light source's wavelength as closely as possible — if the light is coming from a laser, then a suitable dichroic filter should give good results. But even a sheet of red plastic would be better than nothing. (Have you got an old pair of red/blue 3D glasses sitting around somewhere?)
Perhaps subtracting the grayscale image from the red channel would help to highlight the red. I'd post this as a comment but cannot do so yet.
i'm working in a project to recognize a bit code from an image like this, where black rectangle represents 0 bit, and white (white space, not visible) 1 bit.
Somebody have any idea to process the image in order to extract this informations? My project is written in java, but any solution is accepted.
thanks all for support.
I'm not an expert in image processing, I try to apply Edge Detection using Canny Edge Detector Implementation, free java implementation find here. I used this complete image [http://img257.imageshack.us/img257/5323/colorimg.png], reduce it (scale factor = 0.4) to have fast processing and this is the result [http://img222.imageshack.us/img222/8255/colorimgout.png]. Now, how i can decode white rectangle with 0 bit value, and no rectangle with 1?
The image have 10 line X 16 columns. I don't use python, but i can try to convert it to Java.
Many thanks to support.
This is recognising good old OMR (optical mark recognition).
The solution varies depending on the quality and consistency of the data you get, so noise is important.
Using an image processing library will clearly help.
Simple case: No skew in the image and no stretch or shrinkage
Create a horizontal and vertical profile of the image. i.e. sum up values in all columns and all rows and store in arrays. for an image of MxN (width x height) you will have M cells in horizontal profile and N cells in vertical profile.
Use a thresholding to find out which cells are white (empty) and which are black. This assumes you will get at least a couple of entries in each row or column. So black cells will define a location of interest (where you will expect the marks).
Based on this, you can define in lozenges in the form and you get coordinates of lozenges (rectangles where you have marks) and then you just add up pixel values in each lozenge and based on the number, you can define if it has mark or not.
Case 2: Skew (slant in the image)
Use fourier (FFT) to find the slant value and then transform it.
Case 3: Stretch or shrink
Pretty much the same as 1 but noise is higher and reliability less.
Aliostad has made some good comments.
This is OMR and you will find it much easier to get good consistent results with a good image processing library. www.leptonica.com is a free open source 'C' library that would be a very good place to start. It could process the skew and thresholding tasks for you. Thresholding to B/W would be a good start.
Another option would be IEvolution - http://www.hi-components.com/nievolution.asp for .NET.
To be successful you will need some type of reference / registration marks to allow for skew and stretch especially if you are using document scanning or capturing from a camera image.
I am not familiar with Java, but in Python, you can use the imaging library to open the image. Then load the height and the widths, and segment the image into a grid accordingly, by Height/Rows and Width/Cols. Then, just look for black pixels in those regions, or whatever color PIL registers that black to be. This obviously relies on the grid like nature of the data.
Edit:
Doing Edge Detection may also be Fruitful. First apply an edge detection method like something from wikipedia. I have used the one found at archive.alwaysmovefast.com/basic-edge-detection-in-python.html. Then convert any grayscale value less than 180 (if you want the boxes darker just increase this value) into black and otherwise make it completely white. Then create bounding boxes, lines where the pixels are all white. If data isn't terribly skewed, then this should work pretty well, otherwise you may need to do more work. See here for the results: http://imm.io/2BLd
Edit2:
Denis, how large is your dataset and how large are the images? If you have thousands of these images, then it is not feasible to manually remove the borders (the red background and yellow bars). I think this is important to know before proceeding. Also, I think the prewitt edge detection may prove more useful in this case, since there appears to be less noise:
The previous method of segmenting may be applied, if you do preprocess to bin in the following manner, in which case you need only count the number of black or white pixels and threshold after some training samples.