Fast algorithm required to detect changes in pixels - image

I am looking for an algorithm to detect changes in a pixel array. In the picture below, the red line illustrates the pixel array. I want the algorithm to recognize and trigger once the golf club below passes through the red line.
My original thought was to read the pixels along the line and to detect changes in color (specifically on the lower half of the line) when the color changes above a certain threshold.
Is there a better way to do this or an algorithm I can use which detects once the golf club passes the red line?
Thank you for your thoughts.

Tracking the pixels along the line is a feasible approach.
I recommend not to check the pixels in the order top to bottom or bottom to top, I recommend a kind of interlacing order to increase the chance to detect a change with few pixel-checks.
Here is what I mean:
If the index the pixels from the to to the bottom by 0,1,2,3,..63 [just an example]
I would check e.g. this order
0,32 ,16,48 ,8,24,40,56, 4,12,20,28,36,44,52,60, ...
Or if you predict a non equal distributed probability for the change of the pixels, you can take even this in account and find an order of pixel checks with, detects the change with the lowest average pixel checks.

What kind of video-source do you use? Most video compressions already use differences between images to reduce the amount of data to store.
It might be a quite a bit of work, but you might be able to see the change directly in the compressed data without decoding it. Sounds like an interesting challenge to me to implement detection of change in a video, by simply reading the undecoded data.

Related

anyway to remove algorithmically discolorations from aerial imagery

I don't know much about image processing so please bear with me if this is not possible to implement.
I have several sets of aerial images of the same area originating from different sources. The pictures have been taken during different seasons, under different lighting conditions etc. Unfortunately some images look patchy and suffer from discolorations or are partially obstructed by clouds or pix-elated, as par example picture1 and picture2
I would like to take as an input several images of the same area and (by some kind of averaging them) produce 1 picture of improved quality. I know some C/C++ so I could use some image processing library.
Can anybody propose any image processing algorithm to achieve it or knows any research done in this field?
I would try with a "color twist" transform, i.e. a 3x3 matrix applied to the RGB components. To implement it, you need to pick color samples in areas that are split by a border, on both sides. You should fing three significantly different reference colors (hence six samples). This will allow you to write the nine linear equations to determine the matrix coefficients.
Then you will correct the altered areas by means of this color twist. As the geometry of these areas is intertwined with the field patches, I don't see a better way than contouring the regions by hand.
In the case of the second picture, the limits of the regions are blurred so that you will need to blur the region mask as well and perform blending.
In any case, don't expect a perfect repair of those problems as the transform might be nonlinear, and completely erasing the edges will be difficult. I also think that colors are so washed out at places that restoring them might create ugly artifacts.
For the sake of illustration, a quick attempt with PhotoShop using manual HLS adjustment (less powerful than color twist).
The first thing I thought of was a kernel matrix of sorts.
Do a first pass of the photo and use an edge detection algorithm to determine the borders between the photos - this should be fairly trivial, however you will need to eliminate any overlap/fading (looks like there's a bit in picture 2), you'll see why in a minute.
Do a second pass right along each border you've detected, and assume that the pixel on either side of the border should be the same color. Determine the difference between the red, green and blue values and average them along the entire length of the line, then divide it by two. The image with the lower red, green or blue value gets this new value added. The one with the higher red, green or blue value gets this value subtracted.
On either side of this line, every pixel should now be the exact same. You can remove one of these rows if you'd like, but if the lines don't run the length of the image this could cause size issues, and the line will likely not be very noticeable.
This could be made far more complicated by generating a filter by passing along this line - I'll leave that to you.
The issue with this could be where there was development/ fall colors etc, this might mess with your algorithm, but there's only one way to find out!

Invoice / OCR: Detect two important points in invoice image

I am currently working on OCR software and my idea is to use templates to try to recognize data inside invoices.
However scanned invoices can have several 'flaws' with them:
Not all invoices, based on a single template, are correctly aligned under the scanner.
People can write on invoices
etc.
Example of invoice: (Have to google it, sadly cannot add a more concrete version as client data is confidential obviously)
I find my data in the invoices based on the x-values of the text.
However I need to know the scale of the invoice and the offset from left/right, before I can do any real calculations with all data that I have retrieved.
What have I tried so far?
1) Making the image monochrome and use the left and right bounds of the first appearance of a black pixel. This fails due to the fact that people can write on invoices.
2) Divide the invoice up in vertical sections, use the sections that have the highest amount of black pixels. Fails due to the fact that the distribution is not always uniform amongst similar templates.
I could really use your help on (1) how to identify important points in invoices and (2) on what I should focus as the important points.
I hope the question is clear enough as it is quite hard to explain.
Detecting rotation
I would suggest you start by detecting straight lines.
Look (perhaps randomly) for small areas with high contrast, i.e. mostly white but a fair amount of very black pixels as well. Then try to fit a line to these black pixels, e.g. using least squares method. Drop the outliers, and fit another line to the remaining points. Iterate this as required. Evaluate how good that fit is, i.e. how many of the pixels in the observed area are really close to the line, and how far that line extends beyond the observed area. Do this process for a number of regions, and you should get a weighted list of lines.
For each line, you can compute the direction of the line itself and the direction orthogonal to that. One of these numbers can be chosen from an interval [0°, 90°), the other will be 90° plus that value, so storing one is enough. Take all these directions, and find one angle which best matches all of them. You can do that using a sliding window of e.g. 5°: slide accross that (cyclic) region and find a value where the maximal number of lines are within the window, then compute the average or median of the angles within that window. All of this computation can be done taking the weights of the lines into account.
Once you have found the direction of lines, you can rotate your image so that the lines are perfectly aligned to the coordinate axes.
Detecting translation
Assuming the image wasn't scaled at any point, you can then try to use a FFT-based correlation of the image to match it to the template. Convert both images to gray, pad them with zeros till the originals take up at most 1/2 the edge length of the padded image, which preferrably should be a power of two. FFT both images in both directions, multiply them element-wise and iFFT back. The resulting image will encode how much the two images would agree for a given shift relative to one another. Simply find the maximum, and you know how to make them match.
Added text will cause no problems at all. This method will work best for large areas, like the company logo and gray background boxes. Thin lines will provide a poorer match, so in those cases you might have to blur the picture before doing the correlation, to broaden the features. You don't have to use the blurred image for further processing; once you know the offset you can return to the rotated but unblurred version.
Now you know both rotation and translation, and assumed no scaling or shearing, so you know exactly which portion of the template corresponds to which portion of the scan. Proceed.
If rotation is solved already, I'd just sum up all pixel color values horizontally and vertically to a single horizontal / vertical "line". This should provide clear spikes where you have horizontal and vertical lines in the form.
p.s. Generated a corresponding horizontal image with Gimp's scaling capabilities, attached below (it's a bit hard to see because it's only one pixel high and may get scaled down because it's > 700 px wide; the url is http://i.stack.imgur.com/Zy8zO.png ).

simple case of optical flow

General: I'm hoping that the use-case I'm about to describe is a simple case of an optical flow problem and since I don't have much knowledge on the subject, I was wondering if anyone has any suggestions on how I can approach solving my problem.
Research I've already done: I have began reading the High Accuracy Optical Flow Estimation Based on a Theory for Warping paper and am planning on looking over the Particle Video paper. I have found a MATLAB High Accuracy Optical Flow implementation of optical flow. However, the papers (and the code) seem to describe concepts that are very involved and may require a lot of time for me to dig in and understand. I am hoping that the solution to my problem may be more simple.
Problem: I have a sequence of images. The images depict a material breakage process, where the material and background are black and the cracks are white. I am interested in traversing the sequence of images in reverse in an attempt to map all of the cracks that have formed in the breakage process to the first black image. You can think of the material as a large puzzle and I am trying to put the pieces back together in the reverse order that they broke.
In each image, there can be some cracks that are just emerging and/or some cracks that have been fully formed (and thus created a fragment). Throughout the breakage process, some fragments may separate and break further. The fragments can also move farther away from one another (the change is slight between subsequent frames).
Desired Output: All of the cracks/lines in the sequence mapped to the first image in the sequence.
Additional Notes: Images are available in grayscale format (i.e. original) as well as in binary format, where the cracks have been outlined in white and the background is completely black. See below for some image examples.
The top row shows the original images and the bottom row shows the binary images. As you can see, the crack that goes down the middle grows wider and wider as the image sequence progresses. Thus, the bottom crack moves together with the lower fragment. When traversing the sequence in reverse, I hope to algorithmically realize that the middle crack comes together as one (and map it correctly to the first image), and also map the bottom crack correctly, keeping its correct correspondence (size and position) with the bottom fragment.
A sequence typically contains about 30~40 images, so I've just shown the beginning subset. Also, although these images don't show it, it is possible that a particular image only contains the beginning of the crack (i.e. its initial appearance) and in subsequent images it gets longer and longer and may join with other cracks.
Language: Although not necessary, I would like to implement the solution using MATLAB (just because most of the other code that relates to the project has been done in MATLAB). However, if OpenCV may be easier, I am flexible in my language/library usage.
Any ideas are greatly appreciated.
Traverse forward rather than reverse, and don't use optical flow. Use the fracture lines to segment the black parts, track the centroid of each black segment over time. Whenever a new fracture line appears that cuts across a black segment, split the segment into two and continue tracking each segment separately.
From this you should be able to construct a tree structure representing the segmentation of the black parts over time. The fracture lines can be added as metadata to this tree, perhaps assigning fracture lines to the segment node in which they first appeared.
I would advise you to follow your initial idea of backtracking the cracks. Yo kind of know how the cracks look like so you can track all the points that belong to the crack. You just track all the white points with an optical flow tracker, start with Lukas-Kanade tracker and see where you get. The high-accuracy optical flow method is a global one and more general, I'll track all the pixels in the image trying to keep some smoothness everywhere. The LK is a local method that will just use a small window around each point to do the tracking. The problem is that appart from the cracks all the pixels are plain black so nothing to track there, you'll just waist time trying to track something that you can't track and you don't need to track.
If lines are very straight you might end up with what's called the aperture problem and you'll get inaccurate results. You can also try some shape fitting/deformation based on snakes.
I agree to damian. Most optical flow methods like the HAOF rely on the first-order taylor approximation of the intensity constancy constrian equation I(x,t)=I(x+v,t+dt). That mean the solution depends on image derivatives where the gradient determine the motion vector magnitude and angle i.e. you need a certain amount of texture. However the very low texture of your non-binarised images could be enough. You could try histogram equalization to increase the contrast of your input data but it is important to apply the same transformation for both input images. e.g. as follows:
cv::Mat equalizeMat(grayInp1.rows, grayInp1.cols * 2 , CV_8UC1);
grayInp1.copyTo(equalizeMat(cv::Rect(0,0,grayInp1.cols,grayInp1.rows)));
grayInp2.copyTo(equalizeMat(cv::Rect(grayInp1.cols,0,grayInp2.cols,grayInp2.rows)));
cv::equalizeHist(equalizeMat,equalizeMat);
equalizeMat(cv::Rect(0,0,grayInp1.cols,grayInp1.rows)).copyTo(grayInp1);
equalizeMat(cv::Rect(grayInp1.cols,0,grayInp2.cols,grayInp2.rows)).copyTo(grayInp2);
// estimate optical flow

How do I locate black rectangles in a grid and extract the binary code from that

i'm working in a project to recognize a bit code from an image like this, where black rectangle represents 0 bit, and white (white space, not visible) 1 bit.
Somebody have any idea to process the image in order to extract this informations? My project is written in java, but any solution is accepted.
thanks all for support.
I'm not an expert in image processing, I try to apply Edge Detection using Canny Edge Detector Implementation, free java implementation find here. I used this complete image [http://img257.imageshack.us/img257/5323/colorimg.png], reduce it (scale factor = 0.4) to have fast processing and this is the result [http://img222.imageshack.us/img222/8255/colorimgout.png]. Now, how i can decode white rectangle with 0 bit value, and no rectangle with 1?
The image have 10 line X 16 columns. I don't use python, but i can try to convert it to Java.
Many thanks to support.
This is recognising good old OMR (optical mark recognition).
The solution varies depending on the quality and consistency of the data you get, so noise is important.
Using an image processing library will clearly help.
Simple case: No skew in the image and no stretch or shrinkage
Create a horizontal and vertical profile of the image. i.e. sum up values in all columns and all rows and store in arrays. for an image of MxN (width x height) you will have M cells in horizontal profile and N cells in vertical profile.
Use a thresholding to find out which cells are white (empty) and which are black. This assumes you will get at least a couple of entries in each row or column. So black cells will define a location of interest (where you will expect the marks).
Based on this, you can define in lozenges in the form and you get coordinates of lozenges (rectangles where you have marks) and then you just add up pixel values in each lozenge and based on the number, you can define if it has mark or not.
Case 2: Skew (slant in the image)
Use fourier (FFT) to find the slant value and then transform it.
Case 3: Stretch or shrink
Pretty much the same as 1 but noise is higher and reliability less.
Aliostad has made some good comments.
This is OMR and you will find it much easier to get good consistent results with a good image processing library. www.leptonica.com is a free open source 'C' library that would be a very good place to start. It could process the skew and thresholding tasks for you. Thresholding to B/W would be a good start.
Another option would be IEvolution - http://www.hi-components.com/nievolution.asp for .NET.
To be successful you will need some type of reference / registration marks to allow for skew and stretch especially if you are using document scanning or capturing from a camera image.
I am not familiar with Java, but in Python, you can use the imaging library to open the image. Then load the height and the widths, and segment the image into a grid accordingly, by Height/Rows and Width/Cols. Then, just look for black pixels in those regions, or whatever color PIL registers that black to be. This obviously relies on the grid like nature of the data.
Edit:
Doing Edge Detection may also be Fruitful. First apply an edge detection method like something from wikipedia. I have used the one found at archive.alwaysmovefast.com/basic-edge-detection-in-python.html. Then convert any grayscale value less than 180 (if you want the boxes darker just increase this value) into black and otherwise make it completely white. Then create bounding boxes, lines where the pixels are all white. If data isn't terribly skewed, then this should work pretty well, otherwise you may need to do more work. See here for the results: http://imm.io/2BLd
Edit2:
Denis, how large is your dataset and how large are the images? If you have thousands of these images, then it is not feasible to manually remove the borders (the red background and yellow bars). I think this is important to know before proceeding. Also, I think the prewitt edge detection may prove more useful in this case, since there appears to be less noise:
The previous method of segmenting may be applied, if you do preprocess to bin in the following manner, in which case you need only count the number of black or white pixels and threshold after some training samples.

Automatic tracking algorithm

I'm trying to write a simple tracking routine to track some points on a movie.
Essentially I have a series of 100-frames-long movies, showing some bright spots on dark background.
I have ~100-150 spots per frame, and they move over the course of the movie. I would like to track them, so I'm looking for some efficient (but possibly not overkilling to implement) routine to do that.
A few more infos:
the spots are a few (es. 5x5) pixels in size
the movement are not big. A spot generally does not move more than 5-10 pixels from its original position. The movements are generally smooth.
the "shape" of these spots is generally fixed, they don't grow or shrink BUT they become less bright as the movie progresses.
the spots don't move in a particular direction. They can move right and then left and then right again
the user will select a region around each spot and then this region will be tracked, so I do not need to automatically find the points.
As the videos are b/w, I though I should rely on brigthness. For instance I thought I could move around the region and calculate the correlation of the region's area in the previous frame with that in the various positions in the next frame. I understand that this is a quite naïve solution, but do you think it may work? Does anyone know specific algorithms that do this? It doesn't need to be superfast, as long as it is accurate I'm happy.
Thank you
nico
Sounds like a job for Blob detection to me.
I would suggest the Pearson's product. Having a model (which could be any template image), you can measure the correlation of the template with any section of the frame.
The result is a probability factor which determine the correlation of the samples with the template one. It is especially applicable to 2D cases.
It has the advantage to be independent from the sample absolute value, since the result is dependent on the covariance related with the mean of the samples.
Once you detect an high probability, you can track the successive frames in the neightboor of the original position, and select the best correlation factor.
However, the size and the rotation of the template matter, but this is not the case as I can understand. You can customize the detection with any shape since the template image could represent any configuration.
Here is a single pass algorithm implementation , that I've used and works correctly.
This has got to be a well reasearched topic and I suspect there won't be any 100% accurate solution.
Some links which might be of use:
Learning patterns of activity using real-time tracking. A paper by two guys from MIT.
Kalman Filter. Especially the Computer Vision part.
Motion Tracker. A student project, which also has code and sample videos I believe.
Of course, this might be overkill for you, but hope it helps giving you other leads.
Simple is good. I'd start doing something like:
1) over a small rectangle, that surrounds a spot:
2) apply a weighted average of all the pixel coordinates in the area
3) call the averaged X and Y values the objects position
4) while scanning these pixels, do something to approximate the bounding box size
5) repeat next frame with a slightly enlarged bounding box so you don't clip spot that moves
The weight for the average should go to zero for pixels below some threshold. Number 4 can be as simple as tracking the min/max position of anything brighter than the same threshold.
This will of course have issues with spots that overlap or cross paths. But for some reason I keep thinking you're tracking stars with some unknown camera motion, in which case this should be fine.
I'm afraid that blob tracking is not simple, not if you want to do it well.
Start with blob detection as genpfault says.
Now you have spots on every frame and you need to link them up. If the blobs are moving independently, you can use some sort of correspondence algorithm to link them up. See for instance http://server.cs.ucf.edu/~vision/papers/01359751.pdf.
Now you may have collisions. You can use mixture of gaussians to try to separate them, give up and let the tracks cross, use any other before-and-after information to resolve the collisions (e.g. if A and B collide and A is brighter before and will be brighter after, you can keep track of A; if A and B move along predictable trajectories, you can use that also).
Or you can collaborate with a lab that does this sort of stuff all the time.

Resources