I need to programmatically determine the best place to overlay text on an image. In other words, I need to tell the foreground from the background. I have tried imagemagick: http://www.imagemagick.org/Usage/scripts/bg_removal. Unfortunately this was not good enough. The images can be photographs of pretty much anything, but usually with a blurry background.
I would now like to try liuliu's CCV. Code: https://github.com/liuliu/ccv, Demo: http://liuliu.me/ccv/js/nss/
The demo uses what looks like a json haar cascade to detect faces: https://github.com/liuliu/ccv/blob/unstable/js/face.js
How do I:
1. Convert the xml haar cascade files to be able to be used with CCV
2. Generate the best cascade for my goal (text placement on an image)
3. Find any documentation for CCV
AND, finally, is there a better way to approche this problem?
EDIT: I've asked the border question here: https://stackoverflow.com/questions/10559262/programmatically-place-text-in-an-image
Convert the xml haar cascade files to be able to be used with CCV
Generate the best cascade for my goal (text placement on an image)
Find any documentation for CCV
I have no idea about 1)
(Anyway, which XML files? I guess some from opencv?)
or 3),
but here is my take on 2)
To make a haar cascade a lá viola&jones, you need a series of small training images that contain only your desired objects, for example faces.
One object per image, with as little background as possible, all in the same orientation and size, normalized so they all have the same average brightness and variance in brightness. You will need a lot of training images.
You also need a series of negative training images, same size/brightness etc as the positive examples, that contain only background.
However, I doubt that this approach will work for you at all:
Haar filters work by recognizing common rectangular light/dark structures in all your foreground objects.
So your desired foreground images need to have a common structure.
An example haar filter cascade works like this (extremely simplified):
is the rectangular region at x1,y1 darker than the region at x2,y2? if no --> not a face, if yes --> continue
is the region at x3,y3 darker than the region at x4,y4? if no --> not a face --> if yes, continue
and so on ....
(To find the position of a face in a larger image, you execute this filter for every possible position in the image. The filter cascade is very fast in rejecting non-faces, so this is doable.)
So your foreground objects need to have a common pattern among them.
For faces, the eye region is darker than the cheek region, and the mouth is darker than the chin, and so on.
The same filter for faces will cease to work if you just rotate the faces.
You cannot build a good filter for both trees and faces, and you definitely cannot build one for general foreground objects. They have no such common structure among them. You would need a separate filter for each possible type of object, so unless your pictures only show a very limited number of types this will not work
Related
I don't know much about image processing so please bear with me if this is not possible to implement.
I have several sets of aerial images of the same area originating from different sources. The pictures have been taken during different seasons, under different lighting conditions etc. Unfortunately some images look patchy and suffer from discolorations or are partially obstructed by clouds or pix-elated, as par example picture1 and picture2
I would like to take as an input several images of the same area and (by some kind of averaging them) produce 1 picture of improved quality. I know some C/C++ so I could use some image processing library.
Can anybody propose any image processing algorithm to achieve it or knows any research done in this field?
I would try with a "color twist" transform, i.e. a 3x3 matrix applied to the RGB components. To implement it, you need to pick color samples in areas that are split by a border, on both sides. You should fing three significantly different reference colors (hence six samples). This will allow you to write the nine linear equations to determine the matrix coefficients.
Then you will correct the altered areas by means of this color twist. As the geometry of these areas is intertwined with the field patches, I don't see a better way than contouring the regions by hand.
In the case of the second picture, the limits of the regions are blurred so that you will need to blur the region mask as well and perform blending.
In any case, don't expect a perfect repair of those problems as the transform might be nonlinear, and completely erasing the edges will be difficult. I also think that colors are so washed out at places that restoring them might create ugly artifacts.
For the sake of illustration, a quick attempt with PhotoShop using manual HLS adjustment (less powerful than color twist).
The first thing I thought of was a kernel matrix of sorts.
Do a first pass of the photo and use an edge detection algorithm to determine the borders between the photos - this should be fairly trivial, however you will need to eliminate any overlap/fading (looks like there's a bit in picture 2), you'll see why in a minute.
Do a second pass right along each border you've detected, and assume that the pixel on either side of the border should be the same color. Determine the difference between the red, green and blue values and average them along the entire length of the line, then divide it by two. The image with the lower red, green or blue value gets this new value added. The one with the higher red, green or blue value gets this value subtracted.
On either side of this line, every pixel should now be the exact same. You can remove one of these rows if you'd like, but if the lines don't run the length of the image this could cause size issues, and the line will likely not be very noticeable.
This could be made far more complicated by generating a filter by passing along this line - I'll leave that to you.
The issue with this could be where there was development/ fall colors etc, this might mess with your algorithm, but there's only one way to find out!
i'm interested in some kind of charcoal-filters like the photoshop Photocopy-Filter or the note-paper.
Have someone a paper or some instructions how this filter works?
In best case i want to create the following:
input:
Output:
greetings
I think it's a process akin to pan-sharpening. I could get a quite similar image in gimp by:
Converting to gray
Duplicating into two layers
Lightly blurring one layer
Edge-detecting in the other layer with a DOG filter with large radius
Compositing the two layers, playing a bit with the transparency.
What this is doing is converting the color picture into a 0-1 bitmap picture.
They typically use a threshold function which returns 1 (white) for some values and 0 (black) for some other.
One simple function would be transform the image from color to gray-scale, and then select a shade of gray above which everything is white, and below it everything is black. The actual threshold you use could be made adaptive depending on the brightness of the picture (you want a certain percentage of pixels to be white).
It can also be adaptive based on the context within the picture (i.e. a dark area may still have some white pixels to show local contrast). The trees behind the house are not all black because the filtering is sensitive to the average darkness of the region.
Also note that the area close to the light gap in the tree has a cluster of dark pixels, because of its relative darkness. The edges of the home, the bench are also highlighted. There is an edge detection element at play.
I do not know exactly what effect you gave an example of but there are a variety that are similar to it. As VSOverFlow pointed out, thresholding an image would result in something very similar to that though I do not think it is what is being used. Open cv has a function for this, its documentation can be found here. You may also want to look into Otsu's method for thresholding.
Again as VSOverFlow pointed out, there is an edge detection element at play as well. You may want to investigate the Sobel and Prewitt filters. Those are 3 simple options that will give you something similar to the image you provided. Perhaps you could threshold the result from the Prewitt filter? I have no knowledge of how Photoshop implements its filters. If none of these options are close enough to what you are looking for I would recommend looking for information on the specific implementations of those filters in photoshop.
Are there properties of digital images (e.g. dct coefficients, pixel values, YCbCr, others) that remain constant when filters like binarization, grayscale, sepia, etc, or tilting the image by a certain degree are applied. It would also be helpful if you could suggest any reading or online tutorial for basic image processing.
It sounds like you want to know what features are robust to all sorts of image operations.
The properties you listed are not invariant to the transforms you listed. You ask if the "pixel values" remain constant when you apply a filter that by definition modifies the pixel values. The only positive answer about your list would be that DCT coefficients maintain their distribution when you apply a color filter.
I'm going to make an assumption and suggest that you should read up on feature detection, where the goal is to identify salient parts of an image that remain constant after a transformation like scaling, rotation, etc. These features are useful for image stitching, object detection, query-by-image search, and lots more.
Q: Are there properties of digital images ... that remain constant ...
A: Sure: height and width ;-)
Q: ...or tilting the image by a certain degree...
A: Whoops - maybe not even height and width ;)
ANYWAY -
Your question is far, far too broad.
SUGGESTION:
Get a copy of Foley/van Damm:
http://www.amazon.com/Computer-Graphics-Principles-Practice-2nd/dp/0201848406
The properties of an image (i.e. pixels) always change when you process it. Processing simply means changing the pixel values in order to finally get something from it.
There are lots of image processing techniques such as removing noises, applying filters, re-sizing, cropping, edge detection, etc.
If you want to learn from the beginning then see tutorials of Bob Powell. It is in C# and quite easy to understand.
I realize there might be a better place to ask this, but I think you all will have some valuable feedback.
People are asked to draw a shape in black on a white cavas. Then their drawing is added to the running average. I'd like to have the parts that the images mostly have in common be shown, and the parts of the drawing that are unlike most of the other drawings disappear.
My two problems are that I'm using ImageMagick to process the images, which means that I can only create a composite of two images at once. So I have the running total image, and the newest one to add. I cannot get a real average this way.
Secondly, I do not fully understand blending modes particularly when different opacities are involved. I'm not sure which is the best to use.
When you add the first two images you blend them equally. But then when you add the 3rd image to the mix you have to change the weight of each image. The two image composite should be blended at 66.6% while the new image should contribute just 33.3%. For the 4th image you will blend at 75% and 25% respectively. In general, if you have n images in the composite, then the new image should contribute 100/(n+1) percent when added.
As you see, the more images you have the less an individual image affects the composite result.
i'm working in a project to recognize a bit code from an image like this, where black rectangle represents 0 bit, and white (white space, not visible) 1 bit.
Somebody have any idea to process the image in order to extract this informations? My project is written in java, but any solution is accepted.
thanks all for support.
I'm not an expert in image processing, I try to apply Edge Detection using Canny Edge Detector Implementation, free java implementation find here. I used this complete image [http://img257.imageshack.us/img257/5323/colorimg.png], reduce it (scale factor = 0.4) to have fast processing and this is the result [http://img222.imageshack.us/img222/8255/colorimgout.png]. Now, how i can decode white rectangle with 0 bit value, and no rectangle with 1?
The image have 10 line X 16 columns. I don't use python, but i can try to convert it to Java.
Many thanks to support.
This is recognising good old OMR (optical mark recognition).
The solution varies depending on the quality and consistency of the data you get, so noise is important.
Using an image processing library will clearly help.
Simple case: No skew in the image and no stretch or shrinkage
Create a horizontal and vertical profile of the image. i.e. sum up values in all columns and all rows and store in arrays. for an image of MxN (width x height) you will have M cells in horizontal profile and N cells in vertical profile.
Use a thresholding to find out which cells are white (empty) and which are black. This assumes you will get at least a couple of entries in each row or column. So black cells will define a location of interest (where you will expect the marks).
Based on this, you can define in lozenges in the form and you get coordinates of lozenges (rectangles where you have marks) and then you just add up pixel values in each lozenge and based on the number, you can define if it has mark or not.
Case 2: Skew (slant in the image)
Use fourier (FFT) to find the slant value and then transform it.
Case 3: Stretch or shrink
Pretty much the same as 1 but noise is higher and reliability less.
Aliostad has made some good comments.
This is OMR and you will find it much easier to get good consistent results with a good image processing library. www.leptonica.com is a free open source 'C' library that would be a very good place to start. It could process the skew and thresholding tasks for you. Thresholding to B/W would be a good start.
Another option would be IEvolution - http://www.hi-components.com/nievolution.asp for .NET.
To be successful you will need some type of reference / registration marks to allow for skew and stretch especially if you are using document scanning or capturing from a camera image.
I am not familiar with Java, but in Python, you can use the imaging library to open the image. Then load the height and the widths, and segment the image into a grid accordingly, by Height/Rows and Width/Cols. Then, just look for black pixels in those regions, or whatever color PIL registers that black to be. This obviously relies on the grid like nature of the data.
Edit:
Doing Edge Detection may also be Fruitful. First apply an edge detection method like something from wikipedia. I have used the one found at archive.alwaysmovefast.com/basic-edge-detection-in-python.html. Then convert any grayscale value less than 180 (if you want the boxes darker just increase this value) into black and otherwise make it completely white. Then create bounding boxes, lines where the pixels are all white. If data isn't terribly skewed, then this should work pretty well, otherwise you may need to do more work. See here for the results: http://imm.io/2BLd
Edit2:
Denis, how large is your dataset and how large are the images? If you have thousands of these images, then it is not feasible to manually remove the borders (the red background and yellow bars). I think this is important to know before proceeding. Also, I think the prewitt edge detection may prove more useful in this case, since there appears to be less noise:
The previous method of segmenting may be applied, if you do preprocess to bin in the following manner, in which case you need only count the number of black or white pixels and threshold after some training samples.