Finding image subsets between two images - algorithm

I'm working on a way to handle hardware-based bitmap animation. As an input, I've got an image sequence of a simple bitmap (it's not a video, it's more like simple shapes, even though they might contain bitmap fills). I'm making a texture atlas of this animation (so it can be rendered quickly with GPU), and since this sequence sometimes has most part of it standing still while a small part of it is animating, I need an algorithm that can find the "common parts" between two images, so I can save memory.
The images might not have the same size (if an object is growing or shrinking, for example), so I need a way to detect the biggest common area between the two. I've seen this answer and it partly solves my problem. I'd like to know, though, if there is already a better algorithm for my case, specially because since the sizes can vary, one image is not necessarily contained within the other, but I'd need to find the common parts between the two.

One problem I see is that one image can be contained in many ways in another, how do you determine the right answer?
Does it have to be real-time? If not then you can do the simple O(n^4) search for it using a fitness function.
The fitness function could be the error between the images (which gives a n^8 algorithm).
UPDATE:
Wrong analysis of me sorry. The search is n^2 and the fitness function is n^2 which gives n^4.
The whole algorithm should look something like this:
w1 = width of image 1
w2 = width of image 2
h1 = height of image 1
h2 = height of image 2
for x = -w1 to w1+w2
for y = -h1 to h1+h2
find max fitness(x,y)
fitness(xc,yc){
m=0
for each x where image 1 overlaps image 2 displaced by xc
for each y where image 1 overlaps image 2 displaced by yc
if (image1[x][y] == image2[x+xc][y+yc])
m += 1
return m
}
UPDATE: Modified fitness function to find the number of overlaps, and then try to find the most overlaps.

Related

Different methods to normalize images

I want to normalize images whose pixel can have negative values and found two different ways to do that. Given a two-dimensional matrix X I can do the following:
a) X = 0.5*((X/max(abs(X))+1)
b) X = (X-min(X))/(max(X)-min(X))
Since I'm not an expert, I'm not sure which of the two is the more useful way to normalize images. Does one of the two options have certain advantages?
For GLCM is does not at all matter where the 0 level is, what matters is the differences between intensities. Thus, I would pick the method that linearly stretches between the min and max intensity. This method uses the output range best, and therefore introduced the least quantization error.
When comparing GLCM results across images, it is best if all images are stretched the same way. I would select a global min and max, keep those constant for all images in the set.
Note that for other purposes, the answer will be different.
The second approach will use the full range between 0 and 1, which may be what you want. The first approach will map 0 always to 0.5. When the data is symmetrically spread around 0, also the first approach will use the full range between 0 and 1.
Up to you to decide what you want.

Showing two images with the same colorbar in log

I have two sparse matrices "Matrix1" and "Matrix2" of the same size p x n.
By sparse matrix I mean that it contains a lot of exactly zero elements.
I want to show the two matrices under the same colormap and a unique colorbar. Doing this in MATLAB is straightforward:
bottom = min(min(min(Matrix1)),min(min(Matrix2)));
top = max(max(max(Matrix1)),max(max(Matrix2)));
subplot(1,2,1)
imagesc(Matrix1)
colormap(gray)
caxis manual
caxis([bottom top]);
subplot(1,2,2)
imagesc(Matrix2)
colormap(gray)
caxis manual
caxis([bottom top]);
colorbar;
My problem:
In fact, when I show the matrix using imagesc(Matrix), it can ignore the noises (or backgrounds) that always appear with using imagesc(10*log10(Matrix)).
That is why, I want to show the 10*log10 of the matrices. But in this case, the minimum value will be -Inf since the matrices are sparse. In this case caxis will give an error because bottom is equal to -Inf.
What do you suggest me? How can I modify the above code?
Any help will be very appreciated!
A very important point is that the minimum value in your matrix will always be 0. Leveraging this, a very simple way to address your problem is to add 1 inside the log operation so that values that map to 0 in the original matrix also map to 0 in the log operation. This avoids the -Inf error that you're encountering. In fact, this is a very common way of visualizing the Fourier Transform if you will. Adding 1 to the logarithm ensures that the transform has no negative values in the output, yet the derivative or its rate of change remains intact as the effect is simply a translation of the curve by 1 unit to the left.
Therefore, simply do imagesc(10*log10(1 + Matrix));, then the minimum is always bounded at 0 while the maximum is unbounded but subject to the largest value that is seen in Matrix.

Having more control over the thinning algorithm?

I have some text documents where I want to thin the text to varying widths such as 2 pixel wide strokes, 4 pixel wide and so on.
I know that matlab already has the thinning algorithm in bwmorph and one can get to the one pixel wide thinning by using
thinned = bwmorph(bw_image, 'thin', 'n=Inf');
But this thins the image to 1 pixel width. changing the value of n does not produce the desired result. Is there any way I could ensure thinning to n-pixel width?
You could always thin the characters first, then artificially expand their skeletons by performing morphology. For expanding, morphological dilation is most suitable. As such, thin the characters using the standard thinning algorithm, then dilate the result after using a suitable structuring element with a good size. The size of the structuring element should dictate how thick the thinned result is.
To further exemplify my point, here's an example with an image I found on Google:
Reading this in with MATLAB and converting to binary:
im = im2bw(imread('https://lh3.ggpht.com/aWaaZ-BsAXSYyyHRlube_NkiB-Q-FDx-Wpgg8qi5jqrNvAvNp87amEwSUNr7PdbCizY=w300'));
This is what we get:
Performing a thinning gives us:
thinned = bwmorph(im, 'thin', 'n=Inf');
If you want to increase the thickness of the thinning result so that the thickness is n pixels, use a basic square structuring element with size n x n and use this with the imdilate function, which performs morphological dilation on binary images. In general, to increase the thickness of the text to have an overall thickness of n pixels, you would choose the size of the square structuring element to be n.
Here are some examples of what I have discussed above.
n = 2
This would increase the thinning to be 2 pixels wide:
se = strel('square', 2);
expand = imdilate(thinned, se);
imshow(expand);
The function strel defines different structuring elements, but we will choose the square one via 'square' flag. Dilating the thinned image that you see above, we get:
n = 5
Simply change the structuring element to size to 5 x 5, and we get:
se = strel('square', 5);
expand = imdilate(thinned, se);
imshow(expand);
If you take any of the results and zoom into the text, you will see that the width of each stroke is indeed either 2 or 5 pixels. However, the assumption with the above code is that each character is sufficiently separated to allow the variable thickness of each stroke to be maintained. Should the characters be very close together, then dilation will merge these text characters together... but the thinning algorithm will most likely give you bad results even before dilation.

Looking for an algorithm (version of 2-dimensional binary search)

Easy problem and known algorithm:
I have a big array with 100 members. First X members are 0, and the rest are 1. Find X.
I am solving it by a binary search: Check member 50, if it is 0 - check member 75, etc, until I find adjacent 0 and 1.
I am looking for an optimized algorithm for the same problem in 2-dimensions:
I have 2-dimensional array 100*100. Those members that are on rows 0-X AND on columns 0-Y are 0, and the rest are 1. How to find Y and X?
Edit : The optimal solution consists in two simple binary search.
I'm very sorry for the long and convoluted post I did below. What the problem fundamentally consists in is to find a point in a space that contains 100*100 elements. The best you can do is to divide at each step this space in two. You can do it in a convoluted way (the one I did in the rest of the post) But if you realize that a binary search on the X axis still divides the research space in two at each step, (the same goes for the Y axis) then you understand that it's optimal.
I still let the thing I did, and I'm sorry that I made some peremptory affirmations in it.
If you're looking for a simple algorithm (though not optimal) just run the binary search twice as suggested.
However, if you want an optimal algorithm, you can look for the boundary on X and on Y at the same time. (You have to note that the two algorithm have same asymptotical complexity, but the optimal algorithm will still be faster)
In all the following graphics, the point (0, 0) is in the bottom left corner.
Basically when you choose a point and get the result, you cut your space in two parts. When you think about it that is actually the biggest amount of information you can extract from this.
If you choose the point (the black cross) and the result is 1 (red lines), this means that the point you're looking for can not be in the gray space (thus must be in the remaining white area)
On the other hand, if the value is 0 (blue lines), this means that the point you're looking for can not be in the gray area (thus must be in the remaining white area)
So, if you get one 0 result and one 1 result, this is what you'll get :
The point you're looking for is either in rectangle 1, 2 or 3. You just need to check the two corners of rectangle 3 to know which of the 3 rectangle is the good one.
So the algorithm is the following :
Note where are the bottom left and top right corner of the rectangle you're working with.
Do a binary search along the diagonal of the rectangle until you've stumbled at least once on a 1 result and once a 0 result.
Check the 2 other corners of the rectangle 3 (you'll necessary already know the values of the two corners on the diagonal) It is possible to check only one corner to know the right rectangle (but you'll have to check the two corners if the right rectangle is the rectangle 3)
Determine if the point you're looking for is in rectangle 1, 2 or 3
Repeat by reducing the problem to the good rectangle until the final rectangle is reduced to a point : it's the value you're looking for
Edit : if you want the supremum optimality, you'd not the when you choose the point (50, 50), you do not cut the space in equal part. One is three time bigger than the other. Ideally, you'll choose a point that cuts the space in two equal regions (area-wise)
You should compute once at the beginning the value of factor = (1.0 - 1.0/sqrt(2.0)). Then when you want to cut bewteen values a and b, choose the cutting point as a + factor*(b-a). When you cut the initial 100x100 rectangle at the point (100*factor, 100*factor) the two regions will have an area (100*100)/2, thus the convergence will be quicker.
Run your binary search twice. First determine X by running binary search on the last row and then determine Y by running binary search on last column.
Simple solution: go first in X-direction and then in Y-direction.
Check (0,50); If it is 0, check (0,75); until You find adjacent 0 and 1. Then go to Y direction from there.
Second solution:
Check member (50,50). If it is 1, check (25,25), until You find 0. Continue, until You find adjacent (X,X) and (X+1,X+1) that are 0 and 1. Then test (X,X+1) and (X+1,X). Neither or one of them will be 1. If neither, You are finished. If only one, say for example (X+1,X), then You know that the box's size is between (X+1,X) and (100,X). Use binary search to find box's height.
EDIT: As Chris pointed out, it seems that the simple approach is faster.
Second solution (modified):
Check member (50,50). If it is 1, check (25,25), until You find 0. Continue, until You find adjacent (X,X) and (X+1,X+1) that are 0 and 1. Then test (X,X+1). If it is 1, then do binary search on line (X,X+1)...(X,100). Else do binary search on line (X,X)...(100,X).
Even then I am probably beating a dead horse here. If it will be faster, then by neglible amount. This is just for theoretical fun. :)
EDIT 2 As Fezvez and Chris put it, binary search divides the search space in two most efficiently; My approach divides the area to 1/4 and 3/4 pieces. Fezvez pointed out that this could be remedied by calculating the dividing factor beforehand (but that would be extra calculation). In modified version of my algorithm I choose the direction where to go (X or Y direction), which effectively also divides the search space in two, and then conduct binary search. To conclude, this shows that this approach will always be a bit slower. (and more complicated to implement.)
Thank You, Igor Oks, for interesting question. :)
Use binary search on both dimensions and the 1D case:
Start with j=50. Now the 1-D array obtained by varying i is of the desired form - so find X from 1D case.
If X = 100 (i.e. no ones), then make j=75 (middle of the range in j dimension) and repeat.
If X < 100, then you have found it. All that is left is to fix i=X and find Y from the 1D case.

Parabolic knapsack

Lets say I have a parabola. Now I also have a bunch of sticks that are all of the same width (yes my drawing skills are amazing!). How can I stack these sticks within the parabola such that I am minimizing the space it uses as much as possible? I believe that this falls under the category of Knapsack problems, but this Wikipedia page doesn't appear to bring me closer to a real world solution. Is this a NP-Hard problem?
In this problem we are trying to minimize the amount of area consumed (eg: Integral), which includes vertical area.
I cooked up a solution in JavaScript using processing.js and HTML5 canvas.
This project should be a good starting point if you want to create your own solution. I added two algorithms. One that sorts the input blocks from largest to smallest and another that shuffles the list randomly. Each item is then attempted to be placed in the bucket starting from the bottom (smallest bucket) and moving up until it has enough space to fit.
Depending on the type of input the sort algorithm can give good results in O(n^2). Here's an example of the sorted output.
Here's the insert in order algorithm.
function solve(buckets, input) {
var buckets_length = buckets.length,
results = [];
for (var b = 0; b < buckets_length; b++) {
results[b] = [];
}
input.sort(function(a, b) {return b - a});
input.forEach(function(blockSize) {
var b = buckets_length - 1;
while (b > 0) {
if (blockSize <= buckets[b]) {
results[b].push(blockSize);
buckets[b] -= blockSize;
break;
}
b--;
}
});
return results;
}
Project on github - https://github.com/gradbot/Parabolic-Knapsack
It's a public repo so feel free to branch and add other algorithms. I'll probably add more in the future as it's an interesting problem.
Simplifying
First I want to simplify the problem, to do that:
I switch the axes and add them to each other, this results in x2 growth
I assume it is parabola on a closed interval [a, b], where a = 0 and for this example b = 3
Lets say you are given b (second part of interval) and w (width of a segment), then you can find total number of segments by n=Floor[b/w]. In this case there exists a trivial case to maximize Riemann sum and function to get i'th segment height is: f(b-(b*i)/(n+1))). Actually it is an assumption and I'm not 100% sure.
Max'ed example for 17 segments on closed interval [0, 3] for function Sqrt[x] real values:
And the segment heights function in this case is Re[Sqrt[3-3*Range[1,17]/18]], and values are:
Exact form:
{Sqrt[17/6], 2 Sqrt[2/3], Sqrt[5/2],
Sqrt[7/3], Sqrt[13/6], Sqrt[2],
Sqrt[11/6], Sqrt[5/3], Sqrt[3/2],
2/Sqrt[3], Sqrt[7/6], 1, Sqrt[5/6],
Sqrt[2/3], 1/Sqrt[2], 1/Sqrt[3],
1/Sqrt[6]}
Approximated form:
{1.6832508230603465,
1.632993161855452, 1.5811388300841898, 1.5275252316519468, 1.4719601443879744, 1.4142135623730951, 1.35400640077266, 1.2909944487358056, 1.224744871391589, 1.1547005383792517, 1.0801234497346435, 1, 0.9128709291752769, 0.816496580927726, 0.7071067811865475, 0.5773502691896258, 0.4082482904638631}
What you have archived is a Bin-Packing problem, with partially filled bin.
Finding b
If b is unknown or our task is to find smallest possible b under what all sticks form the initial bunch fit. Then we can limit at least b values to:
lower limit : if sum of segment heights = sum of stick heights
upper limit : number of segments = number of sticks longest stick < longest segment height
One of the simplest way to find b is to take a pivot at (higher limit-lower limit)/2 find if solution exists. Then it becomes new higher or lower limit and you repeat the process until required precision is met.
When you are looking for b you do not need exact result, but suboptimal and it would be much faster if you use efficient algorithm to find relatively close pivot point to actual b.
For example:
sort the stick by length: largest to smallest
start 'putting largest items' into first bin thy fit
This is equivalent to having multiple knapsacks (assuming these blocks are the same 'height', this means there's one knapsack for each 'line'), and is thus an instance of the bin packing problem.
See http://en.wikipedia.org/wiki/Bin_packing
How can I stack these sticks within the parabola such that I am minimizing the (vertical) space it uses as much as possible?
Just deal with it like any other Bin Packing problem. I'd throw meta-heuristics on it (such as tabu search, simulated annealing, ...) since those algorithms aren't problem specific.
For example, if I'd start from my Cloud Balance problem (= a form of Bin Packing) in Drools Planner. If all the sticks have the same height and there's no vertical space between 2 sticks on top of each other, there's not much I'd have to change:
Rename Computer to ParabolicRow. Remove it's properties (cpu, memory, bandwith). Give it a unique level (where 0 is the lowest row). Create a number of ParabolicRows.
Rename Process to Stick
Rename ProcessAssignement to StickAssignment
Rewrite the hard constraints so it checks if there's enough room for the sum of all Sticks assigned to a ParabolicRow.
Rewrite the soft constraints to minimize the highest level of all ParabolicRows.
I'm very sure it is equivalent to bin-packing:
informal reduction
Be x the width of the widest row, make the bins 2x big and create for every row a placeholder element which is 2x-rowWidth big. So two placeholder elements cannot be packed into one bin.
To reduce bin-packing on parabolic knapsack you just create placeholder elements for all rows that are bigger than the needed binsize with size width-binsize. Furthermore add placeholders for all rows that are smaller than binsize which fill the whole row.
This would obviously mean your problem is NP-hard.
For other ideas look here maybe: http://en.wikipedia.org/wiki/Cutting_stock_problem
Most likely this is the 1-0 Knapsack or a bin-packing problem. This is a NP hard problem and most likely this problem I don't understand and I can't explain to you but you can optimize with greedy algorithms. Here is a useful article about it http://www.developerfusion.com/article/5540/bin-packing that I use to make my php class bin-packing at phpclasses.org.
Props to those who mentioned the fact that the levels could be at varying heights (ex: assuming the sticks are 1 'thick' level 1 goes from 0.1 unit to 1.1 units, or it could go from 0.2 to 1.2 units instead)
You could of course expand the "multiple bin packing" methodology and test arbitrarily small increments. (Ex: run the multiple binpacking methodology with levels starting at 0.0, 0.1, 0.2, ... 0.9) and then choose the best result, but it seems like you would get stuck calulating for an infinite amount of time unless you had some methodlogy to verify that you had gotten it 'right' (or more precisely, that you had all the 'rows' correct as to what they contained, at which point you could shift them down until they met the edge of the parabola)
Also, the OP did not specify that the sticks had to be laid horizontally - although perhaps the OP implied it with those sweet drawings.
I have no idea how to optimally solve such an issue, but i bet there are certain cases where you could randomly place sticks and then test if they are 'inside' the parabola, and it would beat out any of the methodologies relying only on horizontal rows.
(Consider the case of a narrow parabola that we are trying to fill with 1 long stick.)
I say just throw them all in there and shake them ;)

Resources