Image Segmentation using Mean Shift explained - algorithm
Could anyone please help me understand how Mean Shift segmentation actually works?
Here is a 8x8 matrix that I just made up
103 103 103 103 103 103 106 104
103 147 147 153 147 156 153 104
107 153 153 153 153 153 153 107
103 153 147 96 98 153 153 104
107 156 153 97 96 147 153 107
103 153 153 147 156 153 153 101
103 156 153 147 147 153 153 104
103 103 107 104 103 106 103 107
Using the matrix above is it possible to explain how Mean Shift segmentation would separate the 3 different levels of numbers?
The basics first:
The Mean Shift segmentation is a local homogenization technique that is very useful for damping shading or tonality differences in localized objects.
An example is better than many words:
Action:replaces each pixel with the mean of the pixels in a range-r neighborhood and whose value is within a distance d.
The Mean Shift takes usually 3 inputs:
A distance function for measuring distances between pixels. Usually the Euclidean distance, but any other well defined distance function could be used. The Manhattan
Distance is another useful choice sometimes.
A radius. All pixels within this radius (measured according the above distance) will be accounted for the calculation.
A value difference. From all pixels inside radius r, we will take only those whose values are within this difference for calculating the mean
Please note that the algorithm is not well defined at the borders, so different implementations will give you different results there.
I'll NOT discuss the gory mathematical details here, as they are impossible to show without proper mathematical notation, not available in StackOverflow, and also because they can be found from good sources elsewhere.
Let's look at the center of your matrix:
153 153 153 153
147 96 98 153
153 97 96 147
153 153 147 156
With reasonable choices for radius and distance, the four center pixels will get the value of 97 (their mean) and will be different form the adjacent pixels.
Let's calculate it in Mathematica. Instead of showing the actual numbers, we will display a color coding, so it's easier to understand what is happening:
The color coding for your matrix is:
Then we take a reasonable Mean Shift:
MeanShiftFilter[a, 3, 3]
And we get:
Where all center elements are equal (to 97, BTW).
You may iterate several times with Mean Shift, trying to get a more homogeneous coloring. After a few iterations, you arrive at a stable non-isotropic configuration:
At this time, it should be clear that you can't select how many "colors" you get after applying Mean Shift. So, let's show how to do it, because that is the second part of your question.
What you need to be able to set the number of output clusters in advance is something like Kmeans clustering.
It runs this way for your matrix:
b = ClusteringComponents[a, 3]
{{1, 1, 1, 1, 1, 1, 1, 1},
{1, 2, 2, 3, 2, 3, 3, 1},
{1, 3, 3, 3, 3, 3, 3, 1},
{1, 3, 2, 1, 1, 3, 3, 1},
{1, 3, 3, 1, 1, 2, 3, 1},
{1, 3, 3, 2, 3, 3, 3, 1},
{1, 3, 3, 2, 2, 3, 3, 1},
{1, 1, 1, 1, 1, 1, 1, 1}}
Or:
Which is very similar to our previous result, but as you can see, now we have only three output levels.
HTH!
A Mean-Shift segmentation works something like this:
The image data is converted into feature space
In your case, all you have are intensity values, so feature space will only be one-dimensional. (You might compute some texture features, for instance, and then your feature space would be two dimensional – and you’d be segmenting based on intensity and texture)
Search windows are distributed over the feature space
The number of windows, window size, and initial locations are arbitrary for this example – something that can be fine-tuned depending on specific applications
Mean-Shift iterations:
1.) The MEANs of the data samples within each window are computed
2.) The windows are SHIFTed to the locations equal to their previously computed means
Steps 1.) and 2.) are repeated until convergence, i.e. all windows have settled on final locations
The windows that end up on the same locations are merged
The data is clustered according to the window traversals
... e.g. all data that was traversed by windows that ended up at, say, location “2”, will form a cluster associated with that location.
So, this segmentation will (coincidentally) produce three groups. Viewing those groups in the original image format might look something like the last picture in belisarius' answer. Choosing different window sizes and initial locations might produce different results.
Related
Efficient way to iterate over Gray code change positions
There a number of ways iterating over n-bit Gray codes. Some are more efficient than others. However, I don't actually need the Gray codes and would like instead to iterate over the bit index that is changed in a Gray code list, not the actual Gray codes. For example, take this 3-bit Gray code list: 000, 001, 011, 010, 110, 111, 101, 100 I would like to output 3, 2, 3, 1, 3, 2, 3. This tells us we needed to change bits 3, 2, 3 etc. in order to get the list. Here I am indexing from 1 and from the left. One way to do this would be to compute the Gray codes in order and for each consecutive pair (x, y) compute (x XOR y) to identify which bit changed and then take the integer log base 2 of (x XOR y). However I need the iteration to be as fast as possible and my interest will be in 30-40 bit Gray codes. Is there an efficient way to do this?
If you number the bits starting with 0 for least significant, the position of the bit to change to increase a binary-reflected Gray code is the position of the lowest bit set in an increasing binary number (see end of the wikipedia paragraph you linked) - to get the numbering you presented, subtract from 3/the number of bit positions. binary-reflected ("Gray") 000 001 011 010 110 111 101 100 binary 001 010 011 100 101 110 111 pos of least significant 1 0 1 0 2 0 1 0 (count of trailing zeros ctz) 3 - ctz(binary) 3 2 3 1 3 2 3
If you're working with, e.g., C with GCC intrinsics (and you definitely should be using a language that gives you fine control over the assembly output so that you can vectorize), then you can do long long ctr = 0LL; int next() { return __builtin_ctzll(++ctr); } This returns 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, ... by counting the trailing zeroes of the counter ctr. Translate as appropriate.
LMC program to find the difference between double the median and the smallest of 3 inputs?
I want to write an LMC program to find the difference between twice the median and the smallest of 3 distinct inputs efficiently. I would like some help in figuring out an algorithm for this. Here is what I have so far: INPUT 901 - Input first STO 399 - Store in 99 (a) INPUT 901 - Input second STO 398 - Store in 98 (b) INPUT 901 - Input third STO 397 - Store in 97 (c) LOAD 597 - Load 97 (a) SUB 298 - Subtract 97 - 98 (a - b) BRP 8xx - If value positive go to xx (if value is positive a > b else b > a) LOAD 598 - Load 98 (b) SUB 299 - Subtract 98 - 99 (b - c) BRP 8xx - If value positive go to xx (if value is positive b > c else c > b) LOAD 598 - Load 98 (b) which is the median ADD 198 - Double to get "twice the median" I realized at the end of the snippet I didn't know which input was the smallest and was assuming the inputs were already sorted (which they aren't). I think I will need to somehow sort the inputs from smallest to largest to do this efficiently and determine the smallest input and the median within the same branch.
I don't know little-man-computer language, but it doesn't matter, it's an algorithm question. First of all, you made a little confusion naming the three parameters (first you said that 99 was a, then you said 97 was a). You must load the three parameters in 99, 98, 97 (say a, b, c). Then, you load 99 (a) and subtract 98 (b) from 99 (a). If the result is positive (99 is greater than 98), you have to swap 98 and 99, so the smallest between the two is in location 99. Now load 98 (c) and subtract 97 from it. If the result is positive, swap 97 and 98, so the smallest between the two is in location 98. Finally, you have the two smallest numbers in 98 and 99 locations, that is the smallest and the median. Load 99 and subtract 98 from it. If the result is positive, 99 contains the median and 98 the smallest, otherwise the contrary. Now you can double the median one, and calculate the difference between this number and the smallest.
Slideshow Algorithm
I need to design an algorithm for a photo slideshow that is constantly receiving new images, so that the oldest pictures appear less in the presentation, until a balance between the old photos and those that have appeared. I have thought that every image could have a counter of the number of times they have been shown and prioritize those pictures with the lowest value in that variable. Any other ideas or solutions would be well received.
You can achieve an overall near-uniform distribution (each image appears about the same number of times for the long run), but I wouldn't recommend doing it. Images that were available early would appear very very rarely later on. A better user experience would be to simply choose a random image from all the available images at each step. If you still want near-uniform distribution for the long run, you should set the probability for any image based on the number of times it appeared so far. For example: p(i) = 1 - count(i) / (max_count() + epsilon) Here is a simple R code that simulates such process. 37 random images are selected before a new image becomes available. This process is repeated 3000 times: h <- 3000 # total images eps <- 0.001 t <- integer(length=h) # t[i]: no. of instances of value i in r r <- c() # proceded vector of indexes of images m <- 0 # highest number of appearances for an image for (i in 1:h) for (j in 1:37) # select 37 random images in range 1..i { v <- sample(1:i, 1, prob=1-t[1:i]/(m+eps)) # select image i with weight 1-t[i]/(m+eps) r <- c(r, v) # add to output vector t[v] <- t[v]+1 # update appearances count m <- max(m, t[v]) # update highest number of appearances } plot(table(r)) The output plot shows the number of times each image appeared: epsilon = 0.001: epsilon = 0.0001: If we look, for example at the indexes in the output vector in which, say, image #3 was selected: > which(r==3) [1] 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 [21] 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 1189 34767 39377 [41] 70259 Note that if epsilon is very small, the sequence will seem less random (newer images are much preferred). For the long run however, any epsilon will do.
Instead of a view counter, you could also try basing your algorithm on the timestamp that images were uploaded.
Need to find lowest differences between first line of an array and the rest ones
Well, I've been given a number of pairs of elements (s,h), where s sends an h element on the s-th row of a 2d array.It is not necessary that each line has the same amount of elements, only known that there cannot be more than N elements on a line. What I want to do is to find the lowest biggest difference(!) between a certain element of the first line and the rest ones. Thus, if I have 3 lines with (101,92) (100,25,95,52,101) (93,108,0,65,200) what I want to find is 3, because I have to choose 92 and I have 95-92=3 from first to second and 93-92=1 form first to third. I have reached a point where it is certain that if I have s lines with n(i) elements each and i=0..s, then n0<=n1<=...<=ns so as to have a good average performance scenario when picking the best-fit from 1st line towards the others. However, I cannot think of a way lower than O(n2) or even maybe O(n3) in some cases. Does anyone have a suggestion about a fairly improved way to do this?
Combine all lines into a single list, also keeping track of which element comes from where. Sort this list. Have a last-value variable for each line. For each item in the sorted list, update the last-value variable of the applicable list. If not all lines have a last-value set yet, do nothing. If it's an element from the first list: Recalculate the biggest difference for all of the last-value variables. Store this difference. If it's an element from any other list: If all values have previous not been set, calculate the biggest difference. Otherwise, if the difference between the first list's last-value and this element is bigger than the biggest difference, update the biggest difference with this difference. Store this difference. The smallest difference is the desired value. Example: Lists: (101,92) (100,25,95,52,101) (93,108,0,65,200) Sorted 0 25 52 65 92 93 95 100 101 101 108 200 Source 2 1 1 2 0 2 1 1 0 1 2 2 Last[0] - - - - 92 92 92 92 101 101 101 101 Last[1] - 25 52 52 52 52 95 100 100 101 101 101 Last[2] 0 0 0 65 65 93 93 93 93 93 108 200 Diff - - - - 40 41 3 8 8 8 7 9 Best - - - - 40 40 3 3 3 3 3 3 Best = 3 as required. Storing the actual items or finding them afterwards should be easy enough. Complexity: Let n be the total number of items and k be the number of lists. O(n log n) for the combine + sort. O(nk) (worst case) for the scan through, since we're checking n items and, at each item, we do maximum O(k) work. So O(n log n + nk).
Fastest gap sequence for shell sort?
According to Marcin Ciura's Optimal (best known) sequence of increments for shell sort algorithm, the best sequence for shellsort is 1, 4, 10, 23, 57, 132, 301, 701..., but how can I generate such a sequence? In Marcin Ciura's paper, he said: Both Knuth’s and Hibbard’s sequences are relatively bad, because they are defined by simple linear recurrences. but most algorithm books I found tend to use Knuth’s sequence: k = 3k + 1, because it's easy to generate. What's your way of generating a shellsort sequence?
Ciura's paper generates the sequence empirically -- that is, he tried a bunch of combinations and this was the one that worked the best. Generating an optimal shellsort sequence has proven to be tricky, and the problem has so far been resistant to analysis. The best known increment is Sedgewick's, which you can read about here (see p. 7).
If your data set has a definite upper bound in size, then you can hardcode the step sequence. You should probably only worry about generality if your data set is likely to grow without an upper bound. The sequence shown seems to grow roughly as an exponential series, albeit with quirks. There seems to be a majority of prime numbers, but with non-primes in the mix as well. I don't see an obvious generation formula. A valid question, assuming you must deal with arbitrarily large sets, is whether you need to emphasise worst-case performance, average-case performance, or almost-sorted performance. If the latter, you may find that a plain insertion sort using a binary search for the insertion step might be better than a shellsort. If you need good worst-case performance, then Sedgewick's sequence appears to be favoured. The sequence you mention is optimised for average-case performance, where the number of comparisons outweighs the number of moves.
I would not be ashamed to take the advice given in Wikipedia's Shellsort article, With respect to the average number of comparisons, the best known gap sequences are 1, 4, 10, 23, 57, 132, 301, 701 and similar, with gaps found experimentally. Optimal gaps beyond 701 remain unknown, but good results can be obtained by extending the above sequence according to the recursive formula h_k = \lfloor 2.25 h_{k-1} \rfloor. Tokuda's sequence [1, 4, 9, 20, 46, 103, ...], defined by the simple formula h_k = \lceil h'_k \rceil, where h'k = 2.25h'k − 1 + 1, h'1 = 1, can be recommended for practical applications. guessing from the pseudonym, it seems Marcin Ciura edited the WP article himself.
The sequence is 1, 4, 10, 23, 57, 132, 301, 701, 1750. For every next number after 1750 multiply previous number by 2.25 and round down.
Sedgewick observes that coprimality is good. This rings true: if there are separate ‘streams’ not much cross-compared until the gap is small, and one stream contains mostly smalls and one mostly larges, then the small gap might need to move elements far. Coprimality maximises cross-stream comparison. Gonnet and Baeza-Yates advise growth by a factor of about 2.2; Tokuda by 2.25. It is well known that if there is a mathematical constant between 2⅕ and 2¼ then it must† be precisely √5 ≈ 2.236. So start {1, 3}, and then each subsequent is the integer closest to previous·√5 that is coprime to all previous except 1. This sequence can be pre-calculated and embedded in code. There follow the values up to 2⁶⁴ ≈ eighteen quintillion. {1, 3, 7, 16, 37, 83, 187, 419, 937, 2099, 4693, 10499, 23479, 52501, 117391, 262495, 586961, 1312481, 2934793, 6562397, 14673961, 32811973, 73369801, 164059859, 366848983, 820299269, 1834244921, 4101496331, 9171224603, 20507481647, 45856123009, 102537408229, 229280615033, 512687041133, 1146403075157, 2563435205663, 5732015375783, 12817176028331, 28660076878933, 64085880141667, 143300384394667, 320429400708323, 716501921973329, 1602147003541613, 3582509609866643, 8010735017708063, 17912548049333207, 40053675088540303, 89562740246666023, 200268375442701509, 447813701233330109, 1001341877213507537, 2239068506166650537, 5006709386067537661, 11195342530833252689} (Obviously, omit those that would overflow the relevant array index type. So if that is a signed long long, omit the last.) On average these have ≈1.96 distinct prime factors and ≈2.07 non-distinct prime factors; 19/55 ≈ 35% are prime; and all but three are square-free (2⁴, 13·19² = 4693, 3291992692409·23³ ≈ 4.0·10¹⁶). I would welcome formal reasoning about this sequence. † There’s a little mischief in this “well known … must”. Choosing ∉ℚ guarantees that the closest number that is coprime cannot be a tie, but rational with odd denominator would achieve same. And I like the simplicity of √5, though other possibilities include e^⅘, 11^⅓, π/√2, and √π divided by the Chow-Robbins constant. Simplicity favours √5.
I've found this sequence similar to Marcin Ciura's sequence: 1, 4, 9, 23, 57, 138, 326, 749, 1695, 3785, 8359, 18298, 39744, etc. For example, Ciura's sequence is: 1, 4, 10, 23, 57, 132, 301, 701, 1750 This is a mean of prime numbers. Python code to find mean of prime numbers is here: import numpy as np def isprime(n): ''' Check if integer n is a prime ''' n = abs(int(n)) # n is a positive integer if n < 2: # 0 and 1 are not primes return False if n == 2: # 2 is the only even prime number return True if not n & 1: # all other even numbers are not primes return False # Range starts with 3 and only needs to go up the square root # of n for all odd numbers for x in range(3, int(n**0.5)+1, 2): if n % x == 0: return False return True # To apply a function to a numpy array, one have to vectorize the function vectorized_isprime = np.vectorize(isprime) a = np.arange(10000000) primes = a[vectorized_isprime(a)] #print(primes) for i in range(2,20): print(primes[0:2**i].mean()) The output is: 4.25 9.625 23.8125 57.84375 138.953125 326.1015625 749.04296875 1695.60742188 3785.09082031 8359.52587891 18298.4733887 39744.887085 85764.6216431 184011.130096 392925.738174 835387.635033 1769455.40302 3735498.24225 The gap in the sequence is slowly decreasing from 2.5 to 2. Maybe this association could improve the Shellsort in the future.
I discussed this question here yesterday including the gap sequences I have found work best given a specific (low) n. In the middle I write A nasty side-effect of shellsort is that when using a set of random combinations of n entries (to save processing/evaluation time) to test gaps you may end up with either the best gaps for n entries or the best gaps for your set of combinations - most likely the latter. The problem lies in testing the proposed gaps such that valid conclusions can be drawn. Obviously, testing the gaps against all n! orderings that a set of n unique values can be expressed as is unfeasible. Testing in this manner for n=16, for example, means that 20,922,789,888,000 different combinations of n values must be sorted to determine the exact average, worst and reverse-sorted cases - just to test one set of gaps and that set might not be the best. 2^(16-2) sets of gaps are possible for n=16, the first being {1} and the last {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1}. To illustrate how using random combinations might give incorrect results assume n=3 that can assume six different orderings 012, 021, 102, 120, 201 and 210. You produce a set of two random sequences to test the two possible gap sets, {1} and {2,1}. Assume that these sequences turn out to be 021 and 201. for {1} 021 can be sorted with three comparisons (02, 21 and 01) and 201 with (20, 21, 01) giving a total of six comparisons, divide by two and voilà, an average of 3 and a worst case of 3. Using {2,1} gives (01, 02, 21 and 01) for 021 and (21, 10 and 12) for 201. Seven comparisons with a worst case of 4 and an average of 3.5. The actual average and worst case for {1] is 8/3 and 3, respectively. For {2,1} the values are 10/3 and 4. The averages were too high in both cases and the worst cases were correct. Had 012 been one of the cases {1} would have given a 2.5 average - too low. Now extend this to finding a set of random sequences for n=16 such that no set of gaps tested will be favored in comparison with the others and the result close (or equal) to the true values, all the while keeping processing to a minimum. Can it be done? Possibly. After all, everything is possible - but is it probable? I think that for this problem random is the wrong approach. Selecting the sequences according to some system may be less bad and might even be good.
More information regarding jdaw1's post: Gonnet and Baeza-Yates advise growth by a factor of about 2.2; Tokuda by 2.25. It is well known that if there is a mathematical constant between 2⅕ and 2¼ then it must† be precisely √5 ≈ 2.236. It is known that √5 * √5 is 5 so I think every other index should increase by a factor of five. So first index being 1 insertion sort, second being 3 then each other subsequent is of the factor 5. There follow the values up to 2⁶⁴ ≈ eighteen quintillion. {1, 3,, 15,, 75,, 375,, 1 875,, 9 375,, 46 875,, 234 375,, 1 171 875,, 5 859 375,, 29 296 875,, 146 484 375,, 732 421 875,, 3 662 109 375,, 18 310 546 875,, 91 552 734 375,, 457 763 671 875,, 2 288 818 359 375,, 11 444 091 796 875,, 57 220 458 984 375,, 286 102 294 921 875,, 1 430 511 474 609 375,, 7 152 557 373 046 875,, 35 762 786 865 234 375,, 178 813 934 326 171 875,, 894 069 671 630 859 375,, 4 470 348 358 154 296 875,} The values in the gaps can simply be calculated by taking the value before and multiply by √5 rounding to whole numbers giving the resulting array (using 2.2360679775 * 5 ^ n * 3): {1, 3, 7, 15, 34, 75, 168, 375, 839, 1 875, 4 193, 9 375, 20 963, 46 875, 104 816, 234 375, 524 078, 1 171 875, 2 620 392, 5 859 375, 13 101 961, 29 296 875, 65 509 804, 146 484 375, 327 549 020, 732 421 875, 1 637 745 101, 3 662 109 375, 8 188 725 504, 18 310 546 875, 40 943 627 518, 91 552 734 375, 204 718 137 589, 457 763 671 875, 1 023 590 687 943, 2 288 818 359 375, 5 117 953 439 713, 11 444 091 796 875, 25 589 767 198 563, 57 220 458 984 375, 127 948 835 992 813, 286 102 294 921 875, 639 744 179 964 066, 1 430 511 474 609 375, 3 198 720 899 820 328, 7 152 557 373 046 875, 15 993 604 499 101 639, 35 762 786 865 234 375, 79 968 022 495 508 194, 178 813 934 326 171 875, 399 840 112 477 540 970, 894 069 671 630 859 375, 1 999 200 562 387 704 849, 4 470 348 358 154 296 875, 9 996 002 811 938 524 246} (Obviously, omit those that would overflow the relevant array index type. So if that is a signed long long, omit the last.)