Image Neighbourhood Processing in Haskell - image
I'm new to Haskell, and trying to learn it by thinking in terms of image processing.
So far, I have been stuck thinking about how you would implement a neighbourhood-filtering algorithm in Haskell (or any functional programming language, really).
How would a spatial averaging filter (say 3x3 kernel, 5x5 image) be written functionally? Coming from an entirely imperative background, I can't seem to come up with a way to either structure the data so the solution is elegant, or not do it by iterating through the image matrix, which doesn't seem very declarative.
Working with neighborhoods is easy to do elegantly in a functional language. Operations like convolution with a kernel are higher order functions that can be written in terms of one of the usual tools of functional programming languages - lists.
To write some real, useful code, we'll first play pretend to explain a library.
Pretend
You can think of each image as a function from a coordinate in the image to the value of the data held at that coordinate. This would be defined over all possible coordinates, so it would be useful to pair it with some bounds which tell us where the function is defined. This would suggest a data type like
data Image coordinate value = Image {
lowerBound :: coordinate,
upperBound :: coordinate,
value :: coordinate -> value
}
Haskell has a very similar data type called Array in Data.Array. This data type comes with an additional feature that the value function in Image wouldn't have - it remembers the value for each coordinate so that it never needs to be recomputed. We'll work with Arrays using three functions, which I'll describe in terms of how they'd be defined for Image above. This will help us see that even though we are using the very useful Array type, everything could be written in terms of functions and algebraic data types.
type Array i e = Image i e
bounds gets the bounds of the Array
bounds :: Array i e -> (i, i)
bounds img = (lowerBound img, upperBound img)
The ! looks up a value in the Array
(!) :: Array i e -> i -> e
img ! coordinate = value img coordinate
Finally, makeArray builds an Array
makeArray :: Ix i => (i, i) -> (i -> e) -> Array i e
makeArray (lower, upper) f = Image lower upper f
Ix is a typeclass for things that behave like image coordinates, they have a range. There are instances for most of the base types like Int, Integer, Bool, Char, etc. For example the range of (1, 5) is [1, 2, 3, 4, 5]. There's also an instances for products or tuples of things that themselves have Ix instances; the instance for tuples ranges over all combinations of the ranges of each component. For example, range (('a',1),('c',2)) is
[('a',1),('a',2),
('b',1),('b',2),
('c',1),('c',2)]`
We are only interested in two functions from the Ix typeclass, range :: Ix a => (a, a) -> [a] and inRange :: Ix a => a -> (a, a) -> Bool. inRange quickly checks if a value would be in the result of range.
Reality
In reality, makeArray isn't provided by Data.Array, but we can define it in terms of listArray which constructs an Array from a list of items in the same order as the range of its bounds
import Data.Array
makeArray :: (Ix i) => (i, i) -> (i -> e) -> Array i e
makeArray bounds f = listArray bounds . map f . range $ bounds
When we convolve an array with a kernel, we will compute the neighborhood by adding the coordinates from the kernel to the coordinate we are calculating. The Ix typeclass doesn't require that we can combine two indexes together. There's one candidate typeclass for "things that combine" in base, Monoid, but there aren't instances for Int or Integer or other numbers because there's more than one sensible way to combine them: + and *. To address this, we'll make our own typeclass Offset for things that combine with a new operator called .+.. Usually we don't make typeclasses except for things that have laws. We'll just say that Offset should "work sensibly" with Ix.
class Offset a where
(.+.) :: a -> a -> a
Integers, the default type Haskell uses when you write an integer literal like 9, can be used as offsets.
instance Offset Integer where
(.+.) = (+)
Additionally, pairs or tuples of things that Offset can be combined pairwise.
instance (Offset a, Offset b) => Offset (a, b) where
(x1, y1) .+. (x2, y2) = (x1 .+. x2, y1 .+. y2)
We have one more wrinkle before we write convolve - how will we deal with the edges of the image? I intend to pad them with 0 for simplicity. pad background makes a version of ! that's defined everywhere, outside the bounds of an Array it returns the background.
pad :: Ix i => e -> Array i e -> i -> e
pad background array i =
if inRange (bounds array) i
then array ! i
else background
We're now prepared to write a higher order function for convolve. convolve a b convolves the image b with the kernel a. convolve is higher order because each of its arguments and its result is an Array, which is really a combination of a function ! and its bounds.
convolve :: (Num n, Ix i, Offset i) => Array i n -> Array i n -> Array i n
convolve a b = makeArray (bounds b) f
where
f i = sum . map (g i) . range . bounds $ a
g i o = a ! o * pad 0 b (i .+. o)
To convolve an image b with a kernel a, we make a new image defined over the same bounds as b. Each point in the image can be computed by the function f, which sums the product (*) of the value in the kernel a and the value in the padded image b for each offset o in the range of the bounds of the kernel a.
Example
With the six declarations from the previous section, we can write the example you requested, a spatial averaging filter with a 3x3 kernel applied to a 5x5 image. The kernel a defined below is a 3x3 image that uses one ninth of the value from each of the 9 sampled neighbors. The 5x5 image b is a gradient increasing from 2 in the top left corner to 10 in the bottom right corner.
main = do
let
a = makeArray ((-1, -1), (1, 1)) (const (1.0/9))
b = makeArray ((1,1),(5,5)) (\(x,y) -> fromInteger (x + y))
c = convolve a b
print b
print c
The printed input b is
array ((1,1),(5,5))
[((1,1),2.0),((1,2),3.0),((1,3),4.0),((1,4),5.0),((1,5),6.0)
,((2,1),3.0),((2,2),4.0),((2,3),5.0),((2,4),6.0),((2,5),7.0)
,((3,1),4.0),((3,2),5.0),((3,3),6.0),((3,4),7.0),((3,5),8.0)
,((4,1),5.0),((4,2),6.0),((4,3),7.0),((4,4),8.0),((4,5),9.0)
,((5,1),6.0),((5,2),7.0),((5,3),8.0),((5,4),9.0),((5,5),10.0)]
The convolved output c is
array ((1,1),(5,5))
[((1,1),1.3333333333333333),((1,2),2.333333333333333),((1,3),2.9999999999999996),((1,4),3.6666666666666665),((1,5),2.6666666666666665)
,((2,1),2.333333333333333),((2,2),3.9999999999999996),((2,3),5.0),((2,4),6.0),((2,5),4.333333333333333)
,((3,1),2.9999999999999996),((3,2),5.0),((3,3),6.0),((3,4),7.0),((3,5),5.0)
,((4,1),3.6666666666666665),((4,2),6.0),((4,3),7.0),((4,4),8.0),((4,5),5.666666666666666)
,((5,1),2.6666666666666665),((5,2),4.333333333333333),((5,3),5.0),((5,4),5.666666666666666),((5,5),4.0)]
Depending on the complexity of what you want to do, you might consider using more established libraries, like the oft recommended repa, rather than implementing an image processing kit for yourself.
Related
Proposing an algorithm for arbitrary shape Bit Matrix Transposition with BDD-like structure
We consider a bit matrix (n x m) to be a regular array containing n lines of integers of size m. I have looked in Hacker's Delight and in other sources and the algorithms I found for this were rather specialized: square matrices with powers of two sizes like 8x8, 32x32, 64x64, etc. (which is normal because the machine is built that way). I thought of a more general algorithm (for arbitrary n and m) which is, in the worse case, of the expected complexity (I think), but for matrices containing mostly similar columns, or more zeros than ones, the algorithm seems a bit more interesting (in the extreme, it is linear if the matrix contains the same line over and over). It follows a sort of Binary Decision Diagram manipulation. The output is not a transposed matrix but a compressed transposed matrix: a list of pairs (V,L) where L is an int_m that indicates the lines of the transposed matrix (by setting the bits of the corresponding position) that should contain the int_n V. The lines of the transposed matrix not appearing in any of the pairs are filled with 0. For example, for the matrix 1010111 1111000 0001010 having the transposed 110 010 110 011 100 101 100 the algorithm outputs: (010,0100000) (011,0001000) (100,0000101) (101,0000010) (110,1010000) and one reads the pair (100,0000101) as meaning "the value 100 is put in the 5th and the 7th line of the transposed matrix". This is the algorithm (written in a pseudo-OCaml/C) and a picture of the progress of the algorithm on the above example. We will run according to triples (index_of_current_line, V, L), which is of type (int, int_n, int_m), where int_n is the type of n-bit wide integers and int is just a machine integer wide enough to hold n. The function takes a list of these triples, the matrix, the number of lines and an accumulator for the output (list of pairs (int_m, int_n)) and returns, at some point, that accumulator. list of (int_n, int_m) transpose(list of triple t, int_m[n] mat, int n, list of (int_n, int_m) acc) The first call of the transpose function is transpose([(0, 0, 2^m-1)], mat, n, []). take "&", "|" "xor" to be the usual bit-wise operations transpose(t, mat, n, acc) = match t with | [] -> (* the list is empty, we're done *) return acc | (i, v, l)::tt -> let colIn = mat[i] & l in (* colIn contains the positions that were set in the parent's mask "l" and that are also set in the line "i" *) match colIn with |0 -> (* None of the positions are set in both, do not branch *) if (i<n) then (* not done with the matrix, simply move to next line *) transpose((i+1,v,l)::tt,mat,n,acc) else (* we reached the end of the matrix, we're at a leaf *) if (v>0) then transpose(tt,mat,n,(v,l)::acc) else (* We ignore the null values and continue *) transpose(tt,mat,n,acc) |_ -> (* colIn is non null, ie some of the positions set at the parent mask "l" are also set in this line. If ALL the positions are, we do not branch either. If only some of them are and some of them are not, we branch *) (* First, update v *) let vv = v | (2^(n-i-1)) in (* Then get the mask for the other branch *) let colOut = colIn xor l in, match colOut with | 0 -> (* All are in, none are out, no need to branch *) if (i<n) then transpose((i+1,vv,colIn)::tt,mat,n,acc) else (* we reached the end of the matrix, we're at a leaf *) transpose(tt,mat,n,(vv,colIn)::acc) | _ -> (* Some in, some out : now we branch *) if (i<n) then transpose((i+1,vv,colIn)::(i+1,v,colOut)::tt,mat,n,acc) else if (v>0) then transpose(tt,mat,n,(vv,colIn)::(v,colOut)::acc) else transpose(tt,mat,n,(vv,colIn)::acc) Notice that if the matrix is wider than it is high, it is even faster (if n = 3 and m = 64 for instance) My questions are : Is this interesting and/or useful? Am I reinventing the wheel? Are "almost zero" matrices or "little differentiated-lines" matrices common enough for this to be interesting ? PS: If anything doesn't seem clear, please do tell, I will rewrite whatever needs to be!
Haskell Particle simulation - calculating velocities of particles
I am working on a particle simulation program using Haskell. For one of the functions I am trying to determine the new velocities of all the particles in the simulation based on the mass and velocities of all the surrounding particles. The function is of this form: accelerate :: Float -> [Particle] -> [Particle] Particle is a data type that contains the mass, position vector and velocity vector, the 'Float' argument represents the delta time of the respective time step in the simulation I would like some suggestions on possible functions I can use to traverse the list while calculating the velocities of each of the particles with respect to the other particles in the list. One possible approach I can think of: assume there is another function 'velocityCalculator' which has the following definition: velocityCalculator :: Particle -> Particle -> (Float,Float) This takes two particles and returns the updated velocity vector for the first particle. apply foldl; using the above function as the binary operator, a particle and the list of particles as the arguments, i.e. foldl velocityCalculator particle particleList iterate through the list of particle, applying foldl to each element and building the new list containing the particles with the updated velocities I am not sure if this is the most efficient method so any suggestions and improvements are very much appreciated. PLEASE NOTE -> as I have said I am only looking for suggestions not an answer! Thanks!
Seems like you are pretty set on using foldl. For example iterate through the list of particle, applying foldl to each element and building the new list containing the particles with the updated velocities doesn't really make sense. You apply foldl to a list to reduce it to a "summary value", according to some binary summarizing function. It doesn't really make sense to apply it to a single particle. I am answering this question assuming you are having trouble writing the program in the first place -- it's usually best to do this before worrying about efficiency. Let me know if I have assumed wrong. I am not sure what rule you want to use to update the velocities, but I assume it's some sort of pairwise force simulation, for example gravity or electromagnetism. If this is so, here are some hints that will guide you to a solution. type Vector = (Float, Float) -- Finds the force exerted on one particle by the other. -- Your code will be simplified if this returns (0,0) when the two -- particles are the same. findForce :: Particle -> Particle -> Vector -- Find the sum of all forces exerted on the particle -- by every particle in the list. totalForce :: [Particle] -> Particle -> Vector -- Takes a force and a particle to update, returns the same particle with -- updated velocity. updateVelocity :: Vector -> Particle -> Particle -- Calculate mutual forces and update all particles' velocities updateParticles :: [Particle] -> [Particle] Each of these functions will be quite short, one or two lines. If you need further hints for which higher-order functions to use, pay attention to the type of the function you are trying to write, and note map :: (a -> b) -> [a] -> [b] -- takes a list, returns a list filter :: (a -> Bool) -> [a] -> [a] -- takes a list, returns a list foldl :: (a -> b -> a) -> a -> [b] -> a -- takes a list, returns something else foldr :: (a -> b -> b) -> b -> [a] -> b -- takes a list, returns something else
You might achieve a speed up by a factor of 2 by memoization, if you give every Particle a particle_id :: Int ID and then define that: forceOf a b | particle_id a > particle_id b = -(forceOf b a) | otherwise = (pos a - pos b) *:. charge a * charge b / norm (pos a - pos b) ^ 3 where (*:.) :: Vector -> Double -> Vector is vector-scalar multiplication so the above is a 1/r^2 force law. Notice that here we memoize pos a - pos b and then we also memoize forceOf a b for use as forceOf b a. Then you want to use dvs = [dt * sum (map (forceOf a) particles) / mass a | a <- particles] to get a list of changes in velocity, then zipWith (+) (map velocity particles) dvs One problem is that this approach doesn't do so well with numerical uncertainty: everything for time $t+1$ is based on things that were true at time $t$. You can start to solve this problem by solving a matrix equation; instead of v+ = v + dt * M v (where v = v(t) and v+ = v(t + 1)), you can write v+ = v + dt * M v+, so that you have v+ = (1 − dt * M)-1 v. That can often be more numerically stable. It is potentially even better to mix the two solutions 50/50. v+ = v + dt * M (v + v+) / 2.There are lots of options here.
Best way to do an iteration scheme
I hope this hasn't been asked before, if so I apologize. EDIT: For clarity, the following notation will be used: boldface uppercase for matrices, boldface lowercase for vectors, and italics for scalars. Suppose x0 is a vector, A and B are matrix functions, and f is a vector function. I'm looking for the best way to do the following iteration scheme in Mathematica: A0 = A(x0), B0=B(x0), f0 = f(x0) x1 = Inverse(A0)(B0.x0 + f0) A1 = A(x1), B1=B(x1), f1 = f(x1) x2 = Inverse(A1)(B1.x1 + f1) ... I know that a for-loop can do the trick, but I'm not quite familiar with Mathematica, and I'm concerned that this is the most efficient way to do it. This is a justified concern as I would like to define a function u(N):=xNand use it in further calculations. I guess my questions are: What's the most efficient way to program the scheme? Is RecurrenceTable a way to go? EDIT It was a bit more complicated than I tought. I'm providing more details in order to obtain a more thorough response. Before doing the recurrence, I'm having problems understanding how to program the functions A, B and f. Matrices A and B are functions of the time step dt = 1/T and the space step dx = 1/M, where T and M are the number of points in the {0 < x < 1, 0 < t} region. This is also true for vector the function f. The dependance of A, B and f on x is rather tricky: A and B are upper and lower triangular matrices (like a tridiagonal matrix; I suppose we can call them multidiagonal), with defined constant values on their diagonals. Given a point 0 < xs < 1, I need to determine it's representative xn in the mesh (the closest), and then substitute the nth row of A and B with the function v( x) (transposed, of course), and the nth row of f with the function w( x). Summarizing, A = A(dt, dx, xs, x). The same is true for B and f. Then I need do the loop mentioned above, to define u( x) = step[T]. Hope I've explained myself.
I'm not sure if it's the best method, but I'd just use plain old memoization. You can represent an individual step as xstep[x_] := Inverse[A[x]](B[x].x + f[x]) and then u[0] = x0 u[n_] := u[n] = xstep[u[n-1]] If you know how many values you need in advance, and it's advantageous to precompute them all for some reason (e.g. you want to open a file, use its contents to calculate xN, and then free the memory), you could use NestList. Instead of the previous two lines, you'd do xlist = NestList[xstep, x0, 10]; u[n_] := xlist[[n]] This will break if n > 10, of course (obviously, change 10 to suit your actual requirements). Of course, it may be worth looking at your specific functions to see if you can make some algebraic simplifications.
I would probably write a function that accepts A0, B0, x0, and f0, and then returns A1, B1, x1, and f1 - say step[A0_?MatrixQ, B0_?MatrixQ, x0_?VectorQ, f0_?VectorQ] := Module[...] I would then Nest that function. It's hard to be more precise without more precise information. Also, if your procedure is numerical, then you certainly don't want to compute Inverse[A0], as this is not a numerically stable operation. Rather, you should write A0.x1 == B0.x0+f0 and then use a numerically stable solver to find x1. Of course, Mathematica's LinearSolve provides such an algorithm.
What's the formal term for a function that can be written in terms of `fold`?
I use the LINQ Aggregate operator quite often. Essentially, it lets you "accumulate" a function over a sequence by repeatedly applying the function on the last computed value of the function and the next element of the sequence. For example: int[] numbers = ... int result = numbers.Aggregate(0, (result, next) => result + next * next); will compute the sum of the squares of the elements of an array. After some googling, I discovered that the general term for this in functional programming is "fold". This got me curious about functions that could be written as folds. In other words, the f in f = fold op. I think that a function that can be computed with this operator only needs to satisfy (please correct me if I am wrong): f(x1, x2, ..., xn) = f(f(x1, x2, ..., xn-1), xn) This property seems common enough to deserve a special name. Is there one?
An Iterated binary operation may be what you are looking for. You would also need to add some stopping conditions like f(x) = something f(x1,x2) = something2 They define a binary operation f and another function F in the link I provided to handle what happens when you get down to f(x1,x2).
To clarify the question: 'sum of squares' is a special function because it has the property that it can be expressed in terms of the fold functional plus a lambda, ie sumSq = fold ((result, next) => result + next * next) 0 Which functions f have this property, where dom f = { A tuples }, ran f :: B? Clearly, due to the mechanics of fold, the statement that f is foldable is the assertion that there exists an h :: A * B -> B such that for any n > 0, x1, ..., xn in A, f ((x1,...xn)) = h (xn, f ((x1,...,xn-1))). The assertion that the h exists says almost the same thing as your condition that f((x1, x2, ..., xn)) = f((f((x1, x2, ..., xn-1)), xn)) (*) so you were very nearly correct; the difference is that you are requiring A=B which is a bit more restrictive than being a general fold-expressible function. More problematically though, fold in general also takes a starting value a, which is set to a = f nil. The main reason your formulation (*) is wrong is that it assumes that h is whatever f does on pair lists, but that is only true when h(x, a) = a. That is, in your example of sum of squares, the starting value you gave to Accumulate was 0, which is a does-nothing when you add it, but there are fold-expressible functions where the starting value does something, in which case we have a fold-expressible function which does not satisfy (*). For example, take this fold-expressible function lengthPlusOne: lengthPlusOne = fold ((result, next) => result + 1) 1 f (1) = 2, but f(f(), 1) = f(1, 1) = 3. Finally, let's give an example of a functions on lists not expressible in terms of fold. Suppose we had a black box function and tested it on these inputs: f (1) = 1 f (1, 1) = 1 (1) f (2, 1) = 1 f (1, 2, 1) = 2 (2) Such a function on tuples (=finite lists) obviously exists (we can just define it to have those outputs above and be zero on any other lists). Yet, it is not foldable because (1) implies h(1,1)=1, while (2) implies h(1,1)=2. I don't know if there is other terminology than just saying 'a function expressible as a fold'. Perhaps a (left/right) context-free list function would be a good way of describing it?
In functional programming, fold is used to aggregate results on collections like list, array, sequence... Your formulation of fold is incorrect, which leads to confusion. A correct formulation could be: fold f e [x1, x2, x3,..., xn] = f((...f(f(f(e, x1),x2),x3)...), xn) The requirement for f is actually very loose. Lets say the type of elements is T and type of e is U. So function f indeed takes two arguments, the first one of type U and the second one of type T, and returns a value of type U (because this value will be supplied as the first argument of function f again). In short, we have an "accumulate" function with a signature f: U * T -> U. Due to this reason, I don't think there is a formal term for these kinds of function. In your example, e = 0, T = int, U = int and your lambda function (result, next) => result + next * next has a signaturef: int * int -> int, which satisfies the condition of "foldable" functions. In case you want to know, another variant of fold is foldBack, which accumulates results with the reverse order from xn to x1: foldBack f [x1, x2,..., xn] e = f(x1,f(x2,...,f(n,e)...)) There are interesting cases with commutative functions, which satisfy f(x, y) = f(x, y), when fold and foldBack return the same result. About fold itself, it is a specific instance of catamorphism in category theory. You can read more about catamorphism here.
Pseudo random number generator from two inputs
I need a pseudo random number generator that gives me a number from the range [-1, 1] (range is optional) from two inputs of the type float. I'll also try to explain why I need it: I'm using the Diamond-Square algorithm to create a height map for my terrain engine. The terrain is split into patches (Chunked LOD). The problem with Diamond-Square is that it uses the random function, so let's say two neighbor patches are sharing same point (x, z) then I want the height to be the same for them all so that I won't get some crack effect. Some may say I could fetch the height information from the neighbor patch, but then the result could be different after which patch was created first. So that's why I need a pseudo number generator that returns an unique number given two inputs which are the (x, z). (I'm not asking someone to write such function, I just need a general feedback and or known algorithms that do something similar).
You need something similar to a hash function on the pair (x, z). I would suggest something like (a * x + b * z + c) ^ d where all numbers are integers, a and b are big primes so that the integer multiplications overflow, and c and d are some random integers. ^ is bitwise exclusive or. The result is a random integer which you can scale to the desired range. This assumes that the map is not used in a game where knowing the terrain is of substantial value, as such a function is not secure for keeping it a secret. In that case you'd better use some cryptographic function.
If you're looking for a bijection from IRxIR -> [-1;1], I can suggest this: bijection from IR to ]-a:a[ First let's find a bijection from IR-> ]-1;1[ so we just need to find a bijection from IRxIR->IR tan(x): ]-Pi/2;Pi/2[ -> IR arctan(x) : IR -> ]-Pi/2;Pi/2[ 1/Pi*arctan(x) + 1/2: IR -> ]0;1[ 2*arctan(x) : IR->]-Pi:Pi[ and ln(x) : IR + -> IR exp(x): IR -> R+ Bijection from ]0,1[ x ]0,1[ -> ]0,1[ let's write: (x,y) in ]0,1[ x ]0,1[ x= 0,x1x2x3x4...xn...etc where x1x2x3x4...xn represent the decimals of x in base 10 y=0,y1y2y3y4...ym...etc idem Let's define z=0,x1y1x2y2xx3y3....xnyn...Oym in ]0,1[ Then by construction we can provethere that it is exact bijection from ]0,1[ x ]0,1[ to ]0,1[. (i'm not sure it's is true for number zith infinite decimals..but it's at least a "very good" injection, tell me if i'm wrong) let's name this function : CANTOR(x,y) then 2*CANTOR-1 is a bijection from ]0,1[ x ]0,1[ -> ]-1,1[ Then combining all the above assertions: here you go, you get the bijection from IRxIR -> ]-1;1[... You can combine with a bijection from IR-> ]0,1[ IRxIR -> ]-1;1[ (x,y) -> 2*CANTOR(1/Pi*arctan(x) + 1/2,1/Pi*arctan(y) + 1/2)-1 let's define the reciproque, we process the same way: RCANTOR: z -> (x,y) (reciproque of CANTOR(x,y) RCANTOR((z+1)/2): ]-1:1[ -> ]01[x ]0,1[ then 1/Pi*tan(RCANTOR((z+1)/2)) + 1/2 : z ->(x,y) ]-1;1[ -> IRxIR
Just pick any old hash function, stick in the binary description of the coordinates and use the output.