`friday` package is very slow - performance

I’m writing a Haskell program that draws big maps from Knytt Stories world files. I use the friday package to make image files, and I need to compose the many graphics layers that I put together from spritesheets. Right now, I use my own ugly function for this:
import qualified Vision.Primitive as Im
import qualified Vision.Image.Type as Im
import qualified Vision.Image.Class as Im
import Vision.Image.RGBA.Type (RGBA, RGBAPixel(..))
-- Map a Word8 in [0, 255] to a Double in [0, 1].
w2f :: Word8 -> Double
w2f = (/255) . fromIntegral . fromEnum
-- Map a Double in [0, 1] to a Word8 in [0, 255].
f2w :: Double -> Word8
f2w = toEnum . round . (*255)
-- Compose two images into one. `bottom` is wrapped to `top`'s size.
compose :: RGBA -> RGBA -> RGBA
compose bottom top =
let newSize = Im.manifestSize top
bottom' = wrap newSize bottom
in Im.fromFunction newSize $ \p ->
let RGBAPixel rB gB bB aB = bottom' Im.! p
RGBAPixel rT gT bT aT = top Im.! p
aB' = w2f aB; aT' = w2f aT
ovl :: Double -> Double -> Double
ovl cB cT = (cT * aT' + cB * aB' * (1.0 - aT')) / (aT' + aB' * (1.0 - aT'))
(~*~) :: Word8 -> Word8 -> Word8
cB ~*~ cT = f2w $ w2f cB `ovl` w2f cT
aO = f2w (aT' + aB' * (1.0 - aT'))
in RGBAPixel (rB ~*~ rT) (gB ~*~ gT) (bB ~*~ bT) aO
It simply alpha-composites a bottom layer and a top layer, like so:
If the “bottom” layer is a texture, it will be looped horizontally and vertically (by wrap) to fit the top layer’s size.
Rendering a map takes far, far longer than it should. Rendering the map for the default world that comes with the game takes 27 minutes at -O3, even though the game itself can clearly render each separate screen in less than a couple of milliseconds. (The smaller example output I linked above see above takes 67 seconds; also far too long.)
The profiler (output is here) says the program spends about 77% of its time in compose.
Cutting this down seems like a good first step. It seems like a very simple operation, but I can’t find a native function in friday that lets me do this. Supposedly GHC should be good at collapsing all of the fromFunction stuff, but I don’t know what’s going on. Or is the package just super slow?
Here’s the full, compileable code.

As I stated in my comment, the MCE I made performs fine and does not yield any interesting output:
module Main where
import qualified Vision.Primitive as Im
import Vision.Primitive.Shape
import qualified Vision.Image.Type as Im
import qualified Vision.Image.Class as Im
import Vision.Image.RGBA.Type (RGBA, RGBAPixel(..))
import Vision.Image.Storage.DevIL (load, save, Autodetect(..), StorageError, StorageImage(..))
import Vision.Image (convert)
import Data.Word
import System.Environment (getArgs)
main :: IO ()
main = do
[input1,input2,output] <- getArgs
io1 <- load Autodetect input1 :: IO (Either StorageError StorageImage)
io2 <- load Autodetect input2 :: IO (Either StorageError StorageImage)
case (io1,io2) of
(Left err,_) -> error $ show err
(_,Left err) -> error $ show err
(Right i1, Right i2) -> go (convert i1) (convert i2) output
where
go i1 i2 output =
do res <- save Autodetect output (compose i1 i2)
case res of
Nothing -> putStrLn "Done with compose"
Just e -> error (show (e :: StorageError))
-- Wrap an image to a given size.
wrap :: Im.Size -> RGBA -> RGBA
wrap s im =
let Z :. h :. w = Im.manifestSize im
in Im.fromFunction s $ \(Z :. y :. x) -> im Im.! Im.ix2 (y `mod` h) (x `mod` w)
-- Map a Word8 in [0, 255] to a Double in [0, 1].
w2f :: Word8 -> Double
w2f = (/255) . fromIntegral . fromEnum
-- Map a Double in [0, 1] to a Word8 in [0, 255].
f2w :: Double -> Word8
f2w = toEnum . round . (*255)
-- Compose two images into one. `bottom` is wrapped to `top`'s size.
compose :: RGBA -> RGBA -> RGBA
compose bottom top =
let newSize = Im.manifestSize top
bottom' = wrap newSize bottom
in Im.fromFunction newSize $ \p ->
let RGBAPixel rB gB bB aB = bottom' Im.! p
RGBAPixel rT gT bT aT = top Im.! p
aB' = w2f aB; aT' = w2f aT
ovl :: Double -> Double -> Double
ovl cB cT = (cT * aT' + cB * aB' * (1.0 - aT')) / (aT' + aB' * (1.0 - aT'))
(~*~) :: Word8 -> Word8 -> Word8
cB ~*~ cT = f2w $ w2f cB `ovl` w2f cT
aO = f2w (aT' + aB' * (1.0 - aT'))
in RGBAPixel (rB ~*~ rT) (gB ~*~ gT) (bB ~*~ bT) aO
This code loads two images, applies your compose operation, and saves the resulting image. This happens almost instantly:
% ghc -O2 so.hs && time ./so /tmp/lambda.jpg /tmp/lambda2.jpg /tmp/output.jpg && o /tmp/output.jpg
Done with compose
./so /tmp/lambda.jpg /tmp/lambda2.jpg /tmp/output.jpg 0.05s user 0.00s system 98% cpu 0.050 total
If you have an alternate MCE then please post it. Your complete code was too non-minimal for my eyes.

Related

Why does the runtime increase when moving functions to a module?

I can't figure out what's wrong. I created template project with stack new, add some code:
module Main where
import Lib
import Data.Massiv.Array (Sz (..), Vector)
import qualified Data.Massiv.Array as A
import Data.Time.Clock.POSIX
t mul = round . (mul *) <$> getPOSIXTime
timeMillis = t 1000
area :: BBox -> Double
area (BBox (x1, y1) (x2, y2)) = (x2 - x1) * (y2 - y1)
sumOfAreas :: Vector A.DS BBox -> Double
sumOfAreas bboxes = A.sfoldl (\acc box -> acc + area box) 0 bboxes
main :: IO ()
main = do
start <- timeMillis
putStrLn $ "Start " ++ show start
let
bboxes = A.makeArray A.Seq (Sz 10000 * 10000) $ (\_ -> BBox (0, 0) (12, 12)) :: Vector A.DS BBox
result = sumOfAreas bboxes
putStrLn $ show result
end <- timeMillis
putStrLn $ show (end - start)
module Lib where
import Data.Massiv.Array (Vector)
import qualified Data.Massiv.Array as A
type Point = (Double, Double)
data BBox =
BBox (Point) (Point)
deriving (Eq, Show)
This takes an average of 100ms on my hardware.
Next, I decided to move functions in Lib module. Updated:
module Main where
import Lib
import Data.Massiv.Array (Sz (..), Vector)
import qualified Data.Massiv.Array as A
import Data.Time.Clock.POSIX
t mul = round . (mul *) <$> getPOSIXTime
timeMillis = t 1000
main :: IO ()
main = do
start <- timeMillis
putStrLn $ "Start " ++ show start
let
bboxes = A.makeArray A.Seq (Sz 10000 * 10000) $ (\_ -> BBox (0, 0) (12, 12)) :: Vector A.DS BBox
result = sumOfAreas bboxes
putStrLn $ show result
end <- timeMillis
putStrLn $ show (end - start)
module Lib where
import Data.Massiv.Array (Vector)
import qualified Data.Massiv.Array as A
type Point = (Double, Double)
data BBox =
BBox (Point) (Point)
deriving (Eq, Show)
area :: BBox -> Double
area (BBox (x1, y1) (x2, y2)) = (x2 - x1) * (y2 - y1)
sumOfAreas :: Vector A.DS BBox -> Double
sumOfAreas bboxes = A.sfoldl (\acc box -> acc + area box) 0 bboxes
And the runtime is now approximately 1000 ms.
cabal file looks like:
cabal-version: 1.12
-- This file has been generated from package.yaml by hpack version 0.35.0.
--
-- see: https://github.com/sol/hpack
-- ... info ...
source-repository head
type: git
location: https://github.com/githubuser/profiling
library
exposed-modules:
Lib
other-modules:
Paths_profiling
hs-source-dirs:
src
ghc-options: -Wall -threaded -O3
build-depends:
base >=4.7 && <5
, massiv ==1.0.1.1
, vector ==0.12.3.1
, array ==0.5.4.0
, time ==1.9.3
default-language: Haskell2010
executable profiling-exe
main-is: Main.hs
other-modules:
Paths_profiling
hs-source-dirs:
app
ghc-options: -Wall -threaded -O3
build-depends:
base >=4.7 && <5
, massiv ==1.0.1.1
, vector ==0.12.3.1
, array ==0.5.4.0
, time ==1.9.3
, profiling
default-language: Haskell2010
test-suite profiling-test
type: exitcode-stdio-1.0
main-is: Spec.hs
other-modules:
Paths_profiling
hs-source-dirs:
test
ghc-options: -Wall -threaded -O3
build-depends:
base >=4.7 && <5
, profiling
default-language: Haskell2010
Stack version
Version 2.9.1, Git revision 409d56031b4240221d656db09b2ba476fe6bb5b1 x86_64 hpack-0.35.0
ghc version 9.2.4.
In fact, I encountered this behavior when I exported one function in one of the playground-project in the Main module itself.
What explains this and how to avoid it? What am I missing?

Netwire 5 - A box cannot bounce

I am trying to convert challenge 3 ( https://ocharles.org.uk/blog/posts/2013-08-01-getting-started-with-netwire-and-sdl.html ) from netwire 4.0 to netwire 5.0 using OpenGL. Unfortunately, the box cannot bounce. The entire code is following. It seems to me that the function velocity does not work. When the box collides with a wall, it does not bounce but stops. How do I correct my program? Thanks in advance.
{-# LANGUAGE Arrows #-}
import Graphics.Rendering.OpenGL
import Graphics.UI.GLFW
import Data.IORef
import Prelude hiding ((.))
import Control.Monad.Fix (MonadFix)
import Control.Wire
import FRP.Netwire
isKeyDown :: (Enum k, Monoid e) => k -> Wire s e IO a e
isKeyDown k = mkGen_ $ \_ -> do
s <- getKey k
return $ case s of
Press -> Right mempty
Release -> Left mempty
acceleration :: (Monoid e) => Wire s e IO a Double
acceleration = pure ( 0) . isKeyDown (CharKey 'A') . isKeyDown (CharKey 'D')
<|> pure (-0.5) . isKeyDown (CharKey 'A')
<|> pure ( 0.5) . isKeyDown (CharKey 'D')
<|> pure ( 0)
velocity :: (Monad m, HasTime t s, Monoid e) => Wire s e m (Double, Bool) Double
velocity = integralWith bounce 0
where bounce c v
| c = (-v)
| otherwise = v
collided :: (Ord a, Fractional a) => (a, a) -> a -> (a, Bool)
collided (a, b) x
| x < a = (a, True)
| x > b = (b, True)
| otherwise = (x, False)
position' :: (Monad m, HasTime t s) => Wire s e m Double (Double, Bool)
position' = integral 0 >>> (arr $ collided (-0.8, 0.8))
challenge3 :: (HasTime t s) => Wire s () IO a Double
challenge3 = proc _ -> do
rec a <- acceleration -< ()
v <- velocity -< (a, c)
(p, c) <- position' -< v
returnA -< p
s :: Double
s = 0.05
y :: Double
y = 0.0
renderPoint :: (Double, Double) -> IO ()
renderPoint (x, y) = vertex $ Vertex2 (realToFrac x :: GLfloat) (realToFrac y :: GLfloat)
generatePoints :: Double -> Double -> Double -> [(Double, Double)]
generatePoints x y s =
[ (x - s, y - s)
, (x + s, y - s)
, (x + s, y + s)
, (x - s, y + s) ]
runNetwork :: (HasTime t s) => IORef Bool -> Session IO s -> Wire s e IO a Double -> IO ()
runNetwork closedRef session wire = do
pollEvents
closed <- readIORef closedRef
if closed
then return ()
else do
(st , session') <- stepSession session
(wt', wire' ) <- stepWire wire st $ Right undefined
case wt' of
Left _ -> return ()
Right x -> do
clear [ColorBuffer]
renderPrimitive Quads $
mapM_ renderPoint $ generatePoints x y s
swapBuffers
runNetwork closedRef session' wire'
main :: IO ()
main = do
initialize
openWindow (Size 1024 512) [DisplayRGBBits 8 8 8, DisplayAlphaBits 8, DisplayDepthBits 24] Window
closedRef <- newIORef False
windowCloseCallback $= do
writeIORef closedRef True
return True
runNetwork closedRef clockSession_ challenge3
closeWindow
By experience, I think the trick here is the fact that you technically have to bounce a few pixels before the actual collision, because if you detect it when it happens, then the inertia put your square a little bit "in" the wall, and so velocity is constantly reversed, causing your square to be blocked.
Ocharles actually nods to it in the blog post :
If this position falls outside the world bounds, we adjust the square (with a small epsilon to stop it getting stuck in the wall) and return the collision information.
Good luck with Netwire 5, I'm playing with it too, and I just begin to like it. ;)

Creating image from function in parallel

I'm writting a parogram in Haskell that creates a fractal and writes to a PNG file. I have a function
f:: Int->Int->PixelRGB8
which calcualtes color of the pixel with given image coordinates. (The output color format, PixelRGB8, is not important, I can easilly change it to, say, RGB tuple or anything).
Using Codec.Picture, I can write
writePng "test.png" $ generateImage f width height
which indeed writes the desired image file. However, it works very slowly and I can see that my CPU load is low. I want to use parallel computations, since the computation of each pixel value does not depend on its neighbors. As far as I can see, Codec.Picture does not provide any means to do it. I understand how parMap works, but I can't see a way to apply it here. I think one possible solution is to use repa.DevIL, but I'm kinda lost in its multidimusional arrays notation which looks like an overkill in my case. So, the question is: how to construct an image file from given function using parallel?
UPDATE. Here's a complete code (function 'extract' is ommited because it's long and called only one time):
import Data.Complex
import System.IO
import Data.List.Split
import Codec.Picture
eval:: (Floating a) => [a] -> a -> a
eval [p] _ = p
eval (p:ps) z = p * z ** (fromIntegral (length ps) ) + (eval ps z)
type Comp = Complex Double
-- func, der, z, iter
convergesOrNot:: (Comp -> Comp) -> (Comp -> Comp) -> Comp->Int -> Int
convergesOrNot _ _ _ 0 = 0
convergesOrNot f d z iter | realPart (abs (f z) ) < 1e-6 = 1
| otherwise = convergesOrNot f d (z - (f z)/(d z)) (iter-1)
-- x, y, f,d, xMin, xMin, stepX, stepY
getPixel:: Int->Int->(Comp->Comp)->(Comp->Comp)->Double->Double->Double->Double->PixelRGB8
getPixel x y f d xMin yMin stepX stepY | convergesOrNot f d z 16 == 1 = PixelRGB8 255 255 255
| otherwise = PixelRGB8 0 0 0
where
real = xMin + (fromIntegral x)*stepX
imag = yMin + (fromIntegral y)*stepY
z = real :+ imag;
data Params = Params{f :: [Comp],
d :: [Comp],
xMin::Double,
yMin::Double,
stepX::Double,
stepY::Double,
width::Int,
height::Int
} deriving (Show)
getPixelParams:: Int->Int->Params->PixelRGB8
getPixelParams x y params = getPixel x y func derv (xMin params) (yMin params) (stepX params) (stepY params)
where
func = \z -> eval (f params) z
derv = \z -> eval (d params) z
main = do
handle <- openFile "config.txt" ReadMode
config <- hGetContents handle
let params = extract config
writePng "test.png" $ generateImage (\x y -> getPixelParams x y params) (width params) (height params)
hClose handle
The profiling shows that most of the time is spent in eval function. Result (the .prof file ) is as follows (it's only the top part of file, the rest is bunch of zeroes):
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 91 0 0.0 0.0 100.0 100.0
main Main 183 0 0.0 0.0 99.9 100.0
main.\ Main 244 0 0.0 0.0 0.0 0.0
getPixelParams Main 245 0 0.0 0.0 0.0 0.0
getPixelParams.derv Main 269 1 0.0 0.0 0.0 0.0
getPixelParams.func Main 246 1 0.0 0.0 0.0 0.0
generateImage Codec.Picture.Types 199 1 0.0 0.0 99.8 99.9
generateImage.generated Codec.Picture.Types 234 1 0.0 0.0 99.8 99.9
generateImage.generated.lineGenerator Codec.Picture.Types 238 257 0.0 0.0 99.8 99.9
generateImage.generated.lineGenerator.column Codec.Picture.Types 239 65792 0.5 0.8 99.8 99.9
unsafeWritePixel Codec.Picture.Types 275 65536 0.0 0.0 0.0 0.0
main.\ Main 240 65536 0.1 0.0 99.2 99.1
getPixelParams Main 241 65536 0.7 0.0 99.1 99.1
getPixelParams.derv Main 270 0 0.2 0.0 19.3 18.5
getPixelParams.derv.\ Main 271 463922 0.2 0.0 19.2 18.5
eval Main 272 1391766 18.9 18.5 18.9 18.5
getPixelParams.func Main 247 0 0.5 0.0 62.3 59.0
getPixelParams.func.\ Main 248 993380 0.4 0.0 61.8 59.0
eval Main 249 3973520 61.4 59.0 61.4 59.0
getPixel Main 242 65536 0.2 0.0 16.7 21.5
getPixel.imag Main 262 256 0.0 0.0 0.0 0.0
getPixel.z Main 261 65536 0.1 0.1 0.1 0.1
getPixel.real Main 251 65536 0.2 0.1 0.2 0.1
convergesOrNot Main 243 531889 16.3 21.3 16.3 21.3
UPDATE 2 After a number of changes from #Cirdec and #Jedai, the code looks like this:
import Data.Complex
import System.IO
import Data.List.Split
import qualified Data.List as DL
import Codec.Picture
import Codec.Picture.Types
import Control.Parallel
import Data.Array
import Control.Parallel.Strategies
import GHC.Conc (numCapabilities)
class Ix a => Partitionable a where
partition :: Int -> (a, a) -> [(a, a)]
default partition :: (Num a) => Int -> (a, a) -> [(a, a)]
partition n r#(l,_) = zipWith (\x y -> (x, x+y-1)) starts steps
where
(span, longerSpans) = rangeSize r `quotRem` n
steps = zipWith (+) (replicate (min (rangeSize r) n) (fromIntegral span)) (replicate longerSpans 1 ++ repeat 0)
starts = scanl (+) l steps
instance Partitionable Int
instance (Partitionable a, Partitionable b) => Partitionable (a, b) where
partition n ((x0,y0), (x1, y1)) = do
xr'#(x0', x1') <- partition n (x0, x1)
let n' = n * rangeSize xr' `div` rangeSize (x0, x1)
(y0', y1') <- partition n' (y0, y1)
return ((x0', y0'), (x1', y1'))
mkArrayPar :: (Partitionable i) => Int -> Strategy e -> (i, i) -> (i -> e) -> Array i e
mkArrayPar n s bounds f = listArray bounds (concat workUnits)
where
partitions = partition n bounds
workUnits = parMap (evalList s) (map f . range) partitions
generateImagePar :: forall a . Pixel a => (Int -> Int -> a) -> Int -> Int -> Image a
generateImagePar f w h = generateImage f' w h
where
bounds = ((0, 0), (w-1,h-1))
pixels = mkArrayPar numCapabilities rseq bounds (uncurry f)
f' = curry (pixels !)
--
-- Newton
--
eval:: (Floating a) => [a] -> a -> a
eval cs z = DL.foldl1' (\acc c -> acc * z + c) cs
diff:: (Floating a) => [a] -> [a]
diff [p] = []
diff (p:ps) = [(fromIntegral (length ps) )*p] ++ diff ps
type Comp = Complex Double
convergesOrNot:: (Comp -> Comp) -> (Comp -> Comp) -> Comp->Int -> Int
convergesOrNot _ _ _ 0 = 0
convergesOrNot f d z iter | realPart (abs (f z) ) < 1e-6 = 1
| otherwise = convergesOrNot f d (z - (f z)/(d z)) (iter-1)
-- x, y, f,d, xMin, xMin, stepX, stepY
getPixel:: Int->Int->(Comp->Comp)->(Comp->Comp)->Double->Double->Double->Double->PixelRGB8
getPixel x y f d xMin yMin stepX stepY | convergesOrNot f d z 16 == 1 = PixelRGB8 255 255 255
| otherwise = PixelRGB8 0 0 0
where
real = xMin + (fromIntegral x)*stepX
imag = yMin + (fromIntegral y)*stepY
z = real :+ imag;
data Params = Params{f :: [Comp],
d :: [Comp],
xMin::Double,
yMin::Double,
stepX::Double,
stepY::Double,
width::Int,
height::Int
} deriving (Show)
extract:: String -> Params
extract config = Params poly deriv xMin yMin stepX stepY width height
where
lines = splitOn "\n" config
wh = splitOn " " (lines !! 0)
width = read (wh !! 0) :: Int
height = read (wh !! 1) :: Int
bottomLeft = splitOn " " (lines !! 1)
upperRight = splitOn " " (lines !! 2)
xMin = read $ (bottomLeft !! 0) :: Double
yMin = read $ (bottomLeft !! 1) :: Double
xMax = read $ (upperRight !! 0) :: Double
yMax = read $ (upperRight !! 1) :: Double
stepX = (xMax - xMin)/(fromIntegral width)
stepY = (yMax - yMin)/(fromIntegral height)
poly = map (\x -> (read x :: Double) :+ 0) (splitOn " " (lines !! 3))
deriv = diff poly
getPixelParams:: Int->Int->Params->PixelRGB8
getPixelParams x y params = getPixel x y func derv (xMin params) (yMin params) (stepX params) (stepY params)
where
func = \z -> eval (f params) z
derv = \z -> eval (d params) z
main = do
handle <- openFile "config.txt" ReadMode
config <- hGetContents handle
let params = extract config
writePng "test.png" $ generateImagePar (\x y -> getPixelParams x y params) (width params) (height params)
hClose handle
I compile it with
ghc O2 -threaded -rtsopts -XDefaultSignatures -XExistentialQuantification partNewton.hs -o newton
and I run it with ./newton +RTS -N. But when I run it on config
2048 2048
-1 -1
1 1
1 0 0 1
it results in error
Stack space overflow: current size 8388608 bytes.
You can calculate the pixels in parallel before generating the image. To make the pixel lookup for generateImage simple, we'll stuff all of the pixels into an Array.
{-# LANGUAGE RankNTypes #-}
import Data.Array
import Control.Parallel.Strategies
To generate the image in parallel, we'll calculate the pixels in parallel for each boint within the range of the bounds of the image. We'll build a temporary Array to hold all the pixels. The array's lookup function, ! will provide an efficient lookup function to pass to generateImage.
generateImagePar :: forall a . Pixel a => (Int -> Int -> a) -> Int -> Int -> Image a
generateImagePar f w h = generateImage f' w h
where
bounds = ((0, 0), (w-1,h-1))
pixels = parMap rseq (uncurry f) (range bounds)
pixelArray = listArray bounds pixels
f' = curry (pixelArray !)
We can then write your example in terms of generateImagePar.
writePng "test.png" $ generateImagePar f width height
This may be no faster and may in fact be slower than using generateImage. It's important to profile your code to understand why it is slow before attempting to improve its performance. For example, if your program is memory starved or is thrashing resources, using generateImagePar will certainly be slower than using generateImage.
Partitioning the work
We can partition the work into chunks to reduce the number of sparks without resorting to any sort of mutable data structure. First we'll define the class of indexes whose ranges can be divided into partitions. We'll define a default for dividing up numeric ranges.
class Ix a => Partitionable a where
partition :: Int -> (a, a) -> [(a, a)]
default partition :: (Num a) => Int -> (a, a) -> [(a, a)]
partition n r#(l,_) = zipWith (\x y -> (x, x+y-1)) starts steps
where
(span, longerSpans) = rangeSize r `quotRem` n
steps = zipWith (+) (replicate (min (rangeSize r) n) (fromIntegral span)) (replicate longerSpans 1 ++ repeat 0)
starts = scanl (+) l steps
Ints (and any other Num) can be made Partitionable using the default implementation.
instance Partitionable Int
Index products can be partitioned by first partitioning the first dimension, and then partitioning the second dimension if there aren't enough possible divisions in the first dimension.
instance (Partitionable a, Partitionable b) => Partitionable (a, b) where
partition n ((x0,y0), (x1, y1)) = do
xr'#(x0', x1') <- partition n (x0, x1)
let n' = n * rangeSize xr' `div` rangeSize (x0, x1)
(y0', y1') <- partition n' (y0, y1)
return ((x0', y0'), (x1', y1'))
We can build an array in parallel by partitioning the work into units and sparking each work unit.
mkArrayPar :: (Partitionable i) => Int -> Strategy e -> (i, i) -> (i -> e) -> Array i e
mkArrayPar n s bounds f = listArray bounds (concat workUnits)
where
partitions = partition n bounds
workUnits = parMap (evalList s) (map f . range) partitions
Now we can define generateImagePar in terms of making an array in parallel. A good number of partitions is a small multiple of the number of actual processors, numCapabilities; we'll start up to 1 partition per processor.
import GHC.Conc (numCapabilities)
generateImagePar :: forall a . Pixel a => (Int -> Int -> a) -> Int -> Int -> Image a
generateImagePar f w h = generateImage f' w h
where
bounds = ((0, 0), (w-1,h-1))
pixels = mkArrayPar numCapabilities rseq bounds (uncurry f)
f' = curry (pixels !)

Implementing Phase Unwrapping Algorithm with Haskell Repa Array

I'm trying to implement a Phase Unwrapping Algorithm for Three Phase Structured Light Scanning in Haskell using a Repa Array. I want to implement a flood fill based unwrapping algorithm recursing outward from the point (width / 2, height / 2). Unfortunately using that method of recursion I'm getting an out of memory exception. I'm new to Haskell and the Repa library so I was wondering whether it looks like I'm doing anything glaringly wrong. Any help with this would be greatly appreciated!
Update (#leventov):
I am now considering implementing the following path following algorithm using mutable arrays in Yarr. (Publication: K. Chen, J. Xi, Y. Yu & J. F. Chicharo, "Fast quality-guided flood-fill phase unwrapping algorithm for threedimensional fringe pattern profilometry," in Optical Metrology and Inspection for Industrial Applications,
2010, pp. 1-9.)
{-# OPTIONS_GHC -Odph -rtsopts -fno-liberate-case -fllvm -optlo-O3 -XTypeOperators -XNoMonomorphismRestriction #-}
module Scanner where
import Data.Word
import Data.Fixed
import Data.Array.Repa.Eval
import qualified Data.Array.Repa as R
import qualified Data.Array.Repa.Repr.Unboxed as U
import qualified Data.Array.Repa.Repr.ForeignPtr as P
import Codec.BMP
import Data.Array.Repa.IO.BMP
import Control.Monad.Identity (runIdentity)
import System.Environment( getArgs )
type ImRead = Either Error Image
type Avg = P.Array R.U R.DIM2 (ImageT, ImageT, ImageT)
type ImageT = (Word8, Word8, Word8)
type PhaseT = (Float, Float, Float)
type WrapT = (Float, Int)
type Image = P.Array R.U R.DIM2 (Word8, Word8, Word8)
type Phase = P.Array R.U R.DIM2 (Float, Float, Float)
type Wrap = P.Array R.U R.DIM2 (Float, Int)
type UWrapT = (Float, Int, [(Int, Int)], String)
type DepthT = (Float, Int, String)
{-# INLINE noise #-}
{-# INLINE zskew #-}
{-# INLINE zscale #-}
{-# INLINE compute #-}
{-# INLINE main #-}
{-# INLINE doMain #-}
{-# INLINE zipImg #-}
{-# INLINE mapWrap #-}
{-# INLINE avgPhase #-}
{-# INLINE doAvg #-}
{-# INLINE doWrap #-}
{-# INLINE doPhase #-}
{-# INLINE isPhase #-}
{-# INLINE diffPhase #-}
{-# INLINE shape #-}
{-# INLINE countM #-}
{-# INLINE inArr #-}
{-# INLINE idx #-}
{-# INLINE getElem #-}
{-# INLINE start #-}
{-# INLINE unwrap #-}
{-# INLINE doUnwrap #-}
{-# INLINE doDepth #-}
{-# INLINE write #-}
noise :: Float
noise = 0.1
zskew :: Float
zskew = 24
zscale :: Float
zscale = 130
compute :: (R.Shape sh, U.Unbox e) => P.Array R.D sh e -> P.Array R.U sh e
compute a = runIdentity (R.computeP a)
main :: IO ()
main = do
commandArguments <- getArgs
case commandArguments of
(file1 : file2 : file3 : _ ) -> do
image1 <- readImageFromBMP file1
image2 <- readImageFromBMP file2
image3 <- readImageFromBMP file3
doMain image1 image2 image3
_ -> putStrLn "Not enough arguments"
doMain :: ImRead -> ImRead -> ImRead -> IO()
doMain (Right i1) (Right i2) (Right i3) = write
where
write = writeFile "out.txt" str
(p, m, d, str) = start $ mapWrap i1 i2 i3
doMain _ _ _ = putStrLn "Error loading image"
zipImg :: Image -> Image -> Image -> Avg
zipImg i1 i2 i3 = U.zip3 i1 i2 i3
mapWrap :: Image -> Image -> Image -> Wrap
mapWrap i1 i2 i3 = compute $ R.map wrap avg
where
wrap = (doWrap . avgPhase)
avg = zipImg i1 i2 i3
avgPhase :: (ImageT, ImageT, ImageT) -> PhaseT
avgPhase (i1, i2, i3) = (doAvg i1, doAvg i2, doAvg i3)
doAvg :: ImageT -> Float
doAvg (r, g, b) = (r1 + g1 + b1) / d1
where
r1 = fromIntegral r
g1 = fromIntegral g
b1 = fromIntegral b
d1 = fromIntegral 765
doWrap :: PhaseT -> WrapT
doWrap (p1, p2, p3) = (wrap, mask)
where
wrap = isPhase $ doPhase (p1, p2, p3)
mask = isNoise $ diffPhase [p1, p2, p3]
doPhase :: PhaseT -> (Float, Float)
doPhase (p1, p2, p3) = (x1, x2)
where
x1 = sqrt 3 * (p1 - p3)
x2 = 2 * p2 - p1 - p3
isPhase :: (Float, Float) -> Float
isPhase (x1, x2) = atan2 x1 x2 / (2 * pi)
diffPhase :: [Float] -> Float
diffPhase phases = maximum phases - minimum phases
isNoise :: Float -> Int
isNoise phase = fromEnum $ phase <= noise
shape :: Wrap -> [Int]
shape wrap = R.listOfShape $ R.extent wrap
countM :: Wrap -> (Float, Int)
countM wrap = R.foldAllS count (0,0) wrap
where count = (\(x, y) (i, j) -> (x, y))
start :: Wrap -> UWrapT
start wrap = unwrap wrap (x, y) (ph, m, [], "")
where
[x0, y0] = shape wrap
x = quot x0 2
y = quot y0 2
(ph, m) = getElem wrap (x0, y0)
inArr :: Wrap -> (Int, Int) -> Bool
inArr wrap (x,y) = x >= 0 && y >= 0 && x < x0 && y < y0
where
[x0, y0] = shape wrap
idx :: (Int, Int) -> (R.Z R.:. Int R.:. Int)
idx (x, y) = (R.Z R.:. x R.:. y)
getElem :: Wrap -> (Int, Int) -> WrapT
getElem wrap (x, y) = wrap R.! idx (x, y)
unwrap :: Wrap -> (Int, Int) -> UWrapT -> UWrapT
unwrap wrap (x, y) (ph, m, done, str) =
if
not $ inArr wrap (x, y) ||
(x, y) `elem` done ||
toEnum m::Bool
then
(ph, m, done, str)
else
up
where
unwrap' = doUnwrap wrap (x, y) (ph, m, done, str)
right = unwrap wrap (x+1, y) unwrap'
left = unwrap wrap (x-1, y) right
down = unwrap wrap (x, y+1) left
up = unwrap wrap (x, y-1) down
doUnwrap :: Wrap -> (Int, Int) -> UWrapT -> UWrapT
doUnwrap wrap (x, y) (ph, m, done, str) = unwrapped
where
unwrapped = (nph, m, (x, y):done, out)
(phase, mask) = getElem wrap (x, y)
rph = fromIntegral $ round ph
off = phase - (ph - rph)
nph = ph + (mod' (off + 0.5) 1) - 0.5
out = doDepth wrap (x, y) (nph, m, str)
doDepth :: Wrap -> (Int, Int) -> DepthT -> String
doDepth wrap (x, y) (ph, m, str) = write (x, ys, d, str)
where
[x0, y0] = shape wrap
ys = y0 - y
ydiff = fromIntegral (y - (quot y0 2))
plane = 0.5 - ydiff / zskew
d = (ph - plane) * zscale
write :: (Int, Int, Float, String) -> String
write (x, y, depth, str) = str ++ vertex
where
vertex = xstr ++ ystr ++ zstr
xstr = show x ++ " "
ystr = show y ++ " "
zstr = show depth ++ "\n"
Sorry for wasting some your time by my first misleading advice.
You should use another 2-dimensional array of pixel states (already visited or not) instead of
(x, y) `elem` done
because the latter takes linear time.
Examples of solving almost the same task: for repa and vector, and for yarr.
Perhaps, you have out of memory exception because of building a string by appending to the end (in write function) - the worst solution, linear time and memory consumption. You would better aggregate results using cons (:) and write it to the output file at the end, in reverse order. Even better - write results to another unboxed Vector of (Int, Int, Float) elements (allocate vector of width*height size - as upper bound of possible size).

Haskell Gloss Not Animating

I have a program which simulates the interaction of many agents in a community. I'm animating the interaction using the Gloss library, the pictures of the agents render correctly, just not the animation. I animate it by generating a simulation, which is a list of list of interactions, which I then take the one corresponding to the second of the animation, where I then render that interaction. The code for simulating works fine when outputting it to terminal. The code:
render :: Int -> History -> Picture -- assume window to be square
render size (History int agents) = Pictures $ map (drawAgent (step`div`2) colors step) agents
where step = size*6 `div` (length agents)
--agents = nub $ concat $ map (\(Interaction a1 a2 _ ) -> [a1,a2]) int
nubNames = nub $ map (getName . name) agents --ignore the first two letters of name
colors = Map.fromList $ zipWith (\name color-> (name, color)) nubNames (cycle colorlist)
colorlist = [red,green,blue,yellow,cyan,magenta,rose,violet,azure,aquamarine,chartreuse,orange]
drawAgent :: Int -> Map.Map String Color -> Int -> Agent -> Picture
drawAgent size colors step agent =
color aColor (Polygon [(posA,posB),(posA,negB),(negA,negB),(negA,posB)])
where aColor = fromMaybe black $ Map.lookup (getName $ name agent ) colors
a = (fst $ position agent) * step
b = (snd $ position agent) * step
posA = fromIntegral $ a+size
negA = fromIntegral $ a-size
posB = fromIntegral $ b+size
negB = fromIntegral $ b-size
simulate :: Int -> [Agent] -> [History]
simulate len agents = trace ("simulation"
(playRound agents len) :
simulate len (reproduce (playRound agents len))
main = do
a <- getStdGen
G.animate (G.InWindow "My Window" (400, 400) (0,0)) G.white
(\time ->(render 400) $ ((simulate 5 (agent a))!!(floor time)))
where agent a = generate a 9
sim a = simulate 40 (agent a)
When I execute this, it will say the simulation is running, but only render the first interaction.
$ ghc -O2 -threaded main.hs && ./main
[6 of 8] Compiling Prisoners ( Prisoners.hs, Prisoners.o )
Linking main ...
simulation
simulation
simulation
It will continue like this until I stop it, rendering the same picture each time. What am I doing wrong?
(SO Comment box won't let me paste code)
Your doesn't compile for me. What version of Gloss are you using, the API has changed from v1.0 to v1.7.x. What version of GHC? What OS?
Does this simple example work for you?
{- left click to create a circle; escape to quit -}
import Graphics.Gloss
import Graphics.Gloss.Interface.Pure.Game
initial _ = [(0.0,0.0) :: Point]
event (EventMotion (x,y)) world = world --(x,y)
event (EventKey (MouseButton LeftButton) Up mods (x,y)) world = (x,y):world
event _ world = world
step time world = world
draw pts = Pictures $ map f pts
where f (x,y) = translate x y (circle 10)
m = play (InWindow "Hi" (600,600) (200,200)) white 1 (initial 0) draw event step

Resources