Can this Clojure code be optimized? - performance

I wrote the code below for game I am working on. But it seems a little slow. If you have not checked the code yet, it's the A* search/pathfinding algorithm. It takes about 100-600 ms for a 100x100 grid, depending on the heuristic used (and consequently the number of tiles visited).
There are no reflection warnings. However, I suspect boxing might be an issue. But I don't know how to get rid of boxing in this case, because the computation is split among several functions. Also, I save tiles/coordinates as vectors of two numbers, like this: [x y]. But then the numbers will be boxed, right? A typical piece of code, if you don't want to read through it all, is: (def add-pos (partial mapv + pos)) where pos is the aforementioned kind of two-number vector. There are sereval of places where the numbers are manipulated in a way similar to add-pos above, and put back in a vector afterwards. Is there any way to optimize code like this? Any other tips is welcome too, performance-related or other.
EDIT: Thinking some more about it, I came up with a few follow-up questions: Can a Clojure function ever return primitives? Can a Clojure function ever take primitives (without any boxing)? Can I put primitives in a type/record without boxing?
(ns game.server.pathfinding
(:use game.utils)
(:require [clojure.math.numeric-tower :as math]
[game.math :as gmath]
[clojure.data.priority-map :as pm]))
(defn walkable? [x]
(and x (= 1 x)))
(defn point->tile
([p] (apply point->tile p))
([x y] [(int x) (int y)]))
(defn get-tile [m v]
"Gets the type of the tile at the point v in
the grid m. v is a point in R^2, not grid indices."
(get-in m (point->tile v)))
(defn integer-points
"Given an equation: x = start + t * step, returns a list of the
values for t that make x an integer between start and stop,
or nil if there is no such value for t."
[start stop step]
(if-not (zero? step)
(let [first-t (-> start ((if (neg? step) math/floor math/ceil))
(- start) (/ step))
t-step (/ 1 (math/abs step))]
(take-while #((if (neg? step) > <) (+ start (* step %)) stop)
(iterate (partial + t-step) first-t)))))
(defn crossed-tiles [[x y :as p] p2 m]
(let [[dx dy :as diff-vec] (map - p2 p)
ipf (fn [getter]
(integer-points (getter p) (getter p2) (getter diff-vec)))
x-int-ps (ipf first)
y-int-ps (ipf second)
get-tiles (fn [[x-indent y-indent] t]
(->> [(+ x-indent x (* t dx)) (+ y-indent y (* t dy))]
(get-tile m)))]
(concat (map (partial get-tiles [0.5 0]) x-int-ps)
(map (partial get-tiles [0 0.5]) y-int-ps))))
(defn clear-line?
"Returns true if the line between p and p2 passes over only
walkable? tiles in m, otherwise false."
[p p2 m]
(every? walkable? (crossed-tiles p p2 m)))
(defn clear-path?
"Returns true if a circular object with radius r can move
between p and p2, passing over only walkable? tiles in m,
otherwise false.
Note: Does not currently work for objects with a radius >= 0.5."
[p p2 r m]
(let [diff-vec (map (partial * r) (gmath/normalize (map - p2 p)))
ortho1 ((fn [[x y]] (list (- y) x)) diff-vec)
ortho2 ((fn [[x y]] (list y (- x))) diff-vec)]
(and (clear-line? (map + ortho1 p) (map + ortho1 p2) m)
(clear-line? (map + ortho2 p) (map + ortho2 p2) m))))
(defn straighten-path
"Given a path in the map m, remove unnecessary nodes of
the path. A node is removed if one can pass freely
between the previous and the next node."
([m path]
(if (> (count path) 2) (straighten-path m path nil) path))
([m [from mid to & tail] acc]
(if to
(if (clear-path? from to 0.49 m)
(recur m (list* from to tail) acc)
(recur m (list* mid to tail) (conj acc from)))
(reverse (conj acc from mid)))))
(defn to-mid-points [path]
(map (partial map (partial + 0.5)) path))
(defn to-tiles [path]
(map (partial map int) path))
(defn a*
"A* search for a grid of squares, mat. Tries to find a
path from start to goal using only walkable? tiles.
start and goal are vectors of indices into the grid,
not points in R^2."
[mat start goal factor]
(let [width (count mat)
height (count (first mat))]
(letfn [(h [{pos :pos}] (* factor (gmath/distance pos goal)))
(g [{:keys [pos parent]}]
(if parent
(+ (:g parent) (gmath/distance pos (parent :pos)))
0))
(make-node [parent pos]
(let [node {:pos pos :parent parent}
g (g node) h (h node)
f (+ g h)]
(assoc node :f f :g g :h h)))
(get-path
([node] (get-path node ()))
([{:keys [pos parent]} path]
(if parent
(recur parent (conj path pos))
(conj path pos))))
(free-tile? [tile]
(let [type (get-in mat (vec tile))]
(and type (walkable? type))))
(expand [closed pos]
(let [adj [[1 0] [0 1] [-1 0] [0 -1]]
add-pos (partial mapv + pos)]
(->> (take 4 (partition 2 1 (cycle adj)))
(map (fn [[t t2]]
(list* (map + t t2) (map add-pos [t t2]))))
(map (fn [[d t t2]]
(if (every? free-tile? [t t2]) d nil)))
(remove nil?)
(concat adj)
(map add-pos)
(remove (fn [[x y :as tile]]
(or (closed tile) (neg? x) (neg? y)
(>= x width) (>= y height)
(not (walkable? (get-in mat tile)))))))))
(add-to-open [open tile->node [{:keys [pos f] :as node} & more]]
(if node
(if (or (not (contains? open pos))
(< f (open pos)))
(recur (assoc open pos f)
(assoc tile->node pos node)
more)
(recur open tile->node more))
{:open open :tile->node tile->node}))]
(let [start-node (make-node nil start)]
(loop [closed #{}
open (pm/priority-map start (:f start-node))
tile->node {start start-node}]
(let [[curr _] (peek open) curr-node (tile->node curr)]
(when curr
(if (= curr goal)
(get-path curr-node)
(let [exp-tiles (expand closed curr)
exp-nodes (map (partial make-node curr-node) exp-tiles)
{:keys [open tile->node]}
(add-to-open (pop open) tile->node exp-nodes)]
(recur (conj closed curr) open tile->node))))))))))
(defn find-path [mat start goal]
(let [start-tile (point->tile start)
goal-tile (point->tile goal)
path (a* mat start-tile goal-tile)
point-path (to-mid-points path)
full-path (concat [start] point-path [goal])
final-path (rest (straighten-path mat full-path))]
final-path))

I recommend the Clojure High Performance Programming book for addressing questions like yours.
There are functions to unbox primitives (byte, short, int, long, float, double).
Warn-on-reflection does not apply to numeric type reflection / failure to optimize numeric code. There is a lib to force warnings for numeric reflection - primitive-math.
You can declare the types of function arguments and function return values (defn ^Integer foo [^Integer x ^Integer y] (+ x y)).
Avoid apply if you want performance.
Avoid varargs (a common reason to need apply) if you want performance. Varargs functions create garbage on every invocation (in order to construct the args map, which usually is not used outside the function body). partial always constructs a varargs function. Consider replacing the varargs (partial * x) with #(* x %), the latter can be optimized much more aggressively.
There is a tradeoff with using primitive jvm single-type arrays (they are mutible and fixed in length, which can lead to more complex and brittle code), but they will perform better than the standard clojure sequential types, and are available if all else fails to get the performance you need.
Also, use criterium to compare various implementations of your code, it has a bunch of tricks to help rule out the random things that affect execution time so you can see what really performs best in a tight loop.
Also, regarding your representation of a point as [x y] - you can reduce the space and lookup overhead of the collection holding them with (defrecord point [x y]) (as long as you know they will remain two elements only, and you don't mind changing your code to ask for (:x point) or (:y point)). You could further optimize by making or using a simple two-number java class (with the tradeoff of losing immutibility).

Related

Understanding list comprehension modifier order in Clojure

I'm currently learning Clojure and am stuck with list comprehension.
;; https://stackoverflow.com/a/7625207/4110233
(defn gen-primes "Generates an infinite, lazy sequence of prime numbers"
[]
(letfn [(reinsert [table x prime]
(update-in table [(+ prime x)] conj prime))
(primes-step [table d]
(if-let [factors (get table d)]
(recur (reduce #(reinsert %1 d %2) (dissoc table d) factors)
(inc d))
(lazy-seq (cons d (primes-step (assoc table (* d d) (list d))
(inc d))))))]
(primes-step {} 2)))
(defn prime-factors-not-working [x]
(for [y (gen-primes)
:when (= (mod x y) 0)
:while (< y (Math/sqrt x))]
y))
(defn prime-factors-working [x]
(for [y (gen-primes)
:while (< y (Math/sqrt x))
:when (= (mod x y) 0)]
y))
(prime-factors-working 100)
;; ↪ (2 5)
(prime-factors-not-working 100)
;; Goes into infinite loop
(gen-primes) is a lazy sequence of prime numbers. The only difference between the working and not-working sequences is the order of the modifiers while and when. Why does the not-working implementation go into an infinite loop?
The not working variant expands conceptually (but not factually) into this:
(loop [ys (gen-primes)
result []]
(if (seq ys)
(let [y (first ys)]
(if (= (mod x (first ys)) 0) ;; Can be replaced with `(zero? (mod x y))` BTW.
(if (< y (Math/sqrt x))
(recur (next ys) (conj result y))
result)
(recur (next ys) result)))
result))
As you can see, if (mod x (first ys)) is not 0, it will go to the next number - without checking for that <, going forever.
When you exchange :when and :while, the checks in the pseudo-expansion above are also swapped - stopping the iteration once y reaches the square root of x.
the macro expansion of for is sensitive to the order in which you put :when and :while. macroexpanding gives slightly different code.
you can go very far in clojure without relying on complicated macros beyond defn, for is not very common, and this isn't a usecase where it is clearly advantagous over map->filter->take
good expansion: line 28:
(when (< y (Math/sqrt x)) ; XXX
(if (= (mod x y) 0) ; XXX
(do
(chunk-append b__86695 y)
(recur (unchecked-inc i__86694)))
(recur (unchecked-inc i__86694))))
bad expansion: line 28:
(if (= (mod x y) 0) ; XXX
(when (< y (Math/sqrt x)) ; XXX
(do
(chunk-append b__86666 y)
(recur (unchecked-inc i__86665))))
(recur (unchecked-inc i__86665)))
you can learn about the implementation of the for macro by going to it's source code (your editor should have a way for navigating to defs of symbols)
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L4654
there is nothing about clojure that requires it's macros to write code in the way you think they should, though in this case it may be a bug in for, it's hard to tell. some use cases may want to be able to when while and when are written.
since this is about learning, and macros are pretty much magic unless you see the code they write out, i think the best way to learn is to figure out how to view the expanded macro forms in your code. this is generally how macros are debugged.
link to topic on macro expansion
how is a macro expanded in clojure?
documentation for expanding macros in cider (emacs clojure editor)
https://docs.cider.mx/cider/debugging/macroexpansion.html

Implement a faster algorithm

I have been stuck on this question for days. Apparently I need to write a better algorithm to win the algorithm below. The below code is implemented from the famous Aima file. Is there any expert here who could guide me on how to win the algorithm?
(defun find-closest (list)
(x (car (array-dimensions list)))
(y (cadr (array-dimensions list)))
(let ((elems (aref list x y)))
(dolist (e elems)
(when (eq (type-of e) type)
(return-from find-closest (list x y)))) nil))
I tried implementing a DFS but failed and I do not quite know why. Below is my code.
(defun find-closest (list)
(let ((open (list list))
(closed (list))
(steps 0)
(expanded 0)
(stored 0))
(loop while open do
(let ((x (pop open)))
(when (finished? x)
(return (format nil "Found ~a in ~a steps.
Expanded ~a nodes, stored a maximum of ~a nodes." x steps expanded stored)))
(incf steps)
(pushnew x closed :test #'equal)
(let ((successors (successors x)))
(incf expanded (length successors))
(setq successors
(delete-if (lambda (a)
(or (find a open :test #'equal)
(find a closed :test #'equal)))
successors))
(setq open (append open successors))
(setq stored (max stored (length open))))))))
Looking at the code, the function find-some-in-grid returns the first found thing of type. This will, essentially, give you O(n * m) time for an n * m world (imagine a world, where you have one dirt on each line, alternating between "left-most" and "right-most".
Since you can pull out a list of all dirt locations, you can build a shortest traversal, or at least a shorter-than-dump traversal, by instead of picking whatever dirt you happen to find first you pick the closest (for some distance metric, from the code it looks like you have Manhattan distances (that is, you can only move along the X xor the Y axis, not both at the same time). That should give you a robot that is at least as good as the dumb-traversal robot and frequently better, even if it's not optimal.
With the provision that I do NOT have the book and base implementation purely on what's in your question, something like this might work:
(defun find-closest-in-grid (radar type pos-x pos-y)
(labels ((distance (x y)
(+ (abs (- x pos-x))
(abs (- y pos-y)))))
(destructuring-bind (width height)
(array-dimensions radar)
(let ((best nil)
((best-distance (+ width height))))
(loop for x from 0 below width
do (loop for y from 0 below height
do (loop for element in (aref radar x y)
do (when (eql (type-of element) type)
(when (<= (distance x y) best-distance)
(setf best (list x y))
(setf best-distance (distance x y))))))))
best)))

Performance of vector and array in Clojure

I am trying to solve the Maximum subarray problem on hacker rank. This is a standard DP problem and I write an O(n) solution:
(defn dp
[v]
(let [n (count v)]
(loop [i 1 f (v 0) best f]
(if (< i n)
(let [fi (max (v i) (+ f (v i)))]
(recur (inc i) fi (max fi best)))
best))))
(defn positive-only
[v]
(reduce + (filterv #(> % 0) v)))
(defn line->ints
[line]
(->>
(clojure.string/split line #" ")
(map #(Integer. %))
(into [])
))
(let [T (Integer. (read-line))]
(loop [test 0]
(when (< test T)
(let [_ (read-line)
x (read-line)
v (line->ints x)
a (dp-array v)
b (let [p (positive-only v)]
(if (= p 0) (reduce max v) p))]
(printf "%d %d\n" a b))
(recur (inc test)))))
To my surprise, I got time-limited-exceed for a large test case. I downloaded the input file, and found that the above version needs about 3 seconds to run.
I thought the bottleneck is in (v i) (getting the i-th element in vector v). So I changed the data structure from vector to an array:
(defn dp-array
[v0]
(let [v (into-array v0)
n (int (alength v))]
(loop [i 1
f (aget v 0)
best f]
(if (< i n)
(let [fi (max (aget v i) (+ f (aget v i)))]
(recur (inc i) fi (max fi best)))
best))))
This array version is even slower. On the same input, it costs 33 seconds, much slower than the vector version. I think the slowness is due to boxing and unboxing. I tried to add type hints, but encountered run-time errors. Could anyone help me improve dp-array function? Thanks!
Also, great appreciate if anyone knows how to improve the vector version.
UPDATE:
Finally I managed to get my clojure program accepted, not by optimizing over the dynamic programming function, but by changing (Integer. str) to (Integer/parseInt str). In this way, reflection is avoided in converting from string to integer.
I also replace into-array by int-array. But the speed of both versions are still on par with each other. I would expect the array version be faster than the vector version.
The Clojure compiler can't infer the type of v in the array version of the dp-array function whose argument v0 has unknown type. This causes costs to reflections when evaluating the following alength and aget. In order to avoid these unnecessary reflections, you have to replace into-array with long-array.

How to go about composing core functions, rather then using imperative style?

I have translated this code, the snippet below, from Python to Clojure. I replaced Python's while construct with Clojure's loop-recur here. But this doesn't look idiomatic.
(loop [d 2 [n & more] (list 256)]
(if (> n 1)
(recur (inc d)
(loop [x n sublist more]
(if (= (rem x d) 0)
(recur (/ x d) (conj sublist d))
(conj sublist x))))
(sort more)))
This routine gives me (3 3 31), that is prime factors of 279. For 256, it gives, (2 2 2 2 2 2 2 2), that means, 2^8.
Moreover, it performs worse for large values, say 987654123987546 instead of 279; whereas Python's counterpart works like charm.
How to start composing core functions, rather then translating imperative code as is? And specifically, how to improve this bit?
Thanks.
[Edited]
Here is the python code, I referred above,
def prime_factors(n):
factors = []
d = 2
while n > 1:
while n % d == 0:
factors.append(d)
n /= d
d = d + 1
return factors
A straight translation of the Python code in Clojure would be:
(defn prime-factors [n]
(let [n (atom n) ;; The Python code makes use of mutability which
factors (atom []) ;; isn't idiomatic in Clojure, but can be emulated
d (atom 2)] ;; using atoms
(loop []
(when (< 1 #n)
(loop []
(when (== (rem #n #d) 0)
(swap! factors conj #d)
(swap! n quot #d)
(recur)))
(swap! d inc)
(recur)))
#factors))
(prime-factors 279) ;; => [3 3 31]
(prime-factors 987654123987546) ;; => [2 3 41 14389 279022459]
(time (prime-factors 987654123987546)) ;; "Elapsed time: 13993.984 msecs"
;; same performance on my machine
;; as the Rosetta Code solution
You can improve this code to make it more idiomatic:
from nested loops to a single loop:
(loop []
(cond
(<= #n 1) #factors
(not= (rem #n #d) 0) (do (swap! d inc)
(recur))
:else (do (swap! factors conj #d)
(swap! n quot #d)
(recur))))))
get rid of the atoms:
(defn prime-factors [n]
(loop [n n
factors []
d 2]
(cond
(<= n 1) factors
(not= (rem n d) 0) (recur n factors (inc d))
:else (recur (quot n d) (conj factors d) d))))
replace == 0 by zero?:
(not (zero? (rem n d))) (recur n factors (inc d))
You can also overhaul it completely to make a lazy version of it:
(defn prime-factors [n]
((fn step [n d]
(lazy-seq
(when (< 1 n)
(cond
(zero? (rem n d)) (cons d (step (quot n d) d))
:else (recur n (inc d)))))
n 2))
I planned to have a section on optimization here, but I'm no specialist. The only thing I can say is that you can trivially make this code faster by interrupting the loop when d is greater than the square root of n:
(defn prime-factors [n]
(if (< 1 n)
(loop [n n
factors []
d 2]
(let [q (quot n d)]
(cond
(< q d) (conj factors n)
(zero? (rem n d)) (recur q (conj factors d) d)
:else (recur n factors (inc d)))))
[]))
(time (prime-factors 987654123987546)) ;; "Elapsed time: 7.124 msecs"
Not every loop unrolls cleanly into an elegant "functional" decomposition.
The Rosetta Code solution suggested by #edbond is pretty simple and concise; I would say it's idiomatic since no obvious "functional" solution is apparent. That solution runs noticeably faster on my machine than your Python version for 987654123987546.
More generally, if you're looking to expand your understanding of functional idioms, Bedra and Halloway's "Programming Clojure" (pp.90-95) presents an excellent comparison of different versions of the Fibonacci sequence, using loop, lazy seqs, and an elegant "functional" version. Chouser and Fogus's "Joy of Clojure" (MEAP version) also has a nice section on function composition.

Performance of function in Clojure 1.3

I was wondering if someone could help me with the performance of this code snippet in Clojure 1.3. I am trying to implement a simple function that takes two vectors and does a sum of products.
So let's say the vectors are X (size 10,000 elements) and B (size 3 elements), and the sum of products are stored in a vector Y, mathematically it looks like this:
Y0 = B0*X2 + B1*X1 + B2*X0
Y1 = B0*X3 + B1*X2 + B2*X1
Y2 = B0*X4 + B1*X3 + B2*X2
and so on ...
For this example, the size of Y will end up being 9997, which corresponds to (10,000 - 3). I've set up the function to accept any size of X and B.
Here's the code: It basically takes (count b) elements at a time from X, reverses it, maps * onto B and sums the contents of the resulting sequence to produce an element of Y.
(defn filt [b-vec x-vec]
(loop [n 0 sig x-vec result []]
(if (= n (- (count x-vec) (count b-vec)))
result
(recur (inc n) (rest sig) (conj result (->> sig
(take (count b-vec))
(reverse)
(map * b-vec)
(apply +)))))))
Upon letting X be (vec (range 1 10001)) and B being [1 2 3], this function takes approximately 6 seconds to run. I was hoping someone could suggest improvements to the run time, whether it be algorithmic, or perhaps a language detail I might be abusing.
Thanks!
P.S. I have done (set! *warn-on-reflection* true) but don't get any reflection warning messages.
You are using count many times unnecessary. Below code calculate count one time only
(defn filt [b-vec x-vec]
(let [bc (count b-vec) xc (count x-vec)]
(loop [n 0 sig x-vec result []]
(if (= n (- xc bc))
result
(recur (inc n) (rest sig) (conj result (->> sig
(take bc)
(reverse)
(map * b-vec)
(apply +))))))))
(time (def b (filt [1 2 3] (range 10000))))
=> "Elapsed time: 50.892536 msecs"
If you really want top performance for this kind of calculation, you should use arrays rather than vectors. Arrays have a number of performance advantages:
They support O(1) indexed lookup and writes - marginally better than vectors which are O(log32 n)
They are mutable, so you don't need to construct new arrays all the time - you can just create a single array to serve as the output buffer
They are represented as Java arrays under the hood, so benefit from the various array optimisations built into the JVM
You can use primitive arrays (e.g. of Java doubles) which are much faster than if you use boxed number objects
Code would be something like:
(defn filt [^doubles b-arr
^doubles x-arr]
(let [bc (count b-arr)
xc (count x-arr)
rc (inc (- xc bc))
result ^doubles (double-array rc)]
(dotimes [i rc]
(dotimes [j bc]
(aset result i (+ (aget result i) (* (aget x-arr (+ i j)) (aget b-arr j))))))
result))
To follow on to Ankur's excellent answer, you can also avoid repeated calls to the reverse function, which gets us even a little more performance.
(defn filt [b-vec x-vec]
(let [bc (count b-vec) xc (count x-vec) bb-vec (reverse b-vec)]
(loop [n 0 sig x-vec result []]
(if (= n (- xc bc))
result
(recur (inc n) (rest sig) (conj result (->> sig
(take bc)
(map * bb-vec)
(apply +))))))))

Resources