Performance Problem with Clojure Array

Performance Problem with Clojure Array - performance

This piece of code is very slow. Execution from the slime-repl on my netbook takes a couple minutes.
(def test-array (make-array Integer/TYPE 400 400 3))
(doseq [x (range 400), y (range 400), z (range 3)]
(aset test-array x y z 0))
Conversely, this code runs really fast:
(def max-one (apply max (map (fn [w] (apply max (map #(first %) w))) test-array)))
(def max-two (apply max (map (fn [w] (apply max (map #(second %) w))) test-array)))
(def max-three (apply max (map (fn [w] (apply max (map #(last %) w))) test-array)))
Does this have something to do with chunked sequences? Is my first example just written wrong?

You're hitting Java reflection. This blog post has a workaround:
http://clj-me.cgrand.net/2009/10/15/multidim-arrays/

You might get better performance from one of the four Clojure matrix implementations available via a single interface core.matrix: at clojars, at github.

Related

Scheme less than average function

I want to create a function in Scheme that takes a list of numbers and displays the number that are less than the average. The answer should be 3 but returns 2. I believe it is not reading "15." What am I doing wrong?
(define x (list 10 60 3 55 15 45 40))
(display "The list: ")
(let ((average (/ (apply + (cdr x)) (car x))))
(length (filter (lambda (x) (< x average)) (cdr x))))
Output:
The list:
(10 60 3 55 15 45 40)
The average:
32.57
Number of values less than average:
2

Sure, let's do this step by step!
First off, let's define a function to get us the average of a list. We'll call this function mean.
(define (mean lst)
(/ (apply + lst) (length lst)))
We get the average by adding all the numbers together and dividing that sum by how many numbers were in the list (that is to say, the length of the list). There are Racket libraries that could provide us with this function, such as the Statistics Functions library from the math-lib package. But we'll do it ourselves since it's simple enough.
Next we have the meat of our algorithm, where we define a function that takes a list, gets the average, and filters out every element less than the average.
(define (less-than-average lst)
(filter (λ (x) (< x (mean lst))) lst))
Looks pretty similar to your code, right? Let's see if it works:
(less-than-average (list 10 60 3 55 15 45 40))
I ran this in DrRacket and it gave me 10 3 15, which is the correct answer. So why did this work, when you (very similar) code does not?
The problem with your code is that the value you are storing in average is incorrect. Viz,
(define x (list 10 60 3 55 15 45 40))
(let ((average (/ (apply + (cdr x)) (car x))))
average)
evaluates to 21.8. As you state, the correct average is 32.57. Your current technique for computing this average is to add up everything in the list after the first element (that's what (apply + (cdr x)) does) then dividing that sum by the first element. This will not give you the mean value of the list. What you ought to be doing is summing up the entire list (via (apply + x)), then dividing that by how many numbers were in the list (ie (length x)).

This answer tries to pay attention to performance. The other answer by Alex has a mean function which walks the list twice: once to add up the elements, and once to compute the length. It then calls this function for every element of the list when filtering it, resulting a function which takes time quadratic in the length of the list being averaged. This is not a problem for small lists.
Here is a mean function which walks the list once.
(define (list-average l)
(let average-loop ([tail l] [sum 0] [len 0])
(if (null? tail)
(/ sum len)
(average-loop (rest tail) (+ sum (first tail)) (+ len 1)))))
This is a little better than one which walks it twice, but the difference is probably not significant (naively it might be twice as fast, in practice probably less).
Here is a filtering function which is careful to call the mean function only once. This is a whole complexity class faster than one which calls it for every element, resulting in a function which takes time proportional to the length of the list, not the length of the list squared.
(define (<=-average l)
(define average (list-average l))
(filter (λ (e) (<= e average)) l))

I will not comment too much, I have just written such a function here and you can study it:
(define less-than-average
(lambda (list return)
((lambda (self) (self self list 0 0 return))
(lambda ( self list acc n col)
(if (null? list)
(col (/ acc n) list)
(let ((a (car list)))
(self self
(cdr list)
(+ acc a)
(+ n 1)
(lambda (average l*) ; when the average is known,
(if (< a average)
(col average (cons a l*))
(col average l* ))))))))))
1 ]=> (less-than-average (list 10 60 3 55 15 45 40)
(lambda (avg l*)
(newline)
(display avg)
(newline)
(display l*)
(newline)) )
228/7
(10 3 15)

Optimisation of the number of divisors of a number

I am using the below code to get the number of divisors for the given number as per this answer https://stackoverflow.com/a/110365/2610955. I calculate the number of times a prime factor is repeated and then increment each of them and make a product out of them. But the algorithm seems too slow. Is there a way to optimise the program. I tried with type hints but they don't have any use. Is there something wrong with the algorithm or am I missing any optimisation here?
(defn prime-factors
[^Integer n]
(loop [num n
divisors {}
limit (-> n Math/sqrt Math/ceil Math/round inc)
init 2]
(cond
(or (>= init limit) (zero? num)) divisors
(zero? (rem num init)) (recur (quot num init) (update divisors init #(if (nil? %) 1 (inc %))) limit init)
:else (recur num divisors limit (inc init)))))
(prime-factors 6)
(set! *unchecked-math* true)
(dotimes [_ 10] (->> (reduce *' (range 30 40))
prime-factors
vals
(map inc)
(reduce *' 1)
time))
Edit : Removed rem as its not needed in the final output

Minor thing: use int or long instead of Integer as #leetwinski mentioned.
Major reason:
It seems that you loop from 2 to sqrt(n), which will loop over unnecessary numbers.
Take a look at the code below: (This is just a dirty patch)
(defn prime-factors
[n]
(loop [num n
divisors {}
limit (-> n Math/sqrt int)
init 2]
(cond
(or (>= init limit) (zero? num)) divisors
(zero? (rem num init)) (recur (quot num init) (update divisors init #(if (nil? %) 1 (inc %))) limit init)
:else (recur num divisors limit (if (= 2 init ) (inc init) (+ 2 init))))))
I just replace (inc init) with (if (= 2 init ) (inc init) (+ 2 init)). This will loop over 2 and odd numbers. If you run it, you will notice that the execution time reduced almost half of the original version. Because it skip the even numbers (except 2).
If you loop over only prime numbers, it will be much faster than this. You can get the sequence of primes from clojure.contrib.lazy-seqs/primes. Though this contrib namespace has deprecated, you can still use it.
This is my approach:
(ns util
(:require [clojure.contrib.lazy-seqs :refer (primes)]))
(defn factorize
"Returns a sequence of pairs of prime factor and its exponent."
[n]
(loop [n n, ps primes, acc []]
(let [p (first ps)]
(if (<= p n)
(if (= 0 (mod n p))
(recur (quot n p) ps (conj acc p))
(recur n (rest ps) acc))
(->> (group-by identity acc)
(map (fn [[k v]] [k (count v)])))))))
Then you can use this function like this:
(dotimes [_ 10] (->> (reduce *' (range 30 40))
factorize
(map second)
(map inc)
(reduce *' 1)
time))

you can increase the performance by using primitive int in loop. Just replace (loop [num n... with (loop [num (int n)...
for me it works 4 to 5 times faster
another variant (which is in fact the same) is to change the type hint in the function signature to ^long.
The problem is that the ^Integer type hint doesn't affect the performance in your case (as far as i know). This kind of hint just helps to avoid reflection overhead when you call some methods of the object (which you don't), and primitive type hint (only ^long and ^double are accepted) actually converts value to the primitive type.

Sort faster in racket using hash table

So I have an example list of elements like this
(define A (list 'a 'c 'd 'e 'f 'e 'a))
Now I want to make a ranking from this sample
(define (scan lst)
(foldl (lambda (element a-hash) (hash-update a-hash element add1 0))
(hash)
lst))
The result should be like this:
> #(('a . 2) ('f . 1) ('e . 2) ....)
Because `scan function will make a hash table containing unique keys and the number of repetitions of that key (if it catches an unindexed key it will create a new place for that new key, counting from 0).
Then I'd like to sort that hash-table because it's unsorted:
(define (rank A)
(define ranking (scan A))
(sort ranking > #:key cdr)))
So the result would look like this:
#(('a . 2) ('e . 2) ('f . 1) ...)
Now I'd like to truncate the hash-table and throw away the bottom at the threshold of n = 1 (aka only take the elements with more than 2 repetitions).
(define (truncate lst n)
(define l (length lst))
(define how-many-to-take
(for/list
([i l]
#:when (> (cdr (list-ref lst i))
n))
i))
(take lst (length how-many-to-take)))
So the result might look like this:
(('a . 2) ('e . 2))
However, at the big scale, this procedure is not very efficient, it takes too long. Would you have any suggestion to improve the performance?
Thank you very much,
Part 2:
I have this data structure:
(automaton x
(vector (state y (vector a b c))
(state y (vector a b c)) ...))
Then i generate randomly a population of 1000 of them. Then i scan and rank them using the above functions. If i just scan them as is, it already takes long time. If i try to flatten them into a list like this
(list x y a b c y a b c...)
it'd take even more time. Here is the flatten function:
(define (flatten-au au)
(match-define (automaton x states) au)
(define l (vector-length states))
(define body
(for/list ([i (in-range l)])
(match-define (state y z) (vector-ref states i))
(list y (vector->list z))))
(flatten (list x body)))
The scan function will look a bit different:
(define (scan population)
(foldl (lambda (auto a-hash) (hash-update a-hash (flatten-automaton auto) add1 0))
(hash)
population))

Yep, I believe I see the problem. Your algorithm has O(n^2) ("n-squared") running time. This is because you're counting from one to the length of the list, then for each index, performing a list-ref, which takes time proportional to the size of the index.
This is super-easy to fix.
In fact, there's really no reason to sort it or convert it to a list if this is what you want; just filter the hash table directly. Like this...
#lang racket
(define A (build-list 1000000 (λ (idx) (random 50))))
(define (scan lst)
(foldl (lambda (element a-hash) (hash-update a-hash element add1 0))
(hash)
lst))
(define ht (scan A))
(define only-repeated
(time
(for/hash ([(k v) (in-hash ht)]
#:when (< 1 v))
(values k v))))
I added the call to time to see how long it takes. For a list of size one million, on my computer this takes a measured time of 1 millisecond.
Asymptotic complexity is important!

Can this Clojure code be optimized?

I wrote the code below for game I am working on. But it seems a little slow. If you have not checked the code yet, it's the A* search/pathfinding algorithm. It takes about 100-600 ms for a 100x100 grid, depending on the heuristic used (and consequently the number of tiles visited).
There are no reflection warnings. However, I suspect boxing might be an issue. But I don't know how to get rid of boxing in this case, because the computation is split among several functions. Also, I save tiles/coordinates as vectors of two numbers, like this: [x y]. But then the numbers will be boxed, right? A typical piece of code, if you don't want to read through it all, is: (def add-pos (partial mapv + pos)) where pos is the aforementioned kind of two-number vector. There are sereval of places where the numbers are manipulated in a way similar to add-pos above, and put back in a vector afterwards. Is there any way to optimize code like this? Any other tips is welcome too, performance-related or other.
EDIT: Thinking some more about it, I came up with a few follow-up questions: Can a Clojure function ever return primitives? Can a Clojure function ever take primitives (without any boxing)? Can I put primitives in a type/record without boxing?
(ns game.server.pathfinding
(:use game.utils)
(:require [clojure.math.numeric-tower :as math]
[game.math :as gmath]
[clojure.data.priority-map :as pm]))
(defn walkable? [x]
(and x (= 1 x)))
(defn point->tile
([p] (apply point->tile p))
([x y] [(int x) (int y)]))
(defn get-tile [m v]
"Gets the type of the tile at the point v in
the grid m. v is a point in R^2, not grid indices."
(get-in m (point->tile v)))
(defn integer-points
"Given an equation: x = start + t * step, returns a list of the
values for t that make x an integer between start and stop,
or nil if there is no such value for t."
[start stop step]
(if-not (zero? step)
(let [first-t (-> start ((if (neg? step) math/floor math/ceil))
(- start) (/ step))
t-step (/ 1 (math/abs step))]
(take-while #((if (neg? step) > <) (+ start (* step %)) stop)
(iterate (partial + t-step) first-t)))))
(defn crossed-tiles [[x y :as p] p2 m]
(let [[dx dy :as diff-vec] (map - p2 p)
ipf (fn [getter]
(integer-points (getter p) (getter p2) (getter diff-vec)))
x-int-ps (ipf first)
y-int-ps (ipf second)
get-tiles (fn [[x-indent y-indent] t]
(->> [(+ x-indent x (* t dx)) (+ y-indent y (* t dy))]
(get-tile m)))]
(concat (map (partial get-tiles [0.5 0]) x-int-ps)
(map (partial get-tiles [0 0.5]) y-int-ps))))
(defn clear-line?
"Returns true if the line between p and p2 passes over only
walkable? tiles in m, otherwise false."
[p p2 m]
(every? walkable? (crossed-tiles p p2 m)))
(defn clear-path?
"Returns true if a circular object with radius r can move
between p and p2, passing over only walkable? tiles in m,
otherwise false.
Note: Does not currently work for objects with a radius >= 0.5."
[p p2 r m]
(let [diff-vec (map (partial * r) (gmath/normalize (map - p2 p)))
ortho1 ((fn [[x y]] (list (- y) x)) diff-vec)
ortho2 ((fn [[x y]] (list y (- x))) diff-vec)]
(and (clear-line? (map + ortho1 p) (map + ortho1 p2) m)
(clear-line? (map + ortho2 p) (map + ortho2 p2) m))))
(defn straighten-path
"Given a path in the map m, remove unnecessary nodes of
the path. A node is removed if one can pass freely
between the previous and the next node."
([m path]
(if (> (count path) 2) (straighten-path m path nil) path))
([m [from mid to & tail] acc]
(if to
(if (clear-path? from to 0.49 m)
(recur m (list* from to tail) acc)
(recur m (list* mid to tail) (conj acc from)))
(reverse (conj acc from mid)))))
(defn to-mid-points [path]
(map (partial map (partial + 0.5)) path))
(defn to-tiles [path]
(map (partial map int) path))
(defn a*
"A* search for a grid of squares, mat. Tries to find a
path from start to goal using only walkable? tiles.
start and goal are vectors of indices into the grid,
not points in R^2."
[mat start goal factor]
(let [width (count mat)
height (count (first mat))]
(letfn [(h [{pos :pos}] (* factor (gmath/distance pos goal)))
(g [{:keys [pos parent]}]
(if parent
(+ (:g parent) (gmath/distance pos (parent :pos)))
0))
(make-node [parent pos]
(let [node {:pos pos :parent parent}
g (g node) h (h node)
f (+ g h)]
(assoc node :f f :g g :h h)))
(get-path
([node] (get-path node ()))
([{:keys [pos parent]} path]
(if parent
(recur parent (conj path pos))
(conj path pos))))
(free-tile? [tile]
(let [type (get-in mat (vec tile))]
(and type (walkable? type))))
(expand [closed pos]
(let [adj [[1 0] [0 1] [-1 0] [0 -1]]
add-pos (partial mapv + pos)]
(->> (take 4 (partition 2 1 (cycle adj)))
(map (fn [[t t2]]
(list* (map + t t2) (map add-pos [t t2]))))
(map (fn [[d t t2]]
(if (every? free-tile? [t t2]) d nil)))
(remove nil?)
(concat adj)
(map add-pos)
(remove (fn [[x y :as tile]]
(or (closed tile) (neg? x) (neg? y)
(>= x width) (>= y height)
(not (walkable? (get-in mat tile)))))))))
(add-to-open [open tile->node [{:keys [pos f] :as node} & more]]
(if node
(if (or (not (contains? open pos))
(< f (open pos)))
(recur (assoc open pos f)
(assoc tile->node pos node)
more)
(recur open tile->node more))
{:open open :tile->node tile->node}))]
(let [start-node (make-node nil start)]
(loop [closed #{}
open (pm/priority-map start (:f start-node))
tile->node {start start-node}]
(let [[curr _] (peek open) curr-node (tile->node curr)]
(when curr
(if (= curr goal)
(get-path curr-node)
(let [exp-tiles (expand closed curr)
exp-nodes (map (partial make-node curr-node) exp-tiles)
{:keys [open tile->node]}
(add-to-open (pop open) tile->node exp-nodes)]
(recur (conj closed curr) open tile->node))))))))))
(defn find-path [mat start goal]
(let [start-tile (point->tile start)
goal-tile (point->tile goal)
path (a* mat start-tile goal-tile)
point-path (to-mid-points path)
full-path (concat [start] point-path [goal])
final-path (rest (straighten-path mat full-path))]
final-path))

I recommend the Clojure High Performance Programming book for addressing questions like yours.
There are functions to unbox primitives (byte, short, int, long, float, double).
Warn-on-reflection does not apply to numeric type reflection / failure to optimize numeric code. There is a lib to force warnings for numeric reflection - primitive-math.
You can declare the types of function arguments and function return values (defn ^Integer foo [^Integer x ^Integer y] (+ x y)).
Avoid apply if you want performance.
Avoid varargs (a common reason to need apply) if you want performance. Varargs functions create garbage on every invocation (in order to construct the args map, which usually is not used outside the function body). partial always constructs a varargs function. Consider replacing the varargs (partial * x) with #(* x %), the latter can be optimized much more aggressively.
There is a tradeoff with using primitive jvm single-type arrays (they are mutible and fixed in length, which can lead to more complex and brittle code), but they will perform better than the standard clojure sequential types, and are available if all else fails to get the performance you need.
Also, use criterium to compare various implementations of your code, it has a bunch of tricks to help rule out the random things that affect execution time so you can see what really performs best in a tight loop.
Also, regarding your representation of a point as [x y] - you can reduce the space and lookup overhead of the collection holding them with (defrecord point [x y]) (as long as you know they will remain two elements only, and you don't mind changing your code to ask for (:x point) or (:y point)). You could further optimize by making or using a simple two-number java class (with the tradeoff of losing immutibility).

How to write parallel-map using Places?

I would like to have a parallel-map function implemented in Racket. Places seem like the right thing to build off of, but they're uncharted territory for me. I'm thinking the code should look something like shown below.
#lang racket
; return xs split into n sublists
(define (chunk-into n xs)
(define N (length xs))
(cond [(= 1 n) (list xs)]
[(> n N)
(cons empty
(chunk-into (sub1 n) xs))]
[else
(define m (ceiling (/ N n)))
(cons (take xs m)
(chunk-into (sub1 n) (drop xs m)))]))
(module+ test
(check-equal? (length (chunk-into 4 (range 5))) 4)
(check-equal? (length (chunk-into 2 (range 5))) 2))
(define (parallel-map f xs)
(define n-cores (processor-count))
(define xs* (chunk-into n-cores xs))
(define ps
(for/list ([i n-cores])
(place ch
(place-channel-put
ch
(map f
(place-channel-get ch))))))
(apply append (map place-channel-put ps xs*)))
This gives the error:
f: identifier used out of context in: f
All of the examples I've seen show a design pattern of providing a main function with no arguments which somehow get's used to instantiate additional places, but that's really cumbersome to use, so I'm actively trying to avoid it. Is this possible?
Note: I also tried to make a parallel-map using futures. Unfortunately, for all my tests it was actually slower than map (I tried testing using a recursive process version of fib), but here it is in case you have any suggestions for making it faster.
(define (parallel-map f xs)
(define xs** (chunk-into (processor-count) xs))
(define fs (map (λ (xs*) (future (thunk (map f xs*)))) xs**))
(apply append (map touch fs)))

I have used places before but never had to pass a function as a parameter to a place. I was able to come up with the following, rather crufty code, which uses eval:
#!/usr/bin/env racket
#lang racket
(define (worker pch)
(define my-id (place-channel-get pch)) ; get worker id
(define wch-w (place-channel-get pch)) ; get work channel (shared between controller and all workers) - worker side
(define f (place-channel-get pch)) ; get function
(define ns (make-base-namespace)) ; for eval
(let loop ()
(define n (place-channel-get wch-w)) ; get work order
(let ((res (eval `(,f ,n) ns))) ; need to use eval here !!
(eprintf "~a says ~a\n" my-id res)
(place-channel-put wch-w res) ; put response
(loop)))) ; loop forever
(define (parallel-map f xs)
(define l (length xs))
(define-values (wch-c wch-w) (place-channel)) ; create channel (2 endpoints) for work dispatch (a.k.a. shared queue)
(for ((i (in-range (processor-count))))
(define p (place pch (worker pch))) ; create place
(place-channel-put p (format "worker_~a" i)) ; give worker id
(place-channel-put p wch-w) ; give response channel
(place-channel-put p f)) ; give function
(for ((n xs))
(place-channel-put wch-c n)) ; create work orders
(let loop ((i 0) (res '())) ; response loop
(if (= i l)
(reverse res)
(let ((response (sync/timeout 10 wch-c))) ; get answer with timeout (place-channel-get blocks!)
(loop
(+ i 1)
(if response (cons response res) res))))))
(module+ main
(displayln (parallel-map 'add1 (range 10))))
Running in a console gives, for example:
worker_1 says 1
worker_1 says 3
worker_1 says 4
worker_1 says 5
worker_1 says 6
worker_1 says 7
worker_1 says 8
worker_1 says 9
worker_1 says 10
worker_0 says 2
(1 3 4 5 6 7 8 9 10 2)
As I said, crufty. All suggestions are welcome!

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Performance Problem with Clojure Array - performance

You're hitting Java reflection. This blog post has a workaround: http://clj-me.cgrand.net/2009/10/15/multidim-arrays/

You might get better performance from one of the four Clojure matrix implementations available via a single interface core.matrix: at clojars, at github.

Related

Scheme less than average function

Optimisation of the number of divisors of a number

Sort faster in racket using hash table

Can this Clojure code be optimized?

How to write parallel-map using Places?

Categories

Resources