Random Walk in Clojure - random

I have written the following piece of code for a random walk, which draws random values from {-1,1}.
(defn notahappyfoo [n]
(reverse (butlast (butlast (reverse (interleave (take n (iterate rand (- 0 1)))(take n (iterate rand 1))))))))
However, the code fails to generate a satisfactory walk. The main problem stems from the function rand. It's lower bound is 0, which forced the awkward code I wrote. Namely, the function interleave ends up causing wild shifts in the walk as values are forced to swing from positive to negative. It will be hard to garner any sense of a continuous path with this code.
I believe there should be an elegant form in Clojure to construct this walk. But I am not able to piece the right functions together to generate such a walk. The goals of the function I am looking to construct consist of lower and upper bounds for the random number. In the code above I have forced the interval -1 to 1. It would be nice to generalize this to -a and a. Moreover, how do I form a collection of random reals (floating points) between -a and a that has some notion of continuity?

You need a random function that takes a range
(defn myrand [a b]
(+ a (rand (- b a))))
You can then create a sequence
(def s (repeatedly #(myrand -1 1)))
finally you can use reductions to get a sample walk
(take 10 s)
(reductions + (take 10 s))

Related

Can I make this Clojure code (scoring a graph bisection) more efficient?

My code is spending most of its time scoring bisections: determining how many edges of a graph cross from one set of nodes to the other.
Assume bisect is a set of half of a graph's nodes (ints), and edges is a list of (directed) edges [ [n1 n2] ...] where n1,n2 are also nodes.
(defn tstBisectScore
"number of edges crossing bisect"
([bisect edges]
(tstBisectScore bisect 0 edges))
([bisect nx edge2check]
(if (empty? edge2check)
nx
(let [[n1 n2] (first edge2check)
inb1 (contains? bisect n1)
inb2 (contains? bisect n2)]
(if (or (and inb1 inb2)
(and (not inb1) (not inb2)))
(recur bisect nx (rest edge2check))
(recur bisect (inc nx) (rest edge2check))))
)))
The only clues I have via sampling the execution of this code (using VisualVM) shows most of the time spent in clojure.core$empty_QMARK_, and most of the rest in clojure.core$contains_QMARK_. (first and rest take only a small fraction of the time.) (See attached .
Any suggestions as to how I could tighten the code?
First I would say that you haven't expanded that profile deep enough. empty? is not an expensive function in general. The reason it is taking up all your time is almost surely because the input to your function is a lazy sequence, and empty? is the poor sap whose job it is to look at its elements first. So all the time in empty? is probably actually time you should be accounting to whatever generates the input sequence. You could confirm this by profiling (tstBisectScore bisect (doall edges)) and comparing to your existing profile of (tstBisectScore bisect edges).
Assuming that my hypothesis is true, almost 80% of your real workload is probably in generating the bisects, not in scoring them. So anything we do in this function can get us at most a 20% speedup, even if we replaced the whole thing with (map (constantly 0) edges).
Still, there are many local improvements to be made. Let's imagine we've determined that producing the input argument is as efficient as we can get it, and we need more speed.
When iterating eagerly over something, use next instead of rest. The point of rest is that it's a bit lazier, and always returns a non-nil sequence instead of peeking to see if there is a next element. If you know you will need the next element anyway, use next to get both bits of information at once.
In general, empty? is not an efficient way to test a sequence. (defn empty? [x] (not (seq x))) is obviously a wasted not. If you care about efficiency, write (seq x) instead, and invert your if branches. Better still, if you know x is the result of a next call, it can never be an empty sequence: only nil, or a non-empty sequence. So just write (if x ...).
(or (and inb1 inb2)
(and (not inb1) (not inb2)))
is a very expensive way to write (= inb1 inb2).
So for starters, you could instead write
(defn tstBisectScore
([bisect edges] (tstBisectScore bisect 0 (seq edges)))
([bisect nx edges]
(if edges
(recur bisect (let [[n1 n2] (first edges)
inb1 (contains? bisect n1)
inb2 (contains? bisect n2)]
(if (= inb1 inb2) nx (inc nx)))
(next edges))
nx)))
Note that I've also rearranged things a bit, by putting the if and let inside of the recur instead of duplicating the other arguments to the recur. This isn't a very popular style, and it doesn't matter to efficiency. Here it serves a pedagogical purpose: to draw your attention to the basic structure of this function that you missed. Your whole function has the structure(if xs (recur (f acc x) (next xs))). This is exactly what reduce already does!
I could write out the translation to use reduce, but first I'll also point out that you also have a map step hidden in there, mapping some elements to 1 and some to 0, and then your reduce phase is just summing the list. So, instead of using lazy sequences to do that, we'll use a transducer, and avoid allocating the intermediate sequences:
(defn tstBisectScore [bisect edges]
(transduce (map (fn [[n1 n2]]
(if (= (contains? bisect n1)
(contains? bisect n2)
0, 1)))
+ 0 edges))
This is a lot less code because you let existing abstractions do the work for you, and it should be more efficient because (a) these abstractions don't make the local mistakes you did, and (b) they also handle chunked sequences more efficiently, which is a sizeable boost that comes up surprisingly often when using basic tools like map, range, and filter.
This answer is based this answer from amalloy and shows some additional ways to speed up this code:
Use Java arrays:
Convert edges with (into-array (map into-array edges)). This allows you to use operations like aget, aset and especially areduce.
Use Java functions
In the following code, I replaced = with .equals and contains? with .contains.
Use type hints
Using these tips, I rewrote your function like this:
(defn tst-bisect-score [^HashSet bisect
^"[[Ljava.lang.Long;" edges]
(areduce edges
i
ret
(long 0)
(+ ret
(let [^"[Ljava.lang.Long;" e (aget edges i)]
(if (.equals ^Boolean
(.contains ^HashSet bisect
^Long (aget e 0))
^Boolean
(.contains ^HashSet bisect
^Long (aget e 1)))
0 1)))))
Convert your arguments in advance with (HashSet. ^Collection bisect) and (into-array (map into-array edges)) and then call:
(tst-bisect-score bisect edges)

SICP solution to Fibonacci, set `a + b = a`, why not `a + b = b`?

I am reading Tree Recursion of SICP, where fib was computed by a linear recursion.
We can also formulate an iterative process for computing the
Fibonacci numbers. The idea is to use a pair of integers a and b,
initialized to Fib(1) = 1 and Fib(0) = 0, and to repeatedly apply the
simultaneous transformations
It is not hard to show that, after applying this transformation n
times, a and b will be equal, respectively, to Fib(n + 1) and Fib(n).
Thus, we can compute Fibonacci numbers iteratively using the procedure
(rewrite by Emacs Lisp substitute for Scheme)
#+begin_src emacs-lisp :session sicp
(defun fib-iter (a b count)
(if (= count 0)
b
(fib-iter (+ a b) a (- count 1))))
(defun fib (n)
(fib-iter 1 0 n))
(fib 4)
#+end_src
"Set a + b = a and b = a", it's hard to wrap my mind around it.
The general idea to find a fib is simple:
Suppose a completed Fibonacci number table, search X in the table by jumping step by step from 0 to X.
The solution is barely intuitive.
It's reasonably to set a + b = b, a = b:
(defun fib-iter (a b count)
(if (= count 0)
a
(fib-iter b (+ a b) (- count 1))
)
)
(defun fib(n)
(fib-iter 0 1 n))
So, the authors' setting seems no more than just anti-intuitively placing b in the head with no special purpose.
However, I surely acknowledge that SICP deserves digging deeper and deeper.
What key points am I missing? Why set a + b = a rather than a + b = b?
As far as I can see your problem is that you don't like it that order of the arguments to fib-iter is not what you think it should be. The answer is that the order of arguments to functions is very often simply arbitrary and/or conventional: it's a choice made by the person writing the function. It does not matter to anyone but the person reading or writing the code: it's a stylistic choice. It doesn't particularly seem more intuitive to me to have fib defined as
(define (fib n)
(fib-iter 1 0 n))
(define (fib-iter next current n)
(if (zero? n)
current
(fib-iter (+ next current) next (- n 1))))
Rather than
(define (fib n)
(fib-iter 0 1 n))
(define (fib-iter current next n)
(if (zero? n)
current
(fib-iter (+ next current) current (- n 1))))
There are instances where this isn't true. For instance Standard Lisp (warning, PDF link) defined mapcar so that the list being mapped over was the first argument with the function being mapped the second. This means you can't extend it in the way it has been extended for more recent dialects, so that it takes any positive number of lists with the function being applied to the
corresponding elements of all the lists.
Similarly I think it would be extremely unintuitive to define the arguments of - or / the other way around.
but in many, many cases it's just a matter of making a choice and sticking to it.
The recurrence is given in an imperative form. For instance, in Common Lisp, we could use parallel assignment in the body of a loop:
(psetf a (+ a b)
b a)
To reduce confusion, we should think about this functionally and give the old and new variables different names:
a = a' + b'
b = a'
This is no longer an assignment but a pair of equalities; we are justified in using the ordinary "=" operator of mathematics instead of the assignment arrow.
The linear recursion does this implicitly, because it avoids assignment. The value of the expression (+ a b) is passed as the parameter a. But that's a fresh instance of a in new scope which uses the same name, not an assignment; the binding just induces the two to be equivalent.
We can see it also like this with the help of a "Fibonacci slide rule":
1 1 2 3 5 8 13
----------------------------- <-- sliding interface
b' a'
b a
As we calculate the sequence, there is a two-number window whose entries we are calling a and b, which slides along the sequence. You can read the equalities at any position directly off the slide rule: look, b = a' = 5 and a = b' + a' = 8.
You may be confused by a referring to the higher position in the sequence. You might be thinking of this labeling:
1 1 2 3 5 8 13
------------------------
a' b'
a b
Indeed, under this naming arrangement, now we have b = a' + b', as you expect, and a = b'.
It's just a matter of which variable is designated as the leading one farther along the sequence, and which is the trailing one.
The "a is leading" convention comes from the idea that a is before b in the alphabet, and so it receives the newer "updates" from the sequence first, which then pass off to b.
This may seem counterintuitive, but such a pattern appears elsewhere in mathematics, such as convolution of functions.

Racket - creating a water density function with certain restrictions

I am attempting to solve the following problem:
Lately, Finn has been very curious about buckets of ice water and their properties. He has been reviewing the density of water and ice. It turns out the density of water in both states depends on many factors, including the temperature, atmospheric pressure, and the purity of the water.
As an approximation, Finn has written the following function to determine the density of the water (or ice) in kg/m3 as a function of temperature t in Celsius (−273.15 ≤ t ≤ 100):
water-density(t) = ( 999.97 if t ≥ 0 ;
916.7 if t < 0 )
Write a function water-density that consumes an integer temperature t and produces either 999.97 or 916.7, depending on the value of t. However, you may only use the features of Racket given up to the end of Module 1.
You may use define and mathematical functions, but not cond, if, lists, recursion, Booleans, or other things we’ll get to later in the course. Specifically, you may use any of the functions in section 1.5 of this page: http://docs.racket-lang.org/htdp-langs/beginner.html except for the following functions, which are not allowed: sgn, floor, ceiling, round.
This is what I have so far:
(define (water-density t)
(+ (* (/ (min t 0) (min t -0.000001)) -83.27) 999.97))
This code does definitely work as long as the given temperature is not between -0.000001 and 0, but it will not work for temperatures between that range. What can I do to avoid this problem? Dividing by zero is the biggest problem I have here.
This is a somewhat.... interesting way of going about teaching programming, and I have a feeling this class is going to cause more StackOverflow questions to appear in the future, but you can do it by combining max and min to make a function that returns either 1 or 0 depending on whether its input is negative:
(define (negative->boolint n))
(- 0
(min 0
(max (inexact->exact (floor n))
-1))))
This function takes a number, rounds it down with (inexact->exact (floor n)), then the combination of max and min "bounds" the number to be between -1 and 0, then subtracts that result from 1. Since after conversion to an integer the number can never be between -1 and 0, the bounding just results in 0 for positives and zero and -1 negatives. The subtraction part means the function returns (- 0 0) for all positive numbers and zero and returns (- 1 -1) for all negative numbers. By combining the result of this function with some arithmetic, you can get the behavior you want:
(define (water-density t)
(- 999.97
(* 83.27
(negative->boolint t))))
If t is positive or zero, then the result of (* 83.27 (negative->boolint t)) will just be zero. Otherwise, the difference of the two densities will be subtracted, giving you the correct result.
This works because it's just taking advantage of max and min's built-in conditional functionality to do conditional arithmetic. You could probably achieve the same with some level of hackery for round or abs or other statements that have conditional logic.
EDIT
My apologies, I missed the part of your question about not being able to use the rounding functions. Want you want is still doable however, by using two base functions for simulating conditionals: abs and expt. Getting conditionals from abs is fairly straightforward, you can divide a number by its absolute value to get it's sign. The reason you need expt is because it lets you get around the division by zero issue with abs, because (expt 0 x) is 0 for all positive numbers, 1 for zero, and undefined for negative numbers. We can use this to make a zero->boolint function:
(define (zero->boolint x)
(expt 0 (abs x)))
With this, we can add its result to the numerator and denominator to get around division by zero in (/ x (abs x)). Since this causes the division by zero case to return 1, we now have a nonnegative->boolint function:
(define (nonnegative->boolint x)
(/ (+ 1
(/ (+ (zero->boolint x) x)
(+ (zero->boolint x) (abs x))))
2))
The inner division takes care of dividing a number by its absolute value to return -1 for negatives and 1 for positives and zero. The outer addition by 1 and then division by 2 turns this into 0 for negatives and 1 for positives and zero. In order to get a negative->boolint function, we just need some sort of not operation - which in the case of 1 for true and 0 for false is just subtracting the value from 1. So we can define negative->boolint based on only the conditional logic of abs and expt as:
(define (negative->boolint x)
(- 1 (nonnegative->boolint x))
This works as expected with the definition of water-density. Also, please don't ever do this in real world code. No matter how "clever" it may seem at the time.

calculate polynomial function [scheme/racket]

I am trying to write a tail-recursive function poly that will compute the value of a polynomial given a value and the list of coefficients. As in, if coeff is a list of coefficients (a0, a1, a2,...an) then (poly x coeff) should compute the value a0 + a1x +a2*x^2 + a3*x^3 + ...an*x^n
The functions is also expected to run in linear time (O(n))
My thoughts on this is to create a helper function that has an extra parameter (acc) that keeps track of where you are at in the list so you know what power to raise it to but I can't think of how to do that
There is no need for a helper function to track where you are on the list, since you will only need to move forward one list element at a time till you reach the end. Here's a possible skeleton
(define (poly coeff)
(let loop ((power 0) (total 0) (clist coeff))
(cond
((null? clist) ???)
(else (loop (+ 1 power) (??????) (cdr clist))))))
That's most of it done for you. All you really have to do is work out how you should do the calculating of exponents and addition. There are two basic options and I know which one I would choose (fewer cpu cycles).

Finding keys closest to a given value for clojure sorted-maps

For clojure's sorted-map, how do I find the entry having the key closest to a given value?
e.g. Suppose I have
(def my-map (sorted-map
1 A
2 B
5 C))
I would like a function like
(find-closest my-map 4)
which would return (5,C), since that's the entry with the closest key. I could do a linear search, but since the map is sorted, there should be a way of finding this value in something like O(log n).
I can't find anything in the API which makes this possible. If, for instance, I could ask for the i'th entry in the map, I could cobble together a function like the one I want, but I can't find any such function.
Edit:
So apparently sorted-map is based on a PersistentTreeMap class implemented in java, which is a red and black tree. So this really seems like it should be doable, at least in principle.
subseq and rsubseq are very fast because they exploit the tree structure:
(def m (sorted-map 1 :a, 2 :b, 5 :c))
(defn abs [x] (if (neg? x) (- x) x))
(defn find-closest [sm k]
(if-let [a (key (first (rsubseq sm <= k)))]
(if (= a k)
a
(if-let [b (key (first (subseq sm >= k)))]
(if (< (abs (- k b)) (abs (- k a)))
b
a)))
(key (first (subseq sm >= k)))))
user=> (find-closest m 4)
5
user=> (find-closest m 3)
2
This does slightly more work than ideal, in the ideal scenario we would just do a <= search then look at the node to the right to check if there is anything closer in that direction. You can access the tree (.tree m) but the .left and .right methods aren't public so custom traversal is not currently possible.
Use the Clojure contrib library data.avl. It supports sorted-maps with a nearest function and other useful features.
https://github.com/clojure/data.avl
The first thing that comes to my mind is to pull the map's keys out into a vector and then to do a binary search in that. If there is no exact match to your key, the two pointers involved in a binary search will end up pointing to the two elements on either side of it, and you can then choose the closer one in a single (possibly tie breaking) operation.

Resources