I have been trying to use the principal-components function from Incanter to do PCA and seem to be off track in using it. I found some sample data online from a PCA tutorial and wanted to practice on it:
(def data [[0.69 0.49] [-1.31 -1.21] [0.39 0.99] [0.09 0.29] [1.29 1.09]
[0.49 0.79] [0.19 (- 0 0.31)] [(- 0 0.81) (- 0 0.81)]
[(- 0 0.31) (- 0 0.31)] [(- 0 0.71) (- 0 1.01)]])
Upon first attempt to implement PCA I tried passing vectors to Incanter's matrix function, but found myself passing it too many arguments. At this point I decided to try a nested vector structure as defined above, but would like to avoid this route.
How would I turn data into a matrix (Incanter) such that it will be accepted as input into Incanter's function principal-components. For simplicity let's call the new matrix fooMatrix.
Once this matrix, fooMatrix, has been constructed the following code should work to extract the first two principal components
(def pca (principal-components fooMatrix))
(def components (:rotation pca))
(def pc1 (sel components :cols 0))
(def pc2 (sel components :cols 1))
and then the data can be projected on the principal components by
(def principal1 (mmult fooMatrix pc1))
(def principal2 (mmult fooMatrix pc2))
Check out the Incanter API. I believe you just want (incanter.core/matrix data). These are your options for Incanter's matrix function. Maybe A2 is what you're interested in.
(def A (matrix [[1 2 3] [4 5 6] [7 8 9]])) ; produces a 3x3 matrix
(def A2 (matrix [1 2 3 4 5 6 7 8 9] 3)) ; produces the same 3x3 matrix
(def B (matrix [1 2 3 4 5 6 7 8 9])) ; produces a 9x1 column vector
Example using your data:
user=> (use '[incanter core stats charts datasets])
nil
user=>(def data [0.69 0.49 -1.31 -1.21 0.39 0.99 0.09 0.29 1.29
1.09 0.49 0.79 0.19 (- 0 0.31) (- 0 0.81) (- 0 0.81)
(- 0 0.31) (- 0 0.31) (- 0 0.71) (- 0 1.01)])
user=>(def fooMatrix (matrix data 2))
user=>(principal-components fooMatrix)
{:std-dev (1.3877785387777999 0.27215937850413047), :rotation A 2x2 matrix
-------------
-7.07e-01 -7.07e-01
-7.07e-01 7.07e-01
}
Voilà. Nested vector structure gone.
Related
How to change a element in the matrix? According to Incanter document, the library is built on top of Clatrix. With Clatrix, set an element in the matrix with the command (set A 1 2 0). Please comment how to set the element in incanter. Thank you.
(ns cljsl.optimization
(:require [incanter.core :as i]
[incanter.stats :as s]))
;; create a matrix
cljsl.examples=> (def A (i/matrix [[0 1 2] [3 4 5]]))
cljsl.examples=> A
A 2x3 matrix
-------------
0.00e+00 1.00e+00 2.00e+00
3.00e+00 4.00e+00 5.00e+00
;; the view the item
cljsl.examples=> (i/$ 0 0 A)
0.0
;; element can be set with Clatrix, unfortunately, it don't correct with Incanter.
cljsl.examples=> (cl/set A 1 2 0)
(require '[clojure.core.matrix :as m])
(m/mset! A 0 0 -1)
Thanks for the help. After review the book Clojure for Machine Learning and Clojure for Data Science. Found procedures to fix the error.
adding the following dependency to the project.clj file.
[clatrix "0.5.0"]
The namespace declaration
(ns cljsl.optimization
(:require [clatrix.core :as cl]
[incanter.core :as i]
[incanter.stats :as s]))
Testing
cljsl.optimization=> (def A (i/matrix [[0 1 2] [3 4 5]]))
#'cljsl.optimization/A
cljsl.optimization=> A
A 2x3 matrix
-------------
0.00e+00 1.00e+00 2.00e+00
3.00e+00 4.00e+00 5.00e+00
ljsl.optimization=> (cl/set A 1 2 0)
#object[org.jblas.DoubleMatrix 0x1c951881 "[0.000000, 1.000000, 2.000000; 3.000000, 4.000000, 0.000000]"]
cljsl.optimization=> A
A 2x3 matrix
-------------
0.00e+00 1.00e+00 2.00e+00
3.00e+00 4.00e+00 0.00e+00
I am new to scheme and I'm having problems with matrices in Scheme. I need to create a function that takes one big and one small square matrices (with the condition: the small's length should be divisor of big one) and make a new matrix with doing an operation on the big one with small one. I've successfully split the big matrix to size that I wanted and I’m successfully operating on it to get the result.
Here is how I did it:
(define (matrix-op big small x y)
(if (< y (/ (length big) (length small))))
(if (< x (/ (length big) (length small)))
(cons (calculate (split-y (split-x big small x) small y) small)
(matrix-op big small (+ x 1) y))
(matrix-op big small 0 (+ y 1)) ; <- this is where i need to split
)
'()
)
)
My calculate function returns only 1 atomic value so when I run the function like this it gives me an output like '(val val val val), but what i want is formatting the output like '((val val) (val val)). How can I do it? Thanks in advance.
I realized that I couldn't explain the problem properly. What i want to have is a function that takes two different square matrices one big and one small, Splits the big one to same size as smaller one, operates on them to create a new matrix that has the size m/n if the big one is mxm and small one is nxn. Example:
big '( small '(
(8 0 3 1 5 3 2 2) (8 4)
(7 1 1 4 3 7 1 4) (9 5)
(1 3 7 4 3 6 6 3) )
(0 9 8 6 5 6 4 3)
(1 7 6 9 6 6 7 2)
(5 7 1 0 2 9 5 3)
(0 5 4 6 6 6 3 0)
(3 6 2 7 7 5 7 0)
)
I need to split big over the same size as small and calculate results like:
for x=0 y=0 part is '( calculate result is 5
(8 0)
(7 1)
)
for x=1 y=0 part is '( calculate result is 2
(3 1)
(1 4)
)
I actually did returned the results calculated but with the method i gave above my return was like '(5 2 4 2 2 6 4 4 4 3 5 4 2 4 6 3) but I wanted to return as:
'(
(5 2 4 2)
(2 6 4 4)
(4 3 5 4)
(2 4 6 3)
)
So how can I manage to split the return list where i want to split?
I think you are trying to do too much at once. It is always OK to split a bigger problem into a smaller problem.
If I understand yours, the idea is to take two square matrics, one of which may be some multiple of the other’s dimensions, and perform a pair-wise operation on the elements. For example:
'((1 2 3) '((1 2 3) '((7 7 7) '(( 8 9 10)
(4 5 6) + '((7)) --> (4 5 6) + (7 7 7) --> (11 12 13)
(7 8 9)) (7 8 9)) (7 7 7)) (14 15 16))
I will continue with the assumption that this is what is desired.
Notice that if the two matrices were the same size, a simple nested map would easily combine all elements. What is left is the problem of the different sizes.
Solve that and you are golden.
Recap:
(define (f op small-M big-M)
(f-apply-pairwise-op
op
(f-biggify small-M (/ (length big-M) (length small-M)))
big-M))
Now you have broken the problem into two smaller pieces:
(define (f-apply-pairwise-op op A B) ...) ; produces pairwise 'A op B'
(define (f-biggify M n) ...) ; tile M n times wider and taller
Good luck!
I'm writing a Monty Hall simulator, and found the need to generate a number within a range, excluding a single number.
This seemed easy, so I naively wrote up:
(The g/... functions are part of my personal library. Their use should be fairly clear):
(defn random-int-excluding
"Generates a random number between min-n and max-n; excluding excluding-n.
min-n is inclusive, while max-n is exclusive."
[min-n max-n excluding-n rand-gen]
(let [rand-n (g/random-int min-n max-n rand-gen)
rand-n' (if (= rand-n excluding-n) (inc rand-n) rand-n)]
(g/wrap rand-n' min-n (inc max-n))))
This generates a random number within the range, and if it equals the excluded number, adds one; wrapping if necessary. Of course this ended up giving the number after the excluded number twice the chance of being picked since it would be picked either if it or the excluded number are chosen. Sample output frequencies for a range of 0 to 10 (max exclusive), excluding 2:
([0 0.099882]
[1 0.100355]
[3 0.200025]
[4 0.099912]
[5 0.099672]
[6 0.099976]
[7 0.099539]
[8 0.100222]
[9 0.100417])
Then I read this answer, which seemed much simpler, and based on it, wrote up:
(defn random-int-excluding
"Generates a random number between min-n and max-n; excluding excluding-n.
min-n is inclusive, while max-n is exclusive."
[min-n max-n excluding-n rand-gen]
(let [r1 (g/random-int min-n excluding-n rand-gen)
r2 (g/random-int (inc excluding-n) max-n rand-gen)]
(if (g/random-boolean rand-gen) r1 r2)))
Basically, it splits the range into 2 smaller ranges: from the min to the excluded number, and from excluded number + 1 to the max. It generates random number from these ranges, then randomly chooses one of them. Unfortunately though, as I noted under the answer, this gives skewed results unless both the partitions are of equal size. Sample output frequencies; same conditions as above:
([0 0.2499497]
[1 0.2500795]
[3 0.0715849]
[4 0.071297]
[5 0.0714366]
[6 0.0714362]
[7 0.0712715]
[8 0.0715285]
[9 0.0714161])
Note the numbers part of the smaller range before the excluded number are much more likely. To fix this, I'd have to skew it to pick numbers from the larger range more frequently, and really, I'm not proficient enough in maths in general to understand how to do that.
I looked at the accepted answer from the linked question, but to me, it seems like a version of my first attempt that accepts more than 1 number to exclude. I'd expect, against what the answerer claimed, that the numbers at the end of the exclusion range would be favored, since if a number is chosen that's within the excluded range, it just advances the number past the range.
Since this is going to be one of the most called functions in the simulation, I'd really like to avoid the "brute-force" method of looping while the generated number is excluded since the range will only have 3 numbers, so there's a 1/3 chance that it will need to try again each attempt.
Does anyone know of a simple algorithm to chose a random number from a continuous range, but exclude a single number?
To generate a number in the range [a, b] excluding c, simply generate a number in the range [a, b-1], and if the result is c then output b instead.
Just generate a lazy sequence and filter out items you don't want:
(let [ignore #{4 2}]
(frequencies
(take 2000
(remove ignore (repeatedly #(rand-int 5))))))
Advantage to the other approach of mapping to different new values: This function will also work with different discrete random number distributions.
If the size of the collection of acceptable answers is small, just put all values into a vector and use rand-nth:
http://clojuredocs.org/clojure.core/rand-nth
(def primes [ 2 3 5 7 11 13 17 19] )
(println (rand-nth primes))
(println (rand-nth primes))
(println (rand-nth primes))
~/clj > lein run
19
13
11
Update
If some of the values should include more than the others, just put them in the array of values more than once. The number of occurrances of each value determines its relative weight:
(def samples [ 1 2 2 3 3 3 4 4 4 4 ] )
(def weighted-samples
(repeatedly #(rand-nth samples)))
(println (take 22 weighted-samples))
;=> (3 4 2 4 3 2 2 1 4 4 3 3 3 2 3 4 4 4 2 4 4 4)
If we wanted any number from 1 to 5, but never 3, just do this:
(def samples [ 1 2 4 5 ] )
(def weighted-samples
(repeatedly #(rand-nth samples)))
(println (take 22 weighted-samples))
(1 5 5 5 5 2 2 4 2 5 4 4 5 2 4 4 4 2 1 2 4 1)
Just to show the implementation I wrote, here's what worked for me:
(defn random-int-excluding
"Generates a random number between min-n and max-n; excluding excluding-n.
min-n is inclusive, while max-n is exclusive."
[min-n max-n excluding-n rand-gen]
(let [rand-n (g/random-int min-n (dec max-n) rand-gen)]
(if (= rand-n excluding-n)
(dec max-n)
rand-n)))
Which gives a nice even distribution:
([0 0.111502]
[1 0.110738]
[3 0.111266]
[4 0.110976]
[5 0.111162]
[6 0.111266]
[7 0.111093]
[8 0.110815]
[9 0.111182])
Just to make Alan Malloy's answer explicit:
(defn rand-int-range-excluding [from to without]
(let [n (+ from (rand-int (dec (- to from))))]
(if (= n without)
(dec to)
n)))
(->> #(rand-int-range-excluding 5 10 8)
repeatedly
(take 100)
frequencies)
;{6 28, 9 22, 5 29, 7 21}
No votes required :).
I am trying to write a function that takes a matrix (represented as a list of lists) and adds the elements down the columns and returns a vector (represented as a list):
Example:
(define sample
'((2 6 0 4)
(7 5 1 4)
(6 0 2 2)))
should return '(15 11 3 10).
I was trying to use the (list-ref) function twice to obtain the first element of each column with no luck. I am trying something like:
(map (lambda (matrix) ((list-ref (list-ref matrix 0) 0)) (+ matrix))
The solution is simple if we forget about the indexes and think about higher-order procedures, try this:
(define sample
'((2 6 0 4)
(7 5 1 4)
(6 0 2 2)))
(apply map + sample)
=> '(15 11 3 10)
Explanation: map can take multiple lists as arguments. If we apply it to sample (which is a list of lists) and pass + as the procedure to do the mapping, it'll take one element from each list in turn and add them, producing a list with the results - effectively, adding all the columns in the matrix.
I have code which directly mutates a matrix for performance. Before I mutate it I want to get a complete copy to store in a new symbol, which is then used by the mutation process. Is there anyway that I can copy a Clojure symbol's contents into a new symbol so that the first can be mutated without affecting the second?
Here is one of my failed attempts:
(var mat1 (clatrix/matrix (clatrix/ones 2 2)))
(var mat1)
(intern 'analyzer.core 'mat1 (clatrix/matrix (clatrix/ones 2 2)))
mat1
(intern 'analyzer.core 'mat2 mat1)
mat2
(clatrix/set mat1 0 0 2)
mat1
mat2
And of course, this does not work:
(def mat1 (clatrix/matrix (clatrix/ones 2 2)
(def mat2 mat1)
I also attempted (but not sure if I'm doing it right here anyway):
(def mat1 (clatrix/matrix (clatrix/ones 2 2)
(def mat2 `mat1)
and
(def mat1 (clatrix/matrix (clatrix/ones 2 2))
(def mat2 ~mat1)
and
(def mat1 (clatrix/matrix (clatrix/ones 2 2))
(def mat2 (.dup mat1))
Any ideas?
Update
I have benchmarked the answers presented so far. I'm not sure what the symbol of the lines means.
Setup:
(def mat1 (clatrix/ones 1000 1000) ; Creates a 1000x1000 matrix of 1.0 in each element.
From #Mars:
(criterium.core/bench (let [mat2 (clatrix/matrix mat1)]))
From #JoG:
(criterium.core/bench (let [mat2 (read-string (pr-str mat1))]))
For more general cases
#JoG's solution will work for data structures that serialize into strings well. If someone has ideas about how to make a more general solution, please respond, and I will update this.
Just use matrix again:
(require '[clatrix.core :as clatrix])
; nil
(def mat1 (clatrix/matrix [[1 1][1 1]]))
; #'user/mat1
(def mat2 (clatrix/matrix mat1))
; #'user/mat2
mat1
; A 2x2 matrix
; -------------
; 1.00e+00 1.00e+00
; 1.00e+00 1.00e+00
(clatrix/set mat1 0 0 2)
; #<DoubleMatrix [2.000000, 1.000000; 1.000000, 1.000000]>
mat1
; A 2x2 matrix
; -------------
; 2.00e+00 1.00e+00
; 1.00e+00 1.00e+00
mat2
; A 2x2 matrix
; -------------
; 1.00e+00 1.00e+00
; 1.00e+00 1.00e+00
Given the matrix itself is also a clojure datastrucure how about
(def copy (read-string (pr-str original)))
pr-str dumps the data structure as a string, read-string evaluates it back.