I am unsure why I am getting an error.
I think it may stem from a misunderstanding around the structure syntax, but I am not certain if this is the issue (it would be unsurprising if there are multiple issues).
I am emulating code (from William Gould's The Mata Book) in which the input is a scalar, but the input for the program I am writing is a colvector.
The objective of this exercise is to create a square matrix from a column vector (according to some rules) and once created, multiply this square matrix by itself.
The code is the following:
*! spatial_lag version 1.0.0
version 15
set matastrict on
//--------------------------------------------------------------
local SL struct laginfo
local RS real scalar
local RC real colvector
local RM real matrix
//--------------------------------------------------------------
mata
`SL'
{
//-------------------inputs:
`RC' v
//-------------------derived:
`RM' W
`RM' W2
`RS' n
}
void lagset(`RC' v)
{
`SL' scalar r
// Input:
r.v = v
//I set the derived variables to missing:
r.W = .z
r.W2 = .z
r.n = .z // length of vector V
}
`RM' w_mat(`SL' scalar r)
{
if (r.W == .z) {
real scalar row, i
real scalar col, j
r.W = J(r.n,r.n,0)
for (i=1; i<=r.n; i++) {
for (i=1; i<=r.n; i++) {
if (j!=i) {
if (r.v[j]==r.v[i]) {
r.W[i,j] = 1
}
}
}
}
}
return(r.W)
}
`RS' wlength(`SL' scalar r)
{
if (r.n == .z) {
r.n = length(r.v)
}
return(r.n)
}
`RM' w2mat(`SL' scalar r)
{
if (r.W2 == .z) {
r.W2 = r.W * r.W
}
return(r.W2)
}
end
This compiles without a problem, but it give an error when I attempt to use it interactively as follows:
y=(1\1\1\2\2\2)
q = lagset(y)
w_mat(q)
w2mat(q)
The first two lines run fine, but when I run the last two of those lines, I get:
w_mat(): 3204 q[0,0] found where scalar required
<istmt>: - function returned error
What am I misunderstanding?
This particular error is unrelated to structures. Stata simply complains because the lagset() function is void. That is, it does not return anything. Thus, q ends up being empty, which is in turn used as input in the function w_mat() inappropriately - hence the q[0,0] reference.
Related
JavaScript
for (var x = 0; x < 360; x += 0.5)
{
// ...
}
How do I do this in Kotlin?
Note that my step size is a floating point and so a regular range won't work:
for (x in 0.0 until 360.0 step 0.5) {
// ...
}
I also need to exclude the ending value, hence why I'm using until.
I will resort to a while loop for now:
var x = 0.0;
while (x < 360.0) {
// ...
x += 0.5
}
There isn't a way to do this right now in Kotlin because Kotlin does not have "traditional" for loops. I believe you're right in choosing a while loop. In fact, traditional for loops are just while loops in disguise:
for (init; condition; post;) {
// do something
}
can always be rewritten,
init
while (condition) {
// do something
post
}
with no change in behavior, because the init statement will always execute and the condition will always be checked before the loop runs even once. One thing this construct can't give you is a variable that's only scoped to this block. If you're really after that behavior, the following would work, though it's not very idiomatic.
for (x in generateSequence(0.0) { it + 0.5 }.takeWhile { it < 360.0}) {
println(x)
}
If you're using a Sequence, you might also be interested in the more idiomatic forEach:
generateSequence(0.0) { it + 0.5 }.takeWhile { it < 360.0 }.forEach { x ->
// do something
}
As repeated addition is just a shorthand for multiplication you can look at for (x in 0.0 until 360.0 step 0.5) as "do something for as many times as 0.5 fits in 360". Therefore it would make sense to express it like this:
val times = (360 / 0.5).toInt()
repeat(times){
}
of cause you can also inline times to
repeat(times = (360 / 0.5).toInt()){
}
The advantage is that you skip the sequence generation. Con -- you lose the access to your counting variable x.
see also https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/repeat.html
I agree with Answer by #AdamYakes.
Since the until operator is not available for Floating value type, you can not use
it for floating step value.
However, If you still want to use
for (x in 0.0 until 360.0 step 0.5) {
// use value
}
you can do that as following:
for (x in 0 until 3600 step 5) {
// use (value / 10)
}
Kotlin is so flexible that one could hack a custom solution:
private infix fun (((Double) -> Unit, Double) -> Unit).step(step: Double) =
fun(action: (Double) -> Unit) = this(action, step)
private infix fun Double.upto(to: Double) =
fun(action: (Double) -> Unit, step: Double) {
var v = this
while (v < to) {
action(v)
v += step
}
}
fun main() {
// Usage:
(0.0 upto 360.0 step 0.5) {
println(it)
}
}
What I did above is to create two infix functions.
The upto() infix function can be used with two Doubles: it takes 0.0 upto 360.0 and returns a function that expects two arguments: an action function and the step double value.
The step infix function can be used with functions returned by upto() combined with a Double (so f step 0.5). It returns a function that calls the upto() function passing the step value and the user's { ... } block that gets called for each iteration.
Short and convoluted.
A simpler alternative:
inline fun doubleFor(
from: Double, to: Double, step: Double, action: (Double) -> Unit
) {
var v = from
while (v < to) {
action(v)
v += step
}
}
fun main() {
// Usage:
doubleFor(0.0, 360.0, 0.5) {
println(it)
}
}
A good IDE would show the argument names next to the values making the meaning of the arguments in the usage code obvious.
A third approach can be found at https://stackoverflow.com/a/44332139/2933899 which allows one to write for(x in 0.0 .. 360.0 step 0.5) { println(x) } with the inconvenience that the top boundary value is included.
Run these at https://pl.kotl.in/ZuR354Fas
Currently I'm working on an optimization problem for a course I'm doing with a fellow student. It's basically described by three equations.
Where n is an index taking values between 1 and 1180, Pr is a known vector (meaning all values of this vector are known and constant) and we have to find the vector Ps that results in the minimum value of Ef[1180].
Logically, the answer would be to set all values of Ps[n] to infinity. However, there are a few constraints:
Furthermore, the values of Es and Ps must always be a multiple of 1,000 to decrease the state space.
The above is what we figured out from the assignment description. However, we can't seem to figure out how to solve this as a dynamic programming problem. There are lots of examples around for going from a set of equations to a dynamic programming problem. However, those examples all have two or three inputs and use a 2- or 3-dimensional dictionary resp. to facilitate data reuse. We essentially have 1180 inputs. Creating an 1180-dimensional dictionary is not feasible
We tried constituting Bellman equations for this problem, but the professor told us this is wrong. Then we considered brute forcing the state space, but this is an insane job since there are 43^1180 possible combinations of input vectors P_s. Some of our fellow students advised us to checkout the checkerboard example on this wikipedia page:
Wikipedia page on dynamic programming
However, this example seems to traverse through the checkerboard only once. The usage of a cost function would always pick the highest possible value for Ps[n] to minimize Ef[n]. However, to do pick such a positive value we must have Es[n] > 0 which can only happen when previous elements of Ps[i] for i < n take negative values. But the cost function will prevent Ps from having negative values. Since the cost function does not allow negative values and the Es[n] >= 0 constraint does not allow negative values, this will result in a Ps containing only zeroes, which certainly does not result in the lowest value of Ef[1180].
Any hints on how to continue would be nice. We have been staring at this problem for days now and we are completely lost at this point.
You want to minimize E[1180] idem maximize f defined below:
f(P) = \sigma_{i=1}^{1180} P_i
under constraint:
forall n <= 1180
-6.5*10^5 <= \sum_{i=1}^n P_i <= 0
Recurrence formula be like
f(i, sumPs, v) {
if i == 1180
return { s: sumPs, solution: v }
res = { s: -infinity, solution: [] }
# Pr(i) > Ps(i)
for psi in -21:min(Pr[i], 21)
# Es(n-1) = - sumPs
if psi <= -sumPs
tmp = f(i+1, sumPs + psi, v+[psi])
if tmp.s > res.s
res.s = tmp.s
res.v = tmp.v
return res
}
f(0, 0, [])
Dynamic approach be similar:
Initialize the first layer: an associative array for sumPs as key and {s:sum, v:facultative} as value
We could actually just store nothing as value and use a set (stocking the sums), but it is convenient for debugging purpose
initialiaztion
for psi in -21:min(Pr[0], 0)
layer[psi] = {s: psi, v:[psi]}
To build layer i+1, you only need layer i
for i = 2:1180
nextLayer = []
for psi in range(-21, min(Pr[i-1], 21))
for candidate in layer:
if psi <= -candidate.s
maybe = candidate.s + psi
if !nextLayer[maybe]
nextLayer[maybe] = {s: maybe, v:v+[psi]}
layer = nextLayer
NB: I have not handled the 1000 factor, but that should not be a problem
const Pr = [-10,2,-2,4,-1]
function f(i, sumPs, v) {
if (i == 5) {
return { s: sumPs, v }
}
let res = { s: -1e12, v: [] }
for (let psi = -21; psi<=Math.min(Pr[i], 21); ++psi) {
if (psi <= -sumPs) {
let tmp = f(i+1, sumPs + psi, v.concat(psi))
if (tmp.s > res.s) {
res.s = tmp.s
res.v = tmp.v
}
}
}
return res
}
function dp(n, pr){
let layer = new Map
for (let psi = -21; psi <= Math.min(Pr[0], 0); ++psi) {
layer.set(psi, {s: psi, v:[psi]})
}
for (let i = 2; i <= n; ++i) {
let nextLayer = new Map
for (let psi = -21; psi <= Math.min(pr[i-1], 21); ++psi) {
for (let [k, candidate] of layer) {
if (psi <= -candidate.s) {
const maybe = candidate.s + psi
if (!nextLayer.has(maybe)) {
nextLayer.set(maybe, { s: maybe, v: candidate.v.concat(psi) })
}
}
}
}
layer = nextLayer
}
return [...layer.entries()].sort((a,b) => b[0] - a[0])[0][1]
}
console.log(f(0,0,[]))
console.log(dp(5,Pr))
I want to release the GIL in order to parallelise loop in cython, where different slices of memoryviews are passed to a some function inside the loop. The code looks like this:
cpdef void do_sth_in_parallel(bint[:,:] input, bint[:] output, int D):
for d in prange(D, schedule=dynamic, nogil=True):
ouput[d] = some_function_not_requiring_gil(x[d,:])
This is not possible, since selecting the slice x[d,:], seems to require GIL. Running cython -a, and using a normal for loop, I get the code posted below. How can this be done in pure C?
__pyx_t_5.data = __pyx_v_x.data;
__pyx_t_5.memview = __pyx_v_x.memview;
__PYX_INC_MEMVIEW(&__pyx_t_5, 0);
{
Py_ssize_t __pyx_tmp_idx = __pyx_v_d;
Py_ssize_t __pyx_tmp_shape = __pyx_v_x.shape[0];
Py_ssize_t __pyx_tmp_stride = __pyx_v_x.strides[0];
if (0 && (__pyx_tmp_idx < 0))
__pyx_tmp_idx += __pyx_tmp_shape;
if (0 && (__pyx_tmp_idx < 0 || __pyx_tmp_idx >= __pyx_tmp_shape)) {
PyErr_SetString(PyExc_IndexError, "Index out of bounds (axis 0)");
__PYX_ERR(0, 130, __pyx_L1_error)
}
__pyx_t_5.data += __pyx_tmp_idx * __pyx_tmp_stride;
}
__pyx_t_5.shape[0] = __pyx_v_x.shape[1];
__pyx_t_5.strides[0] = __pyx_v_x.strides[1];
__pyx_t_5.suboffsets[0] = -1;
__pyx_t_6.data = __pyx_v_u.data;
__pyx_t_6.memview = __pyx_v_u.memview;
__PYX_INC_MEMVIEW(&__pyx_t_6, 0);
__pyx_t_6.shape[0] = __pyx_v_u.shape[0];
__pyx_t_6.strides[0] = __pyx_v_u.strides[0];
__pyx_t_6.suboffsets[0] = -1;
The following works for me:
from cython.parallel import prange
cdef bint some_function_not_requiring_gil(bint[:] x) nogil:
return x[0]
cpdef void do_sth_in_parallel(bint[:,:] input, bint[:] output, int D):
cdef int d
for d in prange(D, schedule=dynamic, nogil=True):
output[d] = some_function_not_requiring_gil(input[d,:])
The two main changes I had to make were x to input (because it's assuming it can find x as a python object at the global scope) to fix the error
Converting to Python object not allowed without gil
and adding cdef int d to force the type of d and fix the error
Coercion from Python not allowed without the GIL
(I also created an example some_function_not_requiring_gil but I assume this is fairly obvious)
Solution that works for me:
Access the array slice using
input[d:d+1, :]
instead of
input [d,:]
And pass a 2D array.
I usually find the answers to my questions by looking around here (I'm glad stackovergflow exists!), but I haven't found the answer to this one... I hope you can help me :)
I am using the projection.matrix() function from the "popbio" package to create transition matrices. In the function, you have to specify the "stage" and "fate" (both categorical variables), and the "fertilities" (a numeric column).
Everything works fine, but I would like to apply the function to 1:n fertility columns within the data frame, and get a list of matrices generated from the same categorical variables with the different fertility values.
This is how my data frame looks like (I only include the variables I am using for this question):
stage.fate = data.frame(replicate(2, sample(0:6,40,rep=TRUE)))
stage.fate$X1 = as.factor(stage.fate$X1)
stage.fate$X2 = as.factor(stage.fate$X2)
fertilities = data.frame(replicate(10,rnorm(40, .145, .045)))
df = cbind(stage.fate, fertilities)
colnames(df)[1:2]=c("stage", "fate")
prefix = "control"
suffix = seq(1:10)
fer.names = (paste(prefix ,suffix , sep="."))
colnames(df)[3:12] = c(fer.names)
Using
library(popbio)
projection.matrix(df, fertility=control.1)
returns a single transition matrix with the fertility values incorporated into the matrix.
My problem is that I would like to generate a list of matrices with the different fertility values in one go (in reality the length of my data is >=300, and the fertility columns ~100 for each of four different treatments...).
I will appreciate your help!
-W
PS This is how the function in popbio looks like:
projection.matrix =
function (transitions, stage = NULL, fate = NULL, fertility = NULL,
sort = NULL, add = NULL, TF = FALSE)
{
if (missing(stage)) {
stage <- "stage"
}
if (missing(fate)) {
fate <- "fate"
}
nl <- as.list(1:ncol(transitions))
names(nl) <- names(transitions)
stage <- eval(substitute(stage), nl, parent.frame())
fate <- eval(substitute(fate), nl, parent.frame())
if (is.null(transitions[, stage])) {
stop("No stage column matching ", stage)
}
if (is.null(transitions[, fate])) {
stop("No fate column matching ", fate)
}
if (missing(sort)) {
sort <- levels(transitions[, stage])
}
if (missing(fertility)) {
fertility <- intersect(sort, names(transitions))
}
fertility <- eval(substitute(fertility), nl, parent.frame())
tf <- table(transitions[, fate], transitions[, stage])
T_matrix <- try(prop.table(tf, 2)[sort, sort], silent = TRUE)
if (class(T_matrix) == "try-error") {
warning(paste("Error sorting matrix.\n Make sure that levels in stage and fate columns\n match stages listed in sort option above.\n Printing unsorted matrix instead!\n"),
call. = FALSE)
sort <- TRUE
T_matrix <- prop.table(tf, 2)
}
T_matrix[is.nan(T_matrix)] <- 0
if (length(add) > 0) {
for (i in seq(1, length(add), 3)) {
T_matrix[add[i + 0], add[i + 1]] <- as.numeric(add[i +
2])
}
}
n <- length(fertility)
F_matrix <- T_matrix * 0
if (n == 0) {
warning("Missing a fertility column with individual fertility rates\n",
call. = FALSE)
}
else {
for (i in 1:n) {
fert <- tapply(transitions[, fertility[i]], transitions[,
stage], mean, na.rm = TRUE)[sort]
F_matrix[i, ] <- fert
}
}
F_matrix[is.na(F_matrix)] <- 0
if (TF) {
list(T = T_matrix, F = F_matrix)
}
else {
T_matrix + F_matrix
}
}
<environment: namespace:popbio>
My question was answered via ResearchGate by Caner Aktas
Answer:
fertility.list<-vector("list",length(suffix))
names(fertility.list)<-fer.names
for(i in suffix) fertility.list[[i]]<-projection.matrix(df,fertility=fer.names[i])
fertility.list
Applying popbio “projection.matrix” to multiple fertilities and generate list of matrices?. Available from: https://www.researchgate.net/post/Applying_popbio_projectionmatrix_to_multiple_fertilities_and_generate_list_of_matrices#5578524f60614b1a438b459b [accessed Jun 10, 2015].
Simplified example of my slowly working code (the function rbf is from the kernlab package) that needs speeding up:
install.packages('kernlab')
library('kernlab')
rbf <- rbfdot(sigma=1)
test <- matrix(NaN,nrow=5,ncol=10)
for (i in 1:5) {
for (j in 1:10) { test[i,j] <- rbf(i,j)}
}
I've tried outer() but it doesn't work because the rbf function doesn't return the required length (50). I need to speed this code up because I have a huge amount of data. I've read that vectorization would be the holy grail to speeding this up but I don't know how.
Could you please point me in the right direction?
If rbf is indeed the return value from a call to rbfdot, then body(rbf) looks something like
{
if (!is(x, "vector"))
stop("x must be a vector")
if (!is(y, "vector") && !is.null(y))
stop("y must a vector")
if (is(x, "vector") && is.null(y)) {
return(1)
}
if (is(x, "vector") && is(y, "vector")) {
if (!length(x) == length(y))
stop("number of dimension must be the same on both data points")
return(exp(sigma * (2 * crossprod(x, y) - crossprod(x) -
crossprod(y))))
}
}
Since most of this is consists of check functions, and crossprod simplifies when you are only passing in scalars, I think your function simplifies to
rbf <- function(x, y, sigma = 1)
{
exp(- sigma * (x - y) ^ 2)
}
For a possible further speedup, use the compiler package (requires R-2.14.0 or later).
rbf_loop <- function(m, n)
{
out <- matrix(NaN, nrow = m, ncol = n)
for (i in seq_len(m))
{
for (j in seq_len(n))
{
out[i,j] <- rbf(i,j)
}
}
out
)
library(compiler)
rbf_loop_cmp <- cmpfun(rbf_loop)
Then compare the timing of rbf_loop_cmp(m, n) to what you had before.
The simplification step is easier to see in reverse. If you expand (x - y) ^ 2 you get x ^ 2 - 2 * x * y + y ^ 2, which is minus the thing in the rbf function.
Use the function kernelMatrix() in kernlab,
it should be a couple a couple of order of magnitudes
faster then looping over the kernel function:
library(kernlab)
rbf <- rbfdot(sigma=1)
kernelMatrix(rbf, 1:5, 1:10)