How to write an arbitrary number of models (functions) in NumPyro? - numpyro

I have 5 NumPyro models of the form:
def patient_1():
disease = numpyro.sample("disease",dist.Bernoulli(0.4))
treatment = numpyro.sample("treatment",dist.Bernoulli(0.7))
list_variables=[disease,treatment]
list_variables_no_treatment=[disease]
bloodpressure = numpyro.sample("bloodpressure",dist.Normal(non_linear_fn(jnp.array(random.sample(list_variables_no_treatment,np.random.randint(0,1))),np.random.randint(0,1)*treatment),1))
list_variables.append(bloodpressure)
list_variables_no_treatment.append(bloodpressure)
weight = numpyro.sample("weight", dist.Normal(non_linear_fn(jnp.array(random.sample(list_variables_no_treatment,np.random.randint(0,2))),np.random.randint(0,1)*treatment), 1))
list_variables.append(weight)
list_variables_no_treatment.append(weight)
heartattack=numpyro.sample("heartattack",dist.Normal(non_linear_fn(jnp.array(random.sample(list_variables_no_treatment,np.random.randint(1,3))),np.random.randint(0,1)*treatment),1))
list_variables.append(heartattack)
list_variables_no_treatment.append(heartattack)
variables=[]
list_variables=[disease,treatment,bloodpressure,weight,heartattack]
for i in range(5,k):
vi=numpyro.sample('variable'+str(i),dist.Normal(non_linear_fn(jnp.array(random.sample(list_variables,np.random.randint(0,i))),np.random.randint(0,1)*treatment), 1))
variables.append(vi)
list_variables.append(vi)
vi=numpyro.sample('variable'+str(k),dist.Normal(non_linear_fn(jnp.array(random.sample(list_variables,np.random.randint(0,k))),treatment), 1))
variables.append(vi)
list_variables.append(vi)
return list_variables `
Now, I would like to generate "k" models, so I was thinking of some form of for-loop where i specify each of the models, but it didn't work. Any ideas?

Related

Error: requires numeric/complex matrix/vector arguments for %*%; cross validating glmmTMB model

I am adapting some k-fold cross validation code written for glmer/merMod models to a glmmTMB model framework. All seems well until I try and use the output from the model(s) fit with training data to predict and exponentiate values into a matrix (to then break into quantiles/number of bins to assess predictive performance). I can get get this line to work using glmer models, but it seems when I run the same model using glmmTMB I get Error in model.matrix: requires numeric/complex matrix/vector arguments There are many other posts out there discussing this error code and I have tried converting the data frame into matrix form and changing the class of the covariates with no luck. Separately running the parts before and after the %*% works but when combined I get the error. For context, this code is intended to be run with use/availability data so the example variables may not make sense, but the problem gets shown well enough. Any suggestions as to what is going on?
library(lme4)
library(glmmTMB)
# Example with mtcars dataset
data(mtcars)
# Model both with glmmTMB and lme4
m1 <- glmmTMB(am ~ mpg + wt + (1|carb), family = poisson, data=mtcars)
m2 <- glmer(am ~ mpg + wt + (1|carb), family = poisson, data=mtcars)
#--- K-fold code (hashed out sections are original glmer version of code where different)---
# define variables
k <- 5
mod <- m1 #m2
dt <- model.frame(mod) #data used
reg.list <- list() # initialize object to store all models used for cross validation
# finds the name of the response variable in the model dataframe
resp <- as.character(attr(terms(mod), "variables"))[attr(terms(mod), "response") + 1]
# define column called sets and populates it with character "train"
dt$sets <- "train"
# randomly selects a proportion of the "used"/am records (i.e. am = 1) for testing data
dt$sets[sample(which(dt[, resp] == 1), sum(dt[, resp] == 1)/k)] <- "test"
# updates the original model using only the subset of "trained" data
reg <- glmmTMB(formula(mod), data = subset(dt, sets == "train"), family=poisson,
control = glmmTMBControl(optimizer = optim, optArgs=list(method="BFGS")))
#reg <- glmer(formula(mod), data = subset(dt, sets == "train"), family=poisson,
# control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=2e5)))
reg.list[[i]] <- reg # store models
# uses new model created with training data (i.e. reg) to predict and exponentiate values
predall <- exp(as.numeric(model.matrix(terms(reg), dt) %*% glmmTMB::fixef(reg)))
#predall <- exp(as.numeric(model.matrix(terms(reg), dt) %*% lme4::fixef(reg)))
Without looking at the code too carefully: glmmTMB::fixef(reg) returns a list (with elements cond (conditional model parameters), zi (zero-inflation parameters), disp (dispersion parameters) rather than a vector.
If you replace this bit with glmmTMB::fixef(reg)[["cond"]] it will probably work.

How to select two random values from two strings if both the strings contains same values

two drop downs
1)Array _ depart city contains N cities
aaa,bbb,ccc,ddd,eee,fff,......nnn
2)Array _ arrival city contains N cities
aaa,bbb,ccc,ddd,eee,fff,......nnn
want to select two cities from two strings randomly but two cities should not match
You can include following in a JSR223 Sampler
def cities =["aaa","bbb","ccc","ddd","eee","fff","test"]
//Remove a random city and assign to dep_city
cities.shuffle()
def dep_city = cities.pop()
//Remove a random city and assign to arrival_city
cities.shuffle()
def arrival_city= cities.pop()
//Setting the variables
vars.put("dep_city", dep_city)
vars.put("arrival_city", arrival_city)
SampleResult.setIgnore() //Result is not generated
Groovy is used for the scripting
Shuffle is used to randomly reorder the elements
Pop is used to remove the first element from the list
First of all I don't think your approach is correct, test needs to be repeatable and your "random" logic may lead to the situation when one test run reveals performance problem and the next one doesn't because the data is different.
So maybe it makes more sense to consider using parameterization instead, i.e. put all the cities to the CSV file and use CSV Data Set Config to read them.
If you really want your test to use random data and you have these "arrays" in form of string like in your question you can implement the randomization using any suitable JSR223 Test Element and the example code like:
def dep_array = 'aaa,bbb,ccc,ddd,eee,fff'
def arr_array = 'aaa,bbb,ccc,ddd,eee,fff'
def getRandomCity(String cities, Object prev) {
def array = cities.split(',')
def rv = array[org.apache.commons.lang3.RandomUtils.nextInt(0, array.size())]
if (prev != null) {
if (rv == prev) {
rv = getRandomCity(array.join(','), prev)
}
}
return rv
}
def dep_city = getRandomCity(dep_array, null)
def arr_city = getRandomCity(arr_array, dep_city)
vars.put('dep_city', dep_city)
vars.put('arr_city', arr_city)
You will be able to access the values as ${dep_city} and ${arr_city} later on where required

Algorithm for finding object in range without looping through all other objects?

Background:
I'm in the beginning of making a game, it has objects that should be able to communicate with each-other by "sound" (not necessarily real sound, can be simulated sound, but it should behave like sound).
That means that they can only communicate with each-other if they are within hearing-range.
Question:
Is there some smart way to test if a another object is within hearing-range without having to loop through all of the other objects? (it would become really inefficient when it's a lot of them).
Note: There can be more than 1 object within hearing-range, so all objects within hearing-range are added to an array (or list, haven't decided yet) for communication.
Data
Currently the object has these properties (it can be changed if needed).
Object {
id = self.id,
x = self.x,
y = self.y,
hearing_max_range = random_range(10, 20), // eg: 10
can_hear_other = []; // append: other.id when in other in range
}
You could look into some clever data structures such as quadtrees or kd-trees, but for a problem with a fixed range query, it might not be too bad to just use simple binning. I'll present the general algorithm in python-like pseudo code.
First construct your bins:
from collections import defaultdict
def make_bin(game_objects, bin_size):
object_bins = defaultdict(list)
for obj in game_objects:
object_bins[(obj.x//bin_size, obj.y//bin_size)].append(obj)
Then query as necessary:
def find_neighbors(game_object, object_bins, bin_size):
x_idx = game_object.x // bin_size
y_idx = game_object.y // bin_size
for x_bin in range(x_idx - 1, x_idx + 2):
for y_bin in range(y_idx - 1, y_idx + 2):
for obj in object_bins[(x_bin, y_bin)]:
if (obj.x - game_object.x)**2 + (obj.y - game_object.y)**2 <= bin_size**2:
yield obj

Returning multiple ints and passing them as multiple arguements in Lua

I have a function that takes a variable amount of ints as arguments.
thisFunction(1,1,1,2,2,2,2,3,4,4,7,4,2)
this function was given in a framework and I'd rather not change the code of the function or the .lua it is from. So I want a function that repeats a number for me a certain amount of times so this is less repetitive. Something that could work like this and achieve what was done above
thisFunction(repeatNum(1,3),repeatNum(2,4),3,repeatNum(4,2),7,4,2)
is this possible in Lua? I'm even comfortable with something like this:
thisFunction(repeatNum(1,3,2,4,3,1,4,2,7,1,4,1,2,1))
I think you're stuck with something along the lines of your second proposed solution, i.e.
thisFunction(repeatNum(1,3,2,4,3,1,4,2,7,1,4,1,2,1))
because if you use a function that returns multiple values in the middle of a list, it's adjusted so that it only returns one value. However, at the end of a list, the function does not have its return values adjusted.
You can code repeatNum as follows. It's not optimized and there's no error-checking. This works in Lua 5.1. If you're using 5.2, you'll need to make adjustments.
function repeatNum(...)
local results = {}
local n = #{...}
for i = 1,n,2 do
local val = select(i, ...)
local reps = select(i+1, ...)
for j = 1,reps do
table.insert(results, val)
end
end
return unpack(results)
end
I don't have 5.2 installed on this computer, but I believe the only change you need is to replace unpack with table.unpack.
I realise this question has been answered, but I wondered from a readability point of view if using tables to mark the repeats would be clearer, of course it's probably far less efficient.
function repeatnum(...)
local i = 0
local t = {...}
local tblO = {}
for j,v in ipairs(t) do
if type(v) == 'table' then
for k = 1,v[2] do
i = i + 1
tblO[i] = v[1]
end
else
i = i + 1
tblO[i] = v
end
end
return unpack(tblO)
end
print(repeatnum({1,3},{2,4},3,{4,2},7,4,2))

Data manipulation in R in LINQ style

I'm interested if there's a package in R to support call-chain style data manipulation, like in C#/LINQ, F#?
I want to enable style like this:
var list = new[] {1,5,10,12,1};
var newList = list
.Where(x => x > 5)
.GroupBy(x => x%2)
.OrderBy(x => x.Key.ToString())
.Select(x => "Group: " + x.Key)
.ToArray();
I don't know of one, but here's the start of what it could look like:
`%then%` = function(x, body) {
x = substitute(x)
fl = as.list(substitute(body))
car = fl[[1L]]
cdr = {
if (length(fl) == 1)
list()
else
fl[-1L]
}
combined = as.call(
c(list(car, x), cdr)
)
eval(combined, parent.frame())
}
df = data.frame(x = 1:7)
df %then% subset(x > 2) %then% print
This prints
x
3 3
4 4
5 5
6 6
7 7
If you keep using hacks like that it should be pretty simple to get the kind of
syntax you find pleasing ;-)
edit: combined with plyr, this becomes not bad at all:
(data.frame(
x = c(1, 1, 1, 2, 2, 2),
y = runif(6)
)
%then% subset(y > 0.2)
%then% ddply(.(x), summarize,
ysum = sum(y),
ycount = length(y)
)
%then% print
)
dplyr chaining syntax resembles LINQ (stock example):
flights %>%
group_by(year, month, day) %>%
select(arr_delay, dep_delay) %>%
summarise(
arr = mean(arr_delay, na.rm = TRUE),
dep = mean(dep_delay, na.rm = TRUE)
) %>%
filter(arr > 30 | dep > 30)
Introduction to dplyr - Chaining
(Not an answer. More an extended comment on Owen's answer.] Owen's answer helped me understand what you were after and I thoroughly enjoyed reading his insightful answer. This "outside to inside" style reminded me of an example on the help(Reduce) page where the Funcall function is defined and then successively applied:
## Iterative function application:
Funcall <- function(f, ...) f(...)
## Compute log(exp(acos(cos(0))
Reduce(Funcall, list(log, exp, acos, cos), 0, right = TRUE)
What I find especially intriguing about Owen's macro is that it essentially redefines the argument processing of existing functions. I tried thinking of how I might provide arguments to the "interior" functions for the Funcall aproach and then realized that his %then% function had already sorted that task out. He was using the function names without their leftmost arguments but with all their other right-hand arguments. Brilliant!
https://github.com/slycoder/Rpipe
c(1,1,1,6,4,3) %|% sort() %|% unique()
# result => c(1,3,4)
Admittedly, it would be nice to have a where function here, or alternatively to allow anonymous functions to be passed in, but hey the source code is there: fork it and add it if you want.

Resources