I've got discrete step functions for supply and demand. I'm searching for an algorithm to find the equilibrium price, The data are below in R, but a solution any language (or pseudo-code) is acceptable.
demand = data.frame(volume = c(8,2,3,1,1), price=c(1,2,3,4,5))
supply = data.frame(volume = c(3,2,4,2,3), price=c(5,4,3,2,1))
demand$volume <- cumsum(demand$volume)
supply$volume <- cumsum(supply$volume)
plot(demand, type="s")
lines(supply, type="s", col=3)
You need to take partial cumsum volumes from opposite ends of the price range.
demand_cum = (15, 7, 5, 2, 1)
supply_cum = ( 3, 5, 9, 11, 14)
This shows you total, cumulative demand & supply at each price.
Now can you spot the equilibrium?
I was looking into a similar problem and found this great description: https://www.youtube.com/watch?v=FYfbM56L-mE&ab_channel=31761-Renewablesinelectricitymarkets
You can motivate a similar analysis for your problem. Consider an LP formulation. Given the dual solution, you can find the market-clearing price as follows:
demand = data.frame(Type = "demand",Q = c(8,2,3,1,1), P=c(1,2,3,4,5))
supply = data.frame(Type = "supply",Q = c(3,2,4,2,3), P=c(5,4,3,2,1))
ds <- rbind(supply,demand)
By representing the problem from LP, do the following:
ds[ds$Type == "demand","Q"] <- ds[ds$Type == "demand","Q"]
ds[ds$Type == "supply","Q"] <- ds[ds$Type == "supply","Q"]
P_s <- ds[ds$Type == "supply","P"]
P_d <- ds[ds$Type == "demand","P"]
Q_s <- ds[ds$Type == "supply","Q"]
Q_d <- ds[ds$Type == "demand","Q"]
c_vec <- c(P_s,-P_d)
A_mat <- diag(length(c_vec))
b_vec <- c(Q_s,Q_d)
dir_1 <- rep("<=",length(b_vec))
A2_mat <- c(rep(1,length(Q_s)),rep(-1,length(Q_d)))
b2_vec <- 0
A_mat <- rbind(A_mat,A2_mat)
b_vec <- c(b_vec,b2_vec)
dir_1 <- c(dir_1,"=")
library(lpSolve)
sol <- lp ("min", c_vec, A_mat, dir_1, b_vec, compute.sens=TRUE)
price_mc <- sol$duals[nrow(ds) + 1] # extracts the dual, which corresponds to the price
In your example, the market-clearing price is $2.
Related
I create optimal portfolio using optimize.portfolio and solver DEoptim (maxDrawdown risk objective with target -0.1 and maximum return objective). But when I plot the optimized portfolio object, it does not seem that this portfolio is optimal as there are portfolios with the same Drawdown and better returns. Could someone explain?
library('PortfolioAnalytics')
library('PerformanceAnalytics')
library('DEoptim')
data(edhec)
ret <- edhec[, 1:10]
init.portf <- portfolio.spec(assets=colnames(ret))
init.portf <- add.constraint(portfolio=init.portf, type="full_investment")
init.portf <- add.constraint(portfolio=init.portf, type="long_only")
group_list <- list(group1=c(3),
group2=c(1, 2),
group3=c(5, 7, 8))
init.portf <- add.constraint(portfolio= init.portf,
type="group",
groups=group_list,
group_min=c(0.03, 0, 0),
group_max=c(0.032, 0.2, 0.3))
ret.obj.portf <- add.objective(portfolio=init.portf, type="return",
name="mean")
ret.obj.portf <- add.objective(portfolio = ret.obj.portf,
type = 'risk',
name = 'maxDrawdown',
arguments = list(inverse=TRUE),
target = -0.1)
ret.obj.portf$constraints[[1]]$min_sum <- 0.99
ret.obj.portf$constraints[[1]]$max_sum <- 1.01
ret.obj.portf
set.seed(123)
opt.obj.no1.1 <- optimize.portfolio(R=ret, portfolio=ret.obj.portf,
optimize_method="DEoptim", search_size=2000, trace=TRUE)
opt.obj.no1.1
chart.RiskReward(opt.obj.no1.1,
main = 'Optimized Portfolio: max return and 10% maxDD, all investments allowed',
return.col = "mean", risk.col = 'maxDrawdown')
I need a help for my syntax.
library(e1071)
priori <- function (I, N, M) {
a <- as.matrix(runif(I, min = 0.65, max = 1.70))
b <- as.matrix(runif(I, min = -2.80, max = 2.80))
c <- as.matrix(runif(I, min = 0.00, max = 0.35))
k <- c(rnorm(N*20/100, 0, 1), rnorm(N*80/100,0, 0.01))
M <- cbind(b,a,c)
data <- as.data.frame(rmvlogis(N, M, IRT = FALSE, link = "logit", z.vals = k))
print(data)}
This is my syntax which is generate data.
priori.list <- vector("list", 3)
names(priori.list) <- paste0("L", seq_along(priori.list))
priori.sum.list <- vector("list", 3)
for (i in 1:3) {
for (j in 1:100) {
priori.list$L1[[j]] <- priori(10,100, M="2PL")
priori.list$L2[[j]] <- priori(20,500, M="2PL")
priori.list$L3[[j]] <- priori(40,1000,M="3PL")
priori.sum.list [[i]][[j]] <- rowSums(priori.list[[i]][[j]])
print(kurtosis(priori.sum.list[[i]][[j]]))
if(skewness(priori.sum.list[[i]][[j]])>=-1 | skewness(priori.sum.list[[i]][[j]]>=1)
& kurtosis(priori.sum.list[[i]][[j]])>=-1 | kurtosis(priori.sum.list[[i]][[j]]>=1))
{NA}
else
{return(j=j-1)}}}
Then I do a data list from syntax. I want to create a loop according to the skewness and kurtosis coefficient, but I couldn't. My purpose: If the skewness and kurtosis coefficient is outside 1 and -1, save to list if not regenerate. Can You help me for correct syntax? Thank You.
I am fitting a mixture model to estimate the average of a trait in each of 3 populations.
I have a label switching issue and I am trying to compute the distance between the observed and expected numbers of individuals of each genotype in each population to relabel population clusters. Below is a reproducible example.
For some reasons, JAGS does not compute the square values for distance properly. The corresponding line in code below is: pow(DistNumPerClust[k,j], 2))
Hence, the output matrix results$mean$dist is different from the matrix, results$mean$DistNumPerClust^2, computed a posteriori.
Would anyone know a way to solve this?
library(R2jags)
library(runjags)
library(dirmult)
set.seed(123)
############################
## Simulation of the data ##
############################
npop=3
ngeno=2
freqbalance=1
nsamplesizeperpop <- 100
freqMLG <- t(rdirichlet(n=npop, alpha=rep(freqbalance, ngeno)))
samplesizegenoperpop <- sweep(freqMLG, 1, nsamplesizeperpop, "*")
## Compute membership (probability that a genotype comes from pop 1, 2 or 3)
## Genotype as rows and populations as columns
membership <- sweep(freqMLG, 1, rowSums(freqMLG), "/")
# Parameters for simulations
nind=90
N = npop*nind # nb of observations
clust <- rep(1:npop, each=N/npop)
geno <- c()
for (i in 1:N){
geno <- c(geno, sum(rmultinom(n=1, size=1, prob=freqMLG[, clust[i]])*1:ngeno))
}
numgeno <- as.numeric(table(geno))
## Multiply membership probabilities by sample size for each genotype
ExpNumPerClust <- sweep(membership, 1, numgeno, "*")
muOfClustsim <- c(1, 20, 50) # vector of population means
sigma <- 1.5 # residual sd
(tausim <- 1/(sigma*sigma)) # precision
# parameters are treated as data for the simulation step
data <- list(N=N, npop=npop, ngeno=ngeno, geno=geno, muOfClustsim=muOfClustsim, tausim=tausim, samplesizegenoperpop=samplesizegenoperpop)
## JAG model
txtstring <- "
data{
# Likelihood:
for (i in 1:N){
ysim[i] ~ dnorm(eta[i], tausim) # tau is precision (1 / variance)
eta[i] <- muOfClustsim[clust[i]]
clust[i] ~ dcat( pClust[geno[i], 1:npop] )
}
for (k in 1:ngeno){
pClust[k, 1:npop] ~ ddirch( samplesizegenoperpop[k,] )
}
}
model{
fake <- 0
}
"
# Simulate with jags
out <- run.jags(txtstring, data = data, monitor=c("ysim"), sample=1, n.chains=1, summarise=FALSE)
# reformat the outputs
ysim <- coda::as.mcmc(out)[1:N]
## Estimation model
bayes.mod <- function(){
# Likelihood:
for (i in 1:N){
ysim[i] ~ dnorm(eta[i], tau) # tau is precision (1 / variance)
eta[i] <- beta[clust[i]]
clust[i] ~ dcat( pClust[geno[i], 1:npop] )
}
for (k in 1:ngeno){
## pClust membership estimates
pClust[k, 1:npop] ~ ddirch( samplesizegenoperpop[k,] )
}
for (k in 1:ngeno){
for (j in 1:npop){
# problem of label switching: try to compute the distance between ObsNumPerClust and ExpNumPerClust (i.e. between observed and expected number of individuals of each genotype in each population)
ObsNumPerClust[k,j] <- pClust[k, j] * numgeno[k]
DistNumPerClust[k,j] <- ObsNumPerClust[k,j] - ExpNumPerClust[k,j]
dist[k,j] <- pow(DistNumPerClust[k,j], 2)
}
}
# Priors
beta ~ dmnorm(mu, sigma.inv)
mu ~ dmnorm(m, V)
sigma.inv ~ dwish(R, K)
tau ~ dgamma(0.01, 0.01)
# parameters transformations
sig <- sqrt(1/ tau)
}
m = rep(1, npop)
V = diag(rep(0.01, npop))
R = diag(rep(0.1, npop))
K = npop
## Input variables
sim.dat.jags<-list("ysim","N","npop", "ngeno", "geno","m","V","R", "K", "samplesizegenoperpop","numgeno","ExpNumPerClust")
## Variables to monitor
bayes.mod.params <- c("beta","tau","sig","DistNumPerClust","dist")
## Starting values
init1 <- list(beta = c(0, 100, 1000), tau = 1)
bayes.mod.inits <- list(init1)
## Run model
bayes.mod.fit<-jags(data = sim.dat.jags, inits = bayes.mod.inits, parameters.to.save = bayes.mod.params, n.chains=1, n.iter=101000, n.burnin=1000, n.thin=200, model.file = bayes.mod)
results <- print(bayes.mod.fit)
results$mean$dist
results$mean$DistNumPerClust^2
It seems that you expect that the mean of a transformed set of values will give the same result as transforming the mean of the same set of values. But this is not the case - for example:
values <- c(1,2,3,6,8,20)
mean(values)^2
mean(values^2)
Are not the same thing.
The equivalent is happening in your model - you calculate dist[k,j] as the square of DistNumPerClust[k,j] and then summarise to a mean of dist, and expect this to be the same as the square of the mean of DistNumPerClust[k,j]. Or in a simpler example:
library('runjags')
X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)
model <- "model {
for(i in 1 : N){
Y[i] ~ dnorm(true.y[i], precision);
true.y[i] <- (m * X[i]) + c
}
m ~ dunif(-1000,1000)
c ~ dunif(-1000,1000)
precision ~ dexp(1)
p2 <- precision^2
}"
data <- list(X=X, Y=Y, N=length(X))
results <- run.jags(model=model, monitor=c("m", "c", "precision", "p2"),
data=data, n.chains=2)
results
More specifically, these should not be expected to be the same:
summary(results)['p2','Mean']
summary(results)['precision','Mean']^2
If you want to calculate the same thing you can extract the full chain of values as an MCMC object and do your transformation on these:
p <- combine.mcmc(results,vars='precision')
p2 <- combine.mcmc(results,vars='p2')
mean(p^2)
mean(p2)
mean(p)
mean(sqrt(p2))
Now everything is equivalent.
Matt
I have a double loop that I not only don't like, but would take 14 days to run on my computer since it is going over 3200 records and 1090 variables at about .12 per iteration.
A smaller reproducible bit. It simply checks how many numbers are in the same column between two records, not including NA's. Then it attaches the results to the original data frame.
y <- data.frame(c(1,2,1,NA,NA),c(3,3,3,4,NA),c(5,4,5,7,7),c(7,8,7,9,10))
resultdf <- NULL
for(i in 1:nrow(y))
{
results <- NULL
for(j in 1:nrow(y))
{
results <- c(results,sum((y[i,]==y[j,]),na.rm=TRUE))
}
resultdf <- cbind(resultdf,results)
}
y <- cbind(y,resultdf)
I have repeat calculations that could possibly be avoided leaving about 7 days.
If I understand correctly, a few apply functions are in C that might be faster. I haven't been able to get any to work though. I'm also curious if there is a package that would run faster. Can anyone help speed up the calculation?
Thank you!
I have created data to your specifications, and using #BenBolker's suggestion about using a matrix:
> y <- matrix(sample(c(1:9, NA), 3200 * 1090, replace = TRUE),
+ nrow = 3200, ncol = 1090)
and compared the computation times for three different implementations:
f1 was suggested by #Andrei:
> f1 <- function(y)apply(y, 1, function(r1)
+ apply(y, 1, function(r2)sum(r1==r2, na.rm=TRUE)))
> system.time(r1 <- f1(y))
user system elapsed
523.51 0.77 528.73
f2 was suggested by #VincentZoonekynd:
> f2 <- function(y) {
+ f <- function(i,j) sum(y[i,] == y[j,], na.rm=TRUE)
+ d <- outer( 1:nrow(y), 1:nrow(y), Vectorize(f) )
+ return(d)
+ }
> system.time(r2 <- f2(y))
user system elapsed
658.94 1.96 710.67
f3 is a double loop over the upper triangle as suggested by #BenBolker. It is also a bit more efficient than your OP in that it pre-allocates the output matrix:
> f3 <- function(y) {
+ result <- matrix(NA, nrow(y), nrow(y))
+ for (i in 1:nrow(y)) {
+ row1 <- y[i, ]
+ for (j in i:nrow(y)) {
+ row2 <- y[j, ]
+ num.matches <- sum(row1 == row2, na.rm = TRUE)
+ result[i, j] <- num.matches
+ result[j, i] <- num.matches
+ }
+ }
+ return(result)
+ }
> system.time(r3 <- f3(y))
user system elapsed
167.66 0.08 168.72
So the double loop is the fastest of all three, although not as elegant and compact as the other two answers.
Here is another solution, using outer.
f <- function(i,j) sum(y[i,] == y[j,], na.rm=TRUE)
d <- outer( 1:nrow(y), 1:nrow(y), Vectorize(f) )
Indeed, you can use apply function. Given the earlier hint that a matrix works faster, I would try:
ym <- as.matrix(y)
resultdf <- apply(ym, 1, function(r1) apply(ym, 1, function(r2) sum(r1==r2, na.rm=TRUE)))
You can get rid of the inner loop (using the y and f3 from #flodel's answer):
ty <- t(y)
ix <- rep(1:nrow(y),each = ncol(y))
f4 <- function(y){
result <- matrix(0L, nrow(y), nrow(y))
for(r in 1:nrow(y))
result[r,] <- rowsum(as.numeric(ty == y[r,]), ix, na.rm = T)
result
}
> system.time(out <- f4(y))
user system elapsed
52.616 21.061 74.000
> system.time(out <- f3(y))
user system elapsed
244.751 0.136 244.954
>
It actually does an extra job of computing twice the same thing, but is still 5 times faster. You can make it yet another 4 times faster by using the inner workings of rowsum. See this question for an example.
I was wondering if anyone could kindly help me with this seemingly easy task. I'm using nlminb to conduct optimization and compute some statistics by index. Here's an example from nlminb help.
> x <- rnbinom(100, mu = 10, size = 10)
> hdev <- function(par) {
+ -sum(dnbinom(x, mu = par[1], size = par[2], log = TRUE))
+ }
> nlminb(c(9, 12), hdev)
$par
[1] 9.730000 5.954936
$objective
[1] 297.2074
$convergence
[1] 0
$message
[1] "relative convergence (4)"
$iterations
[1] 10
$evaluations
function gradient
12 27
Suppose I generate random variables x, y, and z where z acts as an index (from 1 to 3).
> x <- rnbinom(100, mu = 10, size = 10)
> y <- rnbinom(100, mu = 10, size = 10)
> z <- rep(1:3, length=100)
> A <- cbind(x,y,z)
> hdev <- function(par) {
+ -sum(dnbinom(x+y, mu = par[1], size = par[2], log = TRUE))}
How can I apply nlminb(c(9, 12), hdev) to the data set by index z? In other words, I would like to compute nlminb(c(9, 12), hdev) for z=1, z=2, and z=3 separately. I tried by(A, z, function(A) nlminb(c(9,12), hdev)) and sparseby(A, z, function(A) nlminb(c(9,12), hdev)), but they return exactly the same values for each value of z.
I would like to turn each output into a new data frame so that it will become a 3X2 matrix.
[1] Z1_ANSWER_1 Z1_ANSWER_2
[2] Z2_ANSWER_1 Z2_ANSWER_2
[3] Z3_ANSWER_1 Z3_ANSWER_2
Since nlminb returns the summary of statistics, I needed to use CASEZ1<-nlminb$par, CASEZ2<-nlminb$par, CASEZ3<-nlminb$par and then use cbind to combine them. However, I would like to automate this process as the real data I'm working on has a lot more categories than z presented here.
If I'm not making myself clear, please let me know. I'll see if I can replicate the actual data set and functions I'm working on (I just don't have them on this computer).
Thank you very much in advance.
Let me try an approach
x <- rnbinom(100, mu = 10, size = 10)
y <- rnbinom(100, mu = 10, size = 10)
z <- rep(1:3, length=100)
A <- as.data.frame(cbind(x,y,z))
At first load the plyr library
library(plyr)
The following code returns the results for each z
dlply(A, .(z), function(x) {
hdev <- function(par, mydata) {-sum(dnbinom(mydata, mu = par[1], size = par[2], log = TRUE))}
nlminb(c(9, 12), hdev, mydata=t(as.vector(x[1] + as.vector(x[2]))))
}
)
Now, with this one you will get a 3x2 dataframe with the $par results
ddply(A, .(z), function(x) {
hdev <- function(par, mydata) {-sum(dnbinom(mydata, mu = par[1], size = par[2], log = TRUE))}
res <- nlminb(c(9, 12), hdev, mydata=t(as.vector(x[1] + as.vector(x[2]))))
return(res$par)
}
)