WINBUGS : adding time and product fixed effects in a hierarchical data - panel

I am working on a Hierarchical panel data using WinBugs. Assuming a data on school performance - logs with independent variable logp & rank. All schools are divided into three categories (cat) and I need beta coefficient for each category (thus HLM). I am wanting to account for time-specific and school specific effects in the model. One way can be to have dummy variables in the list of variables under mu[i] but that would get messy because my number of schools run upto 60. I am sure there must be a better way to handle that.
My data looks like the following:
school time logs logp cat rank
1 1 4.2 8.9 1 1
1 2 4.2 8.1 1 2
1 3 3.5 9.2 1 1
2 1 4.1 7.5 1 2
2 2 4.5 6.5 1 2
3 1 5.1 6.6 2 4
3 2 6.2 6.8 3 7
#logs = log(score)
#logp = log(average hours of inputs)
#rank - rank of school
#cat = section red, section blue, section white in school (hierarchies)
My WinBUGS code is given below.
model {
# N observations
for (i in 1:n){
logs[i] ~ dnorm(mu[i], tau)
mu[i] <- bcons +bprice*(logp[i])
+ brank[cat[i]]*(rank[i])
}
}
}
# C categories
for (c in 1:C) {
brank[c] ~ dnorm(beta, taub)}
# priors
bcons ~ dnorm(0,1.0E-6)
bprice ~ dnorm(0,1.0E-6)
bad ~ dnorm(0,1.0E-6)
beta ~ dnorm(0,1.0E-6)
tau ~ dgamma(0.001,0.001)
taub ~dgamma(0.001,0.001)
}
As you can see in the data sample above, I have multiple observations for school over time. How can I modify the code to account for time and school specific fixed effects. I have used STATA in the past and we get fe,be,i.time options to take care of fixed effects in a panel data. But here I am lost.

Related

Invalid syntax loop in Stata

I'm trying to run a for loop to make a balance table in Stata (comparing the demographics of my dataset with national-level statistics)
For this, I'm prepping my dataset and attempting to calculate the percentages/averages for some key demographics.
preserve
rename unearnedinc_wins95 unearninc_wins95
foreach var of varlist fem age nonwhite hhsize parent employed savings_wins95 debt_wins95 earnedinc_wins95 unearninc_wins95 underfpl2019 { //continuous or binary; to put categorical vars use kwallis test
dis "for variable `var':"
tabstat `var'
summ `var'
local `var'_samplemean=r(mean)
}
clear
set obs 11
gen var=""
gen sample=.
gen F=.
gen pvalue=.
replace var="% Female" if _n==1
replace var="Age" if _n==2
replace var="% Non-white" if _n==3
replace var="HH size" if _n==4
replace var="% Parent" if _n==5
replace var="% Employed" if _n==6
replace var="Savings stock ($)" if _n==7
replace var="Debt stock ($)" if _n==8
replace var="Earned income last mo. ($)" if _n==9
replace var="Unearned income last mo. ($)" if _n==10
replace var="% Under FPL 2019" if _n==11
foreach col of varlist sample {
replace `col'=100*round(`fem_`col'mean', 0.01) if _n==1
replace `col'=round(`age_`col'mean') if _n==2
replace `col'=100*round(`nonwhite_`col'mean', 0.01) if _n==3
replace `col'=round(`hhsize_`col'mean', 0.1) if _n==4
replace `col'=100*round(`parent_`col'mean', 0.01) if _n==5
replace `col'=100*round(`employed_`col'mean', 0.01) if _n==6
replace `col'=round(`savings_wins95_`col'mean') if _n==7
replace `col'=round(`debt_wins95_`col'mean') if _n==8
replace `col'=round(`earnedinc_wins95_`col'mean') if _n==9
replace `col'=round(`unearninc_wins95_`col'mean') if _n==10
replace `col'=100*round(`underfpl2019_`col'mean', 0.01) if _n==11
}
I'm trying to run the following loop, but in the second half of the loop, I keep getting an 'invalid syntax' error. For context, in the first half of the loop (before clearing the dataset), the code stores the average values of the variables as a macro (`var'_samplemean). Can someone help me out and mend this loop?
My sample data:
clear
input byte fem float(age nonwhite) byte(hhsize parent) float employed double(savings_wins95 debt_wins95 earnedinc_wins95 unearninc_wins95) float underfpl2019
1 35 1 6 1 1 0 2500 0 0 0
0 40 0 4 1 1 0 10000 1043 0 0
0 40 0 4 1 1 0 20000 2400 0 0
0 40 0 4 1 1 .24 20000 2000 0 0
0 40 0 4 1 1 10 . 2600 0 0
Thanks!
Thanks for sharing the snippet of data. Apart from the fact the variable unearninc_wins95 has already been renamed in your sample data, the code runs fine for me without returning an error.
That being said, the columns for your F-statistics and p-values are empty once the loop at the bottom of your code completes. As far as I can see there is no local/varlist called sample which you're attempting to call with the line foreach col of varlist sample{. This could be because you haven't included it in your code, in which case please do, or it could be because you haven't created the local/varlist sample, in which case this could well be the source of your error message.
Taking a step back, there are more efficient ways of achieving what I think you're after. For example, you can get (part of) what you want using the package stat2data (if you don't have it installed already, run ssc install stat2data from the command prompt). You can then run the following code:
stat2data fem age nonwhite hhsize parent employed savings_wins95 debt_wins95 earnedinc_wins95 unearninc_wins95 underfpl2019, saving("~/yourstats.dta") stat(count mean)
*which returns:
preserve
use "~/yourstats.dta", clear
. list, sep(11)
+----------------------------+
| _name sN smean |
|----------------------------|
1. | fem 5 .2 |
2. | age 5 39 |
3. | nonwhite 5 .2 |
4. | hhsize 5 4.4 |
5. | parent 5 1 |
6. | employed 5 1 |
7. | savings_wins 5 2.048 |
8. | debt_wins95 4 13125 |
9. | earnedinc_wi 5 1608.6 |
10. | unearninc_wi 5 0 |
11. | underfpl2019 5 0 |
+----------------------------+
restore
This is missing the empty F-statistic and p-value variables you created in your code above, but you can always add them in the same way you have with gen F=. and gen pvalue=.. The presence of these variables though indicates you want to run some tests at some point and then fill the cells with values from them. I'd offer advice on how to do this but it's not obvious to me from your code what you want to test. If you can clarify this I will try and edit this answer to include that.
This doesn't answer your question directly; as others gently point out the question is hard to answer without a reproducible example. But I have several small comments on your code which are better presented in this form.
Assuming that all the variables needed are indeed present in the dataset, I would recommend something more like this:
local myvarlist fem age nonwhite hhsize parent employed savings_wins95 debt_wins95 earnedinc_wins95 unearninc_wins95 underfpl2019
local desc `" "% Female" "Age" "% Non-white" "HH size" "% Parent" "% Employed" "Savings stock ($)" "Debt stock ($)" "Earned income last mo. ($)" "Unearned income last mo. ($)" "% Under FPL 2019" "'
local i = 1
gen variable = ""
gen mean = ""
local i = 1
foreach var of local myvars {
summ `var', meanonly
local this : word `i' of `desc'
replace variable = "`this'" in `i'
if inlist(`i', 1, 3, 5, 6, 11) {
replace mean = strofreal(100 * r(mean), "%2.0f") in `i'
}
else if `i' == 4 {
replace mean = strofreal(r(mean), "%2.1f") in `i'
}
else replace mean = strofreal(r(mean), "%2.0f") in `i'
local ++i
}
This has not been tested.
Points arising include:
Using in is preferable for what you want over testing the observation number with if.
round() is treacherous for rounding to so many decimal places. Most of the time you will get what you want, but occasionally you will get bizarre results arising from the fact that Stata works in binary, like any equivalent program. It is safer to treat rounding as a problem in string manipulation and use display formats as offering precisely what you want.
If the text you want to show is just the variable label for each variable, this code could be simplified further.
The code hints at intent to show other stuff, which is easily done compatibly with this design.

Laravel 5.7 Database Design Layout / Average from Collection

I have a situation where each Order can have Feedback. In case the product is physical, the Feedback can have many packaging_feedbacks. The packaging_feedbacks are supposed to be a relation to the packaging_feedback_details.
Feedback Model
public function packagingFeedbacks()
{
return $this->hasManyThrough('App\PackagingFeedbackDetail', 'App\PackagingFeedback',
'feedback_id', 'id', 'id', 'user_selection');
}
packaging_feedback_details
id|type_id(used to group the "names" for each feedback option)|name
1 0 well packed
2 0 bad packaging
3 1 fast shipping
4 1 express delivery
packaging_feedbacks
id|feedback_id|user_selection (pointing to the ID of packaging_feedback_details)
1 1 2
2 1 6
3 1 7
4 1 12
5 1 15
6 1 17
7 2 1
8 2 6
9 2 7
10 2 12
11 2 15
12 2 17
13 3 1
14 3 6
15 3 7
16 3 12
17 3 15
18 3 17
Now I would like to be able to get the average selection of the users for a physical product. I started by using:
$result = Product::with('userFeedbacks.packagingFeedbacks')->where('id', 1)->first();
$collection = collect();
foreach ($result->userFeedbacks as $key) {
foreach ($key->packagingFeedbacks as $skey) {
$collection->push($skey);
}
}
foreach ($collection->groupBy('type_id') as $key) {
echo($key->average('type_id'));
}
But it returns not the average id since it will calculate the average not the way I need it to calculate. Is there some better way, because I think it's not the cleverest way to do so. Is my database design, in general, the "best" way to handle this?
The type of average you're looking for here is mode. Laravel's collection instances have a mode() method which was introduced in 5.2 which when provide a key returns an array containing the highest occurring value for that key.
If I have understood your question correctly this should give you what you're after:
$result->userFeedbacks
->flatMap->packagingFeedbacks
->groupBy('type_id')
->map->mode('id');
The above is taking advantage of flatMap() and higher order messages() on collections.

GRASS 7.4: r.cross unexpectedly produces category zero

I have a problem with the output of r.cross. I hope you can follow my description without MWE:
I have 3 rasters I want to cross with the following characteristics:
GRASS 7.4.0 (Bengue):~ > r.stats soil_t,lcov,watermask -N
100%
4 8 0
4 8 1
4 9 0
[...]
I would expect r.cross to create a raster with a category for each line shown above. However, I get the following:
GRASS 7.4.0 (Bengue):~ > r.cross input=soil_t,lcov,watermask output=svc
GRASS 7.4.0 (Bengue):~ > r.category svc
0
1 category 4; category 8; category 1
2 category 4; category 9; category 0
[...]
Why is the first line just zero when one would rather expect something like: 1 category 4; category 8; category 0?
EDIT: Just noticed that under GRASS version 6.4 it runs as expected:
GRASS 6.4.6 (Bengue):~ > r.category svc
0
1 category 4; category 8; category 0
2 category 4; category 8; category 1
3 category 4; category 9; category 0
So, something must be wrong with the 7.4 version of r.cross?!
Thanks for your help!
System infos:
GRASS version 7.4.0
Ubuntu MATE 16.04 (xenial)
just in case somebody comes across this post: It was also asked in the mailing list shortly after this post by somebody else: https://lists.osgeo.org/pipermail/grass-user/2018-February/077934.html. As it seems, it is a bug and not yet fixed in the latest release version of GRASS.

Stata: Deleting duplicates based on dates

My dataset consists of a number of variables:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(v1 v2) str11 Date float(v4 v5 v6 v7 v8)
1 2 "15-aug-2016" 1 1 1 1 1
1 2 "07-may-2015" 1 1 1 1 50
1 2 "07-may-2015" 1 1 1 1 88
1 2 "15-aug-2016" 1 1 1 1 29
end
The variable date is a date and time and is formatted as a datetime
generate double date = date(Date,"DMY")
My duplicates are the same for v1-v2-v4-v5-v6-v7 (as in the example), while v8 is different.
I need to delete duplicates based on v1-v2-v4-v5-v6-v7 and keep the one with the smallest date (here 07-may-2015).
I have tried without success:
1.
gsort -date
bysort v1 v2 v4 v5 v6 v7: generate dublet=_n
order dublet date
keep if dublet==1
drop dublet
--> Works for the first 25 rows or so, then keeps the wrong one a couple of times and then the right one again. (Seems to me, that the bysort command removes the sort done by gsort? Any knowing if that's correct?)
bysort v1 v2 v4 v5 v6 v7 (date) : keep if _n == _N
--> Obviously keeps the wrong one, since Date is not -Date.
However, -Date is not an option - Stata writes: - invalid name
You could change your second answer to bysort v1 v2 v4 v5 v6 v7 (date) : keep if _n == 1 and that should give you what you're looking for.
Since in your data example there are duplicate dates (2 observations are May 7th 2015) you will get a random one of the observations with the minimum date.

How to fetch two associated Database values Using Rails 3

Hi I have two tables in DB.The first table is given below.
Table name-
t_hcsy_details
class name in model-
class THcsyDetails < ActiveRecord::Base
end
The values in side table is given below.
HCSY_Details_ID HCSY_ID HCSY_Fund_Type_ID Amount
1 2 1 1125
2 2 2 390
3 2 3 285
4 2 4 100
5 2 5 60
6 2 6 40
My second table is given below.
Table Name:
t_hcsy_fund_type_master
class in model:
class THcsyFundTypeMaster < ActiveRecord::Base
end
Table values are given below.
HCSY_Fund_Type_ID Fund_Type_Code Fund_Type_Name Amount
1 1 woods 1125
2 2 Burning 390
3 3 goods 285
4 4 brahmin 100
5 5 swd 60
6 6 Photo 40
I know only HCSY_ID value(i.e-2) of first table.But i need Fund_Type_Name and Amount from second table.As you can see one HCSY_ID has 6 different records.But i need all Fund_Type_Name and Amount of one HCSY_ID. Please help me to resolve this issue by creating object for both two classes shown above.Please help me.
You haven't specified any relationships setup, so it would be easier to split this in two queries:
# you already have hcsy_id
fund_type_ids = THcsyDetails.where(hcsy_id: hcsy_id).pluck(:hcsy_fund_type_id)
fund_types = THcsyFundTypeMaster.where(id: fund_type_ids)
fund_types.group(:fund_type_name).sum(:amount)
In case you had proper relationships setup, the above would've simplified to:
THcsyDetails.
joins(association_name). # THcsyFundTypeMaster
where(hcsy_id: hcsy_id).
group("#{t = THcsyFundTypeMaster.table_name}.fund_type_name").
sum("#{t}.amount")

Resources