Skip to content

Instantly share code, notes, and snippets.

@snehankekre
Created September 23, 2019 23:36
Show Gist options
  • Select an option

  • Save snehankekre/098f644643d205d9165bce88c8af6104 to your computer and use it in GitHub Desktop.

Select an option

Save snehankekre/098f644643d205d9165bce88c8af6104 to your computer and use it in GitHub Desktop.
CS112_3.1_gelmanhillsimulation.R
library(arm)
library(Matching)
library("ggplot2")
data(lalonde)
ggplot(lalonde, aes(x=age, y=re78)) +
geom_point(color='#2980B9', size = 4) +
geom_smooth(method=lm, color='#2C3E50')
lm1 <- lm(lalonde$re78 ~ lalonde$age)
lm1$coef
# (Intercept) lalonde$age
# 3946.18432 53.39136
library(arm)
set.seed(123)
sim_results <- sim(lm1, n.sims = 20)
set.seed(232)
# 20 sims is too few to get reliable results
sim_results2 <- sim(lm1, n.sims = 10000)
mean(sim_results2@coef[,1])
# 3953 -- compare to 3946.18432 (when n.sims is big, e.g., 100K it's a better estimate)
mean(sim_results2@coef[,2])
#estimate the prediction intervals of y
#for every x in your data set
for(i in 1:10000){
simulated_ys <- sim_results2@coef[i,1]*rep(1, length(lalonde$age)) +
sim_results2@coef[i,2]*lalonde$age +
rnorm(length(lalonde$age), 0, sim_results2@sigma[i])
upper.bounds <- simulated_ys + 1.96*(sd(simulated_ys)/sqrt(length(simulated_ys)))
lower.bounds <- simulated_ys - 1.96*(sd(simulated_ys)/sqrt(length(simulated_ys)))
}
prediction_intervals <- do.call(rbind, Map(data.frame, LOWER=lower.bounds, UPPER=upper.bounds))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment