Note: Be sure to push your work to GitHub at least once before 5pm tomorrow to get credit for this application exercise.
appex-04-potential-outcomes-YOUR-GITHUB-HANDLE
. This repo contains a template you can build on to complete your assignment.
https://github.com/LucyMcGowan/myrepo.git
.In this lab we will work with one package: tidyverse
which is a collection of packages for doing data analysis in a “tidy” way. This should already been installed from your tech setup.
If you’d like to run your code in the Console as well you’ll also need to load the package there. To do so, run the following in the console.
Note that the package is also loaded with the same commands in your R Markdown document.
Welcome to Lucy Land, a land where I have special powers so I can see all potential outcomes 😎. I will share those powers with you! We’re going to turn my code to generate meeple into a function! Here we are generating a population that has a causal effect of 2. Copy this code into your RMarkdown document. Knit, Commit, and Push!
## Generate lucy land's meeple
make_meeple <- function(n = 50) {
tibble(
y0 = rnorm(n, 1),
y1 = y0 + 2
)
}
set.seed(1)
make_meeple()
## # A tibble: 50 × 2
## y0 y1
## <dbl> <dbl>
## 1 0.374 2.37
## 2 1.18 3.18
## 3 0.164 2.16
## 4 2.60 4.60
## 5 1.33 3.33
## 6 0.180 2.18
## 7 1.49 3.49
## 8 1.74 3.74
## 9 1.58 3.58
## 10 0.695 2.69
## # … with 40 more rows
y_obs
. Calculate the average treatment effect. Knit Commit and Push!proportion_exposed <- 0.5
d_random <- make_meeple(n = 50) %>%
mutate(x = sample(rep(c(0, 1), times = c(50 * (1 - proportion_exposed), 50 * proportion_exposed)),
y_obs = _____)
d_random %>%
summarise(mean(y_obs[x == 1]) - mean(y_obs[x == 0]))
Using similar code as above, change your exposure proportion to 25% from 50%. Does this change whether your estimate is biased? Why or why not? Knit commit and push!
Run the code below. What does it show you? Knit commit and push!
get_average_effect <- function(n, proportion_exposed = 0.5) {
d_random <- make_meeple(n) %>%
mutate(x = sample(rep(c(0, 1), times = c(n * (1 - proportion_exposed),
n * proportion_exposed))),
y_obs = ifelse(x == 1, y1, y0))
d_random %>%
summarise(mean(y_obs[x == 1]) - mean(y_obs[x == 0])) %>%
pull()
}
d <- tibble(
n = seq(10, 10000, by = 10),
tau_hat = map_dbl(n, get_average_effect)
)
ggplot(d, aes(n, tau_hat)) +
geom_line() +
geom_point()