Appex 04: Potential Outcomes

Note: Be sure to push your work to GitHub at least once before 5pm tomorrow to get credit for this application exercise.

Getting started

Packages

In this lab we will work with one package: tidyverse which is a collection of packages for doing data analysis in a “tidy” way. This should already been installed from your tech setup.

If you’d like to run your code in the Console as well you’ll also need to load the package there. To do so, run the following in the console.

library(tidyverse) 

Note that the package is also loaded with the same commands in your R Markdown document.

Welcome to Lucy Land, a land where I have special powers so I can see all potential outcomes 😎. I will share those powers with you! Let’s generate n = 50 meeple with one characteristic, whether they tend to be happy (half do, half don’t). Let’s turn this into a happiness index. Let’s also generate some potential outcomes, their happiness if they don’t eat ice cream (y0) and their happiness if they do (y1). It turns out ice cream consumption has nothing to do with happiness in Lucy Land! So y0 and y1 are equal, it’s just their baseline happiness index. All these happiness scores depend on is their prior likelihood to be happy. Copy this code into your RMarkdown document. Knit, Commit, and Push!

## Generate lucy land's meeple
set.seed(1)

n <- 50
meeple <- tibble(
  happy = sample(rep(c(1, 0), each = n / 2)),
  happiness = case_when(
    happy == 1 ~ rbinom(n, 5, 0.7),
    happy == 0 ~ rbinom(n, 3, 0.2)
  ),
  y0 = happiness,
  y1 = happiness
)
  1. We are interested on the impact of ice cream consumption on happiness. Here the exposure, ice cream, was randomly assigned. Below is R code to randomly assign the 50 meeple to eat ice cream (or not). Let’s also create y_obs, the observed outcome for folks without our magic powers to see both potential outcomes. Add the code to your R markdown file. Explain what each part is doing in words. Knit, commit, and push this to GitHub.
set.seed(5)

d_random <- meeple %>%
  mutate(x = sample(rep(c(1, 0), each = n / 2)),  # coin flip!
         y_obs = ifelse(x == 1, y1, y0))
  1. We can calculate the average causal effect but averaging across the individual effects. Copy the code below to your R Markdown file. Describe what each line is doing. Then write a sentence describing the average causal effect (using the true potential outcomes) and the average observed effect (using only y_obs). Knit, commit, and push this to Github.
d_random %>%
  summarise(average_noicecream = mean(y0),
            average_icecream = mean(y1),
            average_effect = mean(y1 - y0))
## # A tibble: 1 × 3
##   average_noicecream average_icecream average_effect
##                <dbl>            <dbl>          <dbl>
## 1               1.82             1.82              0
d_random %>%
  summarise(average_observed = mean(y_obs[x == 1]) - mean(y_obs[x == 0]))
## # A tibble: 1 × 1
##   average_observed
##              <dbl>
## 1            -0.68
  1. Suppose I am the one conducting the experiment and I know my meeple well, so I know who is generally happy, and who is not. Therefore, I assign them ice cream based on my knowledge of happiness, giving the happier people ice cream and the grumpy folks none. The potential outcomes remain the same, but the treatment assignment and observed outcome changes. Copy the code below for this new experiment. Describe what each line of code is doing. Knit, commit, and push this to Github.
d_lucy <- d_random %>%
  mutate(x = happy,
         y_obs = ifelse(x == 1, y1, y0)) 
d_lucy
## # A tibble: 50 × 6
##    happy happiness    y0    y1     x y_obs
##    <dbl>     <int> <int> <int> <dbl> <int>
##  1     1         4     4     4     1     4
##  2     0         0     0     0     0     0
##  3     1         4     4     4     1     4
##  4     0         0     0     0     0     0
##  5     1         2     2     2     1     2
##  6     0         0     0     0     0     0
##  7     1         3     3     3     1     3
##  8     1         2     2     2     1     2
##  9     0         1     1     1     0     1
## 10     1         3     3     3     1     3
## # … with 40 more rows
  1. Using similar code as Exercise 2, calculate the average observed effect of this new experiment. How does this compare to the true average causal effect? What can you learn from this? Knit, commit, and push the results to GitHub.