Randomized Trials

# Randomized Trials

**Session 5**

]

---

# Potential Outcome

<br> <br>
.box-1[

What *will* happen to you in the future given you have a particular exposure

]

<br>

.box-inv-1.medium[

**fixed** but **unknown**

]
---

# What do we observe?

.box-inv-1.medium[Recall: The causal effect is the comparison of the potential outcomes for the same unit **at the same moment** in time post-exposure]

---

# Outcomes for an individual

.box-inv-1[
`$Y_i^{obs}= Y_i(X_i)=\begin{cases}Y_i(0)&\textrm{if }X_i=0\\Y_i(1)&\textrm{if }X_i = 1\end{cases}$`
]

--
<br>

.box-1[
`$Y_i^{mis}= Y_i(1-X_i)=\begin{cases}Y_i(1)&\textrm{if }X_i=0\\Y_i(0)&\textrm{if }X_i = 1\end{cases}$`
]

---

# Causal inference is fundamentally a missing data problem

---

## Stable Unit Treatment Value Assumption

.box-inv-1[
The potential outcomes for any unit do not vary with the exposures assigned to other units, and, for each unit, there are no different forms or versions of each exposure level, which lead to different potential outcomes.]

---

## Stable Unit Treatment Value Assumption

.box-inv-1[
**The potential outcomes for any unit do not vary with the exposures assigned to other units**, and, for each unit, there are no different forms or versions of each exposure level, which lead to different potential outcomes.]

.box-1.medium[No Interference]

---

## Stable Unit Treatment Value Assumption

.box-inv-1[
The potential outcomes for any unit do not vary with the exposures assigned to other units, and, for **each unit, there are no different forms or versions of each exposure level, which lead to different potential outcomes.**]

---

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

.box-1[Assignment probability must be **individualistic** (your covariates / potential outcomes can't influence my assignment)]

---

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

.box-1[Assignment probability must be **individualistic** (your covariates / potential outcomes can't influence my assignment)]

---

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

.box-2[Assignment probability must be **individualistic** (your covariates / potential outcomes can't influence my assignment)]

---

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

.box-2[Assignment probability must be **individualistic** (your covariates / potential outcomes can't influence my assignment)]

# Randomized Trials!

```r
set.seed(5)

d_random <- meeple %>%
  mutate(x = sample(rep(c(1, 0), each = n / 2)),  # coin flip!
         y_obs = ifelse(x == 1, y1, y0))
```

---

# Randomized Trials!

```r
d_random %>%
  summarise(average_noicecream = mean(y0),
            average_icecream = mean(y1),
            average_effect = mean(y1 - y0))
```

```
## # A tibble: 1 × 3
##   average_noicecream average_icecream average_effect
##                <dbl>            <dbl>          <dbl>
## 1               1.82             1.82              0
```

---

# Randomized Trials!

```r
d_random %>%
  summarise(
    average_observed = 
      mean(y_obs[x == 1]) - mean(y_obs[x == 0])
    )
```

```
## # A tibble: 1 × 1
##   average_observed
##              <dbl>
## 1            -0.68
```

---

# <svg aria-hidden="true" role="img" viewBox="0 0 448 512" style="height:1em;width:0.88em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M144 479H48c-26.5 0-48-21.5-48-48V79c0-26.5 21.5-48 48-48h96c26.5 0 48 21.5 48 48v352c0 26.5-21.5 48-48 48zm304-48V79c0-26.5-21.5-48-48-48h-96c-26.5 0-48 21.5-48 48v352c0 26.5 21.5 48 48 48h96c26.5 0 48-21.5 48-48z"/></svg> Pipes

.box-1.medium[Where does the name come from?]

The pipe operator is implemented in the package **magrittr**, it's pronounced "and then".

---

- You can think about the following sequence of actions - find key, 
unlock car, start car, drive to school, park.
- Expressed as a set of nested functions in R pseudocode this would look like:

```r
park(drive(start_car(find("keys")), to = "campus"))
```

]

```r
park(drive(start_car(find("keys")), to = "campus"))
```

]

- Writing it out using pipes give it a more natural (and easier to read) 
structure:
.small[

```r
find("keys") %>%
  start_car() %>%
  drive(to = "campus") %>%
  park()
```

]

---

class: title title-1
# <svg aria-hidden="true" role="img" viewBox="0 0 448 512" style="height:1em;width:0.88em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M144 479H48c-26.5 0-48-21.5-48-48V79c0-26.5 21.5-48 48-48h96c26.5 0 48 21.5 48 48v352c0 26.5-21.5 48-48 48zm304-48V79c0-26.5-21.5-48-48-48h-96c-26.5 0-48 21.5-48 48v352c0 26.5 21.5 48 48 48h96c26.5 0 48-21.5 48-48z"/></svg> What about other arguments?

To send results to a function argument other than first one or to use the previous result for multiple arguments, use `.`

```r
starwars %>%
  filter(species == "Human") %>%
  lm(mass ~ height, data = .)
```

---

# Randomized trial

* The easiest way to establish a causal effect is to run a controlled trial!

* Even very controlled experiments often have things that make them tricky to interpret: people drop out, non-compliance with treatment, interference

---

# Randomized trial

.box-2[Assignment probability must be **individualistic** (your covariates / potential outcomes can't influence my assignment)]

---

# Randomized trial

.box-1[Assignment probability must be **individualistic** (your covariates / potential outcomes can't influence my assignment)]

---

# Notation

.box-inv-1.medium[
.huge[
`$\mathbf{Y}(0), \mathbf{Y}(1)$`
]
]

<br>

---

# Notation

.box-inv-1.medium[
.huge[
`$\mathbf{X}$`
]
]

---

# Causal inference is fundamentally a missing data problem

---

# Outcomes for an individual

.box-inv-1[
`$Y_i^{obs}= Y_i(X_i)=\begin{cases}Y_i(0)&\textrm{if }X_i=0\\Y_i(1)&\textrm{if }X_i = 1\end{cases}$`
]

--
<br>

.box-1[
`$Y_i^{mis}= Y_i(1-X_i)=\begin{cases}Y_i(1)&\textrm{if }X_i=0\\Y_i(0)&\textrm{if }X_i = 1\end{cases}$`
]

---

# Outcomes for an individual

.box-inv-1[
`$Y_i(0)= \begin{cases}Y_i^{mis}&\textrm{if }X_i=1\\Y_i^{obs}&\textrm{if }X_i = 0\end{cases}$`
]

<br>

.box-1[
`$Y_i(1)= \begin{cases}Y_i^{mis}&\textrm{if }X_i=0\\Y_i^{obs}&\textrm{if }X_i = 1\end{cases}$`
]
---
class: section-title-1 title-1 middle

# Causal inference is fundamentally a missing data problem

# Average treatment effect estimand

<br>

`$$\tau=\frac{1}{N}\sum_{i=1}^N(Y_i(1)-Y_i(0))=\bar{Y}(1)-\bar{Y}(0)$$`

]

---

# Average treatment effect estimand

<br>

`$$\bar{Y}(0)=\frac{1}{N}\sum_{i=1}^NY_i(0)\textrm{   and   }\bar{Y}(1) = \frac{1}{N}\sum_{i=1}^NY_i(1)$$`
]

---

# Randomized experiment

---
class: title title-1

# Randomized experiment

---

# Estimator for average treatment effect

<br>

---
class: center

<figure>
<img src = "img/estimand.jpeg" width = "50%"></img>
</figure>
.footer[
Simon Grund on [Twitter](https://twitter.com/simongrund89/status/1085929122860359680?lang=bg)
]
---

# Average treatment effect estimand

<br>

`$$\tau=\frac{1}{N}\sum_{i=1}^N(Y_i(1)-Y_i(0))=\bar{Y}(1)-\bar{Y}(0)$$`

]

---

# Estimator for average treatment effect

<br>

---

# Estimator for average treatment effect

<br>

---

# Estimator for average treatment effect

<br>

# Estimator for average treatment effect

```r
d_random %>%
  summarise(
    average_observed = 
      mean(y_obs[x == 1]) - mean(y_obs[x == 0]
                                 )
    )
```

```
## # A tibble: 1 × 1
##   average_observed
##              <dbl>
## 1            -0.68
```

---

# Estimator for average treatment effect

.box-1.medium[Why does this work?]

.box-inv-1.medium[We know:]

`$Y_i^{obs}=Y_i(1)$` if `$X_i=1$`
]

`$Y_i^{obs}=Y_i(0)$` if `$X_i=0$`
]

---