Causal assumptions

# Causal assumptions

**Session 9**

]

---

# Notation

.box-inv-1.medium[
.huge[
`$\mathbf{Y}(0), \mathbf{Y}(1)$`
]
]

---

# Notation

```
## # A tibble: 8 × 3
## unit_1 unit_2 unit_3
## <int> <int> <int>
## 1 0 0 0
## 2 0 0 1
## 3 0 1 0
## 4 0 1 1
## 5 1 0 0
## 6 1 0 1
## 7 1 1 0
## 8 1 1 1
```
]

---

# Notation

```
## # A tibble: 8 × 3
## unit_1 unit_2 unit_3
## <int> <int> <int>
## 1 0 0 0
## 2 0 0 1
*## 3 0 1 0
## 4 0 1 1
## 5 1 0 0
## 6 1 0 1
## 7 1 1 0
## 8 1 1 1
```
]
---

# Assignment mechanism
.footer[Imbens and Rubin (2015) Causal Inference]

.box-inv-1.medium[

`$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

---
class: section-title section-title-1 middle

# Let's start with **no assumptions**

---

# Assignment mechanism
.footer[Imbens and Rubin (2015) Causal Inference]

.box-inv-1.medium[

`$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

.box-inv-1[
`$$\sum_{X\in\{0,1\}^N}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) =1$$`
for all `$\mathbf{Z}$`, `$\mathbf{Y}(0)$`, and `$\mathbf{Y}(1)$`
]

---

# Assignment mechanism

```
## # A tibble: 8 × 4
## unit_1 unit_2 unit_3 assignment_probability
## <int> <int> <int> <dbl>
## 1 0 0 0 0.125
## 2 0 0 1 0.125
## 3 0 1 0 0.125
## 4 0 1 1 0.125
## 5 1 0 0 0.125
## 6 1 0 1 0.125
## 7 1 1 0 0.125
## 8 1 1 1 0.125
```

---

# Assignment mechanism

```r
d %>%
  summarise(sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## `sum(assignment_probability)`
## <dbl>
## 1 1
```

---

# Assignment mechanism

```
## # A tibble: 8 × 4
## unit_1 unit_2 unit_3 assignment_probability
## <int> <int> <int> <dbl>
## 1 0 0 0 0
## 2 0 0 1 0
## 3 0 1 0 1
## 4 0 1 1 0
## 5 1 0 0 0
## 6 1 0 1 0
## 7 1 1 0 0
## 8 1 1 1 0
```
---

# Assignment mechanism

```
## # A tibble: 8 × 4
## unit_1 unit_2 unit_3 assignment_probability
## <int> <int> <int> <dbl>
## 1 0 0 0 0
## 2 0 0 1 0
*## 3 0 1 0 1
## 4 0 1 1 0
## 5 1 0 0 0
## 6 1 0 1 0
## 7 1 1 0 0
## 8 1 1 1 0
```
---

# Assignment mechanism

```r
d %>%
  summarise(sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## `sum(assignment_probability)`
## <dbl>
## 1 1
```

---

# Assignment mechanism

```
## # A tibble: 8 × 4
## unit_1 unit_2 unit_3 assignment_probability
## <int> <int> <int> <dbl>
## 1 0 0 0 0 
## 2 0 0 1 0 
## 3 0 1 0 0.25
## 4 0 1 1 0.25
## 5 1 0 0 0 
## 6 1 0 1 0 
## 7 1 1 0 0.25
## 8 1 1 1 0.25
```
---

# Assignment mechanism

```
## # A tibble: 8 × 4
## unit_1 unit_2 unit_3 assignment_probability
## <int> <int> <int> <dbl>
## 1 0 0 0 0 
## 2 0 0 1 0 
*## 3 0 1 0 0.25
*## 4 0 1 1 0.25
## 5 1 0 0 0 
## 6 1 0 1 0 
*## 7 1 1 0 0.25
*## 8 1 1 1 0.25
```

---

# Assignment mechanism

```r
d %>%
  summarise(sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## `sum(assignment_probability)`
## <dbl>
## 1 1
```

---
class: title title-1

# Assignment mechanism
.footer[Imbens and Rubin (2015) Causal Inference]

.box-inv-1[
In words: this is the probability that a particular value for the full assignment will occur -- like person 1 gets assigned to the exposure, 2 & 3 to control, 4 to exposure, and so on. The probabilities across the full set of `$2^N$` possible assignment vectors must sum to 1.
]

.box-1.medium[This is **not** the probability that a particular person will be assigned a particular exposure]

---

# Unit Assignment Probability
.footer[Imbens and Rubin (2015) Causal Inference]

`$P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\sum_{\mathbf{X}:X_i=1}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0),\mathbf{Y}(1))$`
]

.box-1[
You can get this by summing across all possible assignment vectors, `$\mathbf{X}$` where `$X_i=1$`
]

---

# Unit Assignment Probability

```r
d %>%
  filter(unit_1 == 1) %>%
  summarise(p1 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p1
## <dbl>
## 1 0.5
```

---

# Unit Assignment Probability

```r
d %>%
* filter(unit_1 == 1) %>%
  summarise(p1 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p1
## <dbl>
## 1 0.5
```

`$P_1(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\sum_{\color{orange}{\mathbf{X}:X_1=1}}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0),\mathbf{Y}(1))$`
]

---

# Unit Assignment Probability

```r
d %>%
  filter(unit_1 == 1) %>% 
* summarise(p1 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p1
## <dbl>
## 1 0.5
```

`$P_1(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\color{orange}{\sum_{\mathbf{X}:X_1=1}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0),\mathbf{Y}(1))}$`
]

---

# Unit Assignment Probability

```r
d %>%
  filter(unit_2 == 1) %>% 
  summarise(p2 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p2
## <dbl>
## 1 1
```

---

# Unit Assignment Probability

```r
d %>%
  filter(unit_3 == 1) %>% 
  summarise(p3 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p3
## <dbl>
## 1 0.5
```

---

# Unit Assignment Probability
.footer[Imbens and Rubin (2015) Causal Inference]

.box-1[By this definition so far, the probability that the `$i$`th unit is in the *exposed* group can be a function of:]

.box-inv-1.small[
their covariates `$(Z)$`
]

.box-inv-1.small[
their potential outcomes `$(Y_i(0), Y_i(1))$`
]

.box-inv-1.small[
the covariate values, exposure assignment, and potential outcomes of all other units]

---

class: title title-1
.footer[Imbens and Rubin (2015) Causal Inference]
# Average unit-level assignment probabilities

.box-1[What if we want the average unit-level assignment probabilities for units with the same covariate values `$(Z_i = z)$`?]

---

# Propensity score: The average unit assignment probability for units where `$Z_i=z$`

---

# Propensity Score

.box-inv-1.medium[

`$e_i(z)=\frac{1}{N(z)}\sum_{i: Z_i=z}P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

`$N(z)$` is the number of units where `$Z_i = z$`

]
---

# Propensity score

.box-1.small[Units 1 & 3 are in category `1`, 2 in category `0`]

```r
# Z = 1
d %>%
  summarise(e1 = (sum(assignment_probability[unit_1 == 1]) + 
              sum(assignment_probability[unit_3 == 1])) / 2)
```

```
## # A tibble: 1 × 1
## e1
## <dbl>
## 1 0.5
```

]
---

# Propensity score

```r
# Z = 1
d %>%
  summarise(e1 = (sum(assignment_probability[unit_1 == 1]) + 
              sum(assignment_probability[unit_3 == 1])) / 2)
```

```
## # A tibble: 1 × 1
## e1
## <dbl>
## 1 0.5
```
]

--
.box-1[

`$e_i(z)=\frac{1}{N(z)}\sum_{\color{orange}{i: Z_i=z}}P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

---
class: title title-1

# Propensity score

```r
# Z = 1
d %>%
* summarise(e1 = (sum(assignment_probability[unit_1 == 1]) +
              sum(assignment_probability[unit_3 == 1])) / 2)
```

```
## # A tibble: 1 × 1
## e1
## <dbl>
## 1 0.5
```
]

`$e_i(z)=\frac{1}{N(z)}\sum_{i: Z_i=z}P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

---

# Propensity score

```r
# Z = 1
d %>%
  summarise(e1 = (sum(assignment_probability[unit_1 == 1]) + 
*             sum(assignment_probability[unit_3 == 1])) / 2)
```

```
## # A tibble: 1 × 1
## e1
## <dbl>
## 1 0.5
```
]

`$e_i(z)=\frac{1}{N(z)}\sum_{i: Z_i=z}P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

---

# Propensity score

```r
# Z = 0
d %>%
  summarise(e1 = sum(assignment_probability[unit_2 == 1]) / 1)
```

```
## # A tibble: 1 × 1
## e1
## <dbl>
## 1 1
```
]

`$e_i(z)=\frac{1}{N(z)}\sum_{i: Z_i=z}P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

---

# Example

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}, \begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

---

# Example

`$$\mathbf{X}=\left\{\color{orange}{\begin{pmatrix}0\\0\end{pmatrix}}, \begin{pmatrix}0\\1\end{pmatrix}, \begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

---

# Example

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \color{orange}{\begin{pmatrix}0\\1\end{pmatrix}}, \begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

---

# Example

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}, \color{orange}{\begin{pmatrix}1\\0\end{pmatrix}}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

---

# Example

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}, \begin{pmatrix}1\\0\end{pmatrix}, \color{orange}{\begin{pmatrix}1\\1\end{pmatrix}}\right\}$$`
]

---

# Let's do a randomized experiment, so all exposure assignment possibilities are equally likely

---

# Randomized Example

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}, \begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) = ?$$`
]

---

# Randomized Example

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}, \begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) = 1/4$$`
]
---

# Randomized Example

`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) = 1/4$$`
]

`$P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\sum_{\mathbf{X}:X_i=1}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0),\mathbf{Y}(1))$`
]

---

# Randomized Example

`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) = 1/4$$`
]

`$P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\color{orange}{\sum_{\mathbf{X}:X_i=1}}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0),\mathbf{Y}(1))$`
]

---

# Unit `$i=1$` assignment probability

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}, \begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

---

# Unit `$i=1$` assignment probability

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix},\begin{pmatrix}0\\1\end{pmatrix}, \color{orange}{\begin{pmatrix}1\\0\end{pmatrix}}, \color{orange}{\begin{pmatrix}1\\1\end{pmatrix}}\right\}$$`
]

.box-1[
`$$P_1(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\sum_{\mathbf{X}:X_1=1}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0),\mathbf{Y}(1))$$`

]

---

# Unit `$i=1$` assignment probability

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix},\begin{pmatrix}0\\1\end{pmatrix}, \color{orange}{\begin{pmatrix}1\\0\end{pmatrix}}, \color{orange}{\begin{pmatrix}1\\1\end{pmatrix}}\right\}$$`
]

]

---

# Unit `$i=2$` assignment probability

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}, \begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}1\\1\end{pmatrix}\right\}$$`
]

---

# Unit `$i=2$` assignment probability

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix},\color{orange}{\begin{pmatrix}0\\1\end{pmatrix}}, \begin{pmatrix}1\\0\end{pmatrix}, \color{orange}{\begin{pmatrix}1\\1\end{pmatrix}}\right\}$$`
]

.box-1[
`$$P_2(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\sum_{\mathbf{X}:X_2=1}P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0),\mathbf{Y}(1))$$`

]

---

# Unit `$i=2$` assignment probability

`$$\mathbf{X}=\left\{\begin{pmatrix}0\\0\end{pmatrix},\color{orange}{\begin{pmatrix}0\\1\end{pmatrix}}, \begin{pmatrix}1\\0\end{pmatrix}, \color{orange}{\begin{pmatrix}1\\1\end{pmatrix}}\right\}$$`
]

]

---

# Propensity Score

.box-inv-1.medium[

`$e_i(z)=\frac{1}{N(z)}\sum_{i: Z_i=z}P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$`
]

.box-1[Because this experiment is randomized, the propensity score is equal to the unit assignment probabilities (1/2 for units 1 and 2)]

---

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

---

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

---
class: title title-1

# Individualistic
.footer[Imbens and Rubin (2015) Causal Inference]

.box-1[
An assignment mechanism `$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$` is individualistic if, for some function `$q( · )\in   [0, 1]$`,

`$$P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=q(Z_i, Y_i(0), Y_i(1))\\\textrm{ for all }i=1,\dots,N$$` 
]

---

# Individualistic
.footer[Imbens and Rubin (2015) Causal Inference]

.box-inv-1[
1️⃣ in words: The unit-level assignment probability for unit `$i$` is equal to some function of *only that unit's* covariates and potential outcomes
]
---

# Individualistic
.footer[Imbens and Rubin (2015) Causal Inference]

.box-1[
`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=\\c\prod_{i=1}^Nq(Z_i, Y_i(0), Y_i(1))^{X_i}(1-q(Z_i, Y_i(0), Y_i(1))^{1-X_i}\\\textrm{for }(\mathbf{X}, \mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))\in \mathbb{A}\\\textrm{ for some set }\mathbb{A}\textrm{ and 0 elsewhere}$$` 
]

---

# Individualistic
.footer[Imbens and Rubin (2015) Causal Inference]

.box-inv-1[
2️⃣ in words: Probability of assignment is proportional to the product of the individual probability assignments. Since we are taking the product of the probabilities, this requires independence among probability of assignment for individual observations
]

---

# Individualistic

```r
p1 = 0.25
p2 = 0.5
p3 = 0.75
d <- d %>%
 mutate(assignment_probability = 
 p1 ^ unit_1 * (1 - p1)^(1 - unit_1) * 
 p2 ^ unit_2 * (1 - p2)^(1 - unit_2) * 
 p3 ^ unit_3 * (1 - p3)^(1 - unit_3)
 )
d
```

```
## # A tibble: 8 × 4
## unit_1 unit_2 unit_3 assignment_probability
## <int> <int> <int> <dbl>
## 1 0 0 0 0.0938
## 2 0 0 1 0.281 
## 3 0 1 0 0.0938
## 4 0 1 1 0.281 
## 5 1 0 0 0.0312
## 6 1 0 1 0.0938
## 7 1 1 0 0.0312
## 8 1 1 1 0.0938
```
]
---

# Individualistic

```r
d %>%
  filter(unit_1 == 1) %>%
  summarise(p1 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p1
## <dbl>
## 1 0.25
```

---
class: title title-1

# Individualistic

```r
d %>%
  filter(unit_2 == 1) %>%
  summarise(p2 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p2
## <dbl>
## 1 0.5
```

---

# Individualistic

```r
d %>%
  filter(unit_3 == 1) %>%
  summarise(p3 = sum(assignment_probability))
```

```
## # A tibble: 1 × 1
## p3
## <dbl>
## 1 0.75
```

---

# Propensity score under individualistic

]

---

# Propensity Score
.footer[Imbens and Rubin (2015) Causal Inference]

.box-inv-1.medium[

`$$e_i(z)=\frac{1}{N(z)}\sum_{i: Z_i=z}P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$$`
]

---

# Propensity Score

.box-inv-1.medium[

`$$e(z) = \frac{1}{N_z}\sum_{i:Z_i=z}q(Z_i, Y_i(0), Y_i(1))$$`
]

---
class: title title-1

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

---

# Probabilistic
.footer[Imbens and Rubin (2015) Causal Inference]

.box-1[
An assignment mechanism `$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))$` is probabilistic if the probability of assignment to treatment for unit `$i$` is strictly between zero and one
]

--
 
.box-1[

`$$0 < P_i(\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) < 1,\\ \textrm{for all possible }\mathbf{Z},\mathbf{Y}(0), \mathbf{Y}(1)\\ \textrm{for all }i=1,\dots,N$$`

]

---

# Exposure assignment

.box-inv-1[Understanding the **assignment mechanism** is crucial for understanding the causal effect]

---
class: title title-1

# Unconfounded
.footer[Imbens and Rubin (2015) Causal Inference]

.box-1[
`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1))=P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}'(0), \mathbf{Y}'(1))\\\textrm{for all }\mathbf{X},\mathbf{Z},\mathbf{Y}(0),\mathbf{Y}(1),\mathbf{Y}'(0), \mathbf{Y}'(1)$$`
]

---

# Unconfounded
.footer[Imbens and Rubin (2015) Causal Inference]

.box-1[If the assignment mechanism is unconfounded, we can drop the potential outcomes from the assignment probability]

--

---
class: title title-1

# Unconfounded

```r
y_0 <- c(1, 1, 1)
y_1 <- c(2, 1, 2)
p <- (0.25 + 0.5 * (y_1 - y_0))
p
```

```
## [1] 0.75 0.25 0.75
```
]
--

---

# Unconfounded

```r
y_0 <- c(1, 1, 1)
y_1 <- c(2, 1, 2)
z <- c(1, 0, 1) #<
p <- (0.25 + 0.5 * z)
```
]
--

---
class: title title-1

# Under individualistic + unconfounded

`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) = c\prod_{i=1}^N q(Z_i)^{X_i}(1-q(Z_i))^{1-X_i}$$`
]

--

---

# Under individualistic + unconfounded

`$$P(\mathbf{X}|\mathbf{Z}, \mathbf{Y}(0), \mathbf{Y}(1)) = c\prod_{i=1}^N e(Z_i)^{X_i}(1-e(Z_i))^{1-X_i}$$`
]

---
class: title title-1

# Propensity Score

.box-inv-1.medium[
`$$e(z) = P(X = 1 | Z = z)$$`
]

.box-1[
The propensity score is a *balancing score*. Conditional on the propensity score, the distribution of `$\mathbf{Z}$` is similar between the exposed `$(X=1)$` and unexposed `$(X=0)$` units.
]

---

<figure>
<img src = "img/estimand.jpeg" width = "50%"></img>
</figure>
.footer[
Simon Grund on [Twitter](https://twitter.com/simongrund89/status/1085929122860359680?lang=bg)
]
---

# Average treatment effect estimand

`$$\tau=\frac{1}{N}\sum_{i=1}^N(Y_i(1)-Y_i(0))=\bar{Y}(1)-\bar{Y}(0)$$`

]

---

# Estimator for average treatment effect

---

# Unbiased?

---

# We can rewrite the estimator

`$$\frac{1}{N}\sum_{i=1}^N\left(\frac{Y_i(1)X_i}{N_e/{N} }-\frac{Y_i(0)(1-X_i)}{N_c/{N}}\right)$$`

]
---

# What is random?

`$$\frac{1}{N}\sum_{i=1}^N\left(\frac{Y_i(1)X_i}{N_e/N }-\frac{Y_i(0)(1-X_i)}{N_c/N}\right)$$`

]

---

# What is random?

`$$\frac{1}{N}{\sum_{i=1}^N}\left(\frac{Y_i(1)\color{purple}{X_i}}{N_e/N}-\frac{Y_i(0)(1-\color{purple}{X_i})}{N_c/N}\right)$$`

]

---

# Randomized experiment

.box-inv-1[
`$$P_X(X_i = 1 | \mathbf{Y}(0), \mathbf{Y}(1)) = \mathbb{E}_X[X_i|Y(0), Y(1)] = N_e/N$$`
]

---

# Randomized experiment

.box-inv-1[
`$$P_X(X_i = 0 | \mathbf{Y}(0), \mathbf{Y}(1)) = \\\mathbb{E}_X[1 - X_i|Y(0), Y(1)] = N_c/N$$`
]

---

# Randomized exeriment: unbiased

`$$\mathbb{E}_X[\hat\tau|\mathbf{Y}(0),\mathbf{Y}(1)]=\\\frac{1}{N}\sum_{i=1}^N\left(\frac{\mathbb{E}_X[X_i]Y_i(1)}{N_e/N}-\frac{\mathbb{E}_X[1-X_i]Y_i(0)}{N_c/N}\right)$$`
]

---

# Randomized exeriment: unbiased

`$$\mathbb{E}_X[\hat\tau|\mathbf{Y}(0),\mathbf{Y}(1)]=\\\frac{1}{N}\sum_{i=1}^N\left(\frac{(N_e/N)Y_i(1)}{N_e/N}-\frac{(N_c/N)Y_i(0)}{N_c/N}\right)$$`
]

---

# Randomized exeriment: unbiased

`$$\mathbb{E}_X[\hat\tau|\mathbf{Y}(0),\mathbf{Y}(1)]=\\\frac{1}{N}\sum_{i=1}^N\left(Y_i(1)-Y_i(0)\right)=\tau$$`
]

---

# Observational Study, one binary confounder
---
class: title title-1

# We can rewrite the estimand

`$$\tau_{Z=1}=\frac{1}{N_{Z=1}}\sum_{i:Z_i = 1}(Y_i(1)-Y_i(0))$$`

]

---
class: title title-1

# We can rewrite the estimand

`$$\tau = \frac{N_{Z=1}}{N}\tau_{Z=1} + \frac{N_{Z=0}}{N}\tau_{Z=0}$$`

]

`$$\tau = \frac{1}{N}\left[\sum_{i:Z=1}((Y_i(1)-Y_i(0)) + \sum_{i:Z=0}((Y_i(1)-Y_i(0))\right]$$`

]

---
class: title title-1

# Lucy Land Trial!

```r
set.seed(1)

n <- 1000
meeple <- tibble(
 happy = sample(rep(c(1, 0), each = n / 2)),
 happiness = case_when(
 happy == 1 ~ rbinom(n, 5, 0.7),
 happy == 0 ~ rbinom(n, 3, 0.2)
 ),
 y0 = happiness,
 y1 = happiness
)
```
]
---

# Lucy Land Trial!

```r
set.seed(5)

d_lucy <- meeple %>%
 mutate(x = case_when(
 happy == 1 ~ rbinom(n, 1, 0.9), 
 happy == 0 ~ rbinom(n, 1, 0.1) 
 ), 
 y_obs = ifelse(x == 1, y1, y0))
```

---

# Lucy Land Trial!

```r
set.seed(5)

d_lucy <- meeple %>%
* mutate(x = case_when(
* happy == 1 ~ rbinom(n, 1, 0.9),
* happy == 0 ~ rbinom(n, 1, 0.1)
* ),
 y_obs = ifelse(x == 1, y1, y0))
```

---
class: section-title-1 title-1 middle

# the baseline indicator `happy` is a confounder for the exposure and the outcome
---

# Confounding

---

# Lucy Land Trial!

```r
d_lucy %>%
  summarise(true_causal_effect = mean(y1) - mean(y0))
```

```
## # A tibble: 1 × 1
## true_causal_effect
## <dbl>
## 1 0
```
]
--

# Lucy Land Trial!

```r
d_lucy %>%
  summarise(observed_causal_effect = 
              sum(y_obs * x) / sum(x) -
              sum(y_obs * (1 - x)) / sum(1 - x))
```

```
## # A tibble: 1 × 1
## observed_causal_effect
## <dbl>
## 1 2.22
```
]
---
class: title title-1

# Lucy Land Trial!

```r
d_lucy %>%
  summarise(observed_causal_effect = 
*             sum(y_obs * x) / sum(x) -
              sum(y_obs * (1 - x)) / sum(1 - x))
```

```
## # A tibble: 1 × 1
## observed_causal_effect
## <dbl>
## 1 2.22
```
]

.box-inv-1.medium[

`$\bar{Y}^{obs}_e$`
]
---

# Lucy Land Trial!

```r
d_lucy %>%
  summarise(observed_causal_effect = 
              sum(y_obs * x) / sum(x) - 
*             sum(y_obs * (1 - x)) / sum(1 - x))
```

```
## # A tibble: 1 × 1
## observed_causal_effect
## <dbl>
## 1 2.22
```
]

.box-inv-1.medium[

`$\bar{Y}^{obs}_c$`
]

---

# Lucy Land Trial!

```r
d_lucy %>%
  group_by(happy) %>%
  summarise(observed_causal_effect = 
              sum(y_obs * x) / sum(x) - 
              sum(y_obs * (1 - x)) / sum(1 - x))
```

```
## # A tibble: 2 × 2
## happy observed_causal_effect
## <dbl> <dbl>
## 1 0 -0.0347
## 2 1 0.0194
```
]

---

# Lucy Land Trial!

```r
d_lucy %>%
* group_by(happy) %>%
  summarise(observed_causal_effect = 
              sum(y_obs * x) / sum(x) - 
              sum(y_obs * (1 - x)) / sum(1 - x))
```

```
## # A tibble: 2 × 2
## happy observed_causal_effect
## <dbl> <dbl>
## 1 0 -0.0347
## 2 1 0.0194
```
]

---

# Lucy Land Trial!

```r
d_lucy %>%
* group_by(happy) %>%
  summarise(observed_causal_effect = 
              sum(y_obs * x) / sum(x) - 
              sum(y_obs * (1 - x)) / sum(1 - x))
```

```
## # A tibble: 2 × 2
## happy observed_causal_effect
## <dbl> <dbl>
## 1 0 -0.0347
## 2 1 0.0194
```
]

---

# We can extend this to any number of confounders, but it gets harder to estimate

---

# What if instead we could condition on some balancing score?

---

# Balancing score

`$$X_i \perp Z_i | b(Z_i)$$`
]

.box-1[a balancing score is a function of the covariates such that the probability of receiving the exposure given the covariates is free of dependence on the covariates given the balancing score]

---