Law of Iterated Expectations

class: center middle main-title section-title-1

# Law of Iterated Expectations

.class-info[

**Session 9**

.light[STA 379/679: Causal Inference <br>
Lucy D'Agostino McGowan
]

]

---

class: title title-1

# What is expectation?

.box-1.medium[
.huge[
`$$E[X]$$`
]
]

---

class: title title-1

# What is expectation

.box-1[In the discrete case, an expected value is just a weighted average of all possible values of the discrete random variable]

--
<br>
.box-1[

`$$E[X] = \sum_xxP(X=x)$$`

]
---

class: title title-1

# What is expectation

.box-1[How do we define expectation for continuous random variables?]

--

.box-1[

`$$E[X] = \int_{-\infty}^\infty xf(x)dx$$`

]

--
<br>
.box-inv-1[
What is `\(f(x)\)`?]

---

class: title title-1

# What is a conditional expectation?

.box-1[

`$$E[X|Y] = \sum_xxP(X=x|Y=y)$$`

]

---

class: title title-1

# What is a conditional expectation?

.box-1[

`$$E[X|Y] = \sum_{\color{orange}x}xP(X=x|Y=y)$$`

]

---

class: title title-1

# What is a conditional expectation?

.box-1[

`$$E[X|Y] = \sum_{\color{orange}x}\color{orange}xP(X=x|Y=y)$$`

]

---

class: title title-1

# What is a conditional expectation?

.box-1[

`$$E[X|Y] = \sum_{\color{orange}x}\color{orange}x\color{orange}{P(X=x|Y=y)}$$`

]
---

class: title title-1

# What is a conditional expectation?

.box-1[

`$$E[X|Y] = \sum_{x}xP(X=x|Y=y)$$`
]

<br>

.box-1[
`$$E[X|Y] = \int_{-\infty}^\infty x f(x|y)dx$$`
]

---

class: title title-1

# What is a conditional expectation?

.box-1[

`$$E[X|Y] = \sum_{x}xP(X=x|Y=y)$$`
]

<br>

.box-1[
`$$E[X|Y] = \int_{-\infty}^\infty x \color{orange}{f(x|y)}dx$$`
]
---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[

`$$\Large E[X]=E[E[X|Y]]$$`
]

---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[

`$$\Large E_\color{orange}X[X]=E_\color{orange}Y[E[X|Y]]$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1.medium[

`$$\Large E[X]=E[E[X|Y]]$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1.medium[

`$$E[X]=E[\color{orange}{E[X|Y]}]$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1.medium[

`$$E[X]=E\left[\color{orange}{\sum_x xP(X=x|Y=y)}\right]$$`
]

--

.box-inv-1[Definition of expectation]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1.medium[

`$$E[X]=\color{orange}E\left[\sum_x xP(X=x|Y=y)\right]$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$E[X]=\color{orange}{\sum_y}\sum_x xP(X=x|Y=y)\color{orange}{P(Y=y)}$$`
]

--
<br>
.box-inv-1[Definition of expectation]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$E[X]=\sum_y\sum_x x\color{orange}{P(X=x|Y=y)P(Y=y)}$$`
]

--
<br>

.box-1[

Per Bayes! `$$P(X|Y)P(Y) = P(X,Y)$$`

]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$E[X]=\sum_y\sum_x x\color{orange}{P(X=x, Y=y)}$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$E[X]=\color{orange}{\sum_x \sum_y}xP(X=x, Y=y)$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$E[X]=\sum_x x \sum_yP(X=x, Y=y)$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$E[X]=\sum_x x \color{orange}{\sum_yP(X=x, Y=y)}$$`
]

--
<br>
.box-1[
Probability! `\(\sum_yP(X, Y)=P(X)\)`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1.medium[

`$$E[X]=\sum_x x \color{orange}{P(X = x)}$$`
]

--
<br>

.box-1[Definition of expectations]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1.medium[

`$$E[X]=E[X]$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[Suppose `\(X\)` and `\(Y\)` are jointly continuous with a joint density function `\(f(x, y)\)` and marginal densities `\(f_x(x), f_y(y)\)`]

<br>

.box-1.medium[

`$$E[X]=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x,y)dxdy$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1.medium[

`$$E[X]=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x,y)dxdy$$`
]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$\begin{align}E[X]&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x,y)dxdy\\&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x|y)f_y(y)dxdy\end{align}$$`
]

--

.box-1[Why?]

---

class: title title-1

# Law of Iterated Expectations Proof

.box-1[

`$$\begin{align}E[X]&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x,y)dxdy\\&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x|y)f_y(y)dxdy\\&=\int_{-\infty}^\infty \left\{\int_{-\infty}^\infty xf(x|y)dx\right\}f_y(y)dy\end{align}$$`
]

--

.box-1[What is in the brackets?]

---
class: title title-1

# Law of Iterated Expectations Proof

.box-1.small[

`$$\begin{align}E[X]&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x,y)dxdy\\&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x|y)f_y(y)dxdy\\&=\int_{-\infty}^\infty \left\{\int_{-\infty}^\infty xf(x|y)dx\right\}f_y(y)dy\\&=\int_{-\infty}^\infty E[X|Y]f_y(y)dy\end{align}$$`
]
---
class: title title-1

# Law of Iterated Expectations Proof

.box-1.small[

`$$\begin{align}E[X]&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x,y)dxdy\\&=\int_{-\infty}^\infty \int_{-\infty}^\infty xf(x|y)f_y(y)dxdy\\&=\int_{-\infty}^\infty \left\{\int_{-\infty}^\infty xf(x|y)dx\right\}f_y(y)dy\\&=\int_{-\infty}^\infty E[X|Y]f_y(y)dy\\&=E[E[X|Y]]\end{align}$$`
]

---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[
Practical use: if you know something about the conditional distribution `\(X|Y\)`, and you know something about `\(Y\)`, you can learn about `\(X\)`
]

---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[Practical Example]

.box-inv-1.medium[We want to know the average exam grade for the class. We know:]

--
1. The average conditional on whether the student studied
2. The proportion of students who studied

---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[Practical Example]

.box-inv-1.medium[We want to know the average exam grade for the class]

`$$E[\textrm{exam grade}] = E[E[\textrm{exam grade | study status}]]$$`
---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[Practical Example]

.box-inv-1.medium[We want to know the average exam grade for the class]

`$$E[\textrm{exam grade}] = \\\sum_{\textrm{study status}}E[\textrm{exam grade | study status}]P(\textrm{study status})$$`

---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[Practical Example]

.box-inv-1.medium[We want to know the average exam grade for the class]

`$$E[\textrm{exam grade}] = \\E[\textrm{exam grade | studied}]P(\textrm{studied})+\\E[\textrm{exam grade | didn't study}]P(\textrm{didn't study})$$`

---

class: title title-1

# Law of Iterated Expectations

.box-1.medium[Practical Example]

.box-inv-1.medium[We want to know the average exam grade for the class]
<br>
.box-1[
`\(E[\textrm{exam grade}] = 95 \times 0.8 + 70\times 0.2 = 90\)`
]

---

class: title title-1

# Example

Let's flip a coin twice, heads is 1, tails is 0. Let `\(X\)` be the sum of the two flips, and `\(Y\)` be the maximum. We can draw a table of the joint and marginal distributions:

`\(p_{X,Y}(x,y)\)` | | |
----------------|--|-|
 | y = 0 | y = 1 | `\(p_X(x)\)`
x = 0 | 0.25 | 0 | 0.25
x = 1 | 0 | 0.5 | 0.5
x = 2 | 0 | 0.25 | 0.25
`\(p_Y(y)\)`|0.25 | 0.75

---

class: title title-1

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

1. Using the table on the previous slide, calculate `\(E[Y]\)`
2. Calculate `\(E[Y|X = 0]\)`
3. Calculate `\(E[Y|X = x]\)` for each possible value `\(x\)` of `\(X\)`
4. Calculate `\(E[E[Y|X=x]]\)`

<br><br>
## https://bit.ly/sta-679-s22-ae7

---

class: title title-1

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

`\(p_{X,Y}(x,y)\)` | | |
----------------|--|-|
 | y = 0 | y = 1 | `\(p_X(x)\)`
x = 0 | 0.25 | 0 | 0.25
x = 1 | 0 | 0.5 | 0.5
x = 2 | 0 | 0.25 | 0.25
`\(p_Y(y)\)`|0.25 | 0.75

.box-1[
`$$E[Y] = \sum_y y P(Y = y)$$`
]

---
class: title title-1

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

`\(p_{X,Y}(x,y)\)` | | |
----------------|--|-|
 | y = 0 | y = 1 | `\(p_X(x)\)`
x = 0 | 0.25 | 0 | 0.25
x = 1 | 0 | 0.5 | 0.5
x = 2 | 0 | 0.25 | 0.25
`\(p_Y(y)\)`|0.25 | 0.75

.box-1[
`$$E[Y] = 0 \times 0.25 + 1 \times 0.75 = 0.75$$`

]

---

class: title title-1

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

`\(p_{X,Y}(x,y)\)` | | |
----------------|--|-|
 | y = 0 | y = 1 | `\(p_X(x)\)`
x = 0 | 0.25 | 0 | 0.25
x = 1 | 0 | 0.5 | 0.5
x = 2 | 0 | 0.25 | 0.25
`\(p_Y(y)\)`|0.25 | 0.75

.box-1[
`$$E[Y|X = 0] = 0 \times 1 + 0 \times 0 = 0$$`
]
---

class: title title-1

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

.pull-left[
`\(p_{X,Y}(x,y)\)` | | |
----------------|--|-|
 | y = 0 | y = 1 | `\(p_X(x)\)`
x = 0 | 0.25 | 0 | 0.25
x = 1 | 0 | 0.5 | 0.5
x = 2 | 0 | 0.25 | 0.25
`\(p_Y(y)\)`|0.25 | 0.75
]

.pull-right[
x | E[Y &#124; X = x]
---|--------
0 | `\(0\times 1 + 1\times 0 = 0\)`
1 | `\(0 \times 0 + 1 \times 1 = 1\)`
2 | `\(0 \times 0 + 1 \times 1 = 1\)`

]

---

class: title title-1

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

.pull-left[
`\(p_{X,Y}(x,y)\)` | | |
----------------|--|-|
 | y = 0 | y = 1 | `\(p_X(x)\)`
x = 0 | 0.25 | 0 | 0.25
x = 1 | 0 | 0.5 | 0.5
x = 2 | 0 | 0.25 | 0.25
`\(p_Y(y)\)`|0.25 | 0.75
]

.pull-right[
x | E[Y &#124; X = x]
---|--------
0 | `\(0\times 1 + 1\times 0 = 0\)`
1 | `\(0 \times 0 + 1 \times 1 = 1\)`
2 | `\(0 \times 0 + 1 \times 1 = 1\)`

]

.box-1[
`$$E[E[Y|X=x]] = 0 \times 0.25 + 1 \times 0.5 + 1 \times 0.25 = 0.75$$`
]
---

class: title title-1

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

```r
flip2 <- rbinom(100000, 1, 0.5)
flip1 <- rbinom(100000, 1, 0.5)
x <- flip2 + flip1
y = pmax(flip2, flip1)
mean(y)
mean(x)
mean(y[x == 0])
mean(y[x == 1])
mean(y[x == 2])
mean(y[x == 0]) * mean(x == 0) + 
  mean(y[x == 1]) * mean(x == 1) + 
  mean(y[x == 2]) * mean(x == 2)
```