Evaluating your propensity score model

# Evaluating your propensity score model

**Session 14**

]

---
class: title title-1

# Checking balance

.box-1.medium[Love Plots]

.box-1.medium[eCDF plots]

---
class: title title-1

# Standardized Mean Difference (SMD)

`$$\LARGE d = \frac{\bar{z}_{exposed}-\bar{z}_{unexposed}}{\sqrt{\frac{s^2_{exposed}+s^2_{unexposed}}{2}}}$$`
--
<br>
.box-1[Rule of thumb: SMD < 0.1]
---

# Customer satisfaction data

```r
library(tidyverse)
library(broom)

glm(satisfied_customer_service ~ income + married + n_kids + has_pets + age + 
      education + gender + former_customer_service + previous_spend, 
    family = binomial(),
    data = customer_satisfaction) %>%
  augment(data = customer_satisfaction, type.predict = "response") %>%
  mutate(atm_wts = pmin(.fitted, 1 - .fitted) / 
           (satisfied_customer_service * .fitted + 
              (1 - satisfied_customer_service) * (1 - .fitted))
  ) -> customer_satisfaction
```
]

---

# Customer satisfaction data

```r
library(tidyverse)
library(broom)

glm(satisfied_customer_service ~ income + married + n_kids + has_pets + age + 
      education + gender + former_customer_service + previous_spend, 
        family = binomial(),
    data = customer_satisfaction) %>%
  augment(data = customer_satisfaction, type.predict = "response") %>%
  mutate(atm_wts = pmin(.fitted, 1 - .fitted) / 
           (satisfied_customer_service * .fitted + 
              (1 - satisfied_customer_service) * (1 - .fitted))
  ) -> customer_satisfaction
```
]

---

# Customer satisfaction data

```r
library(tidyverse)
library(broom)

*glm(satisfied_customer_service ~ income + married + n_kids + has_pets + age +
*     education + gender + former_customer_service + previous_spend,
    family = binomial(),
    data = customer_satisfaction) %>%
  augment(data = customer_satisfaction, type.predict = "response") %>%
  mutate(atm_wts = pmin(.fitted, 1 - .fitted) / 
           (satisfied_customer_service * .fitted + 
              (1 - satisfied_customer_service) * (1 - .fitted))
  ) -> customer_satisfaction
```
]
---

# Customer satisfaction data

```r
library(tidyverse)
library(broom)

```r
library(tidyverse)
library(broom)

glm(satisfied_customer_service ~ income + married + n_kids + has_pets + age + 
      education + gender + former_customer_service + previous_spend, 
    family = binomial(),
    data = customer_satisfaction) %>%
  augment(data = customer_satisfaction, type.predict = "response") %>%
* mutate(atm_wts = pmin(.fitted, 1 - .fitted) /
*          (satisfied_customer_service * .fitted +
*             (1 - satisfied_customer_service) * (1 - .fitted))
  ) -> customer_satisfaction
```
]

# SMD in R

```r
library(smd)
library(tidyverse)
df %>% 
  # w is optional
  summarise(smd = smd(confounder_1, 
                      exposure, 
                      w = wts)$estimate)
```

---
class: title title-1

# SMD in R

```r
library(smd)
library(tidyverse)
customer_satisfaction %>% 
  # w is optional
  summarize(smd = smd(income, 
                      satisfied_customer_service, 
                      w = atm_wts)$estimate)
```

```
## # A tibble: 1 × 1
##      smd
##    <dbl>
## 1 0.0297
```
]

---

# SMD in R

```r
smds <- df %>% 
  summarise( 
    across( 
      c(confounder_1, confounder_2, ...), 
      list(
        unweighted = ~smd(.x, exposure)$estimate, 
        weighted = ~smd(.x, exposure, wts)$estimate 
      )
    )
  )
```
---

# SMD in R

```r
smds <- df %>% 
  summarise( 
    across( 
*     c(confounder_1, confounder_2, ...),
      list(
        unweighted = ~smd(.x, exposure)$estimate, 
        weighted = ~smd(.x, exposure, wts)$estimate 
      )
    )
  )
```

---

# SMD in R

```r
smds <- df %>% 
  summarise( 
    across( 
      c(confounder_1, confounder_2, ...), 
      list(
*       unweighted = ~smd(.x, exposure)$estimate,
*       weighted = ~smd(.x, exposure, wts)$estimate
      )
    )
  )
```

---

# SMD in R

```r
smds <- customer_satisfaction %>% 
  summarise( 
    across( 
      c(income, married, n_kids, has_pets, age, 
        education, gender, former_customer_service, previous_spend), 
      list(
        unweighted = ~smd(.x, satisfied_customer_service)$estimate, 
        weighted = ~smd(.x, satisfied_customer_service, atm_wts)$estimate 
      )
    )
  )
```
]

---

# <svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> Application Exercise

.box-1[Fit a propensity score model for `satisfied_customer_service` (you can use the one you chose for Lab 2)]

.box-1[Calculate the standardized mean differences (unweighted, `ate` weighted and `ato` weighted) for all of the variables (excluding the outcome, `next_spend`)]

<div class="countdown" id="timer_623faafa" style="right:0;bottom:0;" data-warnwhen="0">
<code class="countdown-time"><span class="countdown-digits minutes">10</span><span class="countdown-digits colon">:</span><span class="countdown-digits seconds">00</span></code>
</div>
---

# SMD in R

```r
smds
```

```
## # A tibble: 1 × 18
##   income_unweighted income_weighted married_unweighted married_weighted
##               <dbl>           <dbl>              <dbl>            <dbl>
## 1            0.0168          0.0297             0.0357           0.0350
## # … with 14 more variables: n_kids_unweighted <dbl>, n_kids_weighted <dbl>,
## #   has_pets_unweighted <dbl>, has_pets_weighted <dbl>, age_unweighted <dbl>,
## #   age_weighted <dbl>, education_unweighted <dbl>, education_weighted <dbl>,
## #   gender_unweighted <dbl>, gender_weighted <dbl>,
## #   former_customer_service_unweighted <dbl>,
## #   former_customer_service_weighted <dbl>, previous_spend_unweighted <dbl>,
## #   previous_spend_weighted <dbl>
```
]

---

# SMD in R

```r
plot_df <- smds %>% 
  pivot_longer( 
    everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"), 
    names_pattern = "(.*)_(.*)"
  )
```

---

# SMD in R

```r
plot_df <- smds %>% 
* pivot_longer(
    everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"), 
    names_pattern = "(.*)_(.*)"
  )
```

---

# SMD in R

```r
plot_df <- smds %>% 
  pivot_longer( 
*   everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"), 
    names_pattern = "(.*)_(.*)"
  )
```

---

# SMD in R

```r
plot_df <- smds %>% 
  pivot_longer( 
    everything(),
*   values_to = "SMD",
    names_to = c("variable", "Method"), 
    names_pattern = "(.*)_(.*)"
  )
```

---
class: title title-1

# SMD in R

```r
plot_df <- smds %>% 
  pivot_longer( 
    everything(),
    values_to = "SMD", 
*   names_to = c("variable", "Method"),
*   names_pattern = "(.*)_(.*)"
  )
```

---
class: title title-1

# SMD in R

```r
plot_df <- smds %>% 
  pivot_longer( 
    everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"),
    names_pattern = "(.*)_(.*)"
  ) %>%
  arrange(Method, abs(SMD)) %>%
  mutate(variable = fct_inorder(variable))
```

---

# SMD in R

```r
plot_df
```

```
## # A tibble: 18 × 3
##    variable                Method             SMD
##    <fct>                   <chr>            <dbl>
##  1 gender                  unweighted  0.0161    
##  2 income                  unweighted  0.0168    
##  3 has_pets                unweighted -0.0285    
##  4 married                 unweighted  0.0357    
##  5 former_customer_service unweighted  0.0616    
##  6 n_kids                  unweighted -0.203     
##  7 age                     unweighted  0.340     
##  8 education               unweighted -0.664     
##  9 previous_spend          unweighted -1.02      
## 10 education               weighted    0.00000476
## 11 former_customer_service weighted    0.00869   
## 12 previous_spend          weighted    0.0138    
## 13 gender                  weighted    0.0148    
## 14 age                     weighted    0.0194    
## 15 has_pets                weighted    0.0221    
## 16 n_kids                  weighted    0.0226    
## 17 income                  weighted    0.0297    
## 18 married                 weighted    0.0350
```
]

---

# Love Plot

```r
ggplot(
  data = plot_df,
  aes(x = abs(SMD), y = variable, 
      group = Method, color = Method)
) +  
  geom_line(orientation = "y") +
  geom_point() + 
  geom_vline(xintercept = 0.1, 
             color = "black", size = 0.1)
```
]

---

# Love Plot

```r
ggplot(
  data = plot_df,
* aes(x = abs(SMD), y = variable,
*     group = Method, color = Method)
) +  
  geom_line(orientation = "y") +
  geom_point() + 
  geom_vline(xintercept = 0.1, 
             color = "black", size = 0.1)
```
]

---

# Love Plot

```r
ggplot(
  data = plot_df,
  aes(x = abs(SMD), y = variable, 
      group = Method, color = Method)
) +  
* geom_line(orientation = "y") +
  geom_point() + 
  geom_vline(xintercept = 0.1, 
             color = "black", size = 0.1)
```
]

---

# Love Plot

```r
ggplot(
  data = plot_df,
  aes(x = abs(SMD), y = variable, 
      group = Method, color = Method)
) +  
  geom_line(orientation = "y") +
* geom_point() +
  geom_vline(xintercept = 0.1, 
             color = "black", size = 0.1)
```
]

---

# Love Plot

```r
ggplot(
  data = plot_df,
  aes(x = abs(SMD), y = variable, 
      group = Method, color = Method)
) +  
  geom_line(orientation = "y") +
  geom_point() + 
* geom_vline(xintercept = 0.1,
*            color = "black", size = 0.1)
```
]

---

# Love Plot

]

]
---
class: title title-1

---

# Compare Matching

```r
library(MatchIt)
matched_smds <- matchit(satisfied_customer_service ~ income + married +  
                          n_kids + has_pets + age + education + gender + 
                          former_customer_service + previous_spend, 
                        data = customer_satisfaction, caliper = 0.5) %>%
  get_matches() %>%
  summarise( 
    across( 
      c(income, married, n_kids, has_pets, age, 
        education, gender, former_customer_service, previous_spend), 
      list(
        matched = ~smd(.x, satisfied_customer_service)$estimate
      )
    )
  )
```
]
---

# Compare Matching

```r
*library(MatchIt)
matched_smds <- matchit(satisfied_customer_service ~ income + married +  
                          n_kids + has_pets + age + education + gender + 
                          former_customer_service + previous_spend, 
                        data = customer_satisfaction, caliper = 0.5) %>%
  get_matches() %>%
  summarise( 
    across( 
      c(income, married, n_kids, has_pets, age, 
        education, gender, former_customer_service, previous_spend), 
      list(
        matched = ~smd(.x, satisfied_customer_service)$estimate
      )
    )
  )
```
]

---

# Compare Matching

```r
library(MatchIt)
*matched_smds <- matchit(satisfied_customer_service ~ income + married +
*                         n_kids + has_pets + age + education + gender +
*                         former_customer_service + previous_spend,
                        data = customer_satisfaction, caliper = 0.5) %>%
  get_matches() %>%
  summarise( 
    across( 
      c(income, married, n_kids, has_pets, age, 
        education, gender, former_customer_service, previous_spend), 
      list(
        matched = ~smd(.x, satisfied_customer_service)$estimate
      )
    )
  )
```
]

---

# Compare Matching

```r
library(MatchIt)
matched_smds <- matchit(satisfied_customer_service ~ income + married +  
                          n_kids + has_pets + age + education + gender + 
                          former_customer_service + previous_spend, 
*                       data = customer_satisfaction, caliper = 0.5) %>%
  get_matches() %>%
  summarise( 
    across( 
      c(income, married, n_kids, has_pets, age, 
        education, gender, former_customer_service, previous_spend), 
      list(
        matched = ~smd(.x, satisfied_customer_service)$estimate
      )
    )
  )
```
]

---

# Compare Matching

```r
library(MatchIt)
matched_smds <- matchit(satisfied_customer_service ~ income + married +  
                          n_kids + has_pets + age + education + gender + 
                          former_customer_service + previous_spend, 
                        data = customer_satisfaction, caliper = 0.5) %>%
* get_matches() %>%
  summarise( 
    across( 
      c(income, married, n_kids, has_pets, age, 
        education, gender, former_customer_service, previous_spend), 
      list(
        matched = ~smd(.x, satisfied_customer_service)$estimate
      )
    )
  )
```
]

---

# Compare Matching

```r
library(MatchIt)
matched_smds <- matchit(satisfied_customer_service ~ income + married +  
                          n_kids + has_pets + age + education + gender + 
                          former_customer_service + previous_spend, 
                        data = customer_satisfaction, caliper = 0.5) %>%
  get_matches() %>%
* summarise(
*   across(
*     c(income, married, n_kids, has_pets, age,
*       education, gender, former_customer_service, previous_spend),
*     list(
*       matched = ~smd(.x, satisfied_customer_service)$estimate
*     )
*   )
* )
```
]

---

# Compare Matching

```r
matched_smds
```

```
##   income_matched married_matched n_kids_matched has_pets_matched age_matched
## 1     0.03624997     0.006514001     -0.1074924       0.04472136   0.1654134
##   education_matched gender_matched former_customer_service_matched
## 1        -0.2177035      0.0959927                      0.02060323
##   previous_spend_matched
## 1             -0.4016459
```
]

---

# Compare Matching

```r
plot_df_all <- matched_smds %>%
  pivot_longer( 
    everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"),
    names_pattern = "(.*)_(.*)"
  ) %>%
  bind_rows(plot_df)
```

---

# Compare Matching

```r
*plot_df_all <- matched_smds %>%
  pivot_longer( 
    everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"),
    names_pattern = "(.*)_(.*)"
  ) %>%
  bind_rows(plot_df)
```

---

# Compare Matching

```r
plot_df_all <- matched_smds %>% 
* pivot_longer(
*   everything(),
*   values_to = "SMD",
*   names_to = c("variable", "Method"),
*   names_pattern = "(.*)_(.*)"
  ) %>%
  bind_rows(plot_df)
```

---

# Compare Matching

```r
plot_df_all <- matched_smds %>% 
  pivot_longer( 
    everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"),
    names_pattern = "(.*)_(.*)"
  ) %>%
* bind_rows(plot_df)
```

---

# Compare Matching

```r
plot_df_all <- matched_smds %>% 
  pivot_longer( 
    everything(),
    values_to = "SMD", 
    names_to = c("variable", "Method"),
    names_pattern = "(.*)_(.*)"
  ) %>%
  bind_rows(plot_df) %>%
  arrange(Method, abs(SMD)) %>%
  mutate(variable = fct_inorder(variable))
```

---

# Love Plot

```r
ggplot(
  data = plot_df_all,
  aes(x = abs(SMD), y = variable, 
      group = Method, color = Method)
) +  
  geom_line(orientation = "y") +
  geom_point() + 
  geom_vline(xintercept = 0.1,  
             color = "black", size = 0.1) 
```
]

.right-plot[
<img src="14-evaluating-propensity-score_files/figure-html/unnamed-chunk-41-1.png" width="504" style="display: block; margin: auto;" />
]

---

# What if we changed the caliper?

```r
matched_smds <- matchit(satisfied_customer_service ~ income + married +  
                          n_kids + has_pets + age + education + gender + 
                          former_customer_service + previous_spend, 
                        data = customer_satisfaction, caliper = 0.01) %>%
  get_matches() %>%
  summarise( 
    across( 
      c(income, married, n_kids, has_pets, age, 
        education, gender, former_customer_service, previous_spend), 
      list(
        matched.calp.01 = ~smd(.x, satisfied_customer_service)$estimate
      )
    )
  )
```
]

---

# Love Plot

.right-plot[
<img src="14-evaluating-propensity-score_files/figure-html/unnamed-chunk-45-1.png" width="504" style="display: block; margin: auto;" />
]

---
class: title title-1

# ECDF

For continuous variables, it can be helpful to look at the _whole_ distribution pre and post-weighting rather than a single summary measure

---

# Unweighted ECDF

```r
ggplot(customer_satisfaction, 
       aes(x = previous_spend, group = satisfied_customer_service, 
           color = factor(satisfied_customer_service))) +
  stat_ecdf() +
  scale_color_manual("Satisfied with Customer Service", 
                     values = c("#5154B8", "#5DB854"),
                     labels = c("No", "Yes")) + 
  scale_x_continuous("Previous spending", label = scales::dollar) + 
  ylab("Proportion <= x") 
```
]

---

# Unweighted ECDF

```r
*ggplot(customer_satisfaction,
*      aes(x = previous_spend, group = satisfied_customer_service,
*          color = factor(satisfied_customer_service))) +
  stat_ecdf() +
  scale_color_manual("Satisfied with Customer Service", 
                     values = c("#5154B8", "#5DB854"),
                     labels = c("No", "Yes")) + 
  scale_x_continuous("Previous spending", label = scales::dollar) + 
  ylab("Proportion <= x") 
```
]

---

# Unweighted ECDF

```r
ggplot(customer_satisfaction, 
       aes(x = previous_spend, group = satisfied_customer_service, 
           color = factor(satisfied_customer_service))) +
* stat_ecdf() +
  scale_color_manual("Satisfied with Customer Service", 
                     values = c("#5154B8", "#5DB854"),
                     labels = c("No", "Yes")) + 
  scale_x_continuous("Previous spending", label = scales::dollar) + 
  ylab("Proportion <= x") 
```
]

---

# Unweighted ECDF

```r
ggplot(customer_satisfaction, 
       aes(x = previous_spend, group = satisfied_customer_service, 
           color = factor(satisfied_customer_service))) +
  stat_ecdf() +
* scale_color_manual("Satisfied with Customer Service",
*                    values = c("#5154B8", "#5DB854"),
*                    labels = c("No", "Yes")) +
  scale_x_continuous("Previous spending", label = scales::dollar) + 
  ylab("Proportion <= x") 
```
]

---

# Unweighted ECDF

```r
ggplot(customer_satisfaction, 
       aes(x = previous_spend, group = satisfied_customer_service, 
           color = factor(satisfied_customer_service))) +
  stat_ecdf() +
  scale_color_manual("Satisfied with Customer Service", 
                     values = c("#5154B8", "#5DB854"),
                     labels = c("No", "Yes")) + 
* scale_x_continuous("Previous spending", label = scales::dollar) +
  ylab("Proportion <= x") 
```
]

---

# Unweighted ECDF

---

---
class: title title-1

# Weighted  ECDF

.small[
<img src="14-evaluating-propensity-score_files/figure-html/unnamed-chunk-53-1.png" width="504" style="display: block; margin: auto;" />
]

---

# Weighted  ECDF

```r
ecdf_1 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 1) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ecdf_0 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 0) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
  geom_line(color = "#5DB854") +
  geom_line(data = ecdf_0, 
            aes(x = previous_spend, y = cum_pct), 
            color = "#5154B8") + 
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```
]
---

# Weighted  ECDF

```r
ecdf_1 <- customer_satisfaction %>%
* filter(satisfied_customer_service == 1) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ecdf_0 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 0) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
  geom_line(color = "#5DB854") +
  geom_line(data = ecdf_0, 
            aes(x = previous_spend, y = cum_pct), 
            color = "#5154B8") + 
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```
]

---

# Weighted  ECDF

```r
ecdf_1 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 1) %>%
* arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ecdf_0 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 0) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
  geom_line(color = "#5DB854") +
  geom_line(data = ecdf_0, 
            aes(x = previous_spend, y = cum_pct), 
            color = "#5154B8") + 
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```
]

---

# Weighted  ECDF

```r
ecdf_1 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 1) %>%
  arrange(previous_spend) %>%
* mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ecdf_0 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 0) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
  geom_line(color = "#5DB854") +
  geom_line(data = ecdf_0, 
            aes(x = previous_spend, y = cum_pct),
            color = "#5154B8") + 
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```
]
---

# Weighted  ECDF 
.small[

```r
ecdf_1 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 1) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
*ecdf_0 <- customer_satisfaction %>%
* filter(satisfied_customer_service == 0) %>%
* arrange(previous_spend) %>%
* mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
  geom_line(color = "#5DB854") +
  geom_line(data = ecdf_0,
            aes(x = previous_spend, y = cum_pct), 
            color = "#5154B8") + 
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```
]
---
class: title title-1

# Weighted  ECDF 
.small[

```r
ecdf_1 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 1) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ecdf_0 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 0) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
*ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
  geom_line(color = "#5DB854") +
  geom_line(data = ecdf_0, 
            aes(x = previous_spend, y = cum_pct), 
            color = "#5154B8") + 
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```
]
---

# Weighted  ECDF

```r
ecdf_1 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 1) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ecdf_0 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 0) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
* geom_line(color = "#5DB854") +
  geom_line(data = ecdf_0, 
            aes(x = previous_spend, y = cum_pct), 
            color = "#5154B8") + 
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```
]
---

# Weighted  ECDF

```r
ecdf_1 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 1) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ecdf_0 <- customer_satisfaction %>%
  filter(satisfied_customer_service == 0) %>%
  arrange(previous_spend) %>%
  mutate(cum_pct = cumsum(atm_wts) / sum(atm_wts))
ggplot(ecdf_1, aes(x = previous_spend, y = cum_pct)) +
  geom_line(color = "#5DB854") +
* geom_line(data = ecdf_0,
*           aes(x = previous_spend, y = cum_pct),
*           color = "#5154B8") +
  xlab("Previous spending") + 
  ylab("Proportion <= x") 
```

]

---

---

# Iterate on your propensity score model!

.box-1.medium[Decrease your caliper if doing matching]

.box-1.medium[Allow for more degrees of freedom for variables that look imbalanced]

.box-inv-1.medium[Polynomial terms // splines]

---