Regression Discontinuity Design (RDD): An Emerging Tool for Causal Inference
Introduction
In today’s age, the ability to determine causal relationships between variables is crucial for making informed decisions across all domains. Researchers and analysts frequently ask, “Does X affect Y?” Our first thought is often to use Ordinary Least Squares (OLS) regression to estimate the coefficient. However, while straightforward and widely used, OLS estimates can be skewed due to biases like omitted variable bias or endogeneity. Today, we will explore the Regression Discontinuity Design (RDD), a powerful emerging tool in causal inference that uses sharp cutoff points in continuous variables to identify causal effects more reliably.
In this article, we’ll delve into the fundamental concepts of RDD, its application, and practical examples. Whether you’re an econometrician seeking to expand your toolkit or a non-economist exploring data analysis, this article will introduce you to this intriguing methodology.
Understanding RDD
The main concept that RDD is based on is that, at a specific cutoff point, there is an abrupt change in treatment assignment, due to a randomized experiment. Those on one side of the threshold receive a “treatment,” while those on the other do not. This discontinuity allows us to infer the treatment’s impact by comparing observations “near this threshold”.
Note to remember: Those just above and just below the cutoff score are likely to be very similar in all respects except for receiving the treatment.
Assumptions:
- Continuity Assumption: In the absence of the treatment, the outcomes would progress smoothly across the cutoff. Any observed jump at the threshold can be attributed to the treatment itself.
- No Manipulation at the Threshold: Participants should not be able to manipulate their scores to intentionally fall on either side of the threshold. Note: Researchers are concerned that participants could manipulate their position relative to the cutoff to ensure they receive the treatment.
- To address this, a “donut” RDD is used. This method excludes observations very close to the cutoff, which might be influenced by manipulation, and instead analyzes those further away, but still near the threshold.
- By omitting data points immediately adjacent to the threshold, the “donut” RDD seeks to remove biases from strategic manipulation while maintaining a reasonably robust sample size.
3. Local Randomization: Near the threshold, treatment assignment should resemble random assignment due to similar observable and unobservable characteristics.
Types of RDD
- Sharp RDD: The treatment is strictly binary based on a clear threshold. In this notation, D= 1 represents the treatment assignment, and 0 for those who did not.
- Fuzzy RDD: Not all individuals strictly adhere to the threshold-based assignment. Instead, the probability of treatment increases sharply near the threshold but does not fully determine assignment. An example is when an intervention is not universally implemented due to administrative errors.
This is a special case for Instrumental Varibale (IV). We will be required to use a 2SLS with the instrument as a dummy for being above the threshold.
Implementation
- Choosing a Bandwidth: The selection of the interval or “bandwidth” around the cutoff is crucial. Too wide a bandwidth introduces more noise and might not give the isolated treatment effect, while too narrow a bandwidth limits the sample size.
- Graphical Analysis: A critical step involves visualizing the data near the threshold to confirm that discontinuities align with expectations. If a noticeable jump or gap is observed even without treatment, adjustments must be made to account for this, ensuring that any changes in outcomes can be correctly attributed to the treatment rather than other confounding factors.
- Functional Form of the Regression: The relationship between the running variable and outcome should be adequately modeled. Polynomial forms or other flexible functions can be used.
Coefficient Estimate: Local Average Treatment Effect (LATE)
In RDD, the estimated coefficient reflects the Local Average Treatment Effect (LATE). This concept emphasizes that RDD identifies the treatment effect specifically for individuals around the cutoff point. In other words, LATE captures the causal impact of the treatment on those individuals whose treatment status is determined by being just above or below the threshold.
RD-based estimates of the effect of x on y give a local average treatment for people near the cutoff (specifically, gives the effect of compliers: people who get treated if they fall on one side of cutoff but not if they fall on the other side of cutoff).
Robustness Check
Placebo Tests: Conduct tests on pseudo-cutoffs where no treatment effect is expected, ensuring that any observed discontinuity at the actual cutoff is due to the treatment.
In a placebo test, researchers select one or more points near the real cutoff but where no treatment is assigned (these are called “pseudo-cutoffs”). They then analyze the data around these pseudo-cutoffs to see if any discontinuity in outcomes is evident.
- Purpose: If a discontinuity is observed at these points, it suggests that there might be other factors influencing the outcome besides the treatment, indicating potential biases in the study design or unobserved confounding variables.
- Interpretation: If no discontinuity is observed at the pseudo-cutoffs, but there is a clear and significant jump at the actual treatment threshold, it strengthens the evidence that the observed treatment effect is real and attributable to the treatment itself.
Conclusion
When implemented carefully, RDD offers insights across various fields like education, healthcare, and policy evaluation, providing a clear window into the causal relationship between variables. Whether you are an econometrician honing your skills or a non-economist exploring new analytical horizons, understanding RDD will bolster your ability to unlock causal insights from real-world data.
Further Reading and Resources
- Books: Angrist and Pischke’s “Mostly Harmless Econometrics”
- Paper: Lee and Lemieux’s “Regression Discontinuity Designs in Economics”
- Article: Mixtape (RDD) by Scott Cunningham
Stay in Touch
If you enjoyed this article and would like to stay connected, feel free to follow me on Medium and connect with me on LinkedIn. I’d love to continue the conversation and have a chat with you on any topic.