How-to#

On this page you can find a gallery of example notebooks that demonstrate the use of CausalPy.

ANCOVA#

Analysis of covariance is a simple linear model, typically with one continuous predictor (the covariate) and a categorical variable (which may correspond to treatment or control group). In the context of this package, ANCOVA could be useful in pre-post treatment designs, either with or without random assignment. This is similar to the approach of difference in differences, but only applicable with a single pre and post treatment measure.

ANCOVA for pre/post treatment nonequivalent group designs
ANCOVA for pre/post treatment nonequivalent group designs

Difference in Differences#

Analysis where the treatment effect is estimated as a difference between treatment conditions in the differences between pre-treatment to post treatment observations.

Difference in Differences with pymc models
Difference in Differences with pymc models
Banking dataset with a pymc model
Banking dataset with a pymc model
Difference in Differences with scikit-learn models
Difference in Differences with scikit-learn models

Geographical lift testing#

Geolift (geographical lift testing) is a method for measuring the causal impact of interventions in geographic regions. It combines synthetic control methods with difference-in-differences approaches to estimate treatment effects when interventions are applied to specific geographic areas.

Bayesian geolift with CausalPy
Bayesian geolift with CausalPy
Multi-cell geolift analysis
Multi-cell geolift analysis

Instrumental Variables Regression#

A quasi-experimental design to estimate a treatment effect where there is a risk of confounding between the treatment and the outcome due to endogeneity. Instrumental variables help identify causal effects by using variables that affect treatment assignment but not the outcome directly.

Instrumental Variable Modelling (IV) with pymc models
Instrumental Variable Modelling (IV) with pymc models
Instrumental Regression and Justifying Instruments with pymc
Instrumental Regression and Justifying Instruments with pymc

Interrupted Time Series#

A quasi-experimental design that uses time series methods to generate counterfactuals and estimate treatment effects. A series of observations are collected before and after a treatment, and the pre-treatment trend (or any time-series model) is used to predict what would have happened in the absence of treatment.

Excess deaths due to COVID-19
Excess deaths due to COVID-19
Bayesian Interrupted Time Series
Bayesian Interrupted Time Series
Interrupted Time Series (ITS) with scikit-learn models
Interrupted Time Series (ITS) with scikit-learn models

Inverse Propensity Score Weighting#

A method for estimating causal effects by weighting observations by the inverse of their probability of receiving treatment (propensity score). This helps adjust for confounding by creating a pseudo-population where treatment assignment is independent of observed covariates.

The Paradox of Propensity Scores in Bayesian Inference
The Paradox of Propensity Scores in Bayesian Inference
Inverse Propensity Score Weighting with pymc
Inverse Propensity Score Weighting with pymc

Regression Discontinuity#

A quasi-experimental design where treatment assignment is determined by a cutoff point along a running variable (e.g., test score, age, income). The treatment effect is estimated by comparing outcomes just above and below the cutoff, assuming units near the cutoff are similar except for treatment status.

Sharp regression discontinuity with pymc models
Sharp regression discontinuity with pymc models
Drinking age - Bayesian analysis
Drinking age - Bayesian analysis
Sharp regression discontinuity with scikit-learn models
Sharp regression discontinuity with scikit-learn models
Drinking age with a scikit-learn model
Drinking age with a scikit-learn model

Regression Kink Design#

A variation of regression discontinuity where treatment affects the slope (rate of change) of the outcome with respect to the running variable, rather than causing a discrete jump. The treatment effect is identified by a change in the slope at the cutoff point.

Regression kink design with pymc models
Regression kink design with pymc models

Synthetic Control#

The synthetic control method is a statistical method used to evaluate the effect of an intervention in comparative case studies. It involves the construction of a weighted combination of groups used as controls, to which the treatment group is compared.

Synthetic control with pymc models
Synthetic control with pymc models
The effects of Brexit
The effects of Brexit
Synthetic control with scikit-learn models
Synthetic control with scikit-learn models