Module 4, Week 2: Deep IV & Causal Discovery

1. Introduction

This article covers two advanced topics: (1) using deep learning for instrumental variable estimation when treatment is endogenous and nonlinear, and (2) learning causal graph structures from observational data.

2. DeepIV: Deep Instrumental Variables

DeepIV extends IV regression to handle nonlinear, high-dimensional settings using neural networks. The method uses a two-stage approach similar to 2SLS but with flexible neural network approximations.

Stage 1: Model Treatment Distribution

Learn p(W | Z, X) using a mixture density network, where Z is the instrument.

Stage 2: Estimate Treatment Effect

Sample treatments from stage 1 and train outcome model Y ~ f(W, X).

3. Causal Discovery from Data

Causal discovery algorithms learn DAG structures from observational data. Main approaches include:

Constraint-based: PC algorithm, FCI (use conditional independence tests)
Score-based: GES, NOTEARS (optimize a score function over DAGs)
Functional causal models: ANM, LiNGAM (exploit asymmetry in noise)

4. DAG Learning with Neural Networks

NOTEARS (Zheng et al., 2018) formulates DAG learning as a continuous optimization problem by characterizing acyclicity as a smooth constraint.

minimize: ||X - XW||² + λ||W||₁
subject to: tr((I + αW ⊙ W)ᵈ) - d = 0

Where W is the adjacency matrix and the constraint ensures no cycles.

5. Key Takeaways

✓DeepIV handles nonlinear endogeneity with flexible neural approximations
✓Causal discovery algorithms can learn graph structures from data
✓NOTEARS enables continuous optimization for DAG learning

6. Next Week Preview

Module 5, Week 1: Time Series & Panel Data

We'll cover synthetic control methods, time-varying treatments, and marginal structural models for longitudinal causal inference.