# causal-ddml

> Use when estimating causal effects with Double/Debiased Machine Learning. Triggers on DDML, double machine learning, debiased, cross-fitting, Chernozhukov, high-dimensional, partially linear, PLR, IRM, orthogonal score.

- Author: tangjia1986
- Repository: tangjia1986gz-lab/causal-ml-skills
- Version: 20260121163613
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/tangjia1986gz-lab/causal-ml-skills
- Web: https://mule.run/skillshub/@@tangjia1986gz-lab/causal-ml-skills~causal-ddml:20260121163613

---

---
name: causal-ddml
description: Use when estimating causal effects with Double/Debiased Machine Learning. Triggers on DDML, double machine learning, debiased, cross-fitting, Chernozhukov, high-dimensional, partially linear, PLR, IRM, orthogonal score.
---

# Estimator: Double/Debiased Machine Learning (DDML)

> **Version**: 2.0.0 | **Type**: Estimator | **Structure**: K-Dense
> **Aliases**: DDML, DML, Double ML, Debiased ML, Orthogonal ML

## Overview

Double/Debiased Machine Learning (DDML) estimates causal effects by combining machine learning methods for nuisance parameter estimation with debiased/orthogonal score functions that yield valid statistical inference. The method uses sample splitting (cross-fitting) to avoid overfitting bias while leveraging the flexibility of modern ML methods.

**Key Innovation**: DDML uses Neyman-orthogonal moment conditions that are insensitive (to first order) to estimation errors in nuisance functions, enabling the use of regularized/ML estimators without compromising valid inference.

**Primary Reference**: Chernozhukov, V., et al. (2018). Double/Debiased Machine Learning for Treatment and Structural Parameters. *The Econometrics Journal*, 21(1), C1-C68.

---

## Quick Reference

### When to Use

| Scenario | Recommendation |
|----------|----------------|
| High-dimensional controls (p >> n or many relevant controls) | Use DDML |
| Complex, nonlinear confounding relationships | Use DDML |
| Need valid inference with ML adjustment | Use DDML |
| Low-dimensional, simple relationships | Consider OLS instead |
| Small samples (n < 100) | Avoid - cross-fitting unreliable |
| Unconfoundedness clearly violated | Consider IV or DID instead |

### Model Selection

| Treatment Type | Recommended Model | Estimand |
|----------------|-------------------|----------|
| Continuous | PLR | Constant ATE |
| Binary (constant effect) | PLR | Constant ATE |
| Binary (heterogeneous) | IRM | ATE with heterogeneity |
| Endogenous + binary IV | IIVM | LATE |
| Endogenous + continuous IV | PLIV | Constant ATE |

### CLI Scripts

```bash
# Complete DDML analysis
python scripts/run_ddml_analysis.py --data data.csv --outcome y --treatment d

# Tune nuisance models
python scripts/tune_nuisance_models.py --data data.csv --outcome y --treatment d

# Generate diagnostic plots
python scripts/cross_fit_diagnostics.py --data data.csv --outcome y --treatment d

# Sensitivity analysis
python scripts/sensitivity_analysis.py --data data.csv --outcome y --treatment d

# Compare PLR vs IRM
python scripts/compare_estimators.py --data data.csv --outcome y --treatment d
```

---

## Directory Structure

```
causal-ddml/
├── SKILL.md                    # This file - main documentation
├── ddml_estimator.py           # Core estimation functions
├── references/                 # Detailed reference documentation
│   ├── identification_assumptions.md   # Neyman orthogonality, cross-fitting, rates
│   ├── diagnostic_tests.md             # Cross-fit diagnostics, nuisance quality
│   ├── estimation_methods.md           # PLR, IRM, IIVM, PLIV methods
│   ├── model_selection.md              # Learner selection, ensemble approaches
│   ├── reporting_standards.md          # Tables, CIs, robustness reporting
│   └── common_errors.md                # Pitfalls and how to avoid them
├── scripts/                    # Executable CLI tools
│   ├── run_ddml_analysis.py            # Complete analysis workflow
│   ├── tune_nuisance_models.py         # Automated hyperparameter tuning
│   ├── cross_fit_diagnostics.py        # Diagnostic visualization
│   ├── sensitivity_analysis.py         # Robustness checks
│   └── compare_estimators.py           # Model comparison
└── assets/                     # Templates and formatting
    ├── latex/
    │   └── ddml_table.tex              # LaTeX table templates
    └── markdown/
        └── ddml_report.md              # Analysis report template
```

---

## Identification Assumptions

> **Detailed Reference**: `references/identification_assumptions.md`

| Assumption | Description | Testable? |
|------------|-------------|-----------|
| **Unconfoundedness** | $(Y(0), Y(1)) \perp D \| X$ | No |
| **Overlap/Positivity** | $0 < P(D=1\|X) < 1$ | Yes |
| **Neyman Orthogonality** | Score insensitive to nuisance errors | By construction |
| **Rate Conditions** | $\|\hat{\ell} - \ell_0\| \cdot \|\hat{m} - m_0\| = o_P(n^{-1/2})$ | Partially |

### Key Insight: Product Rate Condition

DDML requires that the **product** of nuisance estimation errors decays faster than $n^{-1/2}$. This is weaker than requiring each to be $\sqrt{n}$-consistent, enabling use of regularized ML estimators.

---

## Workflow

```
+-------------------------------------------------------------+
|                    DDML ESTIMATOR WORKFLOW                    |
+-------------------------------------------------------------+
|  1. SETUP          -> Define Y, D, X (high-dimensional)       |
|  2. MODEL SELECTION-> Choose first-stage ML learners          |
|  3. CROSS-FITTING  -> K-fold sample splitting (K=5 typical)   |
|  4. ESTIMATION     -> PLR (Partially Linear) or IRM (Inter.)  |
|  5. INFERENCE      -> Debiased estimates + valid SEs          |
|  6. REPORTING      -> Tables with multiple ML specifications  |
+-------------------------------------------------------------+
```

### Phase 1: Setup

```python
from ddml_estimator import validate_ddml_setup, create_ddml_data

# Validate data structure
validation = validate_ddml_setup(
    data=df,
    outcome="y",
    treatment="d",
    controls=control_vars,
    n_folds=5
)

if not validation['is_valid']:
    raise ValueError(f"Validation failed: {validation['errors']}")
```

### Phase 2: Model Selection

> **Detailed Reference**: `references/model_selection.md`

```python
from ddml_estimator import select_first_stage_learners

# Auto-select best ML learners via cross-validation
best_learners = select_first_stage_learners(
    X=df[control_vars],
    y=df['y'],
    d=df['d'],
    cv_folds=5
)
```

**Learner Recommendations**:
| Scenario | Learner |
|----------|---------|
| Sparse, linear | Lasso, Elastic Net |
| Complex nonlinear | Random Forest, XGBoost |
| Very high-dimensional | Lasso + RF ensemble |
| Unknown structure | Compare multiple |

### Phase 3: Cross-Fitting

> **Detailed Reference**: `references/identification_assumptions.md` (Section 2)

```
K-Fold Cross-Fitting (K=5):
+--------------------------------------------------+
| Fold 1: Train on [2,3,4,5] -> Predict on [1]     |
| Fold 2: Train on [1,3,4,5] -> Predict on [2]     |
| ...                                               |
+--------------------------------------------------+
Result: Out-of-sample predictions for ALL observations
```

**Choosing K**:
| Sample Size | K | n/K |
|-------------|---|-----|
| n < 500 | 2-3 | ~150+ |
| 500-2000 | 5 | ~200+ |
| n > 2000 | 5-10 | ~200+ |

### Phase 4: Estimation

> **Detailed Reference**: `references/estimation_methods.md`

**PLR (Partially Linear Regression)**:
```python
from ddml_estimator import estimate_plr

result = estimate_plr(
    data=df,
    outcome="y",
    treatment="d",
    controls=control_vars,
    ml_l='lasso',      # E[Y|X] learner
    ml_m='lasso',      # E[D|X] learner
    n_folds=5
)
```

**IRM (Interactive Regression Model)**:
```python
from ddml_estimator import estimate_irm

result = estimate_irm(
    data=df,
    outcome="y",
    treatment="d",  # Must be binary
    controls=control_vars,
    ml_g='random_forest',     # E[Y|D,X] learner
    ml_m='logistic_lasso',    # P(D=1|X) learner
    n_folds=5,
    trimming_threshold=0.01   # Handle extreme propensities
)
```

### Phase 5: Inference

```python
print(f"Effect: {result.effect:.4f}")
print(f"SE: {result.se:.4f}")
print(f"95% CI: [{result.ci_lower:.4f}, {result.ci_upper:.4f}]")
print(f"P-value: {result.p_value:.4f}")
```

### Phase 6: Robustness

> **Detailed Reference**: `references/diagnostic_tests.md`

```python
from ddml_estimator import compare_learners

comparison = compare_learners(
    data=df,
    outcome="y",
    treatment="d",
    controls=control_vars,
    learner_list=['lasso', 'ridge', 'random_forest', 'xgboost']
)

print(comparison.summary_table)
print(f"Effect range: [{comparison.sensitivity['min_effect']:.4f}, "
      f"{comparison.sensitivity['max_effect']:.4f}]")
```

---

## PLR vs IRM Model Comparison

| Aspect | PLR | IRM |
|--------|-----|-----|
| Treatment Type | Any (continuous/binary) | Binary only |
| Effect Assumption | Constant | Allows heterogeneity |
| Estimand | ATE | ATE, ATTE |
| Nuisance Functions | E[Y\|X], E[D\|X] | E[Y\|D,X], P(D=1\|X) |
| Score | Partialling out | AIPW (doubly robust) |
| When to Use | Continuous D, constant effects | Binary D, heterogeneous effects |

---

## Diagnostic Tests

> **Detailed Reference**: `references/diagnostic_tests.md`

### Cross-Fitting Stability

```bash
python scripts/cross_fit_diagnostics.py --data data.csv --outcome y --treatment d --output plots/
```

Generates:
- `fold_variation.png` - Estimates by fold
- `repetition_stability.png` - Stability across repetitions
- `residuals.png` - Residual diagnostics
- `propensity_overlap.png` - Propensity distribution (IRM)
- `nuisance_performance.png` - Predicted vs actual

### Nuisance Model Quality

```python
# Check in result diagnostics
print(result.diagnostics['r2_y_given_x'])  # Outcome model R2
print(result.diagnostics['r2_d_given_x'])  # Treatment model R2
```

### Propensity Overlap (IRM)

```python
# Check propensity distribution
print(result.diagnostics['propensity_summary'])
# {min, max, mean, n_extreme_low, n_extreme_high}
```

---

## Common Errors

> **Detailed Reference**: `references/common_errors.md`

### 1. Not Using Cross-Fitting

```python
# WRONG: In-sample predictions
model.fit(X, y)
resid = y - model.predict(X)  # Overfitted!

# CORRECT: Cross-validated predictions
resid = y - cross_val_predict(model, X, y, cv=5)
```

### 2. IRM with Continuous Treatment

```python
# WRONG
result = estimate_irm(data, outcome, continuous_treatment, controls)

# CORRECT: Use PLR for continuous treatment
result = estimate_plr(data, outcome, continuous_treatment, controls)
```

### 3. Ignoring Propensity Overlap

```python
# ALWAYS check for extreme propensities with IRM
result = estimate_irm(..., trimming_threshold=0.01)
print(result.diagnostics['n_trimmed'])
```

### 4. Single Specification

```python
# WRONG: Report only one learner
result = estimate_plr(..., ml_l='lasso')

# CORRECT: Compare multiple specifications
comparison = compare_learners(..., learner_list=['lasso', 'rf', 'xgboost'])
```

### 5. Claiming Causality Without Justification

Always discuss:
1. Why unconfoundedness is plausible
2. What confounders are included
3. Potential omitted variables
4. Sensitivity to violations

---

## Reporting Standards

> **Detailed Reference**: `references/reporting_standards.md`
> **Template**: `assets/markdown/ddml_report.md`
> **LaTeX Table**: `assets/latex/ddml_table.tex`

### Minimum Reporting Requirements

1. **Model type**: PLR or IRM
2. **ML learners**: For each nuisance function
3. **Cross-fitting**: K folds, n repetitions
4. **Point estimate**: With SE and CI
5. **Sensitivity**: Range across specifications

### Example Table

```
| Specification | Effect | SE | 95% CI |
|---------------|--------|-----|--------|
| (1) Lasso | 0.082*** | 0.008 | [0.066, 0.098] |
| (2) RF | 0.079*** | 0.009 | [0.061, 0.097] |
| (3) XGBoost | 0.081*** | 0.008 | [0.065, 0.097] |
```

---

## DoubleML Package Integration

For production use, consider the `doubleml` package:

```python
import doubleml as dml
from doubleml import DoubleMLData, DoubleMLPLR

# Prepare data
dml_data = DoubleMLData(df, y_col='outcome', d_cols='treatment', x_cols=controls)

# Estimate
dml_plr = DoubleMLPLR(dml_data, ml_l=LassoCV(), ml_m=LassoCV(), n_folds=5)
dml_plr.fit()

print(dml_plr.summary)
```

**Documentation**: https://docs.doubleml.org/

---

## Examples

### Example 1: Returns to Education

```python
# High-dimensional controls for education-wage analysis
result = run_full_ddml_analysis(
    data=df,
    outcome="log_wage",
    treatment="years_education",
    controls=['age', 'age_sq', 'female', 'married', 'region_*', 'industry_*',
              'parents_education', 'test_score', 'family_income']
)

print(result.summary_table)
```

### Example 2: Job Training Program (Binary Treatment)

```python
# Compare PLR and IRM for binary treatment
result_plr = estimate_plr(data, 'earnings', 'training', controls)
result_irm = estimate_irm(data, 'earnings', 'training', controls)

print(f"PLR (constant effect): {result_plr.effect:.2f}")
print(f"IRM (heterogeneous): {result_irm.effect:.2f}")
```

---

## Mathematical Appendix

### Neyman-Orthogonal Score (PLR)

$$
\psi^{PLR}(W; \theta, \ell, m) = (Y - \ell(X) - \theta(D - m(X)))(D - m(X))
$$

Setting $E[\psi] = 0$:
$$
\hat{\theta} = \frac{\sum_i (Y_i - \hat{\ell}(X_i))(D_i - \hat{m}(X_i))}{\sum_i (D_i - \hat{m}(X_i))^2}
$$

### Asymptotic Distribution

Under regularity conditions:
$$
\sqrt{n}(\hat{\theta} - \theta_0) \xrightarrow{d} N(0, \sigma^2)
$$

Where:
$$
\sigma^2 = \frac{E[\psi^2]}{(E[\partial_\theta \psi])^2}
$$

---

## References

### Seminal Papers
1. Chernozhukov, V., et al. (2018). Double/Debiased Machine Learning for Treatment and Structural Parameters. *The Econometrics Journal*, 21(1), C1-C68.
2. Chernozhukov, V., et al. (2017). Double/Debiased/Neyman Machine Learning of Treatment Effects. *AER P&P*, 107(5), 261-265.

### Extensions
3. Chernozhukov, V., et al. (2022). Locally Robust Semiparametric Estimation. *Econometrica*, 90(4), 1501-1535.
4. Semenova, V., & Chernozhukov, V. (2021). Debiased Machine Learning of CATE. *The Econometrics Journal*, 24(2), 264-289.

### Software
5. DoubleML (Python/R): https://docs.doubleml.org/
6. EconML (Python): https://econml.azurewebsites.net/

---

## Related Skills

| Skill | When to Use Instead |
|-------|---------------------|
| `estimator-ols` | Low-dimensional, simple relationships |
| `estimator-psm` | Explicit propensity matching desired |
| `estimator-iv` | Unconfoundedness violated, instrument available |
| `estimator-did` | Panel data, staggered treatment |
| `causal-forest` | Focus on treatment effect heterogeneity |