The Google Paper
In 2015, D. Sculley and colleagues at Google published “Hidden Technical Debt in Machine Learning Systems” at NeurIPS. The paper argued that ML systems have a special capacity for incurring technical debt because they have all the maintenance problems of traditional software plus a set of ML-specific issues. Key debt categories: boundary erosion (no strict API contracts between components), entanglement (changing one feature affects all others — CACE: “Changing Anything Changes Everything”), hidden feedback loops (model predictions influence future training data), undeclared consumers (other systems silently depend on your model), and data dependency debt (harder to track than code dependencies).
ML-Specific Debt
// Hidden Technical Debt (Sculley, 2015)
CACE Principle:
Changing Anything Changes Everything
→ Add 1 feature → all predictions shift
Feedback Loops:
Model predicts → user acts → new data
→ Model trains on influenced data
→ Predictions drift silently
Data Dependencies:
- Unstable data sources
- Underutilized features
- Legacy features no one removes
- Correlated features masking bugs
Configuration Debt:
Hyperparams, thresholds, feature flags
→ Often more lines than model code
Key insight: The paper’s most famous diagram shows that ML code is a tiny box surrounded by massive boxes for data collection, verification, feature extraction, serving infrastructure, monitoring, and configuration. MLOps addresses every one of those surrounding boxes.