Types of Drift
Data drift: Input feature distributions change over time — population shifts, seasonal variations, upstream pipeline changes. Detected with KS tests and PSI (Population Stability Index).
Concept drift: The relationship between features and target variables changes — policy changes, market conditions, user behavior shifts. Detected through performance metrics and error rate monitoring.
Training-serving skew: Production feature distributions differ from training data. Detectable from day one if training data isn’t representative.
Label drift: Target variable distribution changes from annotation errors or labeling criteria shifts.
Detection Methods
Statistical: KL Divergence, KS Test, Wasserstein Distance, PSI
Data quality: Schema validation, cardinality changes, missing values
Performance: Error rates, F1/AUC-ROC, latency degradation
Business: Conversion rates, user engagement, A/B test results
Anomaly: Isolation Forest for outlier detection
# Drift monitoring tools
# Google Vertex AI Model Monitoring
# Built-in data drift + feature skew
# etsi-watchdog (open-source)
# Plug-in architecture for custom drift
# algorithms, rolling window monitoring,
# Slack alerting
# Key metrics to track:
PSI < 0.1 → No significant drift
PSI 0.1-0.2 → Moderate drift, investigate
PSI > 0.2 → Significant drift, retrain
Not all drift requires action: Implement severity-based alerting focused on drift that impacts performance or business outcomes. A feature distribution shift that doesn’t affect predictions is noise, not signal.