Tilbage til ordlisten MLOps & Livscyklus

Datadrift

Ændringen i statistiske egenskaber for inputdata over tid — fører til forringelse af modelydeevne i produktion.

Understanding Data Drift

Data drift refers to changes in the statistical distribution of data that an AI model encounters in production compared to the data it was trained on. Since machine learning models learn patterns from historical data, they assume that future data will follow similar distributions. When this assumption breaks — due to changing customer behavior, market conditions, seasonal patterns, or upstream system changes — model predictions become less accurate. Data drift is one of the most common causes of silent AI model degradation in production environments.

Types of Drift

Covariate drift occurs when input feature distributions change while the relationship between features and targets remains stable. Concept drift involves changes in the underlying relationship between inputs and outputs — what the model should predict given certain inputs evolves over time. Prior probability drift happens when the distribution of target classes changes. Virtual drift affects input distributions without impacting model performance. Each type requires different detection methods and remediation strategies. Gradual drift occurs slowly over time, while sudden drift results from ab

Enterprise Drift Management

Implement automated drift detection using statistical tests — Kolmogorov-Smirnov for numerical features, chi-squared for categorical features, and population stability index for overall distribution comparison. Set alert thresholds calibrated to your business impact tolerance. Establish automated retraining triggers when drift exceeds acceptable levels. Maintain reference datasets that represent expected distributions and update them as business conditions evolve. Build dashboards that track drift metrics alongside model performance metrics, enabling teams to correlate performance degradation