Category: Machine learning
Validation
Validation tests model performance on data not used for training, to estimate how well it will generalize.
Also known as: model evaluation, testing
Expanded definition
Validation should use independent data. In EO, it is easy to accidentally leak information through time, location, or preprocessing choices.
A good validation strategy often separates by geography and time, not just random pixels, because nearby pixels are highly correlated.
When monitoring systems are evaluated, temporal integrity also matters. Retrospective reconstructions can inflate performance compared to real-time deployment.
Related terms
Temporal Integrity
Temporal integrity means an output for a given date is built only from information available on or before that date.
Ground Truth
Ground truth is reference information about real conditions on the ground used to train or validate models.
Training Data
Training data is the labeled examples used to fit a machine learning model.