Category: Machine learning

Validation

Validation tests model performance on data not used for training, to estimate how well it will generalize.

Also known as: model evaluation, testing

Expanded definition

Validation should use independent data. In EO, it is easy to accidentally leak information through time, location, or preprocessing choices.

A good validation strategy often separates by geography and time, not just random pixels, because nearby pixels are highly correlated.

When monitoring systems are evaluated, temporal integrity also matters. Retrospective reconstructions can inflate performance compared to real-time deployment.

Related terms

Temporal Integrity

Temporal integrity means an output for a given date is built only from information available on or before that date.

Ground Truth

Ground truth is reference information about real conditions on the ground used to train or validate models.

Training Data

Training data is the labeled examples used to fit a machine learning model.