Sentinel-2 Scaling & Harmonization: Offsets, Processing Baselines, and “Harmonized” Collections

2026-02-17 · 7 min read · Sentinel-2 · Processing Baseline · PB 04.00 · Reflectance · Scaling · ESA · Baseline

Sentinel-2 Scaling & Harmonization: Offsets, Processing Baselines, and “Harmonized” Collections

TL;DR: Sentinel-2 reflectance bands are stored as integers (DN, digital numbers) plus metadata: a scale (quantification value) and, since PB 04.00, a per-band add offset. Convert with ρ = (DN + ADD_OFFSET) / QUANTIFICATION_VALUE and use the product’s declared special values for NoData and saturation. If you are using a “Harmonized” collection (for example Earth Engine’s Sentinel-2 harmonized datasets), the PB 04.00 offset has already been removed, so do not correct it twice.

Quick rules (SAFE to reflectance)

Processing baseline in plain language

A processing baseline (PB) is ESA’s versioned recipe for turning raw measurements into Level-1C and Level-2A products. Baseline changes can affect pixel semantics, not just metadata, so “same sensor” does not automatically mean “same radiometry”. ESA documents baseline evolution and also runs reprocessing campaigns to deliver more consistent time series across the archive, which is why you may see older acquisition dates appear with newer baseline semantics after reprocessing.Sentinel Online: Sentinel-2 processing baseline concept, baseline evolution, and archive reprocessing for consistent time series

PB 04.00 (operational from 2022-01-25) is the baseline where the radiometric offset change became a frequent source of confusion. The goal was practical: represent near-zero and sometimes negative values over very dark targets without clipping them during quantization, by shifting the stored range and declaring the offset in metadata.ESA STEP Forum: Motivation for PB 04.00 radiometric offset and how it avoids losing information over dark surfaces

The safe conversion from DN to reflectance

Think of SAFE as “numbers plus a recipe”. The numbers are the per-pixel integers, and the recipe is the metadata.

For Level-1C, the offset is typically exposed as RADIO_ADD_OFFSET, and ESA documentation describes the storage relationship as a quantized reflectance with an offset term. In practice you recover reflectance with the inverse form: add the offset (signed) and divide by the quantification value.Copernicus Data Space forum: L1C quantization and RADIO_ADD_OFFSET relationship and how to recover TOA reflectance from DN

For Level-2A, the same principle applies but the field name is typically BOA_ADD_OFFSET. The most robust habit is to treat both the quantification value and the per-band add offsets as metadata-driven inputs, not constants, even if a specific era tends to use the same values.

Product levelAdd offset field (per band)Quantification fieldSpecial values to respect
L1C (TOA)RADIO_ADD_OFFSETQUANTIFICATION_VALUENoData and saturated are declared in metadata, do not guess
L2A (BOA)BOA_ADD_OFFSETBOA_QUANTIFICATION_VALUE or QUANTIFICATION_VALUE (varies by packaging)NoData and saturated are declared in metadata, do not guess
  1. Read the quantification value and the per-band add offset from metadata.
  2. Read the declared special values (NoData and saturated) and decide how you will mask them.
  3. Compute reflectance per band: ρ = (DN + ADD_OFFSET) / QUANTIFICATION_VALUE.
  4. Apply your mask policy after conversion, and do not treat “negative” as “missing” unless your application explicitly requires it.

Indices note: Pure ratios like (A − B)/(A + B) cancel a shared multiplicative scale, but they do not cancel additive offsets. Indices with constants (for example EVI2 and OSAVI) also assume physically scaled reflectance. Correct offsets and scaling first if you want your index math to mean what you think it means.

The two meanings of “Harmonized”

“Harmonized” is overloaded, so you want to pin down which meaning you are relying on.

In a platform sense (for example Earth Engine), “harmonized Sentinel-2” usually means the PB 04.00 range shift has been removed so post-2022-01-25 scenes sit in the same numeric range as older scenes. That improves long time series out of the box, but it changes what you should do next: you still scale to 0–1 reflectance, but you do not apply the SAFE offsets again.Google Earth Engine harmonized Sentinel-2 SR: PB 04.00+ scenes are shifted to match older range; SR bands are scaled by 10000

In a remote sensing science sense, “harmonized” can also mean cross-sensor harmonization (for example Landsat plus Sentinel products). This article is not about cross-sensor harmonization. It is about within-Sentinel-2 baseline-era alignment and offset handling.

Where teams get bitten in real projects

Most failures are quiet and look like “a small bias”. A multi-year vegetation series can show an artificial step at the PB 04.00 boundary if offsets are mishandled. Low-signal areas (deep water, heavy shadow, winter scenes) are where offset logic matters most, because that is where clipping and special-value confusion produce maps that look clean but are wrong.

The other common failure is double-correction. It happens when a team starts from a harmonized platform collection and then applies SAFE-era offsets again because a script was written for raw SAFE JP2 years ago. The result is not “slightly off”. It is systematically shifted reflectance across bands, which then contaminates thresholds, indices, and any downstream machine learning.

ClearSKY delivery modes

If you receive Sentinel-2 through ClearSKY, ask which semantics you are getting: baseline-harmonized reflectance (analysis-ready time series) or PB-native values (closer to ESA SAFE semantics for detailed QA). The right choice depends on whether your priority is production consistency across years or reproducing baseline-era behavior exactly. Either way, treat the conversion as metadata-driven, and keep provenance (baseline, scaling, offsets, processing version) attached to every raster you store.

FAQ

What is the safest formula to convert Sentinel-2 DN to reflectance?

Use the metadata-driven form: ρ = (DN + ADD_OFFSET) / QUANTIFICATION_VALUE. For L1C the offset is typically RADIO_ADD_OFFSET, and for L2A it is typically BOA_ADD_OFFSET. Read both the offset and the quantification value from the product metadata instead of hard-coding constants.

What changed in Processing Baseline 04.00, and why do I care?

PB 04.00 (operational from 2022-01-25) introduced an additional radiometric offset in metadata so dark-scene noise would not be clipped during quantization. If you mix baseline eras without handling offsets, long time series can show a false step change. Platform “harmonized” collections often remove that step for you, which is useful but also changes what corrections you should apply.

What does “Harmonized Sentinel-2” mean in Google Earth Engine?

In Earth Engine, “harmonized” means PB 04.00+ scenes have been shifted to match the numeric range of older scenes, which removes the baseline-era discontinuity. The reflectance bands are still scaled integers (commonly “scaled by 10000”), so you still scale to 0–1. You should not apply the SAFE-era add offsets again on top of the harmonized values.

Do NDVI and other indices require scaling and offsets first?

If your index is a pure normalized difference, a shared multiplicative scale cancels, but additive offsets do not. Indices with constants (for example EVI2) assume reflectance is physically scaled, so computing them on raw integers can bias results. When in doubt, convert to reflectance first and keep the pipeline consistent across time.

Is DN=0 always NoData in Sentinel-2?

No. Sentinel-2 products declare special values (NoData and saturated) in metadata, and some tools expose them through derived masks. A value of zero might be used as a special value in some contexts, but you should not assume it without checking the product’s declared special values. The safest approach is: read the special values, mask them intentionally, and treat “negative after offset” as valid unless your application defines otherwise.

Related articles