Sentinel-2 Scaling & Harmonization: Offsets, Processing Baselines, and “Harmonized” Collections

2025-09-23 · 6 min read · Sentinel-2 · Processing Baseline · PB 04.00 · Reflectance · Scaling · ESA · Baseline

Sentinel-2 Scaling & Harmonization: Offsets, Processing Baselines, and “Harmonized” Collections

TL;DR: Sentinel-2 L1C/L2A store quantized reflectance (integers). The safe, ESA-aligned conversion is ρ = (DN + OFFSET) / QUANTIFICATION_VALUE, where QUANTIFICATION_VALUE is usually 10000 and OFFSET is the per-band RADIO_ADD_OFFSET (L1C) or BOA_ADD_OFFSET (L2A). Treat DN=0 as NoData. “Harmonized” platform collections typically already removed offsets (and may align PB eras)—don’t double-correct.

Quick rules (SAFE → reflectance)

  • Quantization (scale factor): Sentinel-2 L1C (TOA) and L2A (BOA) are reflectance stored as integers with a scale given by QUANTIFICATION_VALUE (usually 10000).
    Compute: ρ = (DN + OFFSET) / QUANTIFICATION_VALUE.
  • Per-band offsets (PB-dependent): Since PB 04.xx, many bands provide a small additive offset in metadata:
    OFFSET = RADIO_ADD_OFFSET (L1C) or BOA_ADD_OFFSET (L2A). Use the signed value from metadata.
  • NoData: DN=0 means NoData in SAFE/JP2. Avoid treating it as a real 0.0 reflectance.
  • Harmonized collections: Some platforms publish “Harmonized” Sentinel-2 where offsets are already removed and PB eras aligned. You usually still multiply by 0.0001 (or the band’s scale) to get 0–1; do not remove offsets again.

Light references: ESA Sentinel-2 Product & Processing Baseline notes (L1C/L2A quantization & per-band offsets) and platform docs for “Harmonized” S2 (e.g., Google Earth Engine, Sentinel Hub).


“Processing Baseline” in plain language

A Processing Baseline (PB) is ESA’s exact recipe for turning raw measurements into L1C/L2A products. When ESA bumps the PB (e.g., 04.xx), users may see:

  1. Metadata offsets per band (to better represent tiny/negative physical values before scaling).
  2. Minor radiometric alignments across PB eras so intra-sensor time series are steadier.

This is not cross-sensor harmonization; it’s within-Sentinel-2 alignment.


The two meanings of “Harmonized”

  • PB-native (a.k.a. “non-harmonized” here): Values preserve the PB-era semantics from SAFE/JP2. You must apply offsets and scaling to get 0–1 reflectance; DN=0 is NoData.
  • Harmonized (platform sense): Provider has already removed offsets and may cross-baseline align older/newer scenes so long series match out of the box. You typically get bands with a scale factor (e.g., 1e-4) or direct 0–1 floats depending on the API.

Naming caution: In remote sensing, “harmonized” can also mean cross-sensor (e.g., Landsat+Sentinel HLS). Here we mean offset/scale normalization + cross-baseline alignment within Sentinel-2.


Fields to read from metadata

LevelField (per band)Typical valueWhy it matters
L1CRADIO_ADD_OFFSET~ −1000 DNAdd this (signed) before scaling.
L2ABOA_ADD_OFFSET~ −1000 DNAdd this (signed) before scaling.
BothQUANTIFICATION_VALUE10000Divide by this to get 0–1.
BothNoData ruleDN=0 is NoDataMask it; don’t treat as 0.0.

SAFE → reflectance, step-by-step

  1. Open the band and read per-band metadata.
  2. Get the numbers: q = QUANTIFICATION_VALUE (often 10000), off = RADIO_ADD_OFFSET|BOA_ADD_OFFSET (signed).
  3. Compute reflectance: ρ = (DN + off) / q.
  4. Mask NoData: where DN == 0, set to missing.

Normalized differences: Pure ratios like (A − B)/(A + B) cancel out a shared multiplicative scale but not additive offsets. If offsets exist—or if your index has constants (EVI2, OSAVI, WI2015)—perform the full correction (add offset → divide by q) first.


Directionality & dark-scene behavior

  • One-way in practice. Going PB-native → harmonized is deterministic (add offset → divide by q → optional cross-baseline normalization). Recovering harmonized → PB-native is brittle once integer DN/PB metadata are gone or rounding/extra normalization occurred.
  • Low-signal QA prefers PB-native. Offsets were introduced to avoid pre-scale clipping over very dark targets (deep water, heavy shadow, snow). Some harmonized feeds clamp ≤0 to 0.0, hiding low-signal nuance. For QA, inspect PB-native during ingestion.

Where this bites in real projects

  • Multi-year series: Mixing PB eras without respecting offsets yields baseline shifts.
  • Indices with constants: Running EVI2/OSAVI/WI2015 on 0–10000 DN without scaling constants biases results. Convert to 0–1 first.
  • Low-signal environments: Lakes/coasts, winter scenes, deep shadows look like “all zeros” if offsets/clamping are mishandled.
  • Double-correction risk: If you start from a Harmonized feed, do not remove offsets again.

ClearSKY delivery options (what you get from us)

  • Historic default for long series — ClearSKY-harmonized: We deliver baseline-harmonized reflectance (offsets handled, cross-baseline alignment) so multi-year analytics match out of the box.
    Analysis-ready choice: Inside your requested area we do not emit NoData; we return very low reflectance values (near 0.0) in dark/low-signal regions (deep water, strong shadow, winter scenes). Arrays stay continuous for ML/stats.

  • NoData policy: NoData appears only outside your requested AOI (pixels not covered by the order). Inside the AOI, pixels are numeric.

  • Nimbus flexibility — PB-native on demand: Our Nimbus model + post-processing can deliver PB-native (non-harmonized) values mirroring ESA SAFE semantics so you can apply offsets/scales yourself and replicate ESA-style QA.

  • Why this differs from ESA PB-native:
    ESA PB-native products can include NoData inside scenes under certain conditions. ClearSKY-harmonized favors continuous numeric rasters for analysis-ready consistency.

  • Recommendation: Use ClearSKY-harmonized for production time series; request PB-native when you explicitly need raw PB behavior or to test your own normalization. Provenance indicates which mode you received.


FAQ

Is Sentinel-2 delivered as reflectance?

Yes. L1C is top-of-atmosphere (TOA) reflectance and L2A is bottom-of-atmosphere (BOA) reflectance. Both are stored as quantized reflectance (integers) defined by QUANTIFICATION_VALUE (commonly 10000).

What exact formula should I use from SAFE/JP2 DN to reflectance?

Use ESA’s aligned form: ρ = (DN + RADIO_ADD_OFFSET|BOA_ADD_OFFSET) / QUANTIFICATION_VALUE.
Apply the signed per-band offset from metadata, then divide by the quantification value. Mask DN=0 as NoData.

Do normalized-difference indices need scaling?

For forms like (A − B) / (A + B), a shared multiplicative scale cancels if both bands share the same scale and no offsets.
Exceptions: indices with constants (e.g., EVI2, OSAVI, WI2015) or any band with an offset. Best practice: correct (offset + scale) first.

What does “Harmonized” mean on platforms like Earth Engine?

It means per-band offsets introduced in recent PBs are already removed and series are cross-baseline aligned. You typically still apply the band scale (e.g., multiply by 0.0001) to get 0–1. Don’t remove offsets a second time.

Is the conversion reversible? Can I go from harmonized back to PB-native?

It’s effectively one-way. Going PB-native → harmonized is deterministic; going back is unreliable once original integer DN and PB metadata are gone or additional normalization/rounding occurred.

Why would I ever want PB-native (non-harmonized) values?

For low-signal QA (deep water, shadow, winter) and method development. PB 04.xx offsets help you see sub-zero/near-zero behavior before any clamping.

Why don’t ClearSKY rasters show NoData inside my AOI?

We ship analysis-ready, continuous arrays for modeling and statistics. Inside your requested polygon we return numeric values (often very low floats) rather than NoData for dark/low-signal pixels.
NoData is used only outside your AOI. If you need ESA-style internal masking, request PB-native.

Related articles