Crop Declaration Mislabel Detection

Parcels Analyzed

Countries

Years (2018-2024)

Avg Mislabel Rate

Mislabel Rate by Country & Year

Cross-Year Consistency (France, 2018-2024)

Fields flagged as mislabeled in 2+ years are likely true declaration errors, not model mistakes.

Locations Tracked

Never Flagged

Flagged Once (model noise)

--
Persistent (2+ years)

--
Est. True Mislabel Rate

Top Confused Crop Pairs

Most common cases where the model confidently predicts a different crop than declared.

Country: Year:

Declared		Predicted	Count	Avg Confidence

Per-Class Mislabel Rates

Country: Year:

Methodology

1. Satellite Embeddings

We use two complementary satellite embedding systems: AEF (64-dim, annual composites) and TESSERA (128-dim, foundation model). Combined into a 192-dim dual feature vector per field.

2. Spatial Cross-Validation

5-fold spatial block CV (0.5° grid) ensures no spatial leakage. Each field gets an out-of-fold prediction it was never trained on.

3. Confidence Thresholding

A field is flagged as mislabeled when the model predicts a different crop with >80% probability. This captures high-confidence disagreements.

4. Cross-Year Consistency

Fields flagged in 2+ years are classified as persistent mislabels (likely true errors). One-off flags are mostly model noise.