A Field Guide to Learning with Noisy Labels

Annotation is expensive. And when it’s expensive, it gets noisy. Labels can become unreliable for a number of reasons: non-expert annotators who disagree on ambiguous cases, labels derived from weak signals like search queries or hashtags, or predictions from an earlier, less robust model used to bootstrap a larger dataset. In real-world settings (fraud detection, medical imaging, content moderation), you rarely have the luxury of a perfectly clean training set. You train on what you have, and what you have is messy. ...

December 1, 2025 · 17 min · 3475 words