Applications
Sequence labeling powers critical production systems across industries. Search engines use NER to identify entities in queries ("flights to Paris" → LOC:Paris) for structured search. Healthcare extracts drug names, dosages, symptoms, and conditions from clinical notes for electronic health records. Finance identifies company names, monetary amounts, and dates in SEC filings and earnings calls. Legal extracts parties, dates, clauses, and obligations from contracts. Customer support identifies product names, issue types, and account numbers from tickets. The challenge in production is domain adaptation: a model trained on news text (CoNLL-2003) performs poorly on medical text because the entity types and vocabulary are completely different. Domain-specific training data and pre-trained models are essential for production NER.
Production NER
Search:
"flights to Paris" → LOC: Paris
"Tim Cook Apple news" → PER + ORG
Healthcare:
"Patient takes 500mg Metformin daily"
→ DRUG: Metformin, DOSE: 500mg
Finance:
"Apple reported $94.8B revenue in Q1"
→ ORG: Apple, MONEY: $94.8B, DATE: Q1
Legal:
"Party A shall deliver by March 15"
→ PARTY: Party A, DATE: March 15
Domain gap:
News NER model on medical text: ~60% F1
Domain-adapted model: ~85% F1
Domain data is essential
Key insight: In production, the entity types you need are rarely the standard ones (PER, ORG, LOC). Real applications need custom entities (drug names, product SKUs, legal clauses), which means custom training data and domain-specific models.