Task Categories
Despite the enormous variety of NLP applications, most problems reduce to a handful of fundamental task types. Text classification: assign a label to a document (sentiment, topic, spam). Sequence labeling: assign a label to each token (POS tagging, NER). Sequence-to-sequence: transform one sequence into another (translation, summarization). Text generation: produce text from a prompt or context. Information extraction: pull structured data from unstructured text (relation extraction, event detection). Semantic similarity: measure how similar two texts are (paraphrase detection, search). Understanding which task type your problem maps to is the first step in choosing the right approach.
Task Types
Classification (document → label):
Sentiment, topic, spam, intent
Sequence labeling (token → label):
POS tagging, NER, chunking
Seq-to-seq (sequence → sequence):
Translation, summarization
Generation (prompt → text):
Completion, dialogue, creative
Information extraction:
Relations, events, entities
Semantic similarity:
Paraphrase, search, matching
Key insight: Knowing the task type determines the model architecture, the loss function, the evaluation metric, and the data format. Misidentifying the task type is the most common beginner mistake in NLP.