Prompt Engineer / Applied AI Engineer
For LLM-based products, prompt engineering is product logic. The prompt engineer designs, tests, and iterates on the instructions that control model behavior. This role sits at the intersection of product design and engineering.
In many teams, the PM and prompt engineer work as closely as the PM and designer do in traditional software. The prompt is the product specification — it defines what the model does, how it responds, what it refuses, and how it handles edge cases.
Key skill: Systematic evaluation. A good prompt engineer doesn’t just write prompts — they build evaluation frameworks to measure prompt quality across hundreds of test cases.
MLOps Engineer
Manages the operational lifecycle of models in production: deployment pipelines, model versioning, A/B testing infrastructure, monitoring, and automated retraining. Think of them as DevOps for machine learning.
Without MLOps, models that work in notebooks never make it to production — or they make it to production and silently degrade.
Evaluation Specialist
An increasingly critical role focused on measuring AI quality. They design evaluation datasets, build automated testing pipelines, define quality rubrics, and run red-team exercises. For LLM products, evaluation is the new QA — but far more complex because you can’t write deterministic test assertions.
Some teams call this role “AI Quality Engineer” or “Eval Lead.”
AI Architect
Designs the end-to-end system architecture: which models to use, how to chain them, where to add RAG, how to handle caching, when to use fine-tuning vs. prompting. Critical for complex systems with multiple AI components.
Hiring reality: Most teams don’t have all these roles as separate hires. In early-stage teams, the ML engineer often covers MLOps, the PM does prompt engineering, and the data scientist handles evaluation. The roles are distinct functions, not necessarily distinct people. Know which functions you need even if one person covers multiple.