Ch 1 — Edge AI vs TinyML: What Runs Where

Deployment tiers, constraints, and selection boundaries.
Foundation
travel_explore
Tiers
arrow_forward
rule
Constraints
arrow_forward
category
Workloads
arrow_forward
balance
Tradeoffs
arrow_forward
task_alt
Scope
-
Click play or press Space to begin the chapter walkthrough...
Step- / 7
lan
Deployment Tiers
Edge is not one environment; it is a spectrum of compute classes.
Tier Map
Edge deployments usually fall into three groups: mobile devices, embedded Linux gateways, and microcontrollers. Each tier changes the feasible model size, runtime stack, and observability depth you can realistically support.
Why This Matters
Teams that skip tier definition often design for desktop-class assumptions and then fail during hardware bring-up. A clear tier decision early prevents expensive rework across model, firmware, and product requirements.
Practical Pattern
Write a deployment decision memo for one real feature and force each requirement into one of the edge tiers. This prevents architecture arguments from drifting into preference debates.
Note: Key Point: Start with hardware class and operational constraints, then choose the model approach.
schedule
Latency and Availability Constraints
TinyML is often selected because response and uptime requirements are strict.
Latency Envelope
Always-on keyword spotting, safety triggers, and control loops usually need low and predictable latency without network dependency. In these cases, local inference is a reliability requirement, not just a performance optimization.
Offline Behavior
Tiny deployments are expected to keep working during intermittent connectivity, power fluctuations, or bandwidth limits. Designing for offline-first behavior protects user trust and simplifies failure handling in the field.
Failure Pattern
Teams often over-scope tiny deployments by adding open-ended language tasks that belong to richer edge stacks. Keep TinyML workloads bounded and action-oriented to preserve reliability.
Note: Key Point: If a missed or delayed response is unacceptable, prioritize on-device inference by default.
dashboard_customize
Workload Archetypes
Different edge tasks demand different model and pipeline patterns.
Common Tiny Tasks
Typical TinyML workloads include keyword spotting, visual wake word detection, vibration anomaly alerts, and simple classification on sensor streams. These tasks reward compact models with stable behavior under noisy inputs.
Boundary Cases
Open-ended generation and multi-step reasoning usually exceed strict MCU budgets and belong on stronger edge tiers. Tiny systems work best when outputs are bounded, testable, and tightly mapped to product actions.
Validation Signal
Validate decisions with real field constraints such as network loss, device sleep behavior, and wake-time budgets. Synthetic desktop checks rarely predict behavior on constrained targets.
Note: Key Point: TinyML excels at narrow, high-value decisions made continuously at the edge.
compare_arrows
Decision Framework
Choose between cloud, edge gateway, mobile, and TinyML with explicit criteria.
Decision Inputs
Use five inputs: maximum latency, privacy sensitivity, power budget, network reliability, and maintenance model. This framework keeps architecture decisions tied to product requirements instead of tooling preferences.
Escalation Path
If constraints relax over time, you can move from TinyML toward richer edge runtimes or hybrid cloud designs. A documented escalation path keeps model interfaces stable while underlying infrastructure evolves.
Governance Rule
Require cross-functional sign-off from product, firmware, and ML owners before committing to the target tier. Early alignment reduces downstream rework and roadmap churn.
Note: Key Point: Architecture choice should be reversible, but the initial choice must match current constraints.
checklist
Scope Definition Checklist
A strong scope prevents shallow prototypes and fragile deployments.
Minimum Scope
Define target hardware, response-time budget, power profile, expected input range, and failure behavior before model training starts. These constraints shape data collection and directly influence model architecture choices.
Success Criteria
Success should include quality metrics plus operational metrics such as RAM peak, startup time, and field stability. TinyML programs fail when teams optimize for accuracy only and ignore operational acceptance gates.
Handoff Artifact
Capture tier assumptions, model boundaries, and escalation paths in versioned docs so future updates do not break the original deployment logic. Review it at each release checkpoint so assumptions remain current.
Note: Key Point: Production-ready TinyML starts with measurable constraints and measurable acceptance criteria.
report
Where Edge Programs Usually Fail
Most failures come from scope mismatch, not from model training quality.
Early Failure Signals
Warning signs include changing hardware targets mid-cycle, undefined offline behavior, and missing power acceptance criteria. These issues should trigger a scope review before additional model tuning work is approved.
Mitigation Pattern
Run a design checkpoint before model lock that revalidates tier choice against product requirements and deployment constraints. This checkpoint is often the highest-leverage quality gate in edge projects.
Note: Key Point: Preventing scope drift early saves more time than late-stage optimization.
playlist_add_check
Execution Checklist
Use a repeatable checklist before moving from concept to implementation.
Checklist Items
Confirm tier selection, latency budget, power budget, privacy requirements, and offline fallback behavior in writing. Each item should have an accountable owner and measurable acceptance condition.
Review Cadence
Revisit this checklist at architecture freeze, first-device bring-up, and pre-release. Repeated review catches assumption drift and keeps engineering effort focused on the right deployment class.
Note: Key Point: A short, repeated checklist outperforms one-time architecture decisions.