Ch 1: Edge AI vs TinyML: What Runs Where

Ch 1 — Edge AI vs TinyML: What Runs Where

Deployment tiers, constraints, and selection boundaries.

Index Next →

Foundation

travel_explore

Tiers

arrow_forward

rule

Constraints

arrow_forward

category

Workloads

arrow_forward

balance

Tradeoffs

arrow_forward

task_alt

Scope

Click play or press Space to begin the chapter walkthrough...

Step- / 7

lan

Deployment Tiers

Edge is not one environment; it is a spectrum of compute classes.

Tier Map

Edge deployments usually fall into three groups: mobile devices, embedded Linux gateways, and microcontrollers. Each tier changes the feasible model size, runtime stack, and observability depth you can realistically support.

Why This Matters

Teams that skip tier definition often design for desktop-class assumptions and then fail during hardware bring-up. A clear tier decision early prevents expensive rework across model, firmware, and product requirements.

Practical Pattern

Write a deployment decision memo for one real feature and force each requirement into one of the edge tiers. This prevents architecture arguments from drifting into preference debates.

Note: Key Point: Start with hardware class and operational constraints, then choose the model approach.

schedule

Latency and Availability Constraints

TinyML is often selected because response and uptime requirements are strict.

Latency Envelope

Always-on keyword spotting, safety triggers, and control loops usually need low and predictable latency without network dependency. In these cases, local inference is a reliability requirement, not just a performance optimization.

Offline Behavior

Tiny deployments are expected to keep working during intermittent connectivity, power fluctuations, or bandwidth limits. Designing for offline-first behavior protects user trust and simplifies failure handling in the field.

Failure Pattern

Teams often over-scope tiny deployments by adding open-ended language tasks that belong to richer edge stacks. Keep TinyML workloads bounded and action-oriented to preserve reliability.

Note: Key Point: If a missed or delayed response is unacceptable, prioritize on-device inference by default.

dashboard_customize

Workload Archetypes

Different edge tasks demand different model and pipeline patterns.

Common Tiny Tasks

Typical TinyML workloads include keyword spotting, visual wake word detection, vibration anomaly alerts, and simple classification on sensor streams. These tasks reward compact models with stable behavior under noisy inputs.

Boundary Cases

Open-ended generation and multi-step reasoning usually exceed strict MCU budgets and belong on stronger edge tiers. Tiny systems work best when outputs are bounded, testable, and tightly mapped to product actions.

Validation Signal

Validate decisions with real field constraints such as network loss, device sleep behavior, and wake-time budgets. Synthetic desktop checks rarely predict behavior on constrained targets.

Note: Key Point: TinyML excels at narrow, high-value decisions made continuously at the edge.

compare_arrows

Decision Framework

Choose between cloud, edge gateway, mobile, and TinyML with explicit criteria.

Decision Inputs

Use five inputs: maximum latency, privacy sensitivity, power budget, network reliability, and maintenance model. This framework keeps architecture decisions tied to product requirements instead of tooling preferences.

Escalation Path

If constraints relax over time, you can move from TinyML toward richer edge runtimes or hybrid cloud designs. A documented escalation path keeps model interfaces stable while underlying infrastructure evolves.

Governance Rule

Require cross-functional sign-off from product, firmware, and ML owners before committing to the target tier. Early alignment reduces downstream rework and roadmap churn.

Note: Key Point: Architecture choice should be reversible, but the initial choice must match current constraints.

checklist

Scope Definition Checklist

A strong scope prevents shallow prototypes and fragile deployments.

Minimum Scope

Define target hardware, response-time budget, power profile, expected input range, and failure behavior before model training starts. These constraints shape data collection and directly influence model architecture choices.

Success Criteria

Success should include quality metrics plus operational metrics such as RAM peak, startup time, and field stability. TinyML programs fail when teams optimize for accuracy only and ignore operational acceptance gates.

Handoff Artifact

Capture tier assumptions, model boundaries, and escalation paths in versioned docs so future updates do not break the original deployment logic. Review it at each release checkpoint so assumptions remain current.

Note: Key Point: Production-ready TinyML starts with measurable constraints and measurable acceptance criteria.

report

Where Edge Programs Usually Fail

Most failures come from scope mismatch, not from model training quality.

Early Failure Signals

Warning signs include changing hardware targets mid-cycle, undefined offline behavior, and missing power acceptance criteria. These issues should trigger a scope review before additional model tuning work is approved.

Mitigation Pattern

Run a design checkpoint before model lock that revalidates tier choice against product requirements and deployment constraints. This checkpoint is often the highest-leverage quality gate in edge projects.

Note: Key Point: Preventing scope drift early saves more time than late-stage optimization.

playlist_add_check

Execution Checklist

Use a repeatable checklist before moving from concept to implementation.

Checklist Items

Confirm tier selection, latency budget, power budget, privacy requirements, and offline fallback behavior in writing. Each item should have an accountable owner and measurable acceptance condition.

Review Cadence

Revisit this checklist at architecture freeze, first-device bring-up, and pre-release. Repeated review catches assumption drift and keeps engineering effort focused on the right deployment class.

Note: Key Point: A short, repeated checklist outperforms one-time architecture decisions.