10
Hardware is changing to accommodate local AI natively.
- NPUs (Neural Processing Units): Dedicated AI chips are becoming standard in all new laptops and phones, drastically improving the battery life and performance of local models.
- Speculative Decoding: Using a tiny, lightning-fast model to guess the next 5 words, and a larger model to verify them, speeding up local generation by 2-3x.
The Bottom Line: Within a few years, every device will have an always-on, locally running SLM acting as the primary interface between the user and the operating system.