Edge & On-Device AI¶

Running AI where the data is — on phones, sensors, and microcontrollers — without a round trip to the cloud.

Most AI you use lives in giant data centres. When you ask your phone a question, it often sends your words across the internet, waits for a faraway computer to think, and sends the answer back. Edge AI flips this around: the thinking happens right on the device — your phone, watch, doorbell, or even a tiny sensor — with no trip to the cloud.

Think of it like cooking dinner in your own kitchen instead of ordering delivery. Delivery can bring you fancier meals, but your kitchen is faster, works when the internet is down, and nobody outside sees what you're making. The trade-off is space: a phone can't hold a restaurant's whole kitchen, so engineers shrink these AI models to fit, keeping the useful parts while trimming the rest.

The main ideas¶

Why on-device — Latency, privacy, offline use, and cost — the case for local inference.
Model compression — Quantization, pruning, and knowledge distillation to shrink models.
Efficient architectures — MobileNets, small language models, and hardware-aware design.
On-device runtimes — Core ML, TensorFlow Lite, ONNX Runtime, and the NPUs in modern chips.
TinyML — Machine learning on microcontrollers with kilobytes of memory.
Hybrid edge-cloud — Splitting work between device and server for the best of both.

AI Hardware & Compute · Data & MLOps · Computer Vision

Want to make things?

Head to AI School — AI camps where kids build their own games.

Edge & On-Device AI¶

The main ideas¶

Related areas¶