Edge AI (on-device LLM) достиг consumer devices в 2024-2025. Apple Intelligence (iPhone 15 Pro+, M1+ Macs) — 3B model on-chip, mid-2024. Google Gemini Nano (Pixel 8+, Android) — 2B. Llama 3.2 1B / 3B — open source, quantized INT4 runs on laptop. 2026 market: 42% flagship smartphones имеют built-in LLM. Latency < 100ms first token. Privacy: no data leaves device. Но quality ниже frontier cloud models.
Ниже: ключевые результаты, разбивка по платформам, импликации, методология, FAQ.
| Метрика | Pass/значение | Медиана | p75 |
|---|---|---|---|
| Flagship phones с on-device LLM | 42% | — | — |
| Apple Intelligence users (iPhone 15 Pro+) | 18% share | — | — |
| Median on-device TTFT | 85ms | 85 | 160 |
| Apple Intelligence model size | 3B parameters INT4 | — | — |
| Gemini Nano model size | 2B parameters | — | — |
| Quality gap vs GPT-5 (benchmark) | -30 to -50 points | — | — |
| Battery impact per 10min use | ~8% | 8 | 15 |
| Privacy: data stays on-device | 100% | — | — |
| Платформа | Доля | Деталь | — |
|---|---|---|---|
| iPhone 15 Pro / 16 (Apple Intelligence) | 21% | 3B on ANE | — |
| Pixel 8 / 9 (Gemini Nano) | 8% | 2B on TPU | — |
| Samsung Galaxy S24+ (Gemini Nano) | 12% | 2B | — |
| MacBook M1+ (Apple Intelligence) | 7% | 3B | — |
| Windows Copilot+ PC | 4% | Phi-3.5 / Llama 3.2 NPU | — |
Stats from Apple / Google earnings calls + StatCounter device share + benchmark testing of Apple Intelligence / Gemini Nano / Llama 3.2 on reference hardware. Март 2026.
Feature блокирован region-based, включая EU (DMA), China, RU. Workaround: change region в Apple ID. Но без App Store access к ограниченным apps.
Да, для простых tasks: summary, classification, rewriting. Runs на консьюмерском CPU. Quality comparable с GPT-3.5 для simple queries.
NPU (Neural Processing Unit) — dedicated chip для AI on-device. Apple ANE (Neural Engine): 35 TOPS. Google Tensor TPU. Intel Core Ultra NPU: 40 TOPS. Runs AI без GPU/CPU load.
Нет, frontier models (GPT-5, Claude Opus) всё ещё cloud-only. On-device для privacy + cost + latency. Hybrid — best.