Apple rolls out on-device generative model API in iOS 18.4 for third-party apps

AI · 5 min read

Apple rolls out on-device generative model API in iOS 18.4 for third-party apps

At its late-June platform update, Apple introduced a Local ML API in iOS 18.4 and macOS Sequoia that standardizes how third-party apps run generative models entirely on device. The API supports multiple model formats, enforces model signing by developers, and exposes energy and thermal budgets so apps can adapt inference to battery state.

The move aims to balance performance and privacy: Apple emphasizes sandboxed execution and a new system-level prompt for model access that mirrors microphone and location permissions. Developers can also opt into a secure model cache that persists signed weights across updates and is subject to system entitlements.

For UX teams, Apple added Human-in-the-Loop guidelines and new UI controls for real-time model status (latency, energy usage). Early test builds already show snappier autocomplete and summarization features in Mail and Notes when the Local ML pipeline is used.