Foundation Models Guide / Chapter 1

Introduction to Foundation Models

The Foundation Models framework is Apple’s new framework that gives you direct access to the on-device large language model used by Apple Intelligence. This chapter introduces the framework’s capabilities, architectural decisions, and how it can help you build AI features for your apps, without having to deal with the complexities of downloading and running models on your own.

The framework’s announcement at WWDC 2025 was predicted by Bloomberg, which reported in May 2025 that Apple was preparing to allow developers to use the model that powers Apple Intelligence through an AI SDK.

Prerequisites and Context

This guide assumes you are comfortable with Swift and SwiftUI development, but having no prior AI or machine learning experience is not a problem. The APIs are written with iOS developers in mind, providing familiar Swift patterns and straightforward integration with your existing app architecture.

What You Will Learn

By the end of this chapter, you will understand:

What Foundation Models can and cannot do for your apps
How guided generation solves the structured output problem
Why Foundation Models uses snapshots instead of token streaming
When to choose Foundation Models vs MLX Swift
The framework’s limitations and architectural decisions

The Foundation Models framework is available on iOS 26.0+, macOS 26.0+, iPadOS 26.0+, and visionOS 26.0+, and all regions where Apple Intelligence is available, excluding mainland China, as of September 2025.

The best way to understand what this framework can achieve is to understand some examples of what you can build with it:

Personalized suggestions that understand your app’s content by providing it with the relevant context
Travel itineraries generated on-demand in a travel app
Dynamic game dialog created for characters
Content summarization and analysis of user input
Structured data extraction from unstructured text

All of this runs completely on-device, so user data stays private, works offline, and does not increase your app size—assuming the Apple Intelligence model is already downloaded on the device.

The Model

Foundation Models is powered by an approximately 3 billion parameter large language model, with each parameter quantized to 2 bits. This model outperforms Llama 3.2 3B but is comparable to or slightly behind Qwen 3 4B and Gemma 3 4B models.

Again, this is a device-scale model. It is more optimized for specific use cases like summarization, extraction, classification, and guided generation, converting unstructured text into structured data that you can directly use in your app.

It is not designed for real world knowledge or advanced reasoning. The model’s training cutoff is late 2023, so you should not rely on it for recent events. Those tasks, especially reasoning, should be performed by state-of-the-art server-side LLMs like Sonnet 4, Gemini 2.5 Pro, or OpenAI’s GPT 5.

Guided Generation

Language models produce unstructured text that is easy for humans to read but difficult to map onto views in your app. You can write code to parse the text into structured data, but this approach is error-prone and difficult to maintain—you are essentially hoping the model will always produce correctly formatted data. Foundation Models solves this with guided generation, a system that guarantees type-safe, structured output directly without parsing JSON or dealing with decoding issues.

You will learn the complete details of guided generation, including @Generable types and @Guide attributes, in a later chapter on structured generation.

Streaming with Snapshots

Foundation Models takes a different approach to streaming than other frameworks. Instead of raw deltas (short character groups), it streams snapshots (complete partial objects with populated fields).

Typically, as deltas are produced, you accumulate them yourself. But when the result has structure, you need to parse it out of the accumulation after each delta. This is not trivial for complex structures, or even simple ones for that matter.

Foundation Models transforms deltas into snapshots that represent partially generated responses. Their properties are all optional and get filled in as the model produces more of the response. This is a much simpler approach than accumulating deltas and parsing them out.

This works great with SwiftUI where you create state holding a partially generated type, iterate over the response stream, and show it in your UI as the structured data fills in with animations!

Tool Calling

Tool calling lets the model execute functions you define, extending its capabilities beyond text generation based on the limited knowledge of the model. The model can access real-world data, fetch data from system frameworks like HealthKit or EventKit, and take actions in your app automatically or even outside of it like creating reminders.

You will learn how to build and use tools in later chapters on tool use and external API integration.

Stateful Sessions and Multi-Turn Conversations

Foundation Models is built around stateful sessions that maintain conversation context. Each interaction is retained in a transcript, allowing the model to understand past interactions within a session.

You will learn how to create sessions, manage conversations, and use transcripts in the upcoming chapter on sessions and a later chapter on advanced chat patterns.

Foundation Models vs MLX Swift

Both frameworks serve different purposes:

Foundation Models gives you a high-level API focused on app features. It uses Apple’s system language models and provides guaranteed type-safe structured output. It also has built-in tools and conversation memory. The biggest caveat is that it is only available on iOS 26.0+ and devices that support Apple Intelligence.

MLX Swift gives you low-level control over the pipeline, any compatible model from Hugging Face or fine-tuned models, raw output from the tool calling that you parse yourself, works on older devices (iPhone 13 as well with iOS 16.0+), and complete flexibility over the model choice and the pipeline.

Choose Foundation Models when you want to build user-facing AI features quickly with Apple’s model and can afford to have the feature available for only iOS 26.0+.

Choose MLX Swift when you need specific models or complete control over the AI pipeline.

Developer Experience

Foundation Models framework is one of the most developer-friendly frameworks by Apple. You can tell it was built by developers who actually use developer tools. The framework includes a simple playground for testing prompts directly in Xcode, and Instruments integration for performance profiling.

Limitations and Considerations

Foundation Models has several important constraints to understand:

Shared context window between input and output. The default is 4096 tokens, but starting in iOS 26.4, you can query the actual size with SystemLanguageModel.default.contextSize, and it may grow as Apple updates the on-device model
No versioning because models are tied to OS releases
Text-only with no vision capabilities as of September 2025
Performance varies as complex generations can take time

You will learn how to work within these limitations throughout the following chapters.

What’s Next

Now that you understand what Foundation Models can do, the next chapter gets you hands-on with sessions - the core building block of every Foundation Models interaction. You will learn to check availability, create your first session, and build a simple chat interface.

Before you start, verify your development environment meets the iOS 26.0+ requirement.

You require Xcode 26.0+ and the latest macOS 26.0+ or iOS 26.0+ SDK to run on your device. You will also need to enable Apple Intelligence on your test devices. If you want to test on the iOS simulator, you require the latest macOS 26 Tahoe along with the latest iOS 26.0+ SDK.