This snapshot establishes the camera-to-result recognition flow and related tests while checking in the project skill/docs assets required for the configured local tooling.
1031 lines
27 KiB
Markdown
1031 lines
27 KiB
Markdown
---
|
|
name: axiom-foundation-models-diag
|
|
description: Use when debugging Foundation Models issues — context exceeded, guardrail violations, slow generation, availability problems, unsupported language, or unexpected output. Systematic diagnostics with production crisis defense.
|
|
license: MIT
|
|
compatibility: iOS 26+, macOS 26+, iPadOS 26+, axiom-visionOS 26+
|
|
metadata:
|
|
version: "1.0.0"
|
|
last-updated: "2025-12-03"
|
|
---
|
|
|
|
# Foundation Models Diagnostics
|
|
|
|
## Overview
|
|
|
|
Foundation Models issues manifest as context window exceeded errors, guardrail violations, slow generation, availability failures, and unexpected output. **Core principle** 80% of Foundation Models problems stem from misunderstanding model capabilities (3B parameter device-scale model, not world knowledge), context limits (4096 tokens), or availability requirements—not framework bugs.
|
|
|
|
## Red Flags — Suspect Foundation Models Issue
|
|
|
|
If you see ANY of these, suspect a Foundation Models misunderstanding, not framework breakage:
|
|
- Generation takes >5 seconds
|
|
- Error: `exceededContextWindowSize`
|
|
- Error: `guardrailViolation`
|
|
- Error: `unsupportedLanguageOrLocale`
|
|
- Model gives hallucinated/wrong output
|
|
- UI freezes during generation
|
|
- Feature works in simulator but not on device
|
|
- ❌ **FORBIDDEN** "Foundation Models is broken, we need a different AI"
|
|
- Foundation Models powers Apple Intelligence across millions of devices
|
|
- Wrong output = wrong use case (world knowledge vs summarization)
|
|
- Do not rationalize away the issue—diagnose it
|
|
|
|
**Critical distinction** Foundation Models is a **device-scale model** (3B parameters) optimized for summarization, extraction, classification—NOT world knowledge or complex reasoning. Using it for the wrong task guarantees poor results.
|
|
|
|
## Mandatory First Steps
|
|
|
|
**ALWAYS run these FIRST** (before changing code):
|
|
|
|
```swift
|
|
// 1. Check availability
|
|
let availability = SystemLanguageModel.default.availability
|
|
|
|
switch availability {
|
|
case .available:
|
|
print("✅ Available")
|
|
case .unavailable(let reason):
|
|
print("❌ Unavailable: \(reason)")
|
|
// Possible reasons:
|
|
// - Device not Apple Intelligence-capable
|
|
// - Region not supported
|
|
// - User not opted in
|
|
}
|
|
|
|
// Record: "Available? Yes/no, reason if not"
|
|
|
|
// 2. Check supported languages
|
|
let supported = SystemLanguageModel.default.supportedLanguages
|
|
print("Supported languages: \(supported)")
|
|
print("Current locale: \(Locale.current.language)")
|
|
|
|
if !supported.contains(Locale.current.language) {
|
|
print("⚠️ Current language not supported!")
|
|
}
|
|
|
|
// Record: "Language supported? Yes/no"
|
|
|
|
// 3. Check context usage
|
|
let session = LanguageModelSession()
|
|
// After some interactions:
|
|
print("Transcript entries: \(session.transcript.entries.count)")
|
|
|
|
// Rough estimation (not exact):
|
|
let transcriptText = session.transcript.entries
|
|
.map { $0.content }
|
|
.joined()
|
|
print("Approximate chars: \(transcriptText.count)")
|
|
print("Rough token estimate: \(transcriptText.count / 3)")
|
|
// 4096 token limit ≈ 12,000 characters
|
|
|
|
// Record: "Approaching context limit? Yes/no"
|
|
|
|
// 4. Profile with Instruments
|
|
// Run with Foundation Models Instrument template
|
|
// Check:
|
|
// - Initial model load time
|
|
// - Token counts (input/output)
|
|
// - Generation time per request
|
|
// - Areas for optimization
|
|
|
|
// Record: "Latency profile: [numbers from Instruments]"
|
|
|
|
// 5. Inspect transcript for debugging
|
|
print("Full transcript:")
|
|
for entry in session.transcript.entries {
|
|
print("Entry: \(entry.content.prefix(100))...")
|
|
}
|
|
|
|
// Record: "Any unusual entries? Repeated content?"
|
|
```
|
|
|
|
#### What this tells you
|
|
- **Unavailable** → Proceed to Pattern 1a/1b/1c (availability issues)
|
|
- **Context exceeded** → Proceed to Pattern 2a (token limit)
|
|
- **Guardrail error** → Proceed to Pattern 2b (content policy)
|
|
- **Language error** → Proceed to Pattern 2c (unsupported language)
|
|
- **Wrong output** → Proceed to Pattern 3a/3b/3c (output quality)
|
|
- **Slow generation** → Proceed to Pattern 4a/4b/4c/4d (performance)
|
|
- **UI frozen** → Proceed to Pattern 5a (main thread blocking)
|
|
|
|
#### MANDATORY INTERPRETATION
|
|
|
|
Before changing ANY code, identify ONE of these:
|
|
|
|
1. If `availability = .unavailable` → Device/region/opt-in issue (not code bug)
|
|
2. If error is `exceededContextWindowSize` → Too many tokens (condense transcript)
|
|
3. If error is `guardrailViolation` → Content policy triggered (not model failure)
|
|
4. If error is `unsupportedLanguageOrLocale` → Language not supported (check supported list)
|
|
5. If output is hallucinated → Wrong use case (world knowledge vs extraction)
|
|
6. If generation >5 seconds → Not streaming or need optimization
|
|
7. If UI frozen → Calling on main thread (use Task {})
|
|
|
|
#### If diagnostics are contradictory or unclear
|
|
- STOP. Do NOT proceed to patterns yet
|
|
- Add detailed logging to every `respond()` call
|
|
- Run with Instruments Foundation Models template
|
|
- Establish baseline: what's actually happening vs what you assumed
|
|
|
|
## Decision Tree
|
|
|
|
```
|
|
Foundation Models problem?
|
|
│
|
|
├─ Won't start?
|
|
│ ├─ .unavailable → Availability issue
|
|
│ │ ├─ Device not capable? → Pattern 1a (device requirement)
|
|
│ │ ├─ Region restriction? → Pattern 1b (regional availability)
|
|
│ │ └─ User not opted in? → Pattern 1c (Settings check)
|
|
│ │
|
|
├─ Generation fails?
|
|
│ ├─ exceededContextWindowSize → Context limit
|
|
│ │ └─ Long conversation or verbose prompts? → Pattern 2a (condense)
|
|
│ │
|
|
│ ├─ guardrailViolation → Content policy
|
|
│ │ └─ Sensitive or inappropriate content? → Pattern 2b (handle gracefully)
|
|
│ │
|
|
│ ├─ unsupportedLanguageOrLocale → Language issue
|
|
│ │ └─ Non-English or unsupported language? → Pattern 2c (language check)
|
|
│ │
|
|
│ └─ Other error → General error handling
|
|
│ └─ Unknown error type? → Pattern 2d (catch-all)
|
|
│
|
|
├─ Output wrong?
|
|
│ ├─ Hallucinated facts → Wrong model use
|
|
│ │ └─ Asking for world knowledge? → Pattern 3a (use case mismatch)
|
|
│ │
|
|
│ ├─ Wrong structure → Parsing issue
|
|
│ │ └─ Manual JSON parsing? → Pattern 3b (use @Generable)
|
|
│ │
|
|
│ ├─ Missing data → Tool needed
|
|
│ │ └─ Need external information? → Pattern 3c (tool calling)
|
|
│ │
|
|
│ └─ Inconsistent output → Sampling issue
|
|
│ └─ Different results each time? → Pattern 3d (temperature/greedy)
|
|
│
|
|
├─ Too slow?
|
|
│ ├─ Initial delay (1-2s) → Model loading
|
|
│ │ └─ First request slow? → Pattern 4a (prewarm)
|
|
│ │
|
|
│ ├─ Long wait for results → Not streaming
|
|
│ │ └─ User waits 3-5s? → Pattern 4b (streaming)
|
|
│ │
|
|
│ ├─ Verbose schema → Token overhead
|
|
│ │ └─ Large @Generable type? → Pattern 4c (includeSchemaInPrompt)
|
|
│ │
|
|
│ └─ Complex prompt → Too much processing
|
|
│ └─ Massive prompt or task? → Pattern 4d (break down)
|
|
│
|
|
└─ UI frozen?
|
|
└─ Main thread blocked → Async issue
|
|
└─ App unresponsive during generation? → Pattern 5a (Task {})
|
|
```
|
|
|
|
## Diagnostic Patterns
|
|
|
|
### Pattern 1a: Device Not Capable
|
|
|
|
**Symptom**:
|
|
- `SystemLanguageModel.default.availability = .unavailable`
|
|
- Reason: Device not Apple Intelligence-capable
|
|
|
|
**Diagnosis**:
|
|
```swift
|
|
let availability = SystemLanguageModel.default.availability
|
|
|
|
switch availability {
|
|
case .available:
|
|
print("✅ Available")
|
|
case .unavailable(let reason):
|
|
print("❌ Reason: \(reason)")
|
|
// Check if device-related
|
|
}
|
|
```
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - No availability UI
|
|
let session = LanguageModelSession() // Crashes on unsupported devices
|
|
|
|
// ✅ GOOD - Graceful UI
|
|
struct AIFeatureView: View {
|
|
@State private var availability = SystemLanguageModel.default.availability
|
|
|
|
var body: some View {
|
|
switch availability {
|
|
case .available:
|
|
AIContentView()
|
|
case .unavailable:
|
|
VStack {
|
|
Image(systemName: "cpu")
|
|
Text("AI features require Apple Intelligence")
|
|
.font(.headline)
|
|
Text("Available on iPhone 15 Pro and later")
|
|
.font(.caption)
|
|
.foregroundColor(.secondary)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Time cost**: 5-10 minutes to add UI
|
|
|
|
---
|
|
|
|
### Pattern 1b: Regional Availability
|
|
|
|
**Symptom**:
|
|
- Feature works for some users, not others
|
|
- .unavailable due to region restrictions
|
|
|
|
**Diagnosis**:
|
|
Foundation Models requires:
|
|
- Supported region (e.g., US, UK, Australia initially)
|
|
- May expand over time
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ✅ GOOD - Clear messaging
|
|
switch SystemLanguageModel.default.availability {
|
|
case .available:
|
|
// proceed
|
|
case .unavailable(let reason):
|
|
// Show region-specific message
|
|
Text("AI features not yet available in your region")
|
|
Text("Check Settings → Apple Intelligence for availability")
|
|
}
|
|
```
|
|
|
|
**Time cost**: 5 minutes
|
|
|
|
---
|
|
|
|
### Pattern 1c: User Not Opted In
|
|
|
|
**Symptom**:
|
|
- Device capable, region supported
|
|
- Still .unavailable
|
|
|
|
**Diagnosis**:
|
|
User must opt in to Apple Intelligence in Settings
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ✅ GOOD - Direct user to settings
|
|
switch SystemLanguageModel.default.availability {
|
|
case .available:
|
|
// proceed
|
|
case .unavailable:
|
|
VStack {
|
|
Text("Enable Apple Intelligence")
|
|
Text("Settings → Apple Intelligence → Enable")
|
|
Button("Open Settings") {
|
|
if let url = URL(string: UIApplication.openSettingsURLString) {
|
|
UIApplication.shared.open(url)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Time cost**: 10 minutes
|
|
|
|
---
|
|
|
|
### Pattern 2a: Context Window Exceeded
|
|
|
|
**Symptom**:
|
|
```
|
|
Error: LanguageModelSession.GenerationError.exceededContextWindowSize
|
|
```
|
|
|
|
**Diagnosis**:
|
|
- 4096 token limit (input + output)
|
|
- Long conversations accumulate tokens
|
|
- Verbose prompts eat into limit
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - Unhandled error
|
|
let response = try await session.respond(to: prompt)
|
|
// Crashes after ~10-15 turns
|
|
|
|
// ✅ GOOD - Condense transcript
|
|
var session = LanguageModelSession()
|
|
|
|
do {
|
|
let response = try await session.respond(to: prompt)
|
|
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
|
|
// Condense and continue
|
|
session = condensedSession(from: session)
|
|
let response = try await session.respond(to: prompt)
|
|
}
|
|
|
|
func condensedSession(from previous: LanguageModelSession) -> LanguageModelSession {
|
|
let entries = previous.transcript.entries
|
|
|
|
guard entries.count > 2 else {
|
|
return LanguageModelSession(transcript: previous.transcript)
|
|
}
|
|
|
|
// Keep: first (instructions) + last (recent context)
|
|
var condensed = [entries.first!, entries.last!]
|
|
|
|
let transcript = Transcript(entries: condensed)
|
|
return LanguageModelSession(transcript: transcript)
|
|
}
|
|
```
|
|
|
|
**Time cost**: 15-20 minutes to implement condensing
|
|
|
|
---
|
|
|
|
### Pattern 2b: Guardrail Violation
|
|
|
|
**Symptom**:
|
|
```
|
|
Error: LanguageModelSession.GenerationError.guardrailViolation
|
|
```
|
|
|
|
**Diagnosis**:
|
|
- User input triggered content policy
|
|
- Violence, hate speech, illegal activities
|
|
- Model refuses to generate
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ✅ GOOD - Graceful handling
|
|
do {
|
|
let response = try await session.respond(to: userInput)
|
|
print(response.content)
|
|
} catch LanguageModelSession.GenerationError.guardrailViolation {
|
|
// Show user-friendly message
|
|
print("I can't help with that request")
|
|
// Log for review (but don't show user input to avoid storing harmful content)
|
|
}
|
|
```
|
|
|
|
**Time cost**: 5-10 minutes
|
|
|
|
---
|
|
|
|
### Pattern 2c: Unsupported Language
|
|
|
|
**Symptom**:
|
|
```
|
|
Error: LanguageModelSession.GenerationError.unsupportedLanguageOrLocale
|
|
```
|
|
|
|
**Diagnosis**:
|
|
User input in language model doesn't support
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - No language check
|
|
let response = try await session.respond(to: userInput)
|
|
// Crashes if unsupported language
|
|
|
|
// ✅ GOOD - Check first
|
|
let supported = SystemLanguageModel.default.supportedLanguages
|
|
|
|
guard supported.contains(Locale.current.language) else {
|
|
// Show disclaimer
|
|
print("Language not supported. Currently supports: \(supported)")
|
|
return
|
|
}
|
|
|
|
// Also handle errors
|
|
do {
|
|
let response = try await session.respond(to: userInput)
|
|
} catch LanguageModelSession.GenerationError.unsupportedLanguageOrLocale {
|
|
print("Please use English or another supported language")
|
|
}
|
|
```
|
|
|
|
**Time cost**: 10 minutes
|
|
|
|
---
|
|
|
|
### Pattern 2d: General Error Handling
|
|
|
|
**Symptom**:
|
|
Unknown error types
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ✅ GOOD - Comprehensive error handling
|
|
do {
|
|
let response = try await session.respond(to: prompt)
|
|
print(response.content)
|
|
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
|
|
// Handle context overflow
|
|
session = condensedSession(from: session)
|
|
} catch LanguageModelSession.GenerationError.guardrailViolation {
|
|
// Handle content policy
|
|
showMessage("Cannot generate that content")
|
|
} catch LanguageModelSession.GenerationError.unsupportedLanguageOrLocale {
|
|
// Handle language issue
|
|
showMessage("Language not supported")
|
|
} catch {
|
|
// Catch-all for unexpected errors
|
|
print("Unexpected error: \(error)")
|
|
showMessage("Something went wrong. Please try again.")
|
|
}
|
|
```
|
|
|
|
**Time cost**: 10-15 minutes
|
|
|
|
---
|
|
|
|
### Pattern 3a: Hallucinated Output (Wrong Use Case)
|
|
|
|
**Symptom**:
|
|
- Model gives factually incorrect answers
|
|
- Makes up information
|
|
|
|
**Diagnosis**:
|
|
Using model for world knowledge (wrong use case)
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - Wrong use case
|
|
let prompt = "Who is the president of France?"
|
|
let response = try await session.respond(to: prompt)
|
|
// Will hallucinate or give outdated info
|
|
|
|
// ✅ GOOD - Use server LLM for world knowledge
|
|
// Foundation Models is for:
|
|
// - Summarization
|
|
// - Extraction
|
|
// - Classification
|
|
// - Content generation
|
|
|
|
// OR: Use Tool calling with external data source
|
|
struct GetFactTool: Tool {
|
|
let name = "getFact"
|
|
let description = "Fetch factual information from verified source"
|
|
|
|
@Generable
|
|
struct Arguments {
|
|
let query: String
|
|
}
|
|
|
|
func call(arguments: Arguments) async throws -> ToolOutput {
|
|
// Fetch from Wikipedia API, news API, etc.
|
|
let fact = await fetchFactFromAPI(arguments.query)
|
|
return ToolOutput(fact)
|
|
}
|
|
}
|
|
```
|
|
|
|
**Time cost**: 20-30 minutes to implement tool OR switch to appropriate AI
|
|
|
|
---
|
|
|
|
### Pattern 3b: Wrong Structure (Not Using @Generable)
|
|
|
|
**Symptom**:
|
|
- Parsing errors
|
|
- Invalid JSON
|
|
- Wrong keys
|
|
|
|
**Diagnosis**:
|
|
Manual JSON parsing instead of @Generable
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - Manual parsing
|
|
let prompt = "Generate person as JSON"
|
|
let response = try await session.respond(to: prompt)
|
|
let data = response.content.data(using: .utf8)!
|
|
let person = try JSONDecoder().decode(Person.self, from: data) // CRASHES
|
|
|
|
// ✅ GOOD - @Generable
|
|
@Generable
|
|
struct Person {
|
|
let name: String
|
|
let age: Int
|
|
}
|
|
|
|
let response = try await session.respond(
|
|
to: "Generate a person",
|
|
generating: Person.self
|
|
)
|
|
// response.content is type-safe Person, guaranteed structure
|
|
```
|
|
|
|
**Time cost**: 10 minutes to convert to @Generable
|
|
|
|
---
|
|
|
|
### Pattern 3c: Missing Data (Need Tool)
|
|
|
|
**Symptom**:
|
|
- Model doesn't have required information
|
|
- Output is vague or generic
|
|
|
|
**Diagnosis**:
|
|
Need external data (weather, locations, contacts)
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - No external data
|
|
let response = try await session.respond(
|
|
to: "What's the weather in Tokyo?"
|
|
)
|
|
// Will make up weather data
|
|
|
|
// ✅ GOOD - Tool calling
|
|
import WeatherKit
|
|
|
|
struct GetWeatherTool: Tool {
|
|
let name = "getWeather"
|
|
let description = "Get current weather for a city"
|
|
|
|
@Generable
|
|
struct Arguments {
|
|
let city: String
|
|
}
|
|
|
|
func call(arguments: Arguments) async throws -> ToolOutput {
|
|
// Fetch real weather
|
|
let weather = await WeatherService.shared.weather(for: arguments.city)
|
|
return ToolOutput("Temperature: \(weather.temperature)°F")
|
|
}
|
|
}
|
|
|
|
let session = LanguageModelSession(tools: [GetWeatherTool()])
|
|
let response = try await session.respond(to: "What's the weather in Tokyo?")
|
|
// Uses real weather data
|
|
```
|
|
|
|
**Time cost**: 20-30 minutes to implement tool
|
|
|
|
---
|
|
|
|
### Pattern 3d: Inconsistent Output (Sampling)
|
|
|
|
**Symptom**:
|
|
- Different output every time for same prompt
|
|
- Need consistent results for testing
|
|
|
|
**Diagnosis**:
|
|
Random sampling (default behavior)
|
|
|
|
**Fix**:
|
|
```swift
|
|
// Default: Random sampling
|
|
let response1 = try await session.respond(to: "Write a haiku")
|
|
let response2 = try await session.respond(to: "Write a haiku")
|
|
// Different every time
|
|
|
|
// ✅ For deterministic output (testing/demos)
|
|
let response = try await session.respond(
|
|
to: "Write a haiku",
|
|
options: GenerationOptions(sampling: .greedy)
|
|
)
|
|
// Same output for same prompt (given same model version)
|
|
|
|
// ✅ For low variance
|
|
let response = try await session.respond(
|
|
to: "Classify this article",
|
|
options: GenerationOptions(temperature: 0.5)
|
|
)
|
|
// Slightly varied but focused
|
|
|
|
// ✅ For high creativity
|
|
let response = try await session.respond(
|
|
to: "Write a creative story",
|
|
options: GenerationOptions(temperature: 2.0)
|
|
)
|
|
// Very diverse output
|
|
```
|
|
|
|
**Time cost**: 2-5 minutes
|
|
|
|
---
|
|
|
|
### Pattern 4a: Initial Latency (Prewarm)
|
|
|
|
**Symptom**:
|
|
- First generation takes 1-2 seconds to start
|
|
- Subsequent requests faster
|
|
|
|
**Diagnosis**:
|
|
Model loading time
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - Load on user interaction
|
|
Button("Generate") {
|
|
Task {
|
|
let session = LanguageModelSession() // 1-2s delay here
|
|
let response = try await session.respond(to: prompt)
|
|
}
|
|
}
|
|
|
|
// ✅ GOOD - Prewarm on init
|
|
class ViewModel: ObservableObject {
|
|
private var session: LanguageModelSession?
|
|
|
|
init() {
|
|
// Prewarm before user interaction
|
|
Task {
|
|
self.session = LanguageModelSession(instructions: "...")
|
|
}
|
|
}
|
|
|
|
func generate(prompt: String) async throws -> String {
|
|
guard let session = session else {
|
|
// Fallback if not ready
|
|
self.session = LanguageModelSession()
|
|
return try await self.session!.respond(to: prompt).content
|
|
}
|
|
return try await session.respond(to: prompt).content
|
|
}
|
|
}
|
|
```
|
|
|
|
**Time cost**: 10 minutes
|
|
**Latency saved**: 1-2 seconds on first request
|
|
|
|
---
|
|
|
|
### Pattern 4b: Long Generation (Streaming)
|
|
|
|
**Symptom**:
|
|
- User waits 3-5 seconds seeing nothing
|
|
- Then entire result appears at once
|
|
|
|
**Diagnosis**:
|
|
Not streaming long generations
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - No streaming
|
|
let response = try await session.respond(
|
|
to: "Generate 5-day itinerary",
|
|
generating: Itinerary.self
|
|
)
|
|
// User waits 4 seconds seeing nothing
|
|
|
|
// ✅ GOOD - Streaming
|
|
@Generable
|
|
struct Itinerary {
|
|
var destination: String
|
|
var days: [DayPlan]
|
|
}
|
|
|
|
let stream = session.streamResponse(
|
|
to: "Generate 5-day itinerary to Tokyo",
|
|
generating: Itinerary.self
|
|
)
|
|
|
|
for try await partial in stream {
|
|
// Update UI incrementally
|
|
self.itinerary = partial
|
|
}
|
|
// User sees destination in 0.5s, then days progressively
|
|
```
|
|
|
|
**Time cost**: 15-20 minutes
|
|
**Perceived latency**: 0.5s vs 4s
|
|
|
|
---
|
|
|
|
### Pattern 4c: Large Schema Overhead
|
|
|
|
**Symptom**:
|
|
- Subsequent requests with same @Generable type slow
|
|
|
|
**Diagnosis**:
|
|
Schema re-inserted into prompt every time
|
|
|
|
**Fix**:
|
|
```swift
|
|
// First request - schema inserted automatically
|
|
let first = try await session.respond(
|
|
to: "Generate first person",
|
|
generating: Person.self
|
|
)
|
|
|
|
// ✅ Subsequent requests - skip schema insertion
|
|
let second = try await session.respond(
|
|
to: "Generate another person",
|
|
generating: Person.self,
|
|
options: GenerationOptions(includeSchemaInPrompt: false)
|
|
)
|
|
```
|
|
|
|
**Time cost**: 2 minutes
|
|
**Latency saved**: 10-20% per request
|
|
|
|
---
|
|
|
|
### Pattern 4d: Complex Prompt (Break Down)
|
|
|
|
**Symptom**:
|
|
- Generation takes >5 seconds
|
|
- Poor quality results
|
|
|
|
**Diagnosis**:
|
|
Prompt too complex for single generation
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - One massive prompt
|
|
let prompt = """
|
|
Generate complete 7-day itinerary with hotels, restaurants,
|
|
activities, transportation, budget, tips, and local customs
|
|
"""
|
|
// 5-8 seconds, poor quality
|
|
|
|
// ✅ GOOD - Break into steps
|
|
let overview = try await session.respond(
|
|
to: "Generate high-level 7-day plan for Tokyo"
|
|
)
|
|
|
|
var dayDetails: [DayPlan] = []
|
|
for day in 1...7 {
|
|
let detail = try await session.respond(
|
|
to: "Detail activities and restaurants for day \(day) in Tokyo",
|
|
generating: DayPlan.self
|
|
)
|
|
dayDetails.append(detail.content)
|
|
}
|
|
// Total time similar, but better quality and progressive results
|
|
```
|
|
|
|
**Time cost**: 20-30 minutes
|
|
**Quality improvement**: Significantly better
|
|
|
|
---
|
|
|
|
### Pattern 5a: UI Frozen (Main Thread Blocking)
|
|
|
|
**Symptom**:
|
|
- App unresponsive during generation
|
|
- UI freezes for seconds
|
|
|
|
**Diagnosis**:
|
|
Calling `respond()` on main thread synchronously
|
|
|
|
**Fix**:
|
|
```swift
|
|
// ❌ BAD - Blocking main thread
|
|
Button("Generate") {
|
|
let response = try await session.respond(to: prompt)
|
|
// UI frozen for 2-5 seconds!
|
|
}
|
|
|
|
// ✅ GOOD - Async task
|
|
Button("Generate") {
|
|
Task {
|
|
do {
|
|
let response = try await session.respond(to: prompt)
|
|
// Update UI on main thread
|
|
await MainActor.run {
|
|
self.result = response.content
|
|
}
|
|
} catch {
|
|
print("Error: \(error)")
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Time cost**: 5 minutes
|
|
**UX improvement**: Massive (no frozen UI)
|
|
|
|
---
|
|
|
|
## Production Crisis Scenario
|
|
|
|
### Context
|
|
|
|
**Situation**: You just launched an AI-powered feature using Foundation Models. Within 2 hours:
|
|
- 20% of users report "AI feature doesn't work"
|
|
- App Store reviews dropping: "New AI broken"
|
|
- VP of Product emailing: "What's the ETA on fix?"
|
|
- Engineering manager: "Should we roll back?"
|
|
|
|
**Pressure Signals**:
|
|
- 🚨 **Revenue impact**: Feature is key selling point for new app version
|
|
- ⏰ **Time pressure**: "Fix it NOW"
|
|
- 👔 **Executive visibility**: VP watching
|
|
- 📉 **Public reputation**: App Store reviews visible to all
|
|
|
|
### Rationalization Traps
|
|
|
|
**DO NOT** fall into these traps:
|
|
|
|
1. **"Disable the feature"**
|
|
- Loses product differentiation
|
|
- Admits defeat
|
|
- Doesn't learn what went wrong
|
|
|
|
2. **"Roll back to previous version"**
|
|
- Loses weeks of work
|
|
- Doesn't fix root cause
|
|
- Users still angry
|
|
|
|
3. **"It works for me"**
|
|
- Simulator ≠ real devices
|
|
- Your device ≠ all devices
|
|
- Ignores real problem
|
|
|
|
4. **"Switch to ChatGPT API"**
|
|
- Violates privacy
|
|
- Expensive at scale
|
|
- Doesn't address availability issue
|
|
|
|
### MANDATORY Protocol
|
|
|
|
#### Phase 1: Identify (5 minutes)
|
|
|
|
```swift
|
|
// Check error distribution
|
|
// What percentage seeing what error?
|
|
|
|
// Run this on test devices:
|
|
let availability = SystemLanguageModel.default.availability
|
|
|
|
switch availability {
|
|
case .available:
|
|
print("✅ Available")
|
|
case .unavailable(let reason):
|
|
print("❌ Unavailable: \(reason)")
|
|
}
|
|
|
|
// Hypothesis:
|
|
// - If 20% unavailable → Availability issue (device/region/opt-in)
|
|
// - If 20% getting errors → Code bug
|
|
// - If 20% seeing wrong results → Use case mismatch
|
|
```
|
|
|
|
**Results**: Discover that 20% of users have devices without Apple Intelligence support.
|
|
|
|
---
|
|
|
|
#### Phase 2: Confirm (5 minutes)
|
|
|
|
```swift
|
|
// Check which devices affected
|
|
// iPhone 15 Pro+ = ✅ Available
|
|
// iPhone 15 = ❌ Unavailable
|
|
// iPhone 14 = ❌ Unavailable
|
|
|
|
// Conclusion: Availability issue, not code bug
|
|
```
|
|
|
|
**Root cause**: Feature assumes all users have Apple Intelligence. 20% don't.
|
|
|
|
---
|
|
|
|
#### Phase 3: Device Requirements (5 minutes)
|
|
|
|
Verify:
|
|
- Apple Intelligence requires iPhone 15 Pro or later
|
|
- Or iPad with M1+ chip
|
|
- Or Mac with Apple silicon
|
|
|
|
#### 20% of user base = older devices
|
|
|
|
---
|
|
|
|
#### Phase 4: Implement Fix (15 minutes)
|
|
|
|
```swift
|
|
// ✅ Add availability check + graceful UI
|
|
struct AIFeatureView: View {
|
|
@State private var availability = SystemLanguageModel.default.availability
|
|
|
|
var body: some View {
|
|
switch availability {
|
|
case .available:
|
|
// Show AI feature
|
|
AIContentView()
|
|
|
|
case .unavailable:
|
|
// Graceful fallback
|
|
VStack {
|
|
Image(systemName: "sparkles")
|
|
.font(.largeTitle)
|
|
.foregroundColor(.secondary)
|
|
|
|
Text("AI-Powered Features")
|
|
.font(.headline)
|
|
|
|
Text("Available on iPhone 15 Pro and later")
|
|
.font(.subheadline)
|
|
.foregroundColor(.secondary)
|
|
.multilineTextAlignment(.center)
|
|
|
|
// Offer alternative
|
|
Button("Use Standard Mode") {
|
|
// Show non-AI fallback
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
#### Phase 5: Deploy (20 minutes)
|
|
|
|
1. Test on multiple devices (15 min)
|
|
- iPhone 15 Pro: ✅ Shows AI feature
|
|
- iPhone 14: ✅ Shows graceful message
|
|
- iPad Pro M1: ✅ Shows AI feature
|
|
|
|
2. Submit hotfix build (5 min)
|
|
|
|
---
|
|
|
|
### Communication Template
|
|
|
|
**To VP of Product (immediate)**:
|
|
```
|
|
Root cause identified:
|
|
|
|
The AI feature requires Apple Intelligence (iPhone 15 Pro+).
|
|
20% of our users have older devices. We didn't check availability.
|
|
|
|
Fix: Added availability check with graceful fallback UI.
|
|
|
|
Timeline:
|
|
- Hotfix ready: Now
|
|
- TestFlight: 10 minutes
|
|
- App Store submission: 30 minutes
|
|
- Review: 24-48 hours (requesting expedited)
|
|
|
|
Impact mitigation:
|
|
- 80% of users see working AI feature
|
|
- 20% see clear message + standard mode fallback
|
|
- No functionality lost, just graceful degradation
|
|
```
|
|
|
|
**To Engineering Team**:
|
|
```
|
|
Post-mortem items:
|
|
1. Add availability check to launch checklist
|
|
2. Test on non-Apple-Intelligence devices
|
|
3. Document device requirements clearly
|
|
4. Add analytics for availability status
|
|
```
|
|
|
|
### Time Saved
|
|
|
|
- **Panic path (disable/rollback)**: 2 hours of meetings + lost work
|
|
- **Proper diagnosis**: 45 minutes root cause → fix → deploy
|
|
|
|
### What We Learned
|
|
|
|
1. **Always check availability** before creating session
|
|
2. **Test on real devices** across device generations
|
|
3. **Graceful degradation** better than feature removal
|
|
4. **Clear messaging** to users about requirements
|
|
|
|
---
|
|
|
|
## Quick Reference Table
|
|
|
|
| Symptom | Cause | Check | Pattern | Time |
|
|
|---------|-------|-------|---------|------|
|
|
| Won't start | .unavailable | SystemLanguageModel.default.availability | 1a | 5 min |
|
|
| Region issue | Not supported region | Check supported regions | 1b | 5 min |
|
|
| Not opted in | Apple Intelligence disabled | Settings check | 1c | 10 min |
|
|
| Context exceeded | >4096 tokens | Transcript length | 2a | 15 min |
|
|
| Guardrail error | Content policy | User input type | 2b | 10 min |
|
|
| Language error | Unsupported language | supportedLanguages | 2c | 10 min |
|
|
| Hallucinated output | Wrong use case | Task type check | 3a | 20 min |
|
|
| Wrong structure | No @Generable | Manual parsing? | 3b | 10 min |
|
|
| Missing data | No tool | External data needed? | 3c | 30 min |
|
|
| Inconsistent | Random sampling | Need deterministic? | 3d | 5 min |
|
|
| Initial delay | Model loading | First request slow? | 4a | 10 min |
|
|
| Long wait | No streaming | >1s generation? | 4b | 20 min |
|
|
| Schema overhead | Re-inserting schema | Subsequent requests? | 4c | 2 min |
|
|
| Complex prompt | Too much at once | >5s generation? | 4d | 30 min |
|
|
| UI frozen | Main thread | Thread check | 5a | 5 min |
|
|
|
|
---
|
|
|
|
## Cross-References
|
|
|
|
**Related Axiom Skills**:
|
|
- `axiom-foundation-models` — Discipline skill for anti-patterns, proper usage patterns, pressure scenarios
|
|
- `axiom-foundation-models-ref` — Complete API reference with all WWDC 2025 code examples
|
|
|
|
**Apple Resources**:
|
|
- Foundation Models Framework Documentation
|
|
- WWDC 2025-286: Meet the Foundation Models framework
|
|
- WWDC 2025-301: Deep dive into the Foundation Models framework
|
|
- Instruments Foundation Models Template
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-12-03
|
|
**Version**: 1.0.0
|
|
**Skill Type**: Diagnostic
|