# Summarization Feature

This document explains how the Summarization feature is structured, how streaming summarization works, and where to look when adding new providers or troubleshooting the data flow.

## High-Level Flow

1. `Summarization.SummaryView` restores the most recent cached summary for the transcript (if any) and otherwise waits for the user to pick a preset from the in-view prompt cloud before issuing the first run (`QuickWhisper/Features/Summarization/Views/Summarization.SummaryView.swift`). A floating `PromptsPanelView` anchored above the window chrome lets the user submit custom instructions or pick a preset before triggering additional runs.
2. The view resolves the active configuration from `QuickWhisper.Settings` and the feature store `Summarization.Settings` (`QuickWhisper/Models/Settings.swift`, `QuickWhisper/Features/Summarization/Models/Summarization.Settings.swift`).
3. `Summarization.Service` looks up model metadata, builds the provider prompt with `Summarization.PromptBuilder` (instructions + transcript combined into the user payload; system channel reserved for system-level guidance), validates the request with `TokenMeter`, and dispatches to the selected engine (`QuickWhisper/Features/Summarization/Services/Summarization+Service.swift`).
4. Engines (OpenAI, Anthropic, Gemini, OpenRouter) conform to `Summarization.Engine` and wrap provider HTTP calls, streaming adapters, and usage parsing while returning an `AsyncThrowingStream<Summarization.Event>` (`QuickWhisper/Features/Summarization/Services/Summarization+Engine.swift`).
5. `SummaryView` consumes events, updating the UI for `.token` and the final `.completed` result.

## Server-Sent Events (SSE) Primer

Most cloud providers return streamed summaries over Server-Sent Events. An SSE response is a long-lived HTTP request whose body delivers a sequence of `data:` lines separated by blank lines. Each blank line marks the end of a logical event. The client must:

1. Keep the HTTP connection open.
2. Read incoming bytes and split them into lines.
3. Buffer lines until a blank line appears, then treat the accumulated payload as a complete event.
4. Parse the JSON carried in the `data:` field and hand the result to the UI.

Summarization feature centralises this in two shared utilities:

- `Summarization.HTTPClient.lines(from:)` turns a streamed `URLSession.AsyncBytes` into an async sequence of lines, preserving blank lines as block separators.
- `Summarization.SSEStreamAdapter` collects those lines into logical event blocks. Engines reuse the adapter (or a bespoke variant when a provider deviates) so we only solve framing once.

Each engine interprets the provider-specific JSON payloads and emits `Summarization.Event.token` events for incremental text and a final `.completed` once the provider signals completion or the stream ends.

## Domain Model

- `Summarization.Provider` lists supported providers and their UI metadata (`QuickWhisper/Features/Summarization/Models/Summarization.Domain.swift`).
- `Summarization.Model` identifies a provider/model pair and carries optional display names and context-window hints.
- `Summarization.Options` holds request knobs such as `maxOutputTokens`, `streamingEnabled`, the prompt preset identifier, and an optional custom instructions override used by the panel.
- `Summarization.Config` supplies the chosen provider/model plus an optional custom base URL.
- Provider-specific errors are normalised into `Summarization.Error` so the UI and tests receive consistent failures.

## Prompt Selection & Caching

- When no cached summary exists for a transcript, `SummaryView` surfaces `Summarization.PromptCloudView`, a lightweight chip grid that lists all built-in and custom prompts so the user can decide which instructions to run before the first request.
- Completed summaries are cached per transcript ID (`Summarization.Storage`, backed by `Summarization.ResponseCache`). Only the most recent summary for each transcript is retained; new runs overwrite the existing entry to keep state predictable.
- Prompt presets (built-in + custom) are persisted in SwiftData via `Summarization.PromptsRegistry`/`Summarization.Storage`. Built-ins are seeded once into the same store (guarded by a UserDefaults flag) so users can edit or delete them; title collisions during seeding are skipped. A reset button in Prompt Settings can upsert the canonical built-ins back into storage (restoring deleted ones) without affecting custom prompts.

## Capabilities and Model Registry

- `Summarization.Capabilities` describes prompt handling traits (whether a provider supports instructions fields, needs a top-level system prompt, their streaming style, etc.) (`QuickWhisper/Features/Summarization/Models/Summarization.Capabilities.swift`).
- `Summarization.ModelsRegistry` is an actor that exposes provider defaults, per-model overrides, and a cached view of dynamically fetched models (`QuickWhisper/Features/Summarization/Models/Summarization.ModelsRegistry.swift`). It relies on `Summarization.NetworkAdapters` for REST discovery (`QuickWhisper/Features/Summarization/Services/Summarization+NetworkAdapters.swift`).
- Capabilities drive prompt building. Summarization instructions are always embedded in the user payload; the system channel is reserved for system-level guidance (and is sent when supported, e.g., Anthropic system or OpenAI instructions).

## Key Management and User Settings

- API keys managed via `Summarization.Keys` (`QuickWhisper/Features/Summarization/Services/KeysStorage/Summarization.Keys.swift`) and internally stored in the keychain. Engines read keys only when necessay to build requests.
- `Summarization.Settings` stores saved configurations, the active configuration, and the selected prompt preset.

## Streaming Data Flow

1. **View ➝ Service**: `SummaryView` requests a stream from `Summarization.Service`, passing the transcript and resolved configuration.
2. **Prompt Assembly**: `PromptBuilder` combines `<USER_INSTRUCTIONS>` and `<TRANSCRIPT>` into the user payload (system prompt reserved for system-level guidance); `TokenMeter` prepares provider-specific parameters based on `Summarization.ModelsRegistry` capabilities.
3. **Engine Request**: The engine constructs the HTTP request, injects API keys from `Summarization.Keys`, and calls `Summarization.HTTPClient` for streamed bytes.
4. **Transport Layer**: `HTTPClient.lines(from:)` yields lines to the engine, which feeds them into `SSEStreamAdapter`. The adapter emits complete `data:` payloads.
5. **Engine Parsing**: Each engine decodes provider-specific payloads, updates usage counters, and determines what text is new. Incremental text triggers `Summarization.Event.token`; a finish signal emits `.completed` with accumulated text and usage.
6. **UI Update**: `SummaryView` appends incoming tokens to the on-screen summary and caches the completed result per transcript via `Summarization.Storage`, keeping only the latest summary so reopening the job immediately shows the prior output.

## Markdown ➝ HTML Rendering

- Streaming output is fed into `Summarization.MarkdownStreamRenderer`, an `@MainActor` helper that buffers tokens, debounces updates, and bridges native events into the embedded web view (`SummaryWebView`).
- The renderer loads a single `WKWebView` instance backed by the bundled `summary.html` template (see `SummaryHTML.htmlURL`), which references the packaged `markdown-it.min.js` for Markdown parsing. Both live under `Features/Summarization/WebResources/` and must be copied into the app bundle.
- Markdown text is base64-encoded before crossing into JavaScript; once decoded, the template converts it to HTML and preserves scroll position when the user is near the bottom of the view.
- Visual styling relies on CSS custom properties with a `prefers-color-scheme: dark` override, so the web content automatically matches the host system theme without additional Swift plumbing.
- SwiftUI still owns the loading state: a skeleton placeholder appears until the first token arrives, after which the web view takes over and continues rendering incremental updates.

## Extending the Feature

To add a new provider or engine:

1. Define provider defaults and known models in `Summarization.ModelsRegistry`, and create a discovery adapter if the provider exposes a model list.
2. Implement an engine conforming to `Summarization.Engine`. Reuse `HTTPClient` and `SSEStreamAdapter` when possible or a custom adapter if the protocol differs.
3. Register the engine inside `Summarization.Service` and update UI assets if the provider should appear in the picker.
4. Extend `Summarization.Provider`, `Summarization.Keys`, and settings views for configuration management.
5. Add focused tests for adapters, prompt mapping, and parsing helpers.

## Known Gaps and Follow-Up Items

- `Summarization.Options.language` is still unused. Providers that support explicit output languages will need prompt or request updates.
- MLX streaming and summarization is unimplemented and currently `fatalError`.
- Context-window validation logs warnings but does not yet block requests that exceed provider limits.
- The feature is still behind a debug flag in the UI.
