# Summarization Feature

This document explains how the Summarization feature is structured, how streaming summarization works, and where to look when adding new providers or troubleshooting the data flow.

## High-Level Flow

1. `Summarization.SummaryView` displays the transcript, kicks off summarization on `onAppear`, and appends streamed tokens to the rendered summary (`QuickWhisper/Features/Summarization/Views/Summarization.SummaryView.swift`).
2. The view resolves the active configuration from `QuickWhisper.Settings` and the feature store `Summarization.Settings` (`QuickWhisper/Models/Settings.swift`, `QuickWhisper/Features/Summarization/Models/Summarization.Settings.swift`).
3. `Summarization.Service` looks up model metadata, builds the provider prompt with `Summarization.PromptBuilder`, validates the request with `TokenMeter`, and dispatches to the selected engine (`QuickWhisper/Features/Summarization/Services/Summarization+Service.swift`).
4. Engines (OpenAI, Anthropic, Gemini, OpenRouter) conform to `Summarization.Engine` and wrap provider HTTP calls, streaming adapters, and usage parsing while returning an `AsyncThrowingStream<Summarization.Event>` (`QuickWhisper/Features/Summarization/Services/Summarization+Engine.swift`).
5. `SummaryView` consumes events, updating the UI for `.token` and the final `.completed` result.

## Server-Sent Events (SSE) Primer

Most cloud providers return streamed summaries over Server-Sent Events. An SSE response is a long-lived HTTP request whose body delivers a sequence of `data:` lines separated by blank lines. Each blank line marks the end of a logical event. The client must:

1. Keep the HTTP connection open.
2. Read incoming bytes and split them into lines.
3. Buffer lines until a blank line appears, then treat the accumulated payload as a complete event.
4. Parse the JSON carried in the `data:` field and hand the result to the UI.

Summarization feature centralises this in two shared utilities:

- `Summarization.HTTPClient.lines(from:)` turns a streamed `URLSession.AsyncBytes` into an async sequence of lines, preserving blank lines as block separators.
- `Summarization.SSEStreamAdapter` collects those lines into logical event blocks. Engines reuse the adapter (or a bespoke variant when a provider deviates) so we only solve framing once.

Each engine interprets the provider-specific JSON payloads and emits `Summarization.Event.token` events for incremental text and a final `.completed` once the provider signals completion or the stream ends.

## Domain Model

- `Summarization.Provider` lists supported providers and their UI metadata (`QuickWhisper/Features/Summarization/Models/Summarization.Domain.swift`).
- `Summarization.Model` identifies a provider/model pair and carries optional display names and context-window hints.
- `Summarization.Options` holds request knobs such as `maxOutputTokens`, `streamingEnabled`, and the prompt preset identifier.
- `Summarization.Config` supplies the chosen provider/model plus an optional custom base URL.
- Provider-specific errors are normalised into `Summarization.Error` so the UI and tests receive consistent failures.

## Capabilities and Model Registry

- `Summarization.Capabilities` describes prompt handling traits (whether a provider supports instructions fields, needs a top-level system prompt, their streaming style, etc.) (`QuickWhisper/Features/Summarization/Models/Summarization.Capabilities.swift`).
- `Summarization.ModelsRegistry` is an actor that exposes provider defaults, per-model overrides, and a cached view of dynamically fetched models (`QuickWhisper/Features/Summarization/Models/Summarization.ModelsRegistry.swift`). It relies on `Summarization.NetworkAdapters` for REST discovery (`QuickWhisper/Features/Summarization/Services/Summarization+NetworkAdapters.swift`).
- Capabilities drive prompt building. For example, Gemini embeds instructions into the user message, while Anthropic expects a separate system channel.

## Key Management and User Settings

- API keys managed via `Summarization.Keys` (`QuickWhisper/Features/Summarization/Services/KeysStorage/Summarization.Keys.swift`) and internally stored in the keychain. Engines read keys only when necessay to build requests.
- `Summarization.Settings` stores saved configurations, the active configuration, and the selected prompt preset.

## Streaming Data Flow

1. **View ➝ Service**: `SummaryView` requests a stream from `Summarization.Service`, passing the transcript and resolved configuration.
2. **Prompt Assembly**: `PromptBuilder` and `TokenMeter` prepare provider-specific parameters based on `Summarization.ModelsRegistry` capabilities.
3. **Engine Request**: The engine constructs the HTTP request, injects API keys from `Summarization.Keys`, and calls `Summarization.HTTPClient` for streamed bytes.
4. **Transport Layer**: `HTTPClient.lines(from:)` yields lines to the engine, which feeds them into `SSEStreamAdapter`. The adapter emits complete `data:` payloads.
5. **Engine Parsing**: Each engine decodes provider-specific payloads, updates usage counters, and determines what text is new. Incremental text triggers `Summarization.Event.token`; a finish signal emits `.completed` with accumulated text and usage.
6. **UI Update**: `SummaryView` appends incoming tokens to the on-screen summary and caches the final result. It also persists the completed summary via `Summarization.ResponseCacheStore` so repeated requests can reuse prior results.

## Extending the Feature

To add a new provider or engine:

1. Define provider defaults and known models in `Summarization.ModelsRegistry`, and create a discovery adapter if the provider exposes a model list.
2. Implement an engine conforming to `Summarization.Engine`. Reuse `HTTPClient` and `SSEStreamAdapter` when possible or a custom adapter if the protocol differs.
3. Register the engine inside `Summarization.Service` and update UI assets if the provider should appear in the picker.
4. Extend `Summarization.Provider`, `Summarization.Keys`, and settings views for configuration management.
5. Add focused tests for adapters, prompt mapping, and parsing helpers.

## Known Gaps and Follow-Up Items

- `Summarization.Options.language` is still unused. Providers that support explicit output languages will need prompt or request updates.
- MLX streaming and summarization is unimplemented and currently `fatalError`.
- Context-window validation logs warnings but does not yet block requests that exceed provider limits.
- The feature is still behind a debug flag in the UI.
