# Architecture & Design This document details the internal architecture of the Afterimage library. It is intended for advanced users who want to extend the library or understand its internals. ## System Overview Afterimage is designed as a modular pipeline for synthetic data generation. The core philosophy is **composition over inheritance**—you build a generator by composing different strategies for prompts, instructions, and storage. ### Core Components 1. **Generators (`BaseGenerator`)**: The orchestrators. They manage the main loop, concurrency, and state. * `AsyncConversationGenerator`: Manages multi-turn dialogs. * `AsyncStructuredGenerator`: Manages single-turn structured output. 2. **Instruction Generators (`BaseInstructionGeneratorCallback`)**: Strategies for "What to ask". * Responsible for producing the initial user instruction/question. * Can have internal state (e.g., to ensure coverage of a document set). 3. **Prompt Modifiers (`BaseRespondentPromptModifierCallback`)**: Strategies for "What to know". * Responsible for modifying the system prompt of the assistant at runtime. * Used for RAG (injecting context) or Persona adoption. 4. **Storage (`BaseStorage`)**: Persistence layer. * Decoupled from generation logic. * Can be swapped (JSONL vs SQL) without changing the generator. 5. **LLM Abstraction Layer (`afterimage.providers.llm_providers`)**: * **Uniform Interface**: `LLMProvider` protocol normalizes interactions across models (Gemini, OpenAI, etc.). * **Unified Responses**: Returns standardized `LLMResponse` or `StructuredLLMResponse` objects with consistent token counts and usage metadata. * **Chat Abstraction**: `ChatSession` manages conversation history statefully, independent of the underlying API's specific mechanics. * **Factory Creation**: `LLMFactory` allows dynamic instantiation of providers via strings. ## Extension Points Afterimage is designed to be extended. Here are the common patterns: ### Custom Instruction Generator If you want to generate instructions from a custom source (e.g., a live API or a specific algorithm), subclass `BaseInstructionGeneratorCallback`. ```python from afterimage.base import BaseInstructionGeneratorCallback from afterimage.common import GeneratedInstructions class MyCustomInstructionGenerator(BaseInstructionGeneratorCallback): async def agenerate(self, original_prompt: str) -> GeneratedInstructions: # Your logic here return GeneratedInstructions( instruction="Tell me a joke about API limits.", context="System load is high." ) ``` ### Custom Storage To save data to a custom backend (e.g., S3, Mongo, or a specific API endpoint), implement the `BaseStorage` protocol. ```python from afterimage.storage import BaseStorage class MyCloudStorage(BaseStorage): async def asave_conversations(self, conversations): # Push to cloud pass async def load_conversations(self, limit=None, offset=None): # Fetch from cloud return [] ``` ### Custom LLM Provider To support a new model family (e.g., Anthropic, Mistral, or a local VLLM), implement the `LLMProvider` protocol. You must also implement a corresponding `ChatSession`. ```python from afterimage.providers import LLMProvider, ChatSession, LLMResponse class MyCustomChat(ChatSession): async def asend_message(self, message, **kwargs) -> LLMResponse: # Implement stateful chat logic pass class MyCustomProvider(LLMProvider): def initialize(self, api_key: str): self.client = ... async def agenerate_content(self, prompt: str, **kwargs) -> LLMResponse: # Call your API return LLMResponse( text="response", prompt_token_count=10, completion_token_count=10, total_token_count=20, finish_reason="stop", model_name="my-model", raw_response={} ) def start_chat(self, **kwargs) -> ChatSession: return MyCustomChat() ``` **Developer Tips for LLM Providers:** * **Async Support**: Always implement both sync and async methods. The library core relies heavily on `agenerate_content` for performance. * **Token Counting**: Ensure you populate token counts in `LLMResponse`. This is critical for the `GenerationMonitor` to track costs and throughput. * **Structured Output**: For `generate_structured`, leveraging Pydantic is highly recommended. If the underlying API doesn't support JSON schema natively, use a robust parser or instructor library. * **Error Handling**: Wrap your API calls in try/except blocks and use `SmartKeyPool.report_error(key)` if an API error occurs, so the pool can rotate keys or back off. ## Design Patterns * **Async-First**: The library is built from the ground up using `asyncio` for high throughput. * **Callback Pattern**: Logic is injected via callbacks rather than subclassing the generator itself. * **Pydantic Models**: All data exchange (config, inputs, outputs) is validated using Pydantic models for type safety.