Providers

Document Providers

class afterimage.providers.document_providers.DirectoryDocumentProvider(directory: str | Path, file_patterns: List[str] | None = None, encoding: str = 'utf-8', recursive: bool = True, min_length: int = 1, cache: bool = True, target_context_usage_count: int | None = None)[source]

Bases: DocumentProvider

Search a directory for several filename patterns (txt/md/jsonl etc).

clear_cache() → None[source]: Optional: implementations can override to clear internal caches.

class afterimage.providers.document_providers.DocumentProvider(*args, **kwargs)[source]

Bases: Protocol

Unified DocumentProvider protocol.

Minimal required method for implementations:

_load_documents() -> list[Document]

Public helpers (provided by protocol defaults below):

get_documents(n: int) -> list[Document]
get_all() -> list[Document]
sample(n: int) -> list[Document]
report_doc_usage(document_id: str) -> int
set_target_context_usage_count(target_context_usage_count: int | None)
mark_fully_covered(document_id: str)
clear_cache()
__len__(), __iter__(), __getitem__(i)

clear_cache() → None[source]: Optional: implementations can override to clear internal caches.

get_all() → list[Document][source]: Return all documents (loads once if implementation caches).

get_documents(n: int) → list[Document][source]: Return up to n random documents. If n is math.inf, return all documents.

get_target_context_usage_count() → int | None[source]: Return the current target usage count, if configured.

mark_fully_covered(document_id: str) → None[source]: Exclude a document from future weighted sampling.

report_doc_usage(document_id: str) → int[source]: Record usage for a document and refresh sampling weights.

sample(n: int) → list[Document][source]: Alias for get_documents.

set_target_context_usage_count(target_context_usage_count: int | None) → None[source]: Update the target usage count used by weight calculation.

class afterimage.providers.document_providers.FileSystemDocumentProvider(path_pattern: str, encoding: str = 'utf-8', recursive: bool = False, min_length: int = 1, cache: bool = True, target_context_usage_count: int | None = None)[source]

Bases: DocumentProvider

Load text files matched by a glob pattern.

clear_cache() → None[source]: Optional: implementations can override to clear internal caches.

class afterimage.providers.document_providers.InMemoryDocumentProvider(texts: list[str | Document], target_context_usage_count: int | None = None)[source]

Bases: DocumentProvider

Simple provider backed by a list of strings.

clear_cache() → None[source]: Optional: implementations can override to clear internal caches.

class afterimage.providers.document_providers.JSONLDocumentProvider(path_pattern: str, content_key: str = 'text', encoding: str = 'utf-8', recursive: bool = False, cache: bool = True, max_docs: int | None = None, target_context_usage_count: int | None = None, preserve_ids: bool = False, include_metadata: bool = False)[source]

Bases: DocumentProvider

Load text fields from one or more JSONL files.

content_key selects which key from each JSON object to use.

clear_cache() → None[source]: Optional: implementations can override to clear internal caches.

class afterimage.providers.document_providers.QdrantDocumentProvider(client: QdrantClient, collection_name: str, content_key: str = 'text', batch_size: int = 500, scroll_filter: Filter | None = None, with_payload_keys: List[str] | None = None, cache: bool = True, max_docs: int | None = None, target_context_usage_count: int | None = None)[source]

Bases: DocumentProvider

Load text payloads from a Qdrant collection via scroll.

Note: requires qdrant-client package.

clear_cache() → None[source]: Optional: implementations can override to clear internal caches.

LLM Providers

class afterimage.providers.llm_providers.AsyncGeminiChatSession(chat, client: Client, model_name: str, max_retries: int = 3, retry_initial_delay: float = 2.0, retry_max_delay: float = 30.0)[source]

Bases: ChatSession

Asynchronous Gemini chat session implementation.

async aclose() → None[source]: Release any async resources held by the chat session.

async asend_message(message: str | ConversationEntry, temperature: float = 0.7, **kwargs) → LLMResponse[source]: Send a message to the chat session asynchronously.

class afterimage.providers.llm_providers.AsyncOpenAIChatSession(client: AsyncOpenAI, model_name: str, system_instruction: str | None = None, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs)[source]

Bases: ChatSession

Asynchronous OpenAI chat session implementation.

async asend_message(message: str | ConversationEntry, temperature: float = 0.7, **kwargs) → LLMResponse[source]: Send a message to the chat session asynchronously.

class afterimage.providers.llm_providers.ChatSession[source]

Bases: object

Abstract chat session interface.

async aclose() → None[source]: Release any async resources held by the chat session.

async asend_message(message: str | ConversationEntry, temperature: float = 0.7, **kwargs) → LLMResponse[source]: Send a message to the chat session asynchronously.

close() → None[source]: Release any resources held by the chat session.

send_message(message: str | ConversationEntry, temperature: float = 0.7, **kwargs) → LLMResponse[source]: Send a message to the chat session.

class afterimage.providers.llm_providers.CommonLLMResponse(text: str, prompt_token_count: int, completion_token_count: int, total_token_count: int, finish_reason: str, model_name: str, raw_response: Any)[source]

Bases: object

Standardized LLM response.

completion_token_count: int

finish_reason: str

model_name: str

prompt_token_count: int

raw_response: Any

text: str

total_token_count: int

class afterimage.providers.llm_providers.DeepSeekProvider(api_key: str | SmartKeyPool, model_name: str = 'deepseek-chat', system_instruction: str | None = None, **kwargs)[source]

Bases: OpenAIProvider

DeepSeek implementation using OpenAI-compatible API.

BASE_URL = 'https://api.deepseek.com'

async agenerate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema asynchronously.

generate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema.

class afterimage.providers.llm_providers.GeminiChatSession(chat, client: Client, model_name: str, max_retries: int = 3, retry_initial_delay: float = 2.0, retry_max_delay: float = 30.0)[source]

Bases: ChatSession

Gemini chat session implementation.

close() → None[source]: Release any resources held by the chat session.

send_message(message: str | ConversationEntry, temperature: float = 0.7, **kwargs) → LLMResponse[source]: Send a message to the chat session.

class afterimage.providers.llm_providers.GeminiProvider(api_key: str | SmartKeyPool, model_name: str = 'gemini-2.0-flash', system_instruction: str | None = None, safety_settings: List[Dict[str, str]] | None = None, max_retries: int = 3, retry_initial_delay: float = 2.0, retry_max_delay: float = 30.0, **kwargs)[source]

Bases: LLMProvider

Google Gemini implementation.

async agenerate_content(prompt: str, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → LLMResponse[source]: Generate completion from prompt asynchronously.

async agenerate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema asynchronously.

async astart_chat(temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → ChatSession[source]: Start a new chat session asynchronously.

generate_content(prompt: str, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → LLMResponse[source]: Generate completion from prompt.

generate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema.

start_chat(temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → ChatSession[source]: Start a new chat session.

class afterimage.providers.llm_providers.LLMFactory[source]

Bases: object

Factory for creating LLM providers.

static create(*, provider: Literal['gemini', 'openai', 'deepseek', 'local', 'openrouter'], model_name: str | None = None, api_key: str | SmartKeyPool | None = None, system_instruction: str | None = None, **kwargs: Any) → LLMProvider[source]

class afterimage.providers.llm_providers.LLMProvider(*args, **kwargs)[source]

Bases: Protocol

Protocol for LLM providers.

async agenerate_content(prompt: str, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → LLMResponse[source]: Generate completion from prompt asynchronously.

async agenerate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema asynchronously.

async astart_chat(**kwargs) → ChatSession[source]: Start a new chat session asynchronously.

generate_content(prompt: str, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → LLMResponse[source]: Generate completion from prompt.

generate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema.

start_chat(**kwargs) → ChatSession[source]: Start a new chat session.

class afterimage.providers.llm_providers.LLMResponse(text: str, prompt_token_count: int, completion_token_count: int, total_token_count: int, finish_reason: str, model_name: str, raw_response: Any, reasoning_content: str | None = None)[source]

Bases: CommonLLMResponse

reasoning_content: str | None = None

class afterimage.providers.llm_providers.OpenAIChatSession(client: OpenAI, model_name: str, system_instruction: str | None = None, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs)[source]

Bases: ChatSession

OpenAI chat session implementation.

send_message(message: str | ConversationEntry, temperature: float = 0.7, **kwargs) → LLMResponse[source]: Send a message to the chat session.

class afterimage.providers.llm_providers.OpenAIProvider(api_key: str | SmartKeyPool, model_name: str = 'gpt-4o', base_url: str | None = None, system_instruction: str | None = None, **kwargs)[source]

Bases: LLMProvider

OpenAI-compatible API implementation.

async agenerate_content(prompt: str, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → LLMResponse[source]: Generate completion from prompt asynchronously.

async agenerate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema asynchronously.

async astart_chat(temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → ChatSession[source]: Start a new chat session asynchronously.

generate_content(prompt: str, temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → LLMResponse[source]: Generate completion from prompt.

generate_structured(prompt: str, schema: Type[T], temperature: float = 0.7, **kwargs) → StructuredLLMResponse[T][source]: Generate structured output that matches the given schema.

start_chat(temperature: float = 0.7, max_tokens: int | None = None, stop_sequences: List[str] | None = None, **kwargs) → ChatSession[source]: Start a new chat session.

class afterimage.providers.llm_providers.OpenRouterProvider(api_key: str | SmartKeyPool, model_name: str = 'openai/gpt-4o-mini', system_instruction: str | None = None, **kwargs)[source]

Bases: OpenAIProvider

OpenRouter (OpenAI-compatible chat and structured output where the upstream model supports it).

BASE_URL = 'https://openrouter.ai/api/v1'

class afterimage.providers.llm_providers.StructuredLLMResponse(text: str, prompt_token_count: int, completion_token_count: int, total_token_count: int, finish_reason: str, model_name: str, raw_response: Any, parsed: T, reasoning_content: str | None = None)[source]

Bases: CommonLLMResponse, Generic[T]

Standardized LLM response with structured output.

parsed: T

reasoning_content: str | None = None

Embedding providers

Async text embeddings (OpenAI-compatible APIs, Gemini, and local SentenceTransformer via a process pool). Public types are re-exported on the afterimage package.

class afterimage.EmbeddingProvider(*args, **kwargs)[source]

Bases: Protocol

Protocol for async text embedding backends.

Implementations return one dense vector per input string, preserve order, and may batch requests internally. Call aclose() when the provider is no longer needed (required for process-based providers).

async aclose() → None[source]

Release resources held by this provider (pools, clients).

Implementations should make this idempotent (safe to call multiple times). API-only providers may use a no-op.

async embed(texts: list[str]) → list[list[float]][source]

Embed each string into a floating-point vector.

Parameters:: texts – Input strings. Empty list returns [].
Returns:: Embeddings in the same order as texts; each embedding is a list of floats (dimension is model-specific).

Note

Callers must not assume a single HTTP or IPC round trip; large inputs may be split into batches by the implementation.

class afterimage.OpenAIEmbeddingProvider(api_key: str | SmartKeyPool, model: str = 'text-embedding-3-small', *, base_url: str | None = None, max_batch_size: int = 128, extra_create_kwargs: dict[str, Any] | None = None)[source]

Bases: _NoOpAcloseMixin

Embeddings via the OpenAI async client (embeddings.create).

Supports OpenAI and OpenAI-compatible servers via base_url. Uses SmartKeyPool for key rotation and error reporting consistent with chat providers.

async embed(texts: list[str]) → list[list[float]][source]

Compute embeddings for texts using the configured model.

Parameters:: texts – Non-empty list of strings to embed, or empty for no work.
Returns:: One embedding per input string, in order.
Raises:: Exception – Propagates API errors after reporting the key to the pool.

class afterimage.GeminiEmbeddingProvider(api_key: str | SmartKeyPool, model: str = 'text-embedding-004', *, max_batch_size: int = 128)[source]

Bases: _NoOpAcloseMixin

Embeddings via Google Gemini client.aio.models.embed_content.

Uses the async Gemini client, closes transient HTTP resources after each embed() call, and integrates with SmartKeyPool.

async embed(texts: list[str]) → list[list[float]][source]

Compute embeddings for texts using the configured Gemini model.

Parameters:

texts – Non-empty list of strings to embed, or empty for no work.

Returns:

One embedding per input string, in order.

Raises:

ValueError – If the API returns an embedding without values.
Exception – Propagates API errors after reporting the key to the pool.

class afterimage.ProcessEmbeddingProvider(model_name: str, *, max_workers: int = 2, max_batch_size: int = 64)[source]

Bases: object

Local embeddings using SentenceTransformer in a process pool.

Inference runs in child processes so the host asyncio loop is not blocked. The model is loaded once per worker via the pool initializer. Call aclose() to shut down workers when finished.

async aclose() → None[source]

Shut down the process pool and mark this provider closed.

Waits for workers to finish. Idempotent after the first call.

async embed(texts: list[str]) → list[list[float]][source]

Encode texts in worker processes via asyncio.loop.run_in_executor().

Parameters:

texts – Non-empty list of strings to embed, or empty for no work.

Returns:

One embedding per input string, in order.

Raises:

RuntimeError – If the provider was closed before or during use.
ImportError – In workers if sentence-transformers is missing.

class afterimage.EmbeddingProviderFactory[source]

Bases: object

Factory for constructing EmbeddingProvider instances from config.

Config dictionaries are intended to be JSON-serializable (aside from embedding-specific nested structures). Key names are matched case-insensitively.

static create(config: dict[str, Any], *, api_key: str | None = None, key_pool: SmartKeyPool | None = None) → EmbeddingProvider[source]

Build a provider from a configuration mapping.

Parameters:

config –
Must include type ("openai", "gemini", or "process"). Optional keys:
- model — embedding model id for API providers.
- model_path — HuggingFace id or path for process (also model is accepted for process).
- base_url — OpenAI-compatible base URL.
- workers — process count for process (default 2).
- max_batch_size — chunk size for batched calls.
- api_key — inline secret when not using key_pool.
api_key – Optional default API key when key_pool is omitted (OpenAI/Gemini only).
key_pool – Optional shared pool; takes precedence over api_key and env vars for API providers.

Returns:

A concrete EmbeddingProvider.

Raises:

ValueError – If type is missing, unknown, or required keys are absent.