Monitoring

class afterimage.monitoring.Alert(name: str, message: str, level: str, timestamp: datetime, data: dict[str, Any])[source]

Bases: object

Represents a monitoring alert.

data: dict[str, Any]

level: str

message: str

name: str

timestamp: datetime

class afterimage.monitoring.FileLogHandler(log_dir: Path)[source]

Bases: LogHandler

Default file-based log handler.

handle_log(message: dict[str, Any]) → None[source]: Handle a log message.

class afterimage.monitoring.FileMetricHandler(log_dir: Path)[source]

Bases: MetricHandler

Default file-based metric handler.

handle_metric(metric_name: str, value: float, metadata: dict[str, Any]) → None[source]: Handle a metric event.

class afterimage.monitoring.GenerationMonitor(log_dir: str | Path | None = None, metric_handlers: list[MetricHandler] | None = None, log_handlers: list[LogHandler] | None = None, alert_handlers: list[Callable[[Alert], None]] | None = None, metrics_interval: int = 60, shutdown_timeout: int = 5, *, alert_min_success_rate: float | None = None, alert_max_generation_time_seconds: float | None = None, alert_max_error_rate: float | None = None, alert_max_prompt_token_mean: float | None = None, alert_max_completion_token_mean: float | None = None, alert_max_total_token_mean: float | None = None, alert_max_conversation_length_mean: float | None = None, token_usage_callback: Callable[[TokenUsageReport], None] | None = None, token_usage_callback_interval_seconds: float = 60.0)[source]

Bases: object

Monitors and tracks conversation generation metrics.

check_alerts() → None[source]: Run built-in alert rules once (same logic as the periodic alert worker).

export_metrics(output_path: str | Path, format: str = 'json', window: timedelta | None = None) → None[source]

Export metrics data to various formats.

Parameters:

output_path – Path to save the exported data
format – Export format (‘json’, ‘csv’, ‘excel’, ‘parquet’)
window – Optional time window to filter metrics

get_metrics(metric_name: str, window: timedelta = datetime.timedelta(seconds=300)) → dict[str, float][source]

Get aggregated metrics for a time window.

Parameters:

metric_name – Name of metric to retrieve
window – Time window for aggregation

Returns:

Dict containing metric aggregates

get_total_token_usage(window: timedelta | None = None) → TokenUsageReport[source]

Get total token usage summed across all events, optionally within a time window. Grouped by model name for cost calculation (one entry per model).

Parameters:: window – If set, only include events with timestamp >= (now - window). If None, include all events.
Returns:: TokenUsageReport with by_model (one entry per model) and total_* fields.

log_error(message: str, error: Exception = None, **data)[source]: Log an error message.

log_info(message: str, **data)[source]: Log an info message.

log_warning(message: str, **data)[source]: Log a warning message.

plot_metric(metric_name: str, window: timedelta = datetime.timedelta(seconds=3600), rolling_window: int = 10, figsize: tuple = (12, 8)) → Figure[source]

Plot a specific metric over time.

Parameters:

metric_name – Name of metric to plot
window – Time window for visualization
rolling_window – Window size for rolling average
figsize – Figure size for plot

Returns:

matplotlib figure

record_metric(metric_name: str, value: float, metadata: dict[str, Any] | None = None)[source]: Record metric using queue.

save_metrics() → Path[source]

Save current metrics to disk.

Returns:: Path to saved metrics file

shutdown()[source]: Gracefully shutdown monitoring.

track_evaluation(duration: float, success: bool, evaluator_type: str, scores: dict[str, float], **kwargs) → None[source]

Track evaluation metrics.

Parameters:

duration – Time taken for evaluation
success – Whether evaluation completed successfully
evaluator_type – Type of evaluator (e.g., ‘coherence’, ‘factuality’)
scores – Dictionary of evaluation scores
**kwargs – Additional metadata

track_generation(duration: float, success: bool, **kwargs)[source]

Track generation metrics using queue.

Parameters:

duration – Time taken for generation.
success – Whether generation completed successfully.
**kwargs – Additional metrics or metadata for logging. Some common metrics include: - prompt_token_count: Number of tokens in the prompt (input tokens). - completion_token_count: Number of tokens in the completion (output tokens). - total_token_count: Total number of tokens used (input + output tokens). - model_name: Name of the model used. - finish_reason: Reason for generation completion. - error: Error message if generation failed. - turns: Number of turns in the conversation. These metrics are automatically converted to individual metrics and logged with the timestamp when they are passed as kwargs.

visualize_metrics(save_dir: str | Path | None = None, figsize: tuple = (12, 6), return_figures: bool = False) → dict[str, Figure] | None[source]

Generate visualizations for metrics.

Parameters:

save_dir – Optional directory to save plots.
figsize – Figure size for plots.
return_figures – Whether to return the figures.

Returns:

Dict of matplotlib figures if return_figures is True, otherwise None.

class afterimage.monitoring.LogHandler(*args, **kwargs)[source]

Bases: Protocol

Protocol for custom log handling.

handle_log(message: dict[str, Any]) → None[source]: Handle a log message.

class afterimage.monitoring.MetricHandler(*args, **kwargs)[source]

Bases: Protocol

Protocol for custom metric handling.

handle_metric(metric_name: str, value: float, metadata: dict[str, Any]) → None[source]: Handle a metric event.

class afterimage.monitoring.ModelTokenUsage(model_name: str, prompt_tokens: int, completion_tokens: int, total_tokens: int)[source]

Bases: object

Token usage for a single model.

completion_tokens: int

model_name: str

prompt_tokens: int

total_tokens: int

class afterimage.monitoring.TokenUsageReport(by_model: list[ModelTokenUsage], total_prompt_tokens: int = 0, total_completion_tokens: int = 0, total_tokens: int = 0)[source]

Bases: object

Token usage report: per-model breakdown plus totals for cost calculation.

by_model: list[ModelTokenUsage]

total_completion_tokens: int = 0

total_prompt_tokens: int = 0

total_tokens: int = 0