Using AfterImage with Local Models
AfterImage supports any OpenAI-compatible local model server. No API key needed.
Quick start
pip install afterimage
afterimage generate -c examples/configs/local.yaml
vLLM
pip install vllm
vllm serve Qwen/Qwen3-1.7B --port 8000
Config:
model:
provider: local
base_url: http://localhost:8000/v1
model_name: Qwen/Qwen3-1.7B
Ollama
ollama pull llama3.2
ollama serve
Config:
model:
provider: local
base_url: http://localhost:11434/v1
model_name: llama3.2
llama.cpp
./llama-server -m model.gguf --port 8000
Config:
model:
provider: local
base_url: http://localhost:8000/v1
model_name: model
Tips
Lower
max_concurrencyfor CPU inference (1-2 is usually best)max_turns: 1keeps generation fast for small modelsQuality gating (
auto_improve: true) requires local embeddings:pip install "afterimage[embeddings-local]"